Orso Forghieri
Scalable dynamic programming, reinforcement learning, and applied decision-making
CMAP, École polytechnique
Institut Polytechnique de Paris
Palaiseau, France
I am Orso Forghieri, a researcher in Reinforcement Learning and Applied Mathematics. I work on scalable dynamic programming and reinforcement learning for Markov Decision Processes, with an emphasis on state abstraction, hierarchical methods, and algorithms that make large decision problems computationally tractable.
My PhD thesis, Hierarchical Reinforcement Learning for Large Scale Problems, was prepared at École polytechnique / Institut Polytechnique de Paris, within CMAP, under the supervision of Erwan Le Pennec, Hind Castel-Taleb, and Emmanuel Hyon.
My applied work connects these methods to resource allocation, edge-computing service placement, network optimization, and systematic trading. I have worked on sequential and explainable decision-making methods for systematic equity trading at Qube Research & Technologies, and on reinforcement-learning approaches for latency-constrained service placement in collaboration with Orange Gardens / CMAP.
I hold a Master’s degree from École normale supérieure Paris-Saclay and an Engineering Diploma from École polytechnique.
Research interests
- State abstraction.
- Approximate dynamic programming.
- Markov aggregation/disaggregation.
- Scalable MDP solving.
- Hierarchical reinforcement learning.
- Planning.
- Stochastic optimization.
Applications
- Large-scale planning.
- Resource allocation.
- Network optimization.
- Edge computing and service placement.
- Market forecasting.
- Railway-delay propagation.
Selected work
- Hierarchical Reinforcement Learning for Large Scale Problems, PhD thesis, 2025. Problem / Method / Outcome: large-scale reinforcement-learning problems are difficult to solve directly; the thesis studies hierarchical and abstraction-based methods; the result is a framework for more scalable decision-making.
- Faster Latency Constrained Service Placement in Edge Computing with Deep Reinforcement Learning, IFIP Networking 2025. Problem / Method / Outcome: latency-constrained edge service placement is computationally demanding; the work applies deep reinforcement learning; the outcome is faster decision-making for network/resource allocation settings.
- State Abstraction Discovery from Progressive Disaggregation Methods, EWRL 2024. Problem / Method / Outcome: useful state abstractions are hard to identify automatically; the method progressively refines aggregated MDP states; the outcome is an abstraction-discovery procedure for reinforcement-learning benchmarks.
- Progressive State Space Disaggregation for Infinite Horizon Dynamic Programming, ICAPS 2024. Problem / Method / Outcome: infinite-horizon dynamic programming can be intractable on large state spaces; the method starts coarse and disaggregates states progressively; the outcome is a scalable solver strategy for benchmark MDPs.
- Selected research software for state-space disaggregation, MDP solving, Gymnasium environments, and applied forecasting.
Research and applied collaborations
I am interested in research collaborations on scalable decision-making, reinforcement learning, approximate dynamic programming, network/resource optimization, and sequential decision problems in applied domains.
Academic service
Selected service and teaching details are listed on the CV and Teaching pages.
Contact / profiles
- Email: orso.forghieri@gmail.com.
- GitHub.
- LinkedIn.
- Google Scholar.
- DBLP.