CV | Orso Forghieri

CV highlights

Machine Learning Engineer and PhD researcher in Reinforcement Learning.
PhD in Applied Mathematics / Reinforcement Learning, École Polytechnique / Institut Polytechnique de Paris, Sep. 2022-Dec. 2025.
Quantitative Researcher, Qube Research & Technologies, Paris, Oct. 2025-Mar. 2026.
Publications at IFIP Networking 2025, EWRL 2024, and ICAPS 2024.
Open-source Python research software for reinforcement learning, dynamic programming, and MDP benchmarking.

Research profile

Machine learning engineer and PhD researcher in reinforcement learning, with experience building Python research software, benchmarking infrastructure, and scalable learning algorithms. Strong background in PyTorch/JAX, sequential decision-making, optimization, and large-scale experimentation, with interests in open-source ML tooling, efficient training/inference, and developer-facing ML libraries.

Experience

Oct. 2025-Mar. 2026: Quantitative Researcher, Qube Research & Technologies, Paris, France. Developed machine-learning and representation-learning methods on large-scale datasets covering more than 3,000 equity assets. Designed robust experimentation and validation frameworks for predictive modeling, including feature engineering, out-of-sample evaluation, and statistical performance analysis. Built scalable Python/JAX research infrastructure for training, benchmarking, and evaluating learning-based decision systems under noisy real-world data.
2022-2026: PhD Researcher in Reinforcement Learning, CMAP, École Polytechnique, Institut Polytechnique de Paris. Developed state-abstraction and progressive-disaggregation algorithms for large-scale Markov Decision Processes, reducing the effective planning space while preserving control-relevant structure. Demonstrated consistent speedups on benchmark and applied MDPs, including random MDPs, Four Rooms, Mountain Car, Sutton’s racetrack, tandem queues, and hydro-valley management models. Built reusable Python/Gymnasium benchmarking infrastructure for RL agents and dynamic-programming solvers.
2021-2024: Research Collaboration in Reinforcement Learning and Combinatorial Optimization, Orange Gardens / CMAP. Developed RL and optimization methods for latency-constrained service placement in edge-computing systems. Benchmarked exact, heuristic, and learning-based approaches in an industrial 6G research context, leading to an IFIP Networking 2025 publication.

Education

Sep. 2022-Dec. 2025: PhD in Applied Mathematics / Reinforcement Learning, École Polytechnique, Institut Polytechnique de Paris. Thesis: Hierarchical Reinforcement Learning for Large Scale Problems. Supervision: Erwan Le Pennec, Hind Castel-Taleb, Emmanuel Hyon. Laboratory: Centre de Mathématiques Appliquées, École Polytechnique.
2021-2022: MSc MVA - Mathematics, Vision, Learning, ENS Paris-Saclay. Focus: statistical learning, deep learning, reinforcement learning, time series analysis, and optimization. Thesis: Hierarchical Reinforcement Learning for Optimizing Large Systems, supervised by Hind Castel-Taleb and Emmanuel Hyon.
2018-2021: Engineering Degree, École Polytechnique. Coursework: statistical learning, probability, optimization, algorithms, deep learning, uncertainty quantification, and risk analysis. Research project: An Overview of Market Forecast with Machine Learning Techniques.

Selected Publications

Hierarchical Reinforcement Learning for Large Scale Problems, PhD thesis, Institut Polytechnique de Paris, 2025.
Faster Latency Constrained Service Placement in Edge Computing with Deep Reinforcement Learning, IFIP Networking, 2025.
State Abstraction Discovery from Progressive Disaggregation Methods, European Workshop on Reinforcement Learning, 2024.
Progressive State Space Disaggregation for Infinite Horizon Dynamic Programming, International Conference on Automated Planning and Scheduling, 2024.

Open-Source Projects

Developed open-source Python implementations of state-abstraction, aggregation/disaggregation, and dynamic-programming algorithms for large-scale Markov Decision Processes.
Built reusable benchmarking frameworks and Gymnasium-compatible environments for evaluating planning and reinforcement-learning methods across control, queueing, and resource-allocation problems.
Released research code accompanying peer-reviewed publications, with emphasis on reproducibility, scalable experimentation, and algorithmic evaluation.
Maintained personal research software and documentation through public GitHub repositories and a research website.

Skills

Frameworks: PyTorch, JAX, TensorFlow, Gymnasium/Gym, Hugging Face Transformers, Datasets, Accelerate.
ML / AI: Deep Learning, Reinforcement Learning, Representation Learning, Sequential Decision-Making, Large-Scale Experimentation, Statistical Learning.
Systems: Python, C++, SQL, Linux, Git, testing, benchmarking, reproducible ML pipelines.
Optimization: Dynamic Programming, Stochastic Optimization, Combinatorial Optimization, Integer Programming, Approximate Planning.