Clarisse Wibault

• French, German & British •

Hi, I'm Clarisse! I'm a Ph.D. student at Oxford supervised by Maike Osborne and Jakob Foerster. My research focuses on how we can scale Reinforcement Learning to the real world. In line with the Collect and Infer pipeline, I'm interested in how we can extract maximally useful policies from large-scale offline data, and, conversely, how this can be used to guide intelligent data collection.

Recent progress in Machine Learning has been driven by methods that scale effectively with data and computation. Rather than relying on rigid, manually engineered priors and heuristics, successful architectures employ general-purpose inductive biases that support scalable learning, such as Convolutional Neural Networks in Computer Vision and Transformers in Natural Language Processing.

Offline Reinforcement Learning offers the potential to train effective decision-making agents from large datasets, but I would argue that a scalable inductive bias for RL has not yet been found. My current intuition is that it lies in hierarchy, which allows for both temporal abstraction and representational abstraction.

Blog Posts

Abstraction for Offline Goal-Conditioned Reinforcement Learning (May 2026)

Recurrent Structural Policy Gradient for Partially Observable Mean Field Games (May 2026)

Selected Publications

(*: equal contribution, see CV for full list)

Abstraction for Offline Goal-Conditioned Reinforcement Learning

Clarisse Wibault, Alexander Goldie, Antonio Villares, Maike Osborne, Jakob Foerster

Preprint

Paper | Code | Blog Post

Recurrent Structural Policy Gradient for Partially Observable Mean Field Games

Clarisse Wibault, Johannes Forkel, Sebastian Towers, Tiphaine Wibault, Juan Duque, George Whittle, Andreas Schaab, Yucheng Yang, Chiyuan Wang, Maike Osborne, Benjamin Moll, Jakob Foerster

ICML 2026 (Spotlight)

Paper | Code | Google Colab | Blog Post

Fully Offline Reinforcement Learning

Mattie Fellows*, Clarisse Wibault*, Uljad Berdica, Johannes Forkel, Maike Osborne, Jakob Foerster

RLC 2026

Paper | Code

Talks

Google DeepMind, London

Structural Policy Gradient for Mean Field Games

Benjamin Moll & Clarisse Wibault,

February 2026

Education

University of Oxford

PhD Student at FLAIR and MLRG October 2024 - Present

University of Oxford

MEng October 2020 - June 2024

Work Experience

College Tutor

Magdalen College, University of Oxford, Oxford, England October 2024 - June 2026

VIRDX QuantCo

Munich, Germany July - October 2025

BNP Paribas

London, UK June - August 2023

BMW Group

Munich, Germany June - August 2022