Clarisse Wibault
Hi, I'm Clarisse! I'm a Ph.D. student at Oxford supervised by Maike Osborne and Jakob Foerster. My research focuses on how we can scale Reinforcement Learning to the real world. In line with the Collect and Infer pipeline, I'm interested in how we can extract maximally useful policies from large-scale offline data, and, conversely, how this can be used to guide intelligent data collection.
Recent progress in Machine Learning has been driven by methods that scale effectively with data and computation. Rather than relying on rigid, manually engineered priors and heuristics, successful architectures employ general-purpose inductive biases that support scalable learning, such as Convolutional Neural Networks in Computer Vision and Transformers in Natural Language Processing.
Offline Reinforcement Learning offers the potential to train effective decision-making agents from large datasets, but I would argue that a scalable inductive bias for RL has not yet been found. My current intuition is that it lies in hierarchy, which allows for both temporal abstraction and representational abstraction.