Raj Ghugare

BuilderBench – A benchmark for generalist agents
Raj Ghugare, Catherine Ji, Kathryn Wantlin, Jin Schofield, Benjamin Eysenbach

project page | paper | code

Today's AI models learn primarily by mimicking and sharpening on existing data, limiting their ability to solve novel problems. One way to overcome this is agents which acquire skills for exploring and learning through experience. We introduce a new benchmark to accelerate research towards developing such agents, focusing on the evaluation of open-ended exploration, embodied reasoning, and reinforcement learning.

Normalizing Flows are Capable Models for RL [NeurIPS 2024]
Raj Ghugare, Benjamin Eysenbach

project page | paper | code

Normalizing Flows are among the most flexible probabilistic models, yet they have received far less attention from the RL community. We take a step towards correcting this by showing that NFs can indeed be a powerful model for RL.

Closing the Gap between TD Learning and Supervised Learning – A Generalisation Point of View [ICLR 2024]
Raj Ghugare, Matthieu Geist, Glen Berseth, Benjamin Eysenbach

paper | code

This paper explores the link between trajectory stitching and combinatorial generalization. Although stitching is mostly associated with dynamic programming, we show that significant progress (up to 10x) can be made using much simpler techniques.

Searching for High-Value Molecules Using Reinforcement Learning and Transformers [ICLR 2024]
Raj Ghugare, Santiago Miret, Adriana Hugessen, Mariano Phielipp, Glen Berseth

project page | paper | code

Through extensive experiments spanning across datasets with 100 million molecules and 25+ reward functions, we uncover essential algorithmic choices for efficient search with RL, and discover phenomena like reward hacking of protien docking scores.

Simplifying Model-based RL: Learning Representations, Latent-space Models and Policies with One Objective [ICLR 2023]
Raj Ghugare, Homanga Bharadhwaj, Benjamin Eysenbach, Sergey Levine, Ruslan Salakhutdinov

project page | paper | code

We present a joint objective for latent space model based RL which lower bounds the RL objective. Maximising this bound jointly with the encoder, model, and the policy boosts sample efficiency, without using techniques like ensembles of Q-networks and a high replay ratio.

Research