Normalizing Flows are Capable Models for RL

Princeton University

Abstract

Modern reinforcement learning (RL) algorithms have found success by using powerful probabilistic models, such as transformers, energy-based models, and diffusion / flow-based models. To this end, RL researchers often choose to pay the price of accommodating these models into their algorithms -- diffusion models are expressive, but are computationally intensive due to their reliance on solving differential equations, while autoregressive transformer models are scalable but typically require learning discrete representations. Normalizing flows (NFs), by contrast, seem to provide an appealing alternative, as they enable likelihoods and sampling without solving differential equations or autoregressive architectures. However, their potential in RL has received limited attention, partly due to the prevailing belief that normalizing flows lack sufficient expressivity. We show that this is not the case. Building on recent work in NFs, we propose a single NF architecture which integrates seamlessly into RL algorithms, serving as a policy, Q-function, and occupancy measure. Our approach leads to much simpler algorithms, and achieves higher performance in imitation learning, offline, goal conditioned RL and unsupervised RL.

Results

Below we present a sneak peak of results achieved using normalizing flows across a diverse set of RL settings like imitation learning, offline RL, goal conditioned RL and unsupervised RL. For more details about the experiments and the methods used, please refer to the paper.

Task Visualization

Task visualization

Behavioral Cloning

Method name: NF-BC

Behavioral Cloning Results

Goal-Conditioned Behavioral Cloning

Method name: NF-GCBC

Goal-Conditioned Behavioral Cloning Results

Offline RL

Method name: NF-RLBC

Offline RL Results

Goal-Conditioned RL

Method name: NF-UGS

Goal-Conditioned RL Results

BibTeX

@misc{ghugare2025nf4rl,
      title={Normalizing Flows are Capable Models for RL}, 
      author={Raj Ghugare and Benjamin Eysenbach},
      year={2025},
      eprint={2505.23527},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.23527}, 
  }