Bellman Diffusion Models

Abstract

Diffusion models have seen tremendous success as generative architectures.Recently, they have been shown to be effective at modelling policies foroffline reinforcement learning and imitation learning. We explore usingdiffusion as a model class for the successor state measure (SSM) of a policy.We find that enforcing the Bellman flow constraints leads to a simple Bellmanupdate on the diffusion step distribution.

Quick Read (beta)

loading the full paper ...