Abstract
Humans leverage rich internal models of the world to reason about the future,imagine counterfactuals, and adapt flexibly to new situations. In ReinforcementLearning (RL), world models aim to capture how the environment evolves inresponse to the agent's actions, facilitating planning and generalization.However, typical world models directly operate on the environment variables(e.g. pixels, physical attributes), which can make their training slow andcumbersome; instead, it may be advantageous to rely on high-level latentdimensions that capture relevant multimodal variables. Global Workspace (GW)Theory offers a cognitive framework for multimodal integration and informationbroadcasting in the brain, and recent studies have begun to introduce efficientdeep learning implementations of GW. Here, we evaluate the capabilities of anRL system combining GW with a world model. We compare our GW-Dreamer withvarious versions of the standard PPO and the original Dreamer algorithms. Weshow that performing the dreaming process (i.e., mental simulation) inside theGW latent space allows for training with fewer environment steps. As anadditional emergent property, the resulting model (but not its comparisonbaselines) displays strong robustness to the absence of one of its observationmodalities (images or simulation attributes). We conclude that the combinationof GW with World Models holds great potential for improving decision-making inRL agents.