Mastering Diverse Domains through World Models

Abstract

General intelligence requires solving tasks across many domains. Currentreinforcement learning algorithms carry this potential but are held back by theresources and knowledge required to tune them for new tasks. We presentDreamerV3, a general and scalable algorithm based on world models thatoutperforms previous approaches across a wide range of domains with fixedhyperparameters. These domains include continuous and discrete actions, visualand low-dimensional inputs, 2D and 3D worlds, different data budgets, rewardfrequencies, and reward scales. We observe favorable scaling properties ofDreamerV3, with larger models directly translating to higher data-efficiencyand final performance. Applied out of the box, DreamerV3 is the first algorithmto collect diamonds in Minecraft from scratch without human data or curricula,a long-standing challenge in artificial intelligence. Our general algorithmmakes reinforcement learning broadly applicable and allows scaling to harddecision making problems.

Quick Read (beta)

loading the full paper ...