AdaWorld: Learning Adaptable World Models with Latent Actions

  • 2025-03-24 18:58:15
  • Shenyuan Gao, Siyuan Zhou, Yilun Du, Jun Zhang, Chuang Gan
  • 0

Abstract

World models aim to learn action-controlled prediction models and have provenessential for the development of intelligent agents. However, most existingworld models rely heavily on substantial action-labeled data and costlytraining, making it challenging to adapt to novel environments withheterogeneous actions through limited interactions. This limitation can hindertheir applicability across broader domains. To overcome this challenge, wepropose AdaWorld, an innovative world model learning approach that enablesefficient adaptation. The key idea is to incorporate action information duringthe pretraining of world models. This is achieved by extracting latent actionsfrom videos in a self-supervised manner, capturing the most criticaltransitions between frames. We then develop an autoregressive world model thatconditions on these latent actions. This learning paradigm enables highlyadaptable world models, facilitating efficient transfer and learning of newactions even with limited interactions and finetuning. Our comprehensiveexperiments across multiple environments demonstrate that AdaWorld achievessuperior performance in both simulation quality and visual planning.

 

Quick Read (beta)

loading the full paper ...