Experience-Embedded Visual Foresight

Abstract

Visual foresight gives an agent a window into the future, which it can use toanticipate events before they happen and plan strategic behavior. Althoughimpressive results have been achieved on video prediction in constrainedsettings, these models fail to generalize when confronted with unfamiliarreal-world objects. In this paper, we tackle the generalization problem viafast adaptation, where we train a prediction model to quickly adapt to theobserved visual dynamics of a novel object. Our method, Experience-embeddedVisual Foresight (EVF), jointly learns a fast adaptation module, which encodesobserved trajectories of the new object into a vector embedding, and a visualprediction model, which conditions on this embedding to generate physicallyplausible predictions. For evaluation, we compare our method against baselineson video prediction and benchmark its utility on two real-world control tasks.We show that our method is able to quickly adapt to new visual dynamics andachieves lower error than the baselines when manipulating novel objects.

Quick Read (beta)

loading the full paper ...