Learning Modular Representations for Long-Term Multi-Agent Motion Predictions

Abstract

Context plays a significant role in the generation of motion for dynamicagents in interactive environments. This work proposes a modular method thatutilises a model of the environment to aid motion prediction of tracked agents.This paper shows that modelling the spatial and dynamic aspects of a givenenvironment alongside the local per agent behaviour results in more accurateand informed long-term motion prediction. Further, we observe that thisdecoupling of dynamics and environment models allows for better generalisationto unseen environments, requiring that only a spatial representation of a newenvironment be learned. We highlight the model's prediction capability using abenchmark pedestrian tracking problem and by tracking a robot arm performing atabletop manipulation task. The proposed approach allows for robust and dataefficient forward modelling, and relaxes the need for full model re-training innew environments. We evaluate this through an ablation study which shows betterperformance gain when decoupling representation modules in addition to improvedgeneralisation on tasks with dynamics unseen at training time.

Quick Read (beta)

loading the full paper ...