DMAP: a Distributed Morphological Attention Policy for Learning to Locomote with a Changing Body

Abstract

Biological and artificial agents need to deal with constant changes in thereal world. We study this problem in four classical continuous controlenvironments, augmented with morphological perturbations. Learning to locomotewhen the length and the thickness of different body parts vary is challenging,as the control policy is required to adapt to the morphology to successfullybalance and advance the agent. We show that a control policy based on theproprioceptive state performs poorly with highly variable body configurations,while an (oracle) agent with access to a learned encoding of the perturbationperforms significantly better. We introduce DMAP, a biologically-inspired,attention-based policy network architecture. DMAP combines independentproprioceptive processing, a distributed policy with individual controllers foreach joint, and an attention mechanism, to dynamically gate sensory informationfrom different body parts to different controllers. Despite not having accessto the (hidden) morphology information, DMAP can be trained end-to-end in allthe considered environments, overall matching or surpassing the performance ofan oracle agent. Thus DMAP, implementing principles from biological motorcontrol, provides a strong inductive bias for learning challenging sensorimotortasks. Overall, our work corroborates the power of these principles inchallenging locomotion tasks.

Quick Read (beta)

loading the full paper ...