Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning

Abstract

The physical design of a robot and the policy that controls its motion areinherently coupled. However, existing approaches largely ignore this coupling,instead choosing to alternate between separate design and control phases, whichrequires expert intuition throughout and risks convergence to suboptimaldesigns. In this work, we propose a method that jointly optimizes over thephysical design of a robot and the corresponding control policy in a model-freefashion, without any need for expert supervision. For an arbitrary robotmorphology, our method maintains a distribution over design parameters and usesreinforcement learning to train a shared neural network controller for sampleddesigns. Throughout training, we use our policy network to quickly evaluate newdesigns and refine the robot distribution to maximize expected reward. Thisresults in an assignment to the robot parameters and neural network policy thatare jointly optimal. We evaluate our approach in the context of leggedlocomotion, and demonstrate that it discovers novel robot designs and walkinggaits for several different morphologies, achieving performance better than abaseline and hand-crafted designs.

Quick Read (beta)

loading the full paper ...