Hybrid Zero Dynamics Inspired Feedback Control Policy Design for 3D Bipedal Locomotion using Reinforcement Learning

Abstract

This paper presents a novel model-free reinforcement learning (RL) frameworkto design feedback control policies for 3D bipedal walking. Existing RLalgorithms are often trained in an end-to-end manner or rely on prior knowledgeof some reference joint trajectories. Different from these studies, we proposea novel policy structure that appropriately incorporates physical insightsgained from the hybrid nature of the walking dynamics and the well-establishedhybrid zero dynamics approach for 3D bipedal walking. As a result, the overallRL framework has several key advantages, including lightweight networkstructure, short training time, and less dependence on prior knowledge. Wedemonstrate the effectiveness of the proposed method on Cassie, a challenging3D bipedal robot. The proposed solution produces stable limit walking cyclesthat can track various walking speed in different directions. Surprisingly,without specifically trained with disturbances to achieve robustness, it alsoperforms robustly against various adversarial forces applied to the torsotowards both the forward and the backward directions.

Quick Read (beta)

loading the full paper ...