Bipedal Walking Robot using Deep Deterministic Policy Gradient

Abstract

Machine learning algorithms have found several applications in the field ofrobotics and control systems. The control systems community has started to showinterest towards several machine learning algorithms from the sub-domains suchas supervised learning, imitation learning and reinforcement learning toachieve autonomous control and intelligent decision making. Amongst manycomplex control problems, stable bipedal walking has been the most challengingproblem. In this paper, we present an architecture to design and simulate aplanar bipedal walking robot(BWR) using a realistic robotics simulator, Gazebo.The robot demonstrates successful walking behaviour by learning through severalof its trial and errors, without any prior knowledge of itself or the worlddynamics. The autonomous walking of the BWR is achieved using reinforcementlearning algorithm called Deep Deterministic Policy Gradient(DDPG). DDPG is oneof the algorithms for learning controls in continuous action spaces. Aftertraining the model in simulation, it was observed that, with a proper shapedreward function, the robot achieved faster walking or even rendered a runninggait with an average speed of 0.83 m/s. The gait pattern of the bipedal walkerwas compared with the actual human walking pattern. The results show that thebipedal walking pattern had similar characteristics to that of a human walkingpattern. The video presenting our experiment is available athttps://goo.gl/NHXKqR.

Quick Read (beta)

loading the full paper ...