Learning to Grasp on the Moon from 3D Octree Observations with Deep Reinforcement Learning

Abstract

Extraterrestrial rovers with a general-purpose robotic arm have manypotential applications in lunar and planetary exploration. Introducing autonomyinto such systems is desirable for increasing the time that rovers can spendgathering scientific data and collecting samples. This work investigates theapplicability of deep reinforcement learning for vision-based robotic graspingof objects on the Moon. A novel simulation environment withprocedurally-generated datasets is created to train agents under challengingconditions in unstructured scenes with uneven terrain and harsh illumination. Amodel-free off-policy actor-critic algorithm is then employed for end-to-endlearning of a policy that directly maps compact octree observations tocontinuous actions in Cartesian space. Experimental evaluation indicates that3D data representations enable more effective learning of manipulation skillswhen compared to traditionally used image-based observations. Domainrandomization improves the generalization of learned policies to novel sceneswith previously unseen objects and different illumination conditions. To thisend, we demonstrate zero-shot sim-to-real transfer by evaluating trained agentson a real robot in a Moon-analogue facility.

Quick Read (beta)

loading the full paper ...