RL-RRT: Kinodynamic Motion Planning via Learning Reachability Estimators from RL Policies

Abstract

This paper addresses two challenges facing sampling-based kinodynamic motionplanning: a way to identify good candidate states for local transitions and thesubsequent computationally intractable steering between these candidate states.Through the combination of sampling-based planning, a Rapidly ExploringRandomized Tree (RRT) and an efficient kinodynamic motion planner throughmachine learning, we propose an efficient solution to long-range planning forkinodynamic motion planning. First, we use deep reinforcement learning to learnan obstacle-avoiding policy that maps a robot's sensor observations to actions,which is used as a local planner during planning and as a controller duringexecution. Second, we train a reachability estimator in a supervised manner,which predicts the RL policy's time to reach a state in the presence ofobstacles. Lastly, we introduce RL-RRT that uses the RL policy as a localplanner, and the reachability estimator as the distance function to biastree-growth towards promising regions. We evaluate our method on threekinodynamic systems, including physical robot experiments. Results across allthree robots tested indicate that RL-RRT outperforms state of the artkinodynamic planners in efficiency, and also provides a shorter path finishtime than a steering function free method. The learned local planner policy andaccompanying reachability estimator demonstrate transferability to thepreviously unseen experimental environments, making RL-RRT fast because theexpensive computations are replaced with simple neural network inference.

Quick Read (beta)

loading the full paper ...