Annotating Motion Primitives for Simplifying Action Search in Reinforcement Learning

Abstract

Reinforcement learning in large-scale environments is challenging due to themany possible actions that can be taken in specific situations. We havepreviously developed a means of constraining, and hence speeding up, the searchprocess through the use of motion primitives; motion primitives are sequencesof pre-specified actions taken across a state series. As a byproduct of thiswork, we have found that if the motion primitives' motions and actions arelabeled, then the search can be sped up further. Since motion primitives mayinitially lack such details, we propose a theoretically viewpoint-insensitiveand speed-insensitive means of automatically annotating the underlying motionsand actions. We do this through a differential-geometric, spatio-temporalkinematics descriptor, which analyzes how the poses of entities in two motionsequences change over time. We use this descriptor in conjunction with aweighted-nearest-neighbor classifier to label the primitives using a limitedset of training examples. In our experiments, we achieve high motion and actionannotation rates for human-action-derived primitives with as few as onetraining sample. We also demonstrate that reinforcement learning usingaccurately labeled trajectories leads to high-performing policies more quicklythan standard reinforcement learning techniques. This is partly because motionprimitives encode prior domain knowledge and preempt the need to re-discoverthat knowledge during training. It is also because agents can leverage thelabels to systematically ignore action classes that do not facilitate taskobjectives, thereby reducing the action space.

Quick Read (beta)

loading the full paper ...