Abstract
In this paper, we address the discovery of robotic options fromdemonstrations in an unsupervised manner. Specifically, we present a frameworkto jointly learn low-level control policies and higher-level policies of how touse them from demonstrations of a robot performing various tasks. Byrepresenting options as continuous latent variables, we frame the problem oflearning these options as latent variable inference. We then present a temporalformulation of variational inference based on a temporal factorization oftrajectory likelihoods,that allows us to infer options in an unsupervisedmanner. We demonstrate the ability of our framework to learn such optionsacross three robotic demonstration datasets.
Quick Read (beta)
loading the full paper ...