Abstract
We address the problem of efficient exploration for transition model learningin the relational model-based reinforcement learning setting without extrinsicgoals or rewards. Inspired by human curiosity, we propose goal-literal babbling(GLIB), a simple and general method for exploration in such problems. GLIBsamples relational conjunctive goals that can be understood as specific,targeted effects that the agent would like to achieve in the world, and plansto achieve these goals using the transition model being learned. We providetheoretical guarantees showing that exploration with GLIB will converge almostsurely to the ground truth model. Experimentally, we find GLIB to stronglyoutperform existing methods in prediction and planning on a range of tasks,encompassing standard PDDL and PPDDL planning benchmarks and a roboticmanipulation task in the PyBullet physics simulator. Video:https://youtu.be/F6lmrPT6TOY