Physically Embedded Planning Problems: New Challenges for Reinforcement Learning

Abstract

Recent work in deep reinforcement learning (RL) has produced algorithmscapable of mastering challenging games such as Go, chess, or shogi. In theseworks the RL agent directly observes the natural state of the game and controlsthat state directly with its actions. However, when humans play such games,they do not just reason about the moves but also interact with their physicalenvironment. They understand the state of the game by looking at the physicalboard in front of them and modify it by manipulating pieces using touch andfine-grained motor control. Mastering complicated physical systems withabstract goals is a central challenge for artificial intelligence, but itremains out of reach for existing RL algorithms. To encourage progress towardsthis goal we introduce a set of physically embedded planning problems and makethem publicly available. We embed challenging symbolic tasks (Sokoban,tic-tac-toe, and Go) in a physics engine to produce a set of tasks that requireperception, reasoning, and motor control over long time horizons. Althoughexisting RL algorithms can tackle the symbolic versions of these tasks, we findthat they struggle to master even the simplest of their physically embeddedcounterparts. As a first step towards characterizing the space of solution tothese tasks, we introduce a strong baseline that uses a pre-trained expert gameplayer to provide hints in the abstract space to an RL agent's policy whiletraining it on the full sensorimotor control task. The resulting agent solvesmany of the tasks, underlining the need for methods that bridge the gap betweenabstract planning and embodied control.

Quick Read (beta)

loading the full paper ...