Imitation Learning with Concurrent Actions in 3D Games

Abstract

In this work we describe a novel deep reinforcement learning neural networkarchitecture that allows multiple actions to be selected at every time-step.Multi-action policies allows complex behaviors to be learnt that are otherwisehard to achieve when using single action selection techniques. This workdescribes an algorithm that uses both imitation learning (IL) and temporaldifference (TD) reinforcement learning (RL) to provide a 4x improvement intraining time and 2.5x improvement in performance over single action selectionTD RL. We demonstrate the capabilities of this network using a complex in-house3D game. Mimicking the behavior of the expert teacher significantly improvesworld state exploration and allows the agents vision system to be trained morerapidly than TD RL alone. This initial training technique kick-starts TDlearning and the agent quickly learns to surpass the capabilities of theexpert.

Quick Read (beta)

loading the full paper ...