PoPS: Policy Pruning and Shrinking for Deep Reinforcement Learning

Abstract

The recent success of deep neural networks (DNNs) for function approximationin reinforcement learning has triggered the development of Deep ReinforcementLearning (DRL) algorithms in various fields, such as robotics, computer games,natural language processing, computer vision, sensing systems, and wirelessnetworking. Unfortunately, DNNs suffer from high computational cost and memoryconsumption, which limits the use of DRL algorithms in systems with limitedhardware resources. In recent years, pruning algorithms have demonstratedconsiderable success in reducing the redundancy of DNNs in classificationtasks. However, existing algorithms suffer from a significant performancereduction in the DRL domain. In this paper, we develop the first effectivesolution to the performance reduction problem of pruning in the DRL domain, andestablish a working algorithm, named Policy Pruning and Shrinking (PoPS), totrain DRL models with strong performance while achieving a compactrepresentation of the DNN. The framework is based on a novel iterative policypruning and shrinking method that leverages the power of transfer learning whentraining the DRL model. We present an extensive experimental study thatdemonstrates the strong performance of PoPS using the popular Cartpole, LunarLander, Pong, and Pacman environments. Finally, we develop an open sourcesoftware for the benefit of researchers and developers in related fields.

Quick Read (beta)

loading the full paper ...