On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning

Abstract

The lottery ticket hypothesis questions the role of overparameterization insupervised deep learning. But how is the performance of winning lottery ticketsaffected by the distributional shift inherent to reinforcement learningproblems? In this work, we address this question by comparing sparse agents whohave to address the non-stationarity of the exploration-exploitation problemwith supervised agents trained to imitate an expert. We show that feed-forwardnetworks trained with behavioural cloning compared to reinforcement learningcan be pruned to higher levels of sparsity without performance degradation.This suggests that in order to solve the RL-specific distributional shiftagents require more degrees of freedom. Using a set of carefully designedbaseline conditions, we find that the majority of the lottery ticket effect inboth learning paradigms can be attributed to the identified mask rather thanthe weight initialization. The input layer mask selectively prunes entire inputdimensions that turn out to be irrelevant for the task at hand. At a moderatelevel of sparsity the mask identified by iterative magnitude pruning yieldsminimal task-relevant representations, i.e., an interpretable inductive bias.Finally, we propose a simple initialization rescaling which promotes the robustidentification of sparse task representations in low-dimensional control tasks.

Quick Read (beta)

loading the full paper ...