The lottery ticket hypothesis questions the role of overparameterization insupervised deep learning. But how is the performance of winning lottery ticketsaffected by the distributional shift inherent to reinforcement learningproblems? In this work, we address this question by comparing sparse agents whohave to address the non-stationarity of the exploration-exploitation problemwith supervised agents trained to imitate an expert. We show that feed-forwardnetworks trained via reinforcement learning and imitation learning can bepruned to the same level of sparsity, suggesting that the distributional shifthas a limited impact on the size of winning tickets. Using a set of carefullydesigned baseline conditions, we find that the majority of the lottery ticketeffect in both learning paradigms can be attributed to the identified maskrather than the weight initialization. The input layer mask selectively prunesentire input dimensions that turn out to be irrelevant for the task at hand. Ata moderate level of sparsity the mask identified by iterative magnitude pruningyields minimal task-relevant representations, i.e., an interpretable inductivebias. Finally, we propose a simple initialization rescaling which promotes therobust identification of sparse task representations in low-dimensional controltasks.