Abstract
The performance of modern reinforcement learning algorithms critically relieson tuning ever-increasing numbers of hyperparameters. Often, small changes in ahyperparameter can lead to drastic changes in performance, and differentenvironments require very different hyperparameter settings to achievestate-of-the-art performance reported in the literature. We currently lack ascalable and widely accepted approach to characterizing these complexinteractions. This work proposes a new empirical methodology for studying,comparing, and quantifying the sensitivity of an algorithm's performance tohyperparameter tuning for a given set of environments. We then demonstrate theutility of this methodology by assessing the hyperparameter sensitivity ofseveral commonly used normalization variants of PPO. The results suggest thatseveral algorithmic performance improvements may, in fact, be a result of anincreased reliance on hyperparameter tuning.