The Impact of Quantization and Pruning on Deep Reinforcement Learning Models

Abstract

Deep reinforcement learning (DRL) has achieved remarkable success acrossvarious domains, such as video games, robotics, and, recently, large languagemodels. However, the computational costs and memory requirements of DRL modelsoften limit their deployment in resource-constrained environments. Thechallenge underscores the urgent need to explore neural network compressionmethods to make RDL models more practical and broadly applicable. Our studyinvestigates the impact of two prominent compression methods, quantization andpruning on DRL models. We examine how these techniques influence fourperformance factors: average return, memory, inference time, and batteryutilization across various DRL algorithms and environments. Despite thedecrease in model size, we identify that these compression techniques generallydo not improve the energy efficiency of DRL models, but the model sizedecreases. We provide insights into the trade-offs between model compressionand DRL performance, offering guidelines for deploying efficient DRL models inresource-constrained settings.

Quick Read (beta)

loading the full paper ...