Abstract
Multi-task reinforcement learning (MTRL) aims to endow a single agent withthe ability to perform well on multiple tasks. Recent works have focused ondeveloping novel sophisticated architectures to improve performance, oftenresulting in larger models; it is unclear, however, whether the performancegains are a consequence of the architecture design itself or the extraparameters. We argue that gains are mostly due to scale by demonstrating thatnaively scaling up a simple MTRL baseline to match parameter counts outperformsthe more sophisticated architectures, and these gains benefit most from scalingthe critic over the actor. Additionally, we explore the training stabilityadvantages that come with task diversity, demonstrating that increasing thenumber of tasks can help mitigate plasticity loss. Our findings suggest thatMTRL's simultaneous training across multiple tasks provides a natural frameworkfor beneficial parameter scaling in reinforcement learning, challenging theneed for complex architectural innovations.