Abstract
Fifth-generation (5G) New Radio (NR) cellular networks support a wide rangeof new services, many of which require an application-specific quality ofservice (QoS), e.g. in terms of a guaranteed minimum bit-rate or a maximumtolerable delay. Therefore, scheduling multiple parallel data flows, eachserving a unique application instance, is bound to become an even morechallenging task compared to the previous generations. Leveraging recentadvances in deep reinforcement learning, in this paper, we propose a QoS-AwareDeep Reinforcement learning Agent (QADRA) scheduler for NR networks. Incontrast to state-of-the-art scheduling heuristics, the QADRA schedulerexplicitly optimizes for the QoS satisfaction rate while simultaneouslymaximizing the network performance. Moreover, we train our algorithm end-to-endon these objectives. We evaluate QADRA in a full scale, near-product, systemlevel NR simulator and demonstrate a significant boost in network performance.In our particular evaluation scenario, the QADRA scheduler improves networkthroughput by 30% while simultaneously maintaining the QoS satisfaction rate ofVoIP users served by the network, compared to state-of-the-art baselines.