SafeSlice: Enabling SLA-Compliant O-RAN Slicing via Safe Deep Reinforcement Learning

Abstract

Deep reinforcement learning (DRL)-based slicing policies have shownsignificant success in simulated environments but face challenges in physicalsystems such as open radio access networks (O-RANs) due tosimulation-to-reality gaps. These policies often lack safety guarantees toensure compliance with service level agreements (SLAs), such as the strictlatency requirements of immersive applications. As a result, a deployed DRLslicing agent may make resource allocation (RA) decisions that degrade systemperformance, particularly in previously unseen scenarios. Real-world immersiveapplications require maintaining SLA constraints throughout deployment toprevent risky DRL exploration. In this paper, we propose SafeSlice to addressboth the cumulative (trajectory-wise) and instantaneous (state-wise) latencyconstraints of O-RAN slices. We incorporate the cumulative constraints bydesigning a sigmoid-based risk-sensitive reward function that reflects theslices' latency requirements. Moreover, we build a supervised learning costmodel as part of a safety layer that projects the slicing agent's RA actions tothe nearest safe actions, fulfilling instantaneous constraints. We conduct anexhaustive experiment that supports multiple services, including real virtualreality (VR) gaming traffic, to investigate the performance of SafeSlice underextreme and changing deployment conditions. SafeSlice achieves reductions of upto 83.23% in average cumulative latency, 93.24% in instantaneous latencyviolations, and 22.13% in resource consumption compared to the baselines. Theresults also indicate SafeSlice's robustness to changing the thresholdconfigurations of latency constraints, a vital deployment scenario that will berealized by the O-RAN paradigm to empower mobile network operators (MNOs).

Quick Read (beta)

loading the full paper ...