Handoff Design in User-Centric Cell-Free Massive MIMO Networks Using DRL

Abstract

In the user-centric cell-free massive MIMO (UC-mMIMO) network scheme, usermobility necessitates updating the set of serving access points to maintain theuser-centric clustering. Such updates are typically performed through handoff(HO) operations; however, frequent HOs lead to overheads associated with theallocation and release of resources. This paper presents a deep reinforcementlearning (DRL)-based solution to predict and manage these connections formobile users. Our solution employs the Soft Actor-Critic algorithm, withcontinuous action space representation, to train a deep neural network to serveas the HO policy. We present a novel proposition for a reward function thatintegrates a HO penalty in order to balance the attainable rate and theassociated overhead related to HOs. We develop two variants of our system; thefirst one uses mobility direction-assisted (DA) observations that are based onthe user movement pattern, while the second one uses history-assisted (HA)observations that are based on the history of the large-scale fading (LSF).Simulation results show that our DRL-based continuous action space approach ismore scalable than discrete space counterpart, and that our derived HO policyautomatically learns to gather HOs in specific time slots to minimize theoverhead of initiating HOs. Our solution can also operate in real time with aresponse time less than 0.4 ms.

Quick Read (beta)

loading the full paper ...