Dual-Mind World Models: A General Framework for Learning in Dynamic Wireless Networks

Abstract

Despite the popularity of reinforcement learning (RL) in wireless networks,existing approaches that rely on model-free RL (MFRL) and model-based RL (MBRL)are data inefficient and short-sighted. Such RL-based solutions cannotgeneralize to novel network states since they capture only statistical patternsrather than the underlying physics and logic from wireless data. Theselimitations become particularly challenging in complex wireless networks withhigh dynamics and long-term planning requirements. To address theselimitations, in this paper, a novel dual-mind world model-based learningframework is proposed with the goal of optimizing completeness-weighted age ofinformation (CAoI) in a challenging mmWave V2X scenario. Inspired by cognitivepsychology, the proposed dual-mind world model encompasses a pattern-drivenSystem 1 component and a logic-driven System 2 component to learn dynamics andlogic of the wireless network, and to provide long-term link scheduling overreliable imagined trajectories. Link scheduling is learned through end-to-enddifferentiable imagined trajectories with logical consistency over an extendedhorizon rather than relying on wireless data obtained from environmentinteractions. Moreover, through imagination rollouts, the proposed world modelcan jointly reason network states and plan link scheduling. During intervalswithout observations, the proposed method remains capable of making efficientdecisions. Extensive experiments are conducted on a realistic simulator basedon Sionna with real-world physical channel, ray-tracing, and scene objects withmaterial properties. Simulation results show that the proposed world modelachieves a significant improvement in data efficiency and achieves stronggeneralization and adaptation to unseen environments, compared to thestate-of-the-art RL baselines, and the world model approach with only System 1.

Quick Read (beta)

loading the full paper ...