Abstract
Root Cause Analysis (RCA) in mobile networks remains a challenging task dueto the need for interpretability, domain expertise, and causal reasoning. Inthis work, we propose a lightweight framework that leverages Large LanguageModels (LLMs) for RCA. To do so, we introduce TeleLogs, a curated dataset ofannotated troubleshooting problems designed to benchmark RCA capabilities. Ourevaluation reveals that existing open-source reasoning LLMs struggle with theseproblems, underscoring the need for domain-specific adaptation. To address thisissue, we propose a two-stage training methodology that combines supervisedfine-tuning with reinforcement learning to improve the accuracy and reasoningquality of LLMs. The proposed approach fine-tunes a series of RCA models tointegrate domain knowledge and generate structured, multi-step diagnosticexplanations, improving both interpretability and effectiveness. Extensiveexperiments across multiple LLM sizes show significant performance gains overstate-of-the-art reasoning and non-reasoning models, including stronggeneralization to randomized test variants. These results demonstrate thepromise of domain-adapted, reasoning-enhanced LLMs for practical andexplainable RCA in network operation and management.