Abstract
Recent advances in large language models (LLMs) have enabled progress inagentic coding, where models autonomously reason, plan, and act withininteractive software development workflows. However, bridging the gap betweenstatic text-based training and dynamic real-world agentic execution remains acore challenge. In this technical report, we present KAT-Coder, a large-scaleagentic code model trained through a multi-stage curriculum encompassingMid-Term Training, Supervised Fine-Tuning (SFT), Reinforcement Fine-Tuning(RFT), and Reinforcement-to-Deployment Adaptation. The Mid-Term stage enhancesreasoning, planning, and reflection capabilities through a corpus of realsoftware engineering data and synthetic agentic interactions. The SFT stageconstructs a million-sample dataset balancing twenty programming languages, tendevelopment contexts, and ten task archetypes. The RFT stage introduces a novelmulti-ground-truth reward formulation for stable and sample-efficient policyoptimization. Finally, the Reinforcement-to-Deployment phase adapts the modelto production-grade IDE environments using Error-Masked SFT and Tree-StructuredTrajectory Training. In summary, these stages enable KAT-Coder to achieverobust tool-use reliability, instruction alignment, and long-context reasoning,forming a deployable foundation for real-world intelligent coding agents. OurKAT series 32B model, KAT-Dev, has been open-sourced onhttps://huggingface.co/Kwaipilot/KAT-Dev.