Scaling Agents via Continual Pre-training

Abstract

Large language models (LLMs) have evolved into agentic systems capable ofautonomous tool use and multi-step reasoning for complex problem-solving.However, post-training approaches building upon general-purpose foundationmodels consistently underperform in agentic tasks, particularly in open-sourceimplementations. We identify the root cause: the absence of robust agenticfoundation models forces models during post-training to simultaneously learndiverse agentic behaviors while aligning them to expert demonstrations, therebycreating fundamental optimization tensions. To this end, we are the first topropose incorporating Agentic Continual Pre-training (Agentic CPT) into thedeep research agents training pipeline to build powerful agentic foundationalmodels. Based on this approach, we develop a deep research agent model namedAgentFounder. We evaluate our AgentFounder-30B on 10 benchmarks and achievestate-of-the-art performance while retains strong tool-use ability, notably39.9% on BrowseComp-en, 43.3% on BrowseComp-zh, and 31.5% Pass@1 on HLE.

Quick Read (beta)

loading the full paper ...