DeepAgent: A General Reasoning Agent with Scalable Toolsets

Abstract

Large reasoning models have demonstrated strong problem-solving abilities,yet real-world tasks often require external tools and long-horizoninteractions. Existing agent frameworks typically follow predefined workflows,which limit autonomous and global task completion. In this paper, we introduceDeepAgent, an end-to-end deep reasoning agent that performs autonomousthinking, tool discovery, and action execution within a single, coherentreasoning process. To address the challenges of long-horizon interactions,particularly the context length explosion from multiple tool calls and theaccumulation of interaction history, we introduce an autonomous memory foldingmechanism that compresses past interactions into structured episodic, working,and tool memories, reducing error accumulation while preserving criticalinformation. To teach general-purpose tool use efficiently and stably, wedevelop an end-to-end reinforcement learning strategy, namely ToolPO, thatleverages LLM-simulated APIs and applies tool-call advantage attribution toassign fine-grained credit to the tool invocation tokens. Extensive experimentson eight benchmarks, including general tool-use tasks (ToolBench, API-Bank,TMDB, Spotify, ToolHop) and downstream applications (ALFWorld, WebShop, GAIA,HLE), demonstrate that DeepAgent consistently outperforms baselines across bothlabeled-tool and open-set tool retrieval scenarios. This work takes a steptoward more general and capable agents for real-world applications. The codeand demo are available at https://github.com/RUC-NLPIR/DeepAgent.

Quick Read (beta)

loading the full paper ...