WebDancer: Towards Autonomous Information Seeking Agency

Abstract

Addressing intricate real-world problems necessitates in-depth informationseeking and multi-step reasoning. Recent progress in agentic systems,exemplified by Deep Research, underscores the potential for autonomousmulti-step research. In this work, we present a cohesive paradigm for buildingend-to-end agentic information seeking agents from a data-centric andtraining-stage perspective. Our approach consists of four key stages: (1)browsing data construction, (2) trajectories sampling, (3) supervisedfine-tuning for effective cold start, and (4) reinforcement learning forenhanced generalisation. We instantiate this framework in a web agent based onthe ReAct, WebDancer. Empirical evaluations on the challenging informationseeking benchmarks, GAIA and WebWalkerQA, demonstrate the strong performance ofWebDancer, achieving considerable results and highlighting the efficacy of ourtraining paradigm. Further analysis of agent training provides valuableinsights and actionable, systematic pathways for developing more capableagentic models. The codes and demo will be released inhttps://github.com/Alibaba-NLP/WebAgent.

Quick Read (beta)

loading the full paper ...