RAG-Gym: Optimizing Reasoning and Search Agents with Process Supervision

Abstract

Retrieval-augmented generation (RAG) has shown great potential forknowledge-intensive tasks, but its traditional architectures rely on staticretrieval, limiting their effectiveness for complex questions that requiresequential information-seeking. While agentic reasoning and search offer a moreadaptive approach, most existing methods depend heavily on prompt engineering.In this work, we introduce RAG-Gym, a unified optimization framework thatenhances information-seeking agents through fine-grained process supervision ateach search step. We also propose ReSearch, a novel agent architecture thatsynergizes answer reasoning and search query generation within the RAG-Gymframework. Experiments on four challenging datasets show that RAG-Gym improvesperformance by up to 25.6\% across various agent architectures, with ReSearchconsistently outperforming existing baselines. Further analysis highlights theeffectiveness of advanced LLMs as process reward judges and the transferabilityof trained reward models as verifiers for different LLMs. Additionally, weexamine the scaling properties of training and inference in agentic RAG. Theproject homepage is available at https://rag-gym.github.io/.

Quick Read (beta)

loading the full paper ...