AgentThink: A Unified Framework for Tool-Augmented Chain-of-Thought Reasoning in Vision-Language Models for Autonomous Driving

  • 2025-05-22 13:56:52
  • Kangan Qian, Sicong Jiang, Yang Zhong, Ziang Luo, Zilin Huang, Tianze Zhu, Kun Jiang, Mengmeng Yang, Zheng Fu, Jinyu Miao, Yining Shi, He Zhe Lim, Li Liu, Tianbao Zhou, Hongyi Wang, Huang Yu, Yifei Hu, Guang Li, Guang Chen, Hao Ye, Lijun Sun, Diange Yang
  • 0

Abstract

Vision-Language Models (VLMs) show promise for autonomous driving, yet theirstruggle with hallucinations, inefficient reasoning, and limited real-worldvalidation hinders accurate perception and robust step-by-step reasoning. Toovercome this, we introduce \textbf{AgentThink}, a pioneering unified frameworkthat, for the first time, integrates Chain-of-Thought (CoT) reasoning withdynamic, agent-style tool invocation for autonomous driving tasks. AgentThink'score innovations include: \textbf{(i) Structured Data Generation}, byestablishing an autonomous driving tool library to automatically constructstructured, self-verified reasoning data explicitly incorporating tool usagefor diverse driving scenarios; \textbf{(ii) A Two-stage Training Pipeline},employing Supervised Fine-Tuning (SFT) with Group Relative Policy Optimization(GRPO) to equip VLMs with the capability for autonomous tool invocation; and\textbf{(iii) Agent-style Tool-Usage Evaluation}, introducing a novelmulti-tool assessment protocol to rigorously evaluate the model's toolinvocation and utilization. Experiments on the DriveLMM-o1 benchmarkdemonstrate AgentThink significantly boosts overall reasoning scores by\textbf{53.91\%} and enhances answer accuracy by \textbf{33.54\%}, whilemarkedly improving reasoning quality and consistency. Furthermore, ablationstudies and robust zero-shot/few-shot generalization experiments across variousbenchmarks underscore its powerful capabilities. These findings highlight apromising trajectory for developing trustworthy and tool-aware autonomousdriving models.

 

Quick Read (beta)

loading the full paper ...