AgenticIQA: An Agentic Framework for Adaptive and Interpretable Image Quality Assessment

Abstract

Image quality assessment (IQA) is inherently complex, as it reflects both thequantification and interpretation of perceptual quality rooted in the humanvisual system. Conventional approaches typically rely on fixed models to outputscalar scores, limiting their adaptability to diverse distortions,user-specific queries, and interpretability needs. Furthermore, scoring andinterpretation are often treated as independent processes, despite theirinterdependence: interpretation identifies perceptual degradations, whilescoring abstracts them into a compact metric. To address these limitations, wepropose AgenticIQA, a modular agentic framework that integrates vision-languagemodels (VLMs) with traditional IQA tools in a dynamic, query-aware manner.AgenticIQA decomposes IQA into four subtasks -- distortion detection,distortion analysis, tool selection, and tool execution -- coordinated by aplanner, executor, and summarizer. The planner formulates task-specificstrategies, the executor collects perceptual evidence via tool invocation, andthe summarizer integrates this evidence to produce accurate scores withhuman-aligned explanations. To support training and evaluation, we introduceAgenticIQA-200K, a large-scale instruction dataset tailored for IQA agents, andAgenticIQA-Eval, the first benchmark for assessing the planning, execution, andsummarization capabilities of VLM-based IQA agents. Extensive experimentsacross diverse IQA datasets demonstrate that AgenticIQA consistently surpassesstrong baselines in both scoring accuracy and explanatory alignment.

Quick Read (beta)

loading the full paper ...