Abstract
Document quality assessment is critical for a wide range of applicationsincluding document digitization, OCR, and archival. However, existingapproaches often struggle to provide accurate and robust quality scores,limiting their applicability in practical scenarios. With the rapid progress inMulti-modal Large Language Models (MLLMs), recent MLLM-based methods haveachieved remarkable performance in image quality assessment. In this work, weextend this success to the document domain by adapting DeQA-Score, astate-of-the-art MLLM-based image quality scorer, for document qualityassessment. We propose DeQA-Doc, a framework that leverages the visual languagecapabilities of MLLMs and a soft label strategy to regress continuous documentquality scores. To adapt DeQA-Score to DeQA-Doc, we adopt two complementarysolutions to construct soft labels without the variance information. Also, werelax the resolution constrains to support the large resolution of documentimages. Finally, we introduce ensemble methods to further enhance theperformance. Extensive experiments demonstrate that DeQA-Doc significantlyoutperforms existing baselines, offering accurate and generalizable documentquality assessment across diverse degradation types. Codes and model weightsare available in https://github.com/Junjie-Gao19/DeQA-Doc.