Abstract
While text-to-image models offer numerous benefits, they also posesignificant societal risks. Detecting AI-generated images is crucial formitigating these risks. Detection methods can be broadly categorized intopassive and watermark-based approaches: passive detectors rely on artifactspresent in AI-generated images, whereas watermark-based detectors proactivelyembed watermarks into such images. A key question is which type of detectorperforms better in terms of effectiveness, robustness, and efficiency. However,the current literature lacks a comprehensive understanding of this issue. Inthis work, we aim to bridge that gap by developing ImageDetectBench, the firstcomprehensive benchmark to compare the effectiveness, robustness, andefficiency of passive and watermark-based detectors. Our benchmark includesfour datasets, each containing a mix of AI-generated and non-AI-generatedimages. We evaluate five passive detectors and four watermark-based detectorsagainst eight types of common perturbations and three types of adversarialperturbations. Our benchmark results reveal several interesting findings. Forinstance, watermark-based detectors consistently outperform passive detectors,both in the presence and absence of perturbations. Based on these insights, weprovide recommendations for detecting AI-generated images, e.g., when bothtypes of detectors are applicable, watermark-based detectors should be thepreferred choice.