RadGPT: Constructing 3D Image-Text Tumor Datasets

  • 2025-08-19 17:05:14
  • Pedro R. A. S. Bassi, Mehmet Can Yavuz, Kang Wang, Xiaoxi Chen, Wenxuan Li, Sergio Decherchi, Andrea Cavalli, Yang Yang, Alan Yuille, Zongwei Zhou
  • 0

Abstract

Cancers identified in CT scans are usually accompanied by detailed radiologyreports, but publicly available CT datasets often lack these essential reports.This absence limits their usefulness for developing accurate report generationAI. To address this gap, we present AbdomenAtlas 3.0, the first public,high-quality abdominal CT dataset with detailed, expert-reviewed radiologyreports. All reports are paired with per-voxel masks and they describe liver,kidney and pancreatic tumors. AbdomenAtlas 3.0 has 9,262 triplets of CT, maskand report--3,955 with tumors. These CT scans come from 17 public datasets.Besides creating the reports for these datasets, we expanded their number oftumor masks by 4.2x, identifying 3,011 new tumor cases. Notably, the reports inAbdomenAtlas 3.0 are more standardized, and generated faster than traditionalhuman-made reports. They provide details like tumor size, location, attenuationand surgical resectability. These reports were created by 12 board-certifiedradiologists using our proposed RadGPT, a novel framework that convertedradiologist-revised tumor segmentation masks into structured and narrativereports. Besides being a dataset creation tool, RadGPT can also become afully-automatic, segmentation-assisted report generation method. We benchmarkedthis method and 5 state-of-the-art report generation vision-language models.Our results show that segmentation strongly improves tumor detection in AI-madereports.

 

Quick Read (beta)

loading the full paper ...