Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring

  • 2024-05-14 18:35:27
  • Tiantian Zhang, Manxi Lin, Hongda Guo, Xiaofan Zhang, Ka Fung Peter Chiu, Aasa Feragen, Qi Dou
  • 0

Abstract

The Prostate Imaging Reporting and Data System (PI-RADS) is pivotal in thediagnosis of clinically significant prostate cancer through MRI imaging.Current deep learning-based PI-RADS scoring methods often lack theincorporation of essential PI-RADS clinical guidelines~(PICG) utilized byradiologists, potentially compromising scoring accuracy. This paper introducesa novel approach that adapts a multi-modal large language model (MLLM) toincorporate PICG into PI-RADS scoring without additional annotations andnetwork parameters. We present a two-stage fine-tuning process aimed atadapting MLLMs originally trained on natural images to the MRI data domainwhile effectively integrating the PICG. In the first stage, we develop a domainadapter layer specifically tailored for processing 3D MRI image inputs anddesign the MLLM instructions to differentiate MRI modalities effectively. Inthe second stage, we translate PICG into guiding instructions for the model togenerate PICG-guided image features. Through feature distillation, we alignscoring network features with the PICG-guided image feature, enabling thescoring network to effectively incorporate the PICG information. We develop ourmodel on a public dataset and evaluate it in a real-world challenging in-housedataset. Experimental results demonstrate that our approach improves theperformance of current scoring networks.

 

Quick Read (beta)

loading the full paper ...