MiniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis

Abstract

Recent advancements in artificial intelligence (AI) have precipitatedsignificant breakthroughs in healthcare, particularly in refining diagnosticprocedures. However, previous studies have often been constrained to limitedfunctionalities. This study introduces MiniGPT-Med, a vision-language modelderived from large-scale language models and tailored for medical applications.MiniGPT-Med demonstrates remarkable versatility across various imagingmodalities, including X-rays, CT scans, and MRIs, enhancing its utility. Themodel is capable of performing tasks such as medical report generation, visualquestion answering (VQA), and disease identification within medical imagery.Its integrated processing of both image and textual clinical data markedlyimproves diagnostic accuracy. Our empirical assessments confirm MiniGPT-Med'ssuperior performance in disease grounding, medical report generation, and VQAbenchmarks, representing a significant step towards reducing the gap inassisting radiology practice. Furthermore, it achieves state-of-the-artperformance on medical report generation, higher than the previous best modelby 19\% accuracy. MiniGPT-Med promises to become a general interface forradiology diagnoses, enhancing diagnostic efficiency across a wide range ofmedical imaging applications.

Quick Read (beta)

loading the full paper ...