Text Embedded Swin-UMamba for DeepLesion Segmentation

Abstract

Segmentation of lesions on CT enables automatic measurement for clinicalassessment of chronic diseases (e.g., lymphoma). Integrating large languagemodels (LLMs) into the lesion segmentation workflow offers the potential tocombine imaging features with descriptions of lesion characteristics from theradiology reports. In this study, we investigate the feasibility of integratingtext into the Swin-UMamba architecture for the task of lesion segmentation. Thepublicly available ULS23 DeepLesion dataset was used along with short-formdescriptions of the findings from the reports. On the test dataset, a high DiceScore of 82% and low Hausdorff distance of 6.58 (pixels) was obtained forlesion segmentation. The proposed Text-Swin-UMamba model outperformed priorapproaches: 37% improvement over the LLM-driven LanGuideMedSeg model (p <0.001),and surpassed the purely image-based xLSTM-UNet and nnUNet models by1.74% and 0.22%, respectively. The dataset and code can be accessed athttps://github.com/ruida/LLM-Swin-UMamba

Quick Read (beta)

loading the full paper ...