ControlMed: Adding Reasoning Control to Medical Language Model

Abstract

Reasoning Large Language Models (LLMs) with enhanced accuracy andexplainability are increasingly being adopted in the medical domain, as thelife-critical nature of clinical decision-making demands reliable support.Despite these advancements, existing reasoning LLMs often generateunnecessarily lengthy reasoning processes, leading to significant computationaloverhead and response latency. These limitations hinder their practicaldeployment in real-world clinical environments. To address these challenges, weintroduce \textbf{ControlMed}, a medical language model that enables users toactively control the length of the reasoning process at inference time throughfine-grained control markers. ControlMed is trained through a three-stagepipeline: 1) pre-training on a large-scale synthetic medical instructiondataset covering both \textit{direct} and \textit{reasoning responses}; 2)supervised fine-tuning with multi-length reasoning data and explicitlength-control markers; and 3) reinforcement learning with model-based rewardsignals to enhance factual accuracy and response quality. Experimental resultson a variety of English and Korean medical benchmarks demonstrate that ourmodel achieves similar or better performance compared to state-of-the-artmodels. Furthermore, users can flexibly balance reasoning accuracy andcomputational efficiency by controlling the reasoning length as needed. Thesefindings demonstrate that ControlMed is a practical and adaptable solution forclinical question answering and medical information analysis.

Quick Read (beta)

loading the full paper ...