Abstract
Multimodal large language models (MLLMs) hold considerable promise forapplications in healthcare. However, their deployment in safety-criticalsettings is hindered by two key limitations: (i) sensitivity to prompt design,and (ii) a tendency to generate incorrect responses with high confidence. Asclinicians may rely on a model's stated confidence to gauge the reliability ofits predictions, it is especially important that when a model expresses highconfidence, it is also highly accurate. We introduce Prompt4Trust, the firstreinforcement learning (RL) framework for prompt augmentation targetingconfidence calibration in MLLMs. A lightweight LLM is trained to producecontext-aware auxiliary prompts that guide a downstream task MLLM to generateresponses in which the expressed confidence more accurately reflects predictiveaccuracy. Unlike conventional calibration techniques, Prompt4Trust specificallyprioritizes aspects of calibration most critical for safe and trustworthyclinical decision-making. Beyond improvements driven by this clinicallymotivated calibration objective, our proposed method also improves taskaccuracy, achieving state-of-the-art medical visual question answering (VQA)performance on the PMC-VQA benchmark, which is composed of multiple-choicequestions spanning diverse medical imaging modalities. Moreover, our frameworktrained with a small downstream task MLLM showed promising zero-shotgeneralization to larger MLLMs in our experiments, suggesting the potential forscalable calibration without the associated computational costs. This workdemonstrates the potential of automated yet human-aligned prompt engineeringfor improving the the trustworthiness of MLLMs in safety critical settings. Ourcodebase can be found at https://github.com/xingbpshen/prompt4trust.