Abstract
Clinical trials are vital in advancing drug development and evidence-basedmedicine, but their success is often hindered by challenges in patientrecruitment. In this work, we investigate the potential of large languagemodels (LLMs) to assist individual patients and referral physicians inidentifying suitable clinical trials from an extensive selection. Specifically,we introduce TrialGPT, a novel architecture employing LLMs to predictcriterion-level eligibility with detailed explanations, which are thenaggregated for ranking and excluding candidate clinical trials based onfree-text patient notes. We evaluate TrialGPT on three publicly availablecohorts of 184 patients and 18,238 annotated clinical trials. The experimentalresults demonstrate several key findings: First, TrialGPT achieves highcriterion-level prediction accuracy with faithful explanations. Second, theaggregated trial-level TrialGPT scores are highly correlated with experteligibility annotations. Third, these scores prove effective in rankingclinical trials and exclude ineligible candidates. Our error analysis suggeststhat current LLMs still make some mistakes due to limited medical knowledge anddomain-specific context understanding. Nonetheless, we believe the explanatorycapabilities of LLMs are highly valuable. Future research is warranted on howsuch AI assistants can be integrated into the routine trial matching workflowin real-world settings to improve its efficiency.