Abstract
Objective: Fluoropyrimidines are widely prescribed for colorectal and breastcancers, but are associated with toxicities such as hand-foot syndrome andcardiotoxicity. Since toxicity documentation is often embedded in clinicalnotes, we aimed to develop and evaluate natural language processing (NLP)methods to extract treatment and toxicity information. Materials and Methods: We constructed a gold-standard dataset of 236 clinicalnotes from 204,165 adult oncology patients. Domain experts annotated categoriesrelated to treatment regimens and toxicities. We developed rule-based, machinelearning-based (Random Forest, Support Vector Machine [SVM], LogisticRegression [LR]), deep learning-based (BERT, ClinicalBERT), and large languagemodels (LLM)-based NLP approaches (zero-shot and error-analysis prompting).Models used an 80:20 train-test split. Results: Sufficient data existed to train and evaluate 5 annotatedcategories. Error-analysis prompting achieved optimal precision, recall, and F1scores (F1=1.000) for treatment and toxicities extraction, whereas zero-shotprompting reached F1=1.000 for treatment and F1=0.876 for toxicitiesextraction.LR and SVM ranked second for toxicities (F1=0.937). Deep learningunderperformed, with BERT (F1=0.873 treatment; F1= 0.839 toxicities) andClinicalBERT (F1=0.873 treatment; F1 = 0.886 toxicities). Rule-based methodsserved as our baseline with F1 scores of 0.857 in treatment and 0.858 intoxicities. Discussion: LMM-based approaches outperformed all others, followed by machinelearning methods. Machine and deep learning approaches were limited by smalltraining data and showed limited generalizability, particularly for rarecategories. Conclusion: LLM-based NLP most effectively extracted fluoropyrimidinetreatment and toxicity information from clinical notes, and has strongpotential to support oncology research and pharmacovigilance.