Query-driven Document-level Scientific Evidence Extraction from Biomedical Studies

  • 2025-05-09 17:55:06
  • Massimiliano Pronesti, Joao Bettencourt-Silva, Paul Flanagan, Alessandra Pascale, Oisin Redmond, Anya Belz, Yufang Hou
  • 0

Abstract

Extracting scientific evidence from biomedical studies for clinical researchquestions (e.g., Does stem cell transplantation improve quality of life inpatients with medically refractory Crohn's disease compared to placebo?) is acrucial step in synthesising biomedical evidence. In this paper, we focus onthe task of document-level scientific evidence extraction for clinicalquestions with conflicting evidence. To support this task, we create a datasetcalled CochraneForest, leveraging forest plots from Cochrane systematicreviews. It comprises 202 annotated forest plots, associated clinical researchquestions, full texts of studies, and study-specific conclusions. Building onCochraneForest, we propose URCA (Uniform Retrieval Clustered Augmentation), aretrieval-augmented generation framework designed to tackle the uniquechallenges of evidence extraction. Our experiments show that URCA outperformsthe best existing methods by up to 10.3% in F1 score on this task. However, theresults also underscore the complexity of CochraneForest, establishing it as achallenging testbed for advancing automated evidence synthesis systems.

 

Quick Read (beta)

loading the full paper ...