Ontology- and LLM-based Data Harmonization for Federated Learning in Healthcare

Abstract

The rise of electronic health records (EHRs) has unlocked new opportunitiesfor medical research, but privacy regulations and data heterogeneity remain keybarriers to large-scale machine learning. Federated learning (FL) enablescollaborative modeling without sharing raw data, yet faces challenges inharmonizing diverse clinical datasets. This paper presents a two-step dataalignment strategy integrating ontologies and large language models (LLMs) tosupport secure, privacy-preserving FL in healthcare, demonstrating itseffectiveness in a real-world project involving semantic mapping of EHR data.

Quick Read (beta)

loading the full paper ...