Meta-Learning Initializations for Low-Resource Drug Discovery

Abstract

Building in silico models to predict chemical properties and activities is acrucial step in drug discovery. However, drug discovery projects are oftencharacterized by limited labeled data, hindering the applications of deeplearning in this setting. Meanwhile advances in meta-learning have enabledstate-of-the-art performances in few-shot learning benchmarks, naturallyprompting the question: Can meta-learning improve deep learning performance inlow-resource drug discovery projects? In this work, we assess the efficiency ofthe Model-Agnostic Meta-Learning (MAML) algorithm - along with its variantsFO-MAML and ANIL - at learning to predict chemical properties and activities.Using the ChEMBL20 dataset to emulate low-resource settings, our benchmarkshows that meta-initializations perform comparably to or outperform multi-taskpre-training baselines on 16 out of 20 in-distribution tasks and on allout-of-distribution tasks, providing an average improvement in AUPRC of 7.2%and 14.9% respectively. Finally, we observe that meta-initializationsconsistently result in the best performing models across fine-tuning sets with$k \in \{16, 32, 64, 128, 256\}$ instances.

Quick Read (beta)

loading the full paper ...