MolCLR: Molecular Contrastive Learning of Representations via Graph Neural Networks

  • 2021-02-19 17:35:18
  • Yuyang Wang, Jianren Wang, Zhonglin Cao, Amir Barati Farimani
  • 51

Abstract

Molecular machine learning bears promise for efficient molecule propertyprediction and drug discovery. However, due to the limited labeled data and thegiant chemical space, machine learning models trained via supervised learningperform poorly in generalization. This greatly limits the applications ofmachine learning methods for molecular design and discovery. In this work, wepresent MolCLR: Molecular Contrastive Learning of Representations via GraphNeural Networks (GNNs), a self-supervised learning framework for largeunlabeled molecule datasets. Specifically, we first build a molecular graph,where each node represents an atom and each edge represents a chemical bond. AGNN is then used to encode the molecule graph. We propose three novel moleculegraph augmentations: atom masking, bond deletion, and subgraph removal. Acontrastive estimator is utilized to maximize the agreement of different graphaugmentations from the same molecule. Experiments show that moleculerepresentations learned by MolCLR can be transferred to multiple downstreammolecular property prediction tasks. Our method thus achieves state-of-the-artperformance on many challenging datasets. We also prove the efficiency of ourproposed molecule graph augmentations on supervised molecular classificationtasks.

 

Quick Read (beta)

loading the full paper ...