Abstract
Deep reinforcement learning has successfully been applied for moleculardiscovery as shown by the Molecule Deep Q-network (MolDQN) algorithm. Thisalgorithm has challenges when applied to optimizing new molecules: trainingsuch a model is limited in terms of scalability to larger datasets and thetrained model cannot be generalized to different molecules in the same dataset.In this paper, a distributed reinforcement learning algorithm for antioxidants,called DA-MolDQN is proposed to address these problems. State-of-the-art bonddissociation energy (BDE) and ionization potential (IP) predictors areintegrated into DA-MolDQN, which are critical chemical properties whileoptimizing antioxidants. Training time is reduced by algorithmic improvementsfor molecular modifications. The algorithm is distributed, scalable for up to512 molecules, and generalizes the model to a diverse set of molecules. Theproposed models are trained with a proprietary antioxidant dataset. The resultshave been reproduced with both proprietary and public datasets. The proposedmolecules have been validated with DFT simulations and a subset of themconfirmed in public "unseen" datasets. In summary, DA-MolDQN is up to 100xfaster than previous algorithms and can discover new optimized molecules fromproprietary and public antioxidants.