Abstract
Generating novel graph structures that optimize given objectives whileobeying some given underlying rules is fundamental for chemistry, biology andsocial science research. This is especially important in the task of moleculargraph generation, whose goal is to discover novel molecules with desiredproperties such as drug-likeness and synthetic accessibility, while obeyingphysical laws such as chemical valency. However, designing models to findmolecules that optimize desired properties while incorporating highly complexand non-differentiable rules remains to be a challenging task. Here we proposeGraph Convolutional Policy Network (GCPN), a general graph convolutionalnetwork based model for goal-directed graph generation through reinforcementlearning. The model is trained to optimize domain-specific rewards andadversarial loss through policy gradient, and acts in an environment thatincorporates domain-specific rules. Experimental results show that GCPN canachieve 61% improvement on chemical property optimization over state-of-the-artbaselines while resembling known molecules, and achieve 184% improvement on theconstrained property optimization task.