A Comprehensive Survey on Graph Neural Networks

  • 2019-01-03 03:20:55
  • Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, Philip S. Yu
  • 425

Abstract

Deep learning has revolutionized many machine learning tasks in recent years,ranging from image classification and video processing to speech recognitionand natural language understanding. The data in these tasks are typicallyrepresented in the Euclidean space. However, there is an increasing number ofapplications where data are generated from non-Euclidean domains and arerepresented as graphs with complex relationships and interdependency betweenobjects. The complexity of graph data has imposed significant challenges onexisting machine learning algorithms. Recently, many studies on extending deeplearning approaches for graph data have emerged. In this survey, we provide acomprehensive overview of graph neural networks (GNNs) in data mining andmachine learning fields. We propose a new taxonomy to divide thestate-of-the-art graph neural networks into different categories. With a focuson graph convolutional networks, we review alternative architectures that haverecently been developed; these learning paradigms include graph attentionnetworks, graph autoencoders, graph generative networks, and graphspatial-temporal networks. We further discuss the applications of graph neuralnetworks across various domains and summarize the open source codes andbenchmarks of the existing algorithms on different learning tasks. Finally, wepropose potential research directions in this fast-growing field.

 

Quick Read (beta)

A Comprehensive Survey on Graph Neural Networks

Zonghan Wu, Shirui Pan, , Fengwen Chen, Guodong Long,
Chengqi Zhang, , Philip S. Yu,  
Z. Wu, F. Chen, G. Long, C. Zhang are with Centre for Artificial Intelligence, FEIT, University of Technology Sydney, NSW 2007, Australia (E-mail: [email protected]; [email protected]; [email protected]; [email protected]). S. Pan is with Faculty of Information Technology, Monash University, Clayton, VIC 3800, Australia (Email: [email protected]). P. S. Yu is with Department of Computer Science, University of Illinois at Chicago, Chicago, IL 60607-7053, USA (Email: [email protected]) Corresponding author: Shirui Pan. Manuscript received Dec xx, 2018; revised Dec xx, 201x.
Abstract

Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into different categories. With a focus on graph convolutional networks, we review alternative architectures that have recently been developed; these learning paradigms include graph attention networks, graph autoencoders, graph generative networks, and graph spatial-temporal networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes and benchmarks of the existing algorithms on different learning tasks. Finally, we propose potential research directions in this fast-growing field.

Deep Learning, graph neural networks, graph convolutional networks, graph representation learning, graph autoencoder, network embedding

1 Introduction

The recent success of neural networks has boosted research on pattern recognition and data mining. Many machine learning tasks such as object detection [1, 2], machine translation [3, 4], and speech recognition [5], which once heavily relied on handcrafted feature engineering to extract informative feature sets, has recently been revolutionized by various end-to-end deep learning paradigms, i.e., convolutional neural networks (CNNs) [6], long short-term memory (LSTM) [7], and autoencoders. The success of deep learning in many domains is partially attributed to the rapidly developing computational resources (e.g., GPU) and the availability of large training data, and is partially due to the effectiveness of deep learning to extract latent representation from Euclidean data (e.g., images, text, and video). Taking image analysis as an example, an image can be represented as a regular grid in the Euclidean space. A convolutional neural network (CNN) is able to exploit the shift-invariance, local connectivity, and compositionality of image data [8], and as a result, CNN can extract local meaningful features that are shared with the entire datasets for various image analysis tasks.

While deep learning has achieved great success on Euclidean data, there is an increasing number of applications where data are generated from the non-Euclidean domain and need to be effctectively analyzed. For instance, in e-commence, a graph-based learning system is able to exploit the interactions between users and products [9, 10, 11] to make a highly accurate recommendations. In chemistry, molecules are modeled as graphs and their bioactivity needs to be identified for drug discovery [12, 13]. In a citation network, papers are linked to each other via citationship and they need to be categorized into different groups [14, 15]. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. This is because graph data are irregular. Each graph has a variable size of unordered nodes and each node in a graph has a different number of neighbors, causing some important operations (e.g., convolutions), which are easy to compute in the image domain, but are not directly applicable to the graph domain any more. Furthermore, a core assumption of existing machine learning algorithms is that instances are independent of each other. However, this is not the case for graph data where each instance (node) is related to others (neighbors) via some complex linkage information, which is used to capture the interdependence among data, including citationship, friendship, and interactions.

TABLE I: A Summary of Open-source Implementations
Model Framework Github Link
ChebNet (2016) [12] tensorflow https://github.com/mdeff/cnn_graph
1stChebNet (2017) [14] tensorflow https://github.com/tkipf/gcn
GGNNs (2015) [18] lua https://github.com/yujiali/ggnn
SSE (2018) [19] c https://github.com/Hanjun-Dai/steady_state_embedding
GraphSage (2017) [24] tensorflow https://github.com/williamleif/GraphSAGE
LGCN (2018) [27] tensorflow https://github.com/divelab/lgcn/
SplineCNN (2018) [86] pytorch https://github.com/rusty1s/pytorch_geometric
GAT (2017) [15] tensorflow https://github.com/PetarV-/GAT
GAE (2016) [59] tensorflow https://github.com/limaosen0/Variational-Graph-Auto-Encoders
ARGA (2018) [61] tensorflow https://github.com/Ruiqi-Hu/ARGA
DNGR (2016) [41] matlab https://github.com/ShelsonCao/DNGR
SDNE (2016) [42] python https://github.com/suanrong/SDNE
DRNE (2016) [63] tensorflow https://github.com/tadpole/DRNE
GraphRNN (2018) [64] tensorflow https://github.com/snap-stanford/GraphRNN
DCRNN (2018) [70] tensorflow https://github.com/liyaguang/DCRNN
CNN-GCN (2017) [71] tensorflow https://github.com/VeritasYin/STGCN_IJCAI-18
ST-GCN (2018) [72] pytorch https://github.com/yysijie/st-gcn
Structural RNN (2016) [73] theano https://github.com/asheshjain399/RNNexp

1.1 Practical Applications

Graph neural networks have a wide range of applications across different tasks and domains. Despite general tasks at which each category of GNNs is specialized, including node classification, node representation learning, graph classification, graph generation, and spatial-temporal forecasting, GNNs can also be applied to node clustering, link prediction [119], and graph partition [120]. In this section, we mainly introduce practical applications according to general domains to which they belong.

Computer Vision One of biggest application areas for graph neural networks is computer vision. Researchers have explored leveraging graph structures in scene graph generation, point clouds classification and segmentation, action recognition and many other directions.

In scene graph generation, semantic relationships between objects facilitate the understanding of the semantic meaning behind a visual scene. Given an image, scene graph generation models detect and recognize objects and predict semantic relationships between pairs of objects[121, 122, 123]. Another application inverses the process by generating realistic images given scene graphs [124]. As natural language can be parsed as semantic graphs where each word represents an object, it is a promising solution to synthesize images given textual descriptions.

In point clouds classification and segmentation, a point cloud is a set of 3D points recorded by LiDAR scans. Solutions for this task enable LiDAR devices to see the surrounding environment, which is typically beneficial for unmanned vehicles. To identify objects depicted by point clouds, [125, 126, 127] convert point clouds into k-nearest neighbor graphs or superpoint graphs, and use graph convolution networks to explore the topological structure.

In action recognition, recognizing human actions contained in videos facilitates a better understanding of video content from a machine aspect. One group of solutions detects the locations of human joints in video clips. Human joints which are linked by skeletons naturally form a graph. Given the time series of human joint locations, [72, 73] applies spatial-temporal neural networks to learn human action patterns.

In addition, the number of possible directions in which to apply graph neural networks in computer vision is still growing. This includes few-shot image classification[128, 129], semantic segmentation [130, 131], visual reasoning [132] and question answering [133].

Recommender Systems Graph-based recommender systems take items and users as nodes. By leveraging the relations between items and items, users and users, users and items, as well as content information, graph-based recommender systems are able to produce high-quality recommendations. The key to a recommender system is to score the importance of an item to an user. As a result, it can be cast as a link prediction problem. The goal is to predict the missing links between users and items. To address this problem, Van et al. [9] and Ying et al. [11] et al. propose a GCN-based graph auto-encoder. Monti et al. [10] combine GCN and RNN to learn the underlying process that generates the known ratings.

Traffic Traffic congestion has become a hot social issue in modern cities. Accurately forecasting traffic speed, volume or the density of roads in traffic networks is fundamentally important in route planning and flow control. [134, 70, 71, 28] adopt a graph-based approach with spatial-temporal neural networks. The input to their models is a spatial-temporal graph. In this spatial-temporal graph, nodes are represented by sensors placed on roads, edges are represented by the distance of pair-wise nodes above a threshold and each node contains a time series as features. The goal is to forecast the average speed of a road within a time interval. Another interesting application is taxi-demand prediction. This greatly helps intelligent transportation systems make use of resources and save energy effectively. Given historical taxi demands, location information, weather data, and event features, Yao et al. [135] incorporate LSTM, CNN and node embeddings trained by LINE [136] to form a joint representation for each location to predict the number of taxis demanded for a location within a time interval.

Chemistry In chemistry, researchers apply graph neural networks to study the graph strcutures of molecules. In a molecular graph, atoms function as nodes and chemical bonds function as edges. Node classification, graph classification and graph generation are three main tasks targeting at molecular graphs in order to learn molecular fingerprints [80, 53], to predict molecular properties [13], to infer protein interfaces [137], and to synthesize chemical compounds [66, 65, 138].

Others There have been initial explorations into applying GNNs to other problems such as program verification [18], program reasoning [139], social influence prediction[140], adversarial attacks prevention[141], electrical health records modeling[142, 143], event detection[144] and combinatorial optimization [145].

2 Future Directions

Though graph neural networks have proven their power in learning graph data, challenges still exist due to the complexity of graphs. In this section, we provide four future directions of graph neural networks.

Go Deep The success of deep learning lies in deep neural architectures. In image classification, for example, an outstanding model named ResNet [146] has 152 layers. However, when it comes to graphs, experimental studies have shown that with the increase in the number of layers, the model performance drops dramatically [147]. According to [147], this is due to the effect of graph convolutions in that it essentially pushes representations of adjacent nodes closer to each other so that, in theory, with an infinite times of convolutions, all nodes’ representations will converge to a single point. This raises the question of whether going deep is still a good strategy for learning graph-structured data.

Receptive Field The receptive field of a node refers to a set of nodes including the central node and its neighbors. The number of neighbors of a node follows a power law distribution. Some nodes may only have one neighbor, while other nodes may neighbors as many as thousands. Though sampling strategies have been adopted [24, 26, 27], how to select a representative receptive field of a node remains to be explored.

Scalability Most graph neural networks do not scale well for large graphs. The main reason for this is when stacking multiple layers of a graph convolution, a node’s final state involves a large number of its neighbors’ hidden states, leading to high complexity of backpropagation. While several approaches try to improve their model efficiency by fast sampling [46, 45] and sub-graph training [24, 27], they are still not scalable enough to handle deep architectures with large graphs.

Dynamics and Heterogeneity The majority of current graph neural networks tackle with static homogeneous graphs. On the one hand, graph structures are assumed to be fixed. On the other hand, nodes and edges from a graph are assumed to come from a single source. However, these two assumptions are not realistic in many scenarios. In a social network, a new person may enter into a network at any time and an existing person may quit the network as well. In a recommender system, products may have different types where their inputs may have different forms such as texts or images. Therefore, new methods should be developed to handle dynamic and heterogeneous graph structures.

3 Conclusion

In this survey, we conduct a comprehensive overview of graph neural networks. We provide a taxonomy which groups graph neural networks into five categories: graph convolutional networks, graph attention networks, graph autoencoders and graph generative networks. We provide a thorough review, comparisons, and summarizations of the methods within or between categories. Then we introduce a wide range of applications of graph neural networks. Datasets, open source codes, and benchmarks for graph neural networks are summarized. Finally, we suggest four future directions for graph neural networks.

Acknowledgment

This research was funded by the Australian Government through the Australian Research Council (ARC) under grants 1) LP160100630 partnership with Australia Government Department of Health and 2) LP150100671 partnership with Australia Research Alliance for Children and Youth (ARACY) and Global Business College Australia (GBCA). We acknowledge the support of NVIDIA Corporation and MakeMagic Australia with the donation of GPU used for this research.

References

  • [1] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
  • [2] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Advances in neural information processing systems, 2015, pp. 91–99.
  • [3] M.-T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine translation,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015, pp. 1412–1421.
  • [4] Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey et al., “Google’s neural machine translation system: Bridging the gap between human and machine translation,” arXiv preprint arXiv:1609.08144, 2016.
  • [5] G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath et al., “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal processing magazine, vol. 29, no. 6, pp. 82–97, 2012.
  • [6] Y. LeCun, Y. Bengio et al., “Convolutional networks for images, speech, and time series,” The handbook of brain theory and neural networks, vol. 3361, no. 10, p. 1995, 1995.
  • [7] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
  • [8] M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst, “Geometric deep learning: going beyond euclidean data,” IEEE Signal Processing Magazine, vol. 34, no. 4, pp. 18–42, 2017.
  • [9] R. van den Berg, T. N. Kipf, and M. Welling, “Graph convolutional matrix completion,” stat, vol. 1050, p. 7, 2017.
  • [10] F. Monti, M. Bronstein, and X. Bresson, “Geometric matrix completion with recurrent multi-graph neural networks,” in Advances in Neural Information Processing Systems, 2017, pp. 3697–3707.
  • [11] R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec, “Graph convolutional neural networks for web-scale recommender systems,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.   ACM, 2018, pp. 974–983.
  • [12] M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” in Advances in Neural Information Processing Systems, 2016, pp. 3844–3852.
  • [13] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in Proceedings of the International Conference on Machine Learning, 2017, pp. 1263–1272.
  • [14] T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in Proceedings of the International Conference on Learning Representations, 2017.
  • [15] P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, “Graph attention networks,” in Proceedings of the International Conference on Learning Representations, 2017.
  • [16] M. Gori, G. Monfardini, and F. Scarselli, “A new model for learning in graph domains,” in Proceedings of the International Joint Conference on Neural Networks, vol. 2.   IEEE, 2005, pp. 729–734.
  • [17] F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The graph neural network model,” IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 61–80, 2009.
  • [18] Y. Li, D. Tarlow, M. Brockschmidt, and R. Zemel, “Gated graph sequence neural networks,” in Proceedings of the International Conference on Learning Representations, 2015.
  • [19] H. Dai, Z. Kozareva, B. Dai, A. Smola, and L. Song, “Learning steady-states of iterative algorithms over graphs,” in Proceedings of the International Conference on Machine Learning, 2018, pp. 1114–1122.
  • [20] J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral networks and locally connected networks on graphs,” in Proceedings of International Conference on Learning Representations, 2014.
  • [21] M. Henaff, J. Bruna, and Y. LeCun, “Deep convolutional networks on graph-structured data,” arXiv preprint arXiv:1506.05163, 2015.
  • [22] R. Li, S. Wang, F. Zhu, and J. Huang, “Adaptive graph convolutional neural networks,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2018, pp. 3546–3553.
  • [23] R. Levie, F. Monti, X. Bresson, and M. M. Bronstein, “Cayleynets: Graph convolutional neural networks with complex rational spectral filters,” arXiv preprint arXiv:1705.07664, 2017.
  • [24] W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in Advances in Neural Information Processing Systems, 2017, pp. 1024–1034.
  • [25] F. Monti, D. Boscaini, J. Masci, E. Rodola, J. Svoboda, and M. M. Bronstein, “Geometric deep learning on graphs and manifolds using mixture model cnns,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, no. 2, 2017, p. 3.
  • [26] M. Niepert, M. Ahmed, and K. Kutzkov, “Learning convolutional neural networks for graphs,” in Proceedings of the International Conference on Machine Learning, 2016, pp. 2014–2023.
  • [27] H. Gao, Z. Wang, and S. Ji, “Large-scale learnable graph convolutional networks,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.   ACM, 2018, pp. 1416–1424.
  • [28] J. Zhang, X. Shi, J. Xie, H. Ma, I. King, and D.-Y. Yeung, “Gaan: Gated attention networks for learning on large and spatiotemporal graphs,” in Proceedings of the Uncertainty in Artificial Intelligence, 2018.
  • [29] P. W. Battaglia, J. B. Hamrick, V. Bapst, A. Sanchez-Gonzalez, V. Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner et al., “Relational inductive biases, deep learning, and graph networks,” arXiv preprint arXiv:1806.01261, 2018.
  • [30] J. B. Lee, R. A. Rossi, S. Kim, N. K. Ahmed, and E. Koh, “Attention models in graphs: A survey,” arXiv preprint arXiv:1807.07984, 2018.
  • [31] Z. Zhang, P. Cui, and W. Zhu, “Deep learning on graphs: A survey,” arXiv preprint arXiv:1812.04202, 2018.
  • [32] P. Cui, X. Wang, J. Pei, and W. Zhu, “A survey on network embedding,” IEEE Transactions on Knowledge and Data Engineering, 2017.
  • [33] W. L. Hamilton, R. Ying, and J. Leskovec, “Representation learning on graphs: Methods and applications,” in Advances in Neural Information Processing Systems, 2017, pp. 1024–1034.
  • [34] D. Zhang, J. Yin, X. Zhu, and C. Zhang, “Network representation learning: A survey,” IEEE Transactions on Big Data, 2018.
  • [35] H. Cai, V. W. Zheng, and K. Chang, “A comprehensive survey of graph embedding: problems, techniques and applications,” IEEE Transactions on Knowledge and Data Engineering, 2018.
  • [36] P. Goyal and E. Ferrara, “Graph embedding techniques, applications, and performance: A survey,” Knowledge-Based Systems, vol. 151, pp. 78–94, 2018.
  • [37] S. Pan, J. Wu, X. Zhu, C. Zhang, and Y. Wang, “Tri-party deep network representation,” in Proceedings of the International Joint Conference on Artificial Intelligence.   AAAI Press, 2016, pp. 1895–1901.
  • [38] X. Shen, S. Pan, W. Liu, Y.-S. Ong, and Q.-S. Sun, “Discrete network embedding,” in Proceedings of the International Joint Conference on Artificial Intelligence, 7 2018, pp. 3549–3555.
  • [39] H. Yang, S. Pan, P. Zhang, L. Chen, D. Lian, and C. Zhang, “Binarized attributed network embedding,” in IEEE International Conference on Data Mining.   IEEE, 2018.
  • [40] B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” in Proceedings of the ACM SIGKDD international conference on Knowledge discovery and data mining.   ACM, 2014, pp. 701–710.
  • [41] S. Cao, W. Lu, and Q. Xu, “Deep neural networks for learning graph representations,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2016, pp. 1145–1152.
  • [42] D. Wang, P. Cui, and W. Zhu, “Structural deep network embedding,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.   ACM, 2016, pp. 1225–1234.
  • [43] A. Susnjara, N. Perraudin, D. Kressner, and P. Vandergheynst, “Accelerated filtering on graphs using lanczos method,” arXiv preprint arXiv:1509.04537, 2015.
  • [44] J. Atwood and D. Towsley, “Diffusion-convolutional neural networks,” in Advances in Neural Information Processing Systems, 2016, pp. 1993–2001.
  • [45] J. Chen, T. Ma, and C. Xiao, “Fastgcn: fast learning with graph convolutional networks via importance sampling,” in Proceedings of the International Conference on Learning Representations, 2018.
  • [46] J. Chen, J. Zhu, and L. Song, “Stochastic training of graph convolutional networks with variance reduction,” in Proceedings of the International Conference on Machine Learning, 2018, pp. 941–949.
  • [47] F. P. Such, S. Sah, M. A. Dominguez, S. Pillai, C. Zhang, A. Michael, N. D. Cahill, and R. Ptucha, “Robust spatial filtering with graph convolutional neural networks,” IEEE Journal of Selected Topics in Signal Processing, vol. 11, no. 6, pp. 884–896, 2017.
  • [48] Z. Liu, C. Chen, L. Li, J. Zhou, X. Li, and L. Song, “Geniepath: Graph neural networks with adaptive receptive paths,” arXiv preprint arXiv:1802.00910, 2018.
  • [49] C. Zhuang and Q. Ma, “Dual graph convolutional networks for graph-based semi-supervised classification,” in Proceedings of the World Wide Web Conference on World Wide Web.   International World Wide Web Conferences Steering Committee, 2018, pp. 499–508.
  • [50] T. Derr, Y. Ma, and J. Tang, “Signed graph convolutional network,” arXiv preprint arXiv:1808.06354, 2018.
  • [51] T. Pham, T. Tran, D. Q. Phung, and S. Venkatesh, “Column networks for collective classification,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2017, pp. 2485–2491.
  • [52] M. Simonovsky and N. Komodakis, “Dynamic edgeconditioned filters in convolutional neural networks on graphs,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.
  • [53] S. Kearnes, K. McCloskey, M. Berndl, V. Pande, and P. Riley, “Molecular graph convolutions: moving beyond fingerprints,” Journal of computer-aided molecular design, vol. 30, no. 8, pp. 595–608, 2016.
  • [54] W. Huang, T. Zhang, Y. Rong, and J. Huang, “Adaptive sampling towards fast graph representation learning,” in Advances in Neural Information Processing Systems, 2018, pp. 4563–4572.
  • [55] M. Zhang, Z. Cui, M. Neumann, and Y. Chen, “An end-to-end deep learning architecture for graph classification,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2018.
  • [56] Z. Ying, J. You, C. Morris, X. Ren, W. Hamilton, and J. Leskovec, “Hierarchical graph representation learning with differentiable pooling,” in Advances in Neural Information Processing Systems, 2018, pp. 4801–4811.
  • [57] J. B. Lee, R. Rossi, and X. Kong, “Graph classification using structural attention,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.   ACM, 2018, pp. 1666–1674.
  • [58] S. Abu-El-Haija, B. Perozzi, R. Al-Rfou, and A. A. Alemi, “Watch your step: Learning node embeddings via graph attention,” in Advances in Neural Information Processing Systems, 2018, pp. 9197–9207.
  • [59] T. N. Kipf and M. Welling, “Variational graph auto-encoders,” arXiv preprint arXiv:1611.07308, 2016.
  • [60] C. Wang, S. Pan, G. Long, X. Zhu, and J. Jiang, “Mgae: Marginalized graph autoencoder for graph clustering,” in Proceedings of the ACM on Conference on Information and Knowledge Management.   ACM, 2017, pp. 889–898.
  • [61] S. Pan, R. Hu, G. Long, J. Jiang, L. Yao, and C. Zhang, “Adversarially regularized graph autoencoder for graph embedding.” in Proceedings of the International Joint Conference on Artificial Intelligence, 2018, pp. 2609–2615.
  • [62] W. Yu, C. Zheng, W. Cheng, C. C. Aggarwal, D. Song, B. Zong, H. Chen, and W. Wang, “Learning deep network representations with adversarially regularized autoencoders,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.   ACM, 2018, pp. 2663–2671.
  • [63] K. Tu, P. Cui, X. Wang, P. S. Yu, and W. Zhu, “Deep recursive network embedding with regular equivalence,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.   ACM, 2018, pp. 2357–2366.
  • [64] J. You, R. Ying, X. Ren, W. L. Hamilton, and J. Leskovec, “Graphrnn: A deep generative model for graphs,” Proceedings of International Conference on Machine Learning, 2018.
  • [65] Y. Li, O. Vinyals, C. Dyer, R. Pascanu, and P. Battaglia, “Learning deep generative models of graphs,” in Proceedings of the International Conference on Machine Learning, 2018.
  • [66] N. De Cao and T. Kipf, “Molgan: An implicit generative model for small molecular graphs,” arXiv preprint arXiv:1805.11973, 2018.
  • [67] A. Bojchevski, O. Shchur, D. Zügner, and S. Günnemann, “Netgan: Generating graphs via random walks,” in Proceedings of the International Conference on Machine Learning, 2018.
  • [68] T. Ma, J. Chen, and C. Xiao, “Constrained generation of semantically valid graphs via regularizing variational autoencoders,” in Advances in Neural Information Processing Systems, 2018, pp. 7110–7121.
  • [69] Y. Seo, M. Defferrard, P. Vandergheynst, and X. Bresson, “Structured sequence modeling with graph convolutional recurrent networks,” arXiv preprint arXiv:1612.07659, 2016.
  • [70] Y. Li, R. Yu, C. Shahabi, and Y. Liu, “Diffusion convolutional recurrent neural network: Data-driven traffic forecasting,” in Proceedings of International Conference on Learning Representations, 2018.
  • [71] B. Yu, H. Yin, and Z. Zhu, “Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting,” in Proceedings of the International Joint Conference on Artificial Intelligence, 2017, pp. 3634–3640.
  • [72] S. Yan, Y. Xiong, and D. Lin, “Spatial temporal graph convolutional networks for skeleton-based action recognition,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2018.
  • [73] A. Jain, A. R. Zamir, S. Savarese, and A. Saxena, “Structural-rnn: Deep learning on spatio-temporal graphs,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 5308–5317.
  • [74] S. Pan, J. Wu, X. Zhu, C. Zhang, and P. S. Yu, “Joint structure feature exploration and regularization for multi-task graph classification,” IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 3, pp. 715–728, 2016.
  • [75] S. Pan, J. Wu, X. Zhu, G. Long, and C. Zhang, “Task sensitive feature exploration and learning for multitask graph classification,” IEEE transactions on cybernetics, vol. 47, no. 3, pp. 744–758, 2017.
  • [76] D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, “The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains,” IEEE Signal Processing Magazine, vol. 30, no. 3, pp. 83–98, 2013.
  • [77] L. B. Almeida, “A learning rule for asynchronous perceptrons with feedback in a combinatorial environment.” in Proceedings of the International Conference on Neural Networks, vol. 2.   IEEE, 1987, pp. 609–618.
  • [78] F. J. Pineda, “Generalization of back-propagation to recurrent neural networks,” Physical review letters, vol. 59, no. 19, p. 2229, 1987.
  • [79] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014, pp. 1724–1734.
  • [80] D. K. Duvenaud, D. Maclaurin, J. Iparraguirre, R. Bombarell, T. Hirzel, A. Aspuru-Guzik, and R. P. Adams, “Convolutional networks on graphs for learning molecular fingerprints,” in Advances in Neural Information Processing Systems, 2015, pp. 2224–2232.
  • [81] K. T. Schütt, F. Arbabzadah, S. Chmiela, K. R. Müller, and A. Tkatchenko, “Quantum-chemical insights from deep tensor neural networks,” Nature communications, vol. 8, p. 13890, 2017.
  • [82] B. Weisfeiler and A. Lehman, “A reduction of a graph to a canonical form and an algebra arising during this reduction,” Nauchno-Technicheskaya Informatsia, vol. 2, no. 9, pp. 12–16, 1968.
  • [83] B. L. Douglas, “The weisfeiler-lehman method and graph isomorphism testing,” arXiv preprint arXiv:1101.5211, 2011.
  • [84] J. Masci, D. Boscaini, M. Bronstein, and P. Vandergheynst, “Geodesic convolutional neural networks on riemannian manifolds,” in Proceedings of the IEEE International Conference on Computer Vision Workshops, 2015, pp. 37–45.
  • [85] D. Boscaini, J. Masci, E. Rodolà, and M. Bronstein, “Learning shape correspondence with anisotropic convolutional neural networks,” in Advances in Neural Information Processing Systems, 2016, pp. 3189–3197.
  • [86] M. Fey, J. E. Lenssen, F. Weichert, and H. Müller, “Splinecnn: Fast geometric deep learning with continuous b-spline kernels,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 869–877.
  • [87] S. Pan, J. Wu, and X. Zhu, “Cogboost: Boosting for fast cost-sensitive graph classification,” IEEE Transactions on Knowledge & Data Engineering, no. 1, pp. 1–1, 2015.
  • [88] K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks,” arXiv preprint arXiv:1810.00826, 2018.
  • [89] S. Verma and Z.-L. Zhang, “Graph capsule convolutional neural networks,” arXiv preprint arXiv:1805.08090, 2018.
  • [90] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, 2017, pp. 5998–6008.
  • [91] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, 2014, pp. 2672–2680.
  • [92] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Advances in Neural Information Processing Systems, 2014, pp. 3104–3112.
  • [93] P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, “Extracting and composing robust features with denoising autoencoders,” in Proceedings of the international conference on Machine learning.   ACM, 2008, pp. 1096–1103.
  • [94] G. L. Guimaraes, B. Sanchez-Lengeling, C. Outeiral, P. L. C. Farias, and A. Aspuru-Guzik, “Objective-reinforced generative adversarial networks (organ) for sequence generation models,” arXiv preprint arXiv:1705.10843, 2017.
  • [95] M. J. Kusner, B. Paige, and J. M. Hernández-Lobato, “Grammar variational autoencoder,” arXiv preprint arXiv:1703.01925, 2017.
  • [96] H. Dai, Y. Tian, B. Dai, S. Skiena, and L. Song, “Syntax-directed variational autoencoder for molecule generation,” in Proceedings of the International Conference on Learning Representations, 2018.
  • [97] R. Gómez-Bombarelli, J. N. Wei, D. Duvenaud, J. M. Hernández-Lobato, B. Sánchez-Lengeling, D. Sheberla, J. Aguilera-Iparraguirre, T. D. Hirzel, R. P. Adams, and A. Aspuru-Guzik, “Automatic chemical design using a data-driven continuous representation of molecules,” ACS central science, vol. 4, no. 2, pp. 268–276, 2018.
  • [98] B. Chen, L. Sun, and X. Han, “Sequence-to-action: End-to-end semantic graph generation for semantic parsing,” in Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2018, pp. 766–777.
  • [99] D. D. Johnson, “Learning graphical state transitions,” in Proceedings of the International Conference on Learning Representations, 2016.
  • [100] M. Schlichtkrull, T. N. Kipf, P. Bloem, R. van den Berg, I. Titov, and M. Welling, “Modeling relational data with graph convolutional networks,” in European Semantic Web Conference.   Springer, 2018, pp. 593–607.
  • [101] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved training of wasserstein gans,” in Advances in Neural Information Processing Systems, 2017, pp. 5767–5777.
  • [102] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein gan,” arXiv preprint arXiv:1701.07875, 2017.
  • [103] P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad, “Collective classification in network data,” AI magazine, vol. 29, no. 3, p. 93, 2008.
  • [104] X. Zhang, Y. Li, D. Shen, and L. Carin, “Diffusion maps for textual network embedding,” in Advances in Neural Information Processing Systems, 2018.
  • [105] J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su, “Arnetminer: extraction and mining of academic social networks,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.   ACM, 2008, pp. 990–998.
  • [106] Y. Ma, S. Wang, C. C. Aggarwal, D. Yin, and J. Tang, “Multi-dimensional graph convolutional networks,” arXiv preprint arXiv:1808.06099, 2018.
  • [107] L. Tang and H. Liu, “Relational learning via latent social dimensions,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Ciscovery and Data Mining.   ACM, 2009, pp. 817–826.
  • [108] H. Wang, J. Wang, J. Wang, M. Zhao, W. Zhang, F. Zhang, X. Xie, and M. Guo, “Graphgan: Graph representation learning with generative adversarial nets,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2017.
  • [109] M. Zitnik and J. Leskovec, “Predicting multicellular function through multi-layer tissue networks,” Bioinformatics, vol. 33, no. 14, pp. i190–i198, 2017.
  • [110] N. Wale, I. A. Watson, and G. Karypis, “Comparison of descriptor spaces for chemical compound retrieval and classification,” Knowledge and Information Systems, vol. 14, no. 3, pp. 347–375, 2008.
  • [111] A. K. Debnath, R. L. Lopez de Compadre, G. Debnath, A. J. Shusterman, and C. Hansch, “Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity,” Journal of medicinal chemistry, vol. 34, no. 2, pp. 786–797, 1991.
  • [112] P. D. Dobson and A. J. Doig, “Distinguishing enzyme structures from non-enzymes without alignments,” Journal of molecular biology, vol. 330, no. 4, pp. 771–783, 2003.
  • [113] R. Ramakrishnan, P. O. Dral, M. Rupp, and O. A. Von Lilienfeld, “Quantum chemistry structures and properties of 134 kilo molecules,” Scientific data, vol. 1, p. 140022, 2014.
  • [114] T. Joachims, “A probabilistic analysis of the rocchio algorithm with tfidf for text categorization.” Carnegie-mellon univ pittsburgh pa dept of computer science, Tech. Rep., 1996.
  • [115] H. Jagadish, J. Gehrke, A. Labrinidis, Y. Papakonstantinou, J. M. Patel, R. Ramakrishnan, and C. Shahabi, “Big data and its technical challenges,” Communications of the ACM, vol. 57, no. 7, pp. 86–94, 2014.
  • [116] B. N. Miller, I. Albert, S. K. Lam, J. A. Konstan, and J. Riedl, “Movielens unplugged: experiences with an occasionally connected recommender system,” in Proceedings of the international conference on Intelligent user interfaces.   ACM, 2003, pp. 263–266.
  • [117] A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R. Hruschka Jr, and T. M. Mitchell, “Toward an architecture for never-ending language learning.” in Proceedings of the AAAI Conference on Artificial Intelligence, 2010, pp. 1306–1313.
  • [118] P. Veličković, W. Fedus, W. L. Hamilton, P. Liò, Y. Bengio, and R. D. Hjelm, “Deep graph infomax,” arXiv preprint arXiv:1809.10341, 2018.
  • [119] M. Zhang and Y. Chen, “Link prediction based on graph neural networks,” in Advances in Neural Information Processing Systems, 2018.
  • [120] T. Kawamoto, M. Tsubaki, and T. Obuchi, “Mean-field theory of graph neural networks in graph partitioning,” in Advances in Neural Information Processing Systems, 2018, pp. 4362–4372.
  • [121] D. Xu, Y. Zhu, C. B. Choy, and L. Fei-Fei, “Scene graph generation by iterative message passing,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, 2017.
  • [122] J. Yang, J. Lu, S. Lee, D. Batra, and D. Parikh, “Graph r-cnn for scene graph generation,” in European Conference on Computer Vision.   Springer, 2018, pp. 690–706.
  • [123] Y. Li, W. Ouyang, B. Zhou, J. Shi, C. Zhang, and X. Wang, “Factorizable net: an efficient subgraph-based framework for scene graph generation,” in European Conference on Computer Vision.   Springer, 2018, pp. 346–363.
  • [124] J. Johnson, A. Gupta, and L. Fei-Fei, “Image generation from scene graphs,” arXiv preprint, 2018.
  • [125] Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic graph cnn for learning on point clouds,” arXiv preprint arXiv:1801.07829, 2018.
  • [126] L. Landrieu and M. Simonovsky, “Large-scale point cloud semantic segmentation with superpoint graphs,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
  • [127] G. Te, W. Hu, Z. Guo, and A. Zheng, “Rgcnn: Regularized graph cnn for point cloud segmentation,” arXiv preprint arXiv:1806.02952, 2018.
  • [128] V. G. Satorras and J. B. Estrach, “Few-shot learning with graph neural networks,” in Proceedings of the International Conference on Learning Representations, 2018.
  • [129] M. Guo, E. Chou, D.-A. Huang, S. Song, S. Yeung, and L. Fei-Fei, “Neural graph matching networks for fewshot 3d action recognition,” in European Conference on Computer Vision.   Springer, 2018, pp. 673–689.
  • [130] X. Qi, R. Liao, J. Jia, S. Fidler, and R. Urtasun, “3d graph neural networks for rgbd semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5199–5208.
  • [131] L. Yi, H. Su, X. Guo, and L. J. Guibas, “Syncspeccnn: Synchronized spectral cnn for 3d shape segmentation.” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 6584–6592.
  • [132] X. Chen, L.-J. Li, L. Fei-Fei, and A. Gupta, “Iterative visual reasoning beyond convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
  • [133] M. Narasimhan, S. Lazebnik, and A. Schwing, “Out of the box: Reasoning with graph convolution nets for factual visual question answering,” in Advances in Neural Information Processing Systems, 2018, pp. 2655–2666.
  • [134] Z. Cui, K. Henrickson, R. Ke, and Y. Wang, “High-order graph convolutional recurrent neural network: a deep learning framework for network-scale traffic learning and forecasting,” arXiv preprint arXiv:1802.07007, 2018.
  • [135] H. Yao, F. Wu, J. Ke, X. Tang, Y. Jia, S. Lu, P. Gong, J. Ye, and Z. Li, “Deep multi-view spatial-temporal network for taxi demand prediction,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2018, pp. 2588–2595.
  • [136] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei, “Line: Large-scale information network embedding,” in Proceedings of the International Conference on World Wide Web.   International World Wide Web Conferences Steering Committee, 2015, pp. 1067–1077.
  • [137] A. Fout, J. Byrd, B. Shariat, and A. Ben-Hur, “Protein interface prediction using graph convolutional networks,” in Advances in Neural Information Processing Systems, 2017, pp. 6530–6539.
  • [138] J. You, B. Liu, R. Ying, V. Pande, and J. Leskovec, “Graph convolutional policy network for goal-directed molecular graph generation,” in Advances in Neural Information Processing Systems, 2018.
  • [139] M. Allamanis, M. Brockschmidt, and M. Khademi, “Learning to represent programs with graphs,” in Proceedings of the International Conference on Learning Representations, 2017.
  • [140] J. Qiu, J. Tang, H. Ma, Y. Dong, K. Wang, and J. Tang, “Deepinf: Social influence prediction with deep learning,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.   ACM, 2018, pp. 2110–2119.
  • [141] D. Zügner, A. Akbarnejad, and S. Günnemann, “Adversarial attacks on neural networks for graph data,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.   ACM, 2018, pp. 2847–2856.
  • [142] E. Choi, M. T. Bahadori, L. Song, W. F. Stewart, and J. Sun, “Gram: graph-based attention model for healthcare representation learning,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.   ACM, 2017, pp. 787–795.
  • [143] E. Choi, C. Xiao, W. Stewart, and J. Sun, “Mime: Multilevel medical embedding of electronic health records for predictive healthcare,” in Advances in Neural Information Processing Systems, 2018, pp. 4548–4558.
  • [144] T. H. Nguyen and R. Grishman, “Graph convolutional networks with argument-aware pooling for event detection,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2018, pp. 5900–5907.
  • [145] Z. Li, Q. Chen, and V. Koltun, “Combinatorial optimization with graph convolutional networks and guided tree search,” in Advances in Neural Information Processing Systems, 2018, pp. 536–545.
  • [146] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  • [147] Q. Li, Z. Han, and X.-M. Wu, “Deeper insights into graph convolutional networks for semi-supervised learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2018.