Abstract
Power system studies require the topological structures of realworld powernetworks; however, such data is confidential due to important securityconcerns. Thus, power grid synthesis (PGS), i.e., creating realistic powergrids that imitate actual power networks, has gained significant attention. Inthis letter, we cast PGS into a graph distribution learning (GDL) problem wherethe probability distribution functions (PDFs) of the nodes (buses) and edges(lines) are captured. A novel deep GDL (DeepGDL) model is proposed to learn thetopological patterns of buses/lines with their physical features (e.g., powerinjection and line impedance). Having a deep nonlinear recurrent structure,DeepGDL understands complex nonlinear topological properties and captures thegraph PDF. Sampling from the obtained PDF, we are able to create a large set ofrealistic networks that all resemble the original power grid. Simulationresults show the significant accuracy of our created synthetic power grids interms of various topological metrics and power flow measurements.
Quick Read (beta)
Deep Generative Graph Distribution Learning
for Synthetic Power Grids
Abstract
Power system studies require the topological structures of realworld power networks; however, such data is confidential due to important security concerns. Thus, power grid synthesis (PGS), i.e., creating realistic power grids that imitate actual power networks, has gained significant attention. In this letter, we cast PGS into a graph distribution learning (GDL) problem where the probability distribution functions (PDFs) of the nodes (buses) and edges (lines) are captured. A novel deep GDL (DeepGDL) model is proposed to learn the topological patterns of buses/lines with their physical features (e.g., power injection and line impedance). Having a deep nonlinear recurrent structure, DeepGDL understands complex nonlinear topological properties and captures the graph PDF. Sampling from the obtained PDF, we are able to create a large set of realistic networks that all resemble the original power grid. Simulation results show the significant accuracy of our created synthetic power grids in terms of various topological metrics and power flow measurements.
.tifpng.pngconvert #1 \OutputFile \AppendGraphicsExtensions.tif \bstctlciteIEEEexample:BSTcontrol
I INTRODUCTION
MANY power system studies require the actual topologies of realworld power networks with real bus and line coordinates as well as real physical charactristics such as power injections and line impedance values. However, due to the confidentiality concerns, there are very few datasets of real power grids such as the IEEE test cases [1] and the Polish grid [2] that are publicly available for such studies. Even these datasets do not contain important information regarding the geographical locations of buses, and lack the diverse characteristics of real topologies. In recent studies, several works are focused on the spatial models for power networks that are realistic but synthesized without revealing any confidential information. [3, 4] create power networks based on the locations of cities and power plants in Texas; however, no geographical or performance comparison with the actual power grid is provided to justify the approach. Although [3, 4] contain useful engineering details, they do not devise a general algorithm that can work with any realistic power network; thus, in practice, these approaches are limited. In this line of research, [5] is the only learningbased power grid synthesis (PGS) model presented in recent literature that applies a Gaussian Mixture Model (GMM) to compute the distributions of bus locations in the Western Interconnect network. However, due to lack of generalization, GMM is theoretically incapable of learning the physical properties of power grid components. Moreover, the data required by [5] exceeds the power grid data as this model represents the demand based on the average residential power usage and city populations. Also, similar to [3] and [4], the GMM approach [5] puts strict assumptions on the node degree and line distributions which do not necessarily hold.
This letter presents a novel deep graph distribution learning (DeepGDL) algorithm to create synthetic power grids. We present a recurrent model that efficiently captures node/edge distributions of arbitrary complex networks. Leveraging a deep architecture, DeepGDL captures complex nonlinear manifolds of nodes/edges in the actual power network, and generates synthetic power grids that imitate the original network. To the best of our knowledge, DeepGDL is the only deep learningbased PGS solution. Also, in contrast to all previous works, the presented algorithm needs no additional information other than the network topology and the bus/line physical properties; hence, it can be applied as a general framework to any PGS problem with minimum amounts of data.
II DeepGDL for Power Grid Synthesis
Let us consider the actual power grid obtained from the Columbia University Synthetic Power Grid (CUSPG) dataset [6] that contains 14430 buses and 18884 lines located in Western US. We represent this network as an undirected weighted graph ${G}_{P}=(V,E)$ where ${n}_{i}\in V$ denotes the $i$th node while an edge ${l}_{i,j}\in E$ is a transmission line connecting two nodes ${n}_{i}$ and ${n}_{j}$. For each node ${n}_{i}$, ${G}_{P}$ contains four features including the latitude $x({n}_{i})$, longitude $y({n}_{i})$, power supply $pw({n}_{i})$ (i.e., power generated at the bus $i$), and power demand $pd({n}_{i})$. A weight ${W}_{i,j}=X({l}_{i,j})\in \mathbb{R}$ is defined to represent the reactance of the line corresponding to each ${l}_{i,j}$. Our objective is to learn the PDFs of nodes ${n}_{i}\in V$ as well as edges ${l}_{i,j}\in E$, and sample from them to generate synthetic power grids.
Algorithm 1 is the pseudocode of our presented model. First, using the modularity optimization in [7], ${G}_{P}$ is decomposed into $K$ communities (i.e., subgraphs) $\mathrm{\Psi}=\{{C}^{1},{C}^{2},\mathrm{\dots},{C}^{K}\}$ where the nodes inside each ${C}^{k}(1\le k\le K)$ are densely connected while the nodes in distinct communities have sparse connections. For each ${C}^{k}=({V}^{k},{E}^{k})$, the adjacency matrix ${A}^{k}$ is modeled based on the node ordering $\tau $ that maps any ${n}_{i}\in {V}^{k}$ to a row/column index $\tau (i)$ in ${A}^{k}$. Having the set of all ${V}^{k}!$ node permutations denoted by $\mathrm{T}$, one can write a new adjacency matrix ${A}^{\tau}\in {A}^{\mathrm{T}}=\{{A}^{\tau}\tau \in \mathrm{T}\}$ defined by ${A}_{i,j}^{\tau}={A}_{\tau (i),\tau (t)}^{k}=X({e}_{\tau (i),\tau (t)})$; hence, we model ${C}^{k}$ as an arbitrarilysized sequence of node orderings each with a unique nodal and edge feature sequence. Considering the ordering $\tau $, a nonlinear function ${f}_{n}$ maps $({C}^{k},\tau )$ to a 2dimensional node feature tensor ${F}_{\tau}^{k}$ using the following formulation:
$${F}_{\tau}^{k}={f}_{n}({C}^{k},\tau )=({F}_{\tau}^{k}(1),{F}_{\tau}^{k}(2),\mathrm{\dots},{F}_{\tau}^{k}({V}^{k}))$$  (1) 
where $$ is the nodal feature of the $i$th node w.r.t the ordering $\tau $. Similarly, we define ${f}_{\mathrm{\Omega}}$ that maps $(G,\tau )$ to a 2dimensional edge feature tensor ${\mathrm{\Omega}}_{\tau}^{k}$ computed by:
$${\mathrm{\Omega}}_{\tau}^{k}={f}_{\mathrm{\Omega}}(G,\tau )=({\mathrm{\Omega}}_{\tau}^{k}(1),{\mathrm{\Omega}}_{\tau}^{k}(2),..,{\mathrm{\Omega}}_{\tau}^{k}({V}^{k}))$$  (2) 
where $$ is the weighted adjacency vector of node $\tau (i)$; thus, for an undirected weighted ${C}^{k}$, the tensor $$ uniquely defines the graph ${C}^{k}$. As a result, one can recover ${C}^{k}$ by the mapping $$; hence, in order to learn the distribution of the nodes and edges in communities ${C}^{k}(k\le K)$ denoted by $P({C}^{k})$, we maximize the likelihood of their observation in actual network ${G}_{P}$ by:
$$  (3) 
where $\mathbb{I}(x)=1$ if $x$ is true and zero otherwise. Using this optimization, we maximize the observation of ${G}_{P}$’s communities, hence finding a model that can imitate ${G}_{P}$.
Let us write a decomposition for $P({F}_{\tau}^{k},{\mathrm{\Omega}}_{\tau}^{k})$ in (3) using conditional probabilities of the observed sequence of nodal and edge features implied by $\tau $ for any community ${C}^{k}$:
$$\prod _{i=1}^{\left{V}^{k}\right}P\left(({F}_{\tau}^{k}\left(i\right),{\mathrm{\Omega}}_{\tau}^{k}\left(i\right))({F}_{\tau}^{k}\left(1\right),{\mathrm{\Omega}}_{\tau}^{k}\left(1\right)),\mathrm{\dots},({F}_{\tau}^{k}\left(i1\right),{\mathrm{\Omega}}_{\tau}^{k}\left(i1\right))\right)$$  (4) 
The recursive characteristics of the graph features in (4) motivates us to model nodal and edge features by a recurrent neural network. Therefore, we model $P({F}_{\tau}^{k},{\mathrm{\Omega}}_{\tau}^{k})$ using a deep Gated Recurrent Unit (GRU) network. For each community ${C}^{k}$, at each iteration $i$ of our deep recurrent model, the GRU computes a graph state ${S}_{i}$ using the previous node/edge observations $I(t)$ $(j=1,2,\mathrm{\dots},i)$ by the following recurrent structure:
$$\begin{array}{c}I\left(t\right)=[{\alpha}^{k}\left(t\right)={F}_{\tau}^{k}\left(t\right),{\beta}^{k}\left(t\right)={\mathrm{\Omega}}_{\tau}^{k}(i,j)]\hfill \\ z\left(t\right)=Sig\left({W}_{1}\cdot [I\left(t\right),M\left(t\right)]\right)\hfill \\ r(t)=Sig({W}_{2}\cdot [I(t),M(t)])\hfill \\ \stackrel{~}{M}\left(t\right)=tnh\left({W}_{M}\cdot [I\left(t\right),r\left(t\right)*M\left(t1\right)]\right)\hfill \\ M\left(t\right)=\left(1z\left(t\right)\right)*M\left(t1\right)+z\left(t\right)*\stackrel{~}{M}\left(t\right)\hfill \end{array}$$  (5) 
Here, ${\alpha}^{k}(t)$ and ${\beta}^{k}(t)$ in $I(t)$ are the intermediate nodal and edge features of ${C}^{k}$ observed at the $j$th round of the recursive formulation. At each round $j$, the temporal features, $z(t)$ and $r(t)$, are computed for the graph observation $I(t)$ using a sigmoid function $Sig$ with weights ${W}_{1}$ and ${W}_{2}$, respectively. Then, the GRU’s updating vector $\stackrel{~}{M}(t)$ is computed using the intermediate graph features, $r(t)$ and $I(t)$, as well as the GRU state at the previous round $M(t1)$. Here, $tnh$ denotes a tangent hyperbolic function with weight ${W}_{M}$ applied to compute $\stackrel{~}{M}(t)$. The GRU state $M(t)$ is finally computed as a linear regression of $M(t1)$ and $\stackrel{~}{M}(t)$ weighted by the temporal feature $z(t)$. After running the recursion (5) for $i$ rounds, the final GRU state $C(j=i)$ is our extracted graph state ${S}_{i}$.
At each iteration $i$, we model the node/edge distribution of each community ${C}^{k}$, and generate a new synthetic community ${C}_{Syn}^{k}=({V}_{Syn}^{k},{E}_{Syn}^{k})$ corresponding to ${C}^{k}$ using the following formulations:
$$\begin{array}{c}{\xi}_{i}^{F}={f}_{F}^{nn}({S}_{i}),{\xi}_{i}^{\mathrm{\Omega}}={f}_{\mathrm{\Omega}}^{nn}({S}_{i})\hfill \\ {F}_{\tau}^{k}(i)\sim {P}_{{\xi}_{i}^{F}}\hspace{1em}\{Samplingthefeaturesofithnode\}\hfill \\ {\mathrm{\Omega}}_{\tau}^{k}(i)\sim {P}_{{\xi}_{i}^{\mathrm{\Omega}}}\hspace{1em}\{Samplingtheweightofithedge\}\hfill \end{array}$$  (6) 
Topological Properties  Power Flow Statistics (MW)  
Networks  #Nodes  #Edges  ${\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$d$}}_{\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$a$}\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$$}\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$v$}\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$$}\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$g$}}$  $\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$\omega $}$  $\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$D$}$  $\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$\rho $}\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$*$}{\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$10$}}^{\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$$}\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$4$}}$  $\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$Q$}$  $\colorbox[rgb]{0.937254901960784,0.937254901960784,0.937254901960784}{$C$}$  Average  Median  STDV  Max  
${\colorbox[rgb]{1,1,1}{${G}$}}_{\colorbox[rgb]{1,1,1}{${P}$}}$  14430  18554  2.572  15.061  37  1.782  0.953  0.072  168.93  33.32  320.93  4,777.07  
${\colorbox[rgb]{1,1,1}{${G}$}}_{\colorbox[rgb]{1,1,1}{${S}$}\colorbox[rgb]{1,1,1}{${}$}\colorbox[rgb]{1,1,1}{${y}$}\colorbox[rgb]{1,1,1}{${}$}\colorbox[rgb]{1,1,1}{${n}$}}$  14408  18302  2.510  15.253  37  1.820  0.961  0.072  173.27  29.31  302.21  4,687.83  
GMM  14269  19881  2.269  16.229  35  1.953  0.804  0.063  110.25  42.78  479.08  5,521,37  
DeepDGL Mean(STDV) 








171.18 (11.34)  30.42 (1.20)  307.18 (15.59)  4,590.25 (217.01) 
where ${f}_{F}^{nn}$ and ${f}_{\mathrm{\Omega}}^{nn}$ are two sigmoidal neural networks (SNNs) with the input ${S}_{i}$ and output vectors ${\xi}_{i}^{F}$ and ${\xi}_{i}^{\mathrm{\Omega}}$, respectively. The vector ${\xi}_{i}^{F}$ encodes the nodal feature distribution of the $i$th node in ${C}_{Syn}^{k}$. Similarly, ${\xi}_{i}^{\mathrm{\Omega}}$ encodes the edge weight distribution corresponding to this node. Sampling from these two distributions in (6), we create a new node (i.e., the ith synthesized node) of ${C}_{Syn}^{k}$. We run (6) to create new nodes and edges until the GRU reaches a predefined $END$ state, which terminates the graph generation process and outputs ${C}_{Syn}^{k}$ for every community ${C}^{k}$. Each of the generated communities ${C}_{Syn}^{k}$ is a component of our final synthetic power network ${P}_{Syn}$.
As shown in [5], in dense regions of realworld power grids, the lowdegree buses are generally connected to highdegree buses. Thus, in order to merge the generated components and form a singlecomponent robust network ${P}_{Syn}$, we connect the lowdegree nodes ${n}_{i}\in {\cup}_{k=1}^{K}{V}_{Syn}^{k}$ in dense regions, to highdegree nodes ${n}_{j}\in {\cup}_{k=1}^{K}{V}_{Syn}^{k}$ until the ${P}_{Syn}$ becomes singlecomponent. The resulting ${P}_{Syn}$ is finally reported as the output.
III Simulation Results
First, we set several hyperparameters defined in the DeepGDL algorithm. The GRU state variables $S$, $z$ and $r$, are considered to have $120$ dimensions. Both ${f}_{f}^{nn}$ and ${f}_{\mathrm{\Lambda}}^{nn}$ are modeled as a SNN with $45$ neurons in its first layer and $35$ neurons in the second layer. The gradient descent is considered to have a learning rate of ${10}^{3}$ while its batch size is $30$. In this study, $K=72$ graph communities are considered for ${G}_{P}$. We generate ${10}^{2}$ synthetic power grids via DeepGDL. Fig. 1(a) and (b) depict the actual network ${G}_{P}$ and one of our synthetic networks ${G}_{Syn}$, respectively. Each community is shown by a distinct color. Moreover, Fig. 1(c) and (d) show the amount of power supply at each bus (node) in the two networks. To compare the performance of the generated grids with the actual grid, we conduct cascading failure experiments using DC power flow. Fig. 1(e) and (f) demonstrate the Yield (i.e. ratio of demand supplied after cascades to the actual value of demand) in the two power grids for cascades initiatd in ${10}^{3}$ regions of radius $100$km. Each point shows the computed Yield at the center of its corresponding region. As shown in Fig. 1, ${G}_{Syn}$ accurately resembles the actual grid regarding the topological characteristics and power flow measurements. Table I reports detailed topological measurements and power flow statistics obtained by ${G}_{P}$, DeepGDL algorithm, and the stateoftheart GMM [5] model. It is shown that DeepGDL has significant improvements over GMM in terms of average node degree ${d}_{avg}$, average path length $\omega $, network Diameter $D$, Density $\rho $, Modularity $Q$, and Average Clustering Coefficient $C$. Also, Table I shows the substantial accuracy improvements of DeepGDL over the stateoftheart in terms of power flow measurements.
IV Conclusions
This letter presents a deep graph distribution learning algorithm for the problem of power grid synthesis. A novel recurrent model is proposed to capture deep nodal and edge features from a realistic power network. To the best of our knowledge, DeepGDL is the only deep learning model that can synthesize largescale networks. Simulation results show significant accuracy of DeepGDL in terms of topological measurements and power flow analysis metrics.
References
 [1] “IEEE benchmark systems,” www.labs.ece.uw.edu/pstca/, accessed: 20190120.
 [2] “Polish grid,” www.pserc.cornell.edu/matpower/, accessed: 20190121.
 [3] K. M. Gegner, A. B. Birchfield, T. Xu, K. S. Shetye, and T. J. Overbye, “A methodology for the creation of geographically realistic synthetic power flow models,” in Power and Energy Conference at Illinois (PECI), 2016 IEEE, 2016, pp. 1–6.
 [4] A. B. Birchfield, T. Xu, K. M. Gegner, K. S. Shetye, and T. J. Overbye, “Grid structural characteristics as validation criteria for synthetic networks,” IEEE Transactions on power systems, vol. 32, no. 4, pp. 3258–3265, 2017.
 [5] S. Soltan, A. Loh, and G. Zussman, “A learningbased method for generating synthetic power grids,” IEEE Systems Journal, no. 99, pp. 1–10, 2018.
 [6] S. Soltan, A. Loh, and g. Zussman, “Columbia university synthetic power grid with geographical coordinates,” Tech. Rep., 1 2018.
 [7] X. Lu, K. Kuzmin, M. Chen, and B. K. Szymanski, “Adaptive modularity maximization via edge weighting scheme,” Information Sciences, vol. 424, pp. 55–68, 2018.