Abstract
Data synthesis is gaining momentum as a privacy-enhancing technology. Whilesingle-table tabular data generation has seen considerable progress, currentmethods for multi-table data often lack the flexibility and expressivenessneeded to capture complex relational structures. In particular, they strugglewith long-range dependencies and complex foreign-key relationships, such astables with multiple parent tables or multiple types of links between the samepair of tables. We propose a generative model for relational data thatgenerates the content of a relational dataset given the graph formed by theforeign-key relationships. We do this by learning a deep generative model ofthe content of the whole relational database by flow matching, where the neuralnetwork trained to denoise records leverages a graph neural network to obtaininformation from connected records. Our method is flexible, as it can supportrelational datasets with complex structures, and expressive, as the generationof each record can be influenced by any other record within the same connectedcomponent. We evaluate our method on several benchmark datasets and show thatit achieves state-of-the-art performance in terms of synthetic data fidelity.