Abstract
Money laundering is the process that intends to legalize the income derivedfrom illicit activities, thus facilitating their entry into the monetary flowof the economy without jeopardizing their source. It is crucial to identifysuch activities accurately and reliably in order to enforce anti-moneylaundering (AML). Despite considerable efforts to AML, a large number of suchactivities still go undetected. Rule-based methods were first introduced andare still widely used in current detection systems. With the rise of machinelearning, graph-based learning methods have gained prominence in detectingillicit accounts through the analysis of money transfer graphs. Nevertheless,these methods generally assume that the transaction graph is centralized,whereas in practice, money laundering activities usually span multiplefinancial institutions. Due to regulatory, legal, commercial, and customerprivacy concerns, institutions tend not to share data, restricting theirutility in practical usage. In this paper, we propose the first algorithm thatsupports performing AML over multiple institutions while protecting thesecurity and privacy of local data. To evaluate, we construct Alipay-ECB, areal-world dataset comprising digital transactions from Alipay, the world'slargest mobile payment platform, alongside transactions from E-Commerce Bank(ECB). The dataset includes over 200 million accounts and 300 milliontransactions, covering both intra-institution transactions and those betweenAlipay and ECB. This makes it the largest real-world transaction graphavailable for analysis. The experimental results demonstrate that our methodscan effectively identify cross-institution money laundering subgroups.Additionally, experiments on synthetic datasets also demonstrate that ourmethod is efficient, requiring only a few minutes on datasets with millions oftransactions.