When Machine Unlearning Jeopardizes Privacy

Abstract

The right to be forgotten states that a data owner has the right to erasetheir data from an entity storing it. In the context of machine learning (ML),the right to be forgotten requires an ML model owner to remove the data owner'sdata from the training set used to build the ML model, a process known asmachine unlearning. While originally designed to protect the privacy of thedata owner, we argue that machine unlearning may leave some imprint of the datain the ML model and thus create unintended privacy risks. In this paper, weperform the first study on investigating the unintended information leakagecaused by machine unlearning. We propose a novel membership inference attackthat leverages the different outputs of an ML model's two versions to inferwhether a target sample is part of the training set of the original model butout of the training set of the corresponding unlearned model. Our experimentsdemonstrate that the proposed membership inference attack achieves strongperformance. More importantly, we show that our attack in multiple casesoutperforms the classical membership inference attack on the original ML model,which indicates that machine unlearning can have counterproductive effects onprivacy. We notice that the privacy degradation is especially significant forwell-generalized ML models where classical membership inference does notperform well. We further investigate four mechanisms to mitigate the newlydiscovered privacy risks and show that releasing the predicted label only,temperature scaling, and differential privacy are effective. We believe thatour results can help improve privacy protection in practical implementations ofmachine unlearning. Our code is available athttps://github.com/MinChen00/UnlearningLeaks.

Quick Read (beta)

loading the full paper ...