Understanding and Controlling User Linkability in Decentralized Learning

Abstract

Machine Learning techniques are widely used by online services (e.g. Google,Apple) in order to analyze and make predictions on user data. As many of theprovided services are user-centric (e.g. personal photo collections, speechrecognition, personal assistance), user data generated on personal devices iskey to provide the service. In order to protect the data and the privacy of theuser, federated learning techniques have been proposed where the data neverleaves the user's device and "only" model updates are communicated back to theserver. In our work, we propose a new threat model that is not concerned withlearning about the content - but rather is concerned with the linkability ofusers during such decentralized learning scenarios. We show that model updates are characteristic for users and therefore lendthemselves to linkability attacks. We show identification and matching of usersacross devices in closed and open world scenarios. In our experiments, we findour attacks to be highly effective, achieving 20x-175x chance-levelperformance. In order to mitigate the risks of linkability attacks, we study variousstrategies. As adding random noise does not offer convincing operation points,we propose strategies based on using calibrated domain-specific data; we findthese strategies offers substantial protection against linkability threats withlittle effect to utility.

Quick Read (beta)

loading the full paper ...