TESSERACT: Gradient Flip Score to Secure Federated Learning Against Model Poisoning Attacks

Abstract

Federated learning---multi-party, distributed learning in a decentralizedenvironment---is vulnerable to model poisoning attacks, even more so thancentralized learning approaches. This is because malicious clients can colludeand send in carefully tailored model updates to make the global modelinaccurate. This motivated the development of Byzantine-resilient federatedlearning algorithms, such as Krum, Bulyan, FABA, and FoolsGold. However, arecently developed untargeted model poisoning attack showed that all priordefenses can be bypassed. The attack uses the intuition that simply by changingthe sign of the gradient updates that the optimizer is computing, for a set ofmalicious clients, a model can be diverted from the optima to increase the testerror rate. In this work, we develop TESSERACT---a defense against thisdirected deviation attack, a state-of-the-art model poisoning attack. TESSERACTis based on a simple intuition that in a federated learning setting, certainpatterns of gradient flips are indicative of an attack. This intuition isremarkably stable across different learning algorithms, models, and datasets.TESSERACT assigns reputation scores to the participating clients based on theirbehavior during the training phase and then takes a weighted contribution ofthe clients. We show that TESSERACT provides robustness against even awhite-box version of the attack.

Quick Read (beta)

loading the full paper ...