Quantifying the Impact of Label Noise on Federated Learning

Abstract

Federated Learning (FL) is a distributed machine learning paradigm whereclients collaboratively train a model using their local (human-generated)datasets. While existing studies focus on FL algorithm development to tackledata heterogeneity across clients, the important issue of data quality (e.g.,label noise) in FL is overlooked. This paper aims to fill this gap by providinga quantitative study on the impact of label noise on FL. We derive an upperbound for the generalization error that is linear in the clients' label noiselevel. Then we conduct experiments on MNIST and CIFAR-10 datasets using variousFL algorithms. Our empirical results show that the global model accuracylinearly decreases as the noise level increases, which is consistent with ourtheoretical analysis. We further find that label noise slows down theconvergence of FL training, and the global model tends to overfit when thenoise level is high.

Quick Read (beta)

loading the full paper ...