Towards Federated Learning with On-device Training and Communication in 8-bit Floating Point

Abstract

Recent work has shown that 8-bit floating point (FP8) can be used forefficiently training neural networks with reduced computational cost comparedto training in FP32/FP16. In this work, we investigate the use of FP8 trainingin a federated learning context. This approach brings not only the usualbenefits of FP8 which are desirable for on-device training at the edge, butalso reduces client-server communication costs due to significant weightcompression. We present a novel method for combining FP8 client training whilemaintaining a global FP32 server model and provide convergence analysis.Experiments with various machine learning models and datasets show that ourmethod consistently yields communication reductions of at least 2.9x across avariety of tasks and models compared to an FP32 baseline to achieve the sametrained model accuracy.

Quick Read (beta)

loading the full paper ...