A Hassle-free Algorithm for Private Learning in Practice: Don't Use Tree Aggregation, Use BLTs

Abstract

The state-of-the-art for training on-device language models for mobilekeyboard applications combines federated learning (FL) with differentialprivacy (DP) via the DP-Follow-the-Regularized-Leader (DP-FTRL) algorithm. Twovariants of DP-FTRL are used in practice, tree aggregation and matrixfactorization. However, tree aggregation suffers from significantly suboptimalprivacy/utility tradeoffs, while matrix mechanisms require expensiveoptimization parameterized by hard-to-estimate-in-advance constants, and highruntime memory costs.This paper extends the recently introduced Buffered LinearToeplitz (BLT) mechanism to multi-participation scenarios. Our BLT-DP-FTRLmaintains the ease-of-use advantages of tree aggregation, while essentiallymatching matrix factorization in terms of utility and privacy. We evaluateBLT-DP-FTRL on the StackOverflow dataset, serving as a re-producible simulationbenchmark, and across four on-device language model tasks in a production FLsystem. Our empirical results highlight the advantages of the BLT mechanism andelevate the practicality and effectiveness of DP in real-world scenarios.

Quick Read (beta)

loading the full paper ...