Instance-Optimality for Private KL Distribution Estimation

Abstract

We study the fundamental problem of estimating an unknown discretedistribution $p$ over $d$ symbols, given $n$ i.i.d. samples from thedistribution. We are interested in minimizing the KL divergence between thetrue distribution and the algorithm's estimate. We first construct minimaxoptimal private estimators. Minimax optimality however fails to shed light onan algorithm's performance on individual (non-worst-case) instances $p$ andsimple minimax-optimal DP estimators can have poor empirical performance onreal distributions. We then study this problem from an instance-optimalityviewpoint, where the algorithm's error on $p$ is compared to the minimumachievable estimation error over a small local neighborhood of $p$. Undernatural notions of local neighborhood, we propose algorithms that achieveinstance-optimality up to constant factors, with and without a differentialprivacy constraint. Our upper bounds rely on (private) variants of theGood-Turing estimator. Our lower bounds use additive local neighborhoods thatmore precisely captures the hardness of distribution estimation in KLdivergence, compared to ones considered in prior works.

Quick Read (beta)

loading the full paper ...