Differentially private anonymized histograms

Abstract

For a dataset of label-count pairs, an anonymized histogram is the multisetof counts. Anonymized histograms appear in various potentially sensitivecontexts such as password-frequency lists, degree distribution in socialnetworks, and estimation of symmetric properties of discrete distributions.Motivated by these applications, we propose the first differentially privatemechanism to release anonymized histograms that achieves near-optimal privacyutility trade-off both in terms of number of items and the privacy parameter.Further, if the underlying histogram is given in a compact format, the proposedalgorithm runs in time sub-linear in the number of items. For anonymizedhistograms generated from unknown discrete distributions, we show that thereleased histogram can be directly used for estimating symmetric properties ofthe underlying distribution.

Quick Read (beta)

loading the full paper ...