Abstract
For a dataset of label-count pairs, an anonymized histogram is the multisetof counts. Anonymized histograms appear in various potentially sensitivecontexts such as password-frequency lists, degree distribution in socialnetworks, and estimation of symmetric properties of discrete distributions.Motivated by these applications, we propose the first differentially privatemechanism to release anonymized histograms that achieves near-optimal privacyutility trade-off both in terms of number of items and the privacy parameter.Further, if the underlying histogram is given in a compact format, the proposedalgorithm runs in time sub-linear in the number of items. For anonymizedhistograms generated from unknown discrete distributions, we show that thereleased histogram can be directly used for estimating symmetric properties ofthe underlying distribution.