Abstract
Data privacy is an important issue for "machine learning as a service"providers. We focus on the problem of membership inference attacks: given adata sample and black-box access to a model's API, determine whether the sampleexisted in the model's training data. Our contribution is an investigation ofthis problem in the context of sequence-to-sequence models, which are importantin applications such as machine translation and video captioning. We define themembership inference problem for sequence generation, provide an open datasetbased on state-of-the-art machine translation models, and report initialresults on whether these models leak private information against several kindsof membership inference attacks.
Quick Read (beta)
loading the full paper ...