An Attack on InstaHide: Is Private Learning Possible with Instance Encoding?

Abstract

A learning algorithm is private if the produced model does not reveal (toomuch) about its training set. InstaHide [Huang, Song, Li, Arora, ICML'20] is arecent proposal that claims to preserve privacy by an encoding mechanism thatmodifies the inputs before being processed by the normal learner. We present a reconstruction attack on InstaHide that is able to use theencoded images to recover visually recognizable versions of the originalimages. Our attack is effective and efficient, and empirically breaks InstaHideon CIFAR-10, CIFAR-100, and the recently released InstaHide Challenge. We further formalize various privacy notions of learning through instanceencoding and investigate the possibility of achieving these notions. We provebarriers against achieving (indistinguishability based notions of) privacythrough any learning protocol that uses instance encoding.

Quick Read (beta)

loading the full paper ...