Improving VAEs' Robustness to Adversarial Attack

Abstract

Variational autoencoders (VAEs) have recently been shown to be vulnerable toadversarial attacks, wherein they are fooled into reconstructing a chosentarget image. However, how to defend against such attacks remains an openproblem. We make significant advances in addressing this issue by introducingmethods for producing adversarially robust VAEs. Namely, we first demonstratethat methods used to obtain disentangled latent representations produce VAEsthat are more robust to these attacks. However, this robustness comes at thecost of reducing the quality of the reconstructions. We, therefore, introduce anew hierarchical VAE, the $\textit{Seatbelt-VAE}$, which can producehigh-fidelity autoencoders that are also adversarially robust. We confirm thecapabilities of the Seatbelt-VAE on several different datasets and with currentstate-of-the-art VAE adversarial attacks.

Quick Read (beta)

loading the full paper ...