Abstract
We present the first acoustic side-channel attack that recovers what userstype on the virtual keyboard of their touch-screen smartphone or tablet. When auser taps the screen with a finger, the tap generates a sound wave thatpropagates on the screen surface and in the air. We found the device'smicrophone(s) can recover this wave and "hear" the finger's touch, and thewave's distortions are characteristic of the tap's location on the screen.Hence, by recording audio through the built-in microphone(s), a malicious appcan infer text as the user enters it on their device. We evaluate theeffectiveness of the attack with 45 participants in a real-world environment onan Android tablet and an Android smartphone. For the tablet, we recover 61% of200 4-digit PIN-codes within 20 attempts, even if the model is not trained withthe victim's data. For the smartphone, we recover 9 words of size 7--13 letterswith 50 attempts in a common side-channel attack benchmark. Our results suggestthat it not always sufficient to rely on isolation mechanisms such as TrustZoneto protect user input. We propose and discuss hardware, operating-system andapplication-level mechanisms to block this attack more effectively. Mobiledevices may need a richer capability model, a more user-friendly notificationsystem for sensor usage and a more thorough evaluation of the informationleaked by the underlying hardware.