Learning Robust Dialog Policies in Noisy Environments

Abstract

Modern virtual personal assistants provide a convenient interface forcompleting daily tasks via voice commands. An important consideration for theseassistants is the ability to recover from automatic speech recognition (ASR)and natural language understanding (NLU) errors. In this paper, we focus onlearning robust dialog policies to recover from these errors. To this end, wedevelop a user simulator which interacts with the assistant through voicecommands in realistic scenarios with noisy audio, and use it to learn dialogpolicies through deep reinforcement learning. We show that dialogs generated byour simulator are indistinguishable from human generated dialogs, as determinedby human evaluators. Furthermore, preliminary experimental results show thatthe learned policies in noisy environments achieve the same execution successrate with fewer dialog turns compared to fixed rule-based policies.

Quick Read (beta)

loading the full paper ...