Feedback-Based Self-Learning in Large-Scale Conversational AI Agents

Abstract

Today, most large-scale conversational AI agents (e.g. Alexa, Siri, or GoogleAssistant) are built using manually annotated data to train the differentcomponents of the system. Typically, the accuracy of the ML models in thesecomponents are improved by manually transcribing and annotating data. As thescope of these systems increase to cover more scenarios and domains, manualannotation to improve the accuracy of these components becomes prohibitivelycostly and time consuming. In this paper, we propose a system that leveragesuser-system interaction feedback signals to automate learning without anymanual annotation. Users here tend to modify a previous query in hopes offixing an error in the previous turn to get the right results. Thesereformulations, which are often preceded by defective experiences caused byerrors in ASR, NLU, ER or the application. In some cases, users may notproperly formulate their requests (e.g. providing partial title of a song), butgleaning across a wider pool of users and sessions reveals the underlyingrecurrent patterns. Our proposed self-learning system automatically detects theerrors, generate reformulations and deploys fixes to the runtime system tocorrect different types of errors occurring in different components of thesystem. In particular, we propose leveraging an absorbing Markov Chain model asa collaborative filtering mechanism in a novel attempt to mine these patterns.We show that our approach is highly scalable, and able to learn reformulationsthat reduce Alexa-user errors by pooling anonymized data across millions ofcustomers. The proposed self-learning system achieves a win/loss ratio of 11.8and effectively reduces the defect rate by more than 30% on utterance levelreformulations in our production A/B tests. To the best of our knowledge, thisis the first self-learning large-scale conversational AI system in production.

Quick Read (beta)

loading the full paper ...