A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP

Abstract

Building a dialogue system that can communicate naturally with humans is achallenging yet interesting problem of agent-based computing. The rapid growthin this area is usually hindered by the long-standing problem of data scarcityas these systems are expected to learn syntax, grammar, decision making, andreasoning from insufficient amounts of task-specific dataset. The recentlyintroduced pre-trained language models have the potential to address the issueof data scarcity and bring considerable advantages by generating contextualizedword embeddings. These models are considered counterpart of ImageNet in NLP andhave demonstrated to capture different facets of language such as hierarchicalrelations, long-term dependency, and sentiment. In this short survey paper, wediscuss the recent progress made in the field of pre-trained language models.We also deliberate that how the strengths of these language models can beleveraged in designing more engaging and more eloquent conversational agents.This paper, therefore, intends to establish whether these pre-trained modelscan overcome the challenges pertinent to dialogue systems, and how theirarchitecture could be exploited in order to overcome these challenges. Openchallenges in the field of dialogue systems have also been deliberated.

Quick Read (beta)

loading the full paper ...