LaMDA: Language Models for Dialog Applications

  • 2022-01-20 15:44:37
  • Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, YaGuang Li, Hongrae Lee, Huaixiu Steven Zheng, Amin Ghafouri, Marcelo Menegali, Yanping Huang, Maxim Krikun, Dmitry Lepikhin, James Qin, Dehao Chen, Yuanzhong Xu, Zhifeng Chen, Adam Roberts, Maarten Bosma, Yanqi Zhou, Chung-Ching Chang, Igor Krivokon, Will Rusch, Marc Pickett, Kathleen Meier-Hellstern, Meredith Ringel Morris, Tulsee Doshi, Renelito Delos Santos, Toju Duke, Johnny Soraker, Ben Zevenbergen, Vinodkumar Prabhakaran, Mark Diaz, Ben Hutchinson, Kristen Olson, Alejandra Molina, Erin Hoffman-John, Josh Lee, Lora Aroyo, Ravi Rajakumar, Alena Butryna, Matthew Lamm, Viktoriya Kuzmina, Joe Fenton, Aaron Cohen, Rachel Bernstein, Ray Kurzweil, B
  • 322

Abstract

We present LaMDA: Language Models for Dialog Applications. LaMDA is a familyof Transformer-based neural language models specialized for dialog, which haveup to 137B parameters and are pre-trained on 1.56T words of public dialog dataand web text. While model scaling alone can improve quality, it shows lessimprovements on safety and factual grounding. We demonstrate that fine-tuningwith annotated data and enabling the model to consult external knowledgesources can lead to significant improvements towards the two key challenges ofsafety and factual grounding. The first challenge, safety, involves ensuringthat the model's responses are consistent with a set of human values, such aspreventing harmful suggestions and unfair bias. We quantify safety using ametric based on an illustrative set of human values, and we find that filteringcandidate responses using a LaMDA classifier fine-tuned with a small amount ofcrowdworker-annotated data offers a promising approach to improving modelsafety. The second challenge, factual grounding, involves enabling the model toconsult external knowledge sources, such as an information retrieval system, alanguage translator, and a calculator. We quantify factuality using agroundedness metric, and we find that our approach enables the model togenerate responses grounded in known sources, rather than responses that merelysound plausible. Finally, we explore the use of LaMDA in the domains ofeducation and content recommendations, and analyze their helpfulness and roleconsistency.

 

Quick Read (beta)

loading the full paper ...