Lemur: Harmonizing Natural Language and Code for Language Agents

  • 2024-08-24 22:30:00
  • Yiheng Xu, Hongjin Su, Chen Xing, Boyu Mi, Qian Liu, Weijia Shi, Binyuan Hui, Fan Zhou, Yitao Liu, Tianbao Xie, Zhoujun Cheng, Siheng Zhao, Lingpeng Kong, Bailin Wang, Caiming Xiong, Tao Yu
  • 0

Abstract

We introduce Lemur and Lemur-Chat, openly accessible language modelsoptimized for both natural language and coding capabilities to serve as thebackbone of versatile language agents. The evolution from language chat modelsto functional language agents demands that models not only master humaninteraction, reasoning, and planning but also ensure grounding in the relevantenvironments. This calls for a harmonious blend of language and codingcapabilities in the models. Lemur and Lemur-Chat are proposed to address thisnecessity, demonstrating balanced proficiencies in both domains, unlikeexisting open-source models that tend to specialize in either. Throughmeticulous pre-training using a code-intensive corpus and instructionfine-tuning on text and code data, our models achieve state-of-the-art averagedperformance across diverse text and coding benchmarks among open-source models.Comprehensive experiments demonstrate Lemur's superiority over existingopen-source models and its proficiency across various agent tasks involvinghuman communication, tool usage, and interaction under fully- and partially-observable environments. The harmonization between natural and programminglanguages enables Lemur-Chat to significantly narrow the gap with proprietarymodels on agent abilities, providing key insights into developing advancedopen-source agents adept at reasoning, planning, and operating seamlesslyacross environments. https://github.com/OpenLemur/Lemur

 

Quick Read (beta)

loading the full paper ...