TextArena

  • 2025-04-15 18:55:20
  • Leon Guertler, Bobby Cheng, Simon Yu, Bo Liu, Leshem Choshen, Cheston Tan
  • 0

Abstract

TextArena is an open-source collection of competitive text-based games fortraining and evaluation of agentic behavior in Large Language Models (LLMs). Itspans 57+ unique environments (including single-player, two-player, andmulti-player setups) and allows for easy evaluation of model capabilities viaan online-play system (against humans and other submitted models) withreal-time TrueSkill scores. Traditional benchmarks rarely assess dynamic socialskills such as negotiation, theory of mind, and deception, creating a gap thatTextArena addresses. Designed with research, community and extensibility inmind, TextArena emphasizes ease of adding new games, adapting the framework,testing models, playing against the models, and training models. Detaileddocumentation of environments, games, leaderboard, and examples are availableon https://github.com/LeonGuertler/TextArena and https://www.textarena.ai/.

 

Quick Read (beta)

loading the full paper ...