Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization

Abstract

Large language model (LLM) agents have recently demonstrated impressivecapabilities in various domains like open-ended conversation and multi-stepdecision-making. However, it remains challenging for these agents to solvestrategic language games, such as Werewolf, which demand both strategicdecision-making and free-form language interactions. Existing LLM agents oftensuffer from intrinsic bias in their action distributions and limitedexploration of the unbounded text action space, resulting in suboptimalperformance. To address these challenges, we propose Latent Space PolicyOptimization (LSPO), an iterative framework that combines game-theoreticmethods with LLM fine-tuning to build strategic language agents. LSPO leveragesthe observation that while the language space is combinatorially large, theunderlying strategy space is relatively compact. We first map free-formutterances into a finite latent strategy space, yielding an abstractedextensive-form game. Then we apply game-theoretic methods like CounterfactualRegret Minimization (CFR) to optimize the policy in the latent space. Finally,we fine-tune the LLM via Direct Preference Optimization (DPO) to align with thelearned policy. By iteratively alternating between these steps, our LSPO agentsprogressively enhance both strategic reasoning and language communication.Experiment on the Werewolf game shows that our agents iteratively expand thestrategy space with improving performance and outperform existing Werewolfagents, underscoring their effectiveness in free-form language games withstrategic interactions.

Quick Read (beta)

loading the full paper ...