Steering Language Models with Game-Theoretic Solvers

Abstract

Mathematical models of interactions among rational agents have long beenstudied in game theory. However these interactions are often over a small setof discrete game actions which is very different from how humans communicate innatural language. To bridge this gap, we introduce a framework that allowsequilibrium solvers to work over the space of natural language dialoguegenerated by large language models (LLMs). Specifically, by modelling theplayers, strategies and payoffs in a "game" of dialogue, we create a bindingfrom natural language interactions to the conventional symbolic logic of gametheory. Given this binding, we can ask existing game-theoretic algorithms toprovide us with strategic solutions (e.g., what string an LLM should generateto maximize payoff in the face of strategic partners or opponents), giving uspredictors of stable, rational conversational strategies. We focus on threedomains that require different negotiation strategies: scheduling meetings,trading fruit and debate, and evaluate an LLM's generated language when guidedby solvers. We see that LLMs that follow game-theory solvers result in dialoguegenerations that are less exploitable than the control (no guidance fromsolvers), and the language generated results in higher rewards, in allnegotiation domains. We discuss future implications of this work, and howgame-theoretic solvers that can leverage the expressivity of natural languagecan open up a new avenue of guiding language research.

Quick Read (beta)

loading the full paper ...