Outbidding and Outbluffing Elite Humans: Mastering Liar's Poker via Self-Play and Reinforcement Learning

Abstract

AI researchers have long focused on poker-like games as a testbed forenvironments characterized by multi-player dynamics, imperfect information, andreasoning under uncertainty. While recent breakthroughs have matched elitehuman play at no-limit Texas hold'em, the multi-player dynamics are subdued:most hands converge quickly with only two players engaged through multiplerounds of bidding. In this paper, we present Solly, the first AI agent toachieve elite human play in reduced-format Liar's Poker, a game characterizedby extensive multi-player engagement. We trained Solly using self-play with amodel-free, actor-critic, deep reinforcement learning algorithm. Solly playedat an elite human level as measured by win rate (won over 50% of hands) andequity (money won) in heads-up and multi-player Liar's Poker. Solly alsooutperformed large language models (LLMs), including those with reasoningabilities, on the same metrics. Solly developed novel bidding strategies,randomized play effectively, and was not easily exploitable by world-classhuman players.

Quick Read (beta)

loading the full paper ...