Abstract
Balancing combat encounters in Dungeons & Dragons (D&D) is a complex taskthat requires Dungeon Masters (DM) to manually assess party strength, enemycomposition, and dynamic player interactions while avoiding interruption of thenarrative flow. In this paper, we propose Encounter Generation viaReinforcement Learning (NTRL), a novel approach that automates DynamicDifficulty Adjustment (DDA) in D&D via combat encounter design. By framing theproblem as a contextual bandit, NTRL generates encounters based on real-timeparty members attributes. In comparison with classic DM heuristics, NTRLiteratively optimizes encounters to extend combat longevity (+200%), increasesdamage dealt to party members, reducing post-combat hit points (-16.67%), andraises the number of player deaths while maintaining low total party kills(TPK). The intensification of combat forces players to act wisely and engage intactical maneuvers, even though the generated encounters guarantee high winrates (70%). Even in comparison with encounters designed by human DungeonMasters, NTRL demonstrates superior performance by enhancing the strategicdepth of combat while increasing difficulty in a manner that preserves overallgame fairness.