On the Effects of Fine-tuning Language Models for Text-Based Reinforcement Learning

Abstract

Text-based reinforcement learning involves an agent interacting with afictional environment using observed text and admissible actions in naturallanguage to complete a task. Previous works have shown that agents can succeedin text-based interactive environments even in the complete absence of semanticunderstanding or other linguistic capabilities. The success of these agents inplaying such games suggests that semantic understanding may not be importantfor the task. This raises an important question about the benefits of LMs inguiding the agents through the game states. In this work, we show that richsemantic understanding leads to efficient training of text-based RL agents.Moreover, we describe the occurrence of semantic degeneration as a consequenceof inappropriate fine-tuning of language models in text-based reinforcementlearning (TBRL). Specifically, we describe the shift in the semanticrepresentation of words in the LM, as well as how it affects the performance ofthe agent in tasks that are semantically similar to the training games. Webelieve these results may help develop better strategies to fine-tune agents intext-based RL scenarios.

Quick Read (beta)

loading the full paper ...