Effects of Different Optimization Formulations in Evolutionary Reinforcement Learning on Diverse Behavior Generation

Abstract

Generating various strategies for a given task is challenging. However, ithas already proven to bring many assets to the main learning process, such asimproved behavior exploration. With the growth in the interest of heterogeneityin solution in evolutionary computation and reinforcement learning, manypromising approaches have emerged. To better understand how one guides multiplepolicies toward distinct strategies and benefit from diversity, we need toanalyze further the influence of the reward signal modulation and otherevolutionary mechanisms on the obtained behaviors. To that effect, this paperconsiders an existing evolutionary reinforcement learning framework whichexploits multi-objective optimization as a way to obtain policies that succeedat behavior-related tasks as well as completing the main goal. Experiments onthe Atari games stress that optimization formulations which do not considerobjectives equally fail at generating diversity and even output agents that areworse at solving the problem at hand, regardless of the obtained behaviors.

Quick Read (beta)

loading the full paper ...