Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions

  • 2019-01-07 11:28:36
  • Rui Wang, Joel Lehman, Jeff Clune, Kenneth O. Stanley
  • 27

Abstract

While the history of machine learning so far encompasses a series of problemsposed by researchers and algorithms that learn their solutions, an importantquestion is whether the problems themselves can be generated by the algorithmat the same time as they are being solved. Such a process would in effect buildits own diverse and expanding curricula, and the solutions to problems atvarious stages would become stepping stones towards solving even morechallenging problems later in the process. The Paired Open-Ended Trailblazer(POET) algorithm introduced in this paper does just that: it pairs thegeneration of environmental challenges and the optimization of agents to solvethose challenges. It simultaneously explores many different paths through thespace of possible problems and solutions and, critically, allows thesestepping-stone solutions to transfer between problems if better, catalyzinginnovation. The term open-ended signifies the intriguing potential foralgorithms like POET to continue to create novel and increasingly complexcapabilities without bound. The results show that POET produces a diverse rangeof sophisticated behaviors that solve a wide range of environmental challenges,many of which cannot be solved by direct optimization alone, or even through adirect, single-path curriculum-based control algorithm introduced to highlightthe critical role of open-endedness in solving ambitious challenges. Theability to transfer solutions from one environment to another proves essentialto unlocking the full potential of the system as a whole, demonstrating theunpredictable nature of fortuitous stepping stones. We hope that POET willinspire a new push towards open-ended discovery across many domains, wherealgorithms like POET can blaze a trail through their interesting possiblemanifestations and solutions.

 

Introduction (beta)

None

 

Conclusion (beta)

None