Towards robust and domain agnostic reinforcement learning competitions

  • 2021-06-07 16:15:46
  • William Hebgen Guss, Stephanie Milani, Nicholay Topin, Brandon Houghton, Sharada Mohanty, Andrew Melnik, Augustin Harter, Benoit Buschmaas, Bjarne Jaster, Christoph Berganski, Dennis Heitkamp, Marko Henning, Helge Ritter, Chengjie Wu, Xiaotian Hao, Yiming Lu, Hangyu Mao, Yihuan Mao, Chao Wang, Michal Opanowicz, Anssi Kanervisto, Yanick Schraner, Christian Scheller, Xiren Zhou, Lu Liu, Daichi Nishio, Toi Tsuneda, Karolis Ramanauskas, Gabija Juceviciute
  • 0

Abstract

Reinforcement learning competitions have formed the basis for standardresearch benchmarks, galvanized advances in the state-of-the-art, and shapedthe direction of the field. Despite this, a majority of challenges suffer fromthe same fundamental problems: participant solutions to the posed challenge areusually domain-specific, biased to maximally exploit compute resources, and notguaranteed to be reproducible. In this paper, we present a new framework ofcompetition design that promotes the development of algorithms that overcomethese barriers. We propose four central mechanisms for achieving this end:submission retraining, domain randomization, desemantization through domainobfuscation, and the limitation of competition compute and environment-samplebudget. To demonstrate the efficacy of this design, we proposed, organized, andran the MineRL 2020 Competition on Sample-Efficient Reinforcement Learning. Inthis work, we describe the organizational outcomes of the competition and showthat the resulting participant submissions are reproducible, non-specific tothe competition environment, and sample/resource efficient, despite thedifficult competition task.

 

Quick Read (beta)

loading the full paper ...