Towards robust and domain agnostic reinforcement learning competitions

Abstract

Reinforcement learning competitions have formed the basis for standardresearch benchmarks, galvanized advances in the state-of-the-art, and shapedthe direction of the field. Despite this, a majority of challenges suffer fromthe same fundamental problems: participant solutions to the posed challenge areusually domain-specific, biased to maximally exploit compute resources, and notguaranteed to be reproducible. In this paper, we present a new framework ofcompetition design that promotes the development of algorithms that overcomethese barriers. We propose four central mechanisms for achieving this end:submission retraining, domain randomization, desemantization through domainobfuscation, and the limitation of competition compute and environment-samplebudget. To demonstrate the efficacy of this design, we proposed, organized, andran the MineRL 2020 Competition on Sample-Efficient Reinforcement Learning. Inthis work, we describe the organizational outcomes of the competition and showthat the resulting participant submissions are reproducible, non-specific tothe competition environment, and sample/resource efficient, despite thedifficult competition task.

Quick Read (beta)

loading the full paper ...