A Survey on Reproducibility by Evaluating Deep Reinforcement Learning Algorithms on Real-World Robots

  • 2019-09-09 11:33:09
  • Nicolai A. Lynnerup, Laura Nolling, Rasmus Hasle, John Hallam
  • 2

Abstract

As reinforcement learning (RL) achieves more success in solving complextasks, more care is needed to ensure that RL research is reproducible and thatalgorithms herein can be compared easily and fairly with minimal bias. RLresults are, however, notoriously hard to reproduce due to the algorithms'intrinsic variance, the environments' stochasticity, and numerous (potentiallyunreported) hyper-parameters. In this work we investigate the many issuesleading to irreproducible research and how to manage those. We further show howto utilise a rigorous and standardised evaluation approach for easing theprocess of documentation, evaluation and fair comparison of differentalgorithms, where we emphasise the importance of choosing the right measurementmetrics and conducting proper statistics on the results, for unbiased reportingof the results.

 

Quick Read (beta)

loading the full paper ...