Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning

Abstract

Offline reinforcement learning (RL) Algorithms are often designed withenvironments such as MuJoCo in mind, in which the planning horizon is extremelylong and no noise exists. We compare model-free, model-based, as well as hybridoffline RL approaches on various industrial benchmark (IB) datasets to test thealgorithms in settings closer to real world problems, including complex noiseand partially observable states. We find that on the IB, hybrid approaches facesevere difficulties and that simpler algorithms, such as rollout basedalgorithms or model-free algorithms with simpler regularizers perform best onthe datasets.

Quick Read (beta)

loading the full paper ...