RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising

  • 2018-08-02 09:13:18
  • David Rohde, Stephen Bonner, Travis Dunlop, Flavian Vasile, Alexandros Karatzoglou
  • 22

Abstract

Recommender Systems are becoming ubiquitous in many settings and take manyforms, from product recommendation in e-commerce stores, to query suggestionsin search engines, to friend recommendation in social networks. Currentresearch directions which are largely based upon supervised learning fromhistorical data appear to be showing diminishing returns with a lot ofpractitioners report a discrepancy between improvements in offline metrics forsupervised learning and the online performance of the newly proposed models.One possible reason is that we are using the wrong paradigm: when looking atthe long-term cycle of collecting historical performance data, creating a newversion of the recommendation model, A/B testing it and then rolling it out. Wesee that there a lot of commonalities with the reinforcement learning (RL)setup, where the agent observes the environment and acts upon it in order tochange its state towards better states (states with higher rewards). To thisend we introduce RecoGym, an RL environment for recommendation, which isdefined by a model of user traffic patterns on e-commerce and the usersresponse to recommendations on the publisher websites. We believe that this isan important step forward for the field of recommendation systems research,that could open up an avenue of collaboration between the recommender systemsand reinforcement learning communities and lead to better alignment betweenoffline and online performance metrics.

 

Quick Read (beta)

loading the full paper ...