Reinforcement Learning for Personalized Dialogue Management

Abstract

Language systems have been of great interest to the research community andhave recently reached the mass market through various assistant platforms onthe web. Reinforcement Learning methods that optimize dialogue policies haveseen successes in past years and have recently been extended into methods thatpersonalize the dialogue, e.g. take the personal context of users into account.These works, however, are limited to personalization to a single user with whomthey require multiple interactions and do not generalize the usage of contextacross users. This work introduces a problem where a generalized usage ofcontext is relevant and proposes two Reinforcement Learning (RL)-basedapproaches to this problem. The first approach uses a single learner andextends the traditional POMDP formulation of dialogue state with features thatdescribe the user context. The second approach segments users by context andthen employs a learner per context. We compare these approaches in a benchmarkof existing non-RL and RL-based methods in three established and one novelapplication domain of financial product recommendation. We compare theinfluence of context and training experiences on performance and find thatlearning approaches generally outperform a handcrafted gold standard.

Quick Read (beta)

loading the full paper ...