SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning

  • 2024-09-22 23:49:54
  • Shuai Zhang, Heshan Devaka Fernando, Miao Liu, Keerthiram Murugesan, Songtao Lu, Pin-Yu Chen, Tianyi Chen, Meng Wang
  • 0

Abstract

This paper studies the transfer reinforcement learning (RL) problem wheremultiple RL problems have different reward functions but share the sameunderlying transition dynamics. In this setting, the Q-function of each RLproblem (task) can be decomposed into a successor feature (SF) and a rewardmapping: the former characterizes the transition dynamics, and the lattercharacterizes the task-specific reward function. This Q-function decomposition,coupled with a policy improvement operator known as generalized policyimprovement (GPI), reduces the sample complexity of finding the optimalQ-function, and thus the SF \& GPI framework exhibits promising empiricalperformance compared to traditional RL methods like Q-learning. However, itstheoretical foundations remain largely unestablished, especially when learningthe successor features using deep neural networks (SF-DQN). This paper studiesthe provable knowledge transfer using SFs-DQN in transfer RL problems. Weestablish the first convergence analysis with provable generalizationguarantees for SF-DQN with GPI. The theory reveals that SF-DQN with GPIoutperforms conventional RL approaches, such as deep Q-network, in terms ofboth faster convergence rate and better generalization. Numerical experimentson real and synthetic RL tasks support the superior performance of SF-DQN \&GPI, aligning with our theoretical findings.

 

Quick Read (beta)

loading the full paper ...