SwiftRL: Towards Efficient Reinforcement Learning on Real Processing-In-Memory Systems

Abstract

Reinforcement Learning (RL) trains agents to learn optimal behavior bymaximizing reward signals from experience datasets. However, RL training oftenfaces memory limitations, leading to execution latencies and prolonged trainingtimes. To overcome this, SwiftRL explores Processing-In-Memory (PIM)architectures to accelerate RL workloads. We achieve near-linear performancescaling by implementing RL algorithms like Tabular Q-learning and SARSA onUPMEM PIM systems and optimizing for hardware. Our experiments on OpenAI GYMenvironments using UPMEM hardware demonstrate superior performance compared toCPU and GPU implementations.

Quick Read (beta)

loading the full paper ...