Noise-based reward-modulated learning

Abstract

The pursuit of energy-efficient and adaptive artificial intelligence (AI) haspositioned neuromorphic computing as a promising alternative to conventionalcomputing. However, achieving learning on these platforms requires techniquesthat prioritize local information while enabling effective credit assignment.Here, we propose noise-based reward-modulated learning (NRL), a novel synapticplasticity rule that mathematically unifies reinforcement learning andgradient-based optimization with biologically-inspired local updates. NRLaddresses the computational bottleneck of exact gradients by approximating themthrough stochastic neural activity, transforming the inherent noise ofbiological and neuromorphic substrates into a functional resource. Drawinginspiration from biological learning, our method uses reward prediction errorsas its optimization target to generate increasingly advantageous behavior, andeligibility traces to facilitate retrospective credit assignment. Experimentalvalidation on reinforcement tasks, featuring immediate and delayed rewards,shows that NRL achieves performance comparable to baselines optimized usingbackpropagation, although with slower convergence, while showing significantlysuperior performance and scalability in multi-layer networks compared toreward-modulated Hebbian learning (RMHL), the most prominent similar approach.While tested on simple architectures, the results highlight the potential ofnoise-driven, brain-inspired learning for low-power adaptive systems,particularly in computing substrates with locality constraints. NRL offers atheoretically grounded paradigm well-suited for the event-drivencharacteristics of next-generation neuromorphic AI.

Quick Read (beta)

loading the full paper ...