A Unified Perspective on Optimization in Machine Learning and Neuroscience: From Gradient Descent to Neural Adaptation

  • 2025-10-21 17:10:15
  • Jesús García Fernández, Nasir Ahmad, Marcel van Gerven
  • 0

Abstract

Iterative optimization is central to modern artificial intelligence (AI) andprovides a crucial framework for understanding adaptive systems. This reviewprovides a unified perspective on this subject, bridging classic theory withneural network training and biological learning. Although gradient-basedmethods, powered by the efficient but biologically implausible backpropagation(BP), dominate machine learning, their computational demands can hinderscalability in high-dimensional settings. In contrast, derivative-free orzeroth-order (ZO) optimization feature computationally lighter approaches thatrely only on function evaluations and randomness. While generally less sampleefficient, recent breakthroughs demonstrate that modern ZO methods caneffectively approximate gradients and achieve performance competitive with BPin neural network models. This ZO paradigm is also particularly relevant forbiology. Its core principles of random exploration (probing) andfeedback-guided adaptation (reinforcing) parallel key mechanisms of biologicallearning, offering a mathematically principled perspective on how the brainlearns. In this review, we begin by categorizing optimization approaches basedon the order of derivative information they utilize, ranging from first-,second-, and higher-order gradient-based to ZO methods. We then explore howthese methods are adapted to the unique challenges of neural network trainingand the resulting learning dynamics. Finally, we build upon these insights toview biological learning through an optimization lens, arguing that a ZOparadigm leverages the brain's intrinsic noise as a computational resource.This framework not only illuminates our understanding of natural intelligencebut also holds vast implications for neuromorphic hardware, helping us designfast and energy-efficient AI systems that exploit intrinsic hardware noise.

 

Quick Read (beta)

loading the full paper ...