Abstract
Real-time decoding of neural activity is central to neuroscience andneurotechnology applications, from closed-loop experiments to brain-computerinterfaces, where models are subject to strict latency constraints. Traditionalmethods, including simple recurrent neural networks, are fast and lightweightbut often struggle to generalize to unseen data. In contrast, recentTransformer-based approaches leverage large-scale pretraining for stronggeneralization performance, but typically have much larger computationalrequirements and are not always suitable for low-resource or real-timesettings. To address these shortcomings, we present POSSM, a novel hybridarchitecture that combines individual spike tokenization via a cross-attentionmodule with a recurrent state-space model (SSM) backbone to enable (1) fast andcausal online prediction on neural activity and (2) efficient generalization tonew sessions, individuals, and tasks through multi-dataset pretraining. Weevaluate POSSM's decoding performance and inference speed on intracorticaldecoding of monkey motor tasks, and show that it extends to clinicalapplications, namely handwriting and speech decoding in human subjects.Notably, we demonstrate that pretraining on monkey motor-cortical recordingsimproves decoding performance on the human handwriting task, highlighting theexciting potential for cross-species transfer. In all of these tasks, we findthat POSSM achieves decoding accuracy comparable to state-of-the-artTransformers, at a fraction of the inference cost (up to 9x faster on GPU).These results suggest that hybrid SSMs are a promising approach to bridging thegap between accuracy, inference speed, and generalization when training neuraldecoders for real-time, closed-loop applications.