Abstract
We describe Bayesian Layers, a module designed for fast experimentation withneural network uncertainty. It extends neural network libraries with layerscapturing uncertainty over weights (Bayesian neural nets), pre-activation units(dropout), activations ("stochastic output layers"), and the function itself(Gaussian processes). With reversible layers, one can also propagateuncertainty from input to output such as for flow-based distributions andconstant-memory backpropagation. Bayesian Layers are a drop-in replacement forother layers, maintaining core features that one typically desires forexperimentation. As demonstration, we fit a 10-billion parameter "BayesianTransformer" on 512 TPUv2 cores, which replaces attention layers with theirBayesian counterpart.