Neural Waveshaping Synthesis

Abstract

We present the Neural Waveshaping Unit (NEWT): a novel, lightweight, fullycausal approach to neural audio synthesis which operates directly in thewaveform domain, with an accompanying optimisation (FastNEWT) for efficient CPUinference. The NEWT uses time-distributed multilayer perceptrons with periodicactivations to implicitly learn nonlinear transfer functions that encode thecharacteristics of a target timbre. Once trained, a NEWT can produce complextimbral evolutions by simple affine transformations of its input and outputsignals. We paired the NEWT with a differentiable noise synthesiser and reverband found it capable of generating realistic musical instrument performanceswith only 260k total model parameters, conditioned on F0 and loudness features.We compared our method to state-of-the-art benchmarks with a multi-stimuluslistening test and the Fr\'echet Audio Distance and found it performedcompetitively across the tested timbral domains. Our method significantlyoutperformed the benchmarks in terms of generation speed, and achievedreal-time performance on a consumer CPU, both with and without FastNEWT,suggesting it is a viable basis for future creative sound design tools.

Quick Read (beta)

loading the full paper ...