ADMN: A Layer-Wise Adaptive Multimodal Network for Dynamic Input Noise and Compute Resources

Abstract

Multimodal deep learning systems are deployed in dynamic scenarios due to therobustness afforded by multiple sensing modalities. Nevertheless, they strugglewith varying compute resource availability (due to multi-tenancy, deviceheterogeneity, etc.) and fluctuating quality of inputs (from sensor feedcorruption, environmental noise, etc.). Statically provisioned multimodalsystems cannot adapt when compute resources change over time, while existingdynamic networks struggle with strict compute budgets. Additionally, bothsystems often neglect the impact of variations in modality quality.Consequently, modalities suffering substantial corruption may needlesslyconsume resources better allocated towards other modalities. We propose ADMN, alayer-wise Adaptive Depth Multimodal Network capable of tackling bothchallenges: it adjusts the total number of active layers across all modalitiesto meet strict compute resource constraints and continually reallocates layersacross input modalities according to their modality quality. Our evaluationsshowcase ADMN can match the accuracy of state-of-the-art networks whilereducing up to 75% of their floating-point operations.

Quick Read (beta)

loading the full paper ...