TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling

Abstract

Deep learning architectures for supervised learning on tabular data rangefrom simple multilayer perceptrons (MLP) to sophisticated Transformers andretrieval-augmented methods. This study highlights a major, yet so faroverlooked opportunity for designing substantially better MLP-based tabulararchitectures. Namely, our new model TabM relies on efficient ensembling, whereone TabM efficiently imitates an ensemble of MLPs and produces multiplepredictions per object. Compared to a traditional deep ensemble, in TabM, theunderlying implicit MLPs are trained simultaneously, and (by default) sharemost of their parameters, which results in significantly better performance andefficiency. Using TabM as a new baseline, we perform a large-scale evaluationof tabular DL architectures on public benchmarks in terms of both taskperformance and efficiency, which renders the landscape of tabular DL in a newlight. Generally, we show that MLPs, including TabM, form a line of strongerand more practical models compared to attention- and retrieval-basedarchitectures. In particular, we find that TabM demonstrates the bestperformance among tabular DL models. Then, we conduct an empirical analysis onthe ensemble-like nature of TabM. We observe that the multiple predictions ofTabM are weak individually, but powerful collectively. Overall, our work bringsan impactful technique to tabular DL and advances the performance-efficiencytrade-off with TabM -- a simple and powerful baseline for researchers andpractitioners.

Quick Read (beta)

loading the full paper ...