LocalGLMnet: interpretable deep learning for tabular data

Abstract

Deep learning models have gained great popularity in statistical modelingbecause they lead to very competitive regression models, often outperformingclassical statistical models such as generalized linear models. Thedisadvantage of deep learning models is that their solutions are difficult tointerpret and explain, and variable selection is not easily possible becausedeep learning models solve feature engineering and variable selectioninternally in a nontransparent way. Inspired by the appealing structure ofgeneralized linear models, we propose a new network architecture that sharessimilar features as generalized linear models, but provides superior predictivepower benefiting from the art of representation learning. This new architectureallows for variable selection of tabular data and for interpretation of thecalibrated deep learning model, in fact, our approach provides an additivedecomposition in the spirit of Shapley values and integrated gradients.

Quick Read (beta)

loading the full paper ...