Logistic Regression, Neural Networks and Dempster-Shafer Theory: a New Perspective

Abstract

We revisit logistic regression and its nonlinear extensions, includingmultilayer feedforward neural networks, by showing that these classifiers canbe viewed as converting input or higher-level features into Dempster-Shafermass functions and aggregating them by Dempster's rule of combination. Theprobabilistic outputs of these classifiers are the normalized plausibilitiescorresponding to the underlying combined mass function. This mass function ismore informative than the output probability distribution. In particular, itmakes it possible to distinguish between lack of evidence (when none of thefeatures provides discriminant information) from conflicting evidence (whendifferent features support different classes). This expressivity of massfunctions allows us to gain insight into the role played by each input featurein logistic regression, and to interpret hidden unit outputs in multilayerneural networks. It also makes it possible to use alternative decision rules,such as interval dominance, which select a set of classes when the availableevidence does not unambiguously point to a single class, thus trading reducederror rate for higher imprecision.

Quick Read (beta)

loading the full paper ...