Variable Importance Clouds: A Way to Explore Variable Importance for the Set of Good Models

  • 2019-01-10 15:06:11
  • Jiayun Dong, Cynthia Rudin
Variable importance is central to scientific studies, including the socialsciences and causal inference, healthcare, and in other domains. However,current notions of variable importance are often tied to a specific predictivemodel. This is problematic: what if there were multiple well-performingpredictive models, and a specific variable is important to some of them and notto others? In that case, we may not be able to tell from a singlewell-performing model whether a variable is always important in predicting theoutcome. Rather than depending on variable importance for a single predictivemodel, we would like to explore variable importance for allapproximately-equally-accurate predictive models. This work introduces theconcept of a variable importance cloud, which maps every variable to itsimportance for every good predictive model. We show properties of the variableimportance cloud and draw connections other areas of statistics. We introducevariable importance diagrams as a projection of the variable importance cloudinto two dimensions for visualization purposes. Experiments with criminaljustice and marketing data illustrate how variables can change dramatically inimportance for approximately-equally-accurate predictive models.


