Optimal or Greedy Decision Trees? Revisiting their Objectives, Tuning, and Performance

Abstract

Recently there has been a surge of interest in optimal decision tree (ODT)methods that globally optimize accuracy directly, in contrast to traditionalapproaches that locally optimize an impurity or information metric. However,the value of optimal methods is not well understood yet, as the literatureprovides conflicting results, with some demonstrating superior out-of-sampleperformance of ODTs over greedy approaches, while others show the opposite.Through a novel extensive experimental study, we provide new insights into thedesign and behavior of learning decision trees. In particular, we identify andanalyze two relatively unexplored aspects of ODTs: the objective function usedin training trees, and tuning techniques. Thus, we address these threequestions: what objective to optimize in ODTs; how to tune ODTs; and how dooptimal and greedy methods compare? Our experimental evaluation examines 11objective functions, six tuning methods, and six claims from the literature onoptimal and greedy methods on 180 real and synthetic data sets. Through ouranalysis, both conceptually and experimentally, we show the effect of(non-)concave objectives in greedy and optimal approaches; we highlight theimportance of proper tuning of ODTs; support and refute several claims from theliterature; provide clear recommendations for researchers and practitioners onthe usage of greedy and optimal methods; and code for future comparisons.

Quick Read (beta)

loading the full paper ...