A Dynamic Boosted Ensemble Learning Based on Random Forest

Abstract

We propose Dynamic Boosted Random Forest (DBRF), a novel ensemble algorithmthat incorporates the notion of hard example mining into Random Forest (RF) andthus combines the high accuracy of Boosting algorithm with the stronggeneralization of Bagging algorithm. Specifically, we propose to measure thequality of each leaf node of every decision tree in the random forest todetermine hard examples. By iteratively training and then removing easyexamples and noise examples from training data, we evolve the random forest tofocus on hard examples dynamically so as to learn decision boundaries better.Data can be cascaded through these random forests learned in each iteration insequence to generate predictions, thus making RF deep. We also propose to useevolution mechanism, stacking mechanism and smart iteration mechanism toimprove the performance of the model. DBRF outperforms RF on three UCI datasetsand achieved state-of-the-art results compared to other deep models. Moreover,we show that DBRF is also a new way of sampling and can be very useful whenlearning from unbalanced data.

Quick Read (beta)

loading the full paper ...