Abstract
Decision trees are a widely used family of methods for learning predictive models from both batch and streaming data. Despite depicting positive results in a multitude of applications, incremental decision trees continuously grow in terms of nodes as new data becomes available, i.e., they eventually split on all features available, and also multiple times using the same feature, thus leading to unnecessary complexity and overfitting. With this behavior, incremental trees lose the ability to generalize well, be human-understandable, and be computationally efficient. To tackle these issues, we proposed in a previous study a regularization scheme for Hoeffding decision trees that (i) uses a penalty factor to control the gain obtained by creating a new split node using a feature that has not been used thus far and (ii) uses information from previous splits in the current branch to determine whether the gain observed indeed justifies a new split. In this paper, we extend this analysis and apply the proposed regularization scheme to other types of incremental decision trees and report the results in both synthetic and real-world scenarios. The main interest is to verify whether and how the proposed regularization scheme affects the different types of incremental trees. Results show that in addition to the original Hoeffding Tree, the Adaptive Random Forest also benefits from regularization, yet, McDiarmid Trees and Extremely Fast Decision Trees observe declines in accuracy.
Similar content being viewed by others
Notes
In practice, depending on the metric J being used, we should, instead, target its minimization. For instance, in CART-based trees [17], our goal would be to minimize the Gini impurity metric instead of maximizing it, and as a result, the process should be adapted.
References
Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res 11:1601–1604
Barddal JP, Gomes HM, Enembreck F, Pfahringer B, Albert Bifet (2016) On dynamic feature weighting for feature drifting data streams. In: ECML/PKDD’16, Lecture Notes in Computer Science. Springer, New York
Bahri M., Maniu S., Bifet A. (2018) A sketch-based naive bayes algorithms for evolving data streams. In: 2018 IEEE International Conference on Big Data (Big Data), pp 604–613
Krawczyk B., Wozniak M. (2015) Weighted naïve bayes classifier with forgetting for drifting data streams. In: 2015 IEEE International conference on systems, man, and cybernetics, pp 2147–2152
Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’00, pages 71–80, New York, NY, USA. ACM. ISBN 1-58113-233-6. https://doi.org/10.1145/347090.347107
Rutkowski L, Pietruczuk L, Duda P, Jaworski M (2013) Decision trees for mining data streams based on the mcdiarmid’s bound. IEEE Trans Know Data Eng 25(6):1272–1279. ISSN 1041-4347. https://doi.org/10.1109/TKDE.2012.66
Amezzane I, Fakhri Y, Aroussi ME, Bakhouya M (2019) Comparative study of batch and stream learning for online smartphone-based human activity recognition. In: Ahmed MB, Boudhir AA, Younes A (eds) Innovations in Smart Cities Applications Edition 2, pp 557–571, Cham. Springer International Publishing. ISBN 978-3-030-11196-0
Bifet A, Frank E, Holmes G, Pfahringer B (2012) Ensembles of restricted hoeffding trees. ACM Trans Intell Syst Technol 3(2):30:1–30:20. ISSN 2157-6904. https://doi.org/10.1145/2089094.2089106
Yang H., Fong S. (2011) Optimized very fast decision tree with balanced classification accuracy and compact tree size, pp 57–64
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Royal Statist Soc Series B (Methodological) 58(1):267–288. ISSN 00359246. http://www.jstor.org/stable/2346178
Barddal JP, Enembreck F (2019) Learning regularized hoeffding trees from data streams. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, SAC ’19, pages 574–581, New York, NY, USA. ACM. ISBN 978-1-4503-5933-7. https://doi.org/10.1145/3297280.3297334
Manapragada C, Webb GI, Salehi M (2018) Extremely fast decision tree. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’18, pages 1953–1962, New York, NY, USA. ACM. ISBN 978-1-4503-5552-0. https://doi.org/10.1145/3219819.3220005
Gomes HM, Bifet A, Read J, Barddal JP, Enembreck F, Pfharinger B, Holmes G, Abdessalem T (2017) Adaptive random forests for evolving data stream classification. Mach Learn 106(9):1469–1495. ISSN 1573-0565. https://doi.org/10.1007/s10994-017-5642-8
Ikonomovska E, Gama J, Džeroski S (2011a) Learning model trees from evolving data streams. Data Mining Know Discovery 23(1):128–168. ISSN 1573-756X. https://doi.org/10.1007/s10618-010-0201-y
Ikonomovska E, Gama J, Zenko B, Dzeroski S (2011b) Speeding-up hoeffding-based regression trees with options. In: ICML, pp 537–544
Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101. ISSN 0885-6125. https://doi.org/10.1023/A:1018046501280
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth and brooks, Monterey CA
da Costa VGT, de Leon Ferreira de Carvalho ACP, Barbon Jr. S (2018) Strict very fast decision tree: a memory conservative algorithm for data stream mining. Patt Recog Lett 116:22–28. ISSN 0167-8655. https://doi.org/10.1016/j.patrec.2018.09.004. http://www.sciencedirect.com/science/article/pii/S0167865518305580
Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’01, pages 97–106, New York, NY, USA. ACM. ISBN 1-58113-391-X. https://doi.org/10.1145/502512.502529
Bifet A, Gavaldà R (2009) Adaptive learning from evolving data streams. Springe, Berlin, pp 249–260. ISBN 978-3-642-03915-7. https://doi.org/10.1007/978-3-642-03915-7_22
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. ISSN 0885-6125. https://doi.org/10.1023/A:1010933404324
Jankowski D, Jackowski K (2016) Learning decision trees from data streams with concept drift, vol 80, pp 1682–1691. ISSN 1877-0509 https://doi.org/10.1016/j.procs.2016.05.508, http://www.sciencedirect.com/science/article/pii/S1877050916309954http://www.sciencedirect.com/science/article/pii/S1877050916309954. International Conference on Computational Science 2016, ICCS 2016, 6–8 June 2016, San Diego, California, USA
Deng H, Runger G (2012) Feature selection via regularized trees. In: The 2012 International Joint Conference on Neural Networks (IJCNN), pp 1–8, DOI https://doi.org/10.1109/IJCNN.2012.6252640
Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Bazzan AC , Labidi S (eds) Advances in Artificial Intelligence – SBIA 2004, volume 3171 of Lecture Notes in Computer Science. ISBN 978-3-540-23237-7. https://doi.org/10.1007/978-3-540-28645-5_29. Springer, Berlin, pp 286–295
Agrawal R, Imielinski T, Swami A (1993) Database mining: a performance perspective. Know Data Eng IEEE Trans 5(6):914–925. ISSN 1041-4347. https://doi.org/10.1109/69.250074
Enembreck F, Ávila BC, Scalabrin EE, Barthès JPA (2007) Learning drifting negotiations. Appl Artif Intell 21(9):861–881. http://dblp.uni-trier.de/db/journals/aai/aai21.html#EnembreckASB07
Harries M (1999) New South Wales. Splice-2 comparative evaluation: Electricity pricing
Blackard JA, Dean DJ (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Comput Elect Agri 24(3):131–151. ISSN 0168-1699. https://doi.org/10.1016/S0168-1699(99)00046-0. http://www.sciencedirect.com/science/article/pii/S0168169999000460
Katakis I, Tsoumakas G, Vlahavas I (2006) Dynamic feature space and incremental feature selection for the classification of textual data streams. In: in ECML/PKDD-2006 International Workshop on Knowledge Discovery from Data Streams 2006. Springer, New York, p 107
Barddal JP, Gomes HM, Enembreck F (2015) A survey on feature drift adaptation. In: Proceedings of the International Conference on Tools with Artificial Intelligence. IEEE
Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J Am Stat Assoc 58 (301):13–30. http://www.jstor.org/stable/2282952?
Gomes HM, Barddal JP, Ferreira LEB, Bifet A (2018) Adaptive random forests for data stream regression. In: 26th European Symposium on Artificial Neural Networks, ESANN 2018, Bruges, Belgium, April 25-27, 2018. http://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2018-183.pdf
Britto AS, Sabourin R, Oliveira LES (2014) Dynamic selection of classifiers—a comprehensive review. Patt Recog 47(11):3665–3680. ISSN 0031-3203. https://doi.org/10.1016/j.patcog.2014.05.003. http://www.sciencedirect.com/science/article/pii/S0031320314001885
Cruz RMO, Sabourin R, Cavalcanti GDC (2014) Analyzing dynamic ensemble selection techniques using dissimilarity analysis. In: Gayar NE, Schwenker F, Suen C (eds) Artificial Neural Networks in Pattern Recognition, pp 59–70, Cham. Springer International Publishing. ISBN 978-3-319-11656-3
Almeida PRLD, Oliveira LS, Britto ADS, Sabourin R (2016) Handling concept drifts using dynamic selection of classifiers. In: 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI), pp 989–995. https://doi.org/10.1109/ICTAI.2016.0153
Zyblewski P, Ksieniewicz P, Woźniak M (2019) Classifier selection for highly imbalanced data streams with minority driven ensemble. In: Rutkowski L, Scherer R, Korytkowski M, Pedrycz W, Tadeusiewicz R, Zurada JM (eds) Artificial Intelligence and Soft Computing, pp 626–635, Cham. Springer International Publishing. ISBN 978-3-030-20912-4
Acknowledgments
The authors would like to thank the anonymous reviewers from both ACM SAC 2019 for the constructive comments yielded on our original manuscript and the reviewers of the Annals of Telecommunications for the feedback in this manuscript. This research did not receive any kind of financial support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Barddal, J.P., Enembreck, F. Regularized and incremental decision trees for data streams. Ann. Telecommun. 75, 493–503 (2020). https://doi.org/10.1007/s12243-020-00782-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12243-020-00782-3