Regularized and incremental decision trees for data streams

Barddal, Jean Paul; Enembreck, Fabrício

doi:10.1007/s12243-020-00782-3

Regularized and incremental decision trees for data streams

Published: 02 July 2020

Volume 75, pages 493–503, (2020)
Cite this article

Annals of Telecommunications Aims and scope Submit manuscript

376 Accesses
2 Citations
Explore all metrics

Abstract

Decision trees are a widely used family of methods for learning predictive models from both batch and streaming data. Despite depicting positive results in a multitude of applications, incremental decision trees continuously grow in terms of nodes as new data becomes available, i.e., they eventually split on all features available, and also multiple times using the same feature, thus leading to unnecessary complexity and overfitting. With this behavior, incremental trees lose the ability to generalize well, be human-understandable, and be computationally efficient. To tackle these issues, we proposed in a previous study a regularization scheme for Hoeffding decision trees that (i) uses a penalty factor to control the gain obtained by creating a new split node using a feature that has not been used thus far and (ii) uses information from previous splits in the current branch to determine whether the gain observed indeed justifies a new split. In this paper, we extend this analysis and apply the proposed regularization scheme to other types of incremental decision trees and report the results in both synthetic and real-world scenarios. The main interest is to verify whether and how the proposed regularization scheme affects the different types of incremental trees. Results show that in addition to the original Hoeffding Tree, the Adaptive Random Forest also benefits from regularization, yet, McDiarmid Trees and Extremely Fast Decision Trees observe declines in accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Incremental Optimization Mechanism for Constructing a Balanced Very Fast Decision Tree for Big Data

Incremental Update of Locally Optimal Classification Rules

Challenges in Learning from Streaming Data Extended Abstract

Notes

In practice, depending on the metric J being used, we should, instead, target its minimization. For instance, in CART-based trees [17], our goal would be to minimize the Gini impurity metric instead of maximizing it, and as a result, the process should be adapted.
For instance, the proposed scheme is the same for McDiarmid trees, except that the McDiarmid bound earlier reported in Eq 3 instead of the Hoeffding bound given in Eq 2.

References

Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res 11:1601–1604
Google Scholar
Barddal JP, Gomes HM, Enembreck F, Pfahringer B, Albert Bifet (2016) On dynamic feature weighting for feature drifting data streams. In: ECML/PKDD’16, Lecture Notes in Computer Science. Springer, New York
Bahri M., Maniu S., Bifet A. (2018) A sketch-based naive bayes algorithms for evolving data streams. In: 2018 IEEE International Conference on Big Data (Big Data), pp 604–613
Krawczyk B., Wozniak M. (2015) Weighted naïve bayes classifier with forgetting for drifting data streams. In: 2015 IEEE International conference on systems, man, and cybernetics, pp 2147–2152
Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’00, pages 71–80, New York, NY, USA. ACM. ISBN 1-58113-233-6. https://doi.org/10.1145/347090.347107
Rutkowski L, Pietruczuk L, Duda P, Jaworski M (2013) Decision trees for mining data streams based on the mcdiarmid’s bound. IEEE Trans Know Data Eng 25(6):1272–1279. ISSN 1041-4347. https://doi.org/10.1109/TKDE.2012.66
Article Google Scholar
Amezzane I, Fakhri Y, Aroussi ME, Bakhouya M (2019) Comparative study of batch and stream learning for online smartphone-based human activity recognition. In: Ahmed MB, Boudhir AA, Younes A (eds) Innovations in Smart Cities Applications Edition 2, pp 557–571, Cham. Springer International Publishing. ISBN 978-3-030-11196-0
Bifet A, Frank E, Holmes G, Pfahringer B (2012) Ensembles of restricted hoeffding trees. ACM Trans Intell Syst Technol 3(2):30:1–30:20. ISSN 2157-6904. https://doi.org/10.1145/2089094.2089106
Article Google Scholar
Yang H., Fong S. (2011) Optimized very fast decision tree with balanced classification accuracy and compact tree size, pp 57–64
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Royal Statist Soc Series B (Methodological) 58(1):267–288. ISSN 00359246. http://www.jstor.org/stable/2346178
Article MathSciNet MATH Google Scholar
Barddal JP, Enembreck F (2019) Learning regularized hoeffding trees from data streams. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, SAC ’19, pages 574–581, New York, NY, USA. ACM. ISBN 978-1-4503-5933-7. https://doi.org/10.1145/3297280.3297334
Manapragada C, Webb GI, Salehi M (2018) Extremely fast decision tree. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’18, pages 1953–1962, New York, NY, USA. ACM. ISBN 978-1-4503-5552-0. https://doi.org/10.1145/3219819.3220005
Gomes HM, Bifet A, Read J, Barddal JP, Enembreck F, Pfharinger B, Holmes G, Abdessalem T (2017) Adaptive random forests for evolving data stream classification. Mach Learn 106(9):1469–1495. ISSN 1573-0565. https://doi.org/10.1007/s10994-017-5642-8
Article MathSciNet MATH Google Scholar
Ikonomovska E, Gama J, Džeroski S (2011a) Learning model trees from evolving data streams. Data Mining Know Discovery 23(1):128–168. ISSN 1573-756X. https://doi.org/10.1007/s10618-010-0201-y
Article MathSciNet MATH Google Scholar
Ikonomovska E, Gama J, Zenko B, Dzeroski S (2011b) Speeding-up hoeffding-based regression trees with options. In: ICML, pp 537–544
Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101. ISSN 0885-6125. https://doi.org/10.1023/A:1018046501280
Google Scholar
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth and brooks, Monterey CA
da Costa VGT, de Leon Ferreira de Carvalho ACP, Barbon Jr. S (2018) Strict very fast decision tree: a memory conservative algorithm for data stream mining. Patt Recog Lett 116:22–28. ISSN 0167-8655. https://doi.org/10.1016/j.patrec.2018.09.004. http://www.sciencedirect.com/science/article/pii/S0167865518305580
Article Google Scholar
Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’01, pages 97–106, New York, NY, USA. ACM. ISBN 1-58113-391-X. https://doi.org/10.1145/502512.502529
Bifet A, Gavaldà R (2009) Adaptive learning from evolving data streams. Springe, Berlin, pp 249–260. ISBN 978-3-642-03915-7. https://doi.org/10.1007/978-3-642-03915-7_22
Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. ISSN 0885-6125. https://doi.org/10.1023/A:1010933404324
Article MATH Google Scholar
Jankowski D, Jackowski K (2016) Learning decision trees from data streams with concept drift, vol 80, pp 1682–1691. ISSN 1877-0509 https://doi.org/10.1016/j.procs.2016.05.508, http://www.sciencedirect.com/science/article/pii/S1877050916309954 http://www.sciencedirect.com/science/article/pii/S1877050916309954. International Conference on Computational Science 2016, ICCS 2016, 6–8 June 2016, San Diego, California, USA
Google Scholar
Deng H, Runger G (2012) Feature selection via regularized trees. In: The 2012 International Joint Conference on Neural Networks (IJCNN), pp 1–8, DOI https://doi.org/10.1109/IJCNN.2012.6252640
Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Bazzan AC , Labidi S (eds) Advances in Artificial Intelligence – SBIA 2004, volume 3171 of Lecture Notes in Computer Science. ISBN 978-3-540-23237-7. https://doi.org/10.1007/978-3-540-28645-5_29. Springer, Berlin, pp 286–295
Agrawal R, Imielinski T, Swami A (1993) Database mining: a performance perspective. Know Data Eng IEEE Trans 5(6):914–925. ISSN 1041-4347. https://doi.org/10.1109/69.250074
Article Google Scholar
Enembreck F, Ávila BC, Scalabrin EE, Barthès JPA (2007) Learning drifting negotiations. Appl Artif Intell 21(9):861–881. http://dblp.uni-trier.de/db/journals/aai/aai21.html#EnembreckASB07
Article Google Scholar
Harries M (1999) New South Wales. Splice-2 comparative evaluation: Electricity pricing
Blackard JA, Dean DJ (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Comput Elect Agri 24(3):131–151. ISSN 0168-1699. https://doi.org/10.1016/S0168-1699(99)00046-0. http://www.sciencedirect.com/science/article/pii/S0168169999000460
Article Google Scholar
Katakis I, Tsoumakas G, Vlahavas I (2006) Dynamic feature space and incremental feature selection for the classification of textual data streams. In: in ECML/PKDD-2006 International Workshop on Knowledge Discovery from Data Streams 2006. Springer, New York, p 107
Barddal JP, Gomes HM, Enembreck F (2015) A survey on feature drift adaptation. In: Proceedings of the International Conference on Tools with Artificial Intelligence. IEEE
Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J Am Stat Assoc 58 (301):13–30. http://www.jstor.org/stable/2282952?
Article MathSciNet MATH Google Scholar
Gomes HM, Barddal JP, Ferreira LEB, Bifet A (2018) Adaptive random forests for data stream regression. In: 26th European Symposium on Artificial Neural Networks, ESANN 2018, Bruges, Belgium, April 25-27, 2018. http://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2018-183.pdf
Britto AS, Sabourin R, Oliveira LES (2014) Dynamic selection of classifiers—a comprehensive review. Patt Recog 47(11):3665–3680. ISSN 0031-3203. https://doi.org/10.1016/j.patcog.2014.05.003. http://www.sciencedirect.com/science/article/pii/S0031320314001885
Article Google Scholar
Cruz RMO, Sabourin R, Cavalcanti GDC (2014) Analyzing dynamic ensemble selection techniques using dissimilarity analysis. In: Gayar NE, Schwenker F, Suen C (eds) Artificial Neural Networks in Pattern Recognition, pp 59–70, Cham. Springer International Publishing. ISBN 978-3-319-11656-3
Almeida PRLD, Oliveira LS, Britto ADS, Sabourin R (2016) Handling concept drifts using dynamic selection of classifiers. In: 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI), pp 989–995. https://doi.org/10.1109/ICTAI.2016.0153
Zyblewski P, Ksieniewicz P, Woźniak M (2019) Classifier selection for highly imbalanced data streams with minority driven ensemble. In: Rutkowski L, Scherer R, Korytkowski M, Pedrycz W, Tadeusiewicz R, Zurada JM (eds) Artificial Intelligence and Soft Computing, pp 626–635, Cham. Springer International Publishing. ISBN 978-3-030-20912-4

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers from both ACM SAC 2019 for the constructive comments yielded on our original manuscript and the reviewers of the Annals of Telecommunications for the feedback in this manuscript. This research did not receive any kind of financial support.

Author information

Authors and Affiliations

Graduate Program in Informatics (PPGIa), Pontifícia Universidade Católica do Paraná, Curitiba, Brazil
Jean Paul Barddal & Fabrício Enembreck

Authors

Jean Paul Barddal
View author publications
You can also search for this author in PubMed Google Scholar
Fabrício Enembreck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jean Paul Barddal.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barddal, J.P., Enembreck, F. Regularized and incremental decision trees for data streams. Ann. Telecommun. 75, 493–503 (2020). https://doi.org/10.1007/s12243-020-00782-3

Download citation

Received: 01 October 2019
Accepted: 18 June 2020
Published: 02 July 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s12243-020-00782-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Regularized and incremental decision trees for data streams

Abstract

Access this article

Similar content being viewed by others

Incremental Optimization Mechanism for Constructing a Balanced Very Fast Decision Tree for Big Data

Incremental Update of Locally Optimal Classification Rules

Challenges in Learning from Streaming Data Extended Abstract

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Regularized and incremental decision trees for data streams

Abstract

Access this article

Similar content being viewed by others

Incremental Optimization Mechanism for Constructing a Balanced Very Fast Decision Tree for Big Data

Incremental Update of Locally Optimal Classification Rules

Challenges in Learning from Streaming Data Extended Abstract

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation