skip to main content
10.1145/3297280.3297551acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
poster

Decision tree-based feature ranking in concept drifting data streams

Published: 08 April 2019 Publication History

Abstract

Data stream mining targets the learning of predictive models that evolve over time according to changes in arriving data. Throughout the years, several approaches have been tailored to create and continuously update predictive models from these streams, and from these, Hoeffding Trees became a popular choice for learning decision trees from data streams. In this paper, we aim at quantifying and expressing the importance of features in dynamic scenarios is of the utmost importance as they allow domain experts to back up, or invalidate, a predictive model. Therefore, we propose and assess a positional gain method tailored for for both individual and ensembles of Hoeffding Trees and how these behave in both synthetic and real-world scenarios.

References

[1]
Jean Paul Barddal, Heitor Murilo Gomes, Fabricio Enembreck, and Bernhard Pfahringer. 2017. A survey on feature drift adaptation: Definition, benchmark, challenges and future directions. Journal of Systems and Software 127 (2017), 278 -- 294.
[2]
Jean Paul Barddal, Heitor Murilo Gomes, Fabrício Enembreck, Bernhard Pfahringer, and Albert Bifet. 2016. On Dynamic Feature Weighting for Feature Drifting Data Streams. In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19--23, 2016, Proceedings, Part II. 129--144.
[3]
Albert Bifet and Ricard Gavaldà. 2009. Adaptive Learning from Evolving Data Streams. Springer Berlin Heidelberg, Berlin, Heidelberg, 249--260.
[4]
Albert Bifet, Geoff Holmes, Richard Kirkby, and Bernhard Pfahringer. 2010. MOA: Massive Online Analysis. The Journal of Machine Learning Research 11 (2010), 1601--1604.
[5]
Pedro Domingos and Geoff Hulten. 2000. Mining High-speed Data Streams. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '00). ACM, New York, NY, USA, 71--80.
[6]
Heitor M. Gomes, Albert Bifet, Jesse Read, Jean Paul Barddal, Fabricio Enembreck, Bernhard Pfharinger, Geoff Holmes, and Talel Abdessalem. 2017. Adaptive random forests for evolving data stream classification. Machine Learning 106, 9 (01 Oct 2017), 1469--1495.
[7]
Mark A. Hall. 2000. Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML '00). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 359--366.
[8]
W. Nick Street and Y. Kim. 2001. A streaming ensemble algorithm (SEA) for large-classification. In Proc. of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM SIGKDD, 377--382.
[9]
Geoffrey I Webb, Loong Kuan Lee, Bart Goethals, and Francois Petitjean. 2018. Analyzing concept drift and shift from sample data. Data Mining and Knowledge Discovery (2018).

Cited By

View all
  • (2025)iSOUP-SymRF: Symbolic feature ranking with random forests in online multi-target regression and multi-label classificationMachine Learning10.1007/s10994-024-06718-5114:2Online publication date: 27-Jan-2025
  • (2023)iSOUP-SymRF: Symbolic Feature Ranking with Random Forests in Online Multi-target RegressionDiscovery Science10.1007/978-3-031-45275-8_4(48-63)Online publication date: 9-Oct-2023
  • (2022)Deep neural network prediction of modified stepped double-slope solar still with a cotton wick and cobalt oxide nanofluidEnvironmental Science and Pollution Research10.1007/s11356-022-21850-229:60(90632-90655)Online publication date: 23-Jul-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing
April 2019
2682 pages
ISBN:9781450359337
DOI:10.1145/3297280
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 April 2019

Check for updates

Author Tags

  1. concept drift
  2. data stream mining
  3. feature ranking

Qualifiers

  • Poster

Conference

SAC '19
Sponsor:

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)iSOUP-SymRF: Symbolic feature ranking with random forests in online multi-target regression and multi-label classificationMachine Learning10.1007/s10994-024-06718-5114:2Online publication date: 27-Jan-2025
  • (2023)iSOUP-SymRF: Symbolic Feature Ranking with Random Forests in Online Multi-target RegressionDiscovery Science10.1007/978-3-031-45275-8_4(48-63)Online publication date: 9-Oct-2023
  • (2022)Deep neural network prediction of modified stepped double-slope solar still with a cotton wick and cobalt oxide nanofluidEnvironmental Science and Pollution Research10.1007/s11356-022-21850-229:60(90632-90655)Online publication date: 23-Jul-2022
  • (2020)Combining Slow and Fast Learning for Improved Credit Scoring2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC42975.2020.9283453(1149-1154)Online publication date: 11-Oct-2020
  • (2020)Lessons learned from data stream classification applied to credit scoringExpert Systems with Applications10.1016/j.eswa.2020.113899162(113899)Online publication date: Dec-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media