Quantifying Changes in Predictions of Classification Models for Data Streams

Grzenda, Maciej

doi:10.1007/978-3-031-01333-1_10

Maciej Grzenda ORCID: orcid.org/0000-0002-5440-4954¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13205))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

1027 Accesses

Abstract

Evaluation methods for data stream classification have frequently been focused on how available data are used for learning a model and for its performance assessment, with major emphasis on the difference between predicted and true labels. More recently, growing interest in delayed labelling evaluation has resulted in the evaluation of multiple predictions made by an evolving model for an instance before its true label arrival. Still, under this setting predictions are also compared with true labels rather than changes in predictions focused on.

In this study, we aim to provide an intuitive evaluation framework to quantify changes in predictions made over time for the same input instances by evolving classification models. The primary motivation is to gain insight into the impact of the evolution of a classification model on the changes in decision boundaries, which may effectively re-assign the instances to other classes. The prediction change measures proposed in this study make it possible to reveal the scale of such changes. Furthermore, the notions of volatility of predictions and productive volatility are proposed and quantified. Results for a number of real and synthetic data streams show that similar accuracy of the models can be accompanied by significantly different volatility of predictions made by these models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The code and data sets repository are available at https://github.com/mgrzenda/PredictionVolatility. The code calculating the measures proposed in this study has been implemented as an extension of the Massive Online Analysis (MOA) [2] framework.

References

Barros, R.S.M., Santos, S.G.T.C.: A large-scale comparison of concept drift detectors. Inf. Sci. 451–452, 348–370 (2018). https://doi.org/10.1016/j.ins.2018.04.014
Article MathSciNet Google Scholar
Bifet, A., Gavald, R., Holmes, G., Pfahringer, B.: Machine Learning for Data Streams: With Practical Examples in MOA. The MIT Press, Cambridge (2018)
Book Google Scholar
Bifet, A., Gavaldà, R.: Adaptive learning from evolving data streams. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 249–260. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03915-7_22
Chapter Google Scholar
Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 81–94 (2014). https://doi.org/10.1109/TNNLS.2013.2251352
Article Google Scholar
Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environments: a survey. IEEE Comput. Intell. Mag. 10(4), 12–25 (2015)
Article Google Scholar
Domingos, P., Hulten, G.: Mining high-speed data streams. In: 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM (2000)
Google Scholar
Gomes, H.M., et al.: Adaptive random forests for evolving data stream classification. Mach. Learn. 106, 1469–1495 (2017). https://doi.org/10.1007/s10994-017-5642-8
Article MathSciNet Google Scholar
Grzenda, M., Gomes, H.M., Bifet, A.: Performance measures for evolving predictions under delayed labelling classification. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2020). https://doi.org/10.1109/IJCNN48605.2020.9207256
Grzenda, M., Gomes, H.M., Bifet, A.: Delayed labelling evaluation for data streams. Data Min. Knowl. Disc. 34(5), 1237–1266 (2019). https://doi.org/10.1007/s10618-019-00654-y
Article MathSciNet MATH Google Scholar
Hofer, V., Krempl, G.: Drift mining in data: a framework for addressing drift in classification. Comput. Stat. Data Anal. 57, 377–391 (2013). https://doi.org/10.1016/j.csda.2012.07.007
Article MathSciNet MATH Google Scholar
Webb, G.I., Hyde, R., Cao, H., Nguyen, H.L., Petitjean, F.: Characterizing concept drift. Data Min. Knowl. Disc. 30(4), 964–994 (2016). https://doi.org/10.1007/s10618-015-0448-4
Article MathSciNet MATH Google Scholar
Webb, G.I., Lee, L.K., Goethals, B., Petitjean, F.: Analyzing concept drift and shift from sample data. Data Min. Knowl. Disc. 32(5), 1179–1199 (2018). https://doi.org/10.1007/s10618-018-0554-1
Article MathSciNet Google Scholar

Download references

Acknowledgements

The project was funded by the POB Research Centre for Artificial Intelligence and Robotics of Warsaw University of Technology within the Excellence Initiative Program - Research University (ID-UB).

Author information

Authors and Affiliations

Faculty of Mathematics and Information Science, Warsaw University of Technology, ul. Koszykowa 75, 00-662, Warszawa, Poland
Maciej Grzenda

Authors

Maciej Grzenda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maciej Grzenda .

Editor information

Editors and Affiliations

University of Rennes, Rennes, France
Tassadit Bouadi
University of Rennes, Rennes, France
Elisa Fromont
University of Munich, LMU, Munich, Germany
Eyke Hüllermeier

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Grzenda, M. (2022). Quantifying Changes in Predictions of Classification Models for Data Streams. In: Bouadi, T., Fromont, E., Hüllermeier, E. (eds) Advances in Intelligent Data Analysis XX. IDA 2022. Lecture Notes in Computer Science, vol 13205. Springer, Cham. https://doi.org/10.1007/978-3-031-01333-1_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-01333-1_10
Published: 07 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-01332-4
Online ISBN: 978-3-031-01333-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Quantifying Changes in Predictions of Classification Models for Data Streams