Abstract
Evaluation methods for data stream classification have frequently been focused on how available data are used for learning a model and for its performance assessment, with major emphasis on the difference between predicted and true labels. More recently, growing interest in delayed labelling evaluation has resulted in the evaluation of multiple predictions made by an evolving model for an instance before its true label arrival. Still, under this setting predictions are also compared with true labels rather than changes in predictions focused on.
In this study, we aim to provide an intuitive evaluation framework to quantify changes in predictions made over time for the same input instances by evolving classification models. The primary motivation is to gain insight into the impact of the evolution of a classification model on the changes in decision boundaries, which may effectively re-assign the instances to other classes. The prediction change measures proposed in this study make it possible to reveal the scale of such changes. Furthermore, the notions of volatility of predictions and productive volatility are proposed and quantified. Results for a number of real and synthetic data streams show that similar accuracy of the models can be accompanied by significantly different volatility of predictions made by these models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The code and data sets repository are available at https://github.com/mgrzenda/PredictionVolatility. The code calculating the measures proposed in this study has been implemented as an extension of the Massive Online Analysis (MOA) [2] framework.
References
Barros, R.S.M., Santos, S.G.T.C.: A large-scale comparison of concept drift detectors. Inf. Sci. 451–452, 348–370 (2018). https://doi.org/10.1016/j.ins.2018.04.014
Bifet, A., Gavald, R., Holmes, G., Pfahringer, B.: Machine Learning for Data Streams: With Practical Examples in MOA. The MIT Press, Cambridge (2018)
Bifet, A., Gavaldà, R.: Adaptive learning from evolving data streams. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 249–260. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03915-7_22
Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 81–94 (2014). https://doi.org/10.1109/TNNLS.2013.2251352
Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environments: a survey. IEEE Comput. Intell. Mag. 10(4), 12–25 (2015)
Domingos, P., Hulten, G.: Mining high-speed data streams. In: 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM (2000)
Gomes, H.M., et al.: Adaptive random forests for evolving data stream classification. Mach. Learn. 106, 1469–1495 (2017). https://doi.org/10.1007/s10994-017-5642-8
Grzenda, M., Gomes, H.M., Bifet, A.: Performance measures for evolving predictions under delayed labelling classification. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2020). https://doi.org/10.1109/IJCNN48605.2020.9207256
Grzenda, M., Gomes, H.M., Bifet, A.: Delayed labelling evaluation for data streams. Data Min. Knowl. Disc. 34(5), 1237–1266 (2019). https://doi.org/10.1007/s10618-019-00654-y
Hofer, V., Krempl, G.: Drift mining in data: a framework for addressing drift in classification. Comput. Stat. Data Anal. 57, 377–391 (2013). https://doi.org/10.1016/j.csda.2012.07.007
Webb, G.I., Hyde, R., Cao, H., Nguyen, H.L., Petitjean, F.: Characterizing concept drift. Data Min. Knowl. Disc. 30(4), 964–994 (2016). https://doi.org/10.1007/s10618-015-0448-4
Webb, G.I., Lee, L.K., Goethals, B., Petitjean, F.: Analyzing concept drift and shift from sample data. Data Min. Knowl. Disc. 32(5), 1179–1199 (2018). https://doi.org/10.1007/s10618-018-0554-1
Acknowledgements
The project was funded by the POB Research Centre for Artificial Intelligence and Robotics of Warsaw University of Technology within the Excellence Initiative Program - Research University (ID-UB).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Grzenda, M. (2022). Quantifying Changes in Predictions of Classification Models for Data Streams. In: Bouadi, T., Fromont, E., Hüllermeier, E. (eds) Advances in Intelligent Data Analysis XX. IDA 2022. Lecture Notes in Computer Science, vol 13205. Springer, Cham. https://doi.org/10.1007/978-3-031-01333-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-01333-1_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-01332-4
Online ISBN: 978-3-031-01333-1
eBook Packages: Computer ScienceComputer Science (R0)