Abstract
In static environments Random Projection (RP) is a popular and efficient technique to preprocess high-dimensional data and to reduce its dimensionality. While RP has been widely used and evaluated in stationary data analysis scenarios, non-stationary environments are not well analyzed. In this paper we provide an evaluation of RP on streaming data including a concept of altering dimensions. We discuss why RP can be used in this scenario and how it can handle stream specific situations like concept drift. We also provide experiments with RP on streaming data, using state-of-the-art streaming classifiers like Adaptive Hoeffding Tree and concept drift detectors on streams containing altering dimensions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
All experiments are implemented in Python supported by the scikit-multiflow framework [18].
- 2.
https://github.com/ChristophRaab/stvm, we are using ‘org vs people’.
References
Achlioptas, D.: Database-friendly random projections. In: Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 274–281. ACM (2001)
Achlioptas, D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. Syst. Sci. 66, 671–687 (2003)
Aggarwal, C.C.: A survey of stream classification algorithms. In: Data Classification: Algorithms and Applications (2014)
Bifet, A., Gavaldà, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the Seventh SIAM International Conference on Data Mining, Minneapolis, Minnesota, USA, 26–28 April 2007, pp. 443–448 (2007)
Bifet, A., Gavaldà, R.: Adaptive learning from evolving data streams. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 249–260. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03915-7_22
Carraher, L.A., Wilsey, P.A., Moitra, A., Dey, S.: Random projection clustering on streaming data. In: 2016 IEEE 16th ICDMW, pp. 708–715 (2016)
Dasgupta, S., Gupta, A.: An elementary proof of a theorem of Johnson and Lindenstrauss. Random Struct. Algorithms 22, 60–65 (2003)
Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 1–37 (2014)
Gomes, H.M., et al.: Adaptive random forests for evolving data stream classification. Mach. Learn. 106(9), 1469–1495 (2017). https://doi.org/10.1007/s10994-017-5642-8
Grabowska, M., Kotłowski, W.: Online principal component analysis for evolving data streams. In: Czachórski, T., Gelenbe, E., Grochla, K., Lent, R. (eds.) ISCIS 2018. CCIS, vol. 935, pp. 130–137. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00840-6_15
Heusinger, M., Raab, C., Schleif, F.-M.: Passive concept drift handling via momentum based robust soft learning vector quantization. In: Vellido, A., Gibert, K., Angulo, C., Martín Guerrero, J.D. (eds.) WSOM 2019. AISC, vol. 976, pp. 200–209. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-19642-4_20
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math. 26, 189–206 (1984)
Kaban, A.: Improved bounds on the dot product under random projection and random sign projection. In: Proceedings of the 21th ACM SIGKDD. KDD 2015, pp. 487–496. ACM, New York (2015)
Klartag, B., Mendelson, S.: Empirical processes and random projections. J. Funct. Anal. 225(1), 229–245 (2005)
Li, P., Hastie, T.J., Church, K.W.: Very sparse random projections. In: Proceedings of the 12th ACM SIGKDD, pp. 287–296. ACM (2006)
Losing, V., Hammer, B., Wersing, H.: KNN classifier with self adjusting memory for heterogeneous concept drift. In: Proceedings of the - IEEE, ICDM, pp. 291–300 (2017)
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
Montiel, J., Read, J., Bifet, A., Abdessalem, T.: Scikit-multiflow: a multi-output streaming framework. J. Mach. Learn. Res. 19(72), 1–5 (2018)
Oza, N.C.: Online bagging and boosting. In: 2005 IEEE International Conference on Systems, Man and Cybernetics, vol. 3, pp. 2340–2345 (2005)
Pham, X.C., Dang, M.T., Dinh, S.V., Hoang, S., Nguyen, T.T., Liew, A.W.: Learning from data stream based on random projection and Hoeffding tree classifier. In: DICTA 2017, pp. 1–8 (2017)
Raab, C., Heusinger, M., Schleif, F.M.: Reactive soft prototype computing for frequent reoccurring concept drift. In: Proceedings of the 27. ESANN, pp. 437–442 (2019)
Sacha, D., et al.: Visual interaction with dimensionality reduction: a structured literature analysis. IEEE Trans. Vis. Comput. Graph. 23(1), 241–250 (2017)
Schoeneman, F., Mahapatra, S., Chandola, V., Napp, N., Zola, J.: Error metrics for learning reliable manifolds from streaming data. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 750–758. SIAM (2017)
Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemometr. Intell. Lab. Syst. 2, 37–52 (1987)
Acknowledgement
We are thankful for support in the FuE program Informations- und Kommunikationstechnik of the StMWi, project OBerA, grant number IUK-1709-0011// IUK530/010.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Heusinger, M., Schleif, FM. (2020). Random Projection in the Presence of Concept Drift in Supervised Environments. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2020. Lecture Notes in Computer Science(), vol 12415. Springer, Cham. https://doi.org/10.1007/978-3-030-61401-0_48
Download citation
DOI: https://doi.org/10.1007/978-3-030-61401-0_48
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61400-3
Online ISBN: 978-3-030-61401-0
eBook Packages: Computer ScienceComputer Science (R0)