Concept Drift Detection with Denoising Autoencoder in Incomplete Data

Murao, Jun; Yonekawa, Kei; Kurokawa, Mori; Amagata, Daichi; Maekawa, Takuya; Hara, Takahiro

doi:10.1007/978-3-030-94822-1_35

Concept Drift Detection with Denoising Autoencoder in Incomplete Data

Jun Murao¹⁷,
Kei Yonekawa¹⁸,
Mori Kurokawa¹⁸,
Daichi Amagata¹⁷,
Takuya Maekawa¹⁷ &
…
Takahiro Hara¹⁷

Conference paper
First Online: 08 February 2022

1428 Accesses

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 419))

Abstract

Recent e-commerce and location-based services provide personalized recommendations based on machine-learning models that take into account purchase and visiting histories. Because machine-learning models assume the same distributions between training and test data, they cannot catch up with concept drifts, i.e., changes of behavioral patterns over time. To keep recommendation accurate, it is important to detect concept drifts. Generally, to achieve this, we need complete data (i.e., data without missing values). In real-world datasets, however, there are many incomplete data, and existing concept drift detection techniques do not deal with incomplete data. To address this issue, we investigate how a deep learning technique (denoising autoencoder), which complements missing values, contributes to detecting concept drifts in incomplete data. We conduct experiments on synthetic and real datasets to evaluate the robustness of this technique, and our results show its advantages.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: SODA, pp. 1027–1035 (2007)
Google Scholar
Barros, R.S., Cabral, D.R., Gonçalves, P.M., Jr., Santos, S.G.: RDDM: reactive drift detection method. Expert Syst. Appl. 90, 344–355 (2017)
Article Google Scholar
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)
Article Google Scholar
Boracchi, G., Carrera, D., Cervellera, C., Maccio, D.: Quanttree: histograms for change detection in multivariate data streams. In: ICML, pp. 639–648 (2018)
Google Scholar
Boulanouar, S., Lamiche, C.: A new hybrid image segmentation method based on fuzzy c-mean and modified bat algorithm. Int. J. Comput. Digit. Syst. 9(4), 677–687 (2020)
Article Google Scholar
Box, G.E., Hunter, W.H., Hunter, S.: Statistics for Experimenters, vol. 664 (1978)
Google Scholar
Dasu, T., Krishnan, S., Venkatasubramanian, S., Yi, K.: An information-theoretic approach to detecting changes in multi-dimensional data streams. In: Symposium on the Interface of Statistics, Computing Science, and Applications (2006)
Google Scholar
Friedman, J.H., Rafsky, L.C.: Multivariate generalizations of the wald-wolfowitz and smirnov two-sample tests. Ann. Stat. 7, 697–717 (1979)
Google Scholar
Gondara, L., Wang, K.: MIDA: multiple imputation using denoising autoencoders. In: PAKDD, pp. 260–272 (2018)
Google Scholar
Haug, J., Kasneci, G.: Learning parameter distributions to detect concept drift in data streams. arXiv preprint arXiv:2010.09388 (2020)
Liu, A., Lu, J., Zhang, G.: Concept drift detection: dealing with missing values via fuzzy distance estimations. IEEE Trans. Fuzzy Syst. 29, 3219–3233 (2020)
Google Scholar
Liu, A., Lu, J., Zhang, G.: Concept drift detection via equal intensity k-means space partitioning. IEEE Trans. Cybern. 51, 3198–3211 (2020)
Google Scholar
Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE Trans. Knowl. Data Eng. 31(12), 2346–2363 (2018)
Google Scholar
Lyu, Y., et al.: Behavior matching between different domains based on canonical correlation analysis. In: ECNLP, pp. 361–366 (2019)
Google Scholar
Nguyen, D., et al.: On the transferability of deep neural networks for recommender system. In: IAL, pp. 22–37 (2020)
Google Scholar
Shao, J., Ahmadi, Z., Kramer, S.: Prototype-based learning on concept-drifting data streams. In: KDD, pp. 412–421 (2014)
Google Scholar
Sun, Y., Tang, K., Minku, L.L., Wang, S., Yao, X.: Online ensemble learning of data streams with gradually evolved classes. IEEE Trans. Knowl. Data Eng. 28(6), 1532–1545 (2016)
Article Google Scholar
Sun, Z., Guo, Q., Yang, J., Fang, H., Guo, G., Zhang, J., Burke, R.: Research commentary on recommendations with side information: A survey and research directions. Electron. Commer. Res. Appl. 37, 100879 (2019)
Article Google Scholar
Wang, H., et al.: Preliminary investigation of alleviating user cold-start problem in e-commerce with deep cross-domain recommender system. In: ECNLP, pp. 398–403 (2019)
Google Scholar
Wang, H., et al.: A DNN-based cross-domain recommender system for alleviating cold-start problem in e-commerce. IEEE Open J. Ind. Electron. Soc. 1, 194–206 (2020)
Article Google Scholar
Wang, S., Schlobach, S., Klein, M.: Concept drift and how to identify it. J. Web Semant. 9(3), 247–265 (2011)
Article Google Scholar
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)
Google Scholar
Xu, S., Wang, J.: Dynamic extreme learning machine for data stream classification. Neurocomputing 238, 433–449 (2017)
Article Google Scholar
Yonekawa, K., et al.: Advertiser-assisted behavioral ad-targeting via denoised distribution induction. In: IEEE Big Data, pp. 5611–5619 (2019)
Google Scholar
Yonekawa, K.,et al.: A heterogeneous domain adversarial neural network for trans-domain behavioral targeting. In: DLKT, pp. 274–285 (2019)
Google Scholar
Zhang, Y., et al.: Personalized geographical influence modeling for poi recommendation. IEEE Intell. Syst. 35(5), 18–27 (2020)
Article Google Scholar

Download references

Acknowledgements

This research is supported by JST CREST Grant Number JPMJCR21F2.

Author information

Authors and Affiliations

Osaka University, Osaka, Japan
Jun Murao, Daichi Amagata, Takuya Maekawa & Takahiro Hara
KDDI Research, Inc., Saitama, Japan
Kei Yonekawa & Mori Kurokawa

Authors

Jun Murao
View author publications
You can also search for this author in PubMed Google Scholar
Kei Yonekawa
View author publications
You can also search for this author in PubMed Google Scholar
Mori Kurokawa
View author publications
You can also search for this author in PubMed Google Scholar
Daichi Amagata
View author publications
You can also search for this author in PubMed Google Scholar
Takuya Maekawa
View author publications
You can also search for this author in PubMed Google Scholar
Takahiro Hara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daichi Amagata .

Editor information

Editors and Affiliations

Osaka University, Osaka, Japan
Takahiro Hara
Osaka University, Osaka, Japan
Hirozumi Yamaguchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Murao, J., Yonekawa, K., Kurokawa, M., Amagata, D., Maekawa, T., Hara, T. (2022). Concept Drift Detection with Denoising Autoencoder in Incomplete Data. In: Hara, T., Yamaguchi, H. (eds) Mobile and Ubiquitous Systems: Computing, Networking and Services. MobiQuitous 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 419. Springer, Cham. https://doi.org/10.1007/978-3-030-94822-1_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-94822-1_35
Published: 08 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-94821-4
Online ISBN: 978-3-030-94822-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics