Abstract
Incremental learning is a learning paradigm in which a model is updated continuously as new data becomes available, and its main challenge is to adapt to non-stationary environments without the time-consuming re-training process. Many efforts have been made on incremental supervised learning. However, providing sufficient labeled data remains a major problem. Recently, domain adaptation methods have gained attention. These methods aim to leverage the knowledge from an auxiliary source domain to boost the performance of the model in the target domain by reducing the domain discrepancy between them. Regarding these issues, in the present paper, a proposed model aims to incrementally learn a new domain characterized by drifts due to a non-stationary environment. It utilizes an unsupervised, fuzzy-based domain adaptation to classify data streams faced with concept drift while accounting for a label-agnostic incremental setting in the target domain. Incremental learning updates occur whenever the entropy-based metric indicates uncertainty, ensuring informative samples are integrated. Also, outdated samples are forgotten during the training stage using the dynamic sample weighting strategy. Through experimentation on forty-five tasks, the superiority of the proposed model in handling dynamic adaptation on non-stationary domains is demonstrated, showcasing improvements in accuracy and computational efficiency.





Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The data that support the findings of this study are openly available at http://qwone.com/~jason/20Newsgroups/. http://jmcauley.ucsd.edu/data/amazon/.
References
Li H, Yu H, Min F, Liu D, Li H (2022) Incremental sequential three-way decision based on continual learning network. Int J Mach Learn Cybern 13:1633–1645. https://doi.org/10.1007/s13042-021-01472-9
Han J, Liu Z, Li Y, Zhang T (2023) SCMP-IL: an incremental learning method with super constraints on model parameters. Int J Mach Learn Cybern 14:1751–1767. https://doi.org/10.1007/s13042-022-01725-1
Bouguelia M-R, Nowaczyk S, Santosh KC, Verikas A (2018) Agreeing to disagree: active learning with noisy labels without crowdsourcing. Int J Mach Learn Cybern 9:1307–1319. https://doi.org/10.1007/s13042-017-0645-0
Nakarmi S and Santosh K (2023) Active learning to minimize the risk from future epidemics 2023 IEEE Conference on Artificial Intelligence (CAI), pp 329–330. https://doi.org/10.1109/CAI54212.2023.00145
Santosh KC, Nakarmi S (2023) Active learning to minimize the possible risk of future epidemics. Springer Nature, Singapore
Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22:1345–1359. https://doi.org/10.1109/TKDE.2009.191
Zhu H, Chen Z, Liu S (2023) Learning knowledge representation with meta knowledge distillation for single image super-resolution. J Vis Commun Image Represent 95:103874. https://doi.org/10.1016/j.jvcir.2023.103874
Long M, Wang J, Ding G, Sun J and Yu PS (2013) Transfer feature learning with joint distribution adaptation. In:ffffffff Proceedings of the IEEE international conference on computer vision, pp 2200–2207
Nguyen AT, Tran T, Gal Y, Baydin AG (2021) Domain invariant representation learning with domain density transformations. Adv Neural Inf Process Syst 34:5264–5275
Chen S, Hong Z, Harandi M, Yang X (2022) Domain neural adaptation. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3151683
Qu X, Liu L, Zhu L, Nie L, Zhang H (2024) Source-free style-diversity adversarial domain adaptation with privacy-preservation for person re-identification. Knowl-Based Syst 283:111150. https://doi.org/10.1016/j.knosys.2023.111150
Wang S, Wang B, Zhang Z, Heidari AA, Chen H (2023) Class-aware sample reweighting optimal transport for multi-source domain adaptation. Neurocomputing 523:213–223
Liu F, Lu J, Zhang G (2018) Unsupervised heterogeneous domain adaptation via shared fuzzy equivalence relations. IEEE Trans Fuzzy Syst 26:3555–3568. https://doi.org/10.1109/TFUZZ.2018.2836364
Lee W, Kim H, Lee J (2021) Compact class-conditional domain invariant learning for multi-class domain adaptation. Pattern Recogn 112:107763. https://doi.org/10.1016/j.patcog.2020.107763
Wang J, Zhang X-L (2023) Improving pseudo labels with intra-class similarity for unsupervised domain adaptation. Pattern Recogn 138:109379
Chen Q, Zhang H, Ye Q, Zhang Z, Yang W (2022) Learning discriminative feature via a generic auxiliary distribution for unsupervised domain adaptation. Int J Mach Learn Cybern 13:175–185. https://doi.org/10.1007/s13042-021-01381-x
Moradi M, Hamidzadeh J (2023) A domain adaptation method by incorporating belief function in twin quarter-sphere SVM. Knowl Inf Syst 65:3125–3163. https://doi.org/10.1007/s10115-023-01857-y
Taufique AMN, Jahan CS, Savakis A (2023) Continual unsupervised domain adaptation in data-constrained environments. IEEE Trans Artif Intell. https://doi.org/10.1109/TAI.2022.3233791
Lu J, Liu A, Dong F, Gu F, Gama J, Zhang G (2018) Learning under concept drift: a review. IEEE Trans Knowl Data Eng 31:2346–2363
Yan MMW (2020) Accurate detecting concept drift in evolving data streams. ICT Express 6:332–338. https://doi.org/10.1016/j.icte.2020.05.011
Guo H, Zhang S, Wang W (2021) Selective ensemble-based online adaptive deep neural networks for streaming data with concept drift. Neural Netw 142:437–456. https://doi.org/10.1016/j.neunet.2021.06.027
Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv (CSUR) 46:44
Baena-Garcıa M, del Campo-Ávila J, Fidalgo R, Bifet A, Gavalda R and Morales-Bueno R (2006) Early drift detection method. Fourth international workshop on knowledge discovery from data streams, pp 77–86
Gomes HM, Bifet A, Read J, Barddal JP, Enembreck F, Pfahringer B, Holmes G, Abdessalem T (2019) Correction to: adaptive random forests for evolving data stream classification. Mach Learn 108:1877–1878. https://doi.org/10.1007/s10994-019-05793-3
Xu S, Wang J (2017) Dynamic extreme learning machine for data stream classification. Neurocomputing 238:433–449. https://doi.org/10.1016/j.neucom.2016.12.078
Baidari I, Honnikoll N (2021) Bhattacharyya distance based concept drift detection method for evolving data stream. Expert Syst Appl 183:115303. https://doi.org/10.1016/j.eswa.2021.115303
Chen D, Yang Q, Liu J, Zeng Z (2020) Selective prototype-based learning on concept-drifting data streams. Inf Sci 516:20–32. https://doi.org/10.1016/j.ins.2019.12.046
Zheng X, Li P, Hu X, Yu K (2021) Semi-supervised classification on data streams with recurring concept drift and concept evolution. Knowl-Based Syst 215:106749. https://doi.org/10.1016/j.knosys.2021.106749
de Mello RF, Vaz Y, Grossi CH, Bifet A (2019) On learning guarantees to unsupervised concept drift detection on data streams. Expert Syst Appl 117:90–102. https://doi.org/10.1016/j.eswa.2018.08.054
Haque A, Khan L and Baron M (2016) Sand: Semi-supervised adaptive novel class detection and classification over data stream. In: Proceedings of the AAAI Conference on Artificial Intelligence
Din SU, Shao J (2020) Exploiting evolving micro-clusters for data stream classification with emerging class detection. Inf Sci 507:404–420
Li P, Wu X and Hu X (2010) Mining recurring concept drifts with limited labeled streaming data. In: Proceedings of 2nd Asian conference on machine learning JMLR Workshop and Conference Proceedings, pp 241–252
Alippi C, Roveri M (2008) Just-in-time adaptive classifiers—Part I: Detecting nonstationary changes. IEEE Trans Neural Netw 19:1145–1153
Kuncheva LI, Žliobaitė I (2009) On the window size for classification in changing environments. Intell Data Anal 13:861–872
Tran D-H (2019) Automated change detection and reactive clustering in multivariate streaming data International Conference on Computing and Communication Technologies (RIVF) IEEE, pp 1–6
Abirami MG, Gressel G (2022) Concept drift detection using minimum prediction deviation soft computing and signal processing. Springer, pp 249–258
Wong W, Koh YS, Dobbie G (2023) Using flexible memories to reduce catastrophic forgetting. Springer, pp 219–230
Kong Y, Liu L, Chen H, Kacprzyk J, Tao D (2023) Overcoming catastrophic forgetting in continual learning by exploring eigenvalues of hessian matrix. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2023.3292359
Zhong Y, Zhou J, Li P, Gong J (2023) Dynamically evolving deep neural networks with continuous online learning. Inf Sci 646:119411. https://doi.org/10.1016/j.ins.2023.119411
Binici K, Pham NT, Mitra T and Leman K Preventing catastrophic forgetting and distribution mismatch in knowledge distillation via synthetic data. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 663–671
Liu H, Yan Z, Liu B, Zhao J, Zhou Y, El Saddik A (2023) Distilled meta-learning for multi-class incremental learning. ACM Trans Multimed Comput Commun Appl 19:1–16
Wu Y, Liang T, Feng S, Jin Y, Lyu G, Fei H and Wang Y (2023) MetaZSCIL: a meta-learning approach for generalized zero-shot class incremental learning, pp 10408–10416
Li P, Wu X, Hu X, Wang H (2015) Learning concept-drifting data streams with random ensemble decision trees. Neurocomputing 166:68–83. https://doi.org/10.1016/j.neucom.2015.04.024
Chandra S, Haque A, Khan L and Aggarwal C (2016) An adaptive framework for multistream classification. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp 1181–1190
Pinagé F, dos Santos EM, Gama J (2020) A drift detection method based on dynamic classifier selection. Data Min Knowl Disc 34:50–74. https://doi.org/10.1007/s10618-019-00656-w
Hamidzadeh J, Moradi M (2020) Incremental one-class classifier based on convex–concave hull. Pattern Anal Appl 23:1523–1549. https://doi.org/10.1007/s10044-020-00876-7
Li Z, Huang W, Xiong Y, Ren S, Zhu T (2020) Incremental learning imbalanced data streams with concept drift: the dynamic updated ensemble algorithm. Knowl-Based Syst 195:105694. https://doi.org/10.1016/j.knosys.2020.105694
Reis DMd, Flach P, Matwin S and Batista G (2016) Fast unsupervised online drift detection using incremental Kolmogorov-Smirnov test. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Association for Computing Machinery, San Francisco, California, USA, pp 1545–1554. https://doi.org/10.1145/2939672.2939836
Yu H, Huang J, Liu Y, Zhu Q, Zhou M and Zhao F (2022) Source-free domain adaptation for real-world image Dehazing. arXiv e-prints: arXiv-2207
Li Z, Cai R, Chen J, Yan Y, Chen W, Zhang K and Ye J (2022) Time-series domain adaptation via sparse associative structure alignment: learning invariance and variance. arXiv preprint arXiv:2205.03554
Zou Y, Yu Z, Kumar BVK and Wang J (2018) Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Proceedings of the European conference on computer vision (ECCV), pp 289–305
French G, Mackiewicz M and Fisher M (2017) Self-ensembling for visual domain adaptation. arXiv preprint arXiv:1706.05208
Novosad P, Fonov V and Collins DL (2019) Unsupervised domain adaptation for the automated segmentation of neuroanatomy in MRI: a deep learning approach. bioRxiv: 845537
Wen J, Yuan J, Zheng Q, Liu R, Gong Z, Zheng N (2022) Hierarchical domain adaptation with local feature patterns. Pattern Recogn 124:108445
Pratama M, de Carvalho M, Xie R, Lughofer E and Lu J (2019) ATL: autonomous knowledge transfer from many streaming processes. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp 269–278
Huang J, Gretton A, Borgwardt K, Schölkopf B, Smola A (2006) Correcting sample selection bias by unlabeled data. Adv Neural Inf Process Syst 19:601–608. https://doi.org/10.1109/CVPR.2018.00400
Haque A, Wang Z, Chandra S, Dong B, Khan L and Hamlen KW (2017) Fusion: An online method for multistream classification. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp 919–928
Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection Brazilian symposium on artificial intelligence. Springer, pp 286–295
Zhao P, Hoi SC, Wang J, Li B (2014) Online transfer learning. Artif Intell 216:76–102
Yu C, Wang J, Chen Y, Qin X (2019) Transfer channel pruning for compressing deep domain adaptation models. Int J Mach Learn Cybern 10:3129–3144. https://doi.org/10.1007/s13042-019-01004-6
Song X, Jin Z (2022) Domain adaptive attention-based dropout for one-shot person re-identification. Int J Mach Learn Cybern 13:255–268. https://doi.org/10.1007/s13042-021-01399-1
Kasim S and Sheppard JW Cross-domain similarity in domain adaptation for human activity recognition IEEE, pp 1–8
Zhong X-C, Wang Q, Liu D, Liao J-X, Yang R, Duan S, Ding G, Sun J (2023) A deep domain adaptation framework with correlation alignment for EEG-based motor imagery classification. Comput Biol Med 163:107235. https://doi.org/10.1016/j.compbiomed.2023.107235
Bi X, Zhang X, Wang S, Zhang H (2022) Entropy-weighted reconstruction adversary and curriculum pseudo labeling for domain adaptation in semantic segmentation. Neurocomputing 506:277–289. https://doi.org/10.1016/j.neucom.2022.07.073
Zhe X, Du Z, Lou C, Li J (2023) Alleviating the generalization issue in adversarial domain adaptation networks. Image Vis Comput 135:104695. https://doi.org/10.1016/j.imavis.2023.104695
Rényi A (1961) On measures of entropy and information. University of California Press, pp 547–561
Gâlmeanu H, Andonie R (2021) concept drift adaptation with incremental–decremental SVM. Appl Sci 11:9644. https://doi.org/10.3390/app11209644
Gâlmeanu H and Andonie R (2008) Implementation issues of an incremental and decremental SVM International Conference on Artificial Neural Networks Springer, pp 325–335. https://doi.org/10.1007/978-3-540-87536-9_34
Hamidzadeh J, Rezaeenik E, Moradi M (2021) Predicting users’ preferences by fuzzy rough set quarter-sphere support vector machine. Appl Soft Comput 112:107740. https://doi.org/10.1016/j.asoc.2021.107740
Read J (2018) Concept-drifting data streams are time series; The Case for Continuous Adaptation. CoRR abs/1810.02266
Read J (2018) Concept-drifting data streams are time series; the case for continuous adaptation. arXiv preprint arXiv:1810.02266
Mikolov T, Chen K, Corrado G and Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Arora S, Liang Y and Ma T (2017) A simple but tough-to-beat baseline for sentence embeddings 5th International Conference on Learning Representations, ICLR
Sheskin DJ (2003) Handbook of parametric and nonparametric statistical procedures. CRC Press
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix I
This section contains detailed information for computing the metric \(\mathcal{D}\) introduced by [13]. For each fuzzy feature vector \({\overline{A} }_{i}\left({a}_{i1},{a}_{i2},\dots ,{a}_{in}\right)\in F({\mathbb{R}}^{n})\). For each \({\overline{a} }_{ij}\in F({\mathbb{R}})\), its membership function is computed by Eq. (I1).
where \({a}_{ij}\) indicates the \(j\) th feature value of the \(i\) th sample and \({\rho }_{i}\) shows the hesitation degree of the \(i\) th sample considering a triangular membership function. Utilizing Eq. (I2), \({\mu }_{ij}({\varvec{x}}|{\overline{A} }_{i})\) where \({\varvec{x}}=({x}_{1},{ x}_{2},\dots \text{, }{x}_{n})\) is obtained.
To define the fuzzy relation between two heterogeneous domains (source and target), the following metric is defined to measure the distance between the fuzzy vectors.
where \(\lambda\) is the membership value, \({\mathcal{D}}_{\lambda }\left(u,v\right)\) indicates the distance between points \(u\) and \(v\) in \({\mathbb{R}}^{n}\) with the given \(\lambda\). \(\Omega \left(\lambda \right)\) is computed by Eq. (I4).
where \(d\left(v,u\right)\) is the \({l}_{1}\)-norm between two n-dimensional vectors (\(u\) and \(v\)). Note that the supremum operator (sup) in Eq. (I3) indicates the longest distance between the fuzzy vector of one specific domain to the fuzzy set of another domain. Eq. (I3) can be re-written as Eq. (I5).
The above equation is de-fuzzified regarding Eqs. (I1) and (I2) as follows.
Eq. (I6) cannot be used directly for computing the fuzzy relation because it does not satisfy two properties of the fuzzy relation, including (1) symmetry, a condition in which \(\mathcal{D}\left({\overline{A} }_{i},{\overline{A} }_{j}\right)=\mathcal{D}\left({\overline{A} }_{j},{\overline{A} }_{i}\right)\text{, }\forall {\overline{A} }_{i}\text{, }{\overline{A} }_{j}\) and (2) reflexivity, a condition in which \(\mathcal{D}\left({\overline{A} }_{i},{\overline{A} }_{j}\right)=1, \forall {\overline{A} }_{i}\). Thus, the following function is employed.
Appendix II
All the tables of SubSect. 5.3 are presented from here on (See Tables 11, 12, 13, 14, 15 and 16).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Moradi, M., Rahmanimanesh, M. & Shahzadi, A. Unsupervised domain adaptation by incremental learning for concept drifting data streams. Int. J. Mach. Learn. & Cyber. 15, 4055–4078 (2024). https://doi.org/10.1007/s13042-024-02135-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-024-02135-1