Concept-evolution detection in non-stationary data streams: a fuzzy clustering approach

ZareMoodi, Poorya; Kamali Siahroudi, Sajjad; Beigy, Hamid

doi:10.1007/s10115-018-1266-y

Concept-evolution detection in non-stationary data streams: a fuzzy clustering approach

Regular Paper
Published: 16 October 2018

Volume 60, pages 1329–1352, (2019)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

580 Accesses
15 Citations
Explore all metrics

Abstract

We have entered the era of networked communications where concepts such as big data and social networks are emerging. The explosion and profusion of available data in a broad range of application domains cause data streams to become an inevitable part of the most real-world applications. In the classification of data streams, there are four major challenges: infinite length, concept drift, recurring and evolving concepts. This paper proposes a novel method to address the mentioned challenges with a focus on the last one. Unlike the existing methods for detection of evolving concepts, we cast joint classification and detection of evolving concepts into optimizing an objective function by extending a fuzzy agglomerative clustering method. Moreover, rather than keeping instances or hyper-sphere summaries of previously seen classes, we just maintain boundaries in the kernel space and generate instances of each class on demand. This approach enhances the accuracy and reduces the memory usage of the proposed method. We empirically evaluated and showed the effectiveness of the proposed approach on several synthetic and real datasets. Experimental results on synthetic and real datasets show the superiority of the proposed method over the related state-of-the-art methods in this area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CPOCEDS-concept preserving online clustering for evolving data streams

Article 28 August 2023

A dynamic hierarchical incremental learning-based supervised clustering for data stream with considering concept drift

Article 04 January 2022

Research on detection and integration classification based on concept drift of data stream

Article Open access 03 April 2019

References

Hosseini MJ, Gholipour A, Beigy H (2016) An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams. Knowl Inf Syst 46(3):567–597
Article Google Scholar
Dehghan M, Beigy H, ZareMoodi P (2016) A novel concept drift detection method in data streams using ensemble classifiers. Intell Data Anal 20(6):1329–1350
Article Google Scholar
Hosseini MJ, Gholipour A, Beigy H (2016) An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams. Knowl Inf Syst 46(3):567–597
Article Google Scholar
Faria ER, Gonçalves IJCR, de Carvalho ACPLF, Gama J (2016) Novelty detection in data streams. Artif Intell Rev 45(2):235–269. https://doi.org/10.1007/s10462-015-9444-8
Article Google Scholar
Abdallah ZS, Gaber MM, Srinivasan B, Krishnaswamy S (2016) Anynovel: detection of novel concepts in evolving data streams. Evol Syst 7(2):73–93
Article Google Scholar
de Faria ER, Goncalves IR, Gama J, de Leon Ferreira ACP et al (2015) Evaluation of multiclass novelty detection algorithms for data streams. IEEE Trans Knowl Data Eng 27(11):2961–2973
Article Google Scholar
Faria ER, Ponce De Leon Ferreira Carvalho AC, Gama J (2016) MINAS: multiclass learning algorithm for novelty detection in data streams. Data Min Knowl Discov 30(3):640–680. https://doi.org/10.1007/s10618-015-0433-y
Article MathSciNet Google Scholar
Pimentel MA, Clifton DA, Clifton L, Tarassenko L (2014) A review of novelty detection. Signal Process 99:215–249
Article Google Scholar
ZareMoodi P, Beigy H, Siahroudi SK (2015) Novel class detection in data streams using local patterns and neighborhood graph. Neurocomputing 158:234–245
Article Google Scholar
Masud MM, Gao J, Khan L, Han J, Thuraisingham BM (2011) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874
Article Google Scholar
Al-Khateeb T, Masud MM, Khan L, Aggarwal C, Han J, Thuraisingham B (2012) Stream classification with recurring and novel class detection using class-based ensemble. In: Proceedings of the IEEE 12th international conference on data mining (ICDM). IEEE, pp 31–40
ZareMoodi P, Siahroudi SK, Beigy H (2016) A support vector based approach for classification beyond the learned label space in data streams. In: Proceedings of the 31st annual ACM symposium on applied computing. ACM, pp 910–915
Masud MM, Chen Q, Khan L, Aggarwal CC, Gao J, Han J, Srivastava A, Oza NC (2013) Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans Knowl Data Eng 25(7):1484–1497
Article Google Scholar
Farid DM, Rahman CM (2012) Novel class detection in concept-drifting data stream mining employing decision tree. In: Proceedings of the 7th international conference on electrical and computer engineering (ICECE). IEEE, pp 630–633
Faria ER, Gama J, Carvalho AC (2013) Novelty detection algorithm for data streams multi-class problems. In: Proceedings of the 28th annual ACM symposium on applied computing. ACM, pp 795–800
Spinosa EJ, de Leon F de Carvalho AP, Gama J (2007) Olindda: a cluster-based approach for detecting novelty and concept drift in data streams. In: Proceedings of the 2007 ACM symposium on applied computing. ACM, New York, NY, USA, pp 448–452. https://doi.org/10.1145/1244002.1244107
Mu X, Ting KM, Zhou Z (2016) Classification under streaming emerging new classes: a solution using completely random trees. CoRR arXiv:1605.09131
Haque A, Khan L, Baron M (2015) Semi supervised adaptive framework for classifying evolving data stream. In: PAKDD (2). Volume 9078 of lecture notes in computer science. Springer, pp 383–394
Haque A, Khan L, Baron M (2016) SAND: semi-supervised adaptive novel class detection and classification over data stream. In: Proceedings of the thirtieth AAAI conference on artificial intelligence, AAAI’16. AAAI Press, pp 1652–1658
Bouguelia M, Belaïd Y, Belaïd A (2014) Efficient active novel class detection for data stream classification. In: ICPR. IEEE Computer Society, pp 2826–2831
Bouguelia M, Belaïd Y, Belaïd A (2013) A stream-based semi-supervised active learning approach for document classification. In: 12th International conference on document analysis and recognition, ICDAR 2013, Washington, DC, USA, August 25–28, 2013, pp 611–615
Siahroudi SK, Moodi PZ, Beigy H (2018) Detection of evolving concepts in non-stationary data streams: a multiple kernel learning approach. Exp Syst Appl 91:187–197
Article Google Scholar
Rigollet P (2007) Generalization error bounds in semi-supervised classification under the cluster assumption. J Mach Learn Res 8:1369–1392
MathSciNet MATH Google Scholar
Camci F, Chinnam RB (2008) General support vector representation machine for one-class classification of non-stationary classes. Pattern Recognit 41(10):3021–3034
Article MATH Google Scholar
Krawczyk B, Woźniak M (2013) Incremental learning and forgetting in one-class classifiers for data streams. In: Proceedings of the 8th international conference on computer recognition systems. Springer, pp 319–328
Li MJ, Ng MK, Cheung Y, Huang JZ (2008) Agglomerative fuzzy k-means clustering algorithm with selection of number of clusters. IEEE Trans Knowl Data Eng 20(11):1519–1534
Article Google Scholar
Sun H, Wang S, Jiang Q (2004) FCM-based model selection algorithms for determining the number of clusters. Pattern Recognit 37(10):2027–2037
Article MATH Google Scholar
Tax DM, Duin RP (2002) Uniform object generation for optimizing one-class classifiers. J Mach Learn Res 2:155–173
MATH Google Scholar
Ullman NR (1978) Elementary statistics: an applied approach. Wiley, New York
Google Scholar
Mika S, Schölkopf B, Smola AJ, Müller KR, Scholz M, Rätsch G (1998) Kernel PCA and de-noising in feature spaces. In: NIPS, vol 4, p 7
Schölkopf B, Mika S, Burges CJ, Knirsch P, Müller KR, Rätsch G, Smola AJ (1999) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 10(5):1000–1017
Article Google Scholar
Dua D, Efi KT (2017) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their constructive comments which improved the paper.

Author information

Authors and Affiliations

Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
Poorya ZareMoodi, Sajjad Kamali Siahroudi & Hamid Beigy

Authors

Poorya ZareMoodi
View author publications
You can also search for this author in PubMed Google Scholar
Sajjad Kamali Siahroudi
View author publications
You can also search for this author in PubMed Google Scholar
Hamid Beigy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hamid Beigy.

Appendix

In this appendix, we give the proof of Theorem 1. The goal of the optimization procedure is to simultaneously find fuzzy memberships U and cluster centers Z such that the objective function given in Eq. (11) is minimized. Onclad adopts an alternating optimization approach to minimize $J_{\textsc {Onclad}}$. Minimizing $J_{\textsc {Onclad}}$ with the constraints is a kind of constrained nonlinear optimization problem. We use Lagrange multipliers, and then the following Lagrange function is obtained.

$$\begin{aligned} J_{\textsc {Onclad}}(U,Z)= & {} \sum \limits _{j=1}^{K}\sum \limits _{i=1}^{N}u_{ij}d_{ij} + \gamma \sum \limits _{j=1}^{K}\sum \limits _{i=1}^{N}u_{ij}\log u_{ij}\nonumber \\&+\, \alpha \sum \limits _{m|P_m \in C^i}\sum \limits _{\begin{array}{c} n| P_n \in C^i \\ m\ne n \end{array}}\sum \limits _{k=1}^{K}\sum \limits _{\begin{array}{c} l=1 \\ l\ne k \end{array}}^{K}u_{mk}u_{nl}\nonumber \\&+\, \beta \sum \limits _{m|P_m \in C^i}\sum \limits _{\begin{array}{c} n| P_n \in C^j \\ i\ne j \end{array}}\sum \limits _{k=1}^{K}u_{mk}u_{nk}\nonumber \\&+\, \sum \limits _{i=1}^{n}{\lambda }_i \left( \sum \limits _{j=1}^{K}u_{ij}-1\right) \end{aligned}$$

(11)

This objective function is minimized using alternating optimization approach. First, we fix fuzzy memberships U and minimize the objective function with respect to Z and then we fix cluster centers Z and minimize the objective function with respect to U. Optimizing the cluster centers $Z \equiv [z_{jl}]_{K \times m}$ and the fuzzy memberships $U \equiv [u_{ij}]_{K \times m}$ obtained by the following lemmas.

Lemma 1

Given the fuzzy memberships U are fixed, the optimal values for the cluster centers $Z \equiv [z_{jl}]_{K \times m}$ are obtained using following equation:

$$\begin{aligned} z_{jl} = \dfrac{\sum \nolimits _{i=1}^{N} u_{ij} x_{il}}{\sum \nolimits _{i=1}^{N} u_{ij}}. \end{aligned}$$

(12)

Proof

By taking derivative of Eq. (11) with respect to each cluster center and setting it to zero, we obtain:

$$\begin{aligned} \dfrac{\partial J(U,\mathbf Z )}{\partial z_{jl}} = \sum \limits _{i=1}^{N}2u_{ij}(z_{jl}-x_{il})=0 \end{aligned}$$

(13)

Thus, the solution for $z_{jl}$ equals to

$$\begin{aligned} z_{jl} = \dfrac{\sum \nolimits _{i=1}^{N} u_{ij} x_{il}}{\sum \nolimits _{i=1}^{N} u_{ij}}, \end{aligned}$$

(14)

which completes the proof of lemma. $\square $

Lemma 2

Given the cluster centers Z are fixed, the optimal value of fuzzy memberships are equal to:

$$\begin{aligned} u_{ij}= \dfrac{\exp \left( \dfrac{-d_{ij}}{\gamma }\right) \exp \left( \dfrac{-\alpha A_{ij}}{\gamma }\right) \exp \left( \dfrac{-\beta B_{ij}}{\gamma }\right) }{\sum \nolimits _{l=1}^{K} \exp \left( \dfrac{-d_{il}}{\gamma }\right) \exp \left( \dfrac{-\alpha A_{il}}{\gamma }\right) \exp \left( \dfrac{-\beta B_{il}}{\gamma }\right) } \end{aligned}$$

(15)

Proof

Taking derivative of Eq. (11) with respect to each fuzzy membership and setting it to zero, we obtain:

$$\begin{aligned} \dfrac{\partial J(\mathbf U ,Z)}{\partial u_{ij}}= & {} d_{ij} + \gamma (1 + \log u_{ij}) + \alpha \left( \underbrace{\sum \limits _{\begin{array}{c} n|P_i \in C^m, P_n \in C^l \\ m\ne l \end{array}} \sum \limits _{\begin{array}{c} k=1 \\ k\ne j \end{array}}^{K} u_{nk}}_{A_{ij}}\right) \nonumber \\&+\, \beta \left( \underbrace{\sum \limits _{n|P_i,P_n \in C^m}u_{nj}}_{B_{ij}}\right) + \lambda _i=0 \end{aligned}$$

(16)

Solving the above equation for $u_{ij}$, we obtain:

$$\begin{aligned} u_{ij}= \exp (-1)\exp \left( \dfrac{-d_{ij}}{\gamma }\right) \exp \left( \dfrac{-\alpha A_{ij}}{\gamma }\right) \exp \left( \dfrac{-\beta B_{ij}}{\gamma }\right) \exp \left( \dfrac{-\lambda _i}{\gamma }\right) \end{aligned}$$

(17)

Because of the constraint $\sum \nolimits _{j=1}^{K} u_{ij}=1$, the Lagrange multipliers are equal to

$$\begin{aligned} \sum \limits _{j=1}^{K} u_{ij}= & {} \sum \limits _{j=1}^{K} \exp (-1)\exp \left( \dfrac{-d_{ij}}{\gamma }\right) \exp \left( \dfrac{-\alpha A_{ij}}{\gamma }\right) \exp \left( \dfrac{-\beta B_{ij}}{\gamma }\right) \exp \left( \dfrac{-\lambda _i}{\gamma }\right) \nonumber \\= & {} \exp (-1)\exp \left( \dfrac{-\lambda _i}{\gamma }\right) \sum \limits _{j=1}^{K} \exp \left( \dfrac{-d_{ij}}{\gamma }\right) \exp \left( \dfrac{-\alpha A_{ij}}{\gamma }\right) \exp \left( \dfrac{-\beta B_{ij}}{\gamma }\right) = 1\nonumber \\ \end{aligned}$$

(18)

By some algebraic simplification, we obtain:

$$\begin{aligned} \exp \left( \dfrac{-\lambda _i}{\gamma }\right) = \dfrac{1}{\exp (-1) \sum \nolimits _{j=1}^{K} \exp \left( \dfrac{-d_{ij}}{\gamma }\right) \exp \left( \dfrac{-\alpha A_{ij}}{\gamma }\right) \exp \left( \dfrac{-\beta B_{ij}}{\gamma }\right) } \end{aligned}$$

(19)

By substituting Eq. (19) in Eq. (17), we obtain the closed form solution for the optimal memberships as

$$\begin{aligned} u_{ij}= \dfrac{\exp \left( \dfrac{-d_{ij}}{\gamma }\right) \exp \left( \dfrac{-\alpha A_{ij}}{\gamma }\right) \exp \left( \dfrac{-\beta B_{ij}}{\gamma }\right) }{\sum \nolimits _{l=1}^{K} \exp \left( \dfrac{-d_{il}}{\gamma }\right) \exp \left( \dfrac{-\alpha A_{il}}{\gamma }\right) \exp \left( \dfrac{-\beta B_{il}}{\gamma }\right) }, \end{aligned}$$

(20)

which completes the proof of lemma. $\square $

The following lemma shows the convergence of the alternating minimization procedure by updating Z and U using Eqs. (12) and (15), respectively.

Lemma 3

Let J(Z) be $J_{\textsc {Onclad}}$ where fuzzy memberships are fixed, let J(U) be $J_{\textsc {Onclad}}$ where cluster centers are fixed and $\alpha , \beta , \gamma >0$. Z and U are local optimum of $J_{\textsc {Onclad}}$ if $z_{ij}$ and $u_{ij}$ are calculated using Eqs. (12) and (15), respectively.

Proof

The necessity has been proven in Lemmas 1 and 2. In order to prove their sufficiency, the Hessian matrices H(J(Z)) of J(Z) and H(J(U)) of J(U) are obtained as follows.

$$\begin{aligned} h_{fg,il}(J(Z))= & {} \dfrac{\partial }{\partial _{fg}}\left[ \dfrac{\partial J(Z)}{\partial z_{il}}\right] = {\left\{ \begin{array}{ll} \sum \nolimits _{j=1}^{K} 2u_{ij}, &{}\quad \textit{if}\, f=i, g=l\\ 0, &{} \quad \textit{otherwise}\\ \end{array}\right. } \end{aligned}$$

(21)

$$\begin{aligned} h_{fg,ij}(J(U))= & {} \dfrac{\partial }{\partial _{fg}}\left[ \dfrac{\partial J(U)}{\partial u_{ij}}\right] = {\left\{ \begin{array}{ll} \dfrac{\gamma }{u_{ij}}, &{} \quad \textit{if}\, f=i, g=j\\ 0, &{}\quad \textit{otherwise}\\ \end{array}\right. } \end{aligned}$$

(22)

According to these equations, H(J(Z)) and H(J(U)) are diagonal matrices. Also, it is mentioned that $u_{ij}\in (0,1]$ and $\gamma >0$. Hence, the Hessian matrices are positive definite, and Eqs. (12) and (15) are the sufficient conditions to minimize J(Z) and J(U), respectively. $\square $

Proof of Theorem 1

The necessary conditions for $J_{\textsc {Onclad}}$ to attain its local minimum were proven in Lemmas 1 and 2. According to Lemma 3, $J_{\textsc {Onclad}}(U^{(t+1)},Z^{(t+1)}) \le J_{\textsc {Onclad}}(U^{(t)},Z^{(t)})$, and the convergence to the local minima is proved.$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

ZareMoodi, P., Kamali Siahroudi, S. & Beigy, H. Concept-evolution detection in non-stationary data streams: a fuzzy clustering approach. Knowl Inf Syst 60, 1329–1352 (2019). https://doi.org/10.1007/s10115-018-1266-y

Download citation

Received: 17 November 2016
Revised: 23 June 2018
Accepted: 14 July 2018
Published: 16 October 2018
Issue Date: 01 September 2019
DOI: https://doi.org/10.1007/s10115-018-1266-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Concept-evolution detection in non-stationary data streams: a fuzzy clustering approach

Abstract

Access this article

Similar content being viewed by others

CPOCEDS-concept preserving online clustering for evolving data streams

A dynamic hierarchical incremental learning-based supervised clustering for data stream with considering concept drift

Research on detection and integration classification based on concept drift of data stream

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Concept-evolution detection in non-stationary data streams: a fuzzy clustering approach

Abstract

Access this article

Similar content being viewed by others

CPOCEDS-concept preserving online clustering for evolving data streams

A dynamic hierarchical incremental learning-based supervised clustering for data stream with considering concept drift

Research on detection and integration classification based on concept drift of data stream

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation