Abstract
We have entered the era of networked communications where concepts such as big data and social networks are emerging. The explosion and profusion of available data in a broad range of application domains cause data streams to become an inevitable part of the most real-world applications. In the classification of data streams, there are four major challenges: infinite length, concept drift, recurring and evolving concepts. This paper proposes a novel method to address the mentioned challenges with a focus on the last one. Unlike the existing methods for detection of evolving concepts, we cast joint classification and detection of evolving concepts into optimizing an objective function by extending a fuzzy agglomerative clustering method. Moreover, rather than keeping instances or hyper-sphere summaries of previously seen classes, we just maintain boundaries in the kernel space and generate instances of each class on demand. This approach enhances the accuracy and reduces the memory usage of the proposed method. We empirically evaluated and showed the effectiveness of the proposed approach on several synthetic and real datasets. Experimental results on synthetic and real datasets show the superiority of the proposed method over the related state-of-the-art methods in this area.
Similar content being viewed by others
References
Hosseini MJ, Gholipour A, Beigy H (2016) An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams. Knowl Inf Syst 46(3):567–597
Dehghan M, Beigy H, ZareMoodi P (2016) A novel concept drift detection method in data streams using ensemble classifiers. Intell Data Anal 20(6):1329–1350
Hosseini MJ, Gholipour A, Beigy H (2016) An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams. Knowl Inf Syst 46(3):567–597
Faria ER, Gonçalves IJCR, de Carvalho ACPLF, Gama J (2016) Novelty detection in data streams. Artif Intell Rev 45(2):235–269. https://doi.org/10.1007/s10462-015-9444-8
Abdallah ZS, Gaber MM, Srinivasan B, Krishnaswamy S (2016) Anynovel: detection of novel concepts in evolving data streams. Evol Syst 7(2):73–93
de Faria ER, Goncalves IR, Gama J, de Leon Ferreira ACP et al (2015) Evaluation of multiclass novelty detection algorithms for data streams. IEEE Trans Knowl Data Eng 27(11):2961–2973
Faria ER, Ponce De Leon Ferreira Carvalho AC, Gama J (2016) MINAS: multiclass learning algorithm for novelty detection in data streams. Data Min Knowl Discov 30(3):640–680. https://doi.org/10.1007/s10618-015-0433-y
Pimentel MA, Clifton DA, Clifton L, Tarassenko L (2014) A review of novelty detection. Signal Process 99:215–249
ZareMoodi P, Beigy H, Siahroudi SK (2015) Novel class detection in data streams using local patterns and neighborhood graph. Neurocomputing 158:234–245
Masud MM, Gao J, Khan L, Han J, Thuraisingham BM (2011) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874
Al-Khateeb T, Masud MM, Khan L, Aggarwal C, Han J, Thuraisingham B (2012) Stream classification with recurring and novel class detection using class-based ensemble. In: Proceedings of the IEEE 12th international conference on data mining (ICDM). IEEE, pp 31–40
ZareMoodi P, Siahroudi SK, Beigy H (2016) A support vector based approach for classification beyond the learned label space in data streams. In: Proceedings of the 31st annual ACM symposium on applied computing. ACM, pp 910–915
Masud MM, Chen Q, Khan L, Aggarwal CC, Gao J, Han J, Srivastava A, Oza NC (2013) Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans Knowl Data Eng 25(7):1484–1497
Farid DM, Rahman CM (2012) Novel class detection in concept-drifting data stream mining employing decision tree. In: Proceedings of the 7th international conference on electrical and computer engineering (ICECE). IEEE, pp 630–633
Faria ER, Gama J, Carvalho AC (2013) Novelty detection algorithm for data streams multi-class problems. In: Proceedings of the 28th annual ACM symposium on applied computing. ACM, pp 795–800
Spinosa EJ, de Leon F de Carvalho AP, Gama J (2007) Olindda: a cluster-based approach for detecting novelty and concept drift in data streams. In: Proceedings of the 2007 ACM symposium on applied computing. ACM, New York, NY, USA, pp 448–452. https://doi.org/10.1145/1244002.1244107
Mu X, Ting KM, Zhou Z (2016) Classification under streaming emerging new classes: a solution using completely random trees. CoRR arXiv:1605.09131
Haque A, Khan L, Baron M (2015) Semi supervised adaptive framework for classifying evolving data stream. In: PAKDD (2). Volume 9078 of lecture notes in computer science. Springer, pp 383–394
Haque A, Khan L, Baron M (2016) SAND: semi-supervised adaptive novel class detection and classification over data stream. In: Proceedings of the thirtieth AAAI conference on artificial intelligence, AAAI’16. AAAI Press, pp 1652–1658
Bouguelia M, Belaïd Y, Belaïd A (2014) Efficient active novel class detection for data stream classification. In: ICPR. IEEE Computer Society, pp 2826–2831
Bouguelia M, Belaïd Y, Belaïd A (2013) A stream-based semi-supervised active learning approach for document classification. In: 12th International conference on document analysis and recognition, ICDAR 2013, Washington, DC, USA, August 25–28, 2013, pp 611–615
Siahroudi SK, Moodi PZ, Beigy H (2018) Detection of evolving concepts in non-stationary data streams: a multiple kernel learning approach. Exp Syst Appl 91:187–197
Rigollet P (2007) Generalization error bounds in semi-supervised classification under the cluster assumption. J Mach Learn Res 8:1369–1392
Camci F, Chinnam RB (2008) General support vector representation machine for one-class classification of non-stationary classes. Pattern Recognit 41(10):3021–3034
Krawczyk B, Woźniak M (2013) Incremental learning and forgetting in one-class classifiers for data streams. In: Proceedings of the 8th international conference on computer recognition systems. Springer, pp 319–328
Li MJ, Ng MK, Cheung Y, Huang JZ (2008) Agglomerative fuzzy k-means clustering algorithm with selection of number of clusters. IEEE Trans Knowl Data Eng 20(11):1519–1534
Sun H, Wang S, Jiang Q (2004) FCM-based model selection algorithms for determining the number of clusters. Pattern Recognit 37(10):2027–2037
Tax DM, Duin RP (2002) Uniform object generation for optimizing one-class classifiers. J Mach Learn Res 2:155–173
Ullman NR (1978) Elementary statistics: an applied approach. Wiley, New York
Mika S, Schölkopf B, Smola AJ, Müller KR, Scholz M, Rätsch G (1998) Kernel PCA and de-noising in feature spaces. In: NIPS, vol 4, p 7
Schölkopf B, Mika S, Burges CJ, Knirsch P, Müller KR, Rätsch G, Smola AJ (1999) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 10(5):1000–1017
Dua D, Efi KT (2017) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml
Acknowledgements
The authors would like to thank the anonymous reviewers for their constructive comments which improved the paper.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
In this appendix, we give the proof of Theorem 1. The goal of the optimization procedure is to simultaneously find fuzzy memberships U and cluster centers Z such that the objective function given in Eq. (11) is minimized. Onclad adopts an alternating optimization approach to minimize \(J_{\textsc {Onclad}}\). Minimizing \(J_{\textsc {Onclad}}\) with the constraints is a kind of constrained nonlinear optimization problem. We use Lagrange multipliers, and then the following Lagrange function is obtained.
This objective function is minimized using alternating optimization approach. First, we fix fuzzy memberships U and minimize the objective function with respect to Z and then we fix cluster centers Z and minimize the objective function with respect to U. Optimizing the cluster centers \(Z \equiv [z_{jl}]_{K \times m}\) and the fuzzy memberships \(U \equiv [u_{ij}]_{K \times m}\) obtained by the following lemmas.
Lemma 1
Given the fuzzy memberships U are fixed, the optimal values for the cluster centers \(Z \equiv [z_{jl}]_{K \times m}\) are obtained using following equation:
Proof
By taking derivative of Eq. (11) with respect to each cluster center and setting it to zero, we obtain:
Thus, the solution for \(z_{jl}\) equals to
which completes the proof of lemma. \(\square \)
Lemma 2
Given the cluster centers Z are fixed, the optimal value of fuzzy memberships are equal to:
Proof
Taking derivative of Eq. (11) with respect to each fuzzy membership and setting it to zero, we obtain:
Solving the above equation for \(u_{ij}\), we obtain:
Because of the constraint \(\sum \nolimits _{j=1}^{K} u_{ij}=1\), the Lagrange multipliers are equal to
By some algebraic simplification, we obtain:
By substituting Eq. (19) in Eq. (17), we obtain the closed form solution for the optimal memberships as
which completes the proof of lemma. \(\square \)
The following lemma shows the convergence of the alternating minimization procedure by updating Z and U using Eqs. (12) and (15), respectively.
Lemma 3
Let J(Z) be \(J_{\textsc {Onclad}}\) where fuzzy memberships are fixed, let J(U) be \(J_{\textsc {Onclad}}\) where cluster centers are fixed and \(\alpha , \beta , \gamma >0\). Z and U are local optimum of \(J_{\textsc {Onclad}}\) if \(z_{ij}\) and \(u_{ij}\) are calculated using Eqs. (12) and (15), respectively.
Proof
The necessity has been proven in Lemmas 1 and 2. In order to prove their sufficiency, the Hessian matrices H(J(Z)) of J(Z) and H(J(U)) of J(U) are obtained as follows.
According to these equations, H(J(Z)) and H(J(U)) are diagonal matrices. Also, it is mentioned that \(u_{ij}\in (0,1]\) and \(\gamma >0\). Hence, the Hessian matrices are positive definite, and Eqs. (12) and (15) are the sufficient conditions to minimize J(Z) and J(U), respectively. \(\square \)
Proof of Theorem 1
The necessary conditions for \(J_{\textsc {Onclad}}\) to attain its local minimum were proven in Lemmas 1 and 2. According to Lemma 3, \(J_{\textsc {Onclad}}(U^{(t+1)},Z^{(t+1)}) \le J_{\textsc {Onclad}}(U^{(t)},Z^{(t)})\), and the convergence to the local minima is proved.\(\square \)
Rights and permissions
About this article
Cite this article
ZareMoodi, P., Kamali Siahroudi, S. & Beigy, H. Concept-evolution detection in non-stationary data streams: a fuzzy clustering approach. Knowl Inf Syst 60, 1329–1352 (2019). https://doi.org/10.1007/s10115-018-1266-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-018-1266-y