Abstract
Regarding the computational complexity of the update procedure in the fast incremental Gaussian mixture model (FIGMM) and no efficiency for removing the spurious component in the incremental Gaussian mixture model (IGMM), this study proposes a novel algorithm called the modified incremental Gaussian mixture model (MIGMM) which is an improvement of FIGMM, and a novel adaptive methodology for removing spurious components in the MIGMM. The major contributions in this study are twofold. Firstly, a more simple and efficient prediction matrix update, which is the core of the update procedure in the MIGMM algorithm, is proposed compared to that described in FIGMM. Secondly, an effective exponential model (\(p_{\mathrm {_{Thv}}}\)) related to the number of output components generated in MIGMM, combined with the Mahalanobis distance-based logical matrix (LM), is proposed to remove spurious components and determine the correct components. Based on the highlighted contributions, regarding the removal of spurious components, comparative experiments studied on synthetic and real data sets show that the proposed framework performs robustly compared with other famous information criteria used to determine the number of components. The performance evaluation of IGMM compared with other efficient unsupervised algorithms is verified by conducting on both synthetic and real-world data sets.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Zhang J, Yin Z, Wang R (2017) Pattern classification of instantaneous cognitive task-load through GMM clustering, Laplacian Eigenmap, and ensemble SVMs. IEEE/ACM Trans Comput Biol Bioinf 14:947–965
Li Z, Xia Y, Ji Z, Zhang Y (2017) Brain voxel classification in magnetic resonance images using niche differential evolution based Bayesian inference of variational mixture of Gaussians. Neurocomputing 269:47–57
Ortiz-Rosario A, Adeli H, Buford JA (2017) MUSIC-expected maximization gaussian mixture methodology for clustering and detection of task-related neuronal firing rates. Behav Brain Res 317:226–236
Davari A, Aptoula E, Yanikoglu B, Maier A, Riess C (2018) GMM-based synthetic samples for classification of hyperspectral images with limited training data. IEEE Geosci Remote Sens Lett 15:942–946
Simms LM, Blair B, Ruz J, Wurtz R, Kaplan AD, Glenn A (2018) A pulse discrimination with a Gaussian mixture model on an FPGA. Nucl Inst Methods Phys Res A 900:1–7
Xue W, Jiang T (2018) An adaptive algorithm for target recognition using Gaussian mixture models. Meas J Int Meas Confed 124:233–240
Heinen MR, Engel PM, Pinto RC (2012) Using a gaussian mixture neural network for incremental learning and robotics. In: The 2012 international joint conference on neural networks (IJCNN), pp 1–8
Heinen MR, Engel PM, Pinto RC (2011) IGMN: an incremental gaussian mixture network that learns instantaneously from data flows. In: Proc VIII Encontro Nacional de Inteligência Artificial (ENIA2011)
Dempster AP (1977) Maximum likelihood estimation from incomplete data via the EM algorithm. J R Stat Soc Ser B (Stat Methodol) 39:1–38
Engel PM, Heinen MR Incremental learning of multivariate gaussian mixture models. In: Brazilian symposium on artificial intelligence. Springer, pp 82–91
Grossberg S (1987) Competitive learning: from interactive activation to adaptive resonance. Cogn Sci 11:23–63
Pinto RC, Engel PM (2015) A fast incremental gaussian mixture model. PLOS ONE 10(10):e0141942
Pragr M, Cizek P (2018) Cost of transport estimation for legged robot based on terrain features inference from aerial scan. In: IEEE international conference on intelligent robots and systems, pp 1745–1750
Prágr M, Čížek P (2019) Incremental learning of traversability cost for aerial reconnaissance support to ground units, lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) 11472 LNCS, pp 412–421
Zhao R, Li Y, Sun Y (2018) Statistical convergence of the EM algorithm on Gaussian mixture models. http://arxiv.org/abs/1810.04090
Robbins H, Monro S (1951) A stochastic approximation method. Ann Math Stat 22:400–407
Pinto RC, Engel PM (2015) A fast incremental gaussian mixture model. PLoS ONE 10:1–12
Chamby-Diaz CJ, Recamonde-Mendoza M, Bazzan LCA, Grunitzki R (2018) Adaptive incremental gaussian mixture network for non-stationary data stream classification. In: Proceedings of the international joint conference on neural networks 2018-July, pp 1–8
Koert D, Trick S, Ewerton M, Lutter (2019) Online learning of an open-ended skill library for collaborative tasks. In: IEEE-RAS international conference on humanoid robots 2018-November, pp 599–606
Drumond DA, Rolo RM, Costa JFCL (2019) Using Mahalanobis distance to detect and remove outliers in experimental covariograms. Nat Resour Res 28:145–152
Singh R, Pal BC, Jabr RA (2009) Statistical representation of distribution system loads using gaussian mixture model. IEEE Transa Power Syst 25:29–37
Salmond DJ, Atherton DP, Bather JA (1989) Mixture reduction algorithms for uncertain tracking. IFAC Proc Ser 2:775–780
Proïa F, Pernet A, Thouroude T, Michel G, Clotault J (2016) On the characterization of flowering curves using Gaussian mixture models. J Theor Biol 402:75–88
Mungai PK (2017) Using keystroke dynamics in a multi-level architecture to protect online examinations from impersonation. In: 2017 IEEE 2nd international conference on big data analysis, pp 622–627
Aryafar A, Mikaeil R, Ardejani FD, Haghshenas SS, Jafarpour A (2019) Application of non-linear regression and soft computing techniques for modeling process of pollutant adsorption from industrial wastewaters. J Min Environ 10(2):327–337
Sun S, Wang H, Chang Z, Mao B, Liu Y (2019) On the Mahalanobis distance classification criterion for a ventricular septal defect diagnosis system. IEEE Sens J 19:2665–2674
Xie C, Chang J, Liu Y (2013) Estimating the number of components in Gaussian mixture models adaptively. J Inf Comput Sci 10:4453–4460
Keribin C (2000) Consistent estimate of the order of mixture models. Sankhy A Ser A 62:49–66
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723
Wang H, Luo B, Zhang Q, Wei S (2004) Estimation for the number of components in a mixture model using stepwise split-and-merge EM algorithm. Pattern Recognit Lett 25:1799–1809
Shi L, Xu L (2006) Local factor analysis with automatic model selection: a comparative study and digits recognition application. In: Kollias S, Stafylopatis A, Duch W, Oja E (eds) Artificial neural networks—ICANN 2006. Springer, Berlin, pp 260–269
Pinto R (2015) Experiment data for “A Fast Incremental Gaussian Mixture Model”. https://doi.org/10.6084/M9.FIGSHARE.1552030.V2
Deb S, Tian Z, Fong S, Wong R, Millham R, Wong KK (2018) Elephant search algorithm applied to data clustering. Soft Comput 22:6035–6046
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324
Krizhevsky HGLA (2009) Learning multiple layers of features from tiny images. Technical Report, Computer Science Department, University of Toronto
Kusetogullari H, Yavariabdi A, Cheddad A, Grahn H, Hall J (2020) ARDIS: a Swedish historical handwritten digit dataset. Neural Comput Appl 32:16505–16518
Wang Y, Chaib-draa B (2016) KNN-based Kalman filter: an efficient and non-stationary method for Gaussian process regression. Knowl-Based Syst 114:148–155
Acknowledgements
This work was supported by the Key Projects of Hunan Provincial Department of Education (Nos. 21A0403 and 21A0405), the Hunan Provincial Natural Science Foundation of China (No. 2022JJ30282), the Key Laboratory of Hunan Province (No. 2019TP1014), and the university-industry collaborative project (No. 202102211006).
Author information
Authors and Affiliations
Contributions
SS: Conceptualization, Methodology, Data curation, Algorithm design, Software, Writing original draft, Writing review and editing. YT and BZ: Experimental analysis, Investigation, Algorithm validation, Writing review and editing. BY, LY, PH and HX: Data collection, Experimental analysis and editing.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Sun, S., Tong, Y., Zhang, B. et al. A novel adaptive methodology for removing spurious components in a modified incremental Gaussian mixture model. Int. J. Mach. Learn. & Cyber. 14, 551–566 (2023). https://doi.org/10.1007/s13042-022-01649-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01649-w