Skip to main content
Log in

A new fuzzy MLE-clustering approach based on object-to-group probabilistic distance measure: from anomaly detection to multi-fault classification in datacenter computational nodes

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Datacenters are expanding in size and complexity to the point where anomaly detection and infrastructure monitoring become critical challenges. One potential strategy for dealing with the reliability of computational nodes in a datacenter is to identify cluster nodes or virtual machines exhibiting anomalous behavior. Throughout this paper, we introduce a novel clustering approach for analyzing cluster node behavior while running various workloads in a system based on resource usage details (CPU utilization, network events, etc.). The new clustering technique aims at boosting the efficiency of fuzzy clustering algorithms based on the maximum likelihood estimation (MLE) scheme. We propose the use of a recently developed object-to-group distance since it does not involve the computation of distances among all pairs of objects to assign the objects to the most appropriate group. The experimental findings under realistic settings demonstrate that the newly implemented algorithm outperforms many similar algorithms that have been used frequently in such tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

The data that support the findings of this paper are available from the corresponding author, Saloua El Motaki, upon reasonable request.

References

  • Abdelsalam M, Krishnan R, Sandhu R (2017) Clustering-based iaas cloud monitoring. In: 2017 IEEE 10th International Conference on Cloud Computing (CLOUD), pages 672–679. https://doi.org/10.1109/CLOUD.2017.90

  • Anthony A, Benjamin A, Jim B, Ann G, Sophia L, Steve M, Jeff O, Mahesh R, Joel S (2015) Toward rapid understanding of production hpc applications and systems. In: 2015 IEEE International Conference on Cluster Computing, pages 464–473. https://doi.org/10.1109/CLUSTER.2015.71

  • Amruthnath N, Gupta T (2018) A research study on unsupervised machine learning algorithms for early fault detection in predictive maintenance. In: 2018 5th International Conference on Industrial Engineering and Applications (ICIEA), pages 355–361. https://doi.org/10.1109/IEA.2018.8387124

  • Bari MF, Boutaba R, Esteves R, Granville LZ, Podlesny M, Rabbani MG, Zhang Q, Zhani MF (2013) Data center network virtualization: a survey. IEEE Commun. Surv. Tutor. 15(2):909–928

    Article  Google Scholar 

  • Bashir M, Irfan A, Hassan U, Muhammad Y (2019) Failure prediction using machine learning in a virtualised hpc system and application. Clust. Comput. 22:471–485. https://doi.org/10.1007/s10586-019-02917-1 (ISSN 1573-7543)

    Article  Google Scholar 

  • Bhatele A, Mohror K, Langer SH, Isaacs KE (2013) There goes the neighborhood: performance degradation due to nearby jobs. In: SC ’13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pages 1–12. https://doi.org/10.1145/2503210.2503247

  • Bhattacharyya A (1946) On a measure of divergence between two multinomial populations. Sankhya: The Indian Journal of Statistics (1933-1960), 7(4):401–406, ISSN 00364452. URL http://www.jstor.org/stable/25047882

  • Bi J, Yuan H, Zhang LB, Zhang J (2019) Sgw-scn: an integrated machine learning approach for workload forecasting in geo-distributed cloud data centers. Information Sciences, 481:57–68, ISSN 0020-0255. https://doi.org/10.1016/j.ins.2018.12.027. URL https://www.sciencedirect.com/science/article/pii/S0020025518309642

  • Brandt J, Chen F, De Sapio V, Gentile A, Mayo J, Pèbay P, Roe Di, Thompson D, Wong M (2010) Quantifying effectiveness of failure prediction and response in hpc systems: methodology and example. In: 2010 International Conference on Dependable Systems and Networks Workshops (DSN-W), pages 2–7. https://doi.org/10.1109/DSNW.2010.5542629

  • Daradkeh T, Agarwal A, Zaman M, Goel N (2020) Dynamic k-means clustering of workload and cloud resource configuration for cloud elastic model. IEEE Access 8:219430–219446. https://doi.org/10.1109/ACCESS.2020.3042716

    Article  Google Scholar 

  • Egele M, Woo M, Chapman P, Brumley D (2014) Blanket execution: dynamic similarity testing for program binaries and components. In: 23rd \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 14), pages 303–317

  • El Motaki S, Yahyaouy A, Gualous H, Sabor J (2019) Gath-geva clustering algorithm for high performance computing (hpc) monitoring. In: 2019 Third International Conference on Intelligent Computing in Data Sciences (ICDS), pages 1–6

  • El Motaki S, Yahyaouy A, Gualous H, Sabor J (2021) A new weighted fuzzy c-means clustering for workload monitoring in cloud datacenter platforms. Clust. Comput. 24(4):3367–3379. https://doi.org/10.1007/s10586-021-03331-2 (ISSN 1573-7543)

    Article  Google Scholar 

  • Gath I, Geva AB (1989) Unsupervised optimal fuzzy clustering. IEEE Trans. Pattern. Anal. Mach. Intell. 11(7):773–780

    Article  MATH  Google Scholar 

  • Genuer R, Poggi J-M, Tuleau-Malot C (2010) Variable selection using random forests. Pattern Recognit. Lett. 31(14):2225–2236

    Article  Google Scholar 

  • Gustafson D, Kessel W (1978) Fuzzy clustering with a fuzzy covariance matrix. 1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes, pages 761–766

  • Hubert L, Arabie P (1985) Comparing partitions. J. Classif. 2(1):193–218. https://doi.org/10.1007/BF01908075 (ISSN 1432-1343)

    Article  MATH  Google Scholar 

  • Hui Y (2018) A virtual machine anomaly detection system for cloud computing infrastructure. J. Supercomput. 21:6126–6134. https://doi.org/10.1007/s11227-018-2518-z

    Article  Google Scholar 

  • Ismaeel S, Miri A, Al-Khazraji A (2016) Energy-consumption clustering in cloud data centre. In: 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC), pages 1–6

  • Khan A, Yan X, Tao S, Anerousis N (2012) Workload characterization and prediction in the cloud: A multiple time series approach. In: 2012 IEEE Network Operations and Management Symposium, pages 1287–1294

  • Lorido-Botran T, Huerta S, Tomás L, Tordsson J, Sanz B (2017) An unsupervised approach to online noisy-neighbor detection in cloud data centers. Expert Systems with Applications, 89:188–204, ISSN 0957-4174. https://doi.org/10.1016/j.eswa.2017.07.038. https://www.sciencedirect.com/science/article/pii/S0957417417305158. Accessed 6 June 2022

  • Nasibov EN, Ulutagay G (2009) Robustness of density-based clustering methods with various neighborhood relations. Fuzzy Sets Syst. 160(24):3601–3615

    Article  MathSciNet  MATH  Google Scholar 

  • Pandeeswari N, Kumar G (2016) Anomaly detection system in cloud environment using fuzzy clustering based ann. Mob. Netw. Appl. 21:494–505. https://doi.org/10.1007/s11036-015-0644-x

    Article  Google Scholar 

  • Rugwiro U, Chunhua G (2017) Customization of virtual machine allocation policy using k-means clustering algorithm to minimize power consumption in data centers. In: Proceedings of the Second International Conference on Internet of Things, Data and Cloud Computing, New York, NY, USA, Association for Computing Machinery. ISBN 9781450347747. https://doi.org/10.1145/3018896.3018947

  • Sammon JW (1969) A nonlinear mapping for data structure analysis. IEEE Trans. Comput. 100(5):401–409

    Article  Google Scholar 

  • Sauvanaud C, Kaâniche M, Kanoun K, Lazri K, Da Silva SG (2018) Anomaly detection and diagnosis for cloud services: Practical experiments and lessons learned. Journal of Systems and Software, 139:84–106, ISSN 0164-1212. https://doi.org/10.1016/j.jss.2018.01.039. https://www.sciencedirect.com/science/article/pii/S0164121218300256. Accessed 2 June 2022

  • Shirazi N, Simpson S, Marnerides AK, Watson M, Mauthe A, Hutchison D (2014) Assessing the impact of intra-cloud live migration on anomaly detection. In: 2014 IEEE 3rd International Conference on Cloud Networking (CloudNet), pages 52–57, https://doi.org/10.1109/CloudNet.2014.6968968

  • Snir M, Wisniewski R W, Abraham JA, Adve SV, Saurabh B, Pavan B, Jim B, Pradip B, Franck C, Bill C, Chien AA, Paul C, Debardeleben NA, Diniz PC, Christian E, Mattan E, Saverio F, Al G, Rinku G, Fred J, Sriram K, Sven L, Dean L, Subhasish M, Todd M, Rob S, Jon S, Eric Van H (2014) Addressing failures in exascale computing. Int. J. High Perform. Comput. Appl. 28(2):129–173. https://doi.org/10.1177/1094342014522573 (ISSN 1094-3420)

    Article  Google Scholar 

  • Tavakkol B, Jeong Myong K, Albin Susan L (2017) Object-to-group probabilistic distance measure for uncertain data classification. Neurocomputing 230:143–151. https://doi.org/10.1016/j.neucom.2016.12.007 (ISSN 0925-2312)

    Article  Google Scholar 

  • Tuncer O, Ates EC, Zhang Y, Turk A, Brandt JM, Leung VJ, Egele M, Coskun AK (2017) Diagnosing performance variations in hpc applications using machine learning. In: ISC. https://doi.org/10.1007/978-3-319-58667-0_19

  • Xiao X, Sun J, Yang J (2021) Operation and maintenance(o &m) for data center: an intelligent anomaly detection approach. Computer Communications, 178:141–152. ISSN 0140-3664. https://doi.org/10.1016/j.comcom.2021.06.030. https://www.sciencedirect.com/science/article/pii/S0140366421002541. Accessed 2 June 2022

  • Zhang X, Meng F, Chen P, Xu J (2016) Taskinsight: A fine-grained performance anomaly detection and problem locating system. In: 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), pages 917–920. https://doi.org/10.1109/CLOUD.2016.0136

Download references

Acknowledgements

The experimental work was developed at the HPC-MARWAN computing cluster of the Mohammed V University In Rabat, Morocco.

Funding

The authors of this paper have not received any financial support for research, authorship and/or publication of this article.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: SEM; Formal analysis and implementation: SEM; Writing—original draft preparation: SEM; Writing—review and editing: SEM and BH; Supervision: AY.

Corresponding author

Correspondence to Saloua El Motaki.

Ethics declarations

Conflict of interest

The authors of this paper declare that they have no significant competing financial, professional, or personal interests that might have influenced the performance or presentation of this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: p-Friedman test

Appendix: p-Friedman test

The Friedman test is an extension of the Wilcoxon Signed Rank Test and the non-parametric equivalent of the 1-factor analysis of variance with repeated measures (Hubert and Arabie 1985). The Friedman test assumes the null hypothesis that k dependent variables belong to the same population. For the position parameter of a sample i by \(M_{i}\), we denote the null hypothesis by \(H_{0}\) and the alternative hypothesis by \(H_{a}\) by the following:

$$\begin{aligned}&H_{0} : M_{1}= M_{2}=\ldots = M_{k} \\&H_{a} : M_{i} \ne M_{j} \text { for at least one } \left( i,j \right) \\&\qquad \text {pair of the Friedman test.} \end{aligned}$$

Given the Friedman null hypothesis, the expected summed ranks of each group are equal to \(\frac{n(k + 1)}{2}\). The Friedman test statistic is expressed as follows:

$$\begin{aligned} Q=\frac{12}{nk(k+1)}\sum _{j=1}^{k}\left\{ R_{j} -\frac{n(k+1)}{2} \right\} ^{2} \end{aligned}$$
(15)

where \(R_{i}\) is the sum of the ranks for the sample i.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

El Motaki, S., Hirchoua, B. & Yahyaouy, A. A new fuzzy MLE-clustering approach based on object-to-group probabilistic distance measure: from anomaly detection to multi-fault classification in datacenter computational nodes. J Ambient Intell Human Comput 14, 12697–12708 (2023). https://doi.org/10.1007/s12652-022-04205-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-022-04205-0

Keywords

Navigation