skip to main content
10.1145/3543712.3543715acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicctaConference Proceedingsconference-collections
research-article

Metrics for machine learning evaluation methods in cloud monitoring systems

Published:20 September 2022Publication History

ABSTRACT

During the machine learning pipeline development, engineers need to validate the efficiency of the machine learning methods in order to assess the quality of the made forecast.

Due to the wide deployment and implementation of the machine learning models and methods across monitoring systems, the actual scientific problem is the assessment of these methods in the monitoring systems. This research has concluded that the current standard metrics are not sufficient to get the accurate assessment for the used machine learning methods.

This research has provided the new complex rating for anomaly detection regarding the use-cases of cloud monitoring systems. The main difference from the standard metrics is that the new approach includes better integration to the business processes, demanding resources, and a critical glance to the false-positive alerts. The new approach might be used in the model assessment in monitoring systems with the similar requirements:

Cost-effective use of computing resources

Low amount of false-positives

Fast detection of anomalies

Furthermore, the current research proposes new methods of computation capacity planning for different anomaly detection methods. These methods are not even limited to anomaly detection and could be used as a basis for developing capacity planning for other machine learning techniques and approaches.

· Applied computing∼Operations research∼Forecasting · Computer systems organization∼Architectures∼Distributed architectures ∼Cloud computing∼Forecasting · Computing methodologies∼Machine learning

References

  1. F. T. Liu, K. M. Ting and Z. Zhou, "Isolation Forest," 2008 Eighth IEEE International Conference on Data Mining, 2008, pp. 413-422, doi: 10.1109/ICDM.2008.17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Chandola, Varun; Banerjee, Arindam; Kumar, Kumar (July 2009). "Anomaly Detection: A Survey". ACM Computing Surveys. 41. doi:10.1145/1541880.1541882. S2CID 207172599Google ScholarGoogle ScholarDigital LibraryDigital Library
  3.  Altman, N. S. “An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression.” The American Statistician, vol. 46, no. 3, [American Statistical Association, Taylor & Francis, Ltd.], 1992, pp. 175–85, https://doi.org/10.2307/2685209.Google ScholarGoogle Scholar
  4. Breunig, M. M.; Kriegel, H.-P.; Ng, R. T.; Sander, J. (2000). LOF: Identifying Density-based Local Outliers (PDF). Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. SIGMOD. pp. 93–104. doi:10.1145/335191.335388. ISBN 1-58113-217-4.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Mia Hubert, Michiel Debruyne, Peter J. Rousseeuw Minimum Covariance Determinant and Extensions // WIREs Computational Statistics. 2017, doi: 10.1002/wics.1421Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Boyd K., Eng K.H., Page C.D. (2013) Area under the Precision-Recall Curve: Point Estimates and Confidence Intervals. In: Blockeel H., Kersting K., Nijssen S., Železný F. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2013. Lecture Notes in Computer Science, vol 8190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40994-3_29Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Zhao, Y., Nasrullah, Z. and Li, Z., 2019. PyOD: A Python Toolbox for Scalable Outlier Detection. Journal of machine learning research (JMLR), 20(96), pp.1-7.Google ScholarGoogle Scholar
  8. Goldstein M, Uchida S. A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data. PLoS One. 2016;11(4):e0152173. Published 2016 Apr 19. doi:10.1371/journal.pone.0152173Google ScholarGoogle Scholar
  9. Pevný, T. Loda: Lightweight on-line detector of anomalies. Mach Learn 102, 275–304 (2016). https://doi.org/10.1007/s10994-015-5521-0Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Wan, X., Wang, W., Liu, J. et al. Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC Med Res Methodol 14, 135 (2014). https://doi.org/10.1186/1471-2288-14-135Google ScholarGoogle Scholar
  11. Priyanga Dilini Talagala, Rob J. Hyndman & Kate Smith-Miles (2021) Anomaly Detection in High-Dimensional Data, Journal of Computational and Graphical Statistics, 30:2, 360-374, DOI: 10.1080/10618600.2020.1807997Google ScholarGoogle Scholar
  12. Botchkarev, Alexei. (2019). A New Typology Design of Performance Metrics to Measure Errors in Machine Learning Regression Algorithms. Interdisciplinary Journal of Information, Knowledge, and Management. 14. 45-79. 10.28945/4184.Google ScholarGoogle Scholar
  13. A. Tannenbaum,  Structured Computer Organization. —.: Piter, 2013. — P. 476. — 884 P. — ISBN 978-5-469-01274-0Google ScholarGoogle Scholar
  14. Cook, Stephen A. "An overview of computational complexity." ACM Turing award lectures (2007): 1982.Google ScholarGoogle Scholar
  15. Davis, Jesse, and Mark Goadrich. "The relationship between Precision-Recall and ROC curves." Proceedings of the 23rd international conference on Machine learning. 2006 APAGoogle ScholarGoogle Scholar
  16. Petrov V.V., Gennadinik A.V., Avksentieva E., Bryukhanov K. Current issues and methods of event processing in systems with event-driven architecture // Journal of Theoretical and Applied Information Technology - 2021, Vol. 99, No. 9, pp. 1943-1954Google ScholarGoogle Scholar
  17. Sokolova, Marina, Nathalie Japkowicz, and Stan Szpakowicz. "Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation." Australasian joint conference on artificial intelligence. Springer, Berlin, Heidelberg, 2006.Google ScholarGoogle Scholar
  18. Lancia, Giuseppe, "SNPs problems, complexity, and algorithms." European symposium on algorithms. Springer, Berlin, Heidelberg, 2001.Google ScholarGoogle Scholar
  19. Gorman, Mel. Understanding the Linux virtual memory manager. Upper Saddle River: Prentice Hall, 2004.Google ScholarGoogle Scholar
  20. Abdulwahhab Alshammari, Raed Almalki, and Riyad Alshammari, "Developing a Predictive Model of Predicting Appointment No-Show by Using Machine Learning Algorithms," Journal of Advances in Information Technology, Vol. 12, No. 3, pp. 234-239, August 2021. doi: 10.12720/jait.12.3.234-239Google ScholarGoogle Scholar
  21. Dulyawit Prangchumpol and Pijitra Jomsri, "Annual Rainfall Model by Using Machine Learning Techniques for Agricultural Adjustment," Journal of Advances in Information Technology, Vol. 11, No. 3, pp. 161-165, August 2020. doi: 10.12720/jait.11.3.161-165Google ScholarGoogle ScholarCross RefCross Ref
  22. Linux profiling with performance counters // https://perf.wiki.kernel.org/index.php/Main_Page (access date: 27.10.2021).Google ScholarGoogle Scholar
  23. Deploying machine learning models with serverless templates // https://aws.amazon.com/blogs/compute/deploying-machine-learning-models-with-serverless-templates/ (access date: 14.09.2021).Google ScholarGoogle Scholar
  24. Fil: A memory profiler for Python // https://pythonspeed.com/fil/docs/index.html#fil-a-memory-profiler-for-python (access date: 21.09.2021).Google ScholarGoogle Scholar
  25. Scikit-Learn // https://scikit-learn.org/stable/index.html (access date: 21.09.2021).Google ScholarGoogle Scholar
  26. Geekbench // https://www.geekbench.com/ (access date: 03.10.2021).Google ScholarGoogle Scholar
  27. AMD μProf // https://developer.amd.com/amd-uprof/ (access date: 06.10.2021).Google ScholarGoogle Scholar
  1. Metrics for machine learning evaluation methods in cloud monitoring systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICCTA '22: Proceedings of the 2022 8th International Conference on Computer Technology Applications
      May 2022
      286 pages
      ISBN:9781450396226
      DOI:10.1145/3543712

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 September 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)31
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format