Abstract
From cybersecurity to life sciences, anomaly detection is considered crucial as it often enables the identification of relevant semantic information that can help to prevent and detect events such as cyber attacks or patients heart-attacks. Although anomaly detection is a prominent research area it still encompasses several challenges, namely regarding results evaluation in real-world unlabelled and imbalanced datasets. This work contributes to understand and compare the behaviour of different evaluation metrics, namely classic metrics based on positive and negative rates, and density based metrics without classes information. We experiment five state-of-art anomaly detection approaches over two datasets with contrasting characteristics regarding dimensionality or contamination. Each metrics’ ability to give trustful results is analysed regarding different datasets or approaches properties focusing on the possibility of evaluating real-world unsupervised learning models using density metrics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C.: Outlier Analysis. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-6396-2
Audibert, J., Michiardi, P., Guyard, F., Marti, S., Zuluaga, M.A.: USAD: unsupervised anomaly detection on multivariate time series. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 3395–3404 (2020)
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: ACM SIGMOD Record, vol. 29, pp. 93–104. ACM (2000)
Campos, G.O., et al.: On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Mining Knowl. Discov. 30(4), 891–927 (2016). https://doi.org/10.1007/s10618-015-0444-8
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 15 (2009)
Clémençon, S., Jakubowicz, J.: Scoring anomalies: a M-estimation formulation. In: Artificial Intelligence and Statistics, pp. 659–667 (2013)
Clémençon, S., Thomas, A.: Mass volume curves and anomaly ranking. arXiv preprint arXiv:1705.01305 (2017)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1007/BF00994018
Goix, N.: How to evaluate the quality of unsupervised anomaly detection algorithms? arXiv:1607.01152 [cs, stat], July 2016
Goix, N., Sabourin, A., Clémençon, S.: On anomaly ranking and excess-mass curves. In: Artificial Intelligence and Statistics, pp. 287–295 (2015)
Goldstein, M., Uchida, S.: A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4), e0152173 (2016). https://doi.org/10.1371/journal.pone.0152173. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0152173
Grubbs, F.E.: Procedures for detecting outlying observations in samples. Technometrics 11(1), 1–21 (1969)
Gupta, M., Gao, J., Aggarwal, C.C., Han, J.: Outlier detection for temporal data: a survey. IEEE Trans. Knowl. Data Eng. 26(9), 2250–2267 (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–80 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Hodge, V., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 85–126 (2004)
Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L., Muller, P.A.: Deep learning for time series classification: a review. Data Mining Knowl. Discov. (2019). https://doi.org/10.1007/s10618-019-00619-1
Lavin, A., Ahmad, S.: Evaluating real-time anomaly detection algorithms-the Numenta Anomaly Benchmark. In: 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), pp. 38–44. IEEE (2015)
Müller, D.W., Sawitzki, G.: Excess mass estimates and tests for multimodality. J. Ame. Stat. Assoc. 86(415), 738–746 (1991)
Rousseeuw, P.J., Driessen, K.V.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41(3), 212–223 (1999)
Su, Y., Zhao, Y., Niu, C., Liu, R., Sun, W., Pei, D.: Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2828–2837 (2019)
Zhao, H., et al.: Multivariate time-series anomaly detection via graph attention network. arXiv preprint arXiv:2009.02040 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Maia, R., Antunes, C. (2021). Density-Based Evaluation Metrics in Unsupervised Anomaly Detection Contexts. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2021. Lecture Notes in Computer Science(), vol 12898. Springer, Cham. https://doi.org/10.1007/978-3-030-85529-1_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-85529-1_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85528-4
Online ISBN: 978-3-030-85529-1
eBook Packages: Computer ScienceComputer Science (R0)