Revisiting Histogram Based Outlier Scores: Strengths and Weaknesses

Aguilera-Martos, Ignacio; Luengo, Julián; Herrera, Francisco

doi:10.1007/978-3-031-40725-3_4

Ignacio Aguilera-Martos¹⁶,
Julián Luengo¹⁶ &
Francisco Herrera¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14001))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

558 Accesses

Abstract

Anomaly detection is a crucial task in various domains such as finance, cybersecurity or medical diagnosis. The demand for interpretability and explainability in model decisions has revived the use of traceable models, with Histogram Based Outlier Scores being a notable option due to its fast speed and commendable performance. Histogram Based Outlier Scores is a well-known and efficient unsupervised anomaly detection algorithm. Despite its popularity, it suffers from several limitations, including the inability to update its internal knowledge, model complex distributions, and consider feature relations. This work aims to provide a comprehensive analysis of the Histogram Based Outlier Scores algorithm status and its limitations. We conduct a comparative analysis of Histogram Based Outlier Scores with other state-of-the-art anomaly detection algorithms to identify its strengths and weaknesses. Our study shows that while Histogram Based Outlier Scores is efficient and computationally inexpensive, it may not be the best option in scenarios where the underlying data distribution is complex or where variable relations play a significant role. The presented alternatives and extensions to Histogram Based Outlier Scores provide valuable insights into the development of future anomaly detection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aggarwal, C.C.: An introduction to outlier analysis. In: Outlier Analysis, pp. 1–34. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-47578-3_1
Chapter MATH Google Scholar
Aguilera-Martos, I., et al.: Multi-step histogram based outlier scores for unsupervised anomaly detection: arcelormittal engineering dataset case of study. Neurocomputing 544, 126228 (2023)
Article Google Scholar
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. SIGMOD Rec. 29(2), 93–104 (2000)
Article Google Scholar
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 1–58 (2009)
Article Google Scholar
Goldstein, M., Dengel, A.: Histogram-based outlier score (HBOS): a fast unsupervised anomaly detection algorithm
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Leigh, C., et al.: A framework for automated anomaly detection in high frequency water-quality data from in situ sensors. Sci. Total Environ. 664, 885–898 (2019)
Article Google Scholar
Li, Z., Zhao, Y., Hu, X., Botta, N., Ionescu, C., Chen, G.: ECOD: unsupervised outlier detection using empirical cumulative distribution functions. IEEE Trans. Knowl. Data Eng. (2022)
Google Scholar
Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422 (2008)
Google Scholar
Moustafa, N., Hu, J., Slay, J.: A holistic review of network anomaly detection systems: a comprehensive survey. J. Netw. Comput. Appl. 128, 33–55 (2019)
Article Google Scholar
Pevný, T.: LODA: lightweight on-line detector of anomalies. Mach. Learn. 102(2), 275–304 (2016)
Article MathSciNet MATH Google Scholar
Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets, pp. 427–438 (2000)
Google Scholar
Schölkopf, B., Williamson, R., Smola, A., Shawe-Taylor, J., Platt, J.: Support vector method for novelty detection. In: Proceedings of the 12th International Conference on Neural Information Processing Systems, pp. 582–588 (1999)
Google Scholar
Shebuti, R.: ODDS Library (2016). http://odds.cs.stonybrook.edu
Shyu, M.L., Chen, S.C., Sarinnapakorn, K., Chang, L.: A novel anomaly detection scheme based on principal component classifier. In: Proceedings of International Conference on Data Mining (2003)
Google Scholar
Xue, Q., Li, G., Zhang, Y., Shen, S., Chen, Z., Liu, Y.: Fault diagnosis and abnormality detection of lithium-ion battery packs based on statistical distribution. J. Power Sources 482, 228964 (2021)
Article Google Scholar
Zhang, G., et al.: eFraudCom: an e-commerce fraud detection system via competitive graph neural networks. ACM Trans. Inf. Syst. (TOIS) 40(3), 1–29 (2022)
Article Google Scholar

Download references

Acknowledgment

This work has been supported by the Ministry of Science and Technology of Spain under project PID2020-119478GB-I00 and the project TED2021-132702B-C21 from the Ministry of Science and Innovation of Spain. I. Aguilera-Martos was supported by the Spanish Ministry of Science under the FPI programme PRE2021-100169.

Author information

Authors and Affiliations

Department of Computer Science and Artificial Intelligence, Andalusian Institute of Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
Ignacio Aguilera-Martos, Julián Luengo & Francisco Herrera

Authors

Ignacio Aguilera-Martos
View author publications
You can also search for this author in PubMed Google Scholar
Julián Luengo
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Herrera
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ignacio Aguilera-Martos .

Editor information

Editors and Affiliations

University of Deusto, Bilbao, Spain
Pablo García Bringas
University of Leon, León, Spain
Hilde Pérez García
University of La Rioja, Logroño, La Rioja, Spain
Francisco Javier Martínez de Pisón
Pablo de Olavide University, Seville, Spain
Francisco Martínez Álvarez
Pablo de Olavide University, Seville, Spain
Alicia Troncoso Lora
University of Burgos, Burgos, Spain
Álvaro Herrero
University of A Coruña, Ferrol - Coruña, Spain
José Luis Calvo Rolle
University of A Coruña, Ferrol - Coruña, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aguilera-Martos, I., Luengo, J., Herrera, F. (2023). Revisiting Histogram Based Outlier Scores: Strengths and Weaknesses. In: García Bringas, P., et al. Hybrid Artificial Intelligent Systems. HAIS 2023. Lecture Notes in Computer Science(), vol 14001. Springer, Cham. https://doi.org/10.1007/978-3-031-40725-3_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-40725-3_4
Published: 29 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40724-6
Online ISBN: 978-3-031-40725-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Revisiting Histogram Based Outlier Scores: Strengths and Weaknesses