Skip to main content

Anomaly Detection Using Agglomerative Hierarchical Clustering Algorithm

  • Conference paper
  • First Online:
Information Science and Applications 2018 (ICISA 2018)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 514))

Included in the following conference series:

Abstract

Intrusion detection is becoming a hot topic of research for the information security people. There are mainly two classes of intrusion detection techniques namely anomaly detection techniques and signature recognition techniques. Anomaly detection techniques are gaining popularity among the researchers and new techniques and algorithms are developing every day. However, no techniques have been found to be absolutely perfect. Clustering is an important data mining techniques used to find patterns and data distribution in the datasets. It is primarily used to identify the dense and sparse regions in the datasets. The sparse regions were often considered as outliers. There are several clustering algorithms developed till today namely K-means, K-medoids, CLARA, CLARANS, DBSCAN, ROCK, BIRCH, CACTUS etc. Clustering techniques have been successfully used for the detection of anomaly in the datasets. The techniques were found to be useful in the design of a couple of anomaly based Intrusion Detection Systems (IDS). But most of the clustering techniques used for these purpose have taken partitioning approach. In this article, we propose a different clustering algorithm for the anomaly detection on network datasets. Our algorithm is an agglomerative hierarchical clustering algorithm which discovers outliers on the hybrid dataset with numeric and categorical attributes. For this purpose, we define a suitable similarity measure on both numeric and categorical attributes available on any network datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hartigan JA (1975) Clustering algorithms. Wiley

    Google Scholar 

  2. Gibson D, Kleinberg J, Raghavan P (1998) Clustering categorical data: an approach based on dynamical systems. In: Proceedings of the 24th international conference on very large databases, New York, pp 311–323

    Google Scholar 

  3. Ng RT, Han J (1994) Efficient and effective clustering methods for spatial data mining. Santiago, Chile, In Proc. of the VLDB Conf, pp 144–155

    Google Scholar 

  4. Ganti V, Gehrke J, Ramakrishnan R (1999) CACTUS-clustering categorical data using summaries. In: Proceedings of the international conference on knowledge discovery and data mining, San Diego, CA, USA, pp 73–83

    Google Scholar 

  5. Guha S, Rastogi R, Shim K, Rock (1999) A robust clustering algorithm for categorical attributes. In: Proceedings of the IEEE international conference on data engineering, Sydney, pp 512–521

    Google Scholar 

  6. Pamula R, Deka JK, Nandi S (2011) An outlier detection method based on clustering. In: Proceedings of 2011 second international conference on emerging applications of information technology, India, Feb 2011, pp 253–256

    Google Scholar 

  7. Zhang Y, Liu J, Li H (2010) An outlier detection algorithm based on clustering analysis. In: The proceedings of 2010 first international conference on pervasive computing, signal processing and applications, China, Sept 2010

    Google Scholar 

  8. Sharma D (2011) Fuzzy clustering as an intrusion detection technique. Int J Comput Sci Commun Netw 1(1), 69–75

    Google Scholar 

  9. Xie L, Wang Y, Chen L, Yue G (2010) An anomaly detection method based on fuzzy c-means clustering algorithms. In: Proceedings of the second symposium on networking and network security, China, pp 89–92

    Google Scholar 

  10. Debar H, Dacier M, Wespi A (1999) Towards a taxonomy of intrusion detection systems. Comput Netw 31:805–822

    Article  Google Scholar 

  11. Escamilla T (1998) Intrusion detection: network security beyond the firewall. Wiley, New York

    Google Scholar 

  12. Munz G, Li S, Carle G (2007) Traffic anomaly detection using k-means clustering. Allen Institute for Artificial Intelligence

    Google Scholar 

  13. Haldar NA, Abulaish M, Pasha SA (2012) A statistical pattern mining approach for identifying wireless network intruders. In: Advances in Intelligent Systems and Computing: Preface, July 2012, pp 131–140

    Chapter  Google Scholar 

  14. Linquan X, Ying W, Liping C, Guangxue Y (2010) An anomaly detection method based on fuzzy c-means clustering algorithm. In: Proceedings of the second international symposium on networking and network security, China, Apr 2010, pp 089–092

    Google Scholar 

  15. Lance GN, Williams WT (1966) Computer programs for hierarchical polythetic classification (“similarity analysis”). Comput J 9(1):60–64

    Article  Google Scholar 

  16. Lance GN, Williams WT (1967) Mixed-data classificatory programs I agglomerative systems. Aust Comput J 15–20

    Google Scholar 

  17. Clifford TH, Stephenson W (1975) An introduction to numerical classification. Academic Press. New York, San Fransisco, London

    Chapter  Google Scholar 

  18. Emran SM, Ye N (2001) Robustness of Canberra metric in computer intrusion detection. In: Proceedings of 2001 IEEE workshop on information assurance and security, US Military Academy, NY, June 2001, pp 80–84

    Google Scholar 

  19. Dutta M, Mahanta AK, Mazumder M (2001) An algorithm for clustering of categorical data using concept of neighours. In: Proceedings of the 1st national workshop on soft data mining and intelligent systems, Tezpur University, India, pp 103–105

    Google Scholar 

  20. Dutta M, Mahanta AK (2006) An algorithm for clustering large categorical databases using a fuzzy set based approach. In: Proceedings national workshop on trends in advanced computing, Tezpur University, India

    Google Scholar 

  21. Mazarbhuiya FA, AlZahrani MY (2017) An efficient method for clustering periodic patterns. In: Computing conference 2017, SAI Conference, London, UK

    Google Scholar 

  22. Sheikholeslami G, Chatterjee S, Zhang A (1998) WaveCluster: a multi-resolution clustering approach for large spatial databases. In: Proceedings of 24th VLDB conference, New York, USA

    Google Scholar 

  23. Thaoroijam K, Mahanta AK (2016) A fuzzy based document clustering algorithm. Int J Comput Appl (0975–8887) 151(10):21–24

    Article  Google Scholar 

  24. Li J, Gao XB, Jiao LC (2004) A GA-based clustering algorithm for large datasets with mixed numerical and categorical values. J Electron Inf Technol 26(8):1203–1209

    Google Scholar 

  25. Bama SS, Ahmed MSI, Saravanan A (2011) Network intrusion detection using clustering: a data mining approach. Int J Comput Appl 30(4):14–17

    Google Scholar 

  26. Lee W, Stolfo SJ (1998) Data mining approaches for intrusion detection. In: 7th conference on USENIX security symposium

    Google Scholar 

  27. Dokas P, Ertos L, Kumar V, Lazarevic A, Srivastava J, Tan PN (2002) Data mining for network intrusion detection. In: Proceedings of the NSF workshop on next generation data mining, Nov 2002

    Google Scholar 

  28. Bloedorn E, Christiansen AD, Hill W, Skorupka C, Talbot LM (2001) Data mining for network intrusion detection: how to get started. Technical report, MITRE

    Google Scholar 

  29. Esposito M, Mazzariello C, Oliviero F, Romano SP, Sansone C (2005) Evaluating pattern recognition techniques in intrusion detection systems. In: Proceedings of the 5th international workshop on pattern recognition in information systems (PRIS) 2005, May 2005, pp 144–153

    Google Scholar 

  30. Luo J, Bridges S (2000) Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection. Int J Intell Syst 15(8):687–704

    Article  Google Scholar 

  31. Lazarevic A, Ertöz L, Kumar V, Ozgur A, Srivastava J (2003) A comparative study of anomaly detection schemes in network intrusion detection. In: Proceedings of the third SIAM international conference on data mining, May 2003

    Chapter  Google Scholar 

  32. Alvarenga SC, Zarpelão BB, Junior SB, Miani RS, Cukier M (2015) Discovering attack strategies using process mining. In: The eleventh advanced international conference on telecommunications, AICT 2015, IARIA, pp 119–125

    Google Scholar 

  33. de Alvarengaa SC, Juniora SB, Mianib RS, Cukierc M, Zarpelãoa BB (2017) Process mining and hierarchical clustering to help intrusion alert visualization. Comput Secur

    Google Scholar 

  34. Al-Mamory SO, Zhang H (2009) Intrusion detection alarms reduction using root cause analysis and clustering. Comput Commun 32(2):419–430

    Article  Google Scholar 

  35. Lagzian S, Amiri F, Enayati A, Gharaee H (2012) Frequent item set mining-based alert correlation for extracting multi-stage attack scenarios. In: 2012 sixth international symposium on telecommunications (IST). IEEE, pp 1010–1014

    Google Scholar 

  36. Xuewei F, Dongxia W, Minhuan H, Xiaoxia S (2014) An approach of discovering causal knowledge for alert correlating based on data mining. In: 2014 IEEE 12th international conference on dependable, autonomic and secure computing (DASC). IEEE, pp 57–62

    Google Scholar 

  37. Bhavsar YB, Waghmare KC (2013) Intrusion detection system using data mining technique: support vector machine. Int J Emerg Technol Adv Eng 3(3):581–586

    Google Scholar 

  38. Wankhede R, Chole V (2016) Intrusion detection system using classification technique. Int J Comput Appl (0975–8887) 139(11):25–28

    Article  Google Scholar 

  39. Shun J, Malki HA (2008) Network intrusion detection systems using neural network. In: ICNC 2008. IEEE Explore

    Google Scholar 

  40. Poojitha G, Kumar KN, Reddy RJ (2010) Intrusion detection using artificial neural network. In: Proceedings of ICCCN 2010. IEEE Explore

    Google Scholar 

  41. Bahareth FA, Bamasak OO Constructing attack scenario using sequential pattern mining with correlated candidate sequences. Res Bull Jordan ACM, II(III):102–108

    Google Scholar 

  42. Horng SJ, Su MY, Chen YH, Kao TW, Chen RJ, Lai JL, Perkasa CD (2011) A novel intrusion detection system based on hierarchical clustering and support vector machines. Expert Syst Appl 38(1):306–313

    Article  Google Scholar 

  43. Liu X, Nielsen PS (2016) Regression-based online anomaly detection for smart grid data. Technical University of Denmark, Kgs. Lyngby, Denmark

    Google Scholar 

  44. Gladkykh T, Hnot T, Solskyy V (2016) Fuzzy logic inference for unsupervised anomaly detection. In: IEEE first international conference on data stream mining & processing, 23–27, pp 42–47

    Google Scholar 

  45. Mane VD, Pawar SN (2016) Anomaly based IDS using back propagation neural network. Int J Comput Appl (0975–8887) 136(10):29–34

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fokrul Alom Mazarbhuiya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mazarbhuiya, F.A., AlZahrani, M.Y., Georgieva, L. (2019). Anomaly Detection Using Agglomerative Hierarchical Clustering Algorithm. In: Kim, K., Baek, N. (eds) Information Science and Applications 2018. ICISA 2018. Lecture Notes in Electrical Engineering, vol 514. Springer, Singapore. https://doi.org/10.1007/978-981-13-1056-0_48

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-1056-0_48

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-1055-3

  • Online ISBN: 978-981-13-1056-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics