Skip to main content
Log in

CID: a novel clustering-based database intrusion detection algorithm

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

At the same time with the increase in the data volume, attacks against the database are also rising, therefore information security and confidentiality became a critical challenge. One promised solution against malicious attacks is the intrusion detection system. In this paper, anomaly detection concept is used to propose a method for distinguishing between normal and abnormal activities. For this purpose, a new density-based clustering intrusion detection (CID) method is proposed which clusters queries based on a similarity measure and labels them as normal or intrusion. The experiments are conducted on two standard datasets including TPC-C and TPC-E. The results show proposed model outperforms state-of-the-art algorithms as baselines in terms of FN, FP, Precision, Recall and F-score measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Available at: https://www.tpc.org.

  2. TPC, TPC Benchmark C, Standard Specification, Ver. 5.1, available at: https://www.tpc.org/ tpcc, July 11, 2016.

  3. Transaction Processing Performance Council (TPC): TPC benchmark E, Standard specification, Version 1.13.0, 2014.

References

  • Aggarwal CC (2013) An introduction to outlier analysis. In Outlier analysis. Springer, New York, pp 1–40

    Google Scholar 

  • Barbara D, Goel R, Jajodia S (2003) Mining malicious corruption of data with hidden markov models. In: Gudes E, Shenoi S (eds) Research directions in data and applications security, IFIP, vol 128. Springer, Berlin, pp 175–189

    Chapter  Google Scholar 

  • Bland JM, Altman DG (1996) Statistics notes: measurement error. BMJ 312(7047):1654

    Article  Google Scholar 

  • Bockermann C, Apel M, Meier M (2009) Learning SQL for database intrusion detection using context-sensitive modelling|. In: Flegel U, Bruschi D (eds) DIMVA 2009. LNCS, vol 5587. Springer, Heidelberg, pp 196–205

    Google Scholar 

  • Bu SJ, Cho SB (2017) A hybrid system of deep learning and learning classifier system for database intrusion detection. In: Martínez de Pisón F, Urraca R, Quintián H, Corchado E (Eds.). Hybrid artificial intelligent systems. HAIS 2017. Lecture notes in computer science, Vol. 10334. Springer, Cham

  • Choi SG, Cho S-B (2017) Adaptive database intrusion detection using evolutionary reinforcement learning. In: Perez Garcia H, Alfonso-Cendon J, Sanchez Gonzalez L, Corchado E, Quintian H (Eds.). International joint conference SOCO’17- CISIS’17-ICEUTE’17, Proceedings (pp. 547–556). Advances in intelligent systems and computing; Vol. 649. Springer Verlag

  • Corona I, Giacinto G, Roli F (2013) Adversarial attacks against intrusion detection systems: taxonomy, solutions and open issues. Inf Sci 239:201–225

    Article  Google Scholar 

  • Darwen H (2009) An introduction to relational database theory, 3rd edn. Bookboon

  • Doroudian M, Shahriari HR (2014) Database intrusion detection system for detecting malicious behaviors in transaction and inter-transaction levels., 7th international symposium on telecommunications (IST'2014), pp. 809–814

  • Du H (2010) Data mining techniques and applications: an introduction. Cengage Learning, Boston

    Google Scholar 

  • Dua S, Du X (2016) Data mining and machine learning in cybersecurity. CRC Press, Boca Raton

    Book  Google Scholar 

  • Ester M, Peter H, Jörg S et al. (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis E, Han J, Fayyad UM (Eds.). Proceedings of the second international conference on knowledge discovery and data mining (KDD-96). AAAI Press. pp. 226–231

  • Felemban M, Javeed Y, Kobes J et al. (2018) Design and evaluation of a data partitioning-based intrusion management architecture for database systems. arXiv:1810.02061

  • Gogoi P, Borah B, Bhattacharyyac D (2011) Supervised anomaly detection using clustering based normal behaviour modeling. Int J Adv Eng Sci 1(1):12–17

    Google Scholar 

  • Gomaa WH, Fahmy AA (2013) A survey of text similarity approaches. Int J Comput Appl 68:13

    Google Scholar 

  • Grossi V, Monreale A, Nanni M et al (2015) Clustering formulation using constraint optimization. In: Bianculli D, Calinescu R, Rumpe B (eds) Software engineering and formal methods. SEFM 2015. Lecture notes in computer science, vol 9509. Springer, Berlin

    Google Scholar 

  • Hans-Peter K, Peer K, Jörg S et al (2011) Density-based Clustering. WIREs Data Min Knowl Discov 1(3):231–240 (J.M.P. Martinez)

    Article  Google Scholar 

  • Hassanzadeh H, Keyvanpour M (2013) A two-phase hybrid of semi-supervised and active learning approach for sequence labeling. Intell Data Anal 17(2):251–270

    Article  Google Scholar 

  • Hu Y, Panda B (2004) A data mining approach for database intrusion detection. ACM symposium on applied computing, pp. 711–716

  • James G (2013) An introduction to statistical learning: with applications in R. Springer, Berlin, p 176

    Book  Google Scholar 

  • Kamber M, Pei J (2011) Data mining: concepts and techniques. Morgan Kaufmann, Burlington

    MATH  Google Scholar 

  • Kamra A, Terzi E, Bertino E (2008) Detecting anomalous access patterns in relational databases. VLDB J 17(5):1063–1077

    Article  Google Scholar 

  • Kundu A, Sural S, Majumdar AK (2010) Database intrusion detection using sequence alignment. Int J Inf Secur 9(3):179–191

    Article  Google Scholar 

  • Miller FP, Vandome AF, Mc Brewster J (2009) Levenshtein distance: information theory, computer science, string (computer science), string metric, Damerau? Levenshtein distance, spell checker, hamming distance. Alpha Press

  • Moradi M, Keyvanpour M (2015) An analytical review of XML association rules mining. Artif Intell Rev 43(2):277–300

    Article  Google Scholar 

  • Mordai F (2014) Improving community detection methods for network data analysis. Phd thesis

  • Pinzón C, Herrero A, De Paz JF et al (2010) CBRid4SQL: a CBR intrusion detector for SQL injection attacks. In: Corchado E, Graña Romay M, Manhaes Savio A (eds) HAIS 2010, Part II. LNCS, vol 6077. Springer, Heidelberg, pp 510–519

    Google Scholar 

  • Pirrone R, Cannella V, Giordano G et al. (2018) Linear density-based clustering with a discrete density model. arXiv:1807.08158v

  • Pourkazemi M, Keyvanpour M (2017) Community detection in social network by using a multi-objective evolutionary algorithm. Intell Data Anal 21(2):385409

    Article  Google Scholar 

  • Ramasubramanian P, Kannan A (2004) Intelligent multi-agent based database hybrid intrusion prevention system. In: Benczúr AA, Demetrovics J, Gottlob G (eds) ADBIS 2004. LNCS, vol 3255. Springer, Heidelberg, pp 393–408

    Google Scholar 

  • Ramasubramanian P, Kannan A (2006) A genetic algorithm based neural network shortterm forecasting framework for database intrusion prediction system. Soft Comput 10(8):699–714

    Article  Google Scholar 

  • Rani S, Singh J (2018) Enhancing Levenshtein’s edit distance algorithm for evaluating document similarity. In: Sharma R, Mantri A, Dua S (eds) Computing, analytics and networks. ICAN 2017. Communications in computer and information science, vol 805. Springer, Singapore

    Google Scholar 

  • Rao UP, Singh NK (2017) Weighted role based data dependency approach for intrusion detection in database. Int J Netw Secur 19(3):358–370

    Google Scholar 

  • Ronao CA, Cho SB (2014) A comparison of data mining techniques for anomaly detection in relational databases. Int Conf on Digital Society (ICDS), pp. 11–16

  • Ronao CA, Cho SB (2015) Mining SQL queries to detect anomalous database access using random forest and PCA. In International conference on industrial, engineering and other applications of applied intelligent systems, Vol. 9101, pp. 151160. Springer, Cham

  • Sallam A, Bertino E (2019a) Result-based detection of insider threats to relational databases. Proceedings of the ninth ACM conference on data and application security and privacy, pp. 133–143

  • Sallam A, Bertino E (2019b) Techniques and systems for anomaly detection in database systems. In: Calo S, Bertino E, Verma D (eds) Policy-based autonomic data governance. Lecture notes in computer science, vol 11550. Springer, Cham

    Google Scholar 

  • Santos RJ, Bernardino J, Vieira M (2014) Approaches and challenges in database intrusion detection. ACM SIGMOD Rec 43(3):36–47

    Article  Google Scholar 

  • Sasaki Y (2007) The truth of the F-measure. https://www.toyota-ti.ac.jp/Lab/Denshi/COIN/people/yutaka.sasaki/F-measure-YS-26Oct07.pdf. Accessed 5 June 2019

  • Shirzad MB, Keyvanpour M (2017) Weighted similarity: a new similarity measure for document ranking features. In: Silhavy R, Senkerik R, Kominkova Oplatkova Z, Prokopova Z, Silhavy P (eds) Artificial intelligence trends in intelligent systems. CSOC 2017. Advances in intelligent systems and computing, vol 573. Springer, Cham, pp 273–280

    Google Scholar 

  • Srivastava A, Sural S, Majumdar AK (2006) Database intrusion detection using weighted sequence mining. J Comput 1(4):8–17

    Article  Google Scholar 

  • Subudhi S, Panigrahi S (2019) Application of OPTICS and ensemble learning for database intrusion detection. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2019.05.001

  • Wee CK, Nayak R (2019) A novel machine learning approach for database exploitation detection and privilege control. J Inf Telecommun 3:308–325

    Google Scholar 

  • Yu X, Chu Y, Jiang F et al (2018) SVMs classification based two-side cross domain collaborative filtering by inferring intrinsic user and item features. Knowl-Based Syst 141:80–91

    Article  Google Scholar 

  • Yu X, Jiang F, Du J et al (2019) A cross-domain collaborative filtering algorithm with expanding user and item features via the latent factor space of auxiliary domains. Pattern Recogn 94:96–109

    Article  Google Scholar 

  • Zandian ZK, Keyvanpour M (2017) Systematic identification and analysis of different fraud detection approaches based on the strategy ahead. KES J 21(2):123–134

    Article  Google Scholar 

  • Zhang J, Zulkernine M, Haque A (2008) Random-forests-based network intrusion detection systems. Syst Man Cybern 38(5):649–659

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamad Reza Keyvanpour.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Keyvanpour, M.R., Barani Shirzad, M. & Mehmandoost, S. CID: a novel clustering-based database intrusion detection algorithm. J Ambient Intell Human Comput 12, 1601–1612 (2021). https://doi.org/10.1007/s12652-020-02231-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-020-02231-4

Keywords

Navigation