Abstract
At the same time with the increase in the data volume, attacks against the database are also rising, therefore information security and confidentiality became a critical challenge. One promised solution against malicious attacks is the intrusion detection system. In this paper, anomaly detection concept is used to propose a method for distinguishing between normal and abnormal activities. For this purpose, a new density-based clustering intrusion detection (CID) method is proposed which clusters queries based on a similarity measure and labels them as normal or intrusion. The experiments are conducted on two standard datasets including TPC-C and TPC-E. The results show proposed model outperforms state-of-the-art algorithms as baselines in terms of FN, FP, Precision, Recall and F-score measures.
Similar content being viewed by others
Notes
Available at: https://www.tpc.org.
TPC, TPC Benchmark C, Standard Specification, Ver. 5.1, available at: https://www.tpc.org/ tpcc, July 11, 2016.
Transaction Processing Performance Council (TPC): TPC benchmark E, Standard specification, Version 1.13.0, 2014.
References
Aggarwal CC (2013) An introduction to outlier analysis. In Outlier analysis. Springer, New York, pp 1–40
Barbara D, Goel R, Jajodia S (2003) Mining malicious corruption of data with hidden markov models. In: Gudes E, Shenoi S (eds) Research directions in data and applications security, IFIP, vol 128. Springer, Berlin, pp 175–189
Bland JM, Altman DG (1996) Statistics notes: measurement error. BMJ 312(7047):1654
Bockermann C, Apel M, Meier M (2009) Learning SQL for database intrusion detection using context-sensitive modelling|. In: Flegel U, Bruschi D (eds) DIMVA 2009. LNCS, vol 5587. Springer, Heidelberg, pp 196–205
Bu SJ, Cho SB (2017) A hybrid system of deep learning and learning classifier system for database intrusion detection. In: Martínez de Pisón F, Urraca R, Quintián H, Corchado E (Eds.). Hybrid artificial intelligent systems. HAIS 2017. Lecture notes in computer science, Vol. 10334. Springer, Cham
Choi SG, Cho S-B (2017) Adaptive database intrusion detection using evolutionary reinforcement learning. In: Perez Garcia H, Alfonso-Cendon J, Sanchez Gonzalez L, Corchado E, Quintian H (Eds.). International joint conference SOCO’17- CISIS’17-ICEUTE’17, Proceedings (pp. 547–556). Advances in intelligent systems and computing; Vol. 649. Springer Verlag
Corona I, Giacinto G, Roli F (2013) Adversarial attacks against intrusion detection systems: taxonomy, solutions and open issues. Inf Sci 239:201–225
Darwen H (2009) An introduction to relational database theory, 3rd edn. Bookboon
Doroudian M, Shahriari HR (2014) Database intrusion detection system for detecting malicious behaviors in transaction and inter-transaction levels., 7th international symposium on telecommunications (IST'2014), pp. 809–814
Du H (2010) Data mining techniques and applications: an introduction. Cengage Learning, Boston
Dua S, Du X (2016) Data mining and machine learning in cybersecurity. CRC Press, Boca Raton
Ester M, Peter H, Jörg S et al. (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis E, Han J, Fayyad UM (Eds.). Proceedings of the second international conference on knowledge discovery and data mining (KDD-96). AAAI Press. pp. 226–231
Felemban M, Javeed Y, Kobes J et al. (2018) Design and evaluation of a data partitioning-based intrusion management architecture for database systems. arXiv:1810.02061
Gogoi P, Borah B, Bhattacharyyac D (2011) Supervised anomaly detection using clustering based normal behaviour modeling. Int J Adv Eng Sci 1(1):12–17
Gomaa WH, Fahmy AA (2013) A survey of text similarity approaches. Int J Comput Appl 68:13
Grossi V, Monreale A, Nanni M et al (2015) Clustering formulation using constraint optimization. In: Bianculli D, Calinescu R, Rumpe B (eds) Software engineering and formal methods. SEFM 2015. Lecture notes in computer science, vol 9509. Springer, Berlin
Hans-Peter K, Peer K, Jörg S et al (2011) Density-based Clustering. WIREs Data Min Knowl Discov 1(3):231–240 (J.M.P. Martinez)
Hassanzadeh H, Keyvanpour M (2013) A two-phase hybrid of semi-supervised and active learning approach for sequence labeling. Intell Data Anal 17(2):251–270
Hu Y, Panda B (2004) A data mining approach for database intrusion detection. ACM symposium on applied computing, pp. 711–716
James G (2013) An introduction to statistical learning: with applications in R. Springer, Berlin, p 176
Kamber M, Pei J (2011) Data mining: concepts and techniques. Morgan Kaufmann, Burlington
Kamra A, Terzi E, Bertino E (2008) Detecting anomalous access patterns in relational databases. VLDB J 17(5):1063–1077
Kundu A, Sural S, Majumdar AK (2010) Database intrusion detection using sequence alignment. Int J Inf Secur 9(3):179–191
Miller FP, Vandome AF, Mc Brewster J (2009) Levenshtein distance: information theory, computer science, string (computer science), string metric, Damerau? Levenshtein distance, spell checker, hamming distance. Alpha Press
Moradi M, Keyvanpour M (2015) An analytical review of XML association rules mining. Artif Intell Rev 43(2):277–300
Mordai F (2014) Improving community detection methods for network data analysis. Phd thesis
Pinzón C, Herrero A, De Paz JF et al (2010) CBRid4SQL: a CBR intrusion detector for SQL injection attacks. In: Corchado E, Graña Romay M, Manhaes Savio A (eds) HAIS 2010, Part II. LNCS, vol 6077. Springer, Heidelberg, pp 510–519
Pirrone R, Cannella V, Giordano G et al. (2018) Linear density-based clustering with a discrete density model. arXiv:1807.08158v
Pourkazemi M, Keyvanpour M (2017) Community detection in social network by using a multi-objective evolutionary algorithm. Intell Data Anal 21(2):385409
Ramasubramanian P, Kannan A (2004) Intelligent multi-agent based database hybrid intrusion prevention system. In: Benczúr AA, Demetrovics J, Gottlob G (eds) ADBIS 2004. LNCS, vol 3255. Springer, Heidelberg, pp 393–408
Ramasubramanian P, Kannan A (2006) A genetic algorithm based neural network shortterm forecasting framework for database intrusion prediction system. Soft Comput 10(8):699–714
Rani S, Singh J (2018) Enhancing Levenshtein’s edit distance algorithm for evaluating document similarity. In: Sharma R, Mantri A, Dua S (eds) Computing, analytics and networks. ICAN 2017. Communications in computer and information science, vol 805. Springer, Singapore
Rao UP, Singh NK (2017) Weighted role based data dependency approach for intrusion detection in database. Int J Netw Secur 19(3):358–370
Ronao CA, Cho SB (2014) A comparison of data mining techniques for anomaly detection in relational databases. Int Conf on Digital Society (ICDS), pp. 11–16
Ronao CA, Cho SB (2015) Mining SQL queries to detect anomalous database access using random forest and PCA. In International conference on industrial, engineering and other applications of applied intelligent systems, Vol. 9101, pp. 151160. Springer, Cham
Sallam A, Bertino E (2019a) Result-based detection of insider threats to relational databases. Proceedings of the ninth ACM conference on data and application security and privacy, pp. 133–143
Sallam A, Bertino E (2019b) Techniques and systems for anomaly detection in database systems. In: Calo S, Bertino E, Verma D (eds) Policy-based autonomic data governance. Lecture notes in computer science, vol 11550. Springer, Cham
Santos RJ, Bernardino J, Vieira M (2014) Approaches and challenges in database intrusion detection. ACM SIGMOD Rec 43(3):36–47
Sasaki Y (2007) The truth of the F-measure. https://www.toyota-ti.ac.jp/Lab/Denshi/COIN/people/yutaka.sasaki/F-measure-YS-26Oct07.pdf. Accessed 5 June 2019
Shirzad MB, Keyvanpour M (2017) Weighted similarity: a new similarity measure for document ranking features. In: Silhavy R, Senkerik R, Kominkova Oplatkova Z, Prokopova Z, Silhavy P (eds) Artificial intelligence trends in intelligent systems. CSOC 2017. Advances in intelligent systems and computing, vol 573. Springer, Cham, pp 273–280
Srivastava A, Sural S, Majumdar AK (2006) Database intrusion detection using weighted sequence mining. J Comput 1(4):8–17
Subudhi S, Panigrahi S (2019) Application of OPTICS and ensemble learning for database intrusion detection. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2019.05.001
Wee CK, Nayak R (2019) A novel machine learning approach for database exploitation detection and privilege control. J Inf Telecommun 3:308–325
Yu X, Chu Y, Jiang F et al (2018) SVMs classification based two-side cross domain collaborative filtering by inferring intrinsic user and item features. Knowl-Based Syst 141:80–91
Yu X, Jiang F, Du J et al (2019) A cross-domain collaborative filtering algorithm with expanding user and item features via the latent factor space of auxiliary domains. Pattern Recogn 94:96–109
Zandian ZK, Keyvanpour M (2017) Systematic identification and analysis of different fraud detection approaches based on the strategy ahead. KES J 21(2):123–134
Zhang J, Zulkernine M, Haque A (2008) Random-forests-based network intrusion detection systems. Syst Man Cybern 38(5):649–659
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations
Rights and permissions
About this article
Cite this article
Keyvanpour, M.R., Barani Shirzad, M. & Mehmandoost, S. CID: a novel clustering-based database intrusion detection algorithm. J Ambient Intell Human Comput 12, 1601–1612 (2021). https://doi.org/10.1007/s12652-020-02231-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02231-4