Skip to main content
Log in

Evaluation of an adaptive genetic-based signature extraction system for network intrusion detection

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Machine learning techniques are frequently applied to intrusion detection problems in various ways such as to classify normal and intrusive activities or to mine interesting intrusion patterns. Self-learning rule-based systems can relieve domain experts from the difficult task of hand crafting signatures, in addition to providing intrusion classification capabilities. To this end, a genetic-based signature learning system has been developed that can adaptively and dynamically learn signatures of both normal and intrusive activities from the network traffic. In this paper, we extend the evaluation of our systems to real time network traffic which is captured from a university departmental server. A methodology is developed to build fully labelled intrusion detection data set by mixing real background traffic with attacks simulated in a controlled environment. Tools are developed to pre-process the raw network data into feature vector format suitable for a supervised learning classifier system and other related machine learning systems. The signature extraction system is then applied to this data set and the results are discussed. We show that even simple feature sets can help detecting payload-based attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. The original Mucus code was written in 2004 and did not support most new Snort keywords. We used an updated version hosted under Bleeding Threat project [9]

  2. The data set will be made available online later.

  3. Note that UCSSE was run on a much faster machine in comparison to the preprocessing tool. The preprocessing time would be reduced further on a faster machine.

References

  1. Almgren M, Jonsson E (2004) Using active learning in intrusion detection. In: Proceedings of the 17th IEEE computer security foundations workshop (CSFW’04). IEEE Computer Society, New Jersey, pp 88–98

  2. Antonatos S, Anagnostakis KG, Markatos EP (2004) Generating realistic workloads for network intrusion detection systems. ACM SIGSOFT Softw Eng Notes 29(1):207–215

    Article  Google Scholar 

  3. Barisani A (2003) Testing firewalls and IDS with FTester. TISC Insight Newslett 5(6):2–4

    Google Scholar 

  4. Bernadó-Mansilla E, Garrell JM (2003) Accuracy-based learning classifier systems: models, analysis and applications to classification tasks. Evol Comput 11(3):209–238

    Article  Google Scholar 

  5. Dixon PW, Corne DW, Oates MJ (2003) A ruleset reduction algorithm for the XCS learning classifier system. In: Proceedings of the 5th international workshop on learning classifier systems, Revised Papers. Springer, Berlin, pp 20–29

  6. Filippone M, Camastra F, Masulli F, Rovetta S (2008) A survey of kernel and spectral methods for clustering. Pattern Recogn 41(1):176–190

    Article  MATH  Google Scholar 

  7. Geschke D (2004) FLoP—Fast logging project for Snort. http://www.geschke-online.de/FLoP/

  8. Goldberg DE (1989) Genetic algorithms in search, optimization, and machine Learning. Addision-Wesley Publishing Company, Inc., Boston

  9. Gregory J (2005) Mucus—traffic generator for IDS simulation. http://www.bleedingthreats.net/.

  10. Hettich S, Bay SD (1999) The UCI KDD archive. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.

  11. Holland JH, Booker LB, Colombetti M, Dorigo M, Goldberg DE, Forrest S, Riolo RL, Smith RE, Lanzi PL, Stolzmann W et al (2000) What is a learning classifier system. Learn Classif Syst Found Appl 1813:3–32

    Google Scholar 

  12. Hwang K, Cai M, Chen Y, Qin M (2007) Hybrid intrusion detection with weighted signature generation over anomalous internet episodes. IEEE Trans Dependable Secure Comput 4(1):41–55

    Article  Google Scholar 

  13. Jin S, Yeung DS, Wang X (2007) Network intrusion detection in covariance feature space. Pattern Recogn 40(8):2185–2197

    Article  MATH  Google Scholar 

  14. Jung J, Paxson V, Berger AW, Balakrishnan H (2004) Fast portscan detection using sequential hypothesis testing. In: Proceedings of the 2004 IEEE symposium on security and privacy, pp 211–225

  15. Lee W, Stolfo SJ, Mok KW (1999) A data mining framework for building intrusion detection models. IEEE Symp Secur Priv 7:120–132

    MATH  Google Scholar 

  16. Lippmann RP, Zissman MA (1998) 1998 DARPA/AFRL off-line intrusion detection evaluation. http://www.ll.mit.edu/IST/ideval/data/data_index.html

  17. Liu Y, Chen K, Liao X, Zhang W (2004) A genetic clustering method for intrusion detection. Pattern Recogn 37(5):927–942

    Article  Google Scholar 

  18. Luo S, Marin GA (2004) Generating realistic network traffic for security experiments. In: Proceedings of the IEEE SoutheastCon, pp 200–207

  19. Mahoney MV, Chan PK (2003) Learning rules for anomaly detection of hostile network traffic. In: Proceedings of the third IEEE international conference on data mining (ICDM 2003), pp 601–604

  20. Mahoney MV (2003) A machine learning approach to detecting attacks by identifying anomalies in network traffic. PhD thesis, Florida Institute of Technology

  21. Mahoney MV, Chan PK (2003) An analysis of the 1999 DARPA/Lincoln laboratory evaluation data for network anomaly detection. In: Proceedings of recent advances in intrusion detection (RAID) 2003. Springer, Berlin, pp 220–237

  22. Massicotte F, Gagnon F, Labiche Y, Briand L, Couture M (2006) Automatic evaluation of intrusion detection systems. In: 22nd annual computer security applications conference, 2006, pp 361–370

  23. McHugh J (2000) Testing intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Trans Inf Syst Secur 3(4):262–294

    Article  Google Scholar 

  24. Mutz D, Vigna G, Kemmerer R (2003) An experience developing an IDS stimulator for the black-box testing of network intrusion detection systems. In: Proceedings of the 19th annual computer security applications conference, pp 374–383

  25. Ramesh A, Mahesh JV (2001) PNrule: a new framework for learning classifier models in data mining (a case-study in network intrusion detection). In: Proceedings of the first SIAM international conference on data mining, Chicago, IL, USA, 5–7 April, 2001

  26. Roesch M (1999) Snort-lightweight intrusion detection for networks. In: Proceedings of USENIX LISA, pp 229–238. http://www.snort.org/

  27. Sabhnani M, Serpen G (2003) Application of machine learning algorithms to KDD intrusion detection dataset within misuse detection context. In: Proceedings of international conference on machine learning: models, technologies, and applications, pp 23–26

  28. Sabhnani M, Serpen G (2004) Why machine learning algorithms fail in misuse detection on KDD intrusion detection data set. Intell Data Anal 8(4):403–415

    Google Scholar 

  29. Shafi K (2008) An online and adaptive signature-based approach for intrusion detection using learning classifier systems. PhD thesis, University of New South Wales, Australian Defence Force Academy, School of Information Technology and Electrical Engineering

  30. Shafi K, Abbass HA (2009) An adaptive genetic-based signature learning system for intrusion detection. Expert Syst Appl 36(10):12036–12043

    Article  Google Scholar 

  31. Shafi K, Abbass HA, Zhu W (2007) Real time signature extraction from a supervised classifier system. In: Proceeding of the IEEE congress on evolutionary computation, CEC 2007, 25–28 September, 2007, pp 2509–2516

  32. Snort. The open source network intrusion detection system. http://www.snort.org/

  33. Sommers J, Yegneswaran V, Barford P (2005) Toward comprehensive traffic generation for online IDS evaluation. Technical report, Department of Computer Science, University of Wisconsin

  34. Stolfo SJ, Fan W, Lee W, Prodromidis A, Chan PK (2000) Cost-based modeling and evaluation for data mining with application to fraud and intrusion detection: results from the JAM Project. In: Proceedings of DARPA information survivability conference, pp 130–144

  35. Team MD (2006) The Metasploit Project. http://www.metasploit.com/

  36. TeleGeography (2008) TeleGeography’s global internet geography. http://www.telegeography.com/products/gig/index.php

  37. Turner A, Bing M (2005) TCPReplay: PCAP editing and replay tools for *nix. http://tcpreplay.sourceforge.net

  38. Wang K, Stolfo SJ (2004) Anomalous payload-based network intrusion detection. Proc Recent Adv Intrusion Detect 7:201–222

    Google Scholar 

  39. Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175

    Google Scholar 

  40. Wilson SW (2001) Compact rulesets from XCSI. In: Proceedings of the 4th international workshop on advances in learning classifier systems: Revised Papers. Springer, Berlin, pp 197–210

  41. Witten IH, Frank E (2000) Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, San Fransisco

Download references

Acknowledgments

This work is funded by University College Postgraduate Research Scholarship (UCPRS). Most of these experiments were run on the Australian Center for Advanced Computing (AC3) super computing facilities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kamran Shafi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shafi, K., Abbass, H.A. Evaluation of an adaptive genetic-based signature extraction system for network intrusion detection. Pattern Anal Applic 16, 549–566 (2013). https://doi.org/10.1007/s10044-011-0255-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-011-0255-5

Keywords

Navigation