Skip to main content
Log in

Semi-supervised machine learning approach for DDoS detection

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Even though advanced Machine Learning (ML) techniques have been adopted for DDoS detection, the attack remains a major threat of the Internet. Most of the existing ML-based DDoS detection approaches are under two categories: supervised and unsupervised. Supervised ML approaches for DDoS detection rely on availability of labeled network traffic datasets. Whereas, unsupervised ML approaches detect attacks by analyzing the incoming network traffic. Both approaches are challenged by large amount of network traffic data, low detection accuracy and high false positive rates. In this paper we present an online sequential semi-supervised ML approach for DDoS detection based on network Entropy estimation, Co-clustering, Information Gain Ratio and Exra-Trees algorithm. The unsupervised part of the approach allows to reduce the irrelevant normal traffic data for DDoS detection which allows to reduce false positive rates and increase accuracy. Whereas, the supervised part allows to reduce the false positive rates of the unsupervised part and to accurately classify the DDoS traffic. Various experiments were performed to evaluate the proposed approach using three public datasets namely NSL-KDD, UNB ISCX 12 and UNSW-NB15. An accuracy of 98.23%, 99.88% and 93.71% is achieved for respectively NSL-KDD, UNB ISCX 12 and UNSW-NB15 datasets, with respectively the false positive rates 0.33%, 0.35% and 0.46%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Bhuyan MH, Bhattacharyya DK, Kalita JK (2015) An empirical evaluation of information metrics for low-rate and high-rate ddos attack detection. Pattern Recogn Lett 51:1–7

    Article  Google Scholar 

  2. Lin S-C, Tseng S-S (2004) Constructing detection knowledge for ddos intrusion tolerance. Exp Syst Appl 27(3):379–390

    Article  Google Scholar 

  3. Chang RKC (2002) Defending against flooding-based distributed denial-of-service attacks: a tutorial. IEEE Commun Mag 40(10):42–51

    Article  Google Scholar 

  4. Yu S (2014) Distributed denial of service attack and defense. Springer, Berlin

    Book  Google Scholar 

  5. Wikipedia (2016) 2016 dyn cyberattack. https://en.wikipedia.org/wiki/2016_Dyn_cyberattack. (Online; accessed 10 Apr 2017)

  6. theguardian (2016) Ddos attack that disrupted internet was largest of its kind in history, experts say. https://www.theguardian.com/technology/2016/oct/26/ddos-attack-dyn-mirai-botnet. (Online; accessed 10 Apr 2017)

  7. Kalegele K, Sasai K, Takahashi H, Kitagata G, Kinoshita T (2015) Four decades of data mining in network and systems management. IEEE Trans Knowl Data Eng 27(10):2700–2716

    Article  Google Scholar 

  8. Han J, Pei J, Kamber M (2006) What is data mining. Data mining: concepts and techniques. Morgan Kaufinann

  9. Berkhin P (2006) A survey of clustering data mining techniques. In: Grouping multidimensional data. Springer, pp 25–71

  10. Mori T (2002) Information gain ratio as term weight: the case of summarization of ir results. In: Proceedings of the 19th international conference on computational linguistics, vol 1. Association for Computational Linguistics, pp 1–7

  11. Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42

    Article  MATH  Google Scholar 

  12. Tavallaee M, Bagheri E, Lu W, Ghorbani A-A (2009) A detailed analysis of the kdd cup 99 data set. In: Proceedings of the second IEEE symposium on computational intelligence for security and defence applications 2009

  13. Shiravi A, Shiravi H, Tavallaee M, Ghorbani AA (2012) Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput Secur 31:357–374

    Article  Google Scholar 

  14. Moustafa N, Slay J (2015) Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: Military communications and information systems conference (MilCIS), 2015. IEEE, pp 1–6

  15. Moustafa N, Slay J (2016) The evaluation of network anomaly detection systems: statistical analysis of the unsw-nb15 data set and the comparison with the kdd99 data set. Inf Secur J: Glob Perspect 25:18–31

    Google Scholar 

  16. Akilandeswari V, Shalinie SM (2012) Probabilistic neural network based attack traffic classification. In: 2012 fourth international conference on advanced computing (ICoAC). IEEE, pp 1–8

  17. Boroujerdi AS, Ayat S (2013) A robust ensemble of neuro-fuzzy classifiers for ddos attack detection. In: 2013 3rd international conference on computer science and network technology (ICCSNT). IEEE, pp 484–487

  18. Ahmed M, Mahmood AN (2015) Novel approach for network traffic pattern analysis using clustering-based collective anomaly detection. Ann Data Sci 2(1):111–130

    Article  Google Scholar 

  19. Saied A, Overill RE, Radzik T (2016) Detection of known and unknown ddos attacks using artificial neural networks. Neurocomputing 172:385–393

    Article  Google Scholar 

  20. Boro D, Bhattacharyya DK (2016) Dyprosd: a dynamic protocol specific defense for high-rate ddos flooding attacks. Microsyst Technol 23:1–19

    Google Scholar 

  21. Nicolau M, McDermott J et al (2016) A hybrid autoencoder and density estimation model for anomaly detection. In: International conference on parallel problem solving from nature. Springer, pp 717–726

  22. Idhammad M, Afdel K, Belouch M (2017) Dos detection method based on artificial neural networks. Int J Adv Comput Sci Appl (ijacsa) 8(4):465–471

    Google Scholar 

  23. Mustapha B, Salah EH, Mohamed I (2017) A two-stage classifier approach using reptree algorithm for network intrusion detection. Int J Adv Comput Sci Appl (ijacsa) 8(6):389–394

    Google Scholar 

  24. Lakhina A, Crovella M, Diot C (2005) Mining anomalies using traffic feature distributions. In: ACM SIGCOMM computer communication review, vol 35. ACM, pp 217–228

  25. Wagner A, Plattner B (2005) Entropy based worm and anomaly detection in fast ip networks. In: 14th IEEE international workshops on enabling technologies: infrastructure for collaborative enterprise (WETICE’05). IEEE, pp 172–177

  26. Liu T, Wang Z, Wang H, Lu K (2014) An entropy-based method for attack detection in large scale network. Int J Comput Commun Control 7(3):509–517

    Article  Google Scholar 

  27. Papalexakis EE, Beutel A, Steenkiste P (2014) Network anomaly detection using co-clustering. In: Encyclopedia of social network analysis and mining. Springer, Berlin, pp 1054–1068

  28. Ahmed M, Mahmood AN (2014) Network traffic pattern analysis using improved information theoretic co-clustering based collective anomaly detection. In: International conference on security and privacy in communication systems. Springer, Berlin, pp 204–219

  29. Ahmad A (2014) Decision tree ensembles based on kernel features. Appl Intell 41(3):855–869

    Article  Google Scholar 

  30. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  31. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MATH  Google Scholar 

  32. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  33. van der Walt S, Colbert CS, Varoquaux G (2011) The numpy array: a structure for efficient numerical computation. Comput Sci Eng 13(2):22–30

    Article  Google Scholar 

  34. McKinney W (2014) Pandas, python data analysis library. 2015. Reference Source

  35. Hunter JD (2007) Matplotlib: a 2d graphics environment. Comput Sci Eng 9(3):90–95

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Idhammad.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Idhammad, M., Afdel, K. & Belouch, M. Semi-supervised machine learning approach for DDoS detection. Appl Intell 48, 3193–3208 (2018). https://doi.org/10.1007/s10489-018-1141-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-018-1141-2

Keywords

Navigation