Abstract
The increase of botnet presence on the Internet has made it necessary to detect their activity in order to prevent them to attack and spread over the Internet. The main methods to detect botnets are traffic classifiers and sinkhole servers, which are special servers designed as a trap for botnets. However, sinkholes also receive non-malicious automatic online traffic and therefore they also need to use traffic classifiers. For these reasons, we have created two new datasets to evaluate classifiers: the TCP-Int dataset, built from publicly available TCP Internet traces of normal traffic and of three botnets, Kelihos, Miuref and Sality; and the TCP-Sink dataset based on traffic from a private sinkhole server with traces of the Conficker botnet and of automatic normal traffic. We used the two datasets to test four well-known Machine Learning classifiers: Decision Tree, k-Nearest Neighbours, Support Vector Machine and Naïve Bayes. On the TCP-Int dataset, we used the F1 score to measure the capability to identify the type of traffic, i.e., if the trace is normal or from one of the three considered botnets, while on the TCP-Sink we used ROC curves and the corresponding AUC score since it only presents two classes: non-malicious or botnet traffic. The best performance was achieved by Decision Tree, with a 0.99 F1 score and a 0.99 AUC score on the TCP-Int and the TCP-Sink datasets respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Martínez, J., Iglesias, C., García-Nieto, P.: Machine learning techniques applied to cybersecurity. Int. J. Mach. Learn. Cybern. 1–14 (2019)
Silva, S.S., Silva, R.M., Pinto, R.C., Salles, R.M.: Botnets: a survey. Comput. Netw. 57(2), 378–403 (2013)
Boshmaf, Y., Muslukhov, I., Beznosov, K., Ripeanu, M.: Design and analysis of a social botnet. Comput. Netw. 57(2), 556–578 (2013)
Bujlow, T., Carela-Español, V., Barlet-Ros, P.: Independent comparison of popular DPI tools for traffic classification. Comput. Netw. 76, 75–89 (2015)
Kirubavathi, G., Anitha, R.: Botnet detection via mining of traffic flow characteristics. Comput. Electr. Eng. 50, 91–101 (2016)
Kim, H., Choi, S.S., Song, J.: A methodology for multipurpose DNS Sinkhole analyzing double bounce emails. In: International Conference on Neural Information Processing, pp. 609–616 (2013)
Fetzer, C., Felber, P., Rivière, É., Schiavoni, V., Sutra, P.: UniCrawl: a practical geographically distributed web crawler. In: International Conference on Cloud Computing, pp. 389–396 (2015)
Sangkatsanee, P., Wattanapongsakorn, N., Charnsripinyo, C.: Practical real-time intrusion detection using machine learning approaches. Comput. Commun. 34(18), 2227–2235 (2011)
Kim, H., Claffy, K.C., Fomenkov, M., Barman, D., Faloutsos, M., Lee, K.: Internet traffic classification demystified: myths, caveats, and the best practices. In: Proceedings of the 2008 ACM CoNEXT Conference, pp. 11:1–11:12 (2008)
Doshi, R., Apthorpe, N., Feamster, N.: Machine learning DDoS detection for consumer internet of things devices. In: IEEE Security and Privacy Workshops, pp. 29–35 (2018)
García, S., Grill, M., Stiborek, J., Zunino, A.: An empirical comparison of botnet detection methods. Comput. Secur. 45, 100–123 (2014)
Saad, S., et al.: Detecting P2P botnets through network behavior analysis and machine learning. In: 2011 Ninth Annual International Conference on Privacy, Security and Trust, pp. 174–180 (2011)
Zhao, D., et al.: Botnet detection based on traffic behavior analysis and flow intervals. Comput. Secur. 39, 2–16 (2013)
Buntine, W., Niblett, T.: A further comparison of splitting rules for decision-tree induction. Mach. Learn. 8, 75–85 (1992)
Friedman, J.H.: Lazy decision trees. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, vol. 1, pp. 717–724 (1996)
Dong, W., Moses, C., Li, K.: Efficient K-nearest neighbor graph construction for generic similarity measures. In: Proceedings of the 20th International Conference on World Wide Web, pp. 577–586 (2011)
Cherkassky, V., Ma, Y.: Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 17(1), 113–126 (2004)
Al Nabki, M.W., Fidalgo, E., Alegre, E., de Paz, I.: Classifying illegal activities on TOR network based on web textual contents. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol. 1, pp. 35–43 (2017)
Fidalgo, E., Alegre, E., González-Castro, V., Fernández-Robles, L.: Compass radius estimation for improved image classification using Edge-SIFT. Neurocomputing 197, 119–135 (2016)
Fidalgo, E., Alegre, E., González-Castro, V., Fernández-Robles, L.: Illegal activity categorisation in darknet based on image classification using CREIC method. In: Pérez García, H., Alfonso-Cendón, J., Sánchez González, L., Quintián, H., Corchado, E. (eds.) SOCO/CISIS/ICEUTE -2017. AISC, vol. 649, pp. 600–609. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-67180-2_58
Fidalgo, E., Alegre, E., González-Castro, V., Fernández-Robles, L.: Boosting image classification through semantic attention filtering strategies. Pattern Recogn. Lett. 112, 176–183 (2018)
Schneider, K.: A comparison of event models for Naive Bayes Anti-spam e-Mail Filtering. In: Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, vol. 1, pp. 307–314 (2003)
Xu, S.: Bayesian Naïve Bayes classifiers to text classification. J. Inf. Sci. 44(1), 48–59 (2018)
Ren, J., Lee, S.D., Chen, X., Kao B., Cheng, R., Cheung, D.: Naive Bayes classification of uncertain data. In: 2009 Ninth IEEE International Conference on Data Mining, pp. 944–949 (2009)
Sasaki, Y.: The truth of the F-measure. Teach Tutor mater 1(5), 1–5 (2007)
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
van Roosmalen, J., Vranken, H., van Eekelen, M.: Applying deep learning on packet flows for botnet detection. In: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, pp. 1629–1636 (2018)
Acknowledgements
This work was supported by the framework agreement between the University of León and INCIBE (Spanish National Cybersecurity Institute) under Addendum 01.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Velasco-Mata, J., Fidalgo, E., González-Castro, V., Alegre, E., Blanco-Medina, P. (2019). Botnet Detection on TCP Traffic Using Supervised Machine Learning. In: Pérez García, H., Sánchez González, L., Castejón Limas, M., Quintián Pardo, H., Corchado Rodríguez, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2019. Lecture Notes in Computer Science(), vol 11734. Springer, Cham. https://doi.org/10.1007/978-3-030-29859-3_38
Download citation
DOI: https://doi.org/10.1007/978-3-030-29859-3_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29858-6
Online ISBN: 978-3-030-29859-3
eBook Packages: Computer ScienceComputer Science (R0)