Abstract
Botnets are one of the major threats to network security nowadays. To carry out malicious actions remotely, they heavily rely on Command and Control channels. DGA-based botnets use a domain generation algorithm to generate a significant number of domain names. By analyzing the linguistic distinctions between legitimate and DGA-based domain names, traditional machine learning schemes obtain great benefits. However, it is difficult to identify the ones based on wordlists or pseudo-random generated. Accordingly, this paper proposes an efficient CNN-LSTM-based detection model (BotDetector) that uses only a set of simple-to-compute, easy-to-compute character features. We evaluate our model with two open-source benchmark datasets (360 netlab, Bambenek) and real DNS traffic from the China Education and Research Network. Experimental results demonstrate that our algorithm improves by 1.6\(\%\) in terms of accuracy and F1-score and reduces the computation time by 9.4\(\%\) compared to other state-of-the-art alternatives. Remarkably, our work can identify botnet’s covert communication channels that use domain names based on word lists or pseudo-random generation without any help of reverse engineering.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Singh, M., Singh, M., & Kaur, S. (2019). Issues and challenges in DNS based botnet detection: A survey. Computers & Security, 86, 28–52.
Patsakis, C., Casino, F., & Katos, V. (2020). Encrypted and covert DNS queries for botnets: Challenges and countermeasures. Computers & Security, 88, 101614.
Patsakis, C., & Casino, F. (2021). Exploiting statistical and structural features for the detection of domain generation algorithms. Journal of Information Security and Applications, 58, 102725.
Namgung, J., Son, S., & Moon, Y.-S. (2021). Efficient deep learning models for DGA domain detection. Security and Communication Networks, 2021, 1–15.
Al-Duwairi, B., Jarrah, M., & Shatnawi, A. S. (2021). PASSVM: A highly accurate fast flux detection system. Computers & Security, 110, 102431.
Xu, C., Shen, J., & Du, X. (2019). Detection method of domain names generated by DGAs based on semantic representation and deep neural network. Computers & Security, 85, 77–88.
Shin, S., Gu, G., Reddy, N., & Lee, C. P. (2011). A large-scale empirical study of Conficker. IEEE Transactions on Information Forensics and Security, 7(2), 676–690.
Zago, M., Gil Pérez, M., & Martínez Pérez, G. (2019). Scalable detection of botnets based on DGA. Soft Computing, 24(8), 5517–5537.
Akhila, G. P., Gayathri, R., Keerthana, S., & Gladston, A. (2020). A machine learning framework for domain generating algorithm based malware detection. Security and Privacy, 3(6), e127.
Tong, A. T., Long, H. V., & Taniar, D. (2021). On detecting and classifying DGA botnets and their families. Computers & Security, 113, 102549.
Anderson, H. S., Woodbridge, J., & Filar, B. (2016). DeepDGA: Adversarially-tuned domain generation and detection. In Proceedings of the 2016 ACM workshop on artificial intelligence and security (pp. 13–21). New York, NY: Association for Computing Machinery. https://doi.org/10.1145/2996758.2996767.
Manasrah, A. M., Khdour, T., & Freehat, R. (2022). DGA-based botnets detection using DNS traffic mining. Journal of King Saud University—Computer and Information Sciences, 34(5), 2045–2061.
Wang, W., Shang, Y., He, Y., Li, Y., & Liu, J. (2020). BotMark: Automated botnet detection with hybrid analysis of flow-based and graph-based traffic behaviors. Information Sciences, 511, 284–296.
Ysab, C., Kj, A., Lc, A., Gj, A., Szab, C., Yzab, C., & Dan, P. D. (2022). Online malicious domain name detection with partial labels for large-scale dependable systems. Journal of Systems and Software, 190, 111322.
Patsakis, C., & Casino, F. (2021). Exploiting statistical and structural features for the detection of domain generation algorithms. Journal of Information Security and Applications, 58, 102725.
Namgung, J., Son, S., & Moon, Y. S. (2021). Efficient deep learning models for DGA domain detection. Security and Communication Networks, 2021(2), 1–15.
Tran, D., Mac, H., Tong, V., Tran, H. A., & Nguyen, L. G. (2017). A LSTM based framework for handling multiclass imbalance in DGA botnet detection. Neurocomputing, 275, 2401–2413.
Yun, X., Huang, J., Wang, Y., Zang, T., & Zhang, Y. (2019). Khaos: An adversarial neural network DGA with high anti-detection ability. IEEE Transactions on Information Forensics and Security, 15, 2225–2240.
Liang, J., Chen, S., Wei, Z., Zhao, S., & Zhao, W. (2022). HAGDetector: Heterogeneous DGA domain name detection model. Computers & Security, 120, 102803.
Alaeiyan, M., Parsa, S., Vinod, P., & Conti, M. (2020). Detection of algorithmically-generated domains: An adversarial machine learning approach. Computer Communications, 160, 661–673.
Yang, L., Liu, G., Wang, J., Bai, H., & Dai, Y. (2021). Fast3DS: A real-time full-convolutional malicious domain name detection system. Journal of Information Security and Applications, 61(1), 102933.
Wang, Z., Guo, Y., & Montgomery, D. (2022). Machine learning-based algorithmically generated domain detection. Computers & Electrical Engineering, 100, 107841.
Park, K. H., Song, H. M., Yoo, J. D., Hong, S.-Y., Cho, B., Kim, K., & Kim, H. K. (2022). Unsupervised malicious domain detection with less labeling effort. Computers & Security, 116, 102662.
Intercepting Hail Hydra. (2021). Real-time detection of algorithmically generated domains. Journal of Network and Computer Applications, 190, 103135.
Wang, T. S., Lin, H. T., Cheng, W. T., & Chen, C. Y. (2017). DBod: Clustering and detecting DGA-based botnets using DNS traffic analysis. Computers & Security, 64, 1–15.
Tong, M., Sun, X., Yang, J., Zhang, H., & Liu, H. (2019). D3N: DGA detection with deep-learning through NXDomain. Cham: Springer.
Schüppen, S., Teubert, D., Herrmann, P., & Meyer, U. (2018). FANCI: Feature-based automated NXDomain classification and intelligence. In 27th USENIX security symposium (USENIX security 18) (pp. 1165–1181).
Yadav, S., Reddy, A. K., Reddy, A. L., & Ranjan, S. (2012). Detecting algorithmically generated domain-flux attacks with DNS traffic analysis. IEEE/ACM Transactions on Networking, 20(5), 1663–1677.
Yan, D., Zhang, H., Wang, Y., Zang, T., Xu, X., & Zeng, Y. (2019). Pontus: A linguistics-based DGA detection system. In 2019 IEEE global communications conference (GLOBECOM) (pp. 1–6). https://doi.org/10.1109/GLOBECOM38437.2019.9014040.
Cucchiarelli, A., Morbidoni, C., Spalazzi, L., & Baldi, M. (2020). Algorithmically generated malicious domain names detection based on n-grams features. Expert Systems with Applications, 170, 114551.
Almashhadani, A., Kaiiali, M., Carlin, D., & Sezer, S. (2020). MaldomDetector: A system for detecting algorithmically generated domain names with machine learning. Computers & Security, 93, 101787.
Beiranvand, F., Mehrdad, V., & Dowlatshahi, M. B. (2022). Unsupervised feature selection for image classification: A bipartite matching-based principal component analysis approach. Knowledge-Based Systems, 250, 109085.
Khehra, G., & Sofat, S. (2018). BotScoop: Scalable detection of DGA based botnets using DNS traffic. In 2018 9th international conference on computing, communication and networking technologies (ICCCNT) (pp. 1–6).
Schiavoni, S., Maggi, F., Cavallaro, L., & Zanero, S. (2014). Phoenix: DGA-based botnet tracking and intelligence. In Detection of intrusions and malware, and vulnerability assessment (pp. 192–211).
Curtin, R. R., Gardner, A. B., Grzonkowski, S., Kleymenov, A., & Mosquera, A. (2018). Detecting DGA domains with recurrent neural networks and side information. In Proceedings of the 14th international conference on availability, reliability and security (pp. 1–10).
Zhou, S., Lin, L., Yuan, J., Wang, F., Ling, Z., & Cui, J. (2019). CNN-based DGA detection with high coverage. In 2019 IEEE international conference on intelligence and security informatics (ISI) (pp. 62–67). https://doi.org/10.1109/ISI.2019.8823200.
Woodbridge, J., Anderson, H. S., Ahuja, A., & Grant, D. (2016). Predicting domain generation algorithms with long short-term memory networks. arXiv:1611.00791
Jiao, H., Wang, Q., Fan, Z., Liu, J., Du, D., Li, N., & Liu, Y. (2022). DGGCN: Dictionary based DGA detection method based on DomainGraph and GCN. In 2022 international conference on computer communications and networks (ICCCN) (pp. 1–10). https://doi.org/10.1109/ICCCN54977.2022.9868932
Ahluwalia, A., Traore, I., Ganame, K., & Agarwal, N. (2017). Detecting broad length algorithmically generated domains. In Intelligent, secure, and dependable systems in distributed and cloud environments (pp. 19–34). Cham: Springer International Publishing.
Patsakis, C., & Casino, F. (2021). Exploiting statistical and structural features for the detection of domain generation algorithms. Journal of Information Security and Applications, 58(2), 102725.
Li, X., Zhang, H., Zhang, R., Liu, Y., & Nie, F. (2019). Generalized uncorrelated regression with adaptive graph for unsupervised feature selection. IEEE Transactions on Neural Networks and Learning Systems, 30(5), 1587–1595.
Huang, D., Cai, X., & Wang, C. D. (2019). Unsupervised feature selection with multi-subspace randomization and collaboration. Knowledge-Based Systems, 182, 104856.
Xie, J., Wang, M., Xu, S., Huang, Z., & Grant, P. W. (2021). The unsupervised feature selection algorithms based on standard deviation and cosine similarity for genomic data analysis. Frontiers in Genetics, 12, 684100.
Yu, B., Gray, D. L., Pan, J., Cock, M., & Nascimento, A. C. A. (2017). Inline DGA detection with deep networks. In 2017 IEEE international conference on data mining workshops (ICDMW) (pp. 683–692). https://doi.org/10.1109/ICDMW.2017.96.
Zhang, X., & Wang, T. (2022). Elastic and reliable bandwidth reservation based on distributed traffic monitoring and control. IEEE Transactions on Parallel and Distributed Systems, 33(12), 4563–4580.
Zhang, X., Wang, Y., Geng, G., & Yu, J. (2021). Delay-optimized multicast tree packing in software-defined networks. IEEE Transactions on Services Computing. https://doi.org/10.1109/TSC.2021.3106264
Tuan, T. A., Long, H. V., & Taniar, D. (2022). On detecting and classifying DGA botnets and their families. Computers & Security, 113, 102549.
Funding
This work has been supported by the support of Key Laboratory of Computer Network and Information Integration (Ministry of Education) (No. K9392022), and Shandong Computer Society provincial key laboratory joint open fund (No.SDKLCN202203), and Natural Science Foundation of Shandong Province, China under grant (No. ZR2021QF090), and Yangzhou Science and Technology Plan Project (YZ2023200), and Self-Developing Experimental Instrument and Equipment Project of Yangzhou University (zzyq2023zy06).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by XZ, JC, XZ, JG and GL. The first draft of the manuscript was written by JC and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zang, X., Cao, J., Zhang, X. et al. BotDetector: a system for identifying DGA-based botnet with CNN-LSTM. Telecommun Syst 85, 207–223 (2024). https://doi.org/10.1007/s11235-023-01073-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11235-023-01073-7