BotDetector: a system for identifying DGA-based botnet with CNN-LSTM

Zang, Xiaodong; Cao, Jianbo; Zhang, Xinchang; Gong, Jian; Li, Guiqing

doi:10.1007/s11235-023-01073-7

BotDetector: a system for identifying DGA-based botnet with CNN-LSTM

Published: 27 November 2023

Volume 85, pages 207–223, (2024)
Cite this article

Telecommunication Systems Aims and scope Submit manuscript

Xiaodong Zang^1,2,3,
Jianbo Cao¹,
Xinchang Zhang²,
Jian Gong³ &
…
Guiqing Li¹

365 Accesses
1 Citation
Explore all metrics

Abstract

Botnets are one of the major threats to network security nowadays. To carry out malicious actions remotely, they heavily rely on Command and Control channels. DGA-based botnets use a domain generation algorithm to generate a significant number of domain names. By analyzing the linguistic distinctions between legitimate and DGA-based domain names, traditional machine learning schemes obtain great benefits. However, it is difficult to identify the ones based on wordlists or pseudo-random generated. Accordingly, this paper proposes an efficient CNN-LSTM-based detection model (BotDetector) that uses only a set of simple-to-compute, easy-to-compute character features. We evaluate our model with two open-source benchmark datasets (360 netlab, Bambenek) and real DNS traffic from the China Education and Research Network. Experimental results demonstrate that our algorithm improves by 1.6$\%$ in terms of accuracy and F1-score and reduces the computation time by 9.4$\%$ compared to other state-of-the-art alternatives. Remarkably, our work can identify botnet’s covert communication channels that use domain names based on word lists or pseudo-random generation without any help of reverse engineering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithm 1

DBD: Deep Learning DGA-Based Botnet Detection

GWDGA: An Effective Adversarial DGA

An Enhanced Model for DGA Botnet Detection Using Supervised Machine Learning

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Singh, M., Singh, M., & Kaur, S. (2019). Issues and challenges in DNS based botnet detection: A survey. Computers & Security, 86, 28–52.
Article Google Scholar
Patsakis, C., Casino, F., & Katos, V. (2020). Encrypted and covert DNS queries for botnets: Challenges and countermeasures. Computers & Security, 88, 101614.
Article Google Scholar
Patsakis, C., & Casino, F. (2021). Exploiting statistical and structural features for the detection of domain generation algorithms. Journal of Information Security and Applications, 58, 102725.
Article Google Scholar
Namgung, J., Son, S., & Moon, Y.-S. (2021). Efficient deep learning models for DGA domain detection. Security and Communication Networks, 2021, 1–15.
Article Google Scholar
Al-Duwairi, B., Jarrah, M., & Shatnawi, A. S. (2021). PASSVM: A highly accurate fast flux detection system. Computers & Security, 110, 102431.
Article Google Scholar
Xu, C., Shen, J., & Du, X. (2019). Detection method of domain names generated by DGAs based on semantic representation and deep neural network. Computers & Security, 85, 77–88.
Article Google Scholar
Shin, S., Gu, G., Reddy, N., & Lee, C. P. (2011). A large-scale empirical study of Conficker. IEEE Transactions on Information Forensics and Security, 7(2), 676–690.
Article Google Scholar
Zago, M., Gil Pérez, M., & Martínez Pérez, G. (2019). Scalable detection of botnets based on DGA. Soft Computing, 24(8), 5517–5537.
Article Google Scholar
Akhila, G. P., Gayathri, R., Keerthana, S., & Gladston, A. (2020). A machine learning framework for domain generating algorithm based malware detection. Security and Privacy, 3(6), e127.
Article Google Scholar
Tong, A. T., Long, H. V., & Taniar, D. (2021). On detecting and classifying DGA botnets and their families. Computers & Security, 113, 102549.
Google Scholar
Anderson, H. S., Woodbridge, J., & Filar, B. (2016). DeepDGA: Adversarially-tuned domain generation and detection. In Proceedings of the 2016 ACM workshop on artificial intelligence and security (pp. 13–21). New York, NY: Association for Computing Machinery. https://doi.org/10.1145/2996758.2996767.
Manasrah, A. M., Khdour, T., & Freehat, R. (2022). DGA-based botnets detection using DNS traffic mining. Journal of King Saud University—Computer and Information Sciences, 34(5), 2045–2061.
Article Google Scholar
Wang, W., Shang, Y., He, Y., Li, Y., & Liu, J. (2020). BotMark: Automated botnet detection with hybrid analysis of flow-based and graph-based traffic behaviors. Information Sciences, 511, 284–296.
Article Google Scholar
Ysab, C., Kj, A., Lc, A., Gj, A., Szab, C., Yzab, C., & Dan, P. D. (2022). Online malicious domain name detection with partial labels for large-scale dependable systems. Journal of Systems and Software, 190, 111322.
Article Google Scholar
Patsakis, C., & Casino, F. (2021). Exploiting statistical and structural features for the detection of domain generation algorithms. Journal of Information Security and Applications, 58, 102725.
Article Google Scholar
Namgung, J., Son, S., & Moon, Y. S. (2021). Efficient deep learning models for DGA domain detection. Security and Communication Networks, 2021(2), 1–15.
Article Google Scholar
Tran, D., Mac, H., Tong, V., Tran, H. A., & Nguyen, L. G. (2017). A LSTM based framework for handling multiclass imbalance in DGA botnet detection. Neurocomputing, 275, 2401–2413.
Article Google Scholar
Yun, X., Huang, J., Wang, Y., Zang, T., & Zhang, Y. (2019). Khaos: An adversarial neural network DGA with high anti-detection ability. IEEE Transactions on Information Forensics and Security, 15, 2225–2240.
Article Google Scholar
Liang, J., Chen, S., Wei, Z., Zhao, S., & Zhao, W. (2022). HAGDetector: Heterogeneous DGA domain name detection model. Computers & Security, 120, 102803.
Article Google Scholar
Alaeiyan, M., Parsa, S., Vinod, P., & Conti, M. (2020). Detection of algorithmically-generated domains: An adversarial machine learning approach. Computer Communications, 160, 661–673.
Article Google Scholar
Yang, L., Liu, G., Wang, J., Bai, H., & Dai, Y. (2021). Fast3DS: A real-time full-convolutional malicious domain name detection system. Journal of Information Security and Applications, 61(1), 102933.
Article Google Scholar
Wang, Z., Guo, Y., & Montgomery, D. (2022). Machine learning-based algorithmically generated domain detection. Computers & Electrical Engineering, 100, 107841.
Article Google Scholar
Park, K. H., Song, H. M., Yoo, J. D., Hong, S.-Y., Cho, B., Kim, K., & Kim, H. K. (2022). Unsupervised malicious domain detection with less labeling effort. Computers & Security, 116, 102662.
Article Google Scholar
Intercepting Hail Hydra. (2021). Real-time detection of algorithmically generated domains. Journal of Network and Computer Applications, 190, 103135.
Article Google Scholar
Wang, T. S., Lin, H. T., Cheng, W. T., & Chen, C. Y. (2017). DBod: Clustering and detecting DGA-based botnets using DNS traffic analysis. Computers & Security, 64, 1–15.
Article Google Scholar
Tong, M., Sun, X., Yang, J., Zhang, H., & Liu, H. (2019). D3N: DGA detection with deep-learning through NXDomain. Cham: Springer.
Google Scholar
Schüppen, S., Teubert, D., Herrmann, P., & Meyer, U. (2018). FANCI: Feature-based automated NXDomain classification and intelligence. In 27th USENIX security symposium (USENIX security 18) (pp. 1165–1181).
Yadav, S., Reddy, A. K., Reddy, A. L., & Ranjan, S. (2012). Detecting algorithmically generated domain-flux attacks with DNS traffic analysis. IEEE/ACM Transactions on Networking, 20(5), 1663–1677.
Article Google Scholar
Yan, D., Zhang, H., Wang, Y., Zang, T., Xu, X., & Zeng, Y. (2019). Pontus: A linguistics-based DGA detection system. In 2019 IEEE global communications conference (GLOBECOM) (pp. 1–6). https://doi.org/10.1109/GLOBECOM38437.2019.9014040.
Cucchiarelli, A., Morbidoni, C., Spalazzi, L., & Baldi, M. (2020). Algorithmically generated malicious domain names detection based on n-grams features. Expert Systems with Applications, 170, 114551.
Article Google Scholar
Almashhadani, A., Kaiiali, M., Carlin, D., & Sezer, S. (2020). MaldomDetector: A system for detecting algorithmically generated domain names with machine learning. Computers & Security, 93, 101787.
Article Google Scholar
Beiranvand, F., Mehrdad, V., & Dowlatshahi, M. B. (2022). Unsupervised feature selection for image classification: A bipartite matching-based principal component analysis approach. Knowledge-Based Systems, 250, 109085.
Article Google Scholar
Khehra, G., & Sofat, S. (2018). BotScoop: Scalable detection of DGA based botnets using DNS traffic. In 2018 9th international conference on computing, communication and networking technologies (ICCCNT) (pp. 1–6).
Schiavoni, S., Maggi, F., Cavallaro, L., & Zanero, S. (2014). Phoenix: DGA-based botnet tracking and intelligence. In Detection of intrusions and malware, and vulnerability assessment (pp. 192–211).
Curtin, R. R., Gardner, A. B., Grzonkowski, S., Kleymenov, A., & Mosquera, A. (2018). Detecting DGA domains with recurrent neural networks and side information. In Proceedings of the 14th international conference on availability, reliability and security (pp. 1–10).
Zhou, S., Lin, L., Yuan, J., Wang, F., Ling, Z., & Cui, J. (2019). CNN-based DGA detection with high coverage. In 2019 IEEE international conference on intelligence and security informatics (ISI) (pp. 62–67). https://doi.org/10.1109/ISI.2019.8823200.
Woodbridge, J., Anderson, H. S., Ahuja, A., & Grant, D. (2016). Predicting domain generation algorithms with long short-term memory networks. arXiv:1611.00791
Jiao, H., Wang, Q., Fan, Z., Liu, J., Du, D., Li, N., & Liu, Y. (2022). DGGCN: Dictionary based DGA detection method based on DomainGraph and GCN. In 2022 international conference on computer communications and networks (ICCCN) (pp. 1–10). https://doi.org/10.1109/ICCCN54977.2022.9868932
Ahluwalia, A., Traore, I., Ganame, K., & Agarwal, N. (2017). Detecting broad length algorithmically generated domains. In Intelligent, secure, and dependable systems in distributed and cloud environments (pp. 19–34). Cham: Springer International Publishing.
Patsakis, C., & Casino, F. (2021). Exploiting statistical and structural features for the detection of domain generation algorithms. Journal of Information Security and Applications, 58(2), 102725.
Article Google Scholar
Li, X., Zhang, H., Zhang, R., Liu, Y., & Nie, F. (2019). Generalized uncorrelated regression with adaptive graph for unsupervised feature selection. IEEE Transactions on Neural Networks and Learning Systems, 30(5), 1587–1595.
Article MathSciNet PubMed Google Scholar
Huang, D., Cai, X., & Wang, C. D. (2019). Unsupervised feature selection with multi-subspace randomization and collaboration. Knowledge-Based Systems, 182, 104856.
Article Google Scholar
Xie, J., Wang, M., Xu, S., Huang, Z., & Grant, P. W. (2021). The unsupervised feature selection algorithms based on standard deviation and cosine similarity for genomic data analysis. Frontiers in Genetics, 12, 684100.
Article CAS PubMed PubMed Central Google Scholar
Yu, B., Gray, D. L., Pan, J., Cock, M., & Nascimento, A. C. A. (2017). Inline DGA detection with deep networks. In 2017 IEEE international conference on data mining workshops (ICDMW) (pp. 683–692). https://doi.org/10.1109/ICDMW.2017.96.
Zhang, X., & Wang, T. (2022). Elastic and reliable bandwidth reservation based on distributed traffic monitoring and control. IEEE Transactions on Parallel and Distributed Systems, 33(12), 4563–4580.
Article Google Scholar
Zhang, X., Wang, Y., Geng, G., & Yu, J. (2021). Delay-optimized multicast tree packing in software-defined networks. IEEE Transactions on Services Computing. https://doi.org/10.1109/TSC.2021.3106264
Article Google Scholar
Tuan, T. A., Long, H. V., & Taniar, D. (2022). On detecting and classifying DGA botnets and their families. Computers & Security, 113, 102549.

Download references

Funding

This work has been supported by the support of Key Laboratory of Computer Network and Information Integration (Ministry of Education) (No. K9392022), and Shandong Computer Society provincial key laboratory joint open fund (No.SDKLCN202203), and Natural Science Foundation of Shandong Province, China under grant (No. ZR2021QF090), and Yangzhou Science and Technology Plan Project (YZ2023200), and Self-Developing Experimental Instrument and Equipment Project of Yangzhou University (zzyq2023zy06).

Author information

Authors and Affiliations

School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
Xiaodong Zang, Jianbo Cao & Guiqing Li
Qilu University of Technology (Shandong Academy of Sciences) Shandong Computer Science Center, Shandong Provincial Key Laboratory of Computer Networks, Jinan, China
Xiaodong Zang & Xinchang Zhang
Key Laboratory of Computer Network and Information Integration, Ministry of Education, Southeast University, Nanjing, China
Xiaodong Zang & Jian Gong

Authors

Xiaodong Zang
View author publications
You can also search for this author inPubMed Google Scholar
Jianbo Cao
View author publications
You can also search for this author inPubMed Google Scholar
Xinchang Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Jian Gong
View author publications
You can also search for this author inPubMed Google Scholar
Guiqing Li
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by XZ, JC, XZ, JG and GL. The first draft of the manuscript was written by JC and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xiaodong Zang.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zang, X., Cao, J., Zhang, X. et al. BotDetector: a system for identifying DGA-based botnet with CNN-LSTM. Telecommun Syst 85, 207–223 (2024). https://doi.org/10.1007/s11235-023-01073-7

Download citation

Accepted: 28 October 2023
Published: 27 November 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11235-023-01073-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BotDetector: a system for identifying DGA-based botnet with CNN-LSTM

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

DBD: Deep Learning DGA-Based Botnet Detection

GWDGA: An Effective Adversarial DGA

An Enhanced Model for DGA Botnet Detection Using Supervised Machine Learning

Explore related subjects

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now