A Survey of Machine Learning and Deep Learning Based DGA Detection Techniques

Saeed, Amr M. H.; Wang, Danghui; Alnedhari, Hamas A. M.; Mei, Kuizhi; Wang, Jihe

doi:10.1007/978-3-030-97774-0_12

Amr M. H. Saeed¹¹,
Danghui Wang^11,14,
Hamas A. M. Alnedhari¹²,
Kuizhi Mei¹³ &
…
Jihe Wang^11,15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13202))

Included in the following conference series:

International Conference on Smart Computing and Communication

1560 Accesses
6 Citations

Abstract

Botnets are the most commonly used mechanisms for current cyberattacks such as DDoS, ransomware, email spamming, phishing data, etc. Botnets deploy the Domain Generation Algorithm (DGA) to conceal domain names of Command & Control (C&C) servers by generating several fake domain names. A sophisticated DGA can circumvent the traditional detection methods and successfully communicate with the C&C. Several detection methods like DNS sinkhole, DNS filtering and DNS logs analysis have been intensively studied to neutralize DGA. However, these methods have a high noise rate and require a massive amount of computational resources. To tackle this issue, several researchers leveraged Machine learning (ML) and Deep Learning (DL) algorithms to develop lightweight and cost-effective detection methods. The purpose of this paper is to investigate and evaluate the DGA detection methods based on ML/DL published in the last three years. After analyzing the relevant literature strengths and limitations, we conclude that low detection speed, encrypted DNS sensitivity, data imbalance sensitivity, and low detection accuracy with variant or unknown DGA are most likely the current research trends and opportunities. As far as we know, this survey is the first of its kind to discuss DGA detection techniques based on ML/DL in-depth, as well as analysis of their limitations and future trends.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

DBD: Deep Learning DGA-Based Botnet Detection

Enhanced Domain Generating Algorithm Detection Based on Deep Neural Networks

Real-Time Detection of Dictionary DGA Network Traffic Using Deep Learning

Article Open access 22 February 2021

References

Gao, Y., Iqbal, S., et al.: Performance and power analysis of high-density multi-GPGPU architectures: a preliminary case study. In: IEEE HPCC 2015, pp. 29–35 (2015)
Google Scholar
Zhao, H., Chen, M., et al.: A novel pre-cache schema for high performance android system. FGCS 56, 766–772 (2016)
Article Google Scholar
Zhang, Z., Wu, J., Deng, J., Qiu, M.: Jamming ack attack to wireless networks and a mitigation approach. In: IEEE GLOBECOM, pp. 1–5 (2008)
Google Scholar
Qiu, H., Qiu, M., Memmi, G., Ming, Z., Liu, M.: A dynamic scalable blockchain based communication architecture for IoT. In: Qiu, M. (ed.) SmartBlock 2018. LNCS, vol. 11373, pp. 159–166. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-05764-0_17
Chapter Google Scholar
Thakur, K., Qiu, M., Gai, K., Ali, M.L.: An investigation on cyber security threats and security models. In: CSCloud 2015, pp. 307–311 (2015)
Google Scholar
Gai, K., Qiu, M., Sun, X., Zhao, H.: Security and privacy issues: a survey on fintech. In: Qiu, M. (ed.) SmartCom 2016. LNCS, vol. 10135, pp. 236–247. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52015-5_24
Chapter Google Scholar
Cyber security statistics. https://www.mcafee.com/enterprise/en-us/lp/threats-reports/oct-2021.html. Accessed 07 Oct 2021
Advanced threat research report 2021. https://purplesec.us/resources/cyber-security-statistics/. Accessed 09 Oct 2021
Mid-year update sonicwall cyber threat report. https://purplesec.us/resources/cyber-security-statistics/. Accessed 25 Sept 2021
Kumar, A.D., et al.: Enhanced domain generating algorithm detection based on deep neural networks. In: Alazab, M., Tang, M.J. (eds.) Deep Learning Applications for Cyber Security. ASTSA, pp. 151–173. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-13057-2_7
Chapter Google Scholar
Almashhadani, A.O., Kaiiali, M., Carlin, D., Sezer, S.: MaldomDetector: a system for detecting algorithmically generated domain names with machine learning. Comput. Secur. 93, 101787 (2020)
Article Google Scholar
Shetu, S.F., Saifuzzaman, M., Moon, N.N., Nur, F.N.: A survey of botnet in cyber security. In: 2019 2nd ICCT, pp. 174–177. IEEE (2019)
Google Scholar
Maikudi, U., Abisoye, O., Ganiyu, S., Bashir, S.A.: A literature survey on IoT botnet detection techniques (2021)
Google Scholar
Xing, Y., Shu, H., Zhao, H., Li, D., Guo, L.: Survey on botnet detection techniques: Classification, methods, and evaluation. Math. Probl. Eng. (2021)
Google Scholar
Anagnostopoulos, M., Kambourakis, G., Drakatos, P., Karavolos, M., Kotsilitis, S., Yau, D.K.Y.: Botnet command and control architectures revisited: tor hidden services and fluxing. In: Bouguettaya, A., et al. (eds.) WISE 2017. LNCS, vol. 10570, pp. 517–527. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68786-5_41
Chapter Google Scholar
Grizzard, J.B., Sharma, V., Nunnery, C., Kang, B.B., Dagon, D.: Peer-to-peer botnets: overview and case study. HotBots 7(2007) (2007)
Google Scholar
Wang, P., Sparks, S., Zou, C.C.: An advanced hybrid peer-to-peer botnet. IEEE Trans. Dependable Secure Comput. 7(2), 113–127 (2008)
Article Google Scholar
Gai, K., Wu, Y., Zhu, L., Zhang, Z., Qiu, M.: Differential privacy-based blockchain for industrial internet-of-things. IEEE Trans. Industr. Inf. 16(6), 4156–4165 (2019)
Article Google Scholar
Karim, A., Salleh, R.B., Shiraz, M., Shah, S.A.A., Awan, I., Anuar, N.B.: Botnet detection techniques: review, future trends, and issues. J. Zhejiang Univ. Sci. C 15(11), 943–983 (2014). https://doi.org/10.1631/jzus.C1300242
Article Google Scholar
Ghalati, N.F., Ghalaty, N.F., Barata, J.: Towards the detection of malicious URL and domain names using machine learning. In: Camarinha-Matos, L.M., Farhadi, N., Lopes, F., Pereira, H. (eds.) DoCEIS 2020. IAICT, vol. 577, pp. 109–117. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45124-0_10
Chapter Google Scholar
Sivaguru, R., Peck, J., Olumofin, F., Nascimento, A., De Cock, M.: Inline detection of DGA domains using side information. IEEE Access 8, 141910–141922 (2020)
Article Google Scholar
Wang, Q., Li, L., Jiang, B., Lu, Z., Liu, J., Jian, S.: Malicious domain detection based on K-means and SMOTE. In: Krzhizhanovskaya, V.V., et al. (eds.) ICCS 2020. LNCS, vol. 12138, pp. 468–481. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50417-5_35
Chapter Google Scholar
Sun, X., Wang, Z., Yang, J., Liu, X.: Deepdom: malicious domain detection with scalable and heterogeneous graph convolutional networks. Comput. Secur. 99, 102057 (2020)
Article Google Scholar
Soleymani, A., Arabgol, F.: A novel approach for detecting DGA-based botnets in DNS queries using machine learning techniques. J. Comput. Netw. Comm. (2021)
Google Scholar
Zhu, J., Zou, F.: Detecting malicious domains using modified SVM model. In: IEEE 21st HPCC, pp. 492–499 (2019)
Google Scholar
Kim, K., Tanuwidjaja, H.C.: Privacy-preserving deep learning a comprehensive survey (2021)
Google Scholar
Xu, C., Shen, J., Du, X.: Detection method of domain names generated by DGAs based on semantic representation and deep neural network. Comput. Secur. 85, 77–88 (2019)
Article Google Scholar
Plohmann, D., Yakdan, K., Klatt, M., Bader, J., Gerhards-Padilla, E.: A comprehensive measurement study of domain generating malware. In: 25th $\{$USENIX$\}$ Security Symposium ($\{$USENIX$\}$ Security 16), pp. 263–278 (2016)
Google Scholar
Vinayakumar, R., Soman, K.P., Poornachandran, P., Alazab, M., Jolfaei, A.: DBD: deep learning DGA-based botnet detection. In: Alazab, M., Tang, M.J. (eds.) Deep Learning Applications for Cyber Security. ASTSA, pp. 127–149. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-13057-2_6
Chapter Google Scholar
Highnam, K., Puzio, D., Luo, S., Jennings, N.R.: Real-time detection of dictionary DGA network traffic using deep learning. SN Comput. Sci. 2(2), 1–17 (2021)
Article Google Scholar
Alexa top 1 m. http://s3.amazonaws.com/alexa-static/top-1m.csv.zip. Accessed 05 Oct 2021
Shahzad, H., Sattar, A.R., Skandaraniyam, J.: DGA domain detection using deep learning. In: IEEE 5th International Conference on Cryptography, Security and Privacy (CSP), pp. 139–143 (2021)
Google Scholar
Cisco umbrella popularity list. http://s3-us-west-1.amazonaws.com/umbrella-static/top-1m.csv.zip. Accessed 05 Oct 2021
Osint feeds from bambenek. http://osint.bambenekconsulting.com/feeds/. Accessed 05 Oct 2021
Vinayakumar, R., Alazab, M., Srinivasan, S., et al.: A visualized botnet detection system based deep learning for the internet of things networks of smart cities. IEEE Trans. Ind. Appl. 56, 4436–4456 (2020)
Article Google Scholar
Namgung, J., Son, S., Moon, Y.S.: Efficient deep learning models for dga domain detection. Secur. Commun. Netw. (2021)
Google Scholar
Wang, C., Cho, K., Gu, J.: Neural machine translation with byte-level subwords. In: AAAI Conference, vol. 34, pp. 9154–9160 (2020)
Google Scholar
Drichel, A., Meyer, U., Schüppen, S., Teubert, D.: Making use of NXt to nothing: the effect of class imbalances on DGA detection classifiers. In: 15th International Conference on Availability, Reliability and Security, pp. 1–9 (2020)
Google Scholar
Padurariu, C., Breaban, M.E.: Dealing with data imbalance in text classification. Proc. Comput. Sci. 159, 736–745 (2019)
Article Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Patsakis, C., Casino, F., Katos, V.: Encrypted and covert DNS queries for botnets: challenges and countermeasures. Comput. Secur. 88, 101614 (2020)
Google Scholar
Bushart, J., Rossow, C.: Padding ain’t enough: assessing the privacy guarantees of encrypted DNS. In: 10th USENIX Workshop FOCI (2020)
Google Scholar
Siby, S., Juarez, M., Diaz, C., Vallina-Rodriguez, N., Troncoso, C.: Encrypted DNS privacy. In: NDSS (2020)
Google Scholar

Download references

Acknowledgement

This work is supported by National NSF of China (No. 61802312), Natural Science Basic Research Plan in Shaanxi Province of China (No. 2019JQ-618), and open fund of Integrated Aero-Space-Ground-Ocean Big Data Application Technology (No. 20200105).

Author information

Authors and Affiliations

School of Computer Science, Northwestern Polytechnical University, Xi’an, China
Amr M. H. Saeed, Danghui Wang & Jihe Wang
School of Artificial Intelligence, Xidian University, Xi’an, China
Hamas A. M. Alnedhari
Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an, China
Kuizhi Mei
Engineering Research Center of Embedded System Integration, MOE, Beijing, China
Danghui Wang
National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, Xi’an, China
Jihe Wang

Authors

Amr M. H. Saeed
View author publications
You can also search for this author in PubMed Google Scholar
Danghui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hamas A. M. Alnedhari
View author publications
You can also search for this author in PubMed Google Scholar
Kuizhi Mei
View author publications
You can also search for this author in PubMed Google Scholar
Jihe Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Danghui Wang .

Editor information

Editors and Affiliations

Texas A&M University-Commerce, Commerce, TX, USA
Meikang Qiu
Beijing Institute of Technology, Beijing, Beijing, China
Keke Gai
Tsinghua University, Beijing, China
Han Qiu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saeed, A.M.H., Wang, D., Alnedhari, H.A.M., Mei, K., Wang, J. (2022). A Survey of Machine Learning and Deep Learning Based DGA Detection Techniques. In: Qiu, M., Gai, K., Qiu, H. (eds) Smart Computing and Communication. SmartCom 2021. Lecture Notes in Computer Science, vol 13202. Springer, Cham. https://doi.org/10.1007/978-3-030-97774-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-97774-0_12
Published: 15 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-97773-3
Online ISBN: 978-3-030-97774-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics