Detection of Algorithmically Generated Domain Names Using SMOTE and Hybrid Neural Network

Zhang, Yudong; Chen, Yuzhong; Lin, Yangyang; Zhang, Yankun

doi:10.1007/978-981-15-1377-0_57

Yudong Zhang^12,13,
Yuzhong Chen^12,13,
Yangyang Lin^12,13 &
…
Yankun Zhang^12,13

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1042))

Included in the following conference series:

CCF Conference on Computer Supported Cooperative Work and Social Computing

988 Accesses

Abstract

Domain generation algorithms (DGA) provide methods that use specific parameters as random seeds to generate a large number of random domain names for preventing malicious domain name detection, which greatly increases the difficulty of detecting and defending botnets and malware. State-of-the-art models for detecting algorithmically generated domain names are generally based on the principle of analyzing the statistical characteristics of the domain name and building a classifier to locate the algorithmically generated ones. However, most current models have problems of requiring the manual construction of feature sets for classification, as they are sensitive to the imbalance of the sample distribution in the domain name dataset and are difficult to adapt to frequent changes of the domain-name algorithm. To address this issue, we propose a hybrid model that combines a convolutional neural network (CNN) and a bidirectional long-term memory network (BLSTM). First, to solve the problem of the number of domain names generated by DGAs being relatively small and the sample distribution being unbalanced, which consequently decreases detection accuracy, the borderline synthetic minority over-sampling technique is employed to optimize the sample balance of the domain name dataset. Second, a hybrid deep neural network that combines CNN and BLSTM is introduced to extract the semantic and context-dependency features from the domain names. The experimental results from different domain-name datasets demonstrate that the proposed model achieves significant improvement over state-of-the-art models with regard to precision and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bilge, L., Sen, S., Balzarotti, D., Kirda, E., Kruegel, C.: Exposure: a passive DNS analysis service to detect and report malicious domains. ACM Trans. Inf. Syst. Secur. (TISSEC) 16(4), 14 (2014)
Article Google Scholar
Schiavoni, S., Maggi, F., Cavallaro, L., Zanero, S.: Phoenix: DGA-based botnet tracking and intelligence. In: Dietrich, S. (ed.) DIMVA 2014. LNCS, vol. 8550, pp. 192–211. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08509-8_11
Chapter Google Scholar
Choi, H., Lee, H., Lee, H., Kim, H.: Botnet detection by monitoring group activities in DNS traffic. In: 7th IEEE International Conference on Computer and Information Technology, pp. 715–720. IEEE, CIT, USA (2007)
Google Scholar
Qu, Y.Z., Lu, Q.K.: Effectively mining network traffic intelligence to detect malicious stealthy port scanning to cloud servers. J. Internet Technol. 15(5), 841–852, (2014). https://doi.org/10.6138/jit.2014.15.5.14
Jiang, J., Zhuge, J.W., Duan, H.X., Wu, J.P.: Research on botnet mechanisms and defenses. J. Softw. 23(1), 82–96 (2012)
Article Google Scholar
Zhou, H., Guo, W., Feng, Y.: An automatic extraction approach of worm signatures based on behavioral footprint analysis. J. Internet Technol. 15(3), 405–412 (2014)
Google Scholar
Kührer, M., Rossow, C., Holz, T.: Paint it black: evaluating the effectiveness of malware blacklists. In: Stavrou, A., Bos, H., Portokalidis, G. (eds.) RAID 2014. LNCS, vol. 8688, pp. 1–21. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11379-1_1
Chapter Google Scholar
Wang, T.S., Lin, H.T., Cheng, W.T., Chen, C.Y.: DBod: clustering and detecting DGA-based botnets using DNS traffic analysis. Comput. Secur. 64, 1–15 (2017)
Article Google Scholar
Truong, D.T., Cheng, G., Jakalan, A.: Detecting DGA-based botnet with DNS traffic analysis in monitored network. J. Internet Technol. 17(2), 217–230 (2016)
Google Scholar
Yadav, S., Reddy, A.K.K., Reddy, A.L., Ranjan, S.: Detecting algorithmically generated malicious domain names. In: Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, pp. 48–61. ACM, USA (2010)
Google Scholar
Xiaodong, Z., Jian, G., Xiaoyan, H.: Detecting malicious domain names based on AGD. J. Commun. 39(7), 1000–1436 (2018)
Google Scholar
Antonakakis, M., et al.: From throw-away traffic to bots: detecting the rise of DGA-based malware. Presented as part of the 21st Security Symposium, pp. 491–506, Bellevue, WA (2012)
Google Scholar
Kejun, Z., Liansheng, G., Fenglin, Q., Xiaoguang, H.: Deep model for DGA botnet detection based on word-hashing. J. Southeast Univ. 373(07), 19–29 (2017)
Google Scholar
Woodbridge, J., Anderson, H.S., Ahuja, A.: Predicting domain generation algorithms with long short-term memory networks. arXiv preprint arXiv:1611.00791 (2016)
Feng, Z., Shuo, C., Xiaochuan, W.: Classification for DGA-based malicious domain names with deep learning architectures. In: 2017 Second International Conference on Applied Mathematics and Information Technology, vol. 6, no. 6, pp. 67–71 (2017)
Google Scholar
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91
Chapter Google Scholar
Chollet, F.: Keras. https://github.com/fchollet/keras. Accessed 2016
Does Alexa have a list of its top ranked webites?. https://support.alexa.com/hc/enus/articles/200449834Does-Alexa-have-a-list-of-its-top-ranked-websites. Accessed 2019
Bambenek consulting master feeds. http://osint.bambenekconsultin.com/feeds/. Accessed 06 Apr 2016
DGA Page. https://data.netlab.360.com/dga. Accessed 2018

Download references

Author information

Authors and Affiliations

College of Mathematics and Computer Sciences, Fuzhou University, Fuzhou, 350116, China
Yudong Zhang, Yuzhong Chen, Yangyang Lin & Yankun Zhang
Fujian Provincial Key Laboratory of Network Computing and Intelligent Information Processing, Fuzhou, 350116, China
Yudong Zhang, Yuzhong Chen, Yangyang Lin & Yankun Zhang

Authors

Yudong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yuzhong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yangyang Lin
View author publications
You can also search for this author in PubMed Google Scholar
Yankun Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuzhong Chen .

Editor information

Editors and Affiliations

Shandong University, Jinan, China
Yuqing Sun
Fudan University, Shanghai, China
Tun Lu
Kunming University of Science and Technology, Kunming, China
Zhengtao Yu
Tongji University, Shanghai, China
Hongfei Fan
University of Shanghai for Science and Technology, Shanghai, China
Liping Gao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Chen, Y., Lin, Y., Zhang, Y. (2019). Detection of Algorithmically Generated Domain Names Using SMOTE and Hybrid Neural Network. In: Sun, Y., Lu, T., Yu, Z., Fan, H., Gao, L. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2019. Communications in Computer and Information Science, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-15-1377-0_57

Download citation

DOI: https://doi.org/10.1007/978-981-15-1377-0_57
Published: 14 November 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1376-3
Online ISBN: 978-981-15-1377-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)