Skip to main content

Detection of Algorithmically Generated Domain Names Using SMOTE and Hybrid Neural Network

  • Conference paper
  • First Online:
Computer Supported Cooperative Work and Social Computing (ChineseCSCW 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1042))

  • 988 Accesses

Abstract

Domain generation algorithms (DGA) provide methods that use specific parameters as random seeds to generate a large number of random domain names for preventing malicious domain name detection, which greatly increases the difficulty of detecting and defending botnets and malware. State-of-the-art models for detecting algorithmically generated domain names are generally based on the principle of analyzing the statistical characteristics of the domain name and building a classifier to locate the algorithmically generated ones. However, most current models have problems of requiring the manual construction of feature sets for classification, as they are sensitive to the imbalance of the sample distribution in the domain name dataset and are difficult to adapt to frequent changes of the domain-name algorithm. To address this issue, we propose a hybrid model that combines a convolutional neural network (CNN) and a bidirectional long-term memory network (BLSTM). First, to solve the problem of the number of domain names generated by DGAs being relatively small and the sample distribution being unbalanced, which consequently decreases detection accuracy, the borderline synthetic minority over-sampling technique is employed to optimize the sample balance of the domain name dataset. Second, a hybrid deep neural network that combines CNN and BLSTM is introduced to extract the semantic and context-dependency features from the domain names. The experimental results from different domain-name datasets demonstrate that the proposed model achieves significant improvement over state-of-the-art models with regard to precision and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bilge, L., Sen, S., Balzarotti, D., Kirda, E., Kruegel, C.: Exposure: a passive DNS analysis service to detect and report malicious domains. ACM Trans. Inf. Syst. Secur. (TISSEC) 16(4), 14 (2014)

    Article  Google Scholar 

  2. Schiavoni, S., Maggi, F., Cavallaro, L., Zanero, S.: Phoenix: DGA-based botnet tracking and intelligence. In: Dietrich, S. (ed.) DIMVA 2014. LNCS, vol. 8550, pp. 192–211. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08509-8_11

    Chapter  Google Scholar 

  3. Choi, H., Lee, H., Lee, H., Kim, H.: Botnet detection by monitoring group activities in DNS traffic. In: 7th IEEE International Conference on Computer and Information Technology, pp. 715–720. IEEE, CIT, USA (2007)

    Google Scholar 

  4. Qu, Y.Z., Lu, Q.K.: Effectively mining network traffic intelligence to detect malicious stealthy port scanning to cloud servers. J. Internet Technol. 15(5), 841–852, (2014). https://doi.org/10.6138/jit.2014.15.5.14

  5. Jiang, J., Zhuge, J.W., Duan, H.X., Wu, J.P.: Research on botnet mechanisms and defenses. J. Softw. 23(1), 82–96 (2012)

    Article  Google Scholar 

  6. Zhou, H., Guo, W., Feng, Y.: An automatic extraction approach of worm signatures based on behavioral footprint analysis. J. Internet Technol. 15(3), 405–412 (2014)

    Google Scholar 

  7. Kührer, M., Rossow, C., Holz, T.: Paint it black: evaluating the effectiveness of malware blacklists. In: Stavrou, A., Bos, H., Portokalidis, G. (eds.) RAID 2014. LNCS, vol. 8688, pp. 1–21. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11379-1_1

    Chapter  Google Scholar 

  8. Wang, T.S., Lin, H.T., Cheng, W.T., Chen, C.Y.: DBod: clustering and detecting DGA-based botnets using DNS traffic analysis. Comput. Secur. 64, 1–15 (2017)

    Article  Google Scholar 

  9. Truong, D.T., Cheng, G., Jakalan, A.: Detecting DGA-based botnet with DNS traffic analysis in monitored network. J. Internet Technol. 17(2), 217–230 (2016)

    Google Scholar 

  10. Yadav, S., Reddy, A.K.K., Reddy, A.L., Ranjan, S.: Detecting algorithmically generated malicious domain names. In: Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, pp. 48–61. ACM, USA (2010)

    Google Scholar 

  11. Xiaodong, Z., Jian, G., Xiaoyan, H.: Detecting malicious domain names based on AGD. J. Commun. 39(7), 1000–1436 (2018)

    Google Scholar 

  12. Antonakakis, M., et al.: From throw-away traffic to bots: detecting the rise of DGA-based malware. Presented as part of the 21st Security Symposium, pp. 491–506, Bellevue, WA (2012)

    Google Scholar 

  13. Kejun, Z., Liansheng, G., Fenglin, Q., Xiaoguang, H.: Deep model for DGA botnet detection based on word-hashing. J. Southeast Univ. 373(07), 19–29 (2017)

    Google Scholar 

  14. Woodbridge, J., Anderson, H.S., Ahuja, A.: Predicting domain generation algorithms with long short-term memory networks. arXiv preprint arXiv:1611.00791 (2016)

  15. Feng, Z., Shuo, C., Xiaochuan, W.: Classification for DGA-based malicious domain names with deep learning architectures. In: 2017 Second International Conference on Applied Mathematics and Information Technology, vol. 6, no. 6, pp. 67–71 (2017)

    Google Scholar 

  16. Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91

    Chapter  Google Scholar 

  17. Chollet, F.: Keras. https://github.com/fchollet/keras. Accessed 2016

  18. Does Alexa have a list of its top ranked webites?. https://support.alexa.com/hc/enus/articles/200449834Does-Alexa-have-a-list-of-its-top-ranked-websites. Accessed 2019

  19. Bambenek consulting master feeds. http://osint.bambenekconsultin.com/feeds/. Accessed 06 Apr 2016

  20. DGA Page. https://data.netlab.360.com/dga. Accessed 2018

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuzhong Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Y., Chen, Y., Lin, Y., Zhang, Y. (2019). Detection of Algorithmically Generated Domain Names Using SMOTE and Hybrid Neural Network. In: Sun, Y., Lu, T., Yu, Z., Fan, H., Gao, L. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2019. Communications in Computer and Information Science, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-15-1377-0_57

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1377-0_57

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1376-3

  • Online ISBN: 978-981-15-1377-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics