Nemesis: Detecting Algorithmically Generated Domains with an LSTM Language Model

Yuan, Dunsheng; Xiong, Ying; Zang, Tianning; Huang, Ji

doi:10.1007/978-3-030-30146-0_24

Dunsheng Yuan^19,20,
Ying Xiong²¹,
Tianning Zang²⁰ &
…
Ji Huang^19,20

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 292))

Included in the following conference series:

International Conference on Collaborative Computing: Networking, Applications and Worksharing

1287 Accesses
4 Citations

Abstract

Various malware families frequently apply Domain Generation Algorithms (DGAs) to generate numerous pseudorandom domain names to communicate with their Command and Control (C&C) servers. Security researchers make a lot of efforts to detect Algorithmically Generated Domains (AGDs) for fighting Botnets and relevant malicious network behaviors. In this paper, we propose a new AGD detection approach, Nemesis, based on a Long Short-Term Memory (LSTM) language model. Nemesis can identify whether given domain names are AGDs according to their string compositions, and without additional information. Nemesis first leverages an n-gram dictionary, which is built on real domain names, to tokenize domain names into n-grams. Then a pre-trained detector is used to classify domain names as real ones or AGDs according to the tokenized results. We evaluate Nemesis’ abilities to detect domain names generated by known DGAs and to discover new DGA families. It turns out that Nemesis can accurately detect AGDs with the precision of 98.6% and the recall of 96.7%. Besides, we verify that Nemesis largely outperforms several existing effective approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Nemesis is the goddess of retribution for evil deeds in ancient Greek mythology.

References

Plohmann, D., Yakdan, K., Klatt, M., Bader, J., Gerhards-Padilla, E.: A comprehensive measurement study of domain generating malware. In: 25th USENIX Security Symposium, USENIX Security 16, Austin, TX, USA, 10–12 August 2016, pp. 263–278 (2016)
Google Scholar
Gai, K., Qiu, M., Tao, L., Zhu, Y.: Intrusion detection techniques for mobile cloud computing in heterogeneous 5G. Secur. Commun. Netw. 9(16), 3049–3058 (2016)
Article Google Scholar
Yadav, S., Reddy, A.K.K., Reddy, A.L.N., Ranjan, S.: Detecting algorithmically generated domain-flux attacks with DNS traffic analysis. IEEE/ACM Trans. Netw. 20(5), 1663–1677 (2012)
Article Google Scholar
Antonakakis, M., et al.: From throw-away traffic to bots: detecting the rise of DGA-based malware. In: Proceedings of the 21th USENIX Security Symposium, Bellevue, WA, USA, 8–10 August 2012, pp. 491–506 (2012)
Google Scholar
Schiavoni, S., Maggi, F., Cavallaro, L., Zanero, S.: Phoenix: DGA-based botnet tracking and intelligence. In: Detection of Intrusions and Malware, and Vulnerability Assessment - 11th International Conference, DIMVA 2014, Egham, UK, 10–11 July 2014, Proceedings, pp. 192–211 (2014)
Google Scholar
Sharifnya, R., Abadi, M.: Dfbotkiller: domain-flux botnet detection based on the history of group activities and failures in DNS traffic. Digit. Invest. 12(12), 15–26 (2015)
Article Google Scholar
Woodbridge, J., Anderson, H.S., Ahuja, A., Grant, D.: Predicting domain generation algorithms with long short-term memory networks. CoRR, vol. abs/1611.00791 (2016)
Google Scholar
Huang, J., Wang, P., Zang, T., Qiang, Q., Wang, Y., Yu, M.: Detecting domain generation algorithms with convolutional neural language models. In: TrustCom/BigDataSE, pp. 1360–1367 (2018)
Google Scholar
Yu, B., Pan, J., Hu, J., Nascimento, A.C.A., Cock, M.D.: Character level based detection of DGA domain names. In: 2018 International Joint Conference on Neural Networks, IJCNN 2018, Rio de Janeiro, Brazil, 8–13 July 2018, pp. 1–8 (2018)
Google Scholar
Sood, A.K., Zeadally, S.: A taxonomy of domain-generation algorithms. IEEE Secur. Priv. 14(4), 46–53 (2016)
Article Google Scholar
Root zone database. https://www.iana.org/domains/root/db
Bengio, Y., Boulanger-Lewandowski, N., Pascanu, R.: Advances in optimizing recurrent networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, 26–31 May 2013, pp. 8624–8628 (2013)
Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.A.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000)
Article Google Scholar
Fu, Y., et al.: Stealthy domain generation algorithms. IEEE Trans. Inf. Forensics Secur. 12(6), 1430–1443 (2017)
Article Google Scholar
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dokl. 10(8), 707–710 (1966)
MathSciNet Google Scholar
Small, H.: Co-citation in the scientific literature: a new measure of the relationship between two documents. J. Am. Soc. Inf. Sci. 24(4), 265–269 (1973)
Article MathSciNet Google Scholar
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work is supported by the National Key Research and Development Program of China under Grant No. 2018YFB0804702 and No. 2018YFB0804704. The corresponding author is Tianning Zang.

Author information

Authors and Affiliations

School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Dunsheng Yuan & Ji Huang
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Dunsheng Yuan, Tianning Zang & Ji Huang
National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing, China
Ying Xiong

Authors

Dunsheng Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Ying Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Tianning Zang
View author publications
You can also search for this author in PubMed Google Scholar
Ji Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tianning Zang .

Editor information

Editors and Affiliations

Xi’an Jiaotong-Liverpool University, Suzhou, China
Xinheng Wang
Shanghai University, Shanghai, China
Honghao Gao
London South Bank University, London, UK
Muddesar Iqbal
University of Exeter, Exeter, UK
Geyong Min

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yuan, D., Xiong, Y., Zang, T., Huang, J. (2019). Nemesis: Detecting Algorithmically Generated Domains with an LSTM Language Model. In: Wang, X., Gao, H., Iqbal, M., Min, G. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 292. Springer, Cham. https://doi.org/10.1007/978-3-030-30146-0_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-30146-0_24
Published: 18 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30145-3
Online ISBN: 978-3-030-30146-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics