Algorithmically Generated Domain Detection and Malware Family Classification

Choudhary, Chhaya; Sivaguru, Raaghavi; Pereira, Mayana; Yu, Bin; Nascimento, Anderson C.; De Cock, Martine

doi:10.1007/978-981-13-5826-5_50

Chhaya Choudhary¹⁴,
Raaghavi Sivaguru¹⁴,
Mayana Pereira¹⁵,
Bin Yu¹⁵,
Anderson C. Nascimento¹⁴ &
…
Martine De Cock^14,16

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 969))

Included in the following conference series:

International Symposium on Security in Computing and Communication

1695 Accesses

Abstract

In this paper, we compare the performance of several machine learning based approaches for the tasks of detecting algorithmically generated malicious domains and the categorization of domains according to their malware family. The datasets used for model comparison were provided by the shared task on Detecting Malicious Domain names (DMD 2018). Our models ranked first for two out of the four test datasets provided in the competition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Machine Learning Framework for Studying Domain Generation Algorithm (DGA)-Based Malware

Dom-BERT: Detecting Malicious Domains with Pre-training Model

Detecting Word Based DGA Domains Using Ensemble Models

Notes

1.
http://nlp.amrita.edu/DMD2018/, Accessed: 2018-07-18.
2.
https://github.com/baderj/domain_generation_algorithms, Accessed: 2018-07-24.
3.
https://data.netlab.360.com/dga/, Accessed: 2018-07-24.
4.
https://umbrella.cisco.com/blog/2016/12/14/cisco-umbrella-1-million/, Accessed: 2018-07-24.
5.
https://dgarchive.caad.fkie.fraunhofer.de/site/, Accessed: 2018-07-24.
6.
https://www.farsightsecurity.com/, Accessed: 2018-07-24.
7.
https://www.spamhaus.org/statistics/tlds/, Accessed: 2018-07-18.

References

Does Alexa have a list of its top-ranked websites? https://support.alexa.com/hc/en-us/articles/200449834-Does-Alexa-have-a-list-of-its-top-ranked-websites-. Accessed 28 May 2017
OSINT feeds from Bambenek Consulting. http://osint.bambenekconsulting.com/feeds/. Accessed 28 May 2017
Antonakakis, M., et al.: From throw-away traffic to bots: detecting the rise of DGA-based malware. In: USENIX Security Symposium, vol. 12 (2012)
Google Scholar
Bilge, L., Kirda, E., Kruegel, C., Balduzzi, M.: Exposure: finding malicious domains using passive DNS analysis. In: NDSS Symposium (2011)
Google Scholar
Dhingra, B., Zhou, Z., Fitzpatrick, D., Muehl, M., Cohen, W.: Tweet2vec: character-based distributed representations for social media. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 2, pp. 269–274 (2016)
Google Scholar
Lison, P., Mavroeidis, V.: Automatic detection of malware-generated domains with recurrent neural models. preprint arXiv:1709.07102 (2017)
Plohmann, D., Yakdan, K., Klatt, M., Bader, J., Gerhards-Padilla, E.: A comprehensive measurement study of domain generating malware. In: USENIX Security Symposium, pp. 263–278 (2016)
Google Scholar
Saxe, J., Berlin, K.: eXpose: A character-level convolutional neural network with embeddings for detecting malicious urls, file paths and registry keys. preprint arXiv:1702.08568 (2017)
Schiavoni, S., Maggi, F., Cavallaro, L., Zanero, S.: Phoenix: DGA-based botnet tracking and intelligence. In: Dietrich, S. (ed.) DIMVA 2014. LNCS, vol. 8550, pp. 192–211. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08509-8_11
Chapter Google Scholar
Tran, D., Mac, H., Tong, V., Tran, H.A., Nguyen, L.G.: A LSTM based framework for handling multiclass imbalance in DGA botnet detection. Neurocomputing 275, 2401–2413 (2018)
Article Google Scholar
Vinayakumar, R., Poornachandran, P., Soman, K.P.: Scalable framework for cyber threat situational awareness based on domain name systems data analysis. In: Roy, S.S., Samui, P., Deo, R., Ntalampiras, S. (eds.) Big Data in Engineering Applications. SBD, vol. 44, pp. 113–142. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-8476-8_6
Chapter Google Scholar
Vinayakumar, R., Soman, K., Poornachandran, P.: Detecting malicious domain names using deep learning approaches at scale. J. Intell. Fuzzy Syst. 34(3), 1355–1367 (2018)
Article Google Scholar
Vinayakumar, R., Soman, K., Poornachandran, P., Sachin Kumar, S.: Evaluating deep learning approaches to characterize and classify the DGAs at scale. J. Intell. Fuzzy Syst. 34(3), 1265–1276 (2018)
Article Google Scholar
Vosoughi, S., Vijayaraghavan, P., Roy, D.: Tweet2vec: learning tweet embeddings using character-level CNN-LSTM encoder-decoder. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1041–1044 (2016)
Google Scholar
Woodbridge, J., Anderson, H.S., Ahuja, A., Grant, D.: Predicting domain generation algorithms with long short-term memory networks. preprint arXiv:1611.00791 (2016)
Yadav, S., Reddy, A.K.K., Reddy, A.L.N., Ranjan, S.: Detecting algorithmically generated malicious domain names. In: Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, pp. 48–61 (2010)
Google Scholar
Yu, B., Gray, D., Pan, J., De Cock, M., Nascimento, A.: Inline DGA detection with deep networks. In: Data Mining for Cyber Security, Proceedings of International Conference on Data Mining (ICDM2017) Workshops, pp. 683–692 (2017)
Google Scholar
Yu, B., Pan, J., Hu, J., Nascimento, A., De Cock, M.: Character level based detection of DGA domain names. In: Proceedings of IJCNN at WCCI2018 (2018 IEEE World Congress on Computational Intelligence), pp. 4168–4175 (2018)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. Adv. Neural Inf. Process. Syst. 28, 649–657 (2015)
Google Scholar

Download references

Acknowledgments

We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.

Author information

Authors and Affiliations

University of Washington Tacoma, Tacoma, USA
Chhaya Choudhary, Raaghavi Sivaguru, Anderson C. Nascimento & Martine De Cock
Infoblox Inc., Santa Clara, USA
Mayana Pereira & Bin Yu
Ghent University, Ghent, Belgium
Martine De Cock

Authors

Chhaya Choudhary
View author publications
You can also search for this author in PubMed Google Scholar
Raaghavi Sivaguru
View author publications
You can also search for this author in PubMed Google Scholar
Mayana Pereira
View author publications
You can also search for this author in PubMed Google Scholar
Bin Yu
View author publications
You can also search for this author in PubMed Google Scholar
Anderson C. Nascimento
View author publications
You can also search for this author in PubMed Google Scholar
Martine De Cock
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martine De Cock .

Editor information

Editors and Affiliations

Technology and Management, Indian Institute of Information, Kerala, India
Sabu M. Thampi
Department of Computer Science, Missouri University of Science and Technology, Rolla, MO, USA
Sanjay Madria
Guangzhou University, Guangzhou, China
Guojun Wang
Howard University, Washington, DC, USA
Danda B. Rawat
University of the West of Scotland, Paisley, Glasgow, UK
Jose M. Alcaraz Calero

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Choudhary, C., Sivaguru, R., Pereira, M., Yu, B., Nascimento, A.C., De Cock, M. (2019). Algorithmically Generated Domain Detection and Malware Family Classification. In: Thampi, S., Madria, S., Wang, G., Rawat, D., Alcaraz Calero, J. (eds) Security in Computing and Communications. SSCC 2018. Communications in Computer and Information Science, vol 969. Springer, Singapore. https://doi.org/10.1007/978-981-13-5826-5_50

Download citation

DOI: https://doi.org/10.1007/978-981-13-5826-5_50
Published: 24 January 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-5825-8
Online ISBN: 978-981-13-5826-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics