Skip to main content

Transfer Learning Approach for Identification of Malicious Domain Names

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 969))

Abstract

Malware domains generated by Domain Generated Algorithms (DGA) are highly dynamic in nature. The traditional approach of blacklisting the malicious domains is a time consuming approach and are not effective, as the DGA randomly generate the domain names for the malware. For real-time applications, malware detection is to be performed on the fly and hence sophisticated techniques are in demand to address this issue. Even though various machine learning techniques are employed for this purpose, the performance of such algorithms depends on how good the features are designed. In this work, we have proposed a transfer learning technique by combining the best performing Convolutional Neural Network with the machine learning algorithms such as Naive Bayes classifier for detection and classification of DGA generated domains. We have evaluated our approach using the dataset released by DMD 2018 Shared Task for both binary classification and multiclass classification scenario. Our methodology of CNN with NB for binary classification has been awarded the first rank in this DMD 2018 shared task.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Vinayakumar, R., Poornachandran, P., Soman, K.P.: Scalable framework for cyber threat situational awareness based on domain name systems data analysis. In: Roy, S.S., Samui, P., Deo, R., Ntalampiras, S. (eds.) Big Data in Engineering Applications. SBD, vol. 44, pp. 113–142. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-8476-8_6

    Chapter  Google Scholar 

  2. Vinayakumar, R., Soman, K., Poornachandran, P.: Detecting malicious domain names using deep learning approaches at scale. J. Intell. Fuzzy Syst. 34(3), 1355–1367 (2018)

    Article  Google Scholar 

  3. Vinayakumar, R., Soman, K., Poornachandran, P., Sachin Kumar, S.: Evaluating deep learning approaches to characterize and classify the DGAs at scale. J. Intell. Fuzzy Syst. 34(3), 1265–1276 (2018)

    Article  Google Scholar 

  4. Vinayakumar, R., Soman, K.P., Poornachandran, P., Menon, P.: A deep-dive on Machine learning for Cybersecurity use cases. In: Gupta, B., Sheng, M. (eds.) Machine Learning for Computer and Cyber Security: Principle, Algorithms, and Practices. CRC Press, USA

    Google Scholar 

  5. Mohan, V.S., Vinayakumar, R., Soman, K.P., Poornachandran, P.: S.P.O.O.F net: syntactic patterns for identification of ominous online factors. In: 2017 IEEE Symposium Security and Privacy (SP), BioSTAR 2018 (2018)

    Google Scholar 

  6. Alazab, M.: Profiling and classifying the behavior of malicious codes. J. Syst. Softw. 100, 91–102 (2015)

    Article  Google Scholar 

  7. Huda, S., Abawajy, J., Alazab, M., Abdollalihian, M., Lslam, R., Yearwood, J.: Hybrids of support vector machine wrapper and filter based framework for malware detection. Future Gener. Comput. Syst. 55, 376–390 (2016)

    Article  Google Scholar 

  8. Zhang, X., LeCun, Y.: Text Understanding from Scratch CoRR (2015)

    Google Scholar 

  9. https://github.com/baderj/domain_generation_algorithms

  10. http://osint.bambenekconsulting.com/feeds/

  11. https://data.netlab.360.com/dga/

  12. Does Alexa have a list of its top-ranked websites? https://support.alexa.com

  13. OpenDNS domain list. https://umbrella.cisco.com/blog

  14. Security in Computing and Communications (SSCC’18). http://www.acn-conference.org/sscc2018/

  15. International Conference in Advances in computing, Communications and Informatics (ICACCI’18). http://icacci-conference.org/2018/

  16. Goodfellow, I., Bengio, Y., Courville, A., Bach, F.: Deep Learning. Adaptive Computation and Machine Learning series. MIT Press, Cambridge (2016)

    Google Scholar 

  17. Hashemi, H.B., Asiaee, A., Kraft, R.: Query Intent Detection using Convolution Neural Network. WSDM QRUMS (2016)

    Google Scholar 

  18. Lenc, L., Kral, P.: Deep Neural Networks for Czech Multi-label Document Classification. CoRR (2017)

    Google Scholar 

  19. Hoang, X.D., Nguyen, Q.: Botnet detection based on machine learning techniques using DNS query data. Future Internet MDPI 2018 (2018)

    Google Scholar 

  20. Venkatraman S., Alazab, M.: Classification of malware using visualisation of similarity matrices. In: Conference Publishing Services, 8 p. (2017)

    Google Scholar 

  21. Rajalakshmi, R.: Identifying health domain URLs using SVM. In: Third International Symposium on Women in Computing and Informatics (WCI–2015), pp. 203–208. ACM (2015). https://doi.org/10.1145/2791405.2791441

  22. Rajalakshmi, R., Aravindan, C.: An effective and discriminative feature learning for URL based web page classification. In: International IEEE Conference on Systems, Man and Cybernetics – SMC 2018 (2018, accepted)

    Google Scholar 

  23. Rajalakshmi, R., Aravindan, C.: Web Page Classification using n-gram based URL Features. In: IEEE Proceedings of International Conference on Advanced Computing (ICoAC 2013), pp. 15–21 (2013). https://doi.org/10.1109/icoac.2013.6921920

  24. Rajalakshmi, R., Xavier, S.: Experimental study of feature weighting techniques for URL based web page classification. Procedia Comput. Sci. 115, 218–225 (2017)

    Article  Google Scholar 

  25. Rajalakshmi, R., Aravindan, C.: Naive Bayes approach for website classification. In: Das, V.V., Thomas, G., Lumban Gaol, F. (eds.) AIM 2011. CCIS, vol. 147, pp. 323–326. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20573-6_55

    Chapter  Google Scholar 

  26. Rajalakshmi, R., Aravindan, C.: Naive Bayes Approach for URL Classification with Supervised Feature Selection and Rejection Framework, Computational Intelligence, Wiley (2018). https://doi.org/10.1111/coin.12158

    Article  MathSciNet  Google Scholar 

  27. Sivakumar, S., Rajalakshmi, R.: Comparative evaluation of various feature weighting methods on movie reviews. In: Behera, H.S., Nayak, J., Naik, B., Abraham, A. (eds.) Computational Intelligence in Data Mining. AISC, vol. 711, pp. 721–730. Springer, Singapore (2019). https://doi.org/10.1007/978-981-10-8055-5_64

    Chapter  Google Scholar 

Download references

Acknowledgement

The authors would like to thank the management of Vellore Institute of Technology (VIT), Chennai for providing the support to carry out this research. We would also like to thank the Department of Science and Engineering Research Board (SERB), Government of India for their financial grant (Award No: ECR/2016/00484) for this research work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Rajalakshmi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rajalakshmi, R., Ramraj, S., Ramesh Kannan, R. (2019). Transfer Learning Approach for Identification of Malicious Domain Names. In: Thampi, S., Madria, S., Wang, G., Rawat, D., Alcaraz Calero, J. (eds) Security in Computing and Communications. SSCC 2018. Communications in Computer and Information Science, vol 969. Springer, Singapore. https://doi.org/10.1007/978-981-13-5826-5_51

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-5826-5_51

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-5825-8

  • Online ISBN: 978-981-13-5826-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics