Skip to main content

Threat Detection in URLs by Applying Machine Learning Algorithms*

  • Conference paper
  • First Online:
Distributed Computing and Artificial Intelligence, Special Sessions, 19th International Conference (DCAI 2022)

Abstract

Different cyber threat groups develop an infrastructure to be able to distribute malware to victims. The entry vector of these threats is usually the download of malicious files via a web link that initiates the system infection. It is possible to detect these threats at an early stage to anticipate a possible compromise by applying a malicious URL detector. This work contributes to detect cyber threats (Emotet and Qakbot, mainly) in advance given an input URL.

*University of Salamanca

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Certified Information Systems Auditor (CISA): Qbot/Qakbot Malware Report. Oct. 29, 2020. https://www.cisa.gov/stopransomware/qbotqakbot-malware-report (visited on 08 May 2022)

  2. Basarslan, M.S., Kayaalp, F.: Sentiment analysis with machine learning methods on social media 9, 5–15 (2020). https://doi.org/10.14201/ADCAIJ202093515

  3. UK-CERT: An introduction to threat intelligence (2014). https://www.ncsc.gov.uk/files/An-introduction-to-threat-intelligence.pdf (visited on 30 Dec 2021)

  4. Canadian Institute for Cybersecurity. URL dataset (ISCX-URL2016). http://205.174.165.80/CICDataset/ISCX- URL- 2016/ (visited on 28 Jan 2022)

  5. Fernández, G.: Confirmo, los operadores de #Emotet ocupan el mismo proveedor de Webshells que ocupa TR distribution con #Qakbot o #Squirrelwaffle. Jan. 24, 2022. https://twitter.com/1ZRR4H/status/1485413045975330822?s=20 &t=cuUfe7pnr7sZ1YEYcqm7uQ (visited on 27 Jan 2022)

  6. Johnson, C., et al.: Guide to cyber threat information sharing 150 (2016). https://doi.org/10.6028/NIST.SP.800-150 (visited on 30 Dec 2021)

  7. Kingma, D.P., Lei Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations, ICLR (2015)

    Google Scholar 

  8. Kiruthiga, R., Akila, D.: Phishing websites detection using machine learning. Int. J. Recent Technol. Eng. 8(2 Special Issue 11 Sept. 2019), 111–114. ISSN: 2277-3878 https://doi.org/10.35940/ijrte.B1018.0982S1119

  9. Le, H., et al.: URLNet: learning a URL representation with deep learning for malicious URL detection. ArXiv (May 2018). arxiv.org/abs/1802.03162

  10. McCormick, C.: Word2Vec Tutorial - The Skip-Gram Model. In: 2016

    Google Scholar 

  11. McMillan, R.: Definition: threat intelligence. In: Gartner Research (2013). https://www.gartner.com/en/documents/2487216 (visited on 30 Dec 2021)

  12. Mikolov, T., et al.: Efficient estimation of word representations in vector space (2013). arxiv.org/abs/1301.3781. https://doi.org/10.48550/ARXIV.1301.3781

  13. MITREATT & CK. Emotet. https://attack.mitre.org/software/S0367/ (visited on 08 May 2022)

  14. MITREATT &CK. QakBot. https://attack.mitre.org/software/S0650/ (visited on 08 May 2022)

  15. Nikkhah, P., et al.: Cyber kill chain-based taxonomy of advanced persistent threat actors: analogy of tactics, techniques, and procedures. Threat detection in URLs by applying machine learning algorithms 7. J. Inf. Proc. Syst. 15 (2021). https://doi.org/10.3745/JIPS.03.0126

  16. Peng, T., Harris, I., Sawa, Y.: Detecting phishing attacks using natural language processing and machine learning. In: 2018 IEEE 12th International Conference on Semantic Computing (ICSC), pp. 300–301 (2018). https://doi.org/10.1109/ICSC.2018.00056

  17. Pimpalkar, A.P., Retna Raj, R.J.: Influence of pre-processing strategies on the performance of ML classifiers exploiting TF-IDF and BOW features 9, 49–68. https://doi.org/10.14201/ADCAIJ2020924968

  18. Schneier, B.: How changing technology affects security. IEEE Secur. Priv. 10(2), 104–104 (2012). https://doi.org/10.1109/MSP.2012.39

  19. J. Shad and S. Sharma. A Novel Machine Learning Approach to Detect Phishing Websites Jaypee Institute of Information Technology. 2018

    Google Scholar 

  20. Sönmez, Y., et al.: Phishing web sites features classification based on extreme learning machine. In: 2018 6th International Symposium on Digital Forensic and Security (ISDFS), pp. 1–5 (2018). https://doi.org/10.1109/ISDFS2018.8355342

  21. URLhaus. URLhaus - Malware URL exchange. https://urlhaus.abuse.ch/ (visited on 28 Jan 2022)

  22. Yuan, H., et al.: URL2Vec: URL modeling with character embeddings for fast and accurate phishing website detection, pp. 265–272 (2018). https://doi.org/10.1109/BDCloud.2018.00050

Download references

Acknowledgements

This work has been supported by the project “XAI - XAI - Sistemas Inteligentes Auto Explicativos creados con Módulos de Mezcla de Expertos”, ID SA082P20, financed by Junta Castilla y León, Consejería de Educación, and FEDER funds.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Angélica González Arrieta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bustos-Tabernero, Á., López-Sánchez, D., González Arrieta, A. (2023). Threat Detection in URLs by Applying Machine Learning Algorithms*. In: Machado, J.M., et al. Distributed Computing and Artificial Intelligence, Special Sessions, 19th International Conference. DCAI 2022. Lecture Notes in Networks and Systems, vol 585. Springer, Cham. https://doi.org/10.1007/978-3-031-23210-7_21

Download citation

Publish with us

Policies and ethics