Abstract
Web addresses, or Uniform Resource Locators (URLs), represent a vector by which attackers are able to deliver a multitude of unwanted and potentially harmful effects to users through malicious software. The ability to detect and block access to such URLs has traditionally been enabled through reactive and labour intensive means such as human verification and whitelists and blacklists. Machine Learning has shown great potential to automate this defence and position it as proactive through the implementation of classifier models. Work in this area has produced numerous high-accuracy models, though the algorithms themselves remain fragile to adversarial manipulation if implemented without consideration being given to their security. Our work aims to investigate the robustness of several classifiers for malicious URL detection by randomly perturbing samples in the training data. It is shown that without a measure of defence to adversarial influence, highly accurate malicious URL detection can be significantly and adversely affected at even low degrees of training data perturbation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kintis, P., et al.: Hiding in plain sight: a longitudinal study of combosquatting abuse. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 569–586. ACM (2017)
Antonakakis, M., Perdisci, R., Dagon, D., Lee, W., Feamster, N.: Building a dynamic reputation system for DNS. In: USENIX Security Symposium, pp. 273–290 (2010)
Christou, O., Pitropakis, N., Papadopoulos, P., McKeown, S., Buchanan, W.J.: Phishing URL detection through top-level domain analysis: a descriptive approach. arXiv preprint arXiv:2005.06599 (2020)
Pitropakis, N., Panaousis, E., Giannetsos, T., Anastasiadis, E., Loukas, G.: A taxonomy and survey of attacks against machine learning. Comput. Sci. Rev. 34, 100199 (2019)
Mamun, M.S.I., Rathore, M.A., Lashkari, A.H., Stakhanova, N., Ghorbani, A.A.: Detecting malicious URLs using lexical analysis. In: Chen, J., Piuri, V., Su, C., Yung, M. (eds.) NSS 2016. LNCS, vol. 9955, pp. 467–482. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46298-1_30
Xin, Y., et al.: Machine learning and deep learning methods for cybersecurity. IEEE Access 6, 35365–35381 (2018)
Pattewar, T., Mali, C., Kshire, S., Sadarao, M., Salunkhe, J., Shah, M.A.: Malicious short URLs detection: A survey (2019)
Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine learning, neural and statistical classification (1994)
Demontis, A., Biggio, B., Fumera, G., Giacinto, G., Roli, F.: Infinity-norm support vector machines against adversarial label contamination. In: 1st Italian Conference on Cybersecurity, ITASEC 2017, vol. 1816, pp. 106–115. CEUR-WS (2017)
Zhao, M., An, B., Gao, W., Zhang, T.: Efficient label contamination attacks against black-box learning models. IJCA I, 3945–3951 (2017)
Biggio, B., Roli, F.: Wild patterns: ten years after the rise of adversarial machine learning. Pattern Recogn. 84, 317–331 (2018)
Xuan, C., Nguyen, H., Nikolaevich, T.: Malicious URL detection based on machine learning. Int. J. Adv. Comput. Sci. Appl. 11(1), 148–153 (2020)
Andrade, R.O., Ortiz-Garcés, I., Cazares, M.: Cybersecurity attacks on smart home during Covid-19 pandemic. In: 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), pp. 398–404. IEEE (2020)
Pranggono, B., Arabo, A.: Covid-19 pandemic cybersecurity issues. Internet Technol. Lett. 4(2), e247 (2021)
Ford, V., Siraj, A.: Applications of machine learning in cyber security. In: Proceedings of the 27th International Conference on Computer Applications in Industry and Engineering., vol. 118. IEEE Xplore, Kota Kinabalu (2014)
Xiao, H., Biggio, B., Nelson, B., Xiao, H., Eckert, C., Roli, F.: Support vector machines under adversarial label contamination. Neurocomputing 160, 53–62 (2015)
Zhou, X., Ding, P.L.K., Li, B.: Improving robustness of random forest under label noise. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 950–958. IEEE (2019)
Hein, M., Andriushchenko, M.: Formal guarantees on the robustness of a classifier against adversarial manipulation. arXiv preprint arXiv:1705.08475 (2017)
Acknowledgments
This work has been partly supported by the University of Piraeus Research Center.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Marchand, B., Pitropakis, N., Buchanan, W.J., Lambrinoudakis, C. (2021). Launching Adversarial Label Contamination Attacks Against Malicious URL Detection. In: Fischer-Hübner, S., Lambrinoudakis, C., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Trust, Privacy and Security in Digital Business. TrustBus 2021. Lecture Notes in Computer Science(), vol 12927. Springer, Cham. https://doi.org/10.1007/978-3-030-86586-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-86586-3_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86585-6
Online ISBN: 978-3-030-86586-3
eBook Packages: Computer ScienceComputer Science (R0)