Randomized Active Learning to Identify Phishing URL

Ponni, P.; Prabha, D.

doi:10.1007/978-3-031-25088-0_47

P. Ponni⁸ &
D. Prabha⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1749))

Included in the following conference series:

International Conference on Advanced Communication and Intelligent Systems

499 Accesses
2 Citations

Abstract

Data analytics is rapidly being employed in cybersecurity concerns, and has been found to be beneficial in situations where large amounts of data and heterogeneity make human assessment by security specialists difficult. Obtaining data with annotations is a tough and well-known restrictive constraint for various supervised security analytics tasks in real-world cyber-security situations using data-driven analytics. Because annotation is largely manual and involves a great deal of expert effort, vast sections of large datasets are frequently left unlabeled. We adopt a randomly ranked feature active learning strategy to create a semi-supervised solution in this research to address this constraint in an applied cyber-security challenge of phishing classification. An early classifier is trained on a slight sample of interpreted data, and then iteratively updated by selecting just relevant samples from a huge pool of unlabeled data that are most likely to effect classifier presentation quickly. Randomly ranked feature Active Learning has a lot of potential in terms of achieving quicker convergence in relationships of classification presentation in a group learning environment, needing even less human annotation labor. Without requiring a significant number of marked training examples to be accessible during training, a helpful feature rank update strategy paired with active learning displays good classification results for labeling phishing/malicious URLs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Keeping pace with the creation of new malicious PDF files using an active-learning based detection framework

Article Open access 18 February 2016

Robust Malicious Domain Detection

Impact of Current Phishing Strategies in Machine Learning Models for Phishing Detection

References

Korkmaz, M., Sahingoz, O.K., Diri, B.: Detection of phishing websites by using machine learning-based URL analysis. In: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT). IEEE (2020)
Google Scholar
Bhattacharjee, S.D., Talukder, A., Al-Shaer, E., Doshi, P.: Prioritized active learning for malicious URL detection using weighted text-based features. IEEE Int. Conference on Intelligence and Security Informatics (ISI) 22, 107–112 (2017)
Google Scholar
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: Learning to detect malicious web sites from suspicious URLs. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1245–1254 (2009)
Google Scholar
Tang, L., Mahmoud, Q.H.: A survey of machine learning-based solutions for phishing website detection. Machine Learning and Knowledge Extraction 3(3), 672–694 (2021)
Article Google Scholar
Vanhoenshoven, F., et al.: Detecting malicious URLs using machine learning techniques. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE (2016)
Google Scholar
Rajawat, A.S., Rawat, R., Barhanpurkar, K., Shaw, R.N., Ghosh, A.: Vulnerability analysis at industrial internet of things platform on dark web network using computational intelligence. In: Bansal, J.C., Paprzycki, M., Bianchini, M., Das, S. (eds.) Computationally Intelligent Systems and their Applications. SCI, vol. 950, pp. 39–51. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-0407-2_4
Chapter Google Scholar
Li, J.-H., Wang, S.-D.: PhishBox: An approach for phishing validation and detection. In: 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech). IEEE (2017)
Google Scholar
Zhu, E., et al.: OFS-NN: an effective phishing websites detection model based on optimal feature selection and neural network. IEEE Access 7, 73271–73284 (2019)
Google Scholar
Blum, A., et al.: Lexical feature-based phishing URL detection using online learning. In: Proceedings of the 3rd ACM Workshop on Artificial Intelligence and Security (2010)
Google Scholar
Zhao, P., Hoi, S.C.: Cost-sensitive online active learning with application to malicious URL detection. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2013)
Google Scholar
Sadique, F., et al.: An automated framework for real-time phishing URL detection. In: 2020 10th Annual Computing and Communication Workshop and Conference (CCWC). IEEE (2020)
Google Scholar
Deka, R.K., Bhattacharyya, D.K., Kalita, J.K.: Active learning to detect DDoS attack using ranked features. Computer Commun. 145, 203–222 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, CMS College of Engineering and Technology, Coimbatore, 641032, India
P. Ponni
Department of Computer Science and Engineering, Sri Krishna College of Engineering and Technology, Coimbatore, 641008, India
D. Prabha

Authors

P. Ponni
View author publications
You can also search for this author in PubMed Google Scholar
D. Prabha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to P. Ponni .

Editor information

Editors and Affiliations

Bharath Institute of Higher Education and Research, Chennai, India
Rabindra Nath Shaw
Systems Research Institute, Warsaw, Poland
Marcin Paprzycki
The Neotia University, Sarisha, India
Ankush Ghosh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ponni, P., Prabha, D. (2023). Randomized Active Learning to Identify Phishing URL. In: Shaw, R.N., Paprzycki, M., Ghosh, A. (eds) Advanced Communication and Intelligent Systems. ICACIS 2022. Communications in Computer and Information Science, vol 1749. Springer, Cham. https://doi.org/10.1007/978-3-031-25088-0_47

Download citation

DOI: https://doi.org/10.1007/978-3-031-25088-0_47
Published: 15 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25087-3
Online ISBN: 978-3-031-25088-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Randomized Active Learning to Identify Phishing URL

Abstract

Access this chapter

Similar content being viewed by others

Keeping pace with the creation of new malicious PDF files using an active-learning based detection framework

Robust Malicious Domain Detection

Impact of Current Phishing Strategies in Machine Learning Models for Phishing Detection

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Randomized Active Learning to Identify Phishing URL

Abstract

Access this chapter

Similar content being viewed by others

Keeping pace with the creation of new malicious PDF files using an active-learning based detection framework

Robust Malicious Domain Detection

Impact of Current Phishing Strategies in Machine Learning Models for Phishing Detection

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation