research-article

Feature Selections for Phishing URLs Detection Using Combination of Multiple Feature Selection Methods

Authors:

Abulfaz Hajizada,

Sharmin JahanAuthors Info & Claims

ICMLC '23: Proceedings of the 2023 15th International Conference on Machine Learning and Computing

Pages 444 - 450

https://doi.org/10.1145/3587716.3587790

Published: 07 September 2023 Publication History

Abstract

In this internet era, we are very prone to fall under phishing attacks where attackers apply social engineering to persuade and manipulate the user. The core attack target is to steal users’ sensitive information or install malicious software to get control over users’ devices. Attackers use different approaches to persuade the user. However, one of the common approaches is sending a phishing URL to the user that looks legitimate and difficult to distinguish. Machine learning is a prominent approach used for phishing URLs detection. There are already some established machine learning models available for this purpose. However, the model's performance depends on the appropriate selection of features during model building. In this paper, we combine multiple filter methods for feature selections in a procedural way that allows us to reduce a large number of feature list into a reduced number of the feature list. Then we finally apply the wrapper method to select the features for building our phishing detection model. The result shows that combining multiple feature selection methods improves the model's detection accuracy. Moreover, since we apply the backward feature selection method as our wrapper method on the data set with a reduced number of features, the computational time for backward feature selection gets faster.

References

[1]

Ram Basnet, Andrew H. Sung, and Qingzhong Liu. 2012. Feature selection for improved phishing detection. Advanced Research in Applied Artificial Intelligence, 252–261. https://doi.org/10.1007/978-3-642-31087-4_27

Digital Library

[2]

Zainab Alkhalil, Chaminda Hewage, Liqaa Nawaf, and Imtiaz Khan. 2021. Phishing Attacks: A Recent Comprehensive Study and a New Anatomy. Cardiff Metropolitan University. Journal contribution. https://hdl.handle.net/10779/cardiffmet.16988479.v1

[3]

Mohammad Almseidin, Almaha Abuzuraiq, Mouhammd Alkasassbeh, and Nidal Alnidami. 2019. Phishing detection based on machine learning and feature selection methods. International Journal of Interactive Mobile Technologies (IJIM), 13(12). https://doi.org/10.3991/ijim.v13i12.11411

[4]

Ram Basnet, and Tenzin Doleck. 2015. Towards developing a tool to detect phishing urls: a machine learning approach. In 2015 IEEE International Conference on Computational Intelligence & Communication Technology, 220–223. https://doi.org/10.1109/cict.2015.63

[5]

Jie Cai, Jiawei Luo, Shulin Wang, and Sheng Yang. 2018. Feature selection in Machine Learning: A new perspective. Neurocomputing, 300, 70–79. https://doi.org/10.1016/j.neucom.2017.11.077

[6]

Girish Chandrashekar, and Ferat Sahin. 2014. A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024

Digital Library

[7]

M. Nazreen Banu, and S. Munawara Banu. 2013. A Comprehensive Study of Phishing Attacks. International Journal of Computer Science and Information Technologies, 783–786.

[8]

Bimal Parmar. 2012. Protecting against Spear-Phishing. Computer Fraud & Security. 8–11. https://doi.org/10.1016/s1361-3723(12)70007-6

[9]

Altyeb Altaher. 2017. Phishing websites classification using hybrid SVM and KNN approach. International Journal of Advanced Computer Science and Applications, 8(6).

[10]

Xiaoqing GU, Hongyuan WANG, and Tongguang NI. 2013. An efficient approach to detecting phishing web. Journal of Computational Information Systems, 9(14), 5553–5560.

[11]

Yue Zhang, Hong I. Jason, and Cranor F. Lorrie. 2007. Cantina: a content-based approach to detecting phishing web sites. Proceedings of the 16th international conference on World Wide Web.

[12]

Ozgur K. Sahingoz, Ebubekir Buber, Onder Demir, and Banu Diri. 2019. Machine learning based phishing detection from URLs. Expert Systems with Applications, 117, 345-357.

[13]

Anh Le, Athina Markopoulou, and Michalis Faloutsos. 2011. PhishDef: URL names say it all. 2011 Proceedings IEEE INFOCOM, 2011, pp. 191-195.

[14]

Yvan Saeys, Thomas Abeel, and Yves Van de Peer. 2008. Robust feature selection using ensemble feature selection techniques. Machine Learning and Knowledge Discovery in Databases, 313–325. https://doi.org/10.1007/978-3-540-87481-2_21

[15]

Pishing Detection Using Machine Learning. Available at: https://www.kaggle.com/code/fadilparves/pishing-detection-using-machine-learning/notebook

[16]

Classification of Malwares (CLaMP). Available at: https://www.kaggle.com/datasets/saurabhshahane/classification-of-malwares

Cited By

Jones MBayesh MJahan S(2024)Unlocking Deeper Understanding: Leveraging Explainable AI for API Anomaly Detection InsightsProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651738(211-217)Online publication date: 2-Feb-2024
https://dl.acm.org/doi/10.1145/3651671.3651738

Index Terms

Feature Selections for Phishing URLs Detection Using Combination of Multiple Feature Selection Methods

Index terms have been assigned to the content through auto-classification.

Recommendations

Hybrid feature selection for phishing email detection
ICA3PP'11: Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part II

Phishing emails are more active than ever before and putting the average computer user and organizations at risk of significant data, brand and financial loss. Through an analysis of a number of phishing and ham email collected, this paper focused on ...
Semantic Feature Selection for Text with Application to Phishing Email Detection
Information Security and Cryptology -- ICISC 2013
Abstract
In a phishing attack, an unsuspecting victim is lured, typically via an email, to a web site designed to steal sensitive information such as bank/credit card account numbers, login information for accounts, etc. Each year Internet users lose ...
Phishing Email Detection Based on Binary Search Feature Selection
Abstract
Phishing has appeared as a critical issue in the cybersecurity domain. Phishers adopt email as one of their major channels of communication to lure potential victims. This paper attempts to detect phishing emails by using binary search feature ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICMLC '23: Proceedings of the 2023 15th International Conference on Machine Learning and Computing

February 2023

619 pages

ISBN:9781450398411

DOI:10.1145/3587716

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 September 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICMLC 2023

ICMLC 2023: 2023 15th International Conference on Machine Learning and Computing

February 17 - 20, 2023

Zhuhai, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
97
Total Downloads

Downloads (Last 12 months)63
Downloads (Last 6 weeks)6

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jones MBayesh MJahan S(2024)Unlocking Deeper Understanding: Leveraging Explainable AI for API Anomaly Detection InsightsProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651738(211-217)Online publication date: 2-Feb-2024
https://dl.acm.org/doi/10.1145/3651671.3651738

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents