skip to main content
10.1145/3587716.3587790acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlcConference Proceedingsconference-collections
research-article

Feature Selections for Phishing URLs Detection Using Combination of Multiple Feature Selection Methods

Published: 07 September 2023 Publication History

Abstract

In this internet era, we are very prone to fall under phishing attacks where attackers apply social engineering to persuade and manipulate the user. The core attack target is to steal users’ sensitive information or install malicious software to get control over users’ devices. Attackers use different approaches to persuade the user. However, one of the common approaches is sending a phishing URL to the user that looks legitimate and difficult to distinguish. Machine learning is a prominent approach used for phishing URLs detection. There are already some established machine learning models available for this purpose. However, the model's performance depends on the appropriate selection of features during model building. In this paper, we combine multiple filter methods for feature selections in a procedural way that allows us to reduce a large number of feature list into a reduced number of the feature list. Then we finally apply the wrapper method to select the features for building our phishing detection model. The result shows that combining multiple feature selection methods improves the model's detection accuracy. Moreover, since we apply the backward feature selection method as our wrapper method on the data set with a reduced number of features, the computational time for backward feature selection gets faster.

References

[1]
Ram Basnet, Andrew H. Sung, and Qingzhong Liu. 2012. Feature selection for improved phishing detection. Advanced Research in Applied Artificial Intelligence, 252–261. https://doi.org/10.1007/978-3-642-31087-4_27 
[2]
Zainab Alkhalil, Chaminda Hewage, Liqaa Nawaf, and Imtiaz Khan. 2021. Phishing Attacks: A Recent Comprehensive Study and a New Anatomy. Cardiff Metropolitan University. Journal contribution. https://hdl.handle.net/10779/cardiffmet.16988479.v1
[3]
Mohammad Almseidin, Almaha Abuzuraiq, Mouhammd Alkasassbeh, and Nidal Alnidami. 2019. Phishing detection based on machine learning and feature selection methods. International Journal of Interactive Mobile Technologies (IJIM), 13(12). https://doi.org/10.3991/ijim.v13i12.11411 
[4]
Ram Basnet, and Tenzin Doleck. 2015. Towards developing a tool to detect phishing urls: a machine learning approach. In 2015 IEEE International Conference on Computational Intelligence & Communication Technology, 220–223. https://doi.org/10.1109/cict.2015.63
[5]
Jie Cai, Jiawei Luo, Shulin Wang, and Sheng Yang. 2018. Feature selection in Machine Learning: A new perspective. Neurocomputing, 300, 70–79. https://doi.org/10.1016/j.neucom.2017.11.077 
[6]
Girish Chandrashekar, and Ferat Sahin. 2014. A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024 
[7]
M. Nazreen Banu, and S. Munawara Banu. 2013. A Comprehensive Study of Phishing Attacks. International Journal of Computer Science and Information Technologies, 783–786. 
[8]
Bimal Parmar. 2012. Protecting against Spear-Phishing. Computer Fraud & Security. 8–11. https://doi.org/10.1016/s1361-3723(12)70007-6
[9]
Altyeb Altaher. 2017. Phishing websites classification using hybrid SVM and KNN approach. International Journal of Advanced Computer Science and Applications, 8(6).
[10]
Xiaoqing GU, Hongyuan WANG, and Tongguang NI. 2013. An efficient approach to detecting phishing web. Journal of Computational Information Systems, 9(14), 5553–5560. 
[11]
Yue Zhang, Hong I. Jason, and Cranor F. Lorrie. 2007. Cantina: a content-based approach to detecting phishing web sites. Proceedings of the 16th international conference on World Wide Web.
[12]
Ozgur K. Sahingoz, Ebubekir Buber, Onder Demir, and Banu Diri. 2019. Machine learning based phishing detection from URLs. Expert Systems with Applications, 117, 345-357.
[13]
Anh Le, Athina Markopoulou, and Michalis Faloutsos. 2011. PhishDef: URL names say it all. 2011 Proceedings IEEE INFOCOM, 2011, pp. 191-195.
[14]
Yvan Saeys, Thomas Abeel, and Yves Van de Peer. 2008. Robust feature selection using ensemble feature selection techniques. Machine Learning and Knowledge Discovery in Databases, 313–325. https://doi.org/10.1007/978-3-540-87481-2_21 
[15]
Pishing Detection Using Machine Learning. Available at: https://www.kaggle.com/code/fadilparves/pishing-detection-using-machine-learning/notebook
[16]
Classification of Malwares (CLaMP). Available at: https://www.kaggle.com/datasets/saurabhshahane/classification-of-malwares

Cited By

View all
  • (2024)Unlocking Deeper Understanding: Leveraging Explainable AI for API Anomaly Detection InsightsProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651738(211-217)Online publication date: 2-Feb-2024

Index Terms

  1. Feature Selections for Phishing URLs Detection Using Combination of Multiple Feature Selection Methods
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        ICMLC '23: Proceedings of the 2023 15th International Conference on Machine Learning and Computing
        February 2023
        619 pages
        ISBN:9781450398411
        DOI:10.1145/3587716
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 07 September 2023

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Correlation
        2. Feature Selection
        3. Machine Learning Model
        4. Phishing

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        ICMLC 2023

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)63
        • Downloads (Last 6 weeks)6
        Reflects downloads up to 20 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Unlocking Deeper Understanding: Leveraging Explainable AI for API Anomaly Detection InsightsProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651738(211-217)Online publication date: 2-Feb-2024

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media