skip to main content
10.1145/3631991.3632011acmotherconferencesArticle/Chapter ViewAbstractPublication PageswsseConference Proceedingsconference-collections
research-article

Do Not Feed the Phish: Phishing Website Detection Using URL-based Features

Published: 26 December 2023 Publication History

Abstract

Due to the onset of the pandemic in previous years, industries have shifted to the use of digitalized services. Since most transactions were done online, people have become more susceptible to phishing attacks. With this, a steady rise in phishing attacks has been observed. Though cybersecurity awareness can help counter phishing attacks, improvement of existing phishing detection approaches remains desirable. Evolution of phishing websites is inevitable as phishers persist in bypassing current phishing detection approaches. This study then suggests a phishing detection model that makes use of URL-based features to detect phishing websites. Models were developed using two well-known classifiers, XGBoost and Random Forest. XGBoost classifier demonstrated the most promising performance with an accuracy of 96.51% and a kappa of 0.930.

References

[1]
Li, J. (2021). Cybercrime in the Philippines: A Case Study of National Security. Turkish Journal of Computer and Mathematics Education, 12(11), 4224–4231.
[2]
Vargas, J., Correa Bahnsen, A., Villegas, S., & Ingevaldson, D. (2016). “Knowing your enemies: Leveraging data analysis to expose phishing patterns against a major US financial institution,” in 2016 APWG Symposium on Electronic Crime Research (eCrime), 52–61.
[3]
Thakur, T. & Verma, R. (2014). Catching Classical and Hijack-Based Phishing Attacks. Cham: Springer International Publishing, 318–337.
[4]
Bahnsen, A. C., Bohorquez, E. C., Villegas, S., Vargas, J., & Gonzalez, F. A. (2017). Classifying phishing URLs using recurrent neural networks. 2017 APWG Symposium on Electronic Crime Research (eCrime). Published. https://doi.org/10.1109/ecrime.2017.7945048
[5]
Jupin, J. A., Sutikno, T., Ismail, M. A., Mohamad, M. S., Kasim, S., & Stiawan, D. (2019). Review of the machine learning methods in the classification of phishing attack. Bulletin of Electrical Engineering and Informatics, 8(4), 1545–1555. https://doi.org/10.11591/eei.v8i4.1344
[6]
Joseph, U. M., & Jacob, M. (2022). Real time detection of phishing attacks in edge devices using LSTM Networks. Proceedings of the International Conference on Research Advances in Engineering and Technology - ITechCET 2021. https://doi.org/10.1063/5.0103355.
[7]
Abuadbba, A., Wang, S., Almashor, M., Ahmed, M.E., Gaire, R.K., Çamtepe, S.A., & Nepal, S. (2022). Towards Web Phishing Detection Limitations and Mitigation. ArXiv, abs/2204.00985.
[8]
Rao, R. S., & Pais, A. R. (2019). Jail-Phish: An improved search engine-based phishing detection system. Computers & Security, 83, 246–267. https://doi.org/10.1016/j.cose.2019.02.011
[9]
Catal, C., Giray, G., Tekinerdogan, B., Kumar, S., & Shukla, S. (2022). Applications of deep learning for phishing detection: a systematic literature review. Knowledge and Information Systems, 64(6), 1457–1500. https://doi.org/10.1007/s10115-022-01672-x
[10]
Lallie, H. S., Shepherd, L. A., Nurse, J. R. C., Erola, A., Epiphaniou, G., Maple, C., & Bellekens, X. (2021). Cyber security in the age of covid-19: A timeline and analysis of cyber-crime and cyber-attacks during the pandemic. Computers & Security, 105, 102248. https://doi.org/10.1016/j.cose.2021.102248
[11]
Carroll, F., Adejobi, J.A. & Montasari, R. (2022). How Good Are We at Detecting a Phishing Attack? Investigating the Evolving Phishing Attack Email and Why It Continues to Successfully Deceive Society. SN COMPUT. SCI. 3, 170. https://doi.org/10.1007/s42979-022-01069-1
[12]
Salahdine, F., El Mrabet, Z., & Kaabouch, N. (2021). Phishing attacks detection: a machine learning-based approach. 2021 IEEE 12th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON). https://doi.org/10.1109/uemcon53757.2021.9666627
[13]
Almseidin, M., Abu Zuraiq, A., Al-kasassbeh, M., & Alnidami, N. (2019). Phishing Detection Based on Machine Learning and Feature Selection Methods. International Journal of Interactive Mobile Technologies (iJIM), 13(12), pp. 171–183. https://doi.org/10.3991/ijim.v13i12.11411
[14]
Musa, H., Gital, A. Y., Bitrus, M. G., Juma, N. F., & Balde, M. A. (2020). Boosting the Accuracy of Phishing Detection with Less Features Using XGBoost. International Journal of Software & Hardware Research in Engineering, 8(2).
[15]
Ariyadasa, Subhash; Fernando, Shantha, & Fernando, Subha (2021), “Phishing Websites Dataset”, Mendeley Data, V1.
[16]
Sahingoz, O. K., Buber, E., Demir, O., & Diri, B. (2019). Machine learning based phishing detection from urls. Expert Systems with Applications, 117, 345–357. https://doi.org/10.1016/j.eswa.2018.09.029
[17]
El Aassal, A., Baki, S., Das, A., & Verma, R. M. (2020). An in-depth benchmarking and evaluation of Phishing Detection Research for Security needs. IEEE Access, 8, 22170–22192. https://doi.org/10.1109/access.2020.2969780
[18]
Althobaiti, K., Rummani, G., & Vaniea, K. (2019, June). A Review of Human- and Computer-Facing URL Phishing Features. 2019 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). https://doi.org/10.1109/eurospw.2019.00027
[19]
We will be retiring Alexa.com on May 1, 2022. Amazon Alexa. (n.d.). https://alexa.com/hc/en-us/articles/4410503838999
[20]
Le Pochat, V., Van Goethem, T., Tajalizadehkhoob, S., Korczynski, M., & Joosen, W. (2019). Tranco: A research-oriented top sites ranking hardened against manipulation. Proceedings 2019 Network and Distributed System Security Symposium. https://doi.org/10.14722/ndss.2019.23386

Cited By

View all
  • (2024)Transfer Learning for Phishing Detection: Screenshot-Based Website Classification2024 9th International Conference on Computer Science and Engineering (UBMK)10.1109/UBMK63289.2024.10773490(1-6)Online publication date: 26-Oct-2024
  • (2024)A State-of-the-Art Review on Phishing Website Detection TechniquesIEEE Access10.1109/ACCESS.2024.351497212(187976-188012)Online publication date: 2024

Index Terms

  1. Do Not Feed the Phish: Phishing Website Detection Using URL-based Features

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WSSE '23: Proceedings of the 2023 5th World Symposium on Software Engineering
    September 2023
    352 pages
    ISBN:9798400708053
    DOI:10.1145/3631991
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 December 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Phishing
    2. Random Forest
    3. URL
    4. XGBoost
    5. machine learning

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    WSSE 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)109
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 26 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Transfer Learning for Phishing Detection: Screenshot-Based Website Classification2024 9th International Conference on Computer Science and Engineering (UBMK)10.1109/UBMK63289.2024.10773490(1-6)Online publication date: 26-Oct-2024
    • (2024)A State-of-the-Art Review on Phishing Website Detection TechniquesIEEE Access10.1109/ACCESS.2024.351497212(187976-188012)Online publication date: 2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media