Stop-Phish: an intelligent phishing detection method using feature selection ensemble

Ramana, A. V.; Rao, K. Lakshmana; Rao, Routhu Srinivasa

doi:10.1007/s13278-021-00829-w

Stop-Phish: an intelligent phishing detection method using feature selection ensemble

Original Article
Published: 30 October 2021

Volume 11, article number 110, (2021)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

1106 Accesses
Explore all metrics

Abstract

Phishing is a cyber-attack which generates a fake website that imitates a trusted website to steal the sensitive information such as username, password, and credit card information. Despite the use of several anti-phishing approaches, online users are still getting trapped into revealing the sensitive information. Hence, in this paper, we propose an intelligent model with an ensemble of various feature selection techniques to detect phishing sites with a significant performance. We have used various machine learning algorithms for identifying the best classifier and developed an ensemble model with Random forest, Decision tree and XGBoost algorithms. We have also used various feature selection ensembles for the classification of phishing websites. From our experimental analysis, we achieved an accuracy of 97.51% in the detection process with dataset from UCI (Dataset 1) and also achieved an accuracy of 98.45% with phishing dataset for machine learning from Mendeley (Dataset 2). Also, the proposed model outperformed baseline models with a significant difference.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Empirical Feature Selection Approach for Phishing Websites Prediction with Machine Learning

A Machine Learning Approach for Phishing Websites Prediction with Novel Feature Selection Framework

Analysis of Ensemble Methods for Phishing Detection

Notes

References

AlShboul R, Thabtah F, Abdelhamid N, Al-Diabat M (2018) A visualization cybersecurity method based on features’ dissimilarity. Comput Secur 77:289–303
Babagoli M, Aghababa MP, Solouk V (2019) Heuristic nonlinear regression strategy for detecting phishing websites. Soft Comput 23:4315–4327
Article Google Scholar
Chiew KL, Tan CL, Wong K, Yong KS, Tiong WK (2019) A new hybrid ensemble feature selection framework for machine learning-based phishing detection system. Inf Sci 484:153–166
Article Google Scholar
El-Alfy E-SM (2017) Detection of phishing websites based on probabilistic neural networks and k-medoids clustering. Comput J 60:1745–1759
Article Google Scholar
Feng F, Zhou Q, Shen Z, Yang X, Han L, Wang J (2018) The application of a novel neural network in the detection of phishing websites. J Amb Intell Hum Comput1–15
Karabatak M, Mustafa T (2018) Performance comparison of classifiers on reduced phishing website dataset. In: 2018 6th international symposium on digital forensic and security (ISDFS). IEEE, pp 1–5
Le H, Pham Q, Sahoo D, Hoi SC (2018) Urlnet: Learning a url representation with deep learning for malicious url detection. arXiv preprint arXiv:1802.03162,
Li Y, Yang Z, Chen X, Yuan H, Liu W (2019) A stacking model using url and html features for phishing webpage detection. Future Gener Comput Syst 94:27–39
Article Google Scholar
Marchal S, Armano G, Gröndahl T, Saari K, Singh N, Asokan N (2017) Off-the-hook: an efficient and usable client-side phishing prevention application. IEEE Trans Comput 66:1717–1733
Article MathSciNet Google Scholar
Prayogo RD, Karimah SA (2020) Optimization of phishing website classification based on synthetic minority oversampling technique and feature selection. In: 2020 international workshop on big data and information security (IWBIS). IEEE, pp 121–126
Rahman SSMM, Islam T, Jabiullah MI (2020) Phishstack: evaluation of stacked generalization in phishing urls detection. Procedia Comput Sci 167:2410–2418
Article Google Scholar
Ramesh G, Gupta J, Gamya P (2017) Identification of phishing webpages and its target domains by analyzing the feign relationship. J Inf Secur Appl 35:75–84
Google Scholar
Rao RS, Pais AR (2018) Detection of phishing websites using an efficient feature-based machine learning framework. Neural Comput Appl. https://doi.org/10.1007/s00521-017-3305-0
Article Google Scholar
Rao RS, Pais AR (2019) Jail-phish: an improved search engine based phishing detection system. Comput Secury 83:246–267
Article Google Scholar
Rao RS, Pais AR (2020) Two level filtering mechanism to detect phishing sites using lightweight visual similarity approach. J Ambient Intell Hum Comput 11:3853–3872
Article Google Scholar
Rao RS, Pais AR, Anand P (2021) A heuristic technique to detect phishing websites using twsvm classifier. Neural Comput Appl 33:5733–5752
Article Google Scholar
Rao RS, Tatti V, Pais AR (2020) Catchphish: Detection of phishing websites by inspecting urls. J Ambient Intell Hum Computing 11:1–15
Google Scholar
Rao RS, Vaishnavi T, Pais AR (2019) Phishdump: a multi-model ensemble based technique for the detection of phishing sites in mobile devices. Pervasive Mobile Comput 60:101084
Article Google Scholar
Sahingoz OK, Buber E, Demir O, Diri B (2018) Machine learning based phishing detection from URLS. Expert Syst Appl 117:345–357
Article Google Scholar
Tan CL, Chiew KL, Wong K, Sze SN (2016) Phishwho: phishing webpage detection via identity keywords extraction and target domain name finder. Decision Supp Syst 88:18–27. https://doi.org/10.1016/j.dss.2016.05.005
Article Google Scholar
Vaitkevicius P, Marcinkevicius V (2020) Comparison of classification algorithms for detection of phishing websites. Informatica 31:143–160
Article MathSciNet Google Scholar
Varshney G, Misra M, Atrey PK (2016) A phish detector using lightweight search features. Comput Secur 62:213–228. https://doi.org/10.1016/j.cose.2016.08.003
Article Google Scholar
Vrbančič G, Fister Jr I, Podgorelec V (2018) Swarm intelligence approaches for parameter setting of deep learning neural network: Case study on phishing websites classification. In: textitProceedings of the 8th international conference on web intelligence, mining and semantics, pp 1–8
Wang S, Khan S, Xu C, Nazir S, Hafeez A (2020) Deep learning-based efficient model development for phishing detection using random forest and BLSTM classifiers. Complexity. https://doi.org/10.1155/2020/8694796
Article Google Scholar
Zabihimayvan M, Doran D (2019) Fuzzy rough set feature selection to enhance phishing attack detection. In: 2019 IEEE international conference on fuzzy systems (FUZZ-IEEE). IEEE, pp 1–6
Zamir A, Khan HU, Iqbal T, Yousaf N, Aslam F, Anjum A, Hamdani M (2020) Phishing web site detection using diverse machine learning algorithms. Electronic Libr 38(1):65–80. https://doi.org/10.1108/EL-05-2019-0118
Article Google Scholar
Zhang W, Jiang Q, Chen L, Li C (2017) Two-stage elm for phishing web pages detection using hybrid features. World Wide Web 20:797–813
Article Google Scholar
Zhang Y, Hong JI, Cranor LF (2007) Cantina: a content-based approach to detecting phishing web sites. In: Proceedings of the 16th international conference on world wide web, pp 639–648. ACM. http://dl.acm.org/citation.cfm?id=1242659. https://doi.org/10.1145/1242572.1242659
Zhu E, Ju Y, Chen Z, Liu F, Fang X (2020) Dtof-ann: an artificial neural network phishing detection model based on decision tree and optimal features. Appl Soft Comput 95:106505
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank Ministry of Electronics & Information Technology (Meity), Government of India for their support in part of the research.

Author information

Authors and Affiliations

AI Research Lab, Department of Computer Science and Engineering, GMR Institute of Technology Rajam, Srikakulam, India, 532127
A. V. Ramana, K. Lakshmana Rao & Routhu Srinivasa Rao

Authors

A. V. Ramana
View author publications
You can also search for this author inPubMed Google Scholar
K. Lakshmana Rao
View author publications
You can also search for this author inPubMed Google Scholar
Routhu Srinivasa Rao
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to A. V. Ramana.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ramana, A.V., Rao, K.L. & Rao, R.S. Stop-Phish: an intelligent phishing detection method using feature selection ensemble. Soc. Netw. Anal. Min. 11, 110 (2021). https://doi.org/10.1007/s13278-021-00829-w

Download citation

Received: 23 June 2021
Revised: 10 October 2021
Accepted: 13 October 2021
Published: 30 October 2021
DOI: https://doi.org/10.1007/s13278-021-00829-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stop-Phish: an intelligent phishing detection method using feature selection ensemble

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Empirical Feature Selection Approach for Phishing Websites Prediction with Machine Learning

A Machine Learning Approach for Phishing Websites Prediction with Novel Feature Selection Framework

Analysis of Ensemble Methods for Phishing Detection

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now