Consensus and majority vote feature selection methods and a detection technique for web phishing

Alotaibi, Bandar; Alotaibi, Munif

doi:10.1007/s12652-020-02054-3

Consensus and majority vote feature selection methods and a detection technique for web phishing

Original Research
Published: 29 May 2020

Volume 12, pages 717–727, (2021)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

718 Accesses
19 Citations
Explore all metrics

Abstract

Phishing is one of the most frequently occurring forms of cybercrime that Internet users face and represents a violation of cybersecurity principles. Phishing is a fraudulent attack that is performed over the Internet with the purpose of obtaining and using without authorization the sensitive information of Internet users, such as usernames, passwords, credit card details, and bank account information. Some widely used phishing attempts involve using email spoofing or instant messaging, aiming to convince a victim to visit the spoofed websites, which will result in obtaining the victim’s information. In this work, we identify and analyze the most important features needed to detect the spoofed websites in virtue of two new feature selection techniques. The first proposed feature selection technique uses underlying feature selection methods that vote on each feature, and if such methods agree on a specific feature, that feature is selected. The second feature selection technique also uses underlying feature selection methods that vote on each feature, and if the majority vote on a specific feature, the feature is selected. We also propose a phishing detection technique based on both AdaBoost and LightGBM ensemble methods to detect the spoofed websites. The proposed method achieves a very high accuracy compared to that of the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stop-Phish: an intelligent phishing detection method using feature selection ensemble

Article 30 October 2021

An Empirical Feature Selection Approach for Phishing Websites Prediction with Machine Learning

A Machine Learning Approach for Phishing Websites Prediction with Novel Feature Selection Framework

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Abutair H, Belghith A, AlAhmadi S (2019) Cbr-pds: a case-based reasoning phishing detection system. J Ambient Intell Hum Comput 10(7):2593–2606
Article Google Scholar
Bahnsen AC, Bohorquez EC, Villegas S, Vargas J, González FA (2017) Classifying phishing urls using recurrent neural networks. In: 2017 APWG symposium on electronic crime research (eCrime), IEEE, pp 1–8
Basnet RB, Sung AH, Liu Q (2012) Feature selection for improved phishing detection. In: International Conference on Industrial. Springer, Engineering and Other Applications of Applied Intelligent Systems, pp 252–261
Chiew KL, Tan CL, Wong K, Yong KS, Tiong WK (2019) A new hybrid ensemble feature selection framework for machine learning-based phishing detection system. Inf Sci 484:153–166
Article Google Scholar
Feng F, Zhou Q, Shen Z et al (2018) The application of a novel neural network in the detection of phishing websites. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-018-0786-3
Freund Y, Schapire RE (1995) A desicion-theoretic generalization of on-line learning and an application to boosting. In: European conference on computational learning theory, Springer, pp 23–37
Freund Y, Schapire R, Abe N (1999) A short introduction to boosting. J Jpn Soc Artif Intell 14(771–780):1612
Google Scholar
Jain AK, Gupta BB (2018) Two-level authentication approach to protect from phishing attacks in real time. J Ambient Intell Hum Comput 9(6):1783–1796
Article Google Scholar
Jain AK, Gupta BB (2019) A machine learning based approach for phishing detection using hyperlinks information. J Ambient Intell Hum Comput 10(5):2015–2028
Article Google Scholar
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY (2017) Lightgbm: a highly efficient gradient boosting decision tree. In: Advances in neural information processing systems, pp 3146–3154
Khonji M, Jones A, Iraqi Y (2013) An empirical evaluation for feature selection methods in phishing email classification. Int J Comput Syst Sci Eng 28(1):37–51
Google Scholar
Lastdrager EE (2014) Achieving a consensual definition of phishing based on a systematic review of the literature. Crime Sci 3(1):9
Article Google Scholar
L’Huillier G, Hevia A, Weber R, Rios S (2010) Latent semantic analysis and keyword extraction for phishing classification. In: 2010 IEEE International Conference on intelligence and security informatics, IEEE, pp 129–131
Ma J, Saul LK, Savage S, Voelker GM (2009) Identifying suspicious urls: an application of large-scale online learning. In: Proceedings of the 26th annual international conference on machine learning, pp 681–688
Marchal S, François J, State R, Engel T (2014) Phishstorm: detecting phishing with streaming analytics. IEEE Trans Netw Serv Manag 11(4):458–471
Article Google Scholar
Marchal S, Saari K, Singh N, Asokan N (2016) Know your phish: Novel techniques for detecting phishing sites and their targets. In: 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS), IEEE, pp 323–333
McCall T (2007) Gartner survey shows phishing attacks escalated in 2007; more than $3 billion lost to these attacks. Gartner. http://www.gartner.com/it/page.jsp?id=565125
Mohammad RM, Thabtah F, McCluskey L (2014) Predicting phishing websites based on self-structuring neural network. Neural Comput Appl 25(2):443–458
Article Google Scholar
Mohammad R, Thabtah FA, McCluskey T (2015a) Phishing websites dataset. University of Huddersfield, v1. https://archive.ics.uci.edu/ml/datasets/phishing+websites
Mohammad RM, Thabtah F, McCluskey L (2015b) Tutorial and critical analysis of phishing websites methods. Comput Sci Rev 17:1–24
Article MathSciNet Google Scholar
Ramanathan V, Wechsler H (2013) Phishing detection and impersonated entity discovery using conditional random field and latent dirichlet allocation. Comput Secur 34:123–139
Article Google Scholar
Rao RS, Pais AR (2019) Two level filtering mechanism to detect phishing sites using lightweight visual similarity approach. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-019-01637-z
Rao RS, Vaishnavi T, Pais AR (2019) Phishdump: a multi-model ensemble based technique for the detection of phishing sites in mobile devices. Pervasive Mob Comput 60:101084
Article Google Scholar
Rao RS, Vaishnavi T, Pais AR (2020) Catchphish: detection of phishing websites by inspecting urls. J Ambient Intell Hum Comput 11(2):813–825
Article Google Scholar
Tan CL (2018) Phishing dataset for machine learning: feature evaluation. Mendeley, v1. https://doi.org/10.17632/h3cgnj8hft.1
Thakur T, Verma R (2014) Catching classical and hijack-based phishing attacks. In: International Conference on information systems security, Springer, pp 318–337
Toolan F, Carthy J (2010) Feature selection for spam and phishing detection. In: 2010 eCrime researchers summit. IEEE, pp 1–12. https://doi.org/10.1109/ecrime.2010.5706696
Varshney G, Misra M, Atrey PK (2016) A survey and classification of web phishing detection schemes. Secur Commun Netw 9(18):6266–6284
Article Google Scholar
Verma R, Dyer K (2015) On the character of phishing urls: accurate and robust statistical learning classifiers. In: Proceedings of the 5th ACM Conference on data and application security and privacy, pp 111–122
Wang W, Zhang F, Luo X, Zhang S (2019) Pdrcnn: precise phishing detection with recurrent convolutional neural networks. Secur Commun Netw. https://doi.org/10.1155/2019/2595794
Xin Y, Kong L, Liu Z, Chen Y, Li Y, Zhu H, Gao M, Hou H, Wang C (2018) Machine learning and deep learning methods for cybersecurity. IEEE Access 6:35365–35381
Article Google Scholar
Zabihimayvan M, Doran D (2019) Fuzzy rough set feature selection to enhance phishing attack detection. In: 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), IEEE, pp 1–6
Zhu E, Chen Y, Ye C, Li X, Liu F (2019) Ofs-nn: an effective phishing websites detection model based on optimal feature selection and neural network. IEEE Access 7:73271–73284
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Tabuk, Tabuk, 71491, Saudi Arabia
Bandar Alotaibi
Shaqra University, Shaqra, Saudi Arabia
Munif Alotaibi

Authors

Bandar Alotaibi
View author publications
You can also search for this author in PubMed Google Scholar
Munif Alotaibi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Munif Alotaibi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alotaibi, B., Alotaibi, M. Consensus and majority vote feature selection methods and a detection technique for web phishing. J Ambient Intell Human Comput 12, 717–727 (2021). https://doi.org/10.1007/s12652-020-02054-3

Download citation

Received: 04 September 2019
Accepted: 30 April 2020
Published: 29 May 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s12652-020-02054-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Consensus and majority vote feature selection methods and a detection technique for web phishing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Stop-Phish: an intelligent phishing detection method using feature selection ensemble

An Empirical Feature Selection Approach for Phishing Websites Prediction with Machine Learning

A Machine Learning Approach for Phishing Websites Prediction with Novel Feature Selection Framework

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Consensus and majority vote feature selection methods and a detection technique for web phishing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Stop-Phish: an intelligent phishing detection method using feature selection ensemble

An Empirical Feature Selection Approach for Phishing Websites Prediction with Machine Learning

A Machine Learning Approach for Phishing Websites Prediction with Novel Feature Selection Framework

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation