Abstract
Although anti-phishing solutions were highly publicized, phishing attack has been still an important serious problem. In this paper, a novel phishing webpage detecting algorithm using the webpage noise and n-gram was proposed. Firstly, the phishing webpage detecting algorithm extracts the webpage noise from suspicious websites, and then expresses it as a feature vector by using n-gram. Lastly, the similarity of feature vector between the protected website and suspicious is calculated. Experimental results on detecting phishing sites samples data show that: this algorithm is more effective, accurate and quick than existing algorithms to detect whether a site is a phishing website.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
APWG. http://www.antiphishing.org. Accessed 3 Mar 2015
Hong, J.: The state of phishing attacks. Commun. ACM 55(1), 74–81 (2012)
APAC. http://www.apac.org.cn/. Accessed 3 Mar 2016
APAC. http://www.apac.org.cn/gzdt/qwfb/201602/P020160225393465225017.pdf. Accessed 3 Mar 2016
Abbasi, A., Zhang, Z., Zimbra, D., et al.: Detecting fake websites: the contribution of statisitical learning theory. MIS Q. 34(3), 435–461 (2010)
Huang, H.J., Qian, L., Wang, Y.J.: A SVM-based technique to detect phishing URLs. Inf. Technol. J. 11(2), 921–925 (2012)
Yue, C., Wang, H.N.: BogusBiter: a transparent protection against phishing attacks. ACM Trans. Internet Technol. 11(7), 6–36 (2010)
Zhang, H., Liu, G., Chow, T.W.S., et al.: Textual and visual content-based anti-phishing: a Bayesian approach. IEEE Trans. Neural Netw. 22(10), 1532–1546 (2011)
Huang, C., Ma, S., Yeh, W.: Mitigate web phishing using site signatures. In: Proceedings of TENCON 2010, pp. 803–808. IEEE (2010)
Huang, H.J., Wang, Y.J., Xie, L.L., et al.: An active anti-phishing solution based on semi-fragile watermark. Inf. Technol. J. 12(1), 198–203 (2013)
Wardman, B., Stallings, T., Warner, G., Skjellum, A.: Automating phishing website identification through deep MD5 matching. In: eCrime Researchers Summit, pp. 1–7 (2008)
Dunlop, M., Groat, S., Shelly, D: GoldPhish: using images for content-based phishing analysis. In: Proceedings of the Fifth International Conference on Internet Monitoring and Protection, pp. 123–128. IEEE (2010)
Whittaker, C., Ryner, B., Nazif, M: Large-scale automatic classification of phishing pages. In: Network and Distributed Systems Security Symposium, pp. 1–10 (2010)
Gu, B., Sheng, V.S., Tay, K.Y., Romano, W., Li, S.: Incremental support vector learning for ordinal regression. IEEE Trans. Neural Netw. Learn. Syst. 26(7), 1403–1416 (2015)
Gu, B., Sun, X., Sheng, V.S.: Structural minimax probability machine. IEEE Trans. Neural Netw. Learn. Syst. (2016). doi:10.1109/TNNLS.2016.2544779
Bin, G., Sheng, V.S., Wang, Z., Ho, D., Osman, S., Li, S.: Incremental learning for ν-support vector regression. Neural Netw. 67, 140–150 (2015)
Mao, X.L., He, J., Yan, H.F.: A survey of web page cleaning research. J. Comput. Res. Dev. 47(12), 2025–2036 (2010)
Phishtank. http://www.phishtank.com/. Accessed 3 Mar 2015
Acknowledgment
This study is supported by National Natural Science Foundation of China (No. 61304208), Hunan Province Natural Science Foundation of China (No. 13JJ2031); Youth Scientific Research Foundation of Central South University of Forestry & Technology (No. QJ2012009A).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Deng, Q., Huang, H., Pan, L., Pang, S., Qin, J. (2016). A Phishing Webpage Detecting Algorithm Using Webpage Noise and N-Gram. In: Sun, X., Liu, A., Chao, HC., Bertino, E. (eds) Cloud Computing and Security. ICCCS 2016. Lecture Notes in Computer Science(), vol 10039. Springer, Cham. https://doi.org/10.1007/978-3-319-48671-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-48671-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48670-3
Online ISBN: 978-3-319-48671-0
eBook Packages: Computer ScienceComputer Science (R0)