Skip to main content

A Phishing Webpage Detecting Algorithm Using Webpage Noise and N-Gram

  • Conference paper
  • First Online:
Book cover Cloud Computing and Security (ICCCS 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10039))

Included in the following conference series:

  • 1417 Accesses

Abstract

Although anti-phishing solutions were highly publicized, phishing attack has been still an important serious problem. In this paper, a novel phishing webpage detecting algorithm using the webpage noise and n-gram was proposed. Firstly, the phishing webpage detecting algorithm extracts the webpage noise from suspicious websites, and then expresses it as a feature vector by using n-gram. Lastly, the similarity of feature vector between the protected website and suspicious is calculated. Experimental results on detecting phishing sites samples data show that: this algorithm is more effective, accurate and quick than existing algorithms to detect whether a site is a phishing website.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. APWG. http://www.antiphishing.org. Accessed 3 Mar 2015

  2. Hong, J.: The state of phishing attacks. Commun. ACM 55(1), 74–81 (2012)

    Article  Google Scholar 

  3. APAC. http://www.apac.org.cn/. Accessed 3 Mar 2016

  4. APAC. http://www.apac.org.cn/gzdt/qwfb/201602/P020160225393465225017.pdf. Accessed 3 Mar 2016

  5. Abbasi, A., Zhang, Z., Zimbra, D., et al.: Detecting fake websites: the contribution of statisitical learning theory. MIS Q. 34(3), 435–461 (2010)

    Google Scholar 

  6. Huang, H.J., Qian, L., Wang, Y.J.: A SVM-based technique to detect phishing URLs. Inf. Technol. J. 11(2), 921–925 (2012)

    Article  Google Scholar 

  7. Yue, C., Wang, H.N.: BogusBiter: a transparent protection against phishing attacks. ACM Trans. Internet Technol. 11(7), 6–36 (2010)

    MathSciNet  Google Scholar 

  8. Zhang, H., Liu, G., Chow, T.W.S., et al.: Textual and visual content-based anti-phishing: a Bayesian approach. IEEE Trans. Neural Netw. 22(10), 1532–1546 (2011)

    Article  Google Scholar 

  9. Huang, C., Ma, S., Yeh, W.: Mitigate web phishing using site signatures. In: Proceedings of TENCON 2010, pp. 803–808. IEEE (2010)

    Google Scholar 

  10. Huang, H.J., Wang, Y.J., Xie, L.L., et al.: An active anti-phishing solution based on semi-fragile watermark. Inf. Technol. J. 12(1), 198–203 (2013)

    Article  Google Scholar 

  11. Wardman, B., Stallings, T., Warner, G., Skjellum, A.: Automating phishing website identification through deep MD5 matching. In: eCrime Researchers Summit, pp. 1–7 (2008)

    Google Scholar 

  12. Dunlop, M., Groat, S., Shelly, D: GoldPhish: using images for content-based phishing analysis. In: Proceedings of the Fifth International Conference on Internet Monitoring and Protection, pp. 123–128. IEEE (2010)

    Google Scholar 

  13. Whittaker, C., Ryner, B., Nazif, M: Large-scale automatic classification of phishing pages. In: Network and Distributed Systems Security Symposium, pp. 1–10 (2010)

    Google Scholar 

  14. Gu, B., Sheng, V.S., Tay, K.Y., Romano, W., Li, S.: Incremental support vector learning for ordinal regression. IEEE Trans. Neural Netw. Learn. Syst. 26(7), 1403–1416 (2015)

    Article  MathSciNet  Google Scholar 

  15. Gu, B., Sun, X., Sheng, V.S.: Structural minimax probability machine. IEEE Trans. Neural Netw. Learn. Syst. (2016). doi:10.1109/TNNLS.2016.2544779

    Google Scholar 

  16. Bin, G., Sheng, V.S., Wang, Z., Ho, D., Osman, S., Li, S.: Incremental learning for ν-support vector regression. Neural Netw. 67, 140–150 (2015)

    Article  Google Scholar 

  17. Mao, X.L., He, J., Yan, H.F.: A survey of web page cleaning research. J. Comput. Res. Dev. 47(12), 2025–2036 (2010)

    Google Scholar 

  18. Phishtank. http://www.phishtank.com/. Accessed 3 Mar 2015

Download references

Acknowledgment

This study is supported by National Natural Science Foundation of China (No. 61304208), Hunan Province Natural Science Foundation of China (No. 13JJ2031); Youth Scientific Research Foundation of Central South University of Forestry & Technology (No. QJ2012009A).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huajun Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Deng, Q., Huang, H., Pan, L., Pang, S., Qin, J. (2016). A Phishing Webpage Detecting Algorithm Using Webpage Noise and N-Gram. In: Sun, X., Liu, A., Chao, HC., Bertino, E. (eds) Cloud Computing and Security. ICCCS 2016. Lecture Notes in Computer Science(), vol 10039. Springer, Cham. https://doi.org/10.1007/978-3-319-48671-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48671-0_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48670-3

  • Online ISBN: 978-3-319-48671-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics