Skip to main content

An Evaluation of Machine Learning-Based Methods for Detection of Phishing Sites

  • Conference paper
Advances in Neuro-Information Processing (ICONIP 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5506))

Included in the following conference series:

Abstract

In this paper, we present the performance of machine learning-based methods for detection of phishing sites. We employ 9 machine learning techniques including AdaBoost, Bagging, Support Vector Machines, Classification and Regression Trees, Logistic Regression, Random Forests, Neural Networks, Naive Bayes, and Bayesian Additive Regression Trees. We let these machine learning techniques combine heuristics, and also let machine learning-based detection methods distinguish phishing sites from others. We analyze our dataset, which is composed of 1,500 phishing sites and 1,500 legitimate sites, classify them using the machine learning-based detection methods, and measure the performance. In our evaluation, we used f 1 measure, error rate, and Area Under the ROC Curve (AUC) as performance metrics along with our requirements for detection methods. The highest f 1 measure is 0.8581, the lowest error rate is 14.15%, and the highest AUC is 0.9342, all of which are observed in the case of AdaBoost. We also observe that 7 out of 9 machine learning-based detection methods outperform the traditional detection method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anti-Phishing Working Group: Phishing Activity Trends Report (July 2007)

    Google Scholar 

  2. Zhang, Y., Egelman, S., Cranor, L., Hong, J.: Phinding Phish: Evaluating Anti-Phishing Tools. In: Proceesings of the 14th Annual Network and Distributed System Security Symposium (NDSS 2007) (2007)

    Google Scholar 

  3. Kumar, A.: Phishing - A new age weapon. Technical report, Open Web Application Secuirtry Project (OWASP) (2005)

    Google Scholar 

  4. Tally, G., Thomas, R., Vleck, T.V.: Anti-Phishing: Best Practices for Institutions and Consumers. Technical report, McAfee Research (2004)

    Google Scholar 

  5. Van der Merwe, A., Loock, M., Dabrowski, M.: Characteristics and responsibilities involeved in a phishing attack. In: Proceedings of the 4th International Symposium on Information and Communication Technologies (ISICT 2005) (2005)

    Google Scholar 

  6. Miyamoto, D., Hazeyama, H., Kadobayashi, Y.: A Proposal of the AdaBoost-Based Detection of Phishing Sites. In: Proceedings of the 2nd Joint Workshop on Information security (2007)

    Google Scholar 

  7. Zhang, Y., Hong, J., Cranor, L.: CANTINA: A Content-Based Approach to Detect Phishing Web Sites. In: Proceesings of the 16th World Wide Web Conference (WWW 2007) (2007)

    Google Scholar 

  8. Fette, I., Sadeh, N.M., Tomasic, A.: Learning to detect phishing emails. In: Proceedings of the 16th International Conference on World Wide Web (WWW 2007) (2007)

    Google Scholar 

  9. Abu-Nimeh, S., Nappa, D., Wang, X., Nair, S.: A comparison of machine learning techniques for phishing detection. In: Proceedings of eCrime Researchers Summit (eCryme 2007) (2007)

    Google Scholar 

  10. Basnet, R., Mukkamala, S., Sung, A.H.: Detection of phishing attacks: A machine learning approach. Studies in Fuzziness and Soft Computing 226, 373–383 (2008)

    Article  Google Scholar 

  11. Pan, Y., Ding, X.: Anomaly based web phishing page detection. In: Proceedings of the 22nd Annual Computer Security Applications Conference on Annual Computer Security Applications Conference (ACSAC 2006) (2006)

    Google Scholar 

  12. OpenDNS: PhishTank - Join the fight against phishing, http://www.phishtank.com

  13. Robichaux, P., Ganger, D.L.: Gone Phishing: Evaluating Anti-Phishing Tools for Windows, http://www.3sharp.com/projects/antiphishing/gone-phishing.pdf

  14. Alexa Internet, Inc.: Alexa the Web Information Company, http://www.alexa.com

  15. Yahoo!Inc.: Random Yahoo Link, http://random.yahoo.com/fast/ryl

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Miyamoto, D., Hazeyama, H., Kadobayashi, Y. (2009). An Evaluation of Machine Learning-Based Methods for Detection of Phishing Sites. In: Köppen, M., Kasabov, N., Coghill, G. (eds) Advances in Neuro-Information Processing. ICONIP 2008. Lecture Notes in Computer Science, vol 5506. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02490-0_66

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02490-0_66

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02489-4

  • Online ISBN: 978-3-642-02490-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics