Abstract
We propose a new phishing detection heuristic based on the search results returned from popular web search engines such as Google, Bing and Yahoo. The full URL of a website a user intends to access is used as the search string, and the number of results returned and ranking of the website are used for classification. Most of the time, legitimate websites get back large number of results and are ranked first, whereas phishing websites get back no result and/or are not ranked at all.
To demonstrate the effectiveness of our approach, we experimented with four well-known classification algorithms – Linear Discriminant Analysis, Naïve Bayesian, K-Nearest Neighbour, and Support Vector Machine – and observed their performance. The K-Nearest Neighbour algorithm performed best, achieving true positive rate of 98% and false positive and false negative rates of 2%. We used new legitimate websites and phishing websites as our dataset to show that our approach works well even on newly launched websites/webpages – such websites are often misclassified in existing blacklisting and whitelisting approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aaron, G., Rasmussen, R.: Global phishing survey: Trends and domain name use in 2h2009 (May 2010), http://www.antiphishing.org/reports/APWG_GlobalPhishingSurvey_2H2009.pdf
Cao, Y., Han, W., Le, Y.: Anti-phishing based on automated individual white-list. In: DIM 2008: Proceedings of the 4th ACM Workshop on Digital Identity Management, pp. 51–60. ACM, New York (2008)
Chou, N., Ledesma, R., Teraguchi, Y., Mitchell, J.C.: Client-Side Defense Against Web-Based Identity Theft. In: NDSS 2004: Proceedings of the Network and Distributed System Security Symposium (2004)
Cristianini, N., Shawe-Taylor, J.: An introduction to support vector machines: and other kernel-based learning methods, 1st edn. Cambridge University Press (March 2000)
Domeniconi, C., Peng, J., Gunopulos, D.: Locally adaptive metric nearest-neighbor classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 1281–1285 (2002)
Domingos, P., Pazzani, M.: On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Machine Learning 29(2-3), 103–130 (1997)
Fu, A.Y., Wenyin, L., Deng, X.: Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover’s Distance (EMD). IEEE Transactions on Dependable and Secure Computing 3(4), 301–311 (2006)
Fukunaga, K.: Introduction to statistical pattern recognition, 2nd edn. Academic Press Professional, Inc., San Diego (1990)
Hearst, M.A., Dumais, S.T., Osman, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intelligent Systems and their Applications 13(4), 18–28 (1998)
Bian, K., Park, J.-M., Hsiao, M.S., Belanger, F., Hiller, J.: Evaluation of Online Resources in Assisting Phishing Detection. In: Ninth Annual International Symposium on Applications and the Internet, SAINT 2009, pp. 30–36. IEEE Computer Society, Bellevue (2009)
Kim, H., Huh, J.H.: Detecting DNS-poisoning-based phishing attacks from their network performance characteristic. Electronics Letters 47(11), 656–658 (2011)
Kirda, E., Kruegel, C.: Protecting Users Against Phishing Attacks with AntiPhish. In: COMPSAC 2005: Proceedings of the 29th Annual International Computer Software and Applications Conference, pp. 517–524. IEEE Computer Society, Washington, DC, USA (2005)
Latourrette, M.: Toward an Explanatory Similarity Measure for Nearest-Neighbor Classification. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 238–245. Springer, Heidelberg (2000)
Liu, W., Deng, X., Huang, G., Fu, A.Y.: An Antiphishing Strategy Based on Visual Similarity Assessment. IEEE Internet Computing 10(2), 58–65 (2006)
Moore, T., Clayton, R.: Examining the impact of website take-down on phishing. In: eCrime 2007: Proceedings of the Anti-Phishing Working Groups 2nd Annual eCrime Researchers Summit, pp. 1–13. ACM, New York (2007)
Pan, Y., Ding, X.: Anomaly Based Web Phishing Page Detection. In: ACSAC 2006: Proceedings of the 22nd Annual Computer Security Applications Conference, pp. 381–392. IEEE Computer Society, Washington, DC, USA (2006)
Rish, I.: An empirical study of the naive Bayes classifier. In: Proceedings of IJCAI-2001 Workshop on Empirical Methods in Artificial Intelligence (2001)
Ronda, T., Saroiu, S., Wolman, A.: Itrustpage: a user-assisted anti-phishing tool. ACM SIGOPS Operating Systems Review 42(4), 261–272 (2008)
Sheng, S., Wardman, B., Warner, G., Cranor, L.F., Hong, J., Zhang, C.: An empirical analysis of phishing blacklists. In: CEAS 2009: Proceedings of the 6th Conference on Email and Anti-Spam (2009)
Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience (September 1998)
Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.-H., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 algorithms in data mining. Knowledge and Information Systems 14(1), 1–37 (2007)
Xiang, G., Hong, J.I.: A hybrid phish detection approach by identity discovery and keywords retrieval. In: WWW 2009: Proceedings of the 18th international conference on World wide web, pp. 571–580. ACM, New York (2009)
Zhang, Y., Egelman, S., Cranor, L., Hong, J.: Phinding Phish: Evaluating Anti-Phishing Tools. In: NDSS 2007: Proceedings of the 14th Annual Network and Distributed System Security Symposium (2007)
Zhang, Y., Hong, J.I., Cranor, L.F.: Cantina: a content-based approach to detecting phishing web sites. In: WWW 2007: Proceedings of the 16th International Conference on World Wide Web, pp. 639–648. ACM, New York (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Huh, J.H., Kim, H. (2012). Phishing Detection with Popular Search Engines: Simple and Effective. In: Garcia-Alfaro, J., Lafourcade, P. (eds) Foundations and Practice of Security. FPS 2011. Lecture Notes in Computer Science, vol 6888. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27901-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-27901-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27900-3
Online ISBN: 978-3-642-27901-0
eBook Packages: Computer ScienceComputer Science (R0)