Detecting Webspam Beneficiaries Using Information Collected by the Random Surfer

Detecting Webspam Beneficiaries Using Information Collected by the Random Surfer

Thomas Largillier, Sylvain Peyronnet
Copyright: © 2011 |Volume: 2 |Issue: 2 |Pages: 13
ISSN: 1947-9344|EISSN: 1947-9352|EISBN13: 9781613508800|DOI: 10.4018/joci.2011040103
Cite Article Cite Article

MLA

Largillier, Thomas, and Sylvain Peyronnet. "Detecting Webspam Beneficiaries Using Information Collected by the Random Surfer." IJOCI vol.2, no.2 2011: pp.36-48. http://doi.org/10.4018/joci.2011040103

APA

Largillier, T. & Peyronnet, S. (2011). Detecting Webspam Beneficiaries Using Information Collected by the Random Surfer. International Journal of Organizational and Collective Intelligence (IJOCI), 2(2), 36-48. http://doi.org/10.4018/joci.2011040103

Chicago

Largillier, Thomas, and Sylvain Peyronnet. "Detecting Webspam Beneficiaries Using Information Collected by the Random Surfer," International Journal of Organizational and Collective Intelligence (IJOCI) 2, no.2: 36-48. http://doi.org/10.4018/joci.2011040103

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

Search engines use several criteria to rank webpages and choose which pages to display when answering a request. Those criteria can be separated into two notions, relevance and popularity. The notion of popularity is calculated by the search engine and is related to links made to the webpage. Malicious webmasters want to artificially increase their popularity; the techniques they use are often referred to as Webspam. It can take many forms and is in constant evolution, but Webspam usually consists of building a specific dedicated structure of spam pages around a given target page. It is important for a search engine to address the issue of Webspam; otherwise, it cannot provide users with fair and reliable results. In this paper, the authors propose a technique to identify Webspam through the frequency language associated with random walks among those dedicated structures. The authors identify the language by calculating the frequency of appearance of k-grams on random walks launched from every node.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.