Skip to main content

Detecting Search Engine Spam from a Trackback Network in Blogspace

  • Conference paper
Knowledge-Based Intelligent Information and Engineering Systems (KES 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3684))

Abstract

We aim to develop a technique to detect search engine optimization (SEO) spam websites. Specifically, we propose four methods for extracting the SEO spam entries from a given trackback network in blogspace that are based on fundamental metrics on a network. Using real data of trackback networks in blogspace, we experimentally evaluate the performance of the proposed methods, and demonstrate that the method of ranking entries based on average degrees of nearest neighbors can be a very promising approach for extracting SEO spam entries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)

    Article  MathSciNet  Google Scholar 

  2. Brin, S., Page, L.: The anatomy of a large scale hypertextualWeb search engine. In: Proceedings of the Seventh International World Wide Web Conference, pp. 107–117 (1998)

    Google Scholar 

  3. Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of Web communities. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 150–160 (2000)

    Google Scholar 

  4. Girvan, M., Newman, E.J.: Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America 99, 7821–7826 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  5. Gruhl, D., Guha, R., Liben-Nowell, D., Tomkins, A.: Information diffusion through blogspace. In: Proceedings of the 13th International World Wide Web Conference, pp. 491–501 (2004)

    Google Scholar 

  6. Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Proceedings of the Ninth ACM-SIAM Symposium on Discrete Algorithms, pp. 668–677 (1998)

    Google Scholar 

  7. Kumar, R., Novak, J., Raghavan, P., Tomkins, A.: On the bursty evolution of Blogspace. In: Proceedings of the 12th International World Wide Web Conference, pp. 568–576 (2003)

    Google Scholar 

  8. Pastor-Satorras, R., Vázquez, A., Vespignani, A.: Dynamical and correlation properties of the Internet. Physical Review Letters 87, 258701 (2001)

    Article  Google Scholar 

  9. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kimura, M., Saito, K., Kazama, K., Sato, Sy. (2005). Detecting Search Engine Spam from a Trackback Network in Blogspace. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2005. Lecture Notes in Computer Science(), vol 3684. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11554028_101

Download citation

  • DOI: https://doi.org/10.1007/11554028_101

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28897-8

  • Online ISBN: 978-3-540-31997-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics