skip to main content
research-article

Combating spam in tagging systems: An evaluation

Published:27 October 2008Publication History
Skip Abstract Section

Abstract

Tagging systems allow users to interactively annotate a pool of shared resources using descriptive strings called tags. Tags are used to guide users to interesting resources and help them build communities that share their expertise and resources. As tagging systems are gaining in popularity, they become more susceptible to tag spam: misleading tags that are generated in order to increase the visibility of some resources or simply to confuse users. Our goal is to understand this problem better. In particular, we are interested in answers to questions such as: How many malicious users can a tagging system tolerate before results significantly degrade? What types of tagging systems are more vulnerable to malicious attacks? What would be the effort and the impact of employing a trusted moderator to find bad postings? Can a system automatically protect itself from spam, for instance, by exploiting user tag patterns? In a quest for answers to these questions, we introduce a framework for modeling tagging systems and user tagging behavior. We also describe a method for ranking documents matching a tag based on taggers' reliability. Using our framework, we study the behavior of existing approaches under malicious attacks and the impact of a moderator and our ranking method.

References

  1. 3spots. http://3spots.blogspot.com/2006/01/all-social-that-can-bookmark.html.Google ScholarGoogle Scholar
  2. Adlam, T. 2006. Tag and ping phenomenon. http://www.optiniche.com/blog/174/tag-and-ping/.Google ScholarGoogle Scholar
  3. Brooks, C. and Montanez, N. 2006. Improved annotation of the blogosphere via autotagging and hierarchical clustering. In Proceedings of the 15th International Conference on the World Wide Web. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. CiteULike. http://www.citeulike.org/.Google ScholarGoogle Scholar
  5. Control, N. http://asp.net/ajax/control-toolkit/live/NoBot/NoBot.aspx.Google ScholarGoogle Scholar
  6. del.icio.us. http://del.icio.us/.Google ScholarGoogle Scholar
  7. Diigo. http://www.diigo.com/.Google ScholarGoogle Scholar
  8. EbiquityBlogger. 2007 http://ebiquity.umbc.edu/blogger/2007/01/24/tag-spam-on-the-rise.Google ScholarGoogle Scholar
  9. Farrell, S. and Lau, T. 2006. Fringe contacts: people tagging for the enterprise. In Proceedings of the Collaborative Web Tagging Workshop in conjunction with the 15th International Conference on the World Wide Web.Google ScholarGoogle Scholar
  10. Flickr. url: http://www.flickr.com/.Google ScholarGoogle Scholar
  11. Golder, S. and Huberman, B. A. 2006. Usage patterns of collaborative tagging systems. J. Inform. Sci. 32, 2, 198--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Guha, R., Kumar, R., Raghavan, P., and Tomkins, A. 2004. Propagation of trust and distrust. In Proceedings of the 13th International Conference on the World Wide Web. 403--412. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Gyöngyi, Z., Berkhin, P., Garcia-Molina, H., and Pedersen, J. 2006. Link spam detection with mass estimation. In Proceedings of the 32nd International Conference on Very Large Databases. 439--450. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Gyöngyi, Z. and Garcia-Molina, H. 2005. Web spam taxonomy. In Proceedings of the 1st International Workshop on Adversarial Information Retrieval on the Web. 39--47.Google ScholarGoogle Scholar
  15. Gyöngyi, Z., Garcia-Molina, H., and Pedersen, J. 2004. Combating spam with TrustRank. In Proceedings of the 30th International Conference on Very Large Databases. 576--587. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Henzinger, M. 2000. Link analysis in web information retrieval. IEEE Data Eng. Bull. 23, 3, 3--8.Google ScholarGoogle Scholar
  17. John, A. and Seligmann, D. 2006. Collaborative tagging and expertise in the enterprise. In Proceedings of the Collaborative Web Tagging Workshop in conjunction with the 15th International Conference on the World Wide Web.Google ScholarGoogle Scholar
  18. Jots. http://www.jots.com/.Google ScholarGoogle Scholar
  19. Koutrika, G., Effendi, F., Gyöngyi, Z., Heymann, P., and Garcia-Molina, H. 2007. Combating spam in tagging systems. In Proceedings of the 3rd International Workshop on Adversarial Information Retrieval on the Web. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kumar, R., Novak, J., and Tomkins, A. 2006. Structure and evolution of online social networks. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 611--617. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Marlow, C., Naaman, M., Boyd, D., and Davis, M. 2006. Position paper, tagging, taxonomy, flickr, article, toread. In Proceedings of the Hypertext Conference. 31--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Mathes, A. 2004. Folksonomies—cooperative classification and communication through shared metadata. Computer Mediated Communication, LIS590CMC (Doctoral Seminar), Graduate School of Library and Information Science, University of Illinois Urbana-Champaign.Google ScholarGoogle Scholar
  23. Merholz, P. 2004. Metadata for the masses. http://www.adaptivepath.com/ideas/essays/archives/000361.php.Google ScholarGoogle Scholar
  24. Mishne, G. 2006. Autotag: collaborative approach to automated tag assignment for weblog posts. In Proceedings of the 15th International Conference on the World Wide Web. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. MyWeb. http://myweb2.search.yahoo.com/.Google ScholarGoogle Scholar
  26. Ohkura, T., Kiyota, Y., and Nakagawa, H. 2006. Browsing system for weblog articles based on automated folksonomy. In Proceedings of the 15th International Conference on the World Wide Web.Google ScholarGoogle Scholar
  27. Rawsugar. http://rawsugar.com/.Google ScholarGoogle Scholar
  28. RealTravel, X. http://realtravel.com/.Google ScholarGoogle Scholar
  29. Schmitz, P. 2006. Inducing ontology from flickr tags. In Proceedings of the Collaborative Web Tagging Workshop in conjunction with the 15th International Conference on the World Wide Web.Google ScholarGoogle Scholar
  30. Sen, S., Lam, S., Rashid, A., Cosley, D., Frankowski, D., Osterhouse, J., Harper, F. M., and Riedl, J. 2006. Tagging, communities, vocabulary, evolution. In Proceedings of the 10th International Conference on Computer Supported Cooperative Work in Design. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Slideshare. http://slideshare.net/.Google ScholarGoogle Scholar
  32. Technorati. http://www.technorati.com/.Google ScholarGoogle Scholar
  33. Wasserman, S. and Faust, K. 1994. Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge, UK.Google ScholarGoogle Scholar
  34. Wu, B., Goel, V., and Davison, B. 2006. Topical trustrank: using topicality to combact web spam. In Proceedings of the 15th International Conference on the World Wide Web. 63--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Xu, Z., Fu, Y., Mao, J., and Su, D. 2006. Towards the semantic web: collaborative tag suggestions. In Proceedings of the Collaborative Web Tagging Workshop in 15th International Conference on the World Wide Web.Google ScholarGoogle Scholar
  36. YouTube. http://www.youtube.com/.Google ScholarGoogle Scholar

Index Terms

  1. Combating spam in tagging systems: An evaluation

                    Recommendations

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in

                    Full Access

                    • Published in

                      cover image ACM Transactions on the Web
                      ACM Transactions on the Web  Volume 2, Issue 4
                      October 2008
                      118 pages
                      ISSN:1559-1131
                      EISSN:1559-114X
                      DOI:10.1145/1409220
                      Issue’s Table of Contents

                      Copyright © 2008 ACM

                      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                      Publisher

                      Association for Computing Machinery

                      New York, NY, United States

                      Publication History

                      • Published: 27 October 2008
                      • Accepted: 1 June 2008
                      • Revised: 1 October 2007
                      • Received: 1 March 2007
                      Published in tweb Volume 2, Issue 4

                      Permissions

                      Request permissions about this article.

                      Request Permissions

                      Check for updates

                      Qualifiers

                      • research-article
                      • Research
                      • Refereed

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader