Skip to main content

A User-Oriented Splog Filtering Based on a Machine Learning

  • Conference paper
Recent Trends and Developments in Social Software (BlogTalk 2008, BlogTalk 2009)

Abstract

A method for filtering spam blogs (splogs) based on a machine learning technique, and its evaluation results are described. Today, spam blogs (splogs) became one of major issues on the Web. The problem of splogs is that values of blog sites are different by people. We propose a novel user-oriented splog filtering method that can adapt each user’s preference for valuable blogs. We use the SVM(Support Vector Machine) for creating a personalized splog filter for each user. We had two experiments: (1) an experiment of individual splog judgement, and (2) an experiment for user oriented splog filtering. From the former experiment, we found existence of ‘gray’ blogs that are needed to treat by persons. From the latter experiment, we found that we can provide appropriate personalized filters by choosing the best feature set for each user. An overview of proposed method, and evaluation results are described.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kolari, P., Java, A., Finin, T., Oates, T., Joshi, A.: Detecting spam blogs: A machine learning approach. In: Proceedings of the 21st National Conference on Association for Advancement of Artificial Intelligence (AAAI 2006), pp. 1351–1356 (2006)

    Google Scholar 

  2. Drucker, H., Wu, D., Vapnik, V.: Support vector machines for spam categorization. IEEE Transactions on Neural Networks, 1048–1054 (1999)

    Google Scholar 

  3. Ishida, K.: Extracting spam blogs with co-citation clusters. In: Proceedings of the 17th International Conference on World Wide Web (WWW 2008), pp. 1043–1044 (2008)

    Google Scholar 

  4. Junejo, K.N., Karim, A.: PSSF: A novel statistical approach for personalized service-side spam filtering. In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (WI 2007), pp. 228–234 (2007)

    Google Scholar 

  5. Jeh, G., Widom, J.: Scaling personalized web search. In: Proceedings of the 12th International Conference on World Wide Web (WWW 2003), pp. 271–279 (2003)

    Google Scholar 

  6. Manning, C.D., Shuetze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    Google Scholar 

  7. Yoshinaka, T., Fukuhara, T., Masuda, H., Nakagawa, H.: A user-oriented splog filtering based on machine learning method- (in japanese). In: Proceedings of The 23rd Annual Conference on the Japanese Society for Artificial Intelligence (JSAI 2009), vol. 2B2-4 (2009)

    Google Scholar 

  8. Manning, C.D., Shuetze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    Google Scholar 

  9. Wang, Y.M., Ma, M., Niu, Y., Chen, H.: Spam double-funnel: connecting web spammers with advertisers. In: Proceedings of the 16th International Conference on World Wide Web (WWW 2007), pp. 291–300 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yoshinaka, T., Ishii, S., Fukuhara, T., Masuda, H., Nakagawa, H. (2010). A User-Oriented Splog Filtering Based on a Machine Learning. In: Breslin, J.G., Burg, T.N., Kim, HG., Raftery, T., Schmidt, JH. (eds) Recent Trends and Developments in Social Software. BlogTalk BlogTalk 2008 2009. Lecture Notes in Computer Science, vol 6045. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16581-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16581-8_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16580-1

  • Online ISBN: 978-3-642-16581-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics