Skip to main content
Log in

Exploring the Relationship between Keywords and Feed Elements in Blog Post Search

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Blogs are increasingly accepted as a useful means to proliferate a variety of information on the web. As the popularity of blogs grows rapidly, a number of blog search engines have appeared recently to help users access and discover blog posts efficiently. Nevertheless, existing approaches tend to focus on ranking the blog posts according to their recency or popularity only, leaving the problem of retrieving more topic relevant posts to a user’s query largely unexplored. In this paper, we present a novel blog ranking framework, called PTRank, that improves search quality by taking account of relevance feedback from users as well as various information available from RSS feeds. A neural network method is employed to learn ranking functions that provide a relevance score between a keyword and a blog post. Extensive experiments on real blog data have been conducted to validate the proposed ranking framework for blog post search, and the results indicate that PTRank performs significantly better than the existing popular approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adar, E., Zhang, L.: Structure and the dynamics of blogspace. In: Proceedings of the WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics (2004)

  2. Agichtein, E., Brill, E., Dumais, S.: Improving web search ranking by incorporating user behavior information. In: Proceedings of the ACM Conference on Research and Development in Information Retrieval (SIGIR), pp. 19–26 (2006)

  3. Beg, M.M.S., Ahmad, N.: Web search enhancement by mining user actions. Inf. Sci. 177, 5203–5218 (2007)

    Article  Google Scholar 

  4. Bloglines: Available: http://www.bloglines.com (2008). Accessed 1 October 2008

  5. Blogpulse: Available: http://www.blogpulse.com (2008). Accessed 1 October 2008

  6. Borges, J., Levene, M.: Ranking pages by topology and popularity within web sites. World Wide Web J. 9, 301–316 (2006)

    Article  Google Scholar 

  7. Brooks, C.H., Montanez, N.: Improved annotation of the blogosphere via autotagging and hierarchical clustering. In: Proceedings of the 15th International Conference on World Wide Web, pp. 625–632 (2006)

  8. Davision-Turley, W.: Blogs and RSS: powerful information management tools. Libr. Hi Tech News 22, 28–29 (2005)

    Article  Google Scholar 

  9. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (2001)

    MATH  Google Scholar 

  10. Fujimura, K., Inoue, T., Sugisaki, M.: The EigenRumor algorithm for ranking blogs. In: Proceedings of the WWW 2005 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics (2005)

  11. Fujimura, K., Toda, H., Inoue, T., Hiroshima, N., Kataoka, R., Sugizaki, M.: BLOGRANGER—a multi-faceted blog search engine. In Proceedings of the 15th International Conference on World Wide Web (2006)

  12. Hayes, C., Avesani, P.: Using tags and clustering to identify Topic-relevant blogs. In: Proceedings of the International Conference on Weblogs and Social Media (2007)

  13. Herring, S., Scheidt, L., Bonus, S., Wright, E.: Bridging the gap: a genre analysis of weblogs. In: Proceedings of the Thirty-Seventh Hawaii International Conference on System Sciences (HICSS-37) (2004)

  14. Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002)

  15. Joachims, T., Radlinski, F.: Search engines that learn from implicit feedback. In: IEEE Computer, pp. 34–40 (2007)

  16. Kleinberg, J.M.: Authoritative sources in hyperlinked environment. J. ACM 46(5), 604–632 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  17. Kritikopoulos, A., Sideri, M., Varlamis, I.: BlogRank: ranking Weblogs based on connectivity and similarity features. In: Proceedings of AAA-IDEA ’06 (2006)

  18. Kumar, R., Novak, J., Raghavan, P., Tomkins, A.: On the bursty evolution of Blogspace. World Wide Web J. 8, 159–178 (2005)

    Article  Google Scholar 

  19. Maguitman, A.G., Menczer, F., Erdinc, F., Roinestad, H., Vespignani, A.: Algorithmic computation and approximation of semantic similarity. World Wide Web J. 9, 431–456 (2006)

    Article  Google Scholar 

  20. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the Web. In: Tech. rep. 1999–66, Stanford Digital Library Technologies Project, Tech. rep. 1999–66 (1999)

  21. Pikas, C.: Blog searching for competitive intelligence, brand image, and reputation management. Online 29(4), 16–21 (2005)

    Google Scholar 

  22. Radlinski, F., Joachims, T.: Query chains: learning to rank from implicit feedback. In: Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD) (2005)

  23. Rosenblatt, F.: The Perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386–4.8 (1958)

    Article  MathSciNet  Google Scholar 

  24. RSS 2.0 Specification: Available: http://www.rssboard.org/rss-specification (2008). Accessed 1 October 2008

  25. Sia, K.C., Cho, J., Cho, H.-K.: Efficient monitoring algorithm for fast news alerts. IEEE Trans. Knowl. Data Eng. 19(7), 950–961 (2007)

    Article  Google Scholar 

  26. Suitt, H.: A blogger in their midst. Harvard Bus. Rev. 81(9), 30–40 (2003)

    Google Scholar 

  27. Technorati: Available: http://www.technorati.com (2008). Accessed 1 October 2008

  28. Thelwall, M., Hasler, L.: Blog search engines. Online Inf. Rev. 31(4), 467–479 (2007)

    Article  Google Scholar 

  29. Weiss, A.: Your blog? Who gives a @*#%!. Networker 8(1), 38–40 (2004)

    Google Scholar 

  30. Yih, W.-T., Goodman, J., Carvalho, V.R.: Finding advertising keywords on web pages. In: Proceedings of the 15th International Conference on World Wide Web (2006)

  31. Zhou, Y., Chen, X., Wang, C.: A self-organizing search engine for RSS syndicated web contents. In: Proceedings of the 22nd International Conference on Data Engineering Workshops (2006)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jonghun Park.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Han, SK., Shin, D., Jung, JY. et al. Exploring the Relationship between Keywords and Feed Elements in Blog Post Search. World Wide Web 12, 381–398 (2009). https://doi.org/10.1007/s11280-009-0067-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-009-0067-3

Keywords

Navigation