Skip to main content

Effectiveness of Aggregation Methods in Blog Distillation

  • Conference paper
Book cover Flexible Query Answering Systems (FQAS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5822))

Included in the following conference series:

Abstract

This paper addresses the blog distillation problem, that is, given a user query find the blogs that are most related to the query topic. We model each post as evidence of the relevance of a blog to the query, and use aggregation methods like Ordered Weighted Averaging operators to combine the evidence. We show that using only highly relevant evidence (posts) for each blog can result in an effective retrieval system. We implement our methods on TREC’06 blog collection with two standard query sets of TREC’07 and TREC’08. Our experiments on the TREC’07 query set show 35% improvement in Mean Average Precision and 22% improvement in Precision@10 over the best applied fusion method to blog distillation. Similar results have been obtained on TREC’08 query set where we have 31% improvement in Mean Average Precision and 20% improvement in Precision@10 over the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Macdonald, C., Ounis, I., Soboroff, I.: Overview of the trec-2007 blog track. In: Proceedings of the Sixteenth Text REtrieval Conference, TREC 2007 (2007)

    Google Scholar 

  2. Ounis, I., De Rijke, M., Macdonald, C., Mishne, G., Soboroff, I.: Overview of the TREC-2006 blog track. In: Proceedings of TREC, pp. 15–27 (2006)

    Google Scholar 

  3. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  4. Elsas, J.L., Arguello, J., Callan, J., Carbonell, J.G.: Retrieval and feedback models for blog feed search. In: SIGIR, pp. 347–354 (2008)

    Google Scholar 

  5. Efron, M., Turnbull, D., Ovalle, C.: University of Texas School of Information at TREC 2007. In: Proc. of the 2007 Text Retrieval Conf. (2007)

    Google Scholar 

  6. Nunes, S., Ribeiro, C., David, G.: Feup at trec 2008 blog track: Using temporal evidence for ranking and feed distillation. In: TREC (2008)

    Google Scholar 

  7. Soboroff, I., de Vries, A., Craswell, N.: Overview of the TREC 2006 Enterprise Track. In: TREC 2006 Working Notes (2006)

    Google Scholar 

  8. Macdonald, C., Ounis, I.: Voting for candidates: adapting data fusion techniques for an expert search task. In: Proceedings of the 15th ACM international conference on Information and knowledge management, pp. 387–396. ACM Press, New York (2006)

    Google Scholar 

  9. Hannah, D., Macdonald, C., Peng, J., He, B., Ounis, I.: University of Glasgow at TREC 2007: Experiments in Blog and Enterprise Tracks with Terrier. In: Proceedings of TREC (2007)

    Google Scholar 

  10. Hawking, D., Thomas, P.: Server selection methods in hybrid portal search. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 75–82 (2005)

    Google Scholar 

  11. Elsas, J.L., Arguello, J., Callan, J., Carbonell, J.G.: Retrieval and feedback models for blog feed search. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 347–354 (2008)

    Google Scholar 

  12. Arguello, J., Elsas, J., Callan, J., Carbonell, J.: Document representation and query expansion models for blog recommendation. In: Proc. of the 2nd Intl. Conf. on Weblogs and Social Media, ICWSM (2008)

    Google Scholar 

  13. Seo, J., Croft, W.B.: Blog site search using resource selection. In: CIKM 2008: Proceeding of the 17th ACM conference on Information and knowledge management, pp. 1053–1062. ACM, New York (2008)

    Chapter  Google Scholar 

  14. Lee, Y., Na, S.H., Kim, J., Nam, S.H., Jung, H.Y., Lee, J.H.: Kle at trec 2008 blog track: Blog post and feed retrieval. In: TREC (2008)

    Google Scholar 

  15. Saffiotti, A.: An AI view of the treatment of uncertainty. The Knowledge Engineering Review  2(2), 75–97 (1987)

    Article  Google Scholar 

  16. Dempster, A.: Upper and Lower Probabilities Induced by a Multivalued Mapping. The Annals of Mathematical Statistics, 325–339 (1967)

    Google Scholar 

  17. Shafer, G.: A mathematical theory of evidence. Princeton university press, Princeton (1976)

    MATH  Google Scholar 

  18. Lalmas, M., Moutogianni, E.: A Dempster-Shafer indexing for the focussed retrieval of a hierarchically structured document space: Implementation and experiments on a web museum collection. In: 6th RIAO Conference, Content-Based Multimedia Information Access (2000)

    Google Scholar 

  19. Yager, R.R.: On ordered weighted averaging aggregation operators in multicriteria decision making. IEEE Trans. Syst. Man Cybern. 18(1), 183–190 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  20. Zadeh, L.: A computational approach to fuzzy quantifiers in natural languages. In: International series in modern applied mathematics and computer science, vol. 5, pp. 149–184 (1983)

    Google Scholar 

  21. O’Hagan, M.: Aggregating template or rule antecedents in real-time expert systems with fuzzy set logic. In: Proc. of 22nd Annual IEEE Asilomar Conference on Signals, Systems, Computers, pp. 681–689 (1988)

    Google Scholar 

  22. Fullér, R., Majlender, P.: An analytic approach for obtaining maximal entropy OWA operator weights. Fuzzy Sets Syst. 124(1), 53–57 (2001)

    Article  MATH  Google Scholar 

  23. Macdonald, C., Ounis, I.: The TREC Blogs06 collection: Creating and analysing a blog test collection. Department of Computer Science, University of Glasgow Tech Report TR-2006-224 (2006)

    Google Scholar 

  24. Xu, Z.: An overview of methods for determining OWA weights. International Journal of Intelligent Systems 20(8) (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Keikha, M., Crestani, F. (2009). Effectiveness of Aggregation Methods in Blog Distillation. In: Andreasen, T., Yager, R.R., Bulskov, H., Christiansen, H., Larsen, H.L. (eds) Flexible Query Answering Systems. FQAS 2009. Lecture Notes in Computer Science(), vol 5822. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04957-6_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04957-6_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04956-9

  • Online ISBN: 978-3-642-04957-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics