Abstract
This paper addresses the blog distillation problem, that is, given a user query find the blogs that are most related to the query topic. We model each post as evidence of the relevance of a blog to the query, and use aggregation methods like Ordered Weighted Averaging operators to combine the evidence. We show that using only highly relevant evidence (posts) for each blog can result in an effective retrieval system. We implement our methods on TREC’06 blog collection with two standard query sets of TREC’07 and TREC’08. Our experiments on the TREC’07 query set show 35% improvement in Mean Average Precision and 22% improvement in Precision@10 over the best applied fusion method to blog distillation. Similar results have been obtained on TREC’08 query set where we have 31% improvement in Mean Average Precision and 20% improvement in Precision@10 over the baseline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Macdonald, C., Ounis, I., Soboroff, I.: Overview of the trec-2007 blog track. In: Proceedings of the Sixteenth Text REtrieval Conference, TREC 2007 (2007)
Ounis, I., De Rijke, M., Macdonald, C., Mishne, G., Soboroff, I.: Overview of the TREC-2006 blog track. In: Proceedings of TREC, pp. 15–27 (2006)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
Elsas, J.L., Arguello, J., Callan, J., Carbonell, J.G.: Retrieval and feedback models for blog feed search. In: SIGIR, pp. 347–354 (2008)
Efron, M., Turnbull, D., Ovalle, C.: University of Texas School of Information at TREC 2007. In: Proc. of the 2007 Text Retrieval Conf. (2007)
Nunes, S., Ribeiro, C., David, G.: Feup at trec 2008 blog track: Using temporal evidence for ranking and feed distillation. In: TREC (2008)
Soboroff, I., de Vries, A., Craswell, N.: Overview of the TREC 2006 Enterprise Track. In: TREC 2006 Working Notes (2006)
Macdonald, C., Ounis, I.: Voting for candidates: adapting data fusion techniques for an expert search task. In: Proceedings of the 15th ACM international conference on Information and knowledge management, pp. 387–396. ACM Press, New York (2006)
Hannah, D., Macdonald, C., Peng, J., He, B., Ounis, I.: University of Glasgow at TREC 2007: Experiments in Blog and Enterprise Tracks with Terrier. In: Proceedings of TREC (2007)
Hawking, D., Thomas, P.: Server selection methods in hybrid portal search. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 75–82 (2005)
Elsas, J.L., Arguello, J., Callan, J., Carbonell, J.G.: Retrieval and feedback models for blog feed search. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 347–354 (2008)
Arguello, J., Elsas, J., Callan, J., Carbonell, J.: Document representation and query expansion models for blog recommendation. In: Proc. of the 2nd Intl. Conf. on Weblogs and Social Media, ICWSM (2008)
Seo, J., Croft, W.B.: Blog site search using resource selection. In: CIKM 2008: Proceeding of the 17th ACM conference on Information and knowledge management, pp. 1053–1062. ACM, New York (2008)
Lee, Y., Na, S.H., Kim, J., Nam, S.H., Jung, H.Y., Lee, J.H.: Kle at trec 2008 blog track: Blog post and feed retrieval. In: TREC (2008)
Saffiotti, A.: An AI view of the treatment of uncertainty. The Knowledge Engineering Review 2(2), 75–97 (1987)
Dempster, A.: Upper and Lower Probabilities Induced by a Multivalued Mapping. The Annals of Mathematical Statistics, 325–339 (1967)
Shafer, G.: A mathematical theory of evidence. Princeton university press, Princeton (1976)
Lalmas, M., Moutogianni, E.: A Dempster-Shafer indexing for the focussed retrieval of a hierarchically structured document space: Implementation and experiments on a web museum collection. In: 6th RIAO Conference, Content-Based Multimedia Information Access (2000)
Yager, R.R.: On ordered weighted averaging aggregation operators in multicriteria decision making. IEEE Trans. Syst. Man Cybern. 18(1), 183–190 (1988)
Zadeh, L.: A computational approach to fuzzy quantifiers in natural languages. In: International series in modern applied mathematics and computer science, vol. 5, pp. 149–184 (1983)
O’Hagan, M.: Aggregating template or rule antecedents in real-time expert systems with fuzzy set logic. In: Proc. of 22nd Annual IEEE Asilomar Conference on Signals, Systems, Computers, pp. 681–689 (1988)
Fullér, R., Majlender, P.: An analytic approach for obtaining maximal entropy OWA operator weights. Fuzzy Sets Syst. 124(1), 53–57 (2001)
Macdonald, C., Ounis, I.: The TREC Blogs06 collection: Creating and analysing a blog test collection. Department of Computer Science, University of Glasgow Tech Report TR-2006-224 (2006)
Xu, Z.: An overview of methods for determining OWA weights. International Journal of Intelligent Systems 20(8) (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Keikha, M., Crestani, F. (2009). Effectiveness of Aggregation Methods in Blog Distillation. In: Andreasen, T., Yager, R.R., Bulskov, H., Christiansen, H., Larsen, H.L. (eds) Flexible Query Answering Systems. FQAS 2009. Lecture Notes in Computer Science(), vol 5822. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04957-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-04957-6_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04956-9
Online ISBN: 978-3-642-04957-6
eBook Packages: Computer ScienceComputer Science (R0)