Skip to main content

Exploring Social Annotation Tags to Enhance Information Retrieval Performance

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6335))

Abstract

Pseudo relevance feedback (PRF) via query expansion has proven to be effective in many information retrieval tasks. Most existing approaches are based on the assumption that the most informative terms in top-ranked documents from the first-pass retrieval can be viewed as the context of the query, and thus can be used to specify the information need. However, there may be irrelevant documents used in PRF (especially for hard topics), which can bring noise into the feedback process. The recent development of Web 2.0 technologies on Internet has provided an opportunity to enhance PRF as more and more high-quality resources can be freely obtained. In this paper, we propose a generative model to select high-quality feedback terms from social annotation tags. The main advantages of our proposed feedback model are as follows. First, our model explicitly explains how each feedback term is generated. Second, our model can take advantage of the human-annotated semantic relationship among terms. Experimental results on three TREC test datasets show that social annotation tags can be used as a good external resource for PRF. It is as good as the top-ranked documents from first-pass retrieval with optimal parameter setting on the WSJ dataset. When we combine the top-ranked documents and the social annotation tags, the retrieval performance can be further improved.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beitzel, S.M., Jensen, E.C., Frieder, O., Lewis, D.D., Chowdhury, A., Kolcz, A.: Improving automatic query classification via semi-supervised learning. In: ICDM, pp. 42–49 (2005)

    Google Scholar 

  2. Carpineto, C., de Mori, R., Bigi, B.: An information-theoretic approach to automatic query expansion. ACM Transactions on Information Systems (TOIS) 19(1), 1–27 (2001)

    Article  Google Scholar 

  3. Xu, J., Croft, W.B.: Improving the effectiveness of information retrieval with local context analysis. ACM Trans. Inf. Syst. 18(1), 79–112 (2000)

    Article  Google Scholar 

  4. Rocchio, J.J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System: Experiments in Automatic Document, pp. 313–323 (1971)

    Google Scholar 

  5. Robertson, S., Walker, S., Beaulieu, M., Gatford, M., Payne, A.: Okapi at trec-4. In: Forth Text REtrieval Conference (TREC-4)

    Google Scholar 

  6. Zhai, C., Lafferty, J.: Model-based feedback in the language modeling approach to information retrieval. In: CIKM 2001: Proceedings of the Tenth International Conference on Information and Knowledge Management, pp. 403–410. ACM, New York (2001)

    Chapter  Google Scholar 

  7. Lavrenko, V., Croft, W.B.: Relevance based language models. In: SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 120–127. ACM, New York (2001)

    Chapter  Google Scholar 

  8. Mitra, M., Singhal, A., Buckley, C.: Improving automatic query expansion. In: SIGIR 1998: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 206–214. ACM, New York (1998)

    Chapter  Google Scholar 

  9. Cronen-Townsend, S., Zhou, Y., Croft, W.B.: A framework for selective query expansion. In: CIKM 2004: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, pp. 236–237. ACM, New York (2004)

    Chapter  Google Scholar 

  10. Mathes, A.: Folksonomies - cooperative classification and communication through shared metadata. In: KDD 2008: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2004)

    Google Scholar 

  11. Song, Y., Zhuang, Z., Li, H., Zhao, Q., Li, J., Lee, W.C., Giles, C.L.: Real-time automatic tag recommendation. In: SIGIR 2008: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 515–522. ACM, New York (2008)

    Chapter  Google Scholar 

  12. Wu, X., Zhang, L., Yu, Y.: Exploring social annotations for the semantic web. In: WWW 2006: Proceedings of the 15th International Conference on World Wide Web, pp. 417–426. ACM, New York (2006)

    Chapter  Google Scholar 

  13. Hotho, A., Jschke, R., Schmitz, C., Stumme, G.: Information retrieval in folksonomies: Search and ranking. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 411–426. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  14. Bao, S., Xue, G., Wu, X., Yu, Y., Fei, B., Su, Z.: Optimizing web search using social annotations. In: WWW 2007: Proceedings of the 16th International Conference on World Wide Web, pp. 501–510. ACM, New York (2007)

    Chapter  Google Scholar 

  15. Cao, G., Nie, J.Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: SIGIR 2008: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 243–250. ACM, New York (2008)

    Chapter  Google Scholar 

  16. Lafferty, J., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: SIGIR 2001: Proceedings of The 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 111–119. ACM, New York (2001)

    Chapter  Google Scholar 

  17. Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22(2), 179–214 (2004)

    Article  Google Scholar 

  18. Voorhees, E.M., Harman, D.: Overview of the sixth text retrieval conference. Information Processing and Management: an International Journal 36, 3–35 (2000)

    Article  Google Scholar 

  19. Lv, Y., Zhai, C.: A comparative study of methods for estimating query language models with pseudo feedback. In: CIKM 2009: Proceeding of the 18th ACM Conference on Information and Knowledge Management, pp. 1895–1898. ACM, New York (2009)

    Chapter  Google Scholar 

  20. Diaz, F., Metzler, D.: Improving the estimation of relevance models using large external corpora. In: SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 154–161. ACM, New York (2006)

    Chapter  Google Scholar 

  21. Metzler, D., Novak, J., Cui, H., Reddy, S.: Building enriched document representations using aggregated anchor text. In: SIGIR 2009: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 219–226. ACM, New York (2009)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ye, Z., Huang, X.J., Jin, S., Lin, H. (2010). Exploring Social Annotation Tags to Enhance Information Retrieval Performance. In: An, A., Lingras, P., Petty, S., Huang, R. (eds) Active Media Technology. AMT 2010. Lecture Notes in Computer Science, vol 6335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15470-6_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15470-6_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15469-0

  • Online ISBN: 978-3-642-15470-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics