Skip to main content

A Probabilistic Topic Model with Social Tags for Query Reformulation in Informational Search

  • Conference paper
Book cover Advanced Data Mining and Applications (ADMA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7120))

Included in the following conference series:

Abstract

It is non-trivial to formulate a query that can precisely describe the goal of an informational search task. Query reformulation based on the query clustering approach addresses this issue by expanding a new query with related existing queries that were generated by other users. However, the query clustering approach is unable to cluster queries that are intrinsically related but neither contain common terms nor return common clicked Web page URLs. More importantly, it does not address the issue of ranking retrieved results according to their relevance to the search goal. In this paper, we present new query reformulation approach based on a novel probabilistic topic model to discovering the latent semantic relationships between the queries and the URLs. It can not only discover related queries that cannot be clustered by existing query clustering approaches but also rank retrieved results according to the similarities of probability distributions over the latent topics among the queries and the URLs. The results of our experiments have shown that this approach can significantly improve the performance of an informational search task in terms of search accuracy and search efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aslam, J., Montague, M.: Models for metasearch. In: SIGIR 2001, pp. 276–284 (2001)

    Google Scholar 

  2. Baeza-Yates, R., Hurtado, C., Mendoza, M.: Query Clustering for Boosting Web Page Ranking. In: Favela, J., Menasalvas, E., Chávez, E. (eds.) AWIC 2004. LNCS (LNAI), vol. 3034, pp. 164–175. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: KDD 2000, pp. 407–416 (2000)

    Google Scholar 

  4. Blei, D.M., McAuliffe, J.D.: Supervised topic models. In: NIPS 2007: Proceedings of Advances in Neural Information Processing Systems (2007)

    Google Scholar 

  5. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research, 993–1022 (2003)

    Google Scholar 

  6. Cui, H., Wen, J.R., Nie, J.Y., Ma, W.Y.: Query expansion by mining user logs. IEEE Transaction of Knowledge Data Engineering 15(4), 829–839 (2003)

    Article  Google Scholar 

  7. Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceeding of the National Academy of Sciences, 5228–5235 (2004)

    Google Scholar 

  8. Joachims, T.: Optimizing search engines using clickthrough data. In: KDD 2002, pp. 133–142 (2002)

    Google Scholar 

  9. Pass, G., Chowdhury, A., Torgeson, C.: A Picture of Search. In: Infoscale 2006: Proceedings of the 1st International Conference on Scalable Information Systems (2006)

    Google Scholar 

  10. Radlinski, F., Joachims, T.: Query Chains: Learning to Rank from Implicit Feedback. In: KDD 2005, pp. 239–248 (2005)

    Google Scholar 

  11. Rocchio, J.J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System, pp. 313–323. Prentice Hall, Inc., Englewood Cliffs (1971)

    Google Scholar 

  12. Rose, D.E., Levinson, D.: Understanding user goals in web search. In: WWW 2004, pp. 13–19 (2004)

    Google Scholar 

  13. Wei, J., Bressan, S., Ooi, B.C.: Mining term association rules for automatic global query expansion: methodology and preliminary results. In: WISE 2000, pp. 366–373 (2000)

    Google Scholar 

  14. Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: SIGIR 2006, pp. 178–185 (2006)

    Google Scholar 

  15. Wen, J.R., Nie, J.Y., Zhang, H.J.: Query clustering using content words and user feedback. In: SIGIR 2001, pp. 442–443 (2001)

    Google Scholar 

  16. Xiang, B., Jiang, D., Pei, J., Sun, X., Chen, E., Li, H.: Context-aware ranking in web search. In: SIGIR 2010, pp. 451–458 (2010)

    Google Scholar 

  17. Xu, J., Croft, W.B.: Query expansion using local and global document analysis. In: SIGIR 1996, pp. 4–11 (1996)

    Google Scholar 

  18. Xue, G.R., Zeng, H.J., Chen, Z., Yu, Y., Ma, W.Y., Xi, W.S., Fan, W.G.: Optimizing web search using web click-through data. In: CIKM 2004, pp. 118–126 (2004)

    Google Scholar 

  19. Zhou, D., Bian, J., Zheng, S., Zha, H., Giles, C.L.: Exploring Social Annotations for Information Retrieval. In: WWW 2008, pp. 715–724 (2008)

    Google Scholar 

  20. Zubiaga, A., García-Plaza, A.P., Fresno, V., Martínez, R.: Content-based Clustering for Tag Cloud Visualization. In: ASONAM 2009 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mao, Y., Shen, H., Sun, C. (2011). A Probabilistic Topic Model with Social Tags for Query Reformulation in Informational Search. In: Tang, J., King, I., Chen, L., Wang, J. (eds) Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science(), vol 7120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25853-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25853-4_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25852-7

  • Online ISBN: 978-3-642-25853-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics