A Probabilistic Topic Model with Social Tags for Query Reformulation in Informational Search

Mao, Yuqing; Shen, Haifeng; Sun, Chengzheng

doi:10.1007/978-3-642-25853-4_9

Yuqing Mao^22,23,
Haifeng Shen²³ &
Chengzheng Sun²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7120))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

962 Accesses
1 Citations

Abstract

It is non-trivial to formulate a query that can precisely describe the goal of an informational search task. Query reformulation based on the query clustering approach addresses this issue by expanding a new query with related existing queries that were generated by other users. However, the query clustering approach is unable to cluster queries that are intrinsically related but neither contain common terms nor return common clicked Web page URLs. More importantly, it does not address the issue of ranking retrieved results according to their relevance to the search goal. In this paper, we present new query reformulation approach based on a novel probabilistic topic model to discovering the latent semantic relationships between the queries and the URLs. It can not only discover related queries that cannot be clustered by existing query clustering approaches but also rank retrieved results according to the similarities of probability distributions over the latent topics among the queries and the URLs. The results of our experiments have shown that this approach can significantly improve the performance of an informational search task in terms of search accuracy and search efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aslam, J., Montague, M.: Models for metasearch. In: SIGIR 2001, pp. 276–284 (2001)
Google Scholar
Baeza-Yates, R., Hurtado, C., Mendoza, M.: Query Clustering for Boosting Web Page Ranking. In: Favela, J., Menasalvas, E., Chávez, E. (eds.) AWIC 2004. LNCS (LNAI), vol. 3034, pp. 164–175. Springer, Heidelberg (2004)
Chapter Google Scholar
Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: KDD 2000, pp. 407–416 (2000)
Google Scholar
Blei, D.M., McAuliffe, J.D.: Supervised topic models. In: NIPS 2007: Proceedings of Advances in Neural Information Processing Systems (2007)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research, 993–1022 (2003)
Google Scholar
Cui, H., Wen, J.R., Nie, J.Y., Ma, W.Y.: Query expansion by mining user logs. IEEE Transaction of Knowledge Data Engineering 15(4), 829–839 (2003)
Article Google Scholar
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceeding of the National Academy of Sciences, 5228–5235 (2004)
Google Scholar
Joachims, T.: Optimizing search engines using clickthrough data. In: KDD 2002, pp. 133–142 (2002)
Google Scholar
Pass, G., Chowdhury, A., Torgeson, C.: A Picture of Search. In: Infoscale 2006: Proceedings of the 1st International Conference on Scalable Information Systems (2006)
Google Scholar
Radlinski, F., Joachims, T.: Query Chains: Learning to Rank from Implicit Feedback. In: KDD 2005, pp. 239–248 (2005)
Google Scholar
Rocchio, J.J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System, pp. 313–323. Prentice Hall, Inc., Englewood Cliffs (1971)
Google Scholar
Rose, D.E., Levinson, D.: Understanding user goals in web search. In: WWW 2004, pp. 13–19 (2004)
Google Scholar
Wei, J., Bressan, S., Ooi, B.C.: Mining term association rules for automatic global query expansion: methodology and preliminary results. In: WISE 2000, pp. 366–373 (2000)
Google Scholar
Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: SIGIR 2006, pp. 178–185 (2006)
Google Scholar
Wen, J.R., Nie, J.Y., Zhang, H.J.: Query clustering using content words and user feedback. In: SIGIR 2001, pp. 442–443 (2001)
Google Scholar
Xiang, B., Jiang, D., Pei, J., Sun, X., Chen, E., Li, H.: Context-aware ranking in web search. In: SIGIR 2010, pp. 451–458 (2010)
Google Scholar
Xu, J., Croft, W.B.: Query expansion using local and global document analysis. In: SIGIR 1996, pp. 4–11 (1996)
Google Scholar
Xue, G.R., Zeng, H.J., Chen, Z., Yu, Y., Ma, W.Y., Xi, W.S., Fan, W.G.: Optimizing web search using web click-through data. In: CIKM 2004, pp. 118–126 (2004)
Google Scholar
Zhou, D., Bian, J., Zheng, S., Zha, H., Giles, C.L.: Exploring Social Annotations for Information Retrieval. In: WWW 2008, pp. 715–724 (2008)
Google Scholar
Zubiaga, A., García-Plaza, A.P., Fresno, V., Martínez, R.: Content-based Clustering for Tag Cloud Visualization. In: ASONAM 2009 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Engineering, Nanyang Technological University, BLK N4, Nanyang Avenue, Singapore, 639798, Singapore
Yuqing Mao & Chengzheng Sun
School of Computer Science, Engineering & Mathematics, Flinders University, Adelaide, South Australia, 5001, Australia
Yuqing Mao & Haifeng Shen

Authors

Yuqing Mao
View author publications
You can also search for this author in PubMed Google Scholar
Haifeng Shen
View author publications
You can also search for this author in PubMed Google Scholar
Chengzheng Sun
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
Jie Tang & Jianyong Wang &
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, SAR, China
Irwin King
Faculty of Engineering and Information Technology, University of Technology, 2007, Sydney, NSW, Australia
Ling Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mao, Y., Shen, H., Sun, C. (2011). A Probabilistic Topic Model with Social Tags for Query Reformulation in Informational Search. In: Tang, J., King, I., Chen, L., Wang, J. (eds) Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science(), vol 7120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25853-4_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-25853-4_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25852-7
Online ISBN: 978-3-642-25853-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics