Abstract
It is non-trivial to formulate a query that can precisely describe the goal of an informational search task. Query reformulation based on the query clustering approach addresses this issue by expanding a new query with related existing queries that were generated by other users. However, the query clustering approach is unable to cluster queries that are intrinsically related but neither contain common terms nor return common clicked Web page URLs. More importantly, it does not address the issue of ranking retrieved results according to their relevance to the search goal. In this paper, we present new query reformulation approach based on a novel probabilistic topic model to discovering the latent semantic relationships between the queries and the URLs. It can not only discover related queries that cannot be clustered by existing query clustering approaches but also rank retrieved results according to the similarities of probability distributions over the latent topics among the queries and the URLs. The results of our experiments have shown that this approach can significantly improve the performance of an informational search task in terms of search accuracy and search efficiency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aslam, J., Montague, M.: Models for metasearch. In: SIGIR 2001, pp. 276–284 (2001)
Baeza-Yates, R., Hurtado, C., Mendoza, M.: Query Clustering for Boosting Web Page Ranking. In: Favela, J., Menasalvas, E., Chávez, E. (eds.) AWIC 2004. LNCS (LNAI), vol. 3034, pp. 164–175. Springer, Heidelberg (2004)
Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: KDD 2000, pp. 407–416 (2000)
Blei, D.M., McAuliffe, J.D.: Supervised topic models. In: NIPS 2007: Proceedings of Advances in Neural Information Processing Systems (2007)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research, 993–1022 (2003)
Cui, H., Wen, J.R., Nie, J.Y., Ma, W.Y.: Query expansion by mining user logs. IEEE Transaction of Knowledge Data Engineering 15(4), 829–839 (2003)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceeding of the National Academy of Sciences, 5228–5235 (2004)
Joachims, T.: Optimizing search engines using clickthrough data. In: KDD 2002, pp. 133–142 (2002)
Pass, G., Chowdhury, A., Torgeson, C.: A Picture of Search. In: Infoscale 2006: Proceedings of the 1st International Conference on Scalable Information Systems (2006)
Radlinski, F., Joachims, T.: Query Chains: Learning to Rank from Implicit Feedback. In: KDD 2005, pp. 239–248 (2005)
Rocchio, J.J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System, pp. 313–323. Prentice Hall, Inc., Englewood Cliffs (1971)
Rose, D.E., Levinson, D.: Understanding user goals in web search. In: WWW 2004, pp. 13–19 (2004)
Wei, J., Bressan, S., Ooi, B.C.: Mining term association rules for automatic global query expansion: methodology and preliminary results. In: WISE 2000, pp. 366–373 (2000)
Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: SIGIR 2006, pp. 178–185 (2006)
Wen, J.R., Nie, J.Y., Zhang, H.J.: Query clustering using content words and user feedback. In: SIGIR 2001, pp. 442–443 (2001)
Xiang, B., Jiang, D., Pei, J., Sun, X., Chen, E., Li, H.: Context-aware ranking in web search. In: SIGIR 2010, pp. 451–458 (2010)
Xu, J., Croft, W.B.: Query expansion using local and global document analysis. In: SIGIR 1996, pp. 4–11 (1996)
Xue, G.R., Zeng, H.J., Chen, Z., Yu, Y., Ma, W.Y., Xi, W.S., Fan, W.G.: Optimizing web search using web click-through data. In: CIKM 2004, pp. 118–126 (2004)
Zhou, D., Bian, J., Zheng, S., Zha, H., Giles, C.L.: Exploring Social Annotations for Information Retrieval. In: WWW 2008, pp. 715–724 (2008)
Zubiaga, A., García-Plaza, A.P., Fresno, V., Martínez, R.: Content-based Clustering for Tag Cloud Visualization. In: ASONAM 2009 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mao, Y., Shen, H., Sun, C. (2011). A Probabilistic Topic Model with Social Tags for Query Reformulation in Informational Search. In: Tang, J., King, I., Chen, L., Wang, J. (eds) Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science(), vol 7120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25853-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-25853-4_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25852-7
Online ISBN: 978-3-642-25853-4
eBook Packages: Computer ScienceComputer Science (R0)