Abstract
Users often submit multiple related queries in order to accomplish one search task. Identifying search tasks faces two challenges: 1) Search tasks are often intertwined and may span from seconds to days. 2) Queries triggered by semantic-related search tasks may share few common terms or clicked documents. To address the challenges, we exploit semantic features of named entities to improve semantic-related search tasks identification. A novel approach to learning the semantic-related distance function between pair-wise queries is proposed. The approach uses categories of named entities as regularization, which reinforces that queries containing entities from the same category more probably belong to one search task. Finally, semantic-related search tasks are identified by the hierarchical agglomerative clustering algorithm with the learned distance function. Experiments show significant improvement of our approach over corresponding state-of-the-art ones.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aiello, L.M., Donato, D., Ozertem, U., et al.: Behavior-driven Clustering of queries into topics. In: Proc. of the 20th CIKM, pp. 1373–1382 (2011)
Kotov, A., Bennett, P.N., White, R.W., et al.: Modeling and analysis of cross-session search tasks. In: Proc. of the 34th SIGIR (2011)
Boldi, P., Bonchi, F., Castillo, C., et al.: The query-flow graph: model and applications. In: Proc. of the 17th CIKM, pp. 609–618 (2008)
Broder, A.: A taxonomy of web search. SIGIR Forum 36, 3–10 (2002)
Donato, D., Bonchi, F., Chi, T., et al.: Do you want to take notes?: identifying research missions in Yahoo! search pad. In: Proc. of the 19th WWW (2010)
Guo, J., Xu, G., Cheng, X., et al.: Named entity recognition in query. In: Proc. of the 32nd SIGIR, pp. 267–274 (2009)
Ji, M., Yan, J., Gu, S., et al.: Learning search tasks in queries and web pages via graph regularization. In: Proc. of the 34th SIGIR, pp. 55–64 (2011)
Lucchese, C., Orlando, S., Perego, R., et al.: Identifying task-based sessions in search engine query logs. In: Proc. of the 4th WSDM, pp. 277–286 (2011)
Jones, R., Klinkner, K.L.: Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In: Proc. of the 17th CIKM (2008)
Sadikov, E., Madhavan, J., Wang, L., et al.: Clustering query refinements by user intent. In: Proc. of the 19th WWW, pp. 841–850 (2010)
Spink, A., Park, M., Jansen, B.J., et al.: Multitasking during web search sessions. Information Processing and Management 42(1), 264–275 (2006)
Yin, X., Shah, S.: Building taxonomy of web search intents for name entity queries. In: Proc. of the 19th WWW, pp. 1001–1010 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gong, S., Xiong, J., Zhang, C., Liu, Z. (2013). Identifying Semantic-Related Search Tasks in Query Log. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds) Web Technologies and Applications. APWeb 2013. Lecture Notes in Computer Science, vol 7808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37401-2_51
Download citation
DOI: https://doi.org/10.1007/978-3-642-37401-2_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37400-5
Online ISBN: 978-3-642-37401-2
eBook Packages: Computer ScienceComputer Science (R0)