Abstract
An approach to clustering short text snippets is proposed, which can be used to cluster search results into a few relevant groups to help users quickly locate their interesting groups of results. Specifically, the collection of search result snippets is regarded as a similarity graph implicitly, in which each snippet is a vertex and each edge between the vertices is weighted by the similarity between the corresponding snippets. TermCut, the proposed clustering algorithm, is then applied to recursively bisect the similarity graph by selecting the current core term such that one cluster contains the term and the other does not. Experimental results show that the proposed algorithm improves the KMeans algorithm by about 0.3 on FScore criterion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Zhao, Y., Karypis, G.: Evaluation of hierarchical clustering algorithms for document datasets. In: Proc. 11th international conference on information and knowledge management, pp. 515–524 (2002)
Ding, C., He, X., Zha, H.: A min-max cut algorithm for graph partitioning and data clustering. In: Proc. international conference on data mining, pp. 107–114 (2001)
Larsen, B., Aone, C.: Fast and effective text mining using linear-time document clustering. In: Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 16–22 (1999)
Zhang, D., Lee, W.S.: Question classification using support vector machines. In: Proc. 26th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 26–32 (2003)
Hachey, B., Grover, C.: Sequence modelling for sentence classification in a legal summarisation system. In: Proc. ACM symposium on applied computing, pp. 292–296 (2005)
Buyans (2008), http://www.buyans.com
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ni, X., Lu, Z., Quan, X., Liu, W., Hua, B. (2009). Short Text Clustering for Search Results. In: Li, Q., Feng, L., Pei, J., Wang, S.X., Zhou, X., Zhu, QM. (eds) Advances in Data and Web Management. APWeb WAIM 2009 2009. Lecture Notes in Computer Science, vol 5446. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00672-2_55
Download citation
DOI: https://doi.org/10.1007/978-3-642-00672-2_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00671-5
Online ISBN: 978-3-642-00672-2
eBook Packages: Computer ScienceComputer Science (R0)