ABSTRACT
The interaction of vast numbers of search engine users with sets of search results sets is a potential source of significant quantities of resource classification data. In this paper we discuss work which uses coselection data (i.e. multiple click-through events generated by the same user on a single search engine result page) as an indicator of mutual relevance between web resources and a means for the automatic clustering of sense-singular resources. The results indicate that coselection can be used in this way. We ground-truthed unambiguous query clustering, forming a foundation for work on automatic ambiguity detection based on the resulting number of generated clusters. Using the cluster overlap by population principle, the extension of previous work allowed determination of synonyms or lingual translations where overlapping clusters indicated the mutual relevance in coselection and subsequently the irrelevance of the actual label inherited from the user query.
- E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. Proc. SIGIR, Jan 2006. Google ScholarDigital Library
- H. Ashman, M. Antunovic, C. Donner, R. Frith, E. Rebelos, J.-F. Schmakeit, G. Smith, and M. Truran, Are clickthroughs useful for image labelling?, Proceedings of IEEE/WIC/ACM Web Intelligence 09, Milan. Google ScholarDigital Library
- H. Ashman, D. Zhou, J. Goulding, T. Brailsford and M. Truran, The Global Perpetual Dictionary of Everything, Proc. Ausweb 2007, http://ausweb.scu.edu.au/aw07/pa-pers/refereed/ashman/paper.html, 2007.Google Scholar
- D. Beeferman and A. Berger, Agglomerative clustering of a search engine query log, Proc of SIGKDD, 2000, pp 407--416 Google ScholarDigital Library
- V. Bush, As we may think, Atlantic Monthly, July 1945.Google ScholarDigital Library
- W. S. Chan, W. T. Leung and D. L. Lee, Clustering Search Engine Query Log Containing Noisy Clickthroughs, Proc Int. Symposium on Applications and the Internet, 2004, p. 4.Google Scholar
- O. Chapelle and Y. Zhang, A Dynamic Bayesian Network Click Model for web Search Ranking, Proc ACM WWW, 2009 Google ScholarDigital Library
- P Alexandru Chirita, W. Nejdl, R. Paiu, and C. Kohlschütter. Using odp metadata to personalize search. Proceedings SIGIR 2005, Jan 2005. Google ScholarDigital Library
- N. Craswell, O. Zoeter, M. Taylor and B. Ramsey, An experimental comparision of click position-bias models, Proc of WSDM, 2008, pp 87--94 Google ScholarDigital Library
- S. Cronen-Townsend and W Bruce Croft. Quantifying query ambiguity. Proc. second international conference on Human Language Technology Research, Jan 2002. Google ScholarDigital Library
- G. Dupret and B. Piwowarski, User browsing model to predict search engine click data from past observations, Proc SIGIR, 2008 Google ScholarDigital Library
- L Earl. The resolution of syntactic ambiguity in automatic language processing. Inf. Storage and Retrieval, Jan 1972.Google ScholarCross Ref
- L Earl. Use of word government in resolving syntactic and semantic ambiguities. Inf. Storage and Retr., Jan 1973.Google ScholarCross Ref
- M. Ester, H. P. Kriegel, J. Sander, and X. Xu, A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise, in Proc. KDDM 1996, pp. 226--231.Google Scholar
- D. Fallows, 2008. Search Soars, Challenging Email as a Favorite Internet Activity. http://pewresearch.org/pubs/921/internet-searchGoogle Scholar
- S. Fox, K. Karnawat, M. Mydland, S. Dumais and T. White, Evaluating implicit measures to improve web search, ACM TOIS, 2005, vol 63, pp147--481 Google ScholarDigital Library
- W. Gale, K. Ward Church, and D. Yarowsky. Estimating upper and lower bounds on the performance of word-sense disambiguation programs. Proc ACL '92, 1992. Google ScholarDigital Library
- Google Image Labeler, http://images.goo-gle.com/imagelabeler/, accessed 19/01/2010Google Scholar
- F. Guo, C. Liu, A. Kannan, T. Minka, M. Taylor, Y. Wang and C. Faloutsos, Click Chain Model in Web Search, Proc ACM WWW, 2009 Google ScholarDigital Library
- T. H. Haveliwala. Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE transactions on knowledge and data engineering, Jan 2003. Google ScholarDigital Library
- N. Ide and J. Vronis. Introduction to the special issue on word sense disambiguation: the state of the art. Computational linguistics, Jan 1998. Google ScholarDigital Library
- T. Joachims, L. Granka, B. Pan, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback, In SIGIR, pages 154--161, Brazil, 2005. ACM. Google ScholarDigital Library
- A. Kaplan. An experimental study of ambiguity and context. Published 1955 in Mechanical Translation, 2(2):39--46, 1950.Google Scholar
- H. Lieberman, Letizia: An agent that assists web browsing, IJCAI, 1995, vol 14, pp 924--929 Google ScholarDigital Library
- S. Liu, C. Yu, and W. Meng. Word sense disambiguation in queries. Proc. ACM Int. conference on Information and knowledge management, Jan 2005. Google ScholarDigital Library
- M. Sanderson. Word sense disambiguation and information retrieval. Proc. ACM SIGIR 1994, Jan 1994. Google ScholarDigital Library
- M. Sanderson. Word sense disambiguation and information retrieval. PhD Thesis, 1996.Google Scholar
- M. Sanderson, Ambiguous queries: test collections need more sense, in Proc. ACM SIGIR, ACM, 2008, pp. 499--506. Google ScholarDigital Library
- F. Scholer, M. Shokouhi, B. Billerbeck and A. Turpin, Using clicks as implicit judgments: Expectations versus observations. ECIR, 2008, vol 4956, pp 28--39 Google ScholarDigital Library
- D. Shen, M. Qin, W. Chen, Q. Yang and Z. Chen, Mining web query hierarchies from clickthrough data, AAAI, 2007, vol 22, pp 341 Google ScholarDigital Library
- G. Smith, M. Antunovic, and H. Ashman, Classifying Images with Image and Text Search Clickthrough Data, Proc. Int. Conf. on Active Media Technology, 2009 Google ScholarDigital Library
- G. Smith and H. Ashman, Evaluating implicit judgments from Web search interactions, Proc. Web Sci., 2009.Google Scholar
- G. Smith, T. Brailsford, C. Donner, D. Hooijmaijers, M. Truran, J. Goulding and H. Ashman, Generating unambiguous URL clusters from Web search, Proc. workshop on Web Search Click Data, pp 28--34, 2009, ACM. Google ScholarDigital Library
- R. Song, Z. Luo, Y.-Y. Nie, Y. Yu, and H.-W.Hon. Identification of ambiguous queries in web search. Information Processing & Mgmt, Jan 2009. Google ScholarDigital Library
- K. Sparck-Jones, S. E Robertson, and M. Sanderson. Ambiguous requests: implications for retrieval tests, systems and theories. ACM SIGIR forum, Jan 2007. Google ScholarDigital Library
- M. Truran, The Theory and Practice of Co-active Search, doctoral thesis, University of Nottingham, 2005.Google Scholar
- M. Truran, J. Goulding and H. Ashman, Co-active Intelligence for Image Retrieval, Proc of Multimedia 05, ACM, 2005, pp 547--550. Google ScholarDigital Library
- T. Tsikrika, C. Diou, A de Vries and A. Delopoulos, Image annotation using clickth rough data, CIVR 09, ACM, 2009. Google ScholarDigital Library
- W. Weaver. Translation (published 1955). Machine Translation of Languages: Fourteen Essays, W. N. Locke and A. D. Booth, Eds.(Technology Press of MIT, Cambridge, MA):15--23, 1949Google Scholar
- J. Wen, J. Nie and H. Zhang, Clustering user queries of a search engine, Proc of ACM WWW, 2001, pp 162--168 Google ScholarDigital Library
- J. R. Wen, J. Y. Nie and H. J. Zhang, Query clustering using user logs, ACM Trans. Inf. Syst., vol. 20, pp. 59--81, 2002. Google ScholarDigital Library
- J. R. Wen and H.-J. Zhang. Query Clustering in the Web Context, in Inf. Retrieval and Clustering, Kluwer, 2002Google Scholar
Index Terms
- Implicit association via crowd-sourced coselection
Recommendations
Query-page intention matching using clicked titles and snippets to boost search rankings
JCDL '09: Proceedings of the 9th ACM/IEEE-CS joint conference on Digital librariesUsers of text retrieval systems input only a few keywords or sometimes just one keyword to the systems even if they had complex information needs. Due to the lack of query keywords, it becomes hard to return relevant search results that satisfy the ...
Improving web search ranking by incorporating user behavior information
SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrievalWe show that incorporating user behavior data can significantly improve ordering of top results in real web search setting. We examine alternatives for incorporating feedback into the ranking process and explore the contributions of user feedback ...
Finding synonyms and other semantically-similar terms from coselection data
AWC '13: Proceedings of the First Australasian Web Conference - Volume 144Clickthrough data has been proposed for numerous uses, and this paper describes how a special form of clickthough data, coselection data, can form non-ambiguous clusters that can then be used to detect semantic similarity between query terms. This ...
Comments