ABSTRACT
Logfiles of search engines are a promising resource for data mining, since they provide raw data associated to users and web documents. In this paper we focus on the latter aspect and explore how the information in logfiles could be used to improve document descriptions. A pilot experiment demonstrated that document descriptors extracted from the queries that are associated with documents by clicks provide useful semantic information about documents in addition to document descriptors extracted from the full text of the web pages.
- R. Baeza-Yates and A. Tiberi. Extracting semantic relations from query logs. In KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 76--85, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- O. Boydell and B. Smyth. From social bookmarking to social summarization: an experiment in community-based summary generation. In IUI '07: Proceedings of the 12th international conference on Intelligent user interfaces, pages 42--51, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- S. Buttcher, C. Clarke, and G. Cormack. Domain-specific synonym expansion and validation for biomedical information retrieval (multitext experiments for trec 2004). 2004.Google Scholar
- H. Cui, J. Wen, J. Nie, and W. Ma. Probabilistic query expansion using query logs. In Proceedings of the 11th international conference on World Wide Web, pages 325--332. ACM Press New York, NY, USA, 2002. Google ScholarDigital Library
- H. Daume and E. Brill. Web search intent induction via automatic query reformulation. In Human Language Technology Conference/North American chapter of the Association for Computational Linguistics (HTL/NAACL), 2004. Google ScholarDigital Library
- L. Denoyer and P. Gallinari. The Wikipedia XML corpus. ACM SIGIR Forum, 40(1):64--69, 2006. Google ScholarDigital Library
- C. Huang, L. Chien, and Y. Oyang. Relevant term suggestion in interactive web search based on contextual information in query session logs. JASTIS, 54(7):638--649, 2003. Google ScholarDigital Library
- T. Joachims. Optimizing search engines using clickthrough data. In KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 133--142, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
- T. Joachims, L. Granka, B. Pan, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the 28th annual international ACM SIGIR conference, pages 154--161. ACM New York, NY, USA, 2005. Google ScholarDigital Library
- J. Mostert and V. Hollink. Effects of Goal-Oriented Search Suggestions. In Proceedings of the Belgian-Dutch Conference on Artificial Intelligence (BNAIC 2008), 2008.Google Scholar
- F. Radlinski and T. Joachims. Active exploration for learning rankings from clickthrough data. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 570--579. ACM Press New York, NY, USA, 2007. Google ScholarDigital Library
- J. Sun, D. Shen, H. Zeng, Q. Yang, Y. Lu, and Z. Chen. Web-page summarization using clickthrough data. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 194--201. ACM New York, NY, USA, 2005. Google ScholarDigital Library
- T. Tomokiyo and M. Hurst. A Language Model Approach to Keyphrase Extraction. In Proceedings of the ACL Workshop on Multiword Expressions, pages 34--40, 2003. Google ScholarDigital Library
- G. Xue, H. Zeng, Z. Chen, Y. Yu, W. Ma, W. Xi, and W. Fan. Optimizing web search using web click-through data. In Proceedings of the thirteenth ACM international conference on Information and knowledge management, pages 118--126. ACM New York, NY, USA, 2004. Google ScholarDigital Library
Index Terms
- Using query logs and click data to create improved document descriptions
Recommendations
A document query search using an extended centrality with the word2vec
ICEC '16: Proceedings of the 18th Annual International Conference on Electronic Commerce: e-Commerce in Smart connected WorldWhile everyday document search is done by keyword-based queries to search engines, we have situations that need deep search of documents such as scrutinies of patents, legal documents, and so on. In such cases, using document queries, instead of keyword-...
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge managementThis work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Learning Query and Document Relevance from a Web-scale Click Graph
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information RetrievalClick-through logs over query-document pairs provide rich and valuable information for multiple tasks in information retrieval. This paper proposes a vector propagation algorithm on the click graph to learn vector representations for both queries and ...
Comments