poster

Learning to detect task boundaries of query session

Authors:

Zhenzhong Zhang,

Xianpei HanAuthors Info & Claims

CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Pages 1885 - 1888

https://doi.org/10.1145/2505515.2507887

Published: 27 October 2013 Publication History

Abstract

To accomplish a search task and satisfy a single information need, users usually submit a series of queries to web search engines. It is useful for web search engines to detect the task boundaries in a series of successive queries. Traditional task boundary detection methods are based on time gap and lexical comparisons, which often suffer from the vocabulary gap problem, that is, the topically related queries may not share any common words. In this paper we learn hidden topics from query log and leverage them to resolve the vocabulary gap problem. Unlike other external knowledge resources, such as WordNet and Wikipedia, the hidden topics discovered from query log cover long tail queries, which is useful to detect task boundaries. Experimental results on dataset from real world query log demonstrate that the proposed method achieves significant quality enhancement.

References

[1]

Paolo Boldi, Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis, and Sebastiano Vigna. 2008. The query-flow graph: model and applications. In CIKM, pages 609--618.

Digital Library

[2]

Nikolai Buzikashvili and Bernard J. Jansen. 2006. Limits of the Web log analysis artifacts, In Workshop on Logging Traces of Web Activity, WWW 2006.

[3]

Doug Downey, Susan Dumais, and Eric Horvitz. 2007. Models of searching and browsing: languages, studies, and application. In IJCAI, pages 2740--2747.

Digital Library

[4]

Daniel Gayo-Avello. 2009. A survey on session detection methods in query logs and a proposal for future evaluation. Information Sciences, 179(12):1822--1843.

Digital Library

[5]

Evgeniy Gabrilovich and Shaul Markovitch. 2007. Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In IJCAI, pages 1606--1611.

Digital Library

[6]

Thomas L Griffiths and Mark Steyvers. 2004. Finding scientific topics. In Proceedings of the National Academy of Sciences of the United States of America, pages: 5228--5235.

[7]

Matthias Hagen, Benno Stein, and Tino Rüb. 2011. Query session detection as a cascade. In CIKM, pages 147--152.

Digital Library

[8]

Daqing He and Ayse Göker. 2000. Detecting session boundaries from Web user logs, In Proceedings of the 22nd Annual Colloquium on Information Retrieval Research, pages 57--66.

[9]

Daqing He, Ayse Göker, and David J. Harper. 2002. Combining evidence for automatic Web session identification, Information Processing and Management, 38(5):727--742,

Digital Library

[10]

Rosie Jones and Kristina L.Klinkner. 2008. Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In CIKM, pages 699--708.

Digital Library

[11]

Claudio Lucchese, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, and Gabriele Tolomei. 2011. Identifying task-based sessions in search engine query logs. InWSDM, pages 277--286.

Digital Library

[12]

Greg Pass, Abdur Chowdhury, and Cayley Torgeson. 2006. A picture of search. In INFOSCALE, paper 1.

Digital Library

[13]

Xuan-Hieu Phan, Cam-Tu Nguyen, Dieu-Thu Le, Le-Minh Nguyen, Susumu Horiguchi, and Quang-ThuyHa. 2010. A hidden topic-based framework towards building applications with short web documents. In TKDE, pages 961--976.

Digital Library

[14]

Filip Radlinski and Thorsten Joachims. 2005. Query chains: learning to rank from implicit feedback, In KDD, pages 239--248.

Digital Library

[15]

Xuehua Shen, Bin Tan, and Chengxiang. Zhai. 2005. Implicit user modeling for personalized search. In CIKM, pages 824--831.

Digital Library

[16]

Amanda Spink, Bernard J. Jansen, and H. C. Özmutlu. 2000. Use of query reformulation and relevance feedback by excite users. In Internet Research: Electronic etworking Applicationsand Policy, 10(4): pages 317--328.

Index Terms

Learning to detect task boundaries of query session
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing

Recommendations

Multi-view random walk framework for search task discovery from click-through log
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

Search engine users often have clear search tasks hidden behind their queries. Inspired by this, the modern search engines are providing an increasing number of services to help users simplify their key tasks. However, the problem of what are the major ...
Segmenting Search Query Logs by Learning to Detect Search Task Boundaries
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

To fulfill their information needs, users submit sets of related queries to available search engines. Query logs record users' activities along with timestamps and additional search-related information. The analysis of those chronological query logs ...
A Multilingual Approach for Unsupervised Search Task Identification
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Users convert their information needs to search queries, which are then run on available search engines. Query logs registered by search engines enable the automatic identification of the search tasks that users perform to fulfill their information ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

October 2013

2612 pages

ISBN:9781450322638

DOI:10.1145/2505515

General Chairs:
Qi He
LinkedIn, USA
,
Arun Iyengar
IBM T.J. Watson Research Center, USA
,
Program Chairs:
Wolfgang Nejdl
L3S Research Center, Germany
,
Jian Pei
Simon Fraser University, Canada
,
Rajeev Rastogi
Amazon, India

Copyright © 2013.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Conference

CIKM'13

Sponsor:

CIKM'13: 22nd ACM International Conference on Information and Knowledge Management

October 27 - November 1, 2013

California, San Francisco, USA

Acceptance Rates

CIKM '13 Paper Acceptance Rate 143 of 848 submissions, 17%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
184
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents