research-article

Modeling and analysis of cross-session search tasks

Authors:
Alexander Kotov

University of Illinois at Urbana-Champaign, Urbana, IL, USA

University of Illinois at Urbana-Champaign, Urbana, IL, USA
View Profile

,
Paul N. Bennett

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Ryen W. White

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Susan T. Dumais

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Jaime Teevan

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalJuly 2011Pages 5–14https://doi.org/10.1145/2009916.2009922

Published:24 July 2011Publication History

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Pages 5–14

ABSTRACT

The information needs of search engine users vary in complexity, depending on the task they are trying to accomplish. Some simple needs can be satisfied with a single query, whereas others require a series of queries issued over a longer period of time. While search engines effectively satisfy many simple needs, searchers receive little support when their information needs span session boundaries. In this work, we propose methods for modeling and analyzing user search behavior that extends over multiple search sessions. We focus on two problems: (i) given a user query, identify all of the related queries from previous sessions that the same user has issued, and (ii) given a multi-query task for a user, predict whether the user will return to this task in the future. We model both problems within a classification framework that uses features of individual queries and long-term user search behavior at different granularity. Experimental evaluation of the proposed models for both tasks indicates that it is possible to effectively model and analyze cross-session search behavior. Our findings have implications for improving search for complex information needs and designing search engine features to support cross-session search tasks.

References

E. Agichtein, E. Brill and S. Dumais. Improving Web search ranking by incorporating user behavior information. SIGIR '06, 19--26, 2006. Google ScholarDigital Library
A. Aula, N. Jhaveri and M. Käki. Information search and re-access strategies of experienced Web users. WWW '05, 583--592, 2005. Google ScholarDigital Library
A. Aula, R. M. Kahn and Z. Guan. How does search behavior change as search behavior becomes more difficult. CHI '10, 35--44, 2010. Google ScholarDigital Library
D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. KDD '00, 407--416, 2000. Google ScholarDigital Library
P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis and S. Vigna. The query-flow graph: Model and applications. CIKM '08, 609--618, 2008. Google ScholarDigital Library
H. Cao, D.H. Hu, D. Shen, D. Jiang, J.-T. Sun, E. Chen and Q. Yang. Context-aware query classification. SIGIR '09, 3--10, 2009. Google ScholarDigital Library
H. Cao, D. Jiang, J. Pei, Q. He, Z. Liao, E. Chen and H. Li. Context-aware query suggestion by mining click-through and session data. KDD '08, 875--883, 2008. Google ScholarDigital Library
Y.-S. Chang, K.-Y. He, S. Yu and W.-H. Lu. Identifying user goals from Web search results. WWW '06, 1038--1041, 2006. Google ScholarDigital Library
D. Donato, F. Bonchi, T. Chi and Y. Maarek. Do you want to take notes? Identifying research missions in Yahoo! Search Pad. WWW '10, 321--330, 2010. Google ScholarDigital Library
D. Downey, S. Dumais, D. Liebling and E. Horvitz. Understanding the relationship between searchers' queries and information goals. CIKM '08, 449--458, 2008. Google ScholarDigital Library
S. Dumais, G. Buscher and E. Cutrell. Individual differences in gaze patterns for web search. IIiX 10, 185--194. Google ScholarDigital Library
S. Fox, K. Karnawat, M. Mydland, S. T. Dumais and T. White. Evaluating implicit measures to improve the search experience, TOIS, 23(2), 147--168. Google ScholarDigital Library
J. Friedman, T. Hastie and T. Tibshirani. Additive logistic regression: A statistical view of boosting. Annals of Statistics, 28(2), 337--407, 2000.Google ScholarCross Ref
A. Hassan, R. Jones and K. Klinkner. Beyond DCG: User behavior as a predictor of a successful search.WSDM '09, 221--230, 2010. Google ScholarDigital Library
D. He, A. Göker, and D.J. Harper. Combining evidence for automatic Web session identification. Information Processing & Management, 38(5):727--742, 2002. Google ScholarDigital Library
R. Jones and K. Klinkner. Beyond the session timeout: Automatic hierarchical segmentation of search topics in query logs. CIKM '08, 699--708, 2008. Google ScholarDigital Library
M. Kellar, C. Watters, and M. Shepherd. A field study characterizing Web-based information-seeking tasks. JASIST, 58(7), 999--1018, 2007. Google ScholarDigital Library
U. Lee, Z. Liu and J. Cho. Automatic indetification of user goals in Web search. WWW '05, 391--400, 2005. Google ScholarDigital Library
J. Liu and N.J. Belkin. Personalizing information retrieval for multi-session tasks: The roles of task stage and task type. SIGIR '10, 26--33, 2010. Google ScholarDigital Library
B. MacKay and C. Watters. Exploring multi-session Web tasks. CHI '08, 1187--1196, 2008. Google ScholarDigital Library
Q. Mei, K. Klinkner, R. Kumar and A. Tomkins. An analysis framework for search sequences. CIKM '09, 1991--1994, 2009. Google ScholarDigital Library
L. Mihalkova and R. Mooney. Learning to disambiguate search queries from short sessions. ECML '09, 111--127, 2009. Google ScholarDigital Library
D. Morris, M. Ringel Morris and G. Venolia. SearchBar: A search-centric Web history for task resumption and information re-finding. CHI '08, 1207--1216, 2008. Google ScholarDigital Library
B. Piwowarski, G. Dupret and R. Jones. Mining user Web search activity with layered Bayesian networks or how to capture a click in its context. WSDM '09, 162--171, 2009. Google ScholarDigital Library
B. Piwowarski and H. Zaragoza. Predictive user click models based on click-through history. CIKM '07, 175--182, 2007. Google ScholarDigital Library
F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. KDD '05, 239--248, 2005. Google ScholarDigital Library
F. Radlinski, M. Szummer and N. Craswell. Inferring query intent from reformulations and clicks. WWW '10, 1171--1172, 2010. Google ScholarDigital Library
C. J. van Rijsbergen. Information Retrieval. Butterworths, London, 1979. Google ScholarDigital Library
D.E. Rose and D. Levinson. Understanding user goals in Web search. WWW '04, 13--19, 2004. Google ScholarDigital Library
X. Shen, B. Tan and C. Zhai. Context-sensitive information retrieval using implicit feedback. SIGIR '05, 43--50, 2005. Google ScholarDigital Library
B. Tan, X. Shen and C. Zhai. Mining long-term search history to improve search accuracy. KDD '06, 718--723, 2006. Google ScholarDigital Library
J. Teevan, E. Adar, R. Jones and M.A.S. Potts. Information re-retrieval: Repeat queries in Yahoo's logs. SIGIR '07, 151--158, 2007. Google ScholarDigital Library
J.-R. Wen, J.-Y. Nie and H.-J. Zhang. Clustering user queries of a search engine. WWW '01, 162--168, 2001. Google ScholarDigital Library
R.W. White, P. Bailey and L. Chen. Predicting user interests from contextual information. SIGIR '09, 363--370, 2009. Google ScholarDigital Library
R.W. White and S.M. Drucker. Investigating behavioral variability in Web search. WWW '07, 21--30, 2007. Google ScholarDigital Library
Y. Yang and Z. Liu. A re-examination of text categorization methods. SIGIR '99, 42--49, 1999. Google ScholarDigital Library

Index Terms

Modeling and analysis of cross-session search tasks
1. Information systems
  1. Information retrieval

Recommendations

User Behavior Modeling for Web Image Search
WSDM '19: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining

Web-based image search engines differ from Web search engines greatly. The intents or goals behind human interactions with image search engines are different. In image search, users mainly search images instead of Web pages or online services. It is ...
Read More
Mining query subtopics from search log data
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Most queries in web search are ambiguous and multifaceted. Identifying the major senses and facets of queries from search log data, referred to as query subtopic mining in this paper, is a very important issue in web search. Through search log analysis, ...
Read More
Why People Search for Images using Web Search Engines
WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

What are the intents or goals behind human interactions with image search engines? Knowing why people search for images is of major concern to Web image search engines because user satisfaction may vary as intent varies. Previous analyses of image ...
Read More

Reviews

Reviewer: Kazunari Sugiyama

Current search engines cannot provide acceptable results for complicated needs that require a user to issue a series of queries in multiple search sessions ("cross-session" in this paper), such as when planning a vacation. Kotov et al. model and analyze the cross-session information needs by identifying all previous queries in a user's search history dedicated to the same task as the current query, and by predicting whether a user will return to the task in future sessions. They formalize these tasks as simple supervised classification tasks, and obtain promising findings that knowledge of previous user queries on the same long-term task enables a search engine to provide support for task resumption. To define the task, they employ both automatic initial labeling and additional human annotation. Then, they label the queries as belonging to the same task if the similarity between term sets of two queries exceeds a threshold. Their proposed approach achieved more than 70 percent accuracy. The authors employ only two different regression-based classifiers. However, to verify which classifier is effective in this type of task, they also should try other popular classifiers such as support vector machines, maximum entropy, and so on. They analyze important features to construct classifiers obtained by logistic regression in detail. Their findings are helpful cues for researchers who work on user search behavior. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
July 2011
1374 pages
ISBN:9781450307574
DOI:10.1145/2009916
General Chairs:
Wei-Ying Ma
Microsoft Research Asia, China
,
Jian-Yun Nie
University of Montreal, Canada
,
Program Chairs:
Ricardo Baeza-Yates
Yahoo! Research, Spain
,
Tat-Seng Chua
National University of Singapore
,
W. Bruce Croft
University of Massachusetts, Amherst, USA
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 July 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
cross-session search tasks
machine learning
user behavior
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 106
  Total Citations
  View Citations
- 1,028
  Total Downloads
- Downloads (Last 12 months)18
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Modeling and analysis of cross-session search tasks

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

User Behavior Modeling for Web Image Search

Mining query subtopics from search log data

Why People Search for Images using Web Search Engines

Reviews

Access critical reviews of Computing literature here