skip to main content
10.1145/2009916.2009922acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Modeling and analysis of cross-session search tasks

Published:24 July 2011Publication History

ABSTRACT

The information needs of search engine users vary in complexity, depending on the task they are trying to accomplish. Some simple needs can be satisfied with a single query, whereas others require a series of queries issued over a longer period of time. While search engines effectively satisfy many simple needs, searchers receive little support when their information needs span session boundaries. In this work, we propose methods for modeling and analyzing user search behavior that extends over multiple search sessions. We focus on two problems: (i) given a user query, identify all of the related queries from previous sessions that the same user has issued, and (ii) given a multi-query task for a user, predict whether the user will return to this task in the future. We model both problems within a classification framework that uses features of individual queries and long-term user search behavior at different granularity. Experimental evaluation of the proposed models for both tasks indicates that it is possible to effectively model and analyze cross-session search behavior. Our findings have implications for improving search for complex information needs and designing search engine features to support cross-session search tasks.

References

  1. E. Agichtein, E. Brill and S. Dumais. Improving Web search ranking by incorporating user behavior information. SIGIR '06, 19--26, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Aula, N. Jhaveri and M. Käki. Information search and re-access strategies of experienced Web users. WWW '05, 583--592, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Aula, R. M. Kahn and Z. Guan. How does search behavior change as search behavior becomes more difficult. CHI '10, 35--44, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. KDD '00, 407--416, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis and S. Vigna. The query-flow graph: Model and applications. CIKM '08, 609--618, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. Cao, D.H. Hu, D. Shen, D. Jiang, J.-T. Sun, E. Chen and Q. Yang. Context-aware query classification. SIGIR '09, 3--10, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. H. Cao, D. Jiang, J. Pei, Q. He, Z. Liao, E. Chen and H. Li. Context-aware query suggestion by mining click-through and session data. KDD '08, 875--883, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y.-S. Chang, K.-Y. He, S. Yu and W.-H. Lu. Identifying user goals from Web search results. WWW '06, 1038--1041, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Donato, F. Bonchi, T. Chi and Y. Maarek. Do you want to take notes? Identifying research missions in Yahoo! Search Pad. WWW '10, 321--330, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Downey, S. Dumais, D. Liebling and E. Horvitz. Understanding the relationship between searchers' queries and information goals. CIKM '08, 449--458, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Dumais, G. Buscher and E. Cutrell. Individual differences in gaze patterns for web search. IIiX 10, 185--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Fox, K. Karnawat, M. Mydland, S. T. Dumais and T. White. Evaluating implicit measures to improve the search experience, TOIS, 23(2), 147--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Friedman, T. Hastie and T. Tibshirani. Additive logistic regression: A statistical view of boosting. Annals of Statistics, 28(2), 337--407, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  14. A. Hassan, R. Jones and K. Klinkner. Beyond DCG: User behavior as a predictor of a successful search.WSDM '09, 221--230, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. He, A. Göker, and D.J. Harper. Combining evidence for automatic Web session identification. Information Processing & Management, 38(5):727--742, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Jones and K. Klinkner. Beyond the session timeout: Automatic hierarchical segmentation of search topics in query logs. CIKM '08, 699--708, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Kellar, C. Watters, and M. Shepherd. A field study characterizing Web-based information-seeking tasks. JASIST, 58(7), 999--1018, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. U. Lee, Z. Liu and J. Cho. Automatic indetification of user goals in Web search. WWW '05, 391--400, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Liu and N.J. Belkin. Personalizing information retrieval for multi-session tasks: The roles of task stage and task type. SIGIR '10, 26--33, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. B. MacKay and C. Watters. Exploring multi-session Web tasks. CHI '08, 1187--1196, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Q. Mei, K. Klinkner, R. Kumar and A. Tomkins. An analysis framework for search sequences. CIKM '09, 1991--1994, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L. Mihalkova and R. Mooney. Learning to disambiguate search queries from short sessions. ECML '09, 111--127, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Morris, M. Ringel Morris and G. Venolia. SearchBar: A search-centric Web history for task resumption and information re-finding. CHI '08, 1207--1216, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Piwowarski, G. Dupret and R. Jones. Mining user Web search activity with layered Bayesian networks or how to capture a click in its context. WSDM '09, 162--171, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. Piwowarski and H. Zaragoza. Predictive user click models based on click-through history. CIKM '07, 175--182, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. KDD '05, 239--248, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. F. Radlinski, M. Szummer and N. Craswell. Inferring query intent from reformulations and clicks. WWW '10, 1171--1172, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. J. van Rijsbergen. Information Retrieval. Butterworths, London, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D.E. Rose and D. Levinson. Understanding user goals in Web search. WWW '04, 13--19, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. X. Shen, B. Tan and C. Zhai. Context-sensitive information retrieval using implicit feedback. SIGIR '05, 43--50, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. B. Tan, X. Shen and C. Zhai. Mining long-term search history to improve search accuracy. KDD '06, 718--723, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Teevan, E. Adar, R. Jones and M.A.S. Potts. Information re-retrieval: Repeat queries in Yahoo's logs. SIGIR '07, 151--158, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J.-R. Wen, J.-Y. Nie and H.-J. Zhang. Clustering user queries of a search engine. WWW '01, 162--168, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. R.W. White, P. Bailey and L. Chen. Predicting user interests from contextual information. SIGIR '09, 363--370, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R.W. White and S.M. Drucker. Investigating behavioral variability in Web search. WWW '07, 21--30, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Y. Yang and Z. Liu. A re-examination of text categorization methods. SIGIR '99, 42--49, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Modeling and analysis of cross-session search tasks

        Recommendations

        Reviews

        Kazunari Sugiyama

        Current search engines cannot provide acceptable results for complicated needs that require a user to issue a series of queries in multiple search sessions ("cross-session" in this paper), such as when planning a vacation. Kotov et al. model and analyze the cross-session information needs by identifying all previous queries in a user's search history dedicated to the same task as the current query, and by predicting whether a user will return to the task in future sessions. They formalize these tasks as simple supervised classification tasks, and obtain promising findings that knowledge of previous user queries on the same long-term task enables a search engine to provide support for task resumption. To define the task, they employ both automatic initial labeling and additional human annotation. Then, they label the queries as belonging to the same task if the similarity between term sets of two queries exceeds a threshold. Their proposed approach achieved more than 70 percent accuracy. The authors employ only two different regression-based classifiers. However, to verify which classifier is effective in this type of task, they also should try other popular classifiers such as support vector machines, maximum entropy, and so on. They analyze important features to construct classifiers obtained by logistic regression in detail. Their findings are helpful cues for researchers who work on user search behavior. Online Computing Reviews Service

        Access critical reviews of Computing literature here

        Become a reviewer for Computing Reviews.

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
          July 2011
          1374 pages
          ISBN:9781450307574
          DOI:10.1145/2009916

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 July 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate792of3,983submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader