skip to main content
research-article

Re-Finding Behaviour in Vertical Domains

Published: 05 June 2017 Publication History

Abstract

Re-finding is the process of searching for information that a user has previously encountered and is a common activity carried out with information retrieval systems. In this work, we investigate re-finding in the context of vertical search, differentiating and modeling user re-finding behavior within different media and topic domains, including images, news, reference material, and movies. We distinguish the re-finding behavior in vertical domains from re-finding in a general search context and engineer features that are effective in differentiating re-finding across the domains. The features are then used to build machine-learned models, achieving an accuracy of re-finding detection in verticals of 85.7% on average. Our results demonstrate that detecting re-finding in specific verticals is more difficult than examining re-finding for general search tasks. We then investigate the effectiveness of differentiating re-finding behavior in two restricted contexts: We consider the case where the history of a searcher’s interactions with the search system is not available. In this scenario, our features and models achieve an average accuracy of 77.5% across the domains. We then examine the detection of re-finding during the early part of a search session. Both of these restrictions represent potential real-world search scenarios, where a system is attempting to learn about a user but may have limited information available. Finally, we investigate in which types of domains re-finding is most difficult. Here, it would appear that re-finding images is particularly challenging for users. This research has implications for search engine design, in terms of adapting search results by predicting the type of user tasks and potentially enabling the presentation of vertical-specific results when re-finding is identified. To the best of our knowledge, this is the first work to investigate the issue of vertical re-finding.

Supplementary Material

a21-sadeghi-appndx.pdf (sadeghi.zip)
Supplemental movie, appendix, image and software files for, Re-Finding Behaviour in Vertical Domains

References

[1]
Mikhail Ageev, Qi Guo, Dmitry Lagun, and Eugene Agichtein. 2011. Find it if you can: A game for modeling different types of web search success using interaction data. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 345--354.
[2]
Jaime Arguello, Jamie Callan, and Fernando Diaz. 2009. Classification-based resource selection. In Proc. CIKM. ACM, 1277--1286.
[3]
Jaime Arguello, Fernando Diaz, Jamie Callan, and Jean-Francois Crespo. 2009. Sources of evidence for vertical selection. In Proc. SIGIR. ACM, 315--322.
[4]
Jaime Arguello, Fernando Diaz, and Jean-François Paiement. 2010. Vertical selection in the presence of unlabeled verticals. In Proc. SIGIR. ACM, 691--698.
[5]
Ofer Bergman, Ruth Beyth-Marom, and Rafi Nachmias. 2008a. The user-subjective approach to personal information management systems design: Evidence and implementations. J. Am. Soc. Inform. Sci. Technol. 59, 2 (2008), 235--246.
[6]
Ofer Bergman, Ruth Beyth-Marom, Rafi Nachmias, Noa Gradovitch, and Steve Whittaker. 2008b. Improved search engines and navigation preference in personal information management. ACM Trans. Inf. Syst. 26 (2008), 20:1--20:24.
[7]
Ofer Bergman, Steve Whittaker, Mark Sanderson, Rafi Nachmias, and Anand Ramamoorthy. 2010. The effect of folder structure on personal file navigation. J. Assoc. Inform. Sci. Technol. 12 (2010), 2426--2441.
[8]
Robert G. Capra III. 2006. An Investigation of Finding and Refinding Information on the Web. Ph.D. Dissertation. Virginia Polytechnic Institute and State University.
[9]
Edward Cutrell, Daniel Robbins, Susan Dumais, and Raman Sarin. 2006. Fast, flexible filtering with phlat. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 261--270.
[10]
Fernando Diaz. 2009. Integration of news content into web results. In Proceedings of of WSDM. ACM, 182--191.
[11]
Fernando Diaz and Jaime Arguello. 2009. Adaptation of offline vertical selection predictions in the presence of user feedback. In Proceedings of SIGIR. ACM, 323--330.
[12]
Susan Dumais, Edward Cutrell, Jonathan J. Cadiz, Gavin Jancke, Raman Sarin, and Daniel C. Robbins. 2003. Stuff I’ve seen: A system for personal information retrieval and re-use. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 72--79.
[13]
David Elsweiler, Mark Baillie, and Ian Ruthven. 2011a. What makes re-finding information difficult? A study of email re-finding. In Advances in Information Retrieval. Springer, 568--579.
[14]
David Elsweiler, Morgan Harvey, and Martin Hacker. 2011b. Understanding re-finding behavior in naturalistic email interaction logs. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 35--44.
[15]
David Elsweiler and Ian Ruthven. 2007. Towards task-based personal information management evaluations. In Proc. SIGIR. ACM, 23--30.
[16]
Abby Goodrum and Amanda Spink. 2001. Image searching on the excite web search engine. Inf. Process. Manage. (2001), 295--311.
[17]
Isabelle Guyon, Jason Weston, Stephen Barnhill, and Vladimir Vapnik. 2002. Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 1--3 (2002), 389--422.
[18]
Morgan Harvey and David Elsweiler. 2012. Exploring query patterns in email search. In Advances in Information Retrieval—Proceedings of the 34th European Conference on IR Research (ECIR’12). 25--36.
[19]
Ahmed Hassan, Rosie Jones, and Kristina Lisa Klinkner. 2010. Beyond DCG: User behavior as a predictor of a successful search. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. ACM, 221--230.
[20]
Ahmed Hassan, Xiaolin Shi, Nick Craswell, and Bill Ramsey. 2013. Beyond clicks: Query reformulation as a predictor of search satisfaction. In Proceedings of the 22nd ACM International Conference on Conference on Information 8 Knowledge Management. ACM, 2019--2028.
[21]
Ahmed Hassan, Yang Song, and Li-wei He. 2011. A task level metric for measuring web search satisfaction and its application on improving relevance estimation. In Proceedings of CIKM. ACM, 125--134.
[22]
Dzung Hong and Luo Si. 2013. Search result diversification in resource selection for federated search. In Proceedings of SIGIR. ACM, 613--622.
[23]
Bernard J. Jansen, Abby Goodrum, and Amanda Spink. 2000. Searching for multimedia: Analysis of audio, video and image Web queries. World Wide Web 3, 4 (2000), 249--254.
[24]
Rosie Jones and Kristina Lisa Klinkner. 2008. Beyond the session timeout: Automatic hierarchical segmentation of search topics in query logs. In Proc. CIKM. ACM, 699--708.
[25]
William Jones. 2007. Personal information management. Annu. Rev. Inform. Sci. Technol. 41, 1 (2007), 453--504.
[26]
William Jones and Harry Bruce. 2005. A report on the nsf-sponsored workshop on personal information management, Seattle, WA, 2005. In Report on the NSF PIM Workshop.
[27]
Jinyoung Kim and W Bruce Croft. 2009. Retrieval experiments using pseudo-desktop collections. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM, 1297--1306.
[28]
Alexander Kotov, Paul N. Bennett, Ryen W. White, Susan T. Dumais, and Jaime Teevan. 2011. Modeling and analysis of cross-session search tasks. In Proc. SIGIR. ACM, 5--14.
[29]
Mark W. Lansdale. 1988. The psychology of personal information management. Appl. Ergon. 19, 1 (1988), 55--66.
[30]
Chang Liu, Jingjing Liu, and Nicholas J. Belkin. 2014. Predicting search task difficulty at different search stages. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management (CIKM’14). ACM, New York, NY, 569--578.
[31]
Jingjing Liu, Jacek Gwizdka, Chang Liu, and Nicholas J. Belkin. 2010. Predicting task difficulty for different task types. Proc. ASIS8T 47, 1 (2010), 1--10.
[32]
Jingjing Liu, Chang Liu, Michael Cole, Nicholas J. Belkin, and Xiangmin Zhang. 2012. Exploring and predicting search task difficulty. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. ACM, 1313--1322.
[33]
Claudio Lucchese, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, and Gabriele Tolomei. 2013. Discovering tasks from search engine query logs. ACM Trans. Inf. Syst. 31, 3 (Aug. 2013), 14:1--14:43.
[34]
Florian Meier and David Elsweiler. 2014. Tweets I’ve seen: Analysing factors influencing re-finding frustration on Twitter. In Proceedings of the 5th Information Interaction in Context Symposium (IIiX’14). ACM, 287--290.
[35]
Claude Nadeau and Yoshua Bengio. 2003. Inference for the generalization error. Mach. Learn. 52, 3 (2003), 239--281.
[36]
Matthew Richardson, Ewa Dominowska, and Robert Ragno. 2007. Predicting clicks: Estimating the click-through rate for new ads. In Proceedings of WWW. ACM, 521--530.
[37]
Kerry Rodden and Kenneth R. Wood. 2003. How do people manage their digital photographs? In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’03). ACM, 409--416.
[38]
Ian Ruthven and Diane Kelly. 2013. Interactive Information Seeking, Behaviour and Retrieval. Facet.
[39]
Sargol Sadeghi, Roi Blanco, Peter Mika, Mark Sanderson, Falk Scholer, and David Vallet. 2014. Identifying re-finding difficulty from user query logs. In Proc. of ADCS. ACM, 105:108.
[40]
Sargol Sadeghi, Roi Blanco, Peter Mika, Mark Sanderson, Falk Scholer, and David Vallet. 2015. Predicting re-fidning activity and difficulty. In Proceedings of ECIR. Springer.
[41]
Sidney Siegel and N. John Castellan. 1988. Nonparametric Statistics for the Behavioural Sciences. McGraw-Hill, New York, NY.
[42]
Shanu Sushmita, Hideo Joho, Mounia Lalmas, and Robert Villa. 2010. Factors affecting click-through behavior in aggregated search interfaces. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. ACM, 519--528.
[43]
Jaime Teevan. 2004. How People Re-find Information When the Web Changes. Technical Report. MIT AI.
[44]
Jaime Teevan. 2006. Supporting Finding and Re-finding Through Personalization. Ph.D. Dissertation. Massachusetts Institute of Technology.
[45]
Jaime Teevan, Eytan Adar, Rosie Jones, and Michael AS Potts. 2007. Information re-retrieval: Repeat queries in Yahoo’s logs. In Proceedings of SIGIR. ACM, 151--158.
[46]
Jaime Teevan, William Jones, and Benjamin B. Bederson. 2006. Personal information management. Commun. ACM 49, 1 (2006), 40--43.
[47]
Liang-Chun Tseng. 2012. Modelling Users’ Contextual Querying Behaviour for Web Image Searching. Ph.D. Dissertation.
[48]
Sarah K. Tyler and Jaime Teevan. 2010. Large scale query log analysis of re-finding. In Proceedings WSDM. ACM, 191--200.
[49]
Sarah K. Tyler, Jian Wang, and Yi Zhang. 2010. Utilizing re-finding for personalized information retrieval. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM’10). ACM, 1469--1472.
[50]
Chao Wang, Yiqun Liu, Min Zhang, Shaoping Ma, Meihong Zheng, Jing Qian, and Kuo Zhang. 2013a. Incorporating vertical results into search click models. In Proceedings of SIGIR (SIGIR’13). ACM, 503--512.
[51]
Hongning Wang, Yang Song, Ming-Wei Chang, Xiaodong He, Ryen W. White, and Wei Chu. 2013b. Learning to extract cross-session search tasks. In Proceedings of the 22nd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1353--1364.
[52]
Steve Whittaker, Victoria Bellotti, and Jacek Gwizdka. 2006. Email in personal information management. Commun. ACM 49 (2006), 68--73.
[53]
Steve Whittaker, Ofer Bergman, and Paul Clough. 2010. Easy on that trigger dad: A study of long term family photo retrieval. Pers. Ubiq. Comput. 14, 1 (2010), 31--43.

Cited By

View all
  • (2021)Towards Understanding Complex Known-Item Requests on RedditProceedings of the 32nd ACM Conference on Hypertext and Social Media10.1145/3465336.3475096(143-154)Online publication date: 30-Aug-2021
  • (2019)Re-finding Behaviour in Educational SearchDigital Libraries for Open Knowledge10.1007/978-3-030-30760-8_43(401-405)Online publication date: 9-Sep-2019

Index Terms

  1. Re-Finding Behaviour in Vertical Domains

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Information Systems
    ACM Transactions on Information Systems  Volume 35, Issue 3
    July 2017
    410 pages
    ISSN:1046-8188
    EISSN:1558-2868
    DOI:10.1145/3026478
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 June 2017
    Accepted: 01 July 2016
    Revised: 01 July 2016
    Received: 01 November 2015
    Published in TOIS Volume 35, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Re-finding behavior
    2. difficulty
    3. predictive models
    4. search feature
    5. vertical

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Australian Postgraduate Awards (APA)
    • Yahoo Research Award

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Towards Understanding Complex Known-Item Requests on RedditProceedings of the 32nd ACM Conference on Hypertext and Social Media10.1145/3465336.3475096(143-154)Online publication date: 30-Aug-2021
    • (2019)Re-finding Behaviour in Educational SearchDigital Libraries for Open Knowledge10.1007/978-3-030-30760-8_43(401-405)Online publication date: 9-Sep-2019

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media