skip to main content
10.1145/3077136.3080808acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Online In-Situ Interleaved Evaluation of Real-Time Push Notification Systems

Published: 07 August 2017 Publication History

Abstract

Real-time push notification systems monitor continuous document streams such as social media posts and alert users to relevant content directly on their mobile devices. We describe a user study of such systems in the context of the TREC 2016 Real-Time Summarization Track, where system updates are immediately delivered as push notifications to the mobile devices of a cohort of users. Our study represents, to our knowledge, the first deployment of an interleaved evaluation framework for prospective information needs, and also provides an opportunity to examine user behavior in a realistic setting. Results of our online in-situ evaluation are correlated against the results a more traditional post-hoc batch evaluation. We observe substantial correlations between many online and batch evaluation metrics, especially for those that share the same basic design (e.g., are utility-based). For some metrics, we observe little correlation, but are able to identify the volume of messages that a system pushes as one major source of differences.

References

[1]
Azzah Al-Maskari, Mark Sanderson, Paul Clough, and Eija Airio. 2008. The Good and the Bad System: Does the Test Collection Predict Users' Effectiveness? SIGIR. 59--66.
[2]
James Allan. 2002. Topic Detection and Tracking: Event-Based Information Organization. Kluwer Academic Publishers, Dordrecht, The Netherlands.
[3]
James Allan, Ben Carterette, and Joshua Lewis. 2005. When Will Information Retrieval Be "Good Enough"? User Effectiveness as a Function of Retrieval Accuracy. In SIGIR. 433--440.
[4]
Javed Aslam, Fernando Diaz, Matthew Ekstrand-Abueg, Richard McCreadie, Virgil Pavlu, and Tetsuya Sakai. 2015. TREC 2015 Temporal Summarization Track Overview TREC.
[5]
Peter Bailey, Nick Craswell, Ian Soboroff, Paul Thomas, Arjen P. de Vries, and Emine Yilmaz. 2008. Relevance Assessment: Are Judges Exchangeable and Does it Matter? SIGIR. 667--674.
[6]
Nicholas J. Belkin and W. Bruce Croft. 1992. Information Filtering and Information Retrieval: Two Sides of the Same Coin? CACM, Vol. 35, 12 (1992), 29--38.
[7]
Olivier Chapelle, Thorsten Joachims, Filip Radlinski, and Yisong Yue. 2012. Large-Scale Validation and Analysis of Interleaved Search Evaluation. ACM TOIS, Vol. 30, 1 (2012), Article 6.
[8]
Qi Guo, Fernando Diaz, and Elad Yom-Tov. 2013. Updating Users about Time Critical Events. In ECIR. 483--494.
[9]
Allan Hanbury, Henning Müller, Krisztian Balog, Torben Brodt, Gordon V. Cormack, Ivan Eggel, Tim Gollub, Frank Hopfgartner, Jayashree Kalpathy-Cramer, Noriko Kando, Anastasia Krithara, Jimmy Lin, Simon Mercer, and Martin Potthast. 2015. Evaluation-as-a-Service: Overview and Outlook. arXiv:1512.07454.
[10]
William Hersh, Andrew Turpin, Susan Price, Benjamin Chan, Dale Kramer, Lynetta Sacherek, and Daniel Olson. 2000. Do Batch and User Evaluations Give the Same Results? SIGIR. 17--24.
[11]
Katja Hofmann, Shimon Whiteson, and Maarten de Rijke. 2011. A Probabilistic Method for Inferring Preferences from Clicks CIKM. 249--258.
[12]
Ron Kohavi, Randal M. Henne, and Dan Sommerfield. 2007. Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO. In KDD. 959--967.
[13]
David D. Lewis. 1995. The TREC-4 Filtering Track. In TREC. 165--180.
[14]
Jimmy Lin, Miles Efron, Yulu Wang, and Garrick Sherman. 2014. Overview of the TREC-2014 Microblog Track. TREC.
[15]
Jimmy Lin, Miles Efron, Yulu Wang, Garrick Sherman, and Ellen Voorhees. 2015. Overview of the TREC-2015 Microblog Track. TREC.
[16]
Jimmy Lin, Adam Roegiest, Luchen Tan, Richard McCreadie, Ellen Voorhees, and Fernando Diaz. 2016. Overview of the TREC 2016 Real-Time Summarization Track TREC.
[17]
Abhinav Mehrotra, Veljko Pejovic, Jo Vermeulen, Robert Hendley, and Mirco Musolesi. 2016. My Phone and Me: Understanding People's Receptivity to Mobile Notifications CHI. 1021--1032.
[18]
Xin Qian, Jimmy Lin, and Adam Roegiest. 2016. Interleaved Evaluation for Retrospective Summarization and Prospective Notification on Document Streams. In SIGIR. 175--184.
[19]
Filip Radlinski and Nick Craswell 2010. Comparing the Sensitivity of Information Retrieval Metrics SIGIR. 667--674.
[20]
Filip Radlinski and Nick Craswell. 2013. Optimized Interleaving for Online Retrieval Evaluation WSDM. 245--254.
[21]
Stephen Robertson and Ian Soboroff. 2002. The TREC 2002 Filtering Track Report. In TREC.
[22]
Alan Said, Jimmy Lin, Alejandro Bellogín, and Arjen P. de Vries. 2013. A Month in the Life of a Production News Recommender System CIKM Workshop on Living Labs for Information Retrieval Evaluation. 7--10.
[23]
Mark Sanderson, Monica Paramita, Paul Clough, and Evangelos Kanoulas. 2010. Do User Preferences and Evaluation Measures Line Up? SIGIR. 555--562.
[24]
Anne Schuth, Krisztian Balog, and Liadh Kelly. 2015. Overview of the Living Labs for Information Retrieval Evaluation (LL4IR) CLEF Lab 2015. In CLEF.
[25]
Anne Schuth, Katja Hofmann, and Filip Radlinski. 2015. Predicting Search Satisfaction Metrics with Interleaved Comparisons SIGIR. 463--472.
[26]
Mark Smucker and Chandra Jethani. 2010. Human Performance and Retrieval Precision Revisited SIGIR. 595--602.
[27]
Ian Soboroff, Iadh Ounis, Craig Macdonald, and Jimmy Lin. 2012. Overview of the TREC-2012 Microblog Track. In TREC.
[28]
Luchen Tan, Adam Roegiest, Jimmy Lin, and Charles L. A. Clarke. 2016. An Exploration of Evaluation Metrics for Mobile Push Notifications SIGIR. 741--744.
[29]
Andrew Turpin and William R. Hersh. 2001. Why Batch and User Evaluations Do Not Give the Same Results SIGIR. 225--231.
[30]
Andrew Turpin and Falk Scholer. 2006. User Performance versus Precision Measures for Simple Search Tasks SIGIR. 11--18.
[31]
Yulu Wang, Garrick Sherman, Jimmy Lin, and Miles Efron. 2015. Assessor Differences and User Preferences in Tweet Timeline Generation SIGIR. 615--624.

Cited By

View all
  • (2022)Reflection on future directions: a systematic review of reported limitations and solutions in interactive information retrieval user studiesAslib Journal of Information Management10.1108/AJIM-05-2022-0253Online publication date: 19-Dec-2022
  • (2022)NaPP: Notification and Push Performance in Wearable DevicesProceedings of the Future Technologies Conference (FTC) 2022, Volume 210.1007/978-3-031-18458-1_43(634-648)Online publication date: 13-Oct-2022
  • (2021)An Effective Hybrid Learning Model for Real-Time Event SummarizationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2020.301774732:10(4419-4431)Online publication date: Oct-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
August 2017
1476 pages
ISBN:9781450350228
DOI:10.1145/3077136
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 August 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. microblogs
  2. trec
  3. user study

Qualifiers

  • Research-article

Funding Sources

Conference

SIGIR '17
Sponsor:

Acceptance Rates

SIGIR '17 Paper Acceptance Rate 78 of 362 submissions, 22%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)2
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Reflection on future directions: a systematic review of reported limitations and solutions in interactive information retrieval user studiesAslib Journal of Information Management10.1108/AJIM-05-2022-0253Online publication date: 19-Dec-2022
  • (2022)NaPP: Notification and Push Performance in Wearable DevicesProceedings of the Future Technologies Conference (FTC) 2022, Volume 210.1007/978-3-031-18458-1_43(634-648)Online publication date: 13-Oct-2022
  • (2021)An Effective Hybrid Learning Model for Real-Time Event SummarizationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2020.301774732:10(4419-4431)Online publication date: Oct-2021
  • (2021)Design and analysis of microblog-based summarization systemSocial Network Analysis and Mining10.1007/s13278-021-00830-311:1Online publication date: 2-Nov-2021
  • (2020)Intelligent Notification SystemsSynthesis Lectures on Mobile and Pervasive Computing10.2200/S00965ED1V01Y201911MPC01411:1(1-75)Online publication date: 3-Jan-2020
  • (2020)AHPCap: A Framework for Automated Hardware Profiling and Capture of Mobile Application States2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)10.1109/ISSREW51248.2020.00069(183-188)Online publication date: Oct-2020
  • (2019)NotifyMeHereProceedings of the 2019 Conference on Human Information Interaction and Retrieval10.1145/3295750.3298932(103-111)Online publication date: 8-Mar-2019
  • (2019)MARESWorld Wide Web10.1007/s11280-018-0597-722:2(499-515)Online publication date: 1-Mar-2019
  • (2018)Update Delivery Mechanisms for Prospective Information NeedsThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210018(785-794)Online publication date: 27-Jun-2018
  • (2017)Event Detection on Curated Tweet StreamsProceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3077136.3084141(1325-1328)Online publication date: 7-Aug-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media