skip to main content
10.1145/3121050.3121052acmconferencesArticle/Chapter ViewAbstractPublication PagesictirConference Proceedingsconference-collections
research-article
Public Access

Mining the Temporal Statistics of Query Terms for Searching Social Media Posts

Published: 01 October 2017 Publication History

Abstract

There is an emerging consensus that time is an important indicator of relevance for searching streams of social media posts. In a process similar to pseudo-relevance feedback, the distribution of document timestamps from the results of an initial query can be leveraged to infer the distribution of relevant documents, for example, using kernel density estimation. In this paper, we explore an alternative approach to mining relevance signals directly from the temporal statistics of query terms in the collection, without the need to perform an initial retrieval. We propose two approaches: a linear ranking model that combines features derived from temporal collection statistics of query terms and a regression-based method that attempts to directly predict the distribution of relevant documents from query term statistics. Experiments on standard tweet test collections show that our proposed methods significantly outperform competitive baselines. Furthermore, studies of different feature combinations show the extent to which different types of temporal signals impact retrieval effectiveness.

References

[1]
Chao Zhang, Liyuan Liu, Dongming Lei, Quan Yuan, Honglei Zhuang, Tim Hanratty, and Jiawei Han 2017. TrioVecEvent: Embedding-Based Online Local Event Detection in Geo-Tagged Tweet Streams KDD. 595--604.
[2]
Chao Zhang, Guangyu Zhou, Quan Yuan, Honglei Zhuang, Yu Zheng, Lance Kaplan, Shaowen Wang, and Jiawei Han. 2016natexlabb. GeoBurst: Real-Time Local Event Detection in Geo-Tagged Tweet Streams SIGIR. 513--522.
[3]
Jaeho Choi and W. Bruce Croft 2012. Temporal Models for Microblogs. In CIKM. 2491--2494.
[4]
Olga Craveiro, Joaquim Macedo, and Henrique Madeira. 2014. Query Expansion with Temporal Segmented Texts. In ECIR. 612--617.
[5]
Wisam Dakka, Luis Gravano, and Panagiotis G. Ipeirotis. 2012. Answering General Time-Sensitive Queries. TKDE, Vol. 24, 2 (2012), 220--235.
[6]
Anlei Dong, Ruiqiang Zhang, Pranam Kolari, Jing Bai, Fernando Diaz, Yi Chang, Zhaohui Zheng, and Hongyuan Zha. 2010natexlabb. Time is of the Essence: Improving Recency Ranking Using Twitter Data WWW. 331--340.
[7]
Miles Efron and Gene Golovchinsky 2011. Estimation Methods for Ranking Recent Information. SIGIR. 495--504.
[8]
Miles Efron, Jimmy Lin, Jiyin He, and Arjen de Vries. 2014. Temporal Feedback for Tweet Search with Non-Parametric Density Estimation SIGIR. 33--42.
[9]
Jonathan L. Elsas and Susan T. Dumais 2010. Leveraging Temporal Dynamics of Document Content in Relevance Ranking WSDM. 1--10.
[10]
Rosie Jones and Fernando Diaz 2007. Temporal Profiles of Queries. TOIS, Vol. 25, 3 (2007), Article 14.
[11]
Mostafa Keikha, Shima Gerani, and Fabio Crestani. 2011. TEMPER: A Temporal Relevance Feedback Method. In ECIR. 436--447.
[12]
Xiaoyan Li and W. Bruce Croft 2003. Time-Based Language Models. In CIKM. 469--475.
[13]
Donald Metzler and W. Bruce Croft 2007. Linear Feature-Based Models for Information Retrieval. Information Retrieval Vol. 10, 3 (2007), 257--274.
[14]
Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, and Jimmy Lin 2012. Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture SIGMOD. 1147--1157.
[15]
Jay M. Ponte and W. Bruce Croft 1998. A Language Modeling Approach to Information Retrieval SIGIR. 275--281.
[16]
Kira Radinsky, Krysta Svore, Susan Dumais, Jaime Teevan, Alex Bocharov, and Eric Horvitz. 2012. Modeling and Predicting Behavioral Dynamics on the Web WWW. 599--608.
[17]
Jinfeng Rao, Hua He, Haotian Zhang, Ferhan Ture, Royal Sequiera, Salman Mohammed, and Jimmy Lin 2017. Integrating Lexical and Temporal Signals in Neural Ranking Models for Social Media Search SIGIR Workshop on Neural Information Retrieval (Neu-IR).
[18]
Jinfeng Rao and Jimmy Lin 2016. Temporal Query Expansion Using a Continuous Hidden Markov Model ICITR. 295--298.
[19]
Jinfeng Rao, Jimmy Lin, and Miles Efron 2015. Reproducible Experiments on Lexical and Temporal Feedback for Tweet Search. ECIR. 755--767.
[20]
Jinfeng Rao, Xing Niu, and Jimmy Lin 2016. Compressing and Decoding Term Statistics Time Series ECIR. 675--681.
[21]
Milad Shokouhi and Kira Radinsky 2012. Time-Sensitive Query Auto-Completion. In SIGIR. 601--610.
[22]
Mark D. Smucker, James Allan, and Ben Carterette. 2007. A Comparison of Statistical Significance Tests for Information Retrieval Evaluation CIKM. 623--632. endthebibliography

Cited By

View all
  • (2022)Evolutionary Poisson Factorization Based Multiple Trust RelationshipsPattern Recognition and Image Analysis10.1134/S105466182201012632:1(218-227)Online publication date: 18-Mar-2022
  • (2021)The Information Retrieval AnthologyProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462798(2550-2555)Online publication date: 11-Jul-2021
  • (2021)Time segment language model for microblog retrievalNeural Computing and Applications10.1007/s00521-020-05534-xOnline publication date: 3-Jan-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICTIR '17: Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval
October 2017
348 pages
ISBN:9781450344906
DOI:10.1145/3121050
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. kernel density estimation
  2. temporal collection statistics
  3. temporal ranking
  4. twitter

Qualifiers

  • Research-article

Funding Sources

Conference

ICTIR '17
Sponsor:

Acceptance Rates

ICTIR '17 Paper Acceptance Rate 27 of 54 submissions, 50%;
Overall Acceptance Rate 235 of 527 submissions, 45%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)115
  • Downloads (Last 6 weeks)8
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Evolutionary Poisson Factorization Based Multiple Trust RelationshipsPattern Recognition and Image Analysis10.1134/S105466182201012632:1(218-227)Online publication date: 18-Mar-2022
  • (2021)The Information Retrieval AnthologyProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462798(2550-2555)Online publication date: 11-Jul-2021
  • (2021)Time segment language model for microblog retrievalNeural Computing and Applications10.1007/s00521-020-05534-xOnline publication date: 3-Jan-2021

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media