research-article

Public Access

Mining the Temporal Statistics of Query Terms for Searching Social Media Posts

Authors:

Jimmy LinAuthors Info & Claims

ICTIR '17: Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval

Pages 133 - 140

https://doi.org/10.1145/3121050.3121052

Published: 01 October 2017 Publication History

Abstract

There is an emerging consensus that time is an important indicator of relevance for searching streams of social media posts. In a process similar to pseudo-relevance feedback, the distribution of document timestamps from the results of an initial query can be leveraged to infer the distribution of relevant documents, for example, using kernel density estimation. In this paper, we explore an alternative approach to mining relevance signals directly from the temporal statistics of query terms in the collection, without the need to perform an initial retrieval. We propose two approaches: a linear ranking model that combines features derived from temporal collection statistics of query terms and a regression-based method that attempts to directly predict the distribution of relevant documents from query term statistics. Experiments on standard tweet test collections show that our proposed methods significantly outperform competitive baselines. Furthermore, studies of different feature combinations show the extent to which different types of temporal signals impact retrieval effectiveness.

References

[1]

Chao Zhang, Liyuan Liu, Dongming Lei, Quan Yuan, Honglei Zhuang, Tim Hanratty, and Jiawei Han 2017. TrioVecEvent: Embedding-Based Online Local Event Detection in Geo-Tagged Tweet Streams KDD. 595--604.

Digital Library

[2]

Chao Zhang, Guangyu Zhou, Quan Yuan, Honglei Zhuang, Yu Zheng, Lance Kaplan, Shaowen Wang, and Jiawei Han. 2016natexlabb. GeoBurst: Real-Time Local Event Detection in Geo-Tagged Tweet Streams SIGIR. 513--522.

Digital Library

[3]

Jaeho Choi and W. Bruce Croft 2012. Temporal Models for Microblogs. In CIKM. 2491--2494.

Digital Library

[4]

Olga Craveiro, Joaquim Macedo, and Henrique Madeira. 2014. Query Expansion with Temporal Segmented Texts. In ECIR. 612--617.

[5]

Wisam Dakka, Luis Gravano, and Panagiotis G. Ipeirotis. 2012. Answering General Time-Sensitive Queries. TKDE, Vol. 24, 2 (2012), 220--235.

Digital Library

[6]

Anlei Dong, Ruiqiang Zhang, Pranam Kolari, Jing Bai, Fernando Diaz, Yi Chang, Zhaohui Zheng, and Hongyuan Zha. 2010natexlabb. Time is of the Essence: Improving Recency Ranking Using Twitter Data WWW. 331--340.

Digital Library

[7]

Miles Efron and Gene Golovchinsky 2011. Estimation Methods for Ranking Recent Information. SIGIR. 495--504.

Digital Library

[8]

Miles Efron, Jimmy Lin, Jiyin He, and Arjen de Vries. 2014. Temporal Feedback for Tweet Search with Non-Parametric Density Estimation SIGIR. 33--42.

Digital Library

[9]

Jonathan L. Elsas and Susan T. Dumais 2010. Leveraging Temporal Dynamics of Document Content in Relevance Ranking WSDM. 1--10.

Digital Library

[10]

Rosie Jones and Fernando Diaz 2007. Temporal Profiles of Queries. TOIS, Vol. 25, 3 (2007), Article 14.

Digital Library

[11]

Mostafa Keikha, Shima Gerani, and Fabio Crestani. 2011. TEMPER: A Temporal Relevance Feedback Method. In ECIR. 436--447.

Digital Library

[12]

Xiaoyan Li and W. Bruce Croft 2003. Time-Based Language Models. In CIKM. 469--475.

Digital Library

[13]

Donald Metzler and W. Bruce Croft 2007. Linear Feature-Based Models for Information Retrieval. Information Retrieval Vol. 10, 3 (2007), 257--274.

Digital Library

[14]

Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, and Jimmy Lin 2012. Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture SIGMOD. 1147--1157.

Digital Library

[15]

Jay M. Ponte and W. Bruce Croft 1998. A Language Modeling Approach to Information Retrieval SIGIR. 275--281.

Digital Library

[16]

Kira Radinsky, Krysta Svore, Susan Dumais, Jaime Teevan, Alex Bocharov, and Eric Horvitz. 2012. Modeling and Predicting Behavioral Dynamics on the Web WWW. 599--608.

Digital Library

[17]

Jinfeng Rao, Hua He, Haotian Zhang, Ferhan Ture, Royal Sequiera, Salman Mohammed, and Jimmy Lin 2017. Integrating Lexical and Temporal Signals in Neural Ranking Models for Social Media Search SIGIR Workshop on Neural Information Retrieval (Neu-IR).

[18]

Jinfeng Rao and Jimmy Lin 2016. Temporal Query Expansion Using a Continuous Hidden Markov Model ICITR. 295--298.

Digital Library

[19]

Jinfeng Rao, Jimmy Lin, and Miles Efron 2015. Reproducible Experiments on Lexical and Temporal Feedback for Tweet Search. ECIR. 755--767.

[20]

Jinfeng Rao, Xing Niu, and Jimmy Lin 2016. Compressing and Decoding Term Statistics Time Series ECIR. 675--681.

[21]

Milad Shokouhi and Kira Radinsky 2012. Time-Sensitive Query Auto-Completion. In SIGIR. 601--610.

Digital Library

[22]

Mark D. Smucker, James Allan, and Ben Carterette. 2007. A Comparison of Statistical Significance Tests for Information Retrieval Evaluation CIKM. 623--632. endthebibliography

Digital Library

Cited By

Chunyan Yin Zhang RChen Y(2022)Evolutionary Poisson Factorization Based Multiple Trust RelationshipsPattern Recognition and Image Analysis10.1134/S105466182201012632:1(218-227)Online publication date: 18-Mar-2022
https://doi.org/10.1134/S1054661822010126
Potthast MGünther SBevendorff JBittner JBondarenko AFröbe MKahmann CNiekler AVölske MStein BHagen MDiaz FShah CSuel TCastells PJones RSakai T(2021)The Information Retrieval AnthologyProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462798(2550-2555)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3404835.3462798
Han ZKong LQi H(2021)Time segment language model for microblog retrievalNeural Computing and Applications10.1007/s00521-020-05534-xOnline publication date: 3-Jan-2021
https://doi.org/10.1007/s00521-020-05534-x

Index Terms

Mining the Temporal Statistics of Query Terms for Searching Social Media Posts
1. Information systems
  1. Information retrieval

Recommendations

Ranking Models for the Temporal Dimension of Text
Temporal features of text have been shown to improve clustering and organization of documents, text classification, visualization, and ranking. Temporal ranking models consider the temporal expressions found in text (e.g., “in 2021” or “last year”) as ...
What is Twitter, a social network or a news media?
WWW '10: Proceedings of the 19th international conference on World wide web

Twitter, a microblogging service less than three years old, commands more than 41 million users as of July 2009 and is growing fast. Twitter users tweet about any topic within the 140-character limit and follow others to receive their tweets. The goal ...
Mining Relevant Time for Query Subtopics in Web Archives
WWW '15 Companion: Proceedings of the 24th International Conference on World Wide Web

With the reflection of nearly all types of social cultural, societal and everyday processes of our lives in the web, web archives from organizations such as the Internet Archive have the potential of becoming huge gold-mines for temporal content ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICTIR '17: Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval

October 2017

348 pages

ISBN:9781450344906

DOI:10.1145/3121050

General Chairs:
Jaap Kamps
University of Amsterdam, The Netherlands
,
Evangelos Kanoulas
University of Amsterdam, The Netherlands
,
Maarten de Rijke
University of Amsterdam, The Netherlands
,
Program Chairs:
Hui Fang
University of Delaware, USA
,
Emine Yilmaz
University College London, UK

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

ICTIR '17

Sponsor:

SIGIR

ICTIR '17: ACM SIGIR International Conference on the Theory of Information Retrieval

October 1 - 4, 2017

Amsterdam, The Netherlands

Acceptance Rates

ICTIR '17 Paper Acceptance Rate 27 of 54 submissions, 50%;

Overall Acceptance Rate 235 of 527 submissions, 45%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
333
Total Downloads

Downloads (Last 12 months)115
Downloads (Last 6 weeks)8

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chunyan Yin Zhang RChen Y(2022)Evolutionary Poisson Factorization Based Multiple Trust RelationshipsPattern Recognition and Image Analysis10.1134/S105466182201012632:1(218-227)Online publication date: 18-Mar-2022
https://doi.org/10.1134/S1054661822010126
Potthast MGünther SBevendorff JBittner JBondarenko AFröbe MKahmann CNiekler AVölske MStein BHagen MDiaz FShah CSuel TCastells PJones RSakai T(2021)The Information Retrieval AnthologyProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462798(2550-2555)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3404835.3462798
Han ZKong LQi H(2021)Time segment language model for microblog retrievalNeural Computing and Applications10.1007/s00521-020-05534-xOnline publication date: 3-Jan-2021
https://doi.org/10.1007/s00521-020-05534-x

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten