Time segment language model for microblog retrieval

Han, Zhong-yuan; Kong, Lei-lei; Qi, Hao-liang

doi:10.1007/s00521-020-05534-x

Time segment language model for microblog retrieval

S.I. : Higher Level Artificial Neural Network Based Intelligent Systems
Published: 03 January 2021

Volume 33, pages 4763–4777, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

189 Accesses
2 Citations
Explore all metrics

Abstract

Related studies have shown that the time characteristics of microblog can improve retrieval performance. However, these researches mainly focus on the time distribution of tweets related to a given query. And this single time characteristics might not be sufficient to reflect time characteristics of microblog. Inspired by the recent success of time-based language models for microblog retrieval, this paper proposes a time segment language model (TSLM) to model the time characteristics of microblog. Briefly, TSLM constructs the language model of each time segment to model the probability distribution over sequences of words for each different time segment. Based on TSLM, the time distribution of terms (tDT), the time distribution of queries (tDQ) and the time distribution of documents (tDD) are proposed. Furthermore, TSLM is exploited to estimate the query model, the document model and compute the similarity between query and document. The experimental results on the Tweets2011 corpus show that the proposed approaches outperform several state-of-the-art baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Time-Sensitive Model for Microblog Retrieval

Combining Recency and Topic-Dependent Temporal Variation for Microblog Search

Making Recommendations on Microblogs through Topic Modeling

Notes

http://twittertools.cc/.
The corpus contains about 16 million tweets, but some microblogs were not downloaded for the reasons such as having be deleted, hidden, and so on.
http://nutch.apache.org/.
http://www.lemurproject.org/indri/.

References

Campos R, Dias G, Jorge AM et al (2017) Identifying top relevant dates for implicit time sensitive queries. Inf Retr J 20(4):363–398
Article Google Scholar
Martins F, Magalhães J, Callan J (2019) Modeling temporal evidence from external collections. In: Proceedings of the twelfth ACM international conference on web search and data mining, pp 159–167
Rao J, Ture F, Niu X, et al (2017) Mining the temporal statistics of query terms for searching social media posts. In: Proceedings of the ACM SIGIR international conference on theory of information retrieval, pp 133–140
Chen Q, Hu Q, Huang JX et al (2018) Taker: fine-grained time-aware microblog search with kernel density estimation. IEEE Trans Knowl Data Eng 30(8):1602–1615
Article Google Scholar
Efron M, Golovchinsky G (2011) Estimation methods for ranking recent information. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval, Beijing, China, pp 495–504
Keikha M, Shima Gi, Fabio C (2011) Time-based relevance models. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval, Beijing, China, pp 1087–1088
Choi J, Croft W B (2012) Time models for microblogs. In: Proceedings of the 21st ACM international conference on Information and knowledge management, Maui, USA, pp 2491–2494
Peetz MH, Edgar M, Maarten de R, Wouter W (2012) Adaptive time query modeling. In: Proceedings of the 34th European conference on information retrieval research, Barcelona, Spain, pp 455–458
Li X, Croft WB (2003) Time-based language models. In: Proceedings of the 12th international conference on information and knowledge management, New Orleans, USA, pp 469–475
Dong A, Zhang R, Kolari P, et al (2010) Time is of the essence: improving recency ranking using twitter data. In: Proceedings of the 19th international conference on World wide web, Raleigh, USA, pp 331–340
Cheng S, Arvanitis A, Hristidis V (2013) How fresh do you want your search results?. In: Proceedings of the 22nd ACM international conference on conference on information & knowledge management, San Francisco, USA, ACM, pp 1271–1280
Dakka W, Gravano L, Ipeirotis PG (2012) Answering general time-sensitive queries. IEEE Trans Knowl Data Eng 24(2):220–235
Article Google Scholar
Efron M, Lin J, He J, et al (2014) Time feedback for tweet search with non-parametric density estimation. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval, Gold Coast, Australia, pp 33–42
Lin J, Efron M (2013) Time relevance profiles for tweet search. In: Proceedings of the 36th annual international ACM SIGIR conference on research and development in information retrieval workshop on time-aware information access, Dublin, Ireland
Ponte JM, Croft WB (1998) A language modeling approach to information retrieval. In: Proceedings of the c in information retrieval, Melbourne, Australia, pp 275-281
Teevan J, Ramage D, Morris MR (2011) #TwitterSearch: a comparison of microblog search and web search. In: Proceedings of the fourth ACM international conference on Web search and data mining, Hong Kong, China, pp 35–44
Gao J, Xu G, Xu J (2013) Query expansion using path-constrained random walks. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval, Dublin, Ireland, pp 563–572
Wei B, Wang B (2014) Time-aware mixed language model for microblog search. Chin J Comput 37(1):229–237
Google Scholar
Metzler D, Cai C, Hovy E (2012) Structured event retrieval over microblog archives. In: Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: human language technologies, Montreal, Canada, pp 646–655
Albishre K, Li Y, Xu Y (2018) Query-based automatic training set selection for microblog retrieval. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, Cham, pp 325–336
Albishre K, Li Y, Xu Y (2017) Effective pseudo-relevance for microblog retrieval. In: Proceedings of the Australasian computer science week multiconference
Chy AN, Ullah MZ, Aono M (2017) Microblog retrieval using ensemble of feature sets through supervised feature selection. IEICE Trans Inf Syst 100(4):793–806
Article Google Scholar
Lavrenko V, Croft WB (2001) Relevance based language models. In: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, New Orleans, pp 120–127
Bai J, Song D, Bruza P, et al (2005) Query expansion using term relationships in language models for information retrieval. In: Proceedings of the 14th ACM international conference on information and knowledge management, Bremen, Germany, pp 688–695
Cao G, Nie J Y, Gao J, et al (2008) Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, Singapore, Singapore, pp 243–250
Carpineto C, Romano G (2012) A survey of automatic query expansion in information retrieval. ACM Comput Surv 44(1):1–50
Article Google Scholar
Tao T, Wang X, Mei Q, et al (2006) Language model information retrieval with document expansion. In: Proceedings of the main conference on human language technology conference of the North American chapter of the association of computational linguistics, association for computational linguistics, Morristown, pp 407–414
Mei Q, Zhang D, Zhai CX (2008) A general optimization framework for smoothing language models on graph structures. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, Singapore, pp 611–618
Liu X, Croft WB (2004) Cluster-based retrieval using language models. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, Sheffield, UK, pp 186–193
Ramage D, Dumais ST, Liebling DJ (2010) Characterizing Microblogs with Topic Models. In: The 4th international conference on weblogs and social media, Washington, DC
Liang S, Ren Z, de Rijke M (2014) The impact of semantic document expansion on cluster-based fusion for microblog search. In: The 36th European conference on information retrieval (ECIR 2014), Springer, Amsterdam, pp 493–499
Efron M, Organisciak P, Fenlon K (2012) Improving retrieval of short texts through document expansion. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval, Portland Oregon, pp 911–920
Soboroff I, Ounis I, Lin J, et al (2012) Overview of the TREC-2012 microblog track. In: Proceedings of the 21st Text REtrieval Conference, Gaithersburg
Lafferty J, Zhai C (2001) Document language models, query models, and risk minimization for information retrieval. In: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, New Orleans, Louisiana, USA, pp 111–119
Merigó JM, Casanovas M (2011) A new Minkowski distance based on induced aggregation operators. Int J Comput Intell Syst 4(2):123–133
Article Google Scholar
Han Z, Li X, Yang M, et al (2012) Hit at TREC 2012 microblog track. In: Proceedings of text retrieval conference, Gaithersburg, USA
Ibtihel BL, Lobna H, Lotfi BR (2019) A deep learning-based ranking approach for microblog retrieval. Procedia Comput Sci 159:352–362
Article Google Scholar
Belhadi A, Djenouri Y, Lin JCW et al (2020) Exploring pattern mining algorithms for hashtag retrieval problem. IEEE Access 8:10569–10583
Article Google Scholar
Rao J, Lin J (2016) Temporal query expansion using a continuous hidden markov model. In: Proceedings of the 2016 ACM international conference on the theory of information retrieval, pp 295–298

Download references

Author information

Authors and Affiliations

School of Electronic Information Engineering, Foshan University, Foshan, 528231, China
Zhong-yuan Han, Lei-lei Kong & Hao-liang Qi

Authors

Zhong-yuan Han
View author publications
You can also search for this author in PubMed Google Scholar
Lei-lei Kong
View author publications
You can also search for this author in PubMed Google Scholar
Hao-liang Qi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei-lei Kong.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported by National Social Science Fund of China (No. 18BYY125).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Han, Zy., Kong, Ll. & Qi, Hl. Time segment language model for microblog retrieval. Neural Comput & Applic 33, 4763–4777 (2021). https://doi.org/10.1007/s00521-020-05534-x

Download citation

Received: 28 August 2020
Accepted: 11 November 2020
Published: 03 January 2021
Issue Date: May 2021
DOI: https://doi.org/10.1007/s00521-020-05534-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Time segment language model for microblog retrieval

Abstract

Access this article

Similar content being viewed by others

A Time-Sensitive Model for Microblog Retrieval

Combining Recency and Topic-Dependent Temporal Variation for Microblog Search

Making Recommendations on Microblogs through Topic Modeling

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Time segment language model for microblog retrieval

Abstract

Access this article

Similar content being viewed by others

A Time-Sensitive Model for Microblog Retrieval

Combining Recency and Topic-Dependent Temporal Variation for Microblog Search

Making Recommendations on Microblogs through Topic Modeling

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation