External Expansion Risk Management: Enhancing Microblogging Filtering Using Implicit Query

Yang, Zhen; Gao, Kaiming; Huang, Jian

doi:10.1007/s11277-017-5075-5

External Expansion Risk Management: Enhancing Microblogging Filtering Using Implicit Query

Published: 30 November 2017

Volume 102, pages 2199–2209, (2018)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

129 Accesses
Explore all metrics

Abstract

Microblogging filtering can help users filter out irrelevant content, and extract timely content effectively from microblogs. However, as a typical short text, microblogging filtering suffers from the insufficient samples problem that makes the probabilistic-like models unreliable. According to the current research, an explicit brief query has been thought to be only an abstract of the user’s information needs, and it’s hard to infer what is the users’ actual searching intents. Instead, we submit the relevant external documents as a user’s implicit prior knowledge and then build a corresponding filtering framework. To against the risk of external documents expansion, we suppose the external document can be viewed as a complete statement of an explicit query, and encode the filtering preferences with the diverge degree between the external document and the the original explicit query. Thus the optimal filtering action is the one that allows one to trade off diverge degree against generalization performance. With respect to the established baselines, our algorithm yields compelling results for providing a meaningful tweets retrieval. This work helps further understand the innate risk characteristics of external expansion for the design of Microblogging filtering systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Time-Sensitive Model for Microblog Retrieval

Interactive Disaster Information Search System for Microblog by Minimal User Feedback

Social Search and Task-Related Relevance Dimensions in Microblogging Sites

Notes

http://trec.nist.gov/data/microblog.html.

References

Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
MATH Google Scholar
Efron, M., Organisciak, P., & Fenlon, K. (2012). Improving retrieval of short texts through document expansion. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (pp. 911–920).
Ma, Z., & Leijon, A. (2011). Bayesian estimation of beta mixture models with variational inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(11), 2160–2173.
Article Google Scholar
Ma, Z., Teschendorff, A. E., Leijon, A., Qiao, Y., Zhang, H., & Guo, J. (2015). Variational bayesian matrix factorization for bounded support data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(4), 876–889.
Article Google Scholar
Ma, Z., Xue, J. -H., Leijon, A., Tan, Z. -H., Yang, Z., & Guo, J. (2016). Decorrelation of neutral vector variables: Theory and applications. IEEE transactions on neural networks and learning systems.
Miyanishi, T., Seki, K., & Uehara, K. (2012). Trec 2012 microblog track experiments at kobe university. Technical report, DTIC Document.
Ounis, I., Macdonald, C., Lin, J., & Soboroff, I. (2011). Overview of the trec-2011 microblog track. In Proceeddings of the 20th text REtrieval conference (TREC 2011) (vol. 32).
Qi, H., Li, M., Gao, J., & Li, S. (2006). Information retrieval for short documents. Journal of Electronics (China), 23(6), 933–936.
Article Google Scholar
Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., & Demirbas, M. (2010). Short text classification in twitter to improve information filtering. In Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval (pp. 841–842).
Strohman, T., Metzler, D., Turtle, H., & Croft, W. B. (2005). Indri: A language model-based search engine for complex queries. In Proceedings of the international conference on intelligent analysis (Vol. 2, pp. 2–6). Citeseer.
Tao, T., Wang, X., Mei, Q., & Zhai, C. (2006). Language model information retrieval with document expansion. In Proceedings of the main conference on human language technology conference of the North American chapter of the association of computational linguistics (pp. 407–414).
Yang, Z., Gao, K., Fan, K., & Lai, Y. (2014). Sensational headline identification by normalized cross entropy-based metric. The Computer Journal, 58(4), 644–655.
Article Google Scholar
Yang, Z., Jones, I., Hu, X., & Liu, H. (2015). Finding the right social media site for questions. In Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015 (pp. 639–644).
Yang, Z., Li, C., Fan, K., & Huang, J. (2017). Exploiting multi-sources query expansion in microblogging filtering. Neural Network World, 27(1), 59.
Article Google Scholar
Zhai, C., & Lafferty, J. (2002). Two-stage language models for information retrieval. In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 49–56).
Zhai, C., & Lafferty, J. (2006). A risk minimization framework for information retrieval. Information Processing & Management, 42(1), 31–55.
Article MATH Google Scholar

Download references

Acknowledgements

This work was partly supported by the National Nature Science Foundation of China (Grant No. 61671030), and the National Key R&D Program of China (No. 2017YFC0803300) .

Author information

Authors and Affiliations

College of Computer Science, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
Zhen Yang & Kaiming Gao
Central University of Finance and Economics, Beijing, 102206, China
Jian Huang

Authors

Zhen Yang
View author publications
You can also search for this author in PubMed Google Scholar
Kaiming Gao
View author publications
You can also search for this author in PubMed Google Scholar
Jian Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhen Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, Z., Gao, K. & Huang, J. External Expansion Risk Management: Enhancing Microblogging Filtering Using Implicit Query. Wireless Pers Commun 102, 2199–2209 (2018). https://doi.org/10.1007/s11277-017-5075-5

Download citation

Published: 30 November 2017
Issue Date: October 2018
DOI: https://doi.org/10.1007/s11277-017-5075-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

External Expansion Risk Management: Enhancing Microblogging Filtering Using Implicit Query

Abstract

Access this article

Similar content being viewed by others

A Time-Sensitive Model for Microblog Retrieval

Interactive Disaster Information Search System for Microblog by Minimal User Feedback

Social Search and Task-Related Relevance Dimensions in Microblogging Sites

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

External Expansion Risk Management: Enhancing Microblogging Filtering Using Implicit Query

Abstract

Access this article

Similar content being viewed by others

A Time-Sensitive Model for Microblog Retrieval

Interactive Disaster Information Search System for Microblog by Minimal User Feedback

Social Search and Task-Related Relevance Dimensions in Microblogging Sites

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation