Skip to main content

Detecting Bursty Topics of Correlated News and Twitter for Government Services

  • Chapter
  • First Online:
Social Media for Government Services

Abstract

This chapter presents a framework of detecting bursty topics of correlated news and twitter, and discusses how to integrate the framework into government services. Especially, as a specific application of the proposed framework of detecting bursty topics of correlated news and twitter, this chapter gives an example of collecting news and twitter that are related to “the 2012 London Olympic game” and applying the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.yomiuri.co.jp/.

  2. 2.

    http://www.asahi.com/.

  3. 3.

    https://twitter.com/.

  4. 4.

    In Kleinberg [1], \(\tau (i,j)\) is defined not as \((j - i)\gamma\), but as \((j - i)\gamma \ln m\), where \(m\) is the number of batches in the sequence \({\mathbf{B}} = (B_{1} , \ldots ,B_{m} )\). In this chapter, we omit the term \(\ln m\) in this definition for simplicity.

  5. 5.

    http://www.cs.princeton.edu/~blei/topicmodeling.html.

  6. 6.

    http://ja.wikipedia.org/.

  7. 7.

    Those evaluation results are still based on inside evaluation, which means that the two parameters \(s\) and \(\gamma\) are optimized with the news and tweets for evaluation we show in this chapter. However, we tune the two parameters across the 34 topics for evaluation, where we observed that the optimal values of the two parameters are mostly consistent across 34 topics for evaluation. Parameter optimization with held-out training data is one of our future work.

  8. 8.

    Although Table 2 only shows the evaluation results for 34 topics that are relevant to “the London Olympic games”, even for the whole 50 topics, precision of the detected bursty topics is about 90 % per day/topic for both news articles and tweet texts.

References

  1. Kleinberg, J. (2002). Bursty and hierarchical structure in streams. In Proceedings of 8th SIGKDD (pp. 91–101).

    Google Scholar 

  2. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  3. Blei, D. M., & Lafferty, J. D. (2006). Dynamic topic models. In Proceedings of 23rd ICML (pp. 113–120).

    Google Scholar 

  4. Takahashi, Y., Utsuro, T., Yoshioka, M., Kando, N., Fukuhara, T., Nakagawa, H., & Kiyota, Y. (2012). Applying a burst model to detect bursty topics in a topic model. In JapTAL 2012 (Vol. 7614 of LNCS, pp. 239–249) Berlin: Springer.

    Google Scholar 

  5. Mane, K., & Borner, K. (2004). Mapping topics and topic bursts in PNAS. In: Proceedings of PNAS (Vol. 101, Suppl 1, pp. 5287–5290).

    Google Scholar 

  6. AlSumait, L., Bardara, D., Gentle, J., & Domeniconi, C. (2009). Topic significance ranking of LDA generative models. In Proceedings of ECML/PKDD (pp. 67–82).

    Google Scholar 

  7. Wang, X., Zhai, C. X., & Hu, R. S. (2007). Mining correlated bursty topic patterns from coordinated text streams. In Proceedings of 13th SIGKDD (pp. 784–793).

    Google Scholar 

  8. Zhang, J., Song, Y., Zhang, C., & Liu, S. (2010). Evolutionary hierarchical Dirichlet processes for multiple correlated time-varying corpora. In Proceedings of 16th SIGKDD (pp. 1079–10881).

    Google Scholar 

  9. Petrović, S., Osborne, M., & Lavrenko, V. (2010). Streaming first story detection with application to twitter. In HLT-NAACL (pp. 181–189).

    Google Scholar 

  10. Weng, J., & Lee, B. S. (2011). LDA-Based document models for ad-hoc retrieval. In Proceedings of Fifth ICWSM (pp. 401–408).

    Google Scholar 

  11. Li, C., Sun, A., & Datta, A. (2012). Twevent: Segment-based event detection from tweets. In Proceedings of 21st CIKM (pp. 155–164).

    Google Scholar 

  12. Diao, Q., Jiang, J., Zhu, F., & Lim, E. P. (2012). Finding bursty topics from microblogs. In Proceedings of 50th ACL (pp. 536–544).

    Google Scholar 

  13. AlSumait, L., Bardara, D., & Domeniconi, C. (2008). On-Line LDA: Adaptive topic models for mining text streams with applications to topic detection and tracking. In Proceedings of 8th ICDM (pp. 3–12).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takehito Utsuro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Utsuro, T., Inoue, Y., Imada, T., Yoshioka, M., Kando, N. (2015). Detecting Bursty Topics of Correlated News and Twitter for Government Services. In: Nepal, S., Paris, C., Georgakopoulos, D. (eds) Social Media for Government Services. Springer, Cham. https://doi.org/10.1007/978-3-319-27237-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27237-5_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27235-1

  • Online ISBN: 978-3-319-27237-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics