Skip to main content
Log in

Yet another approach to understanding news event evolution

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

With information explosion on the Internet, only returning ranked documents by search engines cannot satisfy people’s requirements on news events understanding. A more intelligent news events search engine should not only retrieve all related documents about a specific event, but also provide a global view about how the event originates and evolves. In order to solve this challenge, two tasks, event news retrieval and eventline generation should be processed. For event news retrieval, existing approaches mainly focus on the document-level similarity to retrieve related news documents, while external knowledge is not effectively taken into consideration. To this end, we propose a similarity model named Event-Oriented Similarity combining the document-level with the knowledge-level similarity to retrieve news documents related to the specific event. For eventline generation, in order to outline the event structure more accurately, we construct an Event-Oriented Similarity Graph to represent the relationship among retrieved event news documents and develop a community detection algorithm to segment sub-events which are consequently chained into a cohesive eventline. Experimental results on real-world datasets demonstrate that the proposed approach outperforms existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7

Similar content being viewed by others

Notes

  1. https://en.wikipedia.org/wiki/Air_Alg%C3%A9rie_Flight_5017

  2. https://nlp.stanford.edu/software/CRF-NER.shtml

  3. https://en.wikipedia.org/wiki/Portal:Current_events

  4. https://nlp.stanford.edu/software/CRF-NER.shtml

References

  1. Allan, J., Carbonell, J.G., Doddington, G., Yamron, J., Yang, Y.: Topic detection and tracking pilot study final report (1998)

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J Mach Learn Res 3(Jan), 993–1022 (2003)

    MATH  Google Scholar 

  3. Chen, M.: Efficient vector representation for documents through corruption. arXiv:1707.02377 (2017)

  4. Chen, Z., Zhang, X., Boedihardjo, A.P., Dai, J., Lu, C.: Multimodal storytelling via generative adversarial imitation learning. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, August 19-25, 2017. pp. 3967–3973. Melbourne, Australia. https://doi.org/10.24963/ijcai.2017/554 (2017)

  5. Daiber, J., Jakob, M., Hokamp, C., Mendes, P.N.: Improving efficiency and accuracy in multilingual entity extraction. In: I-SEMANTICS 2013 - 9Th International Conference on Semantic Systems, ISEM ‘13, September 4-6, 2013. pp. 121–124. Graz, Austria. https://doi.org/10.1145/2506182.2506198 (2013)

  6. Hossain, M.S., Butler, P., Boedihardjo, A.P., Ramakrishnan, N.: Storytelling in entity networks to support intelligence analysts. In: The 18Th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ‘12, August 12-16, 2012. pp. 1375–1383. Beijing, China. https://doi.org/10.1145/2339530.2339742 (2012)

  7. Hossain, M.S., Gresock, J., Edmonds, Y., Helm, R., Potts, M., Ramakrishnan, N.: Connecting the dots between pubmed abstracts. PloS One 7(1), e29509 (2012)

    Article  Google Scholar 

  8. Huang, L.: Optimized event storyline generation based on mixture-event-aspect model. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013, 18-21 October 2013, Grand Hyatt Seattle, Seattle, Washington, USA, A meeting of SIGDAT, a Special Interest Group of the ACL. pp. 726–735. http://aclweb.org/anthology/D/D13/D13-1068.pdf (2013)

  9. Jo, Y., Hopcroft, J.E., Lagoze, C.: The Web of topics: discovering the topology of topic evolution in a corpus. In: Proceedings of the 20th International Conference on World Wide Web, WWW 2011, March 28 - April 1, 2011. pp. 257–266. Hyderabad, India. https://doi.org/10.1145/1963405.1963444 (2011)

  10. Kim, D., Oh, A.H.: Topic chains for understanding a news corpus pp. 163–176. https://doi.org/10.1007/978-3-642-19437-5_13 (2011)

  11. Kumar, D., Ramakrishnan, N., Helm, R.F., Potts, M.: Algorithms for storytelling. IEEE Trans. Knowl. Data Eng. 20(6), 736–751 (2008). https://doi.org/10.1109/TKDE.2008.32

    Article  Google Scholar 

  12. Kuzey, E., Vreeken, J., Weikum, G.: A fresh look on knowledge bases: Distilling named events from news. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, November 3-7, 2014. pp. 1689–1698. Shanghai, China. https://doi.org/10.1145/2661829.2661984 (2014)

  13. Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31th International Conference on Machine Learning, ICML 2014, 21-26 June 2014. pp. 1188–1196. Beijing, China. http://jmlr.org/proceedings/papers/v32/le14.html (2014)

  14. Lee, P., Lakshmanan, L.V.S., Milios, E.E.: CAST: A context-aware story-teller for streaming social content. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, Shanghai, China, November 3-7, 2014. pp. 789–798. https://doi.org/10.1145/2661829.2661859 (2014)

  15. Lin, C., Lin, C., Li, J., Wang, D., Chen, Y., Li, T.: Generating event storylines from microblogs. In: 21St ACM International Conference on Information and Knowledge Management, CIKM’12, Maui, HI, USA, October 29 - November 02, 2012. pp. 175–184. https://doi.org/10.1145/2396761.2396787 (2012)

  16. Liu, B., Niu, D., Lai, K., Kong, L., Xu, Y.: Growing story forest online from massive breaking news. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06 - 10, 2017. pp. 777–785. https://doi.org/10.1145/3132847.3132852 (2017)

  17. Liu, Y., Lv, N., Luo, J., Yang, H.: Subtopic based topic evolution analysis. In: International Conference on Web Information Systems and Mining, 2009. WISM 2009. pp. 168–172. IEEE (2009)

  18. Lv, C., Fan, F., Qiang, R., Fei, Y., Yang, J.: PKUICST at TREC 2014 microblog track: Feature extraction for effective microblog search and adaptive clustering algorithms for TTG. Tech. rep. http://trec.nist.gov/pubs/trec23/papers/pro-PKUICST_microblog.pdf (2014)

  19. Magdy, W., Gao, W., El-Ganainy, T., Wei, Z.: QCRI at TREC 2014: Applying the KISS principle for the TTG task in the microblog track. Tech. rep. http://trec.nist.gov/pubs/trec23/papers/pro-QCRI_microblog.pdf (2014)

  20. Newman, M.E.: Analysis of weighted networks. Phys. Rev. E 70(5), 056131 (2004)

    Article  Google Scholar 

  21. Newman, M.E., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)

    Article  Google Scholar 

  22. Rosenberg, A., Hirschberg, J.: V-measure: a conditional entropy-based external cluster evaluation measure. In: EMNLP-CoNLL 2007, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, pp. 410–420. Prague, Czech Republic. http://www.aclweb.org/anthology/D07-1043 (2007)

  23. Rosvall, M., Axelsson, D., Bergstrom, C.T.: The map equation. The European Physical Journal Special Topics 178(1), 13–23 (2009)

    Article  Google Scholar 

  24. Schuhmacher, M., Ponzetto, S.P.: Knowledge-based graph document modeling. In: Seventh ACM International Conference on Web Search and Data Mining, WSDM 2014, February 24-28, 2014. pp. 543–552. New York, NY, USA. https://doi.org/10.1145/2556195.2556250 (2014)

  25. Shahaf, D., Guestrin, C.: Connecting the dots between news articles. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 25-28, 2010. pp. 623–632. Washington, DC, USA. https://doi.org/10.1145/1835804.1835884 (2010)

  26. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000). https://doi.org/10.1109/34.868688

    Article  Google Scholar 

  27. Wang, D., Li, T., Ogihara, M.: Generating pictorial storylines via minimum-weight connected dominating set approximation in multi-view graphs. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, July 22-26 2012, Toronto, Canada. http://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/5074 (2012)

  28. Yamron, J., Carp, I., Gillick, L., Lowe, S., Van Mulbregt, P.: Event tracking and text segmentation via hidden markov models. In: 1997 IEEE Workshop on Automatic Speech Recognition and Understanding, 1997. Proceedings, pp. 519–526. IEEE (1997)

  29. Yan, R., Wan, X., Otterbacher, J., Kong, L., Li, X., Zhang, Y.: Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In: Proceeding of the 34Th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, July 25-29, 2011. pp. 745–754. Beijing, China. https://doi.org/10.1145/2009916.2010016 (2011)

  30. Zhang, X., Guo, Z., Li, B.: An effective algorithm of news topic tracking. In: WRI Global Congress on Intelligent Systems, 2009. GCIS’09, vol. 3, pp. 510–513. IEEE (2009)

  31. Zhou, D., Xu, H., Dai, X., He, Y.: Unsupervised storyline extraction from news articles. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, 9-15 July 2016. pp. 3014–3021. New York, NY, USA. http://www.ijcai.org/Abstract/16/428 (2016)

Download references

Acknowledgments

This research is supported in part by the National Key Research and Development Program of China under Grant 2018YFC0806900.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Longtao Huang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lv, S., Huang, L., Zang, L. et al. Yet another approach to understanding news event evolution. World Wide Web 23, 2449–2470 (2020). https://doi.org/10.1007/s11280-020-00818-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-020-00818-7

Keywords

Navigation