skip to main content
10.1145/3219819.3219827acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

An Extensible Event Extraction System With Cross-Media Event Resolution

Published: 19 July 2018 Publication History

Abstract

The automatic extraction of breaking news events from natural language text is a valuable capability for decision support systems. Traditional systems tend to focus on extracting events from a single media source and often ignore cross-media references. Here, we describe a large-scale automated system for extracting natural disasters and critical events from both newswire text and social media. We outline a comprehensive architecture that can identify, categorize and summarize seven different event types - namely floods, storms, fires, armed conflict, terrorism, infrastructure breakdown, and labour unavailability. The system comprises fourteen modules and is equipped with a novel coreference mechanism, capable of linking events extracted from the two complementary data sources. Additionally, the system is easily extensible to accommodate new event types. Our experimental evaluation demonstrates the effectiveness of the system.

References

[1]
Puneet Agarwal, Rajgopal Vaithiyanathan, Saurabh Sharma, and Gautam Shroff. 2012. Catching the Long-Tail: Extracting Local News Events from Twitter. Proceedings of The International AAAI Conference on Web and Social Media.
[2]
David Ahn. 2006. The stages of event extraction. In Proceedings of the Workshop on Annotating and Reasoning about Time and Events.
[3]
James Allan, Victor Lavrenko, and Hubert Jin. 2000. First Story Detection in TDT is Hard. In Proceedings of the Ninth International Conference on Information and Knowledge Management (CIKM).
[4]
Chinatsu Aone and Mila Ramos-Santacruz. 2000. REES: a large-scale relation and event extraction system Proceedings of the sixth conference on Applied natural language processing.
[5]
Farzindar Atefeh and Wael Khreich. 2013. A Survey of Techniques for Event Detection in Twitter. Computational Intelligence (2013).
[6]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching Word Vectors with Subword Information. arXiv preprint arXiv:1607.04606.
[7]
Yubo Chen, Liheng Xu, Kang Liu, Daojian Zeng, Jun Zhao, et almbox. 2015. Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.
[8]
Nancy A. Chinchor. {n. d.}. Overview of MUC-7 Proceedings of the Seventh Message Understanding Contest.
[9]
Jinho D Choi. 2016. Dynamic Feature Induction: The Last Gist to the State-of-the-Art. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
[10]
Jinho D Choi and Martha Palmer. 2012. Fast and robust part-of-speech tagging using dynamic model selection Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics.
[11]
George R Doddington, Alexis Mitchell, Mark A Przybocki, Lance A Ramshaw, Stephanie Strassel, and Ralph M Weischedel. 2004. The Automatic Content Extraction (ACE) Program-Tasks, Data, and Evaluation. Proceedings of the International Conference on Language Resources and Evaluation (LREC).
[12]
Joe Ellis, Jeremy Getman, Dana Fore, Neil Kuster, Zhiyi Song, Ann Bies, and Stephanie Strassel. 2015. Overview of linguistic resources for the TAC KBP 2015 evaluations: Methodologies and results Proceedings of TAC KBP 2015 Workshop, National Institute of Standards and Technology.
[13]
Judith Gelernter and Shilpa Balaji. 2013. An algorithm for local geoparsing of microtext. GeoInformatica (2013).
[14]
Ralph Grishman. 1997. Information extraction: Techniques and challenges. Information extraction a multidisciplinary approach to an emerging information technology. Springer.
[15]
Weiwei Guo and Mona Diab. 2012. A simple unsupervised latent semantics based approach for sentence similarity Proceedings of the First Joint Conference on Lexical and Computational Semantics.
[16]
Weiwei Guo, Hao Li, Heng Ji, and Mona T Diab. 2013. Linking Tweets to News: A Framework to Enrich Short Text Data in Social Media. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics.
[17]
Dhruv Gupta, Jannik Strötgen, and Klaus Berberich. 2016. EventMiner: Mining Events from Annotated Documents Proceedings of the 2016 ACM on International Conference on the Theory of Information Retrieval.
[18]
Prashant Gupta and Heng Ji. 2009. Predicting unknown time arguments based on cross-event propagation Proceedings of the ACL-IJCNLP 2009 Conference.
[19]
Felix Hamborg, Soeren Lachnit, Moritz Schubotz, Thomas Hepp, and Bela Gipp. 2018. Giveme5W: Main Event Retrieval from News Articles by Extraction of the Five Journalistic W Questions. In International Conference on Information. Springer, 356--366.
[20]
Haibo He and Edwardo A. Garcia. 2009. Learning from Imbalanced Data. IEEE Trans. on Knowl. and Data Eng. (2009).
[21]
Yu Hong, Jianfeng Zhang, Bin Ma, Jianmin Yao, Guodong Zhou, and Qiaoming Zhu. 2011. Using cross-entity inference to improve event extraction Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics.
[22]
Heng Ji and Ralph Grishman. 2008. Refining Event Extraction through Cross-Document Inference Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics.
[23]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a Social Network or a News Media? Proceedings of the 19th international conference on World wide web (WWW).
[24]
Qi Li, Heng Ji, and Liang Huang. 2013. Joint Event Extraction via Structured Prediction with Global Features. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics.
[25]
Q. Li, A. Nourbakhsh, S. Shah, and X. Liu. 2017. Real-Time Novel Event Detection from Social Media. Proceedings of the IEEE 33rd International Conference on Data Engineering (ICDE).
[26]
Q. Li, S. Shah, X. Liu, A. Nourbakhsh, and R. Fang. 2016. Tweet Topic Classification Using Distributed Language Representations Proceedings of the 2016 IEEE/WIC/ACM International Conference on Web Intelligence.
[27]
Shasha Liao and Ralph Grishman. 2010. Using document level cross-event inference to improve event extraction Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.
[28]
Xiaomo Liu, Quanzhi Li, Armineh Nourbakhsh, Rui Fang, Merine Thomas, Kajsa Anderson, Russ Kociuba, Mark Vedder, Steven Pomerville, Ramdev Wudali, et almbox. 2016. Reuters Tracer: A Large Scale System of Detecting & Verifying Real-Time News Events from Twitter Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM).
[29]
Xiaomo Liu, Armineh Nourbakhsh, Sameena Shah Quanzhi Li, Robert Martin, and John Duprey. 2017. Reuters Tracer: Toward Automated News Production Using Large Scale Social Media Data Proceedings of the 2017 IEEE International Conference of Big Data.
[30]
Juha Makkonen et almbox. 2009. Semantic classes in topic detection and tracking. PhD thesis, University of Helsinki, Department of Computer Science (2009).
[31]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space arXiv preprint arXiv:1301.3781.
[32]
Minh-Tien Nguyen, Duc-Vu Tran, Chien-Xuan Tran, and Minh-Le Nguyen. 2016 b. Learning to summarize web documents using social information 28th International Conference on Tools with Artificial Intelligence (ICTAI).
[33]
Thien Huu Nguyen, Kyunghyun Cho, and Ralph Grishman. 2016 a. Joint Event Extraction via Recurrent Neural Networks. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
[34]
Thien Huu Nguyen and Ralph Grishman. 2015. Event Detection and Domain Adaptation with Convolutional Neural Networks. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics.
[35]
Armineh Nourbakhsh, Quanzhi Li, Xiaomo Liu, and Sameena Shah. 2017. “Breaking” Disasters: Predicting and Characterizing the Global News Value of Natural and Man-made Disasters. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[36]
T. Nugent, F. Petroni, N. Raman, L. Carstens, and J. L. Leidner. 2017. A Comparison of Classification Models for Natural Disaster and Critical Event Detection from News. In Proceedings of the Workshop on Data Science for Emergency Management.
[37]
Miles Osborne and Mark Dredze. 2014. Facebook, Twitter and Google Plus for Breaking News: Is There a Winner? Proceedings of The International AAAI Conference on Web and Social Media (ICWSM).
[38]
Ozer Ozdikis, Halit Ouguztüzün, and Pinar Karagoz. 2016. A survey on location estimation techniques for events detected in Twitter. Knowledge and Information Systems (2016).
[39]
Jakub Piskorski, Hristo Tanev, Martin Atkinson, Eric Van Der Goot, and Vanni Zavarella. 2011. Online news event extraction for global crisis surveillance. Transactions on computational collective intelligence V. Springer.
[40]
Alan Ritter, Oren Etzioni, Sam Clark, et almbox. 2012. Open domain event extraction from twitter. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining.
[41]
Polina Rozenshtein, Aris Anagnostopoulos, Aristides Gionis, and Nikolaj Tatti. 2014. Event detection in activity networks. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining.
[42]
Parang Saraf and Naren Ramakrishnan. 2016. EMBERS autogsr: Automated coding of civil unrest events Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[43]
Hristo Tanev, Jakub Piskorski, and Martin Atkinson. 2008. Real-time news event extraction for global crisis monitoring. Natural Language and Information Systems (2008).
[44]
Zhongyu Wei, Yang Liu, Chen Li, and Wei Gao. 2015. Using Tweets to Help Sentence Compression for News Highlights Generation Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics.
[45]
Bishan Yang and Tom Mitchell. 2016. Joint Extraction of Events and Entities within a Document Context Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics.

Cited By

View all
  • (2023)Few-shot Aspect Category Sentiment Analysis via Meta-learningACM Transactions on Information Systems10.1145/352995441:1(1-31)Online publication date: 31-Jan-2023
  • (2023)A Multi-channel Hierarchical Graph Attention Network for Open Event ExtractionACM Transactions on Information Systems10.1145/352866841:1(1-27)Online publication date: 10-Jan-2023
  • (2023)Complex Event Summarization Using Multi-Social Attribute CorrelationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322790635:11(11180-11195)Online publication date: 1-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2018
2925 pages
ISBN:9781450355520
DOI:10.1145/3219819
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. event coreference
  2. event extraction
  3. first story detection
  4. information extraction
  5. news analytics

Qualifiers

  • Research-article

Conference

KDD '18
Sponsor:

Acceptance Rates

KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)34
  • Downloads (Last 6 weeks)4
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Few-shot Aspect Category Sentiment Analysis via Meta-learningACM Transactions on Information Systems10.1145/352995441:1(1-31)Online publication date: 31-Jan-2023
  • (2023)A Multi-channel Hierarchical Graph Attention Network for Open Event ExtractionACM Transactions on Information Systems10.1145/352866841:1(1-27)Online publication date: 10-Jan-2023
  • (2023)Complex Event Summarization Using Multi-Social Attribute CorrelationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322790635:11(11180-11195)Online publication date: 1-Nov-2023
  • (2022)Understanding I/O Direct Cache Access Performance for End Host NetworkingProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35080426:1(1-37)Online publication date: 28-Feb-2022
  • (2022)Studying Up Machine Learning DataProceedings of the ACM on Human-Computer Interaction10.1145/34928536:GROUP(1-14)Online publication date: 14-Jan-2022
  • (2022)"That's Something for Children"Proceedings of the ACM on Human-Computer Interaction10.1145/34928506:GROUP(1-35)Online publication date: 14-Jan-2022
  • (2022)DEES: a real-time system for event extraction from disaster-related web textSocial Network Analysis and Mining10.1007/s13278-022-01007-213:1Online publication date: 11-Dec-2022
  • (2022)Design Event Extraction Model from Amharic Texts Using Deep Learning ApproachAdvances of Science and Technology10.1007/978-3-030-93709-6_28(424-434)Online publication date: 1-Jan-2022
  • (2021)Detecting and Classifying Typhoon Information from Chinese News Based on a Neural Network ModelSustainability10.3390/su1313733213:13(7332)Online publication date: 30-Jun-2021
  • (2021)Rendering with styleACM Transactions on Graphics10.1145/3478513.348050940:6(1-14)Online publication date: 10-Dec-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media