BUEES: a bottom-up event extraction system

Ding, Xiao; Qin, Bing; Liu, Ting

doi:10.1631/FITEE.1400405

Xiao Ding¹,
Bing Qin¹ &
Ting Liu¹

164 Accesses
2 Citations
Explore all metrics

Abstract

Traditional event extraction systems focus mainly on event type identification and event participant extraction based on pre-specified event type paradigms and manually annotated corpora. However, different domains have different event type paradigms. When transferring to a new domain, we have to build a new event type paradigm and annotate a new corpus from scratch. This kind of conventional event extraction system requires massive human effort, and hence prevents event extraction from being widely applicable. In this paper, we present BUEES, a bottom-up event extraction system, which extracts events from the web in a completely unsupervised way. The system automatically builds an event type paradigm in the input corpus, and then proceeds to extract a large number of instance patterns of these events. Subsequently, the system extracts event arguments according to these patterns. By conducting a series of experiments, we demonstrate the good performance of BUEES and compare it to a state-of-the-art Chinese event extraction system, i.e., a supervised event extraction system. Experimental results show that BUEES performs comparably (5% higher F-measure in event type identification and 3% higher F-measure in event argument extraction), but without any human effort.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DuEE: A Large-Scale Dataset for Chinese Event Extraction in Real-World Scenarios

Seq2EG: a novel and effective event graph parsing approach for event extraction

Article 20 May 2023

A Three-Stage Framework for Event-Event Relation Extraction with Large Language Model

References

Ahn, D., 2006. The stages of event extraction. Proc. Workshop on Annotating and Reasoning about Time and Events, p.1-8.
Banko, M., Etzioni, O., 2008. The tradeoffs between open and traditional relation extraction. Proc. Annual Meeting on Association for Computational Linguistics, p.28-36.
Banko, M., Cafarella, M.J., Soderland, S., et al., 2007. Open information extraction for the Web. Proc. 20th Int. Joint Conf. on Artificial Intelligence, p.2670-2676.
Barzilay, R., McKeown, K.R., 2001. Extracting paraphrases from a parallel corpus. Proc. 39th Annual Meeting on Association for Computational Linguistics, p.50-57. [doi:10.3115/1073012.1073020]
Chambers, N., Jurafsky, D., 2009. Unsupervised learning of narrative schemas and their participants. Proc. 47th Annual Meeting on Association for Computational Linguistics and 4th Int. Joint Conf. on Natural Language Processing, p.602-610.
Chambers, N., Jurafsky, D., 2011. Template-based information extraction without the templates. Proc. 49th Annual Meeting on Association for Computational Linguistics, p.976-986.
Che, W., Li, Z., Li, Y., et al., 2009. Multilingual dependencybased syntactic and semantic parsing. Proc. 13th Conf. on Computational Natural Language Learning, p.49-54.
Chen, Z., Ji, H., 2009. Language specific issue and feature exploration in Chinese event extraction. Proc. Annual Conf. on Association for Computational Linguistics, p.209-212.
Chinchor, N., Lewis, D.D., Hirschman, L., 1993. Evaluating message understanding systems: an analysis of the third message understanding conference (MUC-3). Comput. Ling., 19(3):409–449.
MATH Google Scholar
Ding, X., Song, F., Qin, B., et al., 2011. Research on typical event extraction method in the field of music. J. Chin. Inform. Process., 25(2):15–20 (in Chinese).
Google Scholar
Ding, X., Qin, B., Liu, T., 2013. Building Chinese event type paradigm based on trigger clustering. Proc. Int. Joint Conf. on Natural Language Processing, p.311-319.
Dong, Z., Dong, Q., 2006. HowNet and the Computation of Meaning. World Scientific Publishing Company, USA.
Book Google Scholar
Etzioni, O., Fader, A., Christensen, J., et al., 2011. Open information extraction: the second generation. Proc. 22nd Int. Joint Conf. on Artificial Intelligence, p.3-10.
Fader, A., Soderland, S., Etzioni, O., 2011. Identifying relations for open information extraction. Proc. Conf. on Empirical Methods in Natural Language Processing, p.1535-1545.
Friedman, J.H., Bentley, J.L., Finkel, R.A., 1977. An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw., 3(3):209–226. [doi:10.1145/355744.355745]
Article MATH Google Scholar
Grishman, R., 1997. Information extraction: techniques and challenges. In: Pazienza, M.T. (Ed.), Information Extraction: a Multidisciplinary Approach to an Emerging Information Technology. Springer Berlin Heidelberg, New York, USA, p.10–27. [doi:10.1007/3-540-63438-X_2]
Chapter Google Scholar
Grishman, R., 2001. Adaptive information extraction and sublanguage analysis. Int. Joint Conf. on Artificial Itelligence, Workshop on Adaptive Text Extraction and Mining.
Halkidi, M., Batistakis, Y., Vazirgiannis, M., 2001. On clustering validation techniques. J. Intell. Inform. Syst., 17(2-3):107–145. [doi:10.1023/A:1012801612483]
Article MATH Google Scholar
Hasegawa, T., Sekine, S., Grishman, R., 2004. Discovering relations among named entities from large corpora. Proc. 42nd Annual Meeting on Association for Computational Linguistics, Article 415. [doi:10.3115/1218955.1219008]
Hirschberg, D.S., 1977. Algorithms for the longest common subsequence problem. J. ACM, 24(4):664–675. [doi:10.1145/322033.322044]
Article MathSciNet MATH Google Scholar
Hong, Y., Zhang, J., Ma, B., et al., 2011. Using cross-entity inference to improve event extraction. Proc. 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, p.1127-1136.
Ibrahim, A., Katz, B., Lin, J., 2003. Extracting structural paraphrases from aligned monolingual corpora. Proc. 2nd Int. Workshop on Paraphrasing, p.57-64. [doi:10.3115/1118984.1118992]
Ji, H., Grishman, R., 2008. Refining event extraction through cross-document inference. Proc. Association for Computational Linguistics, p.254-262.
Lee, C.S., Chen, Y.J., Jian, Z.W., 2003. Ontology-based fuzzy event extraction agent for Chinese e-news summarization. Expert Syst. Appl., 25(3):431–447. [doi:10.1016/S0957-4174(03)00062-9]
Article Google Scholar
Liao, S., Grishman, R., 2010. Filtered ranking for bootstrapping in event extraction. Proc. 23rd Int. Conf. on Computational Linguistics, p.680-688.
Lin, D., Pantel, P., 2001. DIRT@SBT@discovery of inference rules from text. Proc. 7th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.323-328. [doi:10.1145/502512.502559]
Liu, T., Ma, J., Zhang, H., et al., 2007. Subdividing verbs to improve syntactic parsing. J. Electron. (China), 24(3):347–352 (in Chinese). [doi:10.1007/s11767-005-0193-8]
Google Scholar
Mei, J.J., Zhu, Y.M., Gao, Y.Q., et al., 1983. Dictionary of Synonymous Words. Shanghai Dictionary Publishing Press, Shanghai, China (in Chinese).
Google Scholar
Miller, S., Guinness, J., Zamanian, A., 2004. Name tagging with word clusters and discriminative training. Proc. Conf. of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, p.337-342.
Miwa, M., Sætre, R., Kim, J.D., et al., 2010. Event extraction with complex event classification using rich features. J. Bioinform. Comput. Biol., 8(1):131–146. [doi:10.1142/S0219720010004586]
Article Google Scholar
Pang, B., Knight, K., Marcu, D., 2003. Syntax-based alignment of multiple translations: extracting paraphrases and generating new sentences. Proc. Conf. of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, p.102-109. [doi:10.3115/1073445.1073469]
Patwardhan, S., Riloff, E., 2006. Learning domain-specific information extraction patterns from the Web. Proc. Workshop on Information Extraction Beyond the Document, p.66-73.
Pham, X., Le, M., Ho, B., 2013. A hybrid approach for biomedical event extraction. Proc. Association for Computational Linguistics, p.121-124.
Poon, H., Domingos, P., 2008. Joint unsupervised coreference resolution with Markov logic. Proc. Conf. on Empirical Methods in Natural Language Processing, p.650-659.
Poon, H., Domingos, P., 2009. Unsupervised semantic parsing. Proc. Conf. on Empirical Methods in Natural Language Processing, p.1-10.
Riloff, E., 1996. Automatically generating extraction patterns from untagged text. Proc. AAAI, p.1044-1049.
Ritter, A., Mausam, Etzioni, O., et al., 2012. Open domain event extraction from Twitter. Proc. 18th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.1104-1112. [doi:10.1145/2339530.2339704]
Rosenfeld, B., Feldman, R., 2006. URES: an unsupervised web relation extraction system. Proc. COLING/ACL on Main Conference Poster Sessions, p.667-674.
Schilder, F., 007. Event extraction and temporal reasoning in legal documents. In: Schilder, F., Katz, G., Pustejovsky, J. (Eds.), Annotating, Extracting and Reasoning about Time and Events, p.55-71. [doi:10.1007/978-3-540-75989-8_5]
Shinyama, Y., Sekine, S., 2006. Preemptive information extraction using unrestricted relation discovery. Proc. Conf. of the North American Chapter of the Association of Computational Linguistics on Human Language Technology, p.304-311. [doi:10.3115/1220835.1220874]
Soderland, S., 1999. Learning information extraction rules for semi-structured and free text. Mach. Learn., 34(1-3):233–272. [doi:10.1023/A:1007562322031]
Article Google Scholar
Stevenson, M., Greenwood, M.A., 2005. A semantic approach to IE pattern induction. Proc. 43rd Annual Meeting on Association for Computational Linguistics, p.379-386. [doi:10.3115/1219840.1219887]
Sudo, K., Sekine, S., Grishman, R., 2003. An improved extraction pattern representation model for automatic IE pattern acquisition. Proc. 41st Annual Meeting on Association for Computational Linguistics, p.224-231. [doi:10.3115/1075096.1075125]
Wagner, W., Schmid, H., im Walde, S.S., 2009. Verb sense disambiguation using a predicate-argument-clustering model. Proc. CogSci Workshop on Distributional Semantics Beyond Concrete Concepts, p.23-28.
Wu, F., Weld, D.S., 2010. Open information extraction using Wikipedia. Proc. 48th Annual Meeting of the Association for Computational Linguistics, p.118-127.
Yangarber, R., Grishman, R., Tapanainen, P., et al., 2000. Automatic acquisition of domain knowledge for information extraction. Proc. 18th Conf. on Computational Linguistics, p.940-946. [doi:10.3115/992730.992782]
Yates, A., Etzioni, O., 2009. Unsupervised methods for determining object and relation synonyms on the web. J. Artif. Intell. Res., 34(1):255–296.
Google Scholar
Yeh, A., Hirschman, L., Morgan, A., 2002. Background and overview for KDD Cup 2002 task 1: information extraction from biomedical articles. ACM SIGKDD Explor. Newslett., 4(2):87–89. [doi:10.1145/772862.772873]
Article Google Scholar

Download references

Author information

Authors and Affiliations

Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology, Harbin, 150001, China
Xiao Ding, Bing Qin & Ting Liu

Authors

Xiao Ding
View author publications
You can also search for this author in PubMed Google Scholar
Bing Qin
View author publications
You can also search for this author in PubMed Google Scholar
Ting Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ting Liu.

Additional information

Project supported by the National Natural Science Foundation of China (Nos. 61133012 and 61472107) and the National Basic Research Program (973) of China (No. 2014CB340503)

A preliminary version was presented at the 6th International Joint Conference on Natural Language Processing, Oct. 14-18, 2013, Japan

ORCID: Xiao DING, http://orcid.org/0000-0002-5838-0320

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ding, X., Qin, B. & Liu, T. BUEES: a bottom-up event extraction system. Frontiers Inf Technol Electronic Eng 16, 541–552 (2015). https://doi.org/10.1631/FITEE.1400405

Download citation

Received: 27 November 2014
Accepted: 11 May 2015
Published: 12 July 2015
Issue Date: July 2015
DOI: https://doi.org/10.1631/FITEE.1400405

Keywords

CLC number

TP391

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BUEES: a bottom-up event extraction system

Abstract

Access this article

Similar content being viewed by others

DuEE: A Large-Scale Dataset for Chinese Event Extraction in Real-World Scenarios

Seq2EG: a novel and effective event graph parsing approach for event extraction

A Three-Stage Framework for Event-Event Relation Extraction with Large Language Model

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

CLC number

Navigation

BUEES: a bottom-up event extraction system

Abstract

Access this article

Similar content being viewed by others

DuEE: A Large-Scale Dataset for Chinese Event Extraction in Real-World Scenarios

Seq2EG: a novel and effective event graph parsing approach for event extraction

A Three-Stage Framework for Event-Event Relation Extraction with Large Language Model

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

CLC number

Search

Navigation