Abstract
Open-domain event detection (ODED) aims to identify event mentions of all possible types in text. A challenge for ODED research is the lack of large training datasets. In this work, we explore a novel method to overcome this challenge by fine-tuning the powerful pre-trained language model GPT-2 on existing datasets to automatically generate new training data for ODED. To address the noises presented in the generated data, we propose a novel teacher-student architecture where the teacher model is used to capture anchor knowledge on sentence representations and data type difference. The student model is then trained on the combination of the original and generated data and regularized to be consistent with the anchor knowledge from the teacher. We introduce novel regularization mechanism based on mutual information and optimal transport to achieve the knowledge consistency between the student and the teacher. Moreover, we propose a dynamic sample weighting technique for the generated examples based on optimal transport and data clustering. Our experiments on three benchmark datasets demonstrate the effectiveness of the propped model, yielding state-of-the-art performance for such datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We use the small version of GPT-2 in this work.
- 2.
\(K=10\) produces the best performance in our study.
- 3.
Note that we do not use the ACE 2005 dataset [35] as it only focuses on a small set of event types in the news domain, thus being not appropriate for our open-domain setting of event detection.
- 4.
In the experiments, we learn that augmenting the models with GPT-generated data is more helpful for recalls.
References
Ahn, D.: The stages of event extraction. In: Proceedings of the Workshop on Annotating and Reasoning about Time and Events (2006)
Anaby-Tavor, A., et al.: Do not have enough data? deep learning to the rescue! In: AAAI (2020)
Araki, J., Mitamura, T.: Open-domain event detection using distant supervision. In: COLING (2018)
Bosselut, A., Rashkin, H., Sap, M., Malaviya, C., Celikyilmaz, A., Choi, Y.: Comet: commonsense transformers for automatic knowledge graph construction. In: ACL (2019)
Chen, Y., Xu, L., Liu, K., Zeng, D., Zhao, J.: Event extraction via dynamic multi-pooling convolutional neural networks. In: ACL-IJCNLP (2015)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019)
Ferguson, J., Lockard, C., Weld, D.S., Hajishirzi, H.: Semi-supervised event extraction with paraphrase clusters. In: NAACL (2018)
Hjelm, R.D., et al.: Learning deep representations by mutual information estimation and maximization. In: ICLR (2019)
Huang, L., et al.: Liberal event extraction and event schema induction. In: ACL (2016)
Huang, R., Riloff, E.: Bootstrapped training of event extraction classifiers. In: EACL (2012)
Ji, H., Grishman, R.: Refining event extraction through cross-document inference. In: ACL (2008)
Keith, K., Handler, A., Pinkham, M., Magliozzi, C., McDuffie, J., O’Connor, B.: Identifying civilians killed by police with distantly supervised entity-event extraction. In: EMNLP (2017)
Kumar, V., Choudhary, A., Cho, E.: Data augmentation using pre-trained transformer models. arXiv preprint arXiv:2003.02245 (2020)
Lai, V.D., Dernoncourt, F., Nguyen, T.H.: Exploiting the matching information in the support set for few shot event classification. In: Proceedings of the 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) (2020)
Lai, V.D., Nguyen, T.N., Nguyen, T.H.: Event detection: gate diversity and syntactic importance scores for graph convolution neural networks. In: EMNLP (2020)
Li, Q., Ji, H., Huang, L.: Joint event extraction via structured prediction with global features. In: ACL (2013)
Liao, S., Grishman, R.: Filtered ranking for bootstrapping in event extraction. In: COLING (2010)
Liao, S., Grishman, R.: Using document level cross-event inference to improve event extraction. In: ACL (2010)
Madaan, A., Rajagopal, D., Yang, Y., Ravichander, A., Hovy, E., Prabhumoye, S.: Eigen: event influence generation using pre-trained language models. arXiv preprint arXiv:2010.11764 (2020)
McClosky, D., Surdeanu, M., Manning, C.: Event extraction as dependency parsing. In: BioNLP Shared Task Workshop (2011)
Miwa, M., Thompson, P., Korkontzelos, I., Ananiadou, S.: Comparable study of event extraction in newswire and biomedical domains. In: COLING (2014)
Naik, A., Rosé, C.: Towards open domain event trigger identification using adversarial domain adaptation. In: ACL (2020)
Nguyen, M., Nguyen, T.H.: Who is killed by police: introducing supervised attention for hierarchical lstms. In: COLING (2018)
Nguyen, T.H., Cho, K., Grishman, R.: Joint event extraction via recurrent neural networks. In: NAACL (2016a)
Nguyen, T.H., Grishman, R.: Event detection and domain adaptation with convolutional neural networks. In: ACL (2015b)
Nguyen, T.H., Grishman, R.: Graph convolutional networks with argument-aware pooling for event detection. In: AAAI (2018)
Nguyen, T.M., Nguyen, T.H.: One for all: neural joint modeling of entities and events. In: AAAI (2019)
Papanikolaou, Y., Pierleoni, A.: Dare: data augmented relation extraction with gpt-2. In: SciNLP workshop at AKBC (2020)
Peng, B., Zhu, C., Zeng, M., Gao, J.: Data augmentation for spoken language understanding via pretrained models. arXiv preprint arXiv:2004.13952 (2020)
Peyre, G., Cuturi, M.: Computational optimal transport: with applications to data science. In: Foundations and Trends in Machine Learning (2019)
Pustejovsky, J., et al.: The timebank corpus. In: Corpus linguistics (2003)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019)
Sims, M., Park, J.H., Bamman, D.: Literary event detection. In: ACL (2019)
Trong, H.M.D., Le, D.T., Veyseh, A.P.B., Nguyen, T., Nguyen, T.H.: Introducing a new dataset for event detection in cybersecurity texts. In: EMNLP (2020)
Walker, C., Strassel, S., Medero, J., Maeda, K.: Ace 2005 multilingual training corpus. In: LDC, Philadelphia (2006)
Wang, X., Han, X., Liu, Z., Sun, M., Li, P.: Adversarial training for weakly supervised event detection. In: NAACL-HLT (2019)
Yang, B., Mitchell, T.M.: Joint extraction of events and entities within a document context. In: NAACL-HLT (2016)
Yang, S., Feng, D., Qiao, L., Kan, Z., Li, D.: Exploring pre-trained language models for event extraction and generation. In: ACL (2019)
Yang, Y., et al.: Generative data augmentation for commonsense reasoning. In: Findings of EMNLP 2020 (2020)
Yuan, Q., et al.: Open-schema event profiling for massive news corpora. In: CIKM (2018)
Zeng, Y., et al.: Scale up event extraction learning via automatic training data generation. In: AAAI (2017)
Zhang, D., Li, T., Zhang, H., Yin, B.: On data augmentation for extreme multi-label classification. arXiv preprint arXiv:2009.10778 (2020)
Zhang, J., Qin, Y., Zhang, Y., Liu, M., Ji, D.: Extracting entities and events as a single task using a transition-based neural model. In: IJCAI (2019)
Zhang, Y., et al.: A question answering-based framework for one-step event argument extraction. IEEE Access 8, 65420–65431 (2020)
Acknowledgments
This research has been supported by the Army Research Office (ARO) grant W911NF-21-1-0112 and the NSF grant CNS-1747798 to the IUCRC Center for Big Learning. This research is also based upon work supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via IARPA Contract No. 2019-19051600006 under the Better Extraction from Text Towards Enhanced Retrieval (BETTER) Program. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of ARO, ODNI, IARPA, the Department of Defense, or the U.S. Government.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Veyseh, A.P.B., Van Nguyen, M., Min, B., Nguyen, T.H. (2021). Augmenting Open-Domain Event Detection with Synthetic Data from GPT-2. In: Oliver, N., Pérez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12977. Springer, Cham. https://doi.org/10.1007/978-3-030-86523-8_39
Download citation
DOI: https://doi.org/10.1007/978-3-030-86523-8_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86522-1
Online ISBN: 978-3-030-86523-8
eBook Packages: Computer ScienceComputer Science (R0)