ABSTRACT
Pre-training Event Extraction (EE) models on unlabeled data is an effective strategy that frees researchers from costly and labor-intensive data annotation. However, existing pre-training methods necessitate substantial computational resources, requiring high-performance hardware infrastructure and extensive training duration. In response to these challenges, this paper proposes a Lighter, Faster, and more Data-efficient pre-training framework for EE, named LFDe. Distinct from existing methods that strive to establish a comprehensive representation space during pre-training, our framework focuses on quickly familiarizing with the task format from a small amount of automatically constructed pseudo-events. It comprises three stages: weak-label data construction, pre-training, and fine-tuning. Specifically, during the first stage, LFDe first automatically designates pseudo-triggers and arguments based on the characteristics of real events to form pre-training samples. In the processes of pre-training and fine-tuning, the framework reframes EE as the identification of tokens semantically closest to the prompt within the given sentence. This paper also introduces a novel prompt-based sequence labeling model for EE to accommodate this reframing. Experiments on real-world datasets show that compared to similar models, our framework requires fewer pre-training data (only about 0.04%), a shorter pre-training period (about 0.03%), and lower memory requirements (about 57.6%). Simultaneously, our framework significantly improves performance in various data-scarce scenarios.
Supplemental Material
- Kurt D. Bollacker, Colin Evans, Praveen K. Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of SIGMOD. 1247--1250.Google ScholarDigital Library
- Yubo Chen, Shulin Liu, Xiang Zhang, Kang Liu, and Jun Zhao. 2017. Automatically Labeled Data Generation for Large Scale Event Extraction. In Proceedings of ACL. 409--419.Google ScholarCross Ref
- Yubo Chen, Liheng Xu, Kang Liu, Daojian Zeng, and Jun Zhao. 2015. Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks. In Proceedings of ACL/IJCNLP. 167--176.Google ScholarCross Ref
- Xin Cong, Shiyao Cui, Bowen Yu, Tingwen Liu, Yubin Wang, and Bin Wang. 2021. Few-Shot Event Detection with Prototypical Amortized Conditional Random Field. In Proceedings of Findings of ACL/IJCNLP. 28--40.Google ScholarCross Ref
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL. 4171--4186.Google Scholar
- George R. Doddington, Alexis Mitchell, Mark A. Przybocki, Lance A. Ramshaw, Stephanie M. Strassel, and Ralph M. Weischedel. 2004. The Automatic Content Extraction (ACE) Program - Tasks, Data, and Evaluation. In Proceedings of LREC. 1--4.Google Scholar
- Xinya Du and Claire Cardie. 2020. Event Extraction by Answering (Almost) Natural Questions. In Proceedings of EMNLP. 671--683.Google ScholarCross Ref
- Hao Fei, Shengqiong Wu, Jingye Li, Bobo Li, Fei Li, Libo Qin, Meishan Zhang, Min Zhang, and Tat-Seng Chua. 2022. LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model. In Proceedings of NeurIPS.Google Scholar
- Jun Gao, Huan Zhao, Changlong Yu, and Ruifeng Xu. 2023. Exploring the Feasibility of ChatGPT for Event Extraction. CoRR, Vol. abs/2303.03836 (2023).Google Scholar
- Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Deep Sparse Rectifier Neural Networks. In Proceedings of AISTATS (JMLR Proceedings, Vol. 15). 315--323.Google Scholar
- Yu Hong, Jianfeng Zhang, Bin Ma, Jian-Min Yao, Guodong Zhou, and Qiaoming Zhu. 2011. Using Cross-Entity Inference to Improve Event Extraction. In Proceedings of ACL. 1127--1136.Google Scholar
- I-Hung Hsu, Kuan-Hao Huang, Elizabeth Boschee, Scott Miller, Prem Natarajan, Kai-Wei Chang, and Nanyun Peng. 2022. DEGREE: A Data-Efficient Generation-Based Event Extraction Model. In Proceedings of NAACL. 1890--1908.Google ScholarCross Ref
- I-Hung Hsu, Zhiyu Xie, Kuan-Hao Huang, Prem Natarajan, and Nanyun Peng. 2023. AMPERE: AMR-Aware Prefix for Generation-Based Event Argument Extraction Model. In Proceedings of ACL. 10976--10993.Google ScholarCross Ref
- Kuan-Hao Huang, I-Hung Hsu, Prem Natarajan, Kai-Wei Chang, and Nanyun Peng. 2022. Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction. In Proceedings of ACL. 4633--4646.Google ScholarCross Ref
- Lifu Huang, Heng Ji, Kyunghyun Cho, Ido Dagan, Sebastian Riedel, and Clare R. Voss. 2018. Zero-Shot Transfer Learning for Event Extraction. In Proceedings of ACL. 2160--2170.Google Scholar
- Zhigang Kan, Linhui Feng, Zhangyue Yin, Linbo Qiao, Xipeng Qiu, and Dongsheng Li. 2023 a. A Composable Generative Framework Based on Prompt Learning for Various Information Extraction Tasks. IEEE Trans. Big Data, Vol. 9, 4 (2023), 1238--1251.Google ScholarCross Ref
- Zhigang Kan, Yanqi Shi, Zhangyue Yin, Liwen Peng, Linbo Qiao, Xipeng Qiu, and Dongsheng Li. 2023 b. An anchor-guided sequence labeling model for event detection in both data-abundant and data-scarce scenarios. Information Sciences, Vol. 649 (2023), 119652.Google ScholarDigital Library
- Jungmin Kwon, Jeongseop Kim, Hyunseo Park, and In Kwon Choi. 2021. ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks. In Proceedings of ICML, Vol. 139. 5905--5914.Google Scholar
- Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of ACL. 7871--7880.Google ScholarCross Ref
- Fayuan Li, Weihua Peng, Yuguang Chen, Quan Wang, Lu Pan, Yajuan Lyu, and Yong Zhu. 2020. Event Extraction as Multi-turn Question Answering. In Proceedings of EMNLP. 829--838.Google ScholarCross Ref
- Sha Li, Heng Ji, and Jiawei Han. 2021. Document-Level Event Argument Extraction by Conditional Generation. In Proceedings of NAACL. 894--908.Google ScholarCross Ref
- Ying Lin, Heng Ji, Fei Huang, and Lingfei Wu. 2020. A Joint Neural Model for Information Extraction with Global Features. In Proceedings of ACL. 7999--8009.Google ScholarCross Ref
- Jian Liu, Yubo Chen, Kang Liu, Wei Bi, and Xiaojiang Liu. 2020. Event Extraction as Machine Reading Comprehension. In Proceedings of EMNLP. 1641--1651.Google ScholarCross Ref
- Jian Liu, Yufeng Chen, and Jinan Xu. 2022. Saliency as Evidence: Event Detection with Trigger Saliency Attribution. In Proceedings of ACL. 4573--4585.Google ScholarCross Ref
- Shulin Liu, Yubo Chen, Shizhu He, Kang Liu, and Jun Zhao. 2016. Leveraging FrameNet to Improve Automatic Event Detection. In Proceedings of ACL. 2134----2143.Google ScholarCross Ref
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR, Vol. abs/1907.11692 (2019).Google Scholar
- Yaojie Lu, Hongyu Lin, Jin Xu, Xianpei Han, Jialong Tang, Annan Li, Le Sun, Meng Liao, and Shaoyi Chen. 2021. Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction. In Proceedings of ACL. 2795--2806.Google ScholarCross Ref
- Yaojie Lu, Qing Liu, Dai Dai, Xinyan Xiao, Hongyu Lin, Xianpei Han, Le Sun, and Hua Wu. 2022. Unified Structure Generation for Universal Information Extraction. In Proceedings of ACL. 5755--5772.Google ScholarCross Ref
- Qing Lyu, Hongming Zhang, Elior Sulem, and Dan Roth. 2021. Zero-shot Event Extraction via Transfer Learning: Challenges and Insights. In Proceedings of ACL/IJCNLP (Volume 2: Short Papers). 322--332.Google ScholarCross Ref
- Yubo Ma, Zehao Wang, Yixin Cao, Mukai Li, Meiqi Chen, Kun Wang, and Jing Shao. 2022. Prompt for Extraction? PAIE: Prompting Argument Interaction for Event Argument Extraction. In Proceedings of ACL. 6759--6774.Google ScholarCross Ref
- David McClosky, Mihai Surdeanu, and Christopher D. Manning. 2011. Event Extraction as Dependency Parsing for BioNLP 2011. In Proceedings of BioNLP Shared Task 2011 Workshop. 41--45.Google Scholar
- George A. Miller. 1995. Wordnet: a lexical database for english. In Communications of the Acm, Vol. 38(11). 39--41.Google ScholarDigital Library
- Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, and Luke Zettlemoyer. 2022. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?. In Proceedings of EMNLP. 11048--11064.Google ScholarCross Ref
- Makoto Miwa, Paul Thompson, Ioannis Korkontzelos, and Sophia Ananiadou. 2014. Comparable Study of Event Extraction in Newswire and Biomedical Domains. In Proceedings of ICCL. 2270--2279.Google Scholar
- Minh Van Nguyen, Viet Dac Lai, and Thien Huu Nguyen. 2021. Cross-Task Instance Representation Interactions and Label Dependencies for Joint Information Extraction with Graph Convolutional Networks. In Proceedings of NAACL. 27--38.Google ScholarCross Ref
- Thien Huu Nguyen, Kyunghyun Cho, and Ralph Grishman. 2016. Joint Event Extraction via Recurrent Neural Networks. In Proceedings of NAACL. 300--309.Google ScholarCross Ref
- Thien Huu Nguyen and Ralph Grishman. 2015. Event Detection and Domain Adaptation with Convolutional Neural Networks. In Proceedings of ACL. 365--371.Google ScholarCross Ref
- Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, C'i cero Nogueira dos Santos, Bing Xiang, and Stefano Soatto. 2021. Structured Prediction as Translation between Augmented Natural Languages. In Proceedings of ICLR. 1--26.Google Scholar
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res., Vol. 21 (2020), 1--67.Google Scholar
- Lei Sha, Feng Qian, Baobao Chang, and Zhifang Sui. 2018. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge RNN and Tensor-Based Argument Interaction. In Proceedings of AAAI. 5916--5923.Google ScholarCross Ref
- Zhiyi Song, Ann Bies, Stephanie M. Strassel, Tom Riese, Justin Mott, Joe Ellis, Jonathan Wright, Seth Kulick, Neville Ryant, and Xiaoyi Ma. 2015. From Light to Rich ERE: Annotation of Entities, Relations, and Events. In Proceedings of the The 3rd Workshop on EVENTS. 89--98.Google ScholarCross Ref
- David Wadden, Ulme Wennberg, Yi Luan, and Hannaneh Hajishirzi. 2019. Entity, Relation, and Event Extraction with Contextualized Span Representations. In Proceedings of EMNLP. 5783--5788.Google ScholarCross Ref
- Ziqi Wang, Xiaozhi Wang, Xu Han, Yankai Lin, Lei Hou, Zhiyuan Liu, Peng Li, Juanzi Li, and Jie Zhou. 2021. CLEVE: Contrastive Pre-training for Event Extraction. In Proceedings of ACL/IJCNLP. 6283--6297.Google ScholarCross Ref
- Xiang Wei, Xingyu Cui, Ning Cheng, Xiaobin Wang, Xin Zhang, Shen Huang, Pengjun Xie, Jinan Xu, Yufeng Chen, Meishan Zhang, Yong Jiang, and Wenjuan Han. 2023. Zero-Shot Information Extraction via Chatting with ChatGPT. CoRR, Vol. abs/2302.10205 (2023).Google Scholar
- Dongqin Xu, Junhui Li, Muhua Zhu, Min Zhang, and Guodong Zhou. 2020. Improving AMR Parsing with Sequence-to-Sequence Pre-training. In Proceedings of EMNLP. 2501--2511.Google ScholarCross Ref
- Zhiyang Xu, Jay Yoon Lee, and Lifu Huang. 2023. Learning from a Friend: Improving Event Extraction via Self-Training with Feedback from Abstract Meaning Representation. In Proceedings of Findings of ACL. 10421--10437.Google ScholarCross Ref
- Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo, Yanyang Xiao, and Zheng Ma. 2019. Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks. CoRR, Vol. abs/1901.06523 (2019).Google Scholar
- Sen Yang, Dawei Feng, Linbo Qiao, Zhigang Kan, and Dongsheng Li. 2019. Exploring Pre-trained Language Models for Event Extraction and Generation. In Proceedings of ACL. 5284--5294.Google ScholarCross Ref
- Pengfei Yu, Zixuan Zhang, Clare Voss, Jonathan May, and Heng Ji. 2022. Building an event extractor with only a few examples. In Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing. 102--109.Google ScholarCross Ref
- Zhenrui Yue, Huimin Zeng, Mengfei Lan, Heng Ji, and Dong Wang. 2023. Zero- and Few-Shot Event Detection via Prompt-Based Meta Learning. In Proceedings of ACL. 7928--7943.Google ScholarCross Ref
- Hongming Zhang, Haoyu Wang, and Dan Roth. 2021. Zero-shot Label-Aware Event Trigger and Argument Classification. In Proceedings of Findings of ACL/IJCNLP. 1331--1340.Google ScholarCross Ref
Index Terms
- LFDe: A Lighter, Faster and More Data-Efficient Pre-training Framework for Event Extraction
Recommendations
Poster: Boosting Adversarial Robustness by Adversarial Pre-training
CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications SecurityVision Transformer (ViT) shows superior performance on various tasks, but, similar to other deep learning techniques, it is vulnerable to adversarial attacks. Due to the differences between ViT and traditional CNNs, previous works designed new ...
A fast and efficient pre-training method based on layer-by-layer maximum discrimination for deep neural networks
In this paper, through extension of the present methods and based on error minimization, two fast and efficient layer-by-layer pre-training methods are proposed for initializing deep neural network (DNN) weights. Due to confrontation with a large number ...
Convolutional adaptive denoising autoencoders for hierarchical feature extraction
Convolutional neural networks (CNNs) are typical structures for deep learning and are widely used in image recognition and classification. However, the random initialization strategy tends to become stuck at local plateaus or even diverge, which results ...
Comments