Protest Event Detection: When Task-Specific Models Outperform an Event-Driven Method

Basile, Angelo; Caselli, Tommaso

doi:10.1007/978-3-030-58219-7_9

Angelo Basile¹⁸ &
Tommaso Caselli¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12260))

Included in the following conference series:

International Conference of the Cross-Language Evaluation Forum for European Languages

919 Accesses
1 Citations

Abstract

2019 has been characterized by worldwide waves of protests. Each country’s protests is different but there appear to be common factors. In this paper we present two approaches for identifying protest events in news in English. Our goal is to provide political science and discourse analysis scholars with tools that may facilitate the understanding of this on-going phenomenon. We test our approaches against the ProtestNews Lab 2019 benchmark that challenges systems to perform unsupervised domain adaptation on protest events on three sub-tasks: document classification, sentence classification, and event extraction. Results indicate that developing dedicated architectures and models for each task outperforms simpler solutions based on the propagation of labels from lexical items to documents. Furthermore, we complete the description of our systems with a detailed data analysis to shed light on the limits of the methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://bit.ly/31oyS5k - last retrieved May 16th 2020.
2.
https://www.ldc.upenn.edu/sites/www.ldc.upenn.edu/files/english-events-guidelines-v5.4.3.pdf.
3.
https://github.com/UKPLab/emnlp2017-bilstm-cnn-crf.
4.
We used version 3.9.2.
5.
We use spaCy’s English sentence tokenizer module.
6.
We used the same 30% of the gold data used for the event triggers.

References

Ahn, D.: The stages of event extraction. In: Proceedings of the Workshop on Annotating and Reasoning About Time and Events, pp. 1–8. Association for Computational Linguistics (2006)
Google Scholar
Basile, A., Caselli, T.: ProTestA: identifying and extracting protest events in news notebook for ProtestNews lab at CLEF 2019. In: Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum (2019)
Google Scholar
Bethard, S.: ClearTK-TimeML: a minimalist approach to TempEval 2013. In: Second Joint Conference on Lexical and Computational Semantics (* SEM), vol. 2, pp. 10–14 (2013)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understandin. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota, June 2019. https://doi.org/10.18653/v1/N19-1423
Elsahar, H., Gallé, M.: To annotate or not? Predicting performance drop under domain shift. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 2163–2173. Association for Computational Linguistics, Hong Kong, China, November 2019. https://doi.org/10.18653/v1/D19-1222, https://www.aclweb.org/anthology/D19-1222
Ettinger, A., Rao, S., Daumé III, H., Bender, E.M.: Towards linguistically generalizable NLP systems: a workshop and shared task. In: Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems, pp. 1–10 (2017)
Google Scholar
Huang, L., Ji, H., Cho, K., Dagan, I., Riedel, S., Voss, C.: Zero-shot transfer learning for event extraction. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2160–2170. Association for Computational Linguistics, Melbourne, July 2018. https://doi.org/10.18653/v1/P18-1201, https://www.aclweb.org/anthology/P18-1201
Hürriyetoğlu, A., et al.: Cross-context news corpus for protest events related knowledge base construction. In: Automated Knowledge Base Construction (2020). https://openreview.net/forum?id=7NZkNhLCjp
Hürriyetoğlu, A., et al.: A task set proposal for automatic protest information collection across multiple countries. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) ECIR 2019. LNCS, vol. 11438, pp. 316–323. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15719-7_42
Chapter Google Scholar
Hürriyetoğlu, A., et al.: Overview of CLEF 2019 lab ProtestNews: extracting protests from news in a cross-context setting. In: Crestani, F., et al. (eds.) CLEF 2019. LNCS, vol. 11696, pp. 425–432. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28577-7_32
Chapter Google Scholar
Ji, H., Grishman, R.: Refining event extraction through cross-document inference. In: Proceedings of ACL-2008: HLT, pp. 254–262 (2008)
Google Scholar
Komninos, A., Manandhar, S.: Dependency based embeddings for sentence classification tasks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1490–1500 (2016)
Google Scholar
Liakata, M., Saha, S., Dobnik, S., Batchelor, C., Rebholz-Schuhmann, D.: Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics 28(7), 991–1000 (2012). https://doi.org/10.1093/bioinformatics/bts071
Article Google Scholar
Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1064–1074. Association for Computational Linguistics, Berlin, August 2016. https://doi.org/10.18653/v1/P16-1101
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. System Demonstrations, pp. 55–60 (2014)
Google Scholar
McClosky, D., Charniak, E., Johnson, M.: Effective self-training for parsing. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, pp. 152–159. Association for Computational Linguistics (2006). https://dl.acm.org/doi/pdf/10.3115/1220835.1220855
Miwa, M., Thompson, P., Korkontzelos, I., Ananiadou, S.: Comparable study of event extraction in newswire and biomedical domains. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 2270–2279. Dublin City University and Association for Computational Linguistics, Dublin, August 2014. https://www.aclweb.org/anthology/C14-1214
Montani, J.P.: Tuwienkbs at germeval 2018: German abusive tweet detection. In: 14th Conference on Natural Language Processing KONVENS 2018, p. 45 (2018)
Google Scholar
Nguyen, T.H., Grishman, R.: Event detection and domain adaptation with convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), vol. 2, pp. 365–371 (2015)
Google Scholar
Peters, M.E., et al.: Deep contextualized word representations. In: Proceedings of NAACL (2018)
Google Scholar
Plank, B.: What to do about non-standard (or non-canonical) language in NLP. arXiv preprint arXiv:1608.07836 (2016)
Plank, B., Van Noord, G.: Effective measures of domain similarity for parsing. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1566–1576. Association for Computational Linguistics (2011)
Google Scholar
Reimers, N., Gurevych, I.: Reporting score distributions makes a difference: performance study of LSTM-networks for sequence tagging. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 338–348. Association for Computational Linguistics, Copenhagen, September 2017. https://www.aclweb.org/anthology/D17-1035
Ritter, A., Etzioni, O., Clark, S., et al.: Open domain event extraction from Twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1104–1112. ACM (2012)
Google Scholar
Ruder, S., Plank, B.: Learning to select data for transfer learning with Bayesian optimization. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 372–382. Association for Computational Linguistics, Copenhagen, September 2017. https://www.aclweb.org/anthology/D17-1038

Download references

Author information

Authors and Affiliations

Symanto Research GmbH & Co., Nürnberg, Germany
Angelo Basile
Rijksuniversiteit Groningen, Groningen, The Netherlands
Tommaso Caselli

Authors

Angelo Basile
View author publications
You can also search for this author in PubMed Google Scholar
Tommaso Caselli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tommaso Caselli .

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece
Avi Arampatzis
University of Amsterdam, Amsterdam, The Netherlands
Evangelos Kanoulas
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Theodora Tsikrika
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Stefanos Vrochidis
Faculty of Library, Information and Media Science, University of Tsukuba, Ibaraki, Japan
Hideo Joho
Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
Christina Lioma
Brown University, Providence, RI, USA
Carsten Eickhoff
LIMSI-CNRS, Orsay, France
Aurélie Névéol
Department of Information Engineering, University of Padova, Padua, Italy
Linda Cappellato
Department of Information Engineering, University of Padova, Padua, Italy
Nicola Ferro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Basile, A., Caselli, T. (2020). Protest Event Detection: When Task-Specific Models Outperform an Event-Driven Method. In: Arampatzis, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2020. Lecture Notes in Computer Science(), vol 12260. Springer, Cham. https://doi.org/10.1007/978-3-030-58219-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-58219-7_9
Published: 15 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58218-0
Online ISBN: 978-3-030-58219-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics