Argument parsing via corpus queries

Natalie Dykes; Stefan Evert; Merlin Göttlinger; Philipp Heinrich; Lutz Schröder

doi:10.1515/itit-2020-0051

Published by De Gruyter Oldenbourg May 5, 2021

Argument parsing via corpus queries

Natalie Dykes
Natalie Dykes is working in Stefan Evert’s Computational Corpus Linguistics group. She holds a B. A. in computational linguistics, Scandinavian studies and an M. A. in linguistics. Her research interests include corpus-based discourse analysis, argumentation, and computer-mediated communication.
, Stefan Evert
Stefan Evert holds the Chair of Computational Corpus Linguistics at FAU Erlangen-Nürnberg. After studying mathematics, physics and English linguistics, he received a PhD degree in computational linguistics from the University of Stuttgart. His research interests include the statistical methodology of corpus linguistics, co-occurrence phenomena and software tools for processing large text corpora.
, Merlin Göttlinger
After receiving his B. Sc. in Computer Science and Media from the TH-Nürnberg in 2015 Merlin Göttlinger continued with an M. Sc. in Computer Science at FAU Erlangen-Nürnberg which he completed in 2018. Afterwards, he started working as a PhD student at the Chair of Theoretical Computer Science (INF8) at FAU Erlangen-Nürnberg researching logic formalism for argumentation.
, Philipp Heinrich
Philipp Heinrich is working in Stefan Evert’s Computational Corpus Linguistics group. Having studied mathematics, linguistics, and philosophy, his research interests include corpus-based discourse analysis and argumentation mining with a focus on the comparison of social and mass media.
and Lutz Schröder
Lutz Schröder holds the chair for theoretical computer science at FAU Erlangen-Nürnberg. He received a PhD in mathematics and subsequently the habilitation in computer science from the University of Bremen, and has held a senior researcher position at the German Research Center for Artificial Intelligence (DFKI). His main research area is logic in computer science.

From the journal it - Information Technology

https://doi.org/10.1515/itit-2020-0051

Showing a limited preview of this publication:

Abstract

We present an approach to extracting arguments from social media, exemplified by a case study on a large corpus of Twitter messages collected under the #Brexit hashtag during the run-up to the referendum in 2016. Our method is based on constructing dedicated corpus queries that capture predefined argumentation patterns following standard Walton-style argumentation schemes. Query matches are transformed directly into logical patterns, i. e. formulae with placeholders in a general form of modal logic. We prioritize precision over recall, exploiting the fact that the sheer size of the corpus still delivers substantial numbers of matches for all patterns, and with the goal of eventually gaining an overview of widely-used arguments and argumentation schemes. We evaluate our approach in terms of recall on a manually annotated gold standard of 1000 randomly selected tweets for three selected high-frequency patterns. We also estimate precision by manual inspection of query matches in the entire corpus. Both evaluations are accompanied by an analysis of inter-annotator agreement between three independent judges.

Keywords: argument minig; reasoning; corpus linguistics; social media

ACM CCS:

About the authors

Natalie Dykes

Natalie Dykes is working in Stefan Evert’s Computational Corpus Linguistics group. She holds a B. A. in computational linguistics, Scandinavian studies and an M. A. in linguistics. Her research interests include corpus-based discourse analysis, argumentation, and computer-mediated communication.

Prof. Dr. Stefan Evert

Stefan Evert holds the Chair of Computational Corpus Linguistics at FAU Erlangen-Nürnberg. After studying mathematics, physics and English linguistics, he received a PhD degree in computational linguistics from the University of Stuttgart. His research interests include the statistical methodology of corpus linguistics, co-occurrence phenomena and software tools for processing large text corpora.

Merlin Göttlinger

After receiving his B. Sc. in Computer Science and Media from the TH-Nürnberg in 2015 Merlin Göttlinger continued with an M. Sc. in Computer Science at FAU Erlangen-Nürnberg which he completed in 2018. Afterwards, he started working as a PhD student at the Chair of Theoretical Computer Science (INF8) at FAU Erlangen-Nürnberg researching logic formalism for argumentation.

Philipp Heinrich

Philipp Heinrich is working in Stefan Evert’s Computational Corpus Linguistics group. Having studied mathematics, linguistics, and philosophy, his research interests include corpus-based discourse analysis and argumentation mining with a focus on the comparison of social and mass media.

Prof. Dr. Lutz Schröder

Lutz Schröder holds the chair for theoretical computer science at FAU Erlangen-Nürnberg. He received a PhD in mathematics and subsequently the habilitation in computer science from the University of Bremen, and has held a senior researcher position at the German Research Center for Artificial Intelligence (DFKI). His main research area is logic in computer science.

References

1. T. Alsinet, J. Argelich, R. Béjar, and J. Cemeli. A distributed argumentation algorithm for mining consistent opinions in weighted twitter discussions. Soft Comput., 23(7):2147–2166, 2019.10.1007/s00500-018-3380-xSearch in Google Scholar

2. R. Alur, T. Henzinger, and O. Kupferman. Alternating-time temporal logic. J. ACM, 49:672–713, 2002.10.1109/SFCS.1997.646098Search in Google Scholar

3. F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P. Patel-Schneider, eds. The Description Logic Handbook. Cambridge University Press, 2003.Search in Google Scholar

4. P. Baroni, D. Gabbay, M. Giacomin, and L. van der Torre, eds. Handbook of Formal Argumentation. College Publications, 2018.Search in Google Scholar

5. T. Bosc, E. Cabrio, and S. Villata. Tweeties squabbling: Positive and negative results in applying argument mining on social media. In Computational Models of Argument, COMMA 2016, vol. 287 of Frontiers Artif. Intell. Appl., pp. 21–32. IOS Press, 2016.Search in Google Scholar

6. E. Cabrio and S. Villata. Five years of argument mining: a data-driven analysis. In International Joint Conference on Artificial Intelligence, IJCAI 2018, pp. 5427–5433, 2018. ijcai.org.10.24963/ijcai.2018/766Search in Google Scholar

7. B. Chellas. Modal logic. Cambridge University Press, 1980.10.1017/CBO9780511621192Search in Google Scholar

8. C. Chesñevar, J. McGinnis, S. Modgil, I. Rahwan, C. Reed, G. Simari, M. South, G. Vreeswijk, and S. Willmott. Towards an argument interchange format. Knowledge Eng. Review, 21(4):293–316, 2006.10.1017/S0269888906001044Search in Google Scholar

9. O. Christ. A modular and flexible architecture for an integrated corpus query system. In Papers in Computational Lexicography, COMPLEX 1994, pp. 22–32, 1994.Search in Google Scholar

10. C. Cîrstea, A. Kurz, D. Pattinson, L. Schröder, and Y. Venema. Modal logics are coalgebraic. Comput. J., 54:31–41, 2011.10.14236/ewic/VOCS2008.12Search in Google Scholar

11. J. Cohen. A coefficient of agreement for nominal scales. Educ. Psychol. Meas., 20:37–46, 1960.10.1177/001316446002000104Search in Google Scholar

12. H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan. GATE: an architecture for development of robust HLT applications. In Annual Meeting of the Association for Computational Linguistics, ACL 2002, pp. 168–175, 2002.Search in Google Scholar

13. M. Dusmanu, E. Cabrio, and S. Villata. Argument mining on Twitter: Arguments, facts and sources. In Empirical Methods in Natural Language Processing, EMNLP 2017, pp. 2317–2322. ACL, 2017.10.18653/v1/D17-1245Search in Google Scholar

14. N. Dykes, S. Evert, M. Göttlinger, P. Heinrich, and L. Schröder. Reconstructing arguments from noisy text: Introduction to the RANT project. Datenbank-Spektrum, 20:123–129, 2020.10.1007/s13222-020-00342-ySearch in Google Scholar

15. S. Evert and A. Hardie. Twenty-first century corpus workbench: Updating a query architecture for the new millennium. In Corpus Linguistics, CL 2011. University of Birmingham, 2011.Search in Google Scholar

16. S. Evert and The CWB Development Team. The IMS Open Corpus Workbench (CWB) CQP Query Language Tutorial, 2020. CWB Version 3.5, available at http://cwb.sourceforge.net/documentation.php.Search in Google Scholar

17. V. Feng and G. Hirst. Classifying arguments by scheme. In Annual Meeting of the Association for Computational Linguistics, ACL 2011, pp. 987–996. ACL, 2011.Search in Google Scholar

18. J. Fleiss, J. Cohen, and B. Everitt. Large sample standard errors of kappa and weighted kappa. Psychol. Bull., 72(5):323–327, 1969.10.1037/h0028106Search in Google Scholar

19. L. Godo and R. Rodríguez. Logical approaches to fuzzy similarity-based reasoning: an overview. In Preferences and Similarities, pp. 75–128. Springer, 2008.10.1007/978-3-211-85432-7_4Search in Google Scholar

20. D. Gorín, D. Pattinson, L. Schröder, F. Widmann, and T. Wißmann. COOL – a generic reasoner for coalgebraic hybrid logics (system description). In Automated Reasoning, IJCAR 2014, vol. 8562 of LNCS, pp. 396–402. Springer, 2014.10.1007/978-3-319-08587-6_31Search in Google Scholar

21. T. Goudas, C. Louizos, G. Petasis, and V. Karkaletsis. Argument extraction from news, blogs, and social media. In Artificial Intelligence: Methods and Applications, SETN 2014, pp. 287–299. Springer, 2014.10.1007/978-3-319-07064-3_23Search in Google Scholar

22. K. Grosse, C. Chesñevar, A. Maguitman, and E. Estevez. Empowering an E-government platform through Twitter-based arguments. Inteligencia Artif., 15(50):46–56, 2012.Search in Google Scholar

23. S. Kraus, D. Lehmann, and M. Magidor. Nonmonotonic reasoning, preferential models and cumulative logics. Artif. Intell., 44(1-2):167–207, 1990.10.1016/0004-3702(90)90101-5Search in Google Scholar

24. A. Kurucz, F. Wolter, M. Zakharyaschev, and D. M. Gabbay. Many-Dimensional Modal Logics: Theory and Applications. Elsevier, 2003.Search in Google Scholar

25. J. Lawrence, M. Snaith, B. Konat, K. Budzynska, and C. Reed. Debating technology for dialogical argument: Sensemaking, engagement, and analytics. ACM Trans. Internet Tech., 17(3):1–23, 2017.10.1145/3007210Search in Google Scholar

26. M. Lenz, S. Ollinger, P. Sahitaj, and R. Bergmann. Semantic textual similarity measures for case-based retrieval of argument graphs. In Case-Based Reasoning Research and Development, ICCBR 2019, vol. 11680 of LNCS, pp. 219–234. Springer, 2019.10.1007/978-3-030-29249-2_15Search in Google Scholar

27. D. Lewis. Counterfactuals. Harvard University Press, 1973.Search in Google Scholar

28. A. Lytos, T. Lagkas, P. Sarigiannidis, and K. Bontcheva. The evolution of argumentation mining: From models to social media and emerging tools. Inf. Process. Manage., 56(6):102055, 11 2019.10.1016/j.ipm.2019.102055Search in Google Scholar

29. S. Mac Lane. Categories for the Working Mathematician. Springer, 1971.10.1007/978-1-4612-9839-7Search in Google Scholar

30. T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. CoRR, 1301.3781, 2013.Search in Google Scholar

31. G. Minnen, J. Carroll, and D. Pearce. Applied morphological processing of English. Nat. Lang. Eng., 7(3):207–223, 2001.10.1017/S1351324901002728Search in Google Scholar

32. O. Owoputi, B. O’Connor, C. Dyer, K. Gimpel, N. Schneider, and N. Smith. Improved part-of-speech tagging for online conversational text with word clusters. In Human Language Technologies, HLT-NAACL 2013, pp. 380–390. ACL, 2013.Search in Google Scholar

33. P. Pantel and M. Pennacchiotti. Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In Computational Linguistics / Annual Meeting of the Association for Computational Linguistics, ACL 2006. ACL, 2006.10.3115/1220175.1220190Search in Google Scholar

34. C. Reed, S. Wells, J. Devereux, and G. Rowe. AIF+: dialogue in the argument interchange format. In Computational Models of Argument, COMMA 2008, vol. 172 of Frontiers Artif. Intell. Appl., pp. 311–323. IOS Press, 2008.Search in Google Scholar

35. N. Reimers, B. Schiller, T. Beck, J. Daxenberger, C. Stab, and I. Gurevych. Classification and clustering of arguments with contextualized word embeddings. In Annual Meeting of the Association for Computational Linguistics, ACL 2019, pp. 567–578. ACL, 2019.10.18653/v1/P19-1054Search in Google Scholar

36. A. Ritter, S. Clark, Mausam, and O. Etzioni. Named entity recognition in tweets: An experimental study. In Empirical Methods in Natural Language Processing, EMNLP 2011, pp. 1524–1534. ACL, 2011.Search in Google Scholar

37. A. Ritter, Mausam, O. Etzioni, and S. Clark. Open domain event extraction from twitter. In Knowledge Discovery and Data Mining, KDD 2012, pp. 1104–1112. ACM, 2012.10.1145/2339530.2339704Search in Google Scholar

38. J. Rutten. Universal coalgebra: A theory of systems. Theor. Comput. Sci., 249:3–80, 2000.10.1016/S0304-3975(00)00056-6Search in Google Scholar

39. L. Schröder and D. Pattinson. Modular algorithms for heterogeneous modal logics via multi-sorted coalgebra. Math. Struct. Comput. Sci., 21(2):235–266, 2011.10.1017/S0960129510000563Search in Google Scholar

40. L. Schröder, D. Pattinson, and D. Hausmann. Optimal tableaux for conditional logics with cautious monotonicity. In European Conference on Artificial Intelligence, ECAI 2010, vol. 215 of Frontiers Artif. Intell. Appl., pp. 707–712. IOS Press, 2010.Search in Google Scholar

41. F. Schäfer, S. Evert, and P. Heinrich. Japan’s 2014 general election: Political bots, right-wing Internet activism and PM Abe Shinzō’s hidden nationalist agenda. Big Data, 5(4):294–309, 2017.10.1089/big.2017.0049Search in Google Scholar

42. Y. Son, A. Buffone, J. Raso, A. Larche, A. Janocko, K. Zembroski, H. A. Schwartz, and L. Ungar. Recognizing counterfactual thinking in social media texts. In Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2017.10.18653/v1/P17-2103Search in Google Scholar

43. D. Walton, C. Reed, and F. Macagno. Argumentation Schemes. Cambridge University Press, 2008.10.1017/CBO9780511802034Search in Google Scholar

44. L. Zadeh. Probability measures of fuzzy events. J. Math. Anal. Appl., 23:421–427, 1968.10.1016/0022-247X(68)90078-4Search in Google Scholar

Received: 2020-11-27

Revised: 2021-02-12

Accepted: 2021-03-15

Published Online: 2021-05-05

Published in Print: 2021-02-23

Argument parsing via corpus queries

Abstract

About the authors

References

Journal and Issue

Articles in the same Issue