Abstract
[Context & motivation:] System behavior is often expressed by causal relations in requirements (e.g., If event 1, then event 2). Automatically extracting this embedded causal knowledge supports not only reasoning about requirements dependencies, but also various automated engineering tasks such as seamless derivation of test cases. However, causality extraction from natural language (NL) is still an open research challenge as existing approaches fail to extract causality with reasonable performance. [Question/problem:] We understand causality extraction from requirements as a two-step problem: First, we need to detect if requirements have causal properties or not. Second, we need to understand and extract their causal relations. At present, though, we lack knowledge about the form and complexity of causality in requirements, which is necessary to develop a suitable approach addressing these two problems. [Principal ideas/results:] We conduct an exploratory case study with 14,983 sentences from 53 requirements documents originating from 18 different domains and shed light on the form and complexity of causality in requirements. Based on our findings, we develop a tool-supported approach for causality detection (CiRA, standing for Causality in Requirement Artifacts). This constitutes a first step towards causality extraction from NL requirements. [Contribution:] We report on a case study and the resulting tool-supported approach for causality detection in requirements. Our case study corroborates, among other things, that causality is, in fact, a widely used linguistic pattern to describe system behavior, as about a third of the analyzed sentences are causal. We further demonstrate that our tool CiRA achieves a macro-F\(_{1}\) score of 82% on real word data and that it outperforms related approaches with an average gain of 11.06% in macro-Recall and 11.43% in macro-Precision. Finally, we disclose our open data sets as well as our tool to foster the discourse on the automatic detection of causality in the RE community.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
A demo of CiRA can be accessed at http://cira.diptsrv003.bth.se/. Our code and annotated data sets can be found at https://github.com/fischJan/CiRA.
- 2.
The platform can be accessed at http://clabel.diptsrv003.bth.se/suite.
References
Asghar, N.: Automatic extraction of causal relations from natural language texts: A comprehensive survey. arXiv abs/1605.07895 (2016)
Blanco, E., Castell, N., Moldovan, D.: Causal relation extraction. In: LREC 2008 (2008)
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20, 37–46 (1960)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL 2019 (2019)
Fares, M., Kutuzov, A., Oepen, S., Velldal, E.: Word vectors, reuse, and replicability: Towards a community repository of large-text resources. In: NoDaLiDa 2017 (2017)
Feinstein, A.R., Cicchetti, D.V.: High agreement but low Kappa: I the problems of two paradoxes. J. Clin. Epidemiol. 43, 543–549 (1990)
Fischbach, J., Hauptmann, B., Konwitschny, L., Spies, D., Vogelsang, A.: Towards causality extraction from requirements. In: RE 2020 (2020)
Fischbach, J., Vogelsang, A., Spies, D., Wehrle, A., Junker, M., Freudenstein, D.: Specmate: automated creation of test cases from acceptance criteria. In: ICST 2020 (2020)
Frattini, J., Junker, M., Unterkalmsteiner, M., Mendez, D.: Automatic extraction of cause-effect-relations from requirements artifacts. In: ASE 2020 (2020)
Handbook of Inter-rater Reliability: the Definitive Guide to Measuring the Extent of Agreement Among Raters (2012)
Gwet, K.: AgreeStat Analytics (Cloud-based version (AgreeStat360) was used in Sept. 2020). https://www.agreestat.com/
Honnibal, M., Montani, I.: spaCy NLP library (We use the newest version of the en\(\_\)core\(\_\)web\(\_\)sm model in Sept. 2020). https://spacy.io/
James, G., Witten, D., Hastie, T., Tibshirani, R.E.: An Introduction to Statistical Learning. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-7138-7
Khoo, C.S.G., Chan, S., Niu, Y.: Extracting causal knowledge from a medical database using graphical patterns. In: ACL 2000 (2000)
Kyriakakis, M., Androutsopoulos, I., i Ametllé, J.G., Saudabayev, A.: Transfer learning for causal sentence detection. arXiv abs/1906.07544 (2019)
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)
Lewis, D.: Counterfactuals (1973)
McHugh, M.L.: Interrater reliability: the Kappa statistic. Biochemia Medica 22, 276–282 (2012)
Mostafazadeh, N., Grealish, A., Chambers, N., Allen, J., Vanderwende, L.: CaTeRS: causal and temporal relation scheme for semantic annotation of event structures. In: EVENTS 2016 (2016)
Neves, M., Ševa, J.: An extensive review of tools for manual annotation of documents. Brief. Bioinform. 22, 146–163 (2019)
Pustejovsky, J., Stubbs, A.: Natural Language Annotation for Machine Learning - a Guide to Corpus-Building for Applications. O’Reilly Media Inc., Newton (2012)
Robson, C.: Real World Research - A Resource for Social Scientists and Practitioner-Researchers. Wiley, New York (2002)
Runeson, P., Höst, M.: Guidelines for conducting and reporting case study research in software engineering. Empir. Softw. Eng. 14, 131–164 (2009)
Sundararaman, D., Subramanian, V., Wang, G., Si, S., Shen, D., Wang, D., Carin, L.: Syntax-infused transformer and bert models for machine translation and natural language understanding (2019)
Viera, A., Garrett, J.: Understanding interobserver agreement: the Kappa statistic. Fam. Med. 7, 360–363 (2005)
Wolff, P.: Representing causation. J. Exp. Psychol. Gen. 136, 82 (2007)
Wolff, P., Song, G.: Models of causation and the semantics of causal verbs. Cogn. Psychol. 7, 276–332 (2003)
Wongpakaran, N., Wongpakaran, T., Wedding, D., Gwet, K.: A comparison of Cohen’s Kappa and Gwet’s AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples. BMC Med. Res. Methodol. 13, 61 (2013). https://doi.org/10.1186/1471-2288-13-61
Wu, C.H., Yu, L.C., Jang, F.L.: Using semantic dependencies to mine depressive symptoms from consultation records. IEEE Intell. Syst. 20, 50–58 (2005)
Acknowledgements
We would like to acknowledge that this work was supported by the KKS foundation through the S.E.R.T. Research Profile project at Blekinge Institute of Technology. Further, we thank Yannick Debes for his valuable feedback.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Fischbach, J. et al. (2021). Automatic Detection of Causality in Requirement Artifacts: The CiRA Approach. In: Dalpiaz, F., Spoletini, P. (eds) Requirements Engineering: Foundation for Software Quality. REFSQ 2021. Lecture Notes in Computer Science(), vol 12685. Springer, Cham. https://doi.org/10.1007/978-3-030-73128-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-73128-1_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73127-4
Online ISBN: 978-3-030-73128-1
eBook Packages: Computer ScienceComputer Science (R0)