Automatic Detection of Causality in Requirement Artifacts: The CiRA Approach

Fischbach, Jannik; Frattini, Julian; Spaans, Arjen; Kummeth, Maximilian; Vogelsang, Andreas; Mendez, Daniel; Unterkalmsteiner, Michael

doi:10.1007/978-3-030-73128-1_2

Jannik Fischbach¹⁰,
Julian Frattini¹¹,
Arjen Spaans¹⁰,
Maximilian Kummeth¹⁰,
Andreas Vogelsang¹²,
Daniel Mendez^11,13 &
…
Michael Unterkalmsteiner¹¹

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 12685))

Included in the following conference series:

International Working Conference on Requirements Engineering: Foundation for Software Quality

1476 Accesses
18 Citations

Abstract

[Context & motivation:] System behavior is often expressed by causal relations in requirements (e.g., If event 1, then event 2). Automatically extracting this embedded causal knowledge supports not only reasoning about requirements dependencies, but also various automated engineering tasks such as seamless derivation of test cases. However, causality extraction from natural language (NL) is still an open research challenge as existing approaches fail to extract causality with reasonable performance. [Question/problem:] We understand causality extraction from requirements as a two-step problem: First, we need to detect if requirements have causal properties or not. Second, we need to understand and extract their causal relations. At present, though, we lack knowledge about the form and complexity of causality in requirements, which is necessary to develop a suitable approach addressing these two problems. [Principal ideas/results:] We conduct an exploratory case study with 14,983 sentences from 53 requirements documents originating from 18 different domains and shed light on the form and complexity of causality in requirements. Based on our findings, we develop a tool-supported approach for causality detection (CiRA, standing for Causality in Requirement Artifacts). This constitutes a first step towards causality extraction from NL requirements. [Contribution:] We report on a case study and the resulting tool-supported approach for causality detection in requirements. Our case study corroborates, among other things, that causality is, in fact, a widely used linguistic pattern to describe system behavior, as about a third of the analyzed sentences are causal. We further demonstrate that our tool CiRA achieves a macro-F$_{1}$ score of 82% on real word data and that it outperforms related approaches with an average gain of 11.06% in macro-Recall and 11.43% in macro-Precision. Finally, we disclose our open data sets as well as our tool to foster the discourse on the automatic detection of causality in the RE community.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Causality in requirements artifacts: prevalence, detection, and impact

Article Open access 09 February 2022

Automated Mining and Checking of Formal Properties in Natural Language Requirements

On systematically building a controlled natural language for functional requirements

Article Open access 09 June 2021

Notes

1.
A demo of CiRA can be accessed at http://cira.diptsrv003.bth.se/. Our code and annotated data sets can be found at https://github.com/fischJan/CiRA.
2.
The platform can be accessed at http://clabel.diptsrv003.bth.se/suite.

References

Asghar, N.: Automatic extraction of causal relations from natural language texts: A comprehensive survey. arXiv abs/1605.07895 (2016)
Blanco, E., Castell, N., Moldovan, D.: Causal relation extraction. In: LREC 2008 (2008)
Google Scholar
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20, 37–46 (1960)
Article Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL 2019 (2019)
Google Scholar
Fares, M., Kutuzov, A., Oepen, S., Velldal, E.: Word vectors, reuse, and replicability: Towards a community repository of large-text resources. In: NoDaLiDa 2017 (2017)
Google Scholar
Feinstein, A.R., Cicchetti, D.V.: High agreement but low Kappa: I the problems of two paradoxes. J. Clin. Epidemiol. 43, 543–549 (1990)
Article Google Scholar
Fischbach, J., Hauptmann, B., Konwitschny, L., Spies, D., Vogelsang, A.: Towards causality extraction from requirements. In: RE 2020 (2020)
Google Scholar
Fischbach, J., Vogelsang, A., Spies, D., Wehrle, A., Junker, M., Freudenstein, D.: Specmate: automated creation of test cases from acceptance criteria. In: ICST 2020 (2020)
Google Scholar
Frattini, J., Junker, M., Unterkalmsteiner, M., Mendez, D.: Automatic extraction of cause-effect-relations from requirements artifacts. In: ASE 2020 (2020)
Google Scholar
Handbook of Inter-rater Reliability: the Definitive Guide to Measuring the Extent of Agreement Among Raters (2012)
Google Scholar
Gwet, K.: AgreeStat Analytics (Cloud-based version (AgreeStat360) was used in Sept. 2020). https://www.agreestat.com/
Honnibal, M., Montani, I.: spaCy NLP library (We use the newest version of the en$\_$core$\_$web$\_$sm model in Sept. 2020). https://spacy.io/
James, G., Witten, D., Hastie, T., Tibshirani, R.E.: An Introduction to Statistical Learning. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-7138-7
Book MATH Google Scholar
Khoo, C.S.G., Chan, S., Niu, Y.: Extracting causal knowledge from a medical database using graphical patterns. In: ACL 2000 (2000)
Google Scholar
Kyriakakis, M., Androutsopoulos, I., i Ametllé, J.G., Saudabayev, A.: Transfer learning for causal sentence detection. arXiv abs/1906.07544 (2019)
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)
Article Google Scholar
Lewis, D.: Counterfactuals (1973)
Google Scholar
McHugh, M.L.: Interrater reliability: the Kappa statistic. Biochemia Medica 22, 276–282 (2012)
Article Google Scholar
Mostafazadeh, N., Grealish, A., Chambers, N., Allen, J., Vanderwende, L.: CaTeRS: causal and temporal relation scheme for semantic annotation of event structures. In: EVENTS 2016 (2016)
Google Scholar
Neves, M., Ševa, J.: An extensive review of tools for manual annotation of documents. Brief. Bioinform. 22, 146–163 (2019)
Article Google Scholar
Pustejovsky, J., Stubbs, A.: Natural Language Annotation for Machine Learning - a Guide to Corpus-Building for Applications. O’Reilly Media Inc., Newton (2012)
Google Scholar
Robson, C.: Real World Research - A Resource for Social Scientists and Practitioner-Researchers. Wiley, New York (2002)
Google Scholar
Runeson, P., Höst, M.: Guidelines for conducting and reporting case study research in software engineering. Empir. Softw. Eng. 14, 131–164 (2009)
Article Google Scholar
Sundararaman, D., Subramanian, V., Wang, G., Si, S., Shen, D., Wang, D., Carin, L.: Syntax-infused transformer and bert models for machine translation and natural language understanding (2019)
Google Scholar
Viera, A., Garrett, J.: Understanding interobserver agreement: the Kappa statistic. Fam. Med. 7, 360–363 (2005)
Google Scholar
Wolff, P.: Representing causation. J. Exp. Psychol. Gen. 136, 82 (2007)
Article Google Scholar
Wolff, P., Song, G.: Models of causation and the semantics of causal verbs. Cogn. Psychol. 7, 276–332 (2003)
Article Google Scholar
Wongpakaran, N., Wongpakaran, T., Wedding, D., Gwet, K.: A comparison of Cohen’s Kappa and Gwet’s AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples. BMC Med. Res. Methodol. 13, 61 (2013). https://doi.org/10.1186/1471-2288-13-61
Article Google Scholar
Wu, C.H., Yu, L.C., Jang, F.L.: Using semantic dependencies to mine depressive symptoms from consultation records. IEEE Intell. Syst. 20, 50–58 (2005)
Google Scholar

Download references

Acknowledgements

We would like to acknowledge that this work was supported by the KKS foundation through the S.E.R.T. Research Profile project at Blekinge Institute of Technology. Further, we thank Yannick Debes for his valuable feedback.

Author information

Authors and Affiliations

Qualicen GmbH, Munich, Germany
Jannik Fischbach, Arjen Spaans & Maximilian Kummeth
Blekinge Institute of Technology, Karlshamn, Sweden
Julian Frattini, Daniel Mendez & Michael Unterkalmsteiner
University of Cologne, Cologne, Germany
Andreas Vogelsang
fortiss GmbH, Munich, Germany
Daniel Mendez

Authors

Jannik Fischbach
View author publications
You can also search for this author in PubMed Google Scholar
Julian Frattini
View author publications
You can also search for this author in PubMed Google Scholar
Arjen Spaans
View author publications
You can also search for this author in PubMed Google Scholar
Maximilian Kummeth
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Vogelsang
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Mendez
View author publications
You can also search for this author in PubMed Google Scholar
Michael Unterkalmsteiner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jannik Fischbach .

Editor information

Editors and Affiliations

Utrecht University, Utrecht, The Netherlands
Fabiano Dalpiaz
Kennesaw State University, Kennesaw, GA, USA
Paola Spoletini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fischbach, J. et al. (2021). Automatic Detection of Causality in Requirement Artifacts: The CiRA Approach. In: Dalpiaz, F., Spoletini, P. (eds) Requirements Engineering: Foundation for Software Quality. REFSQ 2021. Lecture Notes in Computer Science(), vol 12685. Springer, Cham. https://doi.org/10.1007/978-3-030-73128-1_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-73128-1_2
Published: 02 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73127-4
Online ISBN: 978-3-030-73128-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics