Abstract
The design of contemporary critical systems involves numerous requirements that must be clearly and coherently articulated, posing significant challenges for system designers. This paper addresses the challenge of translating ambiguous Natural Language (NL) requirements into unambiguous Computation Tree Logic (CTL) specifications, an essential task for maintaining consistency and precision in system design. We introduce Natural2CTL, a novel dataset comprising 2,095 pairs of NL requirements and their CTL specifications. A key aspect of this research includes a detailed methodology for data collection and annotation. The robustness of Natural2CTL is established through rigorous validation processes, including evaluations by academic and industry experts, inter-rater reliability assessments, and practical verification using UPPAAL case studies. These validation efforts underscore the dataset’s reliability and its potential applicability in both research and educational domains within Requirements Engineering (RE) and formal methods.
References
Amna, A.R., Poels, G.: Ambiguity in user stories: a systematic literature review. Inform. Softw. Technol. 145, 106824 (2022)
Clarke, E.M., Emerson, E.A.: Design and synthesis of synchronization skeletons using branching time temporal logic. In: Grumberg, O., Veith, H. (eds.) 25 Years of Model Checking. LNCS, vol. 5000, pp. 196–215. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-69850-0_12
Emerson, E.A., Halpern, J.Y.: “Sometimes’’ and “not never’’ revisited: on branching versus linear time temporal logic. J. ACM (JACM) 33(1), 151–178 (1986)
Harris, C.B., Harris, I.G.: Generating formal hardware verification properties from natural language documentation. In: Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015), pp. 49–56. IEEE (2015)
Li, F.-L., Horkoff, J., Borgida, A., Guizzardi, G., Liu, L., Mylopoulos, J.: From stakeholder requirements to formal specifications through refinement. In: Fricker, S.A., Schneider, K. (eds.) REFSQ 2015. LNCS, vol. 9013, pp. 164–180. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16101-3_11
Hahn, C., Schmitt, F., Tillman, J.J., et al.: Formal specifications from natural language. arXiv preprint arXiv:2206.01962 (2022)
Ghosh, S., Singh, A., Merenstein, A., et al.: SpecNFS: a challenge dataset towards extracting formal models from natural language specifications. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 2166–2176 (2022)
Brunello, A., Montanari, A., Reynolds, M.: Synthesis of LTL formulas from natural language texts: State of the art and research directions. In: 26th International symposium on temporal representation and reasoning (TIME 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2019)
Buzhinsky, I.: Formalization of natural language requirements into temporal logics: a survey. In: IEEE 17th International Conference on Industrial Informatics (INDIN), vol 2019, pp. 400–406 (2019). IEEE
Uppsala University, Sweden, Aalborg University in Denmark, "UPPAAL 5" (2023). https://uppaal.org/
Ferrari, A., Spagnolo, G.O., Gnesi, S.: PURE: a Dataset of Public Requirements Documents", National Research Council of Italy (2018). [dataset]. https://nlreqdataset.isti.cnr.it/
Hayes, J.: "CM1/Requirements Tracing", University of Ottawa (2015). [dataset]. https://promise.site.uottawa.ca/SERepository/datasets-page.html
Tjong, S.F.: Avoiding ambiguity in requirements specifications. Faculty Eng. Comput. Sci. (2008)
Masuoka, E., Fleig, A., Ardanuy, P., et al. MODIS. Volume 1: MODIS level 1A software baseline requirements (1994)
Aditi, F., Hsiao, M.S.: Hybrid rule-based and machine learning system for assertion generation from natural language specifications. In: 2022 IEEE 31st Asian Test Symposium (ATS), pp. 126–131. IEEE (2022)
Cosler, M., Hahn, C., Mendoza, D., et al.: nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models. arXiv preprint arXiv:2303.04864 (2023)
Harris, C.B.: Generating formal verification properties from natural language hardware specifications. University of California, Irvine (2015)
Diamantopoulos, T., Roth, M., Symeonidis, A., et al.: Software requirements as an application domain for natural language processing. Lang. Resources Evaluat. 51, 495–524 (2017)
Souvik, "Software Requirements Datasett", Kaagle (2020). [dataset]. www.kaggle.com/datasets/iamsouvik/software-requirements-dataset?datasetId=560206 &sortBy=dateRun &tab=collaboration
Baier, C., Katoen, J.-p.: Principles of model checking. MIT press (2008)
Dwyer, M.B., Avrunin, G.S., Corbett, J.C.: Patterns in property specifications for finite-state verification. In: Proceedings of the 21st International Conference on Software Engineering, pp. 411–420 (1999)
Krippendorff, K.: Computing Krippendorff’s alpha-reliability (2011)
Acknowledgment
We extend our sincere thanks to P.S. Nouwou Mindom and L. Elfatimi for their vital role in the dataset validation, greatly enriching its integrity and validity. This research was funded by Mitacs under grant IT19246 and grant IT30530.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zrelli, R., Amaral Misson, H., Ben Attia, M., Gohring de Magalhães, F., Shabah, A., Nicolescu, G. (2024). Natural2CTL: A Dataset for Natural Language Requirements and Their CTL Formal Equivalents. In: Mendez, D., Moreira, A. (eds) Requirements Engineering: Foundation for Software Quality. REFSQ 2024. Lecture Notes in Computer Science, vol 14588. Springer, Cham. https://doi.org/10.1007/978-3-031-57327-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-57327-9_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-57326-2
Online ISBN: 978-3-031-57327-9
eBook Packages: Computer ScienceComputer Science (R0)