Skip to main content

Natural2CTL: A Dataset for Natural Language Requirements and Their CTL Formal Equivalents

  • Conference paper
  • First Online:
Requirements Engineering: Foundation for Software Quality (REFSQ 2024)

Abstract

The design of contemporary critical systems involves numerous requirements that must be clearly and coherently articulated, posing significant challenges for system designers. This paper addresses the challenge of translating ambiguous Natural Language (NL) requirements into unambiguous Computation Tree Logic (CTL) specifications, an essential task for maintaining consistency and precision in system design. We introduce Natural2CTL, a novel dataset comprising 2,095 pairs of NL requirements and their CTL specifications. A key aspect of this research includes a detailed methodology for data collection and annotation. The robustness of Natural2CTL is established through rigorous validation processes, including evaluations by academic and industry experts, inter-rater reliability assessments, and practical verification using UPPAAL case studies. These validation efforts underscore the dataset’s reliability and its potential applicability in both research and educational domains within Requirements Engineering (RE) and formal methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Amna, A.R., Poels, G.: Ambiguity in user stories: a systematic literature review. Inform. Softw. Technol. 145, 106824 (2022)

    Article  Google Scholar 

  2. Clarke, E.M., Emerson, E.A.: Design and synthesis of synchronization skeletons using branching time temporal logic. In: Grumberg, O., Veith, H. (eds.) 25 Years of Model Checking. LNCS, vol. 5000, pp. 196–215. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-69850-0_12

    Chapter  Google Scholar 

  3. Emerson, E.A., Halpern, J.Y.: “Sometimes’’ and “not never’’ revisited: on branching versus linear time temporal logic. J. ACM (JACM) 33(1), 151–178 (1986)

    Article  MathSciNet  Google Scholar 

  4. Harris, C.B., Harris, I.G.: Generating formal hardware verification properties from natural language documentation. In: Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015), pp. 49–56. IEEE (2015)

    Google Scholar 

  5. Li, F.-L., Horkoff, J., Borgida, A., Guizzardi, G., Liu, L., Mylopoulos, J.: From stakeholder requirements to formal specifications through refinement. In: Fricker, S.A., Schneider, K. (eds.) REFSQ 2015. LNCS, vol. 9013, pp. 164–180. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16101-3_11

    Chapter  Google Scholar 

  6. Hahn, C., Schmitt, F., Tillman, J.J., et al.: Formal specifications from natural language. arXiv preprint arXiv:2206.01962 (2022)

  7. Ghosh, S., Singh, A., Merenstein, A., et al.: SpecNFS: a challenge dataset towards extracting formal models from natural language specifications. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 2166–2176 (2022)

    Google Scholar 

  8. Brunello, A., Montanari, A., Reynolds, M.: Synthesis of LTL formulas from natural language texts: State of the art and research directions. In: 26th International symposium on temporal representation and reasoning (TIME 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2019)

    Google Scholar 

  9. Buzhinsky, I.: Formalization of natural language requirements into temporal logics: a survey. In: IEEE 17th International Conference on Industrial Informatics (INDIN), vol 2019, pp. 400–406 (2019). IEEE

    Google Scholar 

  10. Uppsala University, Sweden, Aalborg University in Denmark, "UPPAAL 5" (2023). https://uppaal.org/

  11. Ferrari, A., Spagnolo, G.O., Gnesi, S.: PURE: a Dataset of Public Requirements Documents", National Research Council of Italy (2018). [dataset]. https://nlreqdataset.isti.cnr.it/

  12. Hayes, J.: "CM1/Requirements Tracing", University of Ottawa (2015). [dataset]. https://promise.site.uottawa.ca/SERepository/datasets-page.html

  13. Tjong, S.F.: Avoiding ambiguity in requirements specifications. Faculty Eng. Comput. Sci. (2008)

    Google Scholar 

  14. Masuoka, E., Fleig, A., Ardanuy, P., et al. MODIS. Volume 1: MODIS level 1A software baseline requirements (1994)

    Google Scholar 

  15. Aditi, F., Hsiao, M.S.: Hybrid rule-based and machine learning system for assertion generation from natural language specifications. In: 2022 IEEE 31st Asian Test Symposium (ATS), pp. 126–131. IEEE (2022)

    Google Scholar 

  16. Cosler, M., Hahn, C., Mendoza, D., et al.: nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models. arXiv preprint arXiv:2303.04864 (2023)

  17. Harris, C.B.: Generating formal verification properties from natural language hardware specifications. University of California, Irvine (2015)

    Google Scholar 

  18. Diamantopoulos, T., Roth, M., Symeonidis, A., et al.: Software requirements as an application domain for natural language processing. Lang. Resources Evaluat. 51, 495–524 (2017)

    Article  Google Scholar 

  19. Souvik, "Software Requirements Datasett", Kaagle (2020). [dataset]. www.kaggle.com/datasets/iamsouvik/software-requirements-dataset?datasetId=560206 &sortBy=dateRun &tab=collaboration

  20. Baier, C., Katoen, J.-p.: Principles of model checking. MIT press (2008)

    Google Scholar 

  21. Dwyer, M.B., Avrunin, G.S., Corbett, J.C.: Patterns in property specifications for finite-state verification. In: Proceedings of the 21st International Conference on Software Engineering, pp. 411–420 (1999)

    Google Scholar 

  22. Krippendorff, K.: Computing Krippendorff’s alpha-reliability (2011)

    Google Scholar 

Download references

Acknowledgment

We extend our sincere thanks to P.S. Nouwou Mindom and L. Elfatimi for their vital role in the dataset validation, greatly enriching its integrity and validity. This research was funded by Mitacs under grant IT19246 and grant IT30530.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rim Zrelli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zrelli, R., Amaral Misson, H., Ben Attia, M., Gohring de Magalhães, F., Shabah, A., Nicolescu, G. (2024). Natural2CTL: A Dataset for Natural Language Requirements and Their CTL Formal Equivalents. In: Mendez, D., Moreira, A. (eds) Requirements Engineering: Foundation for Software Quality. REFSQ 2024. Lecture Notes in Computer Science, vol 14588. Springer, Cham. https://doi.org/10.1007/978-3-031-57327-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-57327-9_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-57326-2

  • Online ISBN: 978-3-031-57327-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics