Skip to main content

Semi-automatic Software Feature-Relevant Information Extraction from Natural Language User Manuals

An Approach and Practical Experience at Roche Diagnostics GmbH

  • Conference paper
  • First Online:
Requirements Engineering: Foundation for Software Quality (REFSQ 2017)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10153))

  • 3035 Accesses

Abstract

Context and motivation: Mature software systems comprise a vast number of heterogeneous system capabilities which are usually requested by different groups of stakeholders and which evolve over time. Software features describe and bundle low level capabilities logically on an abstract level and thus provide a structured and comprehensive overview of the entire capabilities of a software system. Question/problem: Software features are often not explicitly managed. Quite the contrary, feature-relevant information is often spread across several software engineering artifacts (e.g., user manual, issue tracking systems). It requires huge manual effort to identify and extract feature-relevant information from these artifacts in order to make feature knowledge explicit. Principal ideas/results: Our semi-automatic approach allows to identify and extract atomic software feature-relevant information from natural language user manuals by means of a domain glossary, structural sentence information, and natural language processing techniques with a precision and recall of over 94% and 96% respectively. Contribution: We provide an implementation of the atomic software feature-relevant information extraction approach together with this paper as well as corresponding evaluations based on example sections of a user manual taken from industry.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://poi.apache.org/.

  2. 2.

    http://nlp.stanford.edu/software/lex-parser.shtml.

References

  1. Acher, M., Cleve, A., Perrouin, G., Heymans, P., Vanbeneden, C., Collet, P., Lahire, P.: On extracting feature models from product descriptions. In: Proceedings of 6th International Workshop on Variability Modeling of Software-Intensive Systems (VaMoS 2012), pp. 45–54. ACM (2012)

    Google Scholar 

  2. Aggarwal, C., Zhai, C.: Mining Text Data. Springer, Heidelberg (2012)

    Book  Google Scholar 

  3. Alves, V., Schwanninger, C., Barbosa, L., Rashid, A., Sawyer, P., Rayson, P., Pohl, C., Rummler, A.: An exploratory study of information retrieval techniques in domain analysis. In: Proceedings of 12th International Software Product Line Conference (SPLC 2008), pp. 67–76 (2008)

    Google Scholar 

  4. Apel, S., Kästner, C.: An overview of feature-oriented software development. Object Technol. 8(5), 49–84 (2009)

    Article  Google Scholar 

  5. Bakar, N.H., Kasirun, Z.M., Salleh, N.: Feature extraction approaches from natural language requirements for reuse in software product lines. Syst. Softw. 106(C), 132–149 (2015)

    Article  Google Scholar 

  6. Bakar, N.H., Kasirun, Z.M., Salleh, N.: Terms extractions: an approach for requirements reuse. In: 2nd International Conference on Information Science and Security (ICISS), pp. 1–4 (2015)

    Google Scholar 

  7. Berry, D., Gacitua, R., Sawyer, P., Tjong, S.F.: The case for dumb requirements engineering tools. In: Regnell, B., Damian, D. (eds.) REFSQ 2012. LNCS, vol. 7195, pp. 211–217. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28714-5_18

    Chapter  Google Scholar 

  8. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)

    MATH  Google Scholar 

  9. Bosch, J.: Design and Use of Software Architectures: Adopting and Evolving a Product-line Approach. ACM Press, New York (2000)

    Google Scholar 

  10. Boutkova, E., Houdek, F.: Semi-automatic identification of features in requirement specifications. In: Proceedings of 19th International Requirements Engineering Conference (RE 2011), pp. 313–318 (2011)

    Google Scholar 

  11. Chandrasekar, R., Doran, C., Srinivas, B.: Motivations and methods for text simplification. In: Proceedings of 16th Conference on Computational Linguistics (COLING), pp. 1041–1044 (1996)

    Google Scholar 

  12. Charniak, E.: Statistical parsing with a context-free grammar and word statistics. In: AAAI/IAAI, pp. 598–603 (1997)

    Google Scholar 

  13. Chen, K., Zhang, W., Zhao, H., Mei, H.: An approach to constructing feature models based on requirements clustering. In: Proceedings of 13th International Requirements Engineering Conf. (RE 2005), pp. 31–40 (2005)

    Google Scholar 

  14. Classen, A., Heymans, P., Schobbens, P.-Y.: What’s in a feature: a requirements engineering perspective. In: Proceedings of 11th International Conference on Fundamental Approaches to Software Engineering (FASE 2008), pp. 16–30 (2008)

    Google Scholar 

  15. Corbett, G.: Linguistic features. Afr. Aff. 87, 25–54 (2006)

    Google Scholar 

  16. Earls, A., Embury, S., Turner, N.: A method for the manual extraction of business rules from legacy source code. BT Technol. 20(4), 127–145 (2002)

    Article  Google Scholar 

  17. Eisenbarth, T., Koschke, R., Simon, D.: Locating features in source code. Trans. Softw. Eng. 29(3), 210–224 (2003)

    Article  Google Scholar 

  18. Ghosh, S., Elenius, D., Li, W., Lincoln, P., Shankar, N., Steiner, W.: Arsenal: automatic requirements specification extracting from natural language. In: Proceedings of 8th Interantional Symposium of NASA Formal Methods (NFM 2016), pp. 41–46 (2016)

    Google Scholar 

  19. Guzman, E., Maalej, W.: How do users like this feature? A fine grained sentiment analysis of app. reviews. In: Proceedings of 22nd International Requirements Engineering Conference (RE 2014), pp. 153–162. IEEE (2014)

    Google Scholar 

  20. IEEE: IEEE Standard Glossary of Software Engineering Terminology. IEEE Std, pp. 610–612 (1990)

    Google Scholar 

  21. Indurkhya, N., Damerau, F.J.: Handbook of Natural Language Processing, vol. 2. CRC Press, Boca Raton (2010)

    Google Scholar 

  22. Ittoo, A., Bouma, G.: Term extraction from sparse, ungrammatical domain-specific documents. Expert Syst. App. 40(7), 2530–2540 (2013)

    Article  Google Scholar 

  23. John, I., Dörr, J.: Elicitation of requirements from user documentation. In: 9th International Workshop on Requirements Engineering: Foundation for Software Quality (REFSQ 2003) (2003)

    Google Scholar 

  24. Jonnalagadda, S., Tari, L., Hakenberg, J., Baral, C., Gonzalez, G.: Towards effective sentence simplification for automatic processing of biomedical text. In: Proceedings of Human Language Technologies (NAACL HLT 2009), pp. 177–180 (2009)

    Google Scholar 

  25. Kim, S.N., Baldwin, T., Kan, M.-Y.: An unsupervised approach to domain-specific term extraction. In: Proceedings of Australasian Language Technology Association, Workshop, pp. 94–98 (2009)

    Google Scholar 

  26. Klein, D., Manning, C.D.: Fast exact inference with a factored model for natural language parsing. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems, vol. 15, pp. 3–10. MIT Press, Cambridge (2003)

    Google Scholar 

  27. Levy, R., Andrew, G.: Tregex and tsurgeon: tools for querying and manipulating tree data structures. In: Proceedings of 5th International Conference on Language Resources and Evaluation (LREC 2006), pp. 2231–2234 (2006)

    Google Scholar 

  28. Li, Y., Guzman, E., Tsiamoura, K., Schneider, F., Bruegge, B.: Automated requirements extraction for scientific software. Procedia Comput. Sci. 51, 582–591 (2015)

    Article  Google Scholar 

  29. Loughran, N., Sampaio, A., Rashid, A.: From requirements documents to feature models for aspect oriented product line implementation. In: Bruel, J.-M. (ed.) MODELS 2005. LNCS, vol. 3844, pp. 262–271. Springer, Heidelberg (2006). doi:10.1007/11663430_27

    Chapter  Google Scholar 

  30. Marciuska, S., Gencel, C., Abrahamsson, P.: Automated feature identification in web applications. In: Proceedings of 14th International Conference on Software Quality (QSIC 2014), pp. 100–114 (2014)

    Google Scholar 

  31. Merten, T., Falis, M., Hübner, P., Quirchmayr, T., Bürsner, S., Paech, B.: Software feature request detection in issue tracking systems. In: Proceedings of 24th International Requirements Engineering Conference (RE 2016), pp. 166–175 (2016)

    Google Scholar 

  32. Mu, Y., Wang, Y., Guo, J.: Extracting software functional requirements from free text documents. In: Proceedings of 1st International Conference on Information and Multimedia Technology (ICIMT 2009), pp. 194–198 (2009)

    Google Scholar 

  33. Nixon, M.: Feature Extraction & Image Processing. Academic Press, Cambridge (2008)

    Google Scholar 

  34. Paech, B., Hübner, P., Merten, T.: What are the features of this software? An exploratory study. In: Proceedings of 9th International Conference on Software Engineering Advances (ICSEA 2014), pp. 114–125 (2014)

    Google Scholar 

  35. Pikkarainen, M., Haikara, J., Salo, O., Abrahamsson, P., Still, J.: The impact of agile practices on communication in software development. J. Empir. Softw. Eng. 13(3), 303–337 (2008)

    Article  Google Scholar 

  36. Shaker, P., Atlee, J.M., Wang, S.: A feature-oriented requirements modelling language. In: Proceedings of 20th International Requirements Engineering Conference (RE 2012), pp. 151–160 (2012)

    Google Scholar 

  37. Ward, L.J., Woods, G.: English Grammar for Dummies. Wiley, Hoboken (2013)

    Google Scholar 

  38. Weston, N., Chitchyan, R., Rashid, A.: A framework for constructing semantically composable feature models from natural language requirements. In: Proceedings of 13th International Software Product Line Conference (SPLC 2009), pp. 211–220 (2009)

    Google Scholar 

  39. Wimalasuriya, D.C., Dou, D.: Ontology-based information extraction: an introduction and a survey of current approaches. Inf. Sci. 36(3), 306–323 (2010)

    Article  Google Scholar 

  40. Zapata, J.C.M., Losada, B.M., Gonzalez-Calderon, G.: An approach for using procedure manuals as a source for requirements elicitation. In: Proceedings of 38th Conference Latinoamericana En Informatica (CLEI 2012), pp. 1–8 (2012)

    Google Scholar 

  41. Zorn-Pauli, G., Paech, B., Wittkopf, J.: Strategic release planning challenges for global information systems - a position paper. In: Proceedings of 6th International Workshop on Software Product Management (IWSPM 2012), pp. 186–191 (2012)

    Google Scholar 

Download references

Acknowledgements

We would like to thank Roche Diagnostics GmbH for the financial support of this research project. Many thanks also to the GDC experts for their participation in the case study and valuable discussions of the results.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Quirchmayr .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Quirchmayr, T., Paech, B., Kohl, R., Karey, H. (2017). Semi-automatic Software Feature-Relevant Information Extraction from Natural Language User Manuals. In: Grünbacher, P., Perini, A. (eds) Requirements Engineering: Foundation for Software Quality. REFSQ 2017. Lecture Notes in Computer Science(), vol 10153. Springer, Cham. https://doi.org/10.1007/978-3-319-54045-0_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54045-0_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54044-3

  • Online ISBN: 978-3-319-54045-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics