Skip to main content

Advertisement

Log in

Systems performance prediction using requirements quality attributes classification

  • Original Article
  • Published:
Requirements Engineering Aims and scope Submit manuscript

Abstract

Poor requirements definition can adversely impact system cost and performance for government acquisition programs. This can be mitigated by ensuring requirements statements are written in a clear and unambiguous manner with high linguistic quality. This paper introduces a statistical model that uses requirements quality factors to predict system operational performance. This work explores four classification techniques (Logistic Regression, Naïve Bayes Classifier, Support Vector Machine, and K-Nearest Neighbor) to develop the predictive model. This model is created using empirical data from current major acquisition programs within the federal government. Operational Requirements Documents and Operational Test Reports are the data sources, respectively, for the system requirements statements and the accompanying operational test results used for model development. A commercial-off-the-shelf requirements quality analysis tool is used to determine the requirements linguistic quality metrics used in the model. Subsequent to model construction, the predictive value of the model is confirmed through execution of a sensitivity analysis, cross-validation of the data, and an overfitting analysis. Lastly, Receiver Operating Characteristics are examined to determine the best performing model. In all, the results establish that requirements quality is indeed a predictive factor for end-system operational performance, and the resulting statistical model can influence requirements development based on likelihood of successful operational performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Sutcliffe AG, Economou A, Markis P (1999) Tracing requirements errors to problems in the requirements engineering process. Requir Eng 4(3):134–151

    Article  Google Scholar 

  2. Park S, Maurer F, Eberlein A, Fung T.-S (2010) Requirements attributes to predict requirements related defects. In: Proceedings of the 2010 conference of the center for advanced studies on collaborative research—CASCON’10, p 42

  3. Fantechi A, Gnesi S, Lami G, Maccari A (2003) Applications of linguistic techniques for use case analysis. Requir Eng 8(3):161–170

    Article  Google Scholar 

  4. Nigam A, Arya N, Nigam B, Jain D (2012) Tool for automatic discovery of ambiguity in requirements. Int J Comput Sci Issues 9(5):350–356

    Google Scholar 

  5. Yang H, Roeck A, Gervasi V, Willis A, Nuseibeh B (2011) Analysing anaphoric ambiguity in natural language requirements. Requir Eng 16(3):163–189

    Article  Google Scholar 

  6. Génova G, Fuentes JM, Llorens J, Hurtado O, Moreno V (2011) A framework to measure and improve the quality of textual requirements. Requir Eng 18(1):25–41

    Article  Google Scholar 

  7. Lami RW (2007) Giuseppe and Ferguson, an empirical study on the impact of automation on the requirements analysis process. J Comput Sci Technol 22(3):338–347

    Article  Google Scholar 

  8. Lami G, Gnesi S, Fabbrini F, Fusani M, Trentanni G (2004) An Automatic tool for the analysis of natural language requirements. Information Science Technology Institute, Pisa, pp 1–21

  9. Kiyavitskaya N, Zeni N, Mich L, Berry DM (2008) Requirements for tools for ambiguity identification and measurement in natural language requirements specifications. Requir Eng 13(3):207–239

    Article  Google Scholar 

  10. Bibi S, Tsoumakas G, Stamelos I, Vlahavas I (2006) Software defect prediction using regression via classification. In: IEEE international conference on computer systems and applications, pp 330–336

  11. Tian L, Noore A (2005) Dynamic software reliability prediction: an approach based on support vector machines. Int J Reliab Qual Saf Eng 12(04):309–321

    Article  Google Scholar 

  12. Malhotra R, Jain A (2012) Fault prediction using statistical and machine learning methods for improving software quality. J Inf Process Syst 8(2):241–262

    Article  Google Scholar 

  13. Rawat MS, Dubey SK (2012) Software defect prediction models for quality improvement: a literature study. Int J Comput Sci 9(5):288–296

    Google Scholar 

  14. Turk W (2006) Writing requirements. IET Eng Manag J 16(3):20–24

    Article  MathSciNet  Google Scholar 

  15. Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng 38(6):1276–1304

    Article  Google Scholar 

  16. Elish K, Elish M (2008) Predicting defect-prone software modules using support vector machines. J Syst Softw 81(5):649–660

    Article  Google Scholar 

  17. Saed Sayad (2015) An introduction to data mining. Data mining in engineering, business, and medicine. [Online]. Available: http://www.saedsayad.com/data_mining_map.htm. Accessed 15 March 2015

  18. Cerpa N, Bardeen M, Kitchenham B, Verner J (2010) Evaluating logistic regression models to estimate software project outcomes. Inf Softw Technol 52(9):934–944

    Article  Google Scholar 

  19. Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques. Informatica 31:249–268

    MathSciNet  MATH  Google Scholar 

  20. Kuntz L, Meyer G, Carnahan B (2003) Comparing statistical and machine learning classifiers: alternatives for predictive modeling in human factors research. Hum Factors J 45(3):408–423

    Article  Google Scholar 

  21. Wu X, Kumar V, Ross Quinlan J, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou Z-H, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37

    Article  Google Scholar 

  22. Gondra I (2008) Applying machine learning to software fault-proneness prediction. J Syst Softw 81(2):186–195

    Article  Google Scholar 

  23. Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International joint conference on artificial intelligence, pp 0–6

  24. Rodríguez JD, Pérez A, Lozano JA (2010) Sensitivity analysis of kappa-fold cross validation in prediction error estimation. IEEE Trans Pattern Anal Mach Intell 32(3):569–575

    Article  Google Scholar 

  25. Karystinos GN, Pados DA (2000) On overfitting, generalization, and randomly expanded training sets. IEEE Trans Neural Netw 11(5):1050–1057

    Article  Google Scholar 

  26. Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  27. Hajar Mat Jani SAM (2011) Implementing case-based reasoning technique to software requirements specifications quality analysis. Int J Adv Comput Technol 3(1):23–31

    Google Scholar 

  28. Natt Och Dag J, Regnell B, Carlshamre P, Andersson M, Karlsson J (2002) A feasibility study of automated natural language requirements analysis in market-driven development. Requir Eng 7(1):20–33

    Article  MATH  Google Scholar 

  29. Georgiades A, Marinos, Andreou (2011) A novel software tool for supporting and automating the requirements engineering process with the use of natural language. Int J Comput Sci Technol 4333(i):591–597

  30. Mathison S, Hurworth R (2006) Document analysis. CassBeth document analysis manual. Cherry Hill

  31. Seibel JS, Mazzuchi TA, Sarkani S (2006) Same vendor, version-to-version upgrade decision support model for commercial off-the-shelf productivity applications. Syst Eng 9(4):296–312

    Article  Google Scholar 

  32. Bewick V, Cheek L, Ball J (2005) Statistics review 14: logistic regression. Crit Care 9(1):112–118

    Article  Google Scholar 

  33. Halloran PSO (2005) Lecture 10: logistical regression II—multinomial data. Columbia University, New York, pp 1–73

    Google Scholar 

  34. Ye F (2010) What you see may not be what you get—a brief introduction to overfitting the problem of overfitting

  35. Gortmaker SL, Hosmer DW, Lemeshow S (1994) Applied logistic regression. Contemp Sociol 23(1):159

    Article  Google Scholar 

  36. Dietterich T (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1923

    Article  Google Scholar 

  37. Majnik M, Bosni Z (2013) ROC analysis of classifiers in machine learning: a survey. Intell Data Anal 17:531–558

    Google Scholar 

  38. Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John L. Dargan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dargan, J.L., Wasek, J.S. & Campos-Nanez, E. Systems performance prediction using requirements quality attributes classification. Requirements Eng 21, 553–572 (2016). https://doi.org/10.1007/s00766-015-0232-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00766-015-0232-4

Keywords