Skip to main content
Log in

Source code size prediction using use case metrics: an empirical comparison with use case points

  • Original Paper
  • Published:
Innovations in Systems and Software Engineering Aims and scope Submit manuscript

Abstract

Software source code size, in terms of source lines of code (SLOC), is an important parameter of many parametric software development effort estimation methods. In this paper, we investigate empirically the early prediction of SLOC for object-oriented software using use case metrics. We used different modeling techniques to build the prediction models. We used the univariate logistic regression and the simple linear regression methods to evaluate the individual effect of each use case metric on SLOC, and the multivariate logistic regression and the multiple linear regression methods to explore the combined effect of the use case metrics on SLOC. We also used in the study different machine learning methods (k-NN, naïve Bayes, C4.5, random forest, and multilayer perceptron neural network). The prediction models were evaluated using the receiver operating characteristic analysis, particularly the area under the curve measure, and leave-one-out cross validation. An empirical study, using data collected from five open source Java projects, is reported in the paper. The use case metrics have been compared to the well-known use case points method. Results provide evidence that the use case metrics-based approach gives a more accurate prediction of SLOC than the use case points-based approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. http://www.math-cs.gordon.edu/courses/cs211/ATMExample/.

  2. https://commons.apache.org/proper/commons-exec/.

  3. https://commons.apache.org/proper/commons-email/.

  4. http://www.joda.org/joda-time/.

  5. http://www.xlstat.com/.

  6. http://eric.univ-lyon2.fr/~ricco/tanagra/fr/tanagra.html.

References

  1. Nassif AB, Ho D, Capretz LF (2013) Towards an early software estimation using log-linear regression and a multilayer perceptron model. J Syst Softw 86(1):144–160

    Article  Google Scholar 

  2. Ochodek M, Nawrocki J, Kwarciak K (2011) Simplifying effort estimation based on use case points. Inf Softw Technol 53:200–213

    Article  Google Scholar 

  3. Lagerstrom R, von Wurtemberg LM, Holm H, Luczak O (2012) Identifying factors affecting software development cost and productivity. Softw Qual J 20(2):395–417

  4. Zhou Y, Yang Y, Xu B, Leung H, Zhou X (2014) Souce code size estimation approaches for object-oriented systems from UML class diagrams: a comparative study. Inf Softw Technol 56:220–237

    Article  Google Scholar 

  5. Jacobson I, Christerson M, Jonson P, Overgaard G (1993) Object-oriented software engineering: a use case driven approach. Addison-Wesley, New York

    MATH  Google Scholar 

  6. Larman C (2004) Applying UML and design patterns, an introduction to object-oriented analysis and design and the unified process. Prentice Hall, New York

    Google Scholar 

  7. Karner G (1993) Resource estimation for objectory projects. Objective Systems, Exton

    Google Scholar 

  8. Anda B, Dreiem H, Sjoberg DIK, Jorgensen M (2001) Estimating software development effort based on use cases: experiences from industry. In: UML 2001, LNCS 2185, Springer, New York

  9. Nagheshwaran S (2001) Test effort estimation using use case points. In: Quality Week 2001. San Francisco, California, USA

  10. Mohagheghi P, Anda B, Conradi R (2005) Effort estimation of use cases for incremental large-scale software development. In: Proceedings of the international conference on software engineering, ICSE’05, 15–21 May 2005, St. Louis Missouri

  11. Robiolo G, Orosco R (2007) An alternative method employing use cases for early effort estimation. In: Software Engineering Workshop SEW ‘07, Baltimore, MD

  12. Robiolo G, Orosco R (2008) Employing use cases to early estimate effort with simpler metrics. Innov Syst Softw Eng 4:31–43

    Article  Google Scholar 

  13. Robiolo G, Badano C, Orosco R (2009) Transactions and paths: two use case based metrics which improve the early effort estimation. In: Proceedings of the 3rd international symposium on empirical software engineering and measurement, IEEE Computer Society

  14. Fan W, Xiaohu Y, Xiaochun Z, Lu C (2009) Extended use case points method for software cost estimation. In: International conference on computational intelligence and software engineering

  15. Albrecht A (1979) Measuring application development productivity. In: IBM application development symposium

  16. Albrecht AJ, Gaffney JE (1983) Software function, source lines of code and development effort prediction: a software science validation. IEEE Trans Softw Eng 9(6):639–648

    Article  Google Scholar 

  17. Anda B, Benestad HC, Hove SE (2005) A multiple case study of software effort estimation based on use case points. In: Proceedings of the international symposium on empirical software engineering (ISESE’05)

  18. Boehm BW (1981) Software engineering economics. Prentice-Hall, New York

    MATH  Google Scholar 

  19. Kemerer CF (1987) An empirical validation of software cost estimation. Commun ACM 30(5):416–429

    Article  Google Scholar 

  20. Bourque P, Côté V (1991) An experiment in software sizing with structured analysis metrics. J Syst Softw 15(2):159–172

    Article  Google Scholar 

  21. Matson JE, Barrett BE, Mellichamp JM (1994) Software development cost estimation using function points. IEEE Trans Softw Eng 20(4):275–287

    Article  Google Scholar 

  22. Henderson-Sellers B (1997) Corrigenda: software size estimation of object-oriented systems. IEEE Trans Softw Eng 23(4):260–261

    Article  Google Scholar 

  23. Antoniol G, Lokan C, Caldiera G, Fiutem R (1999) A function point-like measure for object-oriented software. Empir Softw Eng 4(3):263–287

    Article  Google Scholar 

  24. Boehm BW, Winsor Brown A, Madachy R, YeYang (2004) A Software product line life cycle cost estimation mode. In: Proceedings of the International symposium on empirical software engineering (ISESE’04)

  25. Boehm B, Abts C, Brown AW et al (2000) Software cost estimation with COCOMO II. Prentice-Hall, Englewood Cliffs, NJ

    Google Scholar 

  26. Janaki Ram D, Raju SVGK (2000) Object-oriented design function points. In: 1st Asia-Pacific conference on software quality

  27. Jahan MV, Sheibani R (2001) A new method for software size estimation based on UML metrics. In: The national conference on software engineering

  28. Hastings TE, Sajeev ASM (2001) A vector-based approach to software size measurement and effort estimation. IEEE Trans Softw Eng 27(4):337–350

    Article  Google Scholar 

  29. Carbone M, Santucci G (2002) Fast and serious: a UML based metric for effort estimation. In: 6th International ECOOP workshop on quantitative approaches in object-oriented software engineering

  30. Leung H, Fan Z (2002) Software cost estimation, Handbook of Software Engineering and Knowledge Engineering, vol 2. World Scientific Publishing, Singapore

    Google Scholar 

  31. Zhao Y, Tan HBK (2003) Software cost estimation through conceptual requirement. In: 3rd International conference on software quality

  32. McDonell SG (2003) Software source code sizing using fuzzy logic modeling. Inf Softw Technol 45(7):389–404

    Article  Google Scholar 

  33. Antoniol G, Fiutem R, Lokan C (2003) Object-oriented function points: an empirical validation. Empir Softw Eng 8(3):225–254

  34. Chen Y, Boehm BW, Madachy R, Valerdi R (2004) An empirical study of eServices product UML sizing metrics. In: Proceedings of the 2004 international symposium on empirical software engineering

  35. Carroll ER (2005) Estimating software based on use case points. In: OOPSLA’05, San Diego, California, USA, 16–20 Oct 2005

  36. Pfleeger SL, Wu F, Lewis R (2005) Software cost estimation and sizing methods: issues and guidelines. RAND Project Air Force, RAND Corporation , Santa Monica, CA

  37. Kim S, Lively W, Simmons D (2006) An effort estimation by UML points in the early stage of software development. In: International conference on software engineering research and practice

  38. Bianco VD, Lavazza L (2006) An assessment of function point-like metrics for object-oriented open-source software. In: International conference on software process and product measurement

  39. Bianco VD, Lavazza L (2006) Object-oriented model size measurement: experiences and a proposal for a process. In: Workshop on model size metrics, ACM-IEEE international conference on model driven engineering languages and systems

  40. Jorgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. IEEE Trans Softw Eng 33(1):33–53

    Article  Google Scholar 

  41. Jorgensen M (2007) Forecasting of software development work effort: evidence on expert judgement and formal models. Int J Forecast 23:449–462

    Article  Google Scholar 

  42. Jorgensen M, Boehm BW, Rifkin S (2009) Software development effort estimation: formal methods or expert judgment? IEEE Softw 26(2):14–19. doi:10.1109/MS.2009.47

  43. Mishra S, Tripathy KC, Mishra MK (2010) Effort estimation based on complexity and size of relational database system. Int J Comput Sci Commun 1(2):419–422

    Google Scholar 

  44. Verner J, Tate G (1992) A software size model. IEEE Trans Softw Eng 18(4):265–278

    Article  Google Scholar 

  45. Mendes E, Mosley N, Watson I (2002) A comparison of case-based reasoning approaches. In: Proceedings of the 11th international conference on World Wide Web, Honolulu, Hawaii, USA

  46. Galorath DD, Evans MW (2006) Software sizing, estimation and risk management. Auerbach Publications, Boston, MS

    Book  MATH  Google Scholar 

  47. Azzeh M, Neagu D, Cowling P (2010) Fuzzy grey relational analysis for software effort estimation. Empir Softw Eng 15:60–90

    Article  Google Scholar 

  48. Azzeh M, Neagu D, Cowling P (2011) Analogy-based software effort estimation using fuzzy numbers. J Syst Softw 84:270–284

    Article  Google Scholar 

  49. Andreou AS, Papatheocharous E (2008) Software cost estimation using fuzzy decision trees. In: 23rd IEEE/ACM international conference on automated software engineering (ASE 2008)

  50. Papathepcharous E, Andreou AS (2007) Software cost estimation using artificial neural networks with inputs selection. In: 9th International conference on enterprise information systems (ICEIS 2007), Volume DISI – databases and information systems integration

  51. Kumar K, Ravi V, Carr M, Kiran N (2008) Software development cost estimation using wavelet neural networks. J Syst Softw 81:1853–1867

    Article  Google Scholar 

  52. Idri A, Abran A (2000) COCOMO cost model using fuzzy logic. In: 7th International conference on fuzzy theory and technology

  53. Park H, Baek S (2008) An empirical validation of a neural network model for software effort estimation. Expert Syst Appl 35(3):929–937

    Article  Google Scholar 

  54. Huang X, Ho D, Ren J, Capretz LF (2007) Improving the COCOMO model using a neuro-fuzzy approach. Appl Soft Comput 7(1):29–40

    Article  Google Scholar 

  55. Tan HBK, Zhao Y, Zhang H (2009) Conceptual data model-based software size estimation for information systems. ACM Trans Softw Eng Methodol 19:2

  56. Menzies T, Chen Z, Hihn J, Lum K (2006) Selecting best practices for effort estimation. IEEE Trans Softw Eng 32(11):883–895

    Article  Google Scholar 

  57. Diev S (2006) Software estimation in the maintenance context. ACM SIGSOFT Softw Eng Note 31(2):1–8

  58. Nassif AB, Capretz LF, Ho D (2012) Software effort estimation in the early stages of the software life cycle using a cascade correlation neural network model. In: 13th ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing, IEEE

  59. Nassif AB, Capretz LF, Ho D (2014) Calibrating use case points. In: ICSE Companion’14, May 31–June 7, Hyderabad, India—ACM

  60. Braz MR, Vergilio SR (2006) Software effort estimation based on use cases. In: Proceedings of the 30th annual international computer software and applications conference (COMPSAC’06)

  61. Costagliola G, Ferrucci F, Tortora G, Vitiello G (2005) Class point: an approach for the size estimation of object-oriented systems. IEEE Trans Softw Eng 31(1):52–74

    Article  Google Scholar 

  62. Issa A, Odeh M, Coward D (2006) Software cost estimation using use-case models: a critical evaluation. In: Proceedings of the international conference on information and communication technologies (ICTTA’06)

  63. Bianco VD, Lavazza L (2005) An empirical assessment of function point-like object-oriented metrics. In: Proceedings of the 11th international software metrics symposium

  64. Badri M, Badri L, Flageol W (2013) Predicting the size of test suites from use cases: An empirical exploration. In: Yenigün H, Yilmaz C, Ulrich A (eds) ICTSS 2013, LNCS 8254, November

  65. Briand LC, Wust J, Daly J, Porter V (2000) Exploring the relationship between design measures and software quality in object-oriented systems. J Syst Softw 51(3):245–273

    Article  Google Scholar 

  66. Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910

    Article  Google Scholar 

  67. Zhou Y, Leung H (2006) Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults. IEEE Trans Softw Eng 32(10):771–789

    Article  Google Scholar 

  68. Marcus DP, Ferenc R (2008) Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans Softw Eng 34(2):287–300

    Article  Google Scholar 

  69. Singh Y, Kaur A, Malhotra R (2009) Empirical validation of object-oriented metrics for predicting fault proneness models. Softw Qual J 18(1):3–35

    Article  Google Scholar 

  70. El Emam K (2000) A methodology for validating software product metrics. National Research Council of Canada NRC/ERB 1076

  71. Hosmer DW, Lemeshow S (2000) Applied logistic regression. Wiley, New York

    Book  MATH  Google Scholar 

  72. Wilson DR, Martinez TR (1977) Improved heterogeneous distance functions. JAIR 6(1):1–34

    MathSciNet  MATH  Google Scholar 

  73. Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann Publishers, New York

    Google Scholar 

Download references

Acknowledgments

This work was partially supported by NSERC (Natural Sciences and Engineering Research Council of Canada) Grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mourad Badri.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Badri, M., Badri, L., Flageol, W. et al. Source code size prediction using use case metrics: an empirical comparison with use case points. Innovations Syst Softw Eng 13, 143–159 (2017). https://doi.org/10.1007/s11334-016-0285-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11334-016-0285-7

Keywords

Navigation