Abstract
Software source code size, in terms of source lines of code (SLOC), is an important parameter of many parametric software development effort estimation methods. In this paper, we investigate empirically the early prediction of SLOC for object-oriented software using use case metrics. We used different modeling techniques to build the prediction models. We used the univariate logistic regression and the simple linear regression methods to evaluate the individual effect of each use case metric on SLOC, and the multivariate logistic regression and the multiple linear regression methods to explore the combined effect of the use case metrics on SLOC. We also used in the study different machine learning methods (k-NN, naïve Bayes, C4.5, random forest, and multilayer perceptron neural network). The prediction models were evaluated using the receiver operating characteristic analysis, particularly the area under the curve measure, and leave-one-out cross validation. An empirical study, using data collected from five open source Java projects, is reported in the paper. The use case metrics have been compared to the well-known use case points method. Results provide evidence that the use case metrics-based approach gives a more accurate prediction of SLOC than the use case points-based approach.
Similar content being viewed by others
References
Nassif AB, Ho D, Capretz LF (2013) Towards an early software estimation using log-linear regression and a multilayer perceptron model. J Syst Softw 86(1):144–160
Ochodek M, Nawrocki J, Kwarciak K (2011) Simplifying effort estimation based on use case points. Inf Softw Technol 53:200–213
Lagerstrom R, von Wurtemberg LM, Holm H, Luczak O (2012) Identifying factors affecting software development cost and productivity. Softw Qual J 20(2):395–417
Zhou Y, Yang Y, Xu B, Leung H, Zhou X (2014) Souce code size estimation approaches for object-oriented systems from UML class diagrams: a comparative study. Inf Softw Technol 56:220–237
Jacobson I, Christerson M, Jonson P, Overgaard G (1993) Object-oriented software engineering: a use case driven approach. Addison-Wesley, New York
Larman C (2004) Applying UML and design patterns, an introduction to object-oriented analysis and design and the unified process. Prentice Hall, New York
Karner G (1993) Resource estimation for objectory projects. Objective Systems, Exton
Anda B, Dreiem H, Sjoberg DIK, Jorgensen M (2001) Estimating software development effort based on use cases: experiences from industry. In: UML 2001, LNCS 2185, Springer, New York
Nagheshwaran S (2001) Test effort estimation using use case points. In: Quality Week 2001. San Francisco, California, USA
Mohagheghi P, Anda B, Conradi R (2005) Effort estimation of use cases for incremental large-scale software development. In: Proceedings of the international conference on software engineering, ICSE’05, 15–21 May 2005, St. Louis Missouri
Robiolo G, Orosco R (2007) An alternative method employing use cases for early effort estimation. In: Software Engineering Workshop SEW ‘07, Baltimore, MD
Robiolo G, Orosco R (2008) Employing use cases to early estimate effort with simpler metrics. Innov Syst Softw Eng 4:31–43
Robiolo G, Badano C, Orosco R (2009) Transactions and paths: two use case based metrics which improve the early effort estimation. In: Proceedings of the 3rd international symposium on empirical software engineering and measurement, IEEE Computer Society
Fan W, Xiaohu Y, Xiaochun Z, Lu C (2009) Extended use case points method for software cost estimation. In: International conference on computational intelligence and software engineering
Albrecht A (1979) Measuring application development productivity. In: IBM application development symposium
Albrecht AJ, Gaffney JE (1983) Software function, source lines of code and development effort prediction: a software science validation. IEEE Trans Softw Eng 9(6):639–648
Anda B, Benestad HC, Hove SE (2005) A multiple case study of software effort estimation based on use case points. In: Proceedings of the international symposium on empirical software engineering (ISESE’05)
Boehm BW (1981) Software engineering economics. Prentice-Hall, New York
Kemerer CF (1987) An empirical validation of software cost estimation. Commun ACM 30(5):416–429
Bourque P, Côté V (1991) An experiment in software sizing with structured analysis metrics. J Syst Softw 15(2):159–172
Matson JE, Barrett BE, Mellichamp JM (1994) Software development cost estimation using function points. IEEE Trans Softw Eng 20(4):275–287
Henderson-Sellers B (1997) Corrigenda: software size estimation of object-oriented systems. IEEE Trans Softw Eng 23(4):260–261
Antoniol G, Lokan C, Caldiera G, Fiutem R (1999) A function point-like measure for object-oriented software. Empir Softw Eng 4(3):263–287
Boehm BW, Winsor Brown A, Madachy R, YeYang (2004) A Software product line life cycle cost estimation mode. In: Proceedings of the International symposium on empirical software engineering (ISESE’04)
Boehm B, Abts C, Brown AW et al (2000) Software cost estimation with COCOMO II. Prentice-Hall, Englewood Cliffs, NJ
Janaki Ram D, Raju SVGK (2000) Object-oriented design function points. In: 1st Asia-Pacific conference on software quality
Jahan MV, Sheibani R (2001) A new method for software size estimation based on UML metrics. In: The national conference on software engineering
Hastings TE, Sajeev ASM (2001) A vector-based approach to software size measurement and effort estimation. IEEE Trans Softw Eng 27(4):337–350
Carbone M, Santucci G (2002) Fast and serious: a UML based metric for effort estimation. In: 6th International ECOOP workshop on quantitative approaches in object-oriented software engineering
Leung H, Fan Z (2002) Software cost estimation, Handbook of Software Engineering and Knowledge Engineering, vol 2. World Scientific Publishing, Singapore
Zhao Y, Tan HBK (2003) Software cost estimation through conceptual requirement. In: 3rd International conference on software quality
McDonell SG (2003) Software source code sizing using fuzzy logic modeling. Inf Softw Technol 45(7):389–404
Antoniol G, Fiutem R, Lokan C (2003) Object-oriented function points: an empirical validation. Empir Softw Eng 8(3):225–254
Chen Y, Boehm BW, Madachy R, Valerdi R (2004) An empirical study of eServices product UML sizing metrics. In: Proceedings of the 2004 international symposium on empirical software engineering
Carroll ER (2005) Estimating software based on use case points. In: OOPSLA’05, San Diego, California, USA, 16–20 Oct 2005
Pfleeger SL, Wu F, Lewis R (2005) Software cost estimation and sizing methods: issues and guidelines. RAND Project Air Force, RAND Corporation , Santa Monica, CA
Kim S, Lively W, Simmons D (2006) An effort estimation by UML points in the early stage of software development. In: International conference on software engineering research and practice
Bianco VD, Lavazza L (2006) An assessment of function point-like metrics for object-oriented open-source software. In: International conference on software process and product measurement
Bianco VD, Lavazza L (2006) Object-oriented model size measurement: experiences and a proposal for a process. In: Workshop on model size metrics, ACM-IEEE international conference on model driven engineering languages and systems
Jorgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. IEEE Trans Softw Eng 33(1):33–53
Jorgensen M (2007) Forecasting of software development work effort: evidence on expert judgement and formal models. Int J Forecast 23:449–462
Jorgensen M, Boehm BW, Rifkin S (2009) Software development effort estimation: formal methods or expert judgment? IEEE Softw 26(2):14–19. doi:10.1109/MS.2009.47
Mishra S, Tripathy KC, Mishra MK (2010) Effort estimation based on complexity and size of relational database system. Int J Comput Sci Commun 1(2):419–422
Verner J, Tate G (1992) A software size model. IEEE Trans Softw Eng 18(4):265–278
Mendes E, Mosley N, Watson I (2002) A comparison of case-based reasoning approaches. In: Proceedings of the 11th international conference on World Wide Web, Honolulu, Hawaii, USA
Galorath DD, Evans MW (2006) Software sizing, estimation and risk management. Auerbach Publications, Boston, MS
Azzeh M, Neagu D, Cowling P (2010) Fuzzy grey relational analysis for software effort estimation. Empir Softw Eng 15:60–90
Azzeh M, Neagu D, Cowling P (2011) Analogy-based software effort estimation using fuzzy numbers. J Syst Softw 84:270–284
Andreou AS, Papatheocharous E (2008) Software cost estimation using fuzzy decision trees. In: 23rd IEEE/ACM international conference on automated software engineering (ASE 2008)
Papathepcharous E, Andreou AS (2007) Software cost estimation using artificial neural networks with inputs selection. In: 9th International conference on enterprise information systems (ICEIS 2007), Volume DISI – databases and information systems integration
Kumar K, Ravi V, Carr M, Kiran N (2008) Software development cost estimation using wavelet neural networks. J Syst Softw 81:1853–1867
Idri A, Abran A (2000) COCOMO cost model using fuzzy logic. In: 7th International conference on fuzzy theory and technology
Park H, Baek S (2008) An empirical validation of a neural network model for software effort estimation. Expert Syst Appl 35(3):929–937
Huang X, Ho D, Ren J, Capretz LF (2007) Improving the COCOMO model using a neuro-fuzzy approach. Appl Soft Comput 7(1):29–40
Tan HBK, Zhao Y, Zhang H (2009) Conceptual data model-based software size estimation for information systems. ACM Trans Softw Eng Methodol 19:2
Menzies T, Chen Z, Hihn J, Lum K (2006) Selecting best practices for effort estimation. IEEE Trans Softw Eng 32(11):883–895
Diev S (2006) Software estimation in the maintenance context. ACM SIGSOFT Softw Eng Note 31(2):1–8
Nassif AB, Capretz LF, Ho D (2012) Software effort estimation in the early stages of the software life cycle using a cascade correlation neural network model. In: 13th ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing, IEEE
Nassif AB, Capretz LF, Ho D (2014) Calibrating use case points. In: ICSE Companion’14, May 31–June 7, Hyderabad, India—ACM
Braz MR, Vergilio SR (2006) Software effort estimation based on use cases. In: Proceedings of the 30th annual international computer software and applications conference (COMPSAC’06)
Costagliola G, Ferrucci F, Tortora G, Vitiello G (2005) Class point: an approach for the size estimation of object-oriented systems. IEEE Trans Softw Eng 31(1):52–74
Issa A, Odeh M, Coward D (2006) Software cost estimation using use-case models: a critical evaluation. In: Proceedings of the international conference on information and communication technologies (ICTTA’06)
Bianco VD, Lavazza L (2005) An empirical assessment of function point-like object-oriented metrics. In: Proceedings of the 11th international software metrics symposium
Badri M, Badri L, Flageol W (2013) Predicting the size of test suites from use cases: An empirical exploration. In: Yenigün H, Yilmaz C, Ulrich A (eds) ICTSS 2013, LNCS 8254, November
Briand LC, Wust J, Daly J, Porter V (2000) Exploring the relationship between design measures and software quality in object-oriented systems. J Syst Softw 51(3):245–273
Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910
Zhou Y, Leung H (2006) Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults. IEEE Trans Softw Eng 32(10):771–789
Marcus DP, Ferenc R (2008) Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans Softw Eng 34(2):287–300
Singh Y, Kaur A, Malhotra R (2009) Empirical validation of object-oriented metrics for predicting fault proneness models. Softw Qual J 18(1):3–35
El Emam K (2000) A methodology for validating software product metrics. National Research Council of Canada NRC/ERB 1076
Hosmer DW, Lemeshow S (2000) Applied logistic regression. Wiley, New York
Wilson DR, Martinez TR (1977) Improved heterogeneous distance functions. JAIR 6(1):1–34
Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann Publishers, New York
Acknowledgments
This work was partially supported by NSERC (Natural Sciences and Engineering Research Council of Canada) Grant.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Badri, M., Badri, L., Flageol, W. et al. Source code size prediction using use case metrics: an empirical comparison with use case points. Innovations Syst Softw Eng 13, 143–159 (2017). https://doi.org/10.1007/s11334-016-0285-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11334-016-0285-7