Abstract
Many studies have shown that the accuracy of the predictions obtained by estimation models built considering data collected by other companies (cross-company models) can be significantly worse than those of estimation models built employing a dataset collected by the single company (within-company models). This is due to the different characteristics among cross-company and within-company datasets. In this paper, we propose an approach based on the opinion of the experts that could help in the context of small/medium company that do not have data available from past developed projects. In particular, experts are in charge of selecting data from public cross-company datasets looking at the information about employed software development process and software technologies. The proposed strategy is based on the use of a Delphi approach to reach consensus among experts. To assess the strategy, we performed an empirical study considering a dataset from the PROMISE repository that includes information on the functional size expressed in terms of COSMIC for building the cross-company estimation model. We selected this dataset since COSMIC is the method used to size the applications by the company that provided the within-company dataset employed as test set to assess the accuracy of the built cross-company model. We compared the accuracy of the obtained predictions with those of the cross-company model built without selecting the observations. The results are promising since the effort predictions obtained with the proposed strategy are significantly better than those obtained with the model built on the whole cross-company dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Raw data cannot be revealed because of a Non Disclosure Agreement with the software company.
- 2.
See the PROMISE repository web site for the description of all the variables: https://terapromise.csc.ncsu.edu/repo/effort/isbsg/isbsg10/isbsg-attribute-info.txt.
- 3.
See the PROMISE repository web site to see the remaining observations: http://openscience.us/repo/effort/isbsg/cosmic.html.
References
The promise repository of empirical software engineering data (2015)
ISBSG: www.isbsg.org (2017)
Abualkishik, A.Z., et al.: A study on the statistical convertibility of IFPUG function point, COSMIC function point and simple function point. Inf. Softw. Technol. 86, 1–19 (2017)
Breush, T., Pagan, A.: A simple test for heteroscedasticity and random coefficient variation. Econometrica 47, 1287–1294 (1992)
Briand, L., El Emam, K., Surmann, D., Wiekzorek, I., Maxwell, K.: An assessment and comparison of common software cost estimation modeling techniques. In: Proceedings of International Conference on Software Engineering, pp. 313–322. IEEE Press (1999)
Briand, L.C., Wieczorek, I.: Software resource estimation. In: Encyclopedia of Software Engineering, pp. 1160–1196 (2002)
Conover, W.J.: Practical Nonparametric Statistics, 3rd edn. Wiley, Hoboken (1998)
Corazza, A., Martino, S.D., Ferrucci, F., Gravino, C., Mendes, E.: Investigating the use of support vector regression for web effort estimation. Empirical Softw. Eng. 16(2), 211–243 (2011)
Di Martino, S., Ferrucci, F., Gravino, C., Mendes, E.: Comparing size measures for predicting web application development effort: a case study. In: Proceedings of Empirical Software Engineering and Measurement, pp. 324–333. IEEE Press (2007)
Ferrucci, F., Mendes, E., Sarro, F.: Web effort estimation: the value of cross-company data set compared to single-company data set. In: Proceedings of the 8th International Conference on Predictive Models in Software Engineering, pp. 29–38 (2012)
Freund, J.: Mathematical Statistics. Prentice-Hall, Upper Saddle River (1992)
Jeffery, R., Ruhe, M., Wieczorek, I.: A comparative study of two software development cost modeling techniques using multi-organizational and company-specific data. Inf. Softw. Technol. 42, 1009–1016 (2000)
Jeffery, R., Ruhe, M., Wieczorek, I.: Using public domain metrics to estimate software development effort. In: Proceedings of International Software Metrics Symposium, pp. 16–27. IEEE Press (2001)
JøRgensen, M.: A review of studies on expert estimation of software development effort. J. Syst. Softw. 70(1–2), 37–60 (2004)
Kampenes, V., Dyba, T., Hannay, J., Sjoberg, I.: A systematic review of effect size in software engineering experiments. Inf. Softw. Technol. 4(11–12), 1073–1086 (2007)
Kitchenham, B., Mendes, E., Travassos, G.: Cross versus within-company cost estimation studies: a systematic review. IEEE Trans. Softw. Eng. 33(5), 316–329 (2007)
Kitchenham, B., Pickard, L., Pfleeger, S.: Case studies for method and tool evaluation. IEEE Softw. 12(4), 52–62 (1995)
Kitchenham, B., Mendes, E., Travassos, G.H.: A systematic review of cross- vs. within- company cost estimation studies. In: Proceedings of the 10th International Conference on Evaluation and Assessment in Software Engineering, EASE 2006, pp. 81–90. BCS Learning & Development Ltd., Swindon (2006)
Kocaguneli, E., Menzies, T., Mendes, E.: Transfer learning in effort estimation. Empirical Softw. Eng. 20(3), 813–843 (2015)
Langdon, W.B., Dolado, J., Sarro, F., Harman, M.: Exact mean absolute error of baseline predictor, MARP0. Inf. Softw. Technol. 73, 16–18 (2016)
Lefley, M., Shepperd, M.J.: Using genetic programming to improve software effort estimation based on general data sets. In: Cantú-Paz, E., et al. (eds.) GECCO 2003. LNCS, vol. 2724, pp. 2477–2487. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45110-2_151
Lokan, C., Mendes, E.: Cross-company and single-company effort models using the ISBSG database: a further replicated study. In: Proceedings of International Symposium on Empirical Software Engineering, pp. 75–84. IEEE Press (2006)
Martino, S.D., Ferrucci, F., Gravino, C., Sarro, F.: Web effort estimation: function point analysis vs. COSMIC. Inf. Softw. Technol. 72, 90–109 (2016)
Mendes, E.: Predicting web development effort using a Bayesian network. In: Proceedings of Evaluation and Assessment in Software Engineering, pp. 83–93. IEEE Press (2007)
Mendes, E., Counsell, S., Mosley, N.: Comparison of Web size measures for predicting Web design and authoring effort. IEE Proc.-Softw. 149(3), 86–92 (2002)
Mendes, E., Di Martino, S., Ferrucci, F., Gravino, C.: Effort estimation: how valuable is it for a Web company to use a cross-company data set, compared to using its own single-company data Set? In: Proceedings of the 6th International World Wide Web Conference, pp. 83–93. ACM Press (2007)
Mendes, E., Kitchenham, B.: Further comparison of cross-company and within-company effort estimation models for web applications. In: Proceedings of International Software Metrics Symposium, pp. 348–357. IEEE Press (2004)
Mendes, E.: A comparison of techniques for web effort estimation. In: Proceedings of the First International Symposium on Empirical Software Engineering and Measurement, ESEM 2007, Madrid, Spain, 20–21 September 2007, pp. 334–343 (2007)
Mendes, E.: The use of a bayesian network for web effort estimation. In: Baresi, L., Fraternali, P., Houben, G.-J. (eds.) ICWE 2007. LNCS, vol. 4607, pp. 90–104. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73597-7_8
Mendes, E.: The use of Bayesian networks for web effort estimation: further investigation. In: Proceedings of International Conference on Web Engineering, pp. 203–216 (2008)
Mendes, E., Kalinowski, M., Martins, D., Ferrucci, F., Sarro, F.: Cross- vs. within-company cost estimation studies revisited: an extended systematic review. In: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering, EASE 2014, pp. 12:1–12:10. ACM, New York (2014)
Mendes, E., Lokan, C.: Investigating the use of chronological splitting to compare software cross-company and single-company effort predictions: a replicated study. In: Proceedings of the 13th International Conference on Evaluation and Assessment in Software Engineering, EASE 2009 (2009)
Mendes, E., Martino, S.D., Ferrucci, F., Gravino, C.: Cross-company vs. single-company web effort models using the Tukutuku database: an extended study. J. Syst. Softw. 81(5), 673–690 (2008)
Mendes, E., Mosley, N.: Bayesian network models for web effort prediction: a comparative study. IEEE Trans. Softw. Eng. 34(6), 723–737 (2008)
Mendes, E., Pollino, C.A., Mosley, N.: Building an expert-based web effort estimation model using Bayesian networks. In: 13th International Conference on Evaluation and Assessment in Software Engineering, EASE 2009, 20–21 April 2009. Durham University, UK (2009)
Menzies, T., Chen, Z., Hihn, J., Lum, K.: Selecting best practices for effort estimation. IEEE Trans. Softw. Eng. 32(11), 883–895 (2006)
Minku, L., Sarro, F., Mendes, E., Ferrucci, F.: How to make best use of cross-company data for web effort estimation? In: 2015 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 1–10 (2015)
Minku, L.L., Yao, X.: How to make best use of cross-company data in software effort estimation? In: Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pp. 446–456. ACM (2014)
Minku, L.L., Yao, X.: Which models of the past are relevant to the present? A software effort estimation approach to exploiting useful past models. Autom. Softw. Eng. 24(3), 499–542 (2017)
Myrtveit, I., Stensrud, E.: Validity and reliability of evaluation procedures in comparative studies of effort prediction models. Empirical Softw. Eng. 17(1–2), 23–33 (2012)
Rowe, G., Wright, G.: Expert opinions in forecasting: the role of the Delphi technique. In: Armstrong, J.S. (ed.) Principles of Forecasting. ISOR, vol. 30, pp. 125–144. Springer, Boston (2001). https://doi.org/10.1007/978-0-306-47630-3_7
Royston, P.: An extension of Shapiro and Wilk’s W test for normality to large samples. Appl. Stat. 31(2), 115–124 (1982)
Ruhe, M., Wieczorek, I.: How valuable is company-specific data compared to multi-company data for software cost estimation? In: Proceedings of the International Software Metrics Symposium, pp. 237–246. IEEE Press (2002)
Sarro, F., Ferrucci, F., Gravino, C.: Single and multi objective genetic programming for software development effort estimation. In: Proceedings of the ACM Symposium on Applied Computing, SAC 2012, Riva, Trento, Italy, 26–30 March 2012, pp. 1221–1226 (2012)
Sarro, F., Petrozziello, A.: Linear programming as a baseline for software effort estimation. ACM Trans. Softw. Eng. Methodol. 27(3), 12:1–12:28 (2018)
Shepperd, M.J., MacDonell, S.G.: Evaluating prediction systems in software project estimation. Inf. Softw. Technol. 54(8), 820–827 (2012)
Turhan, B., Mendes, E.: A comparison of cross-versus single-company effort prediction models for web projects. In: 40th EUROMICRO Conference on Software Engineering and Advanced Applications, EUROMICRO-SEAA 2014, Verona, Italy, 27–29 August 2014, pp. 285–292 (2014)
Vargha, A., Delaney, H.D.: A critique and improvement of the CL common language effect size statistics of McGraw and Wong. J. Educ. Behav. Stat. 25(2), 101–132 (2000)
Wohlin, C., Runeson, P., Host, M., Ohlsson, M., Regnell, B., Wesslen, A.: Experimentation in Software Engineering - An Introduction. Kluwer, Dordrecht (2000)
Yin, R.K.: Case Study Research: Design and Methods. Sage Publications, Thousand Oaks (1984)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ferrucci, F., Gravino, C. (2019). Can Expert Opinion Improve Effort Predictions When Exploiting Cross-Company Datasets? - A Case Study in a Small/Medium Company. In: Franch, X., Männistö, T., MartÃnez-Fernández, S. (eds) Product-Focused Software Process Improvement. PROFES 2019. Lecture Notes in Computer Science(), vol 11915. Springer, Cham. https://doi.org/10.1007/978-3-030-35333-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-35333-9_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35332-2
Online ISBN: 978-3-030-35333-9
eBook Packages: Computer ScienceComputer Science (R0)