Skip to main content
Log in

Replicating studies on cross- vs single-company effort models using the ISBSG Database

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

In 2001 the ISBSG database was used by Jeffery et al. (Using public domain metrics to estimate software development effort. Proceedings Metrics’01, London, pp 16–27, 2001; S1) to compare the effort prediction accuracy between cross- and single-company effort models. Given that more than 2,000 projects were later volunteered to this database, in 2005 Mendes et al. (A replicated comparison of cross-company and within-company effort estimation models using the ISBSG Database, in Proceedings of Metrics’05, Como, 2005; S2) replicated S1 but obtained different results. The difference in results could have occurred due to legitimate differences in data set patterns; however, they could also have occurred due to differences in experimental procedure given that S2 was unable to employ exactly the same experimental procedure used in S1 because S1’s procedure was not fully documented. Recently, we applied S2’s experimental procedure to the ISBSG database version used in S1 (release 6) to assess if differences in experimental procedure would have contributed towards different results (Lokan and Mendes, Cross-company and single-company effort models using the ISBSG Database: a further replicated study, Proceedings of the ISESE’06, pp 75–84, 2006; S3). Our results corroborated those from S1, suggesting that differences in the results obtained by S2 were likely caused by legitimate differences in data set patterns. We have since been able to reconstruct the experimental procedure of S1 and therefore in this paper we present both S3 and also another study (S4), which applied the experimental procedure of S1 to the data set used in S2. By applying the experimental procedure of S2 to the data set used in S1 (study S3), and the experimental procedure of S1 to the data set used in S2 (study S4), we investigate the effect of all the variations between S1 and S2. Our results for S4 support those of S3, suggesting that differences in data preparation and analysis procedures did not affect the outcome of the analysis. Thus, the different results of S1 and S2 are very likely due to fundamental differences in the data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. www.isbsg.org

References

  • Briand LC, El-Emam K, Maxwell K, Surmann D, Wieczorek I (1999) An assessment and comparison of common cost estimation models. Proceedings of the 21st International Conference on Software Engineering, ICSE 99, pp 313–322

  • Briand LC, Langley T, Wieczorek I (2000) A replicated assessment of common software cost estimation techniques. Proceedings of the 22nd International Conference on Software Engineering, ICSE 20, pp 377–386

  • Conte SD, Dunsmore HE, Shen VY (1986) Software engineering metrics and models. Benjamin-Cummins

  • Cook RD (1977) Detection of influential observations in linear regression. Technometrics 19:15–18

    Article  MATH  MathSciNet  Google Scholar 

  • Jeffery R, Ruhe M, Wieczorek I (2000) A comparative study of two software development cost modeling techniques using multi-organizational and company-specific data. Inf Softw Technol 42:1009–1016

    Article  Google Scholar 

  • Jeffery R, Ruhe M, Wieczorek I (2001) Using public domain metrics to estimate software development effort. Proceedings Metrics’01, London, pp 16–27

  • Kemerer CF (1987) An empirical validation of software cost estimation models. Communications ACM, 30(5)

  • Kitchenham BA, Mendes E (2004) A comparison of cross-company and single-company effort estimation models for Web applications, Proceedings EASE 2004, pp 47–55

  • Kitchenham BA, Mendes E, Travassos G (2006) A systematic review of cross- vs within-company cost estimation studies, Proceedings of EASE’06, BCS. (Available at http://ewic.bcs.org/conferences/2006/ease06/index.htm)

  • Kitchenham BA, Taylor NR (1984) Software cost models. ICL Tech J 73–102, May

  • Kirsopp C, Shepperd M (2002) Making inferences with small numbers of training sets. IEE Proc Softw 149:123–130

    Article  Google Scholar 

  • Lefley M, Shepperd MJ (2003) Using genetic programming to improve software effort estimation based on general data sets, Proceedings of GECCO 2003, LNCS 2724. Springer, New York, pp 2477–2487

  • Lokan CJ (2005) Function points. Advances in computers. In: M.V. Zelkowitz (ed), vol 65, pp 298–347, Elsevier

  • Lokan C, Mendes E (2006) Cross-company and single-company effort models using the ISBSG Database: a further replicated study, Proceedings of the ISESE’06, pp 75–84

  • Maxwell K (2002) Applied statistics for software managers. Software Quality Institute Series, Prentice-Hall, Englewood Cliffs, NJ

    Google Scholar 

  • Maxwell K, Wassenhove LV, Dutta S (1999) Performance evaluation of general and company specific models in software development effort estimation. Manag Sci 45(6):787–803, June

    Article  Google Scholar 

  • Mendes E, Kitchenham BA (2004) Further comparison of cross-company and within company effort estimation models for Web applications. Proceedings Metrics’04, Chicago, Illinois September 11–17th 2004, IEEE Computer Society, pp 348–357

  • Mendes E, Lokan C, Harrison R, Triggs C (2005) A replicated comparison of cross-company and within-company effort estimation models using the ISBSG Database, in Proceedings of Metrics’05, Como

  • Tabachnick BG, Fidell LS (1996) Using multivariate statistics. Harper Collins, New York

    Google Scholar 

  • Wieczorek I, Ruhe M (2002) How valuable is company-specific data compared to multi-company data for software cost estimation? Proceedings Metrics’02, Ottawa, pp 237–246

Download references

Acknowledgments

We would like to thank the ISBSG group for making releases 6 and 9 available for our research and all those companies that have volunteered data on their projects.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emilia Mendes.

Additional information

Editor:

José Carlo Maldonado

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mendes, E., Lokan, C. Replicating studies on cross- vs single-company effort models using the ISBSG Database. Empir Software Eng 13, 3–37 (2008). https://doi.org/10.1007/s10664-007-9045-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-007-9045-5

Keywords

Navigation