Investigating the use of moving windows to improve software effort prediction: a replicated study

Lokan, Chris; Mendes, Emilia

doi:10.1007/s10664-016-9446-4

Investigating the use of moving windows to improve software effort prediction: a replicated study

Published: 26 August 2016

Volume 22, pages 716–767, (2017)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

556 Accesses
19 Citations
2 Altmetric
Explore all metrics

Abstract

To date most research in software effort estimation has not taken chronology into account when selecting projects for training and validation sets. A chronological split represents the use of a project’s starting and completion dates, such that any model that estimates effort for a new project p only uses as its training set projects that have been completed prior to p’s starting date. A study in 2009 (“S3”) investigated the use of chronological split taking into account a project’s age. The research question investigated was whether the use of a training set containing only the most recent past projects (a “moving window” of recent projects) would lead to more accurate estimates when compared to using the entire history of past projects completed prior to the starting date of a new project. S3 found that moving windows could improve the accuracy of estimates. The study described herein replicates S3 using three different and independent data sets. Estimation models were built using regression, and accuracy was measured using absolute residuals. The results contradict S3, as they do not show any gain in estimation accuracy when using windows for effort estimation. This is a surprising result: the intuition that recent data should be more helpful than old data for effort estimation is not supported. Several factors, which are discussed in this paper, might have contributed to such contradicting results. Some of our future work entails replicating this work using other datasets, to understand better when using windows is a suitable choice for software companies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Evaluation of Weighted Moving Windows for Software Effort Estimation

A Virtual Study of Moving Windows for Software Effort Estimation Using Finnish Datasets

The Effects of Gradual Weighting on Duration-Based Moving Windows for Software Effort Estimation

Notes

http://www.isbsg.org
http://isbsg.org/project-estimation-tools/
http://www.4sumpartners.com/
Using R version 3.2.2 and relevant packages as current at January 2016.
Using the “fastbw()” function from Harrell’s “rms” package for R.
Using the “cohen.d()” function from the “effsize” package in R.

References

Amasaki S, Lokan C (2012) The effects of moving windows to software estimation: comparative study on linear regression and estimation by analogy. IWSM/Mensura 2012, Assisi
Google Scholar
Amasaki S, Lokan C (2013) The evaluation of weighted moving windows for software effort estimation. Product-Foc Software Process Improve, LNCS 7983:214–228, Springer
Google Scholar
Amasaki S, Lokan C (2014a) On the effectiveness of weighted moving windows: experiment on linear regression based software effort estimation. J Software: Evol Process 27(7):488–507
Google Scholar
Amasaki S, Lokan C (2014b) The effects of moving windows on software effort estimation: comparative study with CART. Proc 6th Int Workshop Empirical Software Eng Pract, Osaka, Japan
Amasaki S, Lokan C (2014c) The effects of gradual weighting on duration-based moving windows for software effort estimation. 15th Int Conf Product-Focused Software Eng Process Improve, Helsinki, Finland: 63–77
Amasaki S, Lokan C (2015) A replication of comparative study of moving windows on linear regression and estimation by analogy. Proc 11th Int Conf Predict Models Data Anal Software Eng, Beijing, China 1–6:6–10
Google Scholar
Amasaki S, Lokan C (2016a) Evaluation of moving window policies with CART. Proc 7th Int Workshop Empirical Software Eng Pract, Osaka, Japan
Amasaki S, Lokan C (2016b) A replication study on the effects of weighted moving windows for software effort estimation. Proc 20^th Int Conf Eval Assessment Software Eng, Limerick, Ireland
Amasaki S, Takahara Y, Yokogawa T (2011) Performance evaluation of windowing approach on effort estimation by analogy. IWSM/Mensura 2011, Nara, pp 188–195
Google Scholar
Azhar D, Mendes E, Riddle P (2012) A systematic review of Web resource estimation. Proc 8^th Int Conf Predict Models Software Eng, Lund, Sweden: 49–58
Azzeh M, Cowling PI, Neagu D (2010) Software stage-effort estimation based on association rule mining and fuzzy set theory. Proc 10th Int Conf Comput Inform Technol, Bradford, UK: 249–256
Bibi S, Stamelos I, Angelis L (2008) Combining probabilistic models for explanatory productivity estimation. Inf Softw Technol 50(7–8):656–669
Article Google Scholar
Bibi S, Stamelos I, Gerolimos G, Kollias V (2010) BBN based approach for improving the software development process of an SME—a case study. J Softw Maint Evol Res Pract 22(2):121–140
Article Google Scholar
Britto R, Freitas V, Mendes E, Usman M (2014) Effort estimation in global software development: a systematic literature review. Proc 9^th Int Conf Global Software Eng, Shanghai, China: 135–144
Britto R, Mendes E, Börstler J (2015) An empirical investigation on effort estimation in agile global software development. Proc 10^th Int Conf Global Software Eng, Ciudad Real, Spain: 38–45
Carver J (2010) Towards reporting guidelines for experimental replications: a proposal. Proc 1st Int Workshop Replic Empirical Software Eng Res. Cape Town, South Africa
Cohen J (1992) A power primer. Psychol Bull 112:155–159
Article Google Scholar
Cohn M (2005) Agile estimating and planning. Prentice Hall
Conte SD, Dunsmore HE, Shen VY (1986) Software engineering metrics and models. Benjamin-Cummings
Cook RD (1977) Detection of influential observations in linear regression. Technometrics 19:15–18
MathSciNet MATH Google Scholar
Fernández-Diego M, Martínez-Gómez M, Torralba-Martínez J-M (2010) Sensitivity of results to different data quality meta-data criteria in the sample selection of projects from the ISBSG dataset. Proc 6th Int Conf Predict Models Software Eng, Timisoara, Romania: 13:1–13:9.
Forselius P (2006) Data quality criteria for Experience® data collection. STTF Oy
Foss T, Stensrud E, Kitchenham B, Myrtveit I (2003) A simulation study of the model evaluation criterion MMRE. IEEE Trans Softw Eng 29(11):985–995
Article Google Scholar
Han J, Kamber M (2006) Data mining concepts and techniques. Morgan Kaufmann
Jørgensen M (2004) A review of studies on expert estimation of software development effort. J Syst Softw 70(1):37–60
Article Google Scholar
Jørgensen M (2005) Practical guidelines for expert-judgment-based software effort estimation. IEEE Softw 22(3):57–63
Article Google Scholar
Jørgensen M (2013) Relative estimation of software development effort: it matters with what and how you compare. IEEE Softw 30(2):74–79
Article Google Scholar
Jørgensen M, Grimstad S (2008) Avoiding irrelevant and misleading information when estimating development effort. IEEE Software 25(3): 78–83
Jørgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. IEEE Trans Softw Eng 33(1):33–53
Article Google Scholar
Kitchenham BA, Mendes E (2009) Why comparative effort prediction studies may be invalid. Proc 5th Int Conf Predict Models Software Eng, Vancouver, Canada: 4:1–4:5
Kitchenham BA, Pickard LM, MacDonell SG, Shepperd MJ (2001) What accuracy statistics really measure. IEE Proc - Software 148(3):81–85
Article Google Scholar
Kitchenham B, Pfleeger SL, McColl B, Eagan S (2002) An empirical study of maintenance and development estimation accuracy. J Syst Softw 64(1):57–77
Article Google Scholar
Kitchenham BA, Mendes E, Travassos G (2007) Cross versus within-company cost estimation studies: a systematic review. IEEE Trans Softw Eng 33(5):316–329
Article Google Scholar
Kocaguneli E, Menzies T, Mendes E (2014) Transfer learning in effort estimation. Empir Softw Eng 19:1–31
Article Google Scholar
Lefley M, Shepperd MJ (2003) Using genetic programming to improve software effort estimation based on general data sets. LNCS 2724. Springer, Verlag, pp 2477–2487
MATH Google Scholar
Li YF, Xie M, Goh TN (2009) A study of the non-linear adjustment for analogy based software cost estimation. Empir Softw Eng 14:603–643
Article Google Scholar
Lokan C, Mendes E (2008) Investigating the use of chronological splitting to compare software cross-company and single-company effort predictions. Proc 12^th Int Conf Eval Assess Software Eng, Bari, Italy: 151–160
Lokan C, Mendes E (2009a) Using chronological splitting to compare cross- and single-company effort models: further investigation. Proc 32nd Austral Conf Comput Sci, Wellington, NZ: 47–54
Lokan C, Mendes E (2009b) Applying moving windows to software effort estimation. Proc 3rd Int Symp Empirical Software Eng Measure, Lake Buena Vista, Florida, USA: 111–122
Lokan C, Mendes E (2012) Investigating the use of duration-based moving windows to improve software effort estimation. Proc 19^th Asia-Pacific Software Eng Conf, Hong Kong
Lokan C, Mendes E (2014) Investigating the use of duration-based moving windows to improve software effort prediction: a replicated study. Inf Softw Technol 56(9):1063–1075
Article Google Scholar
Lopez-Martin C, Isaza C, Chavoya A (2012) Software development effort prediction of industrial projects applying a general regression neural network. Empir Softw Eng 17(6):738–756
Article Google Scholar
MacDonell SG, Shepperd MG (2003) Using prior-phase effort records for re-estimation during software projects. Proc 9th IEEE Int Symp Software Metrics, Sydney, Australia
MacDonell SG, Shepperd MJ (2010) Data accumulation and software effort prediction. Proceedings of 4th International Symposium on Empirical Software Engineering and Measurement. Bolzano-Bozen, Italy
Google Scholar
Mäntylä MV, Lassenius C, Vanhanen J (2010) Rethinking replication in software engineering: can we see the forest for the trees?. Proc 1^st Int Workshop Replic Empirical Software Eng Res, Cape Town, South Africa
Maxwell K (2002) Applied statistics for software managers. Software Quality Institute Series, Prentice Hall
Mendes E (2014) Practitioner’s Knowledge representation—a pathway to improve software effort estimation. Springer, ISBN 978-3-642-54156-8
Mendes, E. and C. Lokan. 2009. Investigating the use of chronological splitting to compare software cross-company and single-company effort predictions: a replicated study. Proc 13^th Int Conf Eval Assess Software Eng, Durham, UK
Mendes E, Mosley N (2008) Bayesian network models for web effort prediction: a comparative study. IEEE Trans Softw Eng 34(6):723–737
Article Google Scholar
Menzies T, Krishna R, Pryor D (2016) The promise repository of empirical software engineering data; http://openscience.us/repo. North Carolina State University, Department of Computer Science
Minku LL, Yao X (2012a) Can cross-company data improve performance in software effort estimation?. Proc 8th Int Conf Predict Models Software Eng, Lund, Sweden: 69–78
Minku LL, Yao X (2012b) Using unreliable data for creating more reliable online learners. International Joint Conference on Neural Networks, Brisbane, pp 1–8
Google Scholar
Minku LL, Sarro F, Mendes E, Ferrucci F (2015) How to make best use of cross-company data for Web effort estimation?. Proc 9th Int Symp Empirical Software Eng Measure, Beijing, China: 1–10
Premraj R, Shepperd MJ, Kitchenham BA, Forselius P (2005) An empirical analysis of software productivity over time. Proc 11th Int Symp Software Metrics, Como, Italy
Schmietendorf A, Kunz M, Dumke R (2008) Effort estimation for agile software development projects. Proceedings 5th Software Measurement European Forum, Milan, pp 113–126
Google Scholar
Shepperd MJ, MacDonell SG (2012) Evaluating prediction systems in software project estimation. Inf Softw Technol 54(8):820–827
Article Google Scholar
Shull FJ, Carver JC, Vegas S, Juristo N (2008) The role of replications in empirical software engineering. Empir Softw Eng 13:211–218
Article Google Scholar
Sigweni B, Shepperd MJ, Turchi T (2016) Realistic assessment of software effort estimation models. Proc 20^th Int Conf Assess Eval Software Eng, Limerick, Ireland
Song L, Minku LL, Yao X (2013) The impact of parameter tuning on software effort estimation using learning machines. Proc 9th Int Conf Predict Models Software Eng, Baltimore, USA: 9:1–9:10
Tabachnick BG, Fidell LS (1996) Using multivariate statistics. Harper-Collins
Tsunoda M, Amasaki S, Lokan C (2013) How to treat timing information for software effort estimation?. Proc 2013 Int Conf Software Syst Process, San Francisco, USA:10–19
Turhan B (2012) On the dataset shift problem in software engineering prediction models. Empir Softw Eng 17:62–74
Article Google Scholar
Usman M, Mendes E, Weidt F, Britto R (2014) Effort estimation in agile software development: a systematic literature review. Proc 10^th Int Conf Predict Models Software Eng, Turin, Italy: 82–91

Download references

Acknowledgments

We thank Pekka Forselius for making the Finnish data set available to us for this research.

Author information

Authors and Affiliations

School of Engineering & Information Technology, UNSW Canberra, Canberra, Australia
Chris Lokan
Faculty of Computing, Blekinge Institute of Technology, SE-371 79, Karlskrona, Sweden
Emilia Mendes
Faculty of Information Technology and Electrical Engineering, University of Oulu, PO Box 3000, 90014, Oulu, Finland
Emilia Mendes

Authors

Chris Lokan
View author publications
You can also search for this author in PubMed Google Scholar
Emilia Mendes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chris Lokan.

Additional information

Communicated by: Martin Shepperd

Appendix

Table 5 Summary of related work

Full size table

Table 6 Organization A: Mean absolute residuals by window size

Full size table

Table 7 Organization B: Mean absolute residuals by window size

Full size table

Table 8 Organization C: Mean absolute residuals by window size

Full size table

Tables 6, 7, and 8 present in full numerical detail the information that is plotted in Figs. 7a and b, 8a and b, and 9a and b. In each table, the first column shows the window size. The second column shows the number of projects for which the use of a window of that size could make a difference to the estimate, compared to using the growing portfolio. The third column shows the value of MAE across all of those projects, when a window is used. The fourth column shows the value of MAE for the same set of projects, when a window is not used and instead the training set always contains all projects completed so far. The fifth column shows the difference between columns 3 and 4; a positive number means that MAE is worse when a window is used compared to retaining all data, and a negative number indicates that MAE is better when a window is used compared to retaining all data. The sixth column presents the difference in MAE (the fifth column) as a percentage of the MAE without a window (the fourth column). The seventh column shows the p-value when the paired-samples two- sided Wilcoxon test was used to test the hypothesis that MAE with a window differed from MAE with the growing portfolio; values below 0.00055 indicate a statistically significant difference for that test (applying the Holm-Bonferroni correction to the overall significance level of 0.05). The final column shows the effect size r, calculated from Cohen’s d statistic ^{Footnote 6} (Cohen 1992): r = d / sqrt(d ² = 4). Effect size is considered small if it is below about 0.2, medium at about 0.5, and large above about 0.8 (Cohen 1992; Shepperd and MacDonell 2012).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lokan, C., Mendes, E. Investigating the use of moving windows to improve software effort prediction: a replicated study. Empir Software Eng 22, 716–767 (2017). https://doi.org/10.1007/s10664-016-9446-4

Download citation

Published: 26 August 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s10664-016-9446-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Investigating the use of moving windows to improve software effort prediction: a replicated study

Abstract

Access this article

Similar content being viewed by others

The Evaluation of Weighted Moving Windows for Software Effort Estimation

A Virtual Study of Moving Windows for Software Effort Estimation Using Finnish Datasets

The Effects of Gradual Weighting on Duration-Based Moving Windows for Software Effort Estimation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Investigating the use of moving windows to improve software effort prediction: a replicated study

Abstract

Access this article

Similar content being viewed by others

The Evaluation of Weighted Moving Windows for Software Effort Estimation

A Virtual Study of Moving Windows for Software Effort Estimation Using Finnish Datasets

The Effects of Gradual Weighting on Duration-Based Moving Windows for Software Effort Estimation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation