Analysis of attribute weighting heuristics for analogy-based software effort estimation method AQUA+

Li, Jingzhou; Ruhe, Guenther

doi:10.1007/s10664-007-9054-4

Analysis of attribute weighting heuristics for analogy-based software effort estimation method AQUA⁺

Published: 27 November 2007

Volume 13, pages 63–96, (2008)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Jingzhou Li¹ &
Guenther Ruhe¹

419 Accesses
53 Citations
Explore all metrics

Abstract

Estimation by analogy (EBA) predicts effort for a new project by aggregating effort information of similar projects from a given historical data set. Existing research results have shown that a careful selection and weighting of attributes may improve the performance of the estimation methods. This paper continues along that research line and considers weighting of attributes in order to improve the estimation accuracy. More specifically, the impact of weighting (and selection) of attributes is studied as extensions to our former EBA method AQUA, which has shown promising results and also allows estimation in the case of data sets that have non-quantitative attributes and missing values. The new resulting method is called AQUA⁺. For attribute weighting, a qualitative analysis pre-step using rough set analysis (RSA) is performed. RSA is a proven machine learning technique for classification of objects. We exploit the RSA results in different ways and define four heuristics for attribute weighting. AQUA⁺ was evaluated in two ways: (1) comparison between AQUA⁺ and AQUA, along with the comparative analysis between the proposed four heuristics for AQUA⁺, (2) comparison of AQUA⁺ with other EBA methods. The main evaluation results are: (1) better estimation accuracy was obtained by AQUA⁺ compared to AQUA over all six data sets; and (2) AQUA⁺ obtained better results than, or very close to that of other EBA methods for the three data sets applied to all the EBA methods. In conclusion, the proposed attribute weighing method using RSA can improve the estimation accuracy of EBA method AQUA⁺ according to the empirical studies over six data sets. Testing more data sets is necessary to get results that are more statistical significant.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pareto efficient multi-objective optimization for local tuning of analogy-based estimation

Article 01 September 2015

Insightful analogy-based software development effort estimation through selective classification and localization

Article 05 December 2014

Appropriate number of analogues in analogy based software effort estimation using quality datasets

Article 22 January 2023

References

Boehm B (1981) Software engineering economics. Prentice-Hall, Englewood Cliffs, NJ
MATH Google Scholar
Briand LC, Wieczorek I (2001) Resource estimation in software engineering. In: Marciniak JJ (ed) Encyclopedia of software engineering, 2nd edn. Wiley, New York
Google Scholar
Cartwright M, Shepperd M, Song Q (2003) Dealing with missing software project data. Proceedings of the 9th International Symposium on Software Metrics, Australia, pp 154–165 (September)
Chen Z, Boehm B, Menzies T, Port D (2005) Finding the right data for software cost modeling. IEEE Software 22(6):38–46
Article Google Scholar
Chmielewski MR, Grzymala-Busse JW (1994) Global discretization of continuous attributes as preprocessing for machine learning. Third International Workshop on Rough Sets and Soft Computing, November, USA, pp 294–301
Conte SD, Dunsmore H, Shen VY (1986) Software engineering metrics and models. Benjamin-Cummings, Redwood City, CA
Google Scholar
Desharnais JM (1989) Analyse statistique de la productivitie des projets informatique a partie de la technique des point des fonction. Masters Thesis, University of Montreal
Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. Proceedings of 12th International Conference on Machine Learning, USA, pp 194–202
Efron B, Gong G (1983) A leisurely look at the bootstrap, the jackknife, and cross-validation. Am Stat 37(1):36–48
Article MathSciNet Google Scholar
Foss T, Stensrud E, Kitchenham B, Myrtveit I (2003) A simulation study of the model evaluation criterion MMRE. IEEE Trans Softw Eng 29(11):985–995
Article Google Scholar
Huang SJ, Chiu NH (2006) Optimization of analogy weights by genetic algorithm for software effort estimation. Inf Softw Technol 48(11):1034–1045
Article Google Scholar
IDSS (2006) ROSE2, Institute of Computing Science, Poznañ University of Technology, http://idss.cs.put.poznan.pl/site/rose.html, November
ISBSG (2004) Data R8, International Software Benchmark and Standards Group, http://www.isbsg.org.
Jørgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. IEEE Trans Softw Eng 33(1):33–53
Article Google Scholar
Jørgensen M, Indahl U, Sjøberg D (2003) Software effort estimation by analogy and regression toward the mean. J Syst Softw 68(3):253–262
Article Google Scholar
Kadoda G, Michelle C, Chen L, Shepperd M (2000) Experiences using case-based reasoning to predict software project effort. Proceedings of EASE 2000—Fourth International Conference on Empirical Assessment and Evaluation in Software Engineering, UK (January)
Kemerer CF (1987) An empirical validation of software cost estimation models. Commun ACM 30(5):416–429
Article Google Scholar
Kirsopp C, Shepperd M (2002) Case and feature subset selection in case-based software project effort prediction. Proc. 22nd SGAI Int’l Conf. Knowledge-Based Systems and Applied Artificial Intelligence (December)
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324
Article MATH Google Scholar
Laplante PA, Neil CJ (2005) Modeling uncertainty in software engineering using rough sets. Innovations in Systems and Software Engineering 1(1):71–78
Article Google Scholar
Leung HKN (2002) Estimating maintenance effort by analogy. Empirical Software Engineering 7(2):157–175
Article MATH Google Scholar
Li JZ, Ruhe G (2005) Data Set USP05, Software Engineering Decision Support Laboratory, University of Calgary, Canada. (Available: http://promisedata.org/repository/#usp05)
Li JZ, Ruhe G (2006) A comparative study of attribute weighting heuristics for effort estimation by analogy. Proceedings of ACM-IEEE International Symposium on Empirical Software Engineering (ISESE‘06), Brazil, pp 66–74 (September)
Li JZ, Ruhe G (2007) Decision support analysis for software effort estimation by analogy. Proceedings of ICSE 2007 Workshop on Predictor Models in Software Engineering, USA (May)
Li JZ, Ruhe G, Al-Emran A, Richter MM (2007) A flexible method for effort estimation by analogy. Empirical Software Engineering 12(1):65–106
Article Google Scholar
Mendes E, Watson I, Chris T, Nile M, Steve CA (2003) A comparative study of cost estimation models for web hypermedia applications. Empirical Software Engineering 8(2):163–196
Article Google Scholar
Menzies T, Chen Z, Hihn J, Lum K (2006) Selecting best practices for effort estimation. IEEE Trans Softw Eng 32(11):1–13
Article Google Scholar
Moløkken K, Jørgensen M (2003) A review of software surveys on software effort estimation. Proceedings of ACM-IEEE International Symposium on Empirical Software Engineering (ISESE‘03), Italy, pp 223–230 (September)
Mukhopadhyay T, Vicinanza S, Prietula MJ (1992) Examining the feasibility of a case-based reasoning model for software effort estimation. MIS Quarterly 16(2):155–171
Article Google Scholar
Myrtveit I, Stensrud E, Olsson UH (2001) Analyzing data sets with missing data: an empirical evaluation of imputation methods and likelihood-based methods. IEEE Trans Softw Eng 27(11):999–1013
Article Google Scholar
Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer, Boston, MA
Putnam LH (1978) A general empirical solution to the macro sizing and estimating problem. IEEE Trans Softw Eng 4(4):345–361
Article Google Scholar
Ruhe G (1996) Rough sets based data analysis in goal oriented software measurement. Proceedings of the third International Symposium on Software Metrics (METRICS‘96), Germany, pp 10–19 (March)
Sayyad SJ, Menzies TJ (2005) The PROMISE repository of software engineering databases. School of Information Technology and Engineering, University of Ottawa, Canada. (Available: http://promise.site.uottawa.ca/SERepository)
Shepperd M, Schofield C (1997) Estimating software project effort using analogies. IEEE Trans Softw Eng 23:736–743
Article Google Scholar
Shepperd M, Schofield C, Kitchenham B (1996) Effort estimation using analogy. Proceedings of the 18th International Conference on Software Engineering, Germany, pp 170–178 (March)
Song Q, Shepperd M, Mair C (2005) Using grey relational analysis to predict software effort with small data sets. METRICS‘05: Proceedings of the 11th IEEE International Software Metrics Symposium, Italy, pp. 35–45 (September)
Strike K et al (2001) Software cost estimation with incomplete data. IEEE Trans Softw Eng 27(10):890–908
Article Google Scholar
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco, CA
MATH Google Scholar
Zhang M, Yao J (2004) A rough sets based approach to feature selection. Proceedings of the 23rd International Conference of NAFIPS, Canada, pp 434–439 (June)
Zhong N, Dong J (2001) Using rough sets with heuristics for feature selection. Journal of Intelligent Information Systems 16(3):199–214
Article MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank the Alberta Informatics Circle of Research Excellence (iCORE) for its financial support of this research. Thanks are also given to Jim McElroy for his contribution to the improvement of readability of this paper. Special thanks are given to the anonymous reviewers for their valuable and in-depth comments.

Author information

Authors and Affiliations

Software Engineering Decision Support Laboratory, University of Calgary, Calgary, AB, T2N1N4, Canada
Jingzhou Li & Guenther Ruhe

Authors

Jingzhou Li
View author publications
You can also search for this author in PubMed Google Scholar
Guenther Ruhe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jingzhou Li.

Additional information

Editor: José Carlo Maldonado

Appendices

Appendix A

1.1 Definition of Attributes in USP05-FT and USP05-RQ

Table 21 Definition of attributes

Full size table

Appendix B

2.1 Detailed Results of the Comparative Study

Table 22 Results of USP05-FT

Full size table

Table 23 Results of ISBSG04-2

Full size table

Table 24 Results of Mends03

Full size table

Table 25 Results of Kem87

Full size table

Table 26 Results of Desh89

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, J., Ruhe, G. Analysis of attribute weighting heuristics for analogy-based software effort estimation method AQUA⁺ . Empir Software Eng 13, 63–96 (2008). https://doi.org/10.1007/s10664-007-9054-4

Download citation

Received: 30 November 2006
Accepted: 02 October 2007
Published: 27 November 2007
Issue Date: February 2008
DOI: https://doi.org/10.1007/s10664-007-9054-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of attribute weighting heuristics for analogy-based software effort estimation method AQUA⁺

Abstract

Access this article

Similar content being viewed by others

Pareto efficient multi-objective optimization for local tuning of analogy-based estimation

Insightful analogy-based software development effort estimation through selective classification and localization

Appropriate number of analogues in analogy based software effort estimation using quality datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A

1.1 Definition of Attributes in USP05-FT and USP05-RQ

Appendix B

2.1 Detailed Results of the Comparative Study

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Analysis of attribute weighting heuristics for analogy-based software effort estimation method AQUA+

Abstract

Access this article

Similar content being viewed by others

Pareto efficient multi-objective optimization for local tuning of analogy-based estimation

Insightful analogy-based software development effort estimation through selective classification and localization

Appropriate number of analogues in analogy based software effort estimation using quality datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A

1.1 Definition of Attributes in USP05-FT and USP05-RQ

Appendix B

2.1 Detailed Results of the Comparative Study

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Analysis of attribute weighting heuristics for analogy-based software effort estimation method AQUA⁺