Abstract
Achieving accuracy in Software Effort Estimation (SEE) is probably the greatest difficulty in software project management. Since the Unified Modeling Language (UML) became increasingly noticeable in requirement analysis and software system design, researchers and practitioners became progressively intrigued to utilize the Use Case Point (UCP) metrics derived from UML diagrams for SEE. A lot of research has already been done in this area. Several researchers have used different regression and clustering-based models for UCP estimation. However, most of these models suffer from low accuracy. Out of different regression models, the linear regression (LR) model has been used in most studies. But, the LR model’s major problem is that it tries to fit a straight line on the data after model creation. Therefore, the LR model leads to underfitting for the non-linear data, which is generally the case with UCP estimation datasets. This study proposes a UCP Estimation method based on a Locally Weighted Linear Regression (LWLR) model, which we call Know-UCP, that tries to handle the abovementioned issues by assigning weights to the training projects. In addition, it has been found that all the UCP variables are not significant in UCP estimation, affecting the model’s performance. So, we performed a significance analysis in the proposed model to find the most significant predictors for UCP estimation. Further, we compare the proposed approach with the other models and found that the proposed Know-UCP approach performs better than the other models with reference to various performance measures.
Similar content being viewed by others
Data Availability Statements
All data generated or analyzed during this study are included in[https://doi.org/10.1016/j.infsof.2017.12.009] published article [and its supplementary information files].
References
Boehm B, Clark B, Horowitz E, Westland C, Madachy R, Selby R (1995) Cost models for future software life cycle processes: Cocomo 2.0. Ann Softw Eng 1(1):57–94
Albrecht A (1979) Measuring application development productivity. In: IBM Application Development Symposium, pp 83–92
Idri A, Amazal F, Abran A (2015) Analogy-based software development effort estimation: a systematic mapping and review. Inf Softw Technol 58:206–230
Karner G (1993) Resource estimation for objectory projects. Objective Systems SF AB 17:1–9
Silhavy R, Silhavy P, Prokopova Z (2015) Algorithmic optimization method for improving use case points estimation. PloS One 10(11):e0141887
Silhavy R, Silhavy P, Prokopova Z (2017) Improving algorithmic optimization method by spectral clustering. In: Computer Science On-line Conference, pp 1–10
Kirmani M, Wahid A (2015) Use case point method of software effort estimation: a review. International Journal of Computer Applications 116(15)
Wang F et al (2009) Extended use case points method for software cost estimation. In: International conference on estimational intelligence and software engineering, pp 1–5
Palma-Mendoza RJ, De-Marcos L, Rodriguez D, Alonso-Betanzos A (2019) Distributed correlation-based feature selection in spark. Inf Sci 496:287–299
Hoc HT, Van Hai V, Le Thi Kim Nhung H (2020) Adam Optimizer for the optimisation of use case points estimation. In: Proceedings of the Computational Methods in Systems and Software, pp 747–756
S Azzeh M, Nassif AB (2019) Analyzing the relationship between project productivity and environment factors in the use case points method. Journal of Software: Evolution and Process 29(9):e1882
Silhavy R, Silhavy P, Prokopova Z (2017) Analysis and selection of a regression model for the use case points method using a stepwise approach. J Syst Softw 125:1–14
Mustafa G, Hameed R (2019) Families of non-linear subdivision schemes for scattered data fitting and their non-tensor product extensions. Appl Math Comput 359:214–240
Yuan X, Wang Y, Yang C, Ge Z, Song Z, Gui W (2017) Weighted linear dynamic system for feature representation and soft sensor application in nonlinear dynamic industrial processes. IEEE Trans Ind Electron 65(2):1508–1517
Kennedy-Shaffer L (2019) Before p < 0.05 to beyond p < 0.05: using history to contextualize p-values and significance testing. The American Statistician 73(sup1):82–90
Nhung HL, Van HV, Silhavy R, Prokopova Z, Silhavy P (2021) Parametric software effort estimation based on optimizing correction factors and multiple linear regression. IEEE Access 10:2963–86
Mohagheghi P, Anda B, Conradi R (2005) Effort estimation of use cases for incremental largescale software development. In: Proceedings of 27th International Conference on Software Engineering, pp 303–311
Ochodek M, Nawrocki J, Kwarciak K (2011) Simplifying effort estimation based on use case points. Inf Softw Technol 53(3):200–213
Diev S (2006) Software estimation in the maintenance context. ACM SIGSOFT Software Engineering Notes 31(2):1–8
Hsu FC, Chen CN, Shieh MD (2020) Using stepwise backward elimination to specify terms related to tactile sense for product design. Adv Eng Inform 46:101193
Chen Y, Hao Y (2017) A feature weighted support vector machine and K-nearest neighbor algorithm for stock market indices prediction. Expert Syst Appl 80:340–355
Subriadi AP, Ningrum PA (2014) Critical review of the effort rate value in use case point method for estimating software development effort. Journal of Theoretical and Applied Information Technology 59(3):735–744
Silhavy R, Silhavy P, Prokopova Z (2021) Using actors and use cases for software size estimation. Electronics 10(5):592
Xu Y, Goodacre R (2018) On splitting training and validation set: a comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. Journal of Analysis and Testing 2(3):249– 262
Alqasrawi Y, Azzeh M, Elsheikh Y (2022) Locally weighted regression with different kernel smoothers for software effort estimation. Sci Comput Program 214:102744
Azzeh M, Nassif AB, Attili IB (2021) Predicting software effort from use case points: a systematic review. Sci Comput Program 204:102596
Patwary MJ, Wang XZ, Yan D (2019) Impact of fuzziness measures on the performance of semi-supervised learning. International Journal of Fuzzy Systems 21(5):1430–1442
Silhavy R, Silhavy P, Prokopova Z (2018) Evaluating subset selection methods for use case points estimation. Inf Softw Technol 97:1–9
Prokopova Z, Silhavy R, Silhavy P (2017) The effects of clustering to software size estimation for the use case points methods. In: Computer Science On-line Conference, pp 479–490
Li X, Chen W, Zhang Q, Wu L (2020) Building auto-encoder intrusion detection system based on random forest feature selection. Computers & Security 95:101851
Nassif AB, Azzeh M, Idri A, Abran A (2019) Software development effort estimation using regression fuzzy models. Computational Intelligence and Neuroscience 2019:1–17
Benjamin DJ, Berger JO (2019) Three recommendations for improving the use of p-values. The American Statistician 73(sup1):186–91
Krueger JI, Heck PR (2019) Putting the p-value in its place. The American Statistician 73 (sup1):122–8
Speiser JL, Miller ME, Tooze J, Ip E (2019) A comparison of random forest variable selection methods for classification prediction modeling. Expert Systems with Applications 134:93–101
Acknowledgments
The authors are thankful to the Government of India for project funding under the SPARC and VAJRA Scheme. We are also grateful to the reviewers, associate editor, and the editor for their valued feedback and efforts.
Author information
Authors and Affiliations
Contributions
The contribution of both authors is equal in the manuscript development. Suyash Shukla: Conceptualization, methodology, and initial draft preparation. Sandeep Kumar: Writing Review and Editing, Funding Acquisition, Supervision, and Validation.
Corresponding author
Ethics declarations
Conflict of Interests
None.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shukla, S., Kumar, S. Know-UCP: locally weighted linear regression based approach for UCP estimation. Appl Intell 53, 13488–13505 (2023). https://doi.org/10.1007/s10489-022-04160-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-04160-5