The Impact of Instance Selection Algorithms on Maintenance Effort Estimation for Open-Source Software

Miloudi, Chaymae; Cheikhi, Laila; Idri, Ali; Abran, Alain

doi:10.1007/978-3-031-04829-6_17

The Impact of Instance Selection Algorithms on Maintenance Effort Estimation for Open-Source Software

Chaymae Miloudi¹³,
Laila Cheikhi¹³,
Ali Idri¹³ &
…
Alain Abran¹⁴

Conference paper
First Online: 11 May 2022

989 Accesses
1 Citations

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 470))

Abstract

Open-source software are very used nowadays in the industry, and the performance of the estimation of their maintenance effort becomes an interesting research topic. In this context, researchers have conducted many open-source software maintenance effort estimation (O-MEE) studies based on statistical and machine learning (ML) techniques for better estimation. This study focuses on the impact of instance selection on the performance of ML techniques in O-MEE, mainly for bug resolution. An empirical study was conducted using three techniques: K-nearest neighbor (kNN), support vector machine (SVM), and multinomial naïve Bayes (MNB) using all-kNN instance selection algorithms on three datasets: Eclipse JDT, Eclipse Platform, and Mozilla Thunderbird datasets. This study reports on a set of 18 experiments and a comparison of the results. The results of this study show that instance selection helped make ML techniques more performant.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Guo, S., Chen, R., Wei, M., Li, H., Liu, Y.: Ensemble data reduction techniques and multi-RSMOTE via fuzzy integral for bug report classification. IEEE Access 6, 45934–45950 (2018)
Article Google Scholar
Sabor, K.K., Hamdaqa, M., Hamou-Lhadj, A.: Automatic prediction of the severity of bugs using stack traces and categorical features. Inf. Softw. Technol. 123, 106205 (2020)
Article Google Scholar
Wang, H., Kagdi, H.: A conceptual replication study on bugs that get fixed in open source software. In: The proceedings of IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 299–310 (2018)
Google Scholar
Habayeb, M., Murtaza, S.S., Miranskyy, A., Bener, A.B.: On the use of hidden Markov model to predict the time to fix bugs. IEEE Trans. Softw. Eng. 44(12), 1224–1244 (2018)
Article Google Scholar
Ardimento, P., Dinapoli, A.: Knowledge extraction from on-line open source bug tracking systems to predict bug-fixing time. In: 7th International Conference on Web Intelligence, Mining and Semantics - WIMS 2017, pp. 1–9 (2017)
Google Scholar
Thung, F.: Automatic prediction of bug fixing effort measured by code churn size. In: 5th International Workshop on Software Mining - SoftwareMining, pp. 18–23 (2016)
Google Scholar
Xiong, C.J., Li, Y.F., Xie, M., Ng, S.H., Goh, T.N.: A model of open source software maintenance activities. In: IEEE International Conference on Industrial Engineering and Engineering Management, Hong Kong, China, pp. 267–271 (2009)
Google Scholar
Olvera-López, J.A., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34(2), 133–143 (2010)
Article Google Scholar
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Article Google Scholar
Sain, S.R.: The nature of statistical learning theory. Technometrics 38(4), 409 (1996)
Article Google Scholar
D’Alché-Buc, F.: Incremental Learning Algorithms for Classification and Regression: local strategies. In: Proceedings of AIP Conference, Liege, Belgium, vol. 627, pp. 320–329 (2002)
Google Scholar
Chirawichitchai, N.: Sentiment classification by a hybrid method of greedy search and multinomial naive bayes algorithm. In: Eleventh International Conference on ICT and Knowledge Engineering, pp. 1–4, Bangkok, Thailand (2013)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article Google Scholar
Guo, P.J., Zimmermann, T., Nagappan, N., Murphy, B.: Characterizing and predicting which bugs get fixed: an empirical study of Microsoft Windows. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - ICSE 2010, Cape Town, South Africa, vol. 1, p. 495 (2010)
Google Scholar
García-Laencinan, P.J., Sancho-Gómez, J.-L., Figueiras-Vidal, A.R.: Pattern classification with missing data: a review. Neural Comput. Appl. 19(2), 263–282 (2010)
Article Google Scholar
Tomek, I.: An experiment with the edited nearest-neighbor rule .IEEE Transactions on Systems, Man, and Cybernetics, 6(6), 448–452 (1976)
Google Scholar
Jankowski, N., Grochowski, M.: Comparison of instances seletion algorithms I. Algorithms survey. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 598–603. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24844-6_90
Chapter Google Scholar
Abbasi, Z., Rahmani, M.: An instance selection algorithm based on ReliefF. Int. J. Artif. Intell. Tools 28(01), 1950001(2019)
Google Scholar
Guan, D., Yuan, W., Lee, Y.-K., Lee, S.: Nearest neighbor editing aided by unlabeled data. Inf. Sci. 179(13), 2273–2282 (2009)
Article Google Scholar
Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)
Article Google Scholar
Scott, A.J., Knott, M.: A cluster analysis method for grouping means in the analysis of variance. Biometrics 30(3), 507 (1974)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Software Project Management Team, ENSIAS, Mohammed V University IN Rabat, Rabat, Morocco
Chaymae Miloudi, Laila Cheikhi & Ali Idri
École de Technologie Supérieure, University of Québec, Montréal, Canada
Alain Abran

Authors

Chaymae Miloudi
View author publications
You can also search for this author in PubMed Google Scholar
Laila Cheikhi
View author publications
You can also search for this author in PubMed Google Scholar
Ali Idri
View author publications
You can also search for this author in PubMed Google Scholar
Alain Abran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Laila Cheikhi .

Editor information

Editors and Affiliations

ISEG, Universidade de Lisboa, Lisbon, Portugal
Alvaro Rocha
College of Engineering, The Ohio State University, Columbus, OH, USA
Hojjat Adeli
Institute of Data Science and Digital Technologies, Vilnius University, Vilnius, Lithuania
Gintautas Dzemyda
DCT, Universidade Portucalense, Porto, Portugal
Fernando Moreira

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Miloudi, C., Cheikhi, L., Idri, A., Abran, A. (2022). The Impact of Instance Selection Algorithms on Maintenance Effort Estimation for Open-Source Software. In: Rocha, A., Adeli, H., Dzemyda, G., Moreira, F. (eds) Information Systems and Technologies. WorldCIST 2022. Lecture Notes in Networks and Systems, vol 470. Springer, Cham. https://doi.org/10.1007/978-3-031-04829-6_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-04829-6_17
Published: 11 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04828-9
Online ISBN: 978-3-031-04829-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics