A comparative study of handling imbalanced data using generative adversarial networks for machine learning based software fault prediction

Thi Minh Phuong, Ha; Vu Thu Nguyet, Pham; Huu Nhat Minh, Nguyen; Thi My Hanh, Le; Thanh Binh, Nguyen

doi:10.1007/s10489-024-05930-z

A comparative study of handling imbalanced data using generative adversarial networks for machine learning based software fault prediction

Published: 08 January 2025

Volume 55, article number 280, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Ha Thi Minh Phuong¹,
Pham Vu Thu Nguyet¹,
Nguyen Huu Nhat Minh¹,
Le Thi My Hanh² &
…
Nguyen Thanh Binh¹

59 Accesses
Explore all metrics

Abstract

Software fault prediction (SFP) is the process of identifying potentially defect-prone modules before the testing stage of a software development process. By identifying faults early in the development process, software engineers can spend their efforts on those components most likely to contain defects, thereby improving the overall quality and reliability of the software. However, data imbalance and feature redundancy are challenging issues in SFP that can negatively impact the performance of fault prediction models. Imbalanced software fault datasets, in which the number of normal modules (majority class) is significantly higher than that of faulty modules (minority class), may lead to many false negative results. In this work, we study and perform an empirical assessment of the variants of Generative Adversarial Networks (GANs), an emerging synthetic data generation method, for resolving the data imbalance issue in common software fault prediction datasets. Five GANs variations - CopulaGAN, VanillaGAN, CTGAN, TGAN and WGANGP are utilized to generate synthetic faulty samples to balance the proportion of the majority and minority classes in datasets. Thereafter, we present an extensive evaluation of the performance of different prediction models which involve combining Recursive Feature Elimination (RFE) for feature selection with GANs oversampling methods, along with pairs of Autoencoders for feature extraction with GANs models. Throughout the experiments with five fault datasets extracted from the PROMISE repository, we evaluate six different machine learning approaches using precision, recall, F1-score, Area Under Curve (AUC) and Matthews Correlation Coefficient (MCC) as performance evaluation metrics. The experimental results demonstrate that the combination of CTGAN with RFE and a pair of CTGAN with Autoencoders outperform other baselines for all datasets, followed by WGANGP and VanillaGAN. According to the comparative analysis, GANs-based oversampling methods exhibited significant improvement in dealing with data imbalance for software fault prediction.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adversarial Samples for Improving Performance of Software Defect Prediction Models

Software fault prediction with imbalanced datasets using SMOTE-Tomek sampling technique and Genetic Algorithm models

Article 27 October 2023

Domain-specific implications of error-type metrics in risk-based software fault prediction

Article Open access 07 January 2025

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Availability of Data and Materials

The source code of these projects from Apache is available at https://github.com/ApoorvaKrisna/NASA-promise-dataset-repository?tab=readme-ov-file

Code Availability

This source code is available in the GitHub https://github.com/htmphuong/GANPaper/tree/main

References

Mangla M, Sharma N, Mohanty SN (2021) A sequential ensemble model for software fault prediction. Innovations in Systems and Software Engineering 1–8
Balaram A, Vasundra S (2022) Prediction of software fault-prone classes using ensemble random forest with adaptive synthetic sampling algorithm. Autom Softw Eng 29(1):6
Article MATH Google Scholar
Rathore SS, Kumar S (2019) A study on software fault prediction techniques. Artif Intell Rev 51:255–327
Article MATH Google Scholar
Pandey SK, Mishra RB, Tripathi AK (2021) Machine learning based methods for software fault prediction: A survey. Expert Syst Appl 172:114595
Article MATH Google Scholar
Malhotra R, Kamal S (2019) An empirical study to investigate oversampling methods for improving software defect prediction using imbalanced data. Neurocomputing 343:120–140. Learning in the Presence of Class Imbalance and Concept Drift
Bennin KE, Keung JW, Monden A (2019) On the relative value of data resampling approaches for software defect prediction. Empir Softw Eng 24:602–636
Article MATH Google Scholar
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Article MATH Google Scholar
Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA (2018) Generative adversarial networks: An overview. IEEE Signal Process Mag 35(1):53–65
Article Google Scholar
Xu L, Veeramachaneni K (2018) Synthesizing Tabular Data Using Generative Adversarial Networks
Rathore SS, Chouhan SS, Jain DK, Vachhani AG (2022) Generative oversampling methods for handling imbalanced data in software fault prediction. IEEE Trans Reliab 71(2):747–762
Article Google Scholar
CopulaGAN (2023) CopulaGAN Model. Available: https://docs.sdv.dev/sdv/single-table-data/modeling/synthesizers/copulagansynthesizer
Sun Y, Jing X-Y, Wu F, Li J, Xing D, Chen H, Sun Y (2020) Adversarial learning for cross-project semi-supervised defect prediction. IEEE Access 8:32674–32687
Article MATH Google Scholar
Cetiner M, Sahingoz OK (2020) A comparative analysis for machine learning based software defect prediction systems. In: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–7. IEEE
Ahmed MR, Ali MA, Ahmed N, Zamal MFB, Shamrat FJM (2020) The impact of software fault prediction in real-world application: An automated approach for software engineering. In: Proceedings of 2020 the 6th International Conference on Computing and Data Engineering, pp. 247–251
Kaur R, Sharma S (2019) An ann based approach for software fault prediction using object oriented metrics. In: Advanced Informatics for Computing Research: Second International Conference, ICAICR 2018, Shimla, India, pp. 341–354. Springer
Ouellet A, Badri M (2019) Empirical analysis of object-oriented metrics and centrality measures for predicting fault-prone classes in object-oriented software. In: Quality of Information and Communications Technology: 12th International Conference, QUATIC 2019, Ciudad Real, Spain, pp. 129–143. Springer
Malhotra R, Nishant N, Gurha S, Rathi V (2021) Application of particle swarm optimization for software defect prediction using object oriented metrics. In: 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence), pp. 88–93
Borandag E, Ozcift A, Kilinc D, Yucalar F (2019) Majority vote feature selection algorithm in software fault prediction. Comput Sci Inf Syst 16(2):515–539
Article Google Scholar
Sunil JM, Kumar L, Neti LBM (2018) Bayesian logistic regression for software defect prediction (s). In: SEKE, pp. 421–420
Turabieh H, Mafarja M, Li X (2019) Iterated feature selection algorithms with layered recurrent neural network for software fault prediction. Expert Syst Appl 122:27–42
Article MATH Google Scholar
Erturk E, Sezer EA (2015) A comparison of some soft computing methods for software fault prediction. Expert Syst Appl 42(4):1872–1879
Article MATH Google Scholar
Balogun AO, Basri S, Abdulkadir SJ, Mahamad S, Al-momamni MA, Imam AA, Kumar GM (2021) Rank aggregation based multi-filter feature selection method for software defect prediction. In: Advances in Cyber Security: Second International Conference, ACeS 2020, Penang, Malaysia, pp. 371–383. Springer
Phuong HTM, My Hanh LT, Binh NT (2022) A study of filter-based feature selection in software fault prediction. In: International Conference on Intelligence of Things, pp. 58–67. Springer
Xu Z, Liu J, Luo X, Yang Z, Zhang Y, Yuan P, Tang Y, Zhang T (2019) Software defect prediction based on kernel pca and weighted extreme learning machine. Inf Softw Technol 106:182–200
Article MATH Google Scholar
Balogun AO, Basri S, Jadid SA, Mahamad S, Al-momani MA, Bajeh AO, Alazzawi AK (2020) Search-based wrapper feature selection methods in software defect prediction: an empirical analysis. In: Intelligent Algorithms in Software Engineering: Proceedings of the 9th Computer Science On-line Conference 2020, Volume 1 9, pp. 492–503. Springer
Tumar I, Hassouneh Y, Turabieh H, Thaher T (2020) Enhanced binary moth flame optimization as a feature selection algorithm to predict software fault prediction. Ieee Access 8:8041–8055
Article Google Scholar
Long NT, Phuong HTM, Binh NT (2023) A comparative study of wrapper feature selection techniques in software fault prediction. In: Conference on Information Technology and Its Applications, pp. 62–73. Springer
Hassouneh Y, Turabieh H, Thaher T, Tumar I, Chantar H, Too J (2021) Boosted whale optimization algorithm with natural selection operators for software fault prediction. IEEE Access 9:14239–14258
Article Google Scholar
Wang K, Liu L, Yuan C, Wang Z (2021) Software defect prediction model based on lasso-svm. Neural Comput Appl 33:8249–8259
Article MATH Google Scholar
Amini F, Hu G (2021) A two-layer feature selection method using genetic algorithm and elastic net. Expert Syst Appl 166:114072
Article MATH Google Scholar
Kamei Y, Monden A, Matsumoto S, Kakimoto T, Matsumoto Ki (2007) The effects of over and under sampling on fault-prone module detection. In: First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007), pp. 196–204
Kovács G (2019) Smote-variants: A python implementation of 85 minority oversampling techniques. Neurocomputing 366:352–354
Article MATH Google Scholar
Lin C-T, Hsieh T-Y, Liu Y-T, Lin Y-Y, Fang C-N, Wang Y-K, Yen G, Pal NR, Chuang C-H (2018) Minority oversampling in kernel adaptive subspaces for class imbalanced datasets. IEEE Trans Knowl Data Eng 30(5):950–962
Article MATH Google Scholar
Cheng M, Wu G, Yuan M, Wan H (2016) Semi-supervised software defect prediction using task-driven dictionary learning. Chin J Electron 25(6):1089–1096
Article MATH Google Scholar
Huda S, Liu K, Abdelrazek M, Ibrahim A, Alyahya S, Al-Dossari H, Ahmad S (2018) An ensemble oversampling model for class imbalance problem in software defect prediction. IEEE access 6:24184–24195
Article Google Scholar
Gupta A, Sharma S, Goyal S, Rashid M (2020) Novel xgboost tuned machine learning model for software bug prediction. In: 2020 International Conference on Intelligent Engineering and Management (ICIEM), pp. 376–380. IEEE
Hoc HT, Silhavy R, Prokopova Z, Silhavy P (2023) Comparing stacking ensemble and deep learning for software project effort estimation. IEEE Access
Catherine JM, Djodilatchoumy S (2021) Multi-layer perceptron neural network with feature selection for software defect prediction. In: 2021 2nd International Conference on Intelligent Engineering and Management (ICIEM), pp. 228–232. IEEE
Aljamaan H, Alazba A (2020) Software defect prediction using tree-based ensembles. In: Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering, pp. 1–10
Malhotra R (2015) A systematic review of machine learning techniques for software fault prediction. Appl Soft Comput 27:504–518
Article MATH Google Scholar
Halstead MH (1977) Elements of Software Science (Operating and Programming Systems Series). Elsevier Science Inc., USA
MATH Google Scholar
McCabe TJ (1976) A complexity measure. IEEE Transactions on Software Engineering SE-2(4):308–320
Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Software Eng 20(6):476–493
Article Google Scholar
Lorenz M, Kidd J (1994) Object-Oriented Software Metrics: A Practical Guide. Prentice-Hall Inc, USA
MATH Google Scholar
Meiliana Karim S, Warnars HLHS, Gaol FL, Abdurachman E, Soewito B (2017) Software metrics for fault prediction using machine learning approaches: A literature review with promise repository dataset. In: 2017 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom), pp. 19–23
Riaz S, Arshad A, Jiao L (2018) Rough noise-filtered easy ensemble for software fault prediction. Ieee Access 6:46886–46899
Article MATH Google Scholar
Catherine JM, Djodilatchoumy S (2021) Multi-layer perceptron neural network with feature selection for software defect prediction. In: 2021 2nd International Conference on Intelligent Engineering and Management (ICIEM), pp. 228–232
Muthukrishnan R, Rohini R (2016) Lasso: A feature selection technique in predictive modeling for machine learning. In: 2016 IEEE International Conference on Advances in Computer Applications (ICACA), pp. 18–20. IEEE
Osman H, Ghafari M, Nierstrasz O (2017) Automatic feature selection by regularization to improve bug prediction accuracy. In: 2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation, pp. 27–32. IEEE
Rana ZA, Awais MM, Shamail S (2014) Impact of using information gain in software defect prediction models. In: International Conference on Intelligent Computing, pp. 637–648. Springer
Van Der Maaten L, Postma EO, Van Den Herik HJ (2009) Dimensionality reduction: A comparative review. J Mach Learn Res 10(66–71):13
Google Scholar
Jayanthi R, Florence L (2019) Software defect prediction techniques using metrics based on neural network classifier. Clust Comput 22:77–88
Article MATH Google Scholar
Chen X, Zhang D, Zhao Y, Cui Z, Ni C (2019) Software defect number prediction: Unsupervised vs supervised methods. Inf Softw Technol 106:161–181
Article MATH Google Scholar
Malhotra R, Kamal S (2019) An empirical study to investigate oversampling methods for improving software defect prediction using imbalanced data. Neurocomputing 343:120–140
Article MATH Google Scholar
Pan C, Lu M, Xu B, Gao H (2019) An improved cnn model for within-project software defect prediction. Appl Sci 9(10):2138
Article MATH Google Scholar
Guo S, Dong J, Li H, Wang J (2021) Software defect prediction with imbalanced distribution by radius-synthetic minority over-sampling technique. Journal of Software: Evolution and Process 33(7):2362
MATH Google Scholar
Elahi E, Ayub A, Hussain I (2021) Two staged data preprocessing ensemble model for software fault prediction. In: 2021 International Bhurban Conference on Applied Sciences and Technologies (IBCAST), pp. 506–511. IEEE
Feng S, Keung J, Yu X, Xiao Y, Bennin KE, Kabir MA, Zhang M (2021) Coste: Complexity-based oversampling technique to alleviate the class imbalance problem in software defect prediction. Inf Softw Technol 129:106432
Mohammad UG, Imtiaz S, Shakya M, Almadhor A, Anwar F (2022) Research article an optimized feature selection method using ensemble classifiers in software defect prediction for healthcare systems
Goyal S (2022) Handling class-imbalance with knn (neighbourhood) under-sampling for software defect prediction. Artif Intell Rev 55(3):2023–2064
Article MATH Google Scholar
Abaei G, Tah WZ, Toh JZW, Hor ESJ (2022) Improving software fault prediction in imbalanced datasets using the under-sampling approach. In: 2022 11th International Conference on Software and Computer Applications, pp. 41–47
Zhao WD, Zhang SD, Wang M (2022) Software defect prediction method based on cost-sensitive random forest. In: Intelligent Information Processing XI: 12th IFIP TC 12 International Conference, pp. 369–381. Springer
Ali A, Khan N, Abu-Tair M, Noppen J, McClean S, McChesney I (2021) Discriminating features-based cost-sensitive approach for software defect prediction. Autom Softw Eng 28:1–18
Article Google Scholar
Huda S, Liu K, Abdelrazek M, Ibrahim A, Alyahya S, Al-Dossari H, Ahmad S (2018) An ensemble oversampling model for class imbalance problem in software defect prediction. IEEE Access 6:24184–24195
Article Google Scholar
Malhotra R, Jain J (2020) Handling imbalanced data using ensemble learning in software defect prediction. In: 2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence), pp. 300–304. IEEE
Chen L, Fang B, Shang Z, Tang Y (2018) Tackling class overlap and imbalance problems in software defect prediction. Software Qual J 26:97–125
Article MATH Google Scholar
Balaram A, Vasundra S (2022) Prediction of software fault-prone classes using ensemble random forest with adaptive synthetic sampling algorithm. Automated Software Engineering 29
Software defect prediction using cost-sensitive neural network (2015) Faruk Arar, Ayan, K. Appl Soft Comput 33:263–277
Google Scholar
Zhang S (2020) Cost-sensitive knn classification. Neurocomputing 391:234–242
Article MATH Google Scholar
Lenka SR, Barik RK, Patra SS, Singh VP (2021) Modified decision tree learning for cost-sensitive credit card fraud detection model. In: Advances in Communication and Computational Technology: Select Proceedings of ICACCT 2019, pp. 1479–1493. Springer
Zhu M, Pham H (2018) A two-phase software reliability modeling involving with software fault dependency and imperfect fault removal. Computer Languages, Systems & Structures 53:27–42
Article MATH Google Scholar
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, pp. 785–794
Guryanov A (2019) Histogram-based algorithm for building gradient boosting ensembles of piecewise linear decision trees. In: Analysis of Images, Social Networks and Texts: 8th International Conference, Kazan, Russia, pp. 39–50. Springer
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Advances in neural information processing systems. Curran Associates, Inc 27:2672–2680
Ratliff LJ, Burden SA, Sastry SS (2013) Characterization and computation of local nash equilibria in continuous games. In: 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 917–924. IEEE
Zhu Y, Zhang Y, Yang H, Wang F (2019) Gancoder: an automatic natural language-to-programming language translation approach based on gan. In: Natural Language Processing and Chinese Computing: 8th CCF International Conference, NLPCC 2019, Dunhuang, China, pp. 529–539. Springer
Sun Y, Xu L, Guo L, Li Y, Wang Y (2020) A comparison study of vae and gan for software fault prediction. In: Algorithms and Architectures for Parallel Processing: 19th International Conference, ICA3PP 2019, Melbourne, VIC, Australia, December 9–11, 2019, Proceedings, Part II 19, pp. 82–96. Springer
Xing Y, Qian X, Guan Y, Yang B, Zhang Y (2022) Cross-project defect prediction based on g-lstm model. Pattern Recognition Letters 160:50–57. https://doi.org/10.1016/j.patrec.2022.04.039
Chouhan SS, Rathore SS (2021) Generative adversarial networks-based imbalance learning in software aging-related bug prediction. IEEE Trans Reliab 70(2):626–642
Article Google Scholar
Song W, Gan L, Bao T (2024) Software defect prediction via generative adversarial networks and pre-trained model. International Journal of Advanced Computer Science & Applications 15(3)
Zhu Z, Tong H, Wang Y, Li Y (2023) Bl-gan: Semi-supervised bug localization via generative adversarial network. IEEE Trans Knowl Data Eng 35(11):11112–11125. https://doi.org/10.1109/TKDE.2022.3225329
Article Google Scholar
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. Advances in neural information processing systems 29
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of wasserstein gans. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 5769–5779
S K, Durgadevi M (2021) Generative adversarial network (gan): a general review on different variants of gan and applications. In: 2021 6th International Conference on Communication and Electronics Systems (ICCES), pp. 1–8. https://doi.org/10.1109/ICCES51350.2021.9489160
Xu L, Skoularidou M, Cuesta-Infante A, Veeramachaneni K (2019) Modeling Tabular Data Using Conditional GAN. Curran Associates Inc
Bishop CM (2006) Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Berlin, Heidelberg
MATH Google Scholar
Arora JS (2017) Introduction to optimum design (fourth edition), Fourth edition edn. Academic Press, Boston. https://www.sciencedirect.com/science/article/pii/B9780128008065000251
Lin CY (2016) A reversible data transform algorithm using integer transform for privacy-preserving data mining. J. Syst. Softw 117(C):104–112
Mullick SS, Datta S, Das S (2019) Generative adversarial minority oversampling. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1695–1704
Shirabad JS, Menzies T (2005) The promise repository of software engineering databases
Mehta S, Patnaik KS (2021) Improved prediction of software defects using ensemble machine learning techniques. Neural Comput Appl 33:10551–10562
Article MATH Google Scholar
Qi X, Zhu Y, Zhang H (2017) A new meta-heuristic butterfly-inspired algorithm. Journal of computational science 23:226–239
Article MathSciNet MATH Google Scholar
Zhao W, Wang L, Zhang Z (2019) Atom search optimization and its application to solve a hydrogeologic parameter estimation problem. Knowl-Based Syst 163:283–304
Article MATH Google Scholar
Hashim FA, Houssein EH, Mabrouk MS, Al-Atabany W, Mirjalili S (2019) Henry gas solubility optimization: A novel physics-based algorithm. Futur Gener Comput Syst 101:646–667
Article MATH Google Scholar
Thirumoorthy K, Muneeswaran K (2021) Feature selection using hybrid poor and rich optimization algorithm for text classification. Pattern Recogn Lett 147:63–70
Article Google Scholar
Malhotra R, Khan K (2020) A study on software defect prediction using feature extraction techniques. In: 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO), pp. 1139–1144. IEEE
Chicco D, Jurman G (2020) The advantages of the matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genomics 21:1–13
Article MATH Google Scholar
Thanh-Tung H, Tran T (2020) Catastrophic forgetting and mode collapse in gans. In: 2020 International Joint Conference on Neural Networks (ijcnn), pp. 1–10. IEEE

Download references

Acknowledgements

This research is funded by Funds for Science and Technology Development of the University of Danang under project number B2022-DN07-02.

Funding

This research is funded by Funds for Science and Technology Development of the University of Danang under project number B2022-DN07-02.

Author information

Authors and Affiliations

The University of Danang, Vietnam - Korea University of Information and Communication Technology, 55000, Da Nang, Vietnam
Ha Thi Minh Phuong, Pham Vu Thu Nguyet, Nguyen Huu Nhat Minh & Nguyen Thanh Binh
The University of Danang, University of Science and Technology, 55000, Da Nang, Vietnam
Le Thi My Hanh

Authors

Ha Thi Minh Phuong
View author publications
You can also search for this author in PubMed Google Scholar
Pham Vu Thu Nguyet
View author publications
You can also search for this author in PubMed Google Scholar
Nguyen Huu Nhat Minh
View author publications
You can also search for this author in PubMed Google Scholar
Le Thi My Hanh
View author publications
You can also search for this author in PubMed Google Scholar
Nguyen Thanh Binh
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors discussed the results and contributed to the final manuscript.

Corresponding author

Correspondence to Nguyen Thanh Binh.

Ethics declarations

Conflict of Interest/Competing Interests

(check journal-specific guidelines for which heading to use) There are no conflicts of interest regarding the publication of this paper.

Ethics Approval

Not Applicable

Consent to Participate

Not Applicable

Consent for Publication

I hereby provide consent for the publication of the manuscript

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Thi Minh Phuong, H., Vu Thu Nguyet, P., Huu Nhat Minh, N. et al. A comparative study of handling imbalanced data using generative adversarial networks for machine learning based software fault prediction. Appl Intell 55, 280 (2025). https://doi.org/10.1007/s10489-024-05930-z

Download citation

Accepted: 16 October 2024
Published: 08 January 2025
DOI: https://doi.org/10.1007/s10489-024-05930-z

Keywords