Abstract
Test case prioritization is important in regression testing as it enhances testing efficiency by arranging test cases to catch errors quickly. Traditional test case prioritization methods use factors such as code coverage, change information, and historical data to prioritize test cases. Implicit testing can be used to uncover hidden dependencies and user behaviours, leading to the exploration of natural language processing for test case identification. This chapter proposes a novel approach that includes implicit test cases for test case prioritization using natural language processing techniques for feature extraction and classification. Natural language processing is used to analyse test case descriptions to identify implicit test cases, which can then be prioritized alongside explicit test cases. Feature selection is implemented using term frequency–inverse document frequency (TF–IDF) scores, and a multinomial naive Bayes (MNB) classifier is trained to predict labels based on the selected features. Our trained model has an accuracy of 92%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
M. Qasim, A. Bibi, S.J. Hussain, N.Z. Jhanjhi, M. Humayun, N.U. Sama, Test case prioritization techniques in software regression testing: An overview. Int. J. Adv. Appl. Sci. 8(5), 107–121 (2021)
S. Priti, D. Kavita, Design of proposed test case prioritization model for test sequence generation and validating performance against existing methods. Int. J. Recent Technol. Eng. 8(3), 918–924 (2019). https://doi.org/10.35940/ijrte.C4092.098319
H. Hemmati, Advances in Techniques for Test Prioritization, vol 112, 1st edn. (Elsevier Inc., 2019). https://doi.org/10.1016/bs.adcom.2017.12.004
V. Mäntylä, J. Itkonen, How are software defects found? The role of implicit defect detection, individual responsibility, documents, and knowledge. Inf. Softw. Technol. 56, 1597 (2014)
L. Xiao, H. Miao, T. Shi, Y. Hong, LSTM-based deep learning for spatial–temporal software testing. Distrib. Parallel Databases 38(3), 687–712 (2020). https://doi.org/10.1007/s10619-020-07291-1
S. Omri and C. Sinz, “Machine Learning Techniques for Software Quality Assurance: A Survey,” 2021
R. Lima, A. Miguel, J. Ribeiro, Artificial Intelligence Applied to Software Testing : A Literature Review (2020), pp. 24–27
R. Găceanu, A. Szederjesi-Dragomir, A. Vescan, Neural network-based test case prioritization in continuous integration, in 38th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW), (Luxembourg, 2023), pp. 68–77. https://doi.org/10.1109/ASEW60602.2023.00014
R. Malhotra, K. Khan, A study on software defect prediction using feature extraction techniques, in ICRITO 2020 – IEEE 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions), (2020), pp. 1139–1144. https://doi.org/10.1109/ICRITO48877.2020.9197999
Y. Lecun et al. PERSPECTIVES Special Topic: Machine Learning Deep learning for natural language processing: advantages and challenges. 11. P. Sprechmann, A.M. Bronstein, G. Sapiro. IEEE TPAMI. 5(1), 22–24 (2018) https://doi.org/10.1093/nsr/nwx099
S. Omri, Learning to Rank for Test Case Prioritization (2022), pp. 16–24
R. Huang, D. Towey, Y. Xu, Y. Zhou, N. Yang, Dissimilarity-based test case prioritization through data fusion. Softw. Pract. Experience 52(6), 1352–1377 (2022). https://doi.org/10.1002/SPE.3068
J.A.P. Lima, S.R. Vergilio, J.A. Prado Lima, S.R. Vergilio, J.A.P. Lima, S.R. Vergilio, Test case prioritization in continuous integration environments: A systematic mapping study. Inf. Softw. Technol. 121, 106268 (2020). https://doi.org/10.1016/j.infsof.2020.106268
T. Shi, Reinforcement Learning Based Test Case Prioritization for Enhancing the Security of Software (2020), pp. 663–672. https://doi.org/10.1109/DSAA49011.2020.00076
S. Elbaum, A. G. Malishevsky, G. Rothermel. Prioritizing Test Cases for Regression Testing (2000). Accessed: 23 Mar 2020. [Online]. Available: https://digitalcommons.unl.edu/csetechreports/27
Y. Lou, J. Chen, L. Zhang, D. Hao, A Survey on Regression Test-Case Prioritization, vol 113, 1st edn. (Elsevier Inc., 2019). https://doi.org/10.1016/bs.adcom.2018.10.001
M. Khatibsyarbini, M.A. Isa, D.N.A. Jawawi, R. Tumeng, Test Case Prioritization Approaches in Regression Testing: A Systematic Literature Review (Elsevier B.V, 2018). https://doi.org/10.1016/j.infsof.2017.08.014
J. Krüger, W. Gu, H. Shen, M. Mukelabai, R. Hebig, T. Berger, Towards a beter understanding of software features and their characteristics: A case study of Marlin, in ACM International Conference Proceeding Series, (2018), pp. 105–112. https://doi.org/10.1145/3168365.3168371
E.N. Akimova et al., A survey on software defect prediction using deep learning. Mathematics 9(11), 1180 (2021). https://doi.org/10.3390/MATH9111180
T. Cao, T.N. Vu, H.T. Le, V. Nguyen, Ensemble Approaches for Test Case Prioritization in UI Testing. https://doi.org/10.18293/SEKE2022-148
K. Xu, T. Wang, L. Cheng, Service Recommendation of Industrial Software Components Based on Explicit and Implicit Higher-Order Feature Interactions and Attentional Factorization Machines. Appl. Sci. 13(19), 10746 (2023). https://doi.org/10.3390/APP131910746
D. W. Otter, J. R. Medina, and J. K. Kalita, “A Survey of the Usages of Deep Learning for Natural Language Processing,” 2019
M. Moreno Lopez, J. Kalita, Deep learning applied to NLP. arXiv:1703.03091vl 1, 1703.03091Elsevier (2017)
H. Li, Deep learning for natural language processing: Advantages and challenges. Natl. Sci. Rev. 5(1), 24–26 (2018)
S. Tahvili, L. Hatvani, E. Ramentol, R. Pimentel, W. Afzal, F. Herrera, A novel methodology to classify test cases using natural language processing and imbalanced learning. Eng. Appl. Artif. Intell. 95(August), 103878 (2020). https://doi.org/10.1016/j.engappai.2020.103878
R. Pan, M. Bagherzadeh, T. A. Ghaleb, and L. Briand, “Test Case Selection and Prioritization Using Machine Learning: A Systematic Literature Review,” 2021
S. Sutar, R. Kumar, S. Pai, S. Br, Regression test cases selection using natural language processing, in Proceedings of International Conference on Intelligent Engineering and Management, ICIEM, vol. 2020, (2020), pp. 301–305. https://doi.org/10.1109/ICIEM48762.2020.9160225
M. Azizi, A Tag-based Recommender System for Regression Test Case Prioritization (2021), pp. 146–157. https://doi.org/10.1109/ICSTW52544.2021.00035
B. MPOFU, Software Defect Prediction Using Maximal Information Coefficient and Fast Correlation-Based Filter Feature Selection, vol 93, No. I (2017), p. 259
J.A.P. Lima, S.R. Vergilio, A multi-armed bandit approach for test case prioritization in continuous integration environments. IEEE Trans. Softw. Eng. 48(2), 453–465 (2022). https://doi.org/10.1109/TSE.2020.2992428
Z. Wu, Y.Y. Yang, Z. Li, R. Zhao, A time window based reinforcement learning reward for test case prioritization in continuous integration, in ACM International Conference Proceeding Series, (2019), pp. 2–7. https://doi.org/10.1145/3361242.3361258
E.A. Roza, J.A.P. Lima, R.C. Silva, S.R. Vergilio, Machine Learning Regression Techniques for Test Case Prioritization in Continuous Integration Environment (2022), pp. 196–206
J. Liang, S. Elbaum, G. Rothermel, Redefining prioritization: Continuous prioritization for continuous integration, in Proceedings-International Conference on Software Engineering, (IEEE Computer Society, May 2018), pp. 688–698. https://doi.org/10.1145/3180155.3180213
S. Ali, Y. Hafeez, S. Hussain, S. Yang, Enhanced regression testing technique for agile software development and continuous integration strategies. Softw. Qual. J. 28(2), 397–423 (2020). https://doi.org/10.1007/s11219-019-09463-4
W. Wen, Y. Zhongju, Y. Yuyu, Improving RETECS method using FP-Growth in continuous integration, vol 5 (2018), pp. 636–639
D. Marijan, Neural Network Classification for Improving Continuous Regression Testing (2020), pp. 123–124. https://doi.org/10.1109/AITEST49225.2020.00025
N. Medhat, S.M. Moussa, N.L. Badr, M.F. Tolba, A framework for continuous regression and integration testing in IoT systems based on deep learning and search-based techniques. IEEE Access 8, 215716–215726 (2020). https://doi.org/10.1109/ACCESS.2020.3039931
A. Sharif, D. Marijan, M. Liaaen, DeepOrder: Deep Learning for Test Case Prioritization in Continuous Integration Testing
D. Gao, X. Guo, L. Zhao, Test case prioritization for regression testing based on ant colony optimization, in Proceedings of the IEEE International Conference on Software Engineering and Service Sciences, vol. 2015. Novem, no. 91118007, (ICSESS, 2015), pp. 275–279. https://doi.org/10.1109/ICSESS.2015.7339054
S. Sutar, “Regression Test Cases Selection Using Natural Language Processing,” 2020
B. Das, S. Chakraborty, An improved text sentiment classification model using TF-IDF and next word negation. arXiv preprint arXiv, 1806.06407 (2018)
F. Li, W. Lu, J.W. Keung, X. Yu, L. Gong, J. Li, The impact of feature selection techniques on effort-aware defect prediction: An empirical study. IET Softw. 17(2), 168–193 (2023). https://doi.org/10.1049/sfw2.12099
Acknowledgements
We would like to acknowledge Nkosikhona J Dube, Nolwazi Ncube, Ayanda Ncube, Caroline Mhlanga, and Nonhlanhla Mthethwa, who are software developers, for helping with the formulation of implicit test cases that were used to improve the dataset.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ndlovu, S., Mnkandla, E. (2025). Implicit Test Case Identification/Selection for Test Case Prioritization Using Natural Language Processing. In: Woungang, I., Dhurandher, S.K. (eds) The 7th International Conference on Wireless, Intelligent and Distributed Environment for Communication. WIDECOM 2023. Lecture Notes on Data Engineering and Communications Technologies, vol 237. Springer, Cham. https://doi.org/10.1007/978-3-031-80817-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-80817-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-80816-6
Online ISBN: 978-3-031-80817-3
eBook Packages: EngineeringEngineering (R0)