Implicit Test Case Identification/Selection for Test Case Prioritization Using Natural Language Processing

Ndlovu, Siqabukile; Mnkandla, Ernest

doi:10.1007/978-3-031-80817-3_1

Siqabukile Ndlovu⁴ &
Ernest Mnkandla⁵

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 237))

Included in the following conference series:

International Conference on Wireless Intelligent and Distributed Environment for Communication

50 Accesses

Abstract

Test case prioritization is important in regression testing as it enhances testing efficiency by arranging test cases to catch errors quickly. Traditional test case prioritization methods use factors such as code coverage, change information, and historical data to prioritize test cases. Implicit testing can be used to uncover hidden dependencies and user behaviours, leading to the exploration of natural language processing for test case identification. This chapter proposes a novel approach that includes implicit test cases for test case prioritization using natural language processing techniques for feature extraction and classification. Natural language processing is used to analyse test case descriptions to identify implicit test cases, which can then be prioritized alongside explicit test cases. Feature selection is implemented using term frequency–inverse document frequency (TF–IDF) scores, and a multinomial naive Bayes (MNB) classifier is trained to predict labels based on the selected features. Our trained model has an accuracy of 92%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

M. Qasim, A. Bibi, S.J. Hussain, N.Z. Jhanjhi, M. Humayun, N.U. Sama, Test case prioritization techniques in software regression testing: An overview. Int. J. Adv. Appl. Sci. 8(5), 107–121 (2021)
Article Google Scholar
S. Priti, D. Kavita, Design of proposed test case prioritization model for test sequence generation and validating performance against existing methods. Int. J. Recent Technol. Eng. 8(3), 918–924 (2019). https://doi.org/10.35940/ijrte.C4092.098319
Article MATH Google Scholar
H. Hemmati, Advances in Techniques for Test Prioritization, vol 112, 1st edn. (Elsevier Inc., 2019). https://doi.org/10.1016/bs.adcom.2017.12.004
Book MATH Google Scholar
V. Mäntylä, J. Itkonen, How are software defects found? The role of implicit defect detection, individual responsibility, documents, and knowledge. Inf. Softw. Technol. 56, 1597 (2014)
Article MATH Google Scholar
L. Xiao, H. Miao, T. Shi, Y. Hong, LSTM-based deep learning for spatial–temporal software testing. Distrib. Parallel Databases 38(3), 687–712 (2020). https://doi.org/10.1007/s10619-020-07291-1
Article MATH Google Scholar
S. Omri and C. Sinz, “Machine Learning Techniques for Software Quality Assurance: A Survey,” 2021
MATH Google Scholar
R. Lima, A. Miguel, J. Ribeiro, Artificial Intelligence Applied to Software Testing : A Literature Review (2020), pp. 24–27
MATH Google Scholar
R. Găceanu, A. Szederjesi-Dragomir, A. Vescan, Neural network-based test case prioritization in continuous integration, in 38th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW), (Luxembourg, 2023), pp. 68–77. https://doi.org/10.1109/ASEW60602.2023.00014
R. Malhotra, K. Khan, A study on software defect prediction using feature extraction techniques, in ICRITO 2020 – IEEE 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions), (2020), pp. 1139–1144. https://doi.org/10.1109/ICRITO48877.2020.9197999
Chapter MATH Google Scholar
Y. Lecun et al. PERSPECTIVES Special Topic: Machine Learning Deep learning for natural language processing: advantages and challenges. 11. P. Sprechmann, A.M. Bronstein, G. Sapiro. IEEE TPAMI. 5(1), 22–24 (2018) https://doi.org/10.1093/nsr/nwx099
S. Omri, Learning to Rank for Test Case Prioritization (2022), pp. 16–24
MATH Google Scholar
R. Huang, D. Towey, Y. Xu, Y. Zhou, N. Yang, Dissimilarity-based test case prioritization through data fusion. Softw. Pract. Experience 52(6), 1352–1377 (2022). https://doi.org/10.1002/SPE.3068
Article MATH Google Scholar
J.A.P. Lima, S.R. Vergilio, J.A. Prado Lima, S.R. Vergilio, J.A.P. Lima, S.R. Vergilio, Test case prioritization in continuous integration environments: A systematic mapping study. Inf. Softw. Technol. 121, 106268 (2020). https://doi.org/10.1016/j.infsof.2020.106268
Article MATH Google Scholar
T. Shi, Reinforcement Learning Based Test Case Prioritization for Enhancing the Security of Software (2020), pp. 663–672. https://doi.org/10.1109/DSAA49011.2020.00076
Book MATH Google Scholar
S. Elbaum, A. G. Malishevsky, G. Rothermel. Prioritizing Test Cases for Regression Testing (2000). Accessed: 23 Mar 2020. [Online]. Available: https://digitalcommons.unl.edu/csetechreports/27
Y. Lou, J. Chen, L. Zhang, D. Hao, A Survey on Regression Test-Case Prioritization, vol 113, 1st edn. (Elsevier Inc., 2019). https://doi.org/10.1016/bs.adcom.2018.10.001
Book MATH Google Scholar
M. Khatibsyarbini, M.A. Isa, D.N.A. Jawawi, R. Tumeng, Test Case Prioritization Approaches in Regression Testing: A Systematic Literature Review (Elsevier B.V, 2018). https://doi.org/10.1016/j.infsof.2017.08.014
Book Google Scholar
J. Krüger, W. Gu, H. Shen, M. Mukelabai, R. Hebig, T. Berger, Towards a beter understanding of software features and their characteristics: A case study of Marlin, in ACM International Conference Proceeding Series, (2018), pp. 105–112. https://doi.org/10.1145/3168365.3168371
Chapter Google Scholar
E.N. Akimova et al., A survey on software defect prediction using deep learning. Mathematics 9(11), 1180 (2021). https://doi.org/10.3390/MATH9111180
Article MATH Google Scholar
T. Cao, T.N. Vu, H.T. Le, V. Nguyen, Ensemble Approaches for Test Case Prioritization in UI Testing. https://doi.org/10.18293/SEKE2022-148
K. Xu, T. Wang, L. Cheng, Service Recommendation of Industrial Software Components Based on Explicit and Implicit Higher-Order Feature Interactions and Attentional Factorization Machines. Appl. Sci. 13(19), 10746 (2023). https://doi.org/10.3390/APP131910746
Article MATH Google Scholar
D. W. Otter, J. R. Medina, and J. K. Kalita, “A Survey of the Usages of Deep Learning for Natural Language Processing,” 2019
MATH Google Scholar
M. Moreno Lopez, J. Kalita, Deep learning applied to NLP. arXiv:1703.03091vl 1, 1703.03091Elsevier (2017)
MATH Google Scholar
H. Li, Deep learning for natural language processing: Advantages and challenges. Natl. Sci. Rev. 5(1), 24–26 (2018)
Article MATH Google Scholar
S. Tahvili, L. Hatvani, E. Ramentol, R. Pimentel, W. Afzal, F. Herrera, A novel methodology to classify test cases using natural language processing and imbalanced learning. Eng. Appl. Artif. Intell. 95(August), 103878 (2020). https://doi.org/10.1016/j.engappai.2020.103878
Article Google Scholar
R. Pan, M. Bagherzadeh, T. A. Ghaleb, and L. Briand, “Test Case Selection and Prioritization Using Machine Learning: A Systematic Literature Review,” 2021
Google Scholar
S. Sutar, R. Kumar, S. Pai, S. Br, Regression test cases selection using natural language processing, in Proceedings of International Conference on Intelligent Engineering and Management, ICIEM, vol. 2020, (2020), pp. 301–305. https://doi.org/10.1109/ICIEM48762.2020.9160225
Chapter Google Scholar
M. Azizi, A Tag-based Recommender System for Regression Test Case Prioritization (2021), pp. 146–157. https://doi.org/10.1109/ICSTW52544.2021.00035
Book MATH Google Scholar
B. MPOFU, Software Defect Prediction Using Maximal Information Coefficient and Fast Correlation-Based Filter Feature Selection, vol 93, No. I (2017), p. 259
Google Scholar
J.A.P. Lima, S.R. Vergilio, A multi-armed bandit approach for test case prioritization in continuous integration environments. IEEE Trans. Softw. Eng. 48(2), 453–465 (2022). https://doi.org/10.1109/TSE.2020.2992428
Article MATH Google Scholar
Z. Wu, Y.Y. Yang, Z. Li, R. Zhao, A time window based reinforcement learning reward for test case prioritization in continuous integration, in ACM International Conference Proceeding Series, (2019), pp. 2–7. https://doi.org/10.1145/3361242.3361258
Chapter MATH Google Scholar
E.A. Roza, J.A.P. Lima, R.C. Silva, S.R. Vergilio, Machine Learning Regression Techniques for Test Case Prioritization in Continuous Integration Environment (2022), pp. 196–206
Google Scholar
J. Liang, S. Elbaum, G. Rothermel, Redefining prioritization: Continuous prioritization for continuous integration, in Proceedings-International Conference on Software Engineering, (IEEE Computer Society, May 2018), pp. 688–698. https://doi.org/10.1145/3180155.3180213
Chapter Google Scholar
S. Ali, Y. Hafeez, S. Hussain, S. Yang, Enhanced regression testing technique for agile software development and continuous integration strategies. Softw. Qual. J. 28(2), 397–423 (2020). https://doi.org/10.1007/s11219-019-09463-4
Article MATH Google Scholar
W. Wen, Y. Zhongju, Y. Yuyu, Improving RETECS method using FP-Growth in continuous integration, vol 5 (2018), pp. 636–639
MATH Google Scholar
D. Marijan, Neural Network Classification for Improving Continuous Regression Testing (2020), pp. 123–124. https://doi.org/10.1109/AITEST49225.2020.00025
Book MATH Google Scholar
N. Medhat, S.M. Moussa, N.L. Badr, M.F. Tolba, A framework for continuous regression and integration testing in IoT systems based on deep learning and search-based techniques. IEEE Access 8, 215716–215726 (2020). https://doi.org/10.1109/ACCESS.2020.3039931
Article MATH Google Scholar
A. Sharif, D. Marijan, M. Liaaen, DeepOrder: Deep Learning for Test Case Prioritization in Continuous Integration Testing
Google Scholar
D. Gao, X. Guo, L. Zhao, Test case prioritization for regression testing based on ant colony optimization, in Proceedings of the IEEE International Conference on Software Engineering and Service Sciences, vol. 2015. Novem, no. 91118007, (ICSESS, 2015), pp. 275–279. https://doi.org/10.1109/ICSESS.2015.7339054
Chapter MATH Google Scholar
S. Sutar, “Regression Test Cases Selection Using Natural Language Processing,” 2020
Book Google Scholar
B. Das, S. Chakraborty, An improved text sentiment classification model using TF-IDF and next word negation. arXiv preprint arXiv, 1806.06407 (2018)
Google Scholar
F. Li, W. Lu, J.W. Keung, X. Yu, L. Gong, J. Li, The impact of feature selection techniques on effort-aware defect prediction: An empirical study. IET Softw. 17(2), 168–193 (2023). https://doi.org/10.1049/sfw2.12099
Article MATH Google Scholar

Download references

Acknowledgements

We would like to acknowledge Nkosikhona J Dube, Nolwazi Ncube, Ayanda Ncube, Caroline Mhlanga, and Nonhlanhla Mthethwa, who are software developers, for helping with the formulation of implicit test cases that were used to improve the dataset.

Author information

Authors and Affiliations

National University of Science and Technology, Bulawayo, Zimbabwe
Siqabukile Ndlovu
University of South Africa, FL, Johannesburg, South Africa
Ernest Mnkandla

Authors

Siqabukile Ndlovu
View author publications
You can also search for this author in PubMed Google Scholar
Ernest Mnkandla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Siqabukile Ndlovu .

Editor information

Editors and Affiliations

Toronto Metropolitan University, Toronto, ON, Canada
Isaac Woungang
Department of Information Technology, Netaji Subhas University of Technology, New Delhi, Delhi, India
Sanjay Kumar Dhurandher

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ndlovu, S., Mnkandla, E. (2025). Implicit Test Case Identification/Selection for Test Case Prioritization Using Natural Language Processing. In: Woungang, I., Dhurandher, S.K. (eds) The 7th International Conference on Wireless, Intelligent and Distributed Environment for Communication. WIDECOM 2023. Lecture Notes on Data Engineering and Communications Technologies, vol 237. Springer, Cham. https://doi.org/10.1007/978-3-031-80817-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-80817-3_1
Published: 26 February 2025
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-80816-6
Online ISBN: 978-3-031-80817-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics