Skip to main content

Implicit Test Case Identification/Selection for Test Case Prioritization Using Natural Language Processing

  • Conference paper
  • First Online:
The 7th International Conference on Wireless, Intelligent and Distributed Environment for Communication (WIDECOM 2023)

Abstract

Test case prioritization is important in regression testing as it enhances testing efficiency by arranging test cases to catch errors quickly. Traditional test case prioritization methods use factors such as code coverage, change information, and historical data to prioritize test cases. Implicit testing can be used to uncover hidden dependencies and user behaviours, leading to the exploration of natural language processing for test case identification. This chapter proposes a novel approach that includes implicit test cases for test case prioritization using natural language processing techniques for feature extraction and classification. Natural language processing is used to analyse test case descriptions to identify implicit test cases, which can then be prioritized alongside explicit test cases. Feature selection is implemented using term frequency–inverse document frequency (TF–IDF) scores, and a multinomial naive Bayes (MNB) classifier is trained to predict labels based on the selected features. Our trained model has an accuracy of 92%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. M. Qasim, A. Bibi, S.J. Hussain, N.Z. Jhanjhi, M. Humayun, N.U. Sama, Test case prioritization techniques in software regression testing: An overview. Int. J. Adv. Appl. Sci. 8(5), 107–121 (2021)

    Article  Google Scholar 

  2. S. Priti, D. Kavita, Design of proposed test case prioritization model for test sequence generation and validating performance against existing methods. Int. J. Recent Technol. Eng. 8(3), 918–924 (2019). https://doi.org/10.35940/ijrte.C4092.098319

    Article  MATH  Google Scholar 

  3. H. Hemmati, Advances in Techniques for Test Prioritization, vol 112, 1st edn. (Elsevier Inc., 2019). https://doi.org/10.1016/bs.adcom.2017.12.004

    Book  MATH  Google Scholar 

  4. V. Mäntylä, J. Itkonen, How are software defects found? The role of implicit defect detection, individual responsibility, documents, and knowledge. Inf. Softw. Technol. 56, 1597 (2014)

    Article  MATH  Google Scholar 

  5. L. Xiao, H. Miao, T. Shi, Y. Hong, LSTM-based deep learning for spatial–temporal software testing. Distrib. Parallel Databases 38(3), 687–712 (2020). https://doi.org/10.1007/s10619-020-07291-1

    Article  MATH  Google Scholar 

  6. S. Omri and C. Sinz, “Machine Learning Techniques for Software Quality Assurance: A Survey,” 2021

    MATH  Google Scholar 

  7. R. Lima, A. Miguel, J. Ribeiro, Artificial Intelligence Applied to Software Testing : A Literature Review (2020), pp. 24–27

    MATH  Google Scholar 

  8. R. Găceanu, A. Szederjesi-Dragomir, A. Vescan, Neural network-based test case prioritization in continuous integration, in 38th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW), (Luxembourg, 2023), pp. 68–77. https://doi.org/10.1109/ASEW60602.2023.00014

  9. R. Malhotra, K. Khan, A study on software defect prediction using feature extraction techniques, in ICRITO 2020 – IEEE 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions), (2020), pp. 1139–1144. https://doi.org/10.1109/ICRITO48877.2020.9197999

    Chapter  MATH  Google Scholar 

  10. Y. Lecun et al. PERSPECTIVES Special Topic: Machine Learning Deep learning for natural language processing: advantages and challenges. 11. P. Sprechmann, A.M. Bronstein, G. Sapiro. IEEE TPAMI. 5(1), 22–24 (2018) https://doi.org/10.1093/nsr/nwx099

  11. S. Omri, Learning to Rank for Test Case Prioritization (2022), pp. 16–24

    MATH  Google Scholar 

  12. R. Huang, D. Towey, Y. Xu, Y. Zhou, N. Yang, Dissimilarity-based test case prioritization through data fusion. Softw. Pract. Experience 52(6), 1352–1377 (2022). https://doi.org/10.1002/SPE.3068

    Article  MATH  Google Scholar 

  13. J.A.P. Lima, S.R. Vergilio, J.A. Prado Lima, S.R. Vergilio, J.A.P. Lima, S.R. Vergilio, Test case prioritization in continuous integration environments: A systematic mapping study. Inf. Softw. Technol. 121, 106268 (2020). https://doi.org/10.1016/j.infsof.2020.106268

    Article  MATH  Google Scholar 

  14. T. Shi, Reinforcement Learning Based Test Case Prioritization for Enhancing the Security of Software (2020), pp. 663–672. https://doi.org/10.1109/DSAA49011.2020.00076

    Book  MATH  Google Scholar 

  15. S. Elbaum, A. G. Malishevsky, G. Rothermel. Prioritizing Test Cases for Regression Testing (2000). Accessed: 23 Mar 2020. [Online]. Available: https://digitalcommons.unl.edu/csetechreports/27

  16. Y. Lou, J. Chen, L. Zhang, D. Hao, A Survey on Regression Test-Case Prioritization, vol 113, 1st edn. (Elsevier Inc., 2019). https://doi.org/10.1016/bs.adcom.2018.10.001

    Book  MATH  Google Scholar 

  17. M. Khatibsyarbini, M.A. Isa, D.N.A. Jawawi, R. Tumeng, Test Case Prioritization Approaches in Regression Testing: A Systematic Literature Review (Elsevier B.V, 2018). https://doi.org/10.1016/j.infsof.2017.08.014

    Book  Google Scholar 

  18. J. Krüger, W. Gu, H. Shen, M. Mukelabai, R. Hebig, T. Berger, Towards a beter understanding of software features and their characteristics: A case study of Marlin, in ACM International Conference Proceeding Series, (2018), pp. 105–112. https://doi.org/10.1145/3168365.3168371

    Chapter  Google Scholar 

  19. E.N. Akimova et al., A survey on software defect prediction using deep learning. Mathematics 9(11), 1180 (2021). https://doi.org/10.3390/MATH9111180

    Article  MATH  Google Scholar 

  20. T. Cao, T.N. Vu, H.T. Le, V. Nguyen, Ensemble Approaches for Test Case Prioritization in UI Testing. https://doi.org/10.18293/SEKE2022-148

  21. K. Xu, T. Wang, L. Cheng, Service Recommendation of Industrial Software Components Based on Explicit and Implicit Higher-Order Feature Interactions and Attentional Factorization Machines. Appl. Sci. 13(19), 10746 (2023). https://doi.org/10.3390/APP131910746

    Article  MATH  Google Scholar 

  22. D. W. Otter, J. R. Medina, and J. K. Kalita, “A Survey of the Usages of Deep Learning for Natural Language Processing,” 2019

    MATH  Google Scholar 

  23. M. Moreno Lopez, J. Kalita, Deep learning applied to NLP. arXiv:1703.03091vl 1, 1703.03091Elsevier (2017)

    MATH  Google Scholar 

  24. H. Li, Deep learning for natural language processing: Advantages and challenges. Natl. Sci. Rev. 5(1), 24–26 (2018)

    Article  MATH  Google Scholar 

  25. S. Tahvili, L. Hatvani, E. Ramentol, R. Pimentel, W. Afzal, F. Herrera, A novel methodology to classify test cases using natural language processing and imbalanced learning. Eng. Appl. Artif. Intell. 95(August), 103878 (2020). https://doi.org/10.1016/j.engappai.2020.103878

    Article  Google Scholar 

  26. R. Pan, M. Bagherzadeh, T. A. Ghaleb, and L. Briand, “Test Case Selection and Prioritization Using Machine Learning: A Systematic Literature Review,” 2021

    Google Scholar 

  27. S. Sutar, R. Kumar, S. Pai, S. Br, Regression test cases selection using natural language processing, in Proceedings of International Conference on Intelligent Engineering and Management, ICIEM, vol. 2020, (2020), pp. 301–305. https://doi.org/10.1109/ICIEM48762.2020.9160225

    Chapter  Google Scholar 

  28. M. Azizi, A Tag-based Recommender System for Regression Test Case Prioritization (2021), pp. 146–157. https://doi.org/10.1109/ICSTW52544.2021.00035

    Book  MATH  Google Scholar 

  29. B. MPOFU, Software Defect Prediction Using Maximal Information Coefficient and Fast Correlation-Based Filter Feature Selection, vol 93, No. I (2017), p. 259

    Google Scholar 

  30. J.A.P. Lima, S.R. Vergilio, A multi-armed bandit approach for test case prioritization in continuous integration environments. IEEE Trans. Softw. Eng. 48(2), 453–465 (2022). https://doi.org/10.1109/TSE.2020.2992428

    Article  MATH  Google Scholar 

  31. Z. Wu, Y.Y. Yang, Z. Li, R. Zhao, A time window based reinforcement learning reward for test case prioritization in continuous integration, in ACM International Conference Proceeding Series, (2019), pp. 2–7. https://doi.org/10.1145/3361242.3361258

    Chapter  MATH  Google Scholar 

  32. E.A. Roza, J.A.P. Lima, R.C. Silva, S.R. Vergilio, Machine Learning Regression Techniques for Test Case Prioritization in Continuous Integration Environment (2022), pp. 196–206

    Google Scholar 

  33. J. Liang, S. Elbaum, G. Rothermel, Redefining prioritization: Continuous prioritization for continuous integration, in Proceedings-International Conference on Software Engineering, (IEEE Computer Society, May 2018), pp. 688–698. https://doi.org/10.1145/3180155.3180213

    Chapter  Google Scholar 

  34. S. Ali, Y. Hafeez, S. Hussain, S. Yang, Enhanced regression testing technique for agile software development and continuous integration strategies. Softw. Qual. J. 28(2), 397–423 (2020). https://doi.org/10.1007/s11219-019-09463-4

    Article  MATH  Google Scholar 

  35. W. Wen, Y. Zhongju, Y. Yuyu, Improving RETECS method using FP-Growth in continuous integration, vol 5 (2018), pp. 636–639

    MATH  Google Scholar 

  36. D. Marijan, Neural Network Classification for Improving Continuous Regression Testing (2020), pp. 123–124. https://doi.org/10.1109/AITEST49225.2020.00025

    Book  MATH  Google Scholar 

  37. N. Medhat, S.M. Moussa, N.L. Badr, M.F. Tolba, A framework for continuous regression and integration testing in IoT systems based on deep learning and search-based techniques. IEEE Access 8, 215716–215726 (2020). https://doi.org/10.1109/ACCESS.2020.3039931

    Article  MATH  Google Scholar 

  38. A. Sharif, D. Marijan, M. Liaaen, DeepOrder: Deep Learning for Test Case Prioritization in Continuous Integration Testing

    Google Scholar 

  39. D. Gao, X. Guo, L. Zhao, Test case prioritization for regression testing based on ant colony optimization, in Proceedings of the IEEE International Conference on Software Engineering and Service Sciences, vol. 2015. Novem, no. 91118007, (ICSESS, 2015), pp. 275–279. https://doi.org/10.1109/ICSESS.2015.7339054

    Chapter  MATH  Google Scholar 

  40. S. Sutar, “Regression Test Cases Selection Using Natural Language Processing,” 2020

    Book  Google Scholar 

  41. B. Das, S. Chakraborty, An improved text sentiment classification model using TF-IDF and next word negation. arXiv preprint arXiv, 1806.06407 (2018)

    Google Scholar 

  42. F. Li, W. Lu, J.W. Keung, X. Yu, L. Gong, J. Li, The impact of feature selection techniques on effort-aware defect prediction: An empirical study. IET Softw. 17(2), 168–193 (2023). https://doi.org/10.1049/sfw2.12099

    Article  MATH  Google Scholar 

Download references

Acknowledgements

We would like to acknowledge Nkosikhona J Dube, Nolwazi Ncube, Ayanda Ncube, Caroline Mhlanga, and Nonhlanhla Mthethwa, who are software developers, for helping with the formulation of implicit test cases that were used to improve the dataset.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Siqabukile Ndlovu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ndlovu, S., Mnkandla, E. (2025). Implicit Test Case Identification/Selection for Test Case Prioritization Using Natural Language Processing. In: Woungang, I., Dhurandher, S.K. (eds) The 7th International Conference on Wireless, Intelligent and Distributed Environment for Communication. WIDECOM 2023. Lecture Notes on Data Engineering and Communications Technologies, vol 237. Springer, Cham. https://doi.org/10.1007/978-3-031-80817-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-80817-3_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-80816-6

  • Online ISBN: 978-3-031-80817-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics