Skip to main content
Log in

Static test case prioritization using topic models

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Software development teams use test suites to test changes to their source code. In many situations, the test suites are so large that executing every test for every source code change is infeasible, due to time and resource constraints. Development teams need to prioritize their test suite so that as many distinct faults as possible are detected early in the execution of the test suite. We consider the problem of static black-box test case prioritization (TCP), where test suites are prioritized without the availability of the source code of the system under test (SUT). We propose a new static black-box TCP technique that represents test cases using a previously unused data source in the test suite: the linguistic data of the test cases, i.e., their identifier names, comments, and string literals. Our technique applies a text analysis algorithm called topic modeling to the linguistic data to approximate the functionality of each test case, allowing our technique to give high priority to test cases that test different functionalities of the SUT. We compare our proposed technique with existing static black-box TCP techniques in a case study of multiple real-world open source systems: several versions of Apache Ant and Apache Derby. We find that our static black-box TCP technique outperforms existing static black-box TCP techniques, and has comparable or better performance than two existing execution-based TCP techniques. Static black-box TCP methods are widely applicable because the only input they require is the source code of the test cases themselves. This contrasts with other TCP techniques which require access to the SUT runtime behavior, to the SUT specification models, or to the SUT source code.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Ali S, Briand LC, Hemmati H, Panesar-Walawege RK (2009) A systematic review of the application and empirical investigation of search-based test case generation. IEEE Trans Softw Eng 36(6):742–762

    Article  Google Scholar 

  • Apache Foundation (2012a) Ant. http://ant.apache.org. Accessed 17 July 2012

  • Apache Foundation (2012b) Apache. http://www.apache.org. Accessed 17 July 2012

  • Apache Foundation (2012c) Derby. http://db.apache.org/derby. Accessed 17 July 2012

  • Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd international conference on software engineering, pp 1–10

  • Asuncion HU, Asuncion AU, Taylor RN (2010) Software traceability with topic modeling. In: Proceedings of the 32nd international conference on software engineering, pp 95–104

  • Baldi PF, Lopes CV, Linstead EJ, Bajracharya SK (2008) A theory of aspects as latent topics. ACM SIGPLAN Not 43(10):543–562

    Article  Google Scholar 

  • Blei DM, Lafferty JD (2009) Topic models. In: Text mining: classification, clustering, and applications. Chapman & Hall, London, UK, pp 71–94

    Google Scholar 

  • Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  • Chang J (2011) lda: Collapsed Gibbs sampling methods for topic models. http://cran.r-project.org/web/packages/lda. Accessed 17 July 2012

  • Chen S, Chen Z, Zhao Z, Xu B, Feng Y (2011) Using semi-supervised clustering to improve regression test selection techniques. In: Proceedings of the 4th international conference on software testing, verification and validation, pp 1–10

  • Cordy JR (2006) The TXL source transformation language. Science of Computer Programming 61(3):190–210

    Article  MATH  MathSciNet  Google Scholar 

  • Do H, Elbaum S, Rothermel G (2005) Supporting controlled experimentation with testing techniques: an infrastructure and its potential impact. Empir Software Eng 10(4):405–435

    Article  Google Scholar 

  • Elbaum S, Malishevsky A, Rothermel G (2002) Test case prioritization: a family of empirical studies. IEEE Trans Softw Eng 28(2):159–182

    Article  Google Scholar 

  • Feldt R, Torkar R, Gorschek T, Afzal W (2008) Searching for cognitively diverse tests: towards universal test diversity metrics. In: Proceedings of the international conference on software testing verification and validation workshop, pp 178–186

  • Gethers M, Poshyvanyk D (2010) Using relational topic models to capture coupling among classes in object-oriented software systems. In: Proceedings of the 26th international conference on software maintenance, pp 1–10

  • Gethers M, Oliveto R, Poshyvanyk D, Lucia A (2011) On integrating orthogonal information retrieval methods to improve traceability recovery. In: Proceedings of the 27th international conference on software maintenance, pp 133–142

  • Grant S, Cordy JR (2010) Estimating the optimal number of latent concepts in source code analysis. In: Proceedings of the 10th international working conference on source code analysis and manipulation, pp 65–74

  • Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101:5228–5235

    Article  Google Scholar 

  • Griffiths TL, Steyvers M, Tenenbaum JB (2007) Topics in semantic representation. Psychol Rev 114(2):211–244

    Article  Google Scholar 

  • Hemmati H, Arcuri A, Briand L (2010a) Reducing the cost of model-based testing through test case diversity. In: Proceedings of the 22nd international conference on testing software and systems, pp 63–78

  • Hemmati H, Briand L, Arcuri A, Ali S (2010b) An enhanced test case selection approach for model-based testing: an industrial case study. In: Proceedings of the 18th international symposium on foundations of software engineering, pp 267–276

  • Hemmati H, Arcuri A, Briand L (2011) Empirical investigation of the effects of test suite properties on similarity-based test case selection. In: Proceedings of the 4th international conference on software testing, verification and validation, pp 327–336

  • Hemmati H, Briand L, Arcuri A (2013) Achieving scalable model-based testing through test case diversity. ACM Trans Softw Eng Methodol 22(1) (upcoming)

  • Hofmann T (1999) Probabilistic Latent Semantic Indexing. In: Proceedings of the 22nd international conference on research and development in information retrieval, pp 50–57

  • Ihaka R, Gentleman R (1996) R: A language for data analysis and graphics. J Comput Graph Stat 5(3):299–314

    Google Scholar 

  • Jiang B, Zhang Z, Chan W, Tse T (2009) Adaptive random test case prioritization. In: Proceedings of the 24th international conference on automated software engineering, pp 233–244

  • Jones J, Harrold M (2003) Test-suite reduction and prioritization for modified condition/decision coverage. IEEE Trans Softw Eng 29(3):195–209

    Article  Google Scholar 

  • Korel B, Koutsogiannakis G, Tahat L (2007) Model-based test prioritization heuristic methods and their evaluation. In: Proceedings of the 3rd international workshop on advances in model-based testing, pp 34–43

  • Kuhn A, Ducasse S, Girba T (2007) Semantic clustering: identifying topics in source code. Inf Softw Technol 49(3):230–243

    Article  Google Scholar 

  • Kullback S, Leibler R (1951) On information and sufficiency. Ann Math Stat 22(1):79–86

    Article  MATH  MathSciNet  Google Scholar 

  • Kumar A (2010) Development at the speed and scale of google. Presented at QCon 2010, San Francisco, CA, USA

  • Ledru Y, Petrenko A, Boroday S (2009) Using string distances for test case prioritisation. In: Proceedings of the 24th international conference on automated software engineering, pp 510–514

  • Ledru Y, Petrenko A, Boroday S, Mandran N (2011) Prioritizing test cases with string distances. Autom Softw Eng 19(1):65–95

    Article  Google Scholar 

  • Leon D, Podgurski A (2003) A comparison of coverage-based and distribution-based techniques for filtering and prioritizing test cases. In: Proceedings of the international symposium on software reliability engineering, pp 442–456

  • Linstead E, Lopes C, Baldi P (2008) An application of latent Dirichlet allocation to analyzing software evolution. In: Proceedings of the 7th international conference on machine learning and applications, pp 813–818

  • Liu Y, Poshyvanyk D, Ferenc R, Gyimothy T, Chrisochoides N (2009) Modeling class cohesion as mixtures of latent topics. In: Proceedings of the 25th international conference on software maintenance, pp 233–242

  • Lukins SK, Kraft NA, Etzkorn LH (2010) Bug localization using latent Dirichlet allocation. Inf Softw Technol 52(9):972–990

    Article  Google Scholar 

  • Marcus A, Sergeyev A, Rajlich V, Maletic JI (2004) An information retrieval approach to concept location in source code. In: Proceedings of the 11th working conference on reverse engineering, pp 214–223

  • Maskeri G, Sarkar S, Heafield K (2008) Mining business topics in source code using latent Dirichlet allocation. In: Proceedings of the 1st conference on India software engineering conference, pp 113–120

  • Masri W, Podgurski A, Leon D (2007) An empirical study of test case filtering techniques based on exercising information flows. IEEE Trans Softw Eng 33(7):454–477

    Article  Google Scholar 

  • McMaster S, Memon A (2006) Call stack coverage for GUI test-suite reduction. IEEE Trans Softw Eng 34(1):99–115

    Article  Google Scholar 

  • Mei H, Hao D, Zhang L, Zhang L, Zhou J, Rothermel G (2011) A static approach to prioritizing JUnit test cases. IEEE Trans Softw Eng. doi:10.1109/TSE.2011.106

  • Oliveto R, Gethers M, Bavota G, Poshyvanyk D, De Lucia A (2011) Identifying method friendships to remove the feature envy bad smell. In: Proceeding of the 33rd international conference on software engineering (NIER Track), pp 820–823

  • Porteous I, Newman D, Ihler A, Asuncion A, Smyth P, Welling M (2008) Fast collapsed Gibbs sampling for latent Dirichlet allocation. In: Proceeding of the 14th international conference on knowledge discovery and data mining, pp 569–577

  • Ramanathan MK, Koyuturk M, Grama A, Jagannathan S (2008) PHALANX: a graph-theoretic framework for test case prioritization. In: Proceedings of the 23rd ACM symposium on applied computing, pp 667–673

  • Rothermel G, Untch R, Chu C, Harrold M (2001) Prioritizing test cases for regression testing. IEEE Trans Softw Eng 27(10):929–948

    Article  Google Scholar 

  • Rothermel G, Harrold M, Von Ronne J, Hong C (2002) Empirical studies of test-suite reduction. Softw Test Verif Reliab 12(4):219–249

    Article  Google Scholar 

  • Sampath S, Bryce RC, Viswanath G, Kandimalla V, Koru AG (2008) Prioritizing user-session-based test cases for web applications testing. In: Proceedings of the 1st international conference on software testing, verification, and validation, pp 141–150

  • Savage T, Dit B, Gethers M, Poshyvanyk D (2010) TopicXP: Exploring topics in source code using latent Dirichlet allocation. In: Proceedings of the 26th international conference on software maintenance, pp 1–6

  • Simao A, de Mello RF, Senger LJ (2006) A technique to reduce the test case suites for regression testing based on a self-organizing neural network architecture. In: Proceedings of the 30th annual international computer software and applications conference, pp 93–96

  • Thomas SW (2012a) http://research.cs.queensu.ca/~sthomas/. Accessed 17 July 2012

  • Thomas SW (2012b) Mining software repositories with topic models. Tech. Rep. 2012-586, School of Computing, Queen’s University

  • Thomas SW, Adams B, Hassan AE, Blostein D (2010) Validating the use of topic models for software evolution. In: Proceedings of the 10th international working conference on source code analysis and manipulation, pp 55–64

  • Thomas SW, Adams B, Hassan AE, Blostein D (2011) Modeling the evolution of topics in source code histories. In: Proceedings of the 8th working conference on mining software repositories, pp 173–182

  • Vargha A, Delaney HD (2000) A critique and improvement of the CL common language effect size statistics of McGraw and Wong. J Educ Behav Stat 25(2):101–132

    Google Scholar 

  • Wallach HM, Murray I, Salakhutdinov R, Mimno D (2009) Evaluation methods for topic models. In: Proceedings of the 26th international conference on machine learning, pp 1105–1112

  • Wang S, Lo D, Xing Z, Jiang L (2011) Concern localization using information retrieval: an empirical study on Linux kernel. In: Proceedings of the 18th working conference on reverse engineering, pp 92–96

  • Wong W, Horgan J, London S, Agrawal H (1997) A study of effective regression testing in practice. In: Proceedings of the 8th international symposium on software reliability engineering, pp 264–274

  • Yoo S, Harman M (2010) Regression testing minimization, selection and prioritization: a survey. Softw Test Verif Reliab 22(2):67–120

    Article  Google Scholar 

  • Yoo S, Harman M, Tonella P, Susi A (2009) Clustering test cases to achieve effective and scalable prioritisation incorporating expert knowledge. In: Proceedings of the 18th international symposium on software testing and analysis, pp 201–212

  • Zhang L, Zhou J, Hao D, Zhang L, Mei H (2009) Prioritizing JUnit test cases in absence of coverage information. In: Proceedings of the 25th international conference on software maintenance, pp 19–28

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephen W. Thomas.

Additional information

Editor: Gregg Rothermel

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thomas, S.W., Hemmati, H., Hassan, A.E. et al. Static test case prioritization using topic models. Empir Software Eng 19, 182–212 (2014). https://doi.org/10.1007/s10664-012-9219-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-012-9219-7

Keywords

Navigation