Abstract
Random testing is scalable but often fails to hit corner program behaviors, while systematic testing (e.g., concolic execution) is promising to cover corner program behaviors but is not scalable to explore all program behaviors. Prior attempts to integrate random testing with systematic testing lack targeted guidance. In this paper, we propose a guided hybrid testing approach, named Baton, to synergize random testing with concolic testing. It integrates the knowledge inside test cases and their executions into a conditional execution graph, and uses such knowledge to guide test case generation. Specifically, we learn classification models for some conditionals in the conditional execution graph in a demand-driven way. These models are used to guide random testing to reach and cover partially-covered conditionals. We further employ targeted concolic testing to cover conditionals that cannot be fully covered by guided random testing. We implemented Baton for Java and evaluated it on three benchmarks. The results show that Baton improved branch coverage and mutation score over random testing by 16.2%–29.4% and 19.0%–30.0%, over adaptive random testing by 16.8%–33.8% and 19.4%–34.2%, over concolic testing by 2.3%–29.9% and 2.9%–30.1%, and over simple hybrid testing by 1.6%–14.5% and 1.4%–18.7%.
Similar content being viewed by others
References
Loo P, Tsai W. Random testing revisited. Inf Softw Tech, 1988, 30: 402–417
Arcuri A, Iqbal M Z, Briand L. Random testing: theoretical results and practical implications. IEEE Trans Softw Eng, 2012, 38: 258–277
Chen T Y, Kuo F C, Merkel R G, et al. Adaptive random testing: the ART of test case diversity. J Syst Softw, 2010, 83: 60–66
Tappenden A F, Miller J. A novel evolutionary approach for adaptive random testing. IEEE Trans Rel, 2009, 58: 619–633
Chen T Y, Merkel R. Quasi-random testing. IEEE Trans Rel, 2007, 56: 562–568
Liu H, Chen T Y. Randomized quasi-random testing. IEEE Trans Comput, 2016, 65: 1896–1909
Böhme M, Paul S. On the efficiency of automated testing. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2014. 632–642
Xie T, Marinov D, Schulte W, et al. Symstra: a framework for generating object-oriented unit tests using symbolic execution. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems, 2005. 365–381
Cadar C, Dunbar D, Engler D. Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, 2008. 209–224
Păsăreanu C S, Visser W, Bushnell D, et al. Symbolic PathFinder: integrating symbolic execution with model checking for Java bytecode analysis. Autom Softw Eng, 2013, 20: 391–425
Godefroid P, Klarlund N, Sen K. DART: directed automated random testing. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2005. 213–223
Sen K, Marinov D, Agha G. CUTE: a concolic unit testing engine for C. In: Proceedings of the 10th European Software Engineering Conference Held Jointly with the 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2005. 263–272
Majumdar R, Sen K. Hybrid concolic testing. In: Proceedings of the 29th International Conference on Software Engineering, 2007. 416–426
Stephens N, Grosen J, Salls C, et al. Driller: augmenting fuzzing through selective symbolic execution. In: Proceedings of Network and Distributed System Security Symposium, 2016
Inkumsah K, Xie T. Improving structural testing of object-oriented programs via integrating evolutionary testing and symbolic execution. In: Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering, 2008. 297–306
Galeotti J P, Fraser G, Arcuri A. Improving search-based test suite generation with dynamic symbolic execution. In: Proceedings of the 24th International Symposium on Software Reliability Engineering, 2013. 360–369
Garg P, Ivancic F, Balakrishnan G, et al. Feedback-directed unit test generation for c/c++ using concolic execution. In: Proceedings of the 35th International Conference on Software Engineering, 2013. 132–141
Arcuri A, Briand L. Adaptive random testing: an illusion of effectiveness? In: Proceedings of the 20th International Symposium on Software Testing and Analysis, 2011. 265–275
Shahbazi A, Tappenden A F, Miller J. Centroidal voronoi tessellations — a new approach to random testing. IEEE Trans Softw Eng, 2013, 39: 163–183
Luckow K, Giannakopoulou D, Howar F, et al. JDart: a dynamic symbolic analysis framework. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems, 2016. 442–459
Wang H J, Liu T, Guan X H, et al. Dependence guided symbolic execution. IEEE Trans Softw Eng, 2017, 43: 252–271
Hutchins M, Foster H, Goradia T, et al. Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria. In: Proceedings of the 16th International Conference on Software Engineering, 1994. 191–200
Borges M, d’Amorim M, Anand S, et al. Symbolic execution with interval solving and meta-heuristic search. In: Proceedings of the 5th International Conference on Software Testing, Verification and Validation, 2012. 111–120
Bell J, Kaiser G. Phosphor: illuminating dynamic data flow in commodity JVMs. In: Proceedings of ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014. 83–101
Frank E, Hall M A, Witten I H. Data Mining: Practical Machine Learning Tools and Techniques. 4th ed. San Francisco: Morgan Kaufmann 2016
Hoffmann M R, Janiczak B, Mandrikov E. Eclemma 2.3.3. 2017. http://www.eclemma.org/
Just R, Jalali D, Inozemtseva L, et al. Are mutants a valid substitute for real faults in software testing? In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2014. 654–665
Ma Y S, Offutt J, Kwon Y R. MuJava: an automated class mutation system. Softw Test Verif Reliab, 2005, 15: 97–133
Mayer J, Schneckenburger C. An empirical analysis and comparison of random testing techniques. In: Proceedings of International Symposium on Empirical Software Engineering, 2006. 105–114
Chen T, Leung H, Mak I. Adaptive random testing. In: Proceedings of Annual Asian Computing Science Conference, 2005. 320–329
Chan K P, Chen T, Towey D. Restricted random testing. In: Proceedings of European Conference on Software Quality, 2002. 321–330
Jayaraman K, Harvison D, Ganesh V, et al. JFUZZ: a concolic whitebox fuzzer for java. In: Proceedings of the 1st NASA Formal Methods Symposium, 2009. 121–125
de Moura L, Bjørner N. Z3: an efficient smt solver. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems, 2008. 337–340
Pacheco C, Lahiri S K, Ernst M D, et al. Feedback-directed random test generation. In: Proceedings of International Conference on Software Engineering, 2007. 75–84
Fraser G, Arcuri A. Whole test suite generation. IEEE Trans Softw Eng, 2013, 39: 276–291
Powers D M W. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. Int J Mach Learn Technol, 2011, 2: 37–63
Sheskin D J. Handbook of Parametric and Nonparametric Statistical Procedures. 4th ed. Boca Raton: Chapman & Hall/CRC, 2007
Forman G, Cohen I. Learning from little: comparison of classifiers given little training. In: Proceedings of European Conference on Principles of Data Mining and Knowledge Discovery, 2004. 161–172
Liang H L, Pei X X, Jia X D, et al. Fuzzing: state of the art. IEEE Trans Rel, 2018, 67: 1199–1218
Wang J J, Chen B H, Wei L, et al. Skyfire: data-driven seed generation for fuzzing. In: Proceedings of IEEE Symposium on Security and Privacy, 2017. 579–594
Wang J J, Chen B H, Wei L, et al. Superion: grammar-aware greybox fuzzing. In: Proceedings of the 41st International Conference on Software Engineering, 2019. 724–735
Orso A, Rothermel G. Software testing: a research travelogue (2000—2014). In: Proceedings of Future of Software Engineering Proceedings, 2014. 117–132
Anand S, Burke E K, Chen T Y, et al. An orchestrated survey of methodologies for automated software test case generation. J Syst Softw, 2013, 86: 1978–2001
Păsăreanu C S, Visser W. A survey of new trends in symbolic execution for software testing and analysis. Int J Softw Tools Technol Transfer, 2009, 11: 339–353
McMinn P. Search-based software testing: past, present and future. In: Proceedings of the 4th International Conference on Software Testing, Verification and Validation Workshops, 2011. 153–163
McMinn P. Search-based software test data generation: a survey. Softw Test Verif Reliab, 2004, 14: 105–156
Hamlet R. Random testing. In: Encyclopedia of Software Engineering. Hoboken: Wiley & Sons, 1994. 970–978
Duran J W, Ntafos S C. An evaluation of random testing. IEEE Trans Softw Eng, 1984, 10: 438–444
Chen T Y, Tse T H, Yu Y T. Proportional sampling strategy: a compendium and some insights. J Syst Softw, 2001, 58: 65–81
Chen T Y, Kuo F C, Merkel R G, et al. Mirror adaptive random testing. In: Proceedings of the 3rd International Conference on Quality Software, 2003. 4–11
Chen T Y, Merkel R, Wong P K, et al. Adaptive random testing through dynamic partitioning. In: Proceedings of the 4th International Conference on Quality Software, 2004. 79–86
Mayer J. Adaptive random testing by bisection and localization. In: Proceedings of International Workshop on Formal Approaches to Software Testing, 2006. 72–86
Mayer J. Lattice-based adaptive random testing. In: Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering, 2005. 333–336
Chen T Y, Kuo F C, Liu H. Adaptive random testing based on distribution metrics. J Syst Softw, 2009, 82: 1419–1433
Bohme M, Pham V T, Roychoudhury A. Coverage-based greybox fuzzing as Markov chain. IEEE Trans Softw Eng, 2019, 45: 489–506
Li Y K, Chen B H, Chandramohan M, et al. Steelix: program-state based binary fuzzing. In: Proceedings of the 11th Joint Meeting on Foundations of Software Engineering, 2017. 627–637
Leek T R, Baker G Z, Brown R E, et al. Coverage Maximization Using Dynamic Taint Tracing. Massachusetts Inst Of Tech Lexington Lincoln Lab Technical Report, 2007
Ganesh V, Leek T, Rinard M. Taint-based directed whitebox fuzzing. In: Proceedings of the 31st International Conference on Software Engineering, 2009. 474–484
Pacheco C, Ernst M D. Eclat: automatic generation and classification of test inputs. In: Proceedings of European Conference on Object-Oriented Programming, 2005. 504–527
Artzi S, Ernst M D, Zun A K, et al. Finding the needles in the haystack: generating legal test inputs for object-oriented programs. In: Proceedings of the 1st Workshop on Model-Based Testing for Object-Oriented Systems (M-TOOS), 2006
Pacheco C, Lahiri S K, Ball T. Finding errors in.Net with feedback-directed random testing. In: Proceedings of International Symposium on Software Testing and Analysis, 2008. 87–96
Yatoh K, Sakamoto K, Ishikawa F, et al. Feedback-controlled random test generation. In: Proceedings of International Symposium on Software Testing and Analysis, 2015. 316–326
Ma L, Artho C, Zhang C, et al. GRT: program-analysis-guided random testing. In: Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, 2015. 212–223
Thummalapenta S, Xie T, Tillmann N, et al. MSeqGen: object-oriented unit-test generation via mining source code. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2009. 193–202
Zheng W J, Zhang Q R, Lyu M, et al. Random unit-test generation with MUT-aware sequence recommendation. In: Proceedings of IEEE/ACM International Conference on Automated Software Engineering, 2010. 293–296
Zhang S, Saff D, Bu Y Y, et al. Combined static and dynamic automated test generation. In: Proceedings of International Symposium on Software Testing and Analysis, 2011. 353–363
Thummalapenta S, Xie T, Tillmann N, et al. Synthesizing method sequences for high-coverage testing. SIGPLAN Not, 2011, 46: 189–206
Ali S, Briand L C, Hemmati H, et al. A systematic review of the application and empirical investigation of search-based test case generation. IEEE Trans Softw Eng, 2010, 36: 742–762
Harman M, McMinn P. A theoretical and empirical study of search-based testing: local, global, and hybrid search. IEEE Trans Softw Eng, 2010, 36: 226–247
Harman M, Jia Y, Zhang Y Y. Achievements, open problems and challenges for search based software testing. In: Proceedings of the 8th International Conference on Software Testing, Verification and Validation (ICST), 2015. 1–12
Tonella P. Evolutionary testing of classes. In: Proceedings of ACM/SIGSOFT International Symposium on Software Testing and Analysis, 2004. 119–128
Arcuri A, Yao X. Search based software testing of object-oriented containers. Inf Sci, 2008, 178: 3075–3095
Baresi L, Miraz M. Testful: automatic unit-test generation for java classes. In: Proceedings of the 32nd International Conference on Software Engineering, 2010. 281–284
Baars A, Harman M, Hassoun Y, et al. Symbolic search-based testing. In: Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering, 2011. 53–62
Lakhotia K, Harman M, Gross H. AUSTIN: an open source tool for search based software testing of C programs. Inf Softw Tech, 2013, 55: 112–125
Fraser G, Arcuri A. Evosuite: automatic test suite generation for object-oriented software. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, 2011. 416–419
Fraser G, Arcuri A. Sound empirical evidence in software testing. In: Proceedings of the 34th International Conference on Software Engineering, 2012. 178–188
Fraser G, Arcuri A. A large-scale evaluation of automated unit test generation using evosuite. ACM Trans Softw Eng Methodol, 2014, 24: 1–42
Fraser G, Arcuri A. The seed is strong: seeding strategies in search-based software testing. In: Proceedings of the 5th International Conference on Software Testing, Verification and Validation, 2012. 121–130
Rojas J M, Fraser G, Arcuri A. Seeding strategies in search-based unit test generation. Softw Test Verif Reliab, 2016, 26: 366–401
Gross F, Fraser G, Zeller A. Search-based system testing: high coverage, no false alarms. In: Proceedings of International Symposium on Software Testing and Analysis, 2012. 67–77
Abdessalem R B, Nejati S, Briand L C, et al. Testing advanced driver assistance systems using multi-objective search and neural networks. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 63–74
Arcuri A. Evomaster: evolutionary multi-context automated system test generation. In: Proceedings of the 11th International Conference on Software Testing, Verification and Validation, 2018. 394–397
Sen K, Agha G. CUTE and jCUTE: concolic unit testing and explicit path model-checking tools. In: Proceedings of International Conference on Computer Aided Verification, 2006. 419–423
Cadar C, Ganesh V, Pawlowski P M, et al. EXE: automatically generating inputs of death. In: Proceedings of the 13th ACM Conference on Computer and Communications Security, 2006. 322–335
Tillmann N, de Halleux J. Pex: white box test generation for.Net. In: Proceedings of International Conference on Tests and Proofs, 2008. 134–153
Burnim J, Sen K. Heuristics for scalable dynamic test generation. In: Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering, 2008. 443–446
Xie T, Tillmann N, de Halleux J, et al. Fitness-guided path exploration in dynamic symbolic execution. In: Proceedings of IEEE/IFIP International Conference on Dependable Systems & Networks, 2009. 359–368
Godefroid P, Levin M Y, Molnar D A. Automated whitebox fuzz testing. In: Proceedings of the Network and Distributed System Security (NDSS) Symposium, 2008
Godefroid P. Compositional dynamic test generation. In: Proceedings of the 34th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2007. 47–54
Godefroid P, Kiezun A, Levin M Y. Grammar-based whitebox fuzzing. SIGPLAN Not, 2008, 43: 206–215
Qi D W, Nguyen H D, Roychoudhury A. Path exploration based on symbolic output. In: Proceedings of Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2011. 278–288
Wang X Y, Sun J, Chen Z B, et al. Towards optimal concolic testing. In: Proceedings of the 40th International Conference on Software Engineering, 2018. 291–302
Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant No. 61802067).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, B., Liu, Y., Peng, X. et al. Baton: symphony of random testing and concolic testing through machine learning and taint analysis. Sci. China Inf. Sci. 66, 132101 (2023). https://doi.org/10.1007/s11432-020-3403-2
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-020-3403-2