Skip to main content
Log in

Baton: symphony of random testing and concolic testing through machine learning and taint analysis

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Random testing is scalable but often fails to hit corner program behaviors, while systematic testing (e.g., concolic execution) is promising to cover corner program behaviors but is not scalable to explore all program behaviors. Prior attempts to integrate random testing with systematic testing lack targeted guidance. In this paper, we propose a guided hybrid testing approach, named Baton, to synergize random testing with concolic testing. It integrates the knowledge inside test cases and their executions into a conditional execution graph, and uses such knowledge to guide test case generation. Specifically, we learn classification models for some conditionals in the conditional execution graph in a demand-driven way. These models are used to guide random testing to reach and cover partially-covered conditionals. We further employ targeted concolic testing to cover conditionals that cannot be fully covered by guided random testing. We implemented Baton for Java and evaluated it on three benchmarks. The results show that Baton improved branch coverage and mutation score over random testing by 16.2%–29.4% and 19.0%–30.0%, over adaptive random testing by 16.8%–33.8% and 19.4%–34.2%, over concolic testing by 2.3%–29.9% and 2.9%–30.1%, and over simple hybrid testing by 1.6%–14.5% and 1.4%–18.7%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Loo P, Tsai W. Random testing revisited. Inf Softw Tech, 1988, 30: 402–417

    Article  Google Scholar 

  2. Arcuri A, Iqbal M Z, Briand L. Random testing: theoretical results and practical implications. IEEE Trans Softw Eng, 2012, 38: 258–277

    Article  Google Scholar 

  3. Chen T Y, Kuo F C, Merkel R G, et al. Adaptive random testing: the ART of test case diversity. J Syst Softw, 2010, 83: 60–66

    Article  Google Scholar 

  4. Tappenden A F, Miller J. A novel evolutionary approach for adaptive random testing. IEEE Trans Rel, 2009, 58: 619–633

    Article  Google Scholar 

  5. Chen T Y, Merkel R. Quasi-random testing. IEEE Trans Rel, 2007, 56: 562–568

    Article  Google Scholar 

  6. Liu H, Chen T Y. Randomized quasi-random testing. IEEE Trans Comput, 2016, 65: 1896–1909

    Article  MathSciNet  Google Scholar 

  7. Böhme M, Paul S. On the efficiency of automated testing. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2014. 632–642

  8. Xie T, Marinov D, Schulte W, et al. Symstra: a framework for generating object-oriented unit tests using symbolic execution. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems, 2005. 365–381

  9. Cadar C, Dunbar D, Engler D. Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, 2008. 209–224

  10. Păsăreanu C S, Visser W, Bushnell D, et al. Symbolic PathFinder: integrating symbolic execution with model checking for Java bytecode analysis. Autom Softw Eng, 2013, 20: 391–425

    Article  Google Scholar 

  11. Godefroid P, Klarlund N, Sen K. DART: directed automated random testing. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2005. 213–223

  12. Sen K, Marinov D, Agha G. CUTE: a concolic unit testing engine for C. In: Proceedings of the 10th European Software Engineering Conference Held Jointly with the 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2005. 263–272

  13. Majumdar R, Sen K. Hybrid concolic testing. In: Proceedings of the 29th International Conference on Software Engineering, 2007. 416–426

  14. Stephens N, Grosen J, Salls C, et al. Driller: augmenting fuzzing through selective symbolic execution. In: Proceedings of Network and Distributed System Security Symposium, 2016

  15. Inkumsah K, Xie T. Improving structural testing of object-oriented programs via integrating evolutionary testing and symbolic execution. In: Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering, 2008. 297–306

  16. Galeotti J P, Fraser G, Arcuri A. Improving search-based test suite generation with dynamic symbolic execution. In: Proceedings of the 24th International Symposium on Software Reliability Engineering, 2013. 360–369

  17. Garg P, Ivancic F, Balakrishnan G, et al. Feedback-directed unit test generation for c/c++ using concolic execution. In: Proceedings of the 35th International Conference on Software Engineering, 2013. 132–141

  18. Arcuri A, Briand L. Adaptive random testing: an illusion of effectiveness? In: Proceedings of the 20th International Symposium on Software Testing and Analysis, 2011. 265–275

  19. Shahbazi A, Tappenden A F, Miller J. Centroidal voronoi tessellations — a new approach to random testing. IEEE Trans Softw Eng, 2013, 39: 163–183

    Article  Google Scholar 

  20. Luckow K, Giannakopoulou D, Howar F, et al. JDart: a dynamic symbolic analysis framework. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems, 2016. 442–459

  21. Wang H J, Liu T, Guan X H, et al. Dependence guided symbolic execution. IEEE Trans Softw Eng, 2017, 43: 252–271

    Article  Google Scholar 

  22. Hutchins M, Foster H, Goradia T, et al. Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria. In: Proceedings of the 16th International Conference on Software Engineering, 1994. 191–200

  23. Borges M, d’Amorim M, Anand S, et al. Symbolic execution with interval solving and meta-heuristic search. In: Proceedings of the 5th International Conference on Software Testing, Verification and Validation, 2012. 111–120

  24. Bell J, Kaiser G. Phosphor: illuminating dynamic data flow in commodity JVMs. In: Proceedings of ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014. 83–101

  25. Frank E, Hall M A, Witten I H. Data Mining: Practical Machine Learning Tools and Techniques. 4th ed. San Francisco: Morgan Kaufmann 2016

    Google Scholar 

  26. Hoffmann M R, Janiczak B, Mandrikov E. Eclemma 2.3.3. 2017. http://www.eclemma.org/

  27. Just R, Jalali D, Inozemtseva L, et al. Are mutants a valid substitute for real faults in software testing? In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2014. 654–665

  28. Ma Y S, Offutt J, Kwon Y R. MuJava: an automated class mutation system. Softw Test Verif Reliab, 2005, 15: 97–133

    Article  Google Scholar 

  29. Mayer J, Schneckenburger C. An empirical analysis and comparison of random testing techniques. In: Proceedings of International Symposium on Empirical Software Engineering, 2006. 105–114

  30. Chen T, Leung H, Mak I. Adaptive random testing. In: Proceedings of Annual Asian Computing Science Conference, 2005. 320–329

  31. Chan K P, Chen T, Towey D. Restricted random testing. In: Proceedings of European Conference on Software Quality, 2002. 321–330

  32. Jayaraman K, Harvison D, Ganesh V, et al. JFUZZ: a concolic whitebox fuzzer for java. In: Proceedings of the 1st NASA Formal Methods Symposium, 2009. 121–125

  33. de Moura L, Bjørner N. Z3: an efficient smt solver. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems, 2008. 337–340

  34. Pacheco C, Lahiri S K, Ernst M D, et al. Feedback-directed random test generation. In: Proceedings of International Conference on Software Engineering, 2007. 75–84

  35. Fraser G, Arcuri A. Whole test suite generation. IEEE Trans Softw Eng, 2013, 39: 276–291

    Article  Google Scholar 

  36. Powers D M W. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. Int J Mach Learn Technol, 2011, 2: 37–63

    Google Scholar 

  37. Sheskin D J. Handbook of Parametric and Nonparametric Statistical Procedures. 4th ed. Boca Raton: Chapman & Hall/CRC, 2007

    Google Scholar 

  38. Forman G, Cohen I. Learning from little: comparison of classifiers given little training. In: Proceedings of European Conference on Principles of Data Mining and Knowledge Discovery, 2004. 161–172

  39. Liang H L, Pei X X, Jia X D, et al. Fuzzing: state of the art. IEEE Trans Rel, 2018, 67: 1199–1218

    Article  Google Scholar 

  40. Wang J J, Chen B H, Wei L, et al. Skyfire: data-driven seed generation for fuzzing. In: Proceedings of IEEE Symposium on Security and Privacy, 2017. 579–594

  41. Wang J J, Chen B H, Wei L, et al. Superion: grammar-aware greybox fuzzing. In: Proceedings of the 41st International Conference on Software Engineering, 2019. 724–735

  42. Orso A, Rothermel G. Software testing: a research travelogue (2000—2014). In: Proceedings of Future of Software Engineering Proceedings, 2014. 117–132

  43. Anand S, Burke E K, Chen T Y, et al. An orchestrated survey of methodologies for automated software test case generation. J Syst Softw, 2013, 86: 1978–2001

    Article  Google Scholar 

  44. Păsăreanu C S, Visser W. A survey of new trends in symbolic execution for software testing and analysis. Int J Softw Tools Technol Transfer, 2009, 11: 339–353

    Article  Google Scholar 

  45. McMinn P. Search-based software testing: past, present and future. In: Proceedings of the 4th International Conference on Software Testing, Verification and Validation Workshops, 2011. 153–163

  46. McMinn P. Search-based software test data generation: a survey. Softw Test Verif Reliab, 2004, 14: 105–156

    Article  Google Scholar 

  47. Hamlet R. Random testing. In: Encyclopedia of Software Engineering. Hoboken: Wiley & Sons, 1994. 970–978

  48. Duran J W, Ntafos S C. An evaluation of random testing. IEEE Trans Softw Eng, 1984, 10: 438–444

    Article  Google Scholar 

  49. Chen T Y, Tse T H, Yu Y T. Proportional sampling strategy: a compendium and some insights. J Syst Softw, 2001, 58: 65–81

    Article  Google Scholar 

  50. Chen T Y, Kuo F C, Merkel R G, et al. Mirror adaptive random testing. In: Proceedings of the 3rd International Conference on Quality Software, 2003. 4–11

  51. Chen T Y, Merkel R, Wong P K, et al. Adaptive random testing through dynamic partitioning. In: Proceedings of the 4th International Conference on Quality Software, 2004. 79–86

  52. Mayer J. Adaptive random testing by bisection and localization. In: Proceedings of International Workshop on Formal Approaches to Software Testing, 2006. 72–86

  53. Mayer J. Lattice-based adaptive random testing. In: Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering, 2005. 333–336

  54. Chen T Y, Kuo F C, Liu H. Adaptive random testing based on distribution metrics. J Syst Softw, 2009, 82: 1419–1433

    Article  Google Scholar 

  55. Bohme M, Pham V T, Roychoudhury A. Coverage-based greybox fuzzing as Markov chain. IEEE Trans Softw Eng, 2019, 45: 489–506

    Article  Google Scholar 

  56. Li Y K, Chen B H, Chandramohan M, et al. Steelix: program-state based binary fuzzing. In: Proceedings of the 11th Joint Meeting on Foundations of Software Engineering, 2017. 627–637

  57. Leek T R, Baker G Z, Brown R E, et al. Coverage Maximization Using Dynamic Taint Tracing. Massachusetts Inst Of Tech Lexington Lincoln Lab Technical Report, 2007

  58. Ganesh V, Leek T, Rinard M. Taint-based directed whitebox fuzzing. In: Proceedings of the 31st International Conference on Software Engineering, 2009. 474–484

  59. Pacheco C, Ernst M D. Eclat: automatic generation and classification of test inputs. In: Proceedings of European Conference on Object-Oriented Programming, 2005. 504–527

  60. Artzi S, Ernst M D, Zun A K, et al. Finding the needles in the haystack: generating legal test inputs for object-oriented programs. In: Proceedings of the 1st Workshop on Model-Based Testing for Object-Oriented Systems (M-TOOS), 2006

  61. Pacheco C, Lahiri S K, Ball T. Finding errors in.Net with feedback-directed random testing. In: Proceedings of International Symposium on Software Testing and Analysis, 2008. 87–96

  62. Yatoh K, Sakamoto K, Ishikawa F, et al. Feedback-controlled random test generation. In: Proceedings of International Symposium on Software Testing and Analysis, 2015. 316–326

  63. Ma L, Artho C, Zhang C, et al. GRT: program-analysis-guided random testing. In: Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, 2015. 212–223

  64. Thummalapenta S, Xie T, Tillmann N, et al. MSeqGen: object-oriented unit-test generation via mining source code. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2009. 193–202

  65. Zheng W J, Zhang Q R, Lyu M, et al. Random unit-test generation with MUT-aware sequence recommendation. In: Proceedings of IEEE/ACM International Conference on Automated Software Engineering, 2010. 293–296

  66. Zhang S, Saff D, Bu Y Y, et al. Combined static and dynamic automated test generation. In: Proceedings of International Symposium on Software Testing and Analysis, 2011. 353–363

  67. Thummalapenta S, Xie T, Tillmann N, et al. Synthesizing method sequences for high-coverage testing. SIGPLAN Not, 2011, 46: 189–206

    Article  Google Scholar 

  68. Ali S, Briand L C, Hemmati H, et al. A systematic review of the application and empirical investigation of search-based test case generation. IEEE Trans Softw Eng, 2010, 36: 742–762

    Article  Google Scholar 

  69. Harman M, McMinn P. A theoretical and empirical study of search-based testing: local, global, and hybrid search. IEEE Trans Softw Eng, 2010, 36: 226–247

    Article  Google Scholar 

  70. Harman M, Jia Y, Zhang Y Y. Achievements, open problems and challenges for search based software testing. In: Proceedings of the 8th International Conference on Software Testing, Verification and Validation (ICST), 2015. 1–12

  71. Tonella P. Evolutionary testing of classes. In: Proceedings of ACM/SIGSOFT International Symposium on Software Testing and Analysis, 2004. 119–128

  72. Arcuri A, Yao X. Search based software testing of object-oriented containers. Inf Sci, 2008, 178: 3075–3095

    Article  Google Scholar 

  73. Baresi L, Miraz M. Testful: automatic unit-test generation for java classes. In: Proceedings of the 32nd International Conference on Software Engineering, 2010. 281–284

  74. Baars A, Harman M, Hassoun Y, et al. Symbolic search-based testing. In: Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering, 2011. 53–62

  75. Lakhotia K, Harman M, Gross H. AUSTIN: an open source tool for search based software testing of C programs. Inf Softw Tech, 2013, 55: 112–125

    Article  Google Scholar 

  76. Fraser G, Arcuri A. Evosuite: automatic test suite generation for object-oriented software. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, 2011. 416–419

  77. Fraser G, Arcuri A. Sound empirical evidence in software testing. In: Proceedings of the 34th International Conference on Software Engineering, 2012. 178–188

  78. Fraser G, Arcuri A. A large-scale evaluation of automated unit test generation using evosuite. ACM Trans Softw Eng Methodol, 2014, 24: 1–42

    Article  Google Scholar 

  79. Fraser G, Arcuri A. The seed is strong: seeding strategies in search-based software testing. In: Proceedings of the 5th International Conference on Software Testing, Verification and Validation, 2012. 121–130

  80. Rojas J M, Fraser G, Arcuri A. Seeding strategies in search-based unit test generation. Softw Test Verif Reliab, 2016, 26: 366–401

    Article  Google Scholar 

  81. Gross F, Fraser G, Zeller A. Search-based system testing: high coverage, no false alarms. In: Proceedings of International Symposium on Software Testing and Analysis, 2012. 67–77

  82. Abdessalem R B, Nejati S, Briand L C, et al. Testing advanced driver assistance systems using multi-objective search and neural networks. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 63–74

  83. Arcuri A. Evomaster: evolutionary multi-context automated system test generation. In: Proceedings of the 11th International Conference on Software Testing, Verification and Validation, 2018. 394–397

  84. Sen K, Agha G. CUTE and jCUTE: concolic unit testing and explicit path model-checking tools. In: Proceedings of International Conference on Computer Aided Verification, 2006. 419–423

  85. Cadar C, Ganesh V, Pawlowski P M, et al. EXE: automatically generating inputs of death. In: Proceedings of the 13th ACM Conference on Computer and Communications Security, 2006. 322–335

  86. Tillmann N, de Halleux J. Pex: white box test generation for.Net. In: Proceedings of International Conference on Tests and Proofs, 2008. 134–153

  87. Burnim J, Sen K. Heuristics for scalable dynamic test generation. In: Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering, 2008. 443–446

  88. Xie T, Tillmann N, de Halleux J, et al. Fitness-guided path exploration in dynamic symbolic execution. In: Proceedings of IEEE/IFIP International Conference on Dependable Systems & Networks, 2009. 359–368

  89. Godefroid P, Levin M Y, Molnar D A. Automated whitebox fuzz testing. In: Proceedings of the Network and Distributed System Security (NDSS) Symposium, 2008

  90. Godefroid P. Compositional dynamic test generation. In: Proceedings of the 34th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2007. 47–54

  91. Godefroid P, Kiezun A, Levin M Y. Grammar-based whitebox fuzzing. SIGPLAN Not, 2008, 43: 206–215

    Article  Google Scholar 

  92. Qi D W, Nguyen H D, Roychoudhury A. Path exploration based on symbolic output. In: Proceedings of Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2011. 278–288

  93. Wang X Y, Sun J, Chen Z B, et al. Towards optimal concolic testing. In: Proceedings of the 40th International Conference on Software Engineering, 2018. 291–302

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant No. 61802067).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bihuan Chen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, B., Liu, Y., Peng, X. et al. Baton: symphony of random testing and concolic testing through machine learning and taint analysis. Sci. China Inf. Sci. 66, 132101 (2023). https://doi.org/10.1007/s11432-020-3403-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-020-3403-2

Keywords

Navigation