Baton: symphony of random testing and concolic testing through machine learning and taint analysis

Chen, Bihuan; Liu, Yang; Peng, Xin; Wu, Yijian; Qin, Shengchao

doi:10.1007/s11432-020-3403-2

Baton: symphony of random testing and concolic testing through machine learning and taint analysis

Research Paper
Published: 11 November 2022

Volume 66, article number 132101, (2023)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Bihuan Chen^1,2,
Yang Liu³,
Xin Peng^1,2,
Yijian Wu^1,2 &
…
Shengchao Qin⁴

188 Accesses
2 Citations
Explore all metrics

Abstract

Random testing is scalable but often fails to hit corner program behaviors, while systematic testing (e.g., concolic execution) is promising to cover corner program behaviors but is not scalable to explore all program behaviors. Prior attempts to integrate random testing with systematic testing lack targeted guidance. In this paper, we propose a guided hybrid testing approach, named Baton, to synergize random testing with concolic testing. It integrates the knowledge inside test cases and their executions into a conditional execution graph, and uses such knowledge to guide test case generation. Specifically, we learn classification models for some conditionals in the conditional execution graph in a demand-driven way. These models are used to guide random testing to reach and cover partially-covered conditionals. We further employ targeted concolic testing to cover conditionals that cannot be fully covered by guided random testing. We implemented Baton for Java and evaluated it on three benchmarks. The results show that Baton improved branch coverage and mutation score over random testing by 16.2%–29.4% and 19.0%–30.0%, over adaptive random testing by 16.8%–33.8% and 19.4%–34.2%, over concolic testing by 2.3%–29.9% and 2.9%–30.1%, and over simple hybrid testing by 1.6%–14.5% and 1.4%–18.7%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Legion: Best-First Concolic Testing (Competition Contribution)

Effective Test Case Generation via Concolic Execution

Augmenting Concolic Testing with Weighted Constraints-Based Search

References

Loo P, Tsai W. Random testing revisited. Inf Softw Tech, 1988, 30: 402–417
Article Google Scholar
Arcuri A, Iqbal M Z, Briand L. Random testing: theoretical results and practical implications. IEEE Trans Softw Eng, 2012, 38: 258–277
Article Google Scholar
Chen T Y, Kuo F C, Merkel R G, et al. Adaptive random testing: the ART of test case diversity. J Syst Softw, 2010, 83: 60–66
Article Google Scholar
Tappenden A F, Miller J. A novel evolutionary approach for adaptive random testing. IEEE Trans Rel, 2009, 58: 619–633
Article Google Scholar
Chen T Y, Merkel R. Quasi-random testing. IEEE Trans Rel, 2007, 56: 562–568
Article Google Scholar
Liu H, Chen T Y. Randomized quasi-random testing. IEEE Trans Comput, 2016, 65: 1896–1909
Article MathSciNet Google Scholar
Böhme M, Paul S. On the efficiency of automated testing. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2014. 632–642
Xie T, Marinov D, Schulte W, et al. Symstra: a framework for generating object-oriented unit tests using symbolic execution. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems, 2005. 365–381
Cadar C, Dunbar D, Engler D. Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, 2008. 209–224
Păsăreanu C S, Visser W, Bushnell D, et al. Symbolic PathFinder: integrating symbolic execution with model checking for Java bytecode analysis. Autom Softw Eng, 2013, 20: 391–425
Article Google Scholar
Godefroid P, Klarlund N, Sen K. DART: directed automated random testing. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2005. 213–223
Sen K, Marinov D, Agha G. CUTE: a concolic unit testing engine for C. In: Proceedings of the 10th European Software Engineering Conference Held Jointly with the 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2005. 263–272
Majumdar R, Sen K. Hybrid concolic testing. In: Proceedings of the 29th International Conference on Software Engineering, 2007. 416–426
Stephens N, Grosen J, Salls C, et al. Driller: augmenting fuzzing through selective symbolic execution. In: Proceedings of Network and Distributed System Security Symposium, 2016
Inkumsah K, Xie T. Improving structural testing of object-oriented programs via integrating evolutionary testing and symbolic execution. In: Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering, 2008. 297–306
Galeotti J P, Fraser G, Arcuri A. Improving search-based test suite generation with dynamic symbolic execution. In: Proceedings of the 24th International Symposium on Software Reliability Engineering, 2013. 360–369
Garg P, Ivancic F, Balakrishnan G, et al. Feedback-directed unit test generation for c/c++ using concolic execution. In: Proceedings of the 35th International Conference on Software Engineering, 2013. 132–141
Arcuri A, Briand L. Adaptive random testing: an illusion of effectiveness? In: Proceedings of the 20th International Symposium on Software Testing and Analysis, 2011. 265–275
Shahbazi A, Tappenden A F, Miller J. Centroidal voronoi tessellations — a new approach to random testing. IEEE Trans Softw Eng, 2013, 39: 163–183
Article Google Scholar
Luckow K, Giannakopoulou D, Howar F, et al. JDart: a dynamic symbolic analysis framework. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems, 2016. 442–459
Wang H J, Liu T, Guan X H, et al. Dependence guided symbolic execution. IEEE Trans Softw Eng, 2017, 43: 252–271
Article Google Scholar
Hutchins M, Foster H, Goradia T, et al. Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria. In: Proceedings of the 16th International Conference on Software Engineering, 1994. 191–200
Borges M, d’Amorim M, Anand S, et al. Symbolic execution with interval solving and meta-heuristic search. In: Proceedings of the 5th International Conference on Software Testing, Verification and Validation, 2012. 111–120
Bell J, Kaiser G. Phosphor: illuminating dynamic data flow in commodity JVMs. In: Proceedings of ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014. 83–101
Frank E, Hall M A, Witten I H. Data Mining: Practical Machine Learning Tools and Techniques. 4th ed. San Francisco: Morgan Kaufmann 2016
Google Scholar
Hoffmann M R, Janiczak B, Mandrikov E. Eclemma 2.3.3. 2017. http://www.eclemma.org/
Just R, Jalali D, Inozemtseva L, et al. Are mutants a valid substitute for real faults in software testing? In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2014. 654–665
Ma Y S, Offutt J, Kwon Y R. MuJava: an automated class mutation system. Softw Test Verif Reliab, 2005, 15: 97–133
Article Google Scholar
Mayer J, Schneckenburger C. An empirical analysis and comparison of random testing techniques. In: Proceedings of International Symposium on Empirical Software Engineering, 2006. 105–114
Chen T, Leung H, Mak I. Adaptive random testing. In: Proceedings of Annual Asian Computing Science Conference, 2005. 320–329
Chan K P, Chen T, Towey D. Restricted random testing. In: Proceedings of European Conference on Software Quality, 2002. 321–330
Jayaraman K, Harvison D, Ganesh V, et al. JFUZZ: a concolic whitebox fuzzer for java. In: Proceedings of the 1st NASA Formal Methods Symposium, 2009. 121–125
de Moura L, Bjørner N. Z3: an efficient smt solver. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems, 2008. 337–340
Pacheco C, Lahiri S K, Ernst M D, et al. Feedback-directed random test generation. In: Proceedings of International Conference on Software Engineering, 2007. 75–84
Fraser G, Arcuri A. Whole test suite generation. IEEE Trans Softw Eng, 2013, 39: 276–291
Article Google Scholar
Powers D M W. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. Int J Mach Learn Technol, 2011, 2: 37–63
Google Scholar
Sheskin D J. Handbook of Parametric and Nonparametric Statistical Procedures. 4th ed. Boca Raton: Chapman & Hall/CRC, 2007
Google Scholar
Forman G, Cohen I. Learning from little: comparison of classifiers given little training. In: Proceedings of European Conference on Principles of Data Mining and Knowledge Discovery, 2004. 161–172
Liang H L, Pei X X, Jia X D, et al. Fuzzing: state of the art. IEEE Trans Rel, 2018, 67: 1199–1218
Article Google Scholar
Wang J J, Chen B H, Wei L, et al. Skyfire: data-driven seed generation for fuzzing. In: Proceedings of IEEE Symposium on Security and Privacy, 2017. 579–594
Wang J J, Chen B H, Wei L, et al. Superion: grammar-aware greybox fuzzing. In: Proceedings of the 41st International Conference on Software Engineering, 2019. 724–735
Orso A, Rothermel G. Software testing: a research travelogue (2000—2014). In: Proceedings of Future of Software Engineering Proceedings, 2014. 117–132
Anand S, Burke E K, Chen T Y, et al. An orchestrated survey of methodologies for automated software test case generation. J Syst Softw, 2013, 86: 1978–2001
Article Google Scholar
Păsăreanu C S, Visser W. A survey of new trends in symbolic execution for software testing and analysis. Int J Softw Tools Technol Transfer, 2009, 11: 339–353
Article Google Scholar
McMinn P. Search-based software testing: past, present and future. In: Proceedings of the 4th International Conference on Software Testing, Verification and Validation Workshops, 2011. 153–163
McMinn P. Search-based software test data generation: a survey. Softw Test Verif Reliab, 2004, 14: 105–156
Article Google Scholar
Hamlet R. Random testing. In: Encyclopedia of Software Engineering. Hoboken: Wiley & Sons, 1994. 970–978
Duran J W, Ntafos S C. An evaluation of random testing. IEEE Trans Softw Eng, 1984, 10: 438–444
Article Google Scholar
Chen T Y, Tse T H, Yu Y T. Proportional sampling strategy: a compendium and some insights. J Syst Softw, 2001, 58: 65–81
Article Google Scholar
Chen T Y, Kuo F C, Merkel R G, et al. Mirror adaptive random testing. In: Proceedings of the 3rd International Conference on Quality Software, 2003. 4–11
Chen T Y, Merkel R, Wong P K, et al. Adaptive random testing through dynamic partitioning. In: Proceedings of the 4th International Conference on Quality Software, 2004. 79–86
Mayer J. Adaptive random testing by bisection and localization. In: Proceedings of International Workshop on Formal Approaches to Software Testing, 2006. 72–86
Mayer J. Lattice-based adaptive random testing. In: Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering, 2005. 333–336
Chen T Y, Kuo F C, Liu H. Adaptive random testing based on distribution metrics. J Syst Softw, 2009, 82: 1419–1433
Article Google Scholar
Bohme M, Pham V T, Roychoudhury A. Coverage-based greybox fuzzing as Markov chain. IEEE Trans Softw Eng, 2019, 45: 489–506
Article Google Scholar
Li Y K, Chen B H, Chandramohan M, et al. Steelix: program-state based binary fuzzing. In: Proceedings of the 11th Joint Meeting on Foundations of Software Engineering, 2017. 627–637
Leek T R, Baker G Z, Brown R E, et al. Coverage Maximization Using Dynamic Taint Tracing. Massachusetts Inst Of Tech Lexington Lincoln Lab Technical Report, 2007
Ganesh V, Leek T, Rinard M. Taint-based directed whitebox fuzzing. In: Proceedings of the 31st International Conference on Software Engineering, 2009. 474–484
Pacheco C, Ernst M D. Eclat: automatic generation and classification of test inputs. In: Proceedings of European Conference on Object-Oriented Programming, 2005. 504–527
Artzi S, Ernst M D, Zun A K, et al. Finding the needles in the haystack: generating legal test inputs for object-oriented programs. In: Proceedings of the 1st Workshop on Model-Based Testing for Object-Oriented Systems (M-TOOS), 2006
Pacheco C, Lahiri S K, Ball T. Finding errors in.Net with feedback-directed random testing. In: Proceedings of International Symposium on Software Testing and Analysis, 2008. 87–96
Yatoh K, Sakamoto K, Ishikawa F, et al. Feedback-controlled random test generation. In: Proceedings of International Symposium on Software Testing and Analysis, 2015. 316–326
Ma L, Artho C, Zhang C, et al. GRT: program-analysis-guided random testing. In: Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, 2015. 212–223
Thummalapenta S, Xie T, Tillmann N, et al. MSeqGen: object-oriented unit-test generation via mining source code. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2009. 193–202
Zheng W J, Zhang Q R, Lyu M, et al. Random unit-test generation with MUT-aware sequence recommendation. In: Proceedings of IEEE/ACM International Conference on Automated Software Engineering, 2010. 293–296
Zhang S, Saff D, Bu Y Y, et al. Combined static and dynamic automated test generation. In: Proceedings of International Symposium on Software Testing and Analysis, 2011. 353–363
Thummalapenta S, Xie T, Tillmann N, et al. Synthesizing method sequences for high-coverage testing. SIGPLAN Not, 2011, 46: 189–206
Article Google Scholar
Ali S, Briand L C, Hemmati H, et al. A systematic review of the application and empirical investigation of search-based test case generation. IEEE Trans Softw Eng, 2010, 36: 742–762
Article Google Scholar
Harman M, McMinn P. A theoretical and empirical study of search-based testing: local, global, and hybrid search. IEEE Trans Softw Eng, 2010, 36: 226–247
Article Google Scholar
Harman M, Jia Y, Zhang Y Y. Achievements, open problems and challenges for search based software testing. In: Proceedings of the 8th International Conference on Software Testing, Verification and Validation (ICST), 2015. 1–12
Tonella P. Evolutionary testing of classes. In: Proceedings of ACM/SIGSOFT International Symposium on Software Testing and Analysis, 2004. 119–128
Arcuri A, Yao X. Search based software testing of object-oriented containers. Inf Sci, 2008, 178: 3075–3095
Article Google Scholar
Baresi L, Miraz M. Testful: automatic unit-test generation for java classes. In: Proceedings of the 32nd International Conference on Software Engineering, 2010. 281–284
Baars A, Harman M, Hassoun Y, et al. Symbolic search-based testing. In: Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering, 2011. 53–62
Lakhotia K, Harman M, Gross H. AUSTIN: an open source tool for search based software testing of C programs. Inf Softw Tech, 2013, 55: 112–125
Article Google Scholar
Fraser G, Arcuri A. Evosuite: automatic test suite generation for object-oriented software. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, 2011. 416–419
Fraser G, Arcuri A. Sound empirical evidence in software testing. In: Proceedings of the 34th International Conference on Software Engineering, 2012. 178–188
Fraser G, Arcuri A. A large-scale evaluation of automated unit test generation using evosuite. ACM Trans Softw Eng Methodol, 2014, 24: 1–42
Article Google Scholar
Fraser G, Arcuri A. The seed is strong: seeding strategies in search-based software testing. In: Proceedings of the 5th International Conference on Software Testing, Verification and Validation, 2012. 121–130
Rojas J M, Fraser G, Arcuri A. Seeding strategies in search-based unit test generation. Softw Test Verif Reliab, 2016, 26: 366–401
Article Google Scholar
Gross F, Fraser G, Zeller A. Search-based system testing: high coverage, no false alarms. In: Proceedings of International Symposium on Software Testing and Analysis, 2012. 67–77
Abdessalem R B, Nejati S, Briand L C, et al. Testing advanced driver assistance systems using multi-objective search and neural networks. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 63–74
Arcuri A. Evomaster: evolutionary multi-context automated system test generation. In: Proceedings of the 11th International Conference on Software Testing, Verification and Validation, 2018. 394–397
Sen K, Agha G. CUTE and jCUTE: concolic unit testing and explicit path model-checking tools. In: Proceedings of International Conference on Computer Aided Verification, 2006. 419–423
Cadar C, Ganesh V, Pawlowski P M, et al. EXE: automatically generating inputs of death. In: Proceedings of the 13th ACM Conference on Computer and Communications Security, 2006. 322–335
Tillmann N, de Halleux J. Pex: white box test generation for.Net. In: Proceedings of International Conference on Tests and Proofs, 2008. 134–153
Burnim J, Sen K. Heuristics for scalable dynamic test generation. In: Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering, 2008. 443–446
Xie T, Tillmann N, de Halleux J, et al. Fitness-guided path exploration in dynamic symbolic execution. In: Proceedings of IEEE/IFIP International Conference on Dependable Systems & Networks, 2009. 359–368
Godefroid P, Levin M Y, Molnar D A. Automated whitebox fuzz testing. In: Proceedings of the Network and Distributed System Security (NDSS) Symposium, 2008
Godefroid P. Compositional dynamic test generation. In: Proceedings of the 34th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2007. 47–54
Godefroid P, Kiezun A, Levin M Y. Grammar-based whitebox fuzzing. SIGPLAN Not, 2008, 43: 206–215
Article Google Scholar
Qi D W, Nguyen H D, Roychoudhury A. Path exploration based on symbolic output. In: Proceedings of Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2011. 278–288
Wang X Y, Sun J, Chen Z B, et al. Towards optimal concolic testing. In: Proceedings of the 40th International Conference on Software Engineering, 2018. 291–302

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant No. 61802067).

Author information

Authors and Affiliations

School of Computer Science, Fudan University, Shanghai, 201203, China
Bihuan Chen, Xin Peng & Yijian Wu
Shanghai Key Laboratory of Data Science, Fudan University, Shanghai, 201203, China
Bihuan Chen, Xin Peng & Yijian Wu
School of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798, Singapore
Yang Liu
School of Computing, Teesside University, Middlesbrough, TS1 3BX, UK
Shengchao Qin

Authors

Bihuan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xin Peng
View author publications
You can also search for this author in PubMed Google Scholar
Yijian Wu
View author publications
You can also search for this author in PubMed Google Scholar
Shengchao Qin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bihuan Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, B., Liu, Y., Peng, X. et al. Baton: symphony of random testing and concolic testing through machine learning and taint analysis. Sci. China Inf. Sci. 66, 132101 (2023). https://doi.org/10.1007/s11432-020-3403-2

Download citation

Received: 22 July 2020
Revised: 15 October 2021
Accepted: 09 December 2021
Published: 11 November 2022
DOI: https://doi.org/10.1007/s11432-020-3403-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Baton: symphony of random testing and concolic testing through machine learning and taint analysis

Abstract

Access this article

Similar content being viewed by others

Legion: Best-First Concolic Testing (Competition Contribution)

Effective Test Case Generation via Concolic Execution

Augmenting Concolic Testing with Weighted Constraints-Based Search

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Baton: symphony of random testing and concolic testing through machine learning and taint analysis

Abstract

Access this article

Similar content being viewed by others

Legion: Best-First Concolic Testing (Competition Contribution)

Effective Test Case Generation via Concolic Execution

Augmenting Concolic Testing with Weighted Constraints-Based Search

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation