Abstract
A fuzzer provides randomly generated inputs to a targeted software to expose erroneous behavior. To efficiently detect defects, generated inputs should conform to the structure of the input format and thus, grammars can be used to generate syntactically correct inputs. In this context, fuzzing can be guided by probabilities attached to competing rules in the grammar, leading to the idea of probabilistic grammar-based fuzzing. However, the optimal assignment of probabilities to individual grammar rules to effectively expose erroneous behavior for individual systems under test is an open research question. In this paper, we present EvoGFuzz, an evolutionary grammar-based fuzzing approach to optimize the probabilities to generate test inputs that may be more likely to trigger exceptional behavior. The evaluation shows the effectiveness of EvoGFuzz in detecting defects compared to probabilistic grammar-based fuzzing (baseline). Applied to ten real-world applications with common input formats (JSON, JavaScript, or CSS3), the evaluation shows that EvoGFuzz achieved a significantly larger median line coverage for all subjects by up to 48% compared to the baseline. Moreover, EvoGFuzz managed to expose 11 unique defects, from which five have not been detected by the baseline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Data and code artifacts are available here: https://doi.org/10.5281/zenodo.3961374.
References
Anand, S., et al.: An orchestrated survey of methodologies for automated software test case generation. JSS 86(8), 1978–2001 (2013)
Arcuri, A., Briand, L.: A Hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Softw. Test. Verif. Reliab. 24(3), 219–250 (2014)
Atlidakis, V., Geambasu, R., Godefroid, P., Polishchuk, M., Ray, B.: Pythia: grammar-based fuzzing of REST APIs with coverage-guided feedback and learning-based mutations, pp. 1–12 (2020). http://arxiv.org/abs/2005.11498
Böhme, M., Pham, V.T., Nguyen, M.D., Roychoudhury, A.: Directed greybox fuzzing. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, pp. 2329–2344. ACM (2017)
Böhme, M., Pham, V.T., Roychoudhury, A.: Coverage-based greybox fuzzing as markov chain. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, CCS 2016, pp. 1032–1043. ACM (2016)
Cummins, C., Petoumenos, P., Murray, A., Leather, H.: Compiler fuzzing through deep learning. In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2018, pp. 95–105. ACM (2018)
Du, H., Wang, Z., Zhan, W., Guo, J.: Elitism and distance strategy for selection of evolutionary algorithms. IEEE Access 6, 44531–44541 (2018)
Godefroid, P.: Fuzzing: hack, art, and science. Commun. ACM 63(2), 70–76 (2020)
Godefroid, P., Kiezun, A., Levin, M.Y.: Grammar-based whitebox fuzzing. In: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2008, pp. 206–215. ACM (2008)
Godefroid, P., Levin, M.Y., Molnar, D.: SAGE: whitebox fuzzing for security testing. Commun. ACM 55(3), 40–44 (2012)
Godefroid, P., Peleg, H., Singh, R.: Learn&Fuzz: machine learning for input fuzzing. In: Proceedings of the 32nd International Conference on Automated Software Engineering, ASE 2017, pp. 50–59. IEEE (2017)
Hallaraker, O., Vigna, G.: Detecting malicious JavaScript code in Mozilla. In: Proceedings of the 10th IEEE International Conference on Engineering of Complex Computer Systems, ICECCS 2005, pp. 85–94. IEEE (2005)
Hanford, K.V.: Automatic generation of test cases. IBM Syst. J. 9(4), 242–257 (1970)
Harman, M., McMinn, P., de Souza, J.T., Yoo, S.: Search based software engineering: techniques, taxonomy, tutorial. In: Meyer, B., Nordio, M. (eds.) LASER 2008-2010. LNCS, vol. 7007, pp. 1–59. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-25231-0_1
Holler, C., Herzig, K., Zeller, A.: Fuzzing with code fragments. In: Presented as part of the 21st USENIX Security Symposium, pp. 445–458. USENIX (2012)
Höschele, M., Zeller, A.: Mining input grammars with autogram. In: 39th International Conference on Software Engineering Companion, pp. 31–34. IEEE (2017)
Klees, G., Ruef, A., Cooper, B., Wei, S., Hicks, M.: Evaluating fuzz testing. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, pp. 2123–2138. ACM (2018)
Le, X.B.D., Păsăreanu, C., Padhye, R., Lo, D., Visser, W., Sen, K.: Saffron: adaptive grammar-based fuzzing for worst-case analysis. SIGSOFT Softw. Eng. Notes 44(4), 14 (2019)
Lemieux, C., Sen, K.: FairFuzz: a targeted mutation strategy for increasing greybox fuzz testing coverage. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE, pp. 475–485. ACM (2018)
Liu, P., Zhang, X., Pistoia, M., Zheng, Y., Marques, M., Zeng, L.: Automatic text input generation for mobile testing. In: Proceedings of the 39th International Conference on Software Engineering, ICSE 2017, pp. 643–653. IEEE (2017)
Mann, H.B., Whitney, D.R.: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18(1), 50–60 (1947)
Miller, B.P., Fredriksen, L., So, B.: An empirical study of the reliability of UNIX utilities. Commun. ACM 33(12), 32–44 (1990)
Miller, B.L., Goldberg, D.E.: Genetic algorithms, tournament selection, and the effects of noise. Complex Syst. 9, 193–212 (1995)
Miller, J.C., Maloney, C.J.: Systematic mistake analysis of digital computer programs. Commun. ACM 6(2), 58–63 (1963)
Nilizadeh, S., Noller, Y., Păsăreanu, C.S.: Diffuzz: differential fuzzing for side-channel analysis. In: Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, pp. 176–187. IEEE (2019)
Orso, A., Rothermel, G.: Software testing: a research travelogue (2000–2014). In: Future of Software Engineering, FOSE 2014, pp. 117–132. ACM (2014)
Pacheco, C., Ernst, M.D.: Randoop: feedback-directed random testing for Java. In: Proceedings of the 22nd Conference on Object-Oriented Programming Systems and Applications Companion, OOPSLA 2007, pp. 815–816. ACM (2007)
Pavese, E., Soremekun, E., Havrikov, N., Grunske, L., Zeller, A.: Inputs from hell: generating uncommon inputs from common samples. arXiv:1812.07525 [cs] (2018). http://arxiv.org/abs/1812.07525
Pham, V.T., Böhme, M., Santosa, A.E., Căciulescu, A.R., Roychoudhury, A.: Smart greybox fuzzing. IEEE Trans. Softw. Eng., 1–17 (2019). https://doi.org/10.1109/TSE.2019.2941681
Richardson, R.: CSI computer crime and security survey. Comput. Secur. Inst. 1, 1–30 (2008)
Song, D., et al.: BitBlaze: a new approach to computer security via binary analysis. In: Sekar, R., Pujari, A.K. (eds.) ICISS 2008. LNCS, vol. 5352, pp. 1–25. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89862-7_1
Veggalam, S., Rawat, S., Haller, I., Bos, H.: IFuzzer: an evolutionary interpreter fuzzer using genetic programming. In: Askoxylakis, I., Ioannidis, S., Katsikas, S., Meadows, C. (eds.) ESORICS 2016. LNCS, vol. 9878, pp. 581–601. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45744-4_29
Wang, J., Chen, B., Wei, L., Liu, Y.: Superion: grammar-aware greybox fuzzing. In: Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, pp. 724–735. IEEE (2019)
Website: American Fuzzing Lop (AFL) (2018). http://lcamtuf.coredump.cx/afl/
Website: libFuzzer: a library for coverage-guided fuzz testing (2018). https://llvm.org/docs/LibFuzzer.html
Wright, S.: The evolution of dominance. Am. Nat. 63(689), 556–561 (1929)
Yang, X., Chen, Y., Eide, E., Regehr, J.: Finding and understanding bugs in C compilers. SIGPLAN Not. 46(6), 283–294 (2011)
Zeller, A., Gopinath, R., Böhme, M., Fraser, G., Holler, C.: The fuzzing book. In: The Fuzzing Book. Saarland University (2019). https://www.fuzzingbook.org/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Eberlein, M., Noller, Y., Vogel, T., Grunske, L. (2020). Evolutionary Grammar-Based Fuzzing. In: Aleti, A., Panichella, A. (eds) Search-Based Software Engineering. SSBSE 2020. Lecture Notes in Computer Science(), vol 12420. Springer, Cham. https://doi.org/10.1007/978-3-030-59762-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-59762-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59761-0
Online ISBN: 978-3-030-59762-7
eBook Packages: Computer ScienceComputer Science (R0)