skip to main content
research-article

STADS: Software Testing as Species Discovery

Published: 27 June 2018 Publication History

Abstract

A fundamental challenge of software testing is the statistically well-grounded extrapolation from program behaviors observed during testing. For instance, a security researcher who has run the fuzzer for a week has currently no means (1) to estimate the total number of feasible program branches, given that only a fraction has been covered so far; (2) to estimate the additional time required to cover 10% more branches (or to estimate the coverage achieved in one more day, respectively); or (3) to assess the residual risk that a vulnerability exists when no vulnerability has been discovered. Failing to discover a vulnerability does not mean that none exists—even if the fuzzer was run for a week (or a year). Hence, testing provides no formal correctness guarantees.
In this article, I establish an unexpected connection with the otherwise unrelated scientific field of ecology and introduce a statistical framework that models Software Testing and Analysis as Discovery of Species (STADS). For instance, in order to study the species diversity of arthropods in a tropical rain forest, ecologists would first sample a large number of individuals from that forest, determine their species, and extrapolate from the properties observed in the sample to properties of the whole forest. The estimations (1) of the total number of species, (2) of the additional sampling effort required to discover 10% more species, or (3) of the probability to discover a new species are classical problems in ecology. The STADS framework draws from over three decades of research in ecological biostatistics to address the fundamental extrapolation challenge for automated test generation. Our preliminary empirical study demonstrates a good estimator performance even for a fuzzer with adaptive sampling bias—AFL, a state-of-the-art vulnerability detection tool. The STADS framework provides statistical correctness guarantees with quantifiable accuracy.

References

[1]
Gail E. Austen, Markus Bindemann, Richard A. Griffiths, and David L. Roberts. 2016. Species identification by experts and non-experts: Comparing images from field guides. Nature - Scientific Reports 6 (2016), 1--7.
[2]
Greg Banks, Marco Cova, Viktoria Felmetsger, Kevin Almeroth, Richard Kemmerer, and Giovanni Vigna. 2006. SNOOZE: Toward a stateful network protocol fuzZEr. In Proceedings of the 9th International Conference on Information Security (ISC’06). 343--358.
[3]
E. T. Barr, M. Harman, P. McMinn, M. Shahbaz, and S. Yoo. 2015. The oracle problem in software testing: A survey. IEEE Transactions on Software Engineering 41, 5 (May 2015), 507--525.
[4]
Yves Basset, Lukas Cizek, Philippe Cuénoud, Raphael K. Didham, François Guilhaumon, Olivier Missa, Vojtech Novotny, Frode Ødegaard, Tomas Roslin, Jürgen Schmidl, Alexey K. Tishechkin, Neville N. Winchester, David W. Roubik, Henri-Pierre Aberlenc, Johannes Bail, Héctor Barrios, Jon R. Bridle, Gabriela Castaño-Meneses, Bruno Corbara, Gianfranco Curletti, Wesley Duarte da Rocha, Domir De Bakker, Jacques H. C. Delabie, Alain Dejean, Laura L. Fagan, Andreas Floren, Roger L.Kitching, Enrique Medianero, Scott E. Miller, Evandro Gama de Oliveira, Jérôme Orivel, Marc Pollet, Mathieu Rapp, Sérvio P. Ribeiro, Yves Roisin, Jesper B. Schmidt, Line Sørensen, and Maurice Leponce. 2012. Arthropod diversity in a tropical forest. Science 338, 6113 (2012), 1481--1484.
[5]
A. Bertolino. 2007. Software testing research: Achievements, challenges, dreams. In Future of Software Engineering (FOSE’07). 85--103.
[6]
Marcel Böhme, Bruno C. d. S. Oliveira, and Abhik Roychoudhury. 2013. Partition-based regression verification. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). 302--311.
[7]
Marcel Böhme, Bruno C. d. S. Oliveira, and Abhik Roychoudhury. 2013. Regression tests to expose change interaction errors. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’13). 334--344.
[8]
Marcel Böhme and Soumya Paul. 2016. A probabilistic analysis of the efficiency of automated software testing. IEEE Transactions on Software Engineering 42, 4 (April 2016), 345--360.
[9]
Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. 2017. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS’17). 2329--2344.
[10]
Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2016. Coverage-based greybox fuzzing as Markov chain. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS’16). 1032--1043.
[11]
Zdravko I. Botev and Dirk P. Kroese. 2008. An efficient algorithm for rare-event probability estimation, combinatorial optimization, and counting. Methodology and Computing in Applied Probability 10, 4 (2008), 471--505.
[12]
Ulrich Brose, Neo D. Martinez, and Richard J. Williams. 2003. Estimating species richness: Sensitivity to sample coverage and insensitivity to spatial patterns. Ecology 84, 9 (2003), 2364--2377.
[13]
J. Bunge and M. Fitzpatrick. 1993. Estimating the number of species: A review. Journal of the American Statistical Association 88, 421 (1993), 364--373.
[14]
K. P. Burnham and W. S. Overton. 1979. Robust estimation of population size when capture probabilities vary among animals. Ecology 60, 5 (1979), 927--936.
[15]
Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI’08). 209--224.
[16]
Anne Chao. 1984. Nonparametric estimation of the number of classes in a population. Scandinavian Journal of Statistics 11, 4 (1984), 265--270.
[17]
Anne Chao. 1987. Estimating the population size for capture-recapture data with unequal catchability. Biometrics 43, 4 (1987), 783--791.
[18]
Anne Chao and Chun-Huo Chiu. 2014. Species Richness: Estimation and Comparison. John Wiley 8 Sons.
[19]
Anne Chao and Chun-Huo Chiu. 2016. Nonparametric estimation and comparison of species richness. In Encyclopedia of Life Sciences (eLS).
[20]
Anne Chao, Chun-Huo Chiu, Robert K. Colwell, Luiz Fernando S. Magnago, Robin L. Chazdon, and Nicholas J. Gotelli. 2017. Deciphering the enigma of undetected species, phylogenetic, and functional diversity based on good-turing theory. Ecology 98, 11 (2017), 2914--2929.
[21]
Anne Chao and Robert K. Colwell. 2017. Thirty years of progeny from Chao’s inequality: Estimating and comparing richness with incidence data and incomplete sampling. Statistics and Operations Research Transactions 41, 1 (2017), 3--54.
[22]
Anne Chao, Robert K. Colwell, Chih-Wei Lin, and Nicholas J. Gotelli. 2009. Sufficient sampling for asymptotic minimum species richness estimators. Ecology 90, 4 (2009), 1125--1133.
[23]
Anne Chao, T. C. Hsieh, Robin L. Chazdon, Robert K. Colwell, and Nicholas J. Gotelli. 2015. Unveiling the species-rank abundance distribution by generalizing the good-turing sample coverage theory. Ecology 96, 5 (2015), 1189--1201.
[24]
Anne Chao and Lou Jost. 2012. Coverage-based rarefaction and extrapolation: Standardizing samples by completeness rather than size. Ecology 93, 12 (2012), 2533--2547.
[25]
Anne Chao and Shen-Ming Lee. 1992. Estimating the number of classes via sample coverage. Journal of the American Statistical Association 87, 417 (1992), 210--217.
[26]
A. Chao, K. H. Ma, T. C. Hsieh, and C. H. Chiu. 2015. Online Program SpadeR (Species-richness Prediction And Diversity Estimation in R). Program and User’s Guide. Retrieved from http://chao.stat.nthu.edu.tw/wordpress/software_download.
[27]
Anne Chao and Tsung-Jen Shen. 2004. Nonparametric prediction in species sampling. Journal of Agricultural, Biological, and Environmental Statistics 9, 3 (2004), 253--269.
[28]
R. L. Chazdon, R. K. Colwell, J. S. Denslow, and M. R. Guariguata. 1998. Statistical methods for estimating species richness of woody regeneration in primary and secondary rain forests of Northeastern Costa Rica. In Forest Biodiversity Research, Monitoring and Modeling: Conceptual Background and Old World Case Studies. Vol. 20. Man and the Biosphere Series. 285--309.
[29]
Thierry Titcheu Chekam, Mike Papadakis, Yves Le Traon, and Mark Harman. 2017. An empirical study on mutation, statement and branch coverage fault revelation that avoids the unreliable clean program assumption. In Proceedings of the 39th International Conference on Software Engineering (ICSE’17). 597--608.
[30]
Tsong Yueh Chen and Yuen-Tak Yu. 1996. On the expected number of failures detected by subdomain testing and random testing. IEEE Transactions on Software Engineering 22, 2 (1996), 109--119.
[31]
Vitaly Chipounov, Volodymyr Kuznetsov, and George Candea. 2011. S2E: A platform for in-vivo multi-path analysis of software systems. In Proceedings of the 2011 ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’11). 265--278.
[32]
C. H. Chiu, Y. T. Wang, B. A. Walther, and A. Chao. 2014. An improved nonparametric lower bound of species richness via a modified Good-Turing frequency formula. Biometrics 70, 3 (2014), 671--682.
[33]
Bernard D. Coleman. 1981. On random placement and species-area relations. Mathematical Biosciences 54, 3 (1981), 191--215.
[34]
Robert K. Colwell, Anne Chao, Nicholas J. Gotelli, Shang-Yi Lin, Chang Xuan Mao, Robin L. Chazdon, and John T. Longino. 2012. Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages. Journal of Plant Ecology 5, 1 (2012), 3.
[35]
Robert K. Colwell and Jonathan A. Coddington. 1994. Estimating terrestrial biodiversity through extrapolation. Philosophical Transactions of the Royal Society of London B: Biological Sciences 345, 1311 (1994), 101--118.
[36]
Robert K. Colwell and Johanna E. Elsensohn. 2014. EstimateS turns 20: Statistical estimation of species richness and shared species from samples, with non-parametric extrapolation. Ecography 37, 6 (2014), 609--613.
[37]
Robert K. Colwell, Chang Xuan Mao, and Jing Chang. 2004. Interpolating, extrapolating, and comparing incidence-based species accumulation curves. Ecology 85, 10 (2004), 2717--2727.
[38]
Edsger W. Dijkstra. 1970. Notes on Structured Programming.
[39]
Joe W. Duran and Simeon C. Ntafos. 1984. An evaluation of random testing. IEEE Transactions on Software Engineering 10, 4 (July 1984), 438--444.
[40]
S. M. Durant, T. Wacher, S. Bashir, R. Woodroffe, P. De Ornellas, C. Ransom, J. Newby, T. Abáigar, M. Abdelgadir, H. El Alqamy, J. Baillie, M. Beddiaf, F. Belbachir, A. Belbachir-Bazi, A. A. Berbash, N. E. Bemadjim, R. Beudels-Jamar, L. Boitani, C. Breitenmoser, M. Cano, P. Chardonnet, B. Collen, W. A. Cornforth, F. Cuzin, P. Gerngross, B. Haddane, M. Hadjeloum, A. Jacobson, A. Jebali, F. Lamarque, D. Mallon, K. Minkowski, S. Monfort, B. Ndoassal, B. Niagate, G. Purchase, S. Samala, A. K. Samna, C. Sillero-Zubiri, A. E. Soultan, M. R. Stanley Price, and N. Pettorelli. 2014. Fiddling in biodiversity hotspots while deserts burn? Collapse of the Sahara’s megafauna. Diversity and Distributions 20, 1 (2014), 114--122.
[41]
Antonio Filieri, Corina S. Păsăreanu, and Willem Visser. 2013. Reliability analysis in symbolic pathfinder. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). 622--631.
[42]
R. A. Fisher, A. S. Corbet, and C. B. Williams. 1943. The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology 12 (1943), 42--58.
[43]
William A. Gale and Geoffrey Sampson. 1995. Good-Turing smoothing without tears. Journal of Quantitative Linguistics 2 (1995), 217--237.
[44]
G. Gay, M. Staats, M. Whalen, and M. P. E. Heimdahl. 2015. The risks of coverage-directed test case generation. IEEE Transactions on Software Engineering 41, 8 (Aug. 2015), 803--819.
[45]
Jaco Geldenhuys, Matthew B. Dwyer, and Willem Visser. 2012. Probabilistic symbolic execution. In Proceedings of the 2012 International Symposium on Software Testing and Analysis (ISSTA’12). 166--176.
[46]
Patrice Godefroid, Adam Kiezun, and Michael Y. Levin. 2008. Grammar-based whitebox fuzzing. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’08). 206--215.
[47]
Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed automated random testing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’05). 213--223.
[48]
I. J. Good. 2000. Turing’s anticipation of empirical Bayes in connection with the cryptanalysis of the naval enigma. Journal of Statistical Computation and Simulation 66, 2 (2000), 101--111.
[49]
Irving John Good. 1953. The population frequencies of species and the estimation of population parameters. Biometrika 40 (1953), 237--264.
[50]
I. J. Good and G. H. Toulmin. 1956. The number of new species, and the increase in population coverage, when a sample is increased. Biometrika 43, 1/2 (1956), 45--63.
[51]
Steven M. Goodman and Jonathan P. Benstead. 2005. Updated estimates of biotic diversity and endemism for Madagascar. Oryx 39, 1 (2005), 73--77.
[52]
Nicholas J. Gotelli and Anne Chao. 2013. Measuring and estimating species richness, species diversity, and biotic similarity from sampling data. In Encyclopedia of Biodiversity (2nd ed.). Vol. 5. Academic Press.195--211.
[53]
Walter J. Gutjahr. 1999. Partition testing vs. random testing: The influence of uncertainty. IEEE Transactions on Software Engineering 25, 5 (Sept. 1999), 661--674.
[54]
Andrew J. Hamilton, Yves Basset, Kurt K. Benke, Peter S. Grimbacher, Scott E. Miller, Vojtech Novotnỳ, G. Allan Samuelson, Nigel E. Stork, George D. Weiblen, and Jian D.L. Yen. 2010. Quantifying uncertainty in estimation of tropical arthropod species richness. American Naturalist 176, 1 (2010), 90--95.
[55]
Andrew J. Hamilton, Yves Basset, Kurt K. Benke, Peter S. Grimbacher, Scott E. Miller, Vojtech Novotnỳ, G. Allan Samuelson, Nigel E. Stork, George D. Weiblen, and Jian D. L. Yen. 2011. Correction. American Naturalist 177, 4 (2011), 544--545.
[56]
D. Hamlet and R. Taylor. 1990. Partition testing does not inspire confidence {program testing}. IEEE Transactions on Software Engineering 16, 12 (Dec. 1990), 1402--1411.
[57]
Dick Hamlet and Jeff Voas. 1993. Faults on its sleeve: Amplifying software reliability testing. In Proceedings of the 1993 ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’93). 89--98.
[58]
Mary Jean Harrold. 2000. Testing: A roadmap. In Proceedings of the Conference on the Future of Software Engineering (ICSE’00). 61--72.
[59]
Frank J. Hernandez and David G. Lindquist. 1999. A comparison of two light-trap designs for sampling larval and presettlement juvenile fish above a reef in Onslow Bay, North Carolina. Bulletin of Marine Science 64, 1 (1999), 173--184.
[60]
Joaquin Hortal, Paulo A. V. Borges, and Clara Gaspar. 2006. Evaluating the performance of species richness estimators: Sensitivity to sample grain size. Journal of Animal Ecology 75, 1 (2006), 274--287.
[61]
Matthias Höschele and Andreas Zeller. 2016. Mining input grammars from dynamic taints. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE’16). 720--725.
[62]
T. C. Hsieh, K. H. Ma, and Anne Chao. 2016. iNEXT: An R package for rarefaction and extrapolation of species diversity (Hill numbers). Methods in Ecology and Evolution 7, 12 (2016), 1451--1456.
[63]
Stuart H. Hurlbert. 1971. The nonconcept of species diversity: A critique and alternative parameters. Ecology 52, 4 (1971), 577--586.
[64]
Laura Inozemtseva and Reid Holmes. 2014. Coverage is not strongly correlated with test suite effectiveness. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14). 435--445.
[65]
E. T. Jaynes. 2003. Probability Theory: The Logic of Science. Cambridge University Press.
[66]
Y. Jia and M. Harman. 2011. An analysis and survey of the development of mutation testing. IEEE Transactions on Software Engineering 37, 5 (Sept. 2011), 649--678.
[67]
Shen-Ming Lee and Anne Chao. 1994. Estimating population size via sample coverage for closed capture-recapture models. Biometrics 50, 1 (1994), 88--97.
[68]
B. Littlewood and D. Wright. 1997. Some conservative stopping rules for the operational testing of safety critical software. IEEE Transactions on Software Engineering 23, 11 (Nov. 1997), 673--683.
[69]
John T. Longino and Robert K. Colwell. 1997. Biodiversity assessment using structured inventory: Capturing the ant fauna of a tropical rain forest. Ecological Applications 7, 4 (1997), 1263--1277.
[70]
Anne E. Magurran and Brian J. McGill. 2011. Biological Diversity: Frontiers in Measurement and Assessment. Oxford University Press.
[71]
Chang Xuan Mao and Robert K. Colwell. 2005. Estimation of species richness: Mixture models, the role of rare species, and inferential challenges. Ecology 86, 5 (2005), 1143--1153.
[72]
Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: Multi-objective automated testing for android applications. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA’16). 94--105.
[73]
Björn Matthis, Vitalii Avdiienko, Ezekiel Soremekun, Marcel Böhme, and Andreas Zeller. 2017. Detecting information flow by mutating input data. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE’17). 1--11.
[74]
Brian A. Maurer, James H. Brown, and Renee D. Rusler. 1992. The micro and macro in body size evolution. Evolution 46, 4 (1992), 939--953.
[75]
Phil McMinn. 2004. Search-based software test data generation: A survey: Research articles. Journal of Software Testing, Verification and Reliability 14, 2 (June 2004), 105--156.
[76]
P. McMinn. 2011. Search-based software testing: Past, present and future. In Proceedings of the 4th IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW’11). 153--163.
[77]
Barton P. Miller, Louis Fredriksen, and Bryan So. 1990. An empirical study of the reliability of UNIX utilities. Communications of the ACM 33, 12 (Dec. 1990), 32--44.
[78]
K. W. Miller, L. J. Morell, R. E. Noonan, S. K. Park, D. M. Nicol, B. W. Murrill, and M. Voas. 1992. Estimating the probability of failure when testing reveals no failures. IEEE Transactions on Software Engineering 18, 1 (Jan. 1992), 33--43.
[79]
Camilo Mora, Derek P. Tittensor, Sina Adl, Alastair G. B. Simpson, and Boris Worm. 2011. How many species are there on earth and in the ocean? PLOS Biology 9, 8 (2011), 1--8.
[80]
Mesrob I. Ohannessian. 2012. On Inference about Rare Events. PhD dissertation. Massachusetts Institute of Technology.
[81]
Alon Orlitsky and Ananda Theertha Suresh. 2015. Competitive distribution estimation: Why is Good-Turing good. In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS’15). 2143--2151.
[82]
Alon Orlitsky, Ananda Theertha Suresh, and Yihong Wu. 2016. Optimal prediction of the number of unseen species. Proceedings of the National Academy of Sciences 113, 47 (2016), 13283--13288.
[83]
Michael W. Palmer. 1991. Estimating species richness: The second-order jackknife reconsidered. Ecology 72, 4 (1991), 1512--1513.
[84]
Van-Thuan Pham, Marcel Böhme, and Abhik Roychoudhury. 2016. Model-based whitebox fuzzing for program binaries. In Proceedings of the 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE’16). 543--553.
[85]
E. C. Pielou. 1966. Species-diversity and pattern-diversity in the study of ecological succession. Journal of Theoretical Biology 10, 2 (1966), 370--383.
[86]
F. W. Preston. 1948. The commonness, and rarity, of species. Ecology 29, 3 (1948), 254--283.
[87]
Dawei Qi, Hoang D. T. Nguyen, and Abhik Roychoudhury. 2013. Path exploration based on symbolic output. ACM Transactions on Software Engineering and Methodology 22, 4 (Oct. 2013), 32:1--32:41.
[88]
Sushma Reddy and Liliana M. Dávalos. 2003. Geographical sampling bias and its implications for conservation priorities in Africa. Journal of Biogeography 30, 11 (2003), 1719--1727.
[89]
Herbert E. Robbins. 1968. Estimating the total probability of the unobserved outcomes of an experiment. Annals of Mathematical Statistics 39, 1 (1968), 256--257.
[90]
Mark D. Robinson, Davis J. McCarthy, and Gordon K. Smyth. 2010. edgeR: A bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 1 (2010), 139--140.
[91]
Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. 2012. AddressSanitizer: A fast address sanity checker. In Proceedings of the 2012 USENIX Conference on Annual Technical Conference (USENIX ATC’12). 28--28.
[92]
Tsung-Jen Shen, Anne Chao, and Chih-Feng Lin. 2003. Predicting the number of new species in further taxonomic sampling. Ecology 84, 3 (2003), 798--804.
[93]
Andrew R. Solow and Stephen Polasky. 1999. A quick estimator for taxonomic surveys. Ecology 80, 8 (1999), 2799--2803.
[94]
Evgeniy Stepanov and Konstantin Serebryany. 2015. MemorySanitizer: Fast detector of uninitialized memory use in C++. In Proceedings of the 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO’15). 46--55.
[95]
A. B. Wagner, P. Viswanath, and S. R. Kulkarni. 2006. Strong consistency of the good-turing estimator. In Proceedings of the 2006 IEEE International Symposium on Information Theory. 2526--2530.
[96]
Bruno A. Walther and Joslin L. Moore. 2005. The concepts of bias, precision and accuracy, and their use in testing the performance of species richness estimators, with a literature review of estimator performance. Ecography 28, 6 (2005), 815--829.
[97]
Bruno A. Walther and Joslin L. Moore. 2005. The concepts of bias, precision and accuracy, and their use in testing the performance of species richness estimators, with a literature review of estimator performance. Ecography 28, 6 (2005), 815--829.
[98]
Ponemon Institute. 2017. Ponemon Cost of Cyber Crime Study. Retrieved from https://www.accenture.com/us-en/insight-cost-of-cybercrime-2017. Accessed 11-13-2017.
[99]
IEEE Computer Society. 2013. Lockheed Martin Webinar Series: Michael Whalen on the Future of Verification and Validation. Retrieved from https://www.computer.org/cms/Computer.org/webinars/lmco/012413Slides-Whalen.pdf. Accessed: 05-13-2017.
[100]
Michał Zalewski. 2017. AFL: American Fuzzy Lop Fuzzer. Retrieved from http://lcamtuf.coredump.cx/afl/technical_details.txt. Accessed: 05-13-2017.
[101]
Michał Zalewski. 2017. AFL: Pulling Jpegs Out of Thin Air, Michael Zalewski. Retrieved from https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html. Accessed: 05-13-2017.
[102]
CERT Division. 2017. CERT Triage Tools. Retrieved from https://www.cert.org/vulnerability-analysis/tools/triage.cfm?. Accessed: 05-13-2017.
[103]
The LLVM Compiler Infrastructure Project. 2017. Clang Compiler Documentation. Retrieved from https://clang.llvm.org/docs/index.html. Accessed: 05-13-2017.
[104]
US Defense Advanced Research Projects Agency. 2017. DARPA Cyber Grand Challenge. Retrieved from http://www.darpa.mil/news-events/2016-08-04. Accessed: 05-13-2017.
[105]
Facebook. 2017. Facebook: Mark Harman on Software Engineering at Facebook Scale. Retrieved from https://research.fb.com/mark-harmon-on-software-engineering-at-facebook-scale/. Accessed: 05-13-2017.
[106]
FFMPEG. 2017. FFMPEG: A Complete, Cross-Platform Solution to Record, Convert and Stream Audio and Video. Retrieved from https://www.ffmpeg.org/. Accessed: 05-13-2017.
[107]
GNU. 2017. GCov: Coverage Testing Tool. Retrieved from https://linux.die.net/man/1/gcov. Accessed: 11-13-2017.
[108]
GNU. 2017. GNU GCC Sanitizer Options. Retrieved from https://gcc.gnu.org/onlinedocs/gcc-6.3.0/gcc/Instrumentation-Options.html#index-fsanitize_003daddress-947. Accessed: 05-13-2017.
[109]
Anne Chao, K. H. Ma, and T. C. Hsieh. 2017. iNext Online: Species iNterpolation and EXTrapolation. Retrieved from https://chao.shinyapps.io/iNEXTOnline/. Accessed: 05-13-2017.
[110]
Niels Lohmann. 2017. JSON for Modern C++. Retrieved from https://github.com/nlohmann/json. Accessed: 11-13-2017.
[111]
The LLVM Compiler Infrastructure Project. 2017. LibFuzzer: A Library for Coverage-Guided Fuzz Testing. Retrieved from http://llvm.org/docs/LibFuzzer.html. Accessed: 05-13-2017.
[112]
The Libjpeg-turbo Project. 2017. libjpeg-turbo Is a JPEG Image Codec to Accelerate Baseline JPEG Compression and Decompression. Retrieved from http://libjpeg-turbo.virtualgl.org/. Accessed: 05-13-2017.
[113]
The GNOME Project. 2017. LibXML2: The XML C Parser and Toolkit of Gnome. Retrieved from http://xmlsoft.org/. Accessed: 11-13-2017.
[114]
Microsoft. 2017. Microsoft: Project Springfield. Retrieved from https://www.microsoft.com/Springfield/. Accessed: 05-13-2017.
[115]
Google. 2017. Monkey: Android Random Testing. Retrieved from http://developer.android.com/tools/help/monkey.html. Accessed: 05-13-2017.
[116]
Mozilla. 2017. Mozilla: Fuzzing Firefox with Peach.Retrieved from https://wiki.mozilla.org/Security/Fuzzing/Peach. Accessed: 05-13-2017.
[117]
The OpenSSL Project. 2017. OpenSSL: A Toolkit for the Transport Layer Security (TLS) and Secure Sockets Layer (SSL) Protocols. Retrieved from https://www.openssl.org/. Accessed: 05-13-2017.
[118]
Google. 2017. OSS-Fuzz: Five Months Later. Retrieved from https://testing.googleblog.com/2017/05/oss-fuzz-five-months-later-and.html. Accessed: 05-13-2017.
[119]
Peach Fuzzer. 2017. Peach Fuzzer Platform. Retieved from http://www.peachfuzzer.com/products/peach-platform/. Accessed: 05-13-2017.
[120]
Anne Chao, K. H. Ma, T. C. Hsieh, and Chun-Huo Chiu. 2017. SpadeR Online: Species-Richness Prediction and Diversity Estimation in R. Retrieved from https://chao.shinyapps.io/SpadeR/. Accessed: 05-13-2017.
[121]
Eric Archer. 2017. sprex: Calculate Species Richness and Extrapolation Metrics. Retrieved from https://cran.r-project.org/web/packages/sprex/. Accessed: 05-13-2017.
[122]
Google. 2017. Syzkaller: Coverage-Guided Kernel Fuzzing. Retrieved from https://github.com/google/syzkaller. Accessed: 05-13-2017.
[123]
The Wireshark team. 2017. Wireshark Is the World’s Foremost and Widely-Used Network Protocol Analyzer. Retrieved from https://www.wireshark.org/. Accessed: 05-13-2017.
[124]
Sean Eron Anderson. 2017. Bithacks: Implementing Logarithm Efficiently. Retrieved from https://graphics.stanford.edu/ seander/bithacks.html#IntegerLogFloat. Accessed: 11-13-2017.
[125]
Cloudflare. 2017. Incident Report on Memory Leak Caused by Cloudflare Parser Bug. Retrieved from https://blog.cloudflare.com/incident-report-on-memory-leak-caused-by-cloudflare-parser-bug/. Accessed: 11-13-2017.
[126]
Haseeb Qureshi. 2017. Medium: A Hacker Stole USD 31M of Ether. Retrieved from https://medium.freecodecamp.org/a-hacker-stole-31m-of-ether-how-it-happened-and-what-it-means-for-ethereum-9e5dc29e33ce. Accessed: 05-13-2017.
[127]
Damien Gayle, Alexandra Topping, Ian Sample, Sarah Marsh, and Vikram Dodd. 2017. NHS Seeks to Recover from Global Cyber-Attack as Security Concerns Resurface. Retrieved from https://www.theguardian.com/society/2017/may/12/hospitals-across-england-hit-by-large-scale-cyber-attack. Accessed: 11-13-2017.
[128]
Elaine J. Weyuker. 1982. On testing non-testable programs. Computer Journal 25, 4 (1982), 465--470.
[129]
E. J. Weyuker and B. Jeng. 1991. Analyzing partition testing strategies. IEEE Transactions on Software Engineering 17, 7 (July 1991), 703--711.
[130]
B. Yang and M. Xie. 2000. A study of operational and testing reliability in software reliability analysis. Reliability Engineering 8 System Safety 70, 3 (2000), 323--329.
[131]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). 283--294.
[132]
Cun-Hui Zhang and Zhiyi Zhang. 2009. Asymptotic normality of a nonparametric estimator of sample coverage. Annals of Statistics 37, 5A (10 2009), 2582--2595.

Cited By

View all
  • (2025)BazzAFL: Moving Fuzzing Campaigns Towards Bugs via Grouping Bug-Oriented SeedsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.339179522:1(179-191)Online publication date: Jan-2025
  • (2024)Empirical Evaluation of Frequency Based Statistical Models for Estimating Killable MutantsProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686669(61-71)Online publication date: 24-Oct-2024
  • (2024)DiPri: Distance-Based Seed Prioritization for Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/365444034:1(1-39)Online publication date: 19-Dec-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 27, Issue 2
April 2018
186 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/3234930
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2018
Accepted: 01 April 2018
Revised: 01 April 2018
Received: 01 August 2017
Published in TOSEM Volume 27, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Statistical guarantees
  2. code coverage
  3. discovery probability
  4. extrapolation
  5. fuzzing
  6. measure of confidence
  7. measure of progress
  8. reliability
  9. security
  10. species coverage
  11. stopping rule

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • National Cybersecurity R8D Program TSUNAMi
  • National Research Foundation, Prime Minister's Office, Singapore
  • National Cybersecurity R8D Directorate

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)68
  • Downloads (Last 6 weeks)17
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)BazzAFL: Moving Fuzzing Campaigns Towards Bugs via Grouping Bug-Oriented SeedsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.339179522:1(179-191)Online publication date: Jan-2025
  • (2024)Empirical Evaluation of Frequency Based Statistical Models for Estimating Killable MutantsProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686669(61-71)Online publication date: 24-Oct-2024
  • (2024)DiPri: Distance-Based Seed Prioritization for Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/365444034:1(1-39)Online publication date: 19-Dec-2024
  • (2024)MicroFuzz: An Efficient Fuzzing Framework for MicroservicesProceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice10.1145/3639477.3639723(216-227)Online publication date: 14-Apr-2024
  • (2024)Extrapolating Coverage Rate in Greybox FuzzingProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639198(1-12)Online publication date: 20-May-2024
  • (2024)Curiosity-Driven Testing for Sequential Decision-Making ProcessProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639149(1-14)Online publication date: 20-May-2024
  • (2024)Directed Fuzzing with Adaptive Path Guidance and Weighted Distribution2024 IEEE 24th International Conference on Software Quality, Reliability, and Security Companion (QRS-C)10.1109/QRS-C63300.2024.00112(839-848)Online publication date: 1-Jul-2024
  • (2023)Statistical Reachability AnalysisProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616268(326-337)Online publication date: 30-Nov-2023
  • (2023)Boosting Fuzzer Efficiency: An Information Theoretic PerspectiveCommunications of the ACM10.1145/361101966:11(89-97)Online publication date: 20-Oct-2023
  • (2023)Green Fuzzing: A Saturation-Based Stopping Criterion using Vulnerability PredictionProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598043(127-139)Online publication date: 12-Jul-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media