Skip to main content
Log in

Establishing flight software reliability: testing, model checking, constraint-solving, monitoring and learning

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

In this paper we discuss the application of a range of techniques to the verification of mission-critical flight software at NASA’s Jet Propulsion Laboratory. For this type of application we want to achieve a higher level of confidence than can be achieved through standard software testing. Unfortunately, given the current state of the art, especially when efforts are constrained by the tight deadlines and resource limitations of a flight project, it is not feasible to produce a rigorous formal proof of correctness of even a well-specified stand-alone module such as a file system (much less more tightly coupled or difficult-to-specify modules). This means that we must look for a practical alternative in the area between traditional testing and proof, as we attempt to optimize rigor and coverage. The approaches we describe here are based on testing, model checking, constraint-solving, monitoring, and finite-state machine learning, in addition to static code analysis. The results we have obtained in the domain of file systems are encouraging, and suggest that for more complex properties of programs with complex data structures, it is possibly more beneficial to use constraint solvers to guide and analyze execution (i.e., as in testing, even if performed by a model checking tool) than to translate the program and property into a set of constraints, as in abstraction-based and bounded model checkers. Our experience with non-file-system flight software modules shows that methods even further removed from traditional static formal methods can be assisted by formal approaches, yet readily adopted by test engineers and software developers, even as the key problem shifts from test generation and selection to test evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. About Static Driver Verifier: http://www.microsoft.com/whdc/DevTools/tools/SDV.mspx

  2. ACL2 Version 6.3: http://www.cs.utexas.edu/moore/acl2

  3. AdaCore: SPARK Pro. http://www.adacore.com/sparkpro/

  4. CodeSonar: GrammaTech static analysis. http://www.grammatech.com/products/codesonar

  5. Hyper-V - Microsoft: http://www.microsoft.com/servers/hyper-v-server/default.mspx

  6. JPL Laboratory for Reliable Software (LaRS): http://lars-lab.jpl.nasa.gov

  7. Mars Science Laboratory: http://mars.jpl.nasa.gov/msl

  8. Open Group Base Specifications Issue 6: http://www.opengroup.org/onlinepubs/009695399/

  9. Software development testing and static analysis tools: Coverity. http://www.coverity.com

  10. Source code analysis tools for software security & reliability: Klockwork. http://klocwork.com

  11. VCC: A C verifier - Microsoft Research: http://research.microsoft.com/en-us/projects/vcc/

  12. Ammann, P., Offutt, J.: Introduction to Software Testing. Cambridge University Press, Cambridge (2008)

    Book  MATH  Google Scholar 

  13. Andrews, J., Li, F., Menzies, T.: Nighthawk: a two-level genetic-random unit test data generator. Autom. Softw. Eng. 144–153 (2007)

  14. Andrews, J.H., Groce, A., Weston, M., Xu, R.-G.: Random test run length and effectiveness. In: Automated Software Engineering, pp. 19–28 (2008)

  15. Andrews, J.H., Haldar, S., Lei, Y., Li, C.H.F.: Tool support for randomized unit testing. In: Proceedings of the First International Workshop on Randomized Testing, pp. 36–45. Portland (2006)

  16. Angluin, D.: Learning regular sets from queries and counterexamples. Info. Comput. 75(2), 87106 (1987)

    MathSciNet  Google Scholar 

  17. Ball, T., Rajamani, S.: Automatically validating temporal safety properties of interfaces. In: SPIN Workshop on Model Checking of Software, pp. 103–122 (2001)

  18. Barringer, H., Groce, A., Havelund, K., Smith, M.: Formal analysis of log files. J. Aerosp. Comput. Info. Commun. 7(11), 365–390 (2010)

    Article  Google Scholar 

  19. Barringer, H., Havelund, K., Rydeheard, D., Groce, A.: Rule systems for runtime verification: a short tutorial. In: Proceedings of the 9th International Workshop on Runtime Verification (RV’09), LNCS, vol. 5779, pp. 1–24. Springer (2009)

  20. Barringer, H., Rydeheard, D., Havelund, K.: Rule systems for run-time monitoring: from Eagle to RuleR. In: Proceedings of the 7th International Workshop on Runtime Verification (RV’07), LNCS, vol. 4839, pp. 111–125. Springer, Vancouver (2007)

  21. Barringer, H., Rydeheard, D.E., Havelund, K.: Rule systems for run-time monitoring: from Eagle to RuleR. J. Log. Comput. 20(3), 675–706 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  22. Beck, K.: Test Driven Development: By Example. Addison Wesley, Reading (2002)

    Google Scholar 

  23. Beizer, B.: Software Testing Techniques. International Thomson Computer Press, Boston (1990)

  24. Beyer, D., Chlipala, A.J., Henzinger, T.A., Jhala, R., Majumdar, R.: Generating tests from counterexamples. In: International Conference on Software Engineering, pp. 326–335 (2004)

  25. Biere, A.: The Evolution from Limmat to Nanosat. Technical Report 444, Dept. of Computer Science, ETH Zŭrich (2004)

  26. Biere, A., Cimatti, A., Clarke, E.M., Zhu, Y.: Symbolic model checking without BDDs. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 193–207 (1999)

  27. Biermann, A.W., Feldman, J.A.: On the synthesis of finite-state machines from samples of their behaviour. IEEE Trans. Comput. 21, 592–597 (1972)

    Article  MATH  MathSciNet  Google Scholar 

  28. Boonstoppel, P., Cadar, C., Engler, D.: Rwset: attacking path explosion in constraint-based test generation. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 351–366 (2008)

  29. Brooks, F.: The Mythical Man-Month: Essays on Software Engineering, 20th Anniversary Edition. Addison-Wesley Professional (1995)

  30. Cadar, C., Dunbar, D., Engler, D.: Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Operating System Design and Implementation, pp. 209–224 (2008)

  31. Cadar, C., Ganesh, V., Pawlowski, P., Dill, D., Engler, D.: EXE: automatically generating inputs of death. In: Conference on Computer and Communications Security, pp. 322–335 (2006)

  32. Chaki, S., Clarke, E.M., Groce, A., Jha, S., Veith, H.: Modular verification of software components in C. In: International Conference on Software Engineering, pp. 385–395 (2003)

  33. Chen, F., Roşu, G.: MOP: an efficient and generic runtime verification framework. In: Object-Oriented Programming, Systems, Languages and Applications (OOPSLA’07) (2007)

  34. Chen, Y., Groce, A., Zhang, C., Wong, W.-K., Fern, X., Eide, E., Regehr, J.: Taming compiler fuzzers. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 197–208 (2013)

  35. Ciupa, I., Leitner, A., Oriol, M., Meyer, B.: Experimental assessment of random testing for object-oriented software. In: Rosenblum, D.S., Elbaum, S.G. (eds.) International Symposium on Software Testing and Analysis, pp 84–94. ACM (2007)

  36. Claessen, K., Hughes, J.: QuickCheck: a lightweight tool for random testing of haskell programs. In: ICFP, pp. 268–279 (2000)

  37. Clarke, E., Grumberg, O., Peled, D.: Model Checking. MIT Press, Cambridge (2000)

    Google Scholar 

  38. Clarke, E.M., Emerson, E.: The design and synthesis of synchronization skeletons using temporal logic. In: Workshop on Logics of Programs, pp. 52–71 (1981)

  39. Clarke, L.A., Rosenblum, D.S.: A historical perspective on runtime assertion checking in software development. ACM SIGSOFT Softw. Eng. Notes 31(3), 25–37 (2006)

    Article  Google Scholar 

  40. Corbett, J., Dwyer, M., Hatcliff, J., Laubach, S., Pasareanu, C.S., Zheng, H.: Bandera: extracting finite-state models from Java source code. In: International Conference on Software Engineering, pp. 439–448 (2000)

  41. Csallner, C., Smaragdakis, Y.: JCrasher: an automatic robustness tester for Java. Softw. Pract. Exp. 34(11), 1025–1050 (2004)

    Article  Google Scholar 

  42. de Moura, L., Bjorner, N.: Z3: an efficient SMT solver. In: Conference on Tools and Algorithms for the Construction and Analysis of Systems, pp. 337–340 (2008)

  43. DeMillo, R.A., Lipton, R.J., Sayward, F.G.: Hints on test data selection: help for the practicing programmer. Computer 4(11), 34–41 (1978)

    Google Scholar 

  44. Doong, R.-K., Frankl, P.G.: The ASTOOT approach to testing object-oriented programs. ACM Trans. Softw. Eng. Methodol. 3(2), 101–130 (1994)

    Article  Google Scholar 

  45. Duran, J.W., Ntafos, S.C.: Evaluation of random testing. IEEE Trans. Softw. Eng. 10(4), 438–444 (1984)

    Article  Google Scholar 

  46. Dwyer, M.B., Elbaum, S.G., Person, S., Purandare, R.: Parallel randomized state-space search. In: International Conference on Software Engineering, pp. 3–12 (2007)

  47. Een, N., Sorensson, N.: An extensible SAT-solver. In: Symposium on the Theory and Applications of Satisfiability Testing (SAT), pp. 502–518 (2003)

  48. Erickson, J., Joshi, R.: Proving Correctness of a Flash Filesystem in ACL2. Unpublished manuscript in preparation (2006)

  49. Gligoric, M., Groce, A., Zhang, C., Sharma, R., Alipour, A., Marinov, D.: Comparing non-adequate test suites using coverage criteria. In: International Symposium on Software Testing and Analysis, pp. 302–313 (2013)

  50. Godefroid, P.: Verisoft: a tool for the automatic analysis of concurrent software. In: Computer-Aided Verification, pp. 172–186 (1997)

  51. Godefroid, P.: Compositional dynamic test generation. In: Principles of Programming Languages, pp. 47–54 (2007)

  52. Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: Programming Language Design and Implementation, pp. 213–223 (2005)

  53. Groce, A.: (Quickly) testing the tester via path coverage. In: Workshop on Dynamic Analysis (2009)

  54. Groce, A., Fern, A., Pinto, J., Bauer, T., Alipour, A., Erwig, M., Lopez, C.: Lightweight automated testing with adaptation-based programming. In: IEEE International Symposium on Software Reliability Engineering, pp. 161–170 (2012)

  55. Groce, A., Havelund, K., Smith, M.H.: From scripts to specifications: the evolution of a flight software testing effort. In: 32nd International Conference on Software Engineering (ICSE’10), pp. 129–138. ACM SIG, Cape Town (2010)

  56. Groce, A., Holzmann, G., Joshi, R.: Randomized differential testing as a prelude to formal verification. In: International Conference on Software Engineering, pp. 621–631 (2007)

  57. Groce, A., Holzmann, G., Joshi, R., Xu, R.-G.: Putting flight software through the paces with testing, model checking, and constraint-solving. In: International Workshop on Constraints in Formal Verification, pp. 1–15 (2008)

  58. Groce, A., Joshi, R.: Exploiting traces in program analysis. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 379–393 (2006)

  59. Groce, A., Joshi, R.: Extending model checking with dynamic analysis. In: International Conference on Verification, Model Checking, and Abstract Interpretation, pp. 142–156 (2008)

  60. Groce, A., Joshi, R.: Random testing and model checking: building a common framework for nondeterministic exploration. In: Workshop on Dynamic Analysis, pp. 22–28 (2008)

  61. Groce, A., Zhang, C., Alipour, A., Eide, E., Chen, Y., Regehr, J.: Help, help, I’m being suppressed! the significance of suppressors in software testing. In: IEEE International Symposium on Software Reliability Engineering, pp. 390–399 (2013)

  62. Groce, A., Zhang, C., Eide, E., Chen, Y., Regehr, J.: Swarm testing. In: International Symposium on Software Testing and Analysis, pp. 78–88 (2012)

  63. Hamlet, R.: Testing programs with the aid of a compiler. IEEE Trans. Softw. Eng. 3(4), 279–290 (1977)

    Google Scholar 

  64. Hamlet, R., Taylor, R.: Partition testing does not inspire confidence. IEEE Trans. Softw. Eng. 16(12), 1402–1411 (1990)

    Article  MathSciNet  Google Scholar 

  65. Hamlet, R.: Random testing. In: Encyclopedia of Software Engineering, pp. 970–978. Wiley (1994)

  66. Hamlet, R.: When only random testing will do. In: International Workshop on Random Testing, pp. 1–9 (2006)

  67. Havelund, K.: Runtime verification of C programs. In: Proceedings of the 1st TestCom/FATES Conference, LNCS, vol. 5047. Springer, Tokyo (2008)

  68. Havelund, K., Roşu, G.: Efficient monitoring of safety properties. Softw. Tools Technol. Tran. 6(2), 158–173 (2004)

    Article  Google Scholar 

  69. Heimdahl, M.P.E., Rayadurgam, S., Visser, W., George, D., Gao, J.: Auto-generating test sequences using model checkers: a case study. In: International Workshop on Formal Approaches to Testing of Software (FATES) (2003)

  70. Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Lazy abstraction. In: Principles of Programming Languages, pp. 58–70 (2002)

  71. Hoare, T.: The verifying compiler: a grand challenge for computing research. J. ACM 50(1), 63–69 (2003)

    Article  Google Scholar 

  72. Holzmann, G.: Static source code checking for user-defined properties. In: Conference on Integrated Design and Process Technology (2002)

  73. Holzmann, G.: The SPIN Model Checker: Primer and Reference Manual. Addison-Wesley Professional (2003)

  74. Holzmann, G.: The power of ten: rules for developing safety critical code. IEEE Comput. 39(6), 95–97 (2006)

    Article  Google Scholar 

  75. Holzmann, G., Joshi, R.: Model-driven software verification. In: SPIN Workshop on Model Checking of Software, pp. 76–91 (2004)

  76. Holzmann, G., Joshi, R., Groce, A.: Swarm verification. In: Automated Software Engineering, pp. 1–6 (2008)

  77. Holzmann, G., Joshi, R., Groce, A.: Tackling large verification problems with the swarm tool. In: SPIN Workshop on Model Checking of Software, pp. 134–143 (2008)

  78. Holzmann, G.J., Joshi, R., Groce, A.: Swarm verification techniques. IEEE Trans. Softw. Eng. 37(6), 845–857 (2011)

    Article  Google Scholar 

  79. Joshi, R., Holzmann, G.: A mini-challenge: build a verifiable filesystem. In: The Conference on Verified Software: Theories, Tools, Experiments (2005)

  80. Kaufmann, M., Manolios, P., Moore, J.S.: Computer-Aided Reasoning: An Approach. Kluwer, Norwell (2000)

  81. Kim, M., Choi, Y., Kim, Y., Kim, H.: Formal verification of a flash memory device driver — an experience report. In: SPIN Workshop on Model Checking of Software, pp. 144–159 (2008)

  82. Kim, M., Choi, Y., Kim, Y., Kim, H.: Pre-testing flash device driver through model checking techniques. In: International Conference on Software Testing, Verification, and Validation, pp. 475–484 (2008)

  83. Kim, M., Kannan, S., Lee, I., Sokolsky, O.: Java-MaC: a run-time assurance tool for Java. In: Proceedings of the 1st International Workshop on Runtime Verification (RV’01) of ENTCS, vol. 55, issue 2. Elsevier (2001)

  84. Kim, M., Kim, Y., Kim, H.: Unit testing of flash memory device driver through a SAT-based model checker. In: Automated Software Engineering, pp. 198–207 (2008)

  85. King, J.C.: Symbolic execution and program testing. Commun. ACM 19(7), 385–394 (1976)

    Article  MATH  Google Scholar 

  86. Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., Winwood, S.: sel4: formal verification of an OS kernel. In: ACM Symposium on Operating Systems Principles (2009)

  87. Kroening, D., Clarke, E.M., Lerda, F.: A tool for checking ANSI-C programs. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 168–176 (2004)

  88. Lei, Y., Andrews, J.H.: Minimization of randomized unit test cases. In: International Symposium on Software Reliability Engineering, pp. 267–276 (2005)

  89. Majumdar, R., Sen, K.: Hybrid concolic testing. In: International Conference on Software Engineering, pp. 416–426 (2007)

  90. McKeeman, W.: Differential testing for software. Digit. Tech. J. Digit. Equip. Corp. 10(1), 100–107 (1998)

    Google Scholar 

  91. Mercer, E., Jones, M.: Model checking machine code with the GNU debugger. In: SPIN Workshop on Model Checking Software (2005)

  92. Miller, B.P., Fredriksen, L., So, B.: An empirical study of the reliability of UNIX utilities. Commun. ACM 105(33(12)), 32–44 (1990)

    Google Scholar 

  93. Moskewicz, M.W., Madigan, C.F., Zhao, Y., Zhang, L., Malik, S.: Chaff: engineering an efficient SAT solver. In: Design Automation Conference, pp. 530–535 (2001)

  94. Musuvathi, M., Park, D., Chou, A., Engler, D., Dill, D.: CMC: a pragmatic approach to model checking real code. In: Symposium on Operating System Design and Implementation (2002)

  95. Necula, G., McPeak, S., Rahul, S., Weimer, W.: CIL: intermediate language and tools for analysis and transformation of C programs. In: International Conference on Compiler Construction, pp. 213–228 (2002)

  96. Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL - A Proof Assistant for Higher-Order Logic, LNCS, vol. 2283. Springer (2002)

  97. Offutt, J., Abdurazik, A.: Mutation 2000: Mutation Testing in the Twentieth and the Twenty First Centuries (2000)

  98. Pacheco, C., Lahiri, S.K., Ernst, M.D., Ball, T.: Feedback-directed random test generation. In: International Conference on Software Engineering, pp. 75–84 (2007)

  99. Pettichord, B.: Design for testability. In: Pacific Northwest Software Quality Conference (2002)

  100. Qadeer, S., Rehof, J.: Context-bounded model checking of concurrent software. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 93–107 (2005)

  101. Reeves, G., Neilson, T.: The mars rover spirit FLASH anomaly. In: IEEE Aerospace Conference (2005)

  102. Dwyer, R.M., Hatcliff, J.: Bogor: an extensible and highly-modular model checking framework. In: European Software Engineering Conference/Symposium on the Foundations of Software Engineering, pp. 267–276 (2003)

  103. Sen, K., Marinov, D., Agha, G.: CUTE: a concolic unit testing engine for C. In: Foundations of Software Engineering, pp. 262–272 (2005)

  104. Smith, M., Havelund, K.: Requirements capture with RCAT. In: 16th IEEE International Requirements Engineering Conference (RE’08). IEEE Computer Society, Barcelona (2008)

  105. Tuch, H., Klein, G., Norrish, M.: Types, bytes, and separation logic. In: Principles of Programming Languages, pp. 97–108 (2007)

  106. Various: A Collection of NAND Flash Application Notes, Whitepapers and Articles. Available at http://www.data-io.com/NAND/NANDApplicationNotes.asp

  107. Runtime Verification: http://www.runtime-verification.org

  108. Visser, W., Havelund, K., Brat, G., Park, S.J., Lerda, F.: Model checking programs. Autom. Softw. Eng. 10(2), 203–232 (2003)

    Article  Google Scholar 

  109. Visser, W., Păsăreanu, C., Pelanek, R.: Test input generation for Java containers using state matching. In: International Symposium on Software Testing and Analysis, pp. 37–48 (2006)

  110. Xu, R.-G., Majumdar, R., Godefroid, P.: Testing for buffer overflows with length abstraction. In: International Symposium on Software Testing and Analysis, pp. 19–28 (2008)

  111. Yang, J., Sar, C., Engler, D.: EXPLODE: a lightweight, general system for finding serious storage system errors. In: Operating System Design and Implementation, pp. 131–146 (2006)

  112. Yang, J., Sar, C., Twohey, P., Cadar, C., Engler, D.: Automatically generating malicious disks using symbolic execution. In: IEEE Symposium on Security and Privacy, pp. 243–257 (2006)

  113. Yang, J., Twohey, P., Engler, D., Musuvathi, M.: Using model checking to find serious file system errors. In: Operating System Design and Implementation, pp. 273–288 (2004)

  114. Yang, X., Chen, Y., Eide, E., Regehr, J.: Finding and understanding bugs in C compilers. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 283–294 (2011)

  115. Zeller, A., Hildebrandt, R.: Simplifying and isolating failure-inducing input. IEEE Trans. Softw. Eng. 28(2), 183–200 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alex Groce.

Additional information

The research described in this publication was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. Funding was also provided by NASA ESAS 6G. (c) 2008. All Rights Reserved

Rights and permissions

Reprints and permissions

About this article

Cite this article

Groce, A., Havelund, K., Holzmann, G. et al. Establishing flight software reliability: testing, model checking, constraint-solving, monitoring and learning. Ann Math Artif Intell 70, 315–349 (2014). https://doi.org/10.1007/s10472-014-9408-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-014-9408-8

Keywords

Mathematics Subject Classifications (2010)

Navigation