Establishing flight software reliability: testing, model checking, constraint-solving, monitoring and learning

Groce, Alex; Havelund, Klaus; Holzmann, Gerard; Joshi, Rajeev; Xu, Ru-Gang

doi:10.1007/s10472-014-9408-8

Establishing flight software reliability: testing, model checking, constraint-solving, monitoring and learning

Published: 18 March 2014

Volume 70, pages 315–349, (2014)
Cite this article

Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Alex Groce¹,
Klaus Havelund²,
Gerard Holzmann²,
Rajeev Joshi² &
…
Ru-Gang Xu³

498 Accesses
24 Citations
3 Altmetric
Explore all metrics

Abstract

In this paper we discuss the application of a range of techniques to the verification of mission-critical flight software at NASA’s Jet Propulsion Laboratory. For this type of application we want to achieve a higher level of confidence than can be achieved through standard software testing. Unfortunately, given the current state of the art, especially when efforts are constrained by the tight deadlines and resource limitations of a flight project, it is not feasible to produce a rigorous formal proof of correctness of even a well-specified stand-alone module such as a file system (much less more tightly coupled or difficult-to-specify modules). This means that we must look for a practical alternative in the area between traditional testing and proof, as we attempt to optimize rigor and coverage. The approaches we describe here are based on testing, model checking, constraint-solving, monitoring, and finite-state machine learning, in addition to static code analysis. The results we have obtained in the domain of file systems are encouraging, and suggest that for more complex properties of programs with complex data structures, it is possibly more beneficial to use constraint solvers to guide and analyze execution (i.e., as in testing, even if performed by a model checking tool) than to translate the program and property into a set of constraints, as in abstraction-based and bounded model checkers. Our experience with non-file-system flight software modules shows that methods even further removed from traditional static formal methods can be assisted by formal approaches, yet readily adopted by test engineers and software developers, even as the key problem shifts from test generation and selection to test evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Testing machine learning based systems: a systematic mapping

Article Open access 15 September 2020

How Artificial Intelligence Can Revolutionize Software Testing Techniques

A bounded constraint-based approach to aid in fault localization from a counterexample

Article 12 April 2024

References

About Static Driver Verifier: http://www.microsoft.com/whdc/DevTools/tools/SDV.mspx
ACL2 Version 6.3: http://www.cs.utexas.edu/moore/acl2
AdaCore: SPARK Pro. http://www.adacore.com/sparkpro/
CodeSonar: GrammaTech static analysis. http://www.grammatech.com/products/codesonar
Hyper-V - Microsoft: http://www.microsoft.com/servers/hyper-v-server/default.mspx
JPL Laboratory for Reliable Software (LaRS): http://lars-lab.jpl.nasa.gov
Mars Science Laboratory: http://mars.jpl.nasa.gov/msl
Open Group Base Specifications Issue 6: http://www.opengroup.org/onlinepubs/009695399/
Software development testing and static analysis tools: Coverity. http://www.coverity.com
Source code analysis tools for software security & reliability: Klockwork. http://klocwork.com
VCC: A C verifier - Microsoft Research: http://research.microsoft.com/en-us/projects/vcc/
Ammann, P., Offutt, J.: Introduction to Software Testing. Cambridge University Press, Cambridge (2008)
Book MATH Google Scholar
Andrews, J., Li, F., Menzies, T.: Nighthawk: a two-level genetic-random unit test data generator. Autom. Softw. Eng. 144–153 (2007)
Andrews, J.H., Groce, A., Weston, M., Xu, R.-G.: Random test run length and effectiveness. In: Automated Software Engineering, pp. 19–28 (2008)
Andrews, J.H., Haldar, S., Lei, Y., Li, C.H.F.: Tool support for randomized unit testing. In: Proceedings of the First International Workshop on Randomized Testing, pp. 36–45. Portland (2006)
Angluin, D.: Learning regular sets from queries and counterexamples. Info. Comput. 75(2), 87106 (1987)
MathSciNet Google Scholar
Ball, T., Rajamani, S.: Automatically validating temporal safety properties of interfaces. In: SPIN Workshop on Model Checking of Software, pp. 103–122 (2001)
Barringer, H., Groce, A., Havelund, K., Smith, M.: Formal analysis of log files. J. Aerosp. Comput. Info. Commun. 7(11), 365–390 (2010)
Article Google Scholar
Barringer, H., Havelund, K., Rydeheard, D., Groce, A.: Rule systems for runtime verification: a short tutorial. In: Proceedings of the 9th International Workshop on Runtime Verification (RV’09), LNCS, vol. 5779, pp. 1–24. Springer (2009)
Barringer, H., Rydeheard, D., Havelund, K.: Rule systems for run-time monitoring: from Eagle to RuleR. In: Proceedings of the 7th International Workshop on Runtime Verification (RV’07), LNCS, vol. 4839, pp. 111–125. Springer, Vancouver (2007)
Barringer, H., Rydeheard, D.E., Havelund, K.: Rule systems for run-time monitoring: from Eagle to RuleR. J. Log. Comput. 20(3), 675–706 (2010)
Article MATH MathSciNet Google Scholar
Beck, K.: Test Driven Development: By Example. Addison Wesley, Reading (2002)
Google Scholar
Beizer, B.: Software Testing Techniques. International Thomson Computer Press, Boston (1990)
Beyer, D., Chlipala, A.J., Henzinger, T.A., Jhala, R., Majumdar, R.: Generating tests from counterexamples. In: International Conference on Software Engineering, pp. 326–335 (2004)
Biere, A.: The Evolution from Limmat to Nanosat. Technical Report 444, Dept. of Computer Science, ETH Zŭrich (2004)
Biere, A., Cimatti, A., Clarke, E.M., Zhu, Y.: Symbolic model checking without BDDs. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 193–207 (1999)
Biermann, A.W., Feldman, J.A.: On the synthesis of finite-state machines from samples of their behaviour. IEEE Trans. Comput. 21, 592–597 (1972)
Article MATH MathSciNet Google Scholar
Boonstoppel, P., Cadar, C., Engler, D.: Rwset: attacking path explosion in constraint-based test generation. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 351–366 (2008)
Brooks, F.: The Mythical Man-Month: Essays on Software Engineering, 20th Anniversary Edition. Addison-Wesley Professional (1995)
Cadar, C., Dunbar, D., Engler, D.: Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Operating System Design and Implementation, pp. 209–224 (2008)
Cadar, C., Ganesh, V., Pawlowski, P., Dill, D., Engler, D.: EXE: automatically generating inputs of death. In: Conference on Computer and Communications Security, pp. 322–335 (2006)
Chaki, S., Clarke, E.M., Groce, A., Jha, S., Veith, H.: Modular verification of software components in C. In: International Conference on Software Engineering, pp. 385–395 (2003)
Chen, F., Roşu, G.: MOP: an efficient and generic runtime verification framework. In: Object-Oriented Programming, Systems, Languages and Applications (OOPSLA’07) (2007)
Chen, Y., Groce, A., Zhang, C., Wong, W.-K., Fern, X., Eide, E., Regehr, J.: Taming compiler fuzzers. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 197–208 (2013)
Ciupa, I., Leitner, A., Oriol, M., Meyer, B.: Experimental assessment of random testing for object-oriented software. In: Rosenblum, D.S., Elbaum, S.G. (eds.) International Symposium on Software Testing and Analysis, pp 84–94. ACM (2007)
Claessen, K., Hughes, J.: QuickCheck: a lightweight tool for random testing of haskell programs. In: ICFP, pp. 268–279 (2000)
Clarke, E., Grumberg, O., Peled, D.: Model Checking. MIT Press, Cambridge (2000)
Google Scholar
Clarke, E.M., Emerson, E.: The design and synthesis of synchronization skeletons using temporal logic. In: Workshop on Logics of Programs, pp. 52–71 (1981)
Clarke, L.A., Rosenblum, D.S.: A historical perspective on runtime assertion checking in software development. ACM SIGSOFT Softw. Eng. Notes 31(3), 25–37 (2006)
Article Google Scholar
Corbett, J., Dwyer, M., Hatcliff, J., Laubach, S., Pasareanu, C.S., Zheng, H.: Bandera: extracting finite-state models from Java source code. In: International Conference on Software Engineering, pp. 439–448 (2000)
Csallner, C., Smaragdakis, Y.: JCrasher: an automatic robustness tester for Java. Softw. Pract. Exp. 34(11), 1025–1050 (2004)
Article Google Scholar
de Moura, L., Bjorner, N.: Z3: an efficient SMT solver. In: Conference on Tools and Algorithms for the Construction and Analysis of Systems, pp. 337–340 (2008)
DeMillo, R.A., Lipton, R.J., Sayward, F.G.: Hints on test data selection: help for the practicing programmer. Computer 4(11), 34–41 (1978)
Google Scholar
Doong, R.-K., Frankl, P.G.: The ASTOOT approach to testing object-oriented programs. ACM Trans. Softw. Eng. Methodol. 3(2), 101–130 (1994)
Article Google Scholar
Duran, J.W., Ntafos, S.C.: Evaluation of random testing. IEEE Trans. Softw. Eng. 10(4), 438–444 (1984)
Article Google Scholar
Dwyer, M.B., Elbaum, S.G., Person, S., Purandare, R.: Parallel randomized state-space search. In: International Conference on Software Engineering, pp. 3–12 (2007)
Een, N., Sorensson, N.: An extensible SAT-solver. In: Symposium on the Theory and Applications of Satisfiability Testing (SAT), pp. 502–518 (2003)
Erickson, J., Joshi, R.: Proving Correctness of a Flash Filesystem in ACL2. Unpublished manuscript in preparation (2006)
Gligoric, M., Groce, A., Zhang, C., Sharma, R., Alipour, A., Marinov, D.: Comparing non-adequate test suites using coverage criteria. In: International Symposium on Software Testing and Analysis, pp. 302–313 (2013)
Godefroid, P.: Verisoft: a tool for the automatic analysis of concurrent software. In: Computer-Aided Verification, pp. 172–186 (1997)
Godefroid, P.: Compositional dynamic test generation. In: Principles of Programming Languages, pp. 47–54 (2007)
Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: Programming Language Design and Implementation, pp. 213–223 (2005)
Groce, A.: (Quickly) testing the tester via path coverage. In: Workshop on Dynamic Analysis (2009)
Groce, A., Fern, A., Pinto, J., Bauer, T., Alipour, A., Erwig, M., Lopez, C.: Lightweight automated testing with adaptation-based programming. In: IEEE International Symposium on Software Reliability Engineering, pp. 161–170 (2012)
Groce, A., Havelund, K., Smith, M.H.: From scripts to specifications: the evolution of a flight software testing effort. In: 32nd International Conference on Software Engineering (ICSE’10), pp. 129–138. ACM SIG, Cape Town (2010)
Groce, A., Holzmann, G., Joshi, R.: Randomized differential testing as a prelude to formal verification. In: International Conference on Software Engineering, pp. 621–631 (2007)
Groce, A., Holzmann, G., Joshi, R., Xu, R.-G.: Putting flight software through the paces with testing, model checking, and constraint-solving. In: International Workshop on Constraints in Formal Verification, pp. 1–15 (2008)
Groce, A., Joshi, R.: Exploiting traces in program analysis. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 379–393 (2006)
Groce, A., Joshi, R.: Extending model checking with dynamic analysis. In: International Conference on Verification, Model Checking, and Abstract Interpretation, pp. 142–156 (2008)
Groce, A., Joshi, R.: Random testing and model checking: building a common framework for nondeterministic exploration. In: Workshop on Dynamic Analysis, pp. 22–28 (2008)
Groce, A., Zhang, C., Alipour, A., Eide, E., Chen, Y., Regehr, J.: Help, help, I’m being suppressed! the significance of suppressors in software testing. In: IEEE International Symposium on Software Reliability Engineering, pp. 390–399 (2013)
Groce, A., Zhang, C., Eide, E., Chen, Y., Regehr, J.: Swarm testing. In: International Symposium on Software Testing and Analysis, pp. 78–88 (2012)
Hamlet, R.: Testing programs with the aid of a compiler. IEEE Trans. Softw. Eng. 3(4), 279–290 (1977)
Google Scholar
Hamlet, R., Taylor, R.: Partition testing does not inspire confidence. IEEE Trans. Softw. Eng. 16(12), 1402–1411 (1990)
Article MathSciNet Google Scholar
Hamlet, R.: Random testing. In: Encyclopedia of Software Engineering, pp. 970–978. Wiley (1994)
Hamlet, R.: When only random testing will do. In: International Workshop on Random Testing, pp. 1–9 (2006)
Havelund, K.: Runtime verification of C programs. In: Proceedings of the 1st TestCom/FATES Conference, LNCS, vol. 5047. Springer, Tokyo (2008)
Havelund, K., Roşu, G.: Efficient monitoring of safety properties. Softw. Tools Technol. Tran. 6(2), 158–173 (2004)
Article Google Scholar
Heimdahl, M.P.E., Rayadurgam, S., Visser, W., George, D., Gao, J.: Auto-generating test sequences using model checkers: a case study. In: International Workshop on Formal Approaches to Testing of Software (FATES) (2003)
Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Lazy abstraction. In: Principles of Programming Languages, pp. 58–70 (2002)
Hoare, T.: The verifying compiler: a grand challenge for computing research. J. ACM 50(1), 63–69 (2003)
Article Google Scholar
Holzmann, G.: Static source code checking for user-defined properties. In: Conference on Integrated Design and Process Technology (2002)
Holzmann, G.: The SPIN Model Checker: Primer and Reference Manual. Addison-Wesley Professional (2003)
Holzmann, G.: The power of ten: rules for developing safety critical code. IEEE Comput. 39(6), 95–97 (2006)
Article Google Scholar
Holzmann, G., Joshi, R.: Model-driven software verification. In: SPIN Workshop on Model Checking of Software, pp. 76–91 (2004)
Holzmann, G., Joshi, R., Groce, A.: Swarm verification. In: Automated Software Engineering, pp. 1–6 (2008)
Holzmann, G., Joshi, R., Groce, A.: Tackling large verification problems with the swarm tool. In: SPIN Workshop on Model Checking of Software, pp. 134–143 (2008)
Holzmann, G.J., Joshi, R., Groce, A.: Swarm verification techniques. IEEE Trans. Softw. Eng. 37(6), 845–857 (2011)
Article Google Scholar
Joshi, R., Holzmann, G.: A mini-challenge: build a verifiable filesystem. In: The Conference on Verified Software: Theories, Tools, Experiments (2005)
Kaufmann, M., Manolios, P., Moore, J.S.: Computer-Aided Reasoning: An Approach. Kluwer, Norwell (2000)
Kim, M., Choi, Y., Kim, Y., Kim, H.: Formal verification of a flash memory device driver — an experience report. In: SPIN Workshop on Model Checking of Software, pp. 144–159 (2008)
Kim, M., Choi, Y., Kim, Y., Kim, H.: Pre-testing flash device driver through model checking techniques. In: International Conference on Software Testing, Verification, and Validation, pp. 475–484 (2008)
Kim, M., Kannan, S., Lee, I., Sokolsky, O.: Java-MaC: a run-time assurance tool for Java. In: Proceedings of the 1st International Workshop on Runtime Verification (RV’01) of ENTCS, vol. 55, issue 2. Elsevier (2001)
Kim, M., Kim, Y., Kim, H.: Unit testing of flash memory device driver through a SAT-based model checker. In: Automated Software Engineering, pp. 198–207 (2008)
King, J.C.: Symbolic execution and program testing. Commun. ACM 19(7), 385–394 (1976)
Article MATH Google Scholar
Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., Winwood, S.: sel4: formal verification of an OS kernel. In: ACM Symposium on Operating Systems Principles (2009)
Kroening, D., Clarke, E.M., Lerda, F.: A tool for checking ANSI-C programs. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 168–176 (2004)
Lei, Y., Andrews, J.H.: Minimization of randomized unit test cases. In: International Symposium on Software Reliability Engineering, pp. 267–276 (2005)
Majumdar, R., Sen, K.: Hybrid concolic testing. In: International Conference on Software Engineering, pp. 416–426 (2007)
McKeeman, W.: Differential testing for software. Digit. Tech. J. Digit. Equip. Corp. 10(1), 100–107 (1998)
Google Scholar
Mercer, E., Jones, M.: Model checking machine code with the GNU debugger. In: SPIN Workshop on Model Checking Software (2005)
Miller, B.P., Fredriksen, L., So, B.: An empirical study of the reliability of UNIX utilities. Commun. ACM 105(33(12)), 32–44 (1990)
Google Scholar
Moskewicz, M.W., Madigan, C.F., Zhao, Y., Zhang, L., Malik, S.: Chaff: engineering an efficient SAT solver. In: Design Automation Conference, pp. 530–535 (2001)
Musuvathi, M., Park, D., Chou, A., Engler, D., Dill, D.: CMC: a pragmatic approach to model checking real code. In: Symposium on Operating System Design and Implementation (2002)
Necula, G., McPeak, S., Rahul, S., Weimer, W.: CIL: intermediate language and tools for analysis and transformation of C programs. In: International Conference on Compiler Construction, pp. 213–228 (2002)
Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL - A Proof Assistant for Higher-Order Logic, LNCS, vol. 2283. Springer (2002)
Offutt, J., Abdurazik, A.: Mutation 2000: Mutation Testing in the Twentieth and the Twenty First Centuries (2000)
Pacheco, C., Lahiri, S.K., Ernst, M.D., Ball, T.: Feedback-directed random test generation. In: International Conference on Software Engineering, pp. 75–84 (2007)
Pettichord, B.: Design for testability. In: Pacific Northwest Software Quality Conference (2002)
Qadeer, S., Rehof, J.: Context-bounded model checking of concurrent software. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 93–107 (2005)
Reeves, G., Neilson, T.: The mars rover spirit FLASH anomaly. In: IEEE Aerospace Conference (2005)
Dwyer, R.M., Hatcliff, J.: Bogor: an extensible and highly-modular model checking framework. In: European Software Engineering Conference/Symposium on the Foundations of Software Engineering, pp. 267–276 (2003)
Sen, K., Marinov, D., Agha, G.: CUTE: a concolic unit testing engine for C. In: Foundations of Software Engineering, pp. 262–272 (2005)
Smith, M., Havelund, K.: Requirements capture with RCAT. In: 16th IEEE International Requirements Engineering Conference (RE’08). IEEE Computer Society, Barcelona (2008)
Tuch, H., Klein, G., Norrish, M.: Types, bytes, and separation logic. In: Principles of Programming Languages, pp. 97–108 (2007)
Various: A Collection of NAND Flash Application Notes, Whitepapers and Articles. Available at http://www.data-io.com/NAND/NANDApplicationNotes.asp
Runtime Verification: http://www.runtime-verification.org
Visser, W., Havelund, K., Brat, G., Park, S.J., Lerda, F.: Model checking programs. Autom. Softw. Eng. 10(2), 203–232 (2003)
Article Google Scholar
Visser, W., Păsăreanu, C., Pelanek, R.: Test input generation for Java containers using state matching. In: International Symposium on Software Testing and Analysis, pp. 37–48 (2006)
Xu, R.-G., Majumdar, R., Godefroid, P.: Testing for buffer overflows with length abstraction. In: International Symposium on Software Testing and Analysis, pp. 19–28 (2008)
Yang, J., Sar, C., Engler, D.: EXPLODE: a lightweight, general system for finding serious storage system errors. In: Operating System Design and Implementation, pp. 131–146 (2006)
Yang, J., Sar, C., Twohey, P., Cadar, C., Engler, D.: Automatically generating malicious disks using symbolic execution. In: IEEE Symposium on Security and Privacy, pp. 243–257 (2006)
Yang, J., Twohey, P., Engler, D., Musuvathi, M.: Using model checking to find serious file system errors. In: Operating System Design and Implementation, pp. 273–288 (2004)
Yang, X., Chen, Y., Eide, E., Regehr, J.: Finding and understanding bugs in C compilers. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 283–294 (2011)
Zeller, A., Hildebrandt, R.: Simplifying and isolating failure-inducing input. IEEE Trans. Softw. Eng. 28(2), 183–200 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, 97331, USA
Alex Groce
Laboratory for Reliable Software, Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, 91109, USA
Klaus Havelund, Gerard Holzmann & Rajeev Joshi
Department of Computer Science, University of California, Los Angeles, CA, 90095, USA
Ru-Gang Xu

Authors

Alex Groce
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Havelund
View author publications
You can also search for this author in PubMed Google Scholar
Gerard Holzmann
View author publications
You can also search for this author in PubMed Google Scholar
Rajeev Joshi
View author publications
You can also search for this author in PubMed Google Scholar
Ru-Gang Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alex Groce.

Additional information

The research described in this publication was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. Funding was also provided by NASA ESAS 6G. (c) 2008. All Rights Reserved

Rights and permissions

Reprints and permissions

About this article

Cite this article

Groce, A., Havelund, K., Holzmann, G. et al. Establishing flight software reliability: testing, model checking, constraint-solving, monitoring and learning. Ann Math Artif Intell 70, 315–349 (2014). https://doi.org/10.1007/s10472-014-9408-8

Download citation

Published: 18 March 2014
Issue Date: April 2014
DOI: https://doi.org/10.1007/s10472-014-9408-8

Keywords

Mathematics Subject Classifications (2010)

68N30

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Establishing flight software reliability: testing, model checking, constraint-solving, monitoring and learning

Abstract

Access this article

Similar content being viewed by others

Testing machine learning based systems: a systematic mapping

How Artificial Intelligence Can Revolutionize Software Testing Techniques

A bounded constraint-based approach to aid in fault localization from a counterexample

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classifications (2010)

Navigation

Establishing flight software reliability: testing, model checking, constraint-solving, monitoring and learning

Abstract

Access this article

Similar content being viewed by others

Testing machine learning based systems: a systematic mapping

How Artificial Intelligence Can Revolutionize Software Testing Techniques

A bounded constraint-based approach to aid in fault localization from a counterexample

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classifications (2010)

Search

Navigation