Assessing the quality of industrial avionics software: an extensive empirical evaluation

Wu, Ji; Ali, Shaukat; Yue, Tao; Tian, Jie; Liu, Chao

doi:10.1007/s10664-016-9440-x

Assessing the quality of industrial avionics software: an extensive empirical evaluation

Experience Report
Published: 01 July 2016

Volume 22, pages 1634–1683, (2017)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Ji Wu¹,
Shaukat Ali²,
Tao Yue^2,3,
Jie Tian¹ &
…
Chao Liu¹

655 Accesses
2 Altmetric
Explore all metrics

Abstract

A real-time operating system for avionics (RTOS4A) provides an operating environment for avionics application software. Since an RTOS4A has safety-critical applications, demonstrating a satisfactory level of its quality to its stakeholders is very important. By assessing the variation in quality across consecutive releases of an industrial RTOS4A based on test data collected over 17 months, we aim to provide a set of guidelines to 1) improve the test effectiveness and thus the quality of subsequent RTOS4A releases and 2) similarly assess the quality of other systems from test data. We carefully defined a set of research questions, for which we defined a number of variables (based on available test data), including release and measures of test effort, test effectiveness, complexity, test efficiency, test strength, and failure density. With these variables, to assess the quality in terms of number of failures found in tests, we applied a combination of analyses, including trend analysis using two-dimensional graphs, correlation analysis using Spearman’s test, and difference analysis using the Wilcoxon rank test. Key results include the following: 1) The number of failures and failure density decreased in the latest releases and the test coverage was either high or did not decrease with each release; 2) increased test effort was spent on modules of greater complexity and the number of failures was not high in these modules; and 3) the test coverage for modules without failures was not lower than the test coverage for modules with failures uncovered in all the releases. The overall assessment, based on the evidences, suggests that the quality of the latest RTOS4A release has improved. We conclude that the quality of the RTOS4A studied was improved in the latest release. In addition, our industrial partner found our guidelines useful and we believe that these guidelines can be used to assess the quality of other applications in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SECT-AIR: Software Engineering Costs and Timescales – Aerospace Initiative for Reduction

Testing Avionics Software: Is FMI up to the Task?

Excellence in Exploratory Testing: Success Factors in Large-Scale Industry Projects

Notes

This is the result of NoTC _RM divided by the sum of NoTC _L+S and NoTC _HS for rows R3 to R5, respectively, in Table 6.
Calculated as NoTC _AM divided by the sum of NoTC _L+S and NoTC _HS for row R3 in Table 6.
It is computed by NoTC _HS/(NoTC _HS + NoTC _L+S).
Microsoft’s name for the operating system. See https://en.wikipedia.org/wiki/Windows_Vista for details.
An open-source platform. See www.eclipse.org for details.

References

Basili VR, Perricone BT (1984) Software errors and complexity: an empirical investigation. Commun ACM 27(1):42–52
Article Google Scholar
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B Methodol 57(1):289–300
MathSciNet MATH Google Scholar
Boehm BW, Brown JR, Lipow M (1976) Quantitative evaluation of software quality[C]. In: Proceedings of the 2nd international conference on Software engineering. IEEE Computer Society Press, p 592–605
Cai X, Lyu MR (2005) The effect of code coverage on fault detection under different testing profiles. ACM SIGSOFT Softw Eng Notes 30(4):1–7
Google Scholar
David L, Podgurski A (2003) A comparison of coverage-based and distribution-based techniques for filtering and prioritizing test cases. In: Proceedings of the 14th International Symposium on Software Reliability Engineering (ISSRE). IEEE Computer Society Press, p 442–453
Denaro G, Pezze M (2002) An empirical evaluation of fault proneness models. Software Engineering, 2002. ICSE 2002. In: Proceedings of the 24rd International Conference on. IEEE, 2002.
Fenton NE, Neil M (1999) Software metrics: successes, failures and new directions. J Syst Softw 47:149–157
Article Google Scholar
Fenton NE, Ohlsson N (2000) Quantitative analysis of faults and failures in a complex software system. IEEE Trans Softw Eng 26(8):797–814
Article Google Scholar
Fenton N, Pfleeger SL (1996) Software metrics—A rigorous and practical approach, 2nd edn. International Thomson Computer Press, London
Google Scholar
Fujii T, Dohi T, Fujiwara T. Towards quantitative software reliability assessment in incremental development processes[C]. In: Proceedings of the 33rd International Conference on Software Engineering. ACM, p 41–50
Hair JF, Black WC, Babin BJ, Anderson RE, Tatham RL (2006) Multivariate data analysis, vol 6. Pearson Prentice Hall, Upper Saddle River
Google Scholar
Horgan JR, London S, Lyu MR (1994) Achieving software quality with testing coverage measures[J]. Computer 27(9):60–69
Hutchings M, Goradia T, Ostrand T et al (1994) Experiments of the effectiveness of dataflow-and controlflowbased test adequacy criteria[C]. In: Proceedings of the 16th international conference on Software engineering. IEEE Computer Society Press, p 191–200
IEEE standard for software test documentation, IEEE Std 829-1998, 16 Dec. 1998
Inozemtseva L, Holmes R (2014) Coverage is not strongly correlated with test suite effectiveness[C]. In: Proceedings of the 36th International Conference on Software Engineering. ACM, p 435–445
International Standard ISO/IEC 12207, Information Technology-Software Life Cycle Processes, International Organization for Standardization, International Electrotechnical Commission, 1995
ISO/IEC 9126-1:2001: Software Engineering – Product Quality. Part 1: Quality Model. Geneva, Switzerland: International Organization for Standardization, 2001
ISO/IEC 25010 (2011) Systems and software engineering — systems and software quality requirements and evaluation (SQuaRE) — system and software quality models[J]. International Organizationfor Standardization, 2011: 34
Kevrekidis K, Albers S, Sonnemans PJM, Stollman GM (2009) Software complexity and testing effectiveness: an empirical study[C]. Reliability and Maintainability Symposium, 2009. RAMS 2009. Annual. IEEE, 2009: 539–543
Khoshgoftaar TM, Allen EB (1999) Logistic regression modeling of software quality. Int J Reliab Qual Saf Eng 6(04):303–317
Article Google Scholar
Khoshgoftaar TM, Allen EB, Goel N, Nandi A, McMullan J (1996) Detection of software modules with high debug code churn in a very large legacy system[C]. Software Reliability Engineering. In: Proceedings, SeventhInternational Symposium on. IEEE, p 364–371
Lyu MR, Huang Z, Sze KS, Cai X (2003) An empirical study on testing and fault tolerance for software reliability engineering[C]. Software Reliability Engineering. ISSRE 2003. 14th International Symposium on. IEEE, p 119–130
Malaiya YK, Li N, Bieman J, Karcich R, Skibbe B (1994) The relationship between test coverage and reliability[C]. Software Reliability Engineering, p 186–195. In: Proceedings, 5th International Symposium on. IEEE,
Malaiya YK, Li N, Bieman J, Karcich R (2002) Software reliability growth with test coverage. IEEE Trans Reliab 45(4):420–426
Article Google Scholar
Marick B (1991) Experience with the cost of different coverage goals for testing[C]. In: Proceedings Pacific Northwest Soft. Quality Conf, p 147–164
Marick B (1999) How to misuse code coverage[C]. In: Proceedings of the 16th Interational Conference on Testing Computer Software, p 16–18
Memon AM, Xie Q (2005) Studying the fault-detection effectiveness of GUI test cases for rapidly evolving software. IEEE Trans Softw Eng 31(10):884–896
Article Google Scholar
Mockus A, Nagappan N, Dinh-Trong TT (2009) Test coverage and post-verification defects: A multiple case study[C]. Empirical Software Engineering and Measurement. ESEM 2009. 3rd International Symposium on IEEE, p 291–301
Moller K-H, Paulish D (1993) An empirical investigation of software fault distribution[C]. Software Metrics Symposium. In: Proceedings., First International. IEEE, p 82–90
Musa JD, Iannino A, Okumoto K (1987) Software reliability: measurement, prediction, application. McGraw-Hill, New York
Google Scholar
Nagappan N, Williams L, Vouk M, Osborne J (2005) Early estimation of software quality using in-process testing metrics: a controlled case study. ACM SIGSOFT Softw Eng Notes 30(4):1–7
Article Google Scholar
Nagappan N, Ball T, Murphy B (2006) Using historical in-process and product metrics for early estimation of software failures, Proceedings of the 17th International Symposium on Software Reliability Engineering, pp. 62–74
Nagappan N, Maximilien EM, Bhat T, Williams L (2008) Realizing quality improvement through test driven development: results and experiences of four industrial teams. Empir Softw Eng 13:289–302
Article Google Scholar
Ntafos SC (1998) A comparison of some structural testing strategies. IEEE Trans Softw Eng 6:868–874
Google Scholar
Rice J (2006) Mathematical statistics and data analysis, 3rd edn. Nelson Education
Rosenberg L, Hammer T, Shaw J (1998) Software metrics and reliability. Proceedings of the Ninth International Symposium on Software Reliability Engineering
RTCA/DO-178B, Software Considerations in Airborne Systems and Equipment Certification, December 1, 1992
Schneidewind NF (1999) Measuring and evaluating maintenance process using reliability, risk, and test metrics. IEEE Trans Softw Eng 25(6):768–781
Article Google Scholar
Wohlin C, Runeson P, Höst M, Ohlsson M, Regnell B, Wesslén A (2000) Experimentation in software engineering: an introduction. Kluwer Academic Publishers, Norwell
Book MATH Google Scholar
Wu J, Ali S, Yue T, Tian J (2013) Experience report: Assessing the reliability of an industrial avionics software: Results, insights and recommendations[C]. Software Reliability Engineering (ISSRE), IEEE 24th International Symposium on. IEEE, p 218–227
Yekutieli D, Benjamini Y (1999) Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics. J Stat Plann Infer 82(1–2):171–196
Article MathSciNet MATH Google Scholar
Zhang X, Pham H (2000) An analysis of factors affecting software reliability. J Syst Softw 50(1):43–56
Article Google Scholar
Zimmermann T, Nagappan N (2008) Predicting defects using network analysis on dependencygraphs[C]. Proceedings of the 30th international conference on Software engineering. ACM, p 531–540
Zimmermann T, Nagappan N, Herzig K, Premraj R, Williams L (2011) An empirical study on the relation between dependency neighborhoods and failures[C]. Software Testing, Verification and Validation (ICST), 2011 IEEE FourthInternational Conference on. IEEE, p 347–356
Zimmermann T, Nagappan N, Gall H, Giger E, Murphy B (2009) Cross-project defect prediction: a large scale experiment on data vs. domain vs. process, Proceedings of the Seventh Joint Meeting of the European Software Engineering Conference/ACM SIGSOFT Symposium on the Foundations of Software Engineering, p 91–100

Download references

Acknowledgments

This research is jointly supported by the Technology Foundation Program (JSZL2014601B008) of the National Defense Technology Industry Ministry, the State Key Laboratory of the Software Development Environment (SKLSDE-2013ZX-12). This work was also supported by the MBT4CPS project funded by the Research Council of Norway (grant no. 240013/O70) under the category of Young Research Talents of the FRIPO funding scheme. Tao Yue and Shaukat Ali are also supported by the EU Horizon 2020 project U-Test (http://www.u-test.eu/) (grant no. 645463), the RFF Hovedstaden funded MBE-CR (grant no. 239063) project, the Research Council of Norway funded Zen-Configurator (grant no. 240024/F20) project, and the Research Council of Norway funded Certus SFI (grant no. 203461/O30) (http://certus-sfi.no/).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Beihang University, Beijing, China
Ji Wu, Jie Tian & Chao Liu
Simula Research Laboratory, Oslo, Norway
Shaukat Ali & Tao Yue
University of Oslo, Oslo, Norway
Tao Yue

Authors

Ji Wu
View author publications
You can also search for this author in PubMed Google Scholar
Shaukat Ali
View author publications
You can also search for this author in PubMed Google Scholar
Tao Yue
View author publications
You can also search for this author in PubMed Google Scholar
Jie Tian
View author publications
You can also search for this author in PubMed Google Scholar
Chao Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ji Wu.

Additional information

Communicated By: Brian Robinson

This paper is the extended version of the conference paper published in the 24th IEEE International Symposium on Software Reliability Engineering (ISSRE 2013).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, J., Ali, S., Yue, T. et al. Assessing the quality of industrial avionics software: an extensive empirical evaluation. Empir Software Eng 22, 1634–1683 (2017). https://doi.org/10.1007/s10664-016-9440-x

Download citation

Published: 01 July 2016
Issue Date: August 2017
DOI: https://doi.org/10.1007/s10664-016-9440-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessing the quality of industrial avionics software: an extensive empirical evaluation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SECT-AIR: Software Engineering Costs and Timescales – Aerospace Initiative for Reduction

Testing Avionics Software: Is FMI up to the Task?

Excellence in Exploratory Testing: Success Factors in Large-Scale Industry Projects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Assessing the quality of industrial avionics software: an extensive empirical evaluation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SECT-AIR: Software Engineering Costs and Timescales – Aerospace Initiative for Reduction

Testing Avionics Software: Is FMI up to the Task?

Excellence in Exploratory Testing: Success Factors in Large-Scale Industry Projects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation