Reliability over consecutive releases of a semiconductor Optical Endpoint Detection software system developed in a small company

https://doi.org/10.1016/j.jss.2017.12.006Get rights and content

Highlights

  • Limited data available from a small company can be used for reliability analysis.

  • We used Laplace trend test to demonstrate an overall reliability growth trend.

  • We used IDRMs to determine system stability and operational reliability.

  • Reliability growth was demonstrated using 3 SRGMs (GO, MO, S models).

  • SRGMs fitted the observations well and provided accurate reliability predictions.

Abstract

Demonstrating software reliability across multiple software releases has become essential in making informed decisions of upgrading software releases without impacting significantly end users’ characterized processes and software quality standards. Standard defect and workload data normally collected in a typical small software development organization can be used for this purpose. Most of these organizations are normally under aggressive schedules with limited resources and data availability that are significantly different from large commercial software organizations where software reliability engineering has been successfully applied. Trend test, input domain reliability models (IDRM), and software reliability growth models (SRGM) were used in this paper on a semiconductor Optical Endpoint Detection (OED) software system to examine its overall trend and stability, to assess the system’s operational reliability, and to track its reliability growth over multiple releases. These techniques also provided evidence that continuous defect fixes increased software reliability substantially over time for this software system.

Introduction

Software reliability is the probability of a software system to perform its function failure-free during a specified time on a given set of inputs under defined conditions (Lyu, 2007, Musa, Iannino, Okumoto, 1987). This study demonstrates software reliability across consecutive software releases for a semiconductor Optical Endpoint Detection (OED) software system using the standard defect data that is normally available to many software development organizations. The OED is connected to an optical sensing instrument that transmits optical data to be processed by OED’s proprietary algorithms.

Company ABC recognizes the need to measure and analyze software reliability between releases. This can be achieved by documenting reliability of OED using historical data to predict future reliability trend. Stakeholders can also use this data as an early indicator of quality issues if actual data is not conforming to expected and predicted trend. In turn the necessary counter measures can be applied to improve quality. This information can also assist some of the end users to make informed decisions on software upgrades. Since most of the users have very restricted process of upgrading their software, putting reported defects in perspective and showing trend on reliability and stability would be very instrumental to their change control boards.

Building upon our previous work on defect analysis for this system (Abuta and Tian, 2015), this study demonstrates how one can show software reliability in subsequent releases and whether continuous defect fixes and code upgrades increase software reliability. The defects reported from August 2001 through July 2014 and the related workload data were analyzed.

In this paper, we start with an examination of related work in defect management, defect analysis, and software reliability. Then we move to OED’s environment, system flow, data, testing, and test effort tracking. Then we focus on OED’s defect analysis, operational reliability, and reliability growth. We finally conclude the paper by examining the benefits and limitations of our study and how it can impact other similar situations, with future work of expanding the study.

Section snippets

Related work

A well-defined system and mechanism is needed in order to record, categorize, track, and analyze defects and reliability of a given system.

Application environment and data

Company ABC has taken measures to develop a process that can ensure certain information is collected and validated in the very limited and aggressive time constraints between development and deployment (Abuta and Tian, 2015). The process adapted for the OED ranges from requirements gathering, defects reporting, source code control to creating and maintaining a set of regressions test cases.

OED defect analysis

For this study, trend distribution and analysis models (Lyu, 2007, Kumaresh, Baskaran, 2010, Gokhale, Trivedi, 1998) were selected, visualizing the whole and sub population of defect distribution. The whole population of defect distribution would be referred to as OED defects while the subpopulation defect distribution would be referred to as branch specific defects.

Operational reliability for OED

Operational reliability refers to the system reliability snapshots observed during OED’s operation where all the observed failures, including those caused by the same underlying faults yet to be fixed, are counted.

Reliability growth for OED

After assessing the operational reliability above, the next step was to assess the reliability growth of OED. Unlike in operational reliability where duplicate defects were counted, only unique defects were used in reliability growth evaluation. In other words, once a defect was accounted for, it was not counted again in the next test interval. This allowed us to assess the defect fixing effect on reliability growth.

Discussion and perspective

The main objective of this study was to provide OED stakeholders evidence on it’s reliability across multiple releases. The challenge for the study was accessing typical data normally available to large companies that is required by SRGMs and IDRMs to assess reliability (Lyu, 1995, Tian, Lu, Palma, 1995, Musa, 1989). Thus, there was a need to determine whether other types of data normally available to small software development organization could also be suitable to these techniques. The first

Conclusion

Determining software reliability across multiple software releases is essential to end users. This helps them make informed decisions on upgrading software releases without impacting significantly their characterized processes and software quality standards. We achieved this by analyzing the standard defect data normally collected in a typically small software development organization.

For the initial analysis of the data, we used Laplace trend test results to give a quick assessment and

References (18)

  • N. Ullah et al.

    A comparative analysis of software reliability growth models using defects data of closed and open source software

    IEEE 35th Software Engineering Workshop

    (2012)
  • E. Abuta et al.

    Defect analysis over multiple release versions of a semiconductor software system

    IEEE/ACM 1st International Workshop on Complex Faults and Failures in Large Software Systems(COUFLESS)

    (2015)
  • A. Goel et al.

    Time-dependent error-detection rate model for software reliabiltiy and other performance measures

    IEEE Trans. Reliab.

    (1979)
  • S. Gokhale et al.

    Log-logistic software reliability growth model

    Third IEEE International High-Assurance Systems Engineering Symposium Proceedings

    (1998)
  • IEEE

    IEEE Recommended Practice on Software Reliability

    (2008)
  • ISO/IEC/IEEE

    Systems and Software Engineering – Vocabulary

    (2010)
  • S. Kumaresh et al.

    Defect analysis and prevention for software process quality improvement

    Int. J. Comput. Appl

    (2010)
  • M. Leszak et al.

    A case study in root cause defect analysis

    Proceedings of the 22nd International Conference on Software Engineering

    (2000)
  • H. Lu et al.

    Defect prediction between software versions with active learning and dimensionality reduction

    2014 IEEE 25th International Symposium on Software Reliability Engineering

    (2014)
There are more references available in the full text version of this article.

Cited by (0)

View full text