Towards a conceptual framework of software run reliability modeling

doi:10.1016/S0020-0255(00)00018-9

Information Sciences

Volume 126, Issues 1–4, July 2000, Pages 137-163

Communicated by Didier Dubois

https://doi.org/10.1016/S0020-0255(00)00018-9 Get rights and content

Abstract

Bearing in mind that discrete time is only a kind of time measure used in reliability engineering, in this paper, we try to develop a conceptual framework for discrete-time software reliability modeling, which may resemble or be parallel to continuous-time software reliability modeling. By conceptual framework, we mean basic notions and basic methodologies. We present basic notions of discrete-time software reliability modeling and distinguish two types of software run reliability data. We propose three methodologies, i.e., probabilistic, Bayesian and fuzzy, and various corresponding models. However there is no intention to show a specific methodology or a specific model is superior to another. We only show how to use a specific methodology or a specific model to deal with software run reliability data. In this paper, we also discuss how to characterize relationships among operational profiles, discrete-time software reliability behavior and continuous-time software reliability behavior, and propose a simple mathematical model for the relationships.

Introduction

MRTF: mean run to failure or median run to failure.
Stochastic modeling methodology I: by which inter-failure times are treated as random variables.
Stochastic modeling methodology II: by which the number of software failures occurring in a time interval is treated as a stochastic process.
Type 1 data: successful runs between failures.
Type 2 data: number of failures among number of runs.

Here, we note that software reliability modeling is often concerned with quantifying the behavior of software reliability and uses historical software reliability (failure) data to assess current software reliability status or/and forecast future software failures. Type 1 data count how many runs are conducted between two successive software failures and thus are often used to predict how many runs are necessary to expose the next software failure. Type 2 data counts how many software failures are observed in a given number of runs and thus are often used to predict the cumulative number of software failures observed in the next given number of runs. Obviously, type 1 data are more accurate than type 2 data and the former can be converted into the latter. However, type 2 data are easier to collect than type 1 data in practice.

Software reliability modeling has become one of the most important aspects in software reliability engineering since Jelinski–Moranda model appeared [7], [28], [43], [60]. Various methodologies have been adopted to model software reliability behavior: stochastic modeling methodology I [28], [29], [50], [60]; stochastic modeling methodology II [26], [29], [60], [62]; Bayesian methodology [36]; fuzzy methodology [13], [14]; neural network methodology [30]; non-parametric methodology [54]; and others [18], [33]. One may even claim that software reliability behavior can be well predicted [4], although there are some limits [35].

However, we note that most of existing work on software reliability modeling is focused on continuous-time base, which assumes that software reliability behavior can be measured in terms of calendar time, clock time or CPU execution time. Although this assumption is appropriate for a wide scope of systems, there are many systems, which are essentially different from this assumption. For example, reliability behavior of a bank transaction processing software system should be measured in terms of how many transactions are successful, rather than of how long the software system operates without failure. Similarly, reliability behavior of a rocket control software system should be measured in terms of how many rockets are successfully launched, rather than of how long a rocket flies without failure. Obviously, for these systems, the time base of reliability measurement is essentially discrete rather than continuous. We must examine whether software reliability modeling techniques in the context of continuous-time base can be directly applicable to dealing with problems in the context of discrete-time base.

In order to model software reliability behavior in the context of discrete-time base, throughout this paper we have the following common assumptions:

1.
Any software execution process can be divided into a series of runs.
2.
When a run is executed, the software either passes or fails.
3.
Runs are executed independently.¹

Run reliability means the probability or possibility that software successfully perform a run. There has been some work on the topic of run reliability [5], [23], [25], [29], [45], [46], [52], [55], [56], [57], [58], [63]. Some assumed that no software defects were removed during testing and thus software run reliability was constant and corresponded to software validation phase [5], [23], [45], [46], [52], [57], [58], some dealt with the case of software defect removals [25], [29], [47], [63], while others compared or combined continuous-time and discrete-time software reliability modeling [55], [56]. However, we are surprised that the time base of run reliability was not properly recognized. That is, it was interpreted as ‘data domain’ [24], ‘input domain’ [47], or even ‘time-independent’ [58]. In fact, in (hardware) reliability engineering, the notion of time is widely interpreted. Time may be calendar time, mile or even positive integer. The time base of run reliability is also a type of time.

From previous work, we also note that modeling techniques have not been well established for run reliability, that is, basic definitions and notions behind run reliability have not been well understood and basic modeling methodologies have not been well formulated. For example, the notion of run lifetime has not been introduced, the relationships between hazard rate function and failure intensity function have not been well recognized, and one has not well understood what and how many methodologies can be used to model discrete-time software reliability behavior.

Modeling experience with continuous-time software reliability tells us that there has not been a single model, which is superior to other models in all cases [4], [38], and it will be wise if we disregard intuitive interpretations of software reliability model parameters [7]. In this paper, we aim to develop a conceptual framework for run reliability modeling rather than a particular run reliability model. By a conceptual framework we mean basic notions and basic methodologies. We try to show how run reliability behavior can be modeled in a similar way as that for continuous-time reliability modeling, provided we bear in mind that the underlying time base is discrete. Empirical validation of the proposed modeling methodologies should be left to future research.

Section 2 discusses the basic definitions and notions of run reliability. Section 3 deals with type 1 data. Section 4 deals with type 2 data. In Section 5, we show how to use Bayesian methodology to deal with run reliability modeling. In Section 6, we show how to use fuzzy methodology to deal with run reliability modeling. In Section 7, we discuss the relationship among operational profile, discrete-time software reliability behavior and continuous-time software reliability behavior. Concluding remarks are contained in Section 8. However other possible methodologies based on neural network, non-parametric analysis or Dempster–Shafer evidence theory [40], [53] are not involved here.

Of course, in order to apply the methodologies proposed in this paper, software run reliability data must be available. Compared to the amount of continuous-time software reliability data reported in the literature, the amount of discrete-time software reliability data reported in the literature is rather limited, although some authors have published their own data [29], [63].

Section snippets

Basic definitions and notions

In this section, we present basic definitions and notions of run reliability in the context of probability. Part of them have been discussed elsewhere [29], [49]. The counterparts in the context of possibility are left to Section 6.

Dealing with type 1 data

We can use Fig. 1 to represent software run execution process. The process begins with the first run, run (1,1). By run (1,k) we mean the kth run since the run process begun. By run (j,k) we mean the kth run after the (j−1)th software failure. Usually, a defect should be removed when a failure occurs, except in the software validation phase. So X₁,…,X_n may not be i.i.d. Let $Z_{jk} = 1 if the software passes run (j,k), 0 if the software fails run (j,k),$ $P Z_{jk} =1 = r ̄_{jk} =1−r_{jk}, P Z_{jk} =0 =r_{jk} .$

Then as shown in Section 2, we

Dealing with Type 2 data

We have the following assumptions [63]:

1.
N(0)=0.
2.
The process ${N(n), n=1,2,…,}$ has independent increments, i.e., for any collection of the numbers of test runs 0<n₁<n₂<⋯<n_k, the k random variables N(n₁),N(n₂)−N(n₁),…, N(n_k)−N(n_k−1) are statistically independent.
3.
For any of the numbers of test runs n_i and n_j(0<n_i<n_j),

P N n_{j} −N n_{i} =l = m n_{j} −m n_{i}^{l} l! exp − m n_{j} −m n_{i}, l=0,1,2,…

This implies that the NHPP model in the context of continuous-time base [2], [24], [51] can be directly used to deal with Type 2 data of run

Bayesian methodology

Bayesian methodology has a wide scope of applications in system reliability engineering [10], [19], [34], [36]. In comparison with the non-Bayesian methodology, where parameters of concern are treated as a constant, Bayesian methodology treats parameters of concern as a random variable whose probability distribution is called prior probability distribution. The prior probability distribution can capture subjective judgement or is just a parametric probability distribution (e.g., beta

Fuzzy methodology

Using fuzzy or possibilistic methodology to deal with software reliability modeling problems is not a new idea. Ramamoorthy and Bastani’s [47] notion of software correctness possibility is an example. Weiss and Weyuker’s [56] notion of ‘generalized reliability’ and Tsoukalas, Duran and Ntafos’ notion [57] of ‘cost weighted failure rate of software’ are other examples. The latter two notions are actually special cases of profust reliability [15]. On the other hand, Cai developed a fuzzy software

Operational profile, discrete-time base and continuous-time base

Now, let us return our attention to the probability context. Compared to research work on hardware operational profile [16], [17], [27], corresponding research work on software operational profile is relatively small [5], [20], [21], [22], [29], [42], [46], and there are several disadvantages associated with these work. First, they often defined software operational profile as probability distribution across the disjoint classes of test cases [5], [29], [42], [46]. However, the probability

Concluding remarks

Discrete time is a kind of time measure widely used in (hardware) reliability engineering. However, this has not been properly understood in software reliability modeling and several confusions exist in the current literature. In previous sections, we clarify some confusion and present a unified framework of discrete-time software reliability modeling, which is parallel to or resembles that of continuous-time software reliability modeling. In this framework, basic notions are defined and two

Acknowledgements

The author is most grateful to Bev Littlewood for his constructive discussions and comments. The author would like to thank Andrea Bondavalli, Karama Kanoun, Pascale Thevenod-Fosse, Mladen A. Vouk for their helpful comments on earlier versions of the paper. The comments of one anonymous referee are particularly useful since they help the author to identify and correct two mistakes. The readability of the paper is improved with the help of Didier Dubois. The first draft version of the paper was

References (65)

K.Y. Cai
System failure engineering and fuzzy methodology: an introductory overview
Fuzzy Sets and Systems
(1996)
K.Y. Cai
On estimating the number of defects remaining in software
Journal of Systems and Software
(1998)
M. Abramowitz, I.A. Stegun, Handbook of Mathematical Functions, National Bureau of Standards Applied Mathematics Series...
H.V. Basawa et al.
Statistical Inference for Stochastic Processes
(1980)
J.B. Bowles, C.E. Pelaez, Application of fuzzy logic to reliability engineering, in: Proceedings of the IEEE, vol. 83...
S. Brocklekurst, B. Littlewood, New ways to get accurate reliability measures, IEEE Software (July 1992)...
J.R. Brow, M. Lipow, Testing for software reliability, in: Proceedings of the International Conference on Reliable...
K.Y. Cai
Censored software reliability models
IEEE Transactions on Reliability
(1997)
K.Y. Cai, Elements of Software Reliability Engineering, Tsinghua University Press, Beijing, September 1995 (in...
K.Y. Cai
Introduction to Fuzzy Reliability
(1996)

K.Y. Cai

Software Defect and Operational Profile Modeling

(1998)

K.Y. Cai, C.Y. Wen, M.L. Zhang, Fuzzy variables as a basis for a theory of fuzzy reliability in the possibility...

K.Y. Cai, C.Y. Wen, M.L. Zhang, A critical review on software reliability modeling, Reliability Engineering and System...

K.Y. Cai, C.Y. Wen, M.L. Zhang, A novel approach to software reliability modeling, Microelectronics and Reliability 33...

K.Y. Cai, C.Y. Wen, M.L. Zhang, Fuzzy states as a basis for a theory of fuzzy reliability, Microelectronics and...

M. Calzarossa, G. Serazzi, Workload characterization: a survey, in: Proceedings of the IEEE, vol. 81 (8), 1993, pp....

X. Castillo, D.P. Siewiorek, Workload, performance and reliability of digital computing systems, in: Proceedings of the...

H.B. Chenoweth, Reliability prediction, in the conceptual phase, of a processor system with its embedded software, in:...

A. Csenki, Bayesian prediction analysis of a fundamental software reliability model, IEEE Transactions on Reliability...

T. Downs, An approach to the modelling of software testing with some applications, IEEE Transactions on Software...

T. Downs, Extension to an approach to the modelling of software testing with some performance comparison, IEEE...

T. Downs, P. Garrone, Some new models of software testing with comparisons, IEEE Transactions on Reliability 40 (3)...

J.W. Duran, J.J. Wiorkowski, Quantifying software validity by sampling, IEEE Transactions on Reliability R-29 (2)...

W.H. Farr, A Survey of Software Reliability Modeling and Estimation, NSWC-TR-82-171,...

J.M. Finkelstein, A logarithmic reliability-growth model for single-mission systems, IEEE Transactions on Reliability...

A.L. Goel, K. Okumoto, A time dependent error detection rate model for a large scale software system, in: Proceedings...

R.K. Iyer, D.J. Rossetti, Effects of system workload on operating system reliability: a study on IBM 3081, IEEE...

Z. Jelinski, P.B. Moranda, Software reliability research, in: W. Greiberger (Ed.), Statistical Computer Performance...

M. Kaaniche, K. Kanoun, The discrete-time hyperexponential model for software reliability growth evaluation, in:...

N. Karuranithi, D. Whitley, Y.K. Malaiya, Prediction of software reliability using connectionist models, IEEE...

R.A. Keiller, D.R. Miller, On the use and the performance of software reliability growth models, Reliability...

E. Kerre, T. Onisawa, B. Cappelle, I. Gazdik, Reliability, in: R. Slowinski (Ed.), Fuzzy Sets in Decision Analysis,...

Cited by (42)

A method of multidimensional software aging prediction based on ensemble learning: A case of Android OS
2024, Information and Software Technology
Software aging refers to the phenomenon of performance degradation, increasing failure rate, or system crash due to resource consumption and error accumulation in software systems running for a long time. It has become the key factor affecting software systems’ sustainability. Due to its complex formation reasons, precisely predicting the aging state in actual execution is hard but crucial for enabling proactive measures before a catastrophic situation. Machine learning (ML) has been employed on this issue.
However, previous ML-based prediction methods are single-threaded in the whole process, posing challenges in delivering the desired performance facing diverse user scenarios. To alleviate this problem, we propose a multidimensional software aging prediction method based on ensemble learning (MSAP).
In the framework of MSAP, five dimensions, including datasets, labeling metrics, labeling thresholds, algorithms, and model decisions, are extracted and diversified according to aging characteristics and application situations.
Plenty of experiments have been conducted on Android devices from three distinct vendors. When subjected to identical workloads, MSAP demonstrates comparable performance to most unidimensional models. While under varied workloads, MSAP outperforms unidimensional models whose performance drops dramatically, demonstrating enhanced adaptability and predictive accuracy.
MSAP shows exceptional stability while concurrently upholding outstanding prediction precision across a spectrum of user scenarios. It has better generalization characteristics and application prospects.
Comparing the reliability of software systems: A case study on mobile operating systems
2018, Information Sciences
Citation Excerpt :
For instance, in one of the most recent works, Ke et al. [36] use the Parr-curve model with multiple change-points to analyze the consumption of testing-effort. The reliability of software systems have been described using multiple approaches, including fuzzy models [29,50,56], granular models [57], program invariants [16] regression analysis [47], Markov Models [42,44], neural networks [21,23,30,53,55,58], other machine learning techniques [59], testing [11,12], also trying to build a comprehensive framework for it [3,10,49,59], and applied to multiple, also very diverse, contexts [1,79]. Software reliability growth model is a mathematical model that describes software failure-detection or defect discovery phenomena during the system testing and debugging phases.
Assessment of software reliability is inevitable in modern software production process. Many works aimed at better models for measurement and prediction of reliability of software products. Tens of approaches have been developed and evaluated so far. However, very few works focus on approaches to compare existing systems with respect to reliability. Despite a tremendous importance to practice (and software management area), a complete and sound comparison methodology does not exist. In this paper, we propose a methodology for software reliability comparison. The methodology extensively applies the GQM approach and software reliability growth models. The methodology has been thoroughly evaluated on a case of assessment and comparison of three open source mobile operating systems: Sailfish, Tizen and CyanogenMod.
Estimating confidence interval of software reliability with adaptive testing strategy
2014, Journal of Systems and Software
Citation Excerpt :
The former focuses on the reliability behavior measured in terms of time, e.g., CPU execution time, which is appropriate for a wide scope of systems. However, as pointed out by Cai (2000), this assumption is not satisfied in many systems. For example, the reliability behavior of a bank transaction processing system should be measured in terms of how many transactions are successful, rather than of how long the software system operates without failures.
Software reliability assessment is a critical problem in safety-critical and mission-critical systems. In the reliability assessment of such a system, both an accurate reliability estimate and a tight confidence interval are required. Adaptive testing (AT) is an on-line testing framework, which dynamically selects test cases from different subdomains to achieve some optimization object. Although AT has been proved effective in minimizing reliability estimator variance, its performance on providing the corresponding confidence interval has not been investigated. In order to address this issue, an AT strategy combined with Bayesian inference (AT-BI) is proposed in this study. The novel AT-BI strategy is expected to be effective in providing both a low-variance estimator and a tight confidence interval. Experiments are set up to validate the effectiveness of the AT-BI strategy.
Enhancing software reliability estimates using modified adaptive testing
2013, Information and Software Technology
Citation Excerpt :
Moreover, since no defect removal is conducted during the testing, no debugging is involved. This study is also related to reliability modeling approaches, including software reliability growth modeling [5,32], the extended Nelson model [41], architecture-based models [19], neural network-based models [7,28], Bayesian belief network approach [15], qualitative assessment approach [26], early stage assessment approach [39], holistic assessment approach [20], and among others [17,31]. More recently, Huang and Lyu [23] discuss the unification of software reliability growth models based on the Non-homogeneous Poisson processes (NHPPs), and propose a general NHPP model by incorporating the idea of power transformation in the model unification process.
Most software reliability models are based on a binary notion of correctness, i.e. “successful” or “failed.” However, in several instances, it is important to account of failure severity to obtain more descriptive and accurate estimates of the reliability of the software.
In this paper, we develop a set of extended metrics based on the Nelson’s software reliability model to account for information gained from a user’s point of view regarding the severity of the observed failures. Model formulation based on multi-granularity failure severity is provided, and the proposed metrics are proved to be backward compatible.
In order to estimate the software reliability through testing, an extended adaptive testing strategy, namely Modified Adaptive Testing (MAT) is proposed. The use of test history information allows the resulting test process to be adaptive in the selection of tests under limited test budget. Simulations and experiments on real-life programs are conducted to evaluate the effectiveness of MAT.
Data show that the reliability estimates obtained using MAT (a) are closer to the “true” reliability than those obtained using random testing and (b) lead to lower variance than the techniques used for comparison, which means MAT can be applied to help testers and reliability engineers better understand the reliability of their programs.
It is concluded that the proposed approach can enhance the software reliability estimation testing by guiding the test case selection process by providing more descriptive and accurate results.
Does software reliability growth behavior follow a non-homogeneous Poisson process
2008, Information and Software Technology
Citation Excerpt :
Note that the Goel–Okumoto NHPP model was originally proposed to describe the software reliability behavior in the continuous-time domain. However it can apply to the discrete-time domain directly by confining the various time instants (e.g., t1, t2, … , tk) to positive integers [22]. More specifically, the time instants can represent the number of tests (test cases) applied in the course of software testing.
It is widely believed in software reliability community that software reliability growth behavior follows a non-homogeneous Poisson process (NHPP) based on analyzing the behavior of the mean of the cumulative number of observed software failures. In this paper we present two controlled software experiments to examine this belief. The behavior of the mean of the cumulative number of observed software failures and that of the corresponding variance are examined simultaneously. Both empirical observations and statistical hypothesis testing suggest that software reliability behavior does not follow a non-homogeneous Poisson process in general, and does not fit the Goel–Okumoto NHPP model in particular. Although this new finding should be further tested on other software experiments, it is reasonable to cast doubt on the validity of the NHPP framework for software reliability modeling. The importance of the work presented in this paper is not only for the new finding which is distinctly different from existing popular belief of software reliability modeling, but also for the adopted research approach which is to examine the behavior of the mean and that of the corresponding variance simultaneously on basis of controlled software experiments.
Adaptive software testing with fixed-memory feedback
2007, Journal of Systems and Software
Adaptive software testing is the counterpart of adaptive control in software testing. It means that software testing strategy should be adjusted on-line by using the testing data collected during software testing as our understanding of the software under test is improved. In this paper we propose a new strategy of adaptive software testing in the context of software cybernetics. This new strategy employs fixed-memory feedback for on-line parameter estimations and is intended to circumvent the drawbacks of the assumption that all remaining defects are equally detectable at constant rate and to reduce the underlying computational complexity of on-line parameter estimations. A comprehensive case study with the Space program demonstrates that the new adaptive testing strategy can really work in practice and may noticeably outperform the purely-random testing strategy and the random-partition testing strategy (or collectively, the random testing strategies) in terms of the number of tests used to detect and remove a given number of defects in a single process of software testing and the corresponding standard deviation. In addition, the case study shows that the input domain of the software under test should be partitioned non-evenly for the adaptive testing strategy.

View all citing articles on Scopus

^☆: Partially supported by the National Outstanding Youth Foundation of China and the Key Project of China.

View full text

Towards a conceptual framework of software run reliability modeling☆

Abstract

Introduction

Section snippets

Basic definitions and notions

Dealing with type 1 data

Dealing with Type 2 data

Bayesian methodology

Fuzzy methodology

Operational profile, discrete-time base and continuous-time base

Concluding remarks

Acknowledgements

Fuzzy Sets and Systems

Journal of Systems and Software

Statistical Inference for Stochastic Processes

Censored software reliability models

IEEE Transactions on Reliability

Introduction to Fuzzy Reliability

Software Defect and Operational Profile Modeling