An integration of fault detection and correction processes in software reliability analysis
Introduction
Over the past decade, the deployment of computer systems has grown dramatically. People in the modern society are increasingly dependent on both hardware and software systems. Since software is embedded in everything and permeates our daily life, the correct performance of a software system becomes an important issue of many critical systems. Software reliability can be viewed as a good measure of quantifying software failures and is defined as the probability of failure-free software operation for a specified period of time in a specified environment (Lyu, 1996). Numerous software reliability growth models (SRGMs) have been developed to measure software reliability, and some of them are based on NHPP (Lyu, 1996, Musa et al., 1987, Pham, 2000, Xie, 1991). These SRGMs are very useful to describe the error-detection process as a discrete or continuous process at a time-dependent error-detection rate (Goel and Okumoto, 1979, Lo et al., 2001, Lo et al., 2003, Yamada et al., 1983, Yamada et al., 1993). The common assumption of conventional SRGMs is that the detected faults will be immediately removed. However, this assumption may not be very realistic, that is, it is rare to see that the detected faults are immediately corrected (Gokhale et al., 1998, Schneidewind, 1975, Schneidewind, 2003, Ohba, 1984, Xie and Zhao, 1992).
Schneidewind (1975) first modeled the fault-correction process by using a constant delayed fault-detection process. Later, Xie and Zhao (1992) extended the Schneidewind model to a continuous version by substituting a time-dependent delay function for the constant delay. A key factor of the continuous version of Schneidewind model is the time-dependent delay function, which measures the expected time lag to correct a detected fault. Actually, software debugging is a science. Fault correction personnel have to formulate a hypothesis and make predictions based on the hypothesis. Furthermore, they should run the software, observe its output, and confirm the hypothesis. We know that the time to remove a fault depends on the complexity of the detected faults, the skills of the debugging team, the available manpower, and the software development environment, etc. (Lyu, 1996, Musa et al., 1987, Musa, 1998). Therefore, it is very important for us to have different software reliability models for modeling the fault detection and correction processes. In this paper, we will propose a new software reliability model considering both the fault detection and correction processes. Some numerical examples are performed based on two real software failure data sets. Experimental results show that the proposed framework to incorporate debugging time lag for SRGM has a fairly accurate prediction capability.
This paper is organized as follows. In Section 2, the properties of the related models are reviewed and a description of characteristics of the NHPP models is discussed. An integration model of fault detection and correction processes is proposed in Section 3. Also, we show how some existing NHPP models are re-evaluated from the viewpoint of correction process and make some observations between the original NHPP models and the integrated models. The numerical examples and comparison results are presented in Section 4. Finally, the conclusions are made in Section 5.
Section snippets
A brief review of some SRGMs based on NHPP
Let {N(t), t ⩾ 0} denote a counting process representing the cumulative number of faults detected by time t, m(t) be the mean value function (MVF) of the expected number of faults detected in time (0, t], and λ(t) denote the failure intensity at testing time t. That is, they satisfy the following:andThus, an SRGM based on NHPP with mean value function m(t) can be formulated as (Yamada et al., 1993)From our studies (Lo et al., 2001,
An integrated fault detection and correction model
In the past, much research on software reliability models has concentrated on modeling and predicting failure occurrence and has not given equal priority to modeling the fault correction process (Schneidewind, 2003). However, most latent software faults may remain uncorrected for a long time even after they are detected, which increases their impact. The remaining software faults are often one of the most unreliable reasons for software quality. Therefore, from the practical viewpoint, we may
Descriptions of real data sets
In this section, we evaluate the performance of the proposed model by using two sets of software failure data. The first data set is the System T1 data of the Rome Air Development Center (RADC) (Musa, 1985, Musa et al., 1987) and shown in Table 2. The system T1 is used for a real-time command and control application. The size of the software is approximately 21,700 object instructions. It took 21 weeks, and nine programmers to complete the test. During the test phase, about 25.3 CPU hours were
Conclusions
Over the past 30 years, many SRGMs have been proposed by many researchers and some important metrics can be easily determined through SRGMs. However, from our studies, they assumed that detected faults are immediately corrected. In fact, this assumption may not be realistic in practice. In this paper, we first propose a general framework for modeling the fault detection and correction processes. We have showed that some existing SRGMs can be easily derived from the concept of the integration of
Acknowledgements
This research was supported by the National Science Council, Taiwan, ROC, under Grant NSC 93-2213-E-267-001, and also substantially supported by a grant from the Ministry of Economic Affairs (MOEA) of Taiwan (Project No. 94-EC- 17-A-01-S1-038).
Jung-Hua Lo received the BS (1993) in Mathematics and the MS (1995) and the Ph.D. (2003) in Electrical Engineering from National Taiwan University. Since 1998, he has been with LanYang Institute of Technology, where he is currently an Assistant Professor and the Chairman of the Department of Information Management. His research interests are software engineering, software reliability and testing, etc.
References (23)
Performance analysis of Software reliability growth models with testing-effort and change-point
J. Syst. Software
(2005)- et al.
Software reliability growth model with error dependency
Microelectron. Reliab.
(1995) - et al.
Time-dependent error-detection rate model for software reliability and other performance measures
IEEE Trans. Reliab.
(1979) - Gokhale, S.S., Lyu, M.R., Trivedi, K.S., 1998. Software reliability analysis incorporating fault detection and...
- Lo, J.H., Kuo, S.Y., Huang, C.Y., 2001. Reliability modeling incorporating error processes for internet-distributed...
- Lo, J.H., Kuo, S.Y., Lyu, M.R., Huang, C.Y., 2002. Optimal resource allocation and reliability analysis for...
- Lo, J.H., Huang, C.Y., Kuo, S.Y., Lyu, M.R., 2003. Sensitivity analysis of software reliability for component-based...
Handbook of Software Reliability Engineering
(1996)- et al.
Applying software reliability models more effectively
IEEE Trans. Softw.
(1992) - Musa, J.D., 1985. Software reliability data, report and database available from data and analysis center for software....
Software Reliability Engineering: More Reliable Software, Faster Development and Testing
Cited by (70)
On the testing resource allocation problem: Research trends and perspectives
2020, Journal of Systems and SoftwareRFID library management software dependability through reliable fault-detection and fault correction procedures
2024, Microsystem TechnologiesA model of software fault detection and correction processes considering heterogeneous faults
2023, Quality and Reliability Engineering InternationalModelling reliability growth for multi-version open source software considering varied testing and debugging factors
2022, Quality and Reliability Engineering InternationalAutomating Staged Rollout with Reinforcement Learning
2022, Proceedings - International Conference on Software Engineering
Jung-Hua Lo received the BS (1993) in Mathematics and the MS (1995) and the Ph.D. (2003) in Electrical Engineering from National Taiwan University. Since 1998, he has been with LanYang Institute of Technology, where he is currently an Assistant Professor and the Chairman of the Department of Information Management. His research interests are software engineering, software reliability and testing, etc.
Chin-Yu Huang is currently an Assistant Professor in the Department of Computer Science at National Tsing Hua University, Hsinchu, Taiwan. He received the MS (1994), and the Ph.D. (2000) in Electrical Engineering from National Taiwan University, Taipei. He was with the Bank of Taiwan from 1994 to 1999, and was a senior software engineer at Taiwan Semiconductor Manufacturing Company from 1999 to 2000. Before joining NTHU in 2003, he was a division chief of the Central Bank of China, Taipei. His research interests are software reliability engineering, software testing, software metrics, software testability, fault tree analysis, and system safety assessment, etc. He is a member of IEEE.