Reliability analysis and optimal version-updating for open source software
Introduction
Open source software (OSS) development is a new way of building and deploying large software systems on a global basis, and it differs in many interesting ways from the principles and practices of traditional software engineering [1]. There is a widespread recognition across software industry that open source projects can produce software systems of high quality and functionality, such as Linux operating system, Apache web server, Mozilla browser and MySQL database system, that can be used by thousands to millions of end-users [2].
The OSS development is based on a relatively simple idea: the original core of the OSS system is developed locally by a single programmer or a team of programmers. Then a prototype system is released on the internet, so that other programmers can freely read, modify and redistribute that system’s source code. The evolution process of OSS is much faster than the closed source project. The reason is that in the development of OSS, tasks are completed without assigning from hierarchical management and there is no explicit system-level design, no well-defined plan or schedules. A central managing group may check the code but this process is much less rigid than in closed-source projects.
Several OSS systems have been in widespread use with thousands or millions of end-users, e.g. Mozilla, Apache, OpenOffice, Eclipse, NetBeans, GNOME, and Linux. Due to the success of OSS, more and more software companies have switched from a closed source to an open source development in order to win market share and to improve product quality [3]. Even the leading commercial software companies, such as IBM and Sun, have begun to embrace the open source model and are actively taking part in the development of OSS products.
As OSS application rapidly spreads out, it is of great importance to assess the reliability of OSS system to prevent potential financial loss or reputational damage to the company [4]. Due to this consideration, many studies have been carried out recently on predicting number of defects in the system. For instances, Eaddy et al. [5] investigated the relationship between the degree of scattering and the number of defects by stepwise regression and other statistical techniques. Marcus [6] proposed a new measure named Conceptual Cohesion of Classes (C3) to measure the cohesion in object-oriented software. They also applied C3 in logistic regression to predict software faults with the comparisons with other object-oriented metrics. Kim et al. [7] introduced a new technique for predicting latent software bugs in OSS, called change classification. Change classification uses a machine learning classifier to determine whether a new software change is more similar to prior buggy changes or clean changes. In this manner, change classification predicts the existence of bugs in software changes.
Although the works above can provide important information to assess the reliability for OSS, the total number of defects in a software system is an essentially indirect reliability measurement where the time factor is often neglected. Only in some recent studies by Tamura and Yamada [8], [9], such issue is considered. In particular, Tamura and Yamada [8] combined neural network and software reliability growth modeling for the assessment of OSS reliability. In [9], the stochastic differential equation is introduced for the modeling of OSS reliability and optimal version-update time is discussed based on it.
In this paper, we will further investigate the modeling of OSS reliability and its optimal version-update time determination. Our model is based on non-homogeneous Poisson process (NHPP) which has been proven to be a successful model for software reliability [10], [11], [12], [13]. However, different from the NHPP models for closed source software and the models proposed in [8], [9], our model incorporates the unique patterns of OSS development, such as the multiple releases property and the hump-shaped fault detection rate function. In addition, because the project cost is no longer a crucial factor for optimal release time determination for most OSS projects, in this study, we formulate a new version-update time determination problem for OSS. Specifically, the multi-attribute utility theory (MAUT) is adopted for this decision process, where two important strategies are considered simultaneously: rapid release of the software to maintain sufficient volunteers involved and the acceptable level of OSS reliability.
The rest of this paper is organized as follows. Section 2 describes our proposed model based on NHPP incorporating unique properties of OSS. Section 3 formulates the optimal version-update time problem based on MAUT where the rapid release strategy and the level of reliability are considered simultaneously. Section 4 provides numerical examples for validation purpose based on the real world data sets. Conclusions are made in the last section.
Section snippets
Modeling fault detection process of open source software
The underlying software fault detection process is commonly assumed to be a non-homogeneous Poisson process (NHPP) [10], [11], [12], [13]. As software faults are detected, isolated and removed, the software being tested becomes more reliable with a decreasing failure intensity function. In general, an NHPP software reliability growth model (SRGM) can be developed by solving the following differential equation [14]:where m(t), a(t) and b(t) are the mean value function of
Determination of optimal version-update time
Optimal release time determination in the testing phase is a typical application of software reliability models. The total expected cost including both testing cost and operation cost is a crucial factor for such determination [14], [15], [16]. In fact, the software cost estimation is also important in software development [17]. However, most OSS projects are interest-driven and most development activities in OSS projects are accomplished by volunteer users. Consequently, the cost is no longer
Numerical examples
Special properties of OSS are incorporated into the proposed model for open source software reliability. In order to compare the proposed model against traditional models for reliability assessment, numerical examples are provided based on two real world data sets from two famous open source projects: Apache and GNOME. Furthermore, based on the failure data from the first release of Apache, a decision model application example is provided and sensitivity analysis is introduced to help
Conclusions
The OSS approach provides a new paradigm of software development, where volunteer participation has become a critical issue. Since volunteers are interest-driven and the attractiveness of a specific release of software is generally decreasing over time, multiple releases are expected to maintain a sufficient number of volunteers and to attract new comers. In order to describe these unique properties of OSS properly, a modified NHPP model is proposed to assess OSS reliability. Based on the
References (36)
- et al.
Challenges and strategies in the use of open source software by independent software vendors
Information and Software Technology
(2008) Software reliability and cost models: perspectives, comparison, and practice
European Journal of Operational Research
(2003)- et al.
Adaptive ridge regression system for software cost estimating on multi-collinear datasets
Journal of Systems and Software
(2010) - et al.
Exploring the relationship of a file’s history and its fault-proneness: an empirical method and its application to open source programs
Information and Software Technology
(2010) - et al.
Survival analysis on the duration of open source projects
Information and Software Technology
(2010) - et al.
A multi-criteria decision model to determine inspection intervals of condition monitoring based on delay time analysis
Reliability Engineering and System Safety
(2009) - et al.
Enhancing and measuring the predictive capabilities of testing-effort dependent software reliability models
Journal of Systems and Software
(2008) - et al.
Multi-attribute risk assessment for risk ranking of natural gas pipelines
Reliability Engineering and System Safety
(2009) - et al.
A study of the sensitivity of software release time
Journal of Systems and Software
(1998) - et al.
A model for availability analysis of distributed software/hardware systems
Information and Software Technology
(2002)
Sensitivity analysis of release time of software reliability models incorporating testing effort with multiple change-points
Applied Mathematical Modelling
The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary
Two case studies of open source software development: Apache and Mozilla
ACM Transactions on Software Engineering and Methodology
Empirical validation of object-oriented metrics on open source software for fault prediction
IEEE Transactions on Software Engineering
Do crosscutting concerns cause defects?
IEEE Transactions on Software Engineering
Using the conceptual cohesion of classes for fault prediction in object-oriented systems
IEEE Transactions on Software Engineering
Classifying software changes: clean or buggy?
IEEE Transactions on Software Engineering
A component-oriented reliability assessment method for open source software
International Journal of Reliability, Quality and Safety Engineering
Cited by (105)
NASH EQUILIBRIUM STRATEGIES REVISITED IN SOFTWARE RELEASE GAMES
2022, Probability in the Engineering and Informational SciencesSynergic impact of development cost and slippage cost on software delivery time
2024, International Journal of System Assurance Engineering and ManagementA Study on Optimal Release Schedule for Multiversion Software
2024, INFORMS Journal on ComputingCreating space and time for innovation - a methodology for building adaptation design appraisal using physics-based simulation tools and interactive multi-objective optimization
2023, Engineering, Construction and Architectural Management