Statistical analysis of deviation of actual cost from estimated cost using actual project data

doi:10.1016/S0950-5849(00)00092-6

Information and Software Technology

Volume 42, Issue 7, 1 May 2000, Pages 465-473

https://doi.org/10.1016/S0950-5849(00)00092-6 Get rights and content

Abstract

This paper analyzes an association of a deviation of the actual cost (measured by person-month) from the estimated cost with the quality and the productivity of software development projects. Although the obtained results themselves may not be new from the academic point of view, they could provide motivation for developers to join process improvement activities in a software company and thus become a driving force for promoting the process improvement.

We show that if a project is performed faithfully under a well-organized project plan (i.e. the plan is first constructed according to the standards of good writing, and then a project is managed and controlled to meet the plan), the deviation of the actual cost from the estimated one becomes small. Next we show statistically that projects with small deviation of the cost estimate tend to achieve high quality of final products and high productivity of development teams. In this analysis, the actual project data on 37 projects at a certain company are extensively applied.

Introduction

This paper describes an empirical research on process improvement [1] in a certain company, which we call Company A for convenience. In Company A, a software engineering process group (SEPG) was established 7 years ago; the SEPG has tried to pervade process improvement into their company. This study is a part of the process improvement activities that the SEPG has done in 1998 for developers in their company. In a software development project, the size of the project to be developed is estimated first. Next, a project plan is constructed based on the estimate. Then development starts according to the project plan. If a project is performed exactly as a project plan specifies, the project is regarded as a successful project. However, some projects inevitably result in a confused situation such as the Death March project [2], in which the actual cost exceeded the estimated cost by 50%. Therefore, it is strongly desired by the SEPG to reduce the number of the confused projects.

In order to reduce confused projects, many methods and guidelines have already been proposed. Famous methods such as COCOMO [3] and Function Point method [4] aimed to make an estimate accurate. In order to improve the quality and the productivity, review activities are introduced to detect defects in the early stage of development [5]. Then the importance of constructing an appropriate project plan and utilizing it during the development is stressed [6], [7]. However, it is clear that a good method or guideline does not have any effect if it is utilized or applied inappropriately during the development field. In order to guide appropriate applications, developers must be motivated to utilize them. According to Humphrey [6] only motivated professionals or developers can strive for superior performance. Therefore, we should not enforce developers to apply a new method before they understand its benefit and importance, and thus are motivated to do it.

In this paper, we take note of the deviation of the actual cost from the estimated cost and regard projects with large cost deviation as confused projects. We introduce a metric DV, which denotes the difference between an actual cost and an estimated cost. On the other hand, we (including the SEPG) guess that a construction of an appropriate plan and its adherent execution is the key point to reduce confused projects. So, in order to motivate developers to construct and execute a good plan we show its benefit and importance. To sum up, we will show that “construction of appropriate plan and its adherent execution” is an effective approach to reduce confused projects with large DV with the statistical significance in Company A.

We will show the following three propositions with statistical significance to Company A: (P₁) if a project plan is constructed and executed adherently, the deviation of the cost estimate, DV, becomes small; (P₂) if the deviation of the cost estimate, DV, is small, the quality of the product is high; and (P₃) if the DV is small, the productivity of the team is high. As mentioned before, many researchers have already pointed out that an appropriate plan and its adherent execution are important for software development [6], [7], [8]. However, to the best of our knowledge, there is no research that showed the effect of the appropriate plan quantitatively by using actual development data.

In this study, we use 37 project data obtained from actual developments in Company A and show the statistical significance in Company A for propositions P₁, P₂ and P₃ by the correlation analysis and the test of statistical hypotheses.

With regard to proposition P₁, it is very difficult to define what a good project plan is and, therefore, it is much more difficult to construct a good plan. Thus we consider project plans satisfying some standards (prepared for construction of the plan) as good plans. Hence we make a checklist for project plans (the detail of the checklist is described in Section 3). Based on the checklist we judge and evaluate the project plans. For this purpose we define a metric AD_plan, which indicates whether or not a project plan is constructed adherently to the standard.

Next, we should take note of the execution of the projects. Ideally, the developers perform a project exactly as specified in a project plan. However, various problems often disturb a development project. For example, if many defects are found in a test activity unexpected effort is needed to remove them. Thus we evaluate the execution of projects from two points of view: (1) whether a project is managed using a project plan, and (2) whether a ratio of review effort to entire effort is large enough to avoid the confusion. For this purpose, we define a metric AD_exec that indicates whether or not a development project is performed adherently to a project plan. Then we perform correlation analyses between the evaluation of project plans (AD=AD_plan+AD_exec) and DV (the deviation of the cost estimate). The result of the analysis shows that there is some extent of correlation between them.

As for the propositions P₂ and P₃, any projects finished with lower actual cost than the estimated one are likely considered to be successful from an economical point of view. However, from the project managers' point of view, those projects never adhere to their project plans. In this line, we evaluate the resultant effects of the deviation of the cost estimate. To be precise, we investigate the relationship of the deviation of the cost estimate on both the quality of a final product and the productivity of a development team. In this analysis, we classify the projects into two distinct classes using DV (the deviation of the cost estimate): C_S and C_C. C_S includes projects with DV<10%, and C_C includes projects with DV≥10%. The test of statistical hypotheses confirmed that both the quality and the productivity of the projects in C_S are higher than those in C_C (the level of significance is chosen as 0.05).

The rest of this paper is organized as follows. Section 2 describes the preliminaries of this study, target projects, process model and fundamental data. The metrics used in the analyses are defined in Section 3. The correlation analysis between the adherence to project plans and the deviation of the cost estimate is performed in Section 4. It is shown that the deviation of the cost estimate is small if the plan is constructed adherently (to the standard) and the project is performed or managed adherently (to the constructed plan). The analysis for the relationship of the deviation of the cost estimate on the quality and the productivity is shown in Section 5. It is shown in Company A with statistical significance that in the project with a small deviation of the cost estimate, the quality of the delivered code is high and the productivity of the development team is high. Finally, Section 6 summarizes this paper.

Section snippets

Target projects

The projects targeted in this paper are the development of computer control systems with embedded software in Company A. The systems are classified into three categories: banking applications, railroad applications and business applications.

Though we omit the details, such embedded software implement rather complex functions dealing with many sensors, actuators and control signals including various kinds of interrupts. Furthermore, since it is delivered in the form of LSI chips, modification of

Definition of metrics

In this section, we introduce five kinds of metrics for the analyses to be described in 4 Analysis 1: deviation of the cost estimate and adherence, 5 Analysis 2: effect of deviation.

Analysis 1: deviation of the cost estimate and adherence

In the first analysis, we investigate the correlation between AD and DV.

Analysis 2: effect of deviation

We can see that the estimates of the project become accurate for these years in Company A. We have also observed some improvements in both the quality and the productivity. In this section, we clarify the relations between the deviation of the cost estimate and the quality and productivity.

Conclusion

In this paper, we have tested statistically three interesting propositions P₁, P₂ and P₃ as the results of empirical research. Although the implications by these propositions themselves may not be new for academia people, they may become a driving force in the software developing company for promoting process improvement through: (1) exhaustive collection of fundamental data, and (2) establishment of some kinds of standards (mentioned in Section 2).

The main results of our empirical research are

Acknowledgements

The authors would like to thank Prof Koji Torii of Nara Institute of Science and Technology and Associate Prof Shinji Kusumoto of Osaka University for their discussions and advice to our analysis in this paper. They would like thank the two reviewers for their useful comments on the earlier version of this paper.

References (17)

F.J. Heemstra
Software cost estimation
Info. Software Technol.
(1992)
R.C. Tausworthe
The work breakdown structure in software project management
J. Systems Software
(1980)
L. Poiaga
Operations research in project management and cost engineering: an outlook for new operational developments
Eur. J. Oper. Res.
(1989)
W.S. Humphrey
Managing the Software Process
(1989)
E. Yourdon
Death March: the Complete Software Developer's Guide to Surviving ‘Mission Impossible’ Projects
(1997)
B.W. Boehm
Software Engineering Economics
(1981)
A.J. Albrecht et al.
Software function, source lines of code, and development effort prediction: a software science validation
IEEE Trans. Software Engng
(1983)
A.A. Porter et al.
An experiment to assess the cost-benefits of code inspections in large scale software development
IEEE Trans. Software Engng
(1997)

There are more references available in the full text version of this article.

Cited by (6)

Software effort estimation terminology: The tower of Babel
2006, Information and Software Technology
Citation Excerpt :
They both include a brief discussion of problems related to estimation accuracy evaluation, but neither of them suggest guidelines or provide any example of how the problems can be solved in practice. In the research papers, Q2 is addressed in nine of the papers [1,4–6,22,35,40,47,51]. They handle the incomparability in a somewhat different manners: Some studies discuss the consequence of incomparability or assess it to be ignorable/not relevant [1,4,5,40,47], some studies remove data points [22,35], while one study avoids to calculate estimation accuracy at all due to comparison problems [51].
It is well documented that the software industry suffers from frequent cost overruns. A contributing factor is, we believe, the imprecise estimation terminology in use. A lack of clarity and precision in the use of estimation terms reduces the interpretability of estimation accuracy results, makes the communication of estimates difficult, and lowers the learning possibilities. This paper reports on a structured review of typical software effort estimation terminology in software engineering textbooks and software estimation research papers. The review provides evidence that the term ‘effort estimate’ is frequently used without sufficient clarification of its meaning, and that estimation accuracy is often evaluated without ensuring that the estimated and the actual effort are comparable. Guidelines are suggested on how to reduce this lack of clarity and precision in terminology.
A review of studies on expert estimation of software development effort
2004, Journal of Systems and Software
This paper provides an extensive review of studies related to expert estimation of software development effort. The main goal and contribution of the review is to support the research on expert estimation, e.g., to ease other researcher’s search for relevant expert estimation studies. In addition, we provide software practitioners with useful estimation guidelines, based on the research-based knowledge of expert estimation processes. The review results suggest that expert estimation is the most frequently applied estimation strategy for software projects, that there is no substantial evidence in favour of use of estimation models, and that there are situations where we can expect expert estimates to be more accurate than formal estimation models. The following 12 expert estimation “best practice” guidelines are evaluated through the review: (1) evaluate estimation accuracy, but avoid high evaluation pressure; (2) avoid conflicting estimation goals; (3) ask the estimators to justify and criticize their estimates; (4) avoid irrelevant and unreliable estimation information; (5) use documented data from previous development tasks; (6) find estimation experts with relevant domain background and good estimation records; (7) Estimate top-down and bottom-up, independently of each other; (8) use estimation checklists; (9) combine estimates from different experts and estimation strategies; (10) assess the uncertainty of the estimate; (11) provide feedback on estimation accuracy and development task relations; and, (12) provide estimation training opportunities. We found supporting evidence for all 12 estimation principles, and provide suggestions on how to implement them in software organizations.
Tool for measuring productivity in software development teams
2021, Information (Switzerland)
Factors affecting software development productivity: An empirical study
2019, ACM International Conference Proceeding Series
An empirical approach to characterizing risky software projects based on logistic regression analysis
2005, Empirical Software Engineering
DEA evaluation of a Y2K software retrofit program
2004, IEEE Transactions on Engineering Management

View full text

Statistical analysis of deviation of actual cost from estimated cost using actual project data

Abstract

Introduction

Section snippets

Target projects

Definition of metrics

Analysis 1: deviation of the cost estimate and adherence

Analysis 2: effect of deviation

Conclusion

Acknowledgements

Info. Software Technol.

J. Systems Software

Eur. J. Oper. Res.

Managing the Software Process

Death March: the Complete Software Developer's Guide to Surviving ‘Mission Impossible’ Projects

Software Engineering Economics

Software function, source lines of code, and development effort prediction: a software science validation

IEEE Trans. Software Engng

An experiment to assess the cost-benefits of code inspections in large scale software development

IEEE Trans. Software Engng