Statistical analysis of deviation of actual cost from estimated cost using actual project data

https://doi.org/10.1016/S0950-5849(00)00092-6Get rights and content

Abstract

This paper analyzes an association of a deviation of the actual cost (measured by person-month) from the estimated cost with the quality and the productivity of software development projects. Although the obtained results themselves may not be new from the academic point of view, they could provide motivation for developers to join process improvement activities in a software company and thus become a driving force for promoting the process improvement.

We show that if a project is performed faithfully under a well-organized project plan (i.e. the plan is first constructed according to the standards of good writing, and then a project is managed and controlled to meet the plan), the deviation of the actual cost from the estimated one becomes small. Next we show statistically that projects with small deviation of the cost estimate tend to achieve high quality of final products and high productivity of development teams. In this analysis, the actual project data on 37 projects at a certain company are extensively applied.

Introduction

This paper describes an empirical research on process improvement [1] in a certain company, which we call Company A for convenience. In Company A, a software engineering process group (SEPG) was established 7 years ago; the SEPG has tried to pervade process improvement into their company. This study is a part of the process improvement activities that the SEPG has done in 1998 for developers in their company. In a software development project, the size of the project to be developed is estimated first. Next, a project plan is constructed based on the estimate. Then development starts according to the project plan. If a project is performed exactly as a project plan specifies, the project is regarded as a successful project. However, some projects inevitably result in a confused situation such as the Death March project [2], in which the actual cost exceeded the estimated cost by 50%. Therefore, it is strongly desired by the SEPG to reduce the number of the confused projects.

In order to reduce confused projects, many methods and guidelines have already been proposed. Famous methods such as COCOMO [3] and Function Point method [4] aimed to make an estimate accurate. In order to improve the quality and the productivity, review activities are introduced to detect defects in the early stage of development [5]. Then the importance of constructing an appropriate project plan and utilizing it during the development is stressed [6], [7]. However, it is clear that a good method or guideline does not have any effect if it is utilized or applied inappropriately during the development field. In order to guide appropriate applications, developers must be motivated to utilize them. According to Humphrey [6] only motivated professionals or developers can strive for superior performance. Therefore, we should not enforce developers to apply a new method before they understand its benefit and importance, and thus are motivated to do it.

In this paper, we take note of the deviation of the actual cost from the estimated cost and regard projects with large cost deviation as confused projects. We introduce a metric DV, which denotes the difference between an actual cost and an estimated cost. On the other hand, we (including the SEPG) guess that a construction of an appropriate plan and its adherent execution is the key point to reduce confused projects. So, in order to motivate developers to construct and execute a good plan we show its benefit and importance. To sum up, we will show that “construction of appropriate plan and its adherent execution” is an effective approach to reduce confused projects with large DV with the statistical significance in Company A.

We will show the following three propositions with statistical significance to Company A: (P1) if a project plan is constructed and executed adherently, the deviation of the cost estimate, DV, becomes small; (P2) if the deviation of the cost estimate, DV, is small, the quality of the product is high; and (P3) if the DV is small, the productivity of the team is high. As mentioned before, many researchers have already pointed out that an appropriate plan and its adherent execution are important for software development [6], [7], [8]. However, to the best of our knowledge, there is no research that showed the effect of the appropriate plan quantitatively by using actual development data.

In this study, we use 37 project data obtained from actual developments in Company A and show the statistical significance in Company A for propositions P1, P2 and P3 by the correlation analysis and the test of statistical hypotheses.

With regard to proposition P1, it is very difficult to define what a good project plan is and, therefore, it is much more difficult to construct a good plan. Thus we consider project plans satisfying some standards (prepared for construction of the plan) as good plans. Hence we make a checklist for project plans (the detail of the checklist is described in Section 3). Based on the checklist we judge and evaluate the project plans. For this purpose we define a metric ADplan, which indicates whether or not a project plan is constructed adherently to the standard.

Next, we should take note of the execution of the projects. Ideally, the developers perform a project exactly as specified in a project plan. However, various problems often disturb a development project. For example, if many defects are found in a test activity unexpected effort is needed to remove them. Thus we evaluate the execution of projects from two points of view: (1) whether a project is managed using a project plan, and (2) whether a ratio of review effort to entire effort is large enough to avoid the confusion. For this purpose, we define a metric ADexec that indicates whether or not a development project is performed adherently to a project plan. Then we perform correlation analyses between the evaluation of project plans (AD=ADplan+ADexec) and DV (the deviation of the cost estimate). The result of the analysis shows that there is some extent of correlation between them.

As for the propositions P2 and P3, any projects finished with lower actual cost than the estimated one are likely considered to be successful from an economical point of view. However, from the project managers' point of view, those projects never adhere to their project plans. In this line, we evaluate the resultant effects of the deviation of the cost estimate. To be precise, we investigate the relationship of the deviation of the cost estimate on both the quality of a final product and the productivity of a development team. In this analysis, we classify the projects into two distinct classes using DV (the deviation of the cost estimate): CS and CC. CS includes projects with DV<10%, and CC includes projects with DV≥10%. The test of statistical hypotheses confirmed that both the quality and the productivity of the projects in CS are higher than those in CC (the level of significance is chosen as 0.05).

The rest of this paper is organized as follows. Section 2 describes the preliminaries of this study, target projects, process model and fundamental data. The metrics used in the analyses are defined in Section 3. The correlation analysis between the adherence to project plans and the deviation of the cost estimate is performed in Section 4. It is shown that the deviation of the cost estimate is small if the plan is constructed adherently (to the standard) and the project is performed or managed adherently (to the constructed plan). The analysis for the relationship of the deviation of the cost estimate on the quality and the productivity is shown in Section 5. It is shown in Company A with statistical significance that in the project with a small deviation of the cost estimate, the quality of the delivered code is high and the productivity of the development team is high. Finally, Section 6 summarizes this paper.

Section snippets

Target projects

The projects targeted in this paper are the development of computer control systems with embedded software in Company A. The systems are classified into three categories: banking applications, railroad applications and business applications.

Though we omit the details, such embedded software implement rather complex functions dealing with many sensors, actuators and control signals including various kinds of interrupts. Furthermore, since it is delivered in the form of LSI chips, modification of

Definition of metrics

In this section, we introduce five kinds of metrics for the analyses to be described in 4 Analysis 1: deviation of the cost estimate and adherence, 5 Analysis 2: effect of deviation.

Analysis 1: deviation of the cost estimate and adherence

In the first analysis, we investigate the correlation between AD and DV.

Analysis 2: effect of deviation

We can see that the estimates of the project become accurate for these years in Company A. We have also observed some improvements in both the quality and the productivity. In this section, we clarify the relations between the deviation of the cost estimate and the quality and productivity.

Conclusion

In this paper, we have tested statistically three interesting propositions P1, P2 and P3 as the results of empirical research. Although the implications by these propositions themselves may not be new for academia people, they may become a driving force in the software developing company for promoting process improvement through: (1) exhaustive collection of fundamental data, and (2) establishment of some kinds of standards (mentioned in Section 2).

The main results of our empirical research are

Acknowledgements

The authors would like to thank Prof Koji Torii of Nara Institute of Science and Technology and Associate Prof Shinji Kusumoto of Osaka University for their discussions and advice to our analysis in this paper. They would like thank the two reviewers for their useful comments on the earlier version of this paper.

References (17)

There are more references available in the full text version of this article.

Cited by (6)

  • Software effort estimation terminology: The tower of Babel

    2006, Information and Software Technology
    Citation Excerpt :

    They both include a brief discussion of problems related to estimation accuracy evaluation, but neither of them suggest guidelines or provide any example of how the problems can be solved in practice. In the research papers, Q2 is addressed in nine of the papers [1,4–6,22,35,40,47,51]. They handle the incomparability in a somewhat different manners: Some studies discuss the consequence of incomparability or assess it to be ignorable/not relevant [1,4,5,40,47], some studies remove data points [22,35], while one study avoids to calculate estimation accuracy at all due to comparison problems [51].

  • Factors affecting software development productivity: An empirical study

    2019, ACM International Conference Proceeding Series
  • DEA evaluation of a Y2K software retrofit program

    2004, IEEE Transactions on Engineering Management
View full text