Predictive accuracy comparison of fuzzy models for software development effort of small programs

https://doi.org/10.1016/j.jss.2007.08.027Get rights and content

Abstract

Regression analysis to generate predictive equations for software development effort estimation has recently been complemented by analyses using less common methods such as fuzzy logic models. On the other hand, unless engineers have the capabilities provided by personal training, they cannot properly support their teams or consistently and reliably produce quality products. In this paper, an investigation aimed to compare personal Fuzzy Logic Models (FLM) with a Linear Regression Model (LRM) is presented. The evaluation criteria were based mainly upon the magnitude of error relative to the estimate (MER) as well as to the mean of MER (MMER). One hundred five small programs were developed by thirty programmers. From these programs, three FLM were generated to estimate the effort in the development of twenty programs by seven programmers. Both the verification and validation of the models were made. Results show a slightly better predictive accuracy amongst FLM and LRM for estimating the development effort at personal level when small programs are developed.

Introduction

Software development effort estimates are the basis for project bidding and planning (both are critical practices in the software industry). The consequences of poor budgets and plans can be dramatic: if they are too pessimistic, business opportunities can be lost, while over optimism may be followed by significant losses (Grimstad et al., 2005). Software effort estimation has even been identified as one of the three great challenges in Computer Science (Brooks, 2003).

Software development estimation techniques can be classified into three general categories (Mendes et al., 2002):

  • (1)

    Expert judgment: It aims to derive estimates based on an experience of experts on similar projects. The means of deriving an estimate are not explicit and therefore not repeatable (Mendes et al., 2002). Studies on expert estimation rate it to be the preferred method among professional software developers (Molokken and Jorgensen, 2004). The term expert estimation is not clearly defined and covers a wide range of estimation approaches. A common characteristic is, however, that intuitive processes constitute a determinant part of the estimation (Jorgensen et al., 2000).

  • (2)

    Algorithmic models: To date the is the most popular in literature (Briand and Wieczorek, 2001). It attempts to represent the relationship between effort and one or more characteristic of a project; the main cost driver in such a model is usually taken to be some notion of software size (e.g. the number of lines of source code). Its general form is a linear regression equation, as that used by Kok et al. (1990), or a group of non-linear regression equations as used by Boehm (1981) in COCOMO 81 and COCOMO II (Boehm et al., 2000).

  • (3)

    Machine learning: Machine learning techniques have in recent years been used as a complement or alternative to the previous two techniques (Pedrycz, 2002). Fuzzy logic models are included in this category (MacDonell and Gray, 1996) as well as neural networks (Idri et al., 2002a), genetic programming (Burguess and Lefley, 2001), regression trees (Srinivasan and Fisher, 1995), and case-based reasoning (Kadoda et al., 2000).

Given the fact that no single software development estimation technique is best for all situations, a careful comparison of the results of several approaches is most likely to produce realistic estimates (Boehm et al., 1998).

In this paper, the result of using a Linear Regression Model (LRM), is compared with those of three Fuzzy Logic Models (FLM). This comparison is based upon the two following main stages when using an estimation model: (1) the model adequacy checking (or model verification) must be determined, that is, whether the model is adequate to describe the observed (actual) data; if so then (2) the estimation model is validated using new data.

On the other hand, considering that unless engineers have the capabilities provided by personal training, they will not be able to properly support their teams or consistently and reliably produce quality products (Humphrey, 2000), this paper suggests that the software estimation activity could start using a personal level approach, starting with the development of small size programs.

The Capability Maturity Model (CMM) gives an available description of the goals, methods, and practices needed in software engineering industrial practice, while Personal Software Process (PSP) allows its instrumentation at a personal level (Humphrey, 1995). Twelve of the eighteen key process areas of the CMM are at least partially considered in PSP. This paper is based upon some PSP practices and takes account of the guidelines suggested in Kitchenham et al. (2002).

In this study, lines of code and development time were gathered from 105 small programs developed by 30 programmers. From this set of programs three FLM and a LRM were generated and their adequacy was checked (verification). Then, the results of the FLM and LRM equations, were validated in the effort estimation of 20 programs developed by other group integrated by seven programmers.

In spite of the availability of a wide range of software product size measures, source lines of code (LOC) remains in favor of many models (MacDonell, 2003). There are two measures of source code size: physical source lines and logical source statements. The count of physical lines gives the size in terms of the physical length of the code as it appears when printed (Park, 1992).

To generate the FLM and LRM, New and Changed (N&C) (Humphrey, 1995) physical lines of code (LOC) were used. N&C is composed of added and modified code. The added code is the LOC written during the current programming process, while the modified code is the LOC changed in the base program when modifying a previously developed program. The base program is the total LOC of the previous program. Reused code (the LOC of previously developed programs that are used without modification) was analyzed and considered when establishing the LRM and FLM; however it did not showed a statistical significance (Section 3.1).

A coding standard should establish a consistent set of coding practices that is used as a criterion when judging the quality of the produced code (Humphrey, 1995). Hence, it is necessary to always use the same coding and counting standards. The programs developed within this study followed these guidelines.

The solution adopted by cost estimation researches for selecting the relevant attributes is to test the correlation (r) between efforts and attributes (Idri and Khoshgoftaar, 2001). In this study, lines of code are correlated to the development efforts.

A fuzzy model is a modeling construct featuring two main properties (Pedrycz, 1998): (1) It operates at a level of linguistic terms (fuzzy sets), and (2) it represents and processes uncertainty.

All estimation techniques has an important limitation, which arises when software projects are described using categorical data (nominal or ordinal scale) such as small, medium, average, or high (linguistic terms or values). A more comprehensive approach to deal with linguistic values is using the fuzzy set theory (Idri et al., 2002b). Specifically, fuzzy logic offers a particularly convenient way to generate a keen mapping between input and output spaces thanks to the natural expression of fuzzy rules (Zadeh, 1999).

In software development effort estimation, two considerations justify the decision of implementing a fuzzy model: first, it is impossible to develop a precise mathematical model of the domain (Lewis, 2001); second, metrics only produce estimations of the real complexity. Thus, according to the previous assertions, formulating a tiny set of natural rules describing underlying interactions between the software metrics and the effort estimation could effortlessly reveal their intrinsic and wider correlations; hence, this paper presents a comparative study designed to evaluate this proposition.

Disadvantages of a fuzzy model could be that (1) it requires a lot of data, (2) the estimators must be familiar with the historically developed programs, and (3) it is not useful for programs much larger or smaller than the historical data (Humphrey, 1995). In this research, only the third one represents a disadvantage since the sample was conformed by more than a hundred programs and those were well known to the estimators.

There are several ways in data fuzzification that could potentially be applied to the effort estimation problem (Schofield, 1998). One of them is used in this study: to construct a rule induction system replacing the crisp facts with fuzzy inputs. An inference engine uses a base of rules to map inputs to a fuzzy output which can either be translated back to a crisp value or left as a fuzzy value.

A previous research (Foss et al., 2003) has demonstrated that the Magnitude of Relative Error (MRE) (a common criterion for the evaluation of cost estimation models (Briand et al., 1998)) does not identify the best prediction model. In accordance with Foss et al. (2003), the implications of this finding are that the results and conclusions on prediction models over the past 15–25 years are unreliable and may have misled the entire software engineering discipline. Therefore, it is strongly recommended not using MMRE to evaluate and compare prediction models (Foss et al., 2003), but the MMER that was proposed by Kitchenham et al. (2001).

The MER is defined as follows:MERi=|Actual Efforti-Predicted Efforti|Predicted EffortiThe MER value is calculated for each observation i whose effort is predicted. The aggregation of MER over multiple observations (N) can be achieved through the Mean MER (MMER) as follows:MMER=1NiNMERiIntuitively, MER seems preferable to MRE since MER measures the error relative to the estimate.

In Foss et al. (2003) MMER had better results than MMRE. This fact is the reason for using MMER in this study.

However, the MMER are sensitive to individual predictions with excessively large MERs. Therefore, an aggregate measure less sensitive to extreme values is also considered, namely the median of MER values for the N observations (MdMER) (Briand et al., 2000).

A complementary criterion is the prediction at level l, Pred(l) = k/N, where k is the number of observations where MER is less than or equal to l, and N is the total number of observations. Thus, Pred(25) gives a percentage of projects which were predicted with a MER less or equal than 0.25.

In general, the accuracy of an estimation technique is proportional to the Pred(l) and inversely proportional to the MMER. As reference, for effort prediction models, a MMRE  0.25 is considered as acceptable (Conte et al., 1986). A reference for an acceptable value of MMER was not found.

A group of papers regarding aspects related to previous research on software development effort estimation based on FLM were reviewed.

Four different aspects were analyzed: (1) What has been achieved by empirical research studies on software effort estimation based on fuzzy logic? (2) From where has the dataset been gathered? (3) What metrics have been used? and (4) What were the results? A brief description of each paper found is given:

  • (1)

    Ahmed et al. (2004), present a FLM based upon triangular membership functions. The dataset for validating the FLM was (a) generated randomly and (b) that of COCOMO 81 was used. Results showed that the FLM was slightly better than COCOMO equations. In addition, they reported promising experimental summary results in spite of the little background knowledge of the rule base and training data. It does signify that there are potentials for improvements when the framework is deployed in practice, since experienced experts could augment the dataset value with their knowledge.

  • (2)

    Crespo et al. (2004), explore fuzzy regression techniques based upon fuzzification of input values. Project database of COCOMO-81 are used. Their conclusions are that fuzzy regression is able to obtain estimation models with similar predictive properties than existing basic estimation models.

  • (3)

    Gray and MacDonell (1997), compare a FLM with linear regression equations as well as neural networks. The FLM is based upon triangular membership functions. The neural network was better than FLM and regression equation. The dataset was obtained from a Canadian thesis. They concluded that, compared with other techniques, the fuzzy logic model shows good performance. Moreover, they suggest that there is a place in the field of software metrics for fuzzy logic models.

  • (4)

    Host and Wohlin (1997), report an experiment based upon personal practices, but it only uses expert judgment for estimating the effort.

  • (5)

    Huang et al. (2003), propose a model combining fuzzy logic and neural networks. The dataset was obtained from the original COCOMO (1981). The results of the fuzzy logic model were better than those of the COCOMO equations. The FLM was based upon triangular membership functions. The main benefit of this model is its good interpretability by using the fuzzy rules. Another great advantage of this research is that they could put together expert knowledge (fuzzy rules), project data and the traditional algorithmic model into one general framework that may have a wide range of applicability in software cost estimation.

  • (6)

    Idri et al. (2000), apply fuzzy logic to the fifteen cost factors of COCOMO 81. The FLM is based upon trapezoidal membership functions. The dataset is randomly generated and compared with actual data of COCOMO 81. Results of the FLM were similar to those of COCOMO 81 (but in some cases worse). Researches suggest using other kind of membership functions as triangular.

  • (7)

    Idri et al. (2002b), propose an approach based on fuzzy logic named Fuzzy Analogy. Its dataset is that of COCOMO 81. Taking into account their results, they suggest the following ranking of the four techniques in terms of accuracy and adequacy to deal with linguistic values: 1. Fuzzy Logic, 2. Fuzzy intermediate COCOMO’81, 3. Classical intermediate COCOMO’81, and 4. Classical Analogy.

  • (8)

    Musflek et al. (2000), propose a fuzzy model for COCOMO 81 named f-COCOMO that describes the relationship between size fuzzy sets and effort fuzzy sets. Triangular membership functions are used in this study. Moreover, they concluded that (a) fuzzy sets help articulate the estimates and their essence (by exploiting fuzzy numbers described by asymmetric membership functions) and (b) they generate a feedback as to the given uncertainty (granularity) of the results.

  • (9)

    Reformat et al. (2004), propose an estimation model based upon fuzzy neural networks for development effort. This model is applied in a medical information system. The dataset is divided in three subsets, one of them is used for validating the model.

  • (10)

    Zu and Khoshgoftaar (2004), presented a fuzzy identification cost estimation modeling technique to deal with linguistic data, and automatically generate fuzzy membership functions and rules. A case of study based on the COCOMO’81 database compared the proposed model with all three COCOMO’81 models (basic, intermediate and detailed). It was observed that the fuzzy identification model provided significantly better cost estimations than the three COCOMO’81 models.

Two of the last ten references propose a hybrid system combining fuzzy logic with neural networks and analogy. These papers use only fuzzy logic in the estimation without combining it.

Seven of the ten last references used COCOMO’81 as dataset. In this paper, the dataset was gathered from 105 programs generated by thirty developers; while the dataset for validating the models was gathered from 20 programs generated by other seven developers.

References used lines of code, function points (FP) or a variant of FP as metrics. In this paper lines of code (N&C) were used as the input variable.

Not one paper proposing and comparing a fuzzy logic model in estimating the development effort in small programs based upon personal practices, was found.

Section snippets

Experimental design

There are two main stages for using an estimation model (Montgomery and Peck, 2001):

  • (1)

    Model adequacy checking (Model verification): In this paper the MMER is calculated for each model as well as an analysis of variance (ANOVA) to compare their MER. The three assumptions of residuals for MER ANOVA are analyzed.

  • (2)

    Model validation: Once the adequacy of the models was checked, the effort of the newly gathered dataset was estimated. In this paper the population was integrated by 37 programmers divided

Conducting the experiment and data collection

In order to ensure completeness and accuracy in data collection a control method was followed. It consisted in recording measures using logs described in Section 2.1. In Table 2, actual data used by the developers is depicted.

Model adequacy checking (Model verification)

LRM equation (Eq. (4)) and the three FLM (Table 7, Table 8, Table 9) were applied to original data set (Table 10). The MER by program as well as the MMER by model were then calculated.

Result of the ANOVA for MER of the programs (Table 11) shows that there is not a statistically significant difference amongst the accuracy of prediction for the four models.

The following three assumptions of residuals for MER ANOVA were analyzed:

  • (1)

    Independent samples: The group of developers was made up of separate

Conclusions and Future Research

This research was founded on the three following facts: (1) software development effort estimation is one of the most critical activities in managing software projects, (2) given that no single software development estimation technique is best for all situations, a careful comparison of the results of several approaches is most likely to produce realistic estimates, (3) unless engineers have the capabilities provided by personal training, they cannot properly support their teams or consistently

Acknowledgements

We would like to thank the Center for Computing Research, at the National Polytechnic Institute, Mexico as well as Consejo Nacional de Ciencia y Tecnología (CONACYT).

Also, the Federal Commission of Electricity at Guadalajara, Jalisco and the students of the University del Valle de Atemajac in Guadalajara and the developers of the Programa Avanzado de Formación de Recursos Humanos en Tecnologías de Información (PAFTI) of the Centro de Investigación y de Estudios Avanzados (CINVESTAV)-Guadalajara.

Cuauhtémoc López Martín graduated in Computer Engineering in the Universidad Autónoma de Tlaxcala, México, in 1995. He received a M. Sc. Degree in Information Systems from Universidad de Guadalajara, Jalisco, México in 2000, and a Ph.D. in Computer Science in the Center for Computing Research of the National Polytechnic Institute of Mexico, in 2007. He has been professor at three universities and he has developed software for several organizations. His research interests are on software

References (43)

  • Z. Xu et al.

    Identification of fuzzy models of software cost estimation

    Elsevier Fuzzy Sets and Systems

    (2004)
  • M.A. Ahmed et al.

    Adaptive Fuzzy Logic-Based Framework for Software Development Effort Prediction

    (2004)
  • B. Boehm

    Software Engineering Economics

    (1981)
  • Boehm, B., Abts, Ch., Chulani, S., 1998. Software Development Cost Estimation Approaches–A Survey. Chulani Ph.D....
  • B. Boehm et al.

    COCOMOII

    (2000)
  • L.C. Briand et al.

    Software Resource Estimation

    (2001)
  • Briand, L.C., Emam, K.E., Surmann, D., Wieczorek, I., 1998. An Assessment and Comparison of Common Software Cost...
  • Briand, L.C., Langley, T., Wieczorek, I., 2000. A replicated Assessment and Comparison of Common Software Cost Modeling...
  • F.P. Brooks

    Three great challenges for half-century-old computer science

    Journal of the ACM

    (2003)
  • C.J. Burguess et al.

    Can Genetic Programming Improve Software Effort Estimation? A Comparative Evaluation

    (2001)
  • S.D. Conte et al.

    Software Engineering Metrics and Models

    (1986)
  • F.J. Crespo et al.

    On the use of fuzzy regression in parametric software estimation models: integrating imprecision in COCOMO cost drivers

    WSEAS Transactions on Systems

    (2004)
  • T. Foss et al.

    A simulation study of the model evaluation criterion MMRE

    IEEE Transactions on Software Engineering

    (2003)
  • Gray, A.R., MacDonell, S.G., 1997. Applications of Fuzzy Logic to Software Metric Models for Development Effort...
  • S. Grimstad et al.

    Software Effort Estimation Terminology: The Tower of Babel

    (2005)
  • M. Höst et al.

    A subjective effort estimation experiment

    (1997)
  • Huang, X., Capretz. L.F., Ren, J., Ho, D.A., 2003. Neuro-Fuzzy Model for Software Cost Estimation. In: Proceedings of...
  • W. Humphrey

    A Discipline for Software Engineering

    (1995)
  • Humphrey, W., 2000. The Personal Software Process. Technical Report...
  • Idri, A., Abran, A., Kjiri, L., 2000. COCOMO Cost Model Using Fuzzy Logic. In: 7th International Conference on Fuzzy...
  • Idri, A., Khoshgoftaar, T., 2001. Fuzzy Analogy: A New Approach for Software Cost Estimation. In: International...
  • Cited by (28)

    • Towards an early software estimation using log-linear regression and a multilayer perceptron model

      2013, Journal of Systems and Software
      Citation Excerpt :

      Anvik and Murphy (2011) used machine learning techniques to create recommenders to triage bug reports that can be useful to streamline the development process. Lopez-Martín (2011a,b) and Lopez-Martín et al. (2008, 2011) created regression models from short scale programs and from ISBSG repository. The authors also developed fuzzy logic and neural network models such as Feed-Forward and General Regression Neural Networks.

    • Knowledge transfer in system modeling and its realization through an optimal allocation of information granularity

      2012, Applied Soft Computing Journal
      Citation Excerpt :

      In software cost estimation, project planning, quality assessment, to name the main phases of the overall development process, we come up with some models whose construction heavily relies on collected experimental data. In several cases, fuzzy sets are used as a vehicle to capture a lack of detailed numeric data when dealing with software cost estimation as discussed in [1,2,14]. In some others we resort to forming more advanced regression models as being presented, e.g. [10,13], visualization techniques [16] or an incorporation of advanced techniques of genetic optimization in the design of the cost estimation models, see e.g. [22].

    • A fuzzy logic model for predicting the development effort of short scale programs based upon two independent variables

      2011, Applied Soft Computing Journal
      Citation Excerpt :

      Implementing a fuzzy system requires that the different categories of the different inputs be represented by fuzzy sets, which in turn is represented by membership functions (MF) [1]. The MF type considered in this experiment was triangular because it has demonstrated better accuracy than others like Gaussian and trapezoidal types [25] when gathered data from software developed process (planning, algorithm design, coding, compiling, testing, and postmortem), developers profile, number of programs, coding and counting standards as well as programming languages, have been similar to this study. The values of MF parameters by fuzzy model were then defined.

    • Improved estimation of software project effort using multiple additive regression trees

      2009, Expert Systems with Applications
      Citation Excerpt :

      Software effort estimation techniques can be classified into three general categories (Lopez-Martin, Yanez-Marquez, & Gutierrez-Tornes, 2008; Tronto, Silva, & Sant’Anna, 2008): expert judgment, algorithmic models, and machine learning.

    View all citing articles on Scopus

    Cuauhtémoc López Martín graduated in Computer Engineering in the Universidad Autónoma de Tlaxcala, México, in 1995. He received a M. Sc. Degree in Information Systems from Universidad de Guadalajara, Jalisco, México in 2000, and a Ph.D. in Computer Science in the Center for Computing Research of the National Polytechnic Institute of Mexico, in 2007. He has been professor at three universities and he has developed software for several organizations. His research interests are on software engineering, specifically on software development effort estimation.

    Cornelio Yáñez Márquez received his B.S. degree in Physics and Mathematics from the Escuela Superior de Física y Matemáticas at IPN, México in 1989; the M.Sc. degree in Computing Engineering from the CINTECIPN, México in 1995, and the Ph.D. from the Center for Computing Research of the National Polytechnic Institute of Mexico in 2002, receiving the Lázaro Cárdenas Award 2002. Currently, he is a titular professor at the Center for Computing Research, México. His research interests include associative memories, mathematical morphology and neural networks.

    Agustín Gutiérrez Tornés received the B.Sc. degree in Economícs (Universidad de La Habana, Cuba, 1970). He received the Ph.D. degree in Ciencias Agrícolas (Economics Department, Universidad Agrícola, SGGW-AR, Varsovia, Polonia, 1984). He is currently Coordinator of Systems for BANAMEX, S.A. Moreover, he is professor at the Tecnológico de Estudios Superiores de Monterrey (ITESM, México city). He has a wide history as researcher and as professor of topics related to software engineering. He has worked as consultant and professor of several universities and organizations of the United Nations.

    View full text