Skip to main content

Bayesian Prediction of Fault-Proneness of Agile-Developed Object-Oriented System

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 190))

Abstract

Logistic regression (LR) and naïve Bayes (NB) extensively used for prediction of fault-proneness assume linear addition and independence that often cannot hold in practice. Hence, we propose a Bayesian network (BN) model with incorporation of data mining techniques as an integrative approach. Compared with LR and NB, BN provides a flexible modeling framework, thus avoiding the corresponding assumptions. Using the static metrics such as Chidamber and Kemerer’s (C-K) suite and complexity as predictors, the differences in performance between LR, NB and BN models were examined for fault-proneness prediction at the class level in continual releases (five versions) of Rhino, an open-source implementation of JavaScript, developed using the agile process. By cross validation and independent test of continual versions, we conclude that the proposed BN can achieve a better prediction than LR and NB for the agile software due to its flexible modeling framework and incorporation of multiple sophisticated learning algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. D’Ambros, M., Lanza, M., Robbes, R.: Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empir. Softw. Eng. 17, 531–577 (2012)

    Article  Google Scholar 

  2. Pai, G.J., Dugan, J.B.: Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Trans. Software Eng. 33, 675–686 (2007)

    Article  Google Scholar 

  3. Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Software Eng. 33, 2–13 (2007)

    Article  Google Scholar 

  4. Briand, L.C., Wust, J., Daly, J.W., Porter, D.V.: Exploring the relationships between design measures and software quality in object-oriented systems. J. Syst. Softw. 51, 245–273 (2000)

    Article  Google Scholar 

  5. Singh Y., Kaur, A., Malhotra, R.: Application of decision trees for predicting fault proneness. In: International Conference on Information Systems, Technology and Management-Information Technology, Ghaziabad, India (2009)

    Google Scholar 

  6. Guo, L., Ma, Y., Cukic, B., Singh, H.: Robust prediction of fault proneness by random forests. In: 15th International Symposium on Software Reliability Engineering, pp. 417–428. IEEE Computer Society, Washington, DC (2004)

    Google Scholar 

  7. Singh, Y., Kaur, A., Malhotra, R.: Predicting software fault proneness model using neural network. In: Jedlitschka, A., Salo, O. (eds.) PROFES 2008. LNCS, vol. 5089, pp. 204–214. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  8. Singh, Y., Kaur, A., Malhotra, R.: Software fault proneness prediction using support vector machines. In: Proceedings of the World Congress on Engineering (2009)

    Google Scholar 

  9. Hosmer, D., Lemeshow, S.: Applied Logistic Regression. Wiley, New York (2000)

    Google Scholar 

  10. Gokhale, S.S., Lyn, M.R.: Regression tree modeling for the prediction of software quality. In: Proceedings Of Third ISSAT Intl. Conference on Reliability, pp. 31–36 (1997)

    Google Scholar 

  11. John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–346 (1995)

    Google Scholar 

  12. Ambler, S.W.: Agile Modeling: Effective Practices for Extreme Programming and the Unified Process. Wiley, New York (2002)

    Google Scholar 

  13. Herbsleb, J.D.: Global software development. IEEE Softw. 18, 16–20 (2001)

    Article  Google Scholar 

  14. http://www.agilealliance.org

  15. Olague, H.M., Etzkorn, L.H., Gholston, S., Quattlebaum, S.: Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Trans. Softw. Eng. 33, 402–419 (2007)

    Article  Google Scholar 

  16. Basili, V.R., Briand, L.C., Melo, W.L.: A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Eng. 22, 751–761 (1996)

    Article  Google Scholar 

  17. Cardoso, J.: Process control-flow complexity metric: an empirical validation. In: IEEE International Conference on Services Computing (IEEE SCC 06), pp. 167–173. IEEE Computer Society (2006)

    Google Scholar 

  18. Harrison, R., Counsell, S., Nithi, R.: An evaluation of the MOOD set of object oriented software metrics. IEEE Trans. Softw. Eng. 24, 150–157 (1998)

    Article  Google Scholar 

  19. McCabe, T.J.: A complexity measure. IEEE Trans. Softw. Eng. 2, 308–320 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  20. http://www.mozilla.org/rhino

  21. https://bugzilla.mozilla.org/

  22. Spinellis, D.: Code Quality: The Open Source Perspective. Addison Wesley, Boston (2006)

    Google Scholar 

  23. http://cyvis.sourceforge.net/index.html

  24. Elomaa, T., Rousu, J.: Finding optimal multi-splits for numerical attributes in decision tree learning (1996)

    Google Scholar 

  25. Li, L., Wang, J., Leung, H., Jiang, C.: Assessment of catastrophic risk using bayesian network constructed from domain knowledge and spatial data. Risk Anal. 30, 1157–1175 (2010)

    Article  Google Scholar 

  26. Bouckaert, R.R.: Bayesian Belief Network: from Construction to Inference (1995)

    Google Scholar 

  27. Kabli, R., Herrmann, F., McCall, J.: A Chain-Model Genetic Algorithm for Bayesian Network Structure Learning. GECCO, London (2007)

    Google Scholar 

  28. Larranaga, P., Murga, R., Poza, M., Kuijpers, C.: Structure learning of Bayesian network by hybrid genetic algorithms. In: Fisher, D., Lenz, H.J. (eds.) Learning from Data: AI and Statistics. Springer, New York (1996)

    Google Scholar 

  29. Korb, K.B., Nicholson, A.E.: Bayesian Artificial Intelligence. Chapman & Hall/CRC, Boca Raton (2004)

    Google Scholar 

  30. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kauffman, San Francisco (1993)

    Google Scholar 

  31. Dirk, V.P., Bart, L.: Customer attrition analysis for financial services using proportional hazard models. Eur. J. Oper. Res. 157, 196–217 (2004)

    Article  MATH  Google Scholar 

  32. http://white.stanford.edu/~heeger/sdt/sdt.html

  33. Menzies, T., Dekhtyar, A., Distefano, J., Greenwald, J.: Problems with precision: a response to “comments on ‘data mining static code attributes to learn defect predictors’”. IEEE Trans. Softw. Eng. 33, 637–640 (2007)

    Article  Google Scholar 

  34. Liu, Y., Cheah, W., Kim, B., Park, H.: Predict software failure-prone by learning Bayesian network. Int. J. Adv. Sci. Technol. 1, 33–42 (2008)

    Google Scholar 

  35. Fenton, N., Neil, M., Marsh, W., Hearty, P., Radlinski, L., Krause, P.: On the effectiveness of early life cycle defect prediction with Bayesian nets. Empir. Softw. Eng. 13, 499–537 (2008)

    Article  Google Scholar 

  36. Li, L., Leung, H.: Mining static code metrics for a robust prediction of software defect-proneness. In: ACM /IEEE International Symposium on Empirical Software Engineering and Measurement Anaheim, CA (2011)

    Google Scholar 

  37. Cox, A.L.: Risk Analysis: Foundations, Models and Methods. Springer, Heidelberg (2001)

    Google Scholar 

  38. Hoeting, A.J., Madigan, D., Raftery, E.A., Volinsky, T.C.: Bayesian model averaging: a tutorial. Stat. Sci. 14, 382–417 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  39. Li, L., Leung, H.: Using the number of faults to improve fault-proneness prediction of the probability models. In: 2009 World Congress on Computer Science and Information Engineering, Los Angeles/Anaheim (2009)

    Google Scholar 

Download references

Acknowledgements

This research is partly supported by the Hong Kong CERG grant PolyU5225/08E, NSFC grant 1171344/D010703, MOST grants (2012CB955503 and 2011AA120305–1).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hareton Leung .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Li, L., Leung, H. (2014). Bayesian Prediction of Fault-Proneness of Agile-Developed Object-Oriented System. In: Hammoudi, S., Cordeiro, J., Maciaszek, L., Filipe, J. (eds) Enterprise Information Systems. ICEIS 2013. Lecture Notes in Business Information Processing, vol 190. Springer, Cham. https://doi.org/10.1007/978-3-319-09492-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09492-2_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09491-5

  • Online ISBN: 978-3-319-09492-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics