Skip to main content
Log in

Corrective commit probability: a measure of the effort invested in bug fixing

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

The effort invested in software development should ideally be devoted to the implementation of new features. But some of the effort is invariably also invested in corrective maintenance, that is in fixing bugs. Not much is known about what fraction of software development work is devoted to bug fixing, and what factors affect this fraction. We suggest the Corrective Commit Probability (CCP), which measures the probability that a commit reflects corrective maintenance, as an estimate of the relative effort invested in fixing bugs. We identify corrective commits by applying a linguistic model to the commit messages, achieving an accuracy of 93%, higher than any previously reported model. We compute the CCP of all large active GitHub projects (7,557 projects with 200+ commits in 2019). This leads to the creation of an investment scale, suggesting that the bottom 10% of projects spend less than 6% of their total effort on bug fixing, while the top 10% of projects spend at least 39% of their effort on bug fixing — more than 6 times more. Being a process metric, CCP is conditionally independent of source code metrics, enabling their evaluation and investigation. Analysis of project attributes shows that lower CCP (that is, lower relative investment in bug fixing) is associated with smaller files, lower coupling, use of languages like JavaScript and C# as opposed to PHP and C++, fewer code smells, lower project age, better perceived quality, fewer developers, lower developer churn, better onboarding, and better productivity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. https://console.cloud.google.com/bigquery?d=github_repos&p=bigquery-public-data&page=dataset

  2. https://github.blog/2018-11-08-100m-repos/

  3. https://hacktoberfest.digitalocean.com/

References

  • Al-Kilidar, H., Cox, K., & Kitchenham, B. (2005). The use and usefulness of the ISO/IEC 9126 quality standard. In International Symposium Empirical Software Engineering, pages 126–132.

  • Allamanis, M. (2019). The adverse effects of code duplication in machine learning models of code. In Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Onward! 2019, page 143-153, New York, NY, USA. Association for Computing Machinery.

  • Amit, I., Ben Ezra, N., & Feitelson, D. G. (2021). Follow your nose – which code smells are worth chasing? arXiv:2103.01861 [cs.SE].

  • Amit., I, Feitelson, D. G. (2019). Which refactoring reduces bug rate? In Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering, PROMISE’19, pages 12–15, New York, NY, USA, 2019. ACM.

  • Amit, I., Firstenberg, E., & Meshi., Y. (2017). Framework for semi-supervised learning when no labeled data is given. U.S. patent application #US20190164086A1.

  • Amor, J. J., Robles, G., Gonzalez-Barahona, J. M., & Navarro, A. (2006). Discriminating development activities in versioning systems: A case study.

  • Antoniol, G., Ayari, K., Di Penta, M., Khomh, F., & Guéhéneuc, Y. G. (2008). Is it a bug or an enhancement? a text-based approach to classify change requests. In Proceedings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting of Minds, CASCON 08, New York, NY, USA. Association for Computing Machinery.

  • Arcelli Fontana, F., Ferme, V., Marino, A., Walter, B., & Martenka, P. (2013). Investigating the impact of code smells on system’s quality: An empirical study on systems of different application domains. In IEEE International Conference on Software Maintenance, ICSM, pages 260–269.

  • Argyle, M. (1989). Do happy workers work harder? the effect of job satisfaction on job performance. In R. Veenhoven (Ed.), How harmful is happiness? Consequences of enjoying life or not. Rotterdam, The Netherlands: Universitaire Pers.

  • Arpit, D., Jastrzȩbski, S., Ballas, N., Krueger, D., Bengio, E., Kanwal, M. S, Maharaj, T, Fischer, A., Courville, A., Bengio, Y., et al. (2017). A closer look at memorization in deep networks. arXiv preprint arXiv:1706.05394.

  • Avelino, G., Constantinou, E., Valente, M. T., & Serebrenik, A. (2019) On the abandonment and survival of open source projects: An empirical investigation. CoRR, abs/1906.08058. abs/1906.08058

  • Avelino, G., Passos, L. T., Hora, A. C., & Valente, M. T. (2016). A novel approach for estimating truck factors. CoRR, abs/1604.06766.

  • Baggen, R., Correia, J. P., Schill, K., & Visser, J. (2012). Standardized code quality benchmarking for improving software maintainability. Software Quality Journal, 20(2), 287–307.

    Article  Google Scholar 

  • Basili, V. R., Briand, L. C., & Melo, W. L. (1996). A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering, 22(10), 751–761.

    Article  Google Scholar 

  • Beizer, B. (2003). Software testing techniques. Dreamtech Press.

  • Berger, E. D, Hollenbeck, C., Maj, P., Vitek, O., & Vitek, J. (2019). On the impact of programming languages on code quality: A reproduction study. ACM Transactions on Programming Languages and Systems, 41(4).

  • Bernardo, J. H, da Costa, D. A, & Kulesza, U. (2018). Studying the impact of adopting continuous integration on the delivery time of pull requests. In Proceedings of the 15th International Conference on Mining Software Repositories, MSR ’18, page 131-141, New York, NY, USA. Association for Computing Machinery.

  • Bhattacharya, P. & Neamtiu, I. (2011). Assessing programming language impact on development and maintenance: a study on C and C++. In 2011 33rd International Conference on Software Engineering (ICSE), pages 171–180.

  • Bird, C., Bachmann, A., Aune, E., Duffy, J., Bernstein, A., Filkov, V., & Devanbu, P. (2009). Fair and balanced?: Bias in bug-fix datasets. In Proceedings of the the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, ESEC/FSE ’09, pages 121–130, New York, NY, USA. ACM.

  • Bird, C., Nagappan, N., Murphy, B., Gall, H., & Devanbu, P. (2011). Don’t touch my code! examining the effects of ownership on software quality. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pages 4–14.

  • Bird, C., Rigby, P. C., Barr, E. T., Hamilton, D. J., German, D. M., & Devanbu, P. (2009). The promises and perils of mining git. In 2009 6th IEEE International Working Conference on Mining Software Repositories, pages 1–10.

  • Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory, COLT’ 98, pages 92–100, New York, NY, USA. ACM.

  • Boehm, B., & Basili, V. R. (2001). Software defect reduction top 10 list. Computer, 34(1), 135–137.

    Article  Google Scholar 

  • Boehm, B. W. (1981). Software Engineering Economics. Prentice-Hall.

  • Boehm, B. W., Brown, J. R., & Lipow, M. (1976). Quantitative evaluation of software quality. International Conference Software Engineering, 2, 592–605.

    Google Scholar 

  • Boehm, B. W., & Papaccio, P. N. (1988). Understanding and controlling software costs. IEEE Transactions on Software Engineering, 14(10), 1462–1477.

    Article  Google Scholar 

  • Box, G. (1979). Robustness in the strategy of scientific model building. In R. L. LAUNER and G. N. WILKINSON, editors, Robustness in Statistics, pages 201–236. Academic Press.

  • Brooks Jr., F. P. (1975). The Mythical Man-Month: Essays on Software Engineering. Addison-Wesley.

  • Campbell, J. P., McCloy, R. A., Oppler, S. H., & Sager, C. E. (1993). A theory of performance. In N. Schmitt, W. C. Borman, and Associates, editors, Personnel Selection in Organizations, pages 35–70. Jossey-Bass Pub.

  • Chidamber, S. R., & Kemerer, C. F. (1994). A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 20(6), 476–493.

    Article  Google Scholar 

  • Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.

    Article  Google Scholar 

  • Corral, L., & Fronza, I. (2015). Better code for better apps: A study on source code quality and market success of android applications. In 2015 2nd ACM International Conference on Mobile Software Engineering and Systems, pages 22–32.

  • Crosby, P. (1979). Quality Is Free: The Art of Making Quality Certain. McGrawHill.

  • Cunningham, W. (1992). The wycash portfolio management system. In Addendum to the Proceedings on Object-Oriented Programming Systems, Languages, and Applications (Addendum), OOPSLA 92, page 29-30, New York, NY, USA. Association for Computing Machinery.

  • da Costa, D. A., McIntosh, S., Treude, C., Kulesza, U., & Hassan, A. E. (2018). The impact of rapid release cycles on the integration delay of fixed issues. Empirical Software Engineering, 23(2), 835–904.

    Article  Google Scholar 

  • D’Ambros, M., Lanza, M., & Robbes, R. (2010). An extensive comparison of bug prediction approaches. In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), pages 31–41.

  • Dawid, A. P. & A. M. Skene. (1979). Maximum likelihood estimation of observer error-rates using the em algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1), 20–28.

  • Dawson, M., Burrell, D., Rahim, E., & Brewster, S. (2010). Integrating software assurance into the software development life cycle (SDLC). Journal of Information Systems Technology and Planning, 3, 49–53.

    Google Scholar 

  • Dorfman, R. (1979). A formula for the gini coefficient. The review of economics and statistics, 61(1), 146–149.

  • Dromey, G. (1995). A model for software product quality. IEEE Transactions on Software Engineering, 21(2), 146–162.

    Article  Google Scholar 

  • Efron, B. (1992). Bootstrap Methods: Another Look at the Jackknife. In S. Kotz & N. L. Johnson (Eds.), Breakthroughs in Statistics: Methodology and Distribution, pages 569–593. Springer New York, New York, NY.

  • Fisher, R. (1919). The correlation between relatives on the supposition of mendelian inheritance. Transactions of the Royal Society of Edinburgh, 52(2), 399–433.

    Article  Google Scholar 

  • Fowler, M., Beck, K., & Opdyke, W. R. (1997). Refactoring: Improving the design of existing code. In 11th European Conference. Jyväskylä, Finland.

  • Gharehyazie, M., Ray, B., Keshani, M., Zavosht, M. S., Heydarnoori, A., & Filkov, V. (2019). Cross-project code clones in github. Empirical Software Engineering, 24(3), 1538–1573.

    Article  Google Scholar 

  • Ghayyur, S. A. K., Ahmed, S., Ullah, S., & Ahmed, W. (2018). The impact of motivator and demotivator factors on agile software development. Intl J Adv Comp Sci Appl, 9(7).

  • Gil, Y., & Lalouche, G. (2017). On the correlation between size and metric validity. Empirical Software Engineering, 22(5), 2585–2611.

    Article  Google Scholar 

  • Gousios, G., & Spinellis, D. (2012). Ghtorrent: Github’s data from a firehose. In 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), pages 12–21. IEEE.

  • Graves, T. L., Karr, A. F., Marron, J. S., & Siy, H. (2000). Predicting fault incidence using software change history. IEEE Transactions on Software Engineering, 26(7), 653–661.

    Article  Google Scholar 

  • Gyimothy, T., Ferenc, R., & Siket, I. (2005). Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Transactions on Software Engineering, 31(10), 897–910.

    Article  Google Scholar 

  • Hackbarth., R., Mockus, A., Palframan, J., & Sethi, R. (2016). Improving software quality as customers perceive it. IEEE Software, 33(4):40–45.

  • Hall, T., Beecham, S., Bowes, D., Gray, D., & Counsell, S. (2012). A systematic literature review on fault prediction performance in software engineering. IEEE Transactions on Software Engineering, 38(6), 1276–1304.

    Article  Google Scholar 

  • Halstead, M. H. (1977). Elements of Software Science (Operating and Programming Systems Series). New York, NY, USA: Elsevier Science Inc.

    MATH  Google Scholar 

  • Hastings, C., Mosteller, F., Tukey, J. W., & Winsor, C. P. (1947). Low moments for small samples: A comparative study of order statistics. Annals of Mathematical Statistics, 18(3):413–426.

  • Hattori, L. P., & Lanza, M. (2008). On the nature of commits. In 2008 23rd IEEE/ACM International Conference on Automated Software Engineering-Workshops, pages 63–71. IEEE.

  • Hawkins, D. M. (2004). The problem of overfitting. Journal of Chemical Information and Computer Sciences, 44(1), 1–12.

    Article  MathSciNet  Google Scholar 

  • Herbold, S., Trautsch, A., Ledel, B., Aghamohammadi, A., Ghaleb, T. A., Chahal, K. K., Bossenmaier, T., Nagaria, B., Makedonski, P., Ahmadabadi, M. N., Szabados, K., Spieker, H., Madeja, M., Hoy, N., Lenarduzzi, V., Wang, S., Rodriguez-Perez, G., Colomo-Palacios, R., Verdecchia, R., Singh, P., Qin, Y., Chakroborti, D., Davis, W., Walunj, V., Wu, H., Marcilio, D., Alam, O., Aldaeej, A., Amit, I., Turhan, B., Eismann, S., Wickert, A. K., Malavolta, I., Sulir, M., Fard, F., Henley, A. Z., Kourtzanidis, S., Tuzun, E., Treude, C., Shamasbi, S. M., Pashchenko, I., Wyrich, M., Davis, J., Serebrenik, A., Albrecht, E., Aktas, E. U., Strber, D., & Erbel, J. (2020). Large-scale manual validation of bug fixing commits: A fine-grained analysis of tangling. arXiv:2011.06244 [cs.SE].

  • Herzig, K., Just, S., & Zeller, A. (2013). It’s not a bug, it’s a feature: How misclassification impacts bug prediction. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, pages 392–401, Piscataway, NJ, USA. IEEE Press.

  • Herzig, K., Zeller, A. (2013). The impact of tangled code changes. In 2013 10th Working Conference on Mining Software Repositories (MSR), pages 121–130.

  • Hindle, A., German, D. M., Godfrey, M. W., & Holt, R. C. (2009). Automatic classication of large changes into maintenance categories. In 2009 IEEE 17th International Conference on Program Comprehension, pages 30–39.

  • Hovemeyer, D., & Pugh, W. (2004). Finding bugs is easy. SIGPLAN Not, 39(12), 92–106.

    Article  Google Scholar 

  • International Organization for Standardization. (2001). ISO/IEC 9126-1:2001 Software engineering - Product quality - Part 1: Quality model.

  • International Organization for Standardization. (2011). ISO/IEC 25010:2011 Systems and software engineering - systems and software quality requirements and evaluation (SQuaRE) - System software quality models.

  • Jiang, Z., Naud, P., & Comstock, C. (2007). An investigation on the variation of software development productivity. International Journal of Computer and Information Science and Engineering, 1(2), 72–81.

    Google Scholar 

  • Jones, C. (1991). Applied Software Measurement: Assuring Productivity and Quality. New York, NY, USA: McGraw-Hill Inc.

    MATH  Google Scholar 

  • Jones, C. (2006). Social and technical reasons for software project failures. Crosstalk, the Journal of Defense Software Engineering, 19(6):4.

  • Jones, C. (2012). Software quality in 2012: A survey of the state of the art. [Online; accessed 24-September-2018].

  • Jones, C. (2015). Wastage: The impact of poor quality on software economics. Software Quality Professional, 18(1):23-32. retrieved from http://asq.org/pub/sqp/.

  • Kalliamvakou, E., Gousios, G., Blincoe, K., Singer, L., German, D. M., & Damian, D. (2016). An in-depth study of the promises and perils of mining GitHub. Empirical Software Engineering, 21, 2035-2071.

  • Kamei, Y., Shihab, E., Adams, B., Hassan, A. E., Mockus, A., Sinha, A., & Ubayashi, N. (2013). A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering, 39(6), 757–773.

    Article  Google Scholar 

  • Reliability of function points measurement. (1993). A field experiment. Communication ACM, 36(2), 85–97.

    Article  Google Scholar 

  • Kemerer, C. F., & Porter, B. S. (1992). Improving the reliability of function point measurement: An empirical study. IEEE Transactions on Software Engineering, 18(11), 1011–1024.

    Article  Google Scholar 

  • Khomh, F., Dhaliwal, T., Zou, Y., & Adams, B. (2012). Do faster releases improve software quality?: An empirical case study of mozilla firefox. In Proceedings of the 9th IEEE Working Conference on Mining Software Repositories, MSR ’12, pages 179–188, Piscataway, NJ, USA. IEEE Press.

  • Khomh, F., Di Penta, M., & Gueheneuc, Y. G. (2009). An exploratory study of the impact of code smells on software change-proneness. In 2009 16th Working Conference on Reverse Engineering, pages 75–84. IEEE.

  • Kim, S., & Whitehead Jr., E. J. (2006) How long did it take to fix bugs? In Proceedings of the 2006 International Workshop on Mining Software Repositories, MSR ’06, pages 173–174, New York, NY, USA. ACM.

  • Kim, S., Zimmermann, T., Whitehead Jr, E. J., & Zeller, A. (2007). Predicting faults from cached history. In Proceedings of the 29th International Conference on Software Engineering, ICSE ’07, pages 489–498, Washington, DC, USA. IEEE Computer Society.

  • Kochhar, P. S., Wijedasa, D., & Lo, D. (2016). A large scale study of multiple programming languages and code quality. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), volume 1, pages 563–573.

  • Krawczyk, B. (2016). Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence, 5(4), 221–232.

    Article  Google Scholar 

  • Kruchten, P., Nord, R. L., & Ozkaya, I. (2012). Technical debt: From metaphor to theory and practice. IEEE Software, 29(6), 18–21.

    Article  Google Scholar 

  • LaToza, T. D., Venolia, G., & DeLine, R. (2006). Maintaining mental models: a study of developer work habits. In Proceedings of the 28th international conference on Software engineering, pages 492–50.

  • Lehman, M. M. (1980). Programs, life cycles, and laws of software evolution. Proceedings of the IEEE, 68(9), 1060–1076.

    Article  Google Scholar 

  • Lehman, M. M., Ramil, J. F., Wernick, P. D., Perry, D. E., & Turski, W. M. (1997). Metrics and laws of software evolution - the nineties view. International Software Metrics Symposium, 4, 20–32.

    Article  Google Scholar 

  • Levin, S., & Yehudai, A. (2017). Boosting automatic commit classification into maintenance activities by utilizing source code changes. In Proceedings of the 13th International Conference on Predictive Models and Data Analytics in Software Engineering, PROMISE, pages 97–106, New York, NY, USA, 2017. ACM.

  • Levin, S., & Yehudai, A. (2017). The co-evolution of test maintenance and code maintenance through the lens of fine-grained semantic changes. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 35–46. IEEE.

  • Lewis, D. D. (1998). Naive (bayes) at forty: The independence assumption in information retrieval. In C. Nédellec and C. Rouveirol, editors, Machine Learning: ECML-98, pages 4–15, Berlin, Heidelberg. Springer Berlin Heidelberg.

  • Lientz, B. P. (1983). Issues in software maintenance. ACM Computing Surveys, 15(3), 271–278.

    Article  Google Scholar 

  • Lientz, B. P., Swanson, E. B., & Tompkins, G. E. (1978). Characteristics of application software maintenance. Communication ACM, 21(6), 466–471.

    Article  Google Scholar 

  • Lipow, M. (1982). Number of faults per line of code. IEEE Transactions on Software Engineering, 4, 437–439.

    Article  Google Scholar 

  • Lopes, C. V., Maj, P., Martins, P., Saini, V., Yang, D., Zitny, J., Sajnani, H., & Vitek, J. (2017). Déjàvu: A map of code duplicates on github. Proc. ACM Program. Lang., 1(OOPSLA).

  • Maxwell, K. D., & Forselius, P. (2000). Benchmarking software development productivity. IEEE Software, 17(1), 80–88.

    Article  Google Scholar 

  • Maxwell, K. D., Van Wassenhove, L., & Dutta, S. (1996). Software development productivity of european space, military, and industrial applications. IEEE Transactions on Software Engineering, 22(10), 706–718.

    Article  Google Scholar 

  • McCabe, T. J. (1976). A complexity measure. IEEE Transactions on Software Engineering, 2(4), 308–320.

    Article  MathSciNet  MATH  Google Scholar 

  • Mockus, A., Spinellis, D., Kotti, Z., & Dusing, G. J. (2020). A complete set of related git repositories identified via community detection approaches based on shared commits. In 17th Mining Software Repositories, pages 513-517.

  • Molnar, A. J., NeamŢu, A., & Motogna, S. (2020). Evaluation of software product quality metrics. pp. In E. Damiani, G. Spanoudakis, & L. A. Maciaszek (Eds.), Evaluation of Novel Approaches to Software Engineering (pp. 163–187). Cham: Springer International Publishing.

    Chapter  Google Scholar 

  • Morasca, S., & Russo, G. (2011). An empirical study of software productivity. In 25th Annual International Computer Software and Applications Conference. COMPSAC 2001, pages 317–322.

  • Moser, R., Pedrycz, W., & Succi, G. (2008). Analysis of the reliability of a subset of change metrics for defect prediction. In Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM ’08, pages 309–311, New York, NY, USA. ACM.

  • Munaiah, N., Kroh, S., Cabrey, C., & Nagappan, M. (2017). Curating github for engineered software projects. Empirical Software Engineering, 22, 04.

    Article  Google Scholar 

  • Murphy-Hill, W., Jaspan, C., Sadowski, C., Shepherd, D. C., Phillips, M., Winter, C., Dolan, A. K., Smith, E. K., & Jorde, M. A. (2021). What predicts software developers productivity? Transaction on Software Engineering 47, 582-594

  • Myers, G. J., Badgett, T., Thomas, T. M., & Sandler, C. (2004). The art of software testing, volume 2. Wiley Online Library.

  • Nanz, S., & Furia, C. A. (2015). A comparative study of programming languages in rosetta code. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, volume 1, pages 778–788.

  • Norick, B., Krohnm J., Howard, E., Welna, B., & Izurieta, C. (2010). Effects of the number of developers on code quality in open source software: a case study. In Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, pages 1–.

  • Oak, R., Du, M., Yan, D., Takawale, H., & Amit, I. (2019). Malware detection on highly imbalanced data through sequence modeling. In Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, AISec’19, page 37-48, New York, NY, USA. Association for Computing Machinery.

  • Oisen, R. (1971). Can project management be defined? Project Management Quarterly, 2(1), 12–14.

    Google Scholar 

  • Oliveira, E., Fernandes, E., Steinmacher, I., Cristo, M., Conte, T., & Garcia, A. (2020). Code and commit metrics of developer productivity: a study on team leaders perceptions. Empirical Software Engineering 25, 2519-2549.

  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.

    MathSciNet  MATH  Google Scholar 

  • Potdar, A., & Shihab, E. (2014). An exploratory study on self-admitted technical debt. In 2014 IEEE International Conference on Software Maintenance and Evolution, pages 91–100. IEEE.

  • Prechelt, L. (2000). An empirical comparison of seven programming languages. Computer, 33(10), 23–29.

    Article  Google Scholar 

  • Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.

    Article  Google Scholar 

  • Rahman, F., & Devanbu, P. (2013). How, and why, process metrics are better. In 2013 35th International Conference on Software Engineering (ICSE), pages 432–441.

  • Rahman, F., Posnett, D., Hindle, A., Barr, E. T., & Devanbu, P. T. (2011). Bugcache for inspections: hit or miss? In SIGSOFT FSE.

  • Rantala, L., Mantyla, M., & Lo, D. (2020). Prevalence, contents and automatic detection of KL-SATD. 46th Euromicro Conference on Software Engineering and Advanced Applications, 385-388.

  • Ratner, A. J., De Sa, C. M., Wu, S., Selsam, D., & Ré, C. (2016). Data programming: Creating large training sets, quickly. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29, pages 3567–3575. Curran Associates, Inc.

  • Ray, B., Posnett, D., Filkov, V., & Devanbu, P. (2014). A large scale study of programming languages and code quality in github. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pages 155–165, New York, NY, US. ACM.

  • Raymond, E. (1998). The cathedral and the bazaar. First Monday, 3(3).

  • Reddivari, S. & Raman, J. (2019). Software quality prediction: An investigation based on machine learning. 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), pages 115–122.

  • Rice, H. G. (1953). Classes of recursively enumerable sets and their decision problems. Transactions of the American Mathematical Society, 74(2), 358–366.

    Article  MathSciNet  MATH  Google Scholar 

  • Romano, S., Caulo, M., Scanniello, G., Baldassarre, M. T., & Caivano, D. (2020). Sentiment polarity and bug introduction. In International Conference on Product-Focused Software Process Improvement, pages 347–363. Springer.

  • Rosenberg, J. (1997). Some misconceptions about lines of code. In Proceedings fourth international software metrics symposium, pages 137–142. IEEE.

  • Sackman, H., Erikson, W. J., & Grant, E. E. (1968). Exploratory experimental studies comparing online and offline programming performance. Communications of the ACM, 11(1), 3–11.

    Article  Google Scholar 

  • Schach, S. R., Jin, B., Yu, L., Heller, G. Z., & Offutt, J. (2003). Determining the distribution of maintenance categories: Survey versus measurement. Empirical Software Engineering, 8(4), 351–365.

    Article  Google Scholar 

  • Schneidewind, N. F. (2002). Body of knowledge for software quality measurement. Computer, 35(2), 77–83.

    Article  Google Scholar 

  • Settles, B. (2010). Active learning literature survey. Technical report, University of Wisconsin Madison.

  • Shepperd, M. (1988). A critique of cyclomatic complexity as a software metric. Software Engineering Journal, 3(2), 30–36.

    Article  Google Scholar 

  • Shihab, E., Hassan, A. E., Adams, B., & Jiang, Z. M. (2012). An industrial study on the risk of software changes. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE ’12, pages 62:1–62:11, New York, NY, USA. ACM.

  • Shrikanth, N. C., & Menzies, T. (2020). Assessing practitioner beliefs about software defect prediction. In International Confernce on Softwware Engineering, number 42.

  • Shrikanth, N. C., Nichols, W., Fahid, F. M., & Menzies, T. (2020). Assessing practitioner beliefs about software engineering. arXiv:2006.05060.

  • Śliwerski, J., Zimmermann, T., & Zeller, A. (2005). When do changes induce fixes? SIGSOFT Software Engineering Notes, 30(4), 1–5.

    Article  Google Scholar 

  • Spinellis, D. (2006). Software Quality: The Open Source Perspective. Pearson Education Inc.

  • Stamelos, I., Angelis, L., Oikonomou, A., & Bleris, G. L. (2002). Code quality analysis in open source software development. Information Systems Journal, 12(1), 43–60.

    Article  Google Scholar 

  • Swanson, E. B. (1976). The dimensions of maintenance. In Proceedings of the 2Nd International Conference on Software Engineering, ICSE ’76, pages 492–497, Los Alamitos, CA, USA. IEEE Computer Society Press.

  • Taba, S. E. S., Khomh, F., Zou, Y., Hassan, A. E., & Nagappan, M. (2013). Predicting bugs using antipatterns. In 2013 IEEE International Conference on Software Maintenance, pages 270–279.

  • Tom, E., Aurum, A., & Vidgen, R. (2013). An exploration of technical debt. Journal of Systems and Software, 86(6), 1498–1516.

    Article  Google Scholar 

  • Van Emden, E., Moonen, L. (2002). Java quality assurance by detecting code smells. In Ninth Working Conference on Reverse Engineering, 2002. Proceedings., pages 97–106. IEEE.

  • Van Hulse, J., Khoshgoftaar, T. M., & Napolitano, A. (2007). Experimental perspectives on learning from imbalanced data. In Proceedings of the 24th international conference on Machine learning, pages 935–942.

  • Vasilescu, B., Yu, Y., Wang, H., Devanbu, P., & Filkov, V. (2015). Quality and productivity outcomes relating to continuous integration in github. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015, pages 805–816, New York, NY, USA. ACM.

  • Walkinshaw, N., & Minku, L. (2018). Are 20% of files responsible for 80% of defects? In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM ’18, pages 2:1–2:10, New York, NY, USA. ACM.

  • Weyuker, E., Ostrand, T., & Bell, R. (2008). Do too many cooks spoil the broth? using the number of developers to enhance defect prediction models. Empirical Software Engineering, 13,539–559.

  • Williams, L., & Kessler, R. (2002). Pair Programming Illuminated. Inc, USA: Addison-Wesley Longman Publishing Co.

    MATH  Google Scholar 

  • Wood, A. (1996). Predicting software reliability. Computer, 29(11), 69–77.

    Article  Google Scholar 

  • Wright, T. A., & Cropanzano, R. (2000). Psychological well-being and job satisfaction as predictors of job performance. Journal of Occupational Health Psychology, 5, 84–94.

    Article  Google Scholar 

  • Yamada, S. & Osaki, S. (1985). Software reliability growth modeling: Models and applications. IEEE Transactions on Software Engineering, SE-11(12):1431–1437.

  • Yamashita, A. & Moonen, L. (2012). Do code smells reflect important maintainability aspects? In 2012 28th IEEE international conference on software maintenance (ICSM), pages 306–315. IEEE.

  • Zaidman, A., Van Rompaey, B., van Deursen, A., & Demeyer, S. (2011). Studying the co-evolution of production and test code in open source and industrial developer test processes through repository mining. Empirical Software Engineering, 16(3), 325–364.

    Article  Google Scholar 

  • Zimmermann, T., Diehl, S., & Zeller, A. (2003). How history justifies system architecture (or not). In Sixth International Workshop on Principles of Software Evolution, 2003. Proceedings., pages 73–83.

Download references

Acknowledgements

This research was supported by the ISRAEL SCIENCE FOUNDATION (grant No. 832/18). We thank Amiram Yehudai and Stanislav Levin for providing us their data set of labeled commits (Levin & Yehudai, 2017). We thank Guilherme Avelino for drawing our attention to the importance of Truck Factor Developers Detachment (TFDD) and providing a data set (Avelino et al., 2019). Many thanks to the reviewers whose comments were instrumental in improving the focus of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Idan Amit.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Amit, I., Feitelson, D.G. Corrective commit probability: a measure of the effort invested in bug fixing. Software Qual J 29, 817–861 (2021). https://doi.org/10.1007/s11219-021-09564-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-021-09564-z

Keywords

Navigation