Abstract
Various studies had successfully utilized graph theory analysis as a way to gain a high-level abstraction view of the software systems, such as constructing the call graph to visualize the dependencies among software components. The level of granularity and information shown by the graph usually depends on the input such as variable, method, class, package, or combination of multiple levels. However, there are very limited studies that investigated how software evolution and change history can be used as a basis to model software-based complex network. It is a common understanding that stable and well-designed source code will have less update throughout a software development lifecycle. It is only those code that were badly design tend to get updated due to broken dependencies, high coupling, or dependencies with other classes. This paper put forward an approach to model a commit change-based weighted complex network based on historical software change and evolution data captured from GitHub repositories with the aim to identify potential fault prone classes. Four well-established graph centrality metrics were used as a proxy metric to discover fault prone classes. Experiments on ten open-source projects discovered that when all centrality metrics are used together, it can yield reasonably good precision when compared against the ground truth.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ma, Y.T., He, K.Q., Li, B., Liu, J., Zhou, X.Y.: A hybrid set of complexity metrics for large-scale object-oriented software systems. J. Comput. Sci. Technol. 25, 1184–1201 (2010)
Concas, G., Marchesi, M., Murgia, A., Tonelli, R., Turnu, I.: On the Distribution of Bugs in the Eclipse System. IEEE T Softw. Eng. 37, 872–877 (2011)
Turnu, I., Concas, G., Marchesi, M., Tonelli, R.: The fractal dimension of software networks as a global quality metric. Inform. Sci. 245, 290–303 (2013)
Zimmermann, T., Nagappan, N.: Predicting defects using network analysis on dependency graphs. In: Proceedings of the 30th International Conference on Software Engineering, pp. 531–540. ACM (2008)
Hyland-Wood, D., Carrington, D., Kaplan, S.: Scale-free nature of java software package, class and method collaboration graphs. In: Proceedings of the 5th International Symposium on Empirical Software Engineering, Rio de Janeiro, Brasil (2006)
Chong, C.Y., Lee, S.P.: Analyzing maintainability and reliability of object-oriented software using weighted complex network. J. Syst. Softw. 110, 28–53 (2015)
Chong, C.Y., Lee, S.P.: Automatic clustering constraints derivation from object-oriented software using weighted complex network with graph theory analysis. J. Syst. Softw. 133, 28–53 (2017)
Myers, C.R.: Software systems as complex networks: structure, function, and evolvability of software collaboration graphs. Phys. Rev. E 68, 046116 (2003)
Kalliamvakou, E., Gousios, G., Blincoe, K., Singer, L., German, D.M., Damian, D.: An in-depth study of the promises and perils of mining GitHub. Empirical Softw. Eng. 21(5), 2035–2071 (2016)
Begel, A., Bosch, J., Storey, M.A.: Social networking meets software development: perspectives from GitHub, MSDN, stack exchange, and TopCoder. Softw. IEEE 30, 52–66 (2013)
Gousios, G., Pinzger, M., Deursen, A.V.: An exploratory study of the pull-based software development model. In: Proceedings of the 36th International Conference on Software Engineering, pp. 345–355. ACM, Hyderabad (2014)
Nagappan, N., Zeller, A., Zimmermann, T., Herzig, K., Murphy, B.: Change bursts as defect predictors. In: 2010 IEEE 21st International Symposium on Software Reliability Engineering (ISSRE), pp. 309–318. IEEE (2010)
Chong, C.Y., Lee, S.P.: A commit change-based weighted complex network approach to identify potential fault prone classes. In: 13th International Conference on Software Technologies, pp. 471–482 (2018)
Potanin, A., Noble, J., Frean, M., Biddle, R.: Scale-free geometry in OO programs. Commun. ACM 48, 99–103 (2005)
Concas, G., Marchesi, M., Pinna, S., Serra, N.: Power-laws in a large object-oriented software system. IEEE Trans. Softw. Eng. 33, 687–708 (2007)
Louridas, P., Spinellis, D., Vlachos, V.: Power laws in software. ACM Trans. Softw. Eng. Methodol. 18, 1–26 (2008)
Pang, T.Y., Maslov, S.: Universal distribution of component frequencies in biological and technological systems. Proc. Nat. Acad. Sci. 110(15), 6235–6239 (2013)
Baxter, G., et al.: Understanding the shape of Java software. In: Sigplan Notices, vol. 41, pp. 397–412 (2006)
LaBelle, N., Wallingford, E.: Inter-package dependency networks in open-source software. arXiv preprint arXiv:cs/0411096 (2004)
Oyetoyan, T.D., Falleri, J.R., Dietrich, J., Jezek, K.: Circular dependencies and change-proneness: an empirical study. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 241–250 (2015)
Valverde, S., Solé, R.V.: Hierarchical small worlds in software architecture. arXiv preprint arXiv:cond-mat/0307278 (2003)
Zhang, B., Huang, G., Zheng, Z., Ren, J., Hu, C.: Approach to mine the modularity of software network based on the most vital nodes. IEEE Access (2018)
Muthukumaran, K., Choudhary, A., Murthy, N.L.B.: Mining GitHub for novel change metrics to predict buggy files in software systems. In: 2015 International Conference on Computational Intelligence and Networks, pp. 15–20 (2015)
Hassan, A.E.: Predicting faults using the complexity of code changes. In: Proceedings of the 31st International Conference on Software Engineering, pp. 78–88. IEEE Computer Society (2009)
Wiese, I.S., Kuroda, R.T., Re, R., Oliva, G.A., Gerosa, M.A.: An empirical study of the relation between strong change coupling and defects using history and social metrics in the apache aries project. In: Damiani, E., Frati, F., Riehle, D., Wasserman, Anthony I. (eds.) OSS 2015. IAICT, vol. 451, pp. 3–12. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17837-0_1
Ambros, M.D., Lanza, M., Robbes, R.: On the relationship between change coupling and software defects. In: 2009 16th Working Conference on Reverse Engineering, pp. 135–144 (2009)
Ajienka, N., Capiluppi, A.: Understanding the interplay between the logical and structural coupling of software classes. J. Syst. Softw. 134, 120–137 (2017)
Zimmermann, T., Weisgerber, P., Diehl, S., Zeller, A.: Mining version histories to guide software changes. In: Proceedings of the 26th International Conference on Software Engineering, pp. 563–572. IEEE Computer Society (2004)
Kagdi, H., Gethers, M., Poshyvanyk, D.: Integrating conceptual and logical couplings for change impact analysis in software. Empirical Softw. Eng. 18, 933–969 (2013)
Yang, X., Lo, D., Xia, X., Sun, J.: TLEL: a two-layer ensemble learning approach for just-in-time defect prediction. Inf. Softw. Technol. 87, 206–220 (2017)
Xia, X., Lo, D., Pan, S.J., Nagappan, N., Wang, X.: HYDRA: massively compositional model for cross-project defect prediction. IEEE T. Softw. Eng. 42, 977–998 (2016)
Huang, Q., Xia, X., Lo, D.: Supervised vs unsupervised models: a holistic look at effort-aware just-in-time defect prediction. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 159–170 (2017)
Guerrouj, L., et al.: Investigating the relation between lexical smells and change-and fault-proneness: an empirical study. Softw. Qual. J. 25, 641–670 (2017)
Arnaoudova, V., Di Penta, M., Antoniol, G.: Linguistic antipatterns: what they are and how developers perceive them. Empirical Softw. Eng. 21, 104–158 (2016)
Chong, C.Y.: 01 January 2019. https://github.com/chongchunyong/Commit-Change-based-WCN
Acknowledgement
This work was carried out within the framework of the research project FP001-2016 under the Fundamental Research Grant Scheme provided by Ministry of Higher Education, Malaysia.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chong, C.Y., Lee, S.P. (2019). Can Commit Change History Reveal Potential Fault Prone Classes? A Study on GitHub Repositories. In: van Sinderen, M., Maciaszek, L. (eds) Software Technologies. ICSOFT 2018. Communications in Computer and Information Science, vol 1077. Springer, Cham. https://doi.org/10.1007/978-3-030-29157-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-29157-0_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29156-3
Online ISBN: 978-3-030-29157-0
eBook Packages: Computer ScienceComputer Science (R0)