Abstract
Software defect prediction has extensive applicability thus being a very active research area in Search-Based Software Engineering. A high proportion of the software defects are caused by violated couplings. In this paper, we investigate the relevance of semantic coupling in assessing the software proneness to defects. We propose a hybrid classification model combining Gradual Relational Association Rules with Artificial Neural Networks, which detects the defective software entities based on semantic features automatically learned from the source code. The experiments we have performed led to results that confirm the interplay between conceptual coupling and software defects proneness.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bavota, G., Dit, B., Oliveto, R., Di Penta, M., Poshyvanyk, D., De Lucia, A.: An empirical study on the developers perception of software coupling. In: Proceedings of the 2013 International Conference on Software Engineering, ICSE 2013, pp. 692–701. IEEE Press, Piscataway (2013)
Bishnu, P., Bhattacherjee, V.: Software fault prediction using quad tree-based k-means clustering algorithm. IEEE Trans. Knowl. Data Eng. 24(6), 1146–1150 (2012)
Boetticher, G.D.: Improving the credibility of machine learner models in software engineering. In: Zhang, D. (ed.) Advances in Machine Learning Applications in Software Engineering, pp. 52–72. IGI Global, Clear Lake (2007)
Brown, L., Cat, T., DasGupta, A.: Interval estimation for a proportion. Stat. Sci. 16, 101–133 (2001)
Cataldo, M., Mockus, A., Roberts, J.A., Herbsleb, J.D.: Software dependencies, work dependencies, and their impact on failures. IEEE Trans. Software Eng. 35(6), 864–878 (2009)
hua Chang, R., Mu, X., Zhang, L.: Software defect prediction using non-negative matrix factorization. JSW 6(11), 2114–2120 (2011)
Chidamber, S.R., Kemerer, C.F.: Towards a metrics suite for object-oriented design. In: Conference Proceedings on OOP Systems, Languages, and Applications, pp. 197–211 (1991)
Czibula, G., Czibula, I.G., Miholca, D.L.: Enhancing relational association rules with gradualness. Int. J. Innovative Comput. Commun. Control 13, 289–305 (2016)
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. JASIS 41, 391–407 (1990)
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)
Gu, Q., Zhu, L., Cai, Z.: Evaluation measures of the classification performance of imbalanced data sets. In: Cai, Z., Li, Z., Kang, Z., Liu, Y. (eds.) Computational Intelligence and Intelligent Systems, pp. 461–471. Springer, Heidelberg (2009)
Haghighi, A.S., Dezfuli, M.A., Fakhrahmad, S.: Applying mining schemes to software fault prediction: a proposed approach aimed at test cost reduction. In: Proceedings of the World Congress on Engineering. WCE 2012, pp. 1–5, IEEE Computer Society, Washington (2012)
Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S.: A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Software Eng. 38(6), 1276–1304 (2011)
Kirbas, S., et al.: The relationship between evolutionary coupling and defects in large industrial software. J. Softw. Evol. Process 29(4), 1–19 (2017)
Kirbas, S., Sen, A., Caglayan, B., Bener, A., Mahmutogullari, R.: The effect of evolutionary coupling on software defects: an industrial case study on a legacy system. In: Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2014, pp. 6:1–6:7. ACM, New York (2014)
Lavrač, N., Kavšek, B., Flach, P., Todorovski, L.: Subgroup discovery with CN2-SD. J. Mach. Learn. Res. 5, 153–188 (2004)
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. CoRR abs/1405.4053 (2014)
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
McCabe, T.J.: A complexity measure. IEEE Trans. Software Eng. 2(4), 308–320 (1976)
Miholca, D.L., Czibula, G., Czibula, I.G.: A novel approach for software defect prediction through hybridizing gradual relational association rules with artificial neural networks. Inf. Sci. 441, 152–170 (2018)
Miholca, D.: An adaptive gradual relational association rules mining approach. Studia Universitatis Babeş-Bolyai Informatica 63(1), 94–110 (2018)
Panichella, A., Oliveto, R., Lucia, A.D.: Cross-project defect prediction models: L’union fait la force. In: IEEE Conference on Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE), February 2014, pp. 164–173 (2014)
Poshyvanyk, D., Marcus, A., Ferenc, R., Gyimóthy, T.: Using information retrieval based coupling measures for impact analysis. Empirical Softw. Eng. 14(1), 5–32 (2009)
Tera-promise repository. http://openscience.us/repo/
Wang, S., Liu, T., Tan, L.: Automatically learning semantic features for defect prediction. In: Proceedings of the 38th International Conference on Software Engineering, pp. 297–308. ICSE 2016. ACM, New York (2016)
Zubrow, D., Clark, B.: How good is the software : a review of defect prediction techniques. In: Software Engineering Symposium, pp. 1–7. Carreige Mellon University (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Miholca, DL., Czibula, G. (2019). Software Defect Prediction Using a Hybrid Model Based on Semantic Features Learned from the Source Code. In: Douligeris, C., Karagiannis, D., Apostolou, D. (eds) Knowledge Science, Engineering and Management. KSEM 2019. Lecture Notes in Computer Science(), vol 11775. Springer, Cham. https://doi.org/10.1007/978-3-030-29551-6_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-29551-6_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29550-9
Online ISBN: 978-3-030-29551-6
eBook Packages: Computer ScienceComputer Science (R0)