Abstract
Cleansing coincidental correctness test cases has been proven to be useful in software fault localization. However, k-means clustering-based coincidental correctness test cases identification has not been studied yet. k-means clustering is hard classification and each sample point belongs to the cluster with the highest similarity, which leads to the inaccuracy of the cluster-based coincidental correctness. To address this issue, we propose an effective Coincidental Correctness test cases identification framework based on Fuzzy C-Means clustering (CC-FCM). The elements of coincidental correctness were first identified by probability function we designed, and the feature elements of the coincidental correctness were selected. Secondly, fuzzy c-means clustering was first introduced into identifying coincidental correctness test case after the dimensions of program execution traces were reduced. Finally, the results after coincidental correctness cleansing were used for the fault localization. To verify the effectiveness of the proposed CC-FCM, experiments were conducted by four fault localization methods, including Tarantula, Ochiai, Naish2 and Russel &Rao on 10 real-world subject programs. The experimental results showed that our proposed CC-FCM has a significant improvement over the compared methods, and that our approach has a lower false-positive rate and false-negative rate in coincidental correctness test case identification.
Similar content being viewed by others
Data openly available in a public repository
The data that support the findings of this study are openly available in [SIR] at [http://sir.unl.edu/portal/index.html].
References
Voas, J.M.: PIE: a dynamic failure-based technique. IEEE Trans. Softw. Eng. 18(8), 717–727 (1992)
Hierons, R.M.: Avoiding coincidental correctness in boundary value analysis. ACM Trans. Softw. Eng. Methodol. 15(3), 227–241 (2006)
Richardson, D.J., Thompson, M.C.: An analysis of test data selection criteria using the RELAY model of fault detection. IEEE Trans. Softw. Eng. 19(6), 533–553 (1993)
Masri, W., Podgurski, A.: An empirical study of the strength of information flows in programs. In: Proceedings of the 2006 International Workshop on Dynamic Systems Analysis, pp. 73–80. ACM, New York (2006)
Masri, W., Podgurski, A.: Measuring the strength of information flows in programs. ACM Trans. Softw. Eng. Methodol. 19(2), 5–37 (2009)
Wang, X., Cheung, S.C., Chan, W.K., et al.: Taming coincidental correctness: coverage refinement with context patterns to improve fault localization. In: Proceedings of the 31th International Conference on Software Engineering, pp. 45–55. IEEE computer society, Los Alamitos (2009)
Liu, C., Han, J.: Failure proximity: a fault localization- based approach. In: Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 46–56. ACM, New York (2006)
Park, S., Harrold, M.J., Vuduc, R.: Griffin: grouping suspicious memory-access patterns to improve understanding of concurrency bugs. In: Proceedings of the 2013 International Symposium on Software Testing and Analysis, pp. 134–144. ACM, New York (2013)
Masri, W., Assi, R.A.: Prevalence of coincidental correctness and mitigation of its impact on fault localization. ACM Trans. Softw. Eng. Methodol. 23(1), 1–28 (2014)
Yang, X., Mengleng, Liu M., Cao, M., et al.: Regression identification of coincidental correctness via weighted clustering. In: Proceedings of the 39th Annual International Computers, Software & Applications Conference, pp. 115–120. IEEE, Taichung (2015)
Li, D., Zhang, H., Li, T., et al.: Hybrid missing value imputation algorithms using fuzzy c-means and vaguely quantified rough set. IEEE Trans. Fuzzy Syst. 30(5), 1396–1408 (2021)
Masri, W., Assi, R.A.: Cleansing test suites from coincidental correctness to enhance fault-localization. In: Proceedings of the 3rd International Conference on Software Testing, Verification and Validation, pp. 165–174. IEEE, Paris (2010)
Masri, W., Assi, R.A., Zaraket, F., et al.: Enhancing fault localization via multivariate visualization. In: Proceedings of the 5th IEEE International Conference on Software Testing, Verification and Validation, pp. 737–741. IEEE, Montreal (2012)
Jones, J.A., Harrold, M.J., Stasko, J.: Visualization of test information to assist fault localization. In: Proceedings of the 24th International Conference on Software Engineering, pp. 467–477. IEEE Computer Society, Los Alamito (2002)
Rui, A., Peter, Z., Arjan, J.C.v.G.: An evaluation of similarity coefficients for software fault localization. In: Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing, pp. 39–46. IEEE, Riverside (2006)
Naish, L., Lee, H.J., Ramamohanarao, K.: A model for spectra-based software diagnosis. ACM Trans. Softw. Eng. Methodol. 20(3), 11–43 (2011)
Xie, X., Chen, T.Y., Kuo, F.-C., et al.: A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans. Softw. Eng. Methodol. 22(4), 1–40 (2013)
Renieris, M., Reiss, S.P.: Fault localization with nearest neighbor queries. In: Proceedings of the 18th IEEE International Conference on Automated Software Engineering, pp. 30–39. ACM, New York (2003)
Wong, W.E., Debroy, V., Yihao, L., et al.: Software fault localization using DStar (D*). In: Proceedings of the 6th IEEE International Conference on Software Security and Reliability, pp. 21–30. IEEE, Los Alamitos (2012)
Xu, J., Chan, W., Zhang, Z., et al.: A dynamic fault localization technique with noise reduction for java programs. In: Proceedings of the 11th International Conference on Quality Software, pp. 11–20. IEEE, Madrid (2011)
Zhang, Z., Chen, W.K., Tse, T.H., et al.: Capturing propagation of infected program states. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp. 43–52. ACM, New York (2009)
Li, Z., Li, M., Liu, Y., et al.: Identify coincidental correct test cases based on fuzzy classification. In: International Conference on Software Analysis, Testing & Evolution, pp. 72–77. IEEE (2016)
Assi, R.A., Masri, W., Trad, C.: How detrimental is coincidental correctness to coverage-based fault detection and localization? An empirical study. Softw. Test. Verif. Reliab. 1-26 (2021)
Abou Assi, R., Trad, C., Maalouf, M., et al.: Coincidental correctness in the Defects4J benchmark. Softw. Test. Verif. Reliab. 29(3), 1–26 (2019)
Feyzi, F., Parsa, S.: A program slicing-based method for effective detection of coincidentally correct test cases. Computing 100(9), 1–43 (2018)
Feyzi, F.: CGT-FL: using cooperative game theory to effective fault localization in presence of coincidental correctness. Empir. Softw. Eng. 25(5), 3873–3927 (2020)
Funding
This work was partially supported by Cultivation Programme for Young Backbone Teachers in Henan University of Technology, Key scientific research project of colleges and universities in Henan Province (No.22A520024), Major Public Welfare Project of Henan Province (No.201300311200) and National Natural Science Foundation of China (Nos. 62206087,62276091,61602154).
Author information
Authors and Affiliations
Contributions
The authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Heling Cao, Lei Li, Yonghe Chu, Miaolei Deng, Panpan Wang and Chenyang Zhao. The first draft of the manuscript was written by Heling Cao and Lei Li, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declared that they have no conflicts of interest to the manuscript. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Ethics approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cao, H., Li, L., Chu, Y. et al. A coincidental correctness test case identification framework with fuzzy C-means clustering. Multimedia Systems 29, 1089–1101 (2023). https://doi.org/10.1007/s00530-022-01039-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-022-01039-w