Authors:
Md. Hossain
1
;
Suravi Akhter
2
;
Md. Islam
3
;
Muhammad Alam
4
and
Mohammad Shoyaib
1
Affiliations:
1
Institute of Information Technology, University of Dhaka, Dhaka, Bangladesh
;
2
Department of Computer Science and Engineering, University of Liberal Arts Bangladesh, Dhaka, Bangladesh
;
3
Department of Mathematics, University of Dhaka, Dhaka, Bangladesh
;
4
Department of Computer Science and Engineering, Islamic University of Technology, Gazipur, Dhaka, Bangladesh
Keyword(s):
Cross-Project Defect Prediction, Transfer Learning, Second Order Statistics.
Abstract:
: Cross-Project Defect Prediction (CPDP) has gained considerable research interest due to the scarcity of
historical labeled defective modules in a project. Although there are several approaches for CPDP, most
of them contains several parameters that need to be tuned optimally to get the desired performance. Often,
higher computational complexities of these methods make it difficult to tune these parameters. Moreover,
existing methods might fail to align the shape and structure of the source and target data which in turn
deteriorates the prediction performance. Addressing these issues, we investigate correlation alignment for
CPDP (CCPDP) and compare it with state-of-the-art transfer learning methods. Rigorous experimentation
over three benchmark datasets AEEEM, RELINK and SOFTLAB that include 46 different project-pairs,
demonstrate its effectiveness in terms of F1-score, Balance and AUC compared to six other methods TCA,
TCA+, JDA, BDA, CTKCCA and DMDA JFR. In terms of AUC,
CCPDP wins at least 32 and at most 42 out
of 46 project pairs compared to all transfer learning based method.
(More)