short-paper

Cross-Project Software Defect Prediction Using Feature-Based Transfer Learning

Authors:

Yong XiaAuthors Info & Claims

Internetware '15: Proceedings of the 7th Asia-Pacific Symposium on Internetware

Pages 74 - 82

https://doi.org/10.1145/2875913.2875944

Published: 06 November 2015 Publication History

Abstract

Cross-project defect prediction is taken as an effective means of predicting software defects when the data shortage exists in the early phase of software development. Unfortunately, the precision of cross-project defect prediction is usually poor, largely because of the differences between the reference and the target projects. Having realized the project differences, this paper proposes CPDP, a feature-based transfer learning approach to cross-project defect prediction. The core insight of CPDP is to (1) filter and transfer highly-correlated data based on data samples in the target projects, and (2) evaluate and choose learning schemas for transferring data sets. Models are then built for predicting defects in the target projects. We have also conducted an evaluation of the proposed approach on PROMISE datasets. The evaluation results show that, the proposed approach adapts to cross-project defect prediction in that f-measure of 81.8% of projects can get improved, and AUC of 54.5% projects improved. It also achieves similar f-measure and AUC as some inner-project defect prediction approaches.

References

[1]

Akiyama F. An example of software system debugging. In: Proc. of the Int'l Federation of Information Proc. Societies Congress. New York: Springer Science and Business Media, 1971. 353--359.

[2]

Turhan B, Menzies T, Bener A, et al. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering, 2009, 14(5): 540--578.

Digital Library

[3]

Zimmermann T, Nagappan N, Gall H, et al. Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: Proc. of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering, NY, USA, 2009, 91--100.

Digital Library

[4]

Nagappan N, Ball T, Zeller, A. Mining metrics to predict component failures. In: Proc. of the 28th international conference on Software engineering, NY, USA, 2006, 452--461.

Digital Library

[5]

Turhan B, Menzies T, Bener A B, et al. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering, 2009, 14(5): 540--578.

Digital Library

[6]

Zhuang FZ, He Q, Shi ZZ. Survey on transfer learning research. Journal of Software, 2015, 26(1): 26--39. (in Chinese with English abstract).

[7]

Wang Q, Wu SJ, Li MS. Software Defect Prediction. Journal of Software, 2008, 19(7): 1565--1580. (in Chinese with English abstract).

[8]

Briand L C, Melo W L, Wust J. Assessing the applicability of faultproneness models across object-oriented software projects. IEEE Transactions on Software Engineering, 2002, 28(7): 706--720.

Digital Library

[9]

Cruz A, Ochimizu K. Towards logistic regression models for predicting fault-prone code across software projects. In: Proc. of Empirical Software Engineering and Measurement, Lake Buena Vista, FL, 2009, 460--463.

Digital Library

[10]

Nam J, Pan S J, Kim S. Transfer defect learning. In: Proc. of International Conference on Software Engineering, San Francisco, CA, 2013, 382--391.

Digital Library

[11]

Pan S J, Tsang I W, Kwok J T, et al. Domain Adaptation via Transfer Component Analysis. IEEE Transactions on Neural Networks, 2010, 22(2): 199--210.

Digital Library

[12]

Peters F, Menzies T, Marcus A. Better cross company defect prediction. In: Proc. of the Tenth International Workshop on Mining Software Repositories, San Francisco, CA, 2013, 409--418.

Digital Library

[13]

Tosun A, Bener A B, Kale R. AI-Based Software Defect Predictors: Applications and Benefits in a Case Study. IAAI, 2011, 32(2): 57--68.

[14]

Fayola Peters, Tim Menzies, Liang Gong, Hongyu Zhang. Balancing Privacy and Utility in Cross-Company Defect Prediction. IEEE Trans. Software Eng. 39(8): 1054--106.

Digital Library

[15]

Jureczko M, Madeyski L. Towards identifying software project clusters with regard to defect prediction. In: Proc. of the 6th International Conference on Predictive Models in Software Engineering, 2012, 9, ACM.

Digital Library

[16]

Zhang F, Mockus A, Keivanloo I, et al. Towards Building a Universal Defect Prediction Model. In: Proc. of the 11th Working Conference on Mining Software Repositories, 2014, 182--191.

Digital Library

[17]

Ming Li, Hongyu Zhang, Rongxin Wu, Zhi-Hua Zhou. Sample-based software defect prediction with active and semi-supervised learning. Automated Software Engineering, 2012, 19(2): 201--230.

Digital Library

[18]

Liu Y, Khoshgoftaar T M, Seliya N. Evolutionary optimization of software quality modeling with multiple repositories. IEEE Transactions on Software Engineering, 2010, 36(6): 852--864.

Digital Library

[19]

Nam J, Pan S J, Kim S. Transfer defect learning. In: Proc. of the 2013 International Conference on Software Engineering, San Francisco, CA, 2013, 382--391.

Digital Library

[20]

Wenyuan Dai, Qiang Yang, Gui-Rong Xue, Yong Yu. Boosting for transfer learning. In: Proc. of ICML 2007: 193--200

Digital Library

[21]

Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, Yong Yu. Translated Learning: Transfer Learning across Different Feature Spaces. NIPS 2008: 353--360

Digital Library

[22]

Wenyuan Dai, Ou Jin, Gui-Rong Xue, Qiang Yang, Yong Yu. EigenTransfer: a unified framework for transfer learning. ICML 2009: 193--200

Digital Library

[23]

Dai WY. Instance-based and Feature-based Transfer Learning {MS. Thesis}. Shanghai Jiao Tong University, 2008 (in Chinese with English abstract).

[24]

Peters F, Menzies T, Gong L, et al. Balancing privacy and utility in cross-company defect prediction. IEEE Transactions on Software Engineering, 2013, 33(9): 637--640.

Digital Library

[25]

Scholkopf B, Smola A, Muller K R. Kernel Principal Component Analysis. Lecture Notes in Computer Science, 1997, 1327: 583--588.

Digital Library

Cited By

Toscano-Miranda RAguilar JHoyos WCaro MTrebilcok AToro M(2024)Different transfer learning approaches for insect pest classification in cottonApplied Soft Computing10.1016/j.asoc.2024.111283153:COnline publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1016/j.asoc.2024.111283
Malhotra RMeena S(2024)A systematic review of transfer learning in software engineeringMultimedia Tools and Applications10.1007/s11042-024-19756-x83:39(87237-87298)Online publication date: 27-Jul-2024
https://doi.org/10.1007/s11042-024-19756-x
Suhag VDubey SSharma B(2023)Transfer Learning based Low Shot Classifier for Software Defect PredictionJournal of Information Systems Engineering and Business Intelligence10.20473/jisebi.9.2.228-2389:2(228-238)Online publication date: 1-Nov-2023
https://doi.org/10.20473/jisebi.9.2.228-238
Show More Cited By

Index Terms

Cross-Project Software Defect Prediction Using Feature-Based Transfer Learning
1. General and reference
  1. Cross-computing tools and techniques
    1. Metrics
2. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems
      1. System management
        Quality assurance

Recommendations

Understanding the automated parameter optimization on transfer learning for cross-project defect prediction: an empirical study
ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

Data-driven defect prediction has become increasingly important in software engineering process. Since it is not uncommon that data from a software project is insufficient for training a reliable defect prediction model, transfer learning that borrows ...
Cross-project smell-based defect prediction
Abstract
Defect prediction is a technique introduced to optimize the testing phase of the software development pipeline by predicting which components in the software may contain defects. Its methodology trains a classifier with data regarding a set of ...
A transfer cost-sensitive boosting approach for cross-project defect prediction

Software defect prediction has been regarded as one of the crucial tasks to improve software quality by effectively allocating valuable resources to fault-prone modules. It is necessary to have a sufficient set of historical data for building a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

Internetware '15: Proceedings of the 7th Asia-Pacific Symposium on Internetware

November 2015

247 pages

ISBN:9781450336413

DOI:10.1145/2875913

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Key Laboratory of High Confidence Software Technologies: Key Laboratory of High Confidence Software Technologies, Ministry of Education

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 November 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Conference

Internetware '15

Internetware '15: The Seventh Asia-Pacific Symposium on Internetware

November 6, 2015

Wuhan, China

Acceptance Rates

Overall Acceptance Rate 55 of 111 submissions, 50%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
213
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Toscano-Miranda RAguilar JHoyos WCaro MTrebilcok AToro M(2024)Different transfer learning approaches for insect pest classification in cottonApplied Soft Computing10.1016/j.asoc.2024.111283153:COnline publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1016/j.asoc.2024.111283
Malhotra RMeena S(2024)A systematic review of transfer learning in software engineeringMultimedia Tools and Applications10.1007/s11042-024-19756-x83:39(87237-87298)Online publication date: 27-Jul-2024
https://doi.org/10.1007/s11042-024-19756-x
Suhag VDubey SSharma B(2023)Transfer Learning based Low Shot Classifier for Software Defect PredictionJournal of Information Systems Engineering and Business Intelligence10.20473/jisebi.9.2.228-2389:2(228-238)Online publication date: 1-Nov-2023
https://doi.org/10.20473/jisebi.9.2.228-238
Afric PVukadin DSilic MDelac G(2023)Empirical Study: How Issue Classification Influences Software Defect PredictionIEEE Access10.1109/ACCESS.2023.324204511(11732-11748)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3242045
Pal SSillitti A(2022)Cross-Project Defect Prediction: A Literature ReviewIEEE Access10.1109/ACCESS.2022.322118410(118697-118717)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3221184
Alaswad FPoovammal E(2022)Software quality prediction using machine learningMaterials Today: Proceedings10.1016/j.matpr.2022.03.16562(4714-4720)Online publication date: 2022
https://doi.org/10.1016/j.matpr.2022.03.165
Boutaib SElarbi MBechikh SPalomba FSaid L(2022)Handling uncertainty in SBSE: a possibilistic evolutionary approach for code smells detectionEmpirical Software Engineering10.1007/s10664-022-10142-527:6Online publication date: 24-Jun-2022
https://doi.org/10.1007/s10664-022-10142-5
Malhotra RMeena S(2022)Defect prediction model using transfer learningSoft Computing10.1007/s00500-022-06846-x26:10(4713-4726)Online publication date: 22-Feb-2022
https://doi.org/10.1007/s00500-022-06846-x
Cui CLiu BWang S(2022)WIFLF: An approach independent of the target project for cross‐project defect predictionJournal of Software: Evolution and Process10.1002/smr.249734:12Online publication date: 29-Jul-2022
https://doi.org/10.1002/smr.2497
Cui CLiu BXiao PWang S(2020)Can Defect Prediction Be Useful for Coarse-Level Tasks of Software Testing?Applied Sciences10.3390/app1015537210:15(5372)Online publication date: 4-Aug-2020
https://doi.org/10.3390/app10155372
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten