skip to main content
10.1145/2875913.2875944acmotherconferencesArticle/Chapter ViewAbstractPublication PagesinternetwareConference Proceedingsconference-collections
short-paper

Cross-Project Software Defect Prediction Using Feature-Based Transfer Learning

Authors Info & Claims
Published:06 November 2015Publication History

ABSTRACT

Cross-project defect prediction is taken as an effective means of predicting software defects when the data shortage exists in the early phase of software development. Unfortunately, the precision of cross-project defect prediction is usually poor, largely because of the differences between the reference and the target projects. Having realized the project differences, this paper proposes CPDP, a feature-based transfer learning approach to cross-project defect prediction. The core insight of CPDP is to (1) filter and transfer highly-correlated data based on data samples in the target projects, and (2) evaluate and choose learning schemas for transferring data sets. Models are then built for predicting defects in the target projects. We have also conducted an evaluation of the proposed approach on PROMISE datasets. The evaluation results show that, the proposed approach adapts to cross-project defect prediction in that f-measure of 81.8% of projects can get improved, and AUC of 54.5% projects improved. It also achieves similar f-measure and AUC as some inner-project defect prediction approaches.

References

  1. Akiyama F. An example of software system debugging. In: Proc. of the Int'l Federation of Information Proc. Societies Congress. New York: Springer Science and Business Media, 1971. 353--359.Google ScholarGoogle Scholar
  2. Turhan B, Menzies T, Bener A, et al. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering, 2009, 14(5): 540--578. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Zimmermann T, Nagappan N, Gall H, et al. Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: Proc. of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering, NY, USA, 2009, 91--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Nagappan N, Ball T, Zeller, A. Mining metrics to predict component failures. In: Proc. of the 28th international conference on Software engineering, NY, USA, 2006, 452--461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Turhan B, Menzies T, Bener A B, et al. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering, 2009, 14(5): 540--578. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Zhuang FZ, He Q, Shi ZZ. Survey on transfer learning research. Journal of Software, 2015, 26(1): 26--39. (in Chinese with English abstract).Google ScholarGoogle Scholar
  7. Wang Q, Wu SJ, Li MS. Software Defect Prediction. Journal of Software, 2008, 19(7): 1565--1580. (in Chinese with English abstract).Google ScholarGoogle ScholarCross RefCross Ref
  8. Briand L C, Melo W L, Wust J. Assessing the applicability of faultproneness models across object-oriented software projects. IEEE Transactions on Software Engineering, 2002, 28(7): 706--720. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Cruz A, Ochimizu K. Towards logistic regression models for predicting fault-prone code across software projects. In: Proc. of Empirical Software Engineering and Measurement, Lake Buena Vista, FL, 2009, 460--463. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Nam J, Pan S J, Kim S. Transfer defect learning. In: Proc. of International Conference on Software Engineering, San Francisco, CA, 2013, 382--391. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Pan S J, Tsang I W, Kwok J T, et al. Domain Adaptation via Transfer Component Analysis. IEEE Transactions on Neural Networks, 2010, 22(2): 199--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Peters F, Menzies T, Marcus A. Better cross company defect prediction. In: Proc. of the Tenth International Workshop on Mining Software Repositories, San Francisco, CA, 2013, 409--418. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Tosun A, Bener A B, Kale R. AI-Based Software Defect Predictors: Applications and Benefits in a Case Study. IAAI, 2011, 32(2): 57--68.Google ScholarGoogle Scholar
  14. Fayola Peters, Tim Menzies, Liang Gong, Hongyu Zhang. Balancing Privacy and Utility in Cross-Company Defect Prediction. IEEE Trans. Software Eng. 39(8): 1054--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jureczko M, Madeyski L. Towards identifying software project clusters with regard to defect prediction. In: Proc. of the 6th International Conference on Predictive Models in Software Engineering, 2012, 9, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Zhang F, Mockus A, Keivanloo I, et al. Towards Building a Universal Defect Prediction Model. In: Proc. of the 11th Working Conference on Mining Software Repositories, 2014, 182--191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ming Li, Hongyu Zhang, Rongxin Wu, Zhi-Hua Zhou. Sample-based software defect prediction with active and semi-supervised learning. Automated Software Engineering, 2012, 19(2): 201--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Liu Y, Khoshgoftaar T M, Seliya N. Evolutionary optimization of software quality modeling with multiple repositories. IEEE Transactions on Software Engineering, 2010, 36(6): 852--864. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Nam J, Pan S J, Kim S. Transfer defect learning. In: Proc. of the 2013 International Conference on Software Engineering, San Francisco, CA, 2013, 382--391. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Wenyuan Dai, Qiang Yang, Gui-Rong Xue, Yong Yu. Boosting for transfer learning. In: Proc. of ICML 2007: 193--200 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, Yong Yu. Translated Learning: Transfer Learning across Different Feature Spaces. NIPS 2008: 353--360 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Wenyuan Dai, Ou Jin, Gui-Rong Xue, Qiang Yang, Yong Yu. EigenTransfer: a unified framework for transfer learning. ICML 2009: 193--200 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Dai WY. Instance-based and Feature-based Transfer Learning {MS. Thesis}. Shanghai Jiao Tong University, 2008 (in Chinese with English abstract).Google ScholarGoogle Scholar
  24. Peters F, Menzies T, Gong L, et al. Balancing privacy and utility in cross-company defect prediction. IEEE Transactions on Software Engineering, 2013, 33(9): 637--640. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Scholkopf B, Smola A, Muller K R. Kernel Principal Component Analysis. Lecture Notes in Computer Science, 1997, 1327: 583--588. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Cross-Project Software Defect Prediction Using Feature-Based Transfer Learning

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        Internetware '15: Proceedings of the 7th Asia-Pacific Symposium on Internetware
        November 2015
        247 pages
        ISBN:9781450336413
        DOI:10.1145/2875913

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 6 November 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate55of111submissions,50%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader