Skip to main content
Log in

Effort-aware cross-project just-in-time defect prediction framework for mobile apps

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

As the boom of mobile devices, Android mobile apps play an irreplaceable roles in people’s daily life, which have the characteristics of frequent updates involving in many code commits to meet new requirements. Just-in-Time (JIT) defect prediction aims to identify whether the commit instances will bring defects into the new release of apps and provides immediate feedback to developers, which is more suitable to mobile apps. As the within-app defect prediction needs sufficient historical data to label the commit instances, which is inadequate in practice, one alternative method is to use the cross-project model. In this work, we propose a novel method, called KAL, for cross-project JIT defect prediction task in the context of Android mobile apps. More specifically, KAL first transforms the commit instances into a high-dimensional feature space using kernel-based principal component analysis technique to obtain the representative features. Then, the adversarial learning technique is used to extract the common feature embedding for the model building. We conduct experiments on 14 Android mobile apps and employ four effort-aware indicators for performance evaluation. The results on 182 cross-project pairs demonstrate that our proposed KAL method obtains better performance than 20 comparative methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Ghotra B, McIntosh S, Hassan A E. Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the 37th IEEE International Conference on Software Engineering. 2015, 789–800

  2. Xu Z, Li S, Xu J, Luo X, Zhang T, Keung J, Tang Y. LDFR: learning deep feature representation for software defect prediction. Journal of Systems and Software, 2019, 158: 110402

    Article  Google Scholar 

  3. Xu Z, Xuan J, Liu J, Cui X. MICHAC: defect prediction via feature selection based on maximal information coefficient with hierarchical agglomerative clustering. In: Proceedings of the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering. 2016, 370–381

  4. Chen X, Mu Y, Qu Y, Ni C, Liu M, He T, Liu S. Do different crossproject defect prediction methods identify the same defective modules? Journal of Software: Evolution and Process, 2020, 32(5): e2234

    Google Scholar 

  5. Menzies T, Greenwald J, Frank A. Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 2006, 33(1): 2–13

    Article  Google Scholar 

  6. Kamei Y, Shihab E, Adams B, Hassan, A E, Mockus A, Sinha A, Ubayashi N. A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering, 2012, 39(6): 757–773

    Article  Google Scholar 

  7. Kamei Y, Fukushima T, McIntosh S, Yamashita K, Ubayashi N, Hassan A E. Studying just-in-time defect prediction using cross-project models. Empirical Software Engineering, 2016, 21(5): 2072–2106

    Article  Google Scholar 

  8. Catolino G, Di Nucci D, Ferrucci F. Cross-project just-in-time bug prediction for mobile apps: an empirical assessment. In: Proceedings of the 6th IEEE/ACM International Conference on Mobile Software Engineering and Systems. 2019, 99–110

  9. Jing X Y, Ying S, Zhang Z W, Wu S S, Liu J. Dictionary learning based software defect prediction. In: Proceedings of the 36th International Conference on Software Engineering. 2014, 414–423

  10. Xia X, Lo D, Pan S J, Nagappan N, Wang X. Hydra: massively compositional model for cross-project defect prediction. IEEE Transactions on Software Engineering, 2016, 42(10): 977–998

    Article  Google Scholar 

  11. Arisholm E, Briand L C, Fuglerud M. Data mining techniques for building fault-proneness models in telecom java software. In: Proceedings of the 18th IEEE International Symposium on Software Reliability. 2007, 215–224

  12. Ma Y, Luo G, Zeng X, Chen A. Transfer learning for cross-company software defect prediction. Information and Software Technology, 2012, 54(3): 248–256

    Article  Google Scholar 

  13. Nam J, Pan S J, Kim S. Transfer defect learning. In: Proceedings of the 35th International Conference on Software Engineering. 2013, 382–391

  14. Chen L, Fang B, Shang Z, Tang Y. Negative samples reduction in crosscompany software defects prediction. Information and Software Technology, 2015, 62: 67–77

    Article  Google Scholar 

  15. Ryu D, Jang J I, Baik J. A transfer cost-sensitive boosting approach for cross-project defect prediction. Software Quality Journal, 2017, 25(1): 235–272

    Article  Google Scholar 

  16. Liu C, Yang D, Xia X, Yan M, Zhang X. A two-phase transfer learning model for cross-project defect prediction. Information and Software Technology, 2019, 107: 125–136

    Article  Google Scholar 

  17. Xu Z, Pang S, Zhang T, Luo X P, Liu J, Tang Y T, Xue L. Cross project defect prediction via balanced distribution adaptation based transfer learning. Journal of Computer Science and Technology, 2019, 34(5): 1039–1062

    Article  Google Scholar 

  18. McIntosh S, Kamei Y. Are fix-inducing changes a moving target? a longitudinal case study of just-in-time defect prediction. IEEE Transactions on Software Engineering, 2017, 44(5): 412–428

    Article  Google Scholar 

  19. Pascarella L, Palomba F, Bacchelli A. Fine-grained just-in-time defect prediction. Journal of Systems and Software, 2019, 150: 22–36

    Article  Google Scholar 

  20. Chen X, Zhao Y, Wang Q, Yuan Z. MULTI: multi-objective effortaware just-in-time software defect prediction. Information and Software Technology, 2018, 93: 1–13

    Article  Google Scholar 

  21. Cabral G G, Minku L L, Shihab E, Mujahid S. Class imbalance evolution and verification latency in just-in-time software defect prediction. In: Proceedings of the 41st IEEE/ACM International Conference on Software Engineering. 2019, 666–676

  22. Li S Z, Fu Q, Gu L, Scholkopf B, Cheng Y, Zhang H. Kernel machine based learning for multi-view face detection and pose estimation. In: Proceedings of the 8th IEEE International Conference on Computer Vision. 2001, 674–679

  23. Xu Z, Liu J, Luo X, Zhang T. Cross-version defect prediction via hybrid active learning with kernel principal component analysis. In: Proceedings of the 25th IEEE International Conference on Software Analysis, Evolution and Reengineering. 2018, 209–220

  24. Huang J, Yan X. Relevant and independent multi-block approach for plant-wide process and quality-related monitoring based on KPCA and SVDD. ISA Transactions, 2018, 73: 257–267

    Article  Google Scholar 

  25. Xu Z, Liu J, Luo X, Yang Z, Zhang Y, Yuan P, Zhang T. Software defect prediction based on kernel PCA and weighted extreme learning machine. Information and Software Technology, 2019, 106: 182–200

    Article  Google Scholar 

  26. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Bengio Y. Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing systems, 2014, 2672–2680

  27. Li W, Ding W, Sadasivam R, Cui X, Chen P. His-GAN: a histogrambased GAN model to improve data generation quality. Neural Networks, 2019, 119: 31–45

    Article  Google Scholar 

  28. Xu Z, Li S, Luo X, Liu J, Zhang T, Tang Y, Xu J, Yuan P, Keung, J. TSTSS: a two-stage training subset selection framework for cross version defect prediction. Journal of Systems and Software, 2019, 154: 59–78

    Article  Google Scholar 

  29. Arisholm E, Briand L C, Johannessen E B. A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. Journal of Systems and Software, 2010, 83(1): 2–17

    Article  Google Scholar 

  30. Xu Z, Li L, Yan M, Liu J, Luo X, Grundy J, Zhang Y, Zhang X. A comprehensive comparative study of clustering-based unsupervised defect prediction models. Journal of Systems and Software, 2021, 172: 110862

    Article  Google Scholar 

  31. Huang Q, Xia X, Lo D. Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction. Empirical Software Engineering, 2019, 24(5): 2823–2862

    Article  Google Scholar 

  32. Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32

    Article  Google Scholar 

  33. Tantithamthavorn C, Hassan A E, Matsumoto K. The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Transactions on Software Engineering, 2018, 46(11): 1200–1219

    Article  Google Scholar 

  34. Yang X, Lo D, Xia X, Sun J. TLEL: a two-layer ensemble learning approach for just-in-time defect prediction. Information and Software Technology, 2017, 87: 206–220

    Article  Google Scholar 

  35. Demšar J. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 2006, 7: 1–30

    MathSciNet  Google Scholar 

  36. Turhan B, Menzies T, Bener A B, Di Stefano J. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering, 2009, 14(5): 540–578

    Article  Google Scholar 

  37. Peters F, Menzies T, Marcus A. Better cross company defect predictio. In: Proceedings of the 10th Working Conference on Mining Software Repositories. 2013, 409–418

  38. Kawata K, Amasaki S, Yokogawa T. Improving relevancy filter methods for cross-project defect prediction. In: Proceedings of the 3rd International Conference on Applied Computing and Information Technology/2nd International Conference on Computational Science and Intelligence. 2015, 2–7

  39. Yu X, Zhou P, Zhang J, Liu J. A data filtering method based on agglomerative clustering. In: Proceedings of the 29th International Conference on Software Engineering and Knowledge Engineering. 2017, 392–397

  40. He P, Li B, Ma Y. Towards cross-project defect prediction with imbalanced feature sets. 2014, arXiv preprint arXiv: 1411.4228

  41. He Z, Shu F, Yang Y, Li M, Wang Q. An investigation on the feasibility of cross-project defect prediction. Automated Software Engineering, 2012, 19(2): 167–199

    Article  Google Scholar 

  42. Pan S J, Tsang I W, Kwok J T, Yang Q. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks, 2010, 22(2): 199–210

    Article  Google Scholar 

  43. Long M, Wang J, Ding G, Sun J, Yu P S. Transfer feature learning with joint distribution adaptation. In: Proceedings of the IEEE International Conference on Computer Vision. 2013, 2200–2207

  44. Panichella A, Oliveto R, De Lucia A. Cross-project defect prediction models: L’union fait la force. In: Proceedings of the 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering. 2014, 164–173

  45. Petric J, Bowes D, Hall T, Christianson B, Baddoo N. Building an ensemble for software defect prediction based on diversity selection. In: Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 2016, 1–10

  46. Zhang Y, Lo D, Xia X, Sun J. An empirical study of classifier combination for cross-project defect prediction. In: Proceedings of the 39th IEEE Annual Computer Software and Applications Conference. 2015, 264–269

  47. Di Nucci D, Palomba F, De Lucia A. Evaluating the adaptive selection of classifiers for cross-project bug prediction. In: Proceedings of the 6th IEEE/ACM International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering. 2018, 48–54

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 62072060).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junhao Wen.

Additional information

Tian Cheng is a PhD student at the School of Data and Software Engineering, Chongqing University, China. He received the BS and MS degree in Macao Polytechnic Institute and Chongqing University, China in 2012 and 2014, respectively. His research interest includes Web data extraction, data mining, and software engineering.

Kunsong Zhao is a master student at the School of Computer Science, Wuhan University, China. He received the BS degree at School of Computer Science and Information Engineering, Hubei University, China. His research interest includes software engineering, deep learning, and natural language processing.

Song Sun received the BS and MS degrees in software engineering from Chongqing University, China in 2011 and 2014, respectively. He is currently pursuing the PhD degree with the School of Big Data and Software Engineering, Chongqing University, China. His research interest includes recommendation system, computer vision, and machine learning.

Muhammad Mateen received his master’s degree in computer science from Air University, Islamabad, Pakistan in 2015 and PhD in Software Engineering from Chongqing University, China in 2020. Currently, he is working as an Assistant Professor at Air University Multan Campus, Pakistan. His research interest includes software engineering, image processing, and deep learning. He is a member of China Computer Federation (CCF).

Junhao Wen received the PhD degree from the Chongqing University, China in 2008. He is a vice head and professor of the School of Big Data and Software Engineering, Chongqing University, China. His research interest includes service computing, cloud computing, and software dependable engineering. He has published more than 80 refereed journal and conference papers in these areas. He has more than 30 research and industrial grants and developed many commercial systems and software tools.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheng, T., Zhao, K., Sun, S. et al. Effort-aware cross-project just-in-time defect prediction framework for mobile apps. Front. Comput. Sci. 16, 166207 (2022). https://doi.org/10.1007/s11704-021-1013-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-021-1013-5

Keywords

Navigation