Abstract
As the boom of mobile devices, Android mobile apps play an irreplaceable roles in people’s daily life, which have the characteristics of frequent updates involving in many code commits to meet new requirements. Just-in-Time (JIT) defect prediction aims to identify whether the commit instances will bring defects into the new release of apps and provides immediate feedback to developers, which is more suitable to mobile apps. As the within-app defect prediction needs sufficient historical data to label the commit instances, which is inadequate in practice, one alternative method is to use the cross-project model. In this work, we propose a novel method, called KAL, for cross-project JIT defect prediction task in the context of Android mobile apps. More specifically, KAL first transforms the commit instances into a high-dimensional feature space using kernel-based principal component analysis technique to obtain the representative features. Then, the adversarial learning technique is used to extract the common feature embedding for the model building. We conduct experiments on 14 Android mobile apps and employ four effort-aware indicators for performance evaluation. The results on 182 cross-project pairs demonstrate that our proposed KAL method obtains better performance than 20 comparative methods.
Similar content being viewed by others
References
Ghotra B, McIntosh S, Hassan A E. Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the 37th IEEE International Conference on Software Engineering. 2015, 789–800
Xu Z, Li S, Xu J, Luo X, Zhang T, Keung J, Tang Y. LDFR: learning deep feature representation for software defect prediction. Journal of Systems and Software, 2019, 158: 110402
Xu Z, Xuan J, Liu J, Cui X. MICHAC: defect prediction via feature selection based on maximal information coefficient with hierarchical agglomerative clustering. In: Proceedings of the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering. 2016, 370–381
Chen X, Mu Y, Qu Y, Ni C, Liu M, He T, Liu S. Do different crossproject defect prediction methods identify the same defective modules? Journal of Software: Evolution and Process, 2020, 32(5): e2234
Menzies T, Greenwald J, Frank A. Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 2006, 33(1): 2–13
Kamei Y, Shihab E, Adams B, Hassan, A E, Mockus A, Sinha A, Ubayashi N. A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering, 2012, 39(6): 757–773
Kamei Y, Fukushima T, McIntosh S, Yamashita K, Ubayashi N, Hassan A E. Studying just-in-time defect prediction using cross-project models. Empirical Software Engineering, 2016, 21(5): 2072–2106
Catolino G, Di Nucci D, Ferrucci F. Cross-project just-in-time bug prediction for mobile apps: an empirical assessment. In: Proceedings of the 6th IEEE/ACM International Conference on Mobile Software Engineering and Systems. 2019, 99–110
Jing X Y, Ying S, Zhang Z W, Wu S S, Liu J. Dictionary learning based software defect prediction. In: Proceedings of the 36th International Conference on Software Engineering. 2014, 414–423
Xia X, Lo D, Pan S J, Nagappan N, Wang X. Hydra: massively compositional model for cross-project defect prediction. IEEE Transactions on Software Engineering, 2016, 42(10): 977–998
Arisholm E, Briand L C, Fuglerud M. Data mining techniques for building fault-proneness models in telecom java software. In: Proceedings of the 18th IEEE International Symposium on Software Reliability. 2007, 215–224
Ma Y, Luo G, Zeng X, Chen A. Transfer learning for cross-company software defect prediction. Information and Software Technology, 2012, 54(3): 248–256
Nam J, Pan S J, Kim S. Transfer defect learning. In: Proceedings of the 35th International Conference on Software Engineering. 2013, 382–391
Chen L, Fang B, Shang Z, Tang Y. Negative samples reduction in crosscompany software defects prediction. Information and Software Technology, 2015, 62: 67–77
Ryu D, Jang J I, Baik J. A transfer cost-sensitive boosting approach for cross-project defect prediction. Software Quality Journal, 2017, 25(1): 235–272
Liu C, Yang D, Xia X, Yan M, Zhang X. A two-phase transfer learning model for cross-project defect prediction. Information and Software Technology, 2019, 107: 125–136
Xu Z, Pang S, Zhang T, Luo X P, Liu J, Tang Y T, Xue L. Cross project defect prediction via balanced distribution adaptation based transfer learning. Journal of Computer Science and Technology, 2019, 34(5): 1039–1062
McIntosh S, Kamei Y. Are fix-inducing changes a moving target? a longitudinal case study of just-in-time defect prediction. IEEE Transactions on Software Engineering, 2017, 44(5): 412–428
Pascarella L, Palomba F, Bacchelli A. Fine-grained just-in-time defect prediction. Journal of Systems and Software, 2019, 150: 22–36
Chen X, Zhao Y, Wang Q, Yuan Z. MULTI: multi-objective effortaware just-in-time software defect prediction. Information and Software Technology, 2018, 93: 1–13
Cabral G G, Minku L L, Shihab E, Mujahid S. Class imbalance evolution and verification latency in just-in-time software defect prediction. In: Proceedings of the 41st IEEE/ACM International Conference on Software Engineering. 2019, 666–676
Li S Z, Fu Q, Gu L, Scholkopf B, Cheng Y, Zhang H. Kernel machine based learning for multi-view face detection and pose estimation. In: Proceedings of the 8th IEEE International Conference on Computer Vision. 2001, 674–679
Xu Z, Liu J, Luo X, Zhang T. Cross-version defect prediction via hybrid active learning with kernel principal component analysis. In: Proceedings of the 25th IEEE International Conference on Software Analysis, Evolution and Reengineering. 2018, 209–220
Huang J, Yan X. Relevant and independent multi-block approach for plant-wide process and quality-related monitoring based on KPCA and SVDD. ISA Transactions, 2018, 73: 257–267
Xu Z, Liu J, Luo X, Yang Z, Zhang Y, Yuan P, Zhang T. Software defect prediction based on kernel PCA and weighted extreme learning machine. Information and Software Technology, 2019, 106: 182–200
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Bengio Y. Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing systems, 2014, 2672–2680
Li W, Ding W, Sadasivam R, Cui X, Chen P. His-GAN: a histogrambased GAN model to improve data generation quality. Neural Networks, 2019, 119: 31–45
Xu Z, Li S, Luo X, Liu J, Zhang T, Tang Y, Xu J, Yuan P, Keung, J. TSTSS: a two-stage training subset selection framework for cross version defect prediction. Journal of Systems and Software, 2019, 154: 59–78
Arisholm E, Briand L C, Johannessen E B. A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. Journal of Systems and Software, 2010, 83(1): 2–17
Xu Z, Li L, Yan M, Liu J, Luo X, Grundy J, Zhang Y, Zhang X. A comprehensive comparative study of clustering-based unsupervised defect prediction models. Journal of Systems and Software, 2021, 172: 110862
Huang Q, Xia X, Lo D. Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction. Empirical Software Engineering, 2019, 24(5): 2823–2862
Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32
Tantithamthavorn C, Hassan A E, Matsumoto K. The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Transactions on Software Engineering, 2018, 46(11): 1200–1219
Yang X, Lo D, Xia X, Sun J. TLEL: a two-layer ensemble learning approach for just-in-time defect prediction. Information and Software Technology, 2017, 87: 206–220
Demšar J. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 2006, 7: 1–30
Turhan B, Menzies T, Bener A B, Di Stefano J. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering, 2009, 14(5): 540–578
Peters F, Menzies T, Marcus A. Better cross company defect predictio. In: Proceedings of the 10th Working Conference on Mining Software Repositories. 2013, 409–418
Kawata K, Amasaki S, Yokogawa T. Improving relevancy filter methods for cross-project defect prediction. In: Proceedings of the 3rd International Conference on Applied Computing and Information Technology/2nd International Conference on Computational Science and Intelligence. 2015, 2–7
Yu X, Zhou P, Zhang J, Liu J. A data filtering method based on agglomerative clustering. In: Proceedings of the 29th International Conference on Software Engineering and Knowledge Engineering. 2017, 392–397
He P, Li B, Ma Y. Towards cross-project defect prediction with imbalanced feature sets. 2014, arXiv preprint arXiv: 1411.4228
He Z, Shu F, Yang Y, Li M, Wang Q. An investigation on the feasibility of cross-project defect prediction. Automated Software Engineering, 2012, 19(2): 167–199
Pan S J, Tsang I W, Kwok J T, Yang Q. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks, 2010, 22(2): 199–210
Long M, Wang J, Ding G, Sun J, Yu P S. Transfer feature learning with joint distribution adaptation. In: Proceedings of the IEEE International Conference on Computer Vision. 2013, 2200–2207
Panichella A, Oliveto R, De Lucia A. Cross-project defect prediction models: L’union fait la force. In: Proceedings of the 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering. 2014, 164–173
Petric J, Bowes D, Hall T, Christianson B, Baddoo N. Building an ensemble for software defect prediction based on diversity selection. In: Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 2016, 1–10
Zhang Y, Lo D, Xia X, Sun J. An empirical study of classifier combination for cross-project defect prediction. In: Proceedings of the 39th IEEE Annual Computer Software and Applications Conference. 2015, 264–269
Di Nucci D, Palomba F, De Lucia A. Evaluating the adaptive selection of classifiers for cross-project bug prediction. In: Proceedings of the 6th IEEE/ACM International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering. 2018, 48–54
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Grant No. 62072060).
Author information
Authors and Affiliations
Corresponding author
Additional information
Tian Cheng is a PhD student at the School of Data and Software Engineering, Chongqing University, China. He received the BS and MS degree in Macao Polytechnic Institute and Chongqing University, China in 2012 and 2014, respectively. His research interest includes Web data extraction, data mining, and software engineering.
Kunsong Zhao is a master student at the School of Computer Science, Wuhan University, China. He received the BS degree at School of Computer Science and Information Engineering, Hubei University, China. His research interest includes software engineering, deep learning, and natural language processing.
Song Sun received the BS and MS degrees in software engineering from Chongqing University, China in 2011 and 2014, respectively. He is currently pursuing the PhD degree with the School of Big Data and Software Engineering, Chongqing University, China. His research interest includes recommendation system, computer vision, and machine learning.
Muhammad Mateen received his master’s degree in computer science from Air University, Islamabad, Pakistan in 2015 and PhD in Software Engineering from Chongqing University, China in 2020. Currently, he is working as an Assistant Professor at Air University Multan Campus, Pakistan. His research interest includes software engineering, image processing, and deep learning. He is a member of China Computer Federation (CCF).
Junhao Wen received the PhD degree from the Chongqing University, China in 2008. He is a vice head and professor of the School of Big Data and Software Engineering, Chongqing University, China. His research interest includes service computing, cloud computing, and software dependable engineering. He has published more than 80 refereed journal and conference papers in these areas. He has more than 30 research and industrial grants and developed many commercial systems and software tools.
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Cheng, T., Zhao, K., Sun, S. et al. Effort-aware cross-project just-in-time defect prediction framework for mobile apps. Front. Comput. Sci. 16, 166207 (2022). https://doi.org/10.1007/s11704-021-1013-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-021-1013-5