Effort-aware cross-project just-in-time defect prediction framework for mobile apps

Cheng, Tian; Zhao, Kunsong; Sun, Song; Mateen, Muhammad; Wen, Junhao

doi:10.1007/s11704-021-1013-5

Effort-aware cross-project just-in-time defect prediction framework for mobile apps

Research Article
Published: 22 January 2022

Volume 16, article number 166207, (2022)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Tian Cheng¹,
Kunsong Zhao²,
Song Sun¹,
Muhammad Mateen³ &
…
Junhao Wen¹

171 Accesses
17 Citations
1 Altmetric
Explore all metrics

Abstract

As the boom of mobile devices, Android mobile apps play an irreplaceable roles in people’s daily life, which have the characteristics of frequent updates involving in many code commits to meet new requirements. Just-in-Time (JIT) defect prediction aims to identify whether the commit instances will bring defects into the new release of apps and provides immediate feedback to developers, which is more suitable to mobile apps. As the within-app defect prediction needs sufficient historical data to label the commit instances, which is inadequate in practice, one alternative method is to use the cross-project model. In this work, we propose a novel method, called KAL, for cross-project JIT defect prediction task in the context of Android mobile apps. More specifically, KAL first transforms the commit instances into a high-dimensional feature space using kernel-based principal component analysis technique to obtain the representative features. Then, the adversarial learning technique is used to extract the common feature embedding for the model building. We conduct experiments on 14 Android mobile apps and employ four effort-aware indicators for performance evaluation. The results on 182 cross-project pairs demonstrate that our proposed KAL method obtains better performance than 20 comparative methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Just-in-Time crash prediction for mobile apps

Article Open access 08 May 2024

An empirical investigation of performance overhead in cross-platform mobile development frameworks

Article Open access 09 June 2020

CAMAR: a broad learning based context-aware recommender for mobile applications

Article 14 March 2020

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Ghotra B, McIntosh S, Hassan A E. Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the 37th IEEE International Conference on Software Engineering. 2015, 789–800
Xu Z, Li S, Xu J, Luo X, Zhang T, Keung J, Tang Y. LDFR: learning deep feature representation for software defect prediction. Journal of Systems and Software, 2019, 158: 110402
Article Google Scholar
Xu Z, Xuan J, Liu J, Cui X. MICHAC: defect prediction via feature selection based on maximal information coefficient with hierarchical agglomerative clustering. In: Proceedings of the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering. 2016, 370–381
Chen X, Mu Y, Qu Y, Ni C, Liu M, He T, Liu S. Do different crossproject defect prediction methods identify the same defective modules? Journal of Software: Evolution and Process, 2020, 32(5): e2234
Google Scholar
Menzies T, Greenwald J, Frank A. Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 2006, 33(1): 2–13
Article Google Scholar
Kamei Y, Shihab E, Adams B, Hassan, A E, Mockus A, Sinha A, Ubayashi N. A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering, 2012, 39(6): 757–773
Article Google Scholar
Kamei Y, Fukushima T, McIntosh S, Yamashita K, Ubayashi N, Hassan A E. Studying just-in-time defect prediction using cross-project models. Empirical Software Engineering, 2016, 21(5): 2072–2106
Article Google Scholar
Catolino G, Di Nucci D, Ferrucci F. Cross-project just-in-time bug prediction for mobile apps: an empirical assessment. In: Proceedings of the 6th IEEE/ACM International Conference on Mobile Software Engineering and Systems. 2019, 99–110
Jing X Y, Ying S, Zhang Z W, Wu S S, Liu J. Dictionary learning based software defect prediction. In: Proceedings of the 36th International Conference on Software Engineering. 2014, 414–423
Xia X, Lo D, Pan S J, Nagappan N, Wang X. Hydra: massively compositional model for cross-project defect prediction. IEEE Transactions on Software Engineering, 2016, 42(10): 977–998
Article Google Scholar
Arisholm E, Briand L C, Fuglerud M. Data mining techniques for building fault-proneness models in telecom java software. In: Proceedings of the 18th IEEE International Symposium on Software Reliability. 2007, 215–224
Ma Y, Luo G, Zeng X, Chen A. Transfer learning for cross-company software defect prediction. Information and Software Technology, 2012, 54(3): 248–256
Article Google Scholar
Nam J, Pan S J, Kim S. Transfer defect learning. In: Proceedings of the 35th International Conference on Software Engineering. 2013, 382–391
Chen L, Fang B, Shang Z, Tang Y. Negative samples reduction in crosscompany software defects prediction. Information and Software Technology, 2015, 62: 67–77
Article Google Scholar
Ryu D, Jang J I, Baik J. A transfer cost-sensitive boosting approach for cross-project defect prediction. Software Quality Journal, 2017, 25(1): 235–272
Article Google Scholar
Liu C, Yang D, Xia X, Yan M, Zhang X. A two-phase transfer learning model for cross-project defect prediction. Information and Software Technology, 2019, 107: 125–136
Article Google Scholar
Xu Z, Pang S, Zhang T, Luo X P, Liu J, Tang Y T, Xue L. Cross project defect prediction via balanced distribution adaptation based transfer learning. Journal of Computer Science and Technology, 2019, 34(5): 1039–1062
Article Google Scholar
McIntosh S, Kamei Y. Are fix-inducing changes a moving target? a longitudinal case study of just-in-time defect prediction. IEEE Transactions on Software Engineering, 2017, 44(5): 412–428
Article Google Scholar
Pascarella L, Palomba F, Bacchelli A. Fine-grained just-in-time defect prediction. Journal of Systems and Software, 2019, 150: 22–36
Article Google Scholar
Chen X, Zhao Y, Wang Q, Yuan Z. MULTI: multi-objective effortaware just-in-time software defect prediction. Information and Software Technology, 2018, 93: 1–13
Article Google Scholar
Cabral G G, Minku L L, Shihab E, Mujahid S. Class imbalance evolution and verification latency in just-in-time software defect prediction. In: Proceedings of the 41st IEEE/ACM International Conference on Software Engineering. 2019, 666–676
Li S Z, Fu Q, Gu L, Scholkopf B, Cheng Y, Zhang H. Kernel machine based learning for multi-view face detection and pose estimation. In: Proceedings of the 8th IEEE International Conference on Computer Vision. 2001, 674–679
Xu Z, Liu J, Luo X, Zhang T. Cross-version defect prediction via hybrid active learning with kernel principal component analysis. In: Proceedings of the 25th IEEE International Conference on Software Analysis, Evolution and Reengineering. 2018, 209–220
Huang J, Yan X. Relevant and independent multi-block approach for plant-wide process and quality-related monitoring based on KPCA and SVDD. ISA Transactions, 2018, 73: 257–267
Article Google Scholar
Xu Z, Liu J, Luo X, Yang Z, Zhang Y, Yuan P, Zhang T. Software defect prediction based on kernel PCA and weighted extreme learning machine. Information and Software Technology, 2019, 106: 182–200
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Bengio Y. Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing systems, 2014, 2672–2680
Li W, Ding W, Sadasivam R, Cui X, Chen P. His-GAN: a histogrambased GAN model to improve data generation quality. Neural Networks, 2019, 119: 31–45
Article Google Scholar
Xu Z, Li S, Luo X, Liu J, Zhang T, Tang Y, Xu J, Yuan P, Keung, J. TSTSS: a two-stage training subset selection framework for cross version defect prediction. Journal of Systems and Software, 2019, 154: 59–78
Article Google Scholar
Arisholm E, Briand L C, Johannessen E B. A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. Journal of Systems and Software, 2010, 83(1): 2–17
Article Google Scholar
Xu Z, Li L, Yan M, Liu J, Luo X, Grundy J, Zhang Y, Zhang X. A comprehensive comparative study of clustering-based unsupervised defect prediction models. Journal of Systems and Software, 2021, 172: 110862
Article Google Scholar
Huang Q, Xia X, Lo D. Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction. Empirical Software Engineering, 2019, 24(5): 2823–2862
Article Google Scholar
Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32
Article Google Scholar
Tantithamthavorn C, Hassan A E, Matsumoto K. The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Transactions on Software Engineering, 2018, 46(11): 1200–1219
Article Google Scholar
Yang X, Lo D, Xia X, Sun J. TLEL: a two-layer ensemble learning approach for just-in-time defect prediction. Information and Software Technology, 2017, 87: 206–220
Article Google Scholar
Demšar J. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 2006, 7: 1–30
MathSciNet Google Scholar
Turhan B, Menzies T, Bener A B, Di Stefano J. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering, 2009, 14(5): 540–578
Article Google Scholar
Peters F, Menzies T, Marcus A. Better cross company defect predictio. In: Proceedings of the 10th Working Conference on Mining Software Repositories. 2013, 409–418
Kawata K, Amasaki S, Yokogawa T. Improving relevancy filter methods for cross-project defect prediction. In: Proceedings of the 3rd International Conference on Applied Computing and Information Technology/2nd International Conference on Computational Science and Intelligence. 2015, 2–7
Yu X, Zhou P, Zhang J, Liu J. A data filtering method based on agglomerative clustering. In: Proceedings of the 29th International Conference on Software Engineering and Knowledge Engineering. 2017, 392–397
He P, Li B, Ma Y. Towards cross-project defect prediction with imbalanced feature sets. 2014, arXiv preprint arXiv: 1411.4228
He Z, Shu F, Yang Y, Li M, Wang Q. An investigation on the feasibility of cross-project defect prediction. Automated Software Engineering, 2012, 19(2): 167–199
Article Google Scholar
Pan S J, Tsang I W, Kwok J T, Yang Q. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks, 2010, 22(2): 199–210
Article Google Scholar
Long M, Wang J, Ding G, Sun J, Yu P S. Transfer feature learning with joint distribution adaptation. In: Proceedings of the IEEE International Conference on Computer Vision. 2013, 2200–2207
Panichella A, Oliveto R, De Lucia A. Cross-project defect prediction models: L’union fait la force. In: Proceedings of the 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering. 2014, 164–173
Petric J, Bowes D, Hall T, Christianson B, Baddoo N. Building an ensemble for software defect prediction based on diversity selection. In: Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 2016, 1–10
Zhang Y, Lo D, Xia X, Sun J. An empirical study of classifier combination for cross-project defect prediction. In: Proceedings of the 39th IEEE Annual Computer Software and Applications Conference. 2015, 264–269
Di Nucci D, Palomba F, De Lucia A. Evaluating the adaptive selection of classifiers for cross-project bug prediction. In: Proceedings of the 6th IEEE/ACM International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering. 2018, 48–54

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 62072060).

Author information

Authors and Affiliations

School of Big Data and Software Engineering, Chongqing University, Chongqing, 401331, China
Tian Cheng, Song Sun & Junhao Wen
School of Computer Science, Wuhan University, Wuhan, 430072, China
Kunsong Zhao
Department of Computer Science, Air University Multan Campus, Multan, 60000, Pakistan
Muhammad Mateen

Authors

Tian Cheng
View author publications
Search author on:PubMed Google Scholar
Kunsong Zhao
View author publications
Search author on:PubMed Google Scholar
Song Sun
View author publications
Search author on:PubMed Google Scholar
Muhammad Mateen
View author publications
Search author on:PubMed Google Scholar
Junhao Wen
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Junhao Wen.

Additional information

Tian Cheng is a PhD student at the School of Data and Software Engineering, Chongqing University, China. He received the BS and MS degree in Macao Polytechnic Institute and Chongqing University, China in 2012 and 2014, respectively. His research interest includes Web data extraction, data mining, and software engineering.

Kunsong Zhao is a master student at the School of Computer Science, Wuhan University, China. He received the BS degree at School of Computer Science and Information Engineering, Hubei University, China. His research interest includes software engineering, deep learning, and natural language processing.

Song Sun received the BS and MS degrees in software engineering from Chongqing University, China in 2011 and 2014, respectively. He is currently pursuing the PhD degree with the School of Big Data and Software Engineering, Chongqing University, China. His research interest includes recommendation system, computer vision, and machine learning.

Muhammad Mateen received his master’s degree in computer science from Air University, Islamabad, Pakistan in 2015 and PhD in Software Engineering from Chongqing University, China in 2020. Currently, he is working as an Assistant Professor at Air University Multan Campus, Pakistan. His research interest includes software engineering, image processing, and deep learning. He is a member of China Computer Federation (CCF).

Junhao Wen received the PhD degree from the Chongqing University, China in 2008. He is a vice head and professor of the School of Big Data and Software Engineering, Chongqing University, China. His research interest includes service computing, cloud computing, and software dependable engineering. He has published more than 80 refereed journal and conference papers in these areas. He has more than 30 research and industrial grants and developed many commercial systems and software tools.

Electronic Supplementary Material