skip to main content
10.1145/3324884.3416617acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

BiLO-CPDP: bi-level programming for automated model discovery in cross-project defect prediction

Published: 27 January 2021 Publication History

Abstract

Cross-Project Defect Prediction (CPDP), which borrows data from similar projects by combining a transfer learner with a classifier, have emerged as a promising way to predict software defects when the available data about the target project is insufficient. However, developing such a model is challenge because it is difficult to determine the right combination of transfer learner and classifier along with their optimal hyper-parameter settings. In this paper, we propose a tool, dubbed BiLO-CPDP, which is the first of its kind to formulate the automated CPDP model discovery from the perspective of bi-level programming. In particular, the bi-level programming proceeds the optimization with two nested levels in a hierarchical manner. Specifically, the upper-level optimization routine is designed to search for the right combination of transfer learner and classifier while the nested lower-level optimization routine aims to optimize the corresponding hyper-parameter settings. To evaluate BiLO-CPDP, we conduct experiments on 20 projects to compare it with a total of 21 existing CPDP techniques, along with its single-level optimization variant and Auto-Sklearn, a state-of-the-art automated machine learning tool. Empirical results show that BiLO-CPDP champions better prediction performance than all other 21 existing CPDP techniques on 70% of the projects, while being overwhelmingly superior to Auto-Sklearn and its single-level optimization variant on all cases. Furthermore, the unique bi-level formalization in BiLO-CPDP also permits to allocate more budget to the upper-level, which significantly boosts the performance.

References

[1]
A. Agrawal, W. Fu, D. Chen, X. Shen, and T. Menzies. 2019. How to "DODGE" Complex Software Analytics. IEEE Transactions on Software Engineering (2019), 1--1.
[2]
Amritanshu Agrawal and Tim Menzies. 2018. Is "better data" better than "better data miners"?: on the benefits of tuning SMOTE for defect prediction. In ICSE'18: Proc. of the 40th International Conference on Software Engineering. ACM, 1050--1061.
[3]
Andrea Arcuri and Lionel C. Briand. 2011. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In ICSE'11: Proc. of the 33rd International Conference on Software Engineering. ACM, 1--10.
[4]
James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for Hyper-Parameter Optimization. In NIPS'11: Proc. of the 25th Annual Conference on Neural Information Processing Systems. 2546--2554.
[5]
James Bergstra, Daniel Yamins, and David D. Cox. 2013. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. In ICML'13: Proc. of the 30th International Conference on Machine Learning, Vol. 28. 115--123.
[6]
Marco D'Ambros, Michele Lanza, and Romain Robbes. 2010. An extensive comparison of bug prediction approaches. In Proceedings of the 7th International Working Conference on Mining Software Repositories, MSR 2010 (Co-located with ICSE), Cape Town, South Africa, May 2--3, 2010, Proceedings. 31--41.
[7]
Matthias Feurer and Frank Hutter. 2019. Hyperparameter Optimization. In Automated Machine Learning - Methods, Systems, Challenges. 3--33.
[8]
Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Tobias Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and Robust Automated Machine Learning. In NIPS'15: Proc. of the 2015 Annual Conference on Neural Information Processing Systems. 2962--2970.
[9]
Wei Fu, Tim Menzies, and Xipeng Shen. 2016. Tuning for software analytics: Is it really necessary? Information and Software Technology 76 (2016), 135--146.
[10]
Fred Glover and Manuel Laguna. 1998. Tabu Search. Vol. 1--3. Springer US, 2093--2229.
[11]
Zhimin He, Fayola Peters, Tim Menzies, and Ye Yang. 2013. Learning from open-source projects: An empirical study on defect prediction. In 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. IEEE, 45--54.
[12]
Steffen Herbold. 2013. Training data selection for cross-project defect prediction. In ESEM'13: Proc. of the 9th International Conference on Predictive Models in Software Engineering. 1--10.
[13]
Steffen Herbold. 2017. A systematic mapping study on cross-project defect prediction. CoRR abs/1705.06429 (2017).
[14]
Steffen Herbold, Alexander Trautsch, and Jens Grabowski. 2018. A Comparative Study to Benchmark Cross-Project Defect Prediction Approaches. IEEE Trans. Software Eng. 44, 9 (2018), 811--833.
[15]
Seyedrebvar Hosseini, Burak Turhan, and Dimuthu Gunarathna. 2019. A Systematic Literature Review and Meta-Analysis on Cross Project Defect Prediction. IEEE Trans. Software Eng. 45, 2 (2019), 111--147.
[16]
Seyedrebvar Hosseini, Burak Turhan, and Mika Mäntylä. 2018. A benchmark study on the effectiveness of search-based data selection and feature selection for cross project defect prediction. Information and Software Technology 95 (2018), 296--312.
[17]
Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2011. Sequential Model-Based Optimization for General Algorithm Configuration. In LION5: Proc. of the 5th International Conference Learning and Intelligent Optimization (Lecture Notes in Computer Science), Vol. 6683. Springer, 507--523.
[18]
Yue Jiang, Bojan Cukic, and Tim Menzies. 2008. Can data transformation help in the detection of fault-prone modules?. In DEFECTS. ACM, 16--20.
[19]
Marian Jureczko and Lech Madeyski. 2010. Towards identifying software project clusters with regard to defect prediction. In PROMISE'10: Proc. of the 6th International Conference on Predictive Models in Software Engineering. 9.
[20]
Akif Günes Koru and Hongfang Liu. 2005. An investigation of the effect of module size on defect prediction using static measures. ACM SIGSOFT Software Engineering Notes 30,4 (2005), 1--5.
[21]
Ke Li, Zilin Xiang, Tao Chen, Shuo Wang, and Kay Chen Tan. 2020. Understanding the Automated Parameter Optimization on Transfer Learning for CPDP: An Empirical Study. In ICSE'20: Proc. of the 42th International Conference on Software Engineering. accepted for publication.
[22]
Zhiqiang Li, Xiao-Yuan Jing, and Xiaoke Zhu. 2018. Progress on approaches to software defect prediction. IET Software 12, 3 (2018), 161--175.
[23]
Nachai Limsettho, Kwabena Ebo Bennin, Jacky W Keung, Hideaki Hata, and Kenichi Matsumoto. 2018. Cross project defect prediction using class distribution estimation and oversampling. Information and Software Technology 100 (2018), 87--102.
[24]
Charles X. Ling, Jin Huang, and Harry Zhang. 2003. AUC: a Statistically Consistent and more Discriminating Measure than Accuracy. In IJCAI'03: Proc. of the 8th International Joint Conference on Artificial Intelligence. 519--526.
[25]
Thilo Mende. 2010. Replication of defect prediction studies: problems, pitfalls and recommendations. In PROMISE'10: Proc. of the 6th International Conference on Predictive Models in Software Engineering. 5.
[26]
Thilo Mende and Rainer Koschke. 2009. Revisiting the evaluation of defect prediction models. In PROMISE'09: Proc. of the 5th International Workshop on Predictive Models in Software Engineering. 7.
[27]
Nikolaos Mittas and Lefteris Angelis. 2013. Ranking and Clustering Software Cost Estimation Models through a Multiple Comparisons Algorithm. IEEE Trans. Software Eng. 39, 4 (2013), 537--551.
[28]
Nachiappan Nagappan, Thomas Ball, and Andreas Zeller. 2006. Mining metrics to predict component failures. In Proceedings of the 28th international conference on Software engineering. 452--461.
[29]
Jaechang Nam, Sinno Jialin Pan, and Sunghun Kim. 2013. Transfer defect learning. In ICSE'13: Proc. of the 35th International Conference on Software Engineering. 382--391.
[30]
Chao Ni, Wang-Shu Liu, Xiang Chen, Qing Gu, Dao-Xu Chen, and Qi-Guo Huang. 2017. A cluster based feature selection method for cross-project software defect prediction. Journal of Computer Science and Technology 32, 6 (2017), 1090--1107.
[31]
Muhammed Maruf Öztürk. 2019. The impact of parameter optimization of ensemble learning on defect prediction. The Computer Science Journal of Moldova 27, 1 (2019), 85--128.
[32]
Fayola Peters, Tim Menzies, Liang Gong, and Hongyu Zhang. 2013. Balancing privacy and utility in cross-company defect prediction. IEEE Transactions on Software Engineering 39, 8 (2013), 1054--1068.
[33]
Shaojian Qiu, Lu Lu, and Siyu Jiang. 2018. Multiple-components weights model for cross-project software defect prediction. IET Software 12, 4 (2018), 345--355.
[34]
Yubin Qu, Xiang Chen, Yingquan Zhao, and Xiaolin Ju. 2018. Impact of Hyper Parameter Optimization for Cross-Project Software Defect Prediction. International Journal of Performability Engineering 14, 6 (2018), 1291--1299.
[35]
Yubin Qu, Xiang Chen, Yingquan Zhao, and Xiaolin Ju. 2018. Impact of Hyper Parameter Optimization for Cross-Project Software Defect Prediction. International Journal of Performability Engineering 14, 6 (2018).
[36]
Foyzur Rahman, Daryl Posnett, and Premkumar T. Devanbu. 2012. Recalling the "imprecision" of cross-project defect prediction. In FSE'12: Proc. of the 20th ACM SIGSOFT Symposium on the Foundations of Software Engineering. ACM, 61.
[37]
Duksan Ryu, Okjoo Choi, and Jongmoon Baik. 2016. Value-cognitive boosting with a support vector machine for cross-project defect prediction. Empirical Software Engineering 21, 1 (2016), 43--71.
[38]
Duksan Ryu, Jong-In Jang, and Jongmoon Baik. 2015. A hybrid instance selection using nearest-neighbor for cross-project defect prediction. Journal of Computer Science and Technology 30, 5 (2015), 969--980.
[39]
Dilan Sahin, Marouane Kessentini, Slim Bechikh, and Kalyanmoy Deb. 2014. Code-smell detection as a bilevel problem. ACM Transactions on Software Engineering and Methodology (TOSEM) 24, 1 (2014), 1--44.
[40]
Dilan Sahin, Marouane Kessentini, Manuel Wimmer, and Kalyanmoy Deb. 2015. Model transformation testing: a bi-level search-based software engineering approach. Journal of Software: Evolution and Process 27, 11 (2015), 821--837.
[41]
Martin J. Shepperd, Qinbao Song, Zhongbin Sun, and Carolyn Mair. 2013. Data Quality: Some Comments on the NASA Software Defect Datasets. IEEE Trans. Software Eng. 39, 9 (2013), 1208--1215.
[42]
Ankur Sinha, Pekka Malo, and Kalyanmoy Deb. 2018. A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications. IEEE Trans. Evolutionary Computation 22, 2 (2018), 276--295.
[43]
Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto. 2016. Automated parameter optimization of classification techniques for defect prediction models. In ICSE'16: Proc. of the 38th International Conference on Software Engineering. 321--332.
[44]
Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto. 2019. The Impact of Automated Parameter Optimization on Defect Prediction Models. IEEE Transactions on Software Engineering 45, 7 (2019), 683--711.
[45]
Burak Turhan, Tim Menzies, Ayse Basar Bener, and Justin S. Di Stefano. 2009. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering 14, 5 (2009), 540--578.
[46]
András Vargha and Harold D. Delaney. 2000. A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong.
[47]
Heinrich Von Stackelberg. 2010. Market structure and equilibrium. Springer Science & Business Media.
[48]
Frank Wilcoxon. 1945. Individual Comparisons by Ranking Methods.
[49]
Rongxin Wu, Hongyu Zhang, Sunghun Kim, and Shing-Chi Cheung. 2011. ReLink: recovering links between bugs and changes. In ESEC/FSE'11: Proc. of 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering and 13th European Software Engineering Conference. 15--25.
[50]
Feng Zhang, Audris Mockus, Iman Keivanloo, and Ying Zou. 2014. Towards building a universal defect prediction model. In Proceedings of the 11th Working Conference on Mining Software Repositories. 182--191.
[51]
Yuming Zhou, Yibiao Yang, Hongmin Lu, Lin Chen, Yanhui Li, Yangyang Zhao, Junyan Qian, and Baowen Xu. 2018. How Far We Have Progressed in the Journey? An Examination of Cross-Project Defect Prediction. ACM Trans. Softw. Eng. Methodol. 27, 1 (2018), 1:1--1:51.
[52]
Thomas Zimmermann, Nachiappan Nagappan, Harald C. Gall, Emanuel Giger, and Brendan Murphy. 2009. Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In ESEC/FSE'09: Proc. of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 91--100.

Cited By

View all
  • (2025)Cross-project software defect prediction based on the reduction and hybridization of software metricsAlexandria Engineering Journal10.1016/j.aej.2024.10.034112(161-176)Online publication date: Jan-2025
  • (2024)Efficient Cross-Project Software Defect Prediction Based on Federated Meta-LearningElectronics10.3390/electronics1306110513:6(1105)Online publication date: 18-Mar-2024
  • (2024)Deep Configuration Performance Learning: A Systematic Survey and TaxonomyACM Transactions on Software Engineering and Methodology10.1145/370298634:1(1-62)Online publication date: 28-Dec-2024
  • Show More Cited By

Index Terms

  1. BiLO-CPDP: bi-level programming for automated model discovery in cross-project defect prediction

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering
    December 2020
    1449 pages
    ISBN:9781450367684
    DOI:10.1145/3324884
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    • IEEE CS

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 January 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. automated parameter optimization
    2. classification techniques
    3. configurable software and tool
    4. cross-project defect prediction
    5. transfer learning

    Qualifiers

    • Research-article

    Funding Sources

    • UKRI Future Leaders Fellowship

    Conference

    ASE '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 82 of 337 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)27
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Cross-project software defect prediction based on the reduction and hybridization of software metricsAlexandria Engineering Journal10.1016/j.aej.2024.10.034112(161-176)Online publication date: Jan-2025
    • (2024)Efficient Cross-Project Software Defect Prediction Based on Federated Meta-LearningElectronics10.3390/electronics1306110513:6(1105)Online publication date: 18-Mar-2024
    • (2024)Deep Configuration Performance Learning: A Systematic Survey and TaxonomyACM Transactions on Software Engineering and Methodology10.1145/370298634:1(1-62)Online publication date: 28-Dec-2024
    • (2024)Contexts Matter: An Empirical Study on Contextual Influence in Fairness Testing for Deep Learning SystemsProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686673(107-118)Online publication date: 24-Oct-2024
    • (2024)Methodology and Guidelines for Evaluating Multi-objective Search-Based Software EngineeringCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663819(707-709)Online publication date: 10-Jul-2024
    • (2024)Adapting Multi-objectivized Software Configuration TuningProceedings of the ACM on Software Engineering10.1145/36437511:FSE(539-561)Online publication date: 12-Jul-2024
    • (2024)MMO: Meta Multi-Objectivization for Software Configuration TuningIEEE Transactions on Software Engineering10.1109/TSE.2024.338891050:6(1478-1504)Online publication date: 1-Jun-2024
    • (2023)Do Performance Aspirations Matter for Guiding Software Configuration Tuning? An Empirical Investigation under Dual Performance ObjectivesACM Transactions on Software Engineering and Methodology10.1145/357185332:3(1-41)Online publication date: 26-Apr-2023
    • (2023)The Weights Can Be Harmful: Pareto Search versus Weighted Search in Multi-objective Search-based Software EngineeringACM Transactions on Software Engineering and Methodology10.1145/351423332:1(1-40)Online publication date: 13-Feb-2023
    • (2023)Minimizing Expected Deviation in Upper Level Outcomes Due to Lower Level Decision Making in Hierarchical Multiobjective ProblemsIEEE Transactions on Evolutionary Computation10.1109/TEVC.2022.317230227:3(505-519)Online publication date: 1-Jun-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media