research-article

Adapting bug prediction models to predict reverted commits at Wayfair

Author:

Alexander SuhAuthors Info & Claims

ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Pages 1251 - 1262

https://doi.org/10.1145/3368089.3417062

Published: 08 November 2020 Publication History

Abstract

Researchers have proposed many algorithms to predict software bugs. Given a software entity (e.g., a file or method), these algorithms predict whether the entity is bug-prone. However, since these algorithms cannot identify specific bugs, this does not tend to be particularly useful in practice. In this work, we adapt this prior work to the related problem of predicting whether a commit is likely to be reverted. Given the batch nature of continuous integration deployment at scale, this allows developers to find time-sensitive bugs in production more quickly. The models in this paper are based on features extracted from the revision history of a codebase that are typically used in bug prediction. Our experiments, performed on the three main repositories for the Wayfair website, show that our models can rank reverted commits above 80% of non-reverted commits on average. Moreover, when given to Wayfair developers, our models reduce the amount of time needed to find certain kinds of bugs by 55%. Wayfair continues to use our findings and models today to help find bugs during software deployments.

Supplementary Material

Auxiliary Teaser Video (fse20ind-p98-p-teaser.mp4)

This is a presentation video for the paper "Adapting Bug Prediction Models to Predict Reverted Commits at Wayfair" at ESEC/FSE 2020. The main contribution of the paper is a model that predicts whether a commit will be reverted, which we show to measurably improve developer productivity at Wayfair.

Download
8.84 MB

Auxiliary Presentation Video (fse20ind-p98-p-video.mp4)

This is a presentation video for the paper "Adapting Bug Prediction Models to Predict Reverted Commits at Wayfair" at ESEC/FSE 2020. The main contribution of the paper is a model that predicts whether a commit will be reverted, which we show to measurably improve developer productivity at Wayfair.

Download
83.14 MB

References

[1]

Lerina Aversano, Luigi Cerulo, and Concettina Del Grosso. 2007. Learning from Bug-introducing Changes to Prevent Fault Prone Code. In Ninth International Workshop on Principles of Software Evolution: In Conjunction with the 6th ESEC/FSE Joint Meeting (IWPSE '07). ACM, New York, NY, USA, 19-26.

[2]

Ranjita Bhagwan, Rahul Kumar, Chandra Sekhar Maddila, and Adithya Abraham Philip. 2018. Orca: Diferential Bug Localization in Large-Scale Services. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 493-509. https://www.usenix.org/conference/osdi18/ presentation/bhagwan

[3]

Christian Bird, Nachiappan Nagappan, Harald Gall, Brendan Murphy, and Premkumar Devanbu. 2009. Putting It All Together: Using Sociotechnical Networks to Predict Failures. In Proceedings of the 20th IEEE International Conference on Software Reliability Engineering (ISSRE'09). IEEE Press, Piscataway, NJ, USA, 109-119. http://dl.acm.org/citation. cfm?id= 1802408. 1802424

[4]

Gemma Catolino. 2017. Just-in-time Bug Prediction in Mobile Applications: The Domain Matters!. In Proceedings of the 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft '17). IEEE Press, Piscataway, NJ, USA, 201-202. 2017.58

[5]

Gemma Catolino, Fabio Palomba, Andrea De Lucia, Filomena Ferrucci, and Andy Zaidman. 2017. Developer-related Factors in Change Prediction: An Empirical Assessment. In Proceedings of the 25th International Conference on Program Comprehension (ICPC '17). IEEE Press, Piscataway, NJ, USA, 186-195. 2017.19

[6]

Dario Di Nucci, Fabio Palomba, Giuseppe De Rosa, Gabriele Bavota, Rocco Oliveto, and Andrea De Lucia. 2018. A Developer Centered Bug Prediction Model. IEEE Trans. Softw. Eng. 44, 1 (Jan. 2018 ), 5-24. 2017.2659747

[7]

Jon Eyolfson, Lin Tan, and Patrick Lam. 2011. Do Time of Day and Developer Experience Afect Commit Bugginess?. In Proceedings of the 8th Working Conference on Mining Software Repositories (MSR '11). ACM, New York, NY, USA, 153-162. 1985441.1985464

[8]

Emanuel Giger, Marco D'Ambros, Martin Pinzger, and Harald C. Gall. 2012. Method-level Bug Prediction. In Proceedings of the ACMIEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '12). ACM, New York, NY, USA, 171-180.

[9]

Wei Hu and Kenny Wong. 2013. Using Citation Influence to Predict Software Defects. In Proceedings of the 10th Working Conference on Mining Software Repositories (MSR '13). IEEE Press, Piscataway, NJ, USA, 419-428. http://dl.acm.org/citation.cfm?id= 2487085. 2487162

Digital Library

[10]

Tian Jiang, Lin Tan, and Sunghun Kim. 2013. Personalized Defect Prediction. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE'13). IEEE Press, Piscataway, NJ, USA, 279-289. 2013.6693087

[11]

Sunghun Kim and E. James Whitehead, Jr. 2006. How Long Did It Take to Fix Bugs?. In Proceedings of the 2006 International Workshop on Mining Software Repositories (MSR '06). ACM, New York, NY, USA, 173-174.

[12]

Sunghun Kim, E. James Whitehead, Jr., and Yi Zhang. 2008. Classifying Software Changes: Clean or Buggy? IEEE Trans. Softw. Eng. 34, 2 (March 2008 ), 181-196. 2007.70773

[13]

Masanari Kondo, D. Germán, O. Mizuno, and Eun-Hye Choi. 2019. The impact of context metrics on just-in-time defect prediction. Empirical Software Engineering 25 ( 2019 ), 890-939.

[14]

Taek Lee, Jaechang Nam, DongGyun Han, Sunghun Kim, and Hoh Peter In. 2011. Micro Interaction Metrics for Defect Prediction. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE '11). ACM, New York, NY, USA, 311-321. 2025156

[15]

Chris Lewis, Zhongpeng Lin, Caitlin Sadowski, Xiaoyan Zhu, Rong Ou, and E. James Whitehead Jr. 2013. Does Bug Prediction Support Human Developers? Findings from a Google Case Study. In Proceedings of the 2013 International Conference on Software Engineering (ICSE '13). IEEE Press, Piscataway, NJ, USA, 372-381. http://dl.acm.org/citation.cfm? id= 2486788. 2486838

[16]

Wanwangying Ma, Lin Chen, Yibiao Yang, Yuming Zhou, and Baowen Xu. 2016. Empirical Analysis of Network Measures for Efort-aware Fault-proneness Prediction. Inf. Softw. Technol. 69, C ( Jan. 2016 ), 50-70. 2015. 09.001

[17]

Lech Madeyski and Marian Jureczko. 2015. Which Process Metrics Can Significantly Improve Defect Prediction Models? An Empirical Study. Software Quality Journal 23, 3 (Sept. 2015 ), 393-422.

[18]

Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, and Laurens van der Maaten. 2018. Exploring the Limits of Weakly Supervised Pretraining. In Computer Vision-ECCV 2018, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, Cham, 185-201.

Digital Library

[19]

Shinsuke Matsumoto, Yasutaka Kamei, Akito Monden, Ken-ichi Matsumoto, and Masahide Nakamura. 2010. An Analysis of Developer Metrics for Fault Prediction. In Proceedings of the 6th International Conference on Predictive Models in Software Engineering (PROMISE '10). ACM, New York, NY, USA, Article 18, 9 pages. 10.1145/1868328.1868356

[20]

Raimund Moser, Witold Pedrycz, and Giancarlo Succi. 2008. A Comparative Analysis of the Eficiency of Change Metrics and Static Code Attributes for Defect Prediction. In Proceedings of the 30th International Conference on Software Engineering (ICSE '08). ACM, New York, NY, USA, 181-190.

[21]

Nachiappan Nagappan, Thomas Ball, and Andreas Zeller. 2006. Mining Metrics to Predict Component Failures. In Proceedings of the 28th International Conference on Software Engineering (ICSE '06). ACM, New York, NY, USA, 452-461.

[22]

Martin Pinzger, Nachiappan Nagappan, and Brendan Murphy. 2008. Can Developer-module Networks Predict Failures?. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT '08/FSE-16). ACM, New York, NY, USA, 2-12.

[23]

Foyzur Rahman and Premkumar Devanbu. 2011. Ownership, Experience and Defects: A Fine-grained Study of Authorship. In Proceedings of the 33rd International Conference on Software Engineering (ICSE '11). ACM, New York, NY, USA, 491-500. 1985793.1985860

[24]

Foyzur Rahman and Premkumar Devanbu. 2013. How, and Why, Process Metrics Are Better. In Proceedings of the 2013 International Conference on Software Engineering (ICSE '13). IEEE Press, Piscataway, NJ, USA, 432-441. http://dl.acm.org/citation.cfm?id= 2486788. 2486846

Digital Library

[25]

Foyzur Rahman, Daryl Posnett, Abram Hindle, Earl Barr, and Premkumar Devanbu. 2011. BugCache for Inspections: Hit or Miss?. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE '11). ACM, New York, NY, USA, 322-331. 2025157

[26]

Emad Shihab, Ahmed E. Hassan, Bram Adams, and Zhen Ming Jiang. 2012. An Industrial Study on the Risk of Software Changes. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering (FSE '12). ACM, New York, NY, USA, Article 62, 11 pages.

[27]

Shivkumar Shivaji, E. James Whitehead Jr., Ram Akella, and Sunghun Kim. 2009. Reducing Features to Improve Bug Prediction. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering (ASE '09). IEEE Computer Society, Washington, DC, USA, 600-604. 2009.76

[28]

Chen Sun, Abhinav Shrivastava, Saurabh Singh, and Abhinav Gupta. 2017. Revisiting Unreasonable Efectiveness of Data in Deep Learning Era. ( 2017 ).

[29]

Ming Tan, Lin Tan, Sashank Dara, and Caleb Mayeux. 2015. Online Defect Prediction for Imbalanced Data. In Proceedings of the 37th International Conference on Software Engineering-Volume 2 (ICSE '15). IEEE Press, Piscataway, NJ, USA, 99-108. http://dl.acm.org/citation. cfm?id= 2819009. 2819026

Digital Library

[30]

Ayşe Tosun, Burak Turhan, and Ayşe Bener. 2009. Validation of Network Measures As Indicators of Defective Modules in Software Systems. In Proceedings of the 5th International Conference on Predictor Models in Software Engineering (PROMISE '09). ACM, New York, NY, USA, Article 5, 9 pages.

[31]

Elaine J. Weyuker, Thomas J. Ostrand, and Robert M. Bell. 2007. Using Developer Information As a Factor for Fault Prediction. In Proceedings of the Third International Workshop on Predictor Models in Software Engineering (PROMISE '07). IEEE Computer Society, Washington, DC, USA, 8-. 2007.14

[32]

Meng Yan, Xin Xia, David Lo, Ahmed E. Hassan, and Shanping Li. 2019. Characterizing and Identifying Reverted Commits. Empirical Softw. Engg. 24, 4 (Aug. 2019 ), 2171âĂŞ2208. s10664-019-09688-8

[33]

Thomas Zimmermann and Nachiappan Nagappan. 2008. Predicting Defects Using Network Analysis on Dependency Graphs. In Proceedings of the 30th International Conference on Software Engineering (ICSE '08). ACM, New York, NY, USA, 531-540.

[34]

Thomas Zimmermann, Rahul Premraj, and Andreas Zeller. 2007. Predicting Defects for Eclipse. In Proceedings of the Third International Workshop on Predictor Models in Software Engineering (PROMISE âĂŹ07). IEEE Computer Society, USA, 9. 1109/PROMISE. 2007.10

Cited By

Yang YXia XLo DBi TGrundy JYang X(2022)Predictive Models in Software Engineering: Challenges and OpportunitiesACM Transactions on Software Engineering and Methodology10.1145/350350931:3(1-72)Online publication date: 9-Apr-2022
https://dl.acm.org/doi/10.1145/3503509

Index Terms

Adapting bug prediction models to predict reverted commits at Wayfair
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Continuous Software Bug Prediction
ESEM '21: Proceedings of the 15th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

Background: Many software bug prediction models have been proposed and evaluated on a set of well-known benchmark datasets. We conducted pilot studies on the widely used benchmark datasets and observed common issues among them. Specifically, most of ...
Bug-fix time prediction models: can we do better?
MSR '11: Proceedings of the 8th Working Conference on Mining Software Repositories

Predicting bug-fix time is useful in several areas of software evolution, such as predicting software quality or coordinating development effort during bug triaging. Prior work has proposed bug-fix time prediction models that use various bug report ...
Towards Semi-automatic Bug Triage and Severity Prediction Based on Topic Model and Multi-feature of Bug Reports
COMPSAC '14: Proceedings of the 2014 IEEE 38th Annual Computer Software and Applications Conference

Bug fixing is an essential activity in the software maintenance, because most of the software systems have unavoidable defects. When new bugs are submitted, triagers have to find and assign appropriate developers to fix the bugs. However, if the bugs are ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

November 2020

1703 pages

ISBN:9781450370431

DOI:10.1145/3368089

General Chair:
Prem Devanbu
University of California at Davis, USA
,
Program Chairs:
Myra Cohen
Iowa State University, USA
,
Thomas Zimmermann
Microsoft Research, USA

Copyright © 2020 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 November 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ESEC/FSE '20

Sponsor:

SIGSOFT

ESEC/FSE '20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

November 8 - 13, 2020

Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
198
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yang YXia XLo DBi TGrundy JYang X(2022)Predictive Models in Software Engineering: Challenges and OpportunitiesACM Transactions on Software Engineering and Methodology10.1145/350350931:3(1-72)Online publication date: 9-Apr-2022
https://dl.acm.org/doi/10.1145/3503509

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten