skip to main content
10.1145/3368089.3417062acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Adapting bug prediction models to predict reverted commits at Wayfair

Published: 08 November 2020 Publication History

Abstract

Researchers have proposed many algorithms to predict software bugs. Given a software entity (e.g., a file or method), these algorithms predict whether the entity is bug-prone. However, since these algorithms cannot identify specific bugs, this does not tend to be particularly useful in practice. In this work, we adapt this prior work to the related problem of predicting whether a commit is likely to be reverted. Given the batch nature of continuous integration deployment at scale, this allows developers to find time-sensitive bugs in production more quickly. The models in this paper are based on features extracted from the revision history of a codebase that are typically used in bug prediction. Our experiments, performed on the three main repositories for the Wayfair website, show that our models can rank reverted commits above 80% of non-reverted commits on average. Moreover, when given to Wayfair developers, our models reduce the amount of time needed to find certain kinds of bugs by 55%. Wayfair continues to use our findings and models today to help find bugs during software deployments.

Supplementary Material

Auxiliary Teaser Video (fse20ind-p98-p-teaser.mp4)
This is a presentation video for the paper "Adapting Bug Prediction Models to Predict Reverted Commits at Wayfair" at ESEC/FSE 2020. The main contribution of the paper is a model that predicts whether a commit will be reverted, which we show to measurably improve developer productivity at Wayfair.
Auxiliary Presentation Video (fse20ind-p98-p-video.mp4)
This is a presentation video for the paper "Adapting Bug Prediction Models to Predict Reverted Commits at Wayfair" at ESEC/FSE 2020. The main contribution of the paper is a model that predicts whether a commit will be reverted, which we show to measurably improve developer productivity at Wayfair.

References

[1]
Lerina Aversano, Luigi Cerulo, and Concettina Del Grosso. 2007. Learning from Bug-introducing Changes to Prevent Fault Prone Code. In Ninth International Workshop on Principles of Software Evolution: In Conjunction with the 6th ESEC/FSE Joint Meeting (IWPSE '07). ACM, New York, NY, USA, 19-26.
[2]
Ranjita Bhagwan, Rahul Kumar, Chandra Sekhar Maddila, and Adithya Abraham Philip. 2018. Orca: Diferential Bug Localization in Large-Scale Services. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 493-509. https://www.usenix.org/conference/osdi18/ presentation/bhagwan
[3]
Christian Bird, Nachiappan Nagappan, Harald Gall, Brendan Murphy, and Premkumar Devanbu. 2009. Putting It All Together: Using Sociotechnical Networks to Predict Failures. In Proceedings of the 20th IEEE International Conference on Software Reliability Engineering (ISSRE'09). IEEE Press, Piscataway, NJ, USA, 109-119. http://dl.acm.org/citation. cfm?id= 1802408. 1802424
[4]
Gemma Catolino. 2017. Just-in-time Bug Prediction in Mobile Applications: The Domain Matters!. In Proceedings of the 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft '17). IEEE Press, Piscataway, NJ, USA, 201-202. 2017.58
[5]
Gemma Catolino, Fabio Palomba, Andrea De Lucia, Filomena Ferrucci, and Andy Zaidman. 2017. Developer-related Factors in Change Prediction: An Empirical Assessment. In Proceedings of the 25th International Conference on Program Comprehension (ICPC '17). IEEE Press, Piscataway, NJ, USA, 186-195. 2017.19
[6]
Dario Di Nucci, Fabio Palomba, Giuseppe De Rosa, Gabriele Bavota, Rocco Oliveto, and Andrea De Lucia. 2018. A Developer Centered Bug Prediction Model. IEEE Trans. Softw. Eng. 44, 1 (Jan. 2018 ), 5-24. 2017.2659747
[7]
Jon Eyolfson, Lin Tan, and Patrick Lam. 2011. Do Time of Day and Developer Experience Afect Commit Bugginess?. In Proceedings of the 8th Working Conference on Mining Software Repositories (MSR '11). ACM, New York, NY, USA, 153-162. 1985441.1985464
[8]
Emanuel Giger, Marco D'Ambros, Martin Pinzger, and Harald C. Gall. 2012. Method-level Bug Prediction. In Proceedings of the ACMIEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '12). ACM, New York, NY, USA, 171-180.
[9]
Wei Hu and Kenny Wong. 2013. Using Citation Influence to Predict Software Defects. In Proceedings of the 10th Working Conference on Mining Software Repositories (MSR '13). IEEE Press, Piscataway, NJ, USA, 419-428. http://dl.acm.org/citation.cfm?id= 2487085. 2487162
[10]
Tian Jiang, Lin Tan, and Sunghun Kim. 2013. Personalized Defect Prediction. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE'13). IEEE Press, Piscataway, NJ, USA, 279-289. 2013.6693087
[11]
Sunghun Kim and E. James Whitehead, Jr. 2006. How Long Did It Take to Fix Bugs?. In Proceedings of the 2006 International Workshop on Mining Software Repositories (MSR '06). ACM, New York, NY, USA, 173-174.
[12]
Sunghun Kim, E. James Whitehead, Jr., and Yi Zhang. 2008. Classifying Software Changes: Clean or Buggy? IEEE Trans. Softw. Eng. 34, 2 (March 2008 ), 181-196. 2007.70773
[13]
Masanari Kondo, D. Germán, O. Mizuno, and Eun-Hye Choi. 2019. The impact of context metrics on just-in-time defect prediction. Empirical Software Engineering 25 ( 2019 ), 890-939.
[14]
Taek Lee, Jaechang Nam, DongGyun Han, Sunghun Kim, and Hoh Peter In. 2011. Micro Interaction Metrics for Defect Prediction. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE '11). ACM, New York, NY, USA, 311-321. 2025156
[15]
Chris Lewis, Zhongpeng Lin, Caitlin Sadowski, Xiaoyan Zhu, Rong Ou, and E. James Whitehead Jr. 2013. Does Bug Prediction Support Human Developers? Findings from a Google Case Study. In Proceedings of the 2013 International Conference on Software Engineering (ICSE '13). IEEE Press, Piscataway, NJ, USA, 372-381. http://dl.acm.org/citation.cfm? id= 2486788. 2486838
[16]
Wanwangying Ma, Lin Chen, Yibiao Yang, Yuming Zhou, and Baowen Xu. 2016. Empirical Analysis of Network Measures for Efort-aware Fault-proneness Prediction. Inf. Softw. Technol. 69, C ( Jan. 2016 ), 50-70. 2015. 09.001
[17]
Lech Madeyski and Marian Jureczko. 2015. Which Process Metrics Can Significantly Improve Defect Prediction Models? An Empirical Study. Software Quality Journal 23, 3 (Sept. 2015 ), 393-422.
[18]
Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, and Laurens van der Maaten. 2018. Exploring the Limits of Weakly Supervised Pretraining. In Computer Vision-ECCV 2018, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, Cham, 185-201.
[19]
Shinsuke Matsumoto, Yasutaka Kamei, Akito Monden, Ken-ichi Matsumoto, and Masahide Nakamura. 2010. An Analysis of Developer Metrics for Fault Prediction. In Proceedings of the 6th International Conference on Predictive Models in Software Engineering (PROMISE '10). ACM, New York, NY, USA, Article 18, 9 pages. 10.1145/1868328.1868356
[20]
Raimund Moser, Witold Pedrycz, and Giancarlo Succi. 2008. A Comparative Analysis of the Eficiency of Change Metrics and Static Code Attributes for Defect Prediction. In Proceedings of the 30th International Conference on Software Engineering (ICSE '08). ACM, New York, NY, USA, 181-190.
[21]
Nachiappan Nagappan, Thomas Ball, and Andreas Zeller. 2006. Mining Metrics to Predict Component Failures. In Proceedings of the 28th International Conference on Software Engineering (ICSE '06). ACM, New York, NY, USA, 452-461.
[22]
Martin Pinzger, Nachiappan Nagappan, and Brendan Murphy. 2008. Can Developer-module Networks Predict Failures?. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT '08/FSE-16). ACM, New York, NY, USA, 2-12.
[23]
Foyzur Rahman and Premkumar Devanbu. 2011. Ownership, Experience and Defects: A Fine-grained Study of Authorship. In Proceedings of the 33rd International Conference on Software Engineering (ICSE '11). ACM, New York, NY, USA, 491-500. 1985793.1985860
[24]
Foyzur Rahman and Premkumar Devanbu. 2013. How, and Why, Process Metrics Are Better. In Proceedings of the 2013 International Conference on Software Engineering (ICSE '13). IEEE Press, Piscataway, NJ, USA, 432-441. http://dl.acm.org/citation.cfm?id= 2486788. 2486846
[25]
Foyzur Rahman, Daryl Posnett, Abram Hindle, Earl Barr, and Premkumar Devanbu. 2011. BugCache for Inspections: Hit or Miss?. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE '11). ACM, New York, NY, USA, 322-331. 2025157
[26]
Emad Shihab, Ahmed E. Hassan, Bram Adams, and Zhen Ming Jiang. 2012. An Industrial Study on the Risk of Software Changes. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering (FSE '12). ACM, New York, NY, USA, Article 62, 11 pages.
[27]
Shivkumar Shivaji, E. James Whitehead Jr., Ram Akella, and Sunghun Kim. 2009. Reducing Features to Improve Bug Prediction. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering (ASE '09). IEEE Computer Society, Washington, DC, USA, 600-604. 2009.76
[28]
Chen Sun, Abhinav Shrivastava, Saurabh Singh, and Abhinav Gupta. 2017. Revisiting Unreasonable Efectiveness of Data in Deep Learning Era. ( 2017 ).
[29]
Ming Tan, Lin Tan, Sashank Dara, and Caleb Mayeux. 2015. Online Defect Prediction for Imbalanced Data. In Proceedings of the 37th International Conference on Software Engineering-Volume 2 (ICSE '15). IEEE Press, Piscataway, NJ, USA, 99-108. http://dl.acm.org/citation. cfm?id= 2819009. 2819026
[30]
Ayşe Tosun, Burak Turhan, and Ayşe Bener. 2009. Validation of Network Measures As Indicators of Defective Modules in Software Systems. In Proceedings of the 5th International Conference on Predictor Models in Software Engineering (PROMISE '09). ACM, New York, NY, USA, Article 5, 9 pages.
[31]
Elaine J. Weyuker, Thomas J. Ostrand, and Robert M. Bell. 2007. Using Developer Information As a Factor for Fault Prediction. In Proceedings of the Third International Workshop on Predictor Models in Software Engineering (PROMISE '07). IEEE Computer Society, Washington, DC, USA, 8-. 2007.14
[32]
Meng Yan, Xin Xia, David Lo, Ahmed E. Hassan, and Shanping Li. 2019. Characterizing and Identifying Reverted Commits. Empirical Softw. Engg. 24, 4 (Aug. 2019 ), 2171âĂŞ2208. s10664-019-09688-8
[33]
Thomas Zimmermann and Nachiappan Nagappan. 2008. Predicting Defects Using Network Analysis on Dependency Graphs. In Proceedings of the 30th International Conference on Software Engineering (ICSE '08). ACM, New York, NY, USA, 531-540.
[34]
Thomas Zimmermann, Rahul Premraj, and Andreas Zeller. 2007. Predicting Defects for Eclipse. In Proceedings of the Third International Workshop on Predictor Models in Software Engineering (PROMISE âĂŹ07). IEEE Computer Society, USA, 9. 1109/PROMISE. 2007.10

Cited By

View all
  • (2022)Predictive Models in Software Engineering: Challenges and OpportunitiesACM Transactions on Software Engineering and Methodology10.1145/350350931:3(1-72)Online publication date: 9-Apr-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
November 2020
1703 pages
ISBN:9781450370431
DOI:10.1145/3368089
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 November 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. reverted commits
  2. software defect prediction
  3. software deployment

Qualifiers

  • Research-article

Conference

ESEC/FSE '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)1
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Predictive Models in Software Engineering: Challenges and OpportunitiesACM Transactions on Software Engineering and Methodology10.1145/350350931:3(1-72)Online publication date: 9-Apr-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media