Skip to main content

SampleFix: Learning to Generate Functionally Diverse Fixes

  • Conference paper
  • First Online:
Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2021)

Abstract

Automatic program repair holds the potential of dramatically improving the productivity of programmers during the software development process and correctness of software in general. Recent advances in machine learning, deep learning, and NLP have rekindled the hope to eventually fully automate the process of repairing programs. However, previous approaches that aim to predict a single fix are prone to fail due to uncertainty about the true intend of the programmer. Therefore, we propose a generative model that learns a distribution over potential fixes. Our model is formulated as a deep conditional variational autoencoder that can efficiently sample fixes for a given erroneous program. In order to ensure diverse solutions, we propose a novel regularizer that encourages diversity over a semantic embedding space. Our evaluations on common programming errors show for the first time the generation of diverse fixes and strong improvements over the state-of-the-art approaches by fixing up to \(45\%\) of the erroneous programs. We additionally show that for the \(65\%\) of the repaired programs, our approach was able to generate multiple programs with diverse functionalities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Allamanis, M., Barr, E.T., Devanbu, P., Sutton, C.: A survey of machine learning for big code and naturalness. ACM Comput. Surv. (CSUR) 51, 1–37 (2018)

    Article  Google Scholar 

  2. Bader, J., Scott, A., Pradel, M., Chandra, S.: Getafix: learning to fix bugs automatically. Proc. ACM Program. Lang. 3(OOPSLA), 1–27 (2019)

    Article  Google Scholar 

  3. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2015)

    Google Scholar 

  4. Bhattacharyya, A., Schiele, B., Fritz, M.: Accurate and diverse sampling of sequences based on a “best of many” sample objective. In: CVPR (2018)

    Google Scholar 

  5. Bowman, S.R., Vilnis, L., Vinyals, O., Dai, A.M., Jozefowicz, R., Bengio, S.: Generating sentences from a continuous space. In: SIGNLL Conference on Computational Natural Language Learning (CoNLL) (2016)

    Google Scholar 

  6. Das, R., Ahmed, U.Z., Karkare, A., Gulwani, S.: Prutor: a system for tutoring CS1 and collecting student programs for analysis (2016)

    Google Scholar 

  7. Deshpande, A., Aneja, J., Wang, L., Schwing, A.G., Forsyth, D.: Fast, diverse and accurate image captioning guided by part-of-speech. In: CVPR (2019)

    Google Scholar 

  8. D’Antoni, L., Samanta, R., Singh, R.: Qlose: program repair with quantitative objectives. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9780, pp. 383–401. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41540-6_21

    Chapter  Google Scholar 

  9. Girshick, R.: Fast R-CNN. In: ICCV (2015)

    Google Scholar 

  10. Gottschlich, J., et al.: The three pillars of machine programming. In: MAPL (2018)

    Google Scholar 

  11. Goues, C.L., Pradel, M., Roychoudhury, A.: Automated program repair. Commun. ACM 62(12), 56–65 (2019)

    Article  Google Scholar 

  12. Gupta, R., Kanade, A., Shevade, S.: Deep reinforcement learning for programming language correction. In: AAAI (2019)

    Google Scholar 

  13. Gupta, R.R., Pal, S., Kanade, A., Shevade, S.K.: DeepFix: fixing common C language errors by deep learning. In: AAAI (2017)

    Google Scholar 

  14. Jang, E., Gu, S., Poole, B.: Categorical reparameterization with Gumbel-Softmax. In: ICLR (2017)

    Google Scholar 

  15. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

    Google Scholar 

  16. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: ICLR (2014)

    Google Scholar 

  17. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)

    Google Scholar 

  18. Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Unsupervised learning of hierarchical representations with convolutional deep belief networks. Commun. ACM 54, 95–103 (2011)

    Article  Google Scholar 

  19. Li, Y., Wang, S., Nguyen, T.N.: DLFix: context-based code transformation learning for automated program repair. In: International Conference on Software Engineering (ICSE) (2020)

    Google Scholar 

  20. Long, F., Rinard, M.: Automatic patch generation by learning correct code. In: ACM SIGPLAN Notices (2016)

    Google Scholar 

  21. Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: a continuous relaxation of discrete random variables (2016)

    Google Scholar 

  22. Monperrus, M.: Automatic software repair: a bibliography. ACM Comput. Surv. (CSUR) 51, 1–24 (2018)

    Article  Google Scholar 

  23. Pu, Y., Narasimhan, K., Solar-Lezama, A., Barzilay, R.: sk_p: a neural program corrector for MOOCs. In: ACM SIGPLAN (2016)

    Google Scholar 

  24. Rezende, D.J., Mohamed, S.: Variational inference with normalizing flows. In: ICML (2015)

    Google Scholar 

  25. Seo, H., Sadowski, C., Elbaum, S., Aftandilian, E., Bowdidge, R.: Programmers’ build errors: a case study (at google). In: ICSE (2014)

    Google Scholar 

  26. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)

    Google Scholar 

  27. Singh, R., Gulwani, S., Solar-Lezama, A.: Automated feedback generation for introductory programming assignments. In: PLDI (2013)

    Google Scholar 

  28. Smith, E.K., Barr, E.T., Goues, C.L., Brun, Y.: Is the cure worse than the disease? Overfitting in automated program repair. In: Foundations of Software Engineering (ESEC/FSE) (2015)

    Google Scholar 

  29. Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: NIPS (2015)

    Google Scholar 

  30. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS (2014)

    Google Scholar 

  31. Wang, L., Schwing, A., Lazebnik, S.: Diverse and accurate image description using a variational auto-encoder with an additive gaussian encoding space. In: NIPS (2017)

    Google Scholar 

  32. Yasunaga, M., Liang, P.: Graph-based, self-supervised program repair from diagnostic feedback. In: ICML (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hossein Hajipour .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hajipour, H., Bhattacharyya, A., Staicu, CA., Fritz, M. (2021). SampleFix: Learning to Generate Functionally Diverse Fixes. In: Kamp, M., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1525. Springer, Cham. https://doi.org/10.1007/978-3-030-93733-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93733-1_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93732-4

  • Online ISBN: 978-3-030-93733-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics