Skip to main content

Advertisement

Log in

ReBack: recommending backports in social coding environments

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

Pull-based development is widely used in popular social coding environments like GitHub and GitLab for both internal and external contributions. When critical bug fixes or features are committed to the main branch of a project, it is often desirable to also port those changes to other stable branches. This process is referred to as backporting, and pull-requests in the process are known as backports. Backports are typically determined after extensive discussion with collaborators, and it may take many days to identify backports, which commonly results in tags and references to the original pull-requests (i.e., pull-requests for the main branch) being missed. To help software development teams better identify and manage backports, we propose ReBack (Recommending Backports), a tool based on a deep-learning model for automatically identifying backports from pull-requests and related reviews, discussions, metadata, and committed code. ReBack predicted backports with 90.98% precision and 91.81% recall from 80,000 pull-requests in 17 GitHub projects. Although the results are promising, more research is required to further support backporting, including research into automatically porting a pull-request to further reduce costs when managing software versions and branches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from tensorflow.org (2015). https://www.tensorflow.org/

  • Chakroborti, D.: ReBack BenchMark (2021a). https://doi.org/10.5281/zenodo.6715562

  • Chakroborti, D.: ReBack Tool (2021b). https://doi.org/10.5281/zenodo.6715463

  • Ansible: Backport ReadmeD. https://tinyurl.com/backportREADMEMD. [Online; accessed 5-Dec-2021] (2021)

  • Ansible: DevelopmentProcess.rst. https://tinyurl.com/ansibledevelopmentprocessrst. [Online; accessed 22-June-2021] (2021)

  • Ansible: README.md. https://tinyurl.com/ansiblebackportREADME. [Online; accessed 22-June-2021] (2020)

  • Ansible: The Ansible Development Cycle. https://tinyurl.com/information-labels. [Online; accessed 5-Dec-2021] (2021)

  • Azeem, M.I., Panichella, S., Di Sorbo, A., Serebrenik, A., Wang, Q.: Action-Based Recommendation in Pull-Request Development, pp. 115–124. Association for Computing Machinery, New York, NY, USA (2020)

  • Cabot, J., Cánovas Izquierdo, J.L., Cosentino, V., Rolandi, B.: Exploring the use of labels to categorize issues in open-source software projects. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 550–554 (2015). https://doi.org/10.1109/SANER.2015.7081875

  • Chakroborti, D., Schneider, K.A., Roy, C.K.: Backports: Change types, challenges and strategies. In: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension. ICPC ’22, pp. 636–647. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3524610.3527920

  • Chen, D., Stolee, K.T., Menzies, T.: Replication can improve prior results: A github study of pull request acceptance. In: 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC), pp. 179–190 (2019). https://doi.org/10.1109/ICPC.2019.00037

  • Chollet, F., et al.: Keras. https://keras.io/. [Online; accessed 1-Sep-2021] (2021)

  • Cotroneo, D., Grottke, M., Natella, R., Pietrantuono, R., Trivedi, K.S.: Fault triggers in open-source software: An experience report. In: 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE), pp. 178–187 (2013). https://doi.org/10.1109/ISSRE.2013.6698917

  • DP, K., Ba, J.: Adam: a method for stochastic optimization. In: Proc. of the 3rd International Conference for Learning Representations (ICLR) (2015)

  • Davis, J., Goadrich, M.: The relationship between precision-recall and roc curves. In: Proceedings of the 23rd International Conference on Machine Learning. ICML ’06, pp. 233–240. Association for Computing Machinery, New York, NY, USA (2006). https://doi.org/10.1145/1143844.1143874

  • de Lima Júnior, M.L., Soares, D.M., Plastino, A., Murta, L.: Automatic assignment of integrators to pull requests: the importance of selecting appropriate attributes. J. Syst. Softw. 144, 181–196 (2018). https://doi.org/10.1016/j.jss.2018.05.065

    Article  Google Scholar 

  • de Lima Júnior, M.L., Soares, D.M., Plastino, A., Murta, L.: Developers assignment for analyzing pull requests. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing. SAC ’15, pp. 1567–1572. Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2695664.2695884

  • Dehaghani, S.M.H., Hajrahimi, N.: Which factors affect software projects maintenance cost more? Acta Inform. Med. 21(1), 63 (2013)

    Article  Google Scholar 

  • German, D.M., Di Penta, M., Gueheneuc, Y.-G., Antoniol, G.: Code siblings: Technical and legal implications of copying code between applications. In: 2009 6th IEEE International Working Conference on Mining Software Repositories, pp. 81–90 (2009). https://doi.org/10.1109/MSR.2009.5069483

  • GitHub: About branches. https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-branches. [Online; accessed 1-Sep-2021] (2021)

  • GitHub: About forks. https://docs.github.com/en/get-started/quickstart/fork-a-repo. [Online; accessed 1-Sep-2021] (2021)

  • GitHub: About Pull-requests. https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests. [Online; accessed 10-Oct-2021] (2021)

  • GitHub: Query backport. https://tinyurl.com/Querybackport. [Online; accessed 5-Dec-2021] (2021)

  • Gousios, G., Storey, M.-A., Bacchelli, A.: Work practices and challenges in pull-based development: The contributor’s perspective. In: Proceedings of the 38th International Conference on Software Engineering. ICSE ’16, pp. 285–296. Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2884781.2884826

  • Gousios, G., Zaidman, A., Storey, M.-A., Deursen, A.v.: Work practices and challenges in pull-based development: The integrator’s perspective. In: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 1, pp. 358–368 (2015). https://doi.org/10.1109/ICSE.2015.55

  • Gu, X., Han, Y.-S., Kim, S., Zhang, H.: Do Bugs Propagate? An Empirical Analysis of Temporal Correlations Among Software Bugs. In: Møller, A., Sridharan, M. (eds.) 35th European Conference on Object-Oriented Programming (ECOOP 2021). Leibniz International Proceedings in Informatics (LIPIcs), vol. 194, pp. 11–11121. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany (2021). https://doi.org/10.4230/LIPIcs.ECOOP.2021.11. https://drops.dagstuhl.de/opus/volltexte/2021/14054

  • Harris, C.R., Millman, K.J., van der Walt, S.J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N.J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M.H., Brett, M., Haldane, A., del Río, J.F., Wiebe, M., Peterson, P., Gérard-Marchant, P., Sheppard, K., Reddy, T., Weckesser, W., Abbasi, H., Gohlke, C., Oliphant, T.E.: Array programming with NumPy. Nature 585(7825), 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2

    Article  Google Scholar 

  • Hoang, T., Lawall, J., J. Oentaryo, R., Tian, Y., Lo, D.: Patchnet: A tool for deep patch classification. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), pp. 83–86 (2019). https://doi.org/10.1109/ICSE-Companion.2019.00044

  • Hoang, T., Lawall, J., Tian, Y., Oentaryo, R.J., Lo, D.: Patchnet: hierarchical deep learning-based stable patch identification for the linux kernel. IEEE Trans. Softw. Eng. (2019). https://doi.org/10.1109/TSE.2019.2952614

    Article  Google Scholar 

  • Jiang, J., Yang, Y., He, J., Blanc, X., Zhang, L.: Who should comment on this pull request? analyzing attributes for more accurate commenter recommendation in pull-based development. Inf. Softw. Technol. 84, 48–62 (2017). https://doi.org/10.1016/j.infsof.2016.10.006

    Article  Google Scholar 

  • Jiang, J., Wu, Q., Cao, J., Xia, X., Zhang, L.: Recommending tags for pull requests in github. Inf. Softw. Technol. 129, 106394 (2021)

    Article  Google Scholar 

  • Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223

  • Kibana: Creating PRs. https://tinyurl.com/READMEpluginsMD. [Online; accessed 5-Dec-2021] (2021)

  • Kibana: README.md. https://tinyurl.com/kibanaREADMEmd. [Online; accessed 22-June-2021] (2021)

  • Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Association for Computational Linguistics, Doha, Qatar (2014). https://doi.org/10.3115/v1/D14-1181. https://www.aclweb.org/anthology/D14-1181

  • Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization (2017)

  • Kokubun, T.: Gitstar Ranking. https://gitstar-ranking.com/repositories. [Online; accessed 19-August-2021] (2014)

  • Kononenko, O., Rose, T., Baysal, O., Godfrey, M., Theisen, D., de Water, B.: Studying pull request merges: A case study of shopify’s active merchant. In: 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP), pp. 124–133 (2018)

  • Kononenko, O., Rose, T., Baysal, O., Godfrey, M., Theisen, D., de Water, B.: Studying pull request merges: A case study of shopify’s active merchant. In: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. ICSE-SEIP ’18, pp. 124–133. Association for Computing Machinery, New York, NY, USA (2018). https://doi.org/10.1145/3183519.3183542

  • Krasner, H.: The cost of poor software quality in the us: A 2020 report. In: Proc. Consortium Inf. Softw. QualityTM (CISQTM) (2021)

  • Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386

    Article  Google Scholar 

  • Lawall, J., Palinski, D., Gnirke, L., Muller, G.: Fast and precise retrieval of forward and back porting information for linux device drivers. In: Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference. USENIX ATC ’17, pp. 15–26. USENIX Association, USA (2017)

  • Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997). https://doi.org/10.1109/72.554195

    Article  Google Scholar 

  • Li, Z., Yin, G., Yu, Y., Wang, T., Wang, H.: Detecting duplicate pull-requests in github. In: Proceedings of the 9th Asia-Pacific Symposium on Internetware. Internetware’17. Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3131704.3131725

  • Li, Z., Yin, G., Yu, Y., Wang, T., Wang, H.: Detecting duplicate pull-requests in github. In: Proceedings of the 9th Asia-Pacific Symposium on Internetware. Internetware’17. Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3131704.3131725

  • Li, Z., Yu, Y., Yin, G., Wang, T., Fan, Q., Wang, H.: Automatic classification of review comments in pull-based development model. In: SEKE (2017)

  • Li, Z., Yu, Y., Yin, G., Wang, T., Wang, H.: What are they talking about? analyzing code reviews in pull-based development model. J. Comput. Sci. Technol. 32, 1060–1075 (2017)

    Article  Google Scholar 

  • Li, Y., Zhu, C., Rubin, J., Chechik, M.: Semantic slicing of software version histories. IEEE Trans. Softw. Eng. 44(2), 182–201 (2018). https://doi.org/10.1109/TSE.2017.2664824

    Article  Google Scholar 

  • Mohamed, A., Zhang, L., Jiang, J., Ktob, A.: Predicting which pull requests will get reopened in github. In: 2018 25th Asia-Pacific Software Engineering Conference (APSEC), pp. 375–385 (2018). https://doi.org/10.1109/APSEC.2018.00052

  • Mondal, M., Roy, C.K., Schneider, K.A.: Bug propagation through code cloning: An empirical study. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 227–237 (2017). https://doi.org/10.1109/ICSME.2017.33

  • Ng, A.Y.: Feature selection, L1 vs L2 regularization, and rotational invariance. In: Proceedings of the Twenty-First International Conference on Machine Learning. ICML ’04, p. 78. Association for Computing Machinery, New York, NY, USA (2004). https://doi.org/10.1145/1015330.1015435

  • Pham, H., Dai, Z., Xie, Q., Le, Q.V.: Meta pseudo labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11557–11568 (2021)

  • Powers, D.: Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. ArXiv abs/2010.16061 (2020)

  • PyGitHUb: About PyGitHUb. http://pygithub.readthedocs.io/en/latest/. [Online; accessed 1-Sep-2021] (2021)

  • Rahman, M.M., Roy, C.K.: An insight into the pull requests of github. In: Proceedings of the 11th Working Conference on Mining Software Repositories. MSR 2014, pp. 364–367. Association for Computing Machinery, New York, NY, USA (2014). https://doi.org/10.1145/2597073.2597121

  • Ray, B., Kim, M., Person, S., Rungta, N.: Detecting and characterizing semantic inconsistencies in ported code. In: 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 367–377 (2013). https://doi.org/10.1109/ASE.2013.6693095

  • Ray, B., Kim, M.: A case study of cross-system porting in forked projects. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. FSE ’12. Association for Computing Machinery, New York, NY, USA (2012). https://doi.org/10.1145/2393596.2393659

  • Ren, L.: Automated patch porting across forked projects. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ESEC/FSE 2019, pp. 1199–1201. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3338906.3342488

  • Silva, M.C.O., Valente, M.T., Terra, R.: Does technical debt lead to the rejection of pull requests? SBSI 2016, pp. 248–254. Brazilian Computer Society, Porto Alegre, BRA (2016)

  • Soares, D.M., de Lima Júnior, M.L., Plastino, A., Murta, L.: What factors influence the reviewer assignment to pull requests? Inf. Softw. Technol. 98, 32–43 (2018). https://doi.org/10.1016/j.infsof.2018.01.015

    Article  Google Scholar 

  • Stanciulescu, S., Schulze, S., Wasowski, A.: Forked and integrated variants in an open-source firmware project. In: 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 151–160 (2015). https://doi.org/10.1109/ICSM.2015.7332461

  • Terrell, J., Kofink, A., Middleton, J., Rainear, C., Murphy-Hill, E., Parnin, C., Stallings, J.: Gender differences and bias in open source: Pull request acceptance of women versus men. PeerJ Comput. Sci. 3, 111 (2017)

    Article  Google Scholar 

  • Tufano, R., Pascarella, L., Tufano, M., Poshyvanyk, D., Bavota, G.: Towards automating code review activities. arXiv e-prints, 2101 (2021)

  • v. d. Veen, E., Gousios, G., Zaidman, A.: Automatically prioritizing pull requests. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 357–361 (2015)

  • Wang, Q., Xu, B., Xia, X., Wang, T., Li, S.: Duplicate pull request detection: When time matters. In: Proceedings of the 11th Asia-Pacific Symposium on Internetware. Internetware ’19. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3361242.3361254

  • Yang, C., Zhang, X.-H., Zeng, L.-B., Fan, Q., Wang, T., Yu, Y., Yin, G., Wang, H.-M.: Revrec: a two-layer reviewer recommendation algorithm in pull-based development model. J. Central South Univ. 25(5), 1129–1143 (2018)

    Article  Google Scholar 

  • Yu, Y., Wang, H., Filkov, V., Devanbu, P., Vasilescu, B.: Wait for it: Determinants of pull request evaluation latency on github. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 367–371 (2015). https://doi.org/10.1109/MSR.2015.42

  • Yu, Y., Wang, H., Yin, G., Ling, C.X.: Who should review this pull-request: Reviewer recommendation to expedite crowd collaboration. In: 2014 21st Asia-Pacific Software Engineering Conference, vol. 1, pp. 335–342 (2014). https://doi.org/10.1109/APSEC.2014.57

  • Yu, S., Xu, L., Zhang, Y., Wu, J., Liao, Z., Li, Y.: Nbsl: A supervised classification model of pull request in github. In: 2018 IEEE International Conference on Communications (ICC), pp. 1–6 (2018)

  • Yu, Y., Wang, H., Yin, G., Wang, T.: Reviewer recommendation for pull-requests in github: what can we learn from code review and bug assignment? Inf. Softw. Technol. 74, 204–218 (2016). https://doi.org/10.1016/j.infsof.2016.01.004

    Article  Google Scholar 

  • Zampetti, F., Bavota, G., Canfora, G., Penta, M.D.: A study on the interplay between pull request review and continuous integration builds. In: 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 38–48 (2019)

  • Zampetti, F., Ponzanelli, L., Bavota, G., Mocci, A., Di Penta, M., Lanza, M.: How developers document pull requests with external references. In: 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC), pp. 23–33 (2017). https://doi.org/10.1109/ICPC.2017.30

  • Zhai, X., Kolesnikov, A., Houlsby, N., Beyer, L.: Scaling Vision Transformers. arXiv e-prints, 2106–04560 (2021) arXiv:2106.04560 [cs.CV]

Download references

Acknowledgements

This research is supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery grants, and by an NSERC Collaborative Research and Training Experience (CREATE) grant, and by two Canada First Research Excellence Fund (CFREF) grants coordinated by the Global Institute for Food Security (GIFS) and the Global Institute for Water Security (GIWS).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study's conception and design. Material preparation, data collection, and analysis were performed by Debasish Chakroborti. The first draft of the manuscript was written by Debasish Chakroborti, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Debasish Chakroborti.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chakroborti, D., Schneider, K.A. & Roy, C.K. ReBack: recommending backports in social coding environments. Autom Softw Eng 31, 18 (2024). https://doi.org/10.1007/s10515-024-00416-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10515-024-00416-1

Keywords