ReBack: recommending backports in social coding environments

Debasish Chakroborti¹,
Kevin A. Schneider¹ &
Chanchal K. Roy¹

291 Accesses
Explore all metrics

Abstract

Pull-based development is widely used in popular social coding environments like GitHub and GitLab for both internal and external contributions. When critical bug fixes or features are committed to the main branch of a project, it is often desirable to also port those changes to other stable branches. This process is referred to as backporting, and pull-requests in the process are known as backports. Backports are typically determined after extensive discussion with collaborators, and it may take many days to identify backports, which commonly results in tags and references to the original pull-requests (i.e., pull-requests for the main branch) being missed. To help software development teams better identify and manage backports, we propose ReBack (Recommending Backports), a tool based on a deep-learning model for automatically identifying backports from pull-requests and related reviews, discussions, metadata, and committed code. ReBack predicted backports with 90.98% precision and 91.81% recall from 80,000 pull-requests in 17 GitHub projects. Although the results are promising, more research is required to further support backporting, including research into automatically porting a pull-request to further reduce costs when managing software versions and branches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recommending pull request reviewers based on code changes

Article 09 January 2021

CoreDevRec: Automatic Core Member Recommendation for Contribution Evaluation

Article 14 September 2015

What Are They Talking About? Analyzing Code Reviews in Pull-Based Development Model

Article 01 November 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from tensorflow.org (2015). https://www.tensorflow.org/
Chakroborti, D.: ReBack BenchMark (2021a). https://doi.org/10.5281/zenodo.6715562
Chakroborti, D.: ReBack Tool (2021b). https://doi.org/10.5281/zenodo.6715463
Ansible: Backport ReadmeD. https://tinyurl.com/backportREADMEMD. [Online; accessed 5-Dec-2021] (2021)
Ansible: DevelopmentProcess.rst. https://tinyurl.com/ansibledevelopmentprocessrst. [Online; accessed 22-June-2021] (2021)
Ansible: README.md. https://tinyurl.com/ansiblebackportREADME. [Online; accessed 22-June-2021] (2020)
Ansible: The Ansible Development Cycle. https://tinyurl.com/information-labels. [Online; accessed 5-Dec-2021] (2021)
Azeem, M.I., Panichella, S., Di Sorbo, A., Serebrenik, A., Wang, Q.: Action-Based Recommendation in Pull-Request Development, pp. 115–124. Association for Computing Machinery, New York, NY, USA (2020)
Cabot, J., Cánovas Izquierdo, J.L., Cosentino, V., Rolandi, B.: Exploring the use of labels to categorize issues in open-source software projects. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 550–554 (2015). https://doi.org/10.1109/SANER.2015.7081875
Chakroborti, D., Schneider, K.A., Roy, C.K.: Backports: Change types, challenges and strategies. In: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension. ICPC ’22, pp. 636–647. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3524610.3527920
Chen, D., Stolee, K.T., Menzies, T.: Replication can improve prior results: A github study of pull request acceptance. In: 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC), pp. 179–190 (2019). https://doi.org/10.1109/ICPC.2019.00037
Chollet, F., et al.: Keras. https://keras.io/. [Online; accessed 1-Sep-2021] (2021)
Cotroneo, D., Grottke, M., Natella, R., Pietrantuono, R., Trivedi, K.S.: Fault triggers in open-source software: An experience report. In: 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE), pp. 178–187 (2013). https://doi.org/10.1109/ISSRE.2013.6698917
DP, K., Ba, J.: Adam: a method for stochastic optimization. In: Proc. of the 3rd International Conference for Learning Representations (ICLR) (2015)
Davis, J., Goadrich, M.: The relationship between precision-recall and roc curves. In: Proceedings of the 23rd International Conference on Machine Learning. ICML ’06, pp. 233–240. Association for Computing Machinery, New York, NY, USA (2006). https://doi.org/10.1145/1143844.1143874
de Lima Júnior, M.L., Soares, D.M., Plastino, A., Murta, L.: Automatic assignment of integrators to pull requests: the importance of selecting appropriate attributes. J. Syst. Softw. 144, 181–196 (2018). https://doi.org/10.1016/j.jss.2018.05.065
Article Google Scholar
de Lima Júnior, M.L., Soares, D.M., Plastino, A., Murta, L.: Developers assignment for analyzing pull requests. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing. SAC ’15, pp. 1567–1572. Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2695664.2695884
Dehaghani, S.M.H., Hajrahimi, N.: Which factors affect software projects maintenance cost more? Acta Inform. Med. 21(1), 63 (2013)
Article Google Scholar
German, D.M., Di Penta, M., Gueheneuc, Y.-G., Antoniol, G.: Code siblings: Technical and legal implications of copying code between applications. In: 2009 6th IEEE International Working Conference on Mining Software Repositories, pp. 81–90 (2009). https://doi.org/10.1109/MSR.2009.5069483
GitHub: About branches. https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-branches. [Online; accessed 1-Sep-2021] (2021)
GitHub: About forks. https://docs.github.com/en/get-started/quickstart/fork-a-repo. [Online; accessed 1-Sep-2021] (2021)
GitHub: About Pull-requests. https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests. [Online; accessed 10-Oct-2021] (2021)
GitHub: Query backport. https://tinyurl.com/Querybackport. [Online; accessed 5-Dec-2021] (2021)
Gousios, G., Storey, M.-A., Bacchelli, A.: Work practices and challenges in pull-based development: The contributor’s perspective. In: Proceedings of the 38th International Conference on Software Engineering. ICSE ’16, pp. 285–296. Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2884781.2884826
Gousios, G., Zaidman, A., Storey, M.-A., Deursen, A.v.: Work practices and challenges in pull-based development: The integrator’s perspective. In: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 1, pp. 358–368 (2015). https://doi.org/10.1109/ICSE.2015.55
Gu, X., Han, Y.-S., Kim, S., Zhang, H.: Do Bugs Propagate? An Empirical Analysis of Temporal Correlations Among Software Bugs. In: Møller, A., Sridharan, M. (eds.) 35th European Conference on Object-Oriented Programming (ECOOP 2021). Leibniz International Proceedings in Informatics (LIPIcs), vol. 194, pp. 11–11121. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany (2021). https://doi.org/10.4230/LIPIcs.ECOOP.2021.11. https://drops.dagstuhl.de/opus/volltexte/2021/14054
Harris, C.R., Millman, K.J., van der Walt, S.J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N.J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M.H., Brett, M., Haldane, A., del Río, J.F., Wiebe, M., Peterson, P., Gérard-Marchant, P., Sheppard, K., Reddy, T., Weckesser, W., Abbasi, H., Gohlke, C., Oliphant, T.E.: Array programming with NumPy. Nature 585(7825), 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2
Article Google Scholar
Hoang, T., Lawall, J., J. Oentaryo, R., Tian, Y., Lo, D.: Patchnet: A tool for deep patch classification. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), pp. 83–86 (2019). https://doi.org/10.1109/ICSE-Companion.2019.00044
Hoang, T., Lawall, J., Tian, Y., Oentaryo, R.J., Lo, D.: Patchnet: hierarchical deep learning-based stable patch identification for the linux kernel. IEEE Trans. Softw. Eng. (2019). https://doi.org/10.1109/TSE.2019.2952614
Article Google Scholar
Jiang, J., Yang, Y., He, J., Blanc, X., Zhang, L.: Who should comment on this pull request? analyzing attributes for more accurate commenter recommendation in pull-based development. Inf. Softw. Technol. 84, 48–62 (2017). https://doi.org/10.1016/j.infsof.2016.10.006
Article Google Scholar
Jiang, J., Wu, Q., Cao, J., Xia, X., Zhang, L.: Recommending tags for pull requests in github. Inf. Softw. Technol. 129, 106394 (2021)
Article Google Scholar
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223
Kibana: Creating PRs. https://tinyurl.com/READMEpluginsMD. [Online; accessed 5-Dec-2021] (2021)
Kibana: README.md. https://tinyurl.com/kibanaREADMEmd. [Online; accessed 22-June-2021] (2021)
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Association for Computational Linguistics, Doha, Qatar (2014). https://doi.org/10.3115/v1/D14-1181. https://www.aclweb.org/anthology/D14-1181
Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization (2017)
Kokubun, T.: Gitstar Ranking. https://gitstar-ranking.com/repositories. [Online; accessed 19-August-2021] (2014)
Kononenko, O., Rose, T., Baysal, O., Godfrey, M., Theisen, D., de Water, B.: Studying pull request merges: A case study of shopify’s active merchant. In: 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP), pp. 124–133 (2018)
Kononenko, O., Rose, T., Baysal, O., Godfrey, M., Theisen, D., de Water, B.: Studying pull request merges: A case study of shopify’s active merchant. In: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. ICSE-SEIP ’18, pp. 124–133. Association for Computing Machinery, New York, NY, USA (2018). https://doi.org/10.1145/3183519.3183542
Krasner, H.: The cost of poor software quality in the us: A 2020 report. In: Proc. Consortium Inf. Softw. QualityTM (CISQTM) (2021)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
Article Google Scholar
Lawall, J., Palinski, D., Gnirke, L., Muller, G.: Fast and precise retrieval of forward and back porting information for linux device drivers. In: Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference. USENIX ATC ’17, pp. 15–26. USENIX Association, USA (2017)
Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997). https://doi.org/10.1109/72.554195
Article Google Scholar
Li, Z., Yin, G., Yu, Y., Wang, T., Wang, H.: Detecting duplicate pull-requests in github. In: Proceedings of the 9th Asia-Pacific Symposium on Internetware. Internetware’17. Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3131704.3131725
Li, Z., Yin, G., Yu, Y., Wang, T., Wang, H.: Detecting duplicate pull-requests in github. In: Proceedings of the 9th Asia-Pacific Symposium on Internetware. Internetware’17. Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3131704.3131725
Li, Z., Yu, Y., Yin, G., Wang, T., Fan, Q., Wang, H.: Automatic classification of review comments in pull-based development model. In: SEKE (2017)
Li, Z., Yu, Y., Yin, G., Wang, T., Wang, H.: What are they talking about? analyzing code reviews in pull-based development model. J. Comput. Sci. Technol. 32, 1060–1075 (2017)
Article Google Scholar
Li, Y., Zhu, C., Rubin, J., Chechik, M.: Semantic slicing of software version histories. IEEE Trans. Softw. Eng. 44(2), 182–201 (2018). https://doi.org/10.1109/TSE.2017.2664824
Article Google Scholar
Mohamed, A., Zhang, L., Jiang, J., Ktob, A.: Predicting which pull requests will get reopened in github. In: 2018 25th Asia-Pacific Software Engineering Conference (APSEC), pp. 375–385 (2018). https://doi.org/10.1109/APSEC.2018.00052
Mondal, M., Roy, C.K., Schneider, K.A.: Bug propagation through code cloning: An empirical study. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 227–237 (2017). https://doi.org/10.1109/ICSME.2017.33
Ng, A.Y.: Feature selection, L1 vs L2 regularization, and rotational invariance. In: Proceedings of the Twenty-First International Conference on Machine Learning. ICML ’04, p. 78. Association for Computing Machinery, New York, NY, USA (2004). https://doi.org/10.1145/1015330.1015435
Pham, H., Dai, Z., Xie, Q., Le, Q.V.: Meta pseudo labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11557–11568 (2021)
Powers, D.: Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. ArXiv abs/2010.16061 (2020)
PyGitHUb: About PyGitHUb. http://pygithub.readthedocs.io/en/latest/. [Online; accessed 1-Sep-2021] (2021)
Rahman, M.M., Roy, C.K.: An insight into the pull requests of github. In: Proceedings of the 11th Working Conference on Mining Software Repositories. MSR 2014, pp. 364–367. Association for Computing Machinery, New York, NY, USA (2014). https://doi.org/10.1145/2597073.2597121
Ray, B., Kim, M., Person, S., Rungta, N.: Detecting and characterizing semantic inconsistencies in ported code. In: 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 367–377 (2013). https://doi.org/10.1109/ASE.2013.6693095
Ray, B., Kim, M.: A case study of cross-system porting in forked projects. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. FSE ’12. Association for Computing Machinery, New York, NY, USA (2012). https://doi.org/10.1145/2393596.2393659
Ren, L.: Automated patch porting across forked projects. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ESEC/FSE 2019, pp. 1199–1201. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3338906.3342488
Silva, M.C.O., Valente, M.T., Terra, R.: Does technical debt lead to the rejection of pull requests? SBSI 2016, pp. 248–254. Brazilian Computer Society, Porto Alegre, BRA (2016)
Soares, D.M., de Lima Júnior, M.L., Plastino, A., Murta, L.: What factors influence the reviewer assignment to pull requests? Inf. Softw. Technol. 98, 32–43 (2018). https://doi.org/10.1016/j.infsof.2018.01.015
Article Google Scholar
Stanciulescu, S., Schulze, S., Wasowski, A.: Forked and integrated variants in an open-source firmware project. In: 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 151–160 (2015). https://doi.org/10.1109/ICSM.2015.7332461
Terrell, J., Kofink, A., Middleton, J., Rainear, C., Murphy-Hill, E., Parnin, C., Stallings, J.: Gender differences and bias in open source: Pull request acceptance of women versus men. PeerJ Comput. Sci. 3, 111 (2017)
Article Google Scholar
Tufano, R., Pascarella, L., Tufano, M., Poshyvanyk, D., Bavota, G.: Towards automating code review activities. arXiv e-prints, 2101 (2021)
v. d. Veen, E., Gousios, G., Zaidman, A.: Automatically prioritizing pull requests. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 357–361 (2015)
Wang, Q., Xu, B., Xia, X., Wang, T., Li, S.: Duplicate pull request detection: When time matters. In: Proceedings of the 11th Asia-Pacific Symposium on Internetware. Internetware ’19. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3361242.3361254
Yang, C., Zhang, X.-H., Zeng, L.-B., Fan, Q., Wang, T., Yu, Y., Yin, G., Wang, H.-M.: Revrec: a two-layer reviewer recommendation algorithm in pull-based development model. J. Central South Univ. 25(5), 1129–1143 (2018)
Article Google Scholar
Yu, Y., Wang, H., Filkov, V., Devanbu, P., Vasilescu, B.: Wait for it: Determinants of pull request evaluation latency on github. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 367–371 (2015). https://doi.org/10.1109/MSR.2015.42
Yu, Y., Wang, H., Yin, G., Ling, C.X.: Who should review this pull-request: Reviewer recommendation to expedite crowd collaboration. In: 2014 21st Asia-Pacific Software Engineering Conference, vol. 1, pp. 335–342 (2014). https://doi.org/10.1109/APSEC.2014.57
Yu, S., Xu, L., Zhang, Y., Wu, J., Liao, Z., Li, Y.: Nbsl: A supervised classification model of pull request in github. In: 2018 IEEE International Conference on Communications (ICC), pp. 1–6 (2018)
Yu, Y., Wang, H., Yin, G., Wang, T.: Reviewer recommendation for pull-requests in github: what can we learn from code review and bug assignment? Inf. Softw. Technol. 74, 204–218 (2016). https://doi.org/10.1016/j.infsof.2016.01.004
Article Google Scholar
Zampetti, F., Bavota, G., Canfora, G., Penta, M.D.: A study on the interplay between pull request review and continuous integration builds. In: 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 38–48 (2019)
Zampetti, F., Ponzanelli, L., Bavota, G., Mocci, A., Di Penta, M., Lanza, M.: How developers document pull requests with external references. In: 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC), pp. 23–33 (2017). https://doi.org/10.1109/ICPC.2017.30
Zhai, X., Kolesnikov, A., Houlsby, N., Beyer, L.: Scaling Vision Transformers. arXiv e-prints, 2106–04560 (2021) arXiv:2106.04560 [cs.CV]

Download references

Acknowledgements

This research is supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery grants, and by an NSERC Collaborative Research and Training Experience (CREATE) grant, and by two Canada First Research Excellence Fund (CFREF) grants coordinated by the Global Institute for Food Security (GIFS) and the Global Institute for Water Security (GIWS).

Author information

Authors and Affiliations

Department of Computer Science, University of Saskatchewan, 110 Science Place, Saskatoon, SK, S7N 5C9, Canada
Debasish Chakroborti, Kevin A. Schneider & Chanchal K. Roy

Authors

Debasish Chakroborti
View author publications
You can also search for this author inPubMed Google Scholar
Kevin A. Schneider
View author publications
You can also search for this author inPubMed Google Scholar
Chanchal K. Roy
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

All authors contributed to the study's conception and design. Material preparation, data collection, and analysis were performed by Debasish Chakroborti. The first draft of the manuscript was written by Debasish Chakroborti, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Debasish Chakroborti.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chakroborti, D., Schneider, K.A. & Roy, C.K. ReBack: recommending backports in social coding environments. Autom Softw Eng 31, 18 (2024). https://doi.org/10.1007/s10515-024-00416-1

Download citation

Received: 25 June 2022
Accepted: 15 January 2024
Published: 23 February 2024
DOI: https://doi.org/10.1007/s10515-024-00416-1

ReBack: recommending backports in social coding environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recommending pull request reviewers based on code changes

CoreDevRec: Automatic Core Member Recommendation for Contribution Evaluation

What Are They Talking About? Analyzing Code Reviews in Pull-Based Development Model

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

ReBack: recommending backports in social coding environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recommending pull request reviewers based on code changes

CoreDevRec: Automatic Core Member Recommendation for Contribution Evaluation

What Are They Talking About? Analyzing Code Reviews in Pull-Based Development Model

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now