Skip to main content

Pull Requests Integration Process Optimization: An Empirical Study

  • Conference paper
  • First Online:
Evaluation of Novel Approaches to Software Engineering (ENASE 2022)

Abstract

Pull-based Development (PbD) is widely used in collaborative development to integrate changes into a project codebase. In this model, contributions are notified through Pull Request (PR) submissions. Project administrators are responsible for reviewing and integrating PRs. In the integration process, conflicts occur when PRs are concurrently opened on a given target branch and propose different modifications for a same code part. In a previous work, we proposed an approach, called IP Optimizer, to improve the Integration Process Efficiency (IPE) by prioritizing PRs. In this work, we conduct an empirical study on 260 open-source projects hosted by GitHub that use PRs intensively in order to quantify the frequency of conflicts in software projects and analyze how much the integration process can be improved. Our results indicate that regarding the frequency of conflicts in software projects, half of the projects have a moderate and high number of pairwise conflicts and half have a low number of pairwise conflicts or none. Futhermore, on average 18.82% of the time windows have conflicts. On the other hand, regarding how much the integration process can be improved, IP Optimizer improves the IPE in 94.16% of the time windows and the average improvement percentage is 146.15%. In addition, it improves the number of conflict resolutions in 67.16% of the time windows and the average improvement percentage is 134.28%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/.

  2. 2.

    https://www.atlassian.com/en/software/jira.

  3. 3.

    https://ghtorrent.org/.

  4. 4.

    Since the GHTorrent dataset has data up to 2019-06-01, last year corresponds to 2018-06-01 and after.

  5. 5.

    https://git-scm.com/docs/git-merge.

  6. 6.

    Version in which the PR was created.

  7. 7.

    https://git-scm.com/docs/git-merge#Documentation/git-merge.txt---squash.

  8. 8.

    https://www.atlassian.com/git/tutorials/comparing-workflows/forking-workflow.

  9. 9.

    https://www.antlr.org/.

References

  1. Online Appendix. https://anonymous.4open.science/r/pull-request-conflicts-7884/docs/index.md

  2. PYPL Popularity of Programming Language. https://pypl.github.io/PYPL.html

  3. Accioly, P., Borba, P., Silva, L., Cavalcanti, G.: Analyzing conflict predictors in open-source java projects. In: Proceedings of the 15th International Conference on Mining Software Repositories, pp. 576–586 (2018)

    Google Scholar 

  4. Azeem, M.I., Panichella, S., Di Sorbo, A., Serebrenik, A., Wang, Q.: Action-based recommendation in pull-request development. In: Proceedings of the International Conference on Software and System Processes, pp. 115–124 (2020)

    Google Scholar 

  5. Azeem, M.I., Peng, Q., Wang, Q.: Pull request prioritization algorithm based on acceptance and response probability. In: 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS), pp. 231–242. IEEE (2020)

    Google Scholar 

  6. Beck, K.: Extreme Programming Explained: Embrace Change. Addison-Wesley (2000)

    Google Scholar 

  7. Bird, C., Zimmermann, T.: Assessing the value of branches with what-if analysis. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, pp. 1–11 (2012)

    Google Scholar 

  8. Chacon, S., Straub, B.: Pro git. Springer Nature (2014)

    Google Scholar 

  9. Diebold, P., Ostberg, J.-P., Wagner, S., Zendler, U.: What do practitioners vary in using scrum? In: Lassenius, C., Dingsøyr, T., Paasivaara, M. (eds.) XP 2015. LNBIP, vol. 212, pp. 40–51. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18612-2_4

  10. Feldt, R., Magazinius, A.: Validity threats in empirical software engineering research-an initial survey. In: SEKE, pp. 374–379 (2010)

    Google Scholar 

  11. German, D.M., Adams, B., Hassan, A.E.: Continuously mining distributed version control systems: An empirical study of how linux uses git. Empiric. Softw. Eng. 21(1), 260–299 (2016)

    Article  Google Scholar 

  12. Ghiotto, G., Murta, L., Barros, M., Van Der Hoek, A.: On the nature of merge conflicts: A study of 2,731 open source java projects hosted by github. IEEE Trans. Softw. Eng. 46(8), 892–915 (2018)

    Article  Google Scholar 

  13. González-Barahona, J.M., Robles, G.: On the reproducibility of empirical software engineering studies based on data retrieved from development repositories. Empir. Softw. Eng. 17(1–2), 75–89 (2012)

    Article  Google Scholar 

  14. Gousios, G.: The ghtorrent dataset and tool suite. In: Proceedings of the 10th Working Conference on Mining Software Repositories (MSR 2013), pp. 233–236. IEEE Press, Piscataway (2013). https://dl.acm.org/citation.cfm?id=2487085.2487132

  15. Gousios, G., Pinzger, M., Deursen, A.v.: An exploratory study of the pull-based software development model. In: Proceedings of the 36th International Conference on Software Engineering, pp. 345–355 (2014)

    Google Scholar 

  16. Gousios, G., Storey, M.A., Bacchelli, A.: Work practices and challenges in pull-based development: The contributor’s perspective. In: 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), pp. 285–296. IEEE (2016)

    Google Scholar 

  17. Gousios, G., Zaidman, A., Storey, M.A., Van Deursen, A.: Work practices and challenges in pull-based development: The integrator’s perspective. In: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 1, pp. 358–368. IEEE (2015)

    Google Scholar 

  18. Jiang, J., Lo, D., Zheng, J., Xia, X., Yang, Y., Zhang, L.: Who should make decision on this pull request? analyzing time-decaying relationships and file similarities for integrator prediction. J. Syst. Softw. 154, 196–210 (2019)

    Article  Google Scholar 

  19. Jiang, J., Yang, Y., He, J., Blanc, X., Zhang, L.: Who should comment on this pull request? Analyzing attributes for more accurate commenter recommendation in pull-based development. Inf. Softw. Technol. 84, 48–62 (2017)

    Article  Google Scholar 

  20. Kononenko, O., Rose, T., Baysal, O., Godfrey, M., Theisen, D., De Water, B.: Studying pull request merges: A case study of shopify’s active merchant. In: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice, pp. 124–133 (2018)

    Google Scholar 

  21. Legay, D., Decan, A., Mens, T.: On the impact of pull request decisions on future contributions. arXiv preprint arXiv:1812.06269 (2018)

  22. Ma, P., Xu, D., Zhang, X., Xuan, J.: Changes are similar: Measuring similarity of pull requests that change the same code in GitHub. In: Li, Z., Jiang, H., Li, G., Zhou, M., Li, M. (eds.) NASAC 2017-2018. CCIS, vol. 861, pp. 115–128. Springer, Singapore (2019). https://doi.org/10.1007/978-981-15-0310-8_8

  23. Mens, T.: A state-of-the-art survey on software merging. IEEE Trans. Softw. Eng. 28(5), 449–462 (2002)

    Article  Google Scholar 

  24. Olmedo, A., Arévalo, G., Cassol, I., Urtado, C., Vauttier, S.: Improving integration process efficiency through pull request prioritization. In: ENASE 2022–17th International Conference on Evaluation of Novel Approaches to Software Engineering, pp. 62–72. SCITEPRESS-Science and Technology Publications (2022)

    Google Scholar 

  25. Rahman, M.M., Roy, C.K.: An insight into the pull requests of github. In: Proceedings of the 11th Working Conference on Mining Software Repositories, pp. 364–367 (2014)

    Google Scholar 

  26. Rodríguez-Bustos, C., Aponte, J.: How distributed version control systems impact open source software projects. In: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), pp. 36–39. IEEE (2012)

    Google Scholar 

  27. Saini, N., Britto, R.: Using machine intelligence to prioritise code review requests. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 11–20. IEEE (2021)

    Google Scholar 

  28. Thongtanunam, P., Kula, R.G., Cruz, A.E.C., Yoshida, N., Iida, H.: Improving code review effectiveness through reviewer recommendations. In: Proceedings of the 7th International Workshop on Cooperative and Human Aspects of Software Engineering, pp. 119–122 (2014)

    Google Scholar 

  29. Tsay, J., Dabbish, L., Herbsleb, J.: Influence of social and technical factors for evaluating contribution in github. In: Proceedings of the 36th International Conference on Software Engineering, pp. 356–366 (2014)

    Google Scholar 

  30. Tsay, J., Dabbish, L., Herbsleb, J.: Let’s talk about it: Evaluating contributions through discussion in github. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 144–154 (2014)

    Google Scholar 

  31. Van Der Veen, E., Gousios, G., Zaidman, A.: Automatically prioritizing pull requests. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 357–361. IEEE (2015)

    Google Scholar 

  32. Ying, H., Chen, L., Liang, T., Wu, J.: Earec: Leveraging expertise and authority for pull-request reviewer recommendation in github. In: 2016 IEEE/ACM 3rd International Workshop on Crowd Sourcing in Software Engineering (CSI-SE), pp. 29–35. IEEE (2016)

    Google Scholar 

  33. Yu, Y., Wang, H., Filkov, V., Devanbu, P., Vasilescu, B.: Wait for it: Determinants of pull request evaluation latency on github. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 367–371. IEEE (2015)

    Google Scholar 

  34. Yu, Y., Wang, H., Yin, G., Wang, T.: Reviewer recommendation for pull-requests in github: What can we learn from code review and bug assignment? Inf. Softw. Technol. 74, 204–218 (2016)

    Article  Google Scholar 

  35. Yu, Y., Yin, G., Wang, T., Yang, C., Wang, H.: Determinants of pull-based development in the context of continuous integration. Sci. China Inf. Sci. 59(8), 1–14 (2016)

    Article  Google Scholar 

  36. Zampetti, F., Bavota, G., Canfora, G., Di Penta, M.: A study on the interplay between pull request review and continuous integration builds. In: 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 38–48. IEEE (2019)

    Google Scholar 

  37. Zhang, X., et al.: How do multiple pull requests change the same code: A study of competing pull requests in github. In: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 228–239. IEEE (2018)

    Google Scholar 

  38. Zhang, Y., Yin, G., Yu, Y., Wang, H.: A exploratory study of@-mention in github’s pull-requests. In: 2014 21st Asia-Pacific Software Engineering Conference, vol. 1, pp. 343–350. IEEE (2014)

    Google Scholar 

  39. Zhang, Y., Yin, G., Yu, Y., Wang, H.: Investigating social media in github’s pull-requests: A case study on ruby on rails. In: Proceedings of the 1st International Workshop on Crowd-based Software Development Methods and Technologies, pp. 37–41 (2014)

    Google Scholar 

  40. Zhao, G., da Costa, D.A., Zou, Y.: Improving the pull requests review process using learning-to-rank algorithms. Empir. Softw. Eng. 24(4), 2140–2170 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Agustín Olmedo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Olmedo, A., Arévalo, G., Cassol, I., Perez, Q., Urtado, C., Vauttier, S. (2023). Pull Requests Integration Process Optimization: An Empirical Study. In: Kaindl, H., Mannion, M., Maciaszek, L.A. (eds) Evaluation of Novel Approaches to Software Engineering. ENASE 2022. Communications in Computer and Information Science, vol 1829. Springer, Cham. https://doi.org/10.1007/978-3-031-36597-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36597-3_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36596-6

  • Online ISBN: 978-3-031-36597-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics