Abstract
Pull-based Development (PbD) is widely used in collaborative development to integrate changes into a project codebase. In this model, contributions are notified through Pull Request (PR) submissions. Project administrators are responsible for reviewing and integrating PRs. In the integration process, conflicts occur when PRs are concurrently opened on a given target branch and propose different modifications for a same code part. In a previous work, we proposed an approach, called IP Optimizer, to improve the Integration Process Efficiency (IPE) by prioritizing PRs. In this work, we conduct an empirical study on 260 open-source projects hosted by GitHub that use PRs intensively in order to quantify the frequency of conflicts in software projects and analyze how much the integration process can be improved. Our results indicate that regarding the frequency of conflicts in software projects, half of the projects have a moderate and high number of pairwise conflicts and half have a low number of pairwise conflicts or none. Futhermore, on average 18.82% of the time windows have conflicts. On the other hand, regarding how much the integration process can be improved, IP Optimizer improves the IPE in 94.16% of the time windows and the average improvement percentage is 146.15%. In addition, it improves the number of conflict resolutions in 67.16% of the time windows and the average improvement percentage is 134.28%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
Since the GHTorrent dataset has data up to 2019-06-01, last year corresponds to 2018-06-01 and after.
- 5.
- 6.
Version in which the PR was created.
- 7.
- 8.
- 9.
References
Online Appendix. https://anonymous.4open.science/r/pull-request-conflicts-7884/docs/index.md
PYPL Popularity of Programming Language. https://pypl.github.io/PYPL.html
Accioly, P., Borba, P., Silva, L., Cavalcanti, G.: Analyzing conflict predictors in open-source java projects. In: Proceedings of the 15th International Conference on Mining Software Repositories, pp. 576–586 (2018)
Azeem, M.I., Panichella, S., Di Sorbo, A., Serebrenik, A., Wang, Q.: Action-based recommendation in pull-request development. In: Proceedings of the International Conference on Software and System Processes, pp. 115–124 (2020)
Azeem, M.I., Peng, Q., Wang, Q.: Pull request prioritization algorithm based on acceptance and response probability. In: 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS), pp. 231–242. IEEE (2020)
Beck, K.: Extreme Programming Explained: Embrace Change. Addison-Wesley (2000)
Bird, C., Zimmermann, T.: Assessing the value of branches with what-if analysis. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, pp. 1–11 (2012)
Chacon, S., Straub, B.: Pro git. Springer Nature (2014)
Diebold, P., Ostberg, J.-P., Wagner, S., Zendler, U.: What do practitioners vary in using scrum? In: Lassenius, C., Dingsøyr, T., Paasivaara, M. (eds.) XP 2015. LNBIP, vol. 212, pp. 40–51. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18612-2_4
Feldt, R., Magazinius, A.: Validity threats in empirical software engineering research-an initial survey. In: SEKE, pp. 374–379 (2010)
German, D.M., Adams, B., Hassan, A.E.: Continuously mining distributed version control systems: An empirical study of how linux uses git. Empiric. Softw. Eng. 21(1), 260–299 (2016)
Ghiotto, G., Murta, L., Barros, M., Van Der Hoek, A.: On the nature of merge conflicts: A study of 2,731 open source java projects hosted by github. IEEE Trans. Softw. Eng. 46(8), 892–915 (2018)
González-Barahona, J.M., Robles, G.: On the reproducibility of empirical software engineering studies based on data retrieved from development repositories. Empir. Softw. Eng. 17(1–2), 75–89 (2012)
Gousios, G.: The ghtorrent dataset and tool suite. In: Proceedings of the 10th Working Conference on Mining Software Repositories (MSR 2013), pp. 233–236. IEEE Press, Piscataway (2013). https://dl.acm.org/citation.cfm?id=2487085.2487132
Gousios, G., Pinzger, M., Deursen, A.v.: An exploratory study of the pull-based software development model. In: Proceedings of the 36th International Conference on Software Engineering, pp. 345–355 (2014)
Gousios, G., Storey, M.A., Bacchelli, A.: Work practices and challenges in pull-based development: The contributor’s perspective. In: 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), pp. 285–296. IEEE (2016)
Gousios, G., Zaidman, A., Storey, M.A., Van Deursen, A.: Work practices and challenges in pull-based development: The integrator’s perspective. In: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 1, pp. 358–368. IEEE (2015)
Jiang, J., Lo, D., Zheng, J., Xia, X., Yang, Y., Zhang, L.: Who should make decision on this pull request? analyzing time-decaying relationships and file similarities for integrator prediction. J. Syst. Softw. 154, 196–210 (2019)
Jiang, J., Yang, Y., He, J., Blanc, X., Zhang, L.: Who should comment on this pull request? Analyzing attributes for more accurate commenter recommendation in pull-based development. Inf. Softw. Technol. 84, 48–62 (2017)
Kononenko, O., Rose, T., Baysal, O., Godfrey, M., Theisen, D., De Water, B.: Studying pull request merges: A case study of shopify’s active merchant. In: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice, pp. 124–133 (2018)
Legay, D., Decan, A., Mens, T.: On the impact of pull request decisions on future contributions. arXiv preprint arXiv:1812.06269 (2018)
Ma, P., Xu, D., Zhang, X., Xuan, J.: Changes are similar: Measuring similarity of pull requests that change the same code in GitHub. In: Li, Z., Jiang, H., Li, G., Zhou, M., Li, M. (eds.) NASAC 2017-2018. CCIS, vol. 861, pp. 115–128. Springer, Singapore (2019). https://doi.org/10.1007/978-981-15-0310-8_8
Mens, T.: A state-of-the-art survey on software merging. IEEE Trans. Softw. Eng. 28(5), 449–462 (2002)
Olmedo, A., Arévalo, G., Cassol, I., Urtado, C., Vauttier, S.: Improving integration process efficiency through pull request prioritization. In: ENASE 2022–17th International Conference on Evaluation of Novel Approaches to Software Engineering, pp. 62–72. SCITEPRESS-Science and Technology Publications (2022)
Rahman, M.M., Roy, C.K.: An insight into the pull requests of github. In: Proceedings of the 11th Working Conference on Mining Software Repositories, pp. 364–367 (2014)
Rodríguez-Bustos, C., Aponte, J.: How distributed version control systems impact open source software projects. In: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), pp. 36–39. IEEE (2012)
Saini, N., Britto, R.: Using machine intelligence to prioritise code review requests. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 11–20. IEEE (2021)
Thongtanunam, P., Kula, R.G., Cruz, A.E.C., Yoshida, N., Iida, H.: Improving code review effectiveness through reviewer recommendations. In: Proceedings of the 7th International Workshop on Cooperative and Human Aspects of Software Engineering, pp. 119–122 (2014)
Tsay, J., Dabbish, L., Herbsleb, J.: Influence of social and technical factors for evaluating contribution in github. In: Proceedings of the 36th International Conference on Software Engineering, pp. 356–366 (2014)
Tsay, J., Dabbish, L., Herbsleb, J.: Let’s talk about it: Evaluating contributions through discussion in github. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 144–154 (2014)
Van Der Veen, E., Gousios, G., Zaidman, A.: Automatically prioritizing pull requests. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 357–361. IEEE (2015)
Ying, H., Chen, L., Liang, T., Wu, J.: Earec: Leveraging expertise and authority for pull-request reviewer recommendation in github. In: 2016 IEEE/ACM 3rd International Workshop on Crowd Sourcing in Software Engineering (CSI-SE), pp. 29–35. IEEE (2016)
Yu, Y., Wang, H., Filkov, V., Devanbu, P., Vasilescu, B.: Wait for it: Determinants of pull request evaluation latency on github. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 367–371. IEEE (2015)
Yu, Y., Wang, H., Yin, G., Wang, T.: Reviewer recommendation for pull-requests in github: What can we learn from code review and bug assignment? Inf. Softw. Technol. 74, 204–218 (2016)
Yu, Y., Yin, G., Wang, T., Yang, C., Wang, H.: Determinants of pull-based development in the context of continuous integration. Sci. China Inf. Sci. 59(8), 1–14 (2016)
Zampetti, F., Bavota, G., Canfora, G., Di Penta, M.: A study on the interplay between pull request review and continuous integration builds. In: 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 38–48. IEEE (2019)
Zhang, X., et al.: How do multiple pull requests change the same code: A study of competing pull requests in github. In: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 228–239. IEEE (2018)
Zhang, Y., Yin, G., Yu, Y., Wang, H.: A exploratory study of@-mention in github’s pull-requests. In: 2014 21st Asia-Pacific Software Engineering Conference, vol. 1, pp. 343–350. IEEE (2014)
Zhang, Y., Yin, G., Yu, Y., Wang, H.: Investigating social media in github’s pull-requests: A case study on ruby on rails. In: Proceedings of the 1st International Workshop on Crowd-based Software Development Methods and Technologies, pp. 37–41 (2014)
Zhao, G., da Costa, D.A., Zou, Y.: Improving the pull requests review process using learning-to-rank algorithms. Empir. Softw. Eng. 24(4), 2140–2170 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Olmedo, A., Arévalo, G., Cassol, I., Perez, Q., Urtado, C., Vauttier, S. (2023). Pull Requests Integration Process Optimization: An Empirical Study. In: Kaindl, H., Mannion, M., Maciaszek, L.A. (eds) Evaluation of Novel Approaches to Software Engineering. ENASE 2022. Communications in Computer and Information Science, vol 1829. Springer, Cham. https://doi.org/10.1007/978-3-031-36597-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-36597-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36596-6
Online ISBN: 978-3-031-36597-3
eBook Packages: Computer ScienceComputer Science (R0)