Abstract
Collaborative software development platforms (such as GitHub and GitLab) have become increasingly popular as they have attracted thousands of external contributors to contribute to open source projects. The external contributors may submit their contributions via pull requests, which must be reviewed before being integrated into the central repository. During the review process, reviewers provide feedback to contributors, conduct tests and request further modifications before finally accepting or rejecting the contributions. The role of reviewers is key to maintain the effective review process of the project. However, the number of decisions that reviewers can make is far superseded by the increasing number of pull requests submissions. To help reviewers to perform more decisions on pull requests within their limited working time, we propose a learning-to-rank (LtR) approach to recommend pull requests that can be quickly reviewed by reviewers. Different from a binary model for predicting the decisions of pull requests, our ranking approach complements the existing list of pull requests based on their likelihood of being quickly merged or rejected. We use 18 metrics to build LtR models and we use six different LtR algorithms, such as ListNet, RankNet, MART and random forest. We conduct empirical studies on 74 Java projects to compare the performances of the six LtR algorithms. We compare the best performing algorithm against two baselines obtained from previous research regarding pull requests prioritization: the first-in-and-first-out (FIFO) baseline and the small-size-first baseline. We then conduct a survey with GitHub reviewers to understand the perception of code reviewers regarding the usefulness of our approach. We observe that: (1) The random forest LtR algorithm outperforms other five well adapted LtR algorithms to rank quickly merged pull requests. (2) The random forest LtR algorithm performs better than both the FIFO and the small-size-first baselines, which means our LtR approach can help reviewers make more decisions and improve their productivity. (3) The contributor’s social connections and contributor’s experience are the most influential metrics to rank pull requests that can be quickly merged. (4) The GitHub reviewers that participated in our survey acknowledge that our approach complements existing prioritization baselines to help them to prioritize and to review more pull requests.
Similar content being viewed by others
References
Balachandran V (2013) Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In: 2013 35th international conference on software engineering (ICSE). IEEE, pp 931–940
Bird C, Gourley A, Devanbu P (2007) Detecting patch submission and acceptance in oss projects. In: Proceedings of the 4th international workshop on mining software repositories. IEEE Computer Society, p 26
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of the 22nd international conference on machine learning. ACM, pp 89–96
Cao Z, Qin T, Liu T-Y, Tsai M-F, Li H (2007) Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th international conference on Machine learning. ACM, pp 129–136
Colaco M, Svider PF, Agarwal N, Eloy JA, Jackson IM (2013) Readability assessment of online urology patient education materials. J Urology 189(3):1048–1052
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Annals of Statistics 1189–1232
Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4(Nov):933–969
Gousios G (2013) The ghtorent dataset and tool suite. In: Proceedings of the 10th working conference on mining software repositories. IEEE Press, pp 233–236
Gousios G, Storey M-A (2016) A Bacchelli, Work practices and challenges in pull-based development: The contributor’s perspective. In: 2016 IEEE/ACM 38th international conference on software engineering (ICSE). IEEE, pp 285–296
Gousios G, Pinzger M, Deursen Av (2014) An exploratory study of the pull-based software development model. In: Proceedings of the 36th international conference on software engineering. ACM, pp 345– 355
Gousios G, Zaidman A, Storey M-A, Van Deursen A (2015) Work practices and challenges in pull-based development: the integrator’s perspective. In: Proceedings of the 37th international conference on software engineering-volume 1. IEEE Press, pp 358–368
Herbrich R (2000) Large margin rank boundaries for ordinal regression, Advances in large margin classifiers, 115–132
Herbrich R, Graepel T, Obermayer K (1999) Support vector learning for ordinal regression
Hindle A, German DM, Holt R (2008) What do large commits tell us?: a taxonomical study of large commits. In: Proceedings of the 2008 international working conference on mining software repositories. ACM, pp 99–108
Hooimeijer P, Weimer W (2007) Modeling bug report quality. In: Proceedings of the 22nd IEEE/ACM international conference on automated software engineering. ACM, pp 34–43
Kalliamvakou E, Gousios G, Blincoe K, Singer L, German DM, Damian D (2014) The promises and perils of mining github. In: Proceedings of the 11th working conference on mining software repositories. ACM, pp 92–101
Kruskal WH, Wallis WA (1952) Use of ranks in one-criterion variance analysis. J Am Stat Assoc 47(260):583–621
Leys C, Ley C, Klein O, Bernard P, Licata L (2013) Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. J Exp Soc Psychol 49(4):764–766
Li H (2011) A short introduction to learning to rank. IEICE Trans Inf Syst 94 (10):1854–1862
Li H (2014) Learning to rank for information retrieval and natural language processing. Synth Lect Human Lang Technol 7(3):1–121
Li Z, Yin G, Yu Y, Wang T, Wang H (2017) Detecting duplicate pull-requests in github. In: Proceedings of the 9th Asia-Pacific symposium on internetware. ACM, p 20
McCallum DR, Peterson JL (1982) Computer-based readability indexes. In: Proceedings of the ACM’82 conference. ACM, pp 44–48
Metzler D, Croft WB (2007) Linear feature-based models for information retrieval. Inf Retr 10(3):257–274
Nakakoji K, Yamamoto Y, Nishinaka Y, Kishida K, Ye Y (2002) Evolution patterns of open-source software systems and communities. In: Proceedings of the international workshop on Principles of software evolution. ACM, pp 76–85
Niu S, Guo J, Lan Y, Cheng X (2012) Top-k learning to rank: labeling, ranking and evaluation. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 751–760
Padhye R, Mani S, Sinha VS (2014) A study of external community contribution to open-source projects on github. In: Proceedings of the 11th working conference on mining software repositories. ACM, pp 332–335
Pham R, Singer L, Liskin O, Figueira Filho F, Schneider K (2013) Creating a shared understanding of testing culture on a social coding site. In: 2013 35th international conference on software engineering (ICSE). IEEE, pp 112–121
Steinmacher I, Pinto G, Wiese I, Gerosa MA (2018) Almost there: A study on quasi-contributors in open-source software projects, p 12
Tenny T (1988) Program readability: procedures versus comments. IEEE Trans Softw Eng 14(9):1271–1279
Thongtanunam P, Kula RG, Cruz AEC, Yoshida N, Iida H (2014) Improving code review effectiveness through reviewer recommendations. In: Proceedings of the 7th international workshop on cooperative and human aspects of software engineering. ACM, pp 119–122
Thongtanunam P, Tantithamthavorn C, Kula RG, Yoshida N, Iida H, Matsumoto K-I (2015) Who should review my code? a file location-based code-reviewer recommendation approach for modern code review. In: 2015 IEEE 22nd international conference on software analysis, evolution and reengineering (SANER). IEEE, pp 141–150
Tsay J, Dabbish L, Herbsleb J (2014) Influence of social and technical factors for evaluating contribution in github. In: Proceedings of the 36th international conference on Software engineering. ACM, pp 356–366
Van Der Veen E, Gousios G, Zaidman A (2015) Automatically prioritizing pull requests. In: Proceedings of the 12th working conference on mining software repositories. IEEE Press, pp 357–361
Vasilescu B, Yu Y, Wang H, Devanbu P, Filkov V (2015) Quality and productivity outcomes relating to continuous integration in github. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering. ACM, pp 805–816
Wang Y, Wang L, Li Y, He D, Chen W, Liu T-Y (2013) A theoretical analysis of ndcg ranking measures. In: Proceedings of the 26th annual conference on learning theory (COLT 2013)
Weißgerber P, Neu D, Diehl S (2008) Small patches get in!. In: Proceedings of the 2008 international working conference on mining software repositories. ACM, pp 67–76
Yu Y (2014) Who should review this pull-request: Reviewer recommendation to expedite crowd collaboration. In: 2014 21st Asia-Pacific, software engineering conference (APSEC), vol 1. IEEE, pp 335–342
Yu Y, Wang H, Yin G, Ling CX (2014) Reviewer recommender of pull-requests in github. In: 2014 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 609–612
Zar JH (1998) Spearman rank correlation, Encyclopedia of Biostatistics
Zhou J, Zhang H (2012) Learning to rank duplicate bug reports. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 852–861
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Alexander Serebrenik
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Decision Table
Appendix B: Survey E-mail Template
Dear %s,
My name is Guoliang Zhao. I am a PhD student at Queen’s University, Canada. I am inviting you to participate in a survey (consists of only 5 questions and will not take you more than 5 minutes) about improving the pull requests review process because you are an active GitHub code reviewer. We apologize in advance if our email is unwanted or a waste of your time. Your feedback would be highly valuable to our research. This study will help us understand how Github code reviewers would react to our pull requests recommending approach. There are no mandatory questions in our survey. To participate, please click on the following link: https://docs.google.com/forms/d/e/1FAIpQLScUWVQkpS42fVNNfshEKwrKKgb7EKJb5HxLVLbKpNJYbnqk1A/viewform?usp=sf_link
To compensate you for your time, you may win one $10 Amazon gift card if you complete the full questionnaire. We will randomly select 20% of participants as winners. We would also be happy to share with you the results of the survey, if you are interested.
If you have any questions about this survey, or difficulty in accessing the site or completing the survey, please contact Guoliang Zhao at g.zhao@queensu.ca or Daniel Alencar da Costa at daniel.alencar@queensu.ca. Thank you in advance for your time and for providing this important feedback!
Best regards,
Guoliang
Rights and permissions
About this article
Cite this article
Zhao, G., da Costa, D.A. & Zou, Y. Improving the pull requests review process using learning-to-rank algorithms. Empir Software Eng 24, 2140–2170 (2019). https://doi.org/10.1007/s10664-019-09696-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-019-09696-8