skip to main content
10.1145/2652524.2652542acmconferencesArticle/Chapter ViewAbstractPublication PagesesemConference Proceedingsconference-collections
research-article

Tracing back the history of commits in low-tech reviewing environments: a case study of the Linux kernel

Published: 18 September 2014 Publication History

Abstract

<u>Context</u>: During software maintenance, people typically go back to the original reviews of a patch to understand the actual design rationale and potential risks of the code. Whereas modern web-based reviewing environments like gerrit make this process relatively easy, the low-tech, mailing-list based reviewing environments of many open source systems make linking a commit back to its reviews and earlier versions far from trivial, since (1) a commit has no physical link with any reviewing email, (2) the discussed patches are not always fully identical to the accepted commits and (3) some discussions last across multiple email threads, each of which containing potentially multiple versions of the same patch.
<u>Goal</u>: To support maintainers in reconstructing the reviewing history of kernel patches, and studying (for the first time) the characteristics of the recovered reviewing histories.
<u>Method</u>: This paper performs a comparative empirical study on the Linux kernel mailing lists of 3 email-to-email and email-to-commit linking techniques based on checksums, common patch lines and clone detection.
<u>Results</u>: Around 25% of the patches had an (until now) hidden reviewing history of more than four weeks, and patches with multiple versions typically are larger and have a higher acceptance rate than patches with just one version.
<u>Conclusion</u>: The plus-minus-line-based technique is the best approach for linking patch emails to commits, while it needs to be combined with the checksum-based technique for linking different patch versions.

References

[1]
J. Aranda and G. Venolia. The secret life of bugs: Going past the errors and omissions in software repositories. In Proc. of the 31st Intl. Conf. on Software Engineering, ICSE '09, pages 298--308, 2009.
[2]
A. Bacchelli and C. Bird. Expectations, outcomes, and challenges of modern code review. In Proc. of Intl. Conf. on Software Engineering (ICSE), pages 712--721, 2013.
[3]
A. Bacchelli, T. Dal Sasso, M. D'Ambros, and M. Lanza. Content classification of development emails. In Proc. of the 34th Intl. Conf. on Software Engineering, ICSE '12, pages 375--385, 2012.
[4]
O. Baysal, R. Holmes, and M. W. Godfrey. Mining usage data and development artifacts. In Intl. Working Conf. on Mining Software Repositories (MSR), pages 98--107, 2012.
[5]
O. Baysal, O. Kononenko, R. Holmes, and M. W. Godfrey. The influence of non-technical factors on code review. In R. LŁmmel, R. Oliveto, and R. Robbes, editors, WCRE, pages 122--131. IEEE, 2013.
[6]
N. Bettenburg, A. E. Hassan, B. Adams, and D. M. German. Management of community contributions a case study on the android and linux software ecosystems. Empirical Software Engineering., 2013.
[7]
N. Bettenburg, R. Premraj, T. Zimmermann, and S. Kim. Extracting structural information from bug reports. In Proc. of the Intl. Working Conf. on Mining Software Repositories, MSR '08, pages 27--30, 2008.
[8]
N. Bettenburg, W. Shang, W. M. Ibrahim, B. Adams, Y. Zou, and A. E. Hassan. An empirical study on inconsistent changes to code clones at the release level. Sci. Comput. Program., 77(6):760--776, 2012.
[9]
N. Bettenburg, E. Shihab, and A. E. Hassan. An empirical study on the risks of using off-the-shelf techniques for processing mailing list data. In Proc. of the 25th IEEE Intl. Conf. on Software Maintenance (ICSM), pages 539--542, 2009.
[10]
N. Bettenburg, S. W. Thomas, and A. E. Hassan. Using fuzzy code search to link code fragments in discussions to source code. In T. Mens, A. Cleve, and R. Ferenc, editors, CSMR, pages 319--328. IEEE.
[11]
C. Bird, A. Gourley, and P. Devanbu. Detecting patch submission and acceptance in oss projects. In Proc. of the 4th Intl. Workshop on Mining Software Repositories (MSR), page 26, 2007.
[12]
E. Duala-Ekoko and M. P. Robillard. Tracking code clones in evolving software. In Proc. of the 29th Intl. Conf. on Software Engineering (ICSE), pages 158--167, 2007.
[13]
O. Gotel, J. Cleland-Huang, J. H. Hayes, A. Zisman, A. Egyed, P. Grunbacher, and G. Antoniol. The quest for ubiquity: A roadmap for software and systems traceability research. Intl. Requirements Engineering Conference (RE), pages 71--80, 2012.
[14]
O. Gotel, J. Cleland-Huang, J. H. Hayes, A. Zisman, A. Egyed, P. Grünbacher, A. Dekhtyar, G. Antoniol, and J. Maletic. The Grand Challenge of Traceability (v1.0), pages 343--412. Springer-Verlag, 2012.
[15]
A. Hindle, E. T. Barr, Z. Su, M. Gabel, and P. Devanbu. On the naturalness of software. In Proc. of the 34th Intl. Conference on Software Engineering, ICSE '12, pages 837--847, 2012.
[16]
http://bits.blogs.nytimes.com/2014/04/09/qa-on-heartbleed-a-flaw-missed-by-the masses/. Q. and a. on heartbleed: A flaw missed by the masses.
[17]
https://code.google.com/p/gerrit/. Gerrit code review.
[18]
http://www.itl.nist.gov/div898/handbook/. Nist/sematech e-handbook of statistical methods.
[19]
Y. Jiang, B. Adams, and D. M. German. Will my patch make it? and how fast? -- case study on the linux kernel. In Proc. of the 10th IEEE Working Conf. on Mining Software Repositories (MSR), pages 101--110, 2013.
[20]
M. Kim and D. Notkin. Using a clone genealogy extractor for understanding and supporting evolution of code clones. In MSR, pages 1--5, 2005.
[21]
S. Livieri, Y. Higo, M. Matushita, and K. Inoue. Very-large scale code clone analysis and visualization of open source programs using distributed ccfinder: D-ccfinder. In Proc. of the 29th Intl. Conf. on Software Engineering (ICSE), pages 106--115, 2007.
[22]
S. McIntosh, Y. Kamei, B. Adams, and A. E. Hassan. The impact of code review coverage and code review participation on software quality: A case study of the qt, vtk, and itk projects. In Proc. of the 11th Working Conference on Mining Software Repositories, MSR 2014, pages 192--201, 2014.
[23]
H. Pickens. http://linux.slashdot.org/story/13/10/09/1551240/the-linux-backdoor-attempt-of-2003. The Linux Backdoor Attempt of 2003.
[24]
P. C. Rigby and C. Bird. Convergent contemporary software peer review practices. In Proc. of the 2013 9th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2013, pages 202--212, 2013.
[25]
P. C. Rigby, D. M. German, and M.-A. Storey. Open source software peer review practices: a case study of the apache server. In Proc. of the 30th Intl. Conf. on Software Engineering (ICSE), pages 541--550, 2008.
[26]
Wikipedia. http://en.wikipedia.org/wiki/f1_score.
[27]
C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén. Experimentation in Software Engineering: An Introduction. Kluwer Academic Publishers, 2000.

Cited By

View all
  • (2022)Code Reviews With Divergent Review Scores: An Empirical Study of the OpenStack and Qt CommunitiesIEEE Transactions on Software Engineering10.1109/TSE.2020.297790748:1(69-81)Online publication date: 1-Jan-2022
  • (2021)The "Shut the f**k up" Phenomenon: Characterizing Incivility in Open Source Code Review DiscussionsProceedings of the ACM on Human-Computer Interaction10.1145/34794975:CSCW2(1-35)Online publication date: 18-Oct-2021
  • (2021)Synchronous development in open-source projects: A higher-level perspectiveAutomated Software Engineering10.1007/s10515-021-00292-z29:1Online publication date: 13-Oct-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEM '14: Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
September 2014
461 pages
ISBN:9781450327749
DOI:10.1145/2652524
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 September 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clone detection
  2. linux kernel
  3. low-tech reviewing environment
  4. mailing list
  5. open source
  6. review
  7. software engineering
  8. traceability

Qualifiers

  • Research-article

Conference

ESEM '14
Sponsor:

Acceptance Rates

ESEM '14 Paper Acceptance Rate 23 of 123 submissions, 19%;
Overall Acceptance Rate 130 of 594 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Code Reviews With Divergent Review Scores: An Empirical Study of the OpenStack and Qt CommunitiesIEEE Transactions on Software Engineering10.1109/TSE.2020.297790748:1(69-81)Online publication date: 1-Jan-2022
  • (2021)The "Shut the f**k up" Phenomenon: Characterizing Incivility in Open Source Code Review DiscussionsProceedings of the ACM on Human-Computer Interaction10.1145/34794975:CSCW2(1-35)Online publication date: 18-Oct-2021
  • (2021)Synchronous development in open-source projects: A higher-level perspectiveAutomated Software Engineering10.1007/s10515-021-00292-z29:1Online publication date: 13-Oct-2021
  • (2019)A longitudinal study on the maintainers' sentiment of a large scale open source ecosystemProceedings of the 4th International Workshop on Emotion Awareness in Software Engineering10.1109/SEmotion.2019.00011(17-22)Online publication date: 28-May-2019
  • (2019)The list is the processProceedings of the 41st International Conference on Software Engineering10.1109/ICSE.2019.00088(807-818)Online publication date: 25-May-2019
  • (2019)Measuring and analyzing code authorship in 1 + 118 open source projectsScience of Computer Programming10.1016/j.scico.2019.03.001Online publication date: Mar-2019
  • (2017)Do not trust build results at face valueProceedings of the 14th International Conference on Mining Software Repositories10.1109/MSR.2017.7(312-322)Online publication date: 20-May-2017
  • (2017)Broadcast vs. Unicast Review Technology: Does It Matter?2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST.2017.27(219-229)Online publication date: Mar-2017
  • (2016)Effectiveness of code contribution: from patch-based to pull-request-based toolsProceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering10.1145/2950290.2950364(871-882)Online publication date: 1-Nov-2016
  • (2015)Improving the integration process of large software systems2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)10.1109/SANER.2015.7081888(598-598)Online publication date: Mar-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media