research-article

Tracing back the history of commits in low-tech reviewing environments: a case study of the Linux kernel

Authors:

Daniel M. GermanAuthors Info & Claims

ESEM '14: Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

Article No.: 51, Pages 1 - 10

https://doi.org/10.1145/2652524.2652542

Published: 18 September 2014 Publication History

Abstract

<u>Context</u>: During software maintenance, people typically go back to the original reviews of a patch to understand the actual design rationale and potential risks of the code. Whereas modern web-based reviewing environments like gerrit make this process relatively easy, the low-tech, mailing-list based reviewing environments of many open source systems make linking a commit back to its reviews and earlier versions far from trivial, since (1) a commit has no physical link with any reviewing email, (2) the discussed patches are not always fully identical to the accepted commits and (3) some discussions last across multiple email threads, each of which containing potentially multiple versions of the same patch.

<u>Goal</u>: To support maintainers in reconstructing the reviewing history of kernel patches, and studying (for the first time) the characteristics of the recovered reviewing histories.

<u>Method</u>: This paper performs a comparative empirical study on the Linux kernel mailing lists of 3 email-to-email and email-to-commit linking techniques based on checksums, common patch lines and clone detection.

<u>Results</u>: Around 25% of the patches had an (until now) hidden reviewing history of more than four weeks, and patches with multiple versions typically are larger and have a higher acceptance rate than patches with just one version.

<u>Conclusion</u>: The plus-minus-line-based technique is the best approach for linking patch emails to commits, while it needs to be combined with the checksum-based technique for linking different patch versions.

References

[1]

J. Aranda and G. Venolia. The secret life of bugs: Going past the errors and omissions in software repositories. In Proc. of the 31st Intl. Conf. on Software Engineering, ICSE '09, pages 298--308, 2009.

Digital Library

[2]

A. Bacchelli and C. Bird. Expectations, outcomes, and challenges of modern code review. In Proc. of Intl. Conf. on Software Engineering (ICSE), pages 712--721, 2013.

Digital Library

[3]

A. Bacchelli, T. Dal Sasso, M. D'Ambros, and M. Lanza. Content classification of development emails. In Proc. of the 34th Intl. Conf. on Software Engineering, ICSE '12, pages 375--385, 2012.

Digital Library

[4]

O. Baysal, R. Holmes, and M. W. Godfrey. Mining usage data and development artifacts. In Intl. Working Conf. on Mining Software Repositories (MSR), pages 98--107, 2012.

Digital Library

[5]

O. Baysal, O. Kononenko, R. Holmes, and M. W. Godfrey. The influence of non-technical factors on code review. In R. LŁmmel, R. Oliveto, and R. Robbes, editors, WCRE, pages 122--131. IEEE, 2013.

[6]

N. Bettenburg, A. E. Hassan, B. Adams, and D. M. German. Management of community contributions a case study on the android and linux software ecosystems. Empirical Software Engineering., 2013.

[7]

N. Bettenburg, R. Premraj, T. Zimmermann, and S. Kim. Extracting structural information from bug reports. In Proc. of the Intl. Working Conf. on Mining Software Repositories, MSR '08, pages 27--30, 2008.

Digital Library

[8]

N. Bettenburg, W. Shang, W. M. Ibrahim, B. Adams, Y. Zou, and A. E. Hassan. An empirical study on inconsistent changes to code clones at the release level. Sci. Comput. Program., 77(6):760--776, 2012.

Digital Library

[9]

N. Bettenburg, E. Shihab, and A. E. Hassan. An empirical study on the risks of using off-the-shelf techniques for processing mailing list data. In Proc. of the 25th IEEE Intl. Conf. on Software Maintenance (ICSM), pages 539--542, 2009.

[10]

N. Bettenburg, S. W. Thomas, and A. E. Hassan. Using fuzzy code search to link code fragments in discussions to source code. In T. Mens, A. Cleve, and R. Ferenc, editors, CSMR, pages 319--328. IEEE.

Digital Library

[11]

C. Bird, A. Gourley, and P. Devanbu. Detecting patch submission and acceptance in oss projects. In Proc. of the 4th Intl. Workshop on Mining Software Repositories (MSR), page 26, 2007.

Digital Library

[12]

E. Duala-Ekoko and M. P. Robillard. Tracking code clones in evolving software. In Proc. of the 29th Intl. Conf. on Software Engineering (ICSE), pages 158--167, 2007.

Digital Library

[13]

O. Gotel, J. Cleland-Huang, J. H. Hayes, A. Zisman, A. Egyed, P. Grunbacher, and G. Antoniol. The quest for ubiquity: A roadmap for software and systems traceability research. Intl. Requirements Engineering Conference (RE), pages 71--80, 2012.

Digital Library

[14]

O. Gotel, J. Cleland-Huang, J. H. Hayes, A. Zisman, A. Egyed, P. Grünbacher, A. Dekhtyar, G. Antoniol, and J. Maletic. The Grand Challenge of Traceability (v1.0), pages 343--412. Springer-Verlag, 2012.

[15]

A. Hindle, E. T. Barr, Z. Su, M. Gabel, and P. Devanbu. On the naturalness of software. In Proc. of the 34th Intl. Conference on Software Engineering, ICSE '12, pages 837--847, 2012.

Digital Library

[16]

http://bits.blogs.nytimes.com/2014/04/09/qa-on-heartbleed-a-flaw-missed-by-the masses/. Q. and a. on heartbleed: A flaw missed by the masses.

[17]

https://code.google.com/p/gerrit/. Gerrit code review.

[18]

http://www.itl.nist.gov/div898/handbook/. Nist/sematech e-handbook of statistical methods.

[19]

Y. Jiang, B. Adams, and D. M. German. Will my patch make it? and how fast? -- case study on the linux kernel. In Proc. of the 10th IEEE Working Conf. on Mining Software Repositories (MSR), pages 101--110, 2013.

Digital Library

[20]

M. Kim and D. Notkin. Using a clone genealogy extractor for understanding and supporting evolution of code clones. In MSR, pages 1--5, 2005.

Digital Library

[21]

S. Livieri, Y. Higo, M. Matushita, and K. Inoue. Very-large scale code clone analysis and visualization of open source programs using distributed ccfinder: D-ccfinder. In Proc. of the 29th Intl. Conf. on Software Engineering (ICSE), pages 106--115, 2007.

Digital Library

[22]

S. McIntosh, Y. Kamei, B. Adams, and A. E. Hassan. The impact of code review coverage and code review participation on software quality: A case study of the qt, vtk, and itk projects. In Proc. of the 11th Working Conference on Mining Software Repositories, MSR 2014, pages 192--201, 2014.

Digital Library

[23]

H. Pickens. http://linux.slashdot.org/story/13/10/09/1551240/the-linux-backdoor-attempt-of-2003. The Linux Backdoor Attempt of 2003.

[24]

P. C. Rigby and C. Bird. Convergent contemporary software peer review practices. In Proc. of the 2013 9th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2013, pages 202--212, 2013.

Digital Library

[25]

P. C. Rigby, D. M. German, and M.-A. Storey. Open source software peer review practices: a case study of the apache server. In Proc. of the 30th Intl. Conf. on Software Engineering (ICSE), pages 541--550, 2008.

Digital Library

[26]

Wikipedia. http://en.wikipedia.org/wiki/f1_score.

[27]

C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén. Experimentation in Software Engineering: An Introduction. Kluwer Academic Publishers, 2000.

Digital Library

Cited By

Hirao TMcIntosh SIhara AMatsumoto K(2022)Code Reviews With Divergent Review Scores: An Empirical Study of the OpenStack and Qt CommunitiesIEEE Transactions on Software Engineering10.1109/TSE.2020.297790748:1(69-81)Online publication date: 1-Jan-2022
https://doi.org/10.1109/TSE.2020.2977907
Ferreira ICheng JAdams B(2021)The "Shut the f**k up" Phenomenon: Characterizing Incivility in Open Source Code Review DiscussionsProceedings of the ACM on Human-Computer Interaction10.1145/34794975:CSCW2(1-35)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3479497
Bock THunsen CJoblin MApel S(2021)Synchronous development in open-source projects: A higher-level perspectiveAutomated Software Engineering10.1007/s10515-021-00292-z29:1Online publication date: 13-Oct-2021
https://doi.org/10.1007/s10515-021-00292-z
Show More Cited By

Index Terms

Tracing back the history of commits in low-tech reviewing environments: a case study of the Linux kernel
1. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
      1. Software reverse engineering
  2. Software notations and tools
    1. Software configuration management and version control systems

Recommendations

RMX: The Architecture Of Rule-based Mailing System
IDEAS '15: Proceedings of the 19th International Database Engineering & Applications Symposium

Mailing lists are widespread tools to communicate and share information with each other. Especially, organizations maintain so many of them for collaborative works. Because of conventional mailing schemes, it requires constant administration from its ...
Invalidation of Mailing List Address to Block Spam Mails
APSCC '08: Proceedings of the 2008 IEEE Asia-Pacific Services Computing Conference

Mailing lists are used for information exchange in specific groups. However, in the recent times, the number of spam mails received has increased, and considerable amount of time is wasted in filtering spam mails. Spam filtering techniques are widerly ...
SELS: a secure e-mail list service
SAC '05: Proceedings of the 2005 ACM symposium on Applied computing

Exchange of private information content among a large number of users via E-mail List Services is becoming increasingly common. In this paper we address security requirements in that setting and develop a new protocol, SELS (a Secure E-mail List Service)...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEM '14: Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

September 2014

461 pages

ISBN:9781450327749

DOI:10.1145/2652524

General Chair:
Maurizio Morisio
Politecnico di Torino, Italy
,
Program Chairs:
Tore Dybå
Sintef, Norway
,
Marco Torchiano
Politecnico di Torino, Italy

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 September 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ESEM '14

Sponsor:

SIGSOFT

ESEM '14: 2014 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement

September 18 - 19, 2014

Torino, Italy

Acceptance Rates

ESEM '14 Paper Acceptance Rate 23 of 123 submissions, 19%;

Overall Acceptance Rate 130 of 594 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
250
Total Downloads

Downloads (Last 12 months)17
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Hirao TMcIntosh SIhara AMatsumoto K(2022)Code Reviews With Divergent Review Scores: An Empirical Study of the OpenStack and Qt CommunitiesIEEE Transactions on Software Engineering10.1109/TSE.2020.297790748:1(69-81)Online publication date: 1-Jan-2022
https://doi.org/10.1109/TSE.2020.2977907
Ferreira ICheng JAdams B(2021)The "Shut the f**k up" Phenomenon: Characterizing Incivility in Open Source Code Review DiscussionsProceedings of the ACM on Human-Computer Interaction10.1145/34794975:CSCW2(1-35)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3479497
Bock THunsen CJoblin MApel S(2021)Synchronous development in open-source projects: A higher-level perspectiveAutomated Software Engineering10.1007/s10515-021-00292-z29:1Online publication date: 13-Oct-2021
https://doi.org/10.1007/s10515-021-00292-z
Ferreira IStewart KGerman DAdams B(2019)A longitudinal study on the maintainers' sentiment of a large scale open source ecosystemProceedings of the 4th International Workshop on Emotion Awareness in Software Engineering10.1109/SEmotion.2019.00011(17-22)Online publication date: 28-May-2019
https://dl.acm.org/doi/10.1109/SEmotion.2019.00011
Ramsauer RLohmann DMauerer WAtlee JBultan TWhittle J(2019)The list is the processProceedings of the 41st International Conference on Software Engineering10.1109/ICSE.2019.00088(807-818)Online publication date: 25-May-2019
https://dl.acm.org/doi/10.1109/ICSE.2019.00088
Avelino GPassos LHora AValente M(2019)Measuring and analyzing code authorship in 1 + 118 open source projectsScience of Computer Programming10.1016/j.scico.2019.03.001Online publication date: Mar-2019
https://doi.org/10.1016/j.scico.2019.03.001
Zolfagharinia MAdams BGuéhéneuc YGonzalez-Barahona JHindle ATan L(2017)Do not trust build results at face valueProceedings of the 14th International Conference on Mining Software Repositories10.1109/MSR.2017.7(312-322)Online publication date: 20-May-2017
https://dl.acm.org/doi/10.1109/MSR.2017.7
Armstrong FKhomh FAdams B(2017)Broadcast vs. Unicast Review Technology: Does It Matter?2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST.2017.27(219-229)Online publication date: Mar-2017
https://doi.org/10.1109/ICST.2017.27
Zhu JZhou MMockus AZimmermann TCleland-Huang JSu Z(2016)Effectiveness of code contribution: from patch-based to pull-request-based toolsProceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering10.1145/2950290.2950364(871-882)Online publication date: 1-Nov-2016
https://dl.acm.org/doi/10.1145/2950290.2950364
Jiang Y(2015)Improving the integration process of large software systems2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)10.1109/SANER.2015.7081888(598-598)Online publication date: Mar-2015
https://doi.org/10.1109/SANER.2015.7081888

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents