research-article

Identifying the characteristics of vulnerable code changes: an empirical study

Authors:

Amiangshu Bosu,

Jeffrey C. Carver,

Patrick Hilley,

Derek JanniAuthors Info & Claims

FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

Pages 257 - 268

https://doi.org/10.1145/2635868.2635880

Published: 11 November 2014 Publication History

Abstract

To focus the efforts of security experts, the goals of this empirical study are to analyze which security vulnerabilities can be discovered by code review, identify characteristics of vulnerable code changes, and identify characteristics of developers likely to introduce vulnerabilities. Using a three-stage manual and automated process, we analyzed 267,046 code review requests from 10 open source projects and identified 413 Vulnerable Code Changes (VCC). Some key results include: (1) code review can identify common types of vulnerabilities; (2) while more experienced contributors authored the majority of the VCCs, the less experienced contributors' changes were 1.8 to 24 times more likely to be vulnerable; (3) the likelihood of a vulnerability increases with the number of lines changed, and (4) modified files are more likely to contain vulnerabilities than new files. Knowing which code changes are more prone to contain vulnerabilities may allow a security expert to concentrate on a smaller subset of submitted code changes. Moreover, we recommend that projects should: (a) create or adapt secure coding guidelines, (b) create a dedicated security review team, (c) ensure detailed comments during review to help knowledge dissemination, and (d) encourage developers to make small, incremental changes rather than large changes.

References

[1]

Cert c secure coding standard. https://www. securecoding.cert.org/confluence/display/ seccode/CERT+C+Secure+Coding+Standard. {Online; accessed 6-Mar-2014}.

[2]

Chromium coding style. http://dev.chromium.org/ developers/coding-style. {Online; accessed 6-Mar-2014}.

[3]

Chromium developer guide. http://www.chromium. org/chromium-os/developer-guide. {Online; accessed 6-Mar-2014}.

[4]

Gerrit code review tool. https://code.google.com/p/gerrit/. {Online; accessed 6-Mar-2014}.

[5]

Klocwork. http://www.klocwork.com/.

[6]

Qt coding style. http: //qt-project.org/wiki/Qt_Coding_Style. {Online; accessed 6-Mar-2014}.

[7]

Android. Android developer guide. http://source. android.com/source/life-of-a-patch.html. {Online; accessed 6-Mar-2014}.

[8]

A. Austin and L. Williams. One technique is not enough: A comparison of vulnerability discovery techniques. In International Symposium on Empirical Software Engineering and Measurement (ESEM), 2011, pages 97–106, Banff, Alberta, Canada, 2011. IEEE.

Digital Library

[9]

A. Bacchelli and C. Bird. Expectations, outcomes, and challenges of modern code review. In Proceedings of the 2013 International Conference on Software Engineering, pages 712–721, San Francisco, CA, USA, 2013. IEEE Press.

Digital Library

[10]

V. Balachandran. Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, pages 931–940, San Francisco, CA, USA, 2013. IEEE Press.

Digital Library

[11]

V. R. Basili, F. Shull, and F. Lanubile. Building knowledge through families of experiments. IEEE Transactions on Software Engineering, 25(4):456–473, 1999.

Digital Library

[12]

A. Bosu and J. Carver. Peer code review to prevent security vulnerabilities: An empirical evaluation. In 2013 IEEE 7th International Conference on Software Security and Reliability-Companion (SERE-C), pages 229–230, Washington, DC, USA, June 2013.

Digital Library

[13]

A. Bosu and J. C. Carver. Peer code review in open source communities using reviewboard. In Proeedings of the 4th ACM Wksp. on Evaluation and Usability of Programming Language and Tools, pages 17–24, Tucson, Arizona, USA, 2012. ACM.

Digital Library

[14]

A. Bosu and J. C. Carver. Impact of peer code review on peer impression formation: A survey. In 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pages 133–142, Baltimore, MD, USA, 2013. IEEE.

[15]

S. Christey and R. Martin. Vulnerability type distributions in CVE, version 1.1. http://cwe.mitre.org/ documents/vuln-trends/index.html, May 2007.

[16]

J. Cohen. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1):37–46, 1960.

[17]

J. Cohen, E. Brown, B. DuRette, and S. Teleki. Best Kept Secrets of Peer Code Review. Smart Bear, 2006.

[18]

T. Dinh-Trong and J. Bieman. The freebsd project: a replication case study of open source development. IEEE Transactions on Software Engineering, 31(6):481–494, 2005.

Digital Library

[19]

T. Dyba, T. Dingsoyr, and G. K. Hanssen. Applying systematic reviews to diverse study types: An experience report. In First International Symposium on Empirical Software Engineering and Measurement, 2007. ESEM 2007., pages 225–234, Madrid, Spain, 2007. IEEE.

Digital Library

[20]

M. Fagan. Reviews and inspections. Software Pioneers–Contributions to Software Engineering, pages 562–573, 2002.

[21]

M. E. Fagan. Design and code inspections to reduce errors in program development. IBM Systems Journal, 15:182–211, 1976.

Digital Library

[22]

I. Feinerer. Introduction to the tm package text mining in r. http://cran.r-project.org/web/packages/ tm/index.html, 2013.

[23]

M. Gegick, P. Rotella, and L. Williams. Toward non-security failures as a predictor of security faults and failures. In Engineering Secure Software and Systems, pages 135–149. Springer, 2009.

Digital Library

[24]

M. Gegick, L. Williams, J. Osborne, and M. Vouk. Prioritizing software security fortification through code-level metrics. In Proceedings of the 4th ACM workshop on Quality of protection, pages 31–38. ACM, 2008.

Digital Library

[25]

P. M. Johnson. Reengineering inspection. Communications of the ACM, 41(2):49–52, 1998.

Digital Library

[26]

K. R. Lakhani and R. G. Wolf. Why hackers do what they do: Understanding motivation and effort in free/open source software projects. Perspectives on free and open source software, 1:3–22, 2005.

[27]

S. Lukins, N. Kraft, and L. Etzkorn. Source code retrieval for bug localization using latent dirichlet allocation. In 15th Working Conference on Reverse Engineering, 2008. WCRE ’08., pages 155–164, Antwerp, Belgium, 2008.

Digital Library

[28]

G. McGraw. Software security. IEEE Security Privacy, 2(2):80 – 83, Mar-Apr 2004.

Digital Library

[29]

G. McGraw. Software security: building security in, volume 1. Addison-Wesley Professional, 2006.

Digital Library

[30]

G. McGraw. Automated code review tools for security. Computer, 41(12):108 –111, Dec. 2008.

Digital Library

[31]

A. Meneely, H. Srinivasan, A. Musa, A. R. Tejeda, M. Mokary, and B. Spates. When a patch goes bad: Exploring the properties of vulnerability-contributing commits. In 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pages 65–74, Baltimore, MD, USA, 2013. IEEE.

[32]

A. Meneely and L. Williams. Secure open source collaboration: an empirical study of linus’ law. In Proceedings of the 16th ACM conference on Computer and communications security, pages 453–462. ACM, 2009.

Digital Library

[33]

Mitre Coroporation. Common weakness enumeration. http://cwe.mitre.org/. {Online; accessed 6-Mar-2014}.

[34]

A. Mockus, R. T. Fielding, and J. D. Herbsleb. Two case studies of open source software development: Apache and mozilla. ACM Transactions on Software Engineering Methodology, 11(3):309–346, July 2002.

Digital Library

[35]

A. Mockus and D. M. Weiss. Predicting risk of software changes. Bell Labs Technical Journal, 5(2):169–180, 2000.

[36]

M. Mukadam, C. Bird, and P. C. Rigby. Gerrit software code review data from android. In Proceedings of the Tenth International Workshop on Mining Software Repositories, pages 45–48, San Francisco, CA, USA, 2013. IEEE Press.

Digital Library

[37]

N. Nagappan and T. Ball. Use of relative code churn measures to predict system defect density. In Proceedings of the 27th International Conference on Software Engineering, 2005. ICSE 2005., pages 284–292, St. Louis, MO, USA, 2005.

Digital Library

[38]

S. Neuhaus, T. Zimmermann, C. Holler, and A. Zeller. Predicting vulnerable software components. In Proceedings of the 14th ACM conference on Computer and communications security, pages 529–540. ACM, 2007.

Digital Library

[39]

OWASP. The open web application security project. https://www.owasp.org/index.php/Category: Vulnerability, 2013. {Online; accessed 6-Mar-2014}.

[40]

M. F. Porter. Snowball: A language for stemming algorithms. http://www.tartarus.org/\~{}martin/ PorterStemmer, 2001.

[41]

L. J. Pratt, A. C. MacLean, C. D. Knutson, and E. K. Ringger. Cliff Walls: An analysis of monolithic commits using Latent Dirichlet Allocation. In Open Source Systems: Grounding Research, pages 282–298. Springer, 2011.

[42]

R. Purushothaman and D. Perry. Toward understanding the rhetoric of small source code changes. IEEE Transactions on Software Engineering, 31(6):511–526, June 2005.

Digital Library

[43]

F. Rahman and P. Devanbu. Ownership, experience and defects: a fine-grained study of authorship. In Proceedings of the 33rd International Conference on Software Engineering, pages 491–500, Waikiki, Honolulu, HI, USA, 2011. ACM.

Digital Library

[44]

E. S. Raymond. Homesteading the noosphere. First Monday, 3(10), 1998.

[45]

P. C. Rigby and C. Bird. Convergent contemporary software peer review practices. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pages 202–212, Saint Petersburg, Russia, 2013. ACM.

Digital Library

[46]

Y. Shin, A. Meneely, L. Williams, and J. A. Osborne. Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Transactions on Software Engineering, 37(6):772–787, 2011.

Digital Library

[47]

Y. Shin and L. Williams. An empirical model to predict security vulnerabilities using code complexity metrics. In Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement, pages 315–317, Kaiserslautern, Germany, 2008. ACM.

Digital Library

[48]

P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining, volume 1. Addison-Wesley Longman Publishing Co., Inc., 2005.

Digital Library

[49]

K. Tsipenyuk, B. Chess, and G. McGraw. Seven pernicious kingdoms: a taxonomy of software security errors. IEEE Security Privacy, 3(6):81 – 84, Nov.-Dec. 2005.

Digital Library

[50]

J. Walden, M. Doyle, G. A. Welch, and M. Whelan. Security of open source web applications. In Proceedings of the 2009 3rd international Symposium on Empirical Software Engineering and Measurement, pages 545–553, Orlando, FL, USA, 2009. IEEE Computer Society.

Digital Library

[51]

K. E. Wiegers. Peer reviews in Software: A practical guide. Addison-Wesley Boston, 2002.

Digital Library

[52]

T. Yamane. Elementary sampling theory. 1967.

[53]

A. Zeller. Why Programs Fail: A Guide to Systematic Debugging. Morgan Kaufmann Pub. Inc., San Francisco, CA, USA, 2005.

Digital Library

Cited By

Shah IJhanjhi NBrohi S(2024)Machine Learning Models for Detecting Software VulnerabilitiesGenerative AI for Web Engineering Models10.4018/979-8-3693-3703-5.ch001(1-40)Online publication date: 27-Sep-2024
https://doi.org/10.4018/979-8-3693-3703-5.ch001
Esposito MFalaschi VFalessi D(2024)An Extensive Comparison of Static Application Security Testing ToolsProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661199(69-78)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3661167.3661199
Watanabe MKashiwa YLin BHirao TYamaguchi KIida H(2024)On the Use of ChatGPT for Code Review: Do Developers Like Reviews By ChatGPT?Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661183(375-380)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3661167.3661183
Show More Cited By

Index Terms

Identifying the characteristics of vulnerable code changes: an empirical study
1. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems
      1. Software management

Recommendations

Characteristics of the vulnerable code changes identified through peer code review
ICSE Companion 2014: Companion Proceedings of the 36th International Conference on Software Engineering

To effectively utilize the efforts of scarce security experts, this study aims to provide empirical evidence about the characteristics of security vulnerabilities. Using a three-stage, manual analysis of peer code review data from 10 popular Open ...
Do bugs foreshadow vulnerabilities? An in-depth study of the chromium project

As developers face an ever-increasing pressure to engineer secure software, researchers are building an understanding of security-sensitive bugs (i.e. vulnerabilities). Research into mining software repositories has greatly increased our understanding ...
An empirical investigation of socio-technical code review metrics and security vulnerabilities
SSE 2014: Proceedings of the 6th International Workshop on Social Software Engineering

One of the guiding principles of open source software development is to use crowds of developers to keep a watchful eye on source code. Eric Raymond declared Linus'' Law as "many eyes make all bugs shallow", with the socio-technical argument that high ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

November 2014

856 pages

ISBN:9781450330565

DOI:10.1145/2635868

General Chair:
Shing-Chi Cheung
Hong Kong University of Science and Technology, China
,
Program Chairs:
Alessandro Orso
Georgia Institute of Technology, USA
,
Margaret-Anne Storey
University of Victoria, Canada

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGSOFT/FSE'14

Sponsor:

SIGSOFT

SIGSOFT/FSE'14: 22nd ACM SIGSOFT Symposium on the Foundations of Software Engineering

November 16 - 21, 2014

Hong Kong, China

Acceptance Rates

Overall Acceptance Rate 17 of 128 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

79
Total Citations
View Citations
1,003
Total Downloads

Downloads (Last 12 months)106
Downloads (Last 6 weeks)7

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Shah IJhanjhi NBrohi S(2024)Machine Learning Models for Detecting Software VulnerabilitiesGenerative AI for Web Engineering Models10.4018/979-8-3693-3703-5.ch001(1-40)Online publication date: 27-Sep-2024
https://doi.org/10.4018/979-8-3693-3703-5.ch001
Esposito MFalaschi VFalessi D(2024)An Extensive Comparison of Static Application Security Testing ToolsProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661199(69-78)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3661167.3661199
Watanabe MKashiwa YLin BHirao TYamaguchi KIida H(2024)On the Use of ChatGPT for Code Review: Do Developers Like Reviews By ChatGPT?Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661183(375-380)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3661167.3661183
Andreina SCloosters TDavi LGiesen JGutfleisch MKarame GNaiakshina ANaji HLuo BLiao XXu JKirda ELie D(2024)Defying the Odds: Solana's Unexpected Resilience in Spite of the Security Challenges Faced by DevelopersProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670333(4226-4240)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3670333
Le TDu XBabar MSpinellis DConstantinou EBacchelli A(2024)Are Latent Vulnerabilities Hidden Gems for Software Vulnerability Prediction? An Empirical StudyProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644919(716-727)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644919
Li ZWang NZou DLi YZhang RXu SZhang CJin HRoychoudhury APaiva AAbreu RStorey M(2024)On the Effectiveness of Function-Level Vulnerability Detectors for Inter-Procedural VulnerabilitiesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639218(1-12)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639218
Soria ALopez TSeero EMashhadi NEvans EBurge JVan der Hoek ARoychoudhury APaiva AAbreu RStorey M(2024)Characterizing Software Maintenance Meetings: Information Shared, Discussion Outcomes, and Information CapturedProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623330(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3623330
Liu DFeng Y(2024)Mining Fine-Grained Code Change Patterns Using Multiple Feature AnalysisInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402450050535:01(111-138)Online publication date: 27-Nov-2024
https://doi.org/10.1142/S0218194024500505
Charoenwet WThongtanunam PPham VTreude C(2024)Toward effective secure code reviews: an empirical study of security-related coding weaknessesEmpirical Software Engineering10.1007/s10664-024-10496-y29:4Online publication date: 8-Jun-2024
https://dl.acm.org/doi/10.1007/s10664-024-10496-y
Tian ZTian BLv JChen L(2023)Learning and Fusing Multi-View Code Representations for Function Vulnerability DetectionElectronics10.3390/electronics1211249512:11(2495)Online publication date: 1-Jun-2023
https://doi.org/10.3390/electronics12112495
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten