skip to main content
10.1145/2635868.2635880acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Identifying the characteristics of vulnerable code changes: an empirical study

Published: 11 November 2014 Publication History

Abstract

To focus the efforts of security experts, the goals of this empirical study are to analyze which security vulnerabilities can be discovered by code review, identify characteristics of vulnerable code changes, and identify characteristics of developers likely to introduce vulnerabilities. Using a three-stage manual and automated process, we analyzed 267,046 code review requests from 10 open source projects and identified 413 Vulnerable Code Changes (VCC). Some key results include: (1) code review can identify common types of vulnerabilities; (2) while more experienced contributors authored the majority of the VCCs, the less experienced contributors' changes were 1.8 to 24 times more likely to be vulnerable; (3) the likelihood of a vulnerability increases with the number of lines changed, and (4) modified files are more likely to contain vulnerabilities than new files. Knowing which code changes are more prone to contain vulnerabilities may allow a security expert to concentrate on a smaller subset of submitted code changes. Moreover, we recommend that projects should: (a) create or adapt secure coding guidelines, (b) create a dedicated security review team, (c) ensure detailed comments during review to help knowledge dissemination, and (d) encourage developers to make small, incremental changes rather than large changes.

References

[1]
Cert c secure coding standard. https://www. securecoding.cert.org/confluence/display/ seccode/CERT+C+Secure+Coding+Standard. {Online; accessed 6-Mar-2014}.
[2]
Chromium coding style. http://dev.chromium.org/ developers/coding-style. {Online; accessed 6-Mar-2014}.
[3]
Chromium developer guide. http://www.chromium. org/chromium-os/developer-guide. {Online; accessed 6-Mar-2014}.
[4]
Gerrit code review tool. https://code.google.com/p/gerrit/. {Online; accessed 6-Mar-2014}.
[5]
Klocwork. http://www.klocwork.com/.
[6]
Qt coding style. http: //qt-project.org/wiki/Qt_Coding_Style. {Online; accessed 6-Mar-2014}.
[7]
Android. Android developer guide. http://source. android.com/source/life-of-a-patch.html. {Online; accessed 6-Mar-2014}.
[8]
A. Austin and L. Williams. One technique is not enough: A comparison of vulnerability discovery techniques. In International Symposium on Empirical Software Engineering and Measurement (ESEM), 2011, pages 97–106, Banff, Alberta, Canada, 2011. IEEE.
[9]
A. Bacchelli and C. Bird. Expectations, outcomes, and challenges of modern code review. In Proceedings of the 2013 International Conference on Software Engineering, pages 712–721, San Francisco, CA, USA, 2013. IEEE Press.
[10]
V. Balachandran. Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, pages 931–940, San Francisco, CA, USA, 2013. IEEE Press.
[11]
V. R. Basili, F. Shull, and F. Lanubile. Building knowledge through families of experiments. IEEE Transactions on Software Engineering, 25(4):456–473, 1999.
[12]
A. Bosu and J. Carver. Peer code review to prevent security vulnerabilities: An empirical evaluation. In 2013 IEEE 7th International Conference on Software Security and Reliability-Companion (SERE-C), pages 229–230, Washington, DC, USA, June 2013.
[13]
A. Bosu and J. C. Carver. Peer code review in open source communities using reviewboard. In Proeedings of the 4th ACM Wksp. on Evaluation and Usability of Programming Language and Tools, pages 17–24, Tucson, Arizona, USA, 2012. ACM.
[14]
A. Bosu and J. C. Carver. Impact of peer code review on peer impression formation: A survey. In 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pages 133–142, Baltimore, MD, USA, 2013. IEEE.
[15]
S. Christey and R. Martin. Vulnerability type distributions in CVE, version 1.1. http://cwe.mitre.org/ documents/vuln-trends/index.html, May 2007.
[16]
J. Cohen. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1):37–46, 1960.
[17]
J. Cohen, E. Brown, B. DuRette, and S. Teleki. Best Kept Secrets of Peer Code Review. Smart Bear, 2006.
[18]
T. Dinh-Trong and J. Bieman. The freebsd project: a replication case study of open source development. IEEE Transactions on Software Engineering, 31(6):481–494, 2005.
[19]
T. Dyba, T. Dingsoyr, and G. K. Hanssen. Applying systematic reviews to diverse study types: An experience report. In First International Symposium on Empirical Software Engineering and Measurement, 2007. ESEM 2007., pages 225–234, Madrid, Spain, 2007. IEEE.
[20]
M. Fagan. Reviews and inspections. Software Pioneers–Contributions to Software Engineering, pages 562–573, 2002.
[21]
M. E. Fagan. Design and code inspections to reduce errors in program development. IBM Systems Journal, 15:182–211, 1976.
[22]
I. Feinerer. Introduction to the tm package text mining in r. http://cran.r-project.org/web/packages/ tm/index.html, 2013.
[23]
M. Gegick, P. Rotella, and L. Williams. Toward non-security failures as a predictor of security faults and failures. In Engineering Secure Software and Systems, pages 135–149. Springer, 2009.
[24]
M. Gegick, L. Williams, J. Osborne, and M. Vouk. Prioritizing software security fortification through code-level metrics. In Proceedings of the 4th ACM workshop on Quality of protection, pages 31–38. ACM, 2008.
[25]
P. M. Johnson. Reengineering inspection. Communications of the ACM, 41(2):49–52, 1998.
[26]
K. R. Lakhani and R. G. Wolf. Why hackers do what they do: Understanding motivation and effort in free/open source software projects. Perspectives on free and open source software, 1:3–22, 2005.
[27]
S. Lukins, N. Kraft, and L. Etzkorn. Source code retrieval for bug localization using latent dirichlet allocation. In 15th Working Conference on Reverse Engineering, 2008. WCRE ’08., pages 155–164, Antwerp, Belgium, 2008.
[28]
G. McGraw. Software security. IEEE Security Privacy, 2(2):80 – 83, Mar-Apr 2004.
[29]
G. McGraw. Software security: building security in, volume 1. Addison-Wesley Professional, 2006.
[30]
G. McGraw. Automated code review tools for security. Computer, 41(12):108 –111, Dec. 2008.
[31]
A. Meneely, H. Srinivasan, A. Musa, A. R. Tejeda, M. Mokary, and B. Spates. When a patch goes bad: Exploring the properties of vulnerability-contributing commits. In 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pages 65–74, Baltimore, MD, USA, 2013. IEEE.
[32]
A. Meneely and L. Williams. Secure open source collaboration: an empirical study of linus’ law. In Proceedings of the 16th ACM conference on Computer and communications security, pages 453–462. ACM, 2009.
[33]
Mitre Coroporation. Common weakness enumeration. http://cwe.mitre.org/. {Online; accessed 6-Mar-2014}.
[34]
A. Mockus, R. T. Fielding, and J. D. Herbsleb. Two case studies of open source software development: Apache and mozilla. ACM Transactions on Software Engineering Methodology, 11(3):309–346, July 2002.
[35]
A. Mockus and D. M. Weiss. Predicting risk of software changes. Bell Labs Technical Journal, 5(2):169–180, 2000.
[36]
M. Mukadam, C. Bird, and P. C. Rigby. Gerrit software code review data from android. In Proceedings of the Tenth International Workshop on Mining Software Repositories, pages 45–48, San Francisco, CA, USA, 2013. IEEE Press.
[37]
N. Nagappan and T. Ball. Use of relative code churn measures to predict system defect density. In Proceedings of the 27th International Conference on Software Engineering, 2005. ICSE 2005., pages 284–292, St. Louis, MO, USA, 2005.
[38]
S. Neuhaus, T. Zimmermann, C. Holler, and A. Zeller. Predicting vulnerable software components. In Proceedings of the 14th ACM conference on Computer and communications security, pages 529–540. ACM, 2007.
[39]
OWASP. The open web application security project. https://www.owasp.org/index.php/Category: Vulnerability, 2013. {Online; accessed 6-Mar-2014}.
[40]
M. F. Porter. Snowball: A language for stemming algorithms. http://www.tartarus.org/\~{}martin/ PorterStemmer, 2001.
[41]
L. J. Pratt, A. C. MacLean, C. D. Knutson, and E. K. Ringger. Cliff Walls: An analysis of monolithic commits using Latent Dirichlet Allocation. In Open Source Systems: Grounding Research, pages 282–298. Springer, 2011.
[42]
R. Purushothaman and D. Perry. Toward understanding the rhetoric of small source code changes. IEEE Transactions on Software Engineering, 31(6):511–526, June 2005.
[43]
F. Rahman and P. Devanbu. Ownership, experience and defects: a fine-grained study of authorship. In Proceedings of the 33rd International Conference on Software Engineering, pages 491–500, Waikiki, Honolulu, HI, USA, 2011. ACM.
[44]
E. S. Raymond. Homesteading the noosphere. First Monday, 3(10), 1998.
[45]
P. C. Rigby and C. Bird. Convergent contemporary software peer review practices. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pages 202–212, Saint Petersburg, Russia, 2013. ACM.
[46]
Y. Shin, A. Meneely, L. Williams, and J. A. Osborne. Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Transactions on Software Engineering, 37(6):772–787, 2011.
[47]
Y. Shin and L. Williams. An empirical model to predict security vulnerabilities using code complexity metrics. In Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement, pages 315–317, Kaiserslautern, Germany, 2008. ACM.
[48]
P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining, volume 1. Addison-Wesley Longman Publishing Co., Inc., 2005.
[49]
K. Tsipenyuk, B. Chess, and G. McGraw. Seven pernicious kingdoms: a taxonomy of software security errors. IEEE Security Privacy, 3(6):81 – 84, Nov.-Dec. 2005.
[50]
J. Walden, M. Doyle, G. A. Welch, and M. Whelan. Security of open source web applications. In Proceedings of the 2009 3rd international Symposium on Empirical Software Engineering and Measurement, pages 545–553, Orlando, FL, USA, 2009. IEEE Computer Society.
[51]
K. E. Wiegers. Peer reviews in Software: A practical guide. Addison-Wesley Boston, 2002.
[52]
T. Yamane. Elementary sampling theory. 1967.
[53]
A. Zeller. Why Programs Fail: A Guide to Systematic Debugging. Morgan Kaufmann Pub. Inc., San Francisco, CA, USA, 2005.

Cited By

View all
  • (2024)Machine Learning Models for Detecting Software VulnerabilitiesGenerative AI for Web Engineering Models10.4018/979-8-3693-3703-5.ch001(1-40)Online publication date: 27-Sep-2024
  • (2024)An Extensive Comparison of Static Application Security Testing ToolsProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661199(69-78)Online publication date: 18-Jun-2024
  • (2024)On the Use of ChatGPT for Code Review: Do Developers Like Reviews By ChatGPT?Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661183(375-380)Online publication date: 18-Jun-2024
  • Show More Cited By

Index Terms

  1. Identifying the characteristics of vulnerable code changes: an empirical study

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering
    November 2014
    856 pages
    ISBN:9781450330565
    DOI:10.1145/2635868
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 November 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. code review
    2. inspection
    3. open source
    4. security defects
    5. vulnerability

    Qualifiers

    • Research-article

    Conference

    SIGSOFT/FSE'14
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 17 of 128 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)105
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Machine Learning Models for Detecting Software VulnerabilitiesGenerative AI for Web Engineering Models10.4018/979-8-3693-3703-5.ch001(1-40)Online publication date: 27-Sep-2024
    • (2024)An Extensive Comparison of Static Application Security Testing ToolsProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661199(69-78)Online publication date: 18-Jun-2024
    • (2024)On the Use of ChatGPT for Code Review: Do Developers Like Reviews By ChatGPT?Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661183(375-380)Online publication date: 18-Jun-2024
    • (2024)Defying the Odds: Solana's Unexpected Resilience in Spite of the Security Challenges Faced by DevelopersProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670333(4226-4240)Online publication date: 2-Dec-2024
    • (2024)Are Latent Vulnerabilities Hidden Gems for Software Vulnerability Prediction? An Empirical StudyProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644919(716-727)Online publication date: 15-Apr-2024
    • (2024)On the Effectiveness of Function-Level Vulnerability Detectors for Inter-Procedural VulnerabilitiesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639218(1-12)Online publication date: 20-May-2024
    • (2024)Characterizing Software Maintenance Meetings: Information Shared, Discussion Outcomes, and Information CapturedProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623330(1-13)Online publication date: 20-May-2024
    • (2024)Mining Fine-Grained Code Change Patterns Using Multiple Feature AnalysisInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402450050535:01(111-138)Online publication date: 27-Nov-2024
    • (2024)Toward effective secure code reviews: an empirical study of security-related coding weaknessesEmpirical Software Engineering10.1007/s10664-024-10496-y29:4Online publication date: 8-Jun-2024
    • (2023)Learning and Fusing Multi-View Code Representations for Function Vulnerability DetectionElectronics10.3390/electronics1211249512:11(2495)Online publication date: 1-Jun-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media