skip to main content
10.1145/3661167.3661279acmotherconferencesArticle/Chapter ViewAbstractPublication PageseaseConference Proceedingsconference-collections
research-article

An Empirical Investigation of the Security Weaknesses in Open-Source Projects

Published: 18 June 2024 Publication History

Abstract

With the increase of code reuse, the possibility of security vulnerabilities increases. Thus, tools for static analysis are widely used to evaluate open-source projects against security vulnerabilities. This research aims to empirically study common weakness types (CWEs), their frequencies, and the correlations between them and open-source project characteristics. The PVS-Studio tool analyzed 150 projects hosted on GitHub and written in C#, C++, and Java. The tool was used to investigate the common weaknesses found in these projects. Furthermore, our study has practical implications for developers and researchers interested in open-source project security. We have identified the factors that contribute to the presence of these weaknesses, and our statistical analyses have shed light on these factors. Notably, C++ projects tend to have more weaknesses. The most common types of weaknesses detected in these programming languages are CWE-571, 570, 690, 682, 476, 628, 563, 691, 704, and 393. The age of the project and the number of commits are found to be positively correlated with the number of detected weaknesses, while stars and forks have little impact. These findings highlight the need for caution when using open-source code, as it can have several vulnerabilities that can compromise the software's security. Therefore, it is crucial to scan the third-party code before incorporating it into projects.

References

[1]
Mansooreh Zahedi, Muhammad Ali Babar, and Christoph Treude. 2018. An empirical study of security issues posted in open source projects. (2018)
[2]
 “Kalle Rindell, Karin Bernsmed, and Martin Gilje Jaatun. 2019. Managing Security in Software: Or: How I Learned to Stop Worrying and Manage the Security Technical Debt. In Proceedings of the 14th International Conference on Availability, Reliability and Security (Canterbury, CA, United Kingdom) (ARES ’19). Association for Computing Machinery, New York, NY, USA, Article 60, 8 pages. https: //doi.org/10.1145/3339252.3340338.
[3]
Gede Artha Azriadi Prana, Abhishek Sharma, Lwin Khin Shar, Darius Foo, Andrew E Santosa, Asankhaya Sharma, and David Lo. 2021. Out of sight, out of mind? How vulnerable dependencies affect open-source projects. Empirical Software Engineering 26 (2021), 1–34.
[4]
Rabe Abdalkareem, Olivier Nourry, Sultan Wehaibi, Suhaib Mujahid, and Emad Shihab. 2017. Why do developers use trivial packages? an empirical case study on npm. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (Paderborn, Germany) (ESEC/FSE 2017). Association for Computing Machinery, New York, NY, USA, 385–395. https://doi.org/10.1145/3106237.3
[5]
hilpa V. Shankhpal and S.H. Brahmananda. 2020. Design and Development of trust management scheme for the internet of things based on the optimization algorithm. In 2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT). 207–211. https://doi.org/10.1109/CSNT48778. 2020.9115784
[6]
Bushra Aloraini, Meiyappan Nagappan, Daniel M. German, Shinpei Hayashi, and Yoshiki Higo. 2019. An empirical study of security warnings from static application security testing tools. Journal of Systems and Software 158 (2019), 110427. https://doi.org/10.1016/j.jss.2019.110427
[7]
Tobias Baum, Olga Liskin, Kai Niklas, and Kurt Schneider. 2016. A Faceted Classification Scheme for Change-Based Industrial Code Review Processes. In 2016 IEEE International Conference on Software Quality, Reliability and Security (QRS). 74–85. https://doi.org/10.1109/QRS.2016.19
[8]
MITRE. Common Weakness Enumeration (CWE). [Online]. Available: https://cwe.mitre.org/
[9]
OWASP. OWASP Top Ten Web Application Security Risks: 2021. [Online]. Available: https://owasp.org/www-project-top-ten/OWASP_Top_Ten_2021/
[10]
Morteza Verdi, Ashkan Sami, Jafar Akhondali, Foutse Khomh, Gias Uddin, and Alireza Karami Motlagh. 2022. An Empirical Study of C++ Vulnerabilities in Crowd-Sourced Code Examples. IEEE Transactions on Software Engineering 48, 5 (2022), 1497–1514. https://doi.org/10.1109/TSE.2020.3023664
[11]
Cong Wang, Le Kang, Renwei Zhang, and Weiliang Yin. 2019. Statically-Directed Assertion Recommendation for C Programs. In 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), Vol. 1. 1–10. https://doi.org/ 10.1109/COMPSAC.2019.00011
[12]
Vinuri Bandara, Thisura Rathnayake, Nipuna Weerasekara, Charitha Elvitigala, Kenneth Thilakarathna, Primal Wijesekera, and Chamath Keppitiyagama. 2020. Fix that Fix Commit: A real-world remediation analysis of JavaScript projects. In 2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM). 198–202. https://doi.org/10.1109/SCAM51674.2020.00027
[13]
Matthieu Jimenez, Yves Le Traon, and Mike Papadakis. 2018. [Engineering Paper] Enabling the Continuous Analysis of Security Vulnerabilities with VulData7. In 2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM). 56–61. https://doi.org/10.1109/SCAM.2018.00014
[14]
Antonios Gkortzis, Daniel Feitosa, and Diomidis Spinellis. 2021. Software reuse cuts both ways: An empirical analysis of its relationship with security vulnerabilities. Journal of Systems and Software 172 (2021), 110653. https: //doi.org/10.1016/j.jss.2020.110653
[15]
Abdullah Al-Boghdady, Khaled Wassif, and Mohammad El-Ramly. 2021. The presence, trends, and causes of security vulnerabilities in operating systems of iot's low-end devices. Sensors 21, 7 (2021), 2329.
[16]
GitHub. "GitHub: Where the world builds software." GitHub. [Online]. Available: https://github.com/
[17]
Rini Van Solingen, Vic Basili, Gianluigi Caldiera, and H Dieter Rombach. 2002. Goal question metric (gqm) approach. Encyclopedia of software engineering (2002).
[18]
PVS-Studio. "PVS-Studio is a static analyzer on guard of code quality, security (SAST), and code safety." [Online]. Available:https://pvsstudio.com/en/pvs-studio/
[19]
Jón Arnar Briem, Jordi Smit, Hendrig Sellik, and Pavel Rapoport. 2019. Using distributed representation of code for bug detection. arXiv preprint arXiv:1911.12863 (2019).
[20]
Rida Shaukat, Arooba Shahoor, and Aniqa Urooj. 2018. Probing into code analysis tools: A comparison of C supporting static code analyzers. In 2018 15th International Bhurban Conference on Applied Sciences and Technology (IBCAST). 455–464. https://doi.org/10.1109/IBCAST.2018.8312264
[21]
NIST. "Source Code Security Analyzers." [Online]. Available: https://www.nist.gov/itl/ssd/software-quality-group/sourcecode-security-analyzers 
[22]
PVS-Studio. 2021. “CWE Top 25 2021. What is it, what is it for and how is it useful for static analysis?” Accessed: Nov. 27, 2021. [Online]. Available: https://pvs-studio.com/en/blog/posts/0869.
[23]
"PVS Studio is a solution to enhance code quality, security (SAST), and safety." Accessed: Apr. 17, 2024. [Online]. Available: https://pvs-studio.com/en/
[24]
ohan Fisch and Carl Haglund. 2021. Using the SEI CERT Secure Coding Standard to Reduce Vulnerabilities.
[25]
Tolga Muratdağı. 2024. IDENTIFYING TECHNICAL DEBT AND TOOLS FOR TECHNICAL DEBT MANAGEMENT IN SOFTWARE DEVELOPMENT. Ph. D. Dissertation. Middle East Technical University.
[26]
index | TIOBE - The Software Quality Company.” Accessed: Nov. 27, 2021. [Online]. Available: https://www.tiobe.com/tiobeindex/
[27]
Markus Zoppelt and Ramin Tavakoli Kolagari. 2019. SAM: A Security Abstraction Model for Automotive Software Systems. In Security and Safety Interplay of Intelligent Software Systems, Brahim Hamid, Barbara Gallina, Asaf Shabtai, Yuval Elovici, and Joaquin Garcia-Alfaro (Eds.). Springer International Publishing, Cham, 59–74.
[28]
Top Programming Languages 2021 - IEEE Spectrum.” Accessed: Nov. 27, 2021. [Online]. Available: https://spectrum.ieee.org/topprogramming-languages-2021
[29]
Jiateng Wang, Rongchuan Sun, Shumei Yu, Fengfeng Zhang, and Sun Lining. 2021. An Improved Correlation Model for Respiration Tracking in Robotic Radiosurgery Using Essential Skin Surface Motion. IEEE Robotics and Automation Letters 6, 4 (2021), 7885–7892. https://doi.org/10.1109/LRA.2021.3097250
[30]
Mirko Staderini and Andrea Bondavalli. 2021. Investigating Static Analyzers Detection Capabilities on Ethereum Smart Contracts. Stat 55 (2021), 91
[31]
Richard Amankwah, Jinfu Chen, Alfred Adutwum Amponsah, Patrick Kwaku Kudjo, Vivienne Ocran, and Comfort Ofoley Anang. 2020. Fast Bug Detection Algorithm for Identifying Potential Vulnerabilities in Juliet Test Cases. In 2020 IEEE 8th International Conference on Smart City and Informatization (iSCI). 89–94. https://doi.org/10.1109/iSCI50694.2020.00021
[32]
Nasif Imtiaz, Brendan Murphy, and Laurie Williams. 2019. How Do Developers Act on Static Analysis Alerts? An Empirical Study of Coverity Usage. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). 323–333. https://doi.org/10.1109/ISSRE.2019.00040
[33]
Anum Fatima, Shazia Bibi, and Rida Hanif. 2018. Comparative study on static code analysis tools for C/C++. In 2018 15th International Bhurban Conference on Applied Sciences and Technology (IBCAST). 465–469. https://doi.org/10.1109/ IBCAST.2018.8312265
[34]
Tosin Daniel Oyetoyan and Marcos Chaim. 2017. Comparing capability of static analysis tools to detect security weaknesses in mobile applications. (2017).
[35]
José D'Abruzzo Pereira, Naghmeh Ivaki, and Marco Vieira. 2021. Characterizing Buffer Overflow Vulnerabilities in Large C/C++ Projects. IEEE Access 9 (2021), 142879–142892. https://doi.org/10.1109/ACCESS.2021.3120349
[36]
Morteza Verdi, Ashkan Sami, Jafar Akhondali, Foutse Khomh, Gias Uddin, and Alireza Karami Motlagh. 2022. An Empirical Study of C++ Vulnerabilities in Crowd-Sourced Code Examples. IEEE Transactions on Software Engineering 48, 5 (2022), 1497–1514. https://doi.org/10.1109/TSE.2020.3023664
[37]
Andrew Sanders, Gursimran Singh Walia, and Andrew Allen. 2024. Assessing Common Software Vulnerabilities in Undergraduate Computer Science Assignments. In Journal of The Colloquium for Information Systems Security Education, Vol. 11. 8–8
[38]
Seong-Kyun Mok and Eun-Sun Cho. 2023. L4 Pointer: An efficient pointer extension for spatial memory safety support without hardware extension. arXiv preprint arXiv:2302.06819 (2023)
[39]
jd Soud, Grischa Liebel, and Mohammad Hamdaqa. 2024. A fly in the ointment: an empirical study on the characteristics of Ethereum smart contract code weaknesses. Empirical Software Engineering 29, 1 (2024), 13.
[40]
Amir A Khwaja, Muniba Murtaza, and Hafiz F Ahmed. 2020. A security feature framework for programming languages to minimize application layer vulnerabilities. Security and Privacy 3, 1 (2020), e95.
[41]
Nusrat Zahan, Shohanuzzaman Shohan, Dan Harris, and Laurie Williams. 2023. Do Software Security Practices Yield Fewer Vulnerabilities?. In 2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). 292–303. https://doi.org/10.1109/ICSE-SEIP58684.2023. 00032
[42]
Sarah Elder, Nusrat Zahan, Valeri Kozarev, Rui Shu, Tim Menzies, and Laurie Williams. 2021. Structuring a Comprehensive Software Security Course Around the OWASP Application Security Verification Standard. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET). 95–104. https://doi.org/10.1109/ICSE-SEET52601.2021. 00019

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EASE '24: Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering
June 2024
728 pages
ISBN:9798400717017
DOI:10.1145/3661167
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2024

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

EASE 2024

Acceptance Rates

Overall Acceptance Rate 71 of 232 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 84
    Total Downloads
  • Downloads (Last 12 months)84
  • Downloads (Last 6 weeks)22
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media