research-article

Improving real-world vulnerability characterization with vulnerable slices

Authors:
Solmaz Salimi

Sharif University of Technology, Iran

Sharif University of Technology, Iran
View Profile

,
Maryam Ebrahimzadeh

Sharif University of Technology, Iran

Sharif University of Technology, Iran
View Profile

,
Mehdi Kharrazi

Sharif University of Technology, Iran

Sharif University of Technology, Iran

0000-0002-1773-8314
View Profile

PROMISE 2020: Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software EngineeringNovember 2020Pages 11–20https://doi.org/10.1145/3416508.3417120

Published:08 November 2020Publication History

PROMISE 2020: Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering

Pages 11–20

ABSTRACT

Vulnerability detection is an important challenge in the security community. Many different techniques have been proposed, ranging from symbolic execution to fuzzing in order to help in identifying vulnerabilities. Even though there has been considerable improvement in these approaches, they perform poorly on a large scale code basis. There has also been an alternate approach, where software metrics are calculated on the overall code structure with the hope of predicting code segments more likely to be vulnerable. The logic has been that more complex code with respect to the software metrics, will be more likely to contain vulnerabilities.

In this paper, we conduct an empirical study with a large dataset of vulnerable codes to discuss if we can change the way we measure metrics to improve vulnerability characterization. More specifically, we introduce vulnerable slices as vulnerable code units to measure the software metrics and then use these new measured metrics to characterize vulnerable codes. The result shows that vulnerable slices significantly increase the accuracy of vulnerability characterization. Further, we utilize vulnerable slices to analyze the dataset of known vulnerabilities, particularly to observe how by using vulnerable slices the size and complexity changes in real-world vulnerabilities.

References

2020. CVE-2019-6977. Retrieved April, 2020 from https://cve.mitre.org/cgibin/cvename.cgi?name=CVE-2019-6977Google Scholar
2020. Debian Security Tracker. Retrieved March, 2020 from https://salsa.debian. org/security-tracker-team/security-trackerGoogle Scholar
2020. GD Graphics Library. Retrieved April, 2020 from https://libgd.github.ioGoogle Scholar
2020. GitHub. Retrieved March, 2020 from https://github.comGoogle Scholar
2020. NVD Database. Retrieved March, 2020 from https://www.cvedetails.com/ browse-by-date.phpGoogle Scholar
2020. Red Hat CVE Database. Retrieved March, 2020 from https://access.redhat. com/security/security-updates/#/cveGoogle Scholar
Basma S. Alqadi and Jonathan I. Maletic. 2020. Slice-Based Cognitive Complexity Metrics for Defect Prediction. In 27th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2020, London, ON, Canada, February 18-21, 2020, Kostas Kontogiannis, Foutse Khomh, Alexander Chatzigeorgiou, Marios-Eleftherios Fokaefs, and Minghui Zhou (Eds.). IEEE, 411-422.Google Scholar
Henrique Alves, Baldoino Fonseca, and Nuno Antunes. 2016. Software Metrics and Security Vulnerabilities: Dataset and Exploratory Study. In 12th European Dependable Computing Conference, EDCC 2016, Gothenburg, Sweden, September 5-9, 2016. IEEE Computer Society, 37-44.Google Scholar
David W. Binkley, Nicolas Gold, and Mark Harman. 2007. An empirical study of static program slice size. ACM Trans. Softw. Eng. Methodol. 16, 2 ( 2007 ), 8.Google ScholarDigital Library
Bernhard E. Boser, Isabelle Guyon, and Vladimir Vapnik. 1992. A Training Algorithm for Optimal Margin Classifiers. In Proceedings of the Fifth Annual ACM Conference on Computational Learning Theory, COLT 1992, Pittsburgh, PA, USA, July 27-29, 1992, David Haussler (Ed.). ACM, 144-152.Google ScholarDigital Library
Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. In 8th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2008, December 8-10, 2008, San Diego, California, USA, Proceedings, Richard Draves and Robbert van Renesse (Eds.). USENIX Association, 209-224.Google ScholarDigital Library
Xiang Chen, Yingquan Zhao, Zhanqi Cui, Guozhu Meng, Yang Liu, and Zan Wang. 2020. Large-Scale Empirical Studies on Efort-Aware Security Vulnerability Prediction Methods. IEEE Trans. Reliability 69, 1 ( 2020 ), 70-87.Google ScholarCross Ref
Xiang Chen, Yingquan Zhao, Qiuping Wang, and Zhidan Yuan. 2018. MULTI : Multi-objective efort-aware just-in-time software defect prediction. Inf. Softw. Technol. 93 ( 2018 ), 1-13. https://doi.org/10.1016/j.infsof. 2017. 08.004 Google ScholarCross Ref
Vitaly Chipounov, Volodymyr Kuznetsov, and George Candea. 2011. S2E: a platform for in-vivo multi-path analysis of software systems. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2011, Newport Beach, CA, USA, March 5-11, 2011, Rajiv Gupta and Todd C. Mowry (Eds.). ACM, 265-278.Google ScholarDigital Library
Istehad Chowdhury and Mohammad Zulkernine. 2011. Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities. J. Syst. Archit. 57, 3 ( 2011 ), 294-313.Google ScholarDigital Library
Clang. 2020. Clang. Retrieved March, 2020 from https://clang.llvm.orgGoogle Scholar
Pascal Cuoq, Florent Kirchner, Nikolai Kosmatov, Virgile Prevosto, Julien Signoles, and Boris Yakobowski. 2012. Frama-C-A Software Analysis Perspective. In Software Engineering and Formal Methods-10th International Conference, SEFM 2012, Thessaloniki, Greece, October 1-5, 2012. Proceedings (Lecture Notes in Computer Science, Vol. 7504 ), George Eleftherakis, Mike Hinchey, and Mike Holcombe (Eds.). Springer, 233-247.Google ScholarDigital Library
Xiaoning Du, Bihuan Chen, Yuekang Li, Jianmin Guo, Yaqin Zhou, Yang Liu, and Yu Jiang. 2019. Leopard: Identifying vulnerable code for vulnerability assessment through program metrics. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 60-71.Google ScholarDigital Library
Qian Feng, Rundong Zhou, Chengcheng Xu, Yao Cheng, Brian Testa, and Heng Yin. 2016. Scalable Graph-based Bug Search for Firmware Images. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, October 24-28, 2016. 480-491.Google ScholarDigital Library
Wei Fu and Tim Menzies. 2017. Revisiting Unsupervised Learning for Defect Prediction. CoRR abs/1703.00132 ( 2017 ).Google Scholar
Antonios Gkortzis, Dimitris Mitropoulos, and Diomidis Spinellis. 2018. VulinOSS: a dataset of security vulnerabilities in open-source systems. In Proceedings of the 15th International Conference on Mining Software Repositories, MSR 2018, Gothenburg, Sweden, May 28-29, 2018, Andy Zaidman, Yasutaka Kamei, and Emily Hill (Eds.). ACM, 18-21.Google ScholarDigital Library
Tracy Hall, Sarah Beecham, David Bowes, David Gray, and Steve Counsell. [n.d.]. A Systematic Literature Review on Fault Prediction Performance in Software Engineering. IEEE Trans. Software Eng. 38, 6 ([n. d.]), 1276-1304.Google Scholar
Yikun Hu, Yuanyuan Zhang, Juanru Li, and Dawu Gu. 2017. Binary code clone detection across architectures and compiling configurations. In Proceedings of the 25th International Conference on Program Comprehension, ICPC 2017, Buenos Aires, Argentina, May 22-23, 2017. 88-98.Google ScholarDigital Library
Qiao Huang, Xin Xia, and David Lo. 2017. Supervised vs unsupervised models: A holistic look at efort-aware just-in-time defect prediction. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE Computer Society, 159-170.Google ScholarCross Ref
Yasutaka Kamei, Emad Shihab, Bram Adams, Ahmed E. Hassan, Audris Mockus, Anand Sinha, and Naoyasu Ubayashi. 2013. A Large-Scale Empirical Study of Just-in-Time Quality Assurance. IEEE Trans. Software Eng. 39, 6 ( 2013 ), 757-773.Google ScholarDigital Library
Seulbae Kim, Seunghoon Woo, Heejo Lee, and Hakjoo Oh. 2017. VUDDY: A Scalable Approach for Vulnerable Code Clone Discovery. In 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017. 595-614.Google Scholar
David J. Kuck, Robert H. Kuhn, David A. Padua, Bruce Leasure, and Michael Wolfe. 1981. Dependence Graphs and Compiler Optimizations. In Conference Record of the Eighth Annual ACM Symposium on Principles of Programming Languages, Williamsburg, Virginia, USA, January 1981, John White, Richard J. Lipton, and Patricia C. Goldberg (Eds.). ACM Press, 207-218.Google Scholar
William Landi. 1992. Undecidability of Static Analysis. LOPLAS 1, 4 ( 1992 ), 323-337.Google Scholar
Hongzhe Li, Hyuckmin Kwon, Jonghoon Kwon, and Heejo Lee. 2016. CLORIFI: software vulnerability discovery using code clone verification. Concurrency and Computation: Practice and Experience 28, 6 ( 2016 ), 1900-1917.Google Scholar
Hongzhe Li, Jaesang Oh, Hakjoo Oh, and Heejo Lee. 2016. Automated Source Code Instrumentation for Verifying Potential Vulnerabilities. In ICT Systems Security and Privacy Protection-31st IFIP TC 11 International Conference, SEC 2016, Ghent, Belgium, May 30-June 1, 2016, Proceedings. 211-226.Google ScholarCross Ref
Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Hanchao Qi, and Jie Hu. 2016. VulPecker: an automated vulnerability detection system based on code similarity analysis. In Proceedings of the 32nd Annual Conference on Computer Security Applications, ACSAC 2016, Los Angeles, CA, USA, December 5-9, 2016. 201-213.Google ScholarDigital Library
Bingchang Liu, Guozhu Meng, Wei Zou, Qi Gong, Feng Li, Min Lin, Dandan Sun, Wei Huo, and Chao Zhang. 2020. A Large-Scale Empirical Study on Vulnerability Distribution within Projects and the Lessons Learned. In 2020 IEEE/ACM 42 st International Conference on Software Engineering (ICSE).Google Scholar
Jinping Liu, Yuming Zhou, Yibiao Yang, Hongmin Lu, and Baowen Xu. 2017. Code Churn: A Neglected Metric in Efort-Aware Just-in-Time Defect Prediction. In 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2017, Toronto, ON, Canada, November 9-10, 2017, Ayse Bener, Burak Turhan, and Stefan Bifl (Eds.). IEEE Computer Society, 11-19.Google Scholar
Gary McGraw. 2006. Software Security: Building Security In. In 17th International Symposium on Software Reliability Engineering (ISSRE 2006 ), 7-10 November 2006, Raleigh, North Carolina, USA. IEEE Computer Society, 6.Google Scholar
Nadia Patricia Da Silva Medeiros, Naghmeh Ivaki, Pedro Costa, and Marco Vieira. 2017. Software Metrics as Indicators of Security Vulnerabilities. In 28th IEEE International Symposium on Software Reliability Engineering, ISSRE 2017, Toulouse, France, October 23-26, 2017. IEEE Computer Society, 216-227.Google Scholar
Nadia Patricia Da Silva Medeiros, Naghmeh Ivaki, Pedro Costa, and Marco Vieira. 2018. An Approach for Trustworthiness Benchmarking Using Software Metrics. In 23rd IEEE Pacific Rim International Symposium on Dependable Computing, PRDC 2018, Taipei, Taiwan, December 4-7, 2018. IEEE, 84-93.Google Scholar
Barton P. Miller, Lars Fredriksen, and Bryan So. 1990. An Empirical Study of the Reliability of UNIX Utilities. Commun. ACM 33, 12 ( 1990 ), 32-44.Google Scholar
Patrick Morrison, Rahul Pandita, Xusheng Xiao, Ram Chillarege, and Laurie Williams. 2018. Are vulnerabilities discovered and resolved like other defects? Empirical Software Engineering 23, 3 ( 2018 ), 1383-1421.Google Scholar
Sara Moshtari and Ashkan Sami. 2016. Evaluating and comparing complexity, coupling and a new proposed set of coupling metrics in cross-project vulnerability prediction. ( 2016 ), 1415-1421.Google Scholar
Antonio Nappa, Richard Johnson, Leyla Bilge, Juan Caballero, and Tudor Dumitras. 2015. The Attack of the Clones: A Study of the Impact of Shared Code on Vulnerability Patching. In 2015 IEEE Symposium on Security and Privacy, SP 2015, San Jose, CA, USA, May 17-21, 2015. 692-708.Google Scholar
Linda M. Ott and Jefrey J. Thuss. 1993. Slice based metrics for estimating cohesion. In Proceedings of the First International Software Metrics Symposium, METRICS 1993, May 21-22, 1993, Balimore, Maryland, USA. IEEE Computer Society, 71-81.Google Scholar
Danijel Radjenovic, Marjan Hericko, Richard Torkar, and Ales Zivkovic. 2013. Software fault prediction metrics: A systematic literature review. Inf. Softw. Technol. 55, 8 ( 2013 ), 1397-1418.Google Scholar
Riccardo Scandariato, James Walden, Aram Hovsepyan, and Wouter Joosen. 2014. Predicting Vulnerable Software Components via Text Mining. IEEE Trans. Software Eng. 40, 10 ( 2014 ), 993-1006.Google ScholarCross Ref
Edward J. Schwartz, Thanassis Avgerinos, and David Brumley. 2010. All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask). In 31st IEEE Symposium on Security and Privacy, S&P 2010, 16-19 May 2010, Berleley/Oakland, California, USA. IEEE Computer Society, 317-331.Google ScholarDigital Library
Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Andrew Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Krügel, and Giovanni Vigna. 2016. SOK: (State of) The Art of War: Ofensive Techniques in Binary Analysis. In IEEE Symposium on Security and Privacy, SP 2016, San Jose, CA, USA, May 22-26, 2016. 138-157.Google ScholarCross Ref
Frank Tip. 1995. A survey of program slicing techniques. J. Prog. Lang. 3, 3 ( 1995 ). http://compscinet.dcs.kcl.ac.uk/JP/jp030301.abs.htmlGoogle Scholar
James Walden, Jef Stuckman, and Riccardo Scandariato. 2014. Predicting Vulnerable Components: Software Metrics vs Text Mining. In 25th IEEE International Symposium on Software Reliability Engineering, ISSRE 2014, Naples, Italy, November 3-6, 2014. IEEE Computer Society, 23-33.Google ScholarDigital Library
Mark Weiser. 1984. Program slicing. IEEE Transactions on software engineering 4 ( 1984 ), 352-357.Google ScholarDigital Library
Yibiao Yang, Yuming Zhou, Jinping Liu, Yangyang Zhao, Hongmin Lu, Lei Xu, Baowen Xu, and Hareton Leung. 2016. Efort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Seattle, WA, USA, November 13-18, 2016, Thomas Zimmermann, Jane Cleland-Huang, and Zhendong Su (Eds.). ACM, 157-168.Google ScholarDigital Library
Mengyuan Zhang, Xavier de Carné de Carnavalet, Lingyu Wang, and Ahmed Ragab. 2019. Large-Scale Empirical Study of Important Features Indicative of Discovered Vulnerabilities to Assess Application Security. IEEE Trans. Information Forensics and Security 14, 9 ( 2019 ), 2315-2330.Google ScholarCross Ref
Yun Zhang, David Lo, Xin Xia, Bowen Xu, Jianling Sun, and Shanping Li. 2015. Combining Software Metrics and Text Features for Vulnerable File Prediction. In 20th International Conference on Engineering of Complex Computer Systems, ICECCS 2015, Gold Coast, Australia, December 9-12, 2015. IEEE Computer Society, 40-49.Google Scholar

Index Terms

Improving real-world vulnerability characterization with vulnerable slices
1. Security and privacy
  1. Software and application security
  2. Systems security
    1. Vulnerability management
      1. Vulnerability scanners

Recommendations

Program Slicing Stored XSS Bugs in Web Application
TASE '11: Proceedings of the 2011 Fifth International Conference on Theoretical Aspects of Software Engineering

Web applications are vulnerable targets of security attacks. Among the well known attack type - XSS(Cross-Site Scripting), the most threatening is Stored XSS. Since most static analysis methods refer to Reflected XSS but few concentrate on Stored XSS ...
Read More
Using software metrics for predicting vulnerable classes and methods in Java projects: A machine learning approach
Abstract
[Context]A software vulnerability becomes harmful for software when an attacker successfully exploits the insecure code and reveals the vulnerability. A single vulnerability in code can put the entire software at risk. Therefore, maintaining ...

This paper proposes and empirically evaluates suite of software metrics that can be used as feature set to predict vulnerable code‐components at two levels of granularity: Java class‐level and method‐level. Software development teams can use the proposed ...
Read More
Evaluating and comparing complexity, coupling and a new proposed set of coupling metrics in cross-project vulnerability prediction
SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied Computing

Software security is an important concern in the world moving towards Information Technology. Detecting software vulnerabilities is a difficult and resource consuming task. Therefore, automatic vulnerability prediction would help development teams to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PROMISE 2020: Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering
November 2020
80 pages
ISBN:9781450381277
DOI:10.1145/3416508
General Chair:
Leandro Minku,
Program Chairs:
Tim Menzies,
Mei Nagappan
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 November 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Program Slicing
Static Analysis
Vulnerability Characterization
Vulnerability Prediction
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate64of125submissions,51%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 275
  Total Downloads
- Downloads (Last 12 months)39
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Improving real-world vulnerability characterization with vulnerable slices

PROMISE 2020: Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Program Slicing Stored XSS Bugs in Web Application

Using software metrics for predicting vulnerable classes and methods in Java projects: A machine learning approach

Evaluating and comparing complexity, coupling and a new proposed set of coupling metrics in cross-project vulnerability prediction