skip to main content
10.1145/3416508.3417120acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Improving real-world vulnerability characterization with vulnerable slices

Published:08 November 2020Publication History

ABSTRACT

Vulnerability detection is an important challenge in the security community. Many different techniques have been proposed, ranging from symbolic execution to fuzzing in order to help in identifying vulnerabilities. Even though there has been considerable improvement in these approaches, they perform poorly on a large scale code basis. There has also been an alternate approach, where software metrics are calculated on the overall code structure with the hope of predicting code segments more likely to be vulnerable. The logic has been that more complex code with respect to the software metrics, will be more likely to contain vulnerabilities.

In this paper, we conduct an empirical study with a large dataset of vulnerable codes to discuss if we can change the way we measure metrics to improve vulnerability characterization. More specifically, we introduce vulnerable slices as vulnerable code units to measure the software metrics and then use these new measured metrics to characterize vulnerable codes. The result shows that vulnerable slices significantly increase the accuracy of vulnerability characterization. Further, we utilize vulnerable slices to analyze the dataset of known vulnerabilities, particularly to observe how by using vulnerable slices the size and complexity changes in real-world vulnerabilities.

References

  1. 2020. CVE-2019-6977. Retrieved April, 2020 from https://cve.mitre.org/cgibin/cvename.cgi?name=CVE-2019-6977Google ScholarGoogle Scholar
  2. 2020. Debian Security Tracker. Retrieved March, 2020 from https://salsa.debian. org/security-tracker-team/security-trackerGoogle ScholarGoogle Scholar
  3. 2020. GD Graphics Library. Retrieved April, 2020 from https://libgd.github.ioGoogle ScholarGoogle Scholar
  4. 2020. GitHub. Retrieved March, 2020 from https://github.comGoogle ScholarGoogle Scholar
  5. 2020. NVD Database. Retrieved March, 2020 from https://www.cvedetails.com/ browse-by-date.phpGoogle ScholarGoogle Scholar
  6. 2020. Red Hat CVE Database. Retrieved March, 2020 from https://access.redhat. com/security/security-updates/#/cveGoogle ScholarGoogle Scholar
  7. Basma S. Alqadi and Jonathan I. Maletic. 2020. Slice-Based Cognitive Complexity Metrics for Defect Prediction. In 27th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2020, London, ON, Canada, February 18-21, 2020, Kostas Kontogiannis, Foutse Khomh, Alexander Chatzigeorgiou, Marios-Eleftherios Fokaefs, and Minghui Zhou (Eds.). IEEE, 411-422.Google ScholarGoogle Scholar
  8. Henrique Alves, Baldoino Fonseca, and Nuno Antunes. 2016. Software Metrics and Security Vulnerabilities: Dataset and Exploratory Study. In 12th European Dependable Computing Conference, EDCC 2016, Gothenburg, Sweden, September 5-9, 2016. IEEE Computer Society, 37-44.Google ScholarGoogle Scholar
  9. David W. Binkley, Nicolas Gold, and Mark Harman. 2007. An empirical study of static program slice size. ACM Trans. Softw. Eng. Methodol. 16, 2 ( 2007 ), 8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bernhard E. Boser, Isabelle Guyon, and Vladimir Vapnik. 1992. A Training Algorithm for Optimal Margin Classifiers. In Proceedings of the Fifth Annual ACM Conference on Computational Learning Theory, COLT 1992, Pittsburgh, PA, USA, July 27-29, 1992, David Haussler (Ed.). ACM, 144-152.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. In 8th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2008, December 8-10, 2008, San Diego, California, USA, Proceedings, Richard Draves and Robbert van Renesse (Eds.). USENIX Association, 209-224.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Xiang Chen, Yingquan Zhao, Zhanqi Cui, Guozhu Meng, Yang Liu, and Zan Wang. 2020. Large-Scale Empirical Studies on Efort-Aware Security Vulnerability Prediction Methods. IEEE Trans. Reliability 69, 1 ( 2020 ), 70-87.Google ScholarGoogle ScholarCross RefCross Ref
  13. Xiang Chen, Yingquan Zhao, Qiuping Wang, and Zhidan Yuan. 2018. MULTI : Multi-objective efort-aware just-in-time software defect prediction. Inf. Softw. Technol. 93 ( 2018 ), 1-13. https://doi.org/10.1016/j.infsof. 2017. 08.004 Google ScholarGoogle ScholarCross RefCross Ref
  14. Vitaly Chipounov, Volodymyr Kuznetsov, and George Candea. 2011. S2E: a platform for in-vivo multi-path analysis of software systems. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2011, Newport Beach, CA, USA, March 5-11, 2011, Rajiv Gupta and Todd C. Mowry (Eds.). ACM, 265-278.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Istehad Chowdhury and Mohammad Zulkernine. 2011. Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities. J. Syst. Archit. 57, 3 ( 2011 ), 294-313.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Clang. 2020. Clang. Retrieved March, 2020 from https://clang.llvm.orgGoogle ScholarGoogle Scholar
  17. Pascal Cuoq, Florent Kirchner, Nikolai Kosmatov, Virgile Prevosto, Julien Signoles, and Boris Yakobowski. 2012. Frama-C-A Software Analysis Perspective. In Software Engineering and Formal Methods-10th International Conference, SEFM 2012, Thessaloniki, Greece, October 1-5, 2012. Proceedings (Lecture Notes in Computer Science, Vol. 7504 ), George Eleftherakis, Mike Hinchey, and Mike Holcombe (Eds.). Springer, 233-247.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Xiaoning Du, Bihuan Chen, Yuekang Li, Jianmin Guo, Yaqin Zhou, Yang Liu, and Yu Jiang. 2019. Leopard: Identifying vulnerable code for vulnerability assessment through program metrics. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 60-71.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Qian Feng, Rundong Zhou, Chengcheng Xu, Yao Cheng, Brian Testa, and Heng Yin. 2016. Scalable Graph-based Bug Search for Firmware Images. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, October 24-28, 2016. 480-491.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Wei Fu and Tim Menzies. 2017. Revisiting Unsupervised Learning for Defect Prediction. CoRR abs/1703.00132 ( 2017 ).Google ScholarGoogle Scholar
  21. Antonios Gkortzis, Dimitris Mitropoulos, and Diomidis Spinellis. 2018. VulinOSS: a dataset of security vulnerabilities in open-source systems. In Proceedings of the 15th International Conference on Mining Software Repositories, MSR 2018, Gothenburg, Sweden, May 28-29, 2018, Andy Zaidman, Yasutaka Kamei, and Emily Hill (Eds.). ACM, 18-21.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Tracy Hall, Sarah Beecham, David Bowes, David Gray, and Steve Counsell. [n.d.]. A Systematic Literature Review on Fault Prediction Performance in Software Engineering. IEEE Trans. Software Eng. 38, 6 ([n. d.]), 1276-1304.Google ScholarGoogle Scholar
  23. Yikun Hu, Yuanyuan Zhang, Juanru Li, and Dawu Gu. 2017. Binary code clone detection across architectures and compiling configurations. In Proceedings of the 25th International Conference on Program Comprehension, ICPC 2017, Buenos Aires, Argentina, May 22-23, 2017. 88-98.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Qiao Huang, Xin Xia, and David Lo. 2017. Supervised vs unsupervised models: A holistic look at efort-aware just-in-time defect prediction. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE Computer Society, 159-170.Google ScholarGoogle ScholarCross RefCross Ref
  25. Yasutaka Kamei, Emad Shihab, Bram Adams, Ahmed E. Hassan, Audris Mockus, Anand Sinha, and Naoyasu Ubayashi. 2013. A Large-Scale Empirical Study of Just-in-Time Quality Assurance. IEEE Trans. Software Eng. 39, 6 ( 2013 ), 757-773.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Seulbae Kim, Seunghoon Woo, Heejo Lee, and Hakjoo Oh. 2017. VUDDY: A Scalable Approach for Vulnerable Code Clone Discovery. In 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017. 595-614.Google ScholarGoogle Scholar
  27. David J. Kuck, Robert H. Kuhn, David A. Padua, Bruce Leasure, and Michael Wolfe. 1981. Dependence Graphs and Compiler Optimizations. In Conference Record of the Eighth Annual ACM Symposium on Principles of Programming Languages, Williamsburg, Virginia, USA, January 1981, John White, Richard J. Lipton, and Patricia C. Goldberg (Eds.). ACM Press, 207-218.Google ScholarGoogle Scholar
  28. William Landi. 1992. Undecidability of Static Analysis. LOPLAS 1, 4 ( 1992 ), 323-337.Google ScholarGoogle Scholar
  29. Hongzhe Li, Hyuckmin Kwon, Jonghoon Kwon, and Heejo Lee. 2016. CLORIFI: software vulnerability discovery using code clone verification. Concurrency and Computation: Practice and Experience 28, 6 ( 2016 ), 1900-1917.Google ScholarGoogle Scholar
  30. Hongzhe Li, Jaesang Oh, Hakjoo Oh, and Heejo Lee. 2016. Automated Source Code Instrumentation for Verifying Potential Vulnerabilities. In ICT Systems Security and Privacy Protection-31st IFIP TC 11 International Conference, SEC 2016, Ghent, Belgium, May 30-June 1, 2016, Proceedings. 211-226.Google ScholarGoogle ScholarCross RefCross Ref
  31. Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Hanchao Qi, and Jie Hu. 2016. VulPecker: an automated vulnerability detection system based on code similarity analysis. In Proceedings of the 32nd Annual Conference on Computer Security Applications, ACSAC 2016, Los Angeles, CA, USA, December 5-9, 2016. 201-213.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Bingchang Liu, Guozhu Meng, Wei Zou, Qi Gong, Feng Li, Min Lin, Dandan Sun, Wei Huo, and Chao Zhang. 2020. A Large-Scale Empirical Study on Vulnerability Distribution within Projects and the Lessons Learned. In 2020 IEEE/ACM 42 st International Conference on Software Engineering (ICSE).Google ScholarGoogle Scholar
  33. Jinping Liu, Yuming Zhou, Yibiao Yang, Hongmin Lu, and Baowen Xu. 2017. Code Churn: A Neglected Metric in Efort-Aware Just-in-Time Defect Prediction. In 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2017, Toronto, ON, Canada, November 9-10, 2017, Ayse Bener, Burak Turhan, and Stefan Bifl (Eds.). IEEE Computer Society, 11-19.Google ScholarGoogle Scholar
  34. Gary McGraw. 2006. Software Security: Building Security In. In 17th International Symposium on Software Reliability Engineering (ISSRE 2006 ), 7-10 November 2006, Raleigh, North Carolina, USA. IEEE Computer Society, 6.Google ScholarGoogle Scholar
  35. Nadia Patricia Da Silva Medeiros, Naghmeh Ivaki, Pedro Costa, and Marco Vieira. 2017. Software Metrics as Indicators of Security Vulnerabilities. In 28th IEEE International Symposium on Software Reliability Engineering, ISSRE 2017, Toulouse, France, October 23-26, 2017. IEEE Computer Society, 216-227.Google ScholarGoogle Scholar
  36. Nadia Patricia Da Silva Medeiros, Naghmeh Ivaki, Pedro Costa, and Marco Vieira. 2018. An Approach for Trustworthiness Benchmarking Using Software Metrics. In 23rd IEEE Pacific Rim International Symposium on Dependable Computing, PRDC 2018, Taipei, Taiwan, December 4-7, 2018. IEEE, 84-93.Google ScholarGoogle Scholar
  37. Barton P. Miller, Lars Fredriksen, and Bryan So. 1990. An Empirical Study of the Reliability of UNIX Utilities. Commun. ACM 33, 12 ( 1990 ), 32-44.Google ScholarGoogle Scholar
  38. Patrick Morrison, Rahul Pandita, Xusheng Xiao, Ram Chillarege, and Laurie Williams. 2018. Are vulnerabilities discovered and resolved like other defects? Empirical Software Engineering 23, 3 ( 2018 ), 1383-1421.Google ScholarGoogle Scholar
  39. Sara Moshtari and Ashkan Sami. 2016. Evaluating and comparing complexity, coupling and a new proposed set of coupling metrics in cross-project vulnerability prediction. ( 2016 ), 1415-1421.Google ScholarGoogle Scholar
  40. Antonio Nappa, Richard Johnson, Leyla Bilge, Juan Caballero, and Tudor Dumitras. 2015. The Attack of the Clones: A Study of the Impact of Shared Code on Vulnerability Patching. In 2015 IEEE Symposium on Security and Privacy, SP 2015, San Jose, CA, USA, May 17-21, 2015. 692-708.Google ScholarGoogle Scholar
  41. Linda M. Ott and Jefrey J. Thuss. 1993. Slice based metrics for estimating cohesion. In Proceedings of the First International Software Metrics Symposium, METRICS 1993, May 21-22, 1993, Balimore, Maryland, USA. IEEE Computer Society, 71-81.Google ScholarGoogle Scholar
  42. Danijel Radjenovic, Marjan Hericko, Richard Torkar, and Ales Zivkovic. 2013. Software fault prediction metrics: A systematic literature review. Inf. Softw. Technol. 55, 8 ( 2013 ), 1397-1418.Google ScholarGoogle Scholar
  43. Riccardo Scandariato, James Walden, Aram Hovsepyan, and Wouter Joosen. 2014. Predicting Vulnerable Software Components via Text Mining. IEEE Trans. Software Eng. 40, 10 ( 2014 ), 993-1006.Google ScholarGoogle ScholarCross RefCross Ref
  44. Edward J. Schwartz, Thanassis Avgerinos, and David Brumley. 2010. All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask). In 31st IEEE Symposium on Security and Privacy, S&P 2010, 16-19 May 2010, Berleley/Oakland, California, USA. IEEE Computer Society, 317-331.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Andrew Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Krügel, and Giovanni Vigna. 2016. SOK: (State of) The Art of War: Ofensive Techniques in Binary Analysis. In IEEE Symposium on Security and Privacy, SP 2016, San Jose, CA, USA, May 22-26, 2016. 138-157.Google ScholarGoogle ScholarCross RefCross Ref
  46. Frank Tip. 1995. A survey of program slicing techniques. J. Prog. Lang. 3, 3 ( 1995 ). http://compscinet.dcs.kcl.ac.uk/JP/jp030301.abs.htmlGoogle ScholarGoogle Scholar
  47. James Walden, Jef Stuckman, and Riccardo Scandariato. 2014. Predicting Vulnerable Components: Software Metrics vs Text Mining. In 25th IEEE International Symposium on Software Reliability Engineering, ISSRE 2014, Naples, Italy, November 3-6, 2014. IEEE Computer Society, 23-33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Mark Weiser. 1984. Program slicing. IEEE Transactions on software engineering 4 ( 1984 ), 352-357.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Yibiao Yang, Yuming Zhou, Jinping Liu, Yangyang Zhao, Hongmin Lu, Lei Xu, Baowen Xu, and Hareton Leung. 2016. Efort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Seattle, WA, USA, November 13-18, 2016, Thomas Zimmermann, Jane Cleland-Huang, and Zhendong Su (Eds.). ACM, 157-168.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Mengyuan Zhang, Xavier de Carné de Carnavalet, Lingyu Wang, and Ahmed Ragab. 2019. Large-Scale Empirical Study of Important Features Indicative of Discovered Vulnerabilities to Assess Application Security. IEEE Trans. Information Forensics and Security 14, 9 ( 2019 ), 2315-2330.Google ScholarGoogle ScholarCross RefCross Ref
  51. Yun Zhang, David Lo, Xin Xia, Bowen Xu, Jianling Sun, and Shanping Li. 2015. Combining Software Metrics and Text Features for Vulnerable File Prediction. In 20th International Conference on Engineering of Complex Computer Systems, ICECCS 2015, Gold Coast, Australia, December 9-12, 2015. IEEE Computer Society, 40-49.Google ScholarGoogle Scholar

Index Terms

  1. Improving real-world vulnerability characterization with vulnerable slices

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PROMISE 2020: Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering
        November 2020
        80 pages
        ISBN:9781450381277
        DOI:10.1145/3416508

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 November 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate64of125submissions,51%

        Upcoming Conference

        ICSE 2025

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader