skip to main content
10.1145/3576915.3623188acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Enhancing OSS Patch Backporting with Semantics

Published: 21 November 2023 Publication History

Abstract

Keeping open-source software (OSS) up to date is one potential solution to prevent known vulnerabilities. However, it requires frequent and costly testing and may introduce compatibility issues. Consequently, developers often choose to backport security patches to the vulnerable versions instead. Manual backporting is time-consuming, especially for large OSS such as the Linux kernel. Therefore, automating this process is urgently needed to save considerable time. Existing automated approaches for backporting patches involve either automatic patch generation or automatic patch migration. However, these methods are often ineffective and error-prone since they failed to locate the precise patch locations or generate the correct patch, operating only on the syntactic level.
In this paper, we propose a patch type-sensitive approach to automatically backport OSS security patches, guided by the patch type and patch semantics. Specifically, our approach identifies patch locations with the aid of program dependency graph-based matching at the semantic level. It further applies fine-grained patch migration and fine-tuning based on patch types. We have implemented our approach in a tool named TSBPORT and evaluated it on a large-scale dataset consisting of 1,815 pairs of real-world security patches for the Linux kernel. The evaluation results show that TSBPORT successfully backported 1,589 (87.59%) patches, out of which 587 (32.34%) could not be backported by any state-of-the-art approaches, significantly outperforming state-of-the-art approaches. In addition, experiments also show that TSBPORT can be generalized to backport patches in other OSS projects with a success rate of 88.18%.

References

[1]
2023. Common Vulnerabilities and Exposures. https://cve.mitre.org/.
[2]
2023. Linux Kernel CVEs. Retrieved Match 2, 2023 from https://www. linuxkernelcves.com/
[3]
Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. In Addison-Wesley series in computer science / World student series edition. https://api.semanticscholar.org/CorpusID:42981739
[4]
Nikolaos Alexopoulos, Manuel Brack, Jan Philipp Wagner, and Tim Grube. 2022. How Long Do Vulnerabilities Live in the Code? A Large-Scale Empirical Measurement Study on FOSS Vulnerability Lifetimes. In USENIX Security Symposium.
[5]
Frances E Allen. 1970. Control flow analysis. ACM Sigplan Notices, Vol. 5, 7 (1970), 1--19.
[6]
Zimin Chen, Steve Kommrusch, and Martin Monperrus. [n.,d.]. Neural Transfer Learning for Repairing Security Vulnerabilities in C Code., Vol. 49, 1 ( [n.,d.]), 147--165. https://doi.org/10.1109/TSE.2022.3147265 Conference Name: IEEE Transactions on Software Engineering.
[7]
Zimin Chen, Steve Kommrusch, Michele Tufano, Louis-Noël Pouchet, Denys Poshyvanyk, and Monperrus Martin. 2018. SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair. IEEE Transactions on Software Engineering, Vol. 47 (2018), 1943--1959.
[8]
Jiarun Dai, Yuan Zhang, Hailong Xu, Haiming Lyu, Zicheng Wu, Xinyu Xing, and Min Yang. 2021. Facilitating Vulnerability Assessment through PoC Migration. Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (2021).
[9]
Zakir Durumeric, Frank Li, James Kasten, Johanna Amann, Jethro Beekman, Mathias Payer, Nicolas Weaver, David Adrian, Vern Paxson, Michael Bailey, et al. 2014. The matter of heartbleed. In Proceedings of the 2014 conference on internet measurement conference. 475--488.
[10]
Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Monperrus Martin. 2014. Fine-grained and accurate source code differencing. Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (2014).
[11]
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. ArXiv, Vol. abs/2002.08155 (2020).
[12]
Michael Fu, Chakkrit Tantithamthavorn, Trung Le, Van Nguyen, and Dinh Phung. [n.,d.]. VulRepair: a T5-based automated software vulnerability repair. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (New York, NY, USA, 2022-11-09) (ESEC/FSE 2022). Association for Computing Machinery, 935--947. https://doi.org/10.1145/3540250.3549098
[13]
Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Jian Yin, Daxin Jiang, and M. Zhou. 2020. GraphCodeBERT: Pre-training Code Representations with Data Flow. ArXiv, Vol. abs/2009.08366 (2020).
[14]
Kaifeng Huang, Bihuan Chen, Xin Peng, Daihong Zhou, Ying Wang, Yang Liu, and Wenyun Zhao. [n.,d.]. ClDiff: Generating Concise Linked Code Differences. In 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE) (2018-09). 679--690. https://doi.org/10.1145/3238147.3238219 ISSN: 2643--1572.
[15]
Zhen Huang, David Lie, Gang Tan, and Trent Jaeger. 2019. Using Safety Properties to Generate Vulnerability Patches. 2019 IEEE Symposium on Security and Privacy (SP) (2019), 539--554.
[16]
Jiajun Jiang, Yingfei Xiong, Hongyu Zhang, Qing Gao, and Xiangqun Chen. 2018. Shaping program repair space with existing patches and similar code. Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (2018).
[17]
Nan Jiang, Kevin Liu, Thibaud Lutellier, and Lin Tan. 2023. Impact of Code Language Models on Automated Program Repair. ArXiv, Vol. abs/2302.05020 (2023).
[18]
Zheyue Jiang, Yuan Zhang, Jun Xu, Xinqian Sun, Zhuang Liu, and Min Yang. [n.,d.]. AEM: Facilitating Cross-Version Exploitability Assessment of Linux Kernel Vulnerabilities. ( [n.,d.]).
[19]
Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. 2013 35th International Conference on Software Engineering (ICSE) (2013), 802--811.
[20]
Seulbae Kim, Seunghoon Woo, Heejo Lee, and Hakjoo Oh. 2017. VUDDY: A Scalable Approach for Vulnerable Code Clone Discovery. 2017 IEEE Symposium on Security and Privacy (SP) (2017), 595--614.
[21]
Julia L. Lawall and Gilles Muller. 2018. Coccinelle: 10 Years of Automated Evolution in the Linux Kernel. In USENIX Annual Technical Conference.
[22]
Kui Liu, Anil Koyuncu, Dongsun Kim, and Tegawendé F. Bissyandé. 2018. AVATAR: Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations. 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER) (2018), 1--12.
[23]
Kui Liu, Anil Koyuncu, Dongsun Kim, and Tegawendé F. Bissyandé. 2019. TBar: revisiting template-based automated program repair. Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (2019).
[24]
Fan Long and Martin C. Rinard. 2016. An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems. 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE) (2016), 702--713.
[25]
Thibaud Lutellier, Hung Viet Pham, Lawrence Pang, Yitong Li, Moshi Wei, and Lin Tan. 2020. CoCoNuT: combining context-aware neural translation models using ensemble for program repair. Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (2020).
[26]
C Mary. 2015. Shellshock attack on linux systems-bash. International Research Journal of Engineering and Technology, Vol. 2, 8 (2015), 1322--1325.
[27]
Ehsan Mashhadi and Hadi Hemmati. [n.,d.]. Applying CodeBERT for Automated Program Repair of Java Simple Bugs. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR) (2021-05). 505--509. https://doi.org/10.1109/MSR52588.2021.00063 ISSN: 2574--3864.
[28]
Serguei A. Mokhov, Marc-André Laverdière, and Djamel Benredjem. 2008. Taxonomy of Linux Kernel Vulnerability Solutions. In Innovative Techniques in Instruction Technology, E-learning, E-assessment, and Education.
[29]
Yoann Padioleau, Julia L. Lawall, René Rydhof Hansen, and Gilles Muller. 2008. Documenting and automating collateral evolutions in linux device drivers. In European Conference on Computer Systems.
[30]
Gordon D. Plotkin. 2008. A Note on Inductive Generalization.
[31]
Ripon K. Saha, Yingjun Lyu, Hiroaki Yoshida, and Mukul R. Prasad. 2017. Elixir: Effective object-oriented program repair. 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE) (2017), 648--659.
[32]
Hitesh Sajnani, Vaibhav Saini, Jeffrey Svajlenko, Chanchal Kumar Roy, and Cristina V. Lopes. 2015. SourcererCC: Scaling Code Clone Detection to Big-Code. 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE) (2015), 1157--1168.
[33]
John Schulman, Barret Zoph, Jacob Hilton Christina Kim, Jacob Menick, Jiayi Weng, Juan Felipe Ceron Uribe, Liam Fedus, Luke Metz, Michael Pokorny, Rapha Gontijo Lopes, Shengjia Zhao, Arun Vijayvergiya, Eric Sigler, Adam Perelman, Chelsea Voss, Mike Heaton, Joel Parish, Dave Cummings, Rajeev Nayak, Valerie Balcom, David Schnurr, Tomer Kaftan, Chris Hallacy, Nicholas Turley, Noah Deutsch, Vik Goel, Jonathan Ward, Aris Konstantinidis, Wojciech Zaremba, Long Ouyang, Leonard Bogdonoff, Joshua Gross, David Medina, Sarah Yoo, Teddy Lee, Ryan Lowe, Dan Mossing, Joost Huizinga, Roger Jiang, Carroll Wainwright, Diogo Almeida, Steph Lin, Marvin Zhang, Kai Xiao, Katarina Slama, Steven Bills, Alex Gray, Jan Leike, Jakub Pachocki, Phil Tillet, Shantanu Jain, Greg Brockman, and Nick Ryder. 2022. ChatGPT: Optimizing Language Models for Dialogue. (2022). https://openai.com/blog/chatgpt/
[34]
Ridwan Shariffdeen, Xiang Gao, Gregory J. Duck, Shin Hwei Tan, Julia L. Lawall, and Abhik Roychoudhury. 2021. Automated patch backporting in Linux (experience paper). Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (2021).
[35]
Ridwan Shariffdeen, Shin Hwei Tan, Mingyuan Gao, and Abhik Roychoudhury. 2020. Automated Patch Transplantation. ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 30 (2020), 1-36.
[36]
You-Qun Shi, Yuan Zhang, Tianhan Luo, Xiangyu Mao, Yinzhi Cao, Ziwen Wang, Yudi Zhao, Zongan Huang, and Min Yang. 2022. Backporting Security Patches of Web Applications: A Prototype Design and Implementation on Injection Vulnerability Patches. In USENIX Security Symposium.
[37]
Michele Tufano, Jevgenija Pantiuchina, Cody Watson, Gabriele Bavota, and Denys Poshyvanyk. 2019. On Learning Meaningful Code Changes Via Neural Machine Translation. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE) (2019), 25--36.
[38]
Pengcheng Wang, Jeffrey Svajlenko, Yanzhao Wu, Yun Xu, and Chanchal Kumar Roy. 2018. CCAligner: A Token Based Large-Gap Clone Detector. 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE) (2018), 1066--1077.
[39]
Yue Wang, Weishi Wang, Shafiq R. Joty, and Steven C. H. Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation. ArXiv, Vol. abs/2109.00859 (2021).
[40]
Ming Wen, Junjie Chen, Rongxin Wu, Dan Hao, and S. C. Cheung. 2018. Context-Aware Patch Generation for Better Automated Program Repair. 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE) (2018), 1--11.
[41]
Wikipedia. 2023. Backporting. https://en.wikipedia.org/wiki/Backporting Retrieved Match 2, 2023 from
[42]
Seunghoon Woo, Sung-Hwuy Park, Seulbae Kim, Heejo Lee, and Hakjoo Oh. 2021. CENTRIS: A Precise and Scalable Approach for Identifying Modified Open-Source Software Reuse. 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE) (2021), 860--872.
[43]
Chun Xia and Lingming Zhang. 2023. Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT. ArXiv, Vol. abs/2304.00385 (2023).
[44]
Yang Xiao, Bihuan Chen, Chendong Yu, Zhengzi Xu, Zimu Yuan, Feng Li, Binghong Liu, Yang Liu, Wei Huo, Wei Zou, and Wenchang Shi. 2020. MVP: Detecting Vulnerabilities using Patch-Enhanced Vulnerability Signatures. In USENIX Security Symposium.
[45]
Zhengzi Xu, Yulong Zhang, Longri Zheng, Liangzhao Xia, Chenfu Bao, Zhi Wang, and Yang Liu. 2020. Automatic Hot Patch Generation for Android Kernels. In USENIX Security Symposium.
[46]
Fabian Yamaguchi, Nico Golde, Dan Arp, and Konrad Rieck. 2014a. Modeling and Discovering Vulnerabilities with Code Property Graphs. 2014 IEEE Symposium on Security and Privacy (2014), 590--604.
[47]
Fabian Yamaguchi, Nico Golde, Dan Arp, and Konrad Rieck. 2014b. Modeling and Discovering Vulnerabilities with Code Property Graphs. 2014 IEEE Symposium on Security and Privacy (2014), 590--604.
[48]
Yuan Yuan and W. Banzhaf. 2017. ARJA: Automated Repair of Java Programs via Multi-Objective Genetic Programming. IEEE Transactions on Software Engineering, Vol. 46 (2017), 1040--1067.
[49]
Lyuye Zhang, Chengwei Liu, Zhengzi Xu, Sen Chen, Lingling Fan, Lida Zhao, Jiahui Wu, and Yang Liu. 2023. Compatible Remediation on Vulnerabilities from Third-Party Libraries for Java Projects. ArXiv, Vol. abs/2301.08434 (2023).

Cited By

View all
  • (2024)Improving VulRepair’s Perfect Prediction by Leveraging the LION OptimizerApplied Sciences10.3390/app1413575014:13(5750)Online publication date: 1-Jul-2024
  • (2024)Unveiling the Characteristics and Impact of Security Patch EvolutionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695488(1094-1106)Online publication date: 27-Oct-2024

Index Terms

  1. Enhancing OSS Patch Backporting with Semantics

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security
    November 2023
    3722 pages
    ISBN:9798400700507
    DOI:10.1145/3576915
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 November 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. patch backporting
    2. patch semantics
    3. patch type

    Qualifiers

    • Research-article

    Funding Sources

    • the Key Research and Development Science and Technology of Hainan Province
    • the National Key Research and Development Program
    • China Scholarship Council
    • the National Natural Science Foundation of China

    Conference

    CCS '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

    Upcoming Conference

    CCS '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)415
    • Downloads (Last 6 weeks)40
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Improving VulRepair’s Perfect Prediction by Leveraging the LION OptimizerApplied Sciences10.3390/app1413575014:13(5750)Online publication date: 1-Jul-2024
    • (2024)Unveiling the Characteristics and Impact of Security Patch EvolutionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695488(1094-1106)Online publication date: 27-Oct-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media