research-article

Enhancing OSS Patch Backporting with Semantics

Authors:

Yuqing ZhangAuthors Info & Claims

CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security

Pages 2366 - 2380

https://doi.org/10.1145/3576915.3623188

Published: 21 November 2023 Publication History

Abstract

Keeping open-source software (OSS) up to date is one potential solution to prevent known vulnerabilities. However, it requires frequent and costly testing and may introduce compatibility issues. Consequently, developers often choose to backport security patches to the vulnerable versions instead. Manual backporting is time-consuming, especially for large OSS such as the Linux kernel. Therefore, automating this process is urgently needed to save considerable time. Existing automated approaches for backporting patches involve either automatic patch generation or automatic patch migration. However, these methods are often ineffective and error-prone since they failed to locate the precise patch locations or generate the correct patch, operating only on the syntactic level.

In this paper, we propose a patch type-sensitive approach to automatically backport OSS security patches, guided by the patch type and patch semantics. Specifically, our approach identifies patch locations with the aid of program dependency graph-based matching at the semantic level. It further applies fine-grained patch migration and fine-tuning based on patch types. We have implemented our approach in a tool named TSBPORT and evaluated it on a large-scale dataset consisting of 1,815 pairs of real-world security patches for the Linux kernel. The evaluation results show that TSBPORT successfully backported 1,589 (87.59%) patches, out of which 587 (32.34%) could not be backported by any state-of-the-art approaches, significantly outperforming state-of-the-art approaches. In addition, experiments also show that TSBPORT can be generalized to backport patches in other OSS projects with a success rate of 88.18%.

References

[1]

2023. Common Vulnerabilities and Exposures. https://cve.mitre.org/.

[2]

2023. Linux Kernel CVEs. Retrieved Match 2, 2023 from https://www. linuxkernelcves.com/

[3]

Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. In Addison-Wesley series in computer science / World student series edition. https://api.semanticscholar.org/CorpusID:42981739

Digital Library

[4]

Nikolaos Alexopoulos, Manuel Brack, Jan Philipp Wagner, and Tim Grube. 2022. How Long Do Vulnerabilities Live in the Code? A Large-Scale Empirical Measurement Study on FOSS Vulnerability Lifetimes. In USENIX Security Symposium.

[5]

Frances E Allen. 1970. Control flow analysis. ACM Sigplan Notices, Vol. 5, 7 (1970), 1--19.

Digital Library

[6]

Zimin Chen, Steve Kommrusch, and Martin Monperrus. [n.,d.]. Neural Transfer Learning for Repairing Security Vulnerabilities in C Code., Vol. 49, 1 ( [n.,d.]), 147--165. https://doi.org/10.1109/TSE.2022.3147265 Conference Name: IEEE Transactions on Software Engineering.

[7]

Zimin Chen, Steve Kommrusch, Michele Tufano, Louis-Noël Pouchet, Denys Poshyvanyk, and Monperrus Martin. 2018. SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair. IEEE Transactions on Software Engineering, Vol. 47 (2018), 1943--1959.

[8]

Jiarun Dai, Yuan Zhang, Hailong Xu, Haiming Lyu, Zicheng Wu, Xinyu Xing, and Min Yang. 2021. Facilitating Vulnerability Assessment through PoC Migration. Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (2021).

Digital Library

[9]

Zakir Durumeric, Frank Li, James Kasten, Johanna Amann, Jethro Beekman, Mathias Payer, Nicolas Weaver, David Adrian, Vern Paxson, Michael Bailey, et al. 2014. The matter of heartbleed. In Proceedings of the 2014 conference on internet measurement conference. 475--488.

Digital Library

[10]

Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Monperrus Martin. 2014. Fine-grained and accurate source code differencing. Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (2014).

Digital Library

[11]

Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. ArXiv, Vol. abs/2002.08155 (2020).

[12]

Michael Fu, Chakkrit Tantithamthavorn, Trung Le, Van Nguyen, and Dinh Phung. [n.,d.]. VulRepair: a T5-based automated software vulnerability repair. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (New York, NY, USA, 2022-11-09) (ESEC/FSE 2022). Association for Computing Machinery, 935--947. https://doi.org/10.1145/3540250.3549098

Digital Library

[13]

Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Jian Yin, Daxin Jiang, and M. Zhou. 2020. GraphCodeBERT: Pre-training Code Representations with Data Flow. ArXiv, Vol. abs/2009.08366 (2020).

[14]

Kaifeng Huang, Bihuan Chen, Xin Peng, Daihong Zhou, Ying Wang, Yang Liu, and Wenyun Zhao. [n.,d.]. ClDiff: Generating Concise Linked Code Differences. In 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE) (2018-09). 679--690. https://doi.org/10.1145/3238147.3238219 ISSN: 2643--1572.

Digital Library

[15]

Zhen Huang, David Lie, Gang Tan, and Trent Jaeger. 2019. Using Safety Properties to Generate Vulnerability Patches. 2019 IEEE Symposium on Security and Privacy (SP) (2019), 539--554.

[16]

Jiajun Jiang, Yingfei Xiong, Hongyu Zhang, Qing Gao, and Xiangqun Chen. 2018. Shaping program repair space with existing patches and similar code. Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (2018).

Digital Library

[17]

Nan Jiang, Kevin Liu, Thibaud Lutellier, and Lin Tan. 2023. Impact of Code Language Models on Automated Program Repair. ArXiv, Vol. abs/2302.05020 (2023).

[18]

Zheyue Jiang, Yuan Zhang, Jun Xu, Xinqian Sun, Zhuang Liu, and Min Yang. [n.,d.]. AEM: Facilitating Cross-Version Exploitability Assessment of Linux Kernel Vulnerabilities. ( [n.,d.]).

[19]

Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. 2013 35th International Conference on Software Engineering (ICSE) (2013), 802--811.

Digital Library

[20]

Seulbae Kim, Seunghoon Woo, Heejo Lee, and Hakjoo Oh. 2017. VUDDY: A Scalable Approach for Vulnerable Code Clone Discovery. 2017 IEEE Symposium on Security and Privacy (SP) (2017), 595--614.

[21]

Julia L. Lawall and Gilles Muller. 2018. Coccinelle: 10 Years of Automated Evolution in the Linux Kernel. In USENIX Annual Technical Conference.

[22]

Kui Liu, Anil Koyuncu, Dongsun Kim, and Tegawendé F. Bissyandé. 2018. AVATAR: Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations. 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER) (2018), 1--12.

[23]

Kui Liu, Anil Koyuncu, Dongsun Kim, and Tegawendé F. Bissyandé. 2019. TBar: revisiting template-based automated program repair. Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (2019).

Digital Library

[24]

Fan Long and Martin C. Rinard. 2016. An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems. 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE) (2016), 702--713.

[25]

Thibaud Lutellier, Hung Viet Pham, Lawrence Pang, Yitong Li, Moshi Wei, and Lin Tan. 2020. CoCoNuT: combining context-aware neural translation models using ensemble for program repair. Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (2020).

Digital Library

[26]

C Mary. 2015. Shellshock attack on linux systems-bash. International Research Journal of Engineering and Technology, Vol. 2, 8 (2015), 1322--1325.

[27]

Ehsan Mashhadi and Hadi Hemmati. [n.,d.]. Applying CodeBERT for Automated Program Repair of Java Simple Bugs. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR) (2021-05). 505--509. https://doi.org/10.1109/MSR52588.2021.00063 ISSN: 2574--3864.

[28]

Serguei A. Mokhov, Marc-André Laverdière, and Djamel Benredjem. 2008. Taxonomy of Linux Kernel Vulnerability Solutions. In Innovative Techniques in Instruction Technology, E-learning, E-assessment, and Education.

[29]

Yoann Padioleau, Julia L. Lawall, René Rydhof Hansen, and Gilles Muller. 2008. Documenting and automating collateral evolutions in linux device drivers. In European Conference on Computer Systems.

Digital Library

[30]

Gordon D. Plotkin. 2008. A Note on Inductive Generalization.

[31]

Ripon K. Saha, Yingjun Lyu, Hiroaki Yoshida, and Mukul R. Prasad. 2017. Elixir: Effective object-oriented program repair. 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE) (2017), 648--659.

[32]

Hitesh Sajnani, Vaibhav Saini, Jeffrey Svajlenko, Chanchal Kumar Roy, and Cristina V. Lopes. 2015. SourcererCC: Scaling Code Clone Detection to Big-Code. 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE) (2015), 1157--1168.

[33]

John Schulman, Barret Zoph, Jacob Hilton Christina Kim, Jacob Menick, Jiayi Weng, Juan Felipe Ceron Uribe, Liam Fedus, Luke Metz, Michael Pokorny, Rapha Gontijo Lopes, Shengjia Zhao, Arun Vijayvergiya, Eric Sigler, Adam Perelman, Chelsea Voss, Mike Heaton, Joel Parish, Dave Cummings, Rajeev Nayak, Valerie Balcom, David Schnurr, Tomer Kaftan, Chris Hallacy, Nicholas Turley, Noah Deutsch, Vik Goel, Jonathan Ward, Aris Konstantinidis, Wojciech Zaremba, Long Ouyang, Leonard Bogdonoff, Joshua Gross, David Medina, Sarah Yoo, Teddy Lee, Ryan Lowe, Dan Mossing, Joost Huizinga, Roger Jiang, Carroll Wainwright, Diogo Almeida, Steph Lin, Marvin Zhang, Kai Xiao, Katarina Slama, Steven Bills, Alex Gray, Jan Leike, Jakub Pachocki, Phil Tillet, Shantanu Jain, Greg Brockman, and Nick Ryder. 2022. ChatGPT: Optimizing Language Models for Dialogue. (2022). https://openai.com/blog/chatgpt/

[34]

Ridwan Shariffdeen, Xiang Gao, Gregory J. Duck, Shin Hwei Tan, Julia L. Lawall, and Abhik Roychoudhury. 2021. Automated patch backporting in Linux (experience paper). Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (2021).

Digital Library

[35]

Ridwan Shariffdeen, Shin Hwei Tan, Mingyuan Gao, and Abhik Roychoudhury. 2020. Automated Patch Transplantation. ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 30 (2020), 1-36.

Digital Library

[36]

You-Qun Shi, Yuan Zhang, Tianhan Luo, Xiangyu Mao, Yinzhi Cao, Ziwen Wang, Yudi Zhao, Zongan Huang, and Min Yang. 2022. Backporting Security Patches of Web Applications: A Prototype Design and Implementation on Injection Vulnerability Patches. In USENIX Security Symposium.

[37]

Michele Tufano, Jevgenija Pantiuchina, Cody Watson, Gabriele Bavota, and Denys Poshyvanyk. 2019. On Learning Meaningful Code Changes Via Neural Machine Translation. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE) (2019), 25--36.

[38]

Pengcheng Wang, Jeffrey Svajlenko, Yanzhao Wu, Yun Xu, and Chanchal Kumar Roy. 2018. CCAligner: A Token Based Large-Gap Clone Detector. 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE) (2018), 1066--1077.

Digital Library

[39]

Yue Wang, Weishi Wang, Shafiq R. Joty, and Steven C. H. Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation. ArXiv, Vol. abs/2109.00859 (2021).

[40]

Ming Wen, Junjie Chen, Rongxin Wu, Dan Hao, and S. C. Cheung. 2018. Context-Aware Patch Generation for Better Automated Program Repair. 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE) (2018), 1--11.

[41]

Wikipedia. 2023. Backporting. https://en.wikipedia.org/wiki/Backporting Retrieved Match 2, 2023 from

[42]

Seunghoon Woo, Sung-Hwuy Park, Seulbae Kim, Heejo Lee, and Hakjoo Oh. 2021. CENTRIS: A Precise and Scalable Approach for Identifying Modified Open-Source Software Reuse. 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE) (2021), 860--872.

[43]

Chun Xia and Lingming Zhang. 2023. Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT. ArXiv, Vol. abs/2304.00385 (2023).

[44]

Yang Xiao, Bihuan Chen, Chendong Yu, Zhengzi Xu, Zimu Yuan, Feng Li, Binghong Liu, Yang Liu, Wei Huo, Wei Zou, and Wenchang Shi. 2020. MVP: Detecting Vulnerabilities using Patch-Enhanced Vulnerability Signatures. In USENIX Security Symposium.

[45]

Zhengzi Xu, Yulong Zhang, Longri Zheng, Liangzhao Xia, Chenfu Bao, Zhi Wang, and Yang Liu. 2020. Automatic Hot Patch Generation for Android Kernels. In USENIX Security Symposium.

[46]

Fabian Yamaguchi, Nico Golde, Dan Arp, and Konrad Rieck. 2014a. Modeling and Discovering Vulnerabilities with Code Property Graphs. 2014 IEEE Symposium on Security and Privacy (2014), 590--604.

[47]

Fabian Yamaguchi, Nico Golde, Dan Arp, and Konrad Rieck. 2014b. Modeling and Discovering Vulnerabilities with Code Property Graphs. 2014 IEEE Symposium on Security and Privacy (2014), 590--604.

[48]

Yuan Yuan and W. Banzhaf. 2017. ARJA: Automated Repair of Java Programs via Multi-Objective Genetic Programming. IEEE Transactions on Software Engineering, Vol. 46 (2017), 1040--1067.

[49]

Lyuye Zhang, Chengwei Liu, Zhengzi Xu, Sen Chen, Lingling Fan, Lida Zhao, Jiahui Wu, and Yang Liu. 2023. Compatible Remediation on Vulnerabilities from Third-Party Libraries for Java Projects. ArXiv, Vol. abs/2301.08434 (2023).

Cited By

Kishiyama BLee YYang J(2024)Improving VulRepair’s Perfect Prediction by Leveraging the LION OptimizerApplied Sciences10.3390/app1413575014:13(5750)Online publication date: 1-Jul-2024
https://doi.org/10.3390/app14135750
Xie ZWen MWei ZJin HFilkov VRay BZhou M(2024)Unveiling the Characteristics and Impact of Security Patch EvolutionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695488(1094-1106)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695488

Index Terms

Enhancing OSS Patch Backporting with Semantics
1. Theory of computation
  1. Semantics and reasoning
    1. Program reasoning
      1. Program analysis

Recommendations

Automated patch backporting in Linux (experience paper)
ISSTA 2021: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Whenever a bug or vulnerability is detected in the Linux kernel, the kernel developers will endeavour to fix it by introducing a patch into the mainline version of the Linux kernel source tree. However, many users run older “stable” versions of Linux, ...
PDiff: Semantic-based Patch Presence Testing for Downstream Kernels
CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security

Open-source kernels have been adopted by massive downstream vendors on billions of devices. However, these vendors often omit or delay the adoption of patches released in the mainstream version. Even worse, many vendors are not publicizing the patching ...
Predicting Patch Correctness Based on the Similarity of Failing Test Cases
How do we know a generated patch is correct? This is a key challenging question that automated program repair (APR) systems struggle to address given the incompleteness of available test suites. Our intuition is that we can triage correct patches by ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security

November 2023

3722 pages

ISBN:9798400700507

DOI:10.1145/3576915

General Chairs:
Weizhi Meng
Technical University of Denmark
,
Christian D. Jensen
Technical University of Denmark
,
Program Chairs:
Cas Cremers
CISPA Helmholtz Center for Information Security
,
Engin Kirda
Khoury College of Computer Sciences

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 November 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

the Key Research and Development Science and Technology of Hainan Province
the National Key Research and Development Program
China Scholarship Council
the National Natural Science Foundation of China

Conference

CCS '23

Sponsor:

SIGSAC

CCS '23: ACM SIGSAC Conference on Computer and Communications Security

November 26 - 30, 2023

Copenhagen, Denmark

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 13 - 17, 2025

Taipei , Taiwan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
670
Total Downloads

Downloads (Last 12 months)415
Downloads (Last 6 weeks)40

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kishiyama BLee YYang J(2024)Improving VulRepair’s Perfect Prediction by Leveraging the LION OptimizerApplied Sciences10.3390/app1413575014:13(5750)Online publication date: 1-Jul-2024
https://doi.org/10.3390/app14135750
Xie ZWen MWei ZJin HFilkov VRay BZhou M(2024)Unveiling the Characteristics and Impact of Security Patch EvolutionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695488(1094-1106)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695488

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten