skip to main content
10.1145/3620666.3651387acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Open Access

Merlin: Multi-tier Optimization of eBPF Code for Performance and Compactness

Published:27 April 2024Publication History

ABSTRACT

eBPF (extended Berkeley Packet Filter) significantly enhances observability, performance, and security within the Linux kernel, playing a pivotal role in various real-world applications. Implemented as a register-based kernel virtual machine, eBPF features a customized Instruction Set Architecture (ISA) with stringent kernel safety requirements, e.g., a limited number of instructions. This constraint necessitates substantial optimization efforts for eBPF programs to meet performance objectives. Despite the availability of compilers supporting eBPF program compilation, existing tools often overlook key optimization opportunities, resulting in suboptimal performance. In response, this paper introduces Merlin, an optimization framework leveraging customized LLVM passes and bytecode rewriting for Instruction Representation (IR) transformation and bytecode refinement. Merlin employs two primary optimization strategies, i.e., instruction merging and strength reduction. These optimizations are deployed before eBPF verification. We evaluate Merlin across 19 XDP programs (drawn from the Linux kernel, Meta, hXDP, and Cilium) and three eBPF-based systems (Sysdig, Tetragon, and Tracee, each comprising several hundred eBPF programs). The results show that all optimized programs pass the kernel verification. Meanwhile, Merlin can reduce number of instructions by 73% and runtime overhead by 60% compared with the original programs. Merlin can also improve the throughput by 0.59% and reduce the latency by 5.31%, compared to state-of-the-art technique K2, while being 106 times faster and more scalable to larger and more complex programs without additional manual efforts.

References

  1. Bringing ebpf and cilium to google kubernetes engine | google cloud blog. https://cloud.google.com/blog/products/containers-kubernetes/bringing-ebpf-and-cilium-to-google-kubernetes-engine, 2020. Accessed: 2023-10-30.Google ScholarGoogle Scholar
  2. How alibaba cloud uses cilium for high-performance cloud-native networking. https://cilium.io/blog/2020/10/09/cilium-in-alibaba-cloud/, 2020. Accessed: 2023-10-30.Google ScholarGoogle Scholar
  3. Nvd - cve-2021-3490. https://nvd.nist.gov/vuln/detail/CVE-2021-3490, 2021. Accessed: 2023-10-30.Google ScholarGoogle Scholar
  4. c - bpf verifier says program exceeds 1m instruction - stack over-flow. https://stackoverflow.com/questions/70841631/bpf-verifier-says-program-exceeds-1m-instruction, 2022. Accessed: 2024-03-04.Google ScholarGoogle Scholar
  5. Microsoft and isovalent partner to bring next generation ebpf dataplane for cloud-native applications in azure | microsoft azure blog. https://azure.microsoft.com/en-us/blog/microsoft-and-isovalent-partner-to-bring-next-generation-ebpf-dataplane-for-cloudnative-applications-in-azure/, 2022. Accessed: 2023-10-30.Google ScholarGoogle Scholar
  6. Amazon guardduty eks runtime monitoring expands operating systems and processor support. https://aws.amazon.com/about-aws/whats-new/2023/07/amazon-guardduty-eks-monitoring-systems-processor/, 2023. Accessed: 2023-11-1.Google ScholarGoogle Scholar
  7. Bpf design q&a --- the linux kernel documentation. https://docs.kernel.org/bpf/bpf_design_QA.html#q-what-are-the-verifier-limits, 2023. Accessed: 2024-03-04.Google ScholarGoogle Scholar
  8. Bpf documentation --- the linux kernel documentation. https://docs.kernel.org/bpf/, 2023. Accessed: 2023-10-30.Google ScholarGoogle Scholar
  9. ebpf instruction set specification, v1.0 --- the linux kernel documentation. https://docs.kernel.org/bpf/standardization/instruction-set.html, 2023. Accessed: 2023-11-06.Google ScholarGoogle Scholar
  10. ebpf verifier - the linux kernel documentation. https://docs.kernel.org/bpf/verifier.html, 2023. Accessed: 2023-11-06.Google ScholarGoogle Scholar
  11. Sysdig | security for containers, kubernetes, and cloud. https://sysdig.com/, 2023. Accessed: 2023-10-30.Google ScholarGoogle Scholar
  12. Tencent cloud mesh | tencent cloud. https://www.tencentcloud.com/products/tcm, 2023. Accessed: 2023-10-30.Google ScholarGoogle Scholar
  13. Tetragon - ebpf-based security observability and runtime enforcement. https://tetragon.io/, 2023. Accessed: 2023-10-30.Google ScholarGoogle Scholar
  14. Tracee - aqua. https://www.aquasec.com/products/tracee/, 2023. Accessed: 2023-10-30.Google ScholarGoogle Scholar
  15. Sanjit Bhat and Hovav Shacham. Formal verification of the linux kernel ebpf verifier range analysis, 2022. https://sanjit-bhat.github.io/assets/pdf/ebpf-verifier-range-analysis22.pdf.Google ScholarGoogle Scholar
  16. Scott Bradner and Jim McQuaid. Benchmarking methodology for network interconnect devices. Technical report, 1999.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Marco Spaziani Brunella, Giacomo Belocchi, Marco Bonola, Salvatore Pontarelli, Giuseppe Siracusano, Giuseppe Bianchi, Aniello Cammarano, Alessandro Palumbo, Luca Petrucci, and Roberto Bifulco. hxdp: Efficient software packet processing on fpga nics. Communications of the ACM, 65(8):92--100, 2022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Luca Caviglione, Wojciech Mazurczyk, Matteo Repetto, Andreas Schaffhauser, and Marco Zuppelli. Kernel-level tracing for detecting stegomalware and covert channels in linux environments. Computer Networks, 191:108010, 2021. Google ScholarGoogle ScholarCross RefCross Ref
  19. Howard Chen, Jiwei Lu, Wei-Chung Hsu, and Pen-Chung Yew. Continuous adaptive object-code re-optimization framework. In Advances in Computer Systems Architecture: 9th Asia-Pacific Conference, ACSAC 2004, Beijing, China, September 7-9, 2004. Proceedings 9, pages 241--255. Springer, 2004. Google ScholarGoogle ScholarCross RefCross Ref
  20. Cisco. T-rex traffic generator. https://trex-tgn.cisco.com/trex/doc/trex_manual.html, 2023. Accessed: 2023-10-30.Google ScholarGoogle Scholar
  21. Jack W Davidson and Christopher W Fraser. Code selection through object code optimization. ACM Transactions on Programming Languages and Systems (TOPLAS), 6(4):505--526, 1984. -0925/84/1000-0505. Google ScholarGoogle Scholar
  22. Bjorn De Sutter, Ludo Van Put, Dominique Chanet, Bruno De Bus, and Koen De Bosschere. Link-time compaction and optimization of arm executables. ACM Trans. Embed. Comput. Syst., 6(1):5--es, feb 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Advanced Micro Devices. Software Optimization Guide for the AMD Zen4 Microarchitecture. Advanced Micro Devices, Inc., Santa Clara, California, 2023. https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/software-optimization-guides/57647.zip.Google ScholarGoogle Scholar
  24. Dmitry Duplyakin, Robert Ricci, Aleksander Maricq, Gary Wong, Jonathon Duerig, Eric Eide, Leigh Stoller, Mike Hibler, David Johnson, Kirk Webb, et al. The design and operation of {CloudLab}. In 2019 USENIX annual technical conference (USENIX ATC 19), pages 1--14, 2019. https://www.usenix.org/conference/atc19/presentation/duplyakin.Google ScholarGoogle Scholar
  25. Alexis Engelke and Martin Schulz. Robust practical binary optimization at run-time using llvm. In 2020 IEEE/ACM 6th Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC) and Workshop on Hierarchical Parallelism for Exascale Computing (HiPar), pages 56--64, 2020. Google ScholarGoogle ScholarCross RefCross Ref
  26. William Findlay, Anil Somayaji, and David Barrera. Bpfbox: Simple precise process confinement with ebpf. In Proceedings of the 2020 ACM SIGSAC Conference on Cloud Computing Security Workshop, CCSW'20, page 91--103, New York, NY, USA, 2020. Association for Computing Machinery. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Agner Fog. Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for intel, amd and via cpus. Copenhagen University College of Engineering, 93:110, 2011. https://agner.org/optimize/.Google ScholarGoogle Scholar
  28. Tobias Grosser, Hongbin Zheng, Raghesh Aloor, Andreas Simbürger, Armin Größlinger, and Louis-Noël Pouchet. Polly-polyhedral optimization in llvm. In Proceedings of the First International Workshop on Polyhedral Compilation Techniques (IMPACT), volume 2011, page 1, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Intel. Intel 64 and IA-32 Architectures Optimization Reference Manual: Volume 1. Intel Inc., Santa Clara, California, 2023. https://www.intel.com/content/www/us/en/content-details/671488/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html.Google ScholarGoogle Scholar
  30. Yang Ji, Sangho Lee, Evan Downing, Weiren Wang, Mattia Fazzini, Taesoo Kim, Alessandro Orso, and Wenke Lee. Rain: Refinable attack investigation with on-demand inter-process information flow tracking. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pages 377--390, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jinghao Jia, Raj Sahu, Adam Oswald, Dan Williams, Michael V Le, and Tianyin Xu. Kernel extension verification is untenable. In Proceedings of the 19th Workshop on Hot Topics in Operating Systems, pages 150--157, 2023. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Jinghao Jia, YiFei Zhu, Dan Williams, Andrea Arcangeli, Claudio Canella, Hubertus Franke, Tobin Feldman-Fitzthum, Dimitrios Skarlatos, Daniel Gruss, and Tianyin Xu. Programmable system call security with ebpf, 2023. https://arxiv.org/abs/2302.10366.Google ScholarGoogle Scholar
  33. Zachary H Jones. Performance analysis of {XDP} programs. 2021.Google ScholarGoogle Scholar
  34. Jeffrey Katcher. Postmark: A new file system benchmark. TR3022, 1997. https://www.filesystems.org/docs/auto-pilot/Postmark.html.Google ScholarGoogle Scholar
  35. Jens Knoop, Oliver Rüthing, and Bernhard Steffen. Partial dead code elimination. In Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation, PLDI '94, page 147--158, New York, NY, USA, 1994. Association for Computing Machinery. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Hsuan-Chi Kuo, Kai-Hsun Chen, Yicheng Lu, Dan Williams, Sibin Mohan, and Tianyin Xu. Verified programs can party: Optimizing kernel extensions via post-verification merging. In Proceedings of the Seventeenth European Conference on Computer Systems, EuroSys '22, page 283--299, New York, NY, USA, 2022. Association for Computing Machinery. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Juneyoung Lee, Chung-Kil Hur, Ralf Jung, Zhengyang Liu, John Regehr, and Nuno P. Lopes. Reconciling high-level optimizations and low-level code in llvm. Proc. ACM Program. Lang., 2(OOPSLA), oct 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Chunhua Liao, Daniel J. Quinlan, Richard Vuduc, and Thomas Panas. Effective source-to-source outlining to support whole program empirical optimization. In Guang R. Gao, Lori L. Pollock, John Cavazos, and Xiaoming Li, editors, Languages and Compilers for Parallel Computing, pages 308--322, Berlin, Heidelberg, 2010. Springer Berlin Heidelberg. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Soo Yee Lim, Bogdan Stelea, Xueyuan Han, and Thomas Pasquier. Secure namespaced kernel audit for containers. In Proceedings of the ACM Symposium on Cloud Computing, SoCC '21, page 518--532, New York, NY, USA, 2021. Association for Computing Machinery. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Edward S Lowry and Cleburne W Medlock. Object code optimization. Communications of the ACM, 12(1):13--22, 1969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Shiqing Ma, Xiangyu Zhang, and Dongyan Xu. Protracer: Towards practical provenance tracing by alternating between logging and tainting. In 23rd Annual Network And Distributed System Security Symposium (NDSS 2016). Internet Soc, 2016. Google ScholarGoogle ScholarCross RefCross Ref
  42. Larry W McVoy and Carl Staelin. lmbench: Portable tools for performance analysis. In USENIX annual technical conference, pages 279--294. San Diego, CA, USA, 1996. https://lmbench.sourceforge.net/.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Sebastiano Miano, Matteo Bertrone, Fulvio Risso, Mauricio Vásquez Bernal, Yunsong Lu, and Jianwen Pi. Securing linux with a faster and scalable iptables. SIGCOMM Comput. Commun. Rev., 49(3):2--17, nov 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Sebastiano Miano, Matteo Bertrone, Fulvio Risso, Massimo Tumolo, and Mauricio Vásquez Bernal. Creating complex network services with ebpf: Experience and lessons learned. In 2018 IEEE 19th International Conference on High Performance Switching and Routing (HPSR), pages 1--8, 2018. Google ScholarGoogle ScholarCross RefCross Ref
  45. Sebastiano Miano, Xiaoqi Chen, Ran Ben Basat, and Gianni Antichi. Fast in-kernel traffic sketching in ebpf. ACM SIGCOMM Computer Communication Review, 53(1):3--13, 2023. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Sebastiano Miano, Fulvio Risso, Mauricio Vásquez Bernal, Matteo Bertrone, and Yunsong Lu. A framework for ebpf-based network functions in an era of microservices. IEEE Transactions on Network and Service Management, 18(1):133--151, 2021. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Thomas Pasquier, Xueyuan Han, Mark Goldstein, Thomas Moyer, David Eyers, Margo Seltzer, and Jean Bacon. Practical whole-system provenance capture. In Proceedings of the 2017 Symposium on Cloud Computing, pages 405--418, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Alessandro Rivitti, Roberto Bifulco, Angelo Tulumello, Marco Bonola, and Salvatore Pontarelli. Ehdl: Turning ebpf/xdp programs into hardware designs for the nic. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3, ASPLOS 2023, page 208--223, New York, NY, USA, 2023. Association for Computing Machinery. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Dominik Scholz, Daniel Raumer, Paul Emmerich, Alexander Kurtz, Krzysztof Lesiak, and Georg Carle. Performance implications of packet filtering with linux ebpf. In 2018 30th International Teletraffic Congress (ITC 30), volume 01, pages 209--217, 2018. Google ScholarGoogle ScholarCross RefCross Ref
  50. R Sekar, Hanke Kimm, and Rohit Aich. eaudit: A fast, scalable and deployable audit data collection system. In 2024 IEEE Symposium on Security and Privacy (SP), pages 87--87. IEEE Computer Society, 2023. http://seclab.cs.stonybrook.edu/seclab/pubs/eaudit.pdf.Google ScholarGoogle Scholar
  51. T. Simunic, L. Benini, G. De Micheli, and M. Hans. Source code optimization and profiling of energy consumption in embedded systems. In Proceedings 13th International Symposium on System Synthesis, pages 193--198, 2000. Google ScholarGoogle ScholarCross RefCross Ref
  52. David Soldani, Petrit Nahi, Hami Bour, Saber Jafarizadeh, Mohammed F. Soliman, Leonardo Di Giovanna, Francesco Monaco, Giuseppe Ognibene, and Fulvio Risso. ebpf: A new approach to cloud-native observability, networking and security for current (5g) and future mobile networks (6g and beyond). IEEE Access, 11:57174--57202, 2023. Google ScholarGoogle ScholarCross RefCross Ref
  53. Dave Jing Tian, Grant Hernandez, Joseph I. Choi, Vanessa Frost, Peter C. Johnson, and Kevin R. B. Butler. Lbm: A security framework for peripherals within the linux kernel. In 2019 IEEE Symposium on Security and Privacy (SP), pages 967--984, 2019. Google ScholarGoogle ScholarCross RefCross Ref
  54. Harishankar Vishwanathan, Matan Shachnai, Srinivas Narayana, and Santosh Nagarakatte. Verifying the verifier: ebpf range analysis verification. In International Conference on Computer Aided Verification, pages 226--251. Springer, 2023. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Mark N. Wegman and F. Kenneth Zadeck. Constant propagation with conditional branches. ACM Trans. Program. Lang. Syst., 13(2):181--210, apr 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Mathieu Xhonneux, Fabien Duchene, and Olivier Bonaventure. Leveraging ebpf for programmable network functions with ipv6 segment routing. In Proceedings of the 14th International Conference on Emerging Networking EXperiments and Technologies, CoNEXT '18, page 67--72, New York, NY, USA, 2018. Association for Computing Machinery. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Qiongwen Xu, Michael D Wong, Tanvi Wagle, Srinivas Narayana, and Anirudh Sivaraman. Synthesizing safe and efficient kernel extensions for packet processing. In Proceedings of the 2021 ACM SIGCOMM 2021 Conference, pages 50--64, 2021. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Yuhong Zhong, Haoyu Li, Yu Jian Wu, Ioannis Zarkadas, Jeffrey Tao, Evan Mesterhazy, Michael Makris, Junfeng Yang, Amy Tai, Ryan Stutsman, and Asaf Cidon. XRP: In-Kernel storage functions with eBPF. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22), pages 375--393, Carlsbad, CA, July 2022. USENIX Association. https://www.usenix.org/conference/osdi22/presentation/zhong.Google ScholarGoogle Scholar
  59. Jianer Zhou, Zengxie Ma, Weijian Tu, Xinyi Qiu, Jingpu Duan, Zhenyu Li, Qing Li, Xinyi Zhang, and Weichao Li. Cable: A framework for accelerating 5g upf based on ebpf. Computer Networks, 222:109535, 2023. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Yang Zhou, Zezhou Wang, Sowmya Dharanipragada, and Minlan Yu. Electrode: Accelerating distributed protocols with eBPF. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23), pages 1391--1407, Boston, MA, April 2023. USENIX Association. https://www.usenix.org/conference/nsdi23/presentation/zhou.Google ScholarGoogle Scholar

Index Terms

  1. Merlin: Multi-tier Optimization of eBPF Code for Performance and Compactness

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3
        April 2024
        1106 pages
        ISBN:9798400703867
        DOI:10.1145/3620666

        This work is licensed under a Creative Commons Attribution International 4.0 License.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 April 2024

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate535of2,713submissions,20%
      • Article Metrics

        • Downloads (Last 12 months)115
        • Downloads (Last 6 weeks)115

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader