research-article

Packet Processing Algorithm Identification using Program Embeddings

Authors:

S. VenkataKeerthy,

Yashas Andaluri,

Praveen Tammana,

Ramakrishna UpadrastaAuthors Info & Claims

APNet '22: Proceedings of the 6th Asia-Pacific Workshop on Networking

Pages 76 - 82

https://doi.org/10.1145/3542637.3542649

Published: 07 November 2023 Publication History

Abstract

To keep up with the network speeds, many recent works propose to offload network functions to SmartNICs. The process involves identifying packet-processing algorithms in a network function program then offloading them to appropriate accelerators available on SmartNICs. This process is often done manually for each architecture and is error-prone and laborious. In this work, we propose an automated solution to identify algorithms in network function programs. We model our approach as a classification problem of Machine Learning (ML) and propose using sophisticated program embeddings for representing the network function programs. We also identify the limited availability of datasets and propose a way of extrapolating them by systematically generating equivalent programs using (existing) compiler transformations in popular compiler infrastructures. Our approach relies on modeling programs as embeddings, uses ML models trained on such extrapolated datasets, and shows superior results over the recent works.

References

[1]

Miltiadis Allamanis, Earl T Barr, Premkumar Devanbu, and Charles Sutton. 2018. A survey of machine learning for big code and naturalness. ACM Computing Surveys (CSUR) 51, 4 (2018), 81.

Digital Library

[2]

Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019. Code2Vec: Learning Distributed Representations of Code. Proc. ACM Program. Lang. 3, POPL, Article 40 (Jan. 2019), 29 pages. https://doi.org/10.1145/3290353

Digital Library

[3]

arm. 2009. SSL Library mbed TLS/PolarSSL. https://tls.mbed.org/. [Online; accessed 16-March-2022].

[4]

Tal Ben-Nun, Alice Shoshana Jakobovits, and Torsten Hoefler. 2018. Neural Code Comprehension: A Learnable Representation of Code Semantics. In Proceedings of the 32Nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS’18). Curran Associates Inc., USA, 3589–3601. http://dl.acm.org/citation.cfm?id=3327144.3327276

[5]

Botan. 2000. Botan: Crypto and TLS for Modern C++. https://botan.randombit.net/. [Online; accessed 16-March-2022].

[6]

Cavium. 2013. Cavium LiquidIO® Server Adapter Family. https://datasheet.octopart.com/CN6130-110SV-G-Cavium-Networks-datasheet-26366670.pdf. [Online; accessed 16-March-2022].

[7]

P. Cousot and R. Cousot. 1977. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Conference Record of the Fourth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, New York, NY, Los Angeles, California, 238–252.

[8]

Crypto++. 1995. Crypto++® Library 8.6. https://www.cryptopp.com/. [Online; accessed 16-March-2022].

[9]

Tianyi Cui, Wei Zhang, Kaiyuan Zhang, and Arvind Krishnamurthy. 2021. Offloading Load Balancers onto SmartNICs. Association for Computing Machinery, New York, NY, USA, 56–62. https://doi.org/10.1145/3476886.3477505

Digital Library

[10]

Chris Cummins, Zacharias V. Fisches, Tal Ben-Nun, Torsten Hoefler, Michael F P O’Boyle, and Hugh Leather. 2021. ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations. In Proceedings of the 38th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 2244–2253.

[11]

Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. 2017. End-to-end deep learning of optimization heuristics. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT). IEEE, 219–232.

[12]

Ron Cytron, Jeanne Ferrante, Barry K Rosen, Mark N Wegman, and F Kenneth Zadeck. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems (TOPLAS) 13, 4(1991), 451–490.

Digital Library

[13]

Shaoke Fang, Qingsong Liu, and Wenfei Wu. 2021. HyperNAT: Scaling Up Network Address Translation with SmartNICs for Clouds. 2021 IEEE Global Communications Conference (GLOBECOM) (Dec 2021). https://doi.org/10.1109/globecom46510.2021.9685551

Digital Library

[14]

Daniel Firestone. 2017. VFP: A Virtual Switch Platform for Host SDN in the Public Cloud. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 315–328. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/firestone

[15]

Daniel Firestone, Andrew Putnam, Sambhrama Mundkur, Derek Chiou, Alireza Dabagh, Mike Andrewartha, Hari Angepat, Vivek Bhanu, Adrian Caulfield, Eric Chung, Harish Kumar Chandrappa, Somesh Chaturmohta, Matt Humphrey, Jack Lavier, Norman Lam, Fengfen Liu, Kalin Ovtcharov, Jitu Padhye, Gautham Popuri, Shachar Raindel, Tejas Sapre, Mark Shaw, Gabriel Silva, Madhan Sivakumar, Nisheeth Srivastava, Anshuman Verma, Qasim Zuhair, Deepak Bansal, Doug Burger, Kushagra Vaid, David A. Maltz, and Albert Greenberg. 2018. Azure Accelerated Networking: SmartNICs in the Public Cloud. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 51–66. https://www.usenix.org/conference/nsdi18/presentation/firestone

Digital Library

[16]

Grigori Fursin, Yuriy Kashnikov, Abdul Wahid Memon, Zbigniew Chamski, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Bilha Mendelson, Ayal Zaks, Eric Courtois, Francois Bodin, Phil Barnard, Elton Ashton, Edwin Bonilla, John Thomson, Christopher K. I. Williams, and Michael O’Boyle. 2011. Milepost GCC: Machine Learning Enabled Self-tuning Compiler. International Journal of Parallel Programming 39, 3 (01 Jun 2011), 296–327. https://doi.org/10.1007/s10766-010-0161-2

[17]

Jiaqi Gao, Ennan Zhai, Hongqiang Harry Liu, Rui Miao, Yu Zhou, Bingchuan Tian, Chen Sun, Dennis Cai, Ming Zhang, and Minlan Yu. 2020. Lyra: A Cross-Platform Language and Compiler for Data Plane Programming on Heterogeneous ASICs(SIGCOMM ’20). Association for Computing Machinery, New York, NY, USA, 435–450. https://doi.org/10.1145/3387514.3405879

Digital Library

[18]

Matthew S. Hecht. 1977. Flow Analysis of Computer Programs. Elsevier Science Inc., New York, NY, USA.

[19]

Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 37), Francis Bach and David Blei (Eds.). PMLR, Lille, France, 448–456. http://proceedings.mlr.press/v37/ioffe15.html

[20]

Ralf Kundel, Leonhard Nobach, Jeremias Blendin, Wilfried Maas, Andreas Zimber, Hans‐Joerg Kolbe, Georg Schyguda, Vladimir Gurevich, Rhaban Hark, Boris Koldehofe, and Ralf Steinmetz. 2021. OpenBNG: Central Office Network Functions on Programmable Data Plane Hardware. Int. J. Netw. Manag. 31, 1 (jan 2021), 25 pages. https://doi.org/10.1002/nem.2134

Digital Library

[21]

Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization. IEEE Computer Society, 75.

Digital Library

[22]

Ming Liu, Tianyi Cui, Henry Schuh, Arvind Krishnamurthy, Simon Peter, and Karan Gupta. 2019. Offloading Distributed Applications onto SmartNICs Using IPipe. In Proceedings of the ACM Special Interest Group on Data Communication (Beijing, China) (SIGCOMM ’19). Association for Computing Machinery, New York, NY, USA, 318–333. https://doi.org/10.1145/3341302.3342079

Digital Library

[23]

Alberto Magni, Christophe Dubach, and Michael F. P. O’Boyle. 2013. A large-scale cross-architecture evaluation of thread-coarsening. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC’13, Denver, CO, USA - November 17 - 21, 2013, William Gropp and Satoshi Matsuoka (Eds.). ACM, 11:1–11:11. https://doi.org/10.1145/2503210.2503268

Digital Library

[24]

Marvell. 2021. Marvell Octeon LiquidIO SmartNICs and DPUs. https://www.marvell.com/content/dam/marvell/en/public-collateral/embedded-processors/marvell-octeon-10-dpu-platform-product-brief.pdf. [Online; accessed 16-March-2022].

[25]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781(2013).

[26]

Niels Möller 2001. Nettle: a low-level cryptographic library. https://www.lysator.liu.se/~nisse/nettle/nettle.html. [Online; accessed 16-March-2022].

[27]

Netronome. 2017. Netronome Agilio SmartNICs. https://www.netronome.com/media/documents/PB_NFP-4000-7-20.pdf. [Online; accessed 16-March-2022].

[28]

Nvidia. 2022. Nvidia Bluefield Data Processing Units. https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/documents/datasheet-nvidia-bluefield-3-dpu.pdf. [Online; accessed 16-March-2022].

[29]

Randal S. Olson, Nathan Bartley, Ryan J. Urbanowicz, and Jason H. Moore. 2016. Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (Denver, Colorado, USA) (GECCO ’16). ACM, New York, NY, USA, 485–492. https://doi.org/10.1145/2908812.2908918

Digital Library

[30]

OpenSSL. 1998. The Open Source Toolkit for SSL/TLS. http://openssl.org/. [Online; accessed 16-March-2022].

[31]

Sourav Panda, Yixiao Feng, Sameer G Kulkarni, K. K. Ramakrishnan, Nick Duffield, and Laxmi N. Bhuyan. 2021. SmartWatch: Accurate Traffic Analysis and Flow-State Tracking for Intrusion Prevention Using SmartNICs. In Proceedings of the 17th International Conference on Emerging Networking EXperiments and Technologies (Virtual Event, Germany) (CoNEXT ’21). Association for Computing Machinery, New York, NY, USA, 60–75. https://doi.org/10.1145/3485983.3494861

Digital Library

[32]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.

[33]

Pensando. 2020. Pensando DSC-100 Distributed Services Card. https://pensando.io/wp-content/uploads/2020/03/DSC-100-ProductBrief-v06.pdf. [Online; accessed 16-March-2022].

[34]

Yiming Qiu, Jiarong Xing, Kuo-Feng Hsu, Qiao Kang, Ming Liu, Srinivas Narayana, and Ang Chen. 2021. Automated SmartNIC Offloading Insights for Network Functions. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (Virtual Event, Germany) (SOSP ’21). Association for Computing Machinery, New York, NY, USA, 772–787. https://doi.org/10.1145/3477132.3483583

Digital Library

[35]

H. G. Rice. 1953. Classes of Recursively Enumerable Sets and Their Decision Problems. Trans. Amer. Math. Soc. 74, 2 (1953), 358–366. http://www.jstor.org/stable/1990888

[36]

Rinku Shah, Vikas Kumar, Mythili Vutukuru, and Purushottam Kulkarni. 2020. TurboEPC: Leveraging Dataplane Programmability to Accelerate the Mobile Packet Core. In Proceedings of the Symposium on SDN Research (San Jose, CA, USA) (SOSR ’20). Association for Computing Machinery, New York, NY, USA, 83–95. https://doi.org/10.1145/3373360.3380839

Digital Library

[37]

Yulei Sui, Xiao Cheng, Guanqin Zhang, and Haoyu Wang. 2020. Flow2Vec: Value-Flow-Based Precise Code Embedding. Proc. ACM Program. Lang. 4, OOPSLA, Article 233 (nov 2020), 27 pages. https://doi.org/10.1145/3428301

Digital Library

[38]

Nik Sultana, John Sonchack, Hans Giesen, Isaac Pedisich, Zhaoyang Han, Nishanth Shyamkumar, Shivani Burad, André DeHon, and Boon Thau Loo. 2021. Flightplan: Dataplane Disaggregation and Placement for P4 Programs. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). USENIX Association, 571–592. https://www.usenix.org/conference/nsdi21/presentation/sultana

[39]

S. VenkataKeerthy, Rohit Aggarwal, Shalini Jain, Maunendra Sankar Desarkar, Ramakrishna Upadrasta, and Y. N. Srikant. 2020. IR2Vec: LLVM IR Based Scalable Program Embeddings. ACM Trans. Archit. Code Optim. 17, 4, Article 32 (Dec. 2020), 27 pages. https://doi.org/10.1145/3418463

Digital Library

[40]

wolfSSL. 2006. wolfCrypt Embedded Crypto Engine. https://www.wolfssl.com/products/wolfcrypt-2/. [Online; accessed 16-March-2022].

[41]

Jinli Yan, Lu Tang, Junnan Li, Xiangrui Yang, Wei Quan, Hongyi Chen, and Zhigang Sun. 2019. UniSec: A Unified Security Framework with SmartNIC Acceleration in Public Cloud. In Proceedings of the ACM Turing Celebration Conference - China (Chengdu, China) (ACM TURC ’19). Association for Computing Machinery, New York, NY, USA, Article 9, 6 pages. https://doi.org/10.1145/3321408.3323087

Digital Library

[42]

Kaiyuan Zhang, Danyang Zhuo, and Arvind Krishnamurthy. 2020. Gallium: Automated Software Middlebox Offloading to Programmable Switches. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication (Virtual Event, USA) (SIGCOMM ’20). Association for Computing Machinery, New York, NY, USA, 283–295. https://doi.org/10.1145/3387514.3405869

Digital Library

Cited By

Jain SVenkataKeerthy SAggarwal RDangeti TDas DUpadrasta R(2022)Reinforcement Learning assisted Loop Distribution for Locality and Vectorization2022 IEEE/ACM Eighth Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC)10.1109/LLVM-HPC56686.2022.00006(1-12)Online publication date: Nov-2022
https://doi.org/10.1109/LLVM-HPC56686.2022.00006

Index Terms

Packet Processing Algorithm Identification using Program Embeddings

Recommendations

Unleashing SmartNIC Packet Processing Performance in P4
ACM SIGCOMM '23: Proceedings of the ACM SIGCOMM 2023 Conference

SmartNICs are on the rise as a packet processing platform, with the trend towards a uniform P4 programming model. However, unleashing SmartNIC packet processing performance in P4 is a formidable task. Traditional SmartNIC optimizations rely on low-level ...
SmartNIC-Enabled Live Migration for Storage-Optimized VMs
APSys '24: Proceedings of the 15th ACM SIGOPS Asia-Pacific Workshop on Systems

Cloud providers offer storage-optimized VMs equipped with locally attached storage to meet the high performance requirements of cloud users. However, current cloud providers cannot enable live migration for storage-optimized VMs due to the high resource ...
Path-based function embedding and its application to error-handling specification mining
ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Identifying relationships among program elements is useful for program understanding, debugging, and analysis. One such kind of relationship is synonymy. Function synonyms are functions that play a similar role in code; examples include functions that ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

APNet '22: Proceedings of the 6th Asia-Pacific Workshop on Networking

July 2022

110 pages

ISBN:9781450397483

DOI:10.1145/3542637

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

NMICPS TiHAN
National Supercomputing Mission, India
Google

Conference

APNet 2022

APNet 2022: 6th Asia-Pacific Workshop on Networking

July 1 - 2, 2022

Fuzhou, China

Acceptance Rates

Overall Acceptance Rate 50 of 118 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
55
Total Downloads

Downloads (Last 12 months)42
Downloads (Last 6 weeks)4

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jain SVenkataKeerthy SAggarwal RDangeti TDas DUpadrasta R(2022)Reinforcement Learning assisted Loop Distribution for Locality and Vectorization2022 IEEE/ACM Eighth Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC)10.1109/LLVM-HPC56686.2022.00006(1-12)Online publication date: Nov-2022
https://doi.org/10.1109/LLVM-HPC56686.2022.00006

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten