research-article

Low-Latency, Line-Rate Variable-Length Field Parsing for 100+ Gb/s Ethernet

Authors:

Christopher CraryAuthors Info & Claims

FPGA '24: Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays

Pages 12 - 21

https://doi.org/10.1145/3626202.3637559

Published: 02 April 2024 Publication History

Abstract

Field-programmable gate arrays (FPGAs) are widely employed in network-interface cards across applications including cloud services, machine learning, and high-frequency trading. These applications often share a common optimization goal: minimizing latency while meeting throughput constraints. In addition, these applications ideally aim to achieve "line-rate" operation, where the FPGA operates at full bandwidth without using back-pressure to stall incoming data. However, these goals are often conflicting. For example, to minimize latency, application protocols must effectively utilize network bandwidth by encoding variable-length data in variable-length fields. However, variable-length fields often have prohibitively complex processing requirements that prevent line-rate throughput or have excessive latency. In this paper, we present a novel variable-length field parser capable of scaling to accommodate the bus widths and clock frequencies necessary for 100+ Gb/s Ethernet, while still achieving low latency. Our experiments demonstrate parsing variable-length fields at line rate for anticipated bus widths and throughputs, achieving ultra-low latencies under 2 ns for some use cases. To the best of our knowledge, this latency surpasses existing work, including fixed-length field parsing.

References

[1]

2022. Financial Information eXchange (FIX) Protocol. Online. https://www. fixtrading.org/online-specification/

[2]

Srinivas Aluru and Nagakishore Jammula. 2014. A Review of Hardware Acceleration for Computational Genomics. IEEE Design Test 31, 1 (2014), 19--30. https://doi.org/10.1109/MDAT.2013.2293757

[3]

Marc Battyani. 2021. A sub 25 nanoseconds Open Source NASDAQ ITCH FPGA Parser. https://github.com/mbattyani/sub-25-ns-nasdaq-itch-fpga-parser#a-sub- 25-nanoseconds-open-source-nasdaq-itch-fpga-parser. Accessed: October 11, 2023.

[4]

Andrew Bitar, Mohamed S. Abdelfattah, and Vaughn Betz. 2015. Bringing programmability to the data plane: Packet processing with a NoC-enhanced FPGA. In 2015 International Conference on Field Programmable Technology (FPT). 24--31. https://doi.org/10.1109/FPT.2015.7393125

[5]

Gordon Brebner and Weirong Jiang. 2014. High-Speed Packet Processing using Reconfigurable Computing. IEEE Micro 34, 1 (2014), 8--18. https://doi.org/10. 1109/MM.2014.19

[6]

Marco Spaziani Brunella, Giacomo Belocchi, Marco Bonola, Salvatore Pontarelli, Giuseppe Siracusano, Giuseppe Bianchi, Aniello Cammarano, Alessandro Palumbo, Luca Petrucci, and Roberto Bifulco. 2022. HXDP: Efficient Software Packet Processing on FPGA NICs. Commun. ACM 65, 8 (jul 2022), 92--100. https://doi.org/10.1145/3543668

Digital Library

[7]

Adrian M. Caulfield, Eric S. Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo- Young Kim, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, and Doug Burger. 2016. A cloud-scale acceleration architecture. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 1--13. https://doi.org/10.1109/MICRO.2016.7783710

[8]

Danilo Cerovic, Valentin Del Piccolo, Ahmed Amamou, Kamel Haddadou, and Guy Pujolle. 2018. Fast Packet Processing: A Survey. IEEE Communications Surveys Tutorials 20, 4 (2018), 3645--3676. https://doi.org/10.1109/COMST.2018. 2851072

Digital Library

[9]

Eric Chung, Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Adrian Caulfield, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Maleen Abeydeera, Logan Adams, Hari Angepat, Christian Boehn, Derek Chiou, Oren Firestein, Alessandro Forin, Kang Su Gatlin, Mahdi Ghandi, Stephen Heil, Kyle Holohan, Ahmad El Husseini, Tamas Juhasz, Kara Kagi, Ratna K. Kovvuri, Sitaram Lanka, Friedel van Megen, Dima Mukhortov, Prerak Patel, Brandon Perez, Amanda Rapsang, Steven Reinhardt, Bita Rouhani, Adam Sapek, Raja Seera, Sangeetha Shekar, Balaji Sridharan, Gabriel Weisz, Lisa Woods, Phillip Yi Xiao, Dan Zhang, Ritchie Zhao, and Doug Burger. 2018. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave. IEEE Micro 38, 2 (2018), 8--20. https://doi.org/10.1109/MM.2018.022071131

[10]

Daniel Firestone, Andrew Putnam, Sambhrama Mundkur, Derek Chiou, Alireza Dabagh, Mike Andrewartha, Hari Angepat, Vivek Bhanu, Adrian Caulfield, Eric Chung, Harish Kumar Chandrappa, Somesh Chaturmohta, Matt Humphrey, Jack Lavier, Norman Lam, Fengfen Liu, Kalin Ovtcharov, Jitu Padhye, Gautham Popuri, Shachar Raindel, Tejas Sapre, Mark Shaw, Gabriel Silva, Madhan Sivakumar, Nisheeth Srivastava, Anshuman Verma, Qasim Zuhair, Deepak Bansal, Doug Burger, Kushagra Vaid, David A. Maltz, and Albert Greenberg. 2018. Azure Accelerated Networking: SmartNICs in the Public Cloud. In Proceedings of the 15th USENIX Conference on Networked Systems Design and Implementation (Renton, WA, USA) (NSDI'18). USENIX Association, USA, 51--64.

[11]

Jeremy Fowers, Greg Brown, Patrick Cooke, and Greg Stitt. 2012. A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (Monterey, California, USA) (FPGA '12). Association for Computing Machinery, New York, NY, USA, 47--56. https://doi.org/10.1145/ 2145694.2145704

Digital Library

[12]

Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Logan Adams, Mahdi Ghandi, Stephen Heil, Prerak Patel, Adam Sapek, Gabriel Weisz, Lisa Woods, Sitaram Lanka, Steven K. Reinhardt, Adrian M. Caulfield, Eric S. Chung, and Doug Burger. 2018. A Configurable Cloud-Scale DNN Processor for Real-Time AI. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 1--14. https://doi.org/10.1109/ISCA.2018.00012

Digital Library

[13]

Song Han, Junlong Kang, Huizi Mao, Yiming Hu, Xin Li, Yubin Li, Dongliang Xie, Hong Luo, Song Yao, Yu Wang, Huazhong Yang, and William (Bill) J. Dally. 2017. ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (Monterey, California, USA) (FPGA '17). Association for Computing Machinery, New York, NY, USA, 75--84. https://doi.org/10.1145/3020078.3021745

Digital Library

[14]

Ziyi Lv and Jing Zhang. 2022. A Survey of FPGA-Based Deep Learning Acceleration Research. In The International Conference on Image, Vision and Intelligent Systems (ICIVIS 2021), Jian Yao, Yang Xiao, Peng You, and Guang Sun (Eds.). Springer Nature Singapore, Singapore, 59--65.

[15]

NVIDIA. 2023. NVIDIA DOCA GPU Packet Processing Application Guide. https://docs.nvidia.com/doca/sdk/gpu-packet-processing/index.html [Accessed: 10/13/2023].

[16]

Salvatore Pontarelli, Roberto Bifulco, Marco Bonola, Carmelo Cascone, Marco Spaziani, Valerio Bruschi, Davide Sanvito, Giuseppe Siracusano, Antonio Capone, Michio Honda, Felipe Huici, and Giuseppe Siracusano. 2019. FlowBlaze: Stateful Packet Processing in Hardware. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19). USENIX Association, Boston, MA, 531--548. https://www.usenix.org/conference/nsdi19/presentation/pontarelli

[17]

Roberto Sierra, Filippo Mangani, Carlos Carreras, and Gabriel Caffarena. 2019. High-Performance Decoding of Variable-Length Memory Data Packets for FPGA Stream Processing. In 2019 29th International Conference on Field Programmable Logic and Applications (FPL). 307--313. https://doi.org/10.1109/FPL.2019.00056

[18]

Sujoy Sinha Roy, Furkan Turan, Kimmo Jarvinen, Frederik Vercauteren, and Ingrid Verbauwhede. 2019. FPGA-Based High-Performance Parallel Architecture for Homomorphic Computing on Encrypted Data. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). 387--398. https: //doi.org/10.1109/HPCA.2019.00052

[19]

Jagath Weerasinghe, Francois Abel, Christoph Hagleitner, and Andreas Herkersdorf. 2015. Enabling FPGAs in Hyperscale Data Centers. In 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom). 1078--1086. https://doi.org/10.1109/UIC-ATC-ScalCom-CBDCom-IoP.2015.199

[20]

David Wills. 2023. Fast Track Data Center Workloads and AI Applications with NVIDIA DOCA 2.2. https://developer.nvidia.com/blog/fast-track-data-centerworkloads- and-ai-applications-with-nvidia-doca-2--2/ [Accessed: 10/13/2023].

[21]

Xilinx, Inc. 2017. Xilinx UltraScale Architecture Configurable Logic Block. https:

Cited By

Myers ANigito BFoster N(2024)Network Design Considerations for Trading SystemsProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696890(282-289)Online publication date: 18-Nov-2024
https://dl.acm.org/doi/10.1145/3696348.3696890

Index Terms

Low-Latency, Line-Rate Variable-Length Field Parsing for 100+ Gb/s Ethernet
1. Hardware
  1. Integrated circuits
    1. Reconfigurable logic and FPGAs
      1. Hardware accelerators

Recommendations

P4-Compatible High-Level Synthesis of Low Latency 100 Gb/s Streaming Packet Parsers in FPGAs
FPGA '18: Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Packet parsing is a key step in SDN-aware devices. Packet parsers in SDN networks need to be both reconfigurable and fast, to support the evolving network protocols and the increasing multi-gigabit data rates. The combination of packet processing ...
Low-Latency Network-Adaptive Error Control for Interactive Streaming
MM '19: Proceedings of the 27th ACM International Conference on Multimedia

We introduce a novel network-adaptive algorithm that is suitable for alleviating network packet losses for low-latency interactive communications between a source and a destination. Network packet losses happen in a bursty manner as well as an arbitrary ...
Low-latency modular packet header parser for FPGA
ANCS '12: Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems

Packet parsing is the basic operation performed at all points of the network infrastructure. Modern networks impose challenging requirements on the performance and configurability of packet parsing modules, however the high-speed parsers often use very ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

FPGA '24: Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays

April 2024

300 pages

ISBN:9798400704185

DOI:10.1145/3626202

General Chair:
Zhiru Zhang
Cornell University, USA
,
Program Chair:
Andrew Putnam
Microsoft, USA

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGDA: ACM Special Interest Group on Design Automation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 April 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

FPGA '24

Sponsor:

SIGDA

FPGA '24: The 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays

March 3 - 5, 2024

CA, Monterey, USA

Acceptance Rates

Overall Acceptance Rate 125 of 627 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
246
Total Downloads

Downloads (Last 12 months)246
Downloads (Last 6 weeks)47

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Myers ANigito BFoster N(2024)Network Design Considerations for Trading SystemsProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696890(282-289)Online publication date: 18-Nov-2024
https://dl.acm.org/doi/10.1145/3696348.3696890

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten