skip to main content
10.1145/3626202.3637559acmconferencesArticle/Chapter ViewAbstractPublication PagesfpgaConference Proceedingsconference-collections
research-article

Low-Latency, Line-Rate Variable-Length Field Parsing for 100+ Gb/s Ethernet

Published: 02 April 2024 Publication History

Abstract

Field-programmable gate arrays (FPGAs) are widely employed in network-interface cards across applications including cloud services, machine learning, and high-frequency trading. These applications often share a common optimization goal: minimizing latency while meeting throughput constraints. In addition, these applications ideally aim to achieve "line-rate" operation, where the FPGA operates at full bandwidth without using back-pressure to stall incoming data. However, these goals are often conflicting. For example, to minimize latency, application protocols must effectively utilize network bandwidth by encoding variable-length data in variable-length fields. However, variable-length fields often have prohibitively complex processing requirements that prevent line-rate throughput or have excessive latency. In this paper, we present a novel variable-length field parser capable of scaling to accommodate the bus widths and clock frequencies necessary for 100+ Gb/s Ethernet, while still achieving low latency. Our experiments demonstrate parsing variable-length fields at line rate for anticipated bus widths and throughputs, achieving ultra-low latencies under 2 ns for some use cases. To the best of our knowledge, this latency surpasses existing work, including fixed-length field parsing.

References

[1]
2022. Financial Information eXchange (FIX) Protocol. Online. https://www. fixtrading.org/online-specification/
[2]
Srinivas Aluru and Nagakishore Jammula. 2014. A Review of Hardware Acceleration for Computational Genomics. IEEE Design Test 31, 1 (2014), 19--30. https://doi.org/10.1109/MDAT.2013.2293757
[3]
Marc Battyani. 2021. A sub 25 nanoseconds Open Source NASDAQ ITCH FPGA Parser. https://github.com/mbattyani/sub-25-ns-nasdaq-itch-fpga-parser#a-sub- 25-nanoseconds-open-source-nasdaq-itch-fpga-parser. Accessed: October 11, 2023.
[4]
Andrew Bitar, Mohamed S. Abdelfattah, and Vaughn Betz. 2015. Bringing programmability to the data plane: Packet processing with a NoC-enhanced FPGA. In 2015 International Conference on Field Programmable Technology (FPT). 24--31. https://doi.org/10.1109/FPT.2015.7393125
[5]
Gordon Brebner and Weirong Jiang. 2014. High-Speed Packet Processing using Reconfigurable Computing. IEEE Micro 34, 1 (2014), 8--18. https://doi.org/10. 1109/MM.2014.19
[6]
Marco Spaziani Brunella, Giacomo Belocchi, Marco Bonola, Salvatore Pontarelli, Giuseppe Siracusano, Giuseppe Bianchi, Aniello Cammarano, Alessandro Palumbo, Luca Petrucci, and Roberto Bifulco. 2022. HXDP: Efficient Software Packet Processing on FPGA NICs. Commun. ACM 65, 8 (jul 2022), 92--100. https://doi.org/10.1145/3543668
[7]
Adrian M. Caulfield, Eric S. Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo- Young Kim, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, and Doug Burger. 2016. A cloud-scale acceleration architecture. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 1--13. https://doi.org/10.1109/MICRO.2016.7783710
[8]
Danilo Cerovic, Valentin Del Piccolo, Ahmed Amamou, Kamel Haddadou, and Guy Pujolle. 2018. Fast Packet Processing: A Survey. IEEE Communications Surveys Tutorials 20, 4 (2018), 3645--3676. https://doi.org/10.1109/COMST.2018. 2851072
[9]
Eric Chung, Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Adrian Caulfield, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Maleen Abeydeera, Logan Adams, Hari Angepat, Christian Boehn, Derek Chiou, Oren Firestein, Alessandro Forin, Kang Su Gatlin, Mahdi Ghandi, Stephen Heil, Kyle Holohan, Ahmad El Husseini, Tamas Juhasz, Kara Kagi, Ratna K. Kovvuri, Sitaram Lanka, Friedel van Megen, Dima Mukhortov, Prerak Patel, Brandon Perez, Amanda Rapsang, Steven Reinhardt, Bita Rouhani, Adam Sapek, Raja Seera, Sangeetha Shekar, Balaji Sridharan, Gabriel Weisz, Lisa Woods, Phillip Yi Xiao, Dan Zhang, Ritchie Zhao, and Doug Burger. 2018. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave. IEEE Micro 38, 2 (2018), 8--20. https://doi.org/10.1109/MM.2018.022071131
[10]
Daniel Firestone, Andrew Putnam, Sambhrama Mundkur, Derek Chiou, Alireza Dabagh, Mike Andrewartha, Hari Angepat, Vivek Bhanu, Adrian Caulfield, Eric Chung, Harish Kumar Chandrappa, Somesh Chaturmohta, Matt Humphrey, Jack Lavier, Norman Lam, Fengfen Liu, Kalin Ovtcharov, Jitu Padhye, Gautham Popuri, Shachar Raindel, Tejas Sapre, Mark Shaw, Gabriel Silva, Madhan Sivakumar, Nisheeth Srivastava, Anshuman Verma, Qasim Zuhair, Deepak Bansal, Doug Burger, Kushagra Vaid, David A. Maltz, and Albert Greenberg. 2018. Azure Accelerated Networking: SmartNICs in the Public Cloud. In Proceedings of the 15th USENIX Conference on Networked Systems Design and Implementation (Renton, WA, USA) (NSDI'18). USENIX Association, USA, 51--64.
[11]
Jeremy Fowers, Greg Brown, Patrick Cooke, and Greg Stitt. 2012. A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (Monterey, California, USA) (FPGA '12). Association for Computing Machinery, New York, NY, USA, 47--56. https://doi.org/10.1145/ 2145694.2145704
[12]
Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Logan Adams, Mahdi Ghandi, Stephen Heil, Prerak Patel, Adam Sapek, Gabriel Weisz, Lisa Woods, Sitaram Lanka, Steven K. Reinhardt, Adrian M. Caulfield, Eric S. Chung, and Doug Burger. 2018. A Configurable Cloud-Scale DNN Processor for Real-Time AI. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 1--14. https://doi.org/10.1109/ISCA.2018.00012
[13]
Song Han, Junlong Kang, Huizi Mao, Yiming Hu, Xin Li, Yubin Li, Dongliang Xie, Hong Luo, Song Yao, Yu Wang, Huazhong Yang, and William (Bill) J. Dally. 2017. ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (Monterey, California, USA) (FPGA '17). Association for Computing Machinery, New York, NY, USA, 75--84. https://doi.org/10.1145/3020078.3021745
[14]
Ziyi Lv and Jing Zhang. 2022. A Survey of FPGA-Based Deep Learning Acceleration Research. In The International Conference on Image, Vision and Intelligent Systems (ICIVIS 2021), Jian Yao, Yang Xiao, Peng You, and Guang Sun (Eds.). Springer Nature Singapore, Singapore, 59--65.
[15]
NVIDIA. 2023. NVIDIA DOCA GPU Packet Processing Application Guide. https://docs.nvidia.com/doca/sdk/gpu-packet-processing/index.html [Accessed: 10/13/2023].
[16]
Salvatore Pontarelli, Roberto Bifulco, Marco Bonola, Carmelo Cascone, Marco Spaziani, Valerio Bruschi, Davide Sanvito, Giuseppe Siracusano, Antonio Capone, Michio Honda, Felipe Huici, and Giuseppe Siracusano. 2019. FlowBlaze: Stateful Packet Processing in Hardware. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19). USENIX Association, Boston, MA, 531--548. https://www.usenix.org/conference/nsdi19/presentation/pontarelli
[17]
Roberto Sierra, Filippo Mangani, Carlos Carreras, and Gabriel Caffarena. 2019. High-Performance Decoding of Variable-Length Memory Data Packets for FPGA Stream Processing. In 2019 29th International Conference on Field Programmable Logic and Applications (FPL). 307--313. https://doi.org/10.1109/FPL.2019.00056
[18]
Sujoy Sinha Roy, Furkan Turan, Kimmo Jarvinen, Frederik Vercauteren, and Ingrid Verbauwhede. 2019. FPGA-Based High-Performance Parallel Architecture for Homomorphic Computing on Encrypted Data. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). 387--398. https: //doi.org/10.1109/HPCA.2019.00052
[19]
Jagath Weerasinghe, Francois Abel, Christoph Hagleitner, and Andreas Herkersdorf. 2015. Enabling FPGAs in Hyperscale Data Centers. In 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom). 1078--1086. https://doi.org/10.1109/UIC-ATC-ScalCom-CBDCom-IoP.2015.199
[20]
David Wills. 2023. Fast Track Data Center Workloads and AI Applications with NVIDIA DOCA 2.2. https://developer.nvidia.com/blog/fast-track-data-centerworkloads- and-ai-applications-with-nvidia-doca-2--2/ [Accessed: 10/13/2023].
[21]
Xilinx, Inc. 2017. Xilinx UltraScale Architecture Configurable Logic Block. https:

Cited By

View all
  • (2024)Network Design Considerations for Trading SystemsProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696890(282-289)Online publication date: 18-Nov-2024

Index Terms

  1. Low-Latency, Line-Rate Variable-Length Field Parsing for 100+ Gb/s Ethernet

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    FPGA '24: Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays
    April 2024
    300 pages
    ISBN:9798400704185
    DOI:10.1145/3626202
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 April 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. fpga
    2. hft
    3. low-latency
    4. smartnic

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    FPGA '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 125 of 627 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)246
    • Downloads (Last 6 weeks)47
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Network Design Considerations for Trading SystemsProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696890(282-289)Online publication date: 18-Nov-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media