skip to main content
10.1145/3289602.3293943acmconferencesArticle/Chapter ViewAbstractPublication PagesfpgaConference Proceedingsconference-collections
poster

Evaluating and Enhancing Intel® Stratix® 10 FPGAs for Persistent Real-Time AI

Published: 20 February 2019 Publication History

Abstract

Interactive intelligent services (e.g., smart web search) are becoming essential datacenter workloads. They rely on data-intensive artificial intelligence (AI) algorithms that do not use batch computation due to their tight latency constraints. Since off-chip data accesses have higher latency and energy consumption than on-chip accesses, a persistent AI approach with the entire model stored in on-chip memory is becoming the new norm for real-time AI. This approach is the cornerstone of Microsoft's Brainwave FPGA-based AI cloud and was recently added to Nvidia's cuDNN library. In this work, we implement, optimize and evaluate a Brainwave-like neural processing unit (NPU) on a large Stratix-10 FPGA. We benchmark it against a large Nvidia Volta GPU running cuDNN persistent AI kernels. Across real-time persistent RNN, GRU, and LSTM workloads, we show that Stratix-10 offers ~3× (FP32) and ~10× (INT8) better latency than GPU (FP32), which uses only ~6% of its peak throughput. Then, we propose TensorRAM, an ASIC chiplet for persistent AI that is 2.5D integrated with an FPGA in the same package. TensorRAM enhances the on-chip memory capacity and bandwidth, with enough multi-precision INT8/4/2/1 throughput to match that bandwidth. Multiple TensorRAMs can be integrated with Stratix-10. Our evaluation shows that a small 32-mm2 TensorRAM on 10nm offers 64MB of SRAMs with 32TB/s on-chiplet bandwidth and 64 TOP/s (INT8). A small Stratix-10 with a TensorRAM (INT8) offers 16× better latency and 34× energy efficiency compared to GPU (FP32). Overall, Stratix-10 with TensorRAM offers compelling and scalable persistent AI solutions.

Cited By

View all
  • (2024)Near infrared‐II light‐sheet microscopy: Basic principle, intellectualization, and medical applicationJournal of Intelligent Medicine10.1002/jim4.181:1(112-133)Online publication date: 28-Nov-2024
  • (2021)FPGA Architecture: Principles and ProgressionIEEE Circuits and Systems Magazine10.1109/MCAS.2021.307160721:2(4-29)Online publication date: Oct-2022
  • (2021)FPGA acceleration on a multi-layer perceptron neural network for digit recognitionThe Journal of Supercomputing10.1007/s11227-021-03849-777:12(14356-14373)Online publication date: 1-Dec-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
FPGA '19: Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
February 2019
360 pages
ISBN:9781450361378
DOI:10.1145/3289602
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 February 2019

Check for updates

Author Tags

  1. ai
  2. asic
  3. deep learning
  4. fpga
  5. gpu
  6. real-time

Qualifiers

  • Poster

Conference

FPGA '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 125 of 627 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Near infrared‐II light‐sheet microscopy: Basic principle, intellectualization, and medical applicationJournal of Intelligent Medicine10.1002/jim4.181:1(112-133)Online publication date: 28-Nov-2024
  • (2021)FPGA Architecture: Principles and ProgressionIEEE Circuits and Systems Magazine10.1109/MCAS.2021.307160721:2(4-29)Online publication date: Oct-2022
  • (2021)FPGA acceleration on a multi-layer perceptron neural network for digit recognitionThe Journal of Supercomputing10.1007/s11227-021-03849-777:12(14356-14373)Online publication date: 1-Dec-2021
  • (2020)Beyond Peak Performance: Comparing the Real Performance of AI-Optimized FPGAs and GPUs2020 International Conference on Field-Programmable Technology (ICFPT)10.1109/ICFPT51103.2020.00011(10-19)Online publication date: Dec-2020
  • (2019)FPGA-based Computing in the Era of AI and Big DataProceedings of the 2019 International Symposium on Physical Design10.1145/3299902.3311063(35-35)Online publication date: 4-Apr-2019

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media