Journals & Magazines >IEEE Micro >Volume: 38 Issue: 2

Serving DNNs in Real Time at Datacenter Scale with Project Brainwave

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

To meet the computational demands required of deep learning, cloud operators are turning toward specialized hardware for improved efficiency and performance. Project Brai...Show More

Metadata

Abstract:

To meet the computational demands required of deep learning, cloud operators are turning toward specialized hardware for improved efficiency and performance. Project Brainwave, Microsofts principal infrastructure for AI serving in real time, accelerates deep neural network (DNN) inferencing in major services such as Bings intelligent search features and Azure. Exploiting distributed model parallelism and pinning over low-latency hardware microservices, Project Brainwave serves state-of-the-art, pre-trained DNN models with high efficiencies at low batch sizes. A high-performance, precision-adaptable FPGA soft processor is at the heart of the system, achieving up to 39.5 teraflops (Tflops) of effective performance at Batch 1 on a state-of-the-art Intel Stratix 10 FPGA.

Published in: IEEE Micro ( Volume: 38, Issue: 2, Mar./Apr. 2018)

Page(s): 8 - 20

Date of Publication: 20 April 2018

ISSN Information:

DOI: 10.1109/MM.2018.022071131

Contents

References is not available for this document.

Serving DNNs in Real Time at Datacenter Scale with Project Brainwave

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Serving DNNs in Real Time at Datacenter Scale with Project Brainwave

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?