research-article

Extreme Datacenter Specialization for Planet-Scale Computing: ASIC Clouds

Authors:
Shaolin Xie

University of Washington

University of Washington
View Profile

,
Scott Davidson

University of Washington

University of Washington
View Profile

,
Ikuo Magaki

Apple Inc.

Apple Inc.
View Profile

,
Moein Khazraee

UC San Diego

UC San Diego
View Profile

,
Luis Vega

University of Washington

University of Washington
View Profile

,
Lu Zhang

UC San Diego

UC San Diego
View Profile

,
Michael B. Taylor

University of Washington

University of Washington
View Profile

Authors Info & Claims

ACM SIGOPS Operating Systems Review Volume 52 Issue 1July 2018pp 96–108https://doi.org/10.1145/3273982.3273991

Published:28 August 2018Publication History

ACM SIGOPS Operating Systems Review

Abstract

Planet-scale applications are driving the exponential growth of the cloud, and datacenter specialization is the key enabler of this trend, providing order of magnitudes improvements in cost-effectiveness and energy-efficiency. While exascale computing remains a goal for supercomputing, specialized datacenters have emerged and have demonstrated beyond-exascale performance and efficiency in specific domains. This paper generalizes the applications, design methodology, and deployment challenges of the most extreme form of specialized datacenter: ASIC Clouds. It analyzes two game-changing, real-world ASIC Clouds-Bitcoin Cryptocurrency Clouds and Tensor Processing Clouds-discuss their incentives, the empowering technologies and how they benefit from the specialized ASICs. Their business models, architectures and deployment methods are useful for envisioning future potential ASIC Clouds and forecasting how they will transform computing, the economy and society.

References

May 8, 2016. ASIC Clouds: Specializing the Datacenter . https://csetechrep.ucsd. edu/Dienst/UI/2.0/Describe/ncstrl.ucsd_cse/CS2016-1016.Google Scholar
Retrieved 2016. Glassdoor salaries, 2016. https://www.glassdoor.com.Google Scholar
Retrieved Jun, 2018. Accelerate Genomics Research with the Broad-Intel Genomics Stack. https://www.intel. com/content/dam/www/public/us/en/documents/white-papers/ accelerate-genomics-research-with-the-broad-intel-genomics-stack-paper. pdf.Google Scholar
Retrieved Jun, 2018. Amazon EC2. https://aws.amazon.com/ec2/.Google Scholar
Retrieved Jun, 2018. DRAGEN Bio-IT Platform. http://edicogenome.com/ dragen-bioit-platform/.Google Scholar
Retrieved Jun, 2018. Ethereum Miner pool. https://ethermine.org.Google Scholar
Retrieved Jun, 2018. Falcon Accelerated Genomics Pipelines. https://aws.amazon. com/marketplace/pp/B07C3NV88G.Google Scholar
Retrieved Jun, 2018. Litecoin Miner pool. https://www.ltcminer.com.Google Scholar
Retrieved Jun, 2018. Microsoft Genomics Acceleration. https://www.microsoft. com/en-us/research/project/genomicsacceleration/.Google Scholar
Retrieved Jun, 2018. OpenCL miner for BitCoin. https://github.com/Diablo-D3/ DiabloMiner/blob/master/src/main/resources/DiabloMiner.cl.Google Scholar
Retrieved Jun, 2018. Tensorflow CNN Benchmarks. https://github.com/ tensorflow/benchmarks/tree/a03070c016ab33f491ea7962765e378000490d99/ scripts/tf_cnn_benchmarks.Google Scholar
Junwhan Ahn et al. 2015. A scalable processing-in-memory accelerator for parallel graph processing.Google Scholar
Jorge Albericio et al. 2016. Cnvlutin: Ineffectual-neuron-free deep neural network computing. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
XenParavirtOps. https://wiki.xenproject.org/wiki/XenParavirtOps, 2016.Google Scholar
Luiz André Barroso, Jimmy Clidaras, and Urs Hölzle. 2013. The datacenter as a computer: An introduction to the design of warehouse-scale machines. Synthesis lectures on computer architecture (2013).Google Scholar
John Beetem et al. 1985. The GF11 Supercomputer. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Mahdi Nazm Bojnordi et al. 2016. Memristive boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning. In International Symposium on High Performance Computer Architecture (HPCA).Google Scholar
J. Adam Butts et al. 2014. The ANTON 2 chip a second-generation ASIC for molecular dynamics. In Hot Chips: A Symposium on High Performance Chips (HOTCHIPS).Google Scholar
Yunji Chen et al. 2014. DaDianNao: A Machine-Learning Supercomputer. In International Symposium on Microarchitecture (MICRO). Google ScholarDigital Library
Yu-Hsin Chen et al. 2016. Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Ping Chi et al. 2016. PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Eric Chung et al. Mar 2018. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave. IEEE Micro (Mar 2018).Google Scholar
MartinMDeneroff et al. 2008. Anton: A specialized ASIC for molecular dynamics. In Hot Chips: A Symposium on High Performance Chips (HOTCHIPS).Google Scholar
Daichi Fuijiki et al. 2018. GenAx: A Genome Sequencing Accelerator. In International Symposium on Computer Architecture (ISCA).Google Scholar
Boncheol Gu et al. 2016. Biscuit: A framework for near-data processing of big data workloads. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Anthony Gutierrez et al. 2014. Integrated 3D-stacked Server Designs for Increasing Physical Density of Key-value Stores. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarDigital Library
Tae Jun Ham et al. 2016. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics. In International Symposium on Microarchitecture (MICRO). Google ScholarDigital Library
Song Han et al. 2016. EIE: efficient inference engine on compressed deep neural network. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Nikos Hardavellas, Michael Ferdman, Babak Falsafi, and Anastasia Ailamaki. 2011. Toward dark silicon in servers. IEEE Micro (2011). Google ScholarDigital Library
Elmar Haubmann. Retrieved Jun, 2018. Comparing Google's TPUv2 against Nvidia's V100 on ResNet-50. https://blog.riseml.com/ comparing-google-tpuv2-against-nvidia-v100-on-resnet-50-c2bbb6a51e5e.Google Scholar
Yu Ji et al. 2016. NEUTRAMS: Neural network transformation and co-design under neuromorphic hardware constraints. In International Symposium on Microarchitecture (MICRO). Google ScholarDigital Library
H Jones. 2014. Whitepaper: strategies in optimizing market positions for semiconductor vendors based on IP leverage. International Business Strategies. Inc.(IBS). Google Scholar (2014).Google Scholar
Norman P. Jouppi et al. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Chi-Cheng Ju et al. 2015. 18.6 A 0.5 nJ/pixel 4K H. 265/HEVC codec LSI for multiformat smartphone applications. In International Solid-State Circuits Conference (ISSCC).Google Scholar
Moein Khazraee et al. 2017. Moonwalk: NRE Optimization in ASIC Clouds. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarDigital Library
Moein Khazraee, Luis Vega, Ikuo Magaki, and Michael Taylor. 2017. Specializing a Planet's Computation: ASIC Clouds. IEEE Micro (May 2017).Google Scholar
Duckhwan Kim et al. 2016. Neurocube: A programmable digital neuromorphic architecture with high-density 3D memory. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Onur Kocberber et al. 2013. Meet the walkers: Accelerating index traversals for in-memory databases. In International Symposium on Microarchitecture (MICRO). Google ScholarDigital Library
Alex Krizhevsky et al. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. Google ScholarDigital Library
Christian Leber et al. 2011. High frequency trading acceleration using FPGAs. In Field Programmable Logic and Applications (FPL). Google ScholarDigital Library
Kevin Lim et al. 2013. Thin servers with smart pipes: designing SoC accelerators for memcached. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Shaoli Liu et al. 2016. Cambricon: An instruction set architecture for neural networks. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Ikuo Magaki et al. 2016. ASIC Clouds: Specializing the Datacenter. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Junichiro Makino et al. 2012. GRAPE-8-An accelerator for gravitational N-body simulation with 20.5 Gflops/W performance. In High Performance Computing, Networking, Storage and Analysis (SC). Google ScholarDigital Library
Satoshi Nakamoto. 2008. Bitcoin: A peer-to-peer electronic cash system. (2008).Google Scholar
Courtois Nicolas et al. 2014. Optimizing sha256 in bitcoin mining. In International Conference on Cryptography and Security Systems (CCS).Google Scholar
Muhammet Mustafa Ozdal et al. 2016. Energy efficient architecture for graph analytics accelerators. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
A. Pedram et al. 2016. Dark Memory and Accelerator-Rich System Optimization in the Dark Silicon Era. IEEE Design and Test (2016).Google Scholar
Putnam et al. 2014. A Reconfigurable Fabric for Accelerating Large-scale Datacenter Services. In International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Brandon Reagen et al. 2016. Minerva: Enabling low-power, highly-accurate deep neural network accelerators. In International Symposium on Computer Architecture (ISCA) Google ScholarDigital Library
Ali Shafiee et al. 2016. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. International Symposium on Computer Architecture (ISCA). Google ScholarDigital Library
Stephen Weston. 2011. FPGA Accelerators at JP Morgan Chase. Stanford Computer Systems Colloquium, https://www.youtube.com/watch?v=9NqX1ETADn0.Google Scholar
Michael Taylor. 2013. Bitcoin and the Age of Bespoke Silicon. In International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES). Google ScholarDigital Library
Michael Taylor. 2013. A Landscape of the New Dark Silicon Design Regime. Micro, IEEE (Sept-Oct. 2013). Google ScholarDigital Library
Michael B. Taylor. 2012. Is Dark Silicon Useful? Harnessing the Four Horesemen of the Coming Dark Silicon Apocalypse. In DAC. Google ScholarDigital Library
Michael Bedford Taylor. 2017. The Evolution of Bitcoin Hardware. Computer 50, 9 (2017), 58-66Google ScholarDigital Library
Paul Teich. Retrieved Jun, 2018. TEARING APART GOOGLE'S TPU 3.0 AI COPROCESSOR. https://www.nextplatform.com/2018/05/10/ tearing-apart-googles-tpu-3-0-ai-coprocessor/.Google Scholar
Yatish Turakhia et al. 2017. Darwin: A Hardware-acceleration Framework for Genomic Sequence Alignment. bioRxiv (2017).Google Scholar
Ganesh Venkatesh et al. 2010. Conservation cores: reducing the energy of mature computations. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarDigital Library
Shijin Zhang et al. 2016. Cambricon-X: An accelerator for sparse neural networks. In International Symposium on Microarchitecture (MICRO). Google ScholarDigital Library

Recommendations

Moonwalk: NRE Optimization in ASIC Clouds
Asplos'17

Cloud services are becoming increasingly globalized and data-center workloads are expanding exponentially. GPU and FPGA-based clouds have illustrated improvements in power and performance by accelerating compute-intensive workloads. ASIC-based clouds ...
Read More
Moonwalk: NRE Optimization in ASIC Clouds
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems

Cloud services are becoming increasingly globalized and data-center workloads are expanding exponentially. GPU and FPGA-based clouds have illustrated improvements in power and performance by accelerating compute-intensive workloads. ASIC-based clouds ...
Read More
Moonwalk: NRE Optimization in ASIC Clouds
ASPLOS '17

Cloud services are becoming increasingly globalized and data-center workloads are expanding exponentially. GPU and FPGA-based clouds have illustrated improvements in power and performance by accelerating compute-intensive workloads. ASIC-based clouds ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGOPS Operating Systems Review Volume 52, Issue 1
Special Topics
July 2018
133 pages
ISSN:0163-5980
DOI:10.1145/3273982
Editors:
Mark Silberstein
Technion, Hafia, Israel
,
Christopher J. Rossbach
Stop D9500, Austin, TX
Issue’s Table of Contents
Copyright © 2018 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 August 2018
Check for updates
Author Tags
ASIC
Accelerator
Datacenter
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 13
  Total Citations
  View Citations
- 298
  Total Downloads
- Downloads (Last 12 months)16
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Extreme Datacenter Specialization for Planet-Scale Computing: ASIC Clouds

ACM SIGOPS Operating Systems Review

Abstract

References

Cited By

Recommendations

Moonwalk: NRE Optimization in ASIC Clouds

Moonwalk: NRE Optimization in ASIC Clouds

Moonwalk: NRE Optimization in ASIC Clouds

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Extreme Datacenter Specialization for Planet-Scale Computing: ASIC Clouds

ACM SIGOPS Operating Systems Review

Abstract

References

Cited By

Recommendations

Moonwalk: NRE Optimization in ASIC Clouds

Moonwalk: NRE Optimization in ASIC Clouds

Moonwalk: NRE Optimization in ASIC Clouds

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media