skip to main content
research-article

Extreme Datacenter Specialization for Planet-Scale Computing: ASIC Clouds

Published:28 August 2018Publication History
Skip Abstract Section

Abstract

Planet-scale applications are driving the exponential growth of the cloud, and datacenter specialization is the key enabler of this trend, providing order of magnitudes improvements in cost-effectiveness and energy-efficiency. While exascale computing remains a goal for supercomputing, specialized datacenters have emerged and have demonstrated beyond-exascale performance and efficiency in specific domains. This paper generalizes the applications, design methodology, and deployment challenges of the most extreme form of specialized datacenter: ASIC Clouds. It analyzes two game-changing, real-world ASIC Clouds-Bitcoin Cryptocurrency Clouds and Tensor Processing Clouds-discuss their incentives, the empowering technologies and how they benefit from the specialized ASICs. Their business models, architectures and deployment methods are useful for envisioning future potential ASIC Clouds and forecasting how they will transform computing, the economy and society.

References

  1. May 8, 2016. ASIC Clouds: Specializing the Datacenter . https://csetechrep.ucsd. edu/Dienst/UI/2.0/Describe/ncstrl.ucsd_cse/CS2016-1016.Google ScholarGoogle Scholar
  2. Retrieved 2016. Glassdoor salaries, 2016. https://www.glassdoor.com.Google ScholarGoogle Scholar
  3. Retrieved Jun, 2018. Accelerate Genomics Research with the Broad-Intel Genomics Stack. https://www.intel. com/content/dam/www/public/us/en/documents/white-papers/ accelerate-genomics-research-with-the-broad-intel-genomics-stack-paper. pdf.Google ScholarGoogle Scholar
  4. Retrieved Jun, 2018. Amazon EC2. https://aws.amazon.com/ec2/.Google ScholarGoogle Scholar
  5. Retrieved Jun, 2018. DRAGEN Bio-IT Platform. http://edicogenome.com/ dragen-bioit-platform/.Google ScholarGoogle Scholar
  6. Retrieved Jun, 2018. Ethereum Miner pool. https://ethermine.org.Google ScholarGoogle Scholar
  7. Retrieved Jun, 2018. Falcon Accelerated Genomics Pipelines. https://aws.amazon. com/marketplace/pp/B07C3NV88G.Google ScholarGoogle Scholar
  8. Retrieved Jun, 2018. Litecoin Miner pool. https://www.ltcminer.com.Google ScholarGoogle Scholar
  9. Retrieved Jun, 2018. Microsoft Genomics Acceleration. https://www.microsoft. com/en-us/research/project/genomicsacceleration/.Google ScholarGoogle Scholar
  10. Retrieved Jun, 2018. OpenCL miner for BitCoin. https://github.com/Diablo-D3/ DiabloMiner/blob/master/src/main/resources/DiabloMiner.cl.Google ScholarGoogle Scholar
  11. Retrieved Jun, 2018. Tensorflow CNN Benchmarks. https://github.com/ tensorflow/benchmarks/tree/a03070c016ab33f491ea7962765e378000490d99/ scripts/tf_cnn_benchmarks.Google ScholarGoogle Scholar
  12. Junwhan Ahn et al. 2015. A scalable processing-in-memory accelerator for parallel graph processing.Google ScholarGoogle Scholar
  13. Jorge Albericio et al. 2016. Cnvlutin: Ineffectual-neuron-free deep neural network computing. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. XenParavirtOps. https://wiki.xenproject.org/wiki/XenParavirtOps, 2016.Google ScholarGoogle Scholar
  15. Luiz André Barroso, Jimmy Clidaras, and Urs Hölzle. 2013. The datacenter as a computer: An introduction to the design of warehouse-scale machines. Synthesis lectures on computer architecture (2013).Google ScholarGoogle Scholar
  16. John Beetem et al. 1985. The GF11 Supercomputer. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Mahdi Nazm Bojnordi et al. 2016. Memristive boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning. In International Symposium on High Performance Computer Architecture (HPCA).Google ScholarGoogle Scholar
  18. J. Adam Butts et al. 2014. The ANTON 2 chip a second-generation ASIC for molecular dynamics. In Hot Chips: A Symposium on High Performance Chips (HOTCHIPS).Google ScholarGoogle Scholar
  19. Yunji Chen et al. 2014. DaDianNao: A Machine-Learning Supercomputer. In International Symposium on Microarchitecture (MICRO). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yu-Hsin Chen et al. 2016. Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ping Chi et al. 2016. PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Eric Chung et al. Mar 2018. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave. IEEE Micro (Mar 2018).Google ScholarGoogle Scholar
  23. MartinMDeneroff et al. 2008. Anton: A specialized ASIC for molecular dynamics. In Hot Chips: A Symposium on High Performance Chips (HOTCHIPS).Google ScholarGoogle Scholar
  24. Daichi Fuijiki et al. 2018. GenAx: A Genome Sequencing Accelerator. In International Symposium on Computer Architecture (ISCA).Google ScholarGoogle Scholar
  25. Boncheol Gu et al. 2016. Biscuit: A framework for near-data processing of big data workloads. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Anthony Gutierrez et al. 2014. Integrated 3D-stacked Server Designs for Increasing Physical Density of Key-value Stores. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Tae Jun Ham et al. 2016. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics. In International Symposium on Microarchitecture (MICRO). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Song Han et al. 2016. EIE: efficient inference engine on compressed deep neural network. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Nikos Hardavellas, Michael Ferdman, Babak Falsafi, and Anastasia Ailamaki. 2011. Toward dark silicon in servers. IEEE Micro (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Elmar Haubmann. Retrieved Jun, 2018. Comparing Google's TPUv2 against Nvidia's V100 on ResNet-50. https://blog.riseml.com/ comparing-google-tpuv2-against-nvidia-v100-on-resnet-50-c2bbb6a51e5e.Google ScholarGoogle Scholar
  31. Yu Ji et al. 2016. NEUTRAMS: Neural network transformation and co-design under neuromorphic hardware constraints. In International Symposium on Microarchitecture (MICRO). Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H Jones. 2014. Whitepaper: strategies in optimizing market positions for semiconductor vendors based on IP leverage. International Business Strategies. Inc.(IBS). Google Scholar (2014).Google ScholarGoogle Scholar
  33. Norman P. Jouppi et al. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Chi-Cheng Ju et al. 2015. 18.6 A 0.5 nJ/pixel 4K H. 265/HEVC codec LSI for multiformat smartphone applications. In International Solid-State Circuits Conference (ISSCC).Google ScholarGoogle Scholar
  35. Moein Khazraee et al. 2017. Moonwalk: NRE Optimization in ASIC Clouds. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Moein Khazraee, Luis Vega, Ikuo Magaki, and Michael Taylor. 2017. Specializing a Planet's Computation: ASIC Clouds. IEEE Micro (May 2017).Google ScholarGoogle Scholar
  37. Duckhwan Kim et al. 2016. Neurocube: A programmable digital neuromorphic architecture with high-density 3D memory. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Onur Kocberber et al. 2013. Meet the walkers: Accelerating index traversals for in-memory databases. In International Symposium on Microarchitecture (MICRO). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Alex Krizhevsky et al. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Christian Leber et al. 2011. High frequency trading acceleration using FPGAs. In Field Programmable Logic and Applications (FPL). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Kevin Lim et al. 2013. Thin servers with smart pipes: designing SoC accelerators for memcached. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Shaoli Liu et al. 2016. Cambricon: An instruction set architecture for neural networks. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Ikuo Magaki et al. 2016. ASIC Clouds: Specializing the Datacenter. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Junichiro Makino et al. 2012. GRAPE-8-An accelerator for gravitational N-body simulation with 20.5 Gflops/W performance. In High Performance Computing, Networking, Storage and Analysis (SC). Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Satoshi Nakamoto. 2008. Bitcoin: A peer-to-peer electronic cash system. (2008).Google ScholarGoogle Scholar
  46. Courtois Nicolas et al. 2014. Optimizing sha256 in bitcoin mining. In International Conference on Cryptography and Security Systems (CCS).Google ScholarGoogle Scholar
  47. Muhammet Mustafa Ozdal et al. 2016. Energy efficient architecture for graph analytics accelerators. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. A. Pedram et al. 2016. Dark Memory and Accelerator-Rich System Optimization in the Dark Silicon Era. IEEE Design and Test (2016).Google ScholarGoogle Scholar
  49. Putnam et al. 2014. A Reconfigurable Fabric for Accelerating Large-scale Datacenter Services. In International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Brandon Reagen et al. 2016. Minerva: Enabling low-power, highly-accurate deep neural network accelerators. In International Symposium on Computer Architecture (ISCA) Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Ali Shafiee et al. 2016. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Stephen Weston. 2011. FPGA Accelerators at JP Morgan Chase. Stanford Computer Systems Colloquium, https://www.youtube.com/watch?v=9NqX1ETADn0.Google ScholarGoogle Scholar
  53. Michael Taylor. 2013. Bitcoin and the Age of Bespoke Silicon. In International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES). Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Michael Taylor. 2013. A Landscape of the New Dark Silicon Design Regime. Micro, IEEE (Sept-Oct. 2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Michael B. Taylor. 2012. Is Dark Silicon Useful? Harnessing the Four Horesemen of the Coming Dark Silicon Apocalypse. In DAC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Michael Bedford Taylor. 2017. The Evolution of Bitcoin Hardware. Computer 50, 9 (2017), 58-66Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Paul Teich. Retrieved Jun, 2018. TEARING APART GOOGLE'S TPU 3.0 AI COPROCESSOR. https://www.nextplatform.com/2018/05/10/ tearing-apart-googles-tpu-3-0-ai-coprocessor/.Google ScholarGoogle Scholar
  58. Yatish Turakhia et al. 2017. Darwin: A Hardware-acceleration Framework for Genomic Sequence Alignment. bioRxiv (2017).Google ScholarGoogle Scholar
  59. Ganesh Venkatesh et al. 2010. Conservation cores: reducing the energy of mature computations. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Shijin Zhang et al. 2016. Cambricon-X: An accelerator for sparse neural networks. In International Symposium on Microarchitecture (MICRO). Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGOPS Operating Systems Review
    ACM SIGOPS Operating Systems Review  Volume 52, Issue 1
    Special Topics
    July 2018
    133 pages
    ISSN:0163-5980
    DOI:10.1145/3273982
    Issue’s Table of Contents

    Copyright © 2018 Authors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 28 August 2018

    Check for updates

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader