ABSTRACT
Intel's Xeon roadmap includes package-integrated FPGAs in every new generation. In this talk, we will dissect why this is such a powerful combination at this time of great change in datacenter workloads. We will show how power savings within the CPU complex is a significant multiplier for power savings in the datacenter as a whole. Focusing on the domain of machine learning, we will present the recent evolution of data types and operators, and make the case that FPGAs are the path to facilitate this continued evolution. Finally, we will discuss the criticality of the close coupling of the CPU and the FPGA. This coupling facilitates high bandwidth and low latency communication that is required for the development, debugging and deployment of heterogeneous applications.
- National Resource Defense Council, "Data Center Efficiency Assessment", https://www.nrdc.org/sites/default/files/data-center-efficiency-assessment-IP.pdf Retrieved June 27, 2016.Google Scholar
- Emerson Network Power, "Energy Logic: Reducing Data Center Energy Consumption by Creating Savings that Cascade Across System," http://whitepapers.datacenterknowledge.com/content10394 Retreived June 27, 2016Google Scholar
- Krizhevsky, Alex. "ImageNet Classification with Deep Convolutional Neural Networks" http://www.image-net.org/challenges/LSVRC/2012/supervision.pdf, Retrieved 17 November 2013.Google Scholar
- Rastegari, Mohammad, Ordonez, Vicente, Redmon, Joseph, and Farhadi, Ali. Xnornet: Imagenet classification using binary convolutional neural networks. arXiv preprint arXiv:1603.05279, 2016.Google Scholar
- Li, Fengfu and Liu, Bin. Ternary weight networks. arXiv preprint arXiv:1605.04711, 2016.Google Scholar
- Zhou, Ni, Zhou, Wen, Wu, and Zou, DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. arXiv preprint arXiv:1606.06160.Google Scholar
- Harris, M, Inside Pascal: NVIDIA's Newest Computing Platform, https://devblogs.nvidia.com/parallelforall/insidepascal, Retrieved on June 30, 2016.Google Scholar
- Tensor Processing Unit. retrieved from https://en.wikipedia.org/wiki/Tensor_processing_unit Retrieved on June 30, 2016.Google Scholar
- Compton, K., and Hauck, S.: 'Reconfigurable computing: a survey of systems and software', ACM Comput. Surv., 2002, 34, (2), pp. 171--210. Google ScholarDigital Library
Recommendations
Evaluation of Rodinia Codes on Intel Xeon Phi
ISMS '13: Proceedings of the 2013 4th International Conference on Intelligent Systems, Modelling and SimulationHigh performance computing (HPC) is a niche area where various parallel benchmarks are constantly used to explore and evaluate the performance of Heterogeneous computing systems on the horizon. The Rodinia benchmark suite, a collection of parallel ...
Practical SIMD Vectorization Techniques for Intel® Xeon Phi Coprocessors
IPDPSW '13: Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD ForumIntel® Xeon Phi™ coprocessor is based on the Intel® Many Integrated Core (Intel® MIC) architecture, which is an innovative new processor architecture that combines abundant thread parallelism with long SIMD vector units. Efficiently exploiting SIMD ...
Comments