Abstract
In recent years, heterogeneous systems and cooperative computing have become popular research directions in the field of high performance computing. With fast scaling of the size of high performance computer systems, problems such as power consumption and reliability come to the forefront. The aim of high performance computing has thus shifted from merely seeking peak performance to comprehensively pursuing high efficiency, which takes into consideration many factors including performance, cost, power, reliability and so on. A heterogeneous computing system consisting of general-purpose CPU(s) and special-purpose accelerator(s) features high performance, lower power consumption and low cost, etc. Hence, it has already become the mainstream in the field of high performance computing. However, such systems still face many challenges and problems, for example, programmability and reliability. In this paper, we firstly analyze the main challenges facing heterogeneous computing systems. Then, we introduce the architecture of the first petaflop computing system in China, the Tianhe-1 (TH-1) heterogeneous system, including its hardware/software interface and interconnect network. During development of the TH-1 system, several challenges were encountered; research into the solutions of these challenges is subsequently presented.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Yang X, Yan X, Xing Z, et al. A 64-bit stream processor architecture for scientific applications. In: Proceedings of 34th Annual International Symposium on Computer Architecture. 2007, 210–219
Barker K J, Davis K, Hoisie A, et al. Entering the petaflop era: the architecture and performance of Roadrunner. In: Proceedings of 2008 ACM/IEEE Conference on Supercomputing. 2008, 1–11
ClearSpeed Technology plc. ClearSpeed whitepaper: CSX processor architecture. 2007, http://www.clearspeed.com/docs/resources/
Kirk D. Nvidia cuda software and gpu parallel computing architecture. In: Proceedings of 6th International Symposium on Memory Management. 2007, 103–104
Advanced Micro Devices, Inc. AMD Brook+. http://ati.amd.com/technology/streamcomputing/AMDBrook-plus.pdf
Munshi A. OpenCL: parallel computing on the GPU and CPU. Presentation at SIGGRAPH, 2008, http://s08.idav.ucdavis.edu/munshi-opencl.pdf
Semeraro G, Magklis G, Balasubramonian R, et al. Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling. In: Proceedings of 8th International Symposium on High-Performance Computer Architecture. 2002, 29–42
Luk C, Hong S, Kim H. Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: Proceedings of 42nd Annual IEEE/ACM International Symposium on Microarchitecture. 2009, 45–55
Dimitrov M, Mantor M, Zhou H. Understanding software approaches for GPGPU reliability. In: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units. 2009, 94–104
Oh N, Shirvani P, McCluskey E J. Error detection by duplicated instruction in super-scalar processors. IEEE Transactions on Reliability, 2002, 51(1): 63–75
Norman A N, Choi S, Lin C. Compiler-generated staggered checkpointing. In: Proceedings of 7th ACM Workshop on Languages, Compilers, and Runtime Support for Scalable Systems. 2004, 1–8
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yang, X., Liao, X., Xu, W. et al. TH-1: China’s first petaflop supercomputer. Front. Comput. Sci. China 4, 445–455 (2010). https://doi.org/10.1007/s11704-010-0383-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11704-010-0383-x