Skip to main content
Log in

The TH Express high performance interconnect networks

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Interconnection network plays an important role in scalable high performance computer (HPC) systems. The TH Express-2 interconnect has been used in MilkyWay-2 system to provide high-bandwidth and low-latency interprocessor communications, and continuous efforts are devoted to the development of our proprietary interconnect. This paper describes the state-of-the-art of our proprietary interconnect, especially emphasizing on the design of network interface. Several key features are introduced, such as user-level communication, remote direct memory access, offload collective operation, and hardware reliable end-to-end communication, etc. The design of a low level message passing infrastructures and an upper message passing services are also proposed. The preliminary performance results demonstrate the efficiency of the TH interconnect interface.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Top500, http://www.top500.org, 2013

  2. Liao K X, Xiao Q L, Yang Q C, Lu T Y. MilkyWay-2 supercomputer system and application. Submitted to Frontiers of Computer Science, 2013

    Google Scholar 

  3. Pritchard H, Gorodetsky I, Buntinas D. A ugni-based mpich2 nemesis network module for the cray xe. In: Proceedings of the 18th European MPI Users’ Group Conference on Recent Advances in the Message Passing Interface. 2011, 110–119

    Chapter  Google Scholar 

  4. Xie M, Lu Y, Liu L, Cao H, Yang X. Implementation and evaluation of network interface and message passing services for Tianhe-1a supercomputer. In: Proceedings of the 19th IEEE Annual Symposium on High Performance Interconnects. 2011, 78–86

    Google Scholar 

  5. Chun B N, Mainwaring A, Culler D E. Virtual network transport protocols for myrinet. IEEE Micro, 1998, 18(1): 53–63

    Article  Google Scholar 

  6. Araki S, Bilas A, Dubnicki C, Edler J, Konishi K, Philbin J. User-space communication: a quantitative study. In: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing (CDROM). 1998, 1–16

    Google Scholar 

  7. Bhoedjang R A, Ruhl T, Bal H E. User-level network interface protocols. Computer, 1998, 31(11): 53–60

    Article  Google Scholar 

  8. Schoinas I, Hill M D. Address translation mechanisms in network interfaces. In: Proceedings of the 4th International Symposium on High-Performance Computer Architecture. 1998, 219–230

    Google Scholar 

  9. InfiniBand Architecture Specification: Release 1.0. InfiniBand Trade Association, 2000

    Google Scholar 

  10. Graham R L, Poole S, Shamis P, Bloch G, Bloch N, Chapman H, Kagan M, Shahar A, Rabinovitz I, Shainer G. Overlapping computation and communication: Barrier algorithms and connectx-2 core-direct capabilities. In: Proceedings of the 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum. 2010, 1–8

    Google Scholar 

  11. Kandalla K, Subramoni H, Vienne J, Raikar S P, Tomko K, Sur S, Panda D K. Designing non-blocking broadcast with collective offload on infiniband clusters: A case study with hpl. In: Proceedings of the 19th IEEE Annual Symposium on High Performance Interconnects. 2011, 27–34

    Google Scholar 

  12. MPICH2: High-performance and Widely Portable MPI. http://www.mcs.anl.gov/research/projects/mpich2/

  13. Buntinas D, Goglin B, Goodell D, Mercier G, Moreaud S. Cacheefficient, intranode, large-message mpi communication with mpich2-nemesis. In: Proceedings of the 2009 International Conference on Parallel Processing. 2009, 462–469

    Chapter  Google Scholar 

  14. Lauria M, Pakin S, Chien A. Efficient layering for high speed communication: Fast messages 2. x. In: Proceedings of the 7th International Symposium on High Performance Distributed Computing. 1998, 10–20

    Google Scholar 

  15. Liu J, Panda D K. Implementing efficient and scalable flow control schemes in MPI over infiniband. In: Proceedings of the 2004 International Parallel and Distributed Processing Symposium. 2004, 183b

    Google Scholar 

  16. Tezuka H, O’Carroll F, Hori A, Ishikawa Y. Pin-down cache: a virtual memory management technique for zero-copy communication. In: Proceedings of the 1998 Symposium on Parallel and Distributed Processing. 1998, 308–314

    Google Scholar 

  17. MVAPICH: MPI over InfiniBand, 10GigE/iWARP and RoCE, 2013

  18. Vetter J S, Mueller F. Communication characteristics of large-scale scientific applications for contemporary cluster architectures. Journal of Parallel and Distributed Computing, 2003, 63(9): 853–865

    Article  MATH  Google Scholar 

  19. Chiu G. The IBM blue gene project. IBM Journal of Research and Development, 2013, 57(1): 1–6

    Google Scholar 

  20. Chen D, Eisley N A, Heidelberger P, Senger R M, Sugawara Y, Kumar S, Salapura V, Satterfield D L, Steinmacher-Burow B, Parker J J. The IBM blue gene/q interconnection fabric. IEEE Micro, 2012, 32(1): 32–43

    Article  MATH  Google Scholar 

  21. Ajima Y, Takagi Y, Inoue T, Hiramoto S, Shimizu T. The tofu interconnect. In: Proceedings of the 19th IEEE Annual Symposium on High Performance Interconnects. 2011, 87–94

    Google Scholar 

  22. Alverson R, Roweth D, Kaplan L. The gemini system interconnect. In: Proceedings of the 18th IEEE Annual Symposium on High Performance Interconnects. 2010, 83–87

    Google Scholar 

  23. Schroeder B, Gibson G A. Understanding failures in petascale computers. In: Journal of Physics: Conference Series. 2007, Article 012022

    Google Scholar 

  24. Graham R L, Poole S, Shamis P, Bloch G, Bloch N, Chapman H, Kagan M, Shahar A, Rabinovitz I, Shainer G. Connectx-2 infiniband management queues: first investigation of the new support for network offloaded collective operations. In: Proceedings of the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing. 2010, 53–62

    Google Scholar 

  25. Subramoni H, Kandalla K, Sur S, Panda D K. Design and evaluation of generalized collective communication primitives with overlap using connectx-2 offload engine. In: Proceedings of the 18th IEEE Annual Symposium on High Performance Interconnects. 2010, 40–49

    Google Scholar 

  26. Arimilli B, Arimilli R, Chung V, Clark S, Denzel W, Drerup B, Hoefler T, Joyner J, Lewis J, Li J. The percs high-performance interconnect. In: Proceedings of the 18th IEEE Annual Symposium on High Performance Interconnects. 2010, 75–82

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guibin Wang.

Additional information

Zhengbin Pang received the BS, MS, and PhD degrees in computer science from National University of Defense Technology (NUDT), China. He is a professor in College of Computer, NUDT. His research interests include parallel and distributed computing, and high performance computer systems.

Min Xie is a professor in College of Computer at National University of Defense Technology (NUDT), China. His research interests include high-speed interconnects, system software and parallel and distributed computing. He has a PhD in computer science from NUDT.

Jun Zhang received the MS degree in computer science from National University of Defense Technology (NUDT), China. Currently he is an assistant professor at the university. His research interests include high speed communication and ASIC design.

Yi Zheng received the PhD degrees in computer science from National University of Defense Technology (NUDT), China. Currently he is an associate professor at the university. His research interests including high performance computer architecture and high performance networks.

Guibin Wang received the BS, MS, and PhD degrees from National University of Defense Technology (NUDT), China in 2004, 2007, and 2011, respectively. Currently, he is an assistant professor in College of Computer, NUDT. His research interests include high-performance computer systems, heterogeneous parallel systems.

Dezun Dong received the BS, MS, and PhD degrees from the National University of Defense Technology (NUDT), China in 2002, 2004, and 2010, respectively. Currently, he is an associate professor in College of Computer, NUDT, China. His research interests include high-performance computer systems, distributed computing, and wireless networks. He is a member of ACM and IEEE.

Guang Suo received his BS in computer science from National University of Defense Technology (NUDT), China in 2003, and received his MS and PhD in computer science from NUDT in 2005 and 2009, respectively. He is an assistant professor in Institute of Computers, NUDT. He has played an important role in the implementation and optimization of MPI library of MilkWay supercomputers. His research interests are in parallel copmuting, operating system, and HPC runtime systems.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pang, Z., Xie, M., Zhang, J. et al. The TH Express high performance interconnect networks. Front. Comput. Sci. 8, 357–366 (2014). https://doi.org/10.1007/s11704-014-3500-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-014-3500-9

Keywords

Navigation