Skip to main content

Software and Hardware Co-design for Low-Power HPC Platforms

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2019)

Abstract

In order to keep an HPC cluster viable in terms of economy, serious cost limitations on the hardware and software deployment should be considered, prompting researchers to reconsider the design of modern HPC platforms. In this paper we present a cross-layer communication architecture suitable for emerging HPC platforms based on heterogeneous multiprocessors. We propose simple hardware primitives that enable protected, reliable and virtualized, user-level communication that can easily be integrate in the same package with the processing unit. Using an efficient user-space software stack the proposed architecture provides efficient, low-latency communication mechanisms to HPC applications. Our implementation of the MPI standard that exploits the aforementioned capabilities delivers point-to-point and collective primitives with low overheads, including an eager protocol with end-to-end latency of 1.4 \(\upmu \mathrm{s}\). We port and evaluate our communication stack using real HPC applications in a cluster of 128 ARMv8 processors that are tightly coupled with FPGA logic. The network interface primitives occupy less than 25% of the FPGA logic and only 3 Mbits of SRAM while they can easily saturate the 16 Gb/s links in our platform.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    System MMU in the case of ARM processors.

  2. 2.

    Note that, at the hardware level, the transfer has been separately acknowledged from the RX engine on node B to the TX engine on node A.

References

  1. LAMMPS Molecular Dynamics Simulator. Sandia National Laboratories. https://lammps.sandia.gov

  2. The ExaNest project. European Exascale System Interconnect and Storage. GA-671553. www.exanest.eu

  3. Alverson, B., Froese, E., Kaplan, L., Roweth, D.: Cray xc series network. Cray Inc., White Paper WP-Aries01-1112 (2012)

    Google Scholar 

  4. Ammendola, R., et al.: Apenet: a high speed, low latency 3d interconnect network. In: cluster, p. 481. Citeseer (2004)

    Google Scholar 

  5. EuroEXA: European Exascale System Interconnect and Storage. https://euroexa.eu/

  6. Feldman, M.: Fujitsu switches horses for post-k supercomputer, will ride arm into exascale. Recuperado de (2016). https://www.top500.org/news/fujitsu-switcheshorses-for-post-k-supercomputer-will-ride-arm-intoexascale

  7. Fu, H., et al.: The sunway taihulight supercomputer: system and applications. Sci. China Inf. Sci. 59(7), 072001 (2016)

    Article  Google Scholar 

  8. HORIZON 2020: The EU Framework Programme for Research and Innovation. https://ec.europa.eu/programmes/horizon2020/

  9. Katevenis, M., et al., N.C.: The exanest project: Interconnects, storage, and packaging for exascale systems. In: 2016 Euromicro Conference on Digital System Design (DSD), pp. 60–67, August 2016. https://doi.org/10.1109/DSD.2016.106

  10. Katevenis, M.G.: Interprocessor communication seen as load-store instruction generalization. In: The Future of Computing, essays in memory of Stamatis Vassiliadis. In: Bertels, K., et al. (eds.) Delft, The Netherlands. Citeseer (2007)

    Google Scholar 

  11. Katz, R.H., Eggers, S.J., Wood, D.A., Perkins, C., Sheldon, R.G.: Implementing a cache consistency protocol, vol. 13. IEEE Computer Society Press (1985)

    Google Scholar 

  12. Leitao, B.H.: Tuning 10gb network cards on linux. In: Proceedings of the 2009 Linux Symposium, pp. 169–185. Citeseer (2009)

    Google Scholar 

  13. LAMMPS Benchmark suite. http://lammps.sandia.gov/bench.html

  14. OSU Micro-Benchmarks. http://mvapich.cse.ohio-state.edu/benchmarks/

  15. Pfister, G.F.: An introduction to the infiniband architecture. High Perform. Mass Storage Parallel I/O 42, 617–632 (2001)

    Google Scholar 

  16. Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of collective communication operations in MPICH. Int. J. High Perform. Comput. Appl. 19(1), 49–66 (2005). https://doi.org/10.1177/1094342005051521. http://dx.doi.org/10.1177/1094342005051521

    Article  Google Scholar 

  17. Yokokawa, M., Shoji, F., Uno, A., Kurokawa, M., Watanabe, T.: The k computer: Japanese next-generation supercomputer development project. In: IEEE/ACM International Symposium on Low Power Electronics and Design, pp. 371–372. IEEE (2011)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the European Commission under the Horizon 2020 Framework Programme [8] for Research and Innovation through the EuroEXA project [5] (g.a. 754337), the EU H2020 FETHPC project Exanode (g.a. 671578) and the ExaNeSt project (g.a. 671553) [2].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manolis Ploumidis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ploumidis, M. et al. (2019). Software and Hardware Co-design for Low-Power HPC Platforms. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11887. Springer, Cham. https://doi.org/10.1007/978-3-030-34356-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34356-9_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34355-2

  • Online ISBN: 978-3-030-34356-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics