Software and Hardware Co-design for Low-Power HPC Platforms

Ploumidis, Manolis; Kallimanis, Nikolaos D.; Asiminakis, Marios; Chrysos, Nikos; Xirouchakis, Pantelis; Gianoudis, Michalis; Tzanakis, Leandros; Dimou, Nikolaos; Psistakis, Antonis; Peristerakis, Panagiotis; Kalokairinos, Giorgos; Papaefstathiou, Vassilis; Katevenis, Manolis

doi:10.1007/978-3-030-34356-9_9

Manolis Ploumidis¹²,
Nikolaos D. Kallimanis¹²,
Marios Asiminakis¹²,
Nikos Chrysos¹²,
Pantelis Xirouchakis¹²,
Michalis Gianoudis¹²,
Leandros Tzanakis¹²,
Nikolaos Dimou¹²,
Antonis Psistakis¹²,
Panagiotis Peristerakis¹²,
Giorgos Kalokairinos¹²,
Vassilis Papaefstathiou¹² &
…
Manolis Katevenis¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11887))

Included in the following conference series:

International Conference on High Performance Computing

5972 Accesses
4 Citations

Abstract

In order to keep an HPC cluster viable in terms of economy, serious cost limitations on the hardware and software deployment should be considered, prompting researchers to reconsider the design of modern HPC platforms. In this paper we present a cross-layer communication architecture suitable for emerging HPC platforms based on heterogeneous multiprocessors. We propose simple hardware primitives that enable protected, reliable and virtualized, user-level communication that can easily be integrate in the same package with the processing unit. Using an efficient user-space software stack the proposed architecture provides efficient, low-latency communication mechanisms to HPC applications. Our implementation of the MPI standard that exploits the aforementioned capabilities delivers point-to-point and collective primitives with low overheads, including an eager protocol with end-to-end latency of 1.4 \(\upmu \mathrm{s}\). We port and evaluate our communication stack using real HPC applications in a cluster of 128 ARMv8 processors that are tightly coupled with FPGA logic. The network interface primitives occupy less than 25% of the FPGA logic and only 3 Mbits of SRAM while they can easily saturate the 16 Gb/s links in our platform.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
System MMU in the case of ARM processors.
2.
Note that, at the hardware level, the transfer has been separately acknowledged from the RX engine on node B to the TX engine on node A.

References

LAMMPS Molecular Dynamics Simulator. Sandia National Laboratories. https://lammps.sandia.gov
The ExaNest project. European Exascale System Interconnect and Storage. GA-671553. www.exanest.eu
Alverson, B., Froese, E., Kaplan, L., Roweth, D.: Cray xc series network. Cray Inc., White Paper WP-Aries01-1112 (2012)
Google Scholar
Ammendola, R., et al.: Apenet: a high speed, low latency 3d interconnect network. In: cluster, p. 481. Citeseer (2004)
Google Scholar
EuroEXA: European Exascale System Interconnect and Storage. https://euroexa.eu/
Feldman, M.: Fujitsu switches horses for post-k supercomputer, will ride arm into exascale. Recuperado de (2016). https://www.top500.org/news/fujitsu-switcheshorses-for-post-k-supercomputer-will-ride-arm-intoexascale
Fu, H., et al.: The sunway taihulight supercomputer: system and applications. Sci. China Inf. Sci. 59(7), 072001 (2016)
Article Google Scholar
HORIZON 2020: The EU Framework Programme for Research and Innovation. https://ec.europa.eu/programmes/horizon2020/
Katevenis, M., et al., N.C.: The exanest project: Interconnects, storage, and packaging for exascale systems. In: 2016 Euromicro Conference on Digital System Design (DSD), pp. 60–67, August 2016. https://doi.org/10.1109/DSD.2016.106
Katevenis, M.G.: Interprocessor communication seen as load-store instruction generalization. In: The Future of Computing, essays in memory of Stamatis Vassiliadis. In: Bertels, K., et al. (eds.) Delft, The Netherlands. Citeseer (2007)
Google Scholar
Katz, R.H., Eggers, S.J., Wood, D.A., Perkins, C., Sheldon, R.G.: Implementing a cache consistency protocol, vol. 13. IEEE Computer Society Press (1985)
Google Scholar
Leitao, B.H.: Tuning 10gb network cards on linux. In: Proceedings of the 2009 Linux Symposium, pp. 169–185. Citeseer (2009)
Google Scholar
LAMMPS Benchmark suite. http://lammps.sandia.gov/bench.html
OSU Micro-Benchmarks. http://mvapich.cse.ohio-state.edu/benchmarks/
Pfister, G.F.: An introduction to the infiniband architecture. High Perform. Mass Storage Parallel I/O 42, 617–632 (2001)
Google Scholar
Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of collective communication operations in MPICH. Int. J. High Perform. Comput. Appl. 19(1), 49–66 (2005). https://doi.org/10.1177/1094342005051521. http://dx.doi.org/10.1177/1094342005051521
Article Google Scholar
Yokokawa, M., Shoji, F., Uno, A., Kurokawa, M., Watanabe, T.: The k computer: Japanese next-generation supercomputer development project. In: IEEE/ACM International Symposium on Low Power Electronics and Design, pp. 371–372. IEEE (2011)
Google Scholar

Download references

Acknowledgments

This work is supported by the European Commission under the Horizon 2020 Framework Programme [8] for Research and Innovation through the EuroEXA project [5] (g.a. 754337), the EU H2020 FETHPC project Exanode (g.a. 671578) and the ExaNeSt project (g.a. 671553) [2].

Author information

Authors and Affiliations

Foundation for Research and Technology – Hellas (FORTH), Heraklion, Crete, Greece
Manolis Ploumidis, Nikolaos D. Kallimanis, Marios Asiminakis, Nikos Chrysos, Pantelis Xirouchakis, Michalis Gianoudis, Leandros Tzanakis, Nikolaos Dimou, Antonis Psistakis, Panagiotis Peristerakis, Giorgos Kalokairinos, Vassilis Papaefstathiou & Manolis Katevenis

Authors

Manolis Ploumidis
View author publications
You can also search for this author in PubMed Google Scholar
Nikolaos D. Kallimanis
View author publications
You can also search for this author in PubMed Google Scholar
Marios Asiminakis
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Chrysos
View author publications
You can also search for this author in PubMed Google Scholar
Pantelis Xirouchakis
View author publications
You can also search for this author in PubMed Google Scholar
Michalis Gianoudis
View author publications
You can also search for this author in PubMed Google Scholar
Leandros Tzanakis
View author publications
You can also search for this author in PubMed Google Scholar
Nikolaos Dimou
View author publications
You can also search for this author in PubMed Google Scholar
Antonis Psistakis
View author publications
You can also search for this author in PubMed Google Scholar
Panagiotis Peristerakis
View author publications
You can also search for this author in PubMed Google Scholar
Giorgos Kalokairinos
View author publications
You can also search for this author in PubMed Google Scholar
Vassilis Papaefstathiou
View author publications
You can also search for this author in PubMed Google Scholar
Manolis Katevenis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manolis Ploumidis .

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, UK
Michèle Weiland
Helmholtz-Zentrum Dresden-Rossendorf, Dresden, Sachsen, Germany
Guido Juckeland
Swiss National Supercomputing Centre, Lugano, Ticino, Switzerland
Sadaf Alam
University of Tennessee at Knoxville, Knoxville, TN, USA
Heike Jagode

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ploumidis, M. et al. (2019). Software and Hardware Co-design for Low-Power HPC Platforms. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11887. Springer, Cham. https://doi.org/10.1007/978-3-030-34356-9_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-34356-9_9
Published: 03 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34355-2
Online ISBN: 978-3-030-34356-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics