Skip to main content

The Performance Characterization of the RSC PetaStream Module

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8488))

Abstract

The RSC PetaStream architecture is a massively parallel computer design based on Intel® Xeon® Phi manycore co-processors. Each RSC PetaStream module contains eight Intel Xeon Phi co-processors with PCI-express fabric and Infiniband interconnect for intermodule communication. This paper concentrates on the performance of a single RSC PetaStream module, evaluated with the help of low-level (point-to-point MPI), library (linear algebra, MAGMA) and application-level (classical molecular dynamics, GROMACS and LAMMPS codes) tests. The Intel Xeon E5-2690 top bin CPU dual-socket system has been used for comparison. This early evaluation demonstrates that in general each Xeon Phi co-processor of RSC PetaStream delivers approximately the same performance as dual-socket Intel Xeon E5 system, with only a half energy-to-solution. Fine-grain parallelism of Intel Xeon Phi cores takes advantage of higher messages exchange rates on MPI level for communication of threads placed on different Xeon Phi chips.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. TOP500 Supercomputer Site, http://www.top500.org

  2. Kogge, P., Bergman, K., Borkar, S., et al.: ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems. Technical report. Gov. Procure. TR-2008-13, 278 (2008)

    Google Scholar 

  3. Kogge, P.: The Challenges of Petascale Architectures. Comput. Sci. Eng. 11, 10–16 (2009)

    Article  Google Scholar 

  4. OSU MPI benchmarks, http://mvapich.cse.ohio-state.edu/benchmarks

  5. Agullo, E., Demmel, J., Dongarra, J., et al.: Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects. J. Phys. Conf. Ser. 180, 012037 (2009)

    Google Scholar 

  6. Dongarra, J., Dong, T., Gates, M., et al.: MAGMA : Matrix Algebra on GPU and Multicore Architectures. In: SC12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society Press, Salt Lake City (2012)

    Google Scholar 

  7. Dongarra, J., Gates, M., Jia, Y., Kabir, K., Luszczek, P., Tomov, S.: MAGMA MIC: Linear Algebra Library for Intel Xeon Phi Coprocessors, http://icl.cs.utk.edu/projectsfiles/magma/pubs/24-MAGMA_MIC_03.pdf

  8. Plimpton, S.: Fast Parallel Algorithms for Short-Range Molecular Dynamics. J. Comput. Phys. 117, 1–19 (1995)

    Article  Google Scholar 

  9. Hess, B., Kutzner, C., van der Spoel, D., Lindahl, E.: GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J. Chem. Theory Comput. 4, 435–447 (2008)

    Article  Google Scholar 

  10. Kerbyson, D.J., Barker, K.J., Vishnu, A., Hoisie, A.: A performance comparison of current HPC systems: Blue Gene/Q, Cray XE6 and InfiniBand systems. Futur. Gener. Comput. Syst. 30, 291–304 (2014)

    Article  Google Scholar 

  11. Kandalla, K., Venkatesh, A., Hamidouche, K., et al.: Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters. In: 2013 IEEE 21st Annual Symposium on High-Performance Interconnects, San Jose, CA, USA, pp. 63–70 (2013)

    Google Scholar 

  12. Yamazaki, I., Tomov, S., Dongarra, J.: One-sided Dense Matrix Factorizations on a Multicore with Multiple GPU Accelerators. Procedia Comput. Sci. 9, 37–46 (2012)

    Article  Google Scholar 

  13. Petitet, A., Whaley, R.C., Dongarra, J., Cleary, A.: HPL - A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers (2008), http://www.netlib.org/benchmark/hpl

  14. Dongarra, J.: Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report). Technical report (2013)

    Google Scholar 

  15. You, H., Lu, C.-D., Zhao, Z., Xing, F.: Optimizing utilization across XSEDE platforms. In: Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery, XSEDE 2013, p. 1. ACM Press, New York (2013)

    Google Scholar 

  16. Loeffler, H., Winna, M.: Large biomolecular simulation on HPC platforms III. AMBER, CHARMM, GROMACS, LAMMPS and NAMD, Warrington, UK (2012)

    Google Scholar 

  17. LAMMPS Benchmarks, http://lammps.sandia.gov/bench.html

  18. Nvidia Corporation: GROMACS 4.6 Pre-Beta Benchmark Report, Revision 1.0 (September 10, 2012), http://www.nvidia.com/docs/IO/122634/gromacs-benchmark.pdf

  19. Eicker, N., Lippert, T., Moschny, T., Suarez, E.: The DEEP project: Pursuing cluster-computing in the many-core era. In: Proc. of the 42nd International Conference on Parallel Processing Workshops (ICPPW) 2013, Workshop on Heterogeneous and Unconventional Cluster Architectures and Applications (HUCAA), Lyon, France, pp. 885–892 (2013)

    Google Scholar 

  20. Green500 list (November 2013), http://green500.org/lists/green201311

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Semin, A., Druzhinin, E., Mironov, V., Shmelev, A., Moskovsky, A. (2014). The Performance Characterization of the RSC PetaStream Module. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds) Supercomputing. ISC 2014. Lecture Notes in Computer Science, vol 8488. Springer, Cham. https://doi.org/10.1007/978-3-319-07518-1_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07518-1_27

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07517-4

  • Online ISBN: 978-3-319-07518-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics