Skip to main content

An Initial Analysis of the Impact of Overlap and Independent Progress for MPI

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3241))

Abstract

The ability to offload functionality to a programmable network interface is appealing, both for increasing message passing performance and for reducing the overhead on the host processor(s). Two important features of an MPI implementation are independent progress and the ability to overlap computation with communication. In this paper, we compare the performance of several application benchmarks using an MPI implementation that takes advantage of a programmable NIC to implement MPI semantics with an implementation that does not. Unlike previous such comparisons, we compare identical network hardware using virtually the same software stack. This comparison isolates these two important features of an MPI implementation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boden, N., Cohen, D., Felderman, R.E., Kulawik, A.E., Seitz, C.L., Seizovic, J.N., Su, W.: Myrinet-a gigabit-per-second local-area network. IEEE Micro 15, 29–36 (1995)

    Article  Google Scholar 

  2. Petrini, F., Feng, W.C., Hoisie, A., Coll, S., Frachtenberg, E.: The Quadrics network: Highperformance clustering technology. IEEE Micro 22, 46–57 (2002)

    Article  Google Scholar 

  3. Infiniband Trade Association (1999), http://www.infinibandta.org

  4. Message Passing Interface Forum: MPI: A Message-Passing Interface standard. The International Journal of Supercomputer Applications and High Performance Computing 8 (1994)

    Google Scholar 

  5. Brightwell, R.B., Shuler, P.L.: Design and implementation of MPI on Puma portals. In: Proceedings of the Second MPI Developer’s Conference, pp. 18–25 (1996)

    Google Scholar 

  6. Brightwell, R., Underwood, K.D.: An analysis of NIC resource usage for offloading MPI. In: Proceedings of the 2002 Workshop on Communication Architecture for Clusters, Santa Fe, NM (2004)

    Google Scholar 

  7. Cray Research, Inc.: SHMEM Technical Note for C, SG-2516 2.3 (1994)

    Google Scholar 

  8. Brightwell, R.: A new MPI implementation for Cray SHMEM. In: Proceedings of the 11th European PVM/MPI Users’ Group Meeting (2004)

    Google Scholar 

  9. Underwood, K.D., Brightwell, R.: The impact of MPI queue usage on message latency. In: Proceedings of the 2004 International Conference on Parallel Processing (2004)

    Google Scholar 

  10. Liu, J., Wu, J., Kini, S.P., Wyckoff, P., Panda, D.K.: High performance RDMA-based MPI implementation over InfiniBand. In: International Conference on Supercomputing (2003)

    Google Scholar 

  11. Liu, J., Jiang, W., Wyckoff, P., Panda, D.K., Ashton, D., Buntinas, D., Gropp, W., Toonen, B.: Design and implementation of MPICH2 over InfiniBand with RDMA support. In: International Parallel and Distributed Processing Symposium (2004)

    Google Scholar 

  12. Rehm, W., Grabner, R., Mietke, F., Mehlan, T., Siebert, C.: An MPICH2 channel device implementation over VAPI on InfiniBand. In: Workshop on Communication Architecture for Clusters (2004)

    Google Scholar 

  13. Liu, J., Chandrasekaran, B., Wu, J., Jiang, W., Kini, S., Yu, W., Buntinas, D., Wyckoff, P., Panda, D.K.: Performance comparison of MPI implementations over InfiniBand, Myrinet and Quadrics. In: The International Conference for High Performance Computing and Communications (SC 2003) (2003)

    Google Scholar 

  14. Dimitrov, R., Skjellum, A.: Impact of latency on applications’ performance. In: Proceedings of the Fourth MPI Developers’ and Users’ Conference (2000)

    Google Scholar 

  15. Gropp, W., Lusk, E., Doss, N., Skjellum, A.: A high-performance, portable implementation of the MPI message passing interface standard. Parallel Computing 22, 789–828 (1996)

    Article  MATH  Google Scholar 

  16. Petrini, F., Kerbyson, D.J., Pakin, S.: The case of the missing supercomputer performance: Identifying and eliminating the performance variability on the ASCI Q machine. In: Proceedings of the 2003 Conference on High Performance Networking and Computing (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Brightwell, R., Underwood, K.D., Riesen, R. (2004). An Initial Analysis of the Impact of Overlap and Independent Progress for MPI. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2004. Lecture Notes in Computer Science, vol 3241. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30218-6_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30218-6_51

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23163-9

  • Online ISBN: 978-3-540-30218-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics