Skip to main content

Improving Performance of OpenMP for SMP Clusters Through Overlapped Page Migrations

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4315))

Abstract

Costly page migration is a major obstacle to integrating OpenMP and page-based software distributed shared memory (SDSM) to realize the easy-touse programming paradigm for SMP clusters. To reduce the impact of the page migration overhead on the execution time of an application, the previous researches have mainly focused on reducing the number of page migrations and hiding the page migration overhead by overlapping computation and communication. We propose the ‘collective-prefetch’ technique, which overlaps page migrations themselves even when the prior approach cannot be effectively applied. Experiments with a communication-intensive application show that our technique reduces the page migration overhead significantly, and the overall execution time was reduced to 57%~79%.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. OpenMP C and C++ API, Version 1.0 (1998), http://www.openmp.org

  2. Liu, F., Chaudhary, V.: A Practical OpenMP Compiler for System on Chips. In: Voss, M.J. (ed.) WOMPAT 2003. LNCS, vol. 2716, pp. 54–68. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  3. Hu, Y., Lu, H., Cox, A., Zwaenepoel, W.: OpenMP for Networks of SMPs. J. Parallel and Distributed Computing 60(12), 1512–1530 (2000)

    Article  MATH  Google Scholar 

  4. Sato, M., Harada, H., Ishikawa, Y.: OpenMP compiler for a Software Distributed Shared Memory System SCASH. In: Proc. of WOMPAT 2000 (2000)

    Google Scholar 

  5. Kee, Y., Kim, J., Ha, S.: ParADE: An OpenMP Compiler for a Software Distributed Shared Memory Systems. In: Proc. of IEEE/ACM Supercomputing (2003)

    Google Scholar 

  6. Min, S., Basumallik, A., Eigenmann, R.: Optimizing OpenMP Programs on Software Dis-tributed Shared Memory Systems. Int. J. Parallel Programming. 31(3), 225–249 (2003)

    Article  MATH  Google Scholar 

  7. Tao, J., Karl, W., Trinitis, C.: Implementing an OpenMP Execution Environment on Infiniband Clusters. In: Proc. of IWOMP 2005 (2005)

    Google Scholar 

  8. Chun, H., Xuejun, Y.: Performance Analysis and Improvement of OpenMP on Software Distributed Shared Memory Systems. In: Proc. of EWOMP 2003 (2003)

    Google Scholar 

  9. Matsuba, H., Ishikawa, Y.: OpenMP on the FDSM software distributed shared memory. In: Proc. of EWOMP 2003, pp. 71–78 (2003)

    Google Scholar 

  10. Costa, J.J., Cortes, T., Martorell, X., Ayguade, E., Labarta, J.: Running OpenMP applications efficiently on an everything-shared SDSM. In: Proc. of IPDPS 2004 (2004)

    Google Scholar 

  11. Li, K., Hudak, P.: Memory coherence in shared virtual memory systems. ACM Transactions on Computer Systems 6(4), 321–359 (1989)

    Article  Google Scholar 

  12. Baily, D., Saphir, W., van der Wijngaart, R., Woo, A.: The NAS Parallel Benchmarks. Technical Report, NAS-95-020 (1995)

    Google Scholar 

  13. Kee, Y., Kim, J., Ha, S.: Memory Management for Multi-Threaded Software DSM Systems. Parallel Computing 30, 121–138 (2004)

    Article  Google Scholar 

  14. Padmanabhan, V., Mogul, J.: Using Predictive Prefetching to Improve World Wide Web Latency. SIGCOMM Computer Communication Review (1996)

    Google Scholar 

  15. Muller, M.: Compiler-Generated Vector-based Prefetching on Architectures with Distributed Memory. In: Jer, W., Krause, E. (eds.) High Performance Computing in Science and Engineering 2001, Springer, Heidelberg (2001)

    Google Scholar 

  16. Mowry, T., Chan, C., Lo, A.: Comparative Evaluation of Latency Tolerance Techniques for Software Distributed Shared Memory. In: Proc. of HPCA-4 (1998)

    Google Scholar 

  17. Koelbel, C., Mehrotra, P.: Compiling Global Name-Space Parallel Loops for Distributed Execution. IEEE Transaction on Parallel and Distributed Systems 2(4), 440–451 (1991)

    Article  Google Scholar 

  18. The PC Cluster Consortium: The SCore cluster system, http://www.pccluster.org

Download references

Author information

Authors and Affiliations

Authors

Editor information

Matthias S. Mueller Barbara M. Chapman Bronis R. de Supinski Allen D. Malony Michael Voss

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jeun, WC., Kee, YS., Ha, S. (2008). Improving Performance of OpenMP for SMP Clusters Through Overlapped Page Migrations. In: Mueller, M.S., Chapman, B.M., de Supinski, B.R., Malony, A.D., Voss, M. (eds) OpenMP Shared Memory Parallel Programming. IWOMP 2005. Lecture Notes in Computer Science, vol 4315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68555-5_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68555-5_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68554-8

  • Online ISBN: 978-3-540-68555-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics