Skip to main content

Offload Compiler Runtime for the Intel® Xeon PhiTM Coprocessor

  • Conference paper
Supercomputing (ISC 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7905))

Included in the following conference series:

Abstract

The Intel® Xeon PhiTM coprocessor platform enables offload of computation from a host processor to a coprocessor that is a fully-functional Intel® Architecture CPU. This paper presents the C/C++ and Fortran compiler offload runtime for that coprocessor. The paper addresses why offload to a coprocessor is useful, how it is specified, and what the conditions for the profitability of offload are. It also serves as a guide to potential third-party developers of offload runtimes, such as a gcc-based offload compiler, ports of existing commercial offloading compilers to Intel® Xeon PhiTM coprocessor such as CAPS®, and third-party offload library vendors that Intel is working with, such as NAG® and MAGMA®. It describes the software architecture and design of the offload compiler runtime. It enumerates the key performance features for this heterogeneous computing stack, related to initialization, data movement and invocation. Finally, it evaluates the performance impact of those features for a set of directed micro-benchmarks and larger workloads.

For more complete information about compiler optimizations, see Intel’s Optimization Notice at http://software.intel.com/en-us/articles/optimization-notice

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S.: Numerical Linear Algebra on Emerging Architectures: The PLASMA and MAGMA Projects SciDAC 2009: Scientific Discovery through Advanced Computing, San Diego, California. Journal of Physics: Conference Series, vol. 180, p. 012037. IOP Publishing (2009)

    Google Scholar 

  2. Budruk, R., Anderson, D., Shanley, T.: PCI Express System Architecture, 1st edn., 1120 pages (2003) ISBN 978-0-321-15630-3

    Google Scholar 

  3. CAPS, http://www.caps-entreprise.com/technology/hmpp

  4. Denning, P.J., Schwartz, S.C.: Properties of the Working-Set model. Communications of the ACM 15, 191–198 (1972)

    Article  MathSciNet  Google Scholar 

  5. Donaldson, A.F., Dolinsky, U., Richards, A., Russell, G.: Automatic offloading of C++ for the Cell BE Processor: A case study using offload. In: Proceedings of the 2010 Interna-tional Conference on Complex, Intelligent and Software Intensive Systems, pp. 901–906 (2010)

    Google Scholar 

  6. Green 500: The Green500 List (November 2012), http://www.green500.org

  7. Gropp, W., Lusk, E., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message Passing Interface, 2nd edn. MIT Press, Cambridge (1999)

    Book  Google Scholar 

  8. Gropp, W., Lusk, E., Thakur, R.: Using MPI-2: Advanced Features of the Message-Passing Interface. MIT Press, Cambridge (1999)

    Book  Google Scholar 

  9. Intel® C/C++ compiler, http://www.intel.com/Software/Products

  10. Intel® Many Integrated Core, http://www.intel.com/content/www/us/en/architecture-and-technology/many-integrated-core/intel-many-integrated-core-architecture.html

  11. Intel® Many Integrated Core SW development pages, http://software.intel.com/mic-developer

  12. Intel® Math Kernel Library, http://www.intel.com/Software/Products

  13. Intel® Message Passing Interface, http://software.intel.com/en-us/intel-mpi-library/

  14. Intel® OpenCL for Intel® Xeon PhiTM Coprocessor, http://software.intel.com/en-us/vcsource/tools/opencl-sdk-xe

  15. Jeffers, J., Reinders, J.: Intel® Xeon PhiTM Coprocessor High Performance Programming. Morgan Kaufmann (2013)

    Google Scholar 

  16. Khronos, http://www.khronos.org/opencl/

  17. MAGMA, http://icl.cs.utk.edu/magma/

  18. Newburn, C., Deodhar, R., Dmitriev, S., Murty, R., Narayanaswamy, R., Wiegert, J., Chin-chilla, F., McGuire, R.: Offlad Runtime for the Intel® Xeon PhiTM Coprocessor, http://software.intel.com/en-us/articles/offload-runtime-for-the-intelr-xeon-phitm-coprocessor

  19. Numerical Algorithms Group, Ltd., http://www.nag.com/

  20. NVIDIA CUDA reference manual, version 5.0 (October 2012), http://docs.nvidia.com/cuda/pdf/CUDA_Toolkit_Reference_Manual.pdf

  21. OpenACC, http://www.openacc-standard.org/

  22. OpenMP (March 2013), http://www.openmp.org/mp-documents/OpenMP_4.0_RC2.pdf

  23. OpenMP (November 2012), http://www.openmp.org/mp-documents/TR1_167.pdf

  24. Patterson, D., Hennessey, J.: Computer Organization and Design: the Hard-ware/Software Interface, 2nd edn., p. 751. Morgan Kaufmann Publishers, Inc., San Fran (1998)

    Google Scholar 

  25. Rabenseifner, R., Hager, G., Jost, G., Keller, R.: Hybrid MPI and openMP parallel programming. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, p. 11. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  26. Ravi, N., Yang, Y., Bao, T., Chakradhar, S.: Apricot: An optimizing compiler and productivity tool for x86-compatible many-core coprocessors. In: Proc. of the 26th ACM International Conference on Supercomputing, pp. 47–58. ACM, New York (2012)

    Chapter  Google Scholar 

  27. Redhat, http://www.redhat.com/products/enterprise-linux/

  28. Reinders, J., http://parallelbook.com/blogs/james

  29. Saha, B., Zhou, X., Chen, H., Gao, Y., Yan, S., Rajagopalan, M., Fang, J., Zhang, P., Ronen, R., Mendelson, A.: Programming model for a heterogeneous x86 platform. SIGPLAN Not. 44(6), 431–440 (2009)

    Article  Google Scholar 

  30. SHOC 1.1.1 manual, http://ft.ornl.gov/doku/_media/shoc/shoc-manual-1.1.1.pdf

  31. SUSE, https://www.suse.com/promo/sle11.html

  32. Threading Building Blocks, http://threadingbuildingblocks.org

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Newburn, C.J. et al. (2013). Offload Compiler Runtime for the Intel® Xeon PhiTM Coprocessor. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds) Supercomputing. ISC 2013. Lecture Notes in Computer Science, vol 7905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38750-0_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38750-0_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38749-4

  • Online ISBN: 978-3-642-38750-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics