skip to main content
10.1145/2870650.2870655acmotherconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

Code vectorization using Intel Array Notation

Published:13 March 2016Publication History

ABSTRACT

In this paper, we explain the steps we have taken to port a large, industry-grade computational fluid dynamics application to the Intel® Xeon Phi™coprocessor using the C/C++ Array Notation extensions of Intel® Cilk™Plus. An essential part of the performance refactoring process for the Xeon Phi coprocessor is to achieve high-quality SIMD-vectorization. Even though there are other ways to vectorize code, the Array Notation extensions has proven to work best for our application. We have encapsulated the Array Notation extension syntax in a C++ wrapper class to drastically reduce the refactoring effort. In addition the architecture independency of Array Notation extensions minimizes porting and tuning efforts further. In this paper, we study how our approach helps the compiler to generate vectorized code. Derived from that study, we summarize our key learnings and findings as well as current limitations. Finally, we present a performance evaluation of the ported computational fluid dynamics application by using the introduced C++ wrapper class and differentiate our solution to other related solutions.

References

  1. Cilk Plus/LLVM. Website. Available online at http://cilkplus.github.io.Google ScholarGoogle Scholar
  2. GCC 4.9 Release Series. Website. Available online at https://gcc.gnu.org/gcc-4.9/changes.html.Google ScholarGoogle Scholar
  3. Intel Cilk Plus. Website. Available online at https://www.cilkplus.org.Google ScholarGoogle Scholar
  4. Intel Developer Zone: Additional Predefined Macros. Website. Available online at https://software.intel.com/en-us/node/514528.Google ScholarGoogle Scholar
  5. Intel Developer Zone: Data Alignment to Assist Vectorization. Website. Available online at https://software.intel.com/en-us/articles/data-alignment-to-assist-vectorization.Google ScholarGoogle Scholar
  6. Intel Developer Zone: Extensions for Array Notation. Website. Available online at https://software.intel.com/de-de/node/522647.Google ScholarGoogle Scholar
  7. Intel Developer Zone: Intel Math Kernel Library (Intel MKL). Website. Available online at https://software.intel.com/en-us/intel-mkl.Google ScholarGoogle Scholar
  8. Intel Xeon Phi User's Group (IXPUG). Website. Available online at https://www.ixpug.org.Google ScholarGoogle Scholar
  9. Intel ® Math Library. Website. Available online at https://software.intel.com/de-de/node/522652.Google ScholarGoogle Scholar
  10. Introduction to the Intel ® SIMD Data Layout Templates (Intel ® SDLT). Website. Available online at https://software.intel.com/en-us/node/600110;.Google ScholarGoogle Scholar
  11. N3396: Dynamic memory allocation for over-aligned data. Website. Available online at http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3396.htm.Google ScholarGoogle Scholar
  12. Optimizing Memory Bandwidth on Stream Triad. Website. Available online at https://software.intel.com/en-us/articles/optimizing-memory-bandwidth-on-stream-triad.Google ScholarGoogle Scholar
  13. TRACE. Website. Available online at http://www.dlr.de/sc/en/desktopdefault.aspx/tabid-5142/8655 read-3183.Google ScholarGoogle Scholar
  14. The openmp api specification for parallel programming. Website, 2013. Available online at http://www.openmp.org/visited on Nov. 14th 2013.Google ScholarGoogle Scholar
  15. Pierre Estérie, Joel Falcou, Mathias Gaunard, and Jean-Thierry Lapresté. Boost.simd: Generic programming for portable simdization. In Proceedings of the 2014 Workshop on Programming Models for SIMD/Vector Processing, WPMVP '14, pages 1--8, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Fog. VCL. C++ vector class library. Website, 2014. Available online at http://www.agner.org/optimize/#vectorclass.Google ScholarGoogle Scholar
  17. Matthias Kretz and Volker Lindenstruth. Vc: A c++ library for explicit vectorization. Software: Practice and Experience, 42(11):1409--1430, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Olaf Krzikalla, Kim Feldhoff, Ralph Müller-Pfefferkorn, and Wolfgang Nagel. Scout: A Source-to-Source Transformator for SIMD-Optimizations. In 4th Workshop on Productivity and Performance (PROPER 2011), Bordeaux, France, August 2011.Google ScholarGoogle Scholar
  19. Olaf Krzikalla, Kim Feldhoff, Ralph Müller-Pfefferkorn, and Wolfgang Nagel. Auto-Vectorization Techniques for Modern SIMD Architectures. In 16th International Workshop on Compilers for Parallel Computing (CPC 2012), Padova, Italy, January 2012.Google ScholarGoogle Scholar
  20. Roland Leißa, Sebastian Hack, and Ingo Wald. Extending a c-like language for portable simd programming. In Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '12, pages 65--74, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Maleki, Yaoqing Gao, M. J. Garzaran, T. Wong, and D. A. Padua. An evaluation of vectorizing compilers. In Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on, pages 372--382, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Pharr and W. R. Mark. ispc: A spmd compiler for high-performance cpu programming. In Innovative Parallel Computing (InPar), 2012, pages 1--13, May 2012.Google ScholarGoogle ScholarCross RefCross Ref
  23. Julien Sebot and Nathalie Drach-Temam. Memory bandwidth: The true bottleneck of simd multimedia performance on a superscalar processor. In Rizos Sakellariou, John Gurd, Len Freeman, and John Keane, editors, Euro-Par 2001 Parallel Processing, volume 2150 of Lecture Notes in Computer Science, pages 439--447. Springer Berlin Heidelberg, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. Sutherland. The viscosity of gases and molecular force. Philosoph. Mag. 5, 36:507--531, 1893.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    WPMVP '16: Proceedings of the 3rd Workshop on Programming Models for SIMD/Vector Processing
    March 2016
    52 pages
    ISBN:9781450340601
    DOI:10.1145/2870650

    Copyright © 2016 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 13 March 2016

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate20of30submissions,67%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader