skip to main content
10.1145/1065910.1065931acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
Article

Generation of permutations for SIMD processors

Published:15 June 2005Publication History

ABSTRACT

Short vector (SIMD) instructions are useful in signal processing, multimedia, and scientific applications. They offer higher performance, lower energy consumption, and better resource utilization. However, compilers still do not have good support for SIMD instructions, and often the code has to be written manually in assembly language or using compiler builtin functions. Also, in some applications, higher parallelism could be achieved if compilers inserted permutation instructions that reorder the data in registers. In this paper we describe how we create SIMD instructions from regular code, and determine ordering of individual operations in the SIMD instructions to minimize the number of permutation instructions. Individual memory operations are grouped into SIMD operations based on their effective addresses. The SIMD data flow graph is then constructed by following data dependences from SIMD memory operations. Then, the orderings of operations are propagated from SIMD memory operations into the graph.We also describe our approach to compute decomposition of a given permutation into the permutation instructions of the target architecture. Experiments with our prototype compiler show that this approach scales well with the number of operations in SIMD instructions (SIMD width) and can be used to compile a number of important kernels, achieving up to 35% speedup.

References

  1. A. V. Aho, M. Ganapathi, and S. W. K. Tjiang. Code generation using tree matching and dynamic programming. ACM Trans. Prog. Lang. Syst., 11(4):491--516, Oct. 1989.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. E. Eichenberger, P. Wu, and K. O'Brien. Vectorization for SIMD architectures with alignment constraints. In PLDI, pages 82--93, June 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. J. Fisher and H. G. Dietz. Compiling for SIMD within a register. In Workshop on Languages and Compilers for Parallel Computing, pages 290--304, Aug. 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Intel Corporation. Intel® C++ Compiler for Linux* Systems User's Guide, 2003.]]Google ScholarGoogle Scholar
  5. S. Larsen and S. Amarasinghe. Exploiting superword level parallelism. In Proc. of the Conference on Programming Language Design and Implementation (PLDI 2000), pages 145--156, Vancouver, British Columbia, Canada, June 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Larsen, E. Witchel, and S. Amarasinghe. Increasing and detecting memory address congruence. In Proc. of International Conference on Parallel Architectures and Compilation Techniques, pages 18--29, Sept. 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Leupers. Code Optimization Techniques for Embedded Processors. Kluwer Academic Publishers, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Leupers. Code selection for media processors with SIMD instructions. In Design, Automation and Test in Europe, pages 4--8, Mar. 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Naishlos, M. Biberstein, S. Ben-David, and A. Zaks. Vectorizing for a SIMdD DSP architecture. In CASES, pages 2--11, San Jose, CA, Oct. 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Generation of permutations for SIMD processors

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      LCTES '05: Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
      June 2005
      248 pages
      ISBN:1595930183
      DOI:10.1145/1065910
      • General Chair:
      • Yunheung Paek,
      • Program Chair:
      • Rajiv Gupta
      • cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 40, Issue 7
        Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
        July 2005
        238 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/1070891
        Issue’s Table of Contents

      Copyright © 2005 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 June 2005

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Author Tags

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate116of438submissions,26%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader