Abstract
Today’s desktop PCs feature a variety of parallel processing units. Developing applications that exploit this parallelism is a demanding task, and a programmer has to obtain detailed knowledge about the hardware for efficient implementation. CGiS is a data-parallel programming language providing a unified abstraction for two parallel processing units: graphics processing units (GPUs) and the vector processing units of CPUs. The CGiS compiler framework fully virtualizes the differences in capability and accessibility by mapping an abstract data-parallel programming model on those targets. The applicability of CGiS for GPUs has been shown in previous work; this work presents the extension of the framework for SIMD instruction sets of CPUs. We show how to overcome the obstacles in mapping the abstract programming model of CGiS to the SIMD hardware. Our experimental results underline the viability of this approach: Real-world applications can be implemented easily with CGiS and result in efficient code.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures. Morgan Kaufmann, San Francisco (2002)
Coleman, S., McKinley, K.S.: Tile size selection using cache organization and data layout. In: Proceedings of PLDI, pp. 279–290 (1995)
Culler, D.E., Singh, J.P., Gupta, A.: Parallel Computer Architecture: A Hardware/Software Approach. Morgan Kaufmann, San Francisco (1999)
Eichenberger, A.E., O’Brien, K., O’Brien, K., Wu, P., Chen, T., Oden, P.H., Prener, D.A., Shepherd, J.C., So, B., Sura, Z., Wang, A., Zhang, T., Zhao, P., Gschwind, M.: Optimizing compiler for a cell processor. In: Proceedings of PACT (2005)
Freescale. AltiVec Technology Programming Interface Manual. ALTIVECPIM/D 06/1999 Rev. 0 (June 1999)
Freescale. AltiVec Technology Programming Environments Manual. ALTIVECPEM/D 04/2006 Rev. 3 (April 2006)
Intel. Intel 64 and IA-32 Architectures Optimization Reference Manual (May 2007)
Larsen, S., Amarasinghe, S.: Exploiting superword level parallelism with multimedia instruction sets. Technical Report LCS-TM-601, MIT Laboratory for Computer Science (November 1999)
Lucas, P., Fritz, N., Wilhelm, R.: The CGiS compiler—a tool demonstration. In: Mycroft, A., Zeller, A. (eds.) CC 2006. LNCS, vol. 3923, pp. 105–108. Springer, Heidelberg (2006)
Lucas, P., Fritz, N., Wilhelm, R.: The development of the data-parallel GPU programming language CGiS. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3994, pp. 200–203. Springer, Heidelberg (2006)
Mittal, M., Peleg, A., Weiser, U.: MMX technology architecture overview. Intel Technology Journal Q3(12) (1997)
Muchnick, S.S.: Advanced Compiler Design and Implementation. Morgan Kaufmann, San Francisco (1997)
Nuzman, D., Rosen, I., Zaks, A.: Auto-vectorization of interleaved data of simd. In: Proceedings of PLDI (2006)
NVIDIA. CUDA Programming Guide Version 0.8 (February 2007)
Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krüger, J., Lefohn, A.E., Purcell, T.J.: A survey of general-purpose computation on graphics hardware. Computer Graphics Forum 26(1), 80–113 (2007)
Pryanishnikov, I., Krall, A., Horspool, R.N.: Compiler optimizations for processors with SIMD instructions. Software—Practice & Experience 37(1), 93–113 (2007)
Ren, G., Wu, P., Padua, D.: An empirical study on the vectorization of multimedia applications for multimedia extensions. In: IPDPS (2005)
Ren, G., Wu, P., Padua, D.A.: A preliminary study on the vectorization of multimedia applications for multimedia extensions. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, vol. 2958, pp. 420–435. Springer, Heidelberg (2003)
Rivest, R.L.: The RC5 encryption algorithm. In: Practical Cryptography for Data Internetworks. IEEE Computer Society Press, Los Alamitos (1996)
Shin, J., Chame, J., Hall, M.W.: Compiler-controlled caching in superword register files for multimedia extension architectures. In: Proceedings of PACT, pp. 45–55 (2002)
Tenllado, C., Piñuel, L., Prieto, M., Catthoor, F.: Pack transposition: Enhancing superword level parallelism exploitation. In: Proceedings of Parallel Computing (ParCo), pp. 573–580 (2005)
Wu, P., Eichenberer, A.E., Wang, A., Zhao, P.: An integrated simdization framework using virtual vectors. In: Proceedings of the 19th Annual International Conference on Supercomputing (ICS), pp. 169–178 (2005)
Zima, H.P., Chapman, B.: Supercompilers for Parallel and Vector Computers. ACM Press, New York (1990)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fritz, N., Lucas, P., Wilhelm, R. (2008). Exploiting SIMD Parallelism with the CGiS Compiler Framework. In: Adve, V., Garzarán, M.J., Petersen, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2007. Lecture Notes in Computer Science, vol 5234. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85261-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-85261-2_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85260-5
Online ISBN: 978-3-540-85261-2
eBook Packages: Computer ScienceComputer Science (R0)