ABSTRACT
This paper investigates the mapping of stream programs to wide-issue clustered VLIW processors so that designers can leverage their existing investments in VLIW-based platforms to harness the advantages of stream programming.
- G. Slavenburg, S. Rathnam, H. Dijksra, "The TriMedia TM1 PC1 VLIW Media Processor," Proceedings of HotChips 8, August 1996.Google Scholar
- TI, Inc., "TMS320C6000: a high performance DSP platform."Google Scholar
- P. Faraboschi et al., "Lx: a technology platform for customizable VLIW embedded processing," ISCA, 2000. Google ScholarDigital Library
- O. Colavin, D. Rizzo, "A scalable wide-issue clustered VLIW with a reconfigurable interconnect," CASES, San Jose, CA, 2003. Google ScholarDigital Library
- I. Buck et al., "Brook for GPUs: Stream computing on graphics hardware," SIGGRAPH, Los Angeles, CA. Aug 8-12, 2004. Google ScholarDigital Library
- W. Thies, M. Karczmarek, S. Amarasinghe. "StreamIt: A Language for Streaming Applications," CC 2002,, 2002. Google ScholarDigital Library
- P. Mattson, "A Programming System for the Imagine Media Processor," Ph.D. Thesis, EE Dept., Stanford University, 2001. Google ScholarDigital Library
- W. J. Dally et al., "Stream processors: Programmability with efficiency," ACM Queue, vol. 2, no. 1, Mar 2004, pp. 52--62. Google ScholarDigital Library
- B. Ramakrishna Rau, "Iterative Modulo Scheduling," HPL-94-115.Google Scholar
- E. Nystrom, A. E. Eichenberger, "Effective cluster assignment for modulo scheduling," MICRO-31, 1998. Google ScholarDigital Library
- V. Lapinskii et al., "Cluster assignment for high-performance embedded VLIW processors," ACM TODAES, July 2002. Google ScholarDigital Library
- G. Desoli, "Instruction assignment for clustered VLIW DSP compilers: A new approach," HP HPL-98-13, February 1998.Google Scholar
- F. Labonte et al., "The Stream Virtual Machine," PACT, 2004. Google ScholarDigital Library
- E. Salami, M. Valero, "A vector-μ SIMD-VLIW architecture for multimedia applications," ICPP-05, 2005. Google ScholarDigital Library
- D. Naishlos, "Autovectorization in GCC," GCC Summit, June 2004.Google Scholar
- B. Zwernemann, "An 8x8 DCT Implementation on the Motorola DSP56800E," www.freescale.comGoogle Scholar
- J. Gummaraju, M. Rosenblum, "Stream Programming on General-Purpose Processors," MICRO-38, November 2005. Google ScholarDigital Library
Index Terms
- Stream execution on wide-issue clustered VLIW architectures
Recommendations
Stream execution on wide-issue clustered VLIW architectures
Proceedings of the 2007 LCTES conferenceThis paper investigates the mapping of stream programs to wide-issue clustered VLIW processors so that designers can leverage their existing investments in VLIW-based platforms to harness the advantages of stream programming.
Machine-Description Driven Compilers for EPIC and VLIW Processors
In the past, due to the restricted gate count available on an inexpensive chip, embedded DSPs have had limited parallelism, few registers and irregular, incomplete interconnectivity. More recently, with increasing levels of integration, embedded VLIW ...
Low-power branch prediction techniques for VLIW architectures: a compiler-hints based approach
Special issue: ACM great lakes symposium on VLSIThe paper introduces a dynamic branch prediction scheme suitable for energy-aware Very Long Instruction Word (VLIW) processors. The proposed technique is based on a compiler hint mechanism to filter the accesses to the branch predictor blocks. We define ...
Comments