Design space exploration of an optimized compiler approach for a generic reconfigurable array architecture

Dimitroulakos, Grigoris; Galanis, Michalis D.; Goutis, Costas E.

doi:10.1007/s11227-006-0016-1

Design space exploration of an optimized compiler approach for a generic reconfigurable array architecture

Published: 24 February 2007

Volume 40, pages 127–157, (2007)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Grigoris Dimitroulakos¹,
Michalis D. Galanis¹ &
Costas E. Goutis¹

118 Accesses
7 Citations
6 Altmetric
Explore all metrics

Abstract

Several mesh-like coarse-grained reconfigurable architectures have been devised in the last few years accompanied with their corresponding mapping flows. One of the major bottlenecks in mapping algorithms on these architectures is the limited memory access bandwidth. Only a few mapping methodologies encountered the problem of the limited bandwidth while none has explored how the performance improvements are affected, from the architectural characteristics. We study in this paper the impact that the architectural parameters have on performance speedups achieved when the PEs’ local RAMs are used for storing the variables with data reuse opportunities. The data reuse values are transferred in the internal interconnection network instead of being fetched, from external memories, in order to reduce the data transfer burden on the bus network. A novel mapping algorithm is also proposed that uses a list scheduling technique. The experimental results quantified the trade-offs that exist between the performance improvements and the memory access latency, the interconnection network and the processing element’s local RAM size. For this reason, our mapping methodology targets on a flexible architecture template, which permits such an exploration. More specifically, the experiments showed that the improvements increase with the memory access latency, while a richer interconnection topology can improve the operation parallelism by a factor of 1.4 on average. Finally, for the considered set of benchmarks, the operation parallelism has been improved from 8.6% to 85.1% from the application of our methodology, and by having each PE’s Local RAM a size of 8 words.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Coarse-Grained Reconfigurable Array Architectures

Automatic Data Layout Optimizations for GPUs

Automatic Mapping of Parallel Pattern-Based Algorithms on Heterogeneous Architectures

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Hartenstein R (2001) A decade of reconfigurable computing: A visionary retrospective. In: Proc of ACM/IEEE DATE ’01, 2001, pp 642–649
Mei B, Vernalde S, Verkest D, De Man H, Lauwereins R (2003) Exploiting loop-level parallelism on coarse-grained reconfigurable architectures using modulo scheduling. In: Proc of ACM/IEEE DATE ’03, 2003, pp 255–261
Pact Corporation (2004) The XPP white paper. Technical report. www.pactcorp.com
Singh H, Ming-Hau L, Guangming L, Kurdahi FJ, Bagherzadeh N, Chaves Filho EM (2000) MorphoSys: An integrated reconfigurable system for data-parallel and communication-intensive applications. IEEE Trans Comput 49(5):465–481
Article Google Scholar
Miyamori T, Olukutun K (1999) REMARC: reconfigurable multimedia array coprocessor. IEICE Trans Inf Syst 389–397
Waingold E, Taylor M, Srikrishna D et al (1997) Baring it all to software: raw machines. IEEE Comput 30(9):86–93
Google Scholar
Mei B, Vernalde S, Verkest D, Lauwereins R (2004) Design methodology for a tightly coupled VLIW/reconfigurable matrix architecture. A case study. In: Proc of ACM/IEEE DATE ’04, 2004, pp 1224–1229
Catthoor F, Danckaert K, Kulkarni C, Brockmeyer E, Kjeldsberg P, Achteren T, Omnes T (2002) Data accesses and storage management for embedded programmable processors. Kluwer Academic
Hartenstein RW, Kress R (1995) A datapath synthesis system for the reconfigurable datapath architecture. In: Proc of ASP-DAC, Art No 77, Sep 1995
Cardoso JMP (2002) Weinhardt M, XPP-VC: A compiler with temporal partitioning for the PACT-XPP architecture. In: Proc of field programmable logic and its applications (FPL 02), LNCS 2438, Springer, 2002, pp 864–874
Lee J, Choi K, Dutt N (2003) Compilation approach for coarse-grained reconfigurable architectures. IEEE Design Test Comput 20(1):26–33
Article Google Scholar
Todman TJ, Constantinides GA, Wilton SJE, Mencer O, Luk W, Cheung PYK (2005) Reconfigurable computing: architectures and design methods. IEE Proc Comput Digit Tech 152(2):193–207
Article Google Scholar
Miyamori T, Olukotun K (1998) A quantitative analysis of reconfigurable coprocessors for multimedia applications. In: IEEE symposium on fpgas for custom computing machines, 1998, pp 2–11
Borkar S, Cohn R, Cox G, Gross T, Kung HT, Lam M et al (1990) Supporting systolic and memory communication in iWarp. In: Proc 17th int’l symp. computer architecture, IEEE CS Press, Los Alamitos, Calif, 1990, pp 70–81
Shoemaker D, Honoré F, Metcalf C, Ward S (1996) NuMesh: An architecture optimized for scheduled communication. J Supercomput 285–302
Quinton P, Robert Y (1991) Systolic algorithms and architectures, Prentice Hall
Hartenstein RW, Hoffman Th, Nageldinger U (2000) Design-space exploration of low power coarse grained reconfigurable datapath array architectures. In: Proc PATMOS 2000, LNCS, 1918, 2000, pp 118–128
Venkataramani G, Najjar W, Curdahi F, Bagherzadeh N, Bohm W, Hammes J (2003) Automatic compilation to a coarse-grained reconfigurable system-on-chip. ACM Trans Embed Comput Syst 2(4):560–589
Article Google Scholar
Mei B, Lambrechts A, Verkest D, Mignolet JY, Lauwereins R (2005) Architecture exploration for a reconfigurable architecture template. IEEE Design Test Comput 22(2):90–101
Article Google Scholar
Kwok Z, Wilton SJE (2005) Register file architecture optimization in a coarse-grained reconfigurable architecture. In: Proc of IEEE FCCM ’05, 2005, pp 35–44
Bansal N, Gupta S, Dutt N, Nikolau A, Gupta R (2004) Network topology exploration of mesh-based coarse-grain reconfigurable architectures. In: Proc of ACM/IEEE DATE ’04, 2004, pp 474–479
Bansal N, Gupta S, Dutt N, Nikolau A, Gupta R (2004) Interconnect-aware mapping of applications to coarse-grain reconfigurable architectures. In: Proc of field programmable logic and its applications (FPL ’04), LNCS 3203, Springer, 2004, pp 891–899
Mahlke SA, Lin DC, Chen WY et al (1992) Effective compiler support for predicated execution using the hyperblock. In: Proc 25th microarchitecture, 1992, pp 45–54
Kennedy K, Allen R (2002) Optimizing compilers for modern architectures. Morgan Kauffman
Panda PR, Dutt N, Nicolau A (1999) Memory issues in embedded systems-on-chip: optimizations and exploration. Kluwer Academic
Hall MW et al (1996) Maximizing multiprocessor performance with the SUIF compiler. Comput 29:84–89
Article Google Scholar
Muchnick S (1998) Advanced compiler design and implementation. Morgan Kauffman
De Micheli G (1994) Synthesis and optimization of digital circuits. McGraw-Hill, International Editions, Singapore
Texas Instruments Inc, www.ti.com, 2005
http://www.vlsi.ee.upatras.gr/~mgalanis/DSP_codes.doc

Download references

Author information

Authors and Affiliations

VLSI Design Laboratory, ECE Department, University of Patras, Patras, Greece
Grigoris Dimitroulakos, Michalis D. Galanis & Costas E. Goutis

Authors

Grigoris Dimitroulakos
View author publications
You can also search for this author inPubMed Google Scholar
Michalis D. Galanis
View author publications
You can also search for this author inPubMed Google Scholar
Costas E. Goutis
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Grigoris Dimitroulakos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dimitroulakos, G., Galanis, M.D. & Goutis, C.E. Design space exploration of an optimized compiler approach for a generic reconfigurable array architecture. J Supercomput 40, 127–157 (2007). https://doi.org/10.1007/s11227-006-0016-1

Download citation

Published: 24 February 2007
Issue Date: May 2007
DOI: https://doi.org/10.1007/s11227-006-0016-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Design space exploration of an optimized compiler approach for a generic reconfigurable array architecture

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Coarse-Grained Reconfigurable Array Architectures

Automatic Data Layout Optimizations for GPUs

Automatic Mapping of Parallel Pattern-Based Algorithms on Heterogeneous Architectures

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now