poster

Cost-Effective Memory Architecture to Achieve Flexible Configuration and Efficient Data Transmission for Coarse-Grained Reconfigurable Array (Abstract Only)

Authors:

Chen Yang,

Leibo Liu,

Shouyi Yin,

Shaojun WeiAuthors Info & Claims

FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Page 263

https://doi.org/10.1145/2684746.2689103

Published: 22 February 2015 Publication History

FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Cost-Effective Memory Architecture to Achieve Flexible Configuration and Efficient Data Transmission for Coarse-Grained Reconfigurable Array (Abstract Only)

Page 263

Abstract
References

Abstract

The memory architecture has a significant effect on the flexibility and performance of a coarse-grained reconfigurable array (CGRA), which can be restrained due to configuration overhead and large latency of data transmission. Multi-context structure and data preloading method are widely used in popular CGRAs as a solution to bandwidth bottlenecks of context and data. However, these two schemes cannot balance the computing performance, area overhead, and flexibility. This paper proposed group-based context cache and multi-level data memory architectures to alleviate the bottleneck problems. The group-based context cache was designed to dynamically transfer and buffer context inside CGRA in order to relieve the off-chip memory access for contexts at runtime. The multi-level data memory was designed to add data memories to different CGRA hierarchies, which were used as data buffers for reused input data and intermediate data. The proposed memory architectures are efficient and cost-effective so that performance improvement can be achieved at the cost of minor area overhead. Experiments of H.264 video decoding program and scale invariant feature transform algorithm achieved performance improvements of 19% and 23%, respectively. Further, the complexity of the applications running on CGRA is no longer restricted by the capacity of the on-chip context memory, thereby achieving flexible configuration for CGRA. The memory architectures proposed in this paper were based on a generic CGRA architecture derived from the characteristics found in the majority of existing popular CGRAs. As such, they can be applied to universal CGRAs.

References

[1]

B.F. Mei, F.J. Veredas, B. Masschelein, "Mapping an H.264/AVC decoder onto the ADRES reconfigurable architecture," in Proceedings of International Conference on Field Programmable Logic and Applications, 2005. 622--625. DOI= http://dx.doi.org/10.1109/FPL.2005.1515799.

Crossref

Google Scholar

[2]

F. Campi, A. Deledda, M. Pizzotti, et al., "A dynamically adaptive DSP for heterogeneous reconfigurable platforms," in Design, Automation & Test in Europe Conf., 2007. 1--6. DOI= http://dx.doi.org/10.1109/DATE.2007.364559.

Digital Library

Google Scholar

[3]

M.K.A. Ganesan, S. Singh, F. May, et al., "H. 264 decoder at HD resolution on a coarse grain dynamically reconfigurable architecture," in Proceedings of IEEE International Conference on Field Programmable Logic and Applications, 2007. 467--471. DOI= http://dx.doi.org/10.1109/FPL.2007.4380691.

Crossref

Google Scholar

[4]

L.B. Liu, C.C. Deng, D. Wang, et al., "An energy-efficient coarse-grained dynamically reconfigurable fabric for multiple-standard video decoding applications," in Proceedings of IEEE Custom Integrated Circuits Conference, 2013. 1--4. DOI= http://dx.doi.org/10.1109/CICC.2013.6658434.

Crossref

Google Scholar

[5]

F.J. Veredas, M. Scheppler, W. Moffat, et al., "Custom implementation of the coarse-grained reconfigurable ADRES architecture for multimedia purposes," in Proceedings of IEEE International Conference on Field Programmable Logic and Applications, 2005. 106--111. DOI= http://dx.doi.org/10.1109/FPL.2005.1515707.

Crossref

Google Scholar

[6]

G. Dimitroulakos, M. Galanis, C. Goutis, "Alleviating the data memory bandwidth bottleneck in coarse-grained reconfigurable arrays," in Proceedings of 16th IEEE International Conference on Application-Specific Systems, Architecture Processors, Samos, Greece, 2005. 161--168. DOI= http://dx.doi.org/10.1109/ASAP.2005.12.

Digital Library

Google Scholar

[7]

Y. Kim, J. Lee, A. Shrivastava, et al., "High throughput data mapping for coarse-grained reconfigurable architectures," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2011, 30: 1599--1609. DOI= http://dx.doi.org/10.1109/TCAD.2011.2161217.

Digital Library

Google Scholar

[8]

D. Lowe, "Distinctive image features from scale-invariant key points," International journal of computer vision, 2004, vol.60, no.2: 91--110. DOI= http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94.

Digital Library

Google Scholar

[9]

H. Xu, J. Tanabe, H. Usui, et al., "A low power many-core SoC with two 32-core clusters connected by tree based NoC for multimedia applications," in IEEE Symposium on VLSI Circuits, 2012, 150--151. DOI= http://dx.doi.org/10.1109/VLSIC.2012.6243834.

Crossref

Google Scholar

[10]

T.D. Chuang, P.K. Tsung, P.C. Lin, et al., "A 59.5mW scalable/multi-view video decoder chip for quad/3D full HDTV and video streaming applications," in Proceedings of IEEE International Solid-State Circuits Conference, 2010. 262--263. DOI= http://dx.doi.org/10.1109/ISSCC.2010.5433908.

Crossref

Google Scholar

[11]

V. Bonato, E. Marques, and G.A. constantinides, "A parallel hardware architecture for scale and rotation invariant feature detection," IEEE Transactions on Circuits and Systems for Video Technology, Dec.2008, vol.18, no.12, 1703--1712. DOI= http://dx.doi.org/10.1109/TCSVT.2008.2004936.

Digital Library

Google Scholar

[12]

L. Yao, H. Feng, Y. Zhu, et al., "An architecture of optimized SIFT feature detection for an FPGA implementation of an image matcher," in IEEE International conference on Field-Programmable Technology, 2009, 30--37. DOI= http://dx.doi.org/10.1109/FPT.2009.5377651.

Crossref

Google Scholar

[13]

P. Ouyang, S. Yin, H. Gao, et al., "Parallelization of computing-intensive tasks of SIFT algorithm on a reconfigurable architecture system," IEICE Trans. Inf. Syst., 2011, vol.e94-a, no.1, 1--10. DOI= http://dx.doi.org/10.1587/transfun.E96.A.1393.

Crossref

Google Scholar

Index Terms

Cost-Effective Memory Architecture to Achieve Flexible Configuration and Efficient Data Transmission for Coarse-Grained Reconfigurable Array (Abstract Only)
1. Hardware
  1. Emerging technologies
  2. Very large scale integration design

Recommendations

Dataflow-driven execution control in a coarse-grained reconfigurable array (abstract only)
FPGA '12: Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays

Coarse Grained Reconfigurable Arrays (CGRAs) are a promising class of architectures for accelerating applications using a large number of parallel execution units for high throughput. While they are typically very efficient for a single task, all ...
Configuration Cache Management for Coarse-Grained Reconfigurable Architecture with Multi-Array
CYBERC '12: Proceedings of the 2012 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery

Coarse-Grained Reconfigurable Architectures (CGRAs) can achieve both high performance and flexibility, and CGRAs with multi-array are used to meet the increasing performance requirement of multimedia applications. Meanwhile, the context size also becomes ...
Multi-column implementations for cache associativity
ICCD '97: Proceedings of the 1997 International Conference on Computer Design (ICCD '97)

We propose two schemes for implementing higher associativity: the sequential multi-column cache, which is an extension of the column associative cache, and the parallel multi-column cache. In order to achieve the same access cycle time as that of a ...

Comments

Information & Contributors

Information

Published In

FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

February 2015

292 pages

ISBN:9781450333153

DOI:10.1145/2684746

General Chair:
George A. Constantinides
Imperial College
,
Program Chair:
Deming Chen
University of Illinois at Urbana-Champaign

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2015

Check for updates

Author Tags

Qualifiers

Poster

Funding Sources

Science and Technology Project of Jiangxi Province China
China National High Technologies Research Program
Projects from State Grid Corporation of China

Conference

FPGA '15

Sponsor:

SIGDA

FPGA '15: The 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

February 22 - 24, 2015

California, Monterey, USA

Acceptance Rates

FPGA '15 Paper Acceptance Rate 20 of 102 submissions, 20%;

Overall Acceptance Rate 125 of 627 submissions, 20%

Index Terms

Recommendations

Dataflow-driven execution control in a coarse-grained reconfigurable array (abstract only)

Configuration Cache Management for Coarse-Grained Reconfigurable Architecture with Multi-Array

Multi-column implementations for cache associativity

Comments

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Other Metrics

Article Metrics

Other Metrics

Abstract

References

Index Terms

Recommendations

Dataflow-driven execution control in a coarse-grained reconfigurable array (abstract only)

Configuration Cache Management for Coarse-Grained Reconfigurable Architecture with Multi-Array

Multi-column implementations for cache associativity

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

Share

Share this Publication link

Share on social media

Affiliations