Power-Awarness in Coarse-Grained Reconfigurable Multi-Functional Architectures: a Dataflow Based Strategy

Palumbo, Francesca; Fanni, Tiziana; Sau, Carlo; Meloni, Paolo

doi:10.1007/s11265-016-1106-9

Power-Awarness in Coarse-Grained Reconfigurable Multi-Functional Architectures: a Dataflow Based Strategy

Published: 09 February 2016

Volume 87, pages 81–106, (2017)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Francesca Palumbo¹,
Tiziana Fanni²,
Carlo Sau² &
…
Paolo Meloni²

422 Accesses
14 Citations
Explore all metrics

Abstract

Modern embedded systems, to accommodate different applications or functionalities over the same substrate and provide flexibility at the hardware level, are often resource redundant and, consequently, power hungry. Therefore, dedicated design frameworks are required to implement efficient runtime reconfigurable platforms. Such frameworks, to challenge this scenario, need also to offer application specific support for power management. In this work, we adopt dataflow specifications as a starting point to feature power minimization in coarse-grained reconfigurable embedded systems. The proposed flow is composed of two subsequent steps: 1) the characterization of the optimal topological system specification(s) and 2) the identification of disjointed logic regions. These latter are then used to implement clock and power gating methodologies. The validity of this model-based approach has been proved over the reconfigurable computing core of a multi-functional coprocessor for image processing applications. Results have been assessed targeting both an ASIC 90 nm technology and a 45 nm one.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Early Stage Automatic Strategy for Power-Aware Signal Processing Systems Design

Article 28 April 2015

Methodology and Example-Driven Interconnect Synthesis for Designing Heterogeneous Coarse-Grain Reconfigurable Architectures

Effective Reconfigurable Design: The FASTER Approach

Notes

Such as the availability of dedicated cells and processes on the implementation stack.
Xilinx boards, for example, are equipped with dedicated blocks (BUFGs), whose outputs can drive distinct regions of logic powering down different design portions (when enabled).
In the worst case scenario, where all the input DPNs share the same actor, the chain will have as many element as the iterations are.
A combination is a selection of all or part of a set of objects, regardless to the order in which they are selected. Given A, B and C, the complete list of possible selections of two items would be: AB, AC, and BC.
The back-annotation requires performing of a training set of synthesis trials. In this paper we have used the RTL Compiler of Cadence SoC Encounter and a 90 nm CMOS technology.
Only static contribute of the power consumption is considered.
SoC Encounter has been used to extract the CP associated to the N different input specifications synthesized stand-alone with a 90 nm CMOS technology.
Coefficients in Eq. 4 are technology dependent and have to be modeled for each target technology node by interpolating a training set of experimental results. This form derives by experiments carried out by means of the RTL Compiler (Cadence SoC Encounter), using ASIC CMOS 90 nm technology as reference, varying the number of SBoxes (from 1 to 100) and the number of bits of SBoxes data (1, 8, 16, 32, 64).
With kernel contribution we mean the power consumption of the design when the considered kernel is enabled.

References

Esmaeilzadeh, H., Blem, E., St. Amant, R., Sankaralingam, K., & Burger, D. (2011). Dark silicon and the end of multicore scaling. In Proceedings of the 38th annual international symposium on computer architecture (ISCA) (pp. 365– 376).
Taylor, M.B. (2012). Is dark silicon useful? Harnessing the four horsemen of the coming dark silicon apocalypse. In ACM/EDAC/IEEE Design automation conference (DAC) (pp. 1131–1136).
Herbert, S., & Marculescu, D. (2007). Analysis of dynamic voltage/frequency scaling in chip-multiprocessors. In ACM/IEEE International symposium on low power electronics and design (ISLPED) (pp. 38–43).
Eyerman, S., & Eeckhout, L. (2011). Fine-grained DVFS using on-chip regulators. ACM Transactions on Architecture and Code Optimization (TACO), 8(1), 1–24.
Article Google Scholar
Arora, M., Manne, S., Eckert, Y., Paul, I., Jayasena, N., & Tullsen, D.M. (2014). A comparison of core power gating strategies implemented in modern hardware. In ACM SIGMETRICS / International conference on measurement and modeling of computer systems (SIGMETRICS) (pp. 559–560).
Jeff, B. (2012). Advances in big.little technology for power and energy savings. In ARM White paper.
Power Forward Initiative (2009). A Practical Guide to Low Power Design.
Wingard, D. (2013). Noc power-management advantages. In Keynote talks at IP-soc conference and exibition.
Sau, C., Raffo, L., Palumbo, F., Bezati, E., Casale-Brunet, S., & Mattavelli, M. (2014). Automated design flow for coarse-grained reconfigurable platforms: an rvc-cal multi-standard decoder use-case. In Conference on embedded computer systems: architectures, Modeling, and Simulation (SAMOS XIV), IEEE (pp. 59–66).
Palumbo, F., Carta, N., Pani, D., Meloni, P., & Raffo, L. (2014). The multi-dataflow composer tool: generation of on-the-fly reconfigurable platforms. Journal of Real-Time Image Processing (JRTIP), 9(1), 233–249.
Article Google Scholar
Carta, N., Sau, C., Pani, D., Palumbo, F., & Raffo, L. (2013). A Coarse-Grained reconfigurable approach for Low-Power spike sorting architectures. In IEEE/EMBS International conference on neural engineering (NER) (pp. 439–442).
Carta, N., Sau, C., Palumbo, F., Pani, D., & Raffo, L. (2013). A Coarse-Grained reconfigurable wavelet denoiser exploiting the Multi-Dataflow composer tool. In Conference on design and architectures for signal and image processing (DASIP) (pp. 141–148).
ISO/IEC 23001-4 (2009). MPEG-part 4: Codec configuration representation.
ISO/IEC 23002-4 (2010). MPEG video tech.—Part 4: Video tool library.
Palumbo, F., Sau, C., & Raffo, L. (2014). Coarse-grained reconfiguration: dataflow-based power management. In IET Computers & Digital Techniques, 9(1), 36–48.
Article Google Scholar
Palumbo, F., Sau, C., & Raffo, L. (2014). Power-awarness in coarse-grained reconfigurable designs: a dataflow based strategy. In IEEE Workshop on signal processing systems (siPS), IEEE (pp. 1–6).
Kahn, G. (1974). The semantics of a simple language for parallel programming. In International Conference on Information Processing (pp. 471–475).
Lee, E.A., & Parks, T. (1995). Dataflow process networks. In Proceedings of the IEEE (pp. 773–799).
Eker, J., & Janneck, J.W. (2003). Cal Language Report Specification of the Cal Actor Language. Technical report EECS Department, University of California, Berkeley.
Open RVC-CAL Compiler. http://orcc.sourceforge.net/.
Bezati, E., Mattavelli, M., & Janneck, J. (2013). High-Level Synthesis of dataflow programs for signal processing systems. In International symposium on image and signal processing and analysis (ISPA) (pp. 750–754).
Casale-Brunet, S., Mattavelli, M., & Janneck, J. W. (2013). Turnus: a design exploration framework for dataflow system design. In International symposium on circuits and systems (ISCAS) (pp. 654–654).
Carta, S.M., Pani, D., & Raffo, L. (2006). Reconfigurable coprocessor for multimedia application domain. Journal of VLSI Signal Processing Systems, 44, 135–152.
Article MATH Google Scholar
Kumar, V.V., & Lach, J. (2006). Highly flexible multimode digital signal processing systems using adaptable components and controllers. EURASIP Journal on Applied Signal Processing, 14, 1–9.
Google Scholar
Palumbo, F., Pani, D., Manca, E., Raffo, L., Mattavelli, M., & RVC, G. Roquier. (2010). A multi-decoder CAL Composer tool. In Conference on design and architectures for signal and image processing (DASIP) (pp. 144–151).
Palumbo, F., Carta, N., & Raffo, L. (2011). The Multi-Dataflow Composer tool: a runtime reconfigurable HDL platform composer. In Conference on design and architectures for signal and image processing (DASIP) (pp. 178–185).
Wipliez, M., Siret, N., Carta, N., Palumbo, F., & Raffo, L. (2012). Design IP faster: introducing the C˜high-level language. In IP-SOC: IP-Embedded System Conference and Exhibition.
Puri, R., Stok, L., & Bhattacharya, S. (2005). Keeping hot chips cool. In Design automation conference (DAC) (pp. 285– 288).
Casale-Brunet, S., Bezati, E., Alberti, C., Mattavelli, M., Amaldi, E., & Janneck, J. (2013). Partitioning and optimization of high level stream applications for multi clock domain architectures. In IEEE Workshop on Signal Processing Systems (siPS) (pp. 177–182).
Ren, R., Wei, J., Martínez, E.J., González, M.G., Álvaro, C.S., & del Oso, F.P. (2014). A PMC-driven methodology for energy estimation in RVC-CAL video codec specifications. Signal Processing: Image Communication, 28(10), 1303–1314.
Google Scholar
Schmidt, A.G., Steiner, N., French, M., & Sass, R. (2012). Hwpmi: an extensible performance monitoring infrastructure for improving hardware design and productivity on fpgas. International Journal of Reconfigurable Computing, 2012, 1–12.
Google Scholar
Lucarz, C., Roquier, G., & Mattavelli, M. (2010). High level design space exploration of RVC codec specifications for multi-core heterogeneous platforms. In Conference on design and architectures for signal and image processing (DASIP) (pp. 191–198).
Rahman, A.A.A., Thavot, R., Casale-Brunet, S., Bezati, E., & Mattavelli, M. (2012). Design space exploration strategies for FPGA implementation of signal processing systems using CAL dataflow program. In Conference on design and architectures for signal and image processing (DASIP) (pp. 1–8).
Meloni, P., Pomata, S., Tuveri, G., Secchi, S., Raffo, L., & Lindwer, M. (2012). Enabling fast ASIP design space exploration: an fpga-based runtime reconfigurable prototyper. VLSI Design.
Palumbo, F., Sau, C., & Raffo, L. (2013). DSE And profiling of Multi-Context Coarse-Grained reconfigurable systems. In International symposium on image and signal processing and analysis (ISPA) (pp. 744–749).
Zhang, Y., Roivainen, J., & Mammela, A. (2006). Clock-gating in FPGAs: a Novel and Comparative Evaluation. In EUROMICRO Conference on conference on digital system design: architectures, Methods and Tools (DSD) (pp. 584–590).
Bezati, E., Casale-Brunet, S., Mattavelli, M., & Janneck, J. W. (2014). Coarse grain clock gating of streaming applications in programmable logic implementations. In Proceedings of the 2014 electronic system level synthesis conference (ESLsyn) (pp. 1–6).
Silicon Integration Initiative (2014). Si2 Common Power Format Specification^TM - Version 2.1.

Download references

Acknowledgments

Dr. Carlo Sau is grateful to Sardinia Regional Government for funding the RPCT Project (L.R. 7/2007, CRP-18324) that led to these results. Carlo Sau and Tiziana Fanni are grateful to Sardinia Regional Government for supporting their PhD scholarship (P.O.R. F.S.E., European Social Fund 2007-2013 - Axis IV Human Resources).

Author information

Authors and Affiliations

POLCOMING - Information Engineering Unit, University of Sassari, Sassari, Italy
Francesca Palumbo
DIEE - Department of Electronics Engineering, University of Cagliari, Cagliari, Italy
Tiziana Fanni, Carlo Sau & Paolo Meloni

Authors

Francesca Palumbo
View author publications
You can also search for this author in PubMed Google Scholar
Tiziana Fanni
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Sau
View author publications
You can also search for this author in PubMed Google Scholar
Paolo Meloni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesca Palumbo.

Appendix A: MDC Generated CPF

Rights and permissions

Reprints and permissions

About this article

Cite this article

Palumbo, F., Fanni, T., Sau, C. et al. Power-Awarness in Coarse-Grained Reconfigurable Multi-Functional Architectures: a Dataflow Based Strategy. J Sign Process Syst 87, 81–106 (2017). https://doi.org/10.1007/s11265-016-1106-9

Download citation

Received: 01 March 2015
Revised: 05 January 2016
Accepted: 14 January 2016
Published: 09 February 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s11265-016-1106-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Power-Awarness in Coarse-Grained Reconfigurable Multi-Functional Architectures: a Dataflow Based Strategy

Abstract

Access this article

Similar content being viewed by others

Early Stage Automatic Strategy for Power-Aware Signal Processing Systems Design

Methodology and Example-Driven Interconnect Synthesis for Designing Heterogeneous Coarse-Grain Reconfigurable Architectures

Effective Reconfigurable Design: The FASTER Approach

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix A: MDC Generated CPF

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Power-Awarness in Coarse-Grained Reconfigurable Multi-Functional Architectures: a Dataflow Based Strategy

Abstract

Access this article

Similar content being viewed by others

Early Stage Automatic Strategy for Power-Aware Signal Processing Systems Design

Methodology and Example-Driven Interconnect Synthesis for Designing Heterogeneous Coarse-Grain Reconfigurable Architectures

Effective Reconfigurable Design: The FASTER Approach

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix A: MDC Generated CPF

Appendix A: MDC Generated CPF

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation