skip to main content
research-article

An Integrated Exploration and Virtual Platform Framework for Many-Accelerator Heterogeneous Systems

Published: 18 March 2016 Publication History

Abstract

The recent advent of many-accelerator systems-on-chip (SoC), driven by the need for maximizing throughput and power efficiency, has led to an exponential increase in the hardware/software co-design complexity. The reason of this increase is that the designer has to explore a vast number of architectural parameter combinations for each single accelerator, as well as inter-accelerator configuration combinations under specific area, throughput, and power constraints, given that each accelerator has different computational requirements. In such a case, the design space size explodes. Thus, existing design space exploration (DSE) techniques give poor-quality solutions, as the design space cannot be adequately covered in a fair time. This problem is aggravated by the very long simulation time of the many-accelerator virtual platforms (VPs). This article addresses these design issues by (a) presenting a virtual prototyping solution that decreases the exploration time by enabling the evaluation of multiple configurations per VP simulation and (b) proposing a DSE methodology that efficiently explores the design space of many-accelerator systems. With the use of two fully developed use cases, namely an H.264 decoding server for multiple video streams and a parallelized denoising system for MRI scans, we show that the proposed DSE methodology either leads to Pareto points that dominate over those of a typical DSE scenario or finds new solutions that might not be found by the typical DSE. In addition, the proposed virtual prototyping solution leads to DSE runtime reduction reaching 10 × for H.264 and 5 × for Rician denoise.

References

[1]
D. Auras, S. Girbal, H. Berry, O. Temam, and S. Yehia. 2010. CMA: Chip multi-accelerator. In Proceedings of the IEEE 8th SASP Conference. 8--15.
[2]
BrainWeb. 2006. BrainWeb: Simulated Brain Database. Retrieved February 11, 2016, from http://brainweb.bic.mni.mcgill.ca/brainweb/.
[3]
A. Bui, K.-T. Cheng, J. Cong, L. Vese, Y.-C. Wang, B. Yuan, and Y. Zou. 2012. Platform characterization for domain-specific computing. In Proceedings of the 17th ASP-DAC Conference. 94--99.
[4]
CatapultC. 2013. Catapult: Product Family Overview. Retrieved February 11, 2016, from http://calypto.com/en/products/catapult/overview.
[5]
Joseph E. Coffland and Andy D. Pimentel. 2003. A software framework for efficient system-level performance evaluation of embedded systems. In Proceedings of the 2003 ACM Symposium on Applied Computing. ACM, New York, NY, 666--671.
[6]
J. Cong, M. A. Ghodrat, M. Gill, B. Grigorian, H. Huang, and G. Reinman. 2013. Composable accelerator-rich microprocessor enhanced for adaptivity and longevity. In Proceedings of the IEEE ISLPED Conference. 305--310.
[7]
J. Cong, M. A. Ghodrat, M. Gill, B. Grigorian, and G. Reinman. 2012a. Architecture support for accelerator-rich CMPs. In Proceedings of the 49th ACM/EDAC/IEEE DAC Conference. 843--849.
[8]
J. Cong, M. A. Ghodrat, M. Gill, B. Grigorian, and G. Reinman. 2012b. CHARM: A composable heterogeneous accelerator-rich microprocessor. In Proceedings of the ACM/IEEE ISLPED Conference. ACM, New York, NY, 379--384.
[9]
H. Cook and K. Skadron. 2008. Predictive design space exploration using genetically programmed response surfaces. In Proceedings of the 45th Annual Design Automation Conference (DAC’08). ACM, New York, NY, 960--965.
[10]
P. Getreuer, M. Tong, and L. A. Vese. 2011. A variational model for the restoration of MR images corrupted by blur and Rician noise. In Proceedings of the 7th International Conference on Advances in Visual Computing (ISVC’11). 686--698.
[11]
M. Gries. 2004. Methods for evaluating and covering the design space during early design development. Integration, the VLSI Journal 38, 2, 131--183.
[12]
Qi Guo, Tianshi Chen, Yunji Chen, Ling Li, and Weiwu Hu. 2013. Microarchitectural design space exploration made fast. Microprocessors and Microsystems 37, 1, 41--51.
[13]
G. Hamerly, E. Perelman, J. Lau, and B. Calder. 2005. SimPoint 3.0: Faster and more flexible program phase analysis. Journal of Instruction-Level Parallelism 7.
[14]
E. Ipek, S. A. McKee, K. Singh, R. Caruana, B. R. de Supinski, and M. Schulz. 2008. Efficient architectural design space exploration via predictive modeling. ACM Transactions on Architecture and Code Optimization 4, 4, Article No. 1.
[15]
R. Iyer. 2012. Accelerator-rich architectures: Implications, opportunities and challenges. In Proceedings of the 17th ASP-DAC Conference. 106--107.
[16]
R. Jahr, H. Calborean, L. Vintan, and T. Ungerer. 2012. Boosting design space explorations with existing or automatically learned knowledge. In Measurement, Modelling, and Evaluation of Computing Systems and Dependability and Fault Tolerance. Lecture Notes in Computer Science, Vol. 7201. Springer, 221--235.
[17]
Kai-Li Lin, Chen-Kang Lo, and Ren-Song Tsay. 2010. Source-level timing annotation for fast and accurate TLM computation model generation. In Proceedings of the 15th ASP-DAC Conference. 235--240.
[18]
T. Okabe, Y. Jin, and B. Sendhoff. 2003. A critical survey of performance indices for multi-objective optimisation. In Proceedings of the 2003 Congress on Evolutionary Computation (CEC’03), Vol. 2. 878--885.
[19]
OVP. 2014. Open Virtual Platforms Web site. Available at http://www.ovpworld.org.
[20]
G. Palermo, C. Silvano, and V. Zaccaria. 2009. ReSPIR: A response surface-based Pareto iterative refinement for application-specific design space exploration. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 28, 12, 1816--1829.
[21]
Hector Posadas, Sara Real, and Eugenio Villar. 2011. M3-SCoPE: Performance modeling of multi-processor embedded systems for fast design space exploration. In Multi-Objective Design Space Exploration of Multiprocessor SoC Architectures, C. Silvano, W. Fornaciari, and E. Villar (Eds.). Springer, New York, NY, 19--50.
[22]
C. Silvano, W. Fornaciari, G. Palermo, V. Zaccaria, F. Castro, M. Martinez, S. Bocchio, R. Zafalon, P. Avasare, G. Vanmeerbeeck, C. Ykman-Couvreur, M. Wouters, C. Kavka, L. Onesti, A. Turco, U. Bondi, G. Mariani, H. Posadas, E. Villar, C. Wu, F. Dongrui, Z. Hao, and T. Shibin. 2011. MULTICUBE: Multi-objective design space exploration of multi-core architectures. In Proceedings of the VLSI 2010 Annual Symposium. 47--63.
[23]
SoCLib. 2011. SoCLib TLM2.0 Library Web Site. Available at http://www.soclib.fr.
[24]
E. Sotiriou-Xanthopoulos, K. Siozios, G. Economakos, and D. Soudris. 2013. A process-based reconfigurable SystemC module for simulation speedup. In Proceedings of the SAMOS XIII International Conference. 72--79.
[25]
E. Sotiriou-Xanthopoulos, S. Xydis, K. Siozios, G. Economakos, and D. Soudris. 2014. Effective platform-level exploration for heterogeneous multicores exploiting simulation-induced slacks. In Proceedings of the PARMA-DITAM Conference. ACM, New York, NY, Article No. 13.
[26]
J. Teich. 2012. Hardware/software codesign: The past, the present, and predicting the future. Proceedings of the IEEE 100, Special Centennial Issue, 1411--1430.
[27]
Valgrind. 2014. Valgrind Tool Suite Web Site. Available at http://valgrind.org/.
[28]
S. Xydis, G. Palermo, V. Zaccaria, and C. Silvano. 2013. A meta-model assisted coprocessor synthesis framework for compiler/architecture parameters customization. In Proceedings of the DATE Conference. 659--664.
[29]
Eckart Zitzler, Kalyanmoy Deb, and Lothar Thiele. 2000. Comparison of multiobjective evolutionary algorithms: Empirical results. Evolutionary Computation 8, 2, 173--195.

Cited By

View all
  • (2021)The TaPaSCo Open-Source ToolflowJournal of Signal Processing Systems10.1007/s11265-021-01640-893:5(545-563)Online publication date: 2-May-2021
  • (2021)Fast DSE of reconfigurable accelerator systems via ensemble machine learningAnalog Integrated Circuits and Signal Processing10.1007/s10470-021-01885-0108:3(495-509)Online publication date: 1-Sep-2021
  • (2020)Regression Ensembles for Fast Design Space Exploration of Heterogeneous Hardware Designs2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA51294.2020.00041(201-204)Online publication date: Dec-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 15, Issue 3
July 2016
520 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/2899033
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 18 March 2016
Accepted: 01 December 2015
Revised: 01 June 2015
Received: 01 September 2014
Published in TECS Volume 15, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Many accelerator
  2. MultiDynamic
  3. SystemC
  4. analytical models
  5. design space exploration
  6. design space reduction
  7. simulation
  8. virtual platforms

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)The TaPaSCo Open-Source ToolflowJournal of Signal Processing Systems10.1007/s11265-021-01640-893:5(545-563)Online publication date: 2-May-2021
  • (2021)Fast DSE of reconfigurable accelerator systems via ensemble machine learningAnalog Integrated Circuits and Signal Processing10.1007/s10470-021-01885-0108:3(495-509)Online publication date: 1-Sep-2021
  • (2020)Regression Ensembles for Fast Design Space Exploration of Heterogeneous Hardware Designs2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA51294.2020.00041(201-204)Online publication date: Dec-2020
  • (2018)OpenCL-based Virtual Prototyping and Simulation of Many-Accelerator ArchitecturesACM Transactions on Embedded Computing Systems10.1145/324217917:5(1-27)Online publication date: 24-Sep-2018

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media