skip to main content
research-article

The Design and Experiments of A SID-Based Power-Aware Simulator for Embedded Multicore Systems

Published: 02 March 2015 Publication History

Abstract

Embedded multicore systems are playing increasingly important roles in the design of consumer electronics. The objective of such systems is to optimize both performance and power characteristics of mobile devices. However, currently there are no power metrics supporting popular application design platforms (such as SID) that application developers use to develop their applications. This hinders the ability of application developers to optimize power consumption. In this article we present the design and experiments of a SID-based power-aware simulation framework for embedded multicore systems. The proposed power estimation flow includes two phases: IP-level power modeling and power-aware system simulation. The first phase employs PowerMixerIP to construct the power model for the processor IP and other major IPs, while the second phase involves a power abstract interpretation method for summarizing the simulation trace, then, with a CPE module, estimating the power consumption based on the summarized trace information and the input of IP power models. In addition, a Manager component is devised to map each digital signal processor (DSP) component to a host thread and maintain the access to shared resources. The aim is to maintain the simulation performance as the number of simulated DSP components increases. A power-profiling API is also supported that developers of embedded software can use to tune the granularity of power-profiling for a specific code section of the target application. We demonstrate via case studies and experiments how application developers can use our SID-based power simulator for optimizing the power consumption of their applications. We characterize the power consumption of DSP applications with the DSPstone benchmark and discuss how compiler optimization levels with SIMD intrinsics influence the performance and power consumption. A histogram application and an augmented-reality application based on human-face-based RMS (recognition, mining, and synthesis) application are deployed as running examples on multicore systems to demonstrate how our power simulator can be used by developers in the optimization process to illustrate different views of power dissipations of applications.

References

[1]
Andes Tech. 2010. AndesCore n1213-s product brief. http://www.andestech.com/en/products/.
[2]
Fabrice Bellard. 2005. Qemu, a fast and portable dynamic translator. In Proceedings of the USENIX Annual Technical Conference (ATEC'05). USENIX Association, 41--46.
[3]
Andrea Bona, Mariagiovanna Sami, Donatella Sciuto, Cristina Silvano, Vittorio Zaccaria, and Roberto Zafalon. 2005. Reducing the complexity of instruction-level power models for vliw processors. Des. Autom. Embedd. Syst. 10, 1, 49--67.
[4]
David Brooks, Vivek Tiwari, and Margaret Martonosi. 2000. Wattch: A framework for architectural-level power analysis and optimizations. In Proceedings of the 27th Annual International Symposium on Computer Architecture (ISCA'00). 83--94.
[5]
Doug Burger, Todd M. Austin, and Steve Bennett. 1996. Evaluating future microprocessors: The simplescalar tool set. http://research.cs.wisc.edu/techreports/1996/TR1308.pdf.
[6]
J. Adam Butts and Gurindar S. Sohi. 2000. A static power model for architects. In Proceedings of the 33rd Annual ACM/IEEE International Symposium on Microarchitecture (MICRO'00). ACM Press, New York, 191--201.
[7]
David Chih-Wei Chang. 2006. PAC digital signal processor. In Proceedings of the Fall Microprocessor Forum.
[8]
Jui-Ming Chang and Massoud Pedram. 1995. Register allocation and binding for low power. In Proceedings of the 32nd Annual ACM/IEEE Design Automation Conference (DAC'95). ACM Press, New York, 29--35.
[9]
Jianwei Chen, Murali Annavaram, and Michel Dubois. 2009. SlackSim: A platform for parallel simulations of CMPS on CMPS. sigmetrics perform. Eval. Rev. 37, 2, 77--78.
[10]
Gilberto Contreras, Margaret Martonosi, Jinzhan Peng, Roy Ju, and Guei-Yuan Lueh. 2004. XTREM: A power simulator for the Intel Xscale core. In Proceedings of the ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'04). ACM Press, New York, 115--125.
[11]
Gilberto Contreras, Margaret Martonosi, Jinzhang Peng, Guei-Yuan Lueh, and Roy Ju. 2007. The trem power and performance simulator for the Intel Xscale core: Design and experiences. ACM Trans. Embedd. Comput. Syst. 6, 1.
[12]
James Donald and Margaret Martonosi. 2006. An efficient, practical parallelization methodology for multicore architecture simulation. IEEE Comput. Archit. Lett. 5, 2, 14.
[13]
Richard M. Fujimoto. 1990. Parallel discrete event simulation. Comm. ACM 33, 10, 30--53.
[14]
Chen-Wei Hsu, Jia-Lu Liao, Shan-Chien Fang, Chia-Chien Weng, Shi-Yu Huang, Wen-Tsan Hsieh, and Jen-Chieh Yeh. 2011. PowerDepot: Integrating IP-based power modeling with ESL power analysis for multi-core SoC designs. In Proceedings of the 48th Design Automation Conference (DAC'11). 47--52.
[15]
Chen-Wei Hsu, Jia-Lu Liao, Jen-Chieh Yeh, Ji-Jan Chen, Shi-Yu Huang, and Jing-Jia Liou. 2009. Memory-aware power modeling for PAC DSP core. In Proceedings of the 1st Asia Symposium on Quality Electronic Design (ASQED'09). 319--324.
[16]
Jingcao Hu, Youngsoo Shin, Nagu Dhanwada, and Radu Marculescu. 2004. Architecting voltage islands in core-based system-on-a-chip designs. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED'04). 180--185.
[17]
Christopher J. Hughes, Vijay S. Pai, Parthasarathy Ranganathan, and Sarita V. Adve. 2002. RSIM: Simulating shared-memory multiprocessors with ILP processors. Comput. 35, 2, 40--49.
[18]
Jeff Janzen. 2001. Calculating memory system power for DDR SDRAM. Designline 10, 2.
[19]
Jinsong Ji, Chao Wang, and Xuehai Zhou. 2008. System-level early power estimation for memory sub-system in embedded systems. In Proceedings of the 5th IEEE International Symposium on Embedded Computing (SEC'08). 370--375.
[20]
Chi-Bang Kuan and Jenq Kuen Lee. 2012. Compiler supports for VLIW DSP processors with simd intrinsics. Concurr. Comput. Pract. Exper. 24, 5, 517--532.
[21]
Chingren Lee, Jenq Kuen Lee, Tingting Hwang, and Shi-Chun Tsai. 2003. Compiler optimization on VLIW instruction scheduling for low power. ACM Trans. Des. Autom. Electron. Syst. 8, 2, 252--268.
[22]
Mike Tien-Chien Lee, Masahiro Fujita, Vivek Tiwari, and Sharad Malik. 1997. Power analysis and minimization techniques for embedded DSP software. IEEE Trans. VLSI Syst. 5, 1, 123--135.
[23]
Rainer Leupers, Grant Martin, Roman Plyaskin, Andreas Herkersdorf, Frank Schirrmeister, Tim Kogel, and Martin Vaupel. 2012. Virtual platforms: Breaking new grounds. In Proceedings of the Design, Automation and Test in Europe Conference (DATE'12). 685--690.
[24]
Ding Li, Shuai Hao, William G. J. Halfond, and Ramesh Govindan. 2013. Calculating source line level energy information for android applications. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA'13). ACM Press, New York, 78--89.
[25]
Ming-Chih Li, Chia-Chien Weng, Tsai-Yuan Tai, and Shi-Hunag. 2008. Extrapolation-based power modeling for memory compilers using MUX-oriented linear regression. In Proceedings of the VLSI/CAD Symposium.
[26]
Cheng-Yen Lin, Shao-Chung Wang, Ming-Yu Hung, Kun-Yuan Hsieh, and Jenq Kuen Lee. 2009. Software cache support and API design for embedded DSP processor. In Proceedings of the International SoC Design Conference (ISOCC'09). 161--164.
[27]
Yung-Chia Lin, Yi-Ping You, and Jenq-Kuen Lee. 2007. PALF: Compiler supports for irregular register files in clustered VLIW DSP processors. Concurr. Comput. Pract. Exper. 19, 18, 2391--2406.
[28]
Chia-Han Lu, Yung-Chia Lin, Yi-Ping You, and Jenq-Kuen Lee. 2009. LC-GRFA: Global register file assignment with local consciousness for VLIW DSP processors with non-uniform register files. Concurr. Comput. Pract. Exper. 21, 1, 101--114.
[29]
Rong Luo, Hong Luo, Huazhong Yang, and Yuan Xie. 2005. An instruction-level analytical power model for designing the low power systems on a chip. In Proceedings of the 6th International Conference on ASIC (ASICON'05). Vol. 2. 1094--1097.
[30]
Enrico Macii, Massoud Pedram, and Fabio Somenzi. 1998. High-level power modeling, estimation, and optimization. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 17, 11, 1061--1079.
[31]
John C. McCullough, Yuvraj Agarwal, Jaideep Chandrashekar, Sathyanarayan Kuppuswamy, Alex C. Snoeren, and Rajesh K. Gupta. 2011. Evaluating the effectiveness of model-based power characterization. In Proceedings of the USENIX Annual Technical Conference (USENIXATC'11). USENIX Association, 12.
[32]
Jason E. Miller, Harshad Kasture, George Kurian, Charles Gruenwald, Nathan Beckmann, Christopher Celio, Jonathan Eastep, and Anant Agarwal. 2010. Graphite: A distributed parallel simulator for multicores. In Proceedings of the 16th IEEE International Symposium on High Performance Computer Architecture (HPCA'10). 1--12.
[33]
Mayan Moudgill, John-David Wellman, and Jaime H. Moreno. 1999. Environment for PowerPC microarchitecture exploration. IEEE Micro 19, 3, 15--25.
[34]
Shubhendu S. Mukherjee, Steven K. Reinhardt, Babak Falsafi, Mike Litzkow, Mark D. Hill, David A. Wood, Steven Huss-Lederman, and James R. Larus. 2000. Wisconsin wind tunnel ii: A fast, portable parallel architecture simulator. IEEE Concurr. 8, 4, 12--20.
[35]
Andrew Over, Bill Clarke, and Peter Strazdins. 2007. A comparison of two approaches to parallel simulation of multiprocessors. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems Software (ISPASS'07). 12--22.
[36]
Subbarao Palacharla, Norman P. Jouppi, and James E. Smith. 1997. Complexity-effective superscalar processors. In Proceedings of the 24th Annual International Symposium on Computer Architecture (ISCA'97). ACM Press, New York, 206--218.
[37]
Steven K. Reinhardt, Mark D. Hill, James R. Larus, Alvin R. Lebeck, James C. Lewis, and David A. Wood. 1993. The wisconsin wind tunnel: Virtual prototyping of parallel computers. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'93). ACM Press, New York, 48--60.
[38]
Siddharth Rele, Santosh Pande, Soner Onder, and Rajiv Gupta. 2002. Optimizing static power dissipation by functional units in superscalar processors. In Proceedings of the 11th International Conference on Compiler Construction (CC'02). Springer, 261--275.
[39]
Suzanne Rivoire, Parthasarathy Ranganathan, and Christos Kozyrakis. 2008. A comparison of high-level full-system power models. In Proceedings of the Conference on Power Aware Computing and Systems (HotPower'08). USENIX Association, 3.
[40]
Greg Semeraro, David H. Albonesi, Steven G. Dropsho, Grigorios Magklis, Sandhya Dwarkadas, and Michael L. Scott. 2002. Dynamic frequency and voltage control for a multiple clock domain microarchitecture. In Proceedings of the 35th Annual ACM/IEEE International Symposium on Microarchitecture (MICRO'02). IEEE Computer Society Press, 356--367.
[41]
SID. 2001. SID simulator user's guide. http://sourceware.org/sid.
[42]
Te-Feng Su, Jia-Jhe Li, Chih-Hsueh Duan, Shu-Fan Wang, and Shang-Hong Lai. 2011. Parallelized face based RMS system on a multi-core embedded computing platform. In Proceedings of the 40th International Conference on Parallel Processing Workshops (ICPPW'11). 199--206.
[43]
Vivek Tiwari, Sharad Malik, and Andrew Wolfe. 1994. Power analysis of embedded software: A first step towards software power minimization. IEEE Trans. VLSI Syst. 2, 4, 437--445.
[44]
Vivek Tiwari, Sharad Malik, Andrew Wolfe, and Mike Tien-Chien Lee. 1996. Instruction level power analysis and optimization of software. J. VLSI Signal Process. Syst. 13, 2--3, 223--238.
[45]
Vivek Tiwari, Deo Singh, Suresh Rajgopal, Gaurav Mehta, Rakesh Patel, and Franklin Baez. 1998. Reducing power in high-performance microprocessors. In Proceedings of the 35th Annual Design Automation Conference (DAC'98). ACM Press, New York, 732--737.
[46]
Narayanan Vijaykrishnan, Mahmut Kandemir, Mary J. Irwin, Hyunsuk S. Kim, and Wu Ye. 2000. Energy-driven integrated hardware-software optimizations using simplepower. In Proceedings of the 27th Annual International Symposium on Computer Architecture (ISCA'00). ACM Press, New York, 95--106.
[47]
Chi Wu, Kun-Yuan Hsieh, Yung-Chia Lin, Chung-Ju Wu, Wen-Li Shih, S. C. Chen, Chung-Kai Chen, Chien-Ching Huang, Yi-Ping You, and Jenq-Kuen Lee. 2006. Integrating compiler and system toolkit flow for embedded VLIW DSP processors. In Proceedings of the 12th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA'06). 215--222.
[48]
Hongbo Yang, Guang R. Gao, and Clement Leung. 2002. On achieving balanced power consumption in software pipelined loops. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES'02). ACM Press, New York, 210--217.
[49]
Yi-Ping You, Chung-Wen Huang, and Jenq Kuen Lee. 2007. Compilation for compact power-gating controls. ACM Trans. Des. Autom. Electron. Syst. 12, 4.
[50]
Yi-Ping You, Chingren Lee, and Jenq Kuen Lee. 2006. Compilers for leakage power reduction. ACM Trans. Des. Autom. Electron. Syst. 11, 1, 147--164.
[51]
Dukyoung Yun, Sungchan Kim, and Soonhoi Ha. 2012. A parallel simulation technique for multicore embedded systems and its performance analysis. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 31, 1, 121--131.
[52]
Wei Zhang, Mahmuth Kandemir, Narayanan Vijaykrishnan, Mary J. Irwin, and Vivek De. 2003. Compiler support for reducing leakage energy consumption. In Proceedings of the Design, Automation and Test in Europe Conference (DATE'03). 1146--1147.
[53]
Vojin Zivojnovic, Juan M. Velarde, and Christian Schlager. 1994. DSPstone: A dsp-oriented benchmarking methodology. In Proceedings of the 5th International Conference on Signal Processing Applications and Technology (ICSPAT'94).

Cited By

View all
  • (2017)LP-HLS: Automatic power-intent generation for high-level synthesis based hardware implementation flowMicroprocessors and Microsystems10.1016/j.micpro.2017.02.00250(26-38)Online publication date: May-2017
  • (undefined)The Support of MISRA C++ Analyzer for Reliability of Embedded SystemsACM Transactions on Cyber-Physical Systems10.1145/3611390

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Design Automation of Electronic Systems
ACM Transactions on Design Automation of Electronic Systems  Volume 20, Issue 2
February 2015
404 pages
ISSN:1084-4309
EISSN:1557-7309
DOI:10.1145/2742143
  • Editor:
  • Naehyuck Chang
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 02 March 2015
Accepted: 01 September 2014
Revised: 01 September 2014
Received: 01 February 2014
Published in TODAES Volume 20, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. DSP
  2. Multicore simulation
  3. embedded processor
  4. power modeling

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • MediaTek research
  • Ministry of Science and Technology of Taiwan
  • Ministry of Economic Affairs of Taiwan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2017)LP-HLS: Automatic power-intent generation for high-level synthesis based hardware implementation flowMicroprocessors and Microsystems10.1016/j.micpro.2017.02.00250(26-38)Online publication date: May-2017
  • (undefined)The Support of MISRA C++ Analyzer for Reliability of Embedded SystemsACM Transactions on Cyber-Physical Systems10.1145/3611390

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media