Skip to main content

Compiler-Assisted Instruction Decoder Energy Optimization for Clustered VLIW Architectures

  • Conference paper
  • 1825 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4873))

Abstract

Traditionally, an instruction decoder is designed as a monolithic structure that inhibit the leakage energy optimization. In this paper, we consider a split instruction decoder that enable the leakage energy optimization. We also propose a compiler scheduling algorithm that exploits instruction slack to increase the simultaneous active and idle duration in instruction decoder. The proposed compiler-assisted scheme obtains a further 14.5% reduction of energy consumption of instruction decoder over a hardware-only scheme for a VLIW architecture. The benefits are 17.3% and 18.7% in the context of a 2-clustered and a 4-clustered VLIW architecture respectively.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. MediaBench, http://cares.icsl.ucla.edu/MediaBench/

  2. MiBench, http://www.eecs.umich.edu/mibench/

  3. NetBench, http://cares.icsl.ucla.edu/NetBench/

  4. Trimaran System, http://www.trimaran.org/

  5. Borkar, S.: Design Challenges of Technology Scaling. IEEE Micro 19(4) (1999)

    Google Scholar 

  6. Chu, M., Fan, K., Mahlke, S.: Region-based Hierarchical Operation Partitioning for Multicluster Processors. In: SIGPLAN Notices, pp. 300–311 (2003)

    Google Scholar 

  7. Cooper, K.D., Waterman, T.: Understanding energy consumption on the c62x. In: Proc. of the Work. on Compilers and Operating Systems for Low Power (2002)

    Google Scholar 

  8. Desoli, G.: Instruction Assignment for Clustered VLIW DSP Compilers: A New Approach. Technical Report, Hewlett-Packard (1998)

    Google Scholar 

  9. Dropsho, S., Kursun, V., Albonesi, D.H., Dwarkadas, S., Friedman, E.G.: Managing Static Leakage Energy in Microprocessor Functional Units. In: Proc. of the Intl. Symp. on Microarchitecture, Los Alamitos, CA, USA, pp. 321–332 (2002)

    Google Scholar 

  10. Faraboschi, P., Brown, G., Fisher, J.A., Desoli, G.: Clustered Instruction-level Parallel Processors. Technical report, Hewlett-Packard (1998)

    Google Scholar 

  11. Flautner, K., Kim, N.S., Martin, S., Blaauw, D., Mudge, T.: Drowsy Caches: Simple Techniques for Reducing Leakage Power. In: Proc. of the Intl. Symp. on Computer Architecture, Washington, DC, USA, pp. 148–157 (2002)

    Google Scholar 

  12. Kailas, K., Agrawala, A., Ebcioglu, K.: CARS: A New Code Generation Framework for Clustered ILP Processors. In: Proc. of Intl. Symp. on High-Performance Computer Architecture, p. 133 (2001)

    Google Scholar 

  13. Kim, H.S., Vijaykrishnan, N., Kandemir, M., Irwin, M.J.: Adapting Instruction Level Parallelism for Optimizing Leakage in VLIW Architectures. In: Proc. of Conf. on Language, Compiler, and Tool for Embedded Systems, pp. 275–283 (2003)

    Google Scholar 

  14. Kuo, W.-A., Hwang, T., Wu, A.C.-H.: Decomposition of Instruction Decoders for Low-power Designs. ACM Trans. Des. Autom. Electron. Syst. 11(4) (2006)

    Google Scholar 

  15. Kursun, V., Friedman, E.G.: Low swing Dual Threshold Voltage Domino Logic. In: Proc. of the ACM Great Lakes Symp. on VLSI, New York, USA (2002)

    Google Scholar 

  16. Nagpal, R., Srikant, Y.N.: A Graph Matching Based Integrated Scheduling Framework for Clustered VLIW Processors. In: Proc. of ICPP Workshop on Compile and Runtime Techniques Parallel Computing, pp. 530–537 (2004)

    Google Scholar 

  17. Nagpal, R., Srikant, Y.N.: Integrated Temporal and Spatial Scheduling for Extended Operand Clustered VLIW Processors. In: Proc. of Conf. on computing frontiers, pp. 457–470 (2004)

    Google Scholar 

  18. Seshan, N.: High VelociTI Processing. In: IEEE Signal Proc. Magazine (March 1998)

    Google Scholar 

  19. Yun, H., Kim, J.: Power-aware Modulo Scheduling for High-Performance VLIW Processors. In: Proc. of 2001 Intl. Symp. on Low Power Electronics and Design, pp. 40–45 (2001)

    Google Scholar 

  20. Zhang, W., Vijaykrishnan, N., Kandemir, M., Irwin, M.J., Duarte, D., Tsai, Y.-F.: Exploiting VLIW Schedule Slacks for Dynamic and Leakage Energy Reduction. In: Proc. of Intl. Symp. on Microarchitecture, pp. 102–113 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Srinivas Aluru Manish Parashar Ramamurthy Badrinath Viktor K. Prasanna

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nagpal, R., Srikant, Y.N. (2007). Compiler-Assisted Instruction Decoder Energy Optimization for Clustered VLIW Architectures. In: Aluru, S., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing – HiPC 2007. HiPC 2007. Lecture Notes in Computer Science, vol 4873. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77220-0_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77220-0_38

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77219-4

  • Online ISBN: 978-3-540-77220-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics