Skip to main content

Advertisement

Log in

Energy Efficiency of a Multi-Core Processor by Tag Reduction

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

We consider the energy saving problem for caches on a multi-core processor. In the previous research on low power processors, there are various methods to reduce power dissipation. Tag reduction is one of them. This paper extends the tag reduction technique on a single-core processor to a multi-core processor and investigates the potential of energy saving for multi-core processors. We formulate our approach as an equivalent problem which is to find an assignment of the whole instruction pages in the physical memory to a set of cores such that the tag-reduction conflicts for each core can be mostly avoided or reduced. We then propose three algorithms using different heuristics for this assignment problem. We provide convincing experimental results by collecting experimental data from a real operating system instead of the traditional way using a processor simulator that cannot simulate operating system functions and the full memory hierarchy. Experimental results show that our proposed algorithms can save total energy up to 83.93% on an 8-core processor and 76.16% on a 4-core processor in average compared to the one that the tag-reduction is not used for. They also significantly outperform the tag reduction based algorithm on a single-core processor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Spracklen L, Abraham S G. Chip multithreading: Opportunities and challenges. In Proc. the 11th International Symposium on High-Performance Computer Architecture (HPCA), San Francisco, USA, Feb. 12–16, 2005, pp.248–252.

  2. Held J, Bautista J, Koehl S. From a few cores to many: A terascale computing research overview. Research at Intel White Paper, 2006.

  3. Edmondson J H, Rubinfeld P I, Bannon P J, Benschneider B J, Bernstein D, Castelino R W, Cooper E M, Dever D E, Donchin D R, Fischer T C et al. Internal organization of the Alpha 21164, a 300-MHz 64-bit quad-issue CMOS RISC microprocessor. Digital Technical Journal, 1995, 7(1): 119–135.

    Google Scholar 

  4. Montanaro J, Witek R T, Anne K, Black A J, Cooper E M, Dobberpuhl D W, Donahue P M, Eno J, Hoeppner W, Kruckemyer D et al. A 160-mhz, 32-b, 0.5-w CMOS RISC microprocessor. IEEE Journal of Solid-State Circuits, 1996, 31(11): 1703–1714.

    Article  Google Scholar 

  5. Petrov P, Orailoglu A. Dynamic tag reduction for low-power caches in embedded systems with virtual memory. International Journal of Parallel Programming, 2007, 35(2): 157–177.

    Article  MATH  Google Scholar 

  6. Burger D, Austin T M. The SimpleScalar tool set, version 2.0. ACM SIGARCH Computer Architecture News, 1997, 25(3): 13–25.

    Article  Google Scholar 

  7. Henning J L. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Computer Architecture News, 2006, 34(4): 17.

    Article  MathSciNet  Google Scholar 

  8. Du Z H, Lim C C, Li X F, Yang C, Zhao Q, Ngai T F. A costdriven compilation framework for speculative parallelization of sequential programs. ACM SIGPLAN Notices, 2004, 39(6): 71–81.

    Article  Google Scholar 

  9. Chen M K, Olukotun K. The JRPM system for dynamically parallelizing Java programs. In Proc. the 30th Annual International Symposium on Computer Architecture, San Diego, USA, Jun. 9–11, 2003, pp.434–446.

  10. Congy J, Hany G, Jagannathan A, Reinmany G, Rutkowski K. Accelerating sequential applications on CMPs using core spilling. IEEE Transactions on Parallel and Distributed Systems, 2007, 18(8): 1094–1107.

    Article  Google Scholar 

  11. Huh J, Kim C, Shafi H, Zhang L, Burger D, Keckler S W. A NUCA substrate for flexible CMP cache sharing. IEEE Transactions on Parallel and Distributed Systems, 2007, 18(8): 1028–1040.

    Article  Google Scholar 

  12. Monchiero M, Canal R, Gonzalez A. Power/performance/thermal design-space exploration for multicore architectures. IEEE Transactions on Parallel and Distributed Systems, 2008, 19(5): 666–681.

    Article  Google Scholar 

  13. Huang W, Stant M R, Sankaranarayanan K, Ribando R J, Skadron K. Many-core design from a thermal perspective. In Proc. the 45th Annual Design Automation Conference (DAC 2008), Anaheim, USA, Jun. 8–13, 2008, pp.746–749.

  14. Herbert S, Marculescu D. Analysis of dynamic voltage/frequency scaling in chip-multiprocessors. In Proc. the 2007 International Symposium on Low Power Electronics and Design (ISLPED 2007), Portland, USA, Aug. 27–29, 2007, pp.38–43.

  15. Chen Y, Shao Z, Zhuge Q, Xue C, Xiao B, Edwin H M S. Minimizing energy via loop scheduling and DVS for multicore embedded systems. In Proc. the 11th International Conference on Parallel and Distributed Systems-Workshops (ICPADS 2005), Fuduoka, Japan, Jul. 20–22, 2005, pp.2–6.

  16. Shirako J, Oshiyama N,Wada Y, Shikano H. Compiler control power saving scheme for multi core processors. In Proc. the 18th International Workshop Languages and Compilers for Parallel Computing (LCPC 2005), Hawthorne, USA, Oct. 20-22, 2005: Revised Selected Papers, p.362.

  17. Hsu C H, Kremer U. The design, implementation, and evaluation of a compiler algorithm for CPU energy reduction. In Proc. the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation, San Diego, USA, Jun. 9–11, 2003, pp.38–48.

  18. Inc R. Rambus 128/144-Mbit Direct RDRAM Data Sheet, 2000.

  19. Delaluz V, Sivasubramaniam A, Kandemir M, Vijaykrishnan N, Irwin M J. Scheduler-based DRAM energy management. In Proc. the 39th Conference on Design Automation, New Orleans, USA, Jun. 10–14, 2002, pp.697–702.

  20. Delaluz V, Kandemir M, Vijaykrishnan N, Sivasubramaniam A, Irwin M J. DRAM energy management using software and hardware directed powermode control. In Proc. the 7th International Symposium on High-Performance Computer Architecture, Nuevo Leone, Mexico, Jan. 20–24, 2001, pp.159–169.

  21. Delaluz V, Kandemir M, Vijaykrishnan N, Sivasubramaniam A, Irwin M J. Hardware and software techniques for controlling DRAM power modes. IEEE Transactions on Computers, 2001, 50(11): 1154–1173.

    Article  Google Scholar 

  22. Powell M, Yang S H, Falsafi B et al. Gated-V DD: A circuit technique to reduce leakage in deep-submicron cache memories. In Proc. the 2000 International Symposium on Low Power Electronics and Design, 2000, pp.90–95.

  23. Flautner K, Kim N S, Martin S et al. caches: Simple techniques for reducing leakage power. In Proc. the 29th Annual International Symposium on Computer Architecture, Saint Malo, France, Jun. 19–23, 2002, pp.148–157.

  24. Nicolaescu A V D. Low energy, highly-associative cache design for embedded processors. In Proc. IEEE International Conference on Computer Design: VLSI in Computers & Processors (ICCD2004), San Jose, USA, Oct. 11–13, 2004, p.332.

  25. Petrov P, Orailoglu A. Virtual page tag reduction for lowpower TLBs. In Proc. the 21st Int. Conf. Computer Design, San Jose, USA, Oct. 13–15, 2003, pp.371–374.

  26. Zhou X, Petrov P. Heterogeneously tagged caches for lowpower embedded systems with virtual memory support. ACM Transactions on Design Automation of Electronic Systems, 2008, 13(2): 32.

    Article  Google Scholar 

  27. Petrov P, Orailoglu A. Tag compression for low power in dynamically customizable embedded processors. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2004, 23(7): 1031–1047.

    Article  Google Scholar 

  28. Petrov P, Orailoglu A. Low-power data memory communication for application-specific embedded processors. In Proc. the 15th International Symposium on System Synthesis (ISSS 2002), Tokyo, Japan, Nov. 8–10, 2002, pp.219–224.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Long Zheng.

Additional information

This work was supported by the National Basic Research 973 Program of China under Grant No. 2007CB310900, the National Natural Science Foundation of China under Grant No. 60725208, and Fellowships of the Japan Society for the Promotion of Science for Young Scientists Program.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 103 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zheng, L., Dong, MX., Ota, K. et al. Energy Efficiency of a Multi-Core Processor by Tag Reduction. J. Comput. Sci. Technol. 26, 491–503 (2011). https://doi.org/10.1007/s11390-011-1149-0

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-011-1149-0

Keywords

Navigation