Skip to main content

Advertisement

Log in

Reducing energy consumption of parallel sparse matrix applications through integrated link/CPU voltage scaling

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Reducing power consumption is quickly becoming a first-class optimization metric for many high-performance parallel computing platforms. One of the techniques employed by many prior proposals along this direction is voltage scaling and past research used it on different components such as networks, CPUs, and memories. In contrast to most of the existent efforts on voltage scaling that target a single component (CPU, network or memory components), this paper proposes and experimentally evaluates a voltage/frequency scaling algorithm that considers CPU and communication links in a mesh network at the same time. More specifically, it scales voltages/frequencies of CPUs in the nodes and the communication links among them in a coordinated fashion (instead of one after another) such that energy savings are maximized without impacting execution time. Our experiments with several tree-based sparse matrix computations reveal that the proposed integrated voltage scaling approach is very effective in practice and brings 13% and 17% energy savings over the pure CPU and pure communication link voltage scaling schemes, respectively. The results also show that our savings are consistent with the different network sizes and different sets of voltage/frequency levels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Advanced Micro Devices, Inc. AMD Athlon 64 processor power and thermal data sheet, 2004

  2. Chandrakasan A, Brodersen R (1995) Low power digital CMOS design. Kluwer Academic, Dordrecht

    Google Scholar 

  3. Chase J, Anderson D, Thackar P, Vahdat A, Boyle R (2001) Managing energy and server resources in hosting centers. In: Proceedings of the 18th symposium on operating systems principles, October 2001, pp 103–116

  4. Chen G, Malkowski K, Kandemir MT, Raghavan P (2005) Reducing power with performance constraints for parallel sparse applications. In: Proceedings of international parallel and distributed processing symposium, April 2005

  5. Chen X, Peh L (2003) Leakage power modeling and optimization in interconnection networks. In: Proceedings of the international symposium on low power and electronics design, August 2003, pp 90–95

  6. Demmel J, Eisenstat SC, Gilbert JR, Li XS, Liu JWH (1995) A supernodal approach to sparse partial pivoting. Technical report UCB/CSD-95-883, EECS Department, University of California, Berkeley, 1995

  7. Douglis F, Krishnan P, Marsh B (1994) Thwarting the power-hungry disk. In: Proceedings of the USENIX winter conference, 1994, pp 292–306

  8. Elnozahy M, Kistler M, Rajamony R (2002) Energy-efficient server clusters. In: Proceedings of the second workshop on power aware computing systems, February 2002

  9. Elnozahy M, Kistler M, Rajamony R (2003) Energy conservation policies for web servers. In: Proceedings of the 4th USENIX symposium on internet technologies and systems, March 2003

  10. Freeh VW, Lowenthal DK (2005) Using multiple energy gears in MPI programs on a power-scalable cluster. In: Proceedings of the tenth ACM SIGPLAN symposium on principles and practice of parallel programming, 2005, pp 164–173

  11. George JA, Liu JW-H (1981) Computer solution of large sparse positive definite systems. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  12. Grigori L, Li XS (2002) A new scheduling algorithm for parallel sparse lu factorization with static pivoting. In: Proceedings of the 2002 ACM/IEEE conference on supercomputing. IEEE Computer Society Press, 2002, pp 1–18

  13. Gropp W, Lusk E, Doss N, Skjellum A (1996) High-performance, portable implementation of the MPI message passing interface standard. Parallel Comput 22(6):789–828

    Article  MATH  Google Scholar 

  14. Gupta A, Gustavson F, Joshi M, Karypis G, Kumar V (1999) PSPASES: an efficient and scalable parallel sparse direct solver, http://www-users.cs.umn.edu/~mjoshi/pspases

  15. Gupta A, Kumar V, Sameh A (1993) Performance and scalability of preconditioned conjugate gradient methods on the CM-5. In: Proceedings of the sixth SIAM conference on parallel processing for scientific computing, 1993, pp 664–674

  16. Gurumurthi S, Sivasubramaniam A, Kandemir M, Franke H (2003) DRPM: dynamic speed control for power management in server class disks. In: Proceedings of the international symposium on computer architecture, June 2003, pp 169–179

  17. Heath MT, Ng E, Peyton BW (1991) Parallel algorithms for sparse linear systems. SIAM Rev 33:420–460

    Article  MATH  Google Scholar 

  18. Hestenes MR, Stiefel E (1952) Methods of conjugate gradients for solving linear systems. J Res Nat Bur Stand 49:409–436

    MATH  Google Scholar 

  19. Intel XScale™ Core developer’s manual (2002), http://developer.intel.com/design/intelxscale/

  20. Karypis G, Kumar V (1995) METIS: Unstructured graph partitioning and sparse matrix ordering system, Version 2.0, Manual. Department of Computer Science, University of Minnesota, Minneapolis

  21. Kim EJ, Yum KH, Link G, Das CR, Vijaykrishnan N, Kandemir M, Irwin MJ (2003) Energy optimization techniques in cluster interconnects. In: Proceedings of the international symposium on low power electronics and design. ACM, August 2003, pp 459–464

  22. Kim J, Horowitz MA (2002) Adaptive supply serial links with sub-1v operation and per-pin clock recovery. In: Proceedings of international solid-state circuits conference, February 2002

  23. Luo J, Peh L-S, Jha N (2003) Simultaneous dynamic voltage scaling of processors and communication links in real-time distributed embedded systems. In: Proceedings of the design automation and test in Europe conference, 2003, pp 1150–1151

  24. Malkowski K, Raghavan P (2005) Multi-pass mapping schemes for parallel sparse matrix computation. In: International conference on computational science (1), 2005, pp  245–255

  25. Ng E, Raghavan P (2000) Towards a scalable hybrid sparse solver. Concurr Pract Exp 12:1–16

    Article  Google Scholar 

  26. Pothen A, Sun C (1993) A mapping algorithm for parallel sparse Cholesky factorization. SIAM J Sci Comput 14(5):1253–1257

    Article  MATH  Google Scholar 

  27. Raghavan P (1991) Distributed sparse matrix factorization: QR and Cholesky factorizations. PhD thesis, Pennsylvania State University

  28. Raghavan P, Teranishi K, Ng E (2003) A latency tolerant hybrid sparse solver using incomplete Cholesky factorization. Numer Linear Algebra 10:541–560

    Article  MATH  Google Scholar 

  29. Saad Y (1996) Iterative methods for sparse linears systems. PWS Publishing, Boston

    Google Scholar 

  30. Shang L, Peh L-S, Jha NK (2003) Dynamic voltage scaling with links for power optimization of interconnection networks. In: Proceedings of the 9th international symposium on high-performance computer architecture, 2003, pp 91–102

  31. Shin D, Kim J (2004) Power-aware communication optimization for networks-on-chips with voltage scalable links. In: Proceedings of the 2nd IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis, 2004, pp 170–175

  32. Soteriou V, Peh L-S (2004) Design-space exploration of power-aware on/off interconnection networks. In: Proceedings of the IEEE international conference on computer design, 2004, pp 510–517

  33. Transmeta. Crusoe Longrun Power Management White Paper. http://www.transmeta.com/crusoe/longrun.html

  34. Weiser M, Demers A, Welch B, Shenker S (1994) Scheduling for reduced CPU energy. In: Proceedings of symposium on operating system design and implementation, November 1994, pp 13–23

  35. Worm F, Ienne P, Thiran P, Micheli GD (2002) An adaptive low-power transmission scheme for on-chip networks. In: Proceedings of the 15th international symposium on system synthesis, 2002, pp 92–100

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Seung Woo Son.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Son, S.W., Malkowski, K., Chen, G. et al. Reducing energy consumption of parallel sparse matrix applications through integrated link/CPU voltage scaling. J Supercomput 41, 179–213 (2007). https://doi.org/10.1007/s11227-007-0113-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-007-0113-9

Keywords

Navigation