Skip to main content
Log in

Extending Amdahl’s law and Gustafson’s law by evaluating interconnections on multi-core processors

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Multicore chips are emerging as the mainstream solution for high performance computing. Generally, communication overheads cause large performance degradation in multi-core collaboration. Interconnects in large scale are needed to deal with these overheads. Amdahl’s and Gustafson’s law have been applied to multi-core chips but inter-core communication has not been taken into account. In this paper, we introduce interconnection into Amdahl’s and Gustafson’s law so that these laws work more precisely in the multi-core era. We further propose an area cost model and analyse our speedup models under area constraints. We find optimized parameters according to our speedup model. These parameters provide useful feedbacks to architects at an initial phase of their designs. We also present a case study to show the necessity of incorporating interconnection into Amdahl’s and Gustafson’s law.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Amdahl GM (1967) Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the April 18–20, 1967, spring joint computer conference. ACM, New York, pp 483–485

    Google Scholar 

  2. Gustafson JL (1988) Reevaluating Amdahl’s law. Commun ACM 31(5):532–533

    Article  Google Scholar 

  3. Borkar S (2007) Thousand core chips—a technology perspective. San Diego, CA

  4. Furber S (2008) The future of computer technology and its implications for the computer industry. Comput J 51(6):735–740

    Article  Google Scholar 

  5. Wentzlaff D, Griffin P, Hoffmann H, Bao L, Edwards B, Ramey C, Mattina M, Miao CC, Brown Iii JF, Agarwal A (2007) On-chip interconnection architecture of the tile processor. IEEE MICRO 27(5):15–31

    Article  Google Scholar 

  6. George L (2009) More Cores Keep Power Down. “Computing now”. from. http://www.computer.org/portal/web/computingnow/archive/news041

  7. Semiconductor Industry Association (2007). International Technology Roadmap for Semiconductors. From http://www.itrs.net/Links/2007ITRS/Home2007.htm

  8. R CJ (2007) Intel’s teraflops chip uses mesh architecture to emulate mainframe. EETimes Product Brief

  9. Hill MD, Marty MR (2008) Amdahl’s law in the multicore era. Computer 41(7):33–38

    Article  Google Scholar 

  10. Woo DH, Lee HHS (2008) Extending Amdahl’s law for energy-efficient computing in the multi-core era. Computer 41(12):24–31

    Article  Google Scholar 

  11. Sun X-H, Chen Y (2010) Reevaluating Amdahl’s law in the multicore era. J Parallel Distrib Comput 70(2):183–188

    Article  MATH  Google Scholar 

  12. Sinnen O, Sousa LA (2005) Communication contention in task scheduling. IEEE Trans Parallel Distrib Syst 16(6):503–515

    Article  Google Scholar 

  13. Benoit A, Hakem M, Robert Y (2009) Contention awareness and fault-tolerant scheduling for precedence constrained tasks in heterogeneous systems. Parallel Comput 35(2):83–108

    Article  Google Scholar 

  14. Sinnen O, To A, Kaur M (2011) Contention-aware scheduling with task duplication. J Parallel Distrib Comput 71(1):77–86

    Article  Google Scholar 

  15. Guo M, Nakata I, Yamashita Y (2000) Contention-free communication scheduling for array redistribution. Parallel Comput 26(10):1325–1343

    Article  MATH  Google Scholar 

  16. Xiaoyong T, Kenli L, Degui X, Jing Y, Min L, Yunchuan Q (2006) A dynamic communication contention awareness list scheduling algorithm for arbitrary heterogeneous system. Montpellier

  17. Zhiqiang Q (2010) Implementing medical CT algorithms on stand-alone FPGA based systems using an efficient workflow with SysGen and simulink. In: Yongxin Z, Xuan W, Jibo Y, Tian H, Zhe Z, Li Y, Feng Z, Yuzhuo F (eds) Computer and information technology (CIT), 2010 IEEE 10th international conference, pp 2391–2396

    Google Scholar 

  18. Larrabee, http://en.wikipedia.org/wiki/Larrabee_(microarchitecture)

  19. Teraflops Research Chip, http://en.wikipedia.org/wiki/Teraflops_Research_Chip

  20. Single-chip Cloud Computer, http://en.wikipedia.org/wiki/Single-chip_Cloud_Computer

  21. Intel Many Integrated Core Architecture. http://en.wikipedia.org/wiki/Intel_MIC

  22. Pollack’s Rule, http://en.wikipedia.org/wiki/Pollack’s_Rule

  23. Morad TY, Weiser UC, Kolodny A, Valero M, Ayguad E (2006) Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors. IEEE Comput Archit Lett 5(1):14–17

    Google Scholar 

  24. Vee V-Y, Hsu W-J (1999) Applying cilk in provably efficient task scheduling. Comput J 42(8):699–712

    Article  MATH  Google Scholar 

  25. Kwok Y-K, Ahmad I (1999) Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput Surv 31(4):406–471

    Article  Google Scholar 

  26. Kwok Y-K, Ahmad I (1996) Dynamic critical-path scheduling: an effective technique for allocating task graphs to multiprocessors. IEEE Trans Parallel Distrib Syst 7(5):506–521

    Article  Google Scholar 

  27. Cosnard M, Loi M (1995) Automatic task graph generation techniques. In: Proceedings of the twenty-eighth Hawaii international conference on system sciences, vol. II

    Google Scholar 

  28. Wu MY, Gajski DD (1990) Hypertool: a programming aid for message-passing systems. IEEE Trans Parallel Distrib Syst 1(3):330–343

    Article  Google Scholar 

  29. Hwang K, Xu Z, Arakawa M (1996) Benchmark evaluation of the IBM SP2 for parallel signal processing. IEEE Trans Parallel Distrib Syst 7(5):522–536

    Article  Google Scholar 

  30. Jereb B, Pipan L (1992) Measuring parallelism in algorithms. Microprocess Microprogram 34(1–5):49–52

    Article  Google Scholar 

  31. Jain KK, Rajaraman V (1994) Parallelism measures of task graphs for multiprocessors. Microprocess Microprogram 40(4):249–259

    Article  Google Scholar 

  32. Transistor count, From Wikipedia http://en.wikipedia.org/wiki/Transistor_count

  33. Keck B, Hofmann HG, Scherl H, Kowarschik M, Hornegger J (2009) High resolution iterative CT reconstruction using graphics hardware. In: Nuclear science symposium conference record (NSS/MIC). IEEE, New York, pp 4035–4040

    Chapter  Google Scholar 

Download references

Acknowledgement

This paper is partially sponsored by the National High-Technology Research and Development Program of China (863 Program) (No.2009AA012201).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongxin Zhu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, T., Zhu, Y., Qiu, M. et al. Extending Amdahl’s law and Gustafson’s law by evaluating interconnections on multi-core processors. J Supercomput 66, 305–319 (2013). https://doi.org/10.1007/s11227-013-0908-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-013-0908-9

Keywords

Navigation