Skip to main content
Log in

TAMER: an adaptive task allocation method for aging reduction in multi-core embedded real-time systems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Technology scaling has exacerbated the aging impact on the performance and reliability of integrated circuits. By entering into nanotechnology era in recent years, the power density per unit of area has increased, which leads to a higher chip temperature. Aging in a chip is originated from multiple phenomena; all of them are intensified by increased temperature. Several circuit- and architecture-level schemes tried to mitigate the aging in the literature. However, these schemes are not sufficient for multi-core systems due to their unawareness of the unique constraints and features of these platforms. In this paper, we propose a system-level aging mitigation method, so-called Adaptive Task Allocation for Aging Reduction in Multi-core Embedded Real-time Systems (TAMER). As a task allocation algorithm, TAMER takes the cores’ utilization and their internal units’ activity into account to smooth the temperature pattern inside the chip. By minimizing both temporal and spatial thermal variations, TAMER prevents the occurrence of hotspot over time and space. We evaluated the TAMER method using a framework consisting of gem5 full-system cycle-accurate simulator, MATLAB, ESESC multi-core simulator, and HotSpot temperature modeling tool. The simulation results show that TAMER decreases the maximum and average temperature standard deviation of the cores by 56% and 37%, respectively, compared to the best previous temperature distribution task allocation algorithm. It is worth mentioning that, neither area nor performance overhead has been imposed on the system after the aforementioned improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. The proposed method will tame the hotspots.

References

  1. Abdi A, Zarandi HR (2018) Hystery: a hybrid scheduling and mapping approach to optimize temperature, energy consumption and lifetime reliability of heterogeneous multiprocessor systems. J Supercomput 74(5):2213–2238

    Article  Google Scholar 

  2. Ajami AH, Banerjee K, Pedram M (2005) Modeling and analysis of nonuniform substrate temperature effects on global ULSI interconnects. IEEE Trans Comput Aided Des Integr Circuits Syst 24(6):849–861

    Article  Google Scholar 

  3. Amrouch H, Henkel J (2015) Reliability degradation in the scope of aging—from physical to system level. In: 2015 10th International Design & Test Symposium (IDT). IEEE, pp 9–12

  4. Ardestani EK, Renau J (2013) ESESC: a fast multicore simulator using time-based sampling. In: 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA). IEEE, pp 448–459

  5. Atienza D, De Micheli G, Benini L, Ayala JL, Valle PGD, DeBole M, Narayanan V (2008) Reliability-aware design for nanometer-scale devices. In: Proceedings of the 2008 Asia and South Pacific Design Automation Conference. IEEE Computer Society Press, pp 549–554

  6. Bashir Q, Shehzad MN, Awais MN, Farooq U, Hamayun MT, Ali I (2018) A scheduling based energy-aware core switching technique to avoid thermal threshold values in multi-core processing systems. Microprocess Microsyst 61:296–305

    Article  Google Scholar 

  7. Binkert N, Beckmann B, Black G, Reinhardt SK, Saidi A, Basu A, Hestness J, Hower DR, Krishna T, Sardashti S et al (2011) The gem5 simulator. ACM SIGARCH Comput Archit News 39(2):1–7

    Article  Google Scholar 

  8. Bolchini C, Cassano L, Miele A (2016) Lifetime-aware load distribution policies in multi-core systems: an in-depth analysis. In: 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, pp 804–809

  9. Chantem T, Xiang Y, Hu XS, Dick RP (2013) Enhancing multicore reliability through wear compensation in online assignment and scheduling. In: 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, pp 1373–1378

  10. Chen CC, Milor L (2013) System-level modeling and microprocessor reliability analysis for backend wearout mechanisms. In: Proceedings of the Conference on Design, Automation and Test in Europe. EDA Consortium, pp 1615–1620

  11. Cho M, Kulkarni J, Tokunaga C, Khellah M, Tschanz J (2018) Adaptive voltage system for aging guard-band reduction. US Patent 10,122,347

  12. Coskun AK, Rosing T, Whisnant KA, Gross KC (2008) Static and dynamic temperature-aware scheduling for multiprocessor SoCs. IEEE Trans VLSI Syst 16(9):1127–1140

    Article  Google Scholar 

  13. Coskun AK, Rosing TS, Gross KC (2009) Utilizing predictors for efficient thermal management in multiprocessor SoCs. IEEE Trans Comput Aided Des Integr Circuits Syst 28(10):1503–1516

    Article  Google Scholar 

  14. Das A, Shafik RA, Merrett GV, Al-Hashimi BM, Kumar A, Veeravalli B (2014) Reinforcement learning-based inter-and intra-application thermal optimization for lifetime improvement of multicore systems. In: Proceedings of the 51st Annual Design Automation Conference (DAC). ACM, pp 1–6

  15. Es’haghi S, Eshghi M (2018) Lifetime-aware scheduling in high level synthesis. Microelectron Reliab 91:86–97

    Article  Google Scholar 

  16. Etter DM, Etter DM, Etter DM (1993) Engineering problem solving with MATLAB, vol 2. Prentice Hall, Englewood Cliffs

    MATH  Google Scholar 

  17. Feng S, Gupta S, Ansari A, Mahlke S (2010) Maestro: orchestrating lifetime reliability in chip multiprocessors. In: International Conference on High-Performance Embedded Architectures and Compilers (HiPEAC). Springer, pp 186–200

  18. Gomez AF, Gomez R, Champac V (2018) A metric-guided gate-sizing methodology for aging guardband reduction. In: 2018 IEEE 19th Latin-American Test Symposium (LATS). IEEE, pp 1–6

  19. Gunadi E, Sinkar AA, Kim NS, Lipasti MH (2010) Combating aging with the colt duty cycle equalizer. In: 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, pp 103–114

  20. Gustafsson J, Betts A, Ermedahl A, Lisper B (2010) The Mälardalen WCET benchmarks: past, present and future. In: 10th International Workshop on Worst-Case Execution Time Analysis (WCET 2010). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik

  21. Guthaus MR, Ringenberg JS, Ernst D, Austin TM, Mudge T, Brown RB (2001) Mibench: a free, commercially representative embedded benchmark suite. In: Proceedings of the Fourth Annual IEEE International Workshop on Workload Characterization. (WWC-4) (Cat. No. 01EX538). IEEE, pp 3–14

  22. Hashimoto M, Masuda Y (2018) Mttf-aware design methodology for adaptive voltage scaling. In: 2018 China Semiconductor Technology International Conference (CSTIC). IEEE, pp 1–4

  23. Hong H, Lim J, Lim H, Kang S (2015) Lifetime reliability enhancement of microprocessors: mitigating the impact of negative bias temperature instability. ACM Comput Surv (CSUR) 48(1):9

    Article  Google Scholar 

  24. Horowitz M, Indermaur T, Gonzalez R (1994) Low-power digital design. In: Proceedings of 1994 IEEE Symposium on Low Power Electronics. IEEE, pp 8–11

  25. JEP122F JS (2010) Failure mechanisms and models for semiconductor devices. JEDEC Solid State Technology Association, Arlington, VA

  26. Karami M, Abdi A, Zarandi HR (2018) A cross-layer aging-aware task scheduling approach for multiprocessor embedded systems. Microelectron Reliab 85:190–197

    Article  Google Scholar 

  27. Kashefi E, Zarandi HR, Gordon-Ross A (2016) Postponing wearout failures in chip multiprocessors using thermal management and thread migration. In: 2016 11th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC). IEEE, pp 1–7

  28. Khan H, Bashir Q, Hashmi MU (2018) Scheduling based energy optimization technique in multiprocessor embedded systems. In: 2018 International Conference on Engineering and Emerging Technologies (ICEET). IEEE, pp 1–8

  29. Khan O, Kundu S (2009) Thread relocation: a runtime architecture for tolerating hard errors in chip multiprocessors. IEEE Trans Comput 59(5):651–665

    Article  MathSciNet  Google Scholar 

  30. Khdr H, Amrouch H, Henkel J (2018) Aging-aware boosting. IEEE Trans Comput 67(9):1217–1230

    Article  MathSciNet  Google Scholar 

  31. Khoshavi N, Ashraf RA, DeMara RF, Kiamehr S, Oboril F, Tahoori MB (2017) Contemporary CMOS aging mitigation techniques: survey, taxonomy, and methods. Integration 59:10–22

    Article  Google Scholar 

  32. Kim T, Liu Z, Tan SXD (2018) Dynamic reliability management based on resource-based em modeling for multi-core microprocessors. Microelectron J 74:106–115

    Article  Google Scholar 

  33. Kumar R, Sachan A, Gogoi A, Ghoshal B (2020) Application phase behavior guided thermal management of embedded platforms. IEEE Embed Syst Lett

  34. Lee H, Shafique M, Al Faruque MA (2018) Aging-aware workload management on embedded GPU under process variation. IEEE Trans Comput 67(7):920–933

    Article  MathSciNet  Google Scholar 

  35. Lee Y, Chwa HS, Shin KG, Wang S (2018) Thermal-aware resource management for embedded real-time systems. IEEE Trans Comput Aided Des Integr Circuits Syst 37(11):2857–2868

    Article  Google Scholar 

  36. Lerner S, Yilmaz I, Taskin B (2018) Custard: ASIC workload-aware reliable design for multicore IoT processors. IEEE Trans Very Large Scale Integr VLSI Syst 27(3):700–710

    Article  Google Scholar 

  37. Li S, Ahn JH, Strong RD, Brockman JB, Tullsen DM, Jouppi NP (2009) McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). ACM, pp 469–480

  38. Liang H, Dai Y, Yi M, Xu D, Huang Z (2015) Mttf-aware reliability task scheduling for heterogeneous multicore system. In: International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP). Springer, pp 716–727

  39. Liu W, Yi J, Li M, Chen P, Yang L (2018) Energy-efficient application mapping and scheduling for lifetime guaranteed MPSoCS. IEEE Trans Comput Aided Des Integr Circuits Syst 38(1):1–14

    Article  Google Scholar 

  40. Liu Z, Xu T, Tan SXD, Wang H (2013) Dynamic thermal management for multi-core microprocessors considering transient thermal effects. In: 2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, pp 473–478

  41. Ma Y, Chantem T, Dick RP, Hu XS (2017) Improving system-level lifetime reliability of multicore soft real-time systems. IEEE Trans Very Large Scale Integr (VLSI) Syst 25(6):1895–1905

    Article  Google Scholar 

  42. Masrur A, Kindt P, Becker M, Chakraborty S, Kleeberger V, Barke M, Schlichtmann U (2012) Schedulability analysis for processors with aging-aware autonomic frequency scaling. In: 2012 IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA). IEEE, pp 11–20

  43. Mercati P, Bartolini A, Paterna F, Rosing TS, Benini L (2014) A linux-governor based dynamic reliability manager for android mobile devices. In: 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, pp 1–4

  44. Moghaddasi I, Fouman A, Salehi ME, Kargahi M (2018) Instruction-level NBTI stress estimation and its application in runtime aging prediction for embedded processors. IEEE Trans Comput Aided Des Integr Circuits Syst 38(8):1427–1437

    Article  Google Scholar 

  45. Moghaddasi I, Nasab MES, Kargahi M (2019) Aging-aware instruction-level statistical dynamic timing analysis for embedded processors. IEEE Trans Very Large Scale Integr (VLSI) Syst

  46. Mohammadi FD, Heh D (2019) Power management through aging-based task scheduling algorithms for smart grids. In: 2019 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT). IEEE, pp 1–5

  47. Narayanan V, Xie Y (2006) Reliability concerns in embedded system designs. Computer 39(1):118–120

    Article  Google Scholar 

  48. Neisser M, Wurm S (2015) Itrs lithography roadmap: 2015 challenges. Adv Opt Technol 4(4):235–240

    Google Scholar 

  49. Pagán J, Zapater M, Ayala JL (2018) Power transmission and workload balancing policies in ehealth mobile cloud computing scenarios. Future Gener Comput Syst 78:587–601

    Article  Google Scholar 

  50. Pourmeidani H, Sharma A, Choo K, Hassan M, Choi M, Kim K, Jang B (2018) Dynamic temperature aware scheduling for CPU-GPU 3D multicore processor with regression predictor. J Semicond Technol Sci 18(1):115–124

    Article  Google Scholar 

  51. Reinman G, Jouppi NP (2000) Cacti 2.0: an integrated cache timing and power model. Western Research Lab Research Report 7

  52. Rohbani N, Gau H, Mohammadinejad S, Maiti TK, Navarro D, Miura-Mattausch M, Mattausch HJ, Takatsuka H (2019) Power reduction and bti mitigation of data-cache memory based on the storage management of narrow-width values. IEEE Trans Very Large Scale Integr (VLSI) Syst

  53. Rohbani N, Miremadi SG (2018) A low-overhead integrated aging and SEU sensor. IEEE Trans Device Mater Reliab 18(2):205–213

    Article  Google Scholar 

  54. Safari M, Shirmohammadi Z, Rohbani N, Farbeh H (2018) WiP: floating xy-yx: an efficient thermal management routing algorithm for 3D NoCs. In: 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech). IEEE, pp 736–741

  55. Santarini M (2005) Thermal integrity: a must for low-power-IC digital design. EDN 19:37–38

    Google Scholar 

  56. Skadron K, Stan M, Barcella M, Dwarka A, Huang W, Li Y, Ma Y, Naidu, A, Parikh D, Re P et al (2002) Hotspot: techniques for modeling thermal effects at the processor-architecture level. In: International Workshop on Thermal Investigations of ICs and Systems

  57. Skadron K, Stan MR, Sankaranarayanan K, Huang W, Velusamy S, Tarjan D (2004) Temperature-aware microarchitecture: modeling and implementation. ACM Trans Archit Code Optim 1(1):94–125

    Article  Google Scholar 

  58. Tan S, Tahoori M, Kim T, Wang S, Sun Z, Kiamehr S (2019) Aging-aware standard cell library optimization methods. In: Long-Term Reliability of Nanometer VLSI Systems. Springer, pp 323–342

  59. Tiwari A, Torrellas J (2008) Facelift: hiding and slowing down aging in multicores. In: Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE Computer Society, pp 129–140

  60. Viswanath R, Wakharkar V, Watwe A, Lebonheur V et al (2000) Thermal performance challenges from silicon to systems. Intel Technol J

  61. Wang J, Lu J, Guo S, Chen Z, Li Y (2018) A thermal balance oriented task mapping for CMPs. In: Proceedings of the 8th International Conference on Information Communication and Management (ICIM). ACM, pp 12–16

  62. Winter JA, Albonesi DH, Shoemaker CA (2010) Scalable thread scheduling and global power management for heterogeneous many-core architectures. In: 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT). IEEE, pp 29–39

  63. Zhou J, Yan J, Chen J, Wei T (2016) Peak temperature minimization via task allocation and splitting for heterogeneous mpsoc real-time systems. J Signal Process Syst 84(1):111–121

    Article  Google Scholar 

  64. Zuo X, Gupta SK (2017) Asymmetric sizing: An effective design approach for SRAM cells against BTI aging. In: 2017 IEEE 35th VLSI Test Symposium (VTS). IEEE, pp 1–6

Download references

Acknowledgements

Funding was provided by School of Computer Science, Institute for Research in Fundamental Sciences (IPM) (Grant No. 10).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamed Farbeh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saadatmand, F.S., Rohbani, N., Baharvand, F. et al. TAMER: an adaptive task allocation method for aging reduction in multi-core embedded real-time systems. J Supercomput 77, 1939–1957 (2021). https://doi.org/10.1007/s11227-020-03326-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-020-03326-7

Keywords

Navigation