Analysis of integrated circuits thermal dynamics with point heating time

https://doi.org/10.1016/j.mejo.2010.09.011Get rights and content

Abstract

The article presents an analysis of thermal dynamics in integrated circuits using a new method. The introduced variable that defines the dynamic state of an IC is point heating time (PHT). The authors examine the dynamic behaviour of the integrated circuit caused by the thermal activity of one and many functional modules on the multi-core chip. Various substrate materials were analysed and the PHT value was compared to the mean temperature of the integrated circuit and its time constant. The PHT value is also analysed as a variable dependent on the distance from the heat centre of the heating module. The analytical equation, which is a result of the analysis, can describe the PHT value versus distance for the whole chip surface. Further analysis helps verify the hypothesis using a massive multi-core integrated circuit. The result can be used to increase the thermal efficiency of multi-core integrated circuits by thermal-aware flow modification of the scheduling algorithm.

Introduction

The current trend in the modern multi-purpose processor design leads to the development of multi-core universal processing units [1]. The design principle is to place many functionally identical, independent processing units on one die. This may help sustain the Moore Law principle, because nowadays there is no visible roadmap for implementing more “frequency powered” processors [2], [3]. All of this has led to the production of, e.g., Intel Core i7 [4] processor (Fig. 1), future (in laboratory tests) designs like 80-core Intel Polaris [5] (Fig. 2) and IBM/Toshiba CELL processor [6] (Fig. 3). Despite the complicated internal structure, designers can determine power consumption for a whole chip surface. Unfortunately in this case, when the chip consists of many independent modules on a rather substantially large chip, the thermal aspects still are difficult to keep on acceptable levels [25], [26], [27]. These aspects are connected with the processor’s work in order to achieve the maximum performance rating near a thermal boundary of the integrated circuit for a given technology and simultaneously the maximum temperature does not exceed the critical thermal level. The thermal design and its influence on the chip do not currently take into account the dynamic aspect of thermal simulations.

An additional factor that has increasingly significant impact on the computation capabilities of integrated circuits is their thermal restrictions. The actual design of integrated circuits needs an effective heat dissipation method. Because of that, designers include even more complicated active thermal cooling techniques. For the most popular CMOS process, the thermal boundary is set at 125 °C [7]. The active cooling mechanism reacts to temperature changes considerably later just after the time of the cause. A steep temperature rise on the surface of the chip (generated by e.g. increased activity of the computation module) is not likely to be neutralized by passive and active cooling. This is the main reason for decreasing the working temperature for all current processor designs to acceptable levels by a cooling mechanism. The main goal for this paper is to present a new method to analyze the dynamic thermal behaviour of the multi-core integrated circuit.

In the published paper [8], partitioning dissipation of power into dynamic and static power losses emphasizes the rising impact of the latter’s value on total power. Nonetheless, in [9] the authors presented an analysis showing that in current evolution in microelectronics research, dynamic power will still have considerable influence on the total power loss in an integrated circuit. It is a strong justification to do research on minimizing dynamic power loss and its influence on the thermal working boundaries for integrated circuit design. Many researches were made in order to minimize total power dissipation. The most popular techniques include dynamic voltage scaling [10], [11], [12], [13], [14], [15] and dynamic frequency scaling [16], [17], [18], [19]. Dynamic power loss [20] is described asPdyn=fCLVDD2where f is the chip working frequency; CL is load capacity; and VDD is supplied power.

Based on thermal analysis, the authors proposed a novel method for asynchronous control of functional module activity in an integrated circuit. In [21], the authors present an example mechanism of activity control in order to minimize maximum peak temperature of an integrated circuit. The switching frequency of the integrated circuit module activity influenced the peak temperature. Let us stress that the switching frequency determines the switching between the active and passive modes of an integrated circuit module—and not the working frequency of an integrated circuit. Additionally, the algorithm introduced switching between the modules in order to spread the thermal activity all over of the surface of an integrated circuit. The example mechanism that consists of switching between cores is presented visually in Fig. 4. The activity is spread to the available cores.

As stated in [21], the temperature caused only by the active power dissipation during activity switching conforms to the equationT1ωwhere T is the temperature and ω is the pulsation of the switching activity ω=2πf. Visually, the activity switching method can be shown as in Fig. 5. Asynchronous activity switching occurs when the temperature of an integrated circuit is over some defined level—as seen in the picture: the over-temperature area.

Because of these there is a strong demand to estimate and define the dynamic power loss in real time. The best solution is to estimate not only current power loss but also on the basis of computer flow to be able to predict and control power dissipation before real damage can be caused on the hardware level.

In order to define dynamic thermal behaviour of an integrated circuit, a corresponding variable has to describe dependencies that connect physical components of the integrated circuit with its thermal response to a test pattern [22]. The proposed dynamic change is a value that can describe the influence of one thermal active module on the total chip temperature and its dynamic change during computation of the stream of data. In order to investigate thermal dynamics of any integrated circuit, point heating time (PHT) value will be introduced. The PHT value allows us to compare physical and thermal properties of an integrated circuit. The next section will cover the PHT definition and method of its calculation from the initial thermal simulation data.

Section snippets

Point heating time

The point heating time (PHT) value is the duration of temperature growth in any measurement point in the chip volume, which is caused by a heat source located on the chip surface. When we assume that an integrated circuit/micro-circuit temperature reaches its stationary state in the time 3τ, τ is the point heating time value for mean temperature [23]. This value is derived from the equation that describes a mean temperature on the integrated circuit [22].Tm(t)Ni=1Pi2αLxLyexp(t/τ)whereτρcLz2α

Point heating time for substrates

The authors analyze PHT changes for different substrate materials on which an active module is placed. For the analysis authors chose silicon (Si), gallium arsenide (GaAs), cuprum (Cu), aluminium (Al) and aluminium oxide (Al2O3). As a test model, let us define a single heating module (M1) of the size 4×4 mm2 placed on a 20×10 mm2 surface plane. The total thickness of the integrated circuit is 0.625 mm (thickness of a 5″ silicon wafer). Dimensions of the test model are shown in Fig. 7. The boundary

Point heating time for different chip thickness

According to Eq. (4) the time constant value (and consequently the PHT value) depends on chip thickness. The next analysis consists of chip thickness comparison. The same test model as described in the previous section is chosen for this analysis. For the purpose of comparison, the authors chose silicon substrate material with standard wafer thicknesses of 0.275, 0.375, 0.625 and 0.925. The PHT and the time constant values for the given thickness types are presented in Table 2.

The data show

Point heating time for distance estimation

In order to compute the PHT value for the whole chip surface, the authors chose a test model identical to the one in the previous section. The chip thickness was 0.625 mm, the substrate material was silicon. The heating module was placed according to Fig. 8. The PHT value for the chip surface is presented in Fig. 9.

Fig. 8 shows regular change of the PHT value depending on the distance from an active module. The PHT value versus distance from the heat centre has been calculated. Results are

Test models for multi-source case

The analysis of concurrent use of multi-core modules will be presented for two cases. The difference will be in the total core – module – count. Let us define an integrated circuit with the dimensions 20×10 mm2 in the XY plane. The chip thickness will be taken as the thickness of most significant silicon wafers, that is 0.275 mm, 0.625 mm and 0.925 mm.

Let us define the core count and its placement. For research purposes, two cases were defined. The first scenario had two functional modules placed

Multi-source case results

Analysis of dynamic PHT values was performed for the two presented test models. The simulation data were collected regarding minimal and maximal value of dynamic PHT for a given case. The PHT value computed from the mean temperature – identically to the τ value (Eq. (4)) – was computed for comparison.

The values above were computed for any possible combination of functional module activity. Results for the two active modules are presented in Fig. 14. Colour bars represent dynamic PHT variation.

ASTER—control algorithm verification

In order to check the theoretical conclusions presented above, one needs to define a simulation case for a multi-core integrated circuit. As was mentioned in the introduction, massive multi-core systems are going to be a major player in the computing industry. Because of this, the authors are going to analyze thermal aspects of the Intel Polaris processor. The processor consists of 80 cores placed regularly on the chip surface. The area of the chip – 21×10 mm2 – has on the basic plane nothing

ASTER, round-robin, random—results

Based on the presented principles for all three algorithms, a new software testbench was created, which performed the following simulation steps:

  • Test the program definition which consists of two types of instructions (A/B) whose computation time was defined for type A—20 ms and for B—30 ms. Software selects randomly program code from type A/B as long as the total code length is less than 2 s.

  • Program data is forwarded to the three algorithms—ASTER using 100% of available cores, round-robin using

Conclusions

In the article the authors presented the analysis of the integrated circuit thermal dynamics. The research connects the thermal state of the chip with its physical description. PHT mean values were compared with the analytical time constant values. Different substrate materials and chip thicknesses were examined. The simulation results prove the analytical equations to be correct. The values do not differ by more than 10%. For the given test materials, silicon has the lowest PHT value from

Acknowledgement

This work was prepared as a part of project grant number N R13 0065 10 and paid for by the National Center for Research and Development, Poland.

References (27)

  • S. Mikula et al.

    Asynchronous control of modules activity in integrated system for reducing peak temperatures

    Integration, the VLSI Journal

    (2008)
  • S. Borkar, Thousand core chips: a technology perspective, in: Proceedings of the 44th Annual Conference on Design...
  • J. Bautista, Tera-scale computing – motivation and challenges – Conference Computing of the Future—Energy-Efficient...
  • J. Shalf, B. Tchudi S. Elbert et al., Power, cooling, and energy consumption for the petascale and beyond, in:...
  • Intel CoreTM i7 Processor Extreme Edition Series and Intel CoreTM i7 Processor, Specification update, Intel...
  • T. Mattson, Scalable software for many core chips—programming Intel-s 80-core research chip, in: Proceedings of...
  • D. Pham, H. Anderson, E. Behnen, Bolliger, et al., Key features of the design methodology enabling a multi-core SoC...
  • U. Paschen D. Dittrich H. Vogt N. Kordas, High temperature CMOS process with dielectric isolatio, in: Proceedings of...
  • Dirk Grunwald, Philip Levis, Keith I. Farkas, Charles B. Morrey III, Michael Neufeld, Policies for dynamic clock...
  • A. GoŁda, A. Kos, Energy losses in digital CMOS integrated circuits: state-of-the-art and future trends, in:...
  • K. Flautner, S. Reinhardt, T. Mudge, Automatic performance setting for dynamic voltage scaling. Wireless Networks 8, 5...
  • T. Horvath

    Dynamic voltage scaling in multitier web servers with end-to-end delay control

    IEEE Transactions on Computers

    (2007)
  • W. Kim, J. Kim, S.L. Min, Preemption-aware dynamic voltage scaling in hard real-time systems, in: Proceedings of the...
  • Cited by (4)

    • Improvement of multicores throughput based on environmental conditions

      2016, Microelectronics Reliability
      Citation Excerpt :

      A processor driven by TCO is operating with a frequency set in such a way to keep always the same chip temperature, regardless of processor's throughput. In [23,24] an asynchronous control of power method in multicore devices is presented. The aim of this method is to achieve a more uniform temperature on entire surface of a chip by means of assigning tasks to a dynamically selected cores, preventing single core from working in unnecessary high temperature and prolonging its life.

    • Effective temperature control approach for ICs

      2018, Proceedings of 25th International Conference Mixed Design of Integrated Circuits and Systems, MIXDES 2018
    • Quiet passive cooling of high performance microsystems with additional temperature sensor

      2016, Proceedings of the 23rd International Conference Mixed Design of Integrated Circuits and Systems, MIXDES 2016
    View full text