1 Introduction

New fabrication technologies and design complexities, multiple cores embedded are rapidly emerging in System-on-Chip (SoC) and Very Large Scale Integration (VLSI) [4]. Built-in self-test (BIST) is widely applied to the Design for Testability (DFT), Automatic Test Pattern Generation (ATPG) is important to these test technologies, which involves a series of thorny issues such as high test power dissipation, large test data volume and extra area overhead.

Generally, test data volumes can be reduced through three test compression techniques including linear decompression based scheme, broadcast scanbased scheme and code based scheme [13], which are directly applied to current test patterns and avoid any ATPG or fault simulation [15]. Among them, the code based scheme doesn’t require circuit structural information and it is suitable for intellectual property cores, so it is preferred in test data compression. There includes Huffman coding [9], Golomb coding [3], Count Compatible Pattern Run-Length Coding (CCPRL) [16], Frequency-Directed Run-Length (FDR) coding [5], Alternating Run-Length (ARL) coding [6], Extended Frequency-Directed Run-Length (EFDR) coding [7], etc.

In addition, a circuit or system with increased more switching activities consumes more power dissipation in test mode than it does in normal mode [2], because there are some logic state transitions within each test pattern or between two consecutive test patterns. The surging scan-in test power dissipation caused by the instantaneous current in test mode may create enormous damages to circuit with excessively high switching activities. So two techniques have been developed to effectively reduce test power dissipation, One is to modify the conventional LFSR configurations, such as LP-TPG [1], DS-LFSR [14] and LT-TPG [12], etc. Another is to decrease the number of transitions during Circuit under Test (CUT) test such as test patterns reordering, scan cells reordering and favorable X-filling schemes, etc. [2].

This paper proposed the power efficient BIST TPG method based on don’t care bit based 2-D adjusting, Hamming Distance based 2-D reordering and ASDFR with MT-filling scheme. The six largest ISCAS’89 benchmark circuits verify the proposed method. It not only effectively decreased scanned-in test power dissipation, but also obtained a high compression ratio, consumed less test application time and avoided adding extra area overhead.

2 Background

Three concepts are involved in the proposed BIST TPG method.

2.1 Scan-in Test Power Dissipation Model

The test power dissipation caused by the switching activities can be effectively reduced by decreasing the number of logic state transitions of test set [2]. Scan-in test power dissipation is proportional to the switching activities of test pattern. In addition, other activities such as scan-in operation, scan-out operation and their transition operation between these two [11] all contribute to the number of logic state transition in test pattern and test power dissipation.

Scan-out response heavily relies on scan-in test pattern, so we only focus on scan-in operation. The weighted transition metric (WTM) model [4] is presented to estimate scan-in test power dissipation of a given test pattern, which depends on not only the number of logic state transitions but also their relative positions.

Considering a test set as T = {T1, T2, T3, ⋯, Tm}, the length of each test pattern is n. Every test pattern can be expressed as Ti = {ti1, ti2, ⋯, tin}, 1 ≤ i ≤ m, 1 ≤ j ≤ n, t ij denotes the j th bit of the i th test pattern. Therefore the weighted transitions metric WTM i , average scan-in power dissipation P avg and peak scan-in power dissipation P peak are estimated as follows:

$$ {\mathrm{WTM}}_{\mathrm{i}}={\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{n}-1}\left(n-j\right)\ast}\left({\mathrm{t}}_{\mathrm{i},\mathrm{j}}\oplus {\mathrm{t}}_{\mathrm{i},\mathrm{j}+1}\right) $$
(1)
$$ {\mathrm{P}}_{\mathrm{avg}}=\frac{{\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{m}}{\mathrm{WTM}}_{\mathrm{i}}}}{\mathrm{m}} $$
(2)
$$ {\mathrm{P}}_{\mathrm{peak}}= \max {}_{1\le \mathrm{i}\le \mathrm{m}}{\mathrm{WTM}}_{\mathrm{i}} $$
(3)

According to the above formula, three parameters can be calculated to evaluate test power dissipation of a test pattern during scan-in operation. An important conclusion can be drawn from the analysis of scan-in test power dissipation model, different X-filling schemes [2] and different ordering strategies [10] besides the logic state transitions all affect test power dissipation.

2.2 Test Power Dissipation Estimated for Filled Test Set

In scan-based BIST, a test pattern that detects many targeted faults may contain a large number of don’t care bits (X) [2]. In conventional scan ATPG, each X bit in test pattern is filled with 0 or 1 at random since this will not affect the fault coverage. Actually, the number of X bits in a test set is typically large. The X-filling scheme randomly assign a 0 or 1 to X bits in test set so that the number of logic state transitions in scan cells is minimized, which reduces the overall switching activity in Circuit under Test (CUT) during shift cycles [1]. Table 1 shows WTM comparison in two groups of test sets with four test patterns filled by different filling scheme such as 0-filling, 1-filling and Minimum Transitions filling (MT-filling) [2].

Table 1 WTM comparison of two test sets

From the above two groups of test sets, 0-filling scheme and 1-filling scheme just show similar results in the evaluation of WTM, but MT-filling scheme shows obvious reduction in test power dissipation. So MT-filling scheme is regarded as the power efficient filling method, the scan-in test power dissipation and compression ratio will be further considered in the proposed BIST TPG method.

2.3 Run Length Based Code Scheme

The run length based code scheme has been widely applied to test data compression in BIST. Jas and Touba proposed run-length codes to encode runs of 0s to reduce test data volumes. Then Golomb code was proposed by Chandra and Chakrabarty to encode runs of 0s with variable length code which allows efficient encoding of longer runs [1], however it requires a synchronization mechanism between the tester and the chip. Then they proposed a new scheme named Frequency-Directed Run length (FDR) code [5] with the variable group size compared with Golomb code. It is very efficient for FDR code to compress test data which has few 1s but long runs of 0s. Maleh and Abaji further proposed an Extension of FDR (EFDR) which encodes both types of runs to remedy the defects of the FDR. Therefore, EFDR code outperformed FDR code when the test data has few 0s [7].

For further improving the compression ratio, Hellebrand and Wurtenberger proposed an evolution in alternating run-length code called Alternating Shifted FDR (ASFDR) to reduce the SoC test data volumes [8]. This method has no run-length of zero size, and codeword for run length size 0 is unnecessary. This codeword is assigned to run length size 1 and each codeword is shifted to one position higher. It obtains higher compression ratio compared with Alternating FDR. The test data compression example of two tests is shown in Table 2.

Table 2 Test data compression example

The coding results show that ASFDR scheme has higher compression ratio than EFDR scheme does. In addition, an on-chip decoder with an acceptable area overhead is required to load encoded test data from Automatic Test Equipment (ATE).

3 Proposed BIST TPG Method

Hamming distance based reordering [11] should satisfy the following two facts: ① ATPG derived test patterns contain a large amount of don’t care bits [2]. ② Stuck-at faults based test patterns can be reordered without any loss of fault coverage, but the corresponding fault free outputs that are stored in ATE as golden references must be also reordered in the same sequence. In order to reduce scan-in power dissipation of test pattern and obtain high compression ratio in run-length coding, the test set will be preprocessed in accordance with the following procedure.

A given test set T {O1, O2, O3, O4} composed by four test patterns with ten bits is involved in the experiments.

  • O1: 0010100011

  • O2: 0XX1000110

  • O3: 11X01XX01X

  • O4: 10XX1XXXXX

  1. (1)

    don’t care bit based first adjusting

    The test sets are reordered according to the numbers of don’t care bit in each test pattern from more to less, so they are reordered as follows:

    • O4: 10XX1XXXXX

    • O3: 11X01XX01X

    • O2: 0XX1000110

    • O1: 0010100011

    During don’t care bit based first adjusting, the location of test pattern in test set will be interchanged. Test pattern with more don’t care bits will be changed to the front position as much as possible.

  2. (2)

    hamming distance based first reordering

    The test pattern (O4) with maximum don’t care bits is always placed on the first position in above test sequence. So it is selected as the new first test pattern in the reordered list. The reason for selecting the test pattern with minimum don’t care bit is that there is a minimum flexibility for don’t care bit mapping.

    Hamming Distance is defined as the distance between two test patterns equal to the number of corresponding mismatched bits, and it is calculated by bit position wise [11]. Hamming Distances from the first test pattern (O4) in adjusted list to all remaining patterns (O3, O2, O1) are calculated respectively. The test pattern with the minimum Hamming Distance from the first test pattern (O4) will be placed next to it.

    Let’s find out the Hamming distance between the first test pattern (O4) and test pattern (O3), Here we will compare each bit from O4 with each bit in same position from O3, If O4 (i) = O3 (i) or O4 (i) = X or O3 (i) = X, then Hd (i) = 0, otherwise Hd (i) = 1. So total Hamming Distance between O4 and O3 is the sum of Hd(i). For given test set, let’s calculate the Hamming distance for all test patterns.

    • O4: 10XX1XXXXX

    • O3: 11X01XX01X (Hd between O4 and O3 is 1)

    • O2: 0XX1000110 (Hd between O4 and O2 is 2)

    • O1: 0010100011 (Hd between O4 and O1 is 1)

    As test pattern O3 has minimum Hamming distance from the first test pattern O4, we will put test pattern O3 at the second position of new list. Now remaining patterns will be compared with O2 and O1, then O1will be searched and placed based on Hamming distance from the second test pattern O3. The reordering continues until the last test vector is found. The reordered test set is shown as follows:

    • O4: 10XX1XXXXX

    • O3: 11X01XX01X

    • O1: 0010100011

    • O2: 0XX1000110

  3. (3)

    the first matrix transpose

    Considering the given test set as a matrix with the size of 4 × 10, where 4 is the number of test pattern and 10 is the bits number of each test pattern. Let’s clustering don’t care bit in each test pattern, so the columns with matched bits should be placed nearby. Then the transpose matrix of the reordered test set is found.

    • T1: 1100

    • T2: 010X

    • T3: XX1X

    • T4: X001

    • T5: 1110

    • T6: XX00

    • T7: XX00

    • T8: X001

    • T9: X111

    • T10: XX10

  4. (4)

    don’t care bit based second adjusting

    All rows are reordered according to the numbers of don’t care bit from more to less, the row T3 with the most don’t care bit is always regarded as the new first row.

    • T3: XX1X

    • T6: XX00

    • T7: XX00

    • T10: XX10

    • T2: 010X

    • T4: X001

    • T8: X001

    • T9: X111

    • T1: 1100

    • T5: 1110

    In don’t care bit based second adjusting, the location of don’t care bits in each test pattern will be interchanged, and don’t care bits will be changed to the front position as much as possible.

  5. (5)

    hamming distance based second reordering

    The new first row T3 remains fixed, the Hamming Distance based reordering of remaining all rows from the new first row T3 is calculated and compared. As the row T10 has minimum Hamming distance from row T3, it is put at the second position of new list. The hamming distance based second reordering is shown as follows.

    • T3: XX1X

    • T10: XX10

    • T5: 1110

    • T6: XX00

    • T7: XX00

    • T2: 010X

    • T1: 1100

    • T4: X001

    • T8: X001

    • T9: X111

  6. (6)

    the second matrix transpose

    At the end of the second hamming distance reordering, the reordered list is obtained. Then, it will be transposed again. So the required test set is obtained from the transpose matrix composed by the reordered test pattern.

    • N1: XX1XX01XXX

    • N2: XX1XX11001

    • N3: 1110000001

    • N4: X0000X0111

  7. (7)

    MT-filling scheme

    The proposed TPG method is applied to the test set with four test patterns. Before evaluating the scan-in test power dissipation and the compression ratio, these don’t care bits in test set are completely filled and shown in Table 3.

    Table 3 Compression result comparison with different method

    In Table 3, different X-filling schemes applied to test set seriously affect the compression ratio and scan-in test power dissipation. Usually, MT-filling scheme can do better in decreasing the number of switching activities than 0-filling scheme and 1-filling scheme do. Here, don’t care bit 2-D adjusting effectively reduces the value of WTM. Hamming distance 2-D reordering further decreases the test power dissipation.

4 The Decoder Architecture and Area Overhead

The run length based code scheme needs an on-chip decoder which loads the encoded test data from ATE, and then decodes them for on-chip scan testing. For showing it holds true in an actual chip design, the test application time, the decoder architecture and decoder area overhead are also considered in the proposed scheme.

4.1 The Decoder Architecture

Figure 1 improves the decoder by inserting a bit swapping logic array before the test patterns injected to single/parallel scan chains for on-chip testing. For all the test sets, don’t care bit adjusting and bits swapping in test pattern can be performed by programming the on-chip FPGA. As long as a good control in higher clock speeds of the BIST state machine to generate normal speed test pattern, don’t care bit based 2-D adjusting and Hamming Distance based 2-D reordering can perform well during the test set encoding and decoding.

Fig. 1
figure 1

The decoder architecture

4.2 The Decoder Area Overhead

Table 4 compares the decoder area overhead with different coding methods. The decoder area overhead is computed as: (area of decoder*100)/(area of benchmark circuit), area of decoder includes hardware overhead of the core FSM decoder and bit swapping logic array. The proposed BIST TPG method applied with EFDR/ASFDR only requires the basic decoder, which totally avoids difference test pattern method [11]. So it neither requires any CSR nor involves any delay caused by CSR, but only needs a different routing for scan-in operation.

Table 4 The decoder area overhead

Compared with CSR decoder structure, the proposed BIST TPG method needs simpler decoder architecture. Obviously, the improved decoder does not consume extra area overhead. So it is easy to implement in an actual chip design.

4.3 A Decoding Example

Recalled a given test set involved in section 3, it is assumed that test set T with four original test patterns {O1, O2, O3, O4} is turned into test set N with four new test patterns {N1, N2, N3, N4}, then encoded by EFDR/ASFDR. The encoded test set needs to be decoded before it is applied to scan testing. However, the orders of scan-in test pattern are not considered. Thus, bits swapping in each test pattern are considered during decoding process, the routing decoder of a given test set is designed. Figure 2 shows the encoded test set is successfully decoded by don’t care bit adjusting and bit swapping operations, and then loaded into scan chain for on-chip testing.

Fig. 2
figure 2

The decoder architecture of a given test set

The power efficient test set consumes less test power dissipation under the conditions of high compression ratio. As for the architecture complexity of BIST circuit consuming more power especially in deep submicron designs which have high transistor leakage, this paper evaluates scan-in test power dissipation by WTM model, then it is further synthesized and simulated at the gate level for each ISCAS’89 benchmark circuit in section 5.

4.4 Performance Comparison with Previous Works

This paper provides extensive discussions about decoder area overhead, scan-in test power dissipation, test application time and algorithm complexity analysis. Table 5 gives the performance comparisons of some evolution methods.

Table 5 Performance comparison of some evolution methods

Compared with existing commercial methods, the proposed BIST TPG method has obvious advantage. The test power dissipation is reduced without affecting test application time and the improved decoder is implemented without adding extra area overhead, which indicates this scheme holds true in an actual chip design.

5 Simulation Experiment and Results Analysis

5.1 Compression Ratio Comparison

The proposed BIST TPG method is verified by ISCAS’89 benchmark circuits with full-scan chain, the test set obtained from Mintest ATPG program is involved in experiment verification, it was performed on a workstation with a 2.5GHz Intel Core processor and 4G of memory. Before the test patters are injected into scan chain, don’t care bits based 2-D adjusting, Hamming Distance based 2-D reordering were in turn applied to the Mintest test set, don’t care bits were filled by MT-filling scheme and then encoded by EFDR [7] and ASFDR [8], encoded test set is obtained from these codeword. Assuming T D and T E respectively denote the Mintest test set and encoded test set obtained from the proposed BIST TPG method. The compression ratio is calculated by formula (4):

$$ \mathrm{C}\mathrm{R}\%=\frac{T_D-{T}_{\mathrm{E}}}{T_D}\times 100\% $$
(4)

Table 6 compares the compression ratio obtained by different test set preprocessed methods. For EFDR with MT-filling scheme, the proposed BIST TPG method (HDDR+PEDCBC) shows obvious advantages over than EFDR and the 2-D reordering method (HDDR+PEBF) [11], and the average compression ratio reaches to 72.14 %. For ASFDR with MT-filling scheme, the similar conclusions can be drawn, and the average compression ratio reaches to 72.96 %.

Table 6 Compression ratio comparison with previous works

5.2 Scan-in Test Power Dissipation

The proposed BIST TPG method adopts a series of adjusting and reordering strategies to preprocess the test patterns before the test set is loaded for scan testing. Here, don’t care bit based 2-D adjusting makes a large reduction of WTM during scan-in operation. Hamming Distance 2-D reordering is further applied to the adjusted test set to eliminate test power dissipation. Table 7 compares scan-in test power dissipation of the encoded test set with different X-filling schemes.

Table 7 Test power dissipation comparison

Firstly, don’t care bits in original test set are filled by two filling schemes, the experimental results show that MT-filling scheme has obvious advantages in decreasing test power dissipation than 0-filling scheme does. When don’t care bits in test pattern are completely filled by MT-filling scheme, scan-in test power dissipation can be evaluated. The peak test power and average test power in the proposed BIST TPG method (HDDR+PEDCBC) are obviously lower than 2-D reordering method (HDDR+PEBF) [11]. And the average reductions of the peak test power and average test power reach to 35.14 % and 43.01 % respectively.

5.3 Algorithmic Complexity Analysis

HDDR+PEDCBC EFDR/ASFDR with MT-filling scheme is based on WTM power model, the test pattern generation include EFDR/ASFDR encoding, don’t care bit adjusting and filling, Hamming Distance 2-D reordering. These operations devote to reducing the test power dissipation during scan testing, which seems to make the BIST TPG method more complicated. So, the algorithmic complexity including test set encoding time, test application time, the decoder architecture and area overhead is roughly evaluated in this paper. Table 8 compares the test set encoding time include don’t care bit filling time and Hamming Distance 2-D reordering time. The experimental results show that HDDR+PEDCBC ASFDR by MT-filling scheme consumed less test set encoding time than the other previous works did.

Table 8 Test set encoded time (s)

5.4 Test Application Time and Decoder Area Overhead

In proposed scheme, f ate and f scan respectively denotes ATE frequency and on-chip scan frequency, the frequency ratios is defined as α = f scan /f ate , the slow speed tester tests the high speed system, so f ate < f scan . The decoder includes three stages: The decoder receives the encoded test data from ATE at a frequency of f ate , HDDR+PEDCBC EFDR/ASFDR codes are decoded at a frequency of f scan , the bits swapping in test pattern is performed to further obtain the power efficient test set.

It is assumed that ATE adopts only one channel for sending codeword to CUT and takes only one ATE clock cycle for sending one bit. The clock in bit swapping logic array is synchronized with the chip operating clock during scan testing. Let test application time t(m, n) be the total time required to decode a codeword that is the n th member of the m th group. Let t shift (m, n) be the time required to transfer the encoded test data from ATE to on-chip and t decode (m, n) be the time required to decode the codeword. Let t swap (m, n) be the time required to swap the corresponding bits in test pattern. So the test application time (TAT) reflected by ATE cock cycles [6] is concluded by formula (5).

$$ t\left(m,n\right)={t}_{shift}\left(m,n\right)+{t}_{decode}\left(m,n\right)+{t}_{swap}\left(m,n\right) $$
(5)

The decoder is synthesized using FPGA optimized for area overhead, and it is assumed that there is a maximum run length of 2000. Then the estimated gates count for the decoder are calculated, Table 9 compares four techniques in terms of the number of clock cycles needed to decode the test set, which corresponds to the test application time under various frequency ratios. Furthermore decoder area overhead is shown.

Table 9 Test application time and decoder area overhead

For all test sets involved in experiments verification, the number of clock cycles needed for HDDR+PEDCBC EFDR/ASFDR with MT-filling is less than that needed for FDR and EFDR. It indicates that the proposed scheme consumed less test application time than the previous works did. In addition, the estimated gates count for HDDR+PEDCBC EFDR/ASFDR with MT-filling are slightly higher than that for FDR and EFDR, and the decoder area overhead is considered small in comparison with the actual circuit size.

6 Conclusion

A power efficient BIST TPG method is proposed to reduce scan-in test power dissipation. Before the scan testing, the test set is preprocessed by don’t care bit based 2-D adjusting and Hamming Distance based 2-D reordering in interspersed way. Firstly, the test set is adjusted according to the number of don’t cares bits in each test pattern, don’t care bit based first adjusting is applied to the test set. Secondly, Hamming Distance based first row-wise reordering is applied to the adjusted list. Thirdly, the first matrix transpose is achieved. Fourthly, don’t care bit based second adjusting is applied to the each row in transposed matrix. Fifthly, Hamming Distance based second column-wise reordering is applied to the new list in transposed matrix. Finally, the second matrix transpose is achieved until the power efficient test set is obtained. The six largest ISCAS’89 benchmark circuits verify the power efficient BIST TPG method. The experimental results show that it devotes to decreasing test power dissipation during scan testing, which ensuring high compression ratio, consuming less test application time and occupying small decoder area overhead.