1 Introduction

Today, integrated circuits have wide range of applications in different domains such as military industries, economic, etc. The consequences of an attack can be catastrophic which indicates the importance of hardware safety studies. An important issue in hardware security is level of trust to the functionality of integrated circuits. Trust to ICs supply chain has become one of the key challenges in the hardware security because outsourced design and fabrication process is unavoidable. Nowadays, outsourcing the fabrication process was common in Integrated Circuits (ICs) industry [3]. As a result, ICs are exposed to malicious attacks at various points of this chain [23, 31]. These malicious modifications at each stage of the supply chain are called Hardware Trojans [9, 24].

Trigger and payload circuits are two major parts of HTs [9, 23, 24, 31]. The trigger part initiates the activity of the HTs and the payload causes the circuit malfunctioning. HTs are placed in ICs for different purposes, such as leakage of secret information, changing circuit functionality, and etc [16]. Adversaries try to hide the Trojans in a way that traditional manufacturing tests would not be capable to detect them [17]. Subsequently, efficient methods must be developed to detect HTs during the test process.

Side-channel analysis and logic testing are two possible approaches to detect HTs. Most side channel methods compare the side-channel signatures (e.g., delay, transient power, and leakage power) with signatures of golden ICs to detect HTs. Logic testing methods are other categories of HT detection. In an ideal case, an HT detection method successfully mutates the circuit input vectors, triggers the Trojan, observes the malicious behavior and reports the trigger input vector to the design owner. However, in practice, activation of HT is very difficult and sometimes is impossible, as we do not have enough information about HT features including the location, the trigger condition, and the malicious functionality of HT [29]. Therefore, increasing the percentage of trigger coverage in a very short time is the main goal of this category of detection methods. It is noteworthy that logic testing methods can improve the efficiency of side channel techniques since side channel analysis is dependent on HT activation [22].

The generation of proper test vectors is the main parts for activating the HTs. The method in [8] generates random test vectors until each of the rare nodes of the circuit is triggered independently at least N times. This simple heuristic approach for test vectors generation leads to poor trigger coverage, especially for hard-to-trigger Trojans. Moreover, triggering all rare nodes for N times are extremely difficult by this method. The method in [22] by using the genetic algorithm tries to address the challenges of the previous work. However, it is not efficient in terms of the trigger coverage percentage and the test vectors generation time.

In this paper, we try to overcome this challenge and increase the trigger coverage percentage. We introduce a novel method for detecting HTs based on logic testing exploiting a Genetic Algorithm. An appropriate fitness function (as the main parts of a GA) is proposed. We have considered testability measures in this function which has not yet been used in the previous work to the best of our knowledge. We advised GA through update phase at the end of each iteration to produce better test vectors. The proposed method attempts to maintain the advantages of previous methods while overcoming their disadvantages.

The rest of the paper is organized as follows. Section 2 provides the background on the topic of countermeasure against HTs. Section 3 describes the proposed method including the Trojan model, the genetic algorithm, and TRIAGE algorithm. Section 4 discusses Trojan insertion, test vectors evaluation, and the experimental setup. Results of the analysis are reported and discussed in Section 5. Conclusions and future works are given in Section 6.

2 Related Works

As Fig. 1 indicates, HT countermeasures can be applied in three different domains: Design For Security (DFS), Detection and Run-Time Monitoring.

Fig. 1
figure 1

Overview of different protection approaches against Hardware Trojan attacks [5]

The goal of DFS approach is to prevent the insertion of HT by hiding main functionality of the circuits. Hiding main functionality by design obfuscation can provide design security. One of these methods is logical encryption for combinational or sequential circuits. Sequential encryption modifies the state transition graph. In sequential encryption, logical states which are called black states are added to the state-transition function of the circuit. For example, [7] tried to achieve obfuscation using a modification of the state-transition function and internal circuit structure. Subsequently, if correct sequence of patterns is applied to primary inputs, the circuit operates in normal mode. Combinational encryption modifies the gate level netlist. Gates such as the XNOR, XOR [19,20,21] and MUX gates [21] are inserted into the original circuit to falsify the main functionality. Another technique in DFS domain [6, 23] tries to facilitate HTs detection. For example in [23], inserting dummy flip-flops can lead to facilitate HTs detection due to increasing the transition probability of rare nodes. Layout-filler and split manufacturing are other techniques for HT prevention [13].

Runtime approaches are performed to detect Trojans that passed the detection methods for any reason. It is desirable to detect HTs in the ICs, before being deployed but this always cannot be realized. Thus, runtime approaches are known as the last-line of defense against HTs that could monitor an IC for its entire operational lifetime [30]. Methods in [4, 9, 30] are presented in order to HT detection with this approach.

HT detection can be performed on pre-silicon stage and post-silicon stage. Zhang et al. [28, 29] and Waksman et al. [26] have proposed functionality validation on pre-silicon stage. These methods require complex computations and are not applicable for large circuits. Side channel analysis and logic testing are two categories of post-Silicon methods. Previous works such as [1, 2, 12, 14, 16,17,18] have employed side channel information to detect HTs. Side channel detection approaches are not effective when HT is tiny, because circuit’s noise usually overshadows the HT’s activities and its side channel leakage.

In [3, 8, 22, 27], HTs can be detected by applying test vectors to circuits. Logic testing methods look for test vectors to activate Trojan and then propagate triggered Trojans effect to some observable nodes. Trojans are designed to be hard to trigger since the main challenge in this approach is to ensure that Trojan is triggered by generated test vectors [9].

Test pattern generation in [3] is based on the exploration of outliers. This paper has mentioned that an HT’s trigger is chosen among signals that are not correlated to the general switching activity.

A method called MERO (Multiple Excitation of Rare Occurrence) was presented in [8]. This method would generate random test vectors and changed randomly those until each of the rare nodes of the circuit was trigged independently at least N times. MERO had three disadvantages:

  • poor trigger coverage for hard-to-trigger HTs (with triggering probability less than 10− 6)

  • triggering all rare nodes for N times is impractical

  • small number of test vectors have been explored

Saha et al. [22] tried to address these three issues and solved these. Subsequently increasing trigger coverage, decreasing number of test vectors and diagnosis Trojans by a valuable Trojan database that created for each circuit are considered as their goals. Although the method in [22] have improved concerning MERO, it is found to have the following shortcomings:

  • This approach was not useful for triggering very hard-to-trigger HTs(with triggering probability less than 10− 8).

  • In terms of memory overhead, it is not very efficient.

  • Test vectors generation takes a long time for moderately sized circuits.

We used MERO novelty and SCOAP parameter [10] into the fitness function to have all previous benefits and also address the mentioned shortcomings. We considered SCOAP parameters to quantify the rare nodes.

3 Proposed Method

In this section, we introduce the Trojan model assumed throughout this paper and a brief explanation of GA. Then, describe how to use GA to generate patterns. Finally, we explain the TRIAGE method.

3.1 Hardware Trojan Model

As we know, HTs are composed of two parts [15]. The first part is called trigger (activation mechanisms) and the second part is payload (the part of the circuit activated after trigger excitation). Figure 2 shows the assumed Trojan model. According to related works, we consider simple combinational HTs. In this paper, without loss of generality, we considered an AND gate as the trigger part of the HT which its inputs come from a subset of rare nodes in the circuit and an XOR gate as the HT payload. Trojan is triggered when all circuit nodes connected to the AND inputs generate logical value ’1’. Then the logical value at the payload node is inverted by Trojan. In general, the number of trigger inputs is unrestricted. But we are considered up to four rare node combinations for trigger inputs according to [22].

Fig. 2
figure 2

Example of model for combinational Trojan circuits

3.2 The Genetic Algorithm (GA)

In this part, we describe how GA works and how it can be used to generate more effective test vectors and find HTs more efficiently. A genetic algorithm is a method for solving optimization problems and adaptive heuristic search.

A proposed solution to the problem that the GA is trying to solve is referred as chromosome that is composed of genes. Each gene can have different values (e.g. numerical, binary, symbol or character) depending on the problem type. The set of the chromosomes is known as the population. Selection, crossover, and mutation are three evolutionary operators in GA. The selection operator chooses a number of chromosomes as parents. These parents produce children through crossover process which their genes composition is a combination of their parents. In mutation, one or more gene values flips in a chromosome. This step avoids the algorithm from sticking in local extremum. Finally, next generation is chosen among parents and children by Darwinian rule, which says chromosome with higher fitness value have more chance to be selected. The best solution can be achieved after several generations [11].

In our case, the algorithm repeatedly modifies a population of test vectors. At each step, the GA randomly selects individuals from the current test vectors and uses them as parents to produce the children for the next generation. Over successive generations, the test vectors evolve toward optimal test vectors that have the highest percentage of trigger coverage.

3.3 Mapping Between Test Vector Generation Problem and GA Parameters

The relationship between generating test vectors problem and GA parameters is as follows:

  • Gene: any 0s or 1s in test vector is considered a gene.

  • Chromosome: any test vector represents a chromosome.

  • Population: the set of the chromosomes is known as the population. In this problem, a collection of test vectors is a population. We use a random test vector as an initial population for the first time.

  • Fitness function: We need an appropriate test vector evaluation. We rank the test vectors by this function. The rank of a test vector depends on which rare nodes are triggered.

  • Parents selection: We need to choose two chromosomes as parents to perform crossover. In this paper, We used Roulette Wheel (fitness proportionate selection) for this purpose.

  • Crossover: We used the two-point crossover, i.e. two points to be selected on the parent test vectors and every bit between the two points is swapped between the parent and two test vectors are generated that illustrated in Fig. 3.

  • Mutation: According to Fig. 3, a gene is modified in mutation process at a random position of the test vector, in order to prevent the population of test vectors from becoming too similar to each other.

  • Select test vectors: At the end of the round, we used Roulette Wheel for selecting test vectors. These vectors will be used new population for the next generation.

Fig. 3
figure 3

Example of crossover and mutation on chromosome(test vectors)

To summarize, at first, we generate test vectors as an initial population and evaluate test vectors by the fitness function. This population towards to better test vectors if an appropriate fitness function is defined. Thus, a definition of an appropriate fitness function is critical. Then parents are selected using Roulette Wheel, and a crossover occurs to parents and creates offsprings. In the next step, we flip a random bit of test vectors as a mutation. Finally, we use Roulette Wheel again to choose next-generation test vectors and these steps repeated.

3.4 TRIAGE (Hardware TR ojan DetectI on Using an A dvised G enetic Algorithm Based Logic TE sting) Method

In this section, we will introduce a fitness function then we will express a general scheme for HT detection problem. In the first step, netlist, mutation rate, crossover rate, the population size and a threshold for rare nodes detection (𝜃) given to the TRIAGE algorithm as inputs. Next step, we will find rare nodes based on transition probability of nodes. To compute transition probability of nodes, we used the method that proposed in [23]. According to Fig. 4, this paper considers switching probabilities 0.5 for all inputs. After levelizing the circuit, it computes zero and one transitions probability for any nodes at any levels. Then, we compare the transition probability of each node with 𝜃 and the rare nodes are determined.

Fig. 4
figure 4

Transition probability for a target cone

As mentioned, adversaries try to hide the trojan that does not detect by traditional manufacturing tests. We leverage testability measures and transition probabilities to find these nodes (which are potentially susceptible to Trojan insertion). For this purpose, we used SCOAP testability measures that developed by Goldstein [10]. Controllability and observability are important in digital circuits from testability measures point of view. For a digital circuit, “controllability is defined as the difficulty of setting a particular logic signal to a 0 or a 1 and observability is defined as the difficulty of observing the state of a logic signal” [25]. Next step is computing of SCOAP parameters because we want to consider controllability and observability of nodes in the fitness function. Before evaluating test vectors, we need quantify the rare nodes and define the Eq. 1 for this purpose based on transition probability and SCOAP parameters.

$$ {RareNod{{e}_{i}}}= \left\{\begin{array}{lllllllll} \frac{{{\omega }_{1}}}{t{{p}_{0}}+{{c}_{1}}}+{{\omega }_{2}}(CC0+CO)\begin{array}{llllllll} {} & {} & {{R}_{i}}\to 0 \end{array} \\ \frac{{{\omega }_{1}}}{t{{p}_{1}}+{{c}_{1}}}+{{\omega }_{2}}(CC1+CO)\begin{array}{llllllll} {} & {} & {{R}_{i}}\to 1 \end{array} \end{array}\right. $$
(1)

Where: tp is the transition probability (for 0 or 1), CC0 (CC1) is the difficulty of setting circuit line to logic 0 (logic 1), CO is the difficulty of observing a circuit line, ω1 and ω2 are constants for matching two parts of the relationship.

The advantage of Eq. 1 relative to [22] is consideration of testability measures to find out rare nodes. For example, consider a case where an internal flip-flop has a logical value of ”1” in most cases. So, it is found out as a rare node if we only consider the transition probability while the state of this flip-flop can be set simply to 0 in scan mode. For another example, consider nodes which are close to the outputs of a circuit. Even if they have low transition probabilities, they are not suitable locations to insert a Trojan since they are highly observable and might be discovered in the manufacturing test. These examples show that it is not sufficient to consider only transition probability of nodes to find rare nodes. Therefore, in TRIAGE, we consider testability measures in addition to transition probability of nodes to find rare nodes.

In the fourth step, GA runs according to part Section 3.3 and any test vectors evaluated with the fitness function. We considered fitness function as a

$$ Value(V)=\sum\limits_{i\in\left\{nodes|nodes\,which\, are\,trigged\,by\,V\right\}}^{}{RareNod{{e}_{i}}} $$
(2)

Where: Value(V) is the fitness value of a test vector V. This value depends on the rare nodes that triggered by V. In summary, quantification of each rare node and test vectors are shown in Eqs. 1 and 2 respectively.

In the proposed fitness function, a test vector is more valuable if it has triggered more rare nodes since we are looking to detection more Trojans with fewer vectors. On the other hand, a test vector is more valuable if it triggered a rare node which is not triggered so far. Although, the small number of rare nodes was triggered by this vector. For example, consider a node among rare nodes that is triggered in each round of GA. We claim that it must undergo different evaluation than nodes that are less trigger so far. Hence, we should consider not only number of triggered rare nodes but also quality of these rare nodes. For this purpose, before completing this round, we add a step into the normal process of GA and advised it to improve the quantification of the rare nodes. We advised GA and encourage it to evolve into triggering other rare nodes by this phase. The update formulation is defined as

$$ {RareNod{{e}_{i}}}\,=\, \left\{\begin{array}{lllllllll} RareNod{{e}_{i}}\,+\,{{\omega }_{3}}(N\,-\,Ct{{r}_{t-i}})\begin{array}{llllllll}{} & Ct{{r}_{t-i}}\!<\!N \end{array}\\ RareNod{{e}_{i}}\begin{array}{llllllll} {} & {} & {} & {} & {} & {} & {} & otherwise \\ \end{array} \end{array}\right. $$
(3)

Where: Ctrti is the number of times that rare node i has been triggered so far, N is the number of times that we want to trigger rare nodes and ω3 is constant. It’s clear when a rare node is triggered less than N times, its score is increased proportionally to the difference between N and Ctrti.

Subsequently, the rank of test vectors depended on two factors.

  1. 1)

    The number of rare nodes was triggered.

  2. 2)

    Quality of rare nodes was triggered that dynamically can be changed.

In the last step, we check termination condition of the algorithm. When either of these conditions is satisfied, the algorithm is terminated. These conditions are:

  1. 1)

    ”p%” of the rare nodes are triggered at least N times and there is no rare node which has not been triggered at all (we consider p = 95%).

  2. 2)

    If threshold of generations has been reached (accordingly [8], we consider 1000 generations).

By defining such a fitness function, update phase and termination conditions, our proposed method has advantages of both proposed methods in [22] and [8] simultaneously and it compensates shortcomings. As mentioned earlier, MERO looks for test vectors that can excite candidate trigger nodes individually multiple (at least N) time [8]. We consider this concept in first termination condition to increase the excitation probability of rare nodes to their rare logic value on average. On the other hand, according to [22], one of the shortcomings arises from extremely rare nodes which are pretty hard to trigger for a given value of N times. To overcome this challenge, we consider ”p%” of rare nodes are triggered at least N times. Despite we do not leave any rare nodes without triggering. Also, we use the GA to generate the test vector. The evolution of the test vectors causes to satisfy this condition faster than the random test vector generation. The final test vectors are prepared after completion of the GA. Algorithm 1 shows the complete test pattern generation scheme.

figure a

4 Experimental Setup and Trojan Insertion Conditions

4.1 Experimental Setup

TRIAGE method implemented by C#. Also, we implemented the proposed method in [22] to evaluate the effectiveness of TRIAGE method. To make a fair comparison, genetic algorithm parameters adjusted accordingly [22] which include:

  • Trigger combinations are limited to a random sample of 100000.

  • The number of initial test vectors is considered to be 2500.

  • Probabilities of crossover and mutation are considered to be 0.9 and 0.05 respectively.

  • The threshold of generation is considered to be 1000. Also, p is considered to be 95%.

  • The number of selected vectors in each generation is 200 (except s38417 that considered to b 500).

  • We used 𝜃= 0.1 for generating test vectors.

  • We used a subset of ISCAS–85 and ISCAS–89 circuits. Sequential circuits should be converted to scan mode. All flip-flops can be set to any desired logical value by shifting those logic states into the shift register in this mode.

  • The implementation was performed and executed on a 3.3GHz processor and 4GB of main memory.

4.2 Trojan Insertion and Test Vectors Evaluation

We needed to insert Trojan to evaluate the efficiency of TRIAGE method. The main challenge was to select a set of nodes as Trojan trigger. They should be not only hard to trigger but also a possible combination to be triggered at least by one test vector. To overcome this problem, we generated random test vectors for each circuit. We applied these vectors to the circuit to determine stimulated rare nodes. Then we can choose from this set as Trojan’s triggers depending on the desired degree of hard-to-trigger. By decreasing 𝜃, the transition probability of the rare nodes is also diminished, and we can select a very hard-to-trigger combination for the Trojan. We insert two sets of Trojans. One of them created when rare nodes detection threshold (𝜃) is considered 0.1 and another generated when 𝜃) is considered 0.01.

Then one of these Trojans are inserted into the circuit, and each test vectors which generated by TRIAGE are applied to the circuit. We use a counter to find trigger coverage. The counter is up if test vectors can detect this Trojan.

5 Result and Discussions

In this section, the TRIAGE method is compared with methods in [22] and [8] in terms of test vectors generation time and trigger coverage. As mentioned in [22] Trojan coverage differs from trigger coverage. Also according to the result of [22] Trojan coverage is always lower than trigger coverage and a ratio of trigger coverage to Trojan coverage is approximately constant for all of the circuits. For this reason, we did not consider Trojan coverage. Also, this is a fact that higher trigger coverage causes higher Trojan coverage.

We reported the number of gates, time and the number of test vectors that are generated by each method for each circuit in Table 1. To have insight into the performance of previous works, the simulation results of [22] and [8] are shown in Table 1. Also, our simulation results for the proposed scheme and presented method in [22] are reported separately.

Table 1 Comparison between TRIAGE with another method with respect to the test-set length and test vectors generation time

The length of test-set for TRIAGE method is considered similar to other methods for the other assessments to be conducted fairly. The efficiency of the proposed method demonstrated in Table 1. Since in the same test-set length, test vectors generation time in the TRIAGE method is far less than methods in [22] and [8].

Although, we used an idea from MERO but test vector generation time in TRIAGE is less than MERO. This can be explained in three aspects.

  • Triggering all of the rare nodes at least N times is so difficult with random test vectors which used in MERO.

  • In TRIAGE method, only 95% of the rare nodes are triggered rather than all of the rare nodes at least N times.

  • Test vectors are generated by GA, unlike random vectors.

Therefore, we expected that test vector generation time in TRIAGE is less than MERO.

Also, test vectors generation time with the TRIAGE method is lower than of method [22] because it requires the Boolean Satisfiability (SAT) tools at the beginning and end of the algorithm. But TRIAGE method uses random test vectors for initial population.

There is a difference between our simulation results with the results reported in [22] regarding the test vectors generation time. Since the implementation of the Boolean Satisfiability tool is different. It is noteworthy that test vectors generation time in the TRIAGE method is much lower than on both reports (not only based on our simulations but also based on report [22]).

In terms of the percentage of trigger coverage, a comparison was made between the methods. Trigger coverage is shown in Table 2 for each method and each circuit. TRIAGE method has more trigger coverage than others for most of ISCAS circuits (except c3540). This improvement can be explained in several ways. Firstly, the test vectors are developed by the GA and are evolved to the better population. Also, considering the controllability and observability parameters in the fitness function cause to TRIAGE identifies the suitable location for Trojan insertion correctly. Another reason for increasing trigger coverage related to the existence of the updating phase that we advise GA to adjust the score of rare nodes. In summary, all of these reasons are related to the definition of fitness function.

Table 2 Comparison between TRIAGE with other methods regarding trigger coverage for hard and very hard to trigger Trojans

To clearly show the efficiency of TRIAGE method for hard-trigger Trojans (the transition probability of trigger part is less than 10− 8), we generated a set of hard-to-trigger Trojans (which is explained in Section 4.2) and applied the test vectors that have been generated by each method. Contrary to what is in [22] when the transition probability is less than 10− 8, the number of Trojans is not extremely low, especially if the size of the circuit is large. The two last columns in Table 2 indicate both methods have a lower trigger coverage, but the proposed method has been more effective than the method in [22]. As same as before, the TRIAGE method has been able to trigger hard-to-trigger Trojan due to consideration of various parameters in the fitness function, the advantages of MERO and the existence of update phase.

It is noteworthy that when the circuit size increases the trigger coverage decrease. Since the number of possible nodes for inserting HTs increases and triggering all of them with the limited number of test vectors is difficult. it’s obvious in the trend of two simulation results (not only in our simulation but also in work [22]). This is a big deal when we consider a set of trojans whose triggering probabilities are below 10− 8. We need to increase the number of test vectors or population size of each iteration if we want higher coverage for these Trojans. We considered the number of test vectors accordingly to reference work [22] in order to make a fair comparison. However, the proposed method has higher coverage than the reference work [22] in two cases. Also, we increase the population size to 500 for S38417 that is larger than S15850 to see the impact of this factor on trigger coverage. We expected the trigger coverage to increase. The results of Table 2 indicate higher trigger coverage for S38417 respect to S15851. But the time is grown almost 5.5 times. It seems both of them are reasonable.

To sum up, in this work we improved trigger coverage respect to other methods while we reduced generating test vectors time.

6 Conclusion and Future Work

HT detection has become one of the challenges of hardware security. Many methods have been presented to detect HTs.

In this paper, a new method was proposed for generating test vectors to detect HTs. The proposed method used from advantages of methods in [22] and [8] to cover shortages of each other. The simulation results show the proposed method improves the percentage of Trojan detection. Also, test vectors generation time reduced significantly. It’s an important point when the size of circuits is large. According to the studies, it is suggested for future works that:

  • Use test vectors generated by this method in the side channel analysis. Although side channel techniques can detect Trojans without triggering in some cases, they always are finding methods that triggered part of Trojans.

  • Consider Trojan payload in this process. It may be helpful to close the gap between the trigger coverage and Trojan coverage.