1 Introduction

Use-after-Free (UAF) Vulnerability is widely used by attackers to remotely execute arbitrary code and escalate privileges. Such a vulnerability exists when dereferencing a dangling pointer, i.e., a pointer that points to a freed memory object. It is challenging to eliminate dangling pointers in the program due to at least the following two reasons. On one hand, the memory allocation/de-allocation, pointer nullification/re-initialization operations are usually handled in different functions, even different files, leaving opportunities for dangling pointers between the memory de-allocation in one function and the pointer nullification/re-initialization in another function. On the other hand, during program runtime, complicated pointer operations may lead to a number of alias pointers that point to the same memory object. Hence, a memory de-allocation may produce lots of dangling pointers, which are nontrivial to be completely identified.

Existing works [5, 6, 14, 17] use dynamic dangling pointer nullification to defeat UAF vulnerability. Such kind of nullification eliminates dangling pointers by implementing a pointing information management structure into the program, tracking point-to information and inserting nullify instructions after free operations. However, it introduces complex synchronization issues, and cannot handle memory allocation or release operation intensive applications. In addition, such an approach requires the program to maintain additional data structures during runtime, which will expand the original program’s code space. Moreover, heavy instrumentation and extra address operations of this approach often involve some clear operations on function stack frames, which increases the risk of crashing the program. Finally, such a dynamic approach highly depends on the amount and diversity of testing inputs to the program to improve code coverage, but sufficient testing inputs are hard to obtain in reality.

In this paper, we view identifying dangling pointers as a demand-driven alias analysis problem and propose a novel mitigation approach against use-after-free vulnerability, Static Dangling Pointer Nullification (SDPN). SDPN involves two stages, namely analysis stage and dangling pointer repair stage. During the analysis stage, SDPN utilizes our alias analysis algorithm to collect potential dangling pointers, filters them according to the definition of the alias pointer, and classifies them into global dangling pointers, dangling pointers in the same function and dangling pointers in different functions. During the dangling pointer repair stage, SDPN directly repairs global dangling pointers and dangling pointers in the same function, and repairs the dangling pointers in different functions in the proper program location. We implement a prototype of SDPN, and demonstrate its effectiveness against real-world UAF vulnerabilities. We also evaluate it using SPEC CPU2006 benchmarks, and obtain negligible runtime overhead, i.e., less than 1%.

Our contributions are summarized as below:

  • To the best of our knowledge, we are the first to use static analysis to eliminate dangling pointers.

  • We generalize identifying dangling pointers as a special alias analysis problem and devise a novel algorithm to solve it.

  • We implement a prototype of SDPN and evaluate it against real-world CVE vulnerabilities. The experimental results demonstrate its effectiveness against UAF, incurring negligible runtime overhead.

2 Background and Related Works

2.1 Background

Use-after-free and Dangling Pointers. A memory chunk allocated by a memory allocator can be pointed to by many pointers during the runtime of a program. When the memory chunk is freed, those pointers become dangling pointers. Essentially, the use-after-free vulnerability refers to the use of those dangling pointers. The root cause of the UAF vulnerability is that the memory chunk release does not destroy the pointers that dereference the memory chunk. Hence, using those dangling pointers may be exploited to escalate privileges or remote code execution.

In order to perform efficiency and defragment memory management, modern memory allocators, such as ptmalloc, usually put freed chunks into a well-designed linked list. If there exists any suitable freed chunk when the program requests memory next time, the chunk will be directly removed from the linked list to satisfy memory request. Attackers use the elaborate memory request to occupy the previously released chunk, then use a dangling pointer to access the memory. Therefore, use-after-free exploitation is to create a memory chunk that can be interpreted in different program semantics: in a certain period of program running, the memory units are interpreted as writable data; however, in another period of program running, the memory may be interpreted as control flow data, such as function pointers.

Dangling Pointer Analysis and Alias Analysis. Pointer operations in the program can be classified into create, use and release, corresponding to the three sets \(\mathcal {C}\), \(\mathcal {U}\) and \(\mathcal {F}\) respectively. Program location of a pointer p is represented as \(l_p\). \(\forall {p}\in \mathcal {F}, \forall {q}\in \mathcal {U}\), the UAF vulnerability demands that p and q point to the same memory object spatially and there are one or more execution paths from \(l_p\) to \(l_q\) temporally [16]. In contrast, dangling pointers only require p and q point to the same memory object, so they are a pair of alias pointers spatially. Dangling pointer analysis basically is alias analysis starting from pointers in free memory statements.

Alias analysis [11, 15, 20] attempts to examine whether memory references point to the same memory, thus concluding alias:

figure a

where PointTo(p) is defined as a set consisting of variables or storage locations pointed to by a pointer p. PointTo(p) can be obtained using point-to analysis, one of the most fundamental static program analysis techniques that analyze variables or storage locations that pointers point to [2].

Alias analysis can be regarded as the path reachability problem of the context-free grammar on the graph. [9] proposed four variants of alias analysis problems: the all-pairs L-path problem, the single-source L-path problem, the single target L-path problem and the single-source-single-target L-path problem. A variety of current studies mainly focus on solving the single-source-single-target alias problem [15, 20] and the all-pairs L-path problem [18, 19]. In fact, the dangling pointer problem should be viewed as the single-target L-path problem or the single-source L-path problem, that is, there must be at least one L-path for each dangling pointer to reach the memory release statement either forward or backward. For clarity, we define the set of dangling pointers associated with a pointer p as \(DP(p)=\{q_0, q_1, ... , q_n\}\): \(\forall {q}\in {DP}(p),Alias(p,q) = true\).

2.2 Related Works

Recently, many UAF mitigation measures have been proposed, e.g., Cling [1], Diehard [3], Dieharder [8], etc., which try to avoid unsafe memory reuse by reconstructing a new memory allocator. For example, Cling improves the memory safety by only allowing memory reuse of objects of the same type and alignment, but it still leaves the opportunity for attackers to exploit UAF vulnerability on objects of the same type. In addition, the approach of reconstructing the memory allocator highly depends on the actual deployment/operating environment. If the program is deployed in a non-secure allocator environment, such an approach will almost completely fail.

Reference counting approach [10] prevents recycling memory objects that still have dangling pointers point to, by keeping reference count values to the memory chunks in the program, and only reuses memory objects that have zero reference counts. However, this approach will delay the memory reuse in the program, and can still cause severe memory leakage if the to-be-reused memory is not properly cleaned up. In addition, such an approach requires strict and accurate instrumentation. Otherwise, the reference count value can be recorded incorrectly, leading to memory leakage and other security issues.

Runtime check, i.e., dynamic analysis, has been proposed to solve the problem of dangling pointers [4, 7]. This approach identifies dangling pointers and prevents dangling pointer dereference by maintaining additional metadata for each pointer and precisely tracking its semantics. However, accurate pointer semantic tracking during runtime is a complicated and challenging problem, and may generate a huge amount of pointer-related data during runtime, thus incurring significant performance downgrade.

Fig. 1.
figure 1

Example: Node p is the pointer in the free statement selected in the program. Horizontal arrows represent assignment edges (A) and vertical arrows represent dereference edges (D).

3 Overview

3.1 A Motivating Example

The method SDPN uses to identify dangling pointers is inspired by equipotentiality which refers to a region in space where every point in it is at the same potential in mathematics and physics. We found that alias properties are similar to equipotentiality, and this feature can simplify some program analysis.

There is an example to describe our idea as shown in Fig. 1. The left of Fig. 1 is a piece of code that goes through complex pointer assignments and finally releases the pointer p. For ease of expression, this code does not show the memory allocations and variable declarations. The right of Fig. 1 is the PEG (Program Expression Graph) converted from the code. The nodes labeled with numbers, such as 7, 8, 9, on the PEG are temporary nodes generated by ‘*’ (dereference), and the nodes labeled with letters, such as uop, correspond to variables in the code. Horizontal arrows represent assignment edges (A) and vertical arrows represent dereference edges (D).

The code finally releases the memory pointed to by p, resulting in dangling pointers. For the safety of the program, we should nullify these dangling pointers. Nullifying the pointer p is obvious and easy to do. The other dangling pointers are supposed to be aliases for the pointer p, but it’s hard to find out. In the process of finding these dangling pointers, we found that the alias correlation of pointers with p forms an equipotential layer with the combination of assignment and dereference operations. For example, other pointers after some assignments would necessarily be an alias. It seems that their layers have not changed. The dereferencing operation is like lowering the level, and the address taking operation (reverse dereferencing) is similar to raising the level. After the rise and fall of the level, it is possible to make alias when the pointer returns to the alias level. We found that aliases must be in the equipotential layer, but the aliases in the equipotential layer are not necessarily aliases. This feature does not accurately determine aliases, but can be used to narrow down the range of pointers that are considered when looking for aliases. The layer are discussed further in Sect. 4.1 and Sect. 4.4. The relationship between equipotential layers and aliases is summarized as Rule 1 and Rule 2.

Following this intuition, we introduce a new parameter level to the PEG graph to roughly measure the relationship between pointers. Level will propagate and change with the edges on the graph. Equipotential surfaces based on level values can be used to filter pointers. We can filter out nodes with the same level value such as \(\{v, v2, w, 7, y\}\). Next, we can make more accurate judgments about the pointers in the set. Node 7 is actually the variable v2, because t and t2 are aliases with each other and both point to v2. After pointer analysis, we found that, \(PointTo(v)=PointTo(v2)=PointTo(p)=\{o\}, PointTo(w)=\{o3\}, PointTo(y)=\{o2\}\). According to the definition of alias, vv2 are alias with pointer p and should be nullify.

3.2 Overview of the Design

This paper designs and implements a prototype SDPN that uses a multi-step filtering method to find out dangling pointers. The whole process is shown in Fig. 2. In Stage 0, SDPN compiles the source code of the target program to generate the llvm IR, and builds the PEG graph with the level parameter of the target program according to the IR. In Stage 1, SDPN finds free pointer nodes in the graph, and collects all pointer nodes that may have alias relationships with these pointers to form selected pointers set S(p). This stage reduces the analysis target from all pointers to selected pointers set. In Stage 2, SDPN reduces the selected pointers set S(p) to remained pointers set DP(p) according to the definition of the alias and the validity of the pointers in S(p). In Stage 3, SDPN divides DP(p) into global pointers, local pointers of the same function stack and local pointers of different function stacks, generates corresponding nullify statements and finds corresponding insertion locations according to different pointer types to complete the repair. Finally, SDPN will generate the repaired executable.

Fig. 2.
figure 2

Overview

4 Design

4.1 Stage 0: Generate PEG with Level

This section describes the program information that needs to be obtained during the SDPN analysis phase.

Fig. 3.
figure 3

Interpretation of expressions on extended PEG graphs

PEG (Pointer Expression Graphs) [19, 20] \(G=(N,E)\) is a bidirected graph extracted by program statements. N is set of nodes and E is set of edges. The nodes of PEG represent variables in the program, and there are two types of edges: pointer dereference edges (D) and pointer assignment edges (A). Edge D represents the dereference relationship between variables. For example, the expression *p in the C language will produce two nodes \(*p\) and p, and an edge here to express that *p can be obtained by dereferencing p, \(p\xrightarrow {D}*p\). Edge A represents the assignment relationship between variables. For example, the expression \(p=q\) means that the value of q is assigned to p, so there is an edge A, \(p\xleftarrow {A}q\). In PEG, there is a reverse edge for every edge, i.e., \(\forall {p}\xrightarrow {D}*p\in {E}\), so \({p}\xleftarrow {\overline{D}}*p\in {E}\); \(\forall {p}\xleftarrow {A}q\in {E}\), so \({p}\xrightarrow {\overline{A}}q\in {E}\).

Previous research used a detailed context-free grammar path reachability method in order to analyze all aliases as accurately as possible. The productions of the context-free grammar for alias analysis on a PEG graph is as follows [20]:

$$\begin{aligned}&M\quad {::}\!= \ \overline{D}\ V\ D \end{aligned}$$
(1)
$$\begin{aligned}&V\quad {::}\!= \ (M?\overline{A})^{*}\ M?\ (AM?)^{*} \end{aligned}$$
(2)

There are two types of aliases here: memory aliases, M, represents two pointers point to the same memory; value aliases, V, represents two pointers evaluate to the same pointer value. In the grammar, ‘?’ indicates 0 or 1 repetition; ‘\(*\)’ indicates any repetition. If the labels on the edges in the connection path between two pointers can be reduced to the above formula, then the two pointers are aliases to each other. However, SDPN adopts a method of producing candidate aliases first, and then gradually screening to make the results more precise. SDPN can use some non-essential and sufficient conditions derived from the above formula to select candidate pointers.

In order to deduce the filter rules, SDPN extends the traditional PEG nodes with a new parameter level. Mark the pointer p in each free statement as \(level = 1\), and start the graph traversal from this node. The traversal process passes through the A and \(\overline{A}\) edges, and the level remains unchanged; after the \(\overline{D}\) edges, the level increases by 1; after the D edges, the level decreases by 1.

As Fig. 3, the calculation of level is equivalent to interpreting the expressions in the program as follows: the ADDR expression is translated into \(\overline{D}\) edge on the PEG graph, so the level value increases by 1 along the direction of the edges; the COPY expression is translated into A edge on the PEG graph, so the level value remains unchanged; the LOAD expression is translated into a combination of D edge and A edge on the PEG graph, so the level value should be decremented by 1; the STORE expression is translated into a combination of A and \(\overline{D}\) edge on the PEG graph, so the level value should be incremented by 1.

4.2 Stage 1: Level Filter

Now, SDPN has extended PEG. Next, we will show how SDPN implements the filtering of candidate alias pointers through the new level attribute.

SDPN introduces a lv() function:

figure b

For lv(x), if x is a node, return the level value of the node x; if x is an edge, return the absolute value of the level difference between the start node and the end node of this edge; if x is a path, return the absolute value of the level difference between the start node and the end node of this path.

For the context-free grammar (1)(2) on PEG mentioned above for alias analysis,

$$\begin{aligned} lv(M)&= lv(\overline{D}\ V\ D) \\&= lv(\overline{D}) + lv(V) + lv(D) \\&= 0 \\ lv(V)&= lv((M?\overline{A})^{*}\ M? (AM?)^{*}) \\&= (lv(M)? + lv(\overline{A}))^{*} + lv(M)? + (lv(A) + lv(M)?)^{*} \\&= 0 \end{aligned}$$

According to the definition above, \(lv(A)=lv(\overline{A})=0, lv(D)=-1, lv(\overline{D})=1\). Here ‘?’ is interpreted as \((0\ or\ 1)\). Two ‘\(*\)’ are take any natural number, suppose take m and n. Bringing in the above two formulas, we know that \(lv(M)=lv(V)=0\).

That is to say, both memory alias and value alias require that the level difference of the traversed path is 0. But conversely, the two endpoints of a path with a level difference of 0 are not necessarily aliases to each other. For example, \(\overline{D}AD\overline{A}\) has a path level difference of 0, but it does not belong to the grammar of aliases.

SDPN uses two filtering rules based on the above conclusions to obtain alias pointers related to free pointers from all pointers in the program.

Rule 1: For any node p, the level value of its alias node is the same as it, but the node with the same level value as it is not necessarily its alias.

This is a necessary but not sufficient condition. Although we cannot use it to accurately determine the alias, we can use it to do the first screening. With this simple condition, the number of candidate nodes with a size of \(|\mathcal {C}\cup \mathcal {U}\cup \mathcal {F}|\) drops drastically.

Rule 2: Starting from node p, the nodes reachable by the path through nodes with \(lv(o) < 0\) will not be aliases of p.

If \(lv(o)>=1\), it is possible to reach node m through a path, \(lv(m)=1\).

If \(lv(o)=0\), it means that the level of node o needs to be reached by dereference of node p. Then when the node o takes the address and assigns it to another pointer q, then q returns to the \(lv(q)=1\) layer. It is possible for q and p to become aliases.

If \(lv(o)=-1\), it means that the level of node o is reached by dereference twice of node p. Suppose this dereference chain is \(p\xrightarrow {*}m\xrightarrow {*}o\). If the level wants to rise, it must go through the ADDR or STORE instructions. That is, only the address of the o node can be stored in the new memory address node n and n is stored in q, so an address chain \( q\xleftarrow { \& }n\xleftarrow { \& }o\) is generated. It is possible that node m and node n are aliases because they may both point to node o. But p and q cannot be aliases, because the new node q can only point to the new node n, and cannot point to the old node m. The aliases of p should, all point to node m, not a new node, even though the pointer n represented by the new node is equal to m.

The deeper reason is that the pointer can be used to obtain the object it points to, but the pointer cannot be obtained routinely according to the object. It can then be inferred that a node cannot affect the alias relationship between nodes that are more than two layers higher than itself. In the example of Fig. 1, the level of node p is 1, then the level value of node 9 becomes -1, and the node y reached through node 9 can be filtered out in advance.

SDPN generates the initial alias candidate nodes by starting a demand-driven graph traversal process from pointer in every free statement while calculating the level value with these two rules and combines these nodes into set \(\mathcal {S} (p)\).

4.3 Stage 2: Refined Filter

\(\mathcal {S} (p)\) is a set of aliases selected according to necessary conditions, many of which are false positives. This section discusses how to use the definition of alias and variable scope to filter those aliases that are not true.

Definition of Alias Analysis. SDPN combines the elements of S(p) one by one with the release pointer p for more precise filtering. There are two ways to determine whether two pointers are aliases: single-source-single-target alias analysis or alias definition. Since the single-source-single-target alias analysis is similar to the stage 1 analysis, SDPN selects the definition for further analysis:

def: Aliased pointers represent two pointers point to the same memory.

It just find the PointTo set of dangling pointer dp and free pointer p, separately and then determine the relationship between the two sets by definition of alias. If PointTo(dp) and PointTo(p) are completely equal, the relationship between dp and p can be considered as must alias. If these two sets have intersection but are not necessarily equal, the relationship between these two pointers can be considered as may alias. If the intersection of these two sets is \(\emptyset \), the relationship between these two pointers can be considered as no alias. When repairing the dangling pointers in the Sect. 4.4, SDPN will also insert dynamic judgment code, so in order to reduce the false negative rate, we select may alias as the filtering result.

Useless Alias Filter. SDPN divides S(p) into three categories according to the positional relationship between the dangling pointers and the pointers in free statements: global dangling pointers, dangling pointers in same function with free statements and potentially dangerous dangling pointers not in same function with free statement. The scope of a stack pointer is limited to the function this pointer belongs to. One of the more complicated cases is when a dangling pointer and a pointer in free statement come from different functions.

The life cycle of a global pointer has gone through two stages: the normal pointer stage before the free expression, and the dangling pointer stage after the free expression. It is throughout the entire program, so once a dangling pointer is formed, it will pose a potential threat until the end of the program. Not every dangling pointer will cause use-after-free vulnerabilities, but dangling pointers are the prerequisite for most use-after-free vulnerabilities. These global dangling pointers will all be nullify at the appropriate locations in Sect. 4.4.

The life cycle of the stack pointer is too short, so some methods of dynamically clearing dangling pointers directly ignore all stack pointers. Most of the dangling pointers in different functions with a released pointer have been destructed when the program executes to the free statement. However, Some dangling pointers still on the function stack are not discussed by other papers. They are not destructed when the free statement is executed. So stack pointers also risk serious vulnerabilities.

SDPN uses relative call paths to solve the problem. The relative call string \(rcs=[f_{1}...f_{n}]\) is used to represent the function call relationship from the function, \(f_1\), where a dangling pointer dp is located to the function, \(f_n\), where release statement free is located. If such a path exists, it means that there is a runtime state where the dangling pointer is still in the function stack, and the program will re-enter the scope of dp in the future. If such a path does not exist, the dangling pointer dp can be ignored. This is filter Rule 3:

Rule 3: A dangling pointer will be filtered if there is no call path from the function containing the dangling pointer to the free statement.

SDPN combines Rule 3 and stack pointer repair localization implemented in Algorithm 1 discussed in Sect. 4.4. SDPN utilizes these strategy to eliminate a large number of meaningless stack pointers in S(p) and produce a new set DP(p). Now the remaining dangling pointers in DP(p) are either risky or meaningful.

4.4 Stage 3: Repair Generator

Dangling pointers in DP(p) have to be repaired now, including global dangling pointers, dangling pointers in same function with free statements and potentially dangerous dangling pointers not in same function with free statement. In this section, we design their own repair schemes for the two dangling pointers.

Fig. 4.
figure 4

Repair Example: \(g\_bzf\) was added for demonstration purposes. The b and bzf in the function \(BZ2_bzWriteClose64\), the bzf in the function compressStream and the global variable \(g_bzf\) are the dangling pointers formed by the free statement in the function \(BZ2_bzWriteClose64\).

figure c

Global Dangling Pointers Repair. Generally, the global dangling pointer has the longest life cycle, so it has the greatest probability of being converted into a use-after-free vulnerability. Global dangling pointers can be accessed in most locations in the program, so they can be all nullified directly after the appropriate free statements like \(g\_bzf\) in Fig. 4. Nullified dangling pointers cannot completely prevent the occurrence of vulnerabilities, but they can downgrade high-threat vulnerabilities to null pointer dereferences that are basically useless. In view of the existence of false positives in the analysis results, a conditional statement should be inserted before the nullification code in the program. This conditional statement will determine whether each global pointer actually points to the memory location pointed to by bzf when the program has just executed the free(bzf) expression.

Stack Dangling Pointers Repair. It is more difficult to nullify the stack dangling pointers. The life cycle of a stack variable is very short, and the scope of the variable is the function which the variable belongs to. However, dangling pointers are often located in different functions with the free statement, so it may not be possible to access some dangling pointers immediately after the free statement. If the stack dangling pointers in DP(p) are in the same function stack frame with statement free, they can be nullified in accordance with the global pointer method. In other cases, the stack dangling pointers can only be accessed or nullified in its function. Therefore, the specific nullification statement should be inserted after a statement in its function that can be called deeply into the free statement like Fig. 4.

For dangling pointers in different function with its free statements, this paper proposes Algorithm 1 to find the appropriate nullification statement insertion locations. These pointers can not be cleared immediately after the free statement, and the use operation of the use-after-free vulnerability can not occur. Dangerous execution regions are the remaining unexecuted statements in the function which the dangling pointer was born in. The algorithm is to find all positions in the function where the dangling pointer is located that can reach the free statement in the Call-Graph. The function isReachable is use a common graph traversal algorithm. In addition, there may be multiple caller functions in the same function that can execute to the free statement. So the output of the algorithm is a set containing all the locations to be inserted.

5 Evaluation

In this section, we compare the performance of SDPN-protected program and the original program. First, we test how effective SDPN mitigate UAF exploits. Next,we measured the overhead of analysis phase and runtime phase. All experiments are carried out on Ubuntu 20.04.3 LTS system (Linux Kernel 5.11.0-44-generic) with a quad-core 3.90 GHz CPU (Intel Xeon E3-1240 v5), 48 GB RAM, 1TB SSD-base hard disk.

5.1 Implementation

We implement a prototype SDPN that statically clears dangling pointers on top of the llvm 12.0.0 compiler framework. SDPN uses the bytecode file composed of llvm IR as input, and dumps the repaired IR to another bytecode file ending in ‘_dpn.bc’. SDPN uses SVF [12] to extract CFG, PAG, SVFG and other intermediate graph structures from llvm IR. SDPN combines the instruction traversal api of CFG and llvm to obtain the order of execution of the instructions. The PAG in the SVF project is converted into PEG with level by SDPN for pointer selection and filtering. SDPN utilizes SUPA [13] as pointer analysis, a demand-driven and strongly updated pointer analysis performed on the SVFG. SDPN uses c++ to implement various pointers filtering and verification rules , and repair location and generation on llvm and SVFG.

5.2 Security

To test the effectiveness of our method, we choose five real word UAF vulnerability in two software. We select programs that are open source programs and with available pocs. As list in Table 1.

Table 1. Security evaluation against real world vulnerabilities

5.3 Runtime Overhead

As SDPN employs static analysis and static instrumentation to clear potential dangling pointers, the main overhead of this method lies in the static analysis phase and its runtime overhead is very little. The statically inserted instructions are just simple like if instruction, empty operation, and adding some global variables. Since the dangling pointer corresponding to the object is relatively limited, the inserted instructions are relatively few, instead of blindly tracing pointers in a dynamic way. Therefore, after inserting instructions, the size of the file should not change much compared with its original state. According to the data in Table 2, it can also be found that the size of the file has not changed much. Intuitively, this method should bring extremely small, or almost no additional memory overhead. And according to the experimental results, the additional memory overhead is very little indeed.

Table 2. Statistics for SPEC CPU2006.

The overhead of SDPN running SPEC CPU2006 is shown in Table 2. It can be found that the runtime overhead of SDPN is little, with the maximum additional runtime overhead less than 1%. In fact, the impact of the three different types of pointers on the program overhead is also very different. Global dangling pointers and dangling pointers of the same function contain many pointers used by free statements. For these dangling pointers, we only add clearing operations; while the remaining dangling pointers will add safe clearing judgments. The branch statement will bring more overhead to the program.

6 Discussion

Loops and pointer arithmetic are what make a lot of static analysis inaccurate. The loop will continuously increase the node level value calculated by SDPN. If the level of the freed pointer node is 1, we guess that the greater the difference with the node level, the less likely it will become a dangling pointer. SDPN adopts the method of dynamic loop unrolling by setting the maximum level value: if the level value is higher when entering the loop, the number of loop unrolling becomes less; if the level value is lower when entering the loop, the number of loop unrolling is greater. In llvm, pointer computation and address dereference operations are separated. Therefore, SDPN interprets computation instructions as level-invariant propagation instructions. Although we considered these two situations in the pointer filtering, the pointer analysis method selected in the subsequent stage did not consider it. So the final result ignores both cases.

7 Conclusion

Now there are several dynamic dangling pointer elimination methods available, but no programmer is willing to deploy them to eliminate dangling pointers because of their high runtime overheads. Based on the static program analysis, SDPN has low runtime overhead and slightly changes the original program. We applied it to the UAF vulnerability in the real world, which shows the effectiveness and compatibility of SDPN. Through a series of evaluation experiments, we proved that SDPN has very low runtime overhead and hardly changes the program. We believe SDPN will be applied to real-world applications to eliminate dangling pointers.