More precise construction of static single assignment programs using reaching definitions

doi:10.1016/j.jss.2020.110590

Journal of Systems and Software

Volume 166, August 2020, 110590

https://doi.org/10.1016/j.jss.2020.110590 Get rights and content

Abstract

The Static Single Assignment (SSA) form is an intermediate representation used for the analysis and optimization of programs in modern compilers. The ϕ-function placement is the most computationally expensive part of converting any program into its SSA form. The most widely-used ϕ-function placement algorithms are based on computing dominance frontiers (DF). However, this kind of algorithms works under the limiting assumption that all variables are defined at the beginning of the program, which is not the case for local variables. In this paper, we introduce an innovative ϕ-placement algorithm based on computing reaching definitions (RD), which generates a precise number of ϕ-functions. We provided theorems and proofs showing the correctness and the theoretical computational complexity of our algorithms. We implemented our approach and a well-known DF-based algorithm in the Clang/LLVM compiler framework, and performed experiments on a number of benchmarks. The results show that the limiting assumption of the DF-based algorithm when compared with the more accurate results of our RD-based approach leads to generating up to 87% (69% on average) superfluous ϕ-functions on all benchmarks, and thus brings about a significant precision loss. Moreover, even though our approach computes more information to generate precise results, it is able to analyze up to 92.96% procedures (65.63% on average) of all benchmarks with execution time within twice the execution time of the reference DF-based approach.

Introduction

Most current compilers and virtual machines, including the well-known GNU Compiler Collection (GCC)¹, the LLVM Compiler Infrastructure (LLVM)², and the Java Hotspot³, use the so-called static single assignment (SSA) form as an intermediate representation (IR) of programs. SSA programs are often used for efficient program analysis, transformation, optimization, and efficient register allocation. Programs represented in the SSA form require that each variable is defined exactly once, but it may be used multiple times. Moreover, the variable definition should always appear before its use.

Any straight-line sequence of non-SSA code can be converted to SSA form by using a suitable renaming of the program variables that adheres to the definition of SSA program as shown in Fig. 1. However, if the code contains branching instructions, the renaming process becomes complicated by the fact that multiple definitions of a program variable may reach at control flow merge points. For example, the print statement in Fig. 2 receives two distinct definitions of Y from two different branches of the if statement. It may be hard, or even impossible, to statically decide which definition of that variable to use afterwards. Any non-SSA program is transformed to the SSA form by performing the following two steps:

(i)
identifying the merge points in the control flow graph (CFG) of the program to place pseudo-assignments of the form $x = ϕ (x, \dots, x)$ for each variable x, where multiple distinct definitions of x may arrive through different branches of the control flow, and
(ii)
renaming each x such that any assignment or pseudo-assignment to x (i.e., a definition of x) is uniquely renamed and uses the renamed x at each reference of that particular definition.

Each argument of the ϕ-function corresponds to a particular reaching definition of x coming from one of the branches. Thus, a so-called join set $J^{+} (S)$ identifying all those merge points requiring pseudo-assignments for each variable x needs to be constructed, where S is the set of CFG nodes containing assignments to x.

Cytron et al. (1991) have carried out pioneering work on the establishment of a pragmatically efficient construction of the SSA form based on computing so called dominance frontiers (DF), which are relations among CFG nodes based on dominators.⁴ This approach is currently leveraged by most SSA construction algorithms used by modern compilers, such as LLVM. A closer look to Cytron et al.’s approach shows that, on the assumption that computing the aforementioned join set $J^{+} (S)$ is practically inefficient, an efficient alternative would be instead to compute the iterated dominance frontier $D F^{+} (S)$ . This is an approximation that is possible thanks to the equality relation $D F^{+} (S) = J^{+} (S),$ that holds when S contains the Entry node of the CFG (Cytron, Ferrante, Rosen, Wegman, Zadeck, 1991, Weiss, 1992). The set S contains all CFG nodes in which a particular program variable is defined. Since the Entry node is considered to be included into S due to $D F^{+} (S) = J^{+} (S),$ DF-based SSA construction methods implicitly consider that all program variables are defined at the beginning of the program. This is a limiting restriction and it cannot always hold, especially for local variables, which are mostly declared at the beginning, but defined later in the program. Thus, all DF-based SSA construction algorithms produce superfluous ϕ-functions and hence construct larger SSA programs than necessary.

Our Contribution In this work, we explore the impact of the seemingly benign equality condition by which S includes the Entry node of $D F^{+} (S) = J^{+} (S)$ . To do so, we provide an algorithm based on computing reaching definitions (RDs) (Nielson et al., 1999) that can accurately compute the join set $J^{+} (S)$ where we can freely choose the set S of variable definitions. Computing RDs for SSA construction is nontrivial and more complex than computing RDs for program analysis. Our novel approach to compute the $J^{+} (S)$ set is efficient on most of our benchmarks. By including the Entry node into S, then DF-based approaches and ours produce the same number of ϕ-functions.

On the other hand, our algorithm is able to produce more accurate ϕ-functions by considering that only global variables and formal parameters are defined at the Entry node of the CFG. Our experiments on a number of benchmarks reveal that DF-based SSA construction approach generates (i) up to 87% and on an average 69% superfluous ϕ-functions compared to the ϕ-functions generated by our RD-based approach. Our approach is applicable to both structured and unstructured programs containing dense or sparse variable definitions. Moreover, along with constructing the SSA form, our RD information can be re-used to optimize the generated SSA program.

Note that this is an invited extended version of the paper entitled “Towards constructing the SSA form using reaching definitions over dominance frontiers” (Masud and Ciccozzi, 2019) and published at the IEEE International Working Conference on Source Code Analysis and Manipulation. More specifically, this article has been extended as follows:

•
We have included a new section with a detailed description of where and how the DF-based approach loses precision in computing ϕ-functions (Section 3).
•
We have extended the formal development of RD-based SSA construction in Section 4.2. Three new algorithms (Algorithm 3, Algorithm 5) are included to provide a complete and detailed picture of how we perform the computation.
•
We have added a new section (Section 5) providing discussion about the proof of correctness of our algorithms and their computational complexity. Section 5.1 provides Theorem 1 and its proof along with some auxiliary lemmas to prove the correctness of our algorithms and Section 5.2 includes Theorem 2, some auxiliary lemmas and their proofs stating the computational complexity of our algorithms.
•
We have extended our experimental evaluation by running our solution on six additional benchmarks, which confirmed our positive results (Section 6).

Paper organization The remainder of the paper is organized as follows. In Section 2 we provide core concepts and terminology upon which our approach is based. In Section 3 we provide a description of precision loss in the DF-based approach. Our RD-based approach is described in detail in Section 4, while a discussion of the its correctness and computational complexity is given in Section 5. The extended experimental evaluation is presented in Section 6. Related works are outlined in Section 7 and the paper is concluded by Section 8 with a summary and future work.

Section snippets

Background and terminology

Definition 1 Control flow graph (CFG)

The CFG of any given program is a directed graph $G = (N, E, e n t r y)$ where

•
N is the set of nodes and each node n ∈ N represents a basic block containing straight-line sequence of code,
•
E⊆N × N is the set of edges representing the program control flow, and
•
entry is the unique Entry node representing the starting basic block from where the execution starts.

Note that the above definition of CFG is intraprocedural. Since SSA construction is usually performed per procedure, we consider only intraprocedural

Precision loss of DF-based ϕ-placement methods

DF-based ϕ-placement methods lose precision due to the assumption that all program variables are defined at the beginning. In the following, we illustrate this precision loss with some examples obtained from a real-life benchmark suite SPEC CPU2017 (Bucek et al., 2018).

Consider the CFG skeleton in Fig. 3(a). The cycle $n_{0}, n_{1}, n_{2}, n_{3}, n_{0}$ in the CFG represents a loop structure in the program code. Variable $ix$ is locally declared inside the loop body at n₁ and defined at n₁ and n₃. This local

SSA Construction procedure

In this section, we provide methods to compute the join sets requiring ϕ functions without using the concept of dominance frontiers. The method is based on a forward dataflow analysis (Nielson et al., 1999) accumulating facts about reaching variable definitions at different CFG nodes. In Section 4.1, we provide Algorithm 1 to perform the dataflow analysis collecting data flow facts which are called abstract and concrete reaching definitions. In Section 4.2, we develop methods to resolve all

Correctness of computing ϕ nodes

In the remainder of this section, we assume the following:

•
(N, E) is the CFG of any program,
•
N_u is the set of pseudo nodes related to N
•
$G_{μ}^{α} = (N_{μ}, E_{μ})$ be any dependency graph generated according to Definition 3,
•
C⊆N_μ be the set of CDs for the variable x ∈ Var,
•
Σ records the flow of CDs in C in Algorithm 2, and
•
$N o d e s_{ϕ} (C, Σ, G_{μ}^{α})$ is the set of ϕ nodes computed in Algorithm 3.

In this section, we provide Theorem 1 to state the correctness of Algorithm 4. We prove this theorem with the aid of some auxiliary

Experimental evaluation

We implemented both ours and the DF-based ϕ-placement approach of Cytron et al. (1991) in the Clang/LLVM compiler framework Lattner and Adve (2004). We performed the experiments on an Intel(R) Core(TM) i7-7567U CPU with 3.50GHz leveraging a number of SPEC CPU2017 (Bucek et al., 2018) benchmarks consisting of approximately 2081 KLOC. SPEC is a set of industry-standardized, CPU intensive suites for measuring and comparing, among others, compute intensive performance and compilers. Table 2 shows

Related work

The first approach to generate the set of nodes that require pseudo assignments, or ϕ-functions, dates back to the work of Shapiro and Saint (1970). Subsequent contributions include the work of (i) Reif (1978) providing a complex ϕ-placement algorithm in a bottom-up walk of the dominator tree, and (ii) Rosen et al. (1988) generating SSA form for reducible programs. However, Cytron et al. (1991) presented the first practically efficient algorithm based on computing dominance frontiers to

Conclusion and future work

Most SSA construction algorithms are based on computing dominance frontiers, which is very efficient for reducible programs. However, the correctness and precision condition (i.e., $D F^{+} (S) = J^{+} (S)$ ) of any DF-based method depends on the limiting assumption that all program variables are defined at the beginning (i.e., S contains the entry node), which is not always the case for local variables. To understand the impact of this assumption, we have developed a novel RD-based ϕ-placement algorithm

CRediT authorship contribution statement

Abu Naser Masud: Conceptualization, Investigation, Methodology, Software, Data curation, Writing - original draft, Visualization, Validation. Federico Ciccozzi: Writing - review & editing, Project administration, Funding acquisition, Validation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This research is supported by the Knowledge Foundation through the MOMENTUM and the HERO projects led by the Mälardalen University.

References (23)

Y. Sui et al.
Parallel construction of interprocedural memory SSA form
J. Syst. Softw.
(2018)
F.E. Allen
Control flow analysis
Proceedings of a Symposium on Compiler Optimization
(1970)
J. Aycock et al.
Simple generation of static single-assignment form
Proceedings of the 9th International Conference on Compiler Construction, CC ’00
(2000)
G. Bilardi et al.
Algorithms for computing the static single assignment form
J. ACM
(2003)
M.M. Brandis et al.
Single-pass generation of static single-assignment form for structured languages
ACM Trans. Program. Lang. Syst.
(1994)
M. Braun et al.
Simple and efficient construction of static single assignment form
Proceedings of the 22Nd International Conference on Compiler Construction, CC’13
(2013)
P. Briggs et al.
Practical improvements to the construction and destruction of static single assignment form
Softw. Pract. Exper.
(1998)
J. Bucek et al.
SPEC CPU2017: Next-generation compute benchmark
Companion of the 2018 ACM/SPEC International Conference on Performance Engineering, ICPE ’18
(2018)
J.-D. Choi et al.
Automatic construction of sparse data flow evaluation graphs
Proceedings of the 18th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’91
(1991)
F.C. Chow et al.
Effective representation of aliases and indirect memory operations in SSA form
Proceedings of the 6th International Conference on Compiler Construction, CC ’96
(1996)

C. Click et al.

A simple graph-based intermediate representation

SIGPLAN Not.

(1995)

Cited by (7)

Efficient computation of minimal weak and strong control closure
2022, Journal of Systems and Software
Control dependency is a fundamental concept in many program analyses, transformation, parallelization, and compiler optimization techniques. An overwhelming number of definitions of control dependency relations are found in the literature that capture various kinds of program control flow structures. Weak and strong control closure (WCC and SCC) relations capture nontermination insensitive and sensitive control dependencies and subsume all previously defined control dependency relations. In this paper, we have shown that static dependency-based program slicing requires the repeated computation of WCC and SCC. The state-of-the-art WCC and SCC algorithm provided by Danicic et al. has the cubic and the quartic worst-case complexity in terms of the size of the control flow graph and is a major obstacle to be used in static program slicing. We have provided a simple yet efficient method to compute the minimal WCC and SCC which has the quadratic and cubic worst-case complexity and proved the correctness of our algorithms. We implemented ours and the state-of-the-art algorithms in the Clang/LLVM compiler framework and run experiments on a number of SPEC CPU 2017 benchmarks. Our WCC method performs a maximum of 23.8 times and on average 10.6 times faster than the state-of-the-art method to compute WCC. The performance curves of our WCC algorithm for practical applications are closer to the NlogN curve in the microsecond scale. Our SCC method performs a maximum of 226.86 times and on average 67.66 times faster than the state-of-the-art method to compute SCC. Evidently, we improve the practical performance of WCC and SCC computation by an order of magnitude.
TSSRD: A Topic Sentiment Summarization Framework Based on Reaching Definition
2023, IEEE Transactions on Affective Computing
The Duality in Computing SSA Programs and Control Dependency
2023, IEEE Transactions on Software Engineering
Fast and Incremental Computation of Weak Control Closure
2022, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
A method for decompilation of AMD GCN kernels to OpenCL
2021, arXiv
A method for decompilation of AMD GCN kernels to OpenCL
2021, Informatsionno-Upravliaiushchie Sistemy

View all citing articles on Scopus

View full text

More precise construction of static single assignment programs using reaching definitions

Abstract

Introduction

Section snippets

Background and terminology

Precision loss of DF-based ϕ-placement methods

SSA Construction procedure

Correctness of computing ϕ nodes

Experimental evaluation

Related work

Conclusion and future work

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgment

J. Syst. Softw.

Control flow analysis

Proceedings of a Symposium on Compiler Optimization

Simple generation of static single-assignment form

Proceedings of the 9th International Conference on Compiler Construction, CC ’00

Algorithms for computing the static single assignment form

J. ACM

Single-pass generation of static single-assignment form for structured languages

ACM Trans. Program. Lang. Syst.

Simple and efficient construction of static single assignment form

Proceedings of the 22Nd International Conference on Compiler Construction, CC’13

Practical improvements to the construction and destruction of static single assignment form

Softw. Pract. Exper.

SPEC CPU2017: Next-generation compute benchmark

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering, ICPE ’18

Automatic construction of sparse data flow evaluation graphs

Proceedings of the 18th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’91

Effective representation of aliases and indirect memory operations in SSA form

Proceedings of the 6th International Conference on Compiler Construction, CC ’96

A simple graph-based intermediate representation

SIGPLAN Not.