Constructing depth-optimum circuits for adders and And-Or paths

doi:10.1016/j.dam.2021.12.007

Discrete Applied Mathematics

Volume 310, 31 March 2022, Pages 10-31

https://doi.org/10.1016/j.dam.2021.12.007 Get rights and content

Abstract

We examine the fundamental problem of constructing depth-optimum circuits for binary addition. More precisely, as in literature, we consider the following problem: Given auxiliary inputs $t_{0}, \dots, t_{m - 1}$ , the so-called generate and propagate signals, construct a depth-optimum circuit over the basis ${And 2, Or 2}$ computing all $n$ carry bits of an $n$ -bit adder, where $m = 2 n - 1$ . In fact, carry bits are And-Or paths, i.e., Boolean functions of the form $t_{0} \lor (t_{1} \land (t_{2} \lor (\dots t_{m - 1}) \dots))$ . Classical approaches construct so-called prefix circuits which do not achieve a competitive depth. For instance, the popular construction by Kogge and Stone (1973) is only a 2-approximation. A lower bound on the depth of any prefix circuit is $1.44 {log}_{2} m + const$ , while recent non-prefix circuits have a depth of ${log}_{2} m + {log}_{2} {log}_{2} m + const$ . However, it is unknown whether any of these polynomial-time approaches achieves the optimum depth for all $m \in N$ .

We present a new exponential-time algorithm solving the problem optimally. The previously best exact algorithm by Hegerfeld (2018) with a running time of $O (2.4 5^{m})$ is viable only for $m \leq 29$ . Our algorithm is significantly faster: We achieve a theoretical running time of $O (2.0 2^{m})$ and apply sophisticated pruning strategies to improve practical running times dramatically. This allows us to compute optimum circuits for all $m \leq 64$ . Combining these computational results with new theoretical insights, we derive the optimum depths for the computation of all carry bits of $2^{k}$ -bit adder circuits for all $k \leq 13$ , previously known only for $k \leq 4$ .

In fact, we solve a more general problem, namely delay optimization of generalized And-Or paths, which originates from late-stage logic optimization in VLSI design. Delay is a natural extension of circuit depth to prescribed input arrival times; and generalized And-Or paths are a generalization of And-Or paths where And and Or do not necessarily alternate. Our algorithm arises from our new structure theorem which characterizes delay-optimum generalized And-Or path circuits.

Introduction

In this work, we construct fast circuits for binary addition and for related Boolean functions, so-called And-Or paths. An And-Or path is a function of the form $t_{0} \lor (t_{1} \land (t_{2} \lor (\dots t_{m - 1}) \dots))$ for some $m \in N$ ; and a circuit for a Boolean function is a graph-based model for the computation of the function via elementary building blocks (called gates) on a computer chip. Here, we use And2 and Or2 as elementary building blocks, i.e., logical And and Or with two inputs each. Motivated from VLSI design, our objective function is circuit delay, a generalization of circuit depth to prescribed input arrival times $a (t_{i}) \in N$ for each input $t_{i}$ . The delay of a circuit is the maximum delay of any input $t_{i}$ , i.e., $a (t_{i})$ plus the maximum length of any directed path in the circuit starting in $t_{i}$ . In particular, when $a \equiv 0$ , circuit delay is actually circuit depth, i.e., the maximum length of any directed path in the circuit. Given a specific And-Or path with input arrival times, we want to find a delay-optimum circuit for this Boolean function using only $And 2$ and $Or 2$ gates. Important secondary objective functions include circuit size (i.e., number of gates) and fanout (i.e., number of successors of a gate).

And-Or paths occur as carry bits in the computation of adder circuits: Assume we compute the sum of two $n$ -bit binary numbers $\sum_{i = 0}^{n - 1} a_{i} 2^{i}$ and $\sum_{i = 0}^{n - 1} b_{i} 2^{i}$ . A circuit for this task can be constructed via carry bits which are defined recursively by $c_{0} = 0$ and $c_{i + 1} = g_{i} \lor (p_{i} \land c_{i})$ for $0 \leq i \leq n - 1$ , where $g_{i} = a_{i} \land b_{i}$ and $p_{i} = a_{i} \oplus b_{i}$ , see, e.g., Weinberger and Smith [35] or Knowles [21]. Using the carry bits, the sum $\sum_{i = 0}^{n} s_{i} 2^{i}$ can be computed via $s_{i} = c_{i} \oplus p_{i}$ for $i \in \{0, \dots, n - 1\}$ and $s_{n} = c_{n}$ .

The computation of all $g_{i}$ and $p_{i}$ as well as the computation of the sum from the carry bits only requires a constant depth and a linear number of gates. Therefore, a circuit computing all the And-Or paths $c_{i + 1} = g_{i} \lor (p_{i} \land c_{i}) = g_{i} \lor (p_{i} \land (g_{i - 1} \lor (p_{i - 1} \land c_{i - 1}))) = g_{i} \lor (p_{i} \land (g_{i - 1} \lor (p_{i - 1} \land (g_{i - 2} \lor (p_{i - 2} \land \dots (p_{1} \land g_{0}))))))$ for $0 \leq i \leq n - 1$ can be used to construct an adder circuit with almost the same depth and delay. We call such a circuit computing all carry bits $c_{i + 1}$ from the signals $g_{0}, p_{1}, g_{1}, \dots, p_{i}, g_{i}$ a carry-propagate adder circuit. Note that a naive implementation of a carry-propagate adder circuit following the formula above yields a ripple-carry adder circuit with linear delay, cf. Fig. 1(a). A logically equivalent implementation is given in Fig. 1(b).

Now, we consider the reverse reduction: Given input bits $g_{0}, p_{1}, g_{1}, \dots, p_{n - 1}, g_{n - 1}$ , using constant additional depth, one can construct signals $a_{0}, \dots a_{n - 1}$ and $b_{0}, \dots b_{n - 1}$ such that $c_{n}$ – i.e. the output signal of an And-Or path on $g_{0}, p_{1}, g_{1}, \dots, p_{n - 1}, g_{n - 1}$ – equals the most significant bit $s_{n}$ of the sum of $a$ and $b$ . Thus, any adder circuit could also be used to compute the And-Or path on $g_{0}, p_{1}, g_{1}, \dots, p_{n - 1}, g_{n - 1}$ with almost the same depth.

Note that for And-Or paths, we restrict ourselves to the basis $\{And 2, Or 2\}$ , while for general adders, non-monotone gates are at least required to compute the signals $g_{i}$ and $p_{i}$ . It is unknown whether better adder depths can be obtained by not using the reduction to monotone And-Or paths above, i.e. not using carry-propagate adders, and instead exploiting non-monotone gates in a better way. However, the reverse reduction above shows that this could only be the case if the usage of non-monotone gates led to faster And-Or path circuits, and, until now, no approach can exploit non-monotone gates for And-Or path circuits. In fact, all adder circuits known from literature are based on carry-propagate circuits, i.e. they explicitly compute all carry bits. For more details, see also Section 1.1.

In this work, we will construct depth-optimum circuits for And-Or paths over the basis $\{And 2, Or 2\}$ . In fact, we consider generalized And-Or paths, i.e., a generalization of And-Or paths where And and Or do not necessarily alternate. We will see that this more general problem has a rich structure which we will exploit for our new results. To the best of our knowledge, we are the first to directly consider this generalized problem. Delay optimization of generalized And-Or paths can be applied to optimize the delay of critical combinatorial paths in VLSI design, but existing approaches (see, e.g., Werber et al. [36]) use a simple reduction to And-Or paths which leads to sub-optimal solutions.

We now review previous results on adder and And-Or path optimization over the basis $\{And 2, Or 2\}$ . In this section, following a widely used convention (see e.g. Held and Spirkl [12]), we call any carry-propagate adder circuit – which computes the carry bits $c_{i + 1}$ from the signals $g_{0}, p_{1}, g_{1}, \dots, p_{i}, g_{i}$ as in Eq. (1) – an adder circuit. Recall from Eq. (1) that an $n$ -bit adder can be obtained from And-Or paths on $1, 3, \dots, 2 n - 1$ inputs, so the optimum depth of an $n$ -bit adder equals the optimum depth of an $m$ -input And-Or path with $m = 2 n - 1$ , i.e., $n = \frac{m + 1}{2}$ . Note that in the following logarithmic bounds, applying the substitution $m = 2 n - 1$ only leads to a constant additive term, and hence in all bounds without explicit additive constants, $n$ and $m$ can be freely interchanged. Depth bounds for classical adder constructions are given in terms of $n$ instead of $m$ .

Some of the following results only apply to depth optimization, some also to delay optimization for general arrival times. For general arrival times, a lower bound on the delay of a Boolean circuit on inputs $t_{0}, \dots, t_{m - 1}$ is given by $⌈ {log}_{2} W ⌉$ due to the Kraft inequality [23], where $W ≔ \sum_{i = 0}^{t - 1} 2^{a (t_{i})}$ . Note that for $a \equiv 0$ , we have $W = m$ . In the subsequent delay bounds, $W$ can also be replaced by $m$ to obtain the corresponding depth bounds.

Depth optimization of adder circuits is a classical and well-studied problem. Many researchers construct adder circuits via so-called prefix gates, e.g. Sklansky [31], Kogge and Stone [22], Ladner and Fischer [24], or Roy et al. [28], [29]. Though, for $n$ bits, these circuits have an optimum depth of $⌈ {log}_{2} n ⌉$ in terms of prefix gates, they have a depth of $2 ⌈ {log}_{2} n ⌉$ over the basis $\{And 2, Or 2\}$ since each prefix gate has to be realized by a circuit of depth 2. Based on Rautenbach et al. [26], [27], Held and Spirkl [13] directly optimize the And-Or delay of their prefix adders and obtain a delay guarantee of $μ {log}_{2} W + const$ over $\{And 2, Or 2\}$ , where $1.44 \leq μ \leq 1.441$ . However, Held and Spirkl [13] also proved a lower bound of $μ {log}_{2} n - 1$ on the logic gate depth of any prefix-based adder circuit. Thus, further progress is only possible with adders that are not based on prefix gates. We also refer to Paterson et al. [25] for general lower bounds on the delay of circuits that repeatedly use the same gadget gate type to combine inputs, e.g. prefix gates.

Using non-prefix circuits, Brent [3], Khrapchenko [17], and Held and Spirkl [12] achieve a guaranteed depth of the form

for And-Or paths and adders. Spirkl [33] also considered non-uniform arrival times and constructed circuits with a delay bound of the form

. For And-Or paths with arbitrary arrival times, the best known delay guarantee of

{log}_{2} W + {log}_{2} {log}_{2} m + {log}_{2} {log}_{2} {log}_{2} m + 4.3

was first obtained by Brenner and Hermann [1]. For depth optimization, the best known depth bound of

{log}_{2} m + {log}_{2} {log}_{2} m + O (1)

is due to Grinchuk [9]; the additive constant was improved to 1.58 by Hermann [14].

On the other hand, both for adders and And-Or paths, there is a constant $c$ such that for depth optimization, a lower bound of ${log}_{2} m + {log}_{2} {log}_{2} m + c$ holds for sufficiently large $m$ , which was shown by Commentz-Walter [4]. Hitzschke [15] showed that if $c$ is chosen as $c = - 5.02$ , then the lower bound holds for $m \geq 2^{2^{18}}$ . Commentz-Walter [4] obtains the lower bound via structural insights on And-Or paths that are the basis for our structure theorem, see Section 3. For the non-monotone case where not only And2 and Or2, but also INV gates are allowed to be used, Commentz-Walter and Sattler [5] proved that for each $α \in (0, 1)$ , there is some $M_{α} \in N$ such that a lower bound of ${log}_{2} m + α {log}_{2} {log}_{2} {log}_{2} m$ holds for all $m \geq M_{α}$ . From this, Khrapchenko [19] derives that for any $m \geq 2^{2^{32}}$ , the lower bound ${log}_{2} m + 0.15 {log}_{2} {log}_{2} {log}_{2} \frac{m}{2} - 1$ holds. However, so far there is no approach that can exploit non-monotone gates to obtain better depths. It is an interesting open question whether such an approach exists or stronger lower bounds (e.g. similar to the monotone case) can be shown for the non-monotone case.

Comparing to the monotone lower bound, Grinchuk [9] and Hermann [14] construct depth-optimum circuits up to an additive constant, and the delay-optimization algorithm by Brenner and Hermann [1] is, applied for the special case of depth optimization, optimum up to an additive term of $O ({log}_{2} {log}_{2} {log}_{2} m)$ . However, the difference of these upper bounds to the lower bound is still substantial, leading to sub-optimal results, in particular on small instances as they occur in practice in VLSI design.

In order to obtain good empirical results, the algorithm by Brenner and Hermann [2] combines the ideas from these theoretical constructions with practical improvements and generalizations. Although no better worst-case bound can be shown, the obtained circuits mostly have better – often optimal – delay. However, there are instances with only $6$ inputs for which the algorithm does not find an optimum solution, see Section 6.3 in Hermann [14]. Still, regarding depth optimization, all instances with known optimum depth that can be solved by this algorithm are indeed solved optimally, and it is open whether this construction is always optimum in the depth case. There is also a depth-optimization heuristic by Grinchuk [8] which might actually be an exact algorithm (cf. the depths displayed in the table in Section 5 of [8]).

There are three previously known exact algorithms for depth optimization of And-Or paths.

Apart from the aforementioned heuristic, Grinchuk [8] also provides an exact algorithm for depth optimization of And-Or paths with a running time of $Ω (4^{m})$ . No explicit empirical results are given, but it is mentioned that the algorithm can only be used for up to 20 or 30 inputs. The idea of this algorithm is to compute the optimum achievable depth for all monotone Boolean functions on $m$ inputs in a bottom-up dynamic programming fashion. Each Boolean function is identified by its truth table, and circuits of larger depth are obtained by pairwise combinations of existing circuits with And or Or gates. Naively, the dynamic-programming table thus would have $2^{2^{m}}$ entries. Grinchuk’s main contribution is the observation that a truth table of size $m$ – called a passport in [8] – suffices to identify a monotone And-Or path circuit. This way, the size of the dynamic-programming table is reduced to $2^{m}$ , which implies a running time of $Ω (4^{m})$ to compute all table entries and hence a depth-optimum And-Or path circuit.

Hegerfeld [11] proposes two enumeration algorithms constructing depth-optimum And-Or path circuits.

In a first algorithm, Hegerfeld constructs all circuits that are size-optimum among all depth-optimum And-Or path circuits. This algorithm is based on tree enumeration and is viable for up to $19$ inputs. The algorithm can also be used to enumerate And-Or path circuits with non-optimum depth with an increase in running time, which leads to optimum solutions with respect to delay for certain arrival time profiles.

Hegerfeld’s second algorithm is much faster, but is restricted to depth-optimum formula circuits (i.e., circuits where each gate has fanout 1) with a certain size guarantee (cf. Section 7). It has a provable running time of $O (2.4 5^{m})$ and can be applied for up to $29$ inputs. Hegerfeld does not enumerate formula circuits for And-Or paths directly, but so-called rectangle-good protocol trees for Karchmer–Wigderson games (see Karchmer and Wigderson [16]) for And-Or paths, which come from the area of communication complexity. From these, Hegerfeld derives the optimum formula circuits. This is the fastest previously known exact algorithm for depth optimization of And-Or paths.

In this work, we present a new exact algorithm for constructing delay-optimum generalized And-Or path circuits. In the most prominent special case of depth optimization of And-Or path circuits (and thereby carry-propagate adders), our algorithm is significantly faster than previous approaches, both in theory and in practice. For the general problem, which occurs in late-stage timing optimization in VLSI design, we obtain the first known non-trivial exact algorithm.

We prove a new structure theorem which characterizes the structure of specific delay-optimum circuits for generalized And-Or paths. More precisely, we show how optimum solutions for generalized And-Or paths can be obtained by combining optimum circuits for certain smaller generalized And-Or paths in a recursive fashion, directly motivating a dynamic programming algorithm. We stress that an analogous statement does not hold for non-generalized And-Or paths, that is, generalized And-Or paths also occur as sub-functions of delay-optimum circuits for non-generalized And-Or paths.

In the general case, the running time of our new algorithm is at most $O (3^{m})$ . For And-Or paths, the bound is improved to $O (2.4 5^{m})$ .

For the special case of depth optimization of And-Or paths, our algorithm has a running time of $O (2.0 2^{m})$ , significantly improving upon the previously best running time of $O (2.4 5^{m})$ of the formula enumeration algorithm by Hegerfeld [11]. Hegerfeld computes a depth-optimum formula circuit with a certain size guarantee (cf. Section 7). Our algorithm can either compute such a circuit or an arbitrary delay-optimum circuit, which can be done much faster. In contrast to Hegerfeld, in practice, we apply very efficient pruning techniques that drastically reduce empirical running times. The largest instance solved by Hegerfeld has $29$ inputs, while our algorithm with size optimization can solve instances with up to $42$ inputs; without size optimization even up to $64$ inputs. Our running times on $26$ inputs are 2.1 s with size optimization and 0.007 s without size optimization, while Hegerfeld’s running time is $17$ hours. Our largest running time without size optimization on any of these instances is roughly 2.7 h.

From our structure theorem, our computations and the results computed by the heuristic And-Or path optimization algorithm by Grinchuk [8], we deduce the optimum depths of carry-propagate adder circuits – i.e., circuits computing all the carry bits from the $g_{i}$ - and $p_{i}$ -signals as in Eq. (1) – over the basis $\{And 2, Or 2\}$ on $2^{k}$ bits, where $k \leq 13$ . To the best of our knowledge, we are the first to obtain such a result.

The rest of the paper is organized as follows: In Section 2, we formally introduce the problem and basic concepts. In Section 3, we present and prove our structure theorem. From this, in Section 4, we derive our exact algorithm, which is refined for the special case of depth optimization of And-Or paths in Section 5. Practical speedups are presented in Section 6. In Section 7, we show computational results, i.e., our practical running times and the computed optimum adder depths.

Section snippets

Boolean functions and circuits

Our notation regarding Boolean functions and circuits is based on Savage [30]. We denote the set of natural numbers including $0$ by $N$ . For an $n$ -tuple $(x_{0}, \dots, x_{n - 1})$ and an index $i \in \{0, \dots, n - 1\}$ , we use the standard notation $(x_{0}, \dots, \hat{x_{i}}, \dots, x_{n - 1})$ to denote the $(n - 1)$ -tuple arising from $x$ by deleting the entry $x_{i}$ .

For us, a Boolean function is a function $f : {\{0, 1\}}^{m} \to \{0, 1\}$ for some $m \in N$ . We often write $t = (t_{0}, \dots, t_{m - 1})$ for the input variables, short inputs, of $f$ . For an input variable $t_{i}$ of $f$ and a value $α \in \{0, 1\}$ , the restriction

Structure theorem

Our structure theorem and our algorithm presented in Section 4 both reduce the problem of optimizing a given generalized And-Or path to smaller instances of a specific form.

Definition 3.1

Consider a generalized And-Or path $h (t; Γ)$ with $m \geq 1$ inputs. Given indices $0 \leq i_{0} < \dots < i_{k - 1} \leq m - 1$ , the generalized And-Or path $t_{i_{0}} \circ_{i_{0}} (t_{i_{1}} \circ_{i_{1}} (\dots \circ_{i_{k - 3}} (t_{i_{k - 2}} \circ_{i_{k - 2}} t_{i_{k - 1}})))$ is called a sub-path of $h (t; Γ)$ . Now, let a gate type $\circ \in \{And, Or\}$ and a set $S_{1}^{\circ}$ with $0̸ \neq S_{1}^{\circ} \subseteq S^{\circ}$ be given, and let $i$ be maximum with $t_{i} \in S_{1}^{\circ}$ . Then, the sub-path of $h (t; Γ)$ containing

General algorithm

The structure theorem from the previous section motivates an exact algorithm for the DELAY OPTIMIZATION PROBLEM FOR GENERALIZED AND-OR PATHS: Consider a generalized And-Or path $h (t; Γ)$ with prescribed input arrival times. Assume that we know a delay-optimum formula circuit for all strict sub-paths of $h (t; Γ)$ . Then, by Theorem 3.4, there are $\circ \in \{And, Or\}$ and a partition

such that a delay-optimum circuit

C

for

h (t; Γ)

can be obtained from delay-optimum circuits

C_{1}

for

h {(t; Γ)}_{S_{1}^{\circ}}

and

C_{2}

for

h {(t; Γ)}_{S_{2}^{\circ}}

Improved Algorithm for Depth Optimization of AND-OR Paths

In this section, we speed up Algorithm 4.1 for the special case of depth optimization of And-Or paths. For this, we partition all sub-paths considered during the algorithm into so-called $s p$ -equivalence classes, where two sub-paths with segment partitions

and

are considered as

s p

-equivalent if and only if

c = c^{'}

and

| P_{b} | = | P_{b}^{'} |

for all

b \in \{0, \dots, c\}

. Then, up to renaming of the input variables, any two

s p

-equivalent And-Or paths are either logically equivalent or dual to each other, i.e., the delays

Practical implementation

We implemented Algorithm 4.1 (Section 4) in a C++ program, using $64$ -bit bit sets to encode the sub-paths via the bijection $κ$ to subsets of $\{t_{0}, \dots, t_{m - 1}\}$ . In order to obtain good practical running times, we implemented several speedup techniques. On most instances, these in particular imply that we compute the delay for only a fraction of the sub-paths from our dynamic-programming table, see also Table 4 (Section 7.2). Hence, we store the table in a hash set, which violates the worst-case running

Computational results

In Section 7.1, we analyze results for delay optimization of And-Or paths and generalized And-Or paths. Then, in Section 7.2, we consider the DEPTH OPTIMIZATION PROBLEM FOR AND-OR PATHS. In particular, here we analyze all speedup techniques in detail, including their individual impact on the empirical running time. These speedups allow us to solve all instances of the DEPTH OPTIMIZATION PROBLEM FOR AND-OR PATHS with up to $64$ inputs. For this problem, we also compare our running times with those

Conclusions

We presented a new exact algorithm for constructing depth- and delay-optimum And-Or path and carry-propagate adder circuits over the basis $\{And 2, Or 2\}$ . Our algorithm is much faster than previous approaches – both empirically and regarding provable worst-case running time – and hence can solve significantly larger instances. For all And-Or path instances with up to 64 inputs, the optimum depth was computed in reasonable time.

Using these empirical computations and new theoretical results, we derived

References (36)

RautenbachD. et al.
Delay optimization of linear depth boolean circuits with prescribed input arrival times
J. Discrete Algorithms
(2006)
RautenbachD. et al.
The delay of circuits whose inputs have specified arrival times
Discrete Appl. Math.
(2007)
BrennerU. et al.
Faster carry bit computation for adder circuits with prescribed arrival times
ACM Trans. Algorithms
(2019)
BrennerU. et al.
Delay optimization of combinational logic by and-or path restructuring
(2020)
BrentR.
On the addition of binary numbers
Trans. Comput.
(1970)
Commentz-WalterB.
Size-depth tradeoff in monotone Boolean formulae
Acta Inform.
(1979)
Commentz-WalterB. et al.
Size-depth tradeoff in non-monotone Boolean formulae
Acta Inform.
(1980)
ConradieW. et al.
Logic and Discrete Mathematics: A Concise Introduction
(2015)
CramaY. et al.
Boolean Functions: Theory, Algorithms, and Applications
(2011)
M.I. Grinchuk, Low depth circuit design, US patent 8499264...

GrinchukM.I.

Sharpening an upper bound on the adder and comparator depths

Disk. Anal. I Issledovanie Oper.

(2008)

GrinchukM.I.

Sharpening an upper bound on the adder and comparator depths

J. Appl. Ind. Math.

(2009)

HegerfeldF.

Optimal monotone realizations of And-Or-paths

(2018)

HeldS. et al.

Binary adder circuits of asymptotically minimum depth, linear size, and fan-out two

ACM Trans. Algorithms

(2017)

HeldS. et al.

Fast prefix adders for non-uniform input arrival times

Algorithmica

(2017)

HermannA.

Faster circuits for and-or paths and binary addition

(2020)

HitzschkeJ.M.

Untere Schranken für Tiefe und Delay von AND-OR-Pfaden

(2018)

M. Karchmer, A. Wigderson, Monotone circuits for connectivity require super-logarithmic depth, 3(2) (1990)...

Cited by (1)

BONNLOGIC: Delay optimization by AND-OR Path restructuring
2023, Integration
We present BonnLogic, a timing optimization framework that replaces critical paths by logically equivalent realizations with less delay. Our tool allows to revise early decisions on the logical structure of the netlist in late physical design. The core routine of our framework is a new algorithm that constructs delay-optimized circuits for alternating And-Or paths with prescribed input arrival times. It is a sophisticated dynamic programming algorithm which is a common generalization of the previously best approaches. In contrast to all earlier methods, we avoid fixing the structure of sub-solutions before deciding on how to combine them, significantly expanding the search space of the algorithm. Our algorithm provably fulfills the best known approximation guarantees, almost always computes delay-optimum solutions, and empirically outperforms all previous approaches. In addition, we show how any algorithm for And-Or paths optimization which is restricted to integral arrival times can be generalized to fractional arrival times with the same guarantees on the delay. The reduction to And-Or path optimization allows us to optimize general combinatorial paths of arbitrary length in our logic restructuring framework BonnLogic. The framework is applied successfully as a late step in an industrial physical design flow. Experiments demonstrate the effectiveness of BonnLogic on industrial 7 nm instances.

¹: Now with Synopsys GmbH, Germany.

²: Now with Greenplan GmbH, Germany.

View full text

Constructing depth-optimum circuits for adders and And-Or paths

Abstract

Introduction

Section snippets

Boolean functions and circuits

Structure theorem

General algorithm

Improved Algorithm for Depth Optimization of AND-OR Paths

Practical implementation

Computational results

Conclusions

J. Discrete Algorithms

Discrete Appl. Math.

Faster carry bit computation for adder circuits with prescribed arrival times

ACM Trans. Algorithms

Delay optimization of combinational logic by and-or path restructuring

On the addition of binary numbers

Trans. Comput.

Size-depth tradeoff in monotone Boolean formulae

Acta Inform.

Size-depth tradeoff in non-monotone Boolean formulae

Acta Inform.

Logic and Discrete Mathematics: A Concise Introduction

Boolean Functions: Theory, Algorithms, and Applications

Sharpening an upper bound on the adder and comparator depths

Disk. Anal. I Issledovanie Oper.

Sharpening an upper bound on the adder and comparator depths

J. Appl. Ind. Math.

Optimal monotone realizations of And-Or-paths

Binary adder circuits of asymptotically minimum depth, linear size, and fan-out two

ACM Trans. Algorithms

Fast prefix adders for non-uniform input arrival times

Algorithmica

Faster circuits for and-or paths and binary addition

Untere Schranken für Tiefe und Delay von AND-OR-Pfaden