# Universal Switch Modules for FPGA Design YAO-WEN CHANG and D. F. WONG University of Texas at Austin and C. K. WONG Chinese University of Hong Kong A switch module M with W terminals on each side is said to be universal if every set of nets satisfying the dimensional constraint (i.e., the number of nets on each side of M is at most W) is simultaneously routable through M. In this article, we present a class of universal switch modules. Each of our switch modules has 6W switches and switch-module flexibility three (i.e., $F_S=3$ ). We prove that no switch module with less than 6W switches can be universal. We also compare our switch modules with those used in the Xilinx XC4000 family FPGAs and the antisymmetric switch modules (with $F_S=3$ ) suggested by Rose and Brown [1991]. Although these two kinds of switch modules also have $F_S=3$ and 6W switches, we show that they are not universal. Based on combinatorial counting techniques, we show that each of our universal switch modules can accommodate up to 25% more routing instances, compared with the XC4000-type switch module of the same size. Experimental results demonstrate that our universal switch modules improve routability at the chip level. Finally, our work also provides a theoretical insight into the important observation by Rose and Brown [1991] (based on extensive experiments) that $F_S=3$ is often sufficient to provide high routability. Categories and Subject Descriptors: B.7.1 [Integrated Circuits]: Types and Design Styles—gate arrays; B.7.2 [Integrated Circuits]: Design Aids—placement and routing General Terms: Design, Experimentation, Measurement, Performance, Theory, Verification #### 1. INTRODUCTION As a relatively new technology, FPGAs are still undergoing significant change in their architectures [Brown et al. 1992; Trimberger 1994]. This article addresses the FPGA architecture design problem. A typical FPGA consists of a <sup>&</sup>lt;sup>1</sup> More general F<sub>s</sub>'s were considered in Rose and Brown [1991]. This work was partially supported by the Texas Advanced Research Program under grant 003658459. Authors' addresses: Y.-W. Chang and D. F. Wong: Department of Computer Sciences, University of Texas at Austin, Austin, TX 78712; C. K. Wong: Department of Computer Science, Chinese University of Hong Kong, Hong Kong. Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. <sup>© 1996</sup> ACM 1084-4309/96/0100-0080 \$03.50 Fig. 1. (a) Symmetric-array FPGA model; (b) switch module. symmetric array of logic modules that can be connected by general routing resources. Figure 1(a) shows the symmetric-array FPGA model. The logic modules contain circuits that implement logic functions. The routing resources comprise segments of wires and two kinds of modules, switch modules and connection modules, which contain user-programmable switches. The intersection of horizontal and vertical channels is referred to as a switch module; the switch module serves to connect wire segments, and this requires using programmable switches inside it. Figure 1(b) illustrates a switch module in which the programmable switches, denoted by dashed lines between terminal 1 and others, are shown. The *flexibility* of a switch module, represented by $F_{\infty}$ is defined as the number of programming switches between one terminal and others [Rose and Brown 1991]; for example, the switch module in Figure 1(b) has $F_S = 6$ . Connection modules are used to connect logic-module pins to wire segments. We refer to the *connection-module flexibility*, denoted by $F_C$ , as the number of tracks to which a logic-module pin can be connected [Rose and Brown 1991; see Figure 2 for an illustration. Logic circuits are implemented in an FPGA by partitioning logic into individual logic modules and then interconnecting the modules by programming the switches in switch and connection modules; Figure 3(a) shows a routing example using the switch module depicted in Figure 3(b). Architectural studies for the symmetric-array FPGA have been reported in much of the literature. Logic-module architectures were studied by Lin et al. [1994], Rose et al. [1990], and Thakur and Wong [1995] and connection-module ones by Fujiyoshi et al. [1994] and Rose and Brown [1991]. Researchers have shown that the feasibility of FPGA design is constrained more by routing resources than by logic resources [Bhat and Hill 1992; Trimberger and Chene 1992]. Thus it is important to facilitate routing in the FPGA design. Switch modules are a crucial component for FPGA routing. Intuitively, a switch module with a larger *routing capacity* would have better area performance in FPGA routing. To verify this intuition, we <sup>&</sup>lt;sup>2</sup> See Alexander and Robins [1995], Brown et al. [1992b], Chang et al. [1994; 1995a], Lemieux and Brown [1993], Sun and Liu [1994], and Sun et al. [1993]. <sup>&</sup>lt;sup>3</sup> Number of routing instances that can route on a switch module. Section 2 gives a formal definition. Fig. 2. Connection-module flexibility $(F_c = 2)$ . Fig. 3. (a) Routing instance; (b) switch module. perform experiments and show that switch modules with larger routing capacities result in better routing solutions.<sup>4</sup> The following crucial factors contribute to this phenomenon: - Switch modules with larger routing capacity increase the connectivity of routing components, and thus improve the overall routability of an $\mbox{FPC}\Delta$ - Most logic-module pins are logically equivalent [Trimberger 1994]<sup>5</sup>; the pin permutations combined with highly routable switch modules pave the way for optimizing routing. - For practical applications, most nets are short. For example, about 60% (90%) of nets in the CGE [Brown et al. 1992b] and SEGA [Lemieux and Brown 1993] benchmark circuits route through no more than two (five) switch modules, independent of FPGA sizes. Thus the routability of a single switch module plays an important role in overall FPGA routing. <sup>&</sup>lt;sup>4</sup> In fact, work by Chang et al. [1994; 1995a], Rose and Brown [1991], Sun and Liu [1994], and Sun et al. [1993] have also revealed the fact. <sup>&</sup>lt;sup>5</sup> For example, the lookup-table and control inputs in a logic module are logically equivalent. Fig. 4. (a), (b) Two different size antisymmetric switch modules; (c) Xilinx XC4000-type switch module; and (d) its corresponding switch-module model. Hence increasing the routing capacity of a switch module also improves the area performance of a router. Therefore it is of particular importance to consider the switch-module architecture of an FPGA. The main consideration in the FPGA switch-module design is the tradeoff in the routing capacity and area limitation of a switch module. The programmable switches usually occupy large areas, and hence the number of switches that can be placed in a switch module is usually limited. On the other hand, fewer switches in a switch module would reduce its routing capacity. A switch module M with W terminals on each side is said to be universal if every set of nets satisfying the dimensional constraint, that is, the number of nets on each side of M is at most W, is simultaneously routable through M. A universal switch module has the maximum routing capacity, and thus it is desirable to design such a switch module using the minimum number of switches. Switch-module architectures for symmetric-array FPGAs have been much studied recently. Two kinds of well-known switch-module architectures were used by Rose and Brown [1991] and Xilinx, Inc. [Hsieh et al. 1990; Xilinx Inc. 1994]. Figures 4(a) and (b) show two antisymmetric architectures used in Rose and Brown [1991] ( $F_S = 3$ ), in which W = 3 and W=4, respectively. Figure 4(c) depicts a switch module of W=3 used in the Xilinx XC4000 family FPGAs [Hsieh et al. 1990; Xilinx Inc. 1994], and Figure 4(d) illustrates its switch-module model; the XC4000-type switch modules also have $F_S = 3$ . The effects of switch-module architectures on routing in symmetric-array FPGAs were first studied experimentally by Rose and Brown [1991]. An important observation by Rose and Brown [1991] is that 100% detailed-routing completion is often achieved for $F_S = 3$ combined with high $F_C$ . This provides an empirical way to choose a switch-module architecture. The switch modules used in the Xilinx XC4000 family FPGAs are currently regarded as a best architecture among those with $F_S = 3$ [Wu et al. 1994]. However, we show later that there exist universal switch modules whose routing capacities are the proper supersets of those of the antisymmetric and the XC4000-type ones of the same size <sup>&</sup>lt;sup>6</sup> A precise definition of the universal switch module is given in Definition 2.1. <sup>&</sup>lt;sup>7</sup> See Chang et al. [1995b], Hsieh et al. [1990], Rose and Brown [1991], Sun and Liu [1994], Sun et al. [1993], Wu and Chang [1994], Wu et al. [1994], and Zhu et al. [1993]. and with the same number of switches; that is, neither of these two kinds of well-known switch modules is universal. In this article, we present a class of universal switch modules. Each of our switch modules has 6W switches and $F_S=3$ . We prove that no switch module with less than 6W switches can be universal. We also compare our switch modules with the XC4000-type and the antisymmetric switch modules (with $F_S=3$ ). Although these two kinds of switch modules also have $F_S=3$ and 6W switches, we show that they are not universal. Based on combinatorial counting techniques, we show that each of our universal switch modules can accommodate up to 25% more routing instances, compared with the Xilinx XC4000-type one of the same size. Experimental results demonstrate that our universal switch modules improve routabilty at the chip level. Our work also provides a theoretical insight into the important observation by Rose and Brown [1991] (based on extensive experiments) that $F_S=3$ combined with high $F_C$ is often sufficient to provide high routabilty. We focus on switch modules with the flexibility $F_S=3$ in this article. As mentioned earlier, the reasons are threefold: - —It becomes clear later that it suffices to use $F_S = 3$ to construct a universal switch module. - —As shown in Brown et al. [1992a; 1993] and Rose and Brown [1991], 100% detailed-routing completion is often achieved for $F_S = 3$ . - —The switch modules used in the Xilinx XC4000 family FPGAs have $F_S = 3$ . The remainder of this article is organized as follows. Section 2 introduces some notation and definitions. Section 3 presents a class of universal switch modules. Section 4 discusses routing on the XC4000-type and the antisymmetric switch modules. Section 5 presents a technique to analyze the routing capacity of a switch module. Experimental results are reported in Section 6. #### 2. PRELIMINARIES A switch module is a $W \times W$ square block, where W is the number of terminals on each side of the switch module. Some pairs of terminals, on different sides of the module, may have programmable switches and thus can be connected by programming the switches to be "ON." Moreover, these switches are electrically noninteracting, unless they share a terminal. We represent a switch module by M(T,S), where T is the set of terminals, and S the set of switches. Label the terminals $t_1, t_2, \ldots, t_{4W}$ starting from the bottommost terminal on the left side and proceeding clockwise. Let $T_L = \{t_1, \ldots, t_W\}$ (left terminals), $T_T = \{t_{W+1}, \ldots, t_{2W}\}$ (top terminals), $T_R = \{t_{2W+1}, \ldots, t_{3W}\}$ (right terminals), and $T_B = \{t_{3W+1}, \ldots, t_{4W}\}$ (bottom terminals). Therefore, $S = \{(t_j, t_j) \mid \text{there exists a programmable switch between terminals } t_i$ and $t_j$ , and $T = \bigcup_{i \in \{L, T, R, B_j\}} T_i$ . For convenience, we often refer to a switch module M(T, S) simply as M, omitting T and S, if Fig. 5. Six types of connections. there is no ambiguity about T and S, or T and S are not of concern in the context. A net can be routed through a switch module by programming some switch to be "ON." To characterize such a local route, we say a *connection* is established in the switch module between two terminals $t_i$ and $t_j$ , on different sides of the switch module, if the switch $(t_i, t_j)$ is programmed to be "ON." There are six types of connections. Each type is characterized by two sides of a switch module. Figure 5 shows the classification. The connection labeled i, $1 \le i \le 6$ , in Figure 5, is said to be of *Type-i*. For instance, Type-3 connections connect terminals on the left and the top sides of a switch module. Type-1 and -2 connections are *straight connections* whereas the others are *bent connections*. A routing requirement vector (RRV) $\vec{n}$ is a six-tuple $(n_1, n_2, \ldots, n_6)$ where $0 \le n_i \le W$ , $1 \le i \le 6$ . A routing for an RRV on a given switch module is a set of connections such that there are $n_i$ of Type-i connections, for $i \in \{1, \ldots, 6\}$ , and those connections are electrically noninteracting. An RRV $\vec{n}$ is said to be routable on a switch module M if there exists a routing for $\vec{n}$ on M. For example, in Figure 3(a), the routing with the vector (1, 0, 1, 1, 0, 0) on the top-left switch module $M_1$ is shown. The routing capacity of a switch module M is referred to as the number of distinct routable vectors on M; that is, the routing capacity of M is the cardinality $|\{\vec{n}|\vec{n} \text{ is routable on } M\}|$ . The universal switch module is defined as follows: *Definition* 2.1 A switch module M of size W is called *universal* if the following set of inequalities is the sufficient and necessary conditions for an RRV $\vec{n} = (n_1, \ldots, n_6)$ to be routable on M: $$\begin{cases} n_1 + n_3 + n_6 \leq W \\ n_2 + n_3 + n_4 \leq W \\ n_1 + n_4 + n_5 \leq W \\ n_2 + n_5 + n_6 \leq W. \end{cases}$$ Note that the number of nets routing through each side of M can not exceed W; this dimensional constraint is characterized by the preceding four inequalities, one for each side. Therefore a universal switch module has the maximum routing capacity. This article addresses the problem of designing universal switch modules using the minimum number of programmable switches. Fig. 6. Switch modules with different topologies (W=2): (a) $M_1$ ; (b) $M_2$ ; (c) $M_3$ ; (d) $M_4$ ; (e) $M_5$ ; (f) $M_6$ . #### 3. UNIVERSAL SWITCH MODULES Consider the six switch modules depicted in Figure 6. Each contains 12 switches and is of size W=2 and with flexibility $F_S=3$ . However, only three out of the seven RRVs listed in Table I are routable on $M_3$ , $M_4$ , $M_5$ , and $M_6$ whereas all seven RRVs are routable on $M_1$ and $M_2$ . (In Table I, a $\bigcirc$ represents that the RRV listed in the same row is routable on the corresponding switch module, and a x denotes that it is unroutable.) This shows the effects of switch-module topologies on routing. It is obvious that $M_1$ and $M_2$ have the largest routing capacity among these six switch modules. We refer to the topology of $M_1$ as the *symmetric topology*. See Figure 7, Algorithm Symmetric\_Switch\_Modules, for the construction of symmetric switch modules. Note that the switch module $M_3$ is associated with those used in the Xilinx XC4000 family FPGAs (see Figure 4(c,d)). As mentioned earlier, we intend to identify, not only a single, but also a whole class of universal switch modules. We first borrow the terminology *isomorphism* from graph theory (and algebra). It is used to identify a class of switch modules with the same routing capacity. The following is its definition. *Definition* 3.1 Two switch modules M(T, S) and M'(T, S') are *isomorphic* if there exists a bijection $f:T \to T$ such that $(t_i, t_j) \in S$ if and only if $(f(t_j), f(t_j)) \in S'$ and, for any two terminals $t_i$ and $t_j$ , $t_j$ , $t_j \in T_p$ if and only if $f(t_j)$ , $f(t_j) \in T_q$ In other words, M(T, S) and M'(T', S') are isomorphic if we can relabel the terminals of M to be the terminals of M', maintaining the corresponding switches in M and M'; and, for terminals on the same side of M, their corresponding terminals are also on the same side of M'. For instance, the switch modules shown in Figure 8 are all isomorphic, and their corresponding terminals are indicated by the same number. For any two isomorphic switch modules, we have the following theorem. Theorem 3.1 Any two isomorphic switch modules have the same routing capacity. PROOF. If M(T, S) and M(T', S') are isomorphic, we can relabel the six types of connections and have the same switch-connection configuration with respect to each type (see Figure 9). Let $\vec{n}$ be a permutation of $\vec{n}$ so that $\vec{n}$ and $\vec{n}$ correspond to the original (defined in Section 2) and the new (depending on the permutation) definitions of the six types of connections, respectively. It is obvious that $\vec{n}$ is routable on M(T, S) if and only if $\vec{n}$ is Switch module RRV $M_1$ $M_2$ $M_3$ $M_5$ $M_6$ 0 0 (1,1,1,0,1,0)0 $\mathbf{x}$ X X (1,1,0,1,0,1)0 0 0 0 0 X (1,0,1,1,0,0)0 $\circ$ x 0 X $\circ$ (1,0,0,0,1,1)0 $\circ$ X 0 0 X 0 $\bigcirc$ (0,1,1,0,0,1)X $\mathbf{x}$ X 0 (0,1,0,1,1,0)0 0 0 X X X 0 0 0 (0,0,1,1,1,1)0 $\mathbf{x}$ x # Other routable RRVs 49 49 49 49 49 49 56 52 52 Routing capacity 56 52 52 Table I. Effects of Switch-Module Topologies on Routing (O: routable; x: unroutable) ``` Algorithm: Symmetric\_Switch\_Module(W) Input: W -- size of the switch module. Output: M(T,S) -- the symmetric switch module of size W; T\colon set of terminals; S\colon set of switches. 1 T \leftarrow \cup_{i \in \{L,T,R,B\}} T_i; /* See Section 2 for labeling */ S \leftarrow \emptyset; 3 for i \leftarrow 1 to W do { S \leftarrow S \cup \{(t_i, t_{3W-i+1})\}; \text{ /* Type-1 connections */} 5 S \leftarrow S \cup \{(t_{W+i}, t_{4W-i+1})\}; \text{ /* Type-2 connections */} S \leftarrow S \cup \{(t_i, t_{2W-i+1})\}; \text{ /* Type-3 connections */} S \leftarrow S \cup \{(t_{W+i}, t_{3W-i+1})\}; \text{ /* Type-4 connections */} S \leftarrow S \cup \{(t_{2W+i}, t_{4W-i+1})\}; \text{ /* Type-5 connections */} \\ S \leftarrow S \cup \{(t_i, t_{4W-i+1})\}; \text{ /* Type-6 connections */} \\ 8 10 } /* for */ 11 Output M(T,S); ``` Fig. 7. Symmetric\_Switch\_Module algorithm. Fig. 8. (a) Symmetric switch module; (b-d) isomorphic switch modules of (a). routable on M'(T', S'); thus M(T, S) and M'(T', S') have the same routing capacity. $\square$ COROLLARY 3.1 For any two isomorphic switch modules M(T, S), and M'(T', S'), M(T, S) is universal if and only if M'(T', S') is universal. By Corollary 3.1, we can identify a whole class of universal switch modules by performing isomorphism operations on a given universal switch module. The following theorem gives a way to find such a "base" universal switch module. Fig. 9. (a) Switch module and original type definition; (b) isomorphic switch module of (a) and its new type definition. Theorem 3.2 The switch modules constructed by Algorithm Symmetric\_ Switch Modules are universal. PROOF. By Definition 2.1, we show that, for a switch module M of size W, constructed by Algorithm Symmetric\_Switch\_Modules, $\vec{n}$ is routable on M if and only if the following inequalities are simultaneously satisfied: $$\begin{cases} n_1 + n_3 + n_6 \le W \\ n_2 + n_3 + n_4 \le W \\ n_1 + n_4 + n_5 \le W \\ n_2 + n_5 + n_6 \le W. \end{cases}$$ For the switch modules constructed by the algorithm, we have the following key observations (see Figure 10). For a switch module of an even W, we can partition it into W2 noninteracting submodules (shown in Figure 10(b)); each submodule has the same topology as that of $M_1$ in Figure 6(a). As mentioned earlier, the 56 RRVs satisfying the dimensional constraint for W=2 are all routable on $M_1$ (see Table I); that is $M_1$ is universal. The reason is that, $for\ W=2$ , the three terminals, say terminals $b,\ c,\$ and $d,\$ that connect to a terminal, say $a,\$ do not share any switch (see Figure 10(b)); thus the connections associated with them are noninteracting, except those associated with a. For a switch module with an odd W, we can partition it into $\lceil W2 \rceil$ noninteracting submodules, with each of $\lceil W2 \rceil$ submodules identical to $M_1$ and on submodule formed by the four terminals on the middle of each side of the switch module (see Figure 10(d)). Because terminals in different submodules are noninteracting, each submodule can be considered independently. (*If*) If the constraints $n_1+n_3+n_6\leq W$ and $n_1+n_4+n_5\leq W$ ( $n_2+n_3+n_4\leq W$ and $n_2+n_5+n_6\leq W$ ) are satisfied, by the preceding observations, it is always possible to place up to $W-n_1$ ( $W-n_2$ ) Type-3 and -6, and Type-4 and -5 (Type-3 and -4, and Type-5 and -6) connections after $n_1$ Type-1 ( $n_2$ Type-2) connections are placed. Hence, if all four inequalities are satisfied, there must exist a feasible routing for $\vec{n}$ ; that is, $\vec{n}$ is routable on M. (Only If) The total number of connections routing through each side of M can not exceed W. Hence, if $\vec{n}$ is routable on M, the four inequalities must be satisfied. $\square$ Fig. 10. Two universal switch modules and their submodules: (a) W = 4; (b) two submodules of switch modules in (a); (c) W = 3; (d) two submodules of switch modules in (b). By Corollary 3.1 and Theorem 3.2, we can perform isomorphism operations on a switch module constructed by Algorithm Symmetric\_Switch\_Module to obtain a whole family of universal switch modules. Note that there are $2 \times F_S \times W$ switches in a switch module [Rose and Brown 1991]. Because the switch modules constructed by the algorithm have $F_S = 3$ , we have the following corollary. Corollary 3.2 It needs only 6 W switches to construct a universal switch module. In particular, 6W switches are also the minimum requirement for constructing a universal switch module. Theorem 3.3 No switch module with less than 6W switches can be universal. Proof. By Definition 2.1, an RRV with only one nonzero component W such as (W, 0, 0, 0, 0, 0), (0, W, 0, 0, 0, 0), and so on, is routable on a universal switch module. Hence it needs at least W noninteracting switches for each type of connection to construct a universal switch module. Because there are six types, the theorem thus follows. $\square$ By Corollary 3.2 and Theorem 3.3, our universal switch modules do have the minimum number of switches. Note that the requirement, 6W switches, is quite small, compared to that for a fully connected switch module which contains $6W^2$ switches. # 4. TWO WELL-KNOWN SWITCH MODULES We explore the properties associated with the XC4000-type and the antisymmetric switch modules, first showing that neither of them is universal and then discussing their feasibility conditions. We first consider the XC4000-type switch modules. Their switch-module architectures are illustrated in Figure 11. As mentioned in the preceding section, the switch module $M_3$ of size W=2 shown in Figure 6(c) is the XC4000-type. The four RRVs (1, 0, 1, 1, 0, 0), (1, 0, 0, 0, 1, 1), (0, 1, 1, 0, 0, 1), and (0, 1, 0, 1, 1, 0) listed in Table I satisfying the dimensional constraint (i.e., $n_1+n_3+n_6 \leq W$ , $n_2+n_3+n_4 \leq W$ , $n_1+n_4+n_5 \leq W$ , and $n_2+n_5+n_6 \leq W$ ) are not routable on the XC4000-type switch module. Hence the XC4000-type switch modules are not universal. More specifically, we have the following theorem for the XC4000-type switch modules. Fig. 11. (a) Xilinx XC4000-type switch module (W=3) and its interconnect points; (b) corresponding switch-module model of (a) ( $F_s=3$ ); (c) three submodules of switch module in (b). THEOREM 4.1 For a Xilinx XC4000-type switch module M of size W, $\vec{n}$ is routable on M if and only if $\max\{n_1, n_2\} + \max\{n_3, n_5\} + \max\{n_4, n_6\} \le W$ . PROOF. An XC4000-type switch module M consists of W interconnect points on its "diagonal" (see Figure 11(a)). We have the following key observations: - (1) The connections associated with different interconnect points are non-interacting, and thus can be considered independently (see Figure 11(c)). - (2) Type-1 and -2, Type-3 and -5, or Type-4 and -6 connections are noninteracting as they are associated with different sides of *M*. - (3) For each interconnect point, only the two connections of noninteracting types can be used simultaneously; for example, there exists no feasible bent connection after a straight connection is placed. - (If) If the inequality is satisfied, we have $$\begin{cases} n_3 + n_4 \leq W - \max\{n_1, n_2\} \\ n_4 + n_5 \leq W - \max\{n_1, n_2\} \\ n_5 + n_6 \leq W - \max\{n_1, n_2\} \\ n_3 + n_6 \leq W - \max\{n_1, n_2\} \end{cases}$$ After $n_1$ Type-1 and $n_2$ Type-2 connections are placed, there are still $W=W-\max\{n_1,\ n_2\}$ interconnect points available for bent connections, by the preceding observations; the reason is that we may place the $n_1$ Type-1 and $n_2$ Type-2 connections on $\max\{n_1,\ n_2\}$ interconnect points. For the remaining W interconnect points, it is always possible to place up to $W-\max\{n_4,\ n_6\}$ Type-3 or -5 connections. Note that Type-3 and -5 connections are noninteracting, by Observation (2). By the same reasoning, it is always possible to place up to $W-\max\{n_3,\ n_5\}$ Type-4 or -6 connections. Hence, if the inequality is satisfied, there must exist a feasible routing for $\vec{n}$ ; that is, $\vec{n}$ is routable on M. (*Only If*) If $\vec{n}$ is routable on M, the number of available interconnect points for bent connections must be less than or equal to $W = W - \max\{n_1, n_2\}$ after $n_1$ Type-1 and $n_2$ Type-2 connections are placed, by Observation (3). Inasmuch as Type-3 and -4, Type-4 and -5, Type-5 and -6, or Type-3 and -6 connections are associated with the top, the right, the bottom, or the left side, respectively, the number of the connections for each pair of types can Fig. 12. (a, b) Two different size antisymmetric switch modules with $F_s=3$ ; (c) unroutable vector (2, 2, 1, 0, 1, 0) for antisymmetric switch module of W=3. not exceed the remaining interconnect points W. Thus we have $$\begin{cases} n_3 + n_4 \leq W - \max\{n_1, n_2\} \\ n_4 + n_5 \leq W - \max\{n_1, n_2\} \\ n_5 + n_6 \leq W - \max\{n_1, n_2\} \\ n_3 + n_6 \leq W - \max\{n_1, n_2\} \end{cases}$$ The satisfaction of these four inequalities implies that of the inequality $\max\{n_1, n_2\} + \max\{n_3, n_5\} + \max\{n_4, n_6\} \le W$ . $\square$ Figure 12 parts (a) and (b) illustrate two different-size antisymmetric switch modules generated by the program used in Rose and Brown [1991] (with $F_S=3$ ). It is simple to verify that the RRV (2, 2, 1, 0, 1, 0) that satisfies the dimensional constraint is not routable on the antisymmetric switch module of W=3; see Figure 12(c) for an illustration. For different-size antisymmetric switch modules with $F_S=3$ , their switch-connection configurations are not uniform. Thus we do not explore their individual feasibility conditions; however, we note that the antisymmetric switch modules are not universal. #### 5. ROUTING-CAPACITY ANALYSIS The preceding two sections give the feasibility conditions of the universal and the XC4000-type switch modules. In this section we analyze their routing capacities based on combinatorial counting techniques. Let $M_{U,W}$ and $M_{X,W}$ be a universal and a Xilinx XC4000-type switch module of size W, respectively. Let $U_W$ be the *feasible set* for $M_{U,W}$ ; that is, $U_W = \{\vec{n} | \vec{n} | \vec{n} \}$ is routable on $M_{U,W}$ . $X_W$ is similarly defined. We have the following lemma. Lemma 5.1 $$X_W \subseteq U_W$$ PROOF. Immediately from Theorem 3.2 and Theorem 4.1. The feasibility condition of $M_{X,W}$ implies that of $M_{U,W}$ , and thus $X_W \subseteq U_{W}$ . Let $|U_W|(|X_W|)$ be the cardinality of $U_W(X_W)$ . By enumerating the feasible routing instances, we can compute the ratio $|U_W|/|X_W|$ . It is shown that $\begin{array}{l} |U_W|/|X_W| \rightarrow 1.25; \ \text{in other words, for the two kinds of switch modules of the same size, the universal switch modules have up to 25% larger routing capacities than the XC4000-type ones. To obtain the ratio <math display="block">|U_W|/|X_W|, \ \text{we first find the closed forms for } |X_W| \ \text{and } |U_W|. \end{array}$ LEMMA 5.2 (Closed Forms) $$|X_W| = {W+6 \choose 6} + 3{W+5 \choose 6} + 3{W+4 \choose 6} + {W+3 \choose 6}.$$ (1) $$|U_{W}| = \lfloor \frac{1}{6}(10 W^{6} + 120 W^{5} + 595 W^{4} + 1560 W^{8} + 2320 W^{2} + 1920 W + 720) \rfloor.$$ (2) PROOF. (1) By Theorem 4.1, $X_W$ is the set of RRV's ns satisfying the following inequality: $$\max\{n_1, n_2\} + \max\{n_3, n_5\} + \max\{n_4, n_6\} \leq W.$$ Hence we have $$\begin{cases} n_4 \leq W - \max\{n_1, n_2\} - \max\{n_3, n_5\} \\ n_6 \leq W - \max\{n_1, n_2\} - \max\{n_3, n_5\} \end{cases}$$ and $$\begin{cases} n_3 + n_4 \leq W - \max\{n_1, n_2\} \\ n_4 + n_5 \leq W - \max\{n_1, n_2\} \\ n_5 + n_6 \leq W - \max\{n_1, n_2\} \\ n_3 + n_6 \leq W - \max\{n_1, n_2\} \end{cases}$$ Consider the following two sets: $$X_{W,p,1} = \{(n_1, n_2) | \max\{n_1, n_2\} = p, 0 \le p \le W\}$$ $$X_{W,p,2} = \{(n_3, n_4, n_5, n_6) | n_3 + n_4 \le W - p, n_4 + n_5 n_5 n_$$ $$n_5 + n_6 \le W - p$$ , $n_3 + n_6 \le W - p$ , $0 \le P \le W$ . We have $$|X_W| = \sum_{p=0}^W |X_{W,p,1}| |X_{W,p,2}|$$ $$|X_{W_{p,1}}| = 2p + 1, \ 0 \le p \le W.$$ To compute $|X_{W,p,2}|$ , we define the following two sets $$X_{W-p,q,3} = \{(n_3, n_5) | \max\{n_3, n_5\} = q, 0 \le q \le W - p\}$$ $$X_{W-p,q,4} = \{(n_4, n_6) | n_4 \le W - p - q, n_6 \le W - p - q, 0 \le q \le W - p\}.$$ We have $$\begin{split} |X_{W,p,2}| &= \sum_{q=0}^{W-p} |X_{W-p,q,3}| |X_{W-p,q,4}| \\ X_{W-p,q,3} &= 2q+1, \qquad 0 \leq q \leq W-p \\ X_{W-p,q,4} &= (W-p-q+1)^2 = \left(\frac{W-p-q+2}{2}\right) + \left(\frac{W-p-q+1}{2}\right). \end{split}$$ Hence, Hence, $$|X_{W,p,2}| = |\{(n_3, n_4, n_5, n_6) | n_3 + n_4 \le W - p, n_4 + n_5 \le W - p, \\ n_5 + n_6 \le W - p, n_3 + n_6 \le W - p, 0 \le p \le W\} |$$ $$= \sum_{q=0}^{W-p} |X_{W-p,q,3}| |X_{W-p,q,4}|$$ $$= \sum_{q=0}^{W-p} (2q+1) \left( \left( W - p - q + 2 \right) + \left( W - p - q + 1 \right) \right)$$ $$= \sum_{q=0}^{W-p} (2q+1) \left( W - p - q + 2 \right) + \sum_{q=0}^{W-p} (2q+1) \left( W - p - q + 1 \right)$$ $$\left( \sum_{q=1}^{W-p+1} q \left( W - p - q + 3 \right) + \sum_{q=1}^{W-p} \left( W - p - q + 2 \right) \right)$$ $$+ \left( \sum_{q=1}^{W-p+3} q \left( W - p - q + 2 \right) + \sum_{q=1}^{W-p} \left( W - p - q + 1 \right)$$ $$= \sum_{q=0}^{W-p+3} \left( q \right) \left( W - p + 3 - q \right) + \sum_{q=0}^{W-p+2} \left( q \right) \left( W - p + 2 - q \right)$$ $$+ \sum_{q=0}^{W-p+2} \left( q \right) \left( W - p + 2 - q \right) + \sum_{q=0}^{W-p+1} \left( q \right) \left( W - p + 1 - q \right)$$ $$= \left( W - p + 4 \right) + 2 \left( W - p + 3 \right) + \left( W - p + 2 \right) .$$ Note that the identity $$\sum_{k=0}^{I} \binom{I-k}{m} \binom{q+k}{n} = \binom{I+q+1}{m+n+1},$$ where $n \ge q \ge 0$ and l, m, n, $q \in \mathbb{Z}^+ \cup \{0\}$ , is an extension of *Vandermonde's convolution* [Graham et al. 1989]. As a result, $$\begin{split} |X_W| &= |\{\vec{n}| \max\{n_1, \ n_2\} + \max\{n_3, \ n_5\} + \max\{n_4, \ + n_6\} \leq W\}| \\ &= \sum_{p=0}^W |X_{W,p,1}| |X_{W,p,2}| \\ &= \sum_{p=0}^W (2p+1) \left( \left( \begin{matrix} W-p+4 \\ 4 \end{matrix} \right) + 2 \left( \begin{matrix} W-p+3 \\ 4 \end{matrix} \right) + \left( \begin{matrix} W-p+2 \\ 4 \end{matrix} \right) \right) \\ &= \left( \sum_{p=1}^{W+1} p \left( \begin{matrix} W-p+5 \\ 4 \end{matrix} \right) + \sum_{p=1}^W p \left( \begin{matrix} W-p+4 \\ 4 \end{matrix} \right) \right) \\ &+ 2 \left( \sum_{p=1}^{W+1} p \left( \begin{matrix} W-p+4 \\ 4 \end{matrix} \right) + \sum_{p=1}^W p \left( \begin{matrix} W-p+3 \\ 4 \end{matrix} \right) \right) \\ &+ \left( \sum_{p=1}^{W+1} p \left( \begin{matrix} W-p+3 \\ 4 \end{matrix} \right) + \sum_{p=1}^W p \left( \begin{matrix} W-p+2 \\ 4 \end{matrix} \right) \right) \\ &= \sum_{p=0}^{W+5} \left( \begin{matrix} p \\ 1 \end{matrix} \right) \left( \begin{matrix} W+5-p \\ 4 \end{matrix} \right) + \sum_{p=0}^{W+4} \left( \begin{matrix} p \\ 1 \end{matrix} \right) \left( \begin{matrix} W+4-p \\ 4 \end{matrix} \right) \\ &+ 2 \sum_{p=0}^{W+4} \left( \begin{matrix} p \\ 1 \end{matrix} \right) \left( \begin{matrix} W+4-p \\ 4 \end{matrix} \right) + 2 \sum_{p=0}^{W+3} \left( \begin{matrix} p \\ 1 \end{matrix} \right) \left( \begin{matrix} W+3-p \\ 4 \end{matrix} \right) \\ &+ \sum_{p=0}^{W+3} \left( \begin{matrix} p \\ 1 \end{matrix} \right) \left( \begin{matrix} W+3-p \\ 4 \end{matrix} \right) + \sum_{p=0}^{W+2} \left( \begin{matrix} p \\ 1 \end{matrix} \right) \left( \begin{matrix} W+2-p \\ 4 \end{matrix} \right) \\ &= \left( \begin{matrix} W+6 \\ 6 \end{matrix} \right) + 3 \left( \begin{matrix} W+5 \\ 6 \end{matrix} \right) + 3 \left( \begin{matrix} W+5 \\ 6 \end{matrix} \right) + 3 \left( \begin{matrix} W+4 \\ 6 \end{matrix} \right) + \left( \begin{matrix} W+3 \\ 6 \end{matrix} \right) + \left( \begin{matrix} W+3 \\ 6 \end{matrix} \right) \\ &= \frac{1}{6!} \ \vec{X} \cdot \vec{\omega} \,, \end{split}$$ where $$\vec{X} = (8, 96, 500, 1440, 2372, 2064, 720)$$ $\vec{\omega} = (W^6, W^5, W^4, W^3, W^2, W, 1).$ (2) Applying similar techniques, we get $$|U_W| = egin{cases} rac{1}{6!} \; \hat{U}_1 \cdot \hat{\omega}, & W = 2\,k, & k \in Z^+ \cup \{0\} \ rac{1}{6!} \; \hat{U}_2 \cdot \hat{\omega}, & W = 2\,k+1, & k \in Z^+ \cup \{0\} \end{cases}$$ where $$ec{U}_1 = (10,\ 120,\ 595,\ 1560,\ 2320,\ 1920,\ 720)$$ $ec{U}_2 = (10,\ 120,\ 595,\ 1560,\ 2320,\ 1920,\ 675).$ Because $|U_{\mathcal{W}}|$ is an integer, we have $$|U_{W}| = \left[\frac{1}{6!} \ \vec{U} \cdot \vec{\omega} \ \right].$$ where $\vec{U} = \vec{U}_1$ . Theorem 5.1 (Routing Capacities) - (1) $|U_W|/|X_W|$ is a strictly increasing function of W, $W \ge 1$ ; (2) $\lim_{W \to \infty} |U_W|/|X_W| = 1.25$ . PROOF. (1) When $W \ge 1$ , $$\begin{split} \frac{|U_{W+1}|}{|X_{W+1}|} - \frac{|U_{W}|}{|X_{W}|} &= \frac{1}{|X_{W+1}||X_{W}|} \left( |U_{W+1}||X_{W}| - |U_{W}||X_{W+1}| \right) \\ &= \frac{1}{|X_{W+1}||X_{W}|} \left( \left| \frac{1}{6!} \stackrel{?}{U} \cdot \mathring{\omega} \right| \left( \frac{1}{6!} \stackrel{?}{X} \cdot \mathring{\omega} \right) - \left| \frac{1}{6!} \stackrel{?}{U} \cdot \mathring{\omega} \right| \left( \frac{1}{6!} \stackrel{?}{X} \cdot \mathring{\omega}' \right) \right) \\ &\geq \frac{1}{|X_{W+1}||X_{W}|} \left( \left( \frac{1}{6!} \stackrel{?}{U} \cdot \mathring{\omega}' - 1 \right) \left( \frac{1}{6!} \stackrel{?}{X} \cdot \mathring{\omega} \right) - \left( \frac{1}{6!} \stackrel{?}{U} \cdot \mathring{\omega} \right) \left( \frac{1}{6!} \stackrel{?}{X} \cdot \mathring{\omega}' \right) \right) \\ &= \frac{10 \stackrel{?}{A} \cdot \mathring{\omega}''}{(6!)^{2} |X_{W+1}||X_{W}|} \\ &> 0, \end{split}$$ where $\vec{A} = (48, 1080, 10512, 57384, 191484, 396450, 483648, 285426, 9288,$ -48600) $$\vec{\omega}' = ((W+1)^6, (W+1)^5, (W+1)^4, (W+1)^3, (W+1)^2, (W+1), 1)$$ $$\hat{\omega}'' = (W^9, W^8, W^7, W^6, W^5, W^4, W^3, W^2, W, 1).$$ Because $|U_W|/|X_W|<|U_{W+1}|/|X_{W+1}|$ when $W\geq 1$ , $|U_W|/|X_W|$ is a strictly increasing function of W. $$\lim_{W \to \infty} \frac{|U_W|}{|X_W|} = \lim_{W \to \infty} \frac{\tilde{U} \cdot \tilde{\omega}}{\tilde{X} \cdot \tilde{\omega}}$$ $$= 1.25.$$ where $$ar{X} = (8, 96, 500, 1440, 2372, 2064, 720)$$ $ar{U} = (10, 120, 595, 1560, 2320, 1920, 720)$ $ar{\omega} = (W^6, W^5, W^4, W^3, W^2, W, 1).$ Therefore, a universal switch module has up to 25% larger routing capacity than the XC4000-type one of the same size. For current commercially available FPGAs, the sizes of switch modules are usually small, say $W \le 40$ . Thus the ratios for these small Ws are of particular interest; Table II lists their corresponding routing capacity ratios. It shows that the universal switch modules have about 22.5%, 24.2%, 24.6%, and 24.8% larger routing capacities than the XC4000-type ones for W = 10, 20, 30, and 40, respectively. ### 6. EXPERIMENTAL RESULTS To explore the effects of switch-module architectures on routing, we first modified the code of the CGE router [Rose and Brown 1991] to consider various switch-module architectures, and then tested the area performance of the router based on the benchmark circuits used in Rose and Brown [1991]. Table III gives the names of the circuits, the numbers of logic modules in the FPGAs, the numbers of nets and connections in the circuits, and the types of the circuits. The connection-module switches were automatically determined by the CGE package once an $F_C$ value was specified. The switch-module architectures used were the universal, the Xilinx XC4000-type [Hsieh et al. 1990; Xilinx Inc. 1994], and the antisymmetric [Rose and Brown [1991] ones. The flexibilities of these switch modules are Table II. Routing Capacity Comparison of Universal and XC4000-Type Switch Modules | W | Routir | | | |----|---------------------------------|--------------------------------------|------------------------------------------------------| | | Universal S. M. $\mid U_W \mid$ | $XC4000$ -type S. M. $\mid X_W \mid$ | Capacity ratio $\mid U_W \mid \diagup \mid X_W \mid$ | | 1 | 10 | 10 | 1.000 | | 2 | 56 | 52 | 1.077 | | 3 | 214 | 190 | 1.126 | | 4 | 641 | 553 | 1.159 | | 5 | 1,620 | 1,372 | 1.181 | | 6 | 3,616 | 3,024 | 1.196 | | 7 | 7,340 | 6,084 | 1.206 | | 8 | 13,825 | 11,385 | 1.214 | | 9 | 24,510 | 20,086 | 1.220 | | 10 | 41,336 | 33,748 | 1.225 | | 15 | 334,680 | 270,504 | 1.237 | | 20 | 1,573,121 | 1,266,265 | 1.242 | | 25 | 5,377,190 | 4,319,406 | 1.245 | | 30 | 14,905,856 | 11,959,552 | 1.246 | | 35 | 35,622,150 | 28,560,078 | 1.247 | | 40 | 76,215,041 | 61,075,609 | 1.248 | Table III. CGE Benchmark Circuits | Circuit | # Logic<br>modules | # Nets | # Connections | Function type | |---------|--------------------|--------|---------------|----------------------------| | BUSC | 12 imes 13 | 151 | 392 | Bus controller | | DMA | 16 imes 18 | 213 | 771 | DMA controller | | BNRE | 21 imes 22 | 352 | 1257 | Random logic and data path | | DFSM | 22 imes 23 | 420 | 1422 | State machine | | Z03 | 26 imes 27 | 608 | 2135 | 8-Bit multiplier | all three ( $F_S = 3$ ); thus each of them contains $6\,W$ switches. Note that the universal switch modules used in these experiments were constructed by Algorithm Symmetric\_Switch\_Modules. The quality of a switch module was evaluated by the area performance of the CGE detailed router. Table IV shows the results. For the results listed in this table, we first determined the minimum number of tracks W and then the smallest connection-module flexibility $F_C$ required for 100% routing completion for each circuit, using the three kinds of switch modules. We then obtained the minimum Ws needed for 100% routing completion for each circuit using the three kinds of switch modules based on the previously determined $F_C$ . The results based on this "minimal" $F_C$ are then reported in the table. Note that, for each circuit, the detailed-routing results associated with different kinds of switch modules are all based on same global routes. Our results show that, among the three kinds of switch modules, the universal switch modules needed the minimum Ws and $F_Cs$ for 100% routing completion for all of the five circuits. Figure 13 shows the detailed routing solution for the circuit BNRE with the parameters W=12 and $F_C=12$ , using the symmetric switch module ( $F_S=3$ ). # Tracks needed for CGE detailed routing (F<sub>s</sub> = 3) Circuit F. Universal XC4000-type Antisymmetric **BUSC** 9 10 10 11 DMA 10 14 11 11 DFSM 10 11 15 11 **BNRE** 12 12 14 14 **Z**03 14 14 15 14 Total 58 69 60 Table IV. Detailed Routing Results Fig. 13. Ws required for 100% routing completion based on specified $F_C$ 's for the BUSC circuit. Our experimental results show that, among the three kinds of switch modules, the universal switch modules usually achieve the best area performance, and the XC4000-type often have the worst performance. In Figure 14, for example, the Ws required for 100% routing completion (represented by the vertical axis) are plotted as a function of specified $F_C$ 's (represented by the horizontal axis) for the BUSC circuit; this is done for the three kinds of switch modules. Though not presented in Table IV, the results based on various $F_C$ 's are highly consistent with this phenomenon. Note that the architectures of the universal and the antisymmetric switch modules are alike (see Figures 10 and 12); however, as mentioned earlier, the antisymmetric ones are still not universal. This explains why the experimental performance of the antisymmetric switch modules is worse than but close to that of the universal ones. ## 7. CONCLUDING REMARKS We have presented a class of universal switch modules and shown theoretically and experimentally that they have better performance in routing, compared to the two kinds of well-known switch modules used in Hsieh et Fig. 14. Relationship of feasible sets for the three kinds of switch modules. | [Universal S Module] Circuittebnr, W=12, Fs=3, Fc=12 Wed Dec 21 04:05:40 1994 | |----------------------------------------------------------------------------------------| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 0 1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233343536373839404142 | Fig. 15. The detailed routing solution for the circuit BNRE with the parameters W=12, $F_S=3$ , and $F_C=12$ using the symmetrical switch module. al. [1990], Xilinx Inc. [1994], and Rose and Brown [1991]. The feasible sets of the three kinds of switch modules discussed in the article bear the relationship illustrated in Figure 14.8 Experiments with the three kinds of switch modules have shown that switch modules with larger routing capacities often result in better routing solutions. Our work paves a scientific foundation for the switch modules for FPGA design and for the exploration of the effects of switch-module architectures on FPGA routing. Finally, our research also provides a theoretical insight into the important observation by Rose and Brown [1991] that $F_S=3$ combined with high $F_C$ is often sufficient to achieve high routability. <sup>&</sup>lt;sup>8</sup> Both subsets P and Q in Figure 14 are nonempty. For instance, for W=3, (2, 1, 1, 1, 0, 0) is routable on the antisymmetric switch module (see Figure 4(a)), but not on the XC4000-type one (see Figure 4(d)); on the contrary, (2, 2, 1, 0, 1, 0) is routable on the XC4000-type switch module, but not on the antisymmetric one. However, both RRVs are routable on the universal switch module (see Figure 10(c)). #### **ACKNOWLEDGMENTS** We would like to thank Stephen Brown and the authors of Brown et al. [1992b] for providing us with the CGE package. #### **REFERENCES** - ALEXANDER, M. AND ROBINS, G. 1995. New performance-driven FPGA routing algorithm. In *Proceedings of the ACM/IEEE Design Automation Conference* (San Francisco, June 12–16), 562–567. - Bhat, N. and Hill, D. 1992. Routable technology mapping for LUT FPGAs. In *Proceedings of the International Conference on Computer Design, VLSI in Computers and Processors* (Cambridge, MA, Oct.), 95–98. - Brown, S., Rose, J., and Vranesic, Z. G. 1993. A stochastic model to predict the routability of field-programmable gate arrays. *IEEE Trans. Computer-Aided Des.* 12, 12, 1827–1838. - Brown, S., Francis, R. J., Rose, J., and Vranesic, Z. G. 1992a. Field-Programmable Gate Arrays. Kluwer Academic Publishers, Boston, MA. - Brown, S., Rose, J., and Vranesic, Z. G. 1992b. A detailed router for field-programmable gate arrays. *IEEE Trans. Comput.-Aided Des.* 11, 620-627. - CHANG, Y.-W., WONG, D. F., AND WONG, C. K. 1995a. FPGA global routing based on a new congestion metric. In *Proceedings of the IEEE International Conference on Computer Design, VLSI in Computers and Processors* (Austin, TX, Oct. 2–4), 372–378. - Chang, Y.-W., Wong, D. F., and Wong, C. K. 1995b. Design and analysis of FPGA/FPIC switch modules. In *Proceedings of the IEEE International Conference on Computer Design, VLSI in Computers and Processors* (Austin, TX, Oct. 2–4), 394–401. - CHANG, Y.-W., THAKUR, S., ZHU, K., AND WONG, D. F. 1994. A new global routing algorithm for FPGAs. In *Proceedings of the IEEE/ACM International Conference on Computer-Aided Design* (Santa Clara, CA, Nov. 7–11), 380–385. - CHEN, C.-D., LEE, Y.-S., WU, C.-H., AND LIN, Y.-L. 1995. TRACER-fpga: a router for RAM-based FPGAs. *IEEE Trans. Comput.-Aided Des.* 14, 3 (March), 371–374. - Fujiyoshi, K., Kajitani, Y., and Nijtsu, H. 1994. Design of optimum totally perfect connection-blocks of FPGA. In *Proceedings of the IEEE International Symposium on Circuits and Systems* (London, May 30–June 6), 221–224. - Graham, R. L., Knuth, D. E., and Patashnik, O. 1989. *Concrete Mathematics*, Addison-Wesley, Reading, MA. - HSIEH, H. C. ET AL., 1990. Third-generation architecture boosts speed and density of field-programmable gate arrays. In *Proceedings of the IEEE Custom Integrated Circuits Conference* (Boston, MA, May), 31.2.1–31.2.7. - Lemieux, G. and Brown, S. 1993. A detailed routing algorithm for allocating wire segments in field-programmable gate arrays. In *Proceedings of the ACM/SIGDA Physical Design Workshop* (Lake Arrowhead, CA), 215–226. - LIN, C., MAREK-SADOWSKA, M., AND GATLIN, D. 1994. Universal logic gate for FPGA design. In *Proceedings of the IEEE/ACM International Conference on Computer-Aided Design* (San Jose, CA, Nov. 6–10), 164–168. - Palcsewski, M. 1992. Plane parallel A\* maze router and it applications. In *Proceedings of the ACM/IEEE Design Automation Conference* (Anaheim, CA, June), 691–697. - Rose, J. and Brown, S. 1991. Flexibility of interconnection structures for field-programmable gate arrays. *IEEE J. Solid-State Circuits. 26*, 3, 277–282. - Rose, J., Francis, R., Lewis, D., and Chow, P. 1990. Architecture of programmable gate arrays: The effect of logic block functionality on area efficiency. *IEEE J. Solid-State Circuits* 25, 1217–1225. - Sun, Y. And Liu, C. L. 1994. Routing in a new 2-dimensional FPGA/FPIC routing architecture. In *Proceedings of the ACM/IEEE Design Automation Conference* (San Diego, CA, June 6–10), 171–176. - Sun, Y., Wang, T.-C., Wong, C. K., and Liu, C. L. 1993. Routing for symmetric FPGAs and FPIC. In *Proceedings of the IEEE/ACM International Conference on Computer-Aided Design* (Santa Clara, CA, Nov. 7–11), 486–490. - ACM Transactions on Design Automation of Electronic Systems, Vol. 1, No. 1, January 1996. - THAKUR, S. AND WONG, D. F. 1995. On designing ULM-based FPGA logic modules. In *Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays*, (Monterey, CA, Feb. 12–14), 3–9. - TRIMBERGER, S. 1994. Field-Programmable Gate Array Technology. Kluwer Academic, Boston, MA. - Trimberger, S. and Chene, M. 1992. Placement-based partitioning for lookup-table-based FPGA. In *Proceedings of the International Conference on Computer Design, VLSI in Computers and Processors* (Cambridge, MA, Oct.), 91–94. - Wu, Y.-L. and Chang, D. 1994. On the NP-completeness of regular 2-D FPGA routing architectures and a novel solution. In *Proceedings of the IEEE/ACM International Conference on Computer-Aided Design* (San Jose, CA, Nov. 6-10), 362-366. - Wu, Y.-L., Tsukiyama, S., and Marek-Sadowska, M. 1994. Computational complexity of 2-D FPGA routing for arbitrary switch box topologies. In *Proceedings of the ACM International Workshop on FPGA* (Berkeley, CA, Feb. 13–15). - XILINX INC. 1994. The Programmable Logic Data Book. - ZHU, K., Wong, D. F., and Chang, Y.-W. 1993. Switch module design with application to two-dimensional segmentation design. In *Proceedings of the IEEE/ACM International Conference on Computer-Aided Design* (Santa Clara, CA, Nov. 7–11), 481–486. Received June 1995; revised August 1995; accepted September 1995