Abstract
This paper introduces and initiates a study of a new model of arithmetic circuits coupled with new complexity measures. The new model consists of multilinear circuits with arbitrary multilinear gates, rather than the standard multilinear circuits that use only addition and multiplication gates. In light of this generalization, the arity of gates becomes of crucial importance and is indeed one of our complexity measures. Our second complexity measure is the number of gates in the circuit, which (in our context) is significantly different from the number of wires in the circuit (which is typically used as a measure of size). Our main complexity measure, denoted \({\mathtt{AN}}(\cdot )\), is the maximum of these two measures (i.e., the maximum between the arity of the gates and the number of gates in the circuit). We also consider the depth of such circuits, focusing on depth-two and unbounded depth.
Our initial motivation for the study of this arithmetic model is the fact that its two main variants (i.e., depth-two and unbounded depth) yield natural classes of depth-three Boolean circuits for computing multilinear functions. The resulting circuits have size that is exponential in the new complexity measure. Hence, lower bounds on the new complexity measure yield size lower bounds on a restricted class of depth-three Boolean circuits (for computing multilinear functions). Such lower bounds are a sanity check for our conjecture that multilinear functions of relatively low degree over \(\mathrm{GF}(2)\) are good candidates for obtaining exponential lower bounds on the size of constant-depth Boolean circuits (computing explicit functions). Specifically, we propose to move gradually from linear functions to multilinear ones, and conjecture that, for any \(t\ge 2\), some explicit t-linear functions \(F:(\{0,1\}^n)^t\rightarrow \{0,1\}\) require depth-three circuits of size \(\exp (\varOmega (tn^{t/(t+1)}))\).
Letting \({{{\mathtt{AN}}}_2}(\cdot )\) denote the complexity measure \({{{\mathtt{AN}}}}(\cdot )\), when minimized over all depth-two circuits of the above type, our main results are as follows.
-
For every t-linear function F, it holds that \({{{\mathtt{AN}}}}(F)\le {{{\mathtt{AN}}}_2}(F)=O((tn)^{t/(t+1)})\).
-
For almost all t-linear function F, it holds that \({{{\mathtt{AN}}}_2}(F)\ge {{{\mathtt{AN}}}}(F)=\varOmega ((tn)^{t/(t+1)})\).
-
There exists a bilinear function F such that \({{{\mathtt{AN}}}}(F)=O({\sqrt{n}})\) but \({{{\mathtt{AN}}}_2}(F)=\varOmega (n^{2/3})\).
The main open problem posed in this paper is proving that \({{{\mathtt{AN}}}_2}(F)\ge {{{\mathtt{AN}}}}(F)=\varOmega ((tn)^{t/(t+1)})\) holds for an explicit t-linear function F, with \(t\ge 2\). For starters, we seek lower bound of \(\varOmega ((tn)^{0.51})\) for an explicit t-linear function F, preferably for constant t. We outline an approach that reduces this challenge (for \(t=3\)) to a question regarding matrix rigidity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
Note that \({F_\mathtt{all}^{t,n}}({x^{(1)}},...,{x^{(t)}}) = \prod _{j\in [t]}\sum _{i_j\in [n]}{x^{(j)}_{i_{j}}}\), which means that it can be computed by a t-way conjunction of n-way parity circuits, whereas \({F_\mathtt{diag}^{t,n}}\) is obviously an n-way parity of t-way conjunctions of variables.
- 3.
Thus, these tensors should be constructible within \(\exp (tn)\)-time. Note that we can move from the tensor to the multilinear function (and vice versa) in \(n^t\ll \exp (tn)\) oracle calls.
- 4.
Note that \({F_\mathtt{leq}^{t,n}}({x^{(1)}},...,{x^{(t)}})\) equals \(\sum _{i\in [n]}{F_\mathtt{leq}^{t-1,i}}(x^{(1)}_{[1,i]},...,x^{(t-1)}_{[1,i]}) \cdot x^{(t)}_i\), where \(x^{(j)}_{[1,i]}=(x^{(j)}_{1},...,x^{(j)}_{i})\). So, for every \(t'\in [t-1]\), the dynamic program uses the n values \(({F_\mathtt{leq}^{t',i}}(x^{(1)}_{[1,i]},...,x^{(t')}_{[1,i]}))_{i\in [n]}\) in order to compute the n values \(({F_\mathtt{leq}^{t'+1,i}}(x^{(1)}_{[1,i]},...,x^{(t'+1)}_{[1,i]}))_{i\in [n]}\).
- 5.
Again, we use dynamic programming, but here we apply it to generalizations of these functions. Specifically, let \({T_\mathtt{tet}^{t,n,d}}=\{(i_1,...,i_t)\in [n]^t:\sum _{j\in [n]}|i_j-{\lfloor n/2\rfloor }|\le d\}\) and note that the associated function satisfies \({F_\mathtt{tet}^{t,n,d}}({x^{(1)}},...,{x^{(t)}}) = \sum _{i\in [n]} {F_\mathtt{tet}^{t-1,n,d-i}}({x^{(1)}},...,{x^{(t-1)}})\cdot x^{(t)}_i\). Likewise, consider the tensor \({T_{\mathrm{mod}\, {p}}^{t,n,r}} = \left\{ (i_1,...,i_t)\in [n]^t:\sum _{j\in [t]}i_j\equiv r\pmod p\right\} \) and note that the associated function satisfies \({F_{\mathrm{mod}\, {p}}^{t,n,r}}({x^{(1)}},...,{x^{(t)}}) = \sum _{i\in [n]} {F_{\mathrm{mod}\, {p}}^{t-1,n,r-i}}({x^{(1)}},...,{x^{(t-1)}})\cdot x^{(t)}_i\).
- 6.
In brief, when computing t-linear polynomials, a lower bound of \(\exp (\varOmega (s/2^t))\) on the size of depth-two circuits can be justified (see [11, Apdx C]). Furthermore, for \(2^t\ll s\), a lower bound of \(\exp (\varOmega (s))\) can be justified if the CNFs (or DNFs) used are “canonical” (i.e., use only s-way gates at the second (i.e., \(F_i\)’s) level).
- 7.
Since such directly fed variables can be replaced by dummy gates that are each fed by the corresponding variable.
- 8.
Doing so may increase the arity of the top gate, but this increase is upper-bounded by the number of gates. A more general argument is presented in Remark 2.4, which asserts that if gate G computes a monomial that contains no leaves, then this monomial can be moved up to the parent of G.
- 9.
Thus, \(n=4\) and t is the number of matrices being multiplied.
- 10.
Concretely, the conjectured hardness of computing a multilinear function by constant-depth Boolean circuits may stem from the number (denoted n) of variables of the same type (i.e., the variables in \({x^{(j)}}\)), even when the arity of multiplication (denoted t) is relatively small (e.g., we even consider bilinear functions), whereas in the multilinear circuits hardness seem to be related to t (cf., indeed, the aforementioned lower bound for iterated matrix multiplication).
- 11.
See ECCC TR13-043, March 2013.
- 12.
However, as stated in Sect. 1.5, our main results extend to other fields.
- 13.
In fact, \({\mathtt{AN}}_2({F_\mathtt{had}^{2,n}})={\widetilde{O}}({\sqrt{n}})\). In contrast, by [9], \({\mathtt{AN}}_2({F_\mathtt{tet}^{3,n}})={\widetilde{\varOmega }}(n^{2/3})\) and \({\mathtt{AN}}({F_\mathtt{tet}^{3,n}})={\widetilde{\varOmega }}(n^{0.6})\).
- 14.
Clearly, w.l.o.g., H is multilinear in its m inputs, since we are considering multiplication over \(\mathrm{GF}(2)\). However, what we consider next is not the dependency of H on its own inputs, but rather its dependency on the inputs of the circuits as reflected in the composed function \(H(F_1,...,F_m)\). Furthermore, we do not consider this function per se, but rather its syntactic form (before cancellations).
- 15.
Actually, this may increase m by one unit. The reason is that if the top gate if fed by i variables, then the number of intermediate gates in the circuit is at most \(m-i\). So introducing intermediate singleton gates yields a depth-two circuit with at most \((m-i)+i\) intermediate gates.
- 16.
The point is that this alternative class G does not refer to the “description length” but rather to the complexity measures defined in this section. In this case, we may show that a random restriction of the type used in the original proof leaves m/s live variables in each \(G_i\), in expectation, just as it holds for the \(B_i\)’s. Using \(m=0.9n^{2/3}\), it holds that, with high probability, none of the gates exceeds this expectation by a factor of 1/0.9. Next, we upper-bound the size of G, very much as done in the foregoing proof, where here the crucial fact is that each \(B_i\) has only \(m\cdot n^{1/6}\) live terms, whereas \(m^2\cdot n^{1/6} = 0.81\cdot n^{3/2}\).
- 17.
Here, we assume (as is standard in the area) that the cancellations must hold over any extension field of \(\mathrm{GF}(2)\) (rather than only over \(\mathrm{GF}(2)\) itself); that is, the polynomial \(x^i\) may be cancelled by the polynomial \((2k+1)\cdot x^j\) if and only if \(i=j\).
- 18.
Variables that feed directly into the top gate can be replaced by 1-ary identity gates.
- 19.
Depth-two circuits can be derived by combining the t-way multiplication gate with the \(\sqrt{n}\)-way addition gates feeding it (resp., each \(\sqrt{n}\)-way addition gate with the t-way multiplication gate feeding it).
- 20.
By the induction hypothesis, for every \(t'\in [t-1]\), we can express the functions \({F_\mathtt{leq}^{t-t',[(k-1)s+1,n]}}({x^{(1)}},...,{x^{(t-t')}})\) for all \(k\in [s]\), but here we need the functions \({F_\mathtt{leq}^{t-t',[(k-1)s+1,n]}}({x^{(t'+1)}},...,{x^{(t)}})\). Still, these are the same functions, we just need to change the variable names in the expressions.
- 21.
For starters, we allowed each gate to be feed by m original variables and m auxiliary functions, whereas the arity bound is m. Furthermore, we allowed each gate to be fed by all other gates, whereas the circuit should be acyclic. Moreover, the choice of the t-partition can be the same for all gates, let alone that the various t-partitions must be consistent among gates and adheres to the multilinearity condition of Definition 2.1.
- 22.
Denoting by \(m_j\) the number of variables and/or gates that belong to the \(j^\mathrm{th}\) block, the number of possible monomials is \(\prod _{j\in [t]}(m_j+1)\), where in our case \(\sum _{j\in [t]}m_j\le 2m\).
- 23.
Added in Revision: Interestingly, a subsequent work of Dvir and Liu [3, 4] shows that no Toeplitz matrix is rigid in the range of parameters sought by Valiant [31]. Specifically, they show that, for any constant \(c>1\), no Toeplitz matrix has rigidity \(n^c\) with respect to rank \(n/\log n\) (see [4], which builds upon [3]). In contrast, the subsequent work of Goldreich and Tal [9] shows that almost all Toeplitz matrix have rigidity \({\widetilde{\varOmega }}(n^3/r^2)\) with respect to rank \(r\in [{\sqrt{n}},n/32]\).
- 24.
As in Construction 2.6, we may replace variables that feed directly into the top gate by 1-ary identity gates. That is, if \(F(x,y)=H(F_1(x,y),...,F_{m'}(x,y),z_{m'+1}...,z_{m-1})\), where each \(z_i\) belongs either to x or to y, then we let \(F(x,y)=H(F_1(x,y),...,F_{m-1}(x,y))\), where \(F_i(x,y)=z_i\) for every \(i\in [m'+1,m-1]\).
- 25.
In terms of Eq. (1), letting T denote the set of one-entries of M, it holds that \(B(x,y)=\sum _{(k,\ell )\in T}x_ky_\ell \).
- 26.
That is, letting \(L'_j(y)=\sum _{j_2:(j,j_2)\in P}L_{j_2}(y)\), we consider the sum \(\sum _{j_1\in [m-1]}L_{j_1}(x)\cdot L'_{j_1}(y)\), and note that each term corresponds to a rank-1 matrix (i.e., the \((k,\ell )^\mathrm{th}\) entry of the \(j_1^\mathrm{th}\) matrix equals \(L_{j_1}(0^{k-1}10^{n-k})\cdot L'_{j_1}(0^{\ell -1}10^{n-\ell })\)).
- 27.
Actually, we can combine all products that involve \(F_i\), see below.
- 28.
Since, as argued next, such monomials exist only in the top gate, it follows that (w.l.o.g.) they cannot be a single linear function, because the top gate must compute a homogeneous polynomial of degree 2.
- 29.
Note that \(\sigma _1+\sigma _2 = \sum _k F_k(x)\cdot L_k(y) +\sum _\ell L_\ell (x)\cdot F_\ell (y)\), where the \(L_i\)’s are arbitrary linear functions (which may depend on an arbitrary number of variables in either x or y).
- 30.
This generalizes the claim made in Remark 3.5. Furthermore, as stated there, mixed gates are potentially beneficial. The observation made here is that this benefit (i.e., a mixed monomial) comes at the “cost” of using auxiliary functions of lower degree.
- 31.
The opposite direction is equally simple: Just note that \({F_\mathtt{tet}^{3,n}}\) can be expressed as a sum of the values in the eight directions corresponding to \(\{\pm 1\}^3\).
- 32.
Equivalently, we wish to set I such that \(|\{i\in I:\chi (i,j,k)=1\}|\equiv h_{0,j+k}\pmod 2\) holds, where \(\chi (i,j,k)=1\) if \((i,j,k)\in T_3\) (equiv., \(i+j+k\le n'\)). Letting \(\zeta _i\in \mathrm{GF}(2)\) represent whether \(i\in I\), we solve the linear system \(\sum _{i\in [[n']]} \chi (i,0,j+k)\zeta _i \equiv h_{0,j+k}\pmod 2\) for \(j+k\in [[n']]\). Note that the matrix corresponding to this linear system has full rank.
- 33.
This means that one considers the syntactic polynomial computed by the circuit (over a generic field) and requires that it equals the target polynomial when the field remains unspecified.
- 34.
Recall that, w.l.o.g., gates that compute quadratic \(F_i\)’s (for \(i\in [m']\)) may only feed into the top gate. Ditto for gates computing products of two linear \(F_i\)’s (for \(i\in [m'+1,m-1]\)). Thus, \(F_0=Q_0+\sum _{i\in [m']}F_i+\sum _{i=m'+1}^{m-1}L_{0,i}F_i\), where \(Q_0\) is a sum of the products of pairs of variables that appear in \(F_0\), the \(L_{0,i}\)’s are arbitrary linear functions, and for \(i>m'\) the linear function \(F_i\) is computed by an internal gate. Furthermore, for every \(i\in [m']\), it holds that \(F_i=Q_i+\sum _{j=m'+1}^{m-1}L_{i,j}F_j\), where \(Q_i\) is a sum of the products of pairs of variables that appear in \(F_i\), the \(L_{i,j}\)’s are arbitrary linear functions, and for \(j>m'\) the linear function \(F_j\) is computed by an internal gate. Letting \(L_j=\sum _{i=0}^{m'}L_{i,j}\), we get Eq. (12).
- 35.
Recall that, w.l.o.g., gates that compute cubic \(F_i\)’s (for \(i\in [m']\)) may only feed into the top gate. Ditto for gates computing products of linear \(F_i\)’s and quadratic \(F_i\)’s (for \(i\in [m'+1,m-1]\)). Thus, \(F_0=C_0+\sum _{i\in [m']}F_i +\sum _{i=m'+1}^{m'+m''}L_{0,i}F_{i} +\sum _{i=m'+m''+1}^{m'+m''+m''}Q_{0,i}F_{i}\), where \(C_0\) is a sum of the products of triples of variables that appear in \(F_0\), the \(L_{0,i}\)’s (resp., \(Q_{0,i}\)’s) are arbitrary linear (resp., quadratic) functions, and for \(i>m'\) the quadratic (resp., linear) function \(F_i\) is computed by an internal gate. Furthermore, for every \(i\in [m']\), it holds that \(F_i=C_i+\sum _{j=m'+1}^{m'+m''}L_{i,j}F_{j} +\sum _{j=m'+m''+1}^{m'+m''+m''}Q_{i,j}F_{j}\), where \(C_i\) is a sum of the products of triples of variables that appear in \(F_i\), the \(L_{i,j}\)’s (resp., \(Q_{i,j}\)’s) are arbitrary linear (resp., quadratic) functions, and for \(j>m'\) the quadratic (resp., linear) function \(F_j\) is computed by an internal gate. Letting \(L_j=\sum _{i=0}^{m'}L_{i,j}\) and \(Q_j=\sum _{i=0}^{m'}Q_{i,j}\), we get Eq. (13).
- 36.
In a gate that is fed by a trivial multiplication-gate, the argument representing the trivial gate’s output is replaced by the (up to) t input variables feeding this trivial gate.
References
Ajtai, M.: \(\varSigma _1^1\)-formulae on finite structures. Ann. Pure Appl. Logic 24(1), 1–48 (1983)
Babai, L.: Random oracles separate PSPACE from the polynomial-time hierarchy. IPL 26, 51–53 (1987)
Dvir, Z., Liu, A.: Fourier and circulant matrices are not rigid. In: 34th CCC, pp. 17:1–17:23 (2019). arXiv:1902.07334 [math.CO]
Dvir, Z., Liu, A.: Fourier and circulant matrices are not rigid. To appear in TOC, special issue of 34th CCC. See also ECCC, TR19-129, September 2019
Erdos, P., Spencer, J.: Probabilistic Methods in Combinatorics. Academic Press Inc., New York (1974)
Furst, M.L., Saxe, J.B., Sipser, M.: Parity, circuits, and the polynomial-time hierarchy. Math. Syst. Theory 17(1), 13–27 (1984). Preliminary version in 22nd FOCS (1981)
Goldreich, O.: Computational Complexity: A Conceptual Perspective. Cambridge University Press, Cambridge (2008)
Goldreich, O.: Improved bounds on the AN-complexity of multilinear functions. In: ECCC, TR19-171 (2019)
Goldreich, O., Tal, A.: Matrix rigidity of random Toeplitz matrices. Comput. Complex. 27(2), 305–350 (2018). Preliminary versions in 48th STOC (2016) and ECCC TR15-079 (2015)
Goldreich, O., Tal, A.: On constant-depth canonical Boolean circuits for computing multilinear functions. In: ECCC, TR17-193 (2017)
Goldreich, O., Wigderson, A.: On the size of depth-three Boolean circuits for computing multilinear functions. In: ECCC, TR13-043 (2013)
Hastad, J.: Almost optimal lower bounds for small depth circuits. In: Micali, S. (ed.) Advances in Computing Research: A Research Annual, (Randomness and Computation), vol. 5, pp. 143–170 (1989). Extended abstract in 18th STOC (1986)
Hastad, J.: Computational Limitations for Small Depth Circuits. MIT Press, Cambridge (1987)
Hastad, J., Jukna, S., Pudlak, P.: Top-down lower bounds for depth-three circuits. Comput. Complex. 5(2), 99–112 (1995)
Hrubes, P., Rao, A.: Circuits with medium fan-in. In: ECCC, TR14-020 (2014)
Jukna, S.: Boolean Function Complexity: Advances and Frontiers. Algorithms and Combinatorics, vol. 27. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24508-4
Karchmer, M., Wigderson, A.: Monotone circuits for connectivity require super-logarithmic depth. SIAM J. Discret. Math. 3(2), 255–265 (1990)
Kushilevitz, E., Nisan, N.: Communication Complexity. Cambridge University Press, Cambridge (1997)
Lokam, S.V.: Complexity lower bounds using linear algebra. Found. Trends Theor. Comput. Sci. 4, 1–155 (2009)
van Melkebeek, D.: A survey of lower bounds for satisfiability and related problems. Found. Trends Theor. Comput. Sci. 2, 197–303 (2007)
Nisan, N.: Pseudorandom bits for constant depth circuits. Combinatorica 11(1), 63–70 (1991)
Nisan, N., Wigderson, A.: Hardness vs randomness. J. Comput. Syst. Sci. 49(2), 149–167 (1994). Preliminary version in 29th FOCS (1988)
Nisan, N., Wigderson, A.: Lower bound on arithmetic circuits via partial derivatives. Comput. Complex. 6, 217–234 (1996)
Raz, R.: Tensor-rank and lower bounds for arithmetic formulas. In: Proceeding of the 42nd STOC, pp. 659–666 (2010)
Raz, R., Yehudayoff, A.: Lower bounds and separations for constant depth multilinear circuits. In: ECCC, TR08-006 (2008)
Razborov, A.: Lower bounds on the size of bounded-depth networks over a complete basis with logical addition. Matematicheskie Zametki 41(4), 598–607 (1987). (in Russian). English translation in Math. Notes Acad. Sci. USSR 41(4), 333–338 (1987)
Savitch, W.J.: Relationships between nondeterministic and deterministic tape complexities. JCSS 4(2), 177–192 (1970)
Shaltiel, R., Viola, E.: Hardness amplification proofs require majority. SIAM J. Comput. 39(7), 3122–3154 (2010). Extended abstract in 40th STOC (2008)
Smolensky, R.: Algebraic methods in the theory of lower bounds for Boolean circuit complexity. In: 19th STOC, pp. 77–82 (1987)
Strassen, V.: Vermeidung von Divisionen. J. Reine Angew. Math. 264, 182–202 (1973)
Valiant, L.G.: Graph-theoretic arguments in low-level complexity. In: Gruska, J. (ed.) MFCS 1977. LNCS, vol. 53, pp. 162–176. Springer, Heidelberg (1977). https://doi.org/10.1007/3-540-08353-7_135
Valiant, L.G.: Exponential lower bounds for restricted monotone circuits. In: 15th STOC, pp. 110–117 (1983)
Vazirani, U.V.: Efficiency considerations in using semi-random sources. In: 19th STOC, pp. 160–168 (1987)
Yao, A.C.: Separating the polynomial-time hierarchy by oracles. In: 26th FOCS, pp. 1–10 (1985)
Acknowledgments
We are grateful to Or Meir for extremely helpful discussions, and to Avishay Tal for many suggestions for improving the presentation. Research was partially done while O.G. visited the IAS.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Goldreich, O., Wigderson, A. (2020). On the Size of Depth-Three Boolean Circuits for Computing Multilinear Functions. In: Goldreich, O. (eds) Computational Complexity and Property Testing. Lecture Notes in Computer Science(), vol 12050. Springer, Cham. https://doi.org/10.1007/978-3-030-43662-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-43662-9_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43661-2
Online ISBN: 978-3-030-43662-9
eBook Packages: Computer ScienceComputer Science (R0)