On the inequivalence of bilinear algorithms for 3×3 matrix multiplication

doi:10.1016/j.ipl.2013.05.011

Information Processing Letters

Volume 113, Issue 17, 30 August 2013, Pages 640-645

https://doi.org/10.1016/j.ipl.2013.05.011 Get rights and content

Highlights

•
A heuristic for finding algorithms for 3 by 3 matrix multiplication is presented.
•
Our heuristic is very simpler and faster than the previous published methods.
•
Using the heuristic, many algorithms are obtained.
•
Many algorithms are proved to be inequivalent to the published ones, theoretically.

Abstract

Since Laderman showed an algorithm for $3 \times 3$ matrix multiplication using 23 scalar multiplications, Johnson and McLoughlin used a numerical optimization and human controlled method to give two parameterized algorithms in which the coefficients are rational numbers. The algorithms are inequivalent to Ladermanʼs one with respect to the transformation introduced by de Groote. We present a simple and fast numerical heuristic for finding valid algorithms. Then we show that many of the obtained algorithms are inequivalent to the published ones.

Introduction

For two $2 \times 2$ matrices A and B, let C be the product AB. Strassen [1] showed that seven scalar multiplications are sufficient to get C. Let $P_{t} = (\sum_{i, j} α_{i j t} A_{i j}) (\sum_{i, j} β_{i j t} B_{i j})$ and $C_{n m} = \sum_{t} γ_{m n t} P_{t}$ where $t = 0, 1, \dots, 6$ , and $α_{i j t}, β_{i j t} \in {- 1, 0, 1}$ .

By finding such numbers $α_{i j t}, β_{i j t}, γ_{i j t}$ , he proved that 7 scalar multiplications are enough to compute C. By applying the result recursively to the problem of multiplying two $n \times n$ matrices, Strassen derived an algorithm with $O (n^{\log_{2} 7})$ arithmetic operations which is better than $O (n^{3})$ , the complexity of the naive method. Later, Winograd [2] proved that 7 is the minimum number of multiplications required to multiply two $2 \times 2$ matrices in the context of bilinear combinations.

One way to obtain fast algorithms for multiplying two $n \times n$ matrices is to look for other small cases such as the problem of multiplying a $2 \times 2$ matrix by a $2 \times 3$ matrix. For this case, it was proved that 11 multiplications are required. The upper bound was obtained by combining Strassenʼs algorithm with a vector-matrix multiplication and the lower bound was obtained by Alexeyev [3]. For the case of multiplying a $2 \times 3$ matrix by a $3 \times 3$ matrix, the possible number of multiplications is either 14 or 15. The upper bound of 15 was derived by Hopcroft and Kerr [4] and the lower bound of 14 was proved by Brockett and Dobkin [5].

In this paper, we consider the problem of multiplicating two $3 \times 3$ matrices. For this case, Laderman [6] devised an algorithm that uses 23 multiplications instead of 27. Bläser [7] proved that a lower bound is 19. The time complexity of Ladermanʼs algorithm becomes $O (n^{\log_{3} 23})$ . Although the complexity is worse than that of Strassen algorithm, the result has an important theoretical meaning which will be discussed in Section 1.2.

Currently, the best complexity for the product of two $n \times n$ matrices is $O (n^{2.3727})$ by Williams [8].

We can think of $α_{i j t}, β_{i j t}, γ_{m n t}$ as matrices. That is, $α_{i j t}$ can be considered as a $3 \times 3$ matrix where i (j resp.) denotes row index (column index, resp.) for $i, j = 0, 1, 2$ . Then, an algorithm for the multiplication of two $3 \times 3$ matrices using 23 scalar multiplications consists of $23 \cdot 3$ matrices. Let $α_{i j t}^{'}, β_{i j t}^{'}, γ_{m n t}^{'}$ be another algorithm. Two algorithms, consisting of $α_{i j t}, β_{i j t}, γ_{m n t}$ and $α_{i j t}^{'}, β_{i j t}^{'}, γ_{m n t}^{'}$ , respectively, are called equivalent when one can be transformed to the other using a series of transformations by permutation, cyclic permutation, transposition, scalar multiplication, and matrix multiplication [9].

For the multiplication of $2 \times 2$ matrices, de Groote [10] proved that Strassen algorithm is unique under this transformation by showing that any algorithm is transformable to Strassenʼs. In other words, every algorithms of 7 scalar multiplications are equivalent to Strassenʼs.

For the $3 \times 3$ matrix multiplication, Johnson and McLoughlin [11] presented two parameterized algorithms and proved that they are inequivalent to Ladermanʼs. However, their result does not give an answer to a critical question: How many inequivalent algorithms exist. In this paper, we show that there exist many algorithms inequivalent to Ladermanʼs and Johnson–McLoughlinʼs algorithms.

De Grooteʼs transformation has an invariance property for ranks of matrices, which is useful when proving inequivalence between two algorithms. Simply speaking, if the distribution of ranks for an algorithm is different from that for another algorithm, then they are inequivalent. (We just implicitly mention the whole distribution of ranks of 69 matrices. De Grooteʼs transformation has a stronger invariance property than we mentioned here.)

The Ladermanʼs algorithm consists of 51 matrices of rank 1, 12 matrices of rank 2, and 6 matrices of rank 3. For convenience, we simply write the distribution as $[51 : 12 : 6]$ . In case of Johnson and McLoughlinʼs first parameterized algorithm, the distribution can be either $[47 : 21 : 1]$ or $[48 : 20 : 1]$ depending on the given parameters. The distribution of their second algorithm can be $[48 : 21 : 0]$ , $[49 : 20 : 0]$ or $[50 : 19 : 0]$ . From the fact that the distributions of ranks for the two Johnson and McLoughlinʼs algorithms are different to the Ladermanʼs, the algorithms are inequivalent to the Ladermanʼs. (Johnson and McLoughlinʼs first algorithm is also inequivalent to their second algorithm.) Johnson and McLoughlin used this invariance property to prove their result.

To prove inequivalence, we need to find two candidate algorithms. However it is difficult to find even one candidate algorithm. To find one algorithm, Laderman used a deeper analysis for the problem. Johnson and McLoughlin used the help of computer and a fairly delicate procedure considering de Grooteʼs transformation so that they could get two parameterized algorithms.

Since our first step is to find many candidate algorithms, the published methods mentioned above are not satisfactory in that they take too much time. We added a simple rounding method to Brentʼs numerical method so that we could get a simple and fast heuristic to find many algorithms in a short time.

Since we could find algorithms with 23 multiplications, we also tried to find an algorithm with 22 multiplications. Unfortunately, given $α_{i j t}$ ʼs, $β_{i j t}$ ʼs, $γ_{m n t}$ ʼs, at most 8 out of 9 $C_{n m}$ ʼs could be obtained.

We first present the basic numerical method commonly used for solving (1), (2). Then we show our heuristic which adds a simple rounding method to the basic numerical method. Finally, we give just three algorithms that are inequivalent to the published ones; due to space limit, we could not list all the inequivalent algorithms that we found. Thus, before showing the three algorithms, we give a summary of the rank distributions of them.

Section snippets

Basic numerical method

Brent [12] introduced a basic numerical method. When we plugging (1) into Eq. (2), we get the following $\sum_{t = 0}^{22} α_{i j t} β_{k l t} γ_{m n t} = δ_{n i} δ_{j k} δ_{l m}$ where $i, j, k, l, m, n = 1, 2, 3$ . In [12], a function $s (α, β, γ)$ was defined as follows: $s (α, β, γ) = \sum_{i, j, k, l, m, n} {[\sum_{t} α_{i j t} β_{k l t} γ_{m n t} - δ_{n i} δ_{j k} δ_{l m}]}^{2} .$ The function $s (α, β, γ)$ is decomposed into three parts. $s (α, β, γ) = \sum_{i, j, k, l, m, n} {[\sum_{t} α_{i j t} β_{k l t} γ_{m n t}]}^{2} - 2 \sum_{i, j, k, l, m, n} [δ_{n i} δ_{j k} δ_{l m} \sum_{t} α_{i j t} β_{k l t} γ_{m n t}] + \sum_{i, j, k, l, m, n} {(δ_{n i} δ_{j k} δ_{l m})}^{2} .$ Each part is computed as follows: $\sum_{i, j, k, l, m, n} {[\sum_{t} α_{i j t} β_{k l t} γ_{m n t}]}^{2} = \sum_{t, u} [(\sum_{i, j} α_{i j t} α_{i j u}) (\sum$

Inequivalent algorithms

In this section, we first show some statistics of rank distributions of the algorithms that we found. Then we show some algorithms inequivalent to the published algorithms. We cannot list all the inequivalent algorithms due to space limit.² Instead, we select three of them that are interesting in comparison with the published algorithms. The first is an integral algorithm where all variables are

Conclusion

We devised a simple and fast heuristic for finding algorithms for $3 \times 3$ matrix multiplication with 23 scalar multiplications. Our heuristic is simple and fairly faster than the published method. We then proved that our algorithms are inequivalent to the published ones.

References (12)

S. Winograd
On multiplication of $2 \times 2$ matrices
Linear Algebra Appl.
(1971)
V.B. Alekseyev
On the complexity of some algorithms of matrix multiplication
J. Algorithms
(1985)
M. Bläser
On the complexity of the multiplication of matrices of small formats
J. Complexity
(2003)
H.F. de Groote
On varieties of optimal algorithms for the computation of bilinear mappings I. The isotropy group of a bilinear mapping
Theoret. Comput. Sci.
(1978)
H.F. de Groote
On varieties of optimal algorithms for the computation of bilinear mappings II. Optimal algorithms for $2 \times 2$ -matrix multiplication
Theoret. Comput. Sci.
(1978)
V. Strassen
Gaussian elimination is not optimal
Numer. Math.
(1969)

There are more references available in the full text version of this article.

Cited by (17)

Equivalent polyadic decompositions of matrix multiplication tensors
2022, Journal of Computational and Applied Mathematics
Citation Excerpt :
The techniques used by de Groote, Johnson and McLoughlin, and Oh et al. to prove the equivalence/inequivalence of decompositions are either too specific [4,23] or too conservative [24] (some inequivalent decompositions are not recognized as such) to be applied to general decompositions of matrix multiplication tensors of arbitrary size.
Invariance transformations of polyadic decompositions of matrix multiplication tensors define an equivalence relation on the set of such decompositions. In this paper, we present an algorithm to efficiently decide whether two polyadic decompositions of a given matrix multiplication tensor are equivalent. With this algorithm, we analyze the equivalence classes of decompositions of several matrix multiplication tensors. This analysis is relevant for the study of fast matrix multiplication as it relates to the question of how many essentially different fast matrix multiplication algorithms there exist. This question has been first studied by de Groote, who showed that for the multiplication of 2 ×2 matrices with 7 active multiplications, all algorithms are essentially equivalent to Strassen’s algorithm. In contrast, the results of our analysis show that for the multiplication of larger matrices (e.g., 2 ×3 by 3 ×2 or 3 ×3 by 3 ×3 matrices), two decompositions are very likely to be essentially different. We further provide a necessary criterion for a polyadic decomposition to be equivalent to a polyadic decomposition with integer entries. Decompositions with specific integer entries, e.g., powers of two, provide fast matrix multiplication algorithms with better efficiency and stability properties. This condition can be tested algorithmically and we present the conclusions obtained for the decompositions of small/medium matrix multiplication tensors.
New ways to multiply 3 × 3-matrices
2021, Journal of Symbolic Computation
It is known since the 1970s that no more than 23 multiplications are required for computing the product of two $3 \times 3$ -matrices. For non-commutative coefficient rings, it is not known whether it can also be done with fewer multiplications. However, there are several mutually inequivalent ways of doing the job with 23 multiplications. In this article, we extend this list considerably by providing more than 17,000 new and mutually inequivalent schemes for multiplying $3 \times 3$ -matrices using 23 multiplications. Moreover, we show that the set of all these schemes is a manifold of dimension at least 17.
Ruling Out Low-rank Matrix Multiplication Tensor Decompositions with Symmetries via SAT
2024, arXiv
On automorphism group of a possible short algorithm for multiplication of 3 × 3 matrices
2022, arXiv
A Normal Form for Matrix Multiplication Schemes
2022, arXiv
A Normal Form for Matrix Multiplication Schemes
2022, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

View all citing articles on Scopus

View full text

On the inequivalence of bilinear algorithms for 3×3 matrix multiplication

Highlights

Abstract

Introduction

Section snippets

Basic numerical method

Inequivalent algorithms

Conclusion

Linear Algebra Appl.

J. Algorithms

J. Complexity

Theoret. Comput. Sci.

Theoret. Comput. Sci.

Gaussian elimination is not optimal

Numer. Math.

On the inequivalence of bilinear algorithms for $3 \times 3$ matrix multiplication