On the inequivalence of bilinear algorithms for matrix multiplication
Introduction
For two matrices A and B, let C be the product AB. Strassen [1] showed that seven scalar multiplications are sufficient to get C. Let and where , and .
By finding such numbers , he proved that 7 scalar multiplications are enough to compute C. By applying the result recursively to the problem of multiplying two matrices, Strassen derived an algorithm with arithmetic operations which is better than , the complexity of the naive method. Later, Winograd [2] proved that 7 is the minimum number of multiplications required to multiply two matrices in the context of bilinear combinations.
One way to obtain fast algorithms for multiplying two matrices is to look for other small cases such as the problem of multiplying a matrix by a matrix. For this case, it was proved that 11 multiplications are required. The upper bound was obtained by combining Strassenʼs algorithm with a vector-matrix multiplication and the lower bound was obtained by Alexeyev [3]. For the case of multiplying a matrix by a matrix, the possible number of multiplications is either 14 or 15. The upper bound of 15 was derived by Hopcroft and Kerr [4] and the lower bound of 14 was proved by Brockett and Dobkin [5].
In this paper, we consider the problem of multiplicating two matrices. For this case, Laderman [6] devised an algorithm that uses 23 multiplications instead of 27. Bläser [7] proved that a lower bound is 19. The time complexity of Ladermanʼs algorithm becomes . Although the complexity is worse than that of Strassen algorithm, the result has an important theoretical meaning which will be discussed in Section 1.2.
Currently, the best complexity for the product of two matrices is by Williams [8].
We can think of as matrices. That is, can be considered as a matrix where i (j resp.) denotes row index (column index, resp.) for . Then, an algorithm for the multiplication of two matrices using 23 scalar multiplications consists of matrices. Let be another algorithm. Two algorithms, consisting of and , respectively, are called equivalent when one can be transformed to the other using a series of transformations by permutation, cyclic permutation, transposition, scalar multiplication, and matrix multiplication [9].
For the multiplication of matrices, de Groote [10] proved that Strassen algorithm is unique under this transformation by showing that any algorithm is transformable to Strassenʼs. In other words, every algorithms of 7 scalar multiplications are equivalent to Strassenʼs.
For the matrix multiplication, Johnson and McLoughlin [11] presented two parameterized algorithms and proved that they are inequivalent to Ladermanʼs. However, their result does not give an answer to a critical question: How many inequivalent algorithms exist. In this paper, we show that there exist many algorithms inequivalent to Ladermanʼs and Johnson–McLoughlinʼs algorithms.
De Grooteʼs transformation has an invariance property for ranks of matrices, which is useful when proving inequivalence between two algorithms. Simply speaking, if the distribution of ranks for an algorithm is different from that for another algorithm, then they are inequivalent. (We just implicitly mention the whole distribution of ranks of 69 matrices. De Grooteʼs transformation has a stronger invariance property than we mentioned here.)
The Ladermanʼs algorithm consists of 51 matrices of rank 1, 12 matrices of rank 2, and 6 matrices of rank 3. For convenience, we simply write the distribution as . In case of Johnson and McLoughlinʼs first parameterized algorithm, the distribution can be either or depending on the given parameters. The distribution of their second algorithm can be , or . From the fact that the distributions of ranks for the two Johnson and McLoughlinʼs algorithms are different to the Ladermanʼs, the algorithms are inequivalent to the Ladermanʼs. (Johnson and McLoughlinʼs first algorithm is also inequivalent to their second algorithm.) Johnson and McLoughlin used this invariance property to prove their result.
To prove inequivalence, we need to find two candidate algorithms. However it is difficult to find even one candidate algorithm. To find one algorithm, Laderman used a deeper analysis for the problem. Johnson and McLoughlin used the help of computer and a fairly delicate procedure considering de Grooteʼs transformation so that they could get two parameterized algorithms.
Since our first step is to find many candidate algorithms, the published methods mentioned above are not satisfactory in that they take too much time. We added a simple rounding method to Brentʼs numerical method so that we could get a simple and fast heuristic to find many algorithms in a short time.
Since we could find algorithms with 23 multiplications, we also tried to find an algorithm with 22 multiplications. Unfortunately, given ʼs, ʼs, ʼs, at most 8 out of 9 ʼs could be obtained.
We first present the basic numerical method commonly used for solving (1), (2). Then we show our heuristic which adds a simple rounding method to the basic numerical method. Finally, we give just three algorithms that are inequivalent to the published ones; due to space limit, we could not list all the inequivalent algorithms that we found. Thus, before showing the three algorithms, we give a summary of the rank distributions of them.
Section snippets
Basic numerical method
Brent [12] introduced a basic numerical method. When we plugging (1) into Eq. (2), we get the following where . In [12], a function was defined as follows: The function is decomposed into three parts. Each part is computed as follows:
Inequivalent algorithms
In this section, we first show some statistics of rank distributions of the algorithms that we found. Then we show some algorithms inequivalent to the published algorithms. We cannot list all the inequivalent algorithms due to space limit.2 Instead, we select three of them that are interesting in comparison with the published algorithms. The first is an integral algorithm where all variables are
Conclusion
We devised a simple and fast heuristic for finding algorithms for matrix multiplication with 23 scalar multiplications. Our heuristic is simple and fairly faster than the published method. We then proved that our algorithms are inequivalent to the published ones.
References (12)
On multiplication of matrices
Linear Algebra Appl.
(1971)On the complexity of some algorithms of matrix multiplication
J. Algorithms
(1985)On the complexity of the multiplication of matrices of small formats
J. Complexity
(2003)On varieties of optimal algorithms for the computation of bilinear mappings I. The isotropy group of a bilinear mapping
Theoret. Comput. Sci.
(1978)On varieties of optimal algorithms for the computation of bilinear mappings II. Optimal algorithms for -matrix multiplication
Theoret. Comput. Sci.
(1978)Gaussian elimination is not optimal
Numer. Math.
(1969)
Cited by (17)
Equivalent polyadic decompositions of matrix multiplication tensors
2022, Journal of Computational and Applied MathematicsCitation Excerpt :The techniques used by de Groote, Johnson and McLoughlin, and Oh et al. to prove the equivalence/inequivalence of decompositions are either too specific [4,23] or too conservative [24] (some inequivalent decompositions are not recognized as such) to be applied to general decompositions of matrix multiplication tensors of arbitrary size.
New ways to multiply 3 × 3-matrices
2021, Journal of Symbolic ComputationA Normal Form for Matrix Multiplication Schemes
2022, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)