# Strassen's matrix multiplication

Consider two matrices, $A=[\;\begin{smallmatrix}a_{11}&a_{12} \newline a_{21}&a_{22}\end{smallmatrix}\;]$ and $B=[\;\begin{smallmatrix}b_{11}&b_{12} \newline b_{21}&b_{22}\end{smallmatrix}\;]$. The matrix multiplication $C=AB$ can be computed using the following expression:

\[\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} a_{11}b_{11}+a_{12}b_{21} & a_{11}b_{12}+a_{12}b_{22} \\ a_{21}b_{11}+a_{22}b_{21} & a_{21}b_{12}+a_{22}b_{22} \end{bmatrix}.\]Note that each element in the resulting matrix is in the form $c_{ij} = a_{i1}b_{1j}+a_{i2}b_{2j}$. Thus, there are two *basic* multiplications between numbers before they are summed up. (For matrices of dimension $n{\times}n$, we have $c_{ij} = \sum_1^n a_{i\square}b_{\square j}$ instead.)

The advantage of this equation is that each $a_{ij}$ and $b_{ij}$ doesn’t have to be just scalars. They can also be submatrices of $A$ and $B$ that are equally partitioned.

For instance, when $A$ and $B$ are both $4{\times}4$ matrices, We have

\[\begin{align} {\color{red}A}{\color{blue}B} &= {\color{red}\left[\;\begin{array}{c:c} \boxed{\begin{matrix}a_{11} & a_{12} \\ a_{21} & a_{22}\end{matrix}} & \boxed{\begin{matrix}a_{13} & a_{14} \\ a_{23} & a_{24}\end{matrix}} \\ \hdashline \boxed{\begin{matrix}a_{31} & a_{32} \\ a_{41} & a_{42}\end{matrix}} & \boxed{\begin{matrix}a_{33} & a_{34} \\ a_{43} & a_{44}\end{matrix}} \end{array}\;\right]} {\color{blue}\left[\;\begin{array}{c:c} \boxed{\begin{matrix}b_{11} & b_{12} \\ b_{21} & b_{22}\end{matrix}} & \boxed{\begin{matrix}b_{13} & b_{14} \\ b_{23} & b_{24}\end{matrix}} \\ \hdashline \boxed{\begin{matrix}b_{31} & b_{32} \\ b_{41} & b_{42}\end{matrix}} & \boxed{\begin{matrix}b_{33} & b_{34} \\ b_{43} & b_{44}\end{matrix}} \end{array}\;\right]} \\ &= \left[\;\begin{array}{c:c} {\color{red}\boxed{\begin{matrix}a_{11} & a_{12} \\ a_{21} & a_{22}\end{matrix}}}\, {\color{blue}\boxed{\begin{matrix}b_{11} & b_{12} \\ b_{21} & b_{22}\end{matrix}}} + {\color{red}\boxed{\begin{matrix}a_{13} & a_{14} \\ a_{23} & a_{24}\end{matrix}}}\, {\color{blue}\boxed{\begin{matrix}b_{31} & b_{32} \\ b_{41} & b_{42}\end{matrix}}} & \cdots \\ \hdashline \vdots & \ddots \end{array}\;\right] \\ &= \left[\;\begin{array}{c:c} \boxed{\begin{matrix} \sum_1^2 a_{1\square}b_{\square1} & \sum_1^2 a_{1\square}b_{\square2} \\ \sum_1^2 a_{2\square}b_{\square1} & \sum_1^2 a_{2\square}b_{\square2} \end{matrix}} + \boxed{\begin{matrix} \sum_3^4 a_{1\square}b_{\square1} & \sum_3^4 a_{1\square}b_{\square2} \\ \sum_3^4 a_{2\square}b_{\square1} & \sum_3^4 a_{2\square}b_{\square2} \end{matrix}} & \cdots \\ \hdashline \vdots & \ddots \end{array}\;\right] \\ &= \left[\;\begin{array}{cc:cc} \sum_1^4 a_{1\square}b_{\square1} & \sum_1^4 a_{1\square}b_{\square2} & \cdots & \sum_1^4 a_{1\square}b_{\square4} \\ \sum_1^4 a_{2\square}b_{\square1} & \sum_1^4 a_{2\square}b_{\square2} && \\ \hdashline \vdots && \ddots \\ \sum_1^4 a_{4\square}b_{\square1} &&& \sum_1^4 a_{4\square}b_{\square4} \end{array}\;\right]. \end{align}\]Thus, we can applied divide and conquer technique to derive an algorithm for this problem.

Naturally, the next question is: what is the complexity of this algorithm? The simplest analysis involves counting how many times the basic multiplication between numbers occurs.

Let’s consider $n{\times}n$ matrices. When $n=2$, based on the previous observation that each element in the result matrix involves two basic multiplications, we end up with a total eight basic multiplications. Now, when $n=4$, we can observe that the problem is divided into eight subproblems of $2{\times}2$ matrix multiplication. This result in a total number of $8\cdot8=64$ basic multiplications.

Roughly speaking, we can conclude that with this basic multiplication counting, the complexity of the algorithm is $O(n^3)$. This complexity aligns well with the definition of matrix multiplication, suggesting that it might be an optimal solution that cannot be further improved…

However, it was not until 1969 that Volker Strassen propose a matrix multiplication algorithm that uses only seven multiplications in each layer. This technique relies on the observation that we don’t need to compute each of $c_{ij}$ right away. Instead, we can find *a little bit* complicated intermediate results that intertwine with many elements. We can then combine these intermediate results into $c_{ij}$ later. Furthermore, we *must* reuse these intermediate results for different elements, thus reducing the number of multiplication used.

Precisely, we compute these intermediate results:

\[\begin{align} m_1 &= (a_{11}+a_{22})(b_{11}+b_{22}), \\ m_2 &= (a_{21}+a_{22})b_{11}, \\ m_3 &= a_{11}(b_{12}-b_{22}), \\ m_4 &= a_{22}(b_{21}-b_{11}), \\ m_5 &= (a_{11}+a_{12})b_{22}, \\ m_6 &= (a_{21}-a_{11})(b_{11}+b_{12}), \\ m_7 &= (a_{12}-a_{22})(b_{21}+b_{22}). \end{align}\]Then we find $C=AB$ with

\[\begin{bmatrix} c_{11} & c_{12} \\ c_{21} & c_{22} \end{bmatrix} = \begin{bmatrix} m_1 + m_4 - m_5 + m_7 & m_3 + m_5 \\ m_2 + m_4 & m_1 - m_2 + m_3 + m_6 \end{bmatrix}.\]By reducing the number of multiplications to seven, the complexity is lowered to $O(n^{\log_27}) \approx O(n^{2.8})$.

The above system of equations can be complicated to understand. However, I found that an explanatory picture on Wikipedia helped me *visualize* the main concept of this algorithm. It’s like a building block game where we need to build the correct building. Each building block occupies cells on a $4{\times}4$ table in the permutation grid between $A$ and $B$. We construct each $c_{ij}$ by selecting a combination of building blocks from $m_1$ to $m_7$.

$c_{11}$$c_{12}$$c_{21}$$c_{22}$$m_1$$m_2$$m_3$$m_4$$m_5$$m_6$$m_7$

While Strassen’s algorithm seems to be only a slightly improvement, it marks a significant milestone that paved the path for the study of matrix multiplication algorithms. Over the past half-century, extensive research has been conducted in this area. (Even Strassen himself later purposed another faster algorithm, although his first algorithm is generally more well-known.) In the latest advancements in 2020, Josh Alman and Virginia Vassilevska Williams develop novel techniques that further reduce the complexity to $O(n^{2.3728596})$.

*author*