Algebra of matrices

Section 3.2 Algebra of matrices

The last section was devoted to what might be called the arithmetic of matrices. We learned the basic operations of adding, multiplying, scaling, and transposing matrices. In this section we tackle the algebra of matrices. We will investigate the properties enjoyed (and not enjoyed) by our matrix operations, and will show how to use these operations to solve matrix equations.

As you learn about matrix algebra, always keep in mind your old friend, real number algebra. For the most part these two algebraic systems closely resemble one another, as Theorem 3.2.1 below makes clear. However, there are two crucial points where they differ (see Theorem 3.2.8): two important properties of real number algebra that do not hold for matrices. The consequences of these two simple aberrations are far-reaching and imbue matrix algebra with a fascinating richness in comparison to real number algebra.

Theorem 3.2.1. Properties of matrix addition, multiplication and scalar multiplication.

The following properties hold for all matrices \(A, B, C\) and scalars \(a, b, c\in \R\) for which the given expression makes sense.

Addition commutative law.
\(\displaystyle A+B=B+A\)
Addition associative law.
\(\displaystyle A+(B+C)=(A+B)+C\)
Multiplication associative law.
\(\displaystyle A(BC)=(AB)C\)
Left-distributive law.
\(\displaystyle A(B+C)=AB+AC\)
Right-distributive law.
\(\displaystyle (B+C)A=BA+CA\)
Scaling distributive law.
\(\displaystyle a(B+C)=aB+aC\)
Another scaling distributive law.
\(\displaystyle (a+b)C=aC+bC\)
Scaling associative law.
\(\displaystyle a(bC)=(ab)C\)
Scaling commutative law.
\(a(BC)=(aB)C=B(aC)\text{.}\)

How does one actually prove one of these properties? These are all matrix equalities of the form \(X=Y\text{,}\) so according to the matrix equality definition we must show (1) that the matrices \(X\) and \(Y\) have the same dimension, and (2) that \((X)_{ij}=(Y)_{ij}\) for all \((i,j)\text{.}\) The proof below illustrates this technique for the multiplication associative law of Theorem 3.2.1.

Proof of (iv).

We prove only the multiplication associative law. Let \(A=[a_{ij}]_{m\times r}\text{,}\) \(B=[b_{ij}]_{r\times s}\text{,}\) \(C=[c_{ij}]_{s\times n}\text{.}\) To show

\begin{equation*} A(BC)=(AB)C\text{,} \end{equation*}

we must show (1) that \(A(BC)\) and \((AB)C\) have the same dimension, and (2) that

\begin{equation*} [A(BC)]_{ij}=[(AB)C]_{ij} \end{equation*}

for all possible \((i,j)\text{.}\)

(1) The usual observation about “inner” and “outer” dimensions shows that both \(A(BC)\) and \((AB)C\) have dimension \(m\times n\text{.}\)

Definition 3.1.17

(2) Given any \((i,j)\) with \(1\leq i\leq m\) and \(1\leq j\leq n\text{,}\) we have:

\begin{align*} [A(BC)]_{ij}\amp=\sum_{\ell=1}^ra_{i\ell}[BC]_{\ell j} \amp (\knowl{./knowl/d_matrix_mult.html}{\text{Definition 3.1.17}})\\ \amp=\sum_{\ell=1}^ra_{i\ell}\left(\sum_{k=1}^sb_{\ell k}c_{kj}\right) \amp (\knowl{./knowl/d_matrix_mult.html}{\text{Definition 3.1.17}})\\ =\amp \sum_{\ell=1}^r\sum_{k=1}^sa_{i\ell}(b_{\ell k}c_{kj}) \amp \text{ (real dist.) }\\ =\amp \sum_{k=1}^s\sum_{\ell=1}^r(a_{i\ell}b_{\ell k})c_{kj} \amp \text{ (real comm., real assoc.) }\\ =\amp \sum_{k=1}^s\left(\sum_{\ell=1}^ra_{i\ell}b_{\ell k}\right)c_{kj} \amp \text{ (real dist.) }\\ =\amp \sum_{k=1}^s[AB]_{ik}c_{kj} =\Bigl[(AB)C]_{ij} \end{align*}

This proves that all entries of the two matrices are equal, and hence \(A(BC)=(AB)C\text{.}\)

Like real number algebra, we can identify some special matrices that act as additive identities and multiplciative identities; and every matrix has an additive inverse. What we mean here is spelled out in detail in Theorem 3.2.4.

Definition 3.2.2. Additive inverse of a matrix.

Given an \(m\times n\) matrix \(A=[a_{ij}]\text{,}\) its additive inverse \(-A\) is defined as

\begin{equation*} -A=(-1)A=[-a_{ij}]\text{.} \end{equation*}

Definition 3.2.3. Identity matrix.

The \(n\times n\) identity matrix is the square \(n\times n\) matrix

\begin{equation*} I_n=\begin{bmatrix}1\amp 0\amp 0 \amp \dots\amp 0\\ 0\amp 1 \amp 0 \amp \dots \amp 0\\ \vdots \\ 0\amp 0\amp \dots \amp 0\amp 1\end{bmatrix} \end{equation*}

with ones along the diagonal and zeros everywhere else. In other words, for all \(1\leq i\leq n\) ans \(1\leq j\leq n\text{,}\) we have

\begin{equation*} (I_n)_{ij}=\begin{cases} 1 \amp \text{if } i=j \\ 0\amp \text{if } i\ne j \end{cases}\text{.} \end{equation*}

When the size \(n\) of the identity matrix is not important, we will often denote it simply as \(I\text{.}\)

Theorem 3.2.4. Additive identities, additive inverses, and multiplicative identities.

Additive identities.
The \(m\times n\) zero matrix \(\boldzero_{m\times n}\) is an additive identity for \(m\times n\) matrices in the following sense: for any \(m\times n\) matrix \(A\) we have

\begin{equation*} \boldzero_{m\times n}+A=A\text{.} \end{equation*}
Additive inverses.
For any \(m\times n\) matrix \(A\) we have

\begin{equation*} A+-A=A+(-1)A=\boldzero_{m\times n}\text{.} \end{equation*}
Multiplicative identities.
The \(n\times n\) identity matrix is a multiplicative identity for \(n\times n\) matrices in the following sense: for any \(n\times n\) matrix \(A\) we have

\begin{equation*} I_n\, A=A\, I_n=A\text{.} \end{equation*}

Proof.

Left as an exercise.

Corollary 3.2.5. Additive cancellation of matrices.

Given \(m\times n\) matrices \(A, B\text{,}\) and \(C\text{,}\) we have \(A+B=A+C\) if and only if \(B=C\text{.}\) Using logical notation:

\begin{equation*} A+B=A+C \iff B=C. \end{equation*}

Proof.

As simple as this claim might seem, remember that we are dealing with a completely new algebraic system here. We will prove both implications of “if and only if” statement separately.

Proof: \(A+B=A+C\implies B=C\).

We prove this via a chain of implications:

\begin{align*} A+B=A+C\amp \implies -A+(A+B)=-A+(A+C) \\ \amp\implies (-A+A)+B=(-A+A)+C\amp (\knowl{./knowl/th_matrixadd_assoc.html}{\text{Addition associative law}}) \\ \amp\implies \boldzero_{m\times n}+B=\boldzero_{m\times n}+C \amp (\knowl{./knowl/th_matrix_add_inverse.html}{\text{Item 2}}) \\ \amp \implies B=C \amp (\knowl{./knowl/th_matrix_add_ident.html}{\text{Item 1}}) \text{.} \end{align*}

Proof: \(B=C\implies A+B=A+C\).

This direction is obvious: if \(B\) and \(C\) are equal matrices, then they remain equal when we add \(A\) to each of them.

Remark 3.2.6.

The algebraic importance of Corollary 3.2.5 is that we can perform additive cancellation in matrix equations just as we do in real number algebra. For example, we can solve the matrix equation \(A+B=3A\) for \(B\) as follows:

\begin{align*} A+B\amp = 3A\\ A+B\amp = (1+2)A\\ A+B\amp = A+2A\\ B\amp = 2A \amp (\knowl{./knowl/c_matrix_additive_canc.html}{\text{Corollary 3.2.5}})\text{.} \end{align*}

Warning 3.2.7.

Though we can perform additive cancellation in matrix algebra, we can not always perform multiplicative cancellation. For example, consider the matrices

\begin{equation*} A=\begin{bmatrix}1\amp 1\\ 1\amp 1\end{bmatrix}, B=\begin{bmatrix}1\amp 0\\ 0\amp 0 \end{bmatrix}, C=\begin{bmatrix}0\amp 0\\ 1\amp 0 \end{bmatrix}\text{.} \end{equation*}

Check for yourself that \(AB=AC\text{,}\) and yet \(B\ne C\text{.}\) In other words, we cannot always “cancel” \(A\) from the matrix equation \(AB=AC\text{.}\)

The example in our warning above is but one instance of the general failure of the principle of multiplicative cancellation in matrix algebra. This in turn is a consequence of the following theorem, which identifies the two crucial places where matrix algebra differs significantly from real number algebra.

Theorem 3.2.8. Matrix algebra abnormalities.

Matrix multiplication is not commutative.
For two \(n\times n\) matrices \(A\) and \(B\text{,}\) we do not necessarily have \(AB=BA\text{.}\)
Products of nonzero matrices may be equal to zero.
If the product of two matrices is the zero matrix, we cannot conclude that one of matrices is the zero matrix. In logical notation:

\begin{equation*} \underset{m\times n}{A}\underset{n\times r}{B}=0_{m\times r}\;\not\!\!\!\!\implies A=0_{m\times n} \text{ or } B=0_{n\times r}\text{.} \end{equation*}

Proof.

To prove that an identity does not hold, it suffices to provide a single counterexample to that effect. We do so for each statement in turn. There is no significance to the particular counterexamples chosen here, and indeed there are infinitely many counterexamples to choose from in both cases.

Observe that \(\begin{bmatrix}1\amp 0\\ 0\amp 0 \end{bmatrix} \begin{bmatrix}0\amp 1\\ 0\amp 0 \end{bmatrix} \ne \begin{bmatrix}0\amp 1\\ 0\amp 0 \end{bmatrix} \begin{bmatrix}1\amp 0\\ 0\amp 0 \end{bmatrix} \text{.}\)
Observe that \(\begin{bmatrix}0\amp 1\\ 0\amp 0 \end{bmatrix} \begin{bmatrix}1\amp 0\\ 0\amp 0 \end{bmatrix} =\begin{bmatrix}0\amp 0\\ 0\amp 0 \end{bmatrix} \text{.}\) This is an example of two nonzero matrices whose product is the zero matrix.

Corollary 3.2.9. Failure of multiplicative cancellation.

Suppose matrices \(A, B, C\) satisfy \(AB=AC\text{.}\) We cannot conclude that \(B=C\text{.}\) In logical notation:

\begin{align*} AB=AC \amp \;\not\!\!\!\!\implies B=C \end{align*}
Suppose matrices \(B, C, D\) satisfy \(BD=CD\text{.}\) We cannot conclude that \(B=C\text{.}\) In logical notation:

\begin{align*} BD=CD \amp \;\not\!\!\!\!\implies B=C \end{align*}

Proof.

Again, we need only provide explicit counterexamples for each statement.

Let \(A=\begin{bmatrix}1\amp 1\amp 0\\ 0\amp 0\amp 0 \end{bmatrix}\text{,}\) \(B=\begin{bmatrix}1\amp 0\\ 0\amp 0\\ 0\amp 0 \end{bmatrix}\text{,}\) \(C=\begin{bmatrix}0\amp 0\\ 1\amp 0\\ 0\amp 0 \end{bmatrix}\text{.}\) Verify for yourself that

\begin{equation*} \begin{bmatrix}1\amp 1\amp 0\\ 0\amp 0\amp 0 \end{bmatrix} \begin{bmatrix}1\amp 0\\ 0\amp 0\\ 0\amp 0 \end{bmatrix} = \begin{bmatrix}1\amp 1\amp 0\\ 0\amp 0\amp 0 \end{bmatrix} \begin{bmatrix}0\amp 0\\ 1\amp 0\\ 0\amp 0 \end{bmatrix}= \begin{bmatrix} 1\amp 0\\ 0\amp 0\\ 0\amp 0 \end{bmatrix}\text{.} \end{equation*}

Thus \(AB=AC\text{,}\) but clearly \(B\ne C\text{.}\)
Let \(B=\begin{bmatrix}2\amp 0\\ 0\amp 0 \end{bmatrix}\text{,}\) \(C=\begin{bmatrix}1\amp 1\\ 0\amp 0 \end{bmatrix}\text{,}\) \(D=\begin{bmatrix}1\amp 1\amp 1\\ 1\amp 1\amp 1 \end{bmatrix}\text{.}\) We have

\begin{equation*} \begin{bmatrix}2\amp 0\\ 0\amp 0 \end{bmatrix} \begin{bmatrix}1\amp 1\amp 1\\ 1\amp 1\amp 1 \end{bmatrix} = \begin{bmatrix}1\amp 1\\ 0\amp 0 \end{bmatrix} \begin{bmatrix}1\amp 1\amp 1\\ 1\amp 1\amp 1 \end{bmatrix}= \begin{bmatrix} 2\amp 2\amp 2\\ 2\amp 2\amp 2 \end{bmatrix}\text{.} \end{equation*}

Thus \(BD=CD\text{,}\) but \(B\ne C\text{.}\)

Remark 3.2.10.

Mark well this important abnormality of matrix algebra. Confronted with a real number equation of the form \(ab=ac\text{,}\) we have a deeply ingrained impulse to declare that either \(a=0\) or \(b=c\text{.}\) (If we're sloppy we may forget about that first possibility.) The corresponding maneuver for the matrix equation \(AB=AC\) is simply not available to us, unless we know something more about \(A\text{.}\)

We end our foray into matrix algebra with some properties articulating how matrix transposition interacts with matrix addition, multiplication and scalar multiplication.

Theorem 3.2.11. Properties of matrix transposition.

The following properties hold for all matrices \(A, B, C\) and scalars \(c\in \R\) for which the given expression makes sense.

\(\displaystyle (A+B)^T=A^T+B^T\)
\(\displaystyle (cA)^T=cA^T\)
\(\displaystyle (AB)^T=B^TA^T\)
\(\displaystyle \left(A^T\right)^T=A\)

Proof.

We prove only the first statement. First observe that if \(A\) is \(m\times n\text{,}\) then so is \(B\) and \(A+B\text{.}\) Then \((A+B)^T\) is \(n\times m\) by Definition 3.1.24. Similarly, we see that \(A^T+B^T\) is \(n\times m\text{.}\)

Next, given any \((i,j)\) with \(1\leq i\leq n\text{,}\) \(1\leq j\leq m\text{,}\) we have

\begin{align*} \left((A+B)^T\right)_{ij}\amp =(A+B)_{ji} \amp (\knowl{./knowl/d_transpose.html}{\text{Definition 3.1.24}})\\ \amp = A_{ji}+B_{ji} \amp (\knowl{./knowl/d_matrix_add_subtract.html}{\text{Definition 3.1.9}})\\ \amp =(A^T)_{ij}+(B^T)_{ij} \amp (\knowl{./knowl/d_transpose.html}{\text{Definition 3.1.24}} \\ \amp =(A^T+B^T)_{ij} \amp (\knowl{./knowl/d_matrix_add_subtract.html}{\text{Definition 3.1.9}})\text{.} \end{align*}

Since the \(ij\)-entries of both matrices are equal for each \((i,j)\text{,}\) it follows that \((A+B)^T=A^T+B^T\text{.}\)

Video examples: proving matrix equalities.

Figure 3.2.12. Matrix multiplication is associative

Figure 3.2.13. Transpose property

Exercises Exercises

1.

In this exercise you will complete the proof of Theorem 3.2.1.

Prove Item i.
Prove Item ii.
Prove Item iv.
Prove Item v.
Prove Item vi.
Prove Item vii.
Prove Item viii.
Prove Item ix.

2.

Prove all three statements of Theorem 3.2.4.

3.

In this exercise you will complete the proof of Theorem 3.2.11.

Prove Item ii.
Prove Item iii.
Prove Item iv.

4.

Let \(A\) an \(n\times n\) matrix. We define its square \(A^2\) as \(A^2=AA\text{.}\)

In real number algebra we know that \(a^2=0\implies a=0\text{.}\) By contrast, show that there are infinitely many \(2\times 2\) matrices \(A\) satisfying \(A^2=\boldzero_{2\times 2}\text{.}\)

Optional: can you describe in a parametric manner the set of all matrices \(A\) satisfying \(A^2=\boldzero_{2\times 2}\text{?}\)
In real number algebra we know that \(a^2=a\implies a=0 \text{ or } a=1\text{.}\) By contrast, show that there are infinitely many \(2\times 2\) matrices \(A\) satisfying \(A^2=A\text{.}\)
In real number algebra we have the identity \((x+y)^2=x^2+2xy+y^2\text{.}\) Show that two \(n\times n\) matrices \(A\text{,}\) \(B\) satisfy

\begin{equation*} (A+B)^2=A^2+2AB+B^2 \end{equation*}

if and only if \(AB=BA\text{.}\)

Hint.

For (a) set \(A=\abcdmatrix{a}{b}{c}{d}\text{,}\) compute \(A^2\text{,}\) set this matrix equal to \(\boldzero_{2\times 2}\text{,}\) and try and find some solutions to the corresponding (nonlinear) system of four equations in the unknowns \(a,b,c,d\text{.}\)

Similar hint for (b), only now set \(A^2=A\text{.}\)

5.

Consider the matrix equation \(A\, \begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}=\begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}\text{.}\)

The following chain of implications is invalid.

\begin{align*} A\, \begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}=\begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}\amp\implies A\, \begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}=I_2\, \begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix} \\ \amp\implies A=I_2 \text{.} \end{align*}

For each implication in the chain, explain why it is valid or invalid.
Find all \(A\) satisfying

\begin{equation*} A\, \begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}=\begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}\text{.} \end{equation*}

Hint.
Write \(A=\begin{bmatrix} a\amp b\\ c\amp d \end{bmatrix}\) and set up a system of linear equations in the unknowns \(a,b,c,d\text{.}\)