Matrices and their arithmetic

Section 3.1 Matrices and their arithmetic

Matrices played a small supporting role in our discussion of linear systems in Chapter 2. In this chapter we bring them to center stage and give them a full-blown treatment as independent mathematical objects in their own right.

Like any mathematical object worth its salt, matrices can be employed in a vast multitude of ways. As such it is important to allow matrices to transcend their humble beginnings in this course as boiled down systems of linear equations. We record this observation as an official principle.

Mantra 3.1.1. Matrix mantra.

A matrix is a matrix is a matrix.

Not every matrix should be thought of as an augmented matrix associated to a linear system.

Subsection 3.1.1 The basics

We begin with some elementary definitions about matrices, matrix equality, and special types of matrices. As the next definition makes clear, a matrix is just an ordered sequence of numbers arranged in a very particular manner.

Definition 3.1.2. Matrix.

A (real) matrix is a rectangular array of real numbers

\begin{equation} A=\genmatrix\text{.}\label{s_matrix_eq_matrix}\tag{3.1.1} \end{equation}

The number \(a_{ij}\) located in the \(i\)-th row and \(j\)-th column of \(A\) is called the \((i,j)\)-entry (or \(ij\)-th entry) of \(A\text{.}\)

A matrix with \(m\) rows and \(n\) columns is said to have size (or dimension) \(m\times n\text{.}\)

We will typically use capital letters near the beginning of the alphabet (e.g. \(A, B,C, D\text{,}\) etc.) to denote matrices.

The displayed matrix in (3.1.1) is costly both in the space it takes up in print, and the time it takes to write down or typeset. Accordingly we introduce two somewhat complementary forms of notation to help describe matrices.

Definition 3.1.3.

Matrix-building notation: The notation \([a_{ij}]_{m\times n}\) denotes the \(m\times n\) matrix whose \(ij\)-th entry (\(i\)-th row, \(j\)-th column) is \(a_{ij}\text{.}\) When there is no danger of confusion, this notation is often shortened to \([a_{ij}]\text{.}\)
Matrix entry notation: Given a matrix \(A\text{,}\) the notation \([A]_{ij}\) denotes the \(ij\)-th entry of \(A\text{.}\)

Thus if \(A=[a_{ij}]_{m\times n}\text{,}\) then \([A]_{ij}=a_{ij}\) for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)

Remark 3.1.4.

The matrix-building notation is often used simply to give names to the entries of an arbitrary matrix. However, it can also be used to describe a matrix whose \(ij\)-th entry is given by specified rule or formula.

For example, let \(A=[a_{ij}]_{2\times 3}\text{,}\) where \(a_{ij}=(i-j)j\text{.}\) This is the \(2\times 3\) matrix whose \(ij\)-th entry is \((i-j)j\text{.}\) Thus

\begin{equation*} A=\begin{bmatrix}(1-1)1 \amp (1-2)2 \amp (1-3)3\\ (2-1)1 \amp (2-2)2 \amp (2-3)3 \end{bmatrix}=\begin{bmatrix}0 \amp -2 \amp -6\\ 1 \amp 0 \amp -3 \end{bmatrix}\text{.} \end{equation*}

In this example we have \([A]_{23}=-3\) and \([A]_{ii}=0\) for \(i=1,2\text{.}\)

In everyday language the notion of equality is taken as self-evident. Two things are equal if they are the same. What more is there to say? In mathematics, each time we introduce a new type of mathematical object (e.g., sets, functions, \(n\)-tuples, etc.) we need to spell out exactly what we mean for two things to be considered equal. We do so now with matrices.

Definition 3.1.5. Matrix equality.

Let \(A\) and \(B\) be matrices of dimension \(m\times n\) and \(m'\times n'\text{,}\) respectively. The two matrices are equal if

\(m=m'\) and \(n=n'\text{;}\)
\([A]_{ij}=[B]_{ij}\) for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)

In other words, we have \(A=B\) if and only if \(A\) and \(B\) have the same shape, and each entry of \(A\) is equal to the corresponding entry of \(B\text{.}\)

Example 3.1.6.

The matrix \(A=\begin{bmatrix}1\amp 2\amp 3\amp 4 \end{bmatrix}\) is not equal to \(B=\begin{bmatrix} 1\\ 2\\ 3\\ 4\end{bmatrix}\text{,}\) despite their having the same entries roughly in the same order: 1 is first, 2 is second, \etc. In this case equality does not hold as \(A\) and \(B\) have different shapes: \(A\) is \(1\times 4\text{,}\) and \(B\) is \(4\times 1\text{.}\)

The matrices \(A=\begin{bmatrix}1\amp 2 \\3\amp 4 \end{bmatrix}\) and \(B=\begin{bmatrix}1\amp 2\\ 5\amp 4\end{bmatrix}\) have the same dimension, but are not equal since \([A]_{21}=3\ne 5=[B]_{21}\text{.}\)

Definition 3.1.7. Square matrices, row vectors, column vectors, zero matrices.

A matrix \(A\) is square if its dimension is \(n\times n\text{.}\) The diagonal of a square matrix \(A=[a_{ij}]_{n\times n}\) consists of the entries \(a_{ii}\) for \(1\leq i\leq n\text{.}\)

A \(1\times n\) matrix

\begin{equation*} \bolda=\begin{bmatrix}a_1\amp a_2\amp \cdots \amp a_n \end{bmatrix} \end{equation*}

is called a row vector. The \(j\)-th entry of a row vector \(\bolda\) is denoted \([\bolda]_j\)

An \(n\times 1\) matrix

\begin{equation*} \boldb=\begin{bmatrix}b_1\\ b_2\\\vdots \\ b_m \end{bmatrix}\text{,} \end{equation*}

is called a column vector. The \(i\)-th entry of a column vector \(\boldb\) is denoted \([\boldb]_i\text{.}\)

The \(m\times n\) zero matrix, denoted \(\boldzero_{m\times n}\text{,}\) is the matrix of that dimension, all of whose entries are zero: i.e., \((\boldzero_{m\times n})_{ij}=0\) for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)

When the actual dimension is not significant, we will often drop the subscript and write simply \(\boldzero\) for a zero matrix of suitable dimension.

Remark 3.1.8. Matrices as collections of columns/rows.

Let \(A\) be an \(m\times n\) matrix. We will often think of \(A\) as a collection of columns, in which case we write

\begin{equation} A=\begin{bmatrix}\vert \amp \vert \amp \amp \vert \\ \boldc_1 \amp \boldc_2\amp \cdots \amp \boldc_n \\ \vert \amp \vert \amp \amp \vert \end{bmatrix}\text{,}\label{eq_columns}\tag{3.1.2} \end{equation}

where \(\boldc_j\) is the column vector consisting of the entries of the \(j\)-th column of \(A\text{:}\) i.e.,

\begin{equation*} \boldc_j=\begin{bmatrix}a_{1j}\\ a_{2j}\\ \vdots \\ a_{mj}\end{bmatrix}\text{.} \end{equation*}

Similarly, when we think of \(A\) as a collection of rows, we write

\begin{equation} A=\begin{bmatrix}\ -\boldr_{1}- \ \\ \ -\boldr_{1}- \ \\ \vdots \\ \ -\boldr_{m}- \ \\ \end{bmatrix}\text{,}\label{eq_rows}\tag{3.1.3} \end{equation}

where \(\boldr_i\) is the row vector consisting of the entries of the \(i\)-th row of \(A\text{:}\) i.e.,

\begin{equation*} \boldr_i= \begin{bmatrix}a_{i1}\amp a_{i2}\amp\cdots\amp a_{in} \end{bmatrix}\text{.} \end{equation*}

The vertical and horizontal lines in (3.1.2) and (3.1.3) are used to emphasize that the \(\boldc_j\) are columns vectors and the \(\boldr_i\) are row vectors.

Subsection 3.1.2 Matrix arithmetic: addition, subtraction and scalar multiplication

We now lay out the various algebraic operations we will use to combine and transform matrices; we refer to the use of these operations loosely as matrix arithmetic. Some of these operations resemble familiar operations from real arithmetic in terms of their notation and definition. Do not be lulled into complacency! These are new operations defined for a new class of mathematical objects, and must be treated carefully. In particular, pay close attention to (a) exactly what type of mathematical objects serve as inputs for each operation (the ingredients of the operation), and (b) what type of mathematical object is outputted.

Definition 3.1.9. Matrix addition and subtraction.

Matrix addition is the operation defined as follows: given two \(m\times n\) matrices \(A=[a_{ij}]_{m\times n}\) and \(B=[b_{ij}]_{m\times n}\text{,}\) we define their sum to be the matrix

\begin{equation*} A+B\colon =[a_{ij}+b_{ij}]_{m\times n}\text{.} \end{equation*}

In other words \(A+B\) is the \(m\times n\) matrix satisfying

\begin{equation*} [A+B]_{ij}=[A]_{ij}+[B]_{ij}=a_{ij}+b_{ij} \end{equation*}

for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)

Matrix subtraction is the operation defined as follows: given two \(m\times n\) matrices \(A=[a_{ij}]_{m\times n}\) and \(B=[b_{ij}]_{m\times n}\text{,}\) we define their difference to be the matrix

\begin{equation*} A-B\colon =[a_{ij}-b_{ij}]_{m\times n}\text{.} \end{equation*}

In other words \(A-B\) is the \(m\times n\) matrix satisfying

\begin{equation*} [A-B]_{ij}=[A]_{ij}-[B]_{ij}=a_{ij}-b_{ij} \end{equation*}

for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)

Remark 3.1.10.

Observe that matrix addition/subtraction is not defined for any old pair of matrices. The ingredients of matrix addition (or subtraction) are two matrices of the same dimension; and the output is a third matrix of this common dimension.

Definition 3.1.11. Scalar multiplication of matrices.

Given any matrix \(A=[a_{ij}]_{m\times n}\) and any constant \(c\in \R\text{,}\) we define

\begin{equation*} cA=[ca_{ij}]\text{.} \end{equation*}

In other words, \(cA\) is the \(m\times n\) matrix obtained by “scaling” each of the entries of \(A\) by the constant \(c\text{.}\)

We call \(cA\) a scalar multiple of \(A\text{.}\) Furthermore, to help distinguish between matrices and real numbers, we will refer to elements of \(\R\) as scalars.

Remark 3.1.12.

Whereas matrix addition and subtraction closely resemble corresponding operations involving real numbers, there is no obvious real arithmetic analogue to matrix scalar multiplication. In particular, notice how matrix scalar multiplication is a sort of hybrid operation that combines mathematical objects of two very different natures: a real number (or scalar) on the one hand, and a matrix on the other.

We call the result of applying a sequence of matrix additions and scalar multiplications a linear combination of matrices.

Definition 3.1.13. Linear combination of matrices.

Given matrices \(A_1,A_2,\dots, A_r\) of the same dimension, and scalars \(c_1,c_2, \dots ,c_r\text{,}\) the expression

\begin{equation*} c_1A_1+c_2A_2\cdots +c_rA_r \end{equation*}

is called a linear combination of matrices. The scalars \(c_i\) are called the coefficients of the linear combination.

Example 3.1.14.

Let \(A=\begin{amatrix}[rrr]1\amp -1\amp 2\\ 0\amp 0\amp 1\end{amatrix}\) and \(B=\begin{amatrix}[rrr]0\amp 1\amp 1\\ -1\amp -1\amp 1\end{amatrix}\text{.}\) Compute \(2A+(-3)B\text{.}\)

Solution.

\begin{align*} 2A+(-3)B\amp= \begin{amatrix}[rrr]2\amp -2\amp 4\\ 0\amp 0\amp 2\end{amatrix}+\begin{amatrix}[rrr]0\amp -3\amp -3\\ 3\amp 3\amp -3\end{amatrix}\\ \amp=\begin{amatrix}[rrr]-2\amp 1\amp -3\\ 3\amp 3\amp -1\end{amatrix} \text{.} \end{align*}

Example 3.1.15.

Show that \(B=\begin{amatrix}3\amp -3\amp 3 \end{amatrix}\) can be expressed as a linear combination of the matrices

\begin{equation*} A_1=\begin{amatrix}[rrr]1\amp 1\amp 1\end{amatrix}, \ A_2=\begin{amatrix}[rrr]1\amp -1\amp 0\end{amatrix}, \ A_3=\begin{amatrix}[rrr]1\amp 1\amp -2\end{amatrix}\text{.} \end{equation*}

Solution.

We must solve the matrix (or row vector) equation

\begin{equation*} aA_1+bA_2+cA_3=B \end{equation*}

for the scalars \(a,b,c\text{.}\) Computing the linear combination on the left yields the matrix equation

\begin{equation*} \begin{amatrix}[rrr]a+b+c\amp a-b+c\amp a-2c\end{amatrix}=\begin{amatrix}[rrr]3\amp -3\amp 3\end{amatrix}\text{.} \end{equation*}

Using the definition of matrix equality (Definition 3.1.5), we get the system of equations

\begin{equation*} \begin{linsys}{3} 1a \amp +\amp b \amp + \amp c \amp = \amp 3\\ a \amp-\amp b\amp +\amp c\amp =\amp -3\\ a \amp \amp \amp -\amp 2c\amp =\amp 3 \end{linsys}\text{.} \end{equation*}

Using Gaussian elimination we find that there is a unique solution to this system: namely, \((a,b,c)=(1,3,-1)\text{.}\) We conclude that \(B=A_1+3A_2+(-1)A_3=A_1+3A_2-A_3\text{.}\)

Remark 3.1.16.

Let \(A_1, A_2,\dots, A_r\) be \(m\times n\) matrices, An easy induction argument on \(r\) shows that for any scalars \(c_1,c_2,\dots, c_r\) we have

\begin{equation*} [c_1A_1+c_2A_2+\cdots +c_rA_r]_{ij} =c_1[A_1]_{ij}+c_2[A_2]_{ij}+\cdots c_r[A_r]_{ij} \end{equation*}

for all \(1\leq i\leq m\text{,}\) \(1\leq j\leq n\text{.}\) (See Exercise 3.1.6.6. )

Subsection 3.1.3 Matrix arithmetic: matrix multiplication

So how do we define the product of two matrices? Looking at the previous operations, you might have guessed that we should define the product of two \(m\times n\) matrices by taking the product of their corresponding entries. Not so!

Definition 3.1.17. Matrix multiplication.

Matrix multiplication is the operation defined as follows: given an \(m\times n\) matrix \(A=[a_{ij}]_{m\times n}\) and an \(n\times r\) matrix \(B=[b_{ij}]_{n\times r}\text{,}\) we define their product to be the \(m\times r\) matrix \(AB\) whose \(ij\)-th entry is given by the formula

\begin{equation*} [AB]_{ij}=a_{i1}b_{1j}+a_{i2}b_{2j}+\cdots a_{ir}b_{rj} =\sum_{\ell=1}^ra_{i\ell}b_{\ell j} \end{equation*}

for all \(1\leq i\leq m\) and \(1\leq j\leq r\text{.}\)

Figure 3.1.18. In \(C=AB\text{,}\) the \(ij\)-th entry \(c_{ij}=\sum_{k=1}^na_{ik}b_{kj}\) is computed by moving across the \(i\)-th row of \(A\) and down the \(j\)-th column of \(B\text{.}\)

The formula for \([AB]_{ij}\) is undoubtedly more complicated than you expected, and seems to come completely out of the blue. We will be able to retroactively motivate this definition once we introduce linear transformations. For now let's focus on understanding how and when we can compute the product of two matrices. In particular, let's concentrate on how matrix dimension comes into play.

For the product of \(A_{mn}\) and \(B_{pr}\) to be defined, we need \(n=p\text{.}\) In other words for the product below to make sense we need the “inner” dimensions of \(A\) and \(B\) to be equal:

\begin{equation*} \underset{m\times \boxed{n}}{A}\hspace{5pt} \underset{\boxed{n}\times r}{B}\text{.} \end{equation*}

If this condition is met, the dimension of the resulting matrix \(AB\) is determined by the “outer” dimensions of \(A\) and \(B\text{.}\) Schematically, you can think of the inner dimensions as being “canceled out”:

\begin{equation*} \underset{\boxed{m}\times\cancel{n}}{A}\hspace{5pt}\underset{\cancel{n}\times\boxed{r}}{B}=\underset{m\times r}{AB}. \end{equation*}

All of this will make more sense once we begin thinking of matrices \(A\) as defining certain functions \(T_A\text{.}\) Our formula for the entries of \(AB\) is chosen precisely so that this new matrix corresponds to the composition of the functions \(T_A\) and \(T_B\text{:}\) i.e. so that

\begin{equation*} T_{AB}=T_A\circ T_B\text{.} \end{equation*}

The ponderous restriction on the dimensions of the ingredient matrices ensures that the two functions \(T_A\) and \(T_B\) can be composed. (See Composition of linear transformations and matrix multiplication.)

Subsection 3.1.4 Alternative methods of multiplication

In addition to the given definition of matrix multiplication, we will make heavy use of two further ways of computing matrix products, called the column method and the row method.

Theorem 3.1.19. Column method of matrix multiplication.

Let \(A=[a_{i}]_{m\times n}\) and \(B=[b_{ij}]_{n\times r}\text{.}\) The column method of matrix multiplication computes \(AB\) using the two steps below.

Step 1: Let \(\boldb_j\) be the \(j\)-th column of \(B\text{,}\) considered as a column vector. Then

\begin{align*} AB \amp =A\begin{bmatrix}\vert \amp \vert \amp \amp \vert \\ \boldb_1\amp \boldb_2\amp \cdots\amp \boldb_r\\ \vert \amp \vert \amp \amp \vert \end{bmatrix}\\ \amp=\begin{bmatrix}\vert \amp \vert \amp \amp \vert \\ A\boldb_1\amp A\boldb_2\amp \cdots\amp A\boldb_r\\ \vert \amp \vert \amp \amp \vert \end{bmatrix}\text{.} \end{align*}
Step 2: Let \(\bolda_j\) be the \(j\)-th column of \(A\text{,}\) considered as a column vector. Given any column vector \(\boldb=[b_{i}]_{n\times 1}\) we have

\begin{equation*} A\,\boldb=A\,\begin{bmatrix}b_1\\b_2\\ \vdots \\b_n \end{bmatrix} = b_1\bolda_1+b_2\bolda_2+\cdots +b_n\bolda_n\text{.} \end{equation*}

Proof.

We prove the equalities in both steps separately.

Proof of Step 1.

We must show \(AB=C\text{,}\) where

\begin{equation*} C=\begin{bmatrix}\vert \amp \vert \amp \amp \vert \\ A\boldb_1\amp A\boldb_2\amp \cdots\amp A\boldb_r\\ \vert \amp \vert \amp \amp \vert \end{bmatrix}\text{.} \end{equation*}

First we show \(AB\) and \(C\) have the same size. By definition of matrix multiplication, \(AB\) is \(m\times r\text{.}\) By construction \(C\) has \(r\) columns and its \(j\)-th column is \(A\boldb_j\text{.}\) Since \(A\) and \(\boldb_j\) have size \(m\times n\) and \(n\times 1\text{,}\) respectively, \(A\boldb_j\) has size \(m\times 1\text{.}\) Thus each of the \(r\) columns of \(C\) is an \(m\times 1\) column vector. It follows that \(C\) is \(m\times r\text{,}\) as desired.

Next we show that \([AB]_{ij}=[C]_{ij}\) for all \(1\leq i\leq m\text{,}\) \(1\leq j\leq r\text{.}\) Since the \(ij\)-th entry of \(C\) is the \(i\)-th entry of the \(j\)-th column of \(C\text{,}\) we have

\begin{align*} [C]_{ij} \amp= [A\boldb_j]_{i} \\ \amp=\sum_{k=1}^n a_{ik}b_{kj} \\ \amp =[AB]_{ij}\text{.} \end{align*}

Proof of Step 2.

We must show that \(A\boldb=\boldc\text{,}\) where

\begin{equation*} \boldc=b_1\bolda_1+b_2\bolda_2+\cdots +b_n\bolda_n\text{.} \end{equation*}

The usual argument shows that both \(A\boldb\) and \(\boldc\) are \(m\times 1\) column vectors. It remains only to show that the \(i\)-th entry \([A\boldb]_i\) of the column \(A\boldb\) is equal to the \(i\)-th entry \([\boldc]_i\) of \(\boldc\) for all \(1\leq i\leq m\text{.}\) For any such \(i\) we have

\begin{align*} [\boldc]_i \amp = [b_1\bolda_1+b_2\bolda_2+\cdots +b_n\bolda_n]_i\\ \amp= b_1[\bolda_1]_i+b_2[\bolda_2]_i+\cdots +b_n[\bolda_n]_i \amp (\knowl{./knowl/rm_entry_lin_comb.html}{\text{Remark 3.1.16}})\\ \amp= b_1a_{i1}+b_2a_{i2}+\cdots +b_n\bolda_{in}\amp (\text{def. of } \bolda_j) \\ \amp= a_{i1}b_1+a_{i2}b_2+\cdots+a_{in}b_n \\ \amp =[A\boldb]_i \amp (\knowl{./knowl/d_matrix_mult.html}{\text{Definition 3.1.17}})\text{.} \end{align*}

Remark 3.1.20.

Theorem 3.1.19 amounts to a two-step process for computing an arbitrary matrix product \(AB\text{.}\)

The first statement (Step 1) tells us that the \(j\)-th column of the matrix \(AB\) can be obtained by computing the product \(A\,\boldb_j\) of \(A\) with the \(j\)-th column of \(B\text{.}\)

The second statement (Step 2) tells us that each product \(A\,\boldb_j\) can itself be computed as a certain linear combination of the columns of \(A\) with coefficients drawn from \(\boldb_j\text{.}\)

A similar remark applies to computing matrix products using the row method, as described below in Theorem 3.1.21.

Theorem 3.1.21. Row method of matrix multiplication.

Let \(A=[a_{i}]_{m\times n}\) and \(B=[b_{ij}]_{n\times r}\text{.}\) The row method of matrix multiplication computes \(AB\) using the two steps below.

Step 1: Let \(\bolda_i\) be the \(i\)-th row of \(A\text{.}\) Then

\begin{align*} AB\amp =\begin{bmatrix} \ - \bolda_1- \ \\ \ - \bolda_2- \ \\ \vdots \\ \ - \bolda_m- \ \end{bmatrix} B\\ \amp = \begin{bmatrix}\ - \bolda_1B- \ \\ \ -\bolda_2B- \ \\ \vdots \\ \ -\bolda_mB - \ \end{bmatrix}\text{.} \end{align*}
Step 2: Let \(\boldb_i\) be the \(i\)-th row of \(B\text{.}\) Given any row vector \(\bolda=[a_{j}]_{1\times n}\) we have

\begin{equation*} \bolda\, B=\begin{bmatrix}a_1\amp a_2\amp \cdots \amp a_n \end{bmatrix} B= a_1\boldb_1+a_2\boldb_2+\cdots +a_n\boldb_n\text{.} \end{equation*}

Proof.

The proof is very similar to that of Theorem 3.1.19 and is left to the reader.

Example 3.1.22. Matrix multiplication.

Let \(A=\begin{amatrix}[rrr] 1\amp 1 \amp -2 \\ 1\amp 3\amp 2\end{amatrix}\) and \(B=\begin{amatrix}[rc]1\amp 1\\ 0\amp 1 \\ -2\amp 1 \end{amatrix}\)

Compute \(AB\) using (a) the definition of matrix multiplication, (b) the column method, (c) the row method.

Solution.

Using the definition, we see easily that

\begin{equation*} AB=\begin{amatrix}[rr]5\amp 0\\ -3\amp 6 \end{amatrix} \end{equation*}
Let \(\bolda_1, \bolda_2, \bolda_3\) be the columns of \(A\text{,}\) and let \(\boldb_1, \boldb_2\) be the columns of \(B\text{.}\) We have

\begin{align*} AB \amp= \begin{amatrix}[cc]\vert \amp \vert \\ A\boldb_1\amp A\boldb_2 \\ \vert\amp \vert\end{amatrix} \amp \text{(Step 1)} \\ \amp= \begin{amatrix}[cc]\vert \amp \vert \\ (1\bolda_1+0\bolda_2-2\bolda_3)\amp (\bolda_1+\bolda_2+\bolda_3) \\ \vert\amp \vert\end{amatrix} \amp \text{(Step 2)}\\ \amp= \begin{amatrix}[rr]5\amp 0\\ -3\amp 6 \end{amatrix} \amp \text{(arithmetic)} \end{align*}
Now let \(\bolda_1, \bolda_2\) be the rows of \(A\text{,}\) and let \(\boldb_1, \boldb_2, \boldb_3\) be the rows of \(B\text{.}\) We have

\begin{align*} AB \amp= \begin{amatrix}[c]--\bolda_1\, B--\\ --\bolda_2\, B-- \end{amatrix}\amp \text{(Step 1)}\\ \amp= \begin{amatrix}[c]--(1\boldb_1+1\boldb_2-2\boldb_3)-- \\ --(1\boldb_1+3\boldb_2+2\boldb_3)-- \end{amatrix} \amp \text{(Step 2)} \\ \amp=\begin{amatrix}[rr]5\amp 0\\ -3\amp 6 \end{amatrix} \amp \text{(arithmetic)} \end{align*}

Video example of matrix multiplication.

Figure 3.1.23. Three methods of matrix multiplication

Subsection 3.1.5 Transpose of a matrix

We end this section with one last operation, matrix transposition. We will not make much use of this operation until later, but this is as good a place as any to introduce it.

Definition 3.1.24. Matrix transposition.

Given an \(m\times n\) matrix \(A=[a_{ij}]\) its transpose \(A^T\) is the matrix whose \(ij\)-entry is the \(ji\)-th entry of \(A\text{.}\) In other words, \(A^T\) is the \(n\times m\) matrix satisfying \((A^T)_{ij}=(A)_{ji}\) for all \(1\leq i\leq n\) and \(1\leq j\leq m\text{.}\)

Remark 3.1.25.

Given a matrix \(A\) we can give a column- or row-based description of \(A^T\) as follows:

\(A^T\) is the matrix whose \(i\)-th row is the \(i\)-th column of \(A\text{.}\)
\(A^T\) is the matrix whose \(j\)-th column is the \(j\)-th row of \(A\text{.}\)

Example 3.1.26.

Let \(A=\begin{bmatrix}1\amp 2\amp 3\\4\amp 5\amp 6 \end{bmatrix}\text{;}\) then \(A^T=\begin{bmatrix}1\amp 4\\2\amp 5\\3\amp 6 \end{bmatrix}\text{.}\)

Let \(B=\begin{bmatrix}1\\0\\3 \end{bmatrix}\text{,}\) then \(B^T=\begin{bmatrix}1\amp 0\amp 3 \end{bmatrix}\text{.}\)

Exercises 3.1.6 Exercises

1.

For each part below write down the most general \(3\times 3\) matrix \(A=[a_{ij}]\) satisfying the given condition (use letter names \(a,b,c\text{,}\)etc. for entries).

\(a_{ij}=a_{ji}\) for all \(i,j\text{.}\)
\(a_{ij}=-a_{ji}\) for all \(i,j\)
\(a_{ij}=0\) for \(i\ne j\text{.}\)

2.

Let

\begin{equation*} A = \begin{bmatrix}3\amp 0\\ -1\amp 2\\ 1\amp 1 \end{bmatrix} , \hspace{5pt} B = \begin{bmatrix}4\amp -1\\ 0\amp 2 \end{bmatrix} , \hspace{5pt} C = \begin{bmatrix}1\amp 4\amp 2\\ 3\amp 1\amp 5 \end{bmatrix} \end{equation*}

\begin{equation*} D = \begin{bmatrix}1\amp 5\amp 2\\ -1\amp 0\amp 1\\ 3\amp 2\amp 4 \end{bmatrix} , \hspace{5pt} E = \begin{bmatrix}6\amp 1\amp 3\\ -1\amp 1\amp 2\\ 4\amp 1\amp 3 \end{bmatrix}\text{.} \end{equation*}

Compute the following matrices, or else explain why the given expression is not well defined.

\(\displaystyle (2D^T-E)A\)
\(\displaystyle (4B)C+2B\)
\(\displaystyle B^T(CC^T-A^TA)\)

3.

Let

\begin{equation*} A = \begin{bmatrix}3\amp -2\amp 7\\ 6\amp 5\amp 4\\ 0\amp 4\amp 9 \end{bmatrix} , \hspace{5pt} B = \begin{bmatrix}6\amp -2\amp 4\\ 0\amp 1\amp 3\\ 7\amp 7\amp 5 \end{bmatrix}\text{.} \end{equation*}

Compute the following using either the row or column method of matrix multiplication. Make sure to show how you are using the relevant method.

the first column of \(AB\text{;}\)
the second row of \(BB\text{;}\)
the third column of \(AA\text{.}\)

Solution.

Using expansion by columns, the first column of \(AB\) is given by \(A\) times the first column of \(B\text{.}\) We compute

\begin{equation*} \begin{bmatrix}3\amp -2\amp 7\\ 6\amp 5\amp 4\\ 0\amp 4\amp 9 \end{bmatrix} \begin{bmatrix}6\\ 0\\ 7 \end{bmatrix} = 6 \begin{amatrix}[r]3 \\ 6 \\ 0 \end{amatrix}+0 \begin{amatrix}[r]-2 \\ 5 \\ 4 \end{amatrix}+7\begin{amatrix}[r]7 \\ 4 \\ 9 \end{amatrix}= \begin{bmatrix}67\\ 64\\ 63 \end{bmatrix} \end{equation*}

4.

Use the row or column method to quickly compute the following product:

\begin{equation*} \begin{amatrix}[rrrrr]1\amp -1\amp 1\amp -1\amp 1\\ 1\amp -1\amp 1\amp -1\amp 1\\ 1\amp -1\amp 1\amp -1\amp 1\\ 1\amp -1\amp 1\amp -1\amp 1\\ 1\amp -1\amp 1\amp -1\amp 1 \end{amatrix} \begin{amatrix}[rrrr]1\amp 1\amp 1\amp 1\\ -1\amp 0\amp 0\amp 0\\ 0\amp 1\amp 0\amp 0\\ 0\amp 0\amp 2\amp 0\\ 0\amp 0\amp 0\amp 3 \end{amatrix} \end{equation*}

Solution.

I'll just describe the row method here.

Note that the rows of \(A\) are all identical, and equal to \(\begin{bmatrix}1 \amp -1 \amp 1 \amp -1 \amp 1 \end{bmatrix}\text{.}\) From the row method it follows that each row of \(AB\) is given by

\begin{equation*} \begin{bmatrix}1 \amp -1 \amp 1 \amp -1 \amp 1 \end{bmatrix} B\text{.} \end{equation*}

Thus the rows of \(AB\) are all identical, and the row method computes the product above by taking the corresponding alternating sum of the rows of \(B\text{:}\)

\begin{equation*} \begin{bmatrix}1 \amp -1 \amp 1 \amp -1 \amp 1 \end{bmatrix} B=\begin{bmatrix}2\amp 2\amp -1\amp 4 \end{bmatrix}\text{.} \end{equation*}

Thus \(AB\) is the the \(5\times 4\) matrix, all of whose rows are \(\begin{bmatrix}2\amp 2\amp -1\amp 4 \end{bmatrix}\text{.}\)

5.

Each of the \(3\times 3\) matrices \(B_i\) below performs a specific row operation when multiplying a \(3\times n\) matrix \(A=\begin{bmatrix}-\boldr_1-\\ -\boldr_2-\\ -\boldr_3- \end{bmatrix}\) on the left; i.e., the matrix \(B_iA\) is the result of performing a certain row operation on the matrix \(A\text{.}\) Use the row method of matrix multiplication to decide what row operation each \(B_i\) performs.

\begin{equation*} B_1=\begin{bmatrix}1\amp 0\amp 0\\ 0\amp 1\amp 0\\ -2\amp 0\amp 1 \end{bmatrix} , B_2=\begin{bmatrix}1\amp 0\amp 0\\ 0\amp \frac{1}{2}\amp 0\\ 0\amp 0\amp 1 \end{bmatrix} , B_3=\begin{bmatrix}0\amp 0\amp 1\\ 0\amp 1\amp 0\\ 1\amp 0\amp 0 \end{bmatrix}\text{.} \end{equation*}

6.

Let \(r\geq 2\) be an integer. Prove, by induction on \(r\text{,}\) that for any \(m\times n\) matrices \(A_1, A_2,\dots, A_r\) and scalars \(c_1,c_2,\dots, c_r\text{,}\) we have

\begin{equation*} [c_1A_1+c_2A_2+\cdots +c_rA_r]_{ij} =c_1[A_1]_{ij}+c_2[A_2]_{ij}+\cdots c_r[A_r]_{ij} \end{equation*}

for all \(1\leq i\leq m\text{,}\) \(1\leq j\leq n\text{.}\)