Rank-nullity theorem and fundamental spaces

Section 4.6 Rank-nullity theorem and fundamental spaces

This section is in a sense just a long-format example of how to compute bases and dimensions of subspaces. Along the way, however we meet the rank-nullity theorem (sometimes called the “fundamental theorem of linear algebra”), and apply this theorem in the context of fundamental spaces of matrices (Definition 4.6.5).

Subsection 4.6.1 The rank-nullity theorem

The rank-nullity theorem relates the the dimensions of the null space and image of a linear transformation \(T\colon V\rightarrow W\text{,}\) assuming \(V\) is finite dimensional. Roughly speaking, it says that the bigger the null space, the smaller the image. More precisely, it tells us that

\begin{equation*} \dim V=\dim\NS T+\dim\im T\text{.} \end{equation*}

As we will see, this elegant result can be used to significantly simplify computations with linear transformations. For example, in a situation where we wish to compute explicitly both the null space and image of a given linear transformation, we can often get away with just computing one of the two spaces and using the rank-nullity theorem (and a dimension argument) to easily determine the other. Additionally, the rank-nullity theorem will directly imply some intuitively obvious properties of linear transformations. For example, suppose \(V\) is a finite-dimensional vector space. It seems obvious that if \(\dim W> \dim V\text{,}\) then there is no linear transformation mapping \(V\) surjectively onto \(W\text{:}\) i.e., you should not be able to map a “smaller” vector space onto a “bigger” one. Similarly, if \(\dim W \lt \dim V\text{,}\) then we expect that there is no injective linear transformation mapping \(V\) injectively into \(W\text{.}\) Both these results are easy consequences of the rank-nullity theorem.

Before proving the theorem we give names to \(\dim \NS T\) and \(\dim\im T\text{.}\)

Definition 4.6.1. Rank and nullity.

Let \(T\colon V\rightarrow W\) be a linear transformation.

The rank of \(T\text{,}\) denoted \(\rank T\text{,}\) is the dimension of \(\im T\text{:}\) i.e.,

\begin{equation*} \rank T=\dim\im T\text{.} \end{equation*}
The nullity of \(T\text{,}\) denoted \(\nullity T\text{,}\) is the dimension of \(\NS T\text{:}\) i.e.,

\begin{equation*} \nullity T=\dim\NS T\text{.} \end{equation*}

Theorem 4.6.2. Rank-nullity.

Let \(V\) be a vector space of dimension \(n\text{,}\) and let \(T\colon V\rightarrow W\) be a linear transformation. Then

\begin{equation*} n=\dim\NS T+\dim\im T\text{,} \end{equation*}

or alternatively,

\begin{equation*} n=\nullity T+\rank T\text{.} \end{equation*}

Proof.

Choose a basis \(B'=\{\boldv_1, \boldv_2, \dots, \boldv_k\}\) of \(\NS T\) and extend \(B'\) to a basis \(B=\{\boldv_1, \boldv_2,\dots, \boldv_k,\boldv_{k+1},\dots, \boldv_n\}\text{,}\) using Theorem 4.5.15. Observe that \(\dim\NS T=\nullity T=k\) and \(\dim V=n\text{.}\)

We claim that \(B''=\{T(\boldv_{k+1}),T(\boldv_{k+2}),\dots, T(\boldv_{n})\}\) is a basis of \(\im T\text{.}\)

Proof of claim.

\(B''\) is linearly independent.

Suppose \(a_kT(\boldv_k)+a_{k+1}T(\boldv_{k+1})+\cdots +a_nT(\boldv_n)=\boldzero\text{.}\) Then the vector \(\boldv=a_k\boldv_k+a_{k+1}\boldv_{k+1}+\cdots +a_n\boldv_n\) satisfies \(T(\boldv)=\boldzero\) (using linearity of \(T\)), and hence \(\boldv\in \NS T\text{.}\) Then, using the fact that \(B'\) is a basis of \(\NS T\text{,}\) we have

\begin{equation*} b_1\boldv_1+b_2\boldv_2+\cdots +\boldv_k=\boldv=a_k\boldv_k+a_{k+1}\boldv_{k+1}+\cdots +a_n\boldv_n, \end{equation*}

and hence

\begin{equation*} b_1\boldv_1+b_2\boldv_2+\cdots +\boldv_k-a_k\boldv_k-a_{k+1}\boldv_{k+1}-\cdots -a_n\boldv_n=\boldzero. \end{equation*}

Since the set \(B\) is linearly independent, we conclude that \(b_i=a_j=0\) for all \(1\leq i\leq k\) and \(k+1\leq j\leq n\text{.}\) In particular, \(a_{k+1}=a_{k+2}=\cdots=a_n=0\text{,}\) as desired.

\(B''\) spans \(\im T\).

It is clear that \(\Span B''\subseteq \im T\) since \(T(\boldv_i)\in \im T\) for all \(k+1\leq i\leq n\) and \(\im T\) is closed under linear combinations.

For the other direction, suppose \(\boldw\in \im T\text{.}\) Then there is a \(\boldv\in V\) such that \(\boldw=T(\boldv)\text{.}\) Since \(B\) is a basis of \(V\) we may write

\begin{equation*} \boldv=a_1\boldv_1+a_2\boldv_2+\cdots a_k\boldv_k+a_{k+1}\boldv_{k+1}+\cdots +a_n\boldv_n\text{,} \end{equation*}

in which case

\begin{align*} \boldw=T(\boldv)\amp= T(a_1\boldv_1+a_2\boldv_2+\cdots a_k\boldv_k+a_{k+1}\boldv_{k+1}+\cdots +a_n\boldv_n)\\ \amp=a_1T(\boldv_1)+a_2T(\boldv_2)+\cdots a_kT(\boldv_k)+a_{k+1}T(\boldv_{k+1})+\cdots +a_nT(\boldv_n) \amp (T \text{ is linear })\\ \amp=\boldzero +a_kT(\boldv_k)+a_{k+1}T(\boldv_{k+1})+\cdots +a_nT(\boldv_n) \amp (\boldv_i\in\NS T \text{ for } 1\leq i\leq k) \text{.} \end{align*}

This shows that \(\boldw=a_kT(\boldv_k)+a_{k+1}T(\boldv_{k+1})+\cdots +a_nT(\boldv_n)\in \Span B''\text{,}\) as desired.

Having shown \(B''\) is a basis for \(\im T\text{,}\) we conclude that \(\dim \im T=\val{B''}=n-(k+1)+1=n-k\text{,}\) and thus that

\begin{align*} \dim V \amp=k+(n-k) \\ \amp=\dim\NS T+\dim \im T \\ \amp = \nullity T+\rank T\text{.} \end{align*}

Example 4.6.3. Rank-nullity: verification.

Verify the rank-nullity theorem for the linear transformation \(T\colon \R^3\rightarrow \R^2\) defined as \(T(x,y,z)=(x+y+z,x+y+z)\text{.}\)

Solution.

To verify the rank-nullity theorem, we must compute bases for \(\NS T\) and \(\im T\text{.}\) Consider first \(\NS T\text{.}\) We have

\begin{align*} \NS T \amp =\{(x,y,z)\colon x+y+z=0\}\\ \amp =\{(-s-t,s,t)\colon s,t\in\R\} \text{.} \end{align*}

Here the parametric description is obtained using our usual technique for solving systems of equations (Theorem 2.3.5). From the parametric description, it is clear that the set \(B=\{(-1,1,0), (-1,0,1)\}\) spans \(\NS T\text{.}\) Since \(B\) is clearly linearly independent, it is a basis for \(\NS T\text{,}\) and we conclude that \(\dim \NS=\val{B}=2\text{.}\) (Alternatively, the equation \(x+y+z=0\) defines a plane passing through the origin in \(\R^3\text{,}\) and we know such subspaces are of dimension two. )

Next it is fairly clearly that \(\im T=\{(t,t)\colon t\in \R\}=\Span\{(1,1)\}\text{.}\) Thus \(B'=\{(1,1)\}\) is a basis for \(\im T\) and \(\dim\im T=\val{B'}=1\text{.}\)

Finally we observe that

\begin{equation*} \nullity T+\rank T=\dim\NS T+\dim\im T=2+1=3=\dim \R^3, \end{equation*}

as predicted by the rank-nullity theorem.

Example 4.6.4. Rank-nullity: application.

Show that the linear transformation

\begin{align*} T\colon \R^4 \amp\rightarrow \R^3 \\ (x,y,z,w)\amp \mapsto (x+z,y+z+w,z+2w) \end{align*}

is surjective: i.e., \(\im T=\R^3\text{.}\) Do so by first computing \(\NS T\text{.}\)

Solution.

We first examine \(\NS T\text{.}\) We have

\begin{equation*} T(x,y,z,w)=\boldzero \iff \begin{linsys}{4} x\amp \amp \amp \amp \amp + \amp w\amp =\amp 0\\ \amp \amp y\amp+ \amp z \amp + \amp w\amp =\amp 0\\ \amp \amp \amp\amp z \amp + \amp 2w\amp =\amp 0 \end{linsys}\text{.} \end{equation*}

The system above is already in row echelon form, and so we easily see that

\begin{equation*} \NS T=\{(-t,t,-2t,t)\colon t\in \R\}=\Span\{(-1,1,-2,1)\}\text{.} \end{equation*}

Thus \(B=\{(-1,1,-2,1)\}\) is a basis of \(\NS T\text{,}\) and we conclude that \(\dim \NS T=1\text{.}\) The rank-nullity theorem now implies that

\begin{equation*} \dim \im T=4-\dim \NS T=4-1=3\text{.} \end{equation*}

Since \(\im T\subseteq \R^3\) and \(\dim\im T=\dim \R^3=3\text{,}\) we conclude by Corollary 4.5.17 that \(\im T=\R^3\text{.}\) Thus \(T\) is surjective.

Subsection 4.6.2 Fundamental spaces of matrices

Definition 4.6.5. Fundamental spaces.

Let \(A\) be a an \(m\times n\) matrix. Let \(\boldr_1,\dots, \boldr_m\) be the \(m\) rows of \(A\text{,}\) and let \(\boldc_1,\dots \boldc_n\) be its \(n\) columns. The following subspaces are called the fundamental subspaces of \(A\).

The null space of \(A\), denoted \(\NS A\) is defined as

\begin{equation*} \NS A =\{\boldx\in\R^n\colon A\boldx=\boldzero\}\subseteq \R^n.\text{.} \end{equation*}
The row space of \(A\), denoted \(\RS A\text{,}\) is defined as

\begin{equation*} \RS A=\Span \{\boldr_1, \boldr_2, \dots, \boldr_m\}\subseteq \R^n\text{.} \end{equation*}
The column space of \(A\), denoted \(\CS A\text{,}\) is defined as

\begin{equation*} \CS A=\Span \{\boldc_1, \boldc_2, \dots, \boldc_n\}\subseteq \R^m\text{.} \end{equation*}

The rank and nullity of \(A\text{,}\) denoted \(\rank A\) and \(\nullity A\text{,}\) respectively, are defined as \(\rank A=\dim \CS A\) and \(\nullity A=\dim\NS A\text{.}\)

Theorem 4.6.6. Fundamental spaces.

Let \(A\) be an \(m\times n\) matrix. Suppose \(A\) is row equivalent to the row echelon matrix \(U\text{.}\)

Let \(T_A\colon \R^n\rightarrow \R^m\) be the matrix transformation associated to \(A\text{.}\) Then \(\NS A=\NS T_A\) and \(\CS A=\im T_A\text{.}\)
We have
- \(\displaystyle \NS A=\NS U\)
- \(\displaystyle \RS A=\RS U\)
- \(\dim\CS A=\dim \CS U\) (but in general it is not the case that \(\dim\CS A=\dim\CS U\)).
Let \(r \) be the number of leading ones in \(U\text{,}\) and let \(s=n-r\text{;}\) i.e., \(r\) and \(k\) are the number of leading and free variables, respectively, of the system corresponding to \(\begin{amatrix}[r|r]U\amp \boldzero\end{amatrix}\text{.}\) Then

\begin{align*} \rank A\amp =\dim\CS A=\dim \RS A=r \\ \nullity A\amp=\dim\NS A=s \text{.} \end{align*}
Rank-nullity for matrices.
We have

\begin{equation*} n=\dim\NS A+\dim\CS A=\dim\NS A+\dim\RS A\text{.} \end{equation*}

Proof.

We prove each statement in turn.

Proof of (1).

Recall that \(T_A\) is defined as \(T_A(\boldx)=A\boldx\) for all \(\boldx\in \mathbb{R}^n \text{.}\) Then clearly \(T_A(\boldx)=\boldzero\) if and only if \(A\boldx=\boldzero\text{,}\) showing that \(\NS T=\NS A\text{.}\)

Let \(\boldc_1, \boldc_2,\dots, \boldc_n\) be the columns of \(A\text{.}\) We have

\begin{align*} \boldy\in \im T_A \amp\iff \boldy=T_A(\boldx) \text{ for some } \boldx\in \mathbb{R}^n \\ \amp\iff \boldy=A \boldx \text{ for some } \boldx=(x_1,x_2,\dots, x_n)\in \mathbb{R}^n \amp (\text{definition of } T_A) \\ \amp \iff \boldy=x_1\boldc_1+x_2\boldc_2+\cdots +x_n\boldc_n \text{ for some } x_i\in \mathbb{\R} \amp (\knowl{./knowl/th_column_method.html}{\text{Theorem 3.1.19}})\\ \amp \iff \boldy\in\CS A\text{.} \end{align*}

Thus \(\im T_A=\CS A\text{.}\)

Proof of (2).

First observe that \(A=\begin{bmatrix}1\amp 1\\ 1\amp 1 \end{bmatrix}\) is row equivalent to \(U=\begin{bmatrix}1\amp 1\\ 0 \amp 0\end{bmatrix}\) and yet \(\CS A=\Span\{(1,1)\}=\{(t,t)\colon t\in \R\}\ne \CS U=\Span\{(1,0)\}=\{(t,0)\colon t\in\R\}\text{.}\) Thus we do not have \(\CS A=\CS U\) in general.

Next, we show more generally that if \(A\) is row equivalent to \(B\text{,}\) then \(\NS A=\NS B\text{,}\) \(\RS A=\RS B\text{,}\) and \(\dim\CS A=\dim \CS B\text{.}\)

Assume that \(A\) is row equivalent to \(B\text{.}\) Using the formulation of row reduction in terms of multiplication by elementary matrices, we see that there is an invertible matrix \(Q\) such that \(B=QA\text{,}\) and hence also \(A=Q^{-1}B\text{.}\) Then:

\begin{align*} \boldx\in\NS A\amp\iff A\boldx=\boldzero \\ \amp\iff QA\boldx=Q\boldzero \amp (\knowl{./knowl/th_inverse_cancel.html}{\text{Theorem 3.3.3}}) \\ \amp \iff B\boldx=\boldzero \\ \amp iff \boldx\in\NS B \end{align*}

. This proves \(\NS A=\NS B\text{.}\)

Next, by (1) we have \(\NS A=\NS T_A\text{,}\) \(\CS A=\im T_A\) and \(\NS B=\NS T_B\text{,}\) \(\CS B=\im T_B\text{.}\) Then

\begin{align*} \dim \CS A \amp=\dim\im T_A \\ \amp=n-\dim\NS A \amp (\knowl{./knowl/th_rank-nullity.html}{\text{Theorem 4.6.2}}) \\ \amp=n-\dim NS B \amp (\NS A=\NS B)\\ \amp =\dim\im T_B \amp ( \knowl{./knowl/th_rank-nullity.html}{\text{Theorem 4.6.2}}\\ \amp =\dim \CS B\text{.} \end{align*}

Lastly, we turn to the row spaces. We will show that each row of \(B\) is an element of \(\RS A\text{,}\) from whence it follows that \(\RS B\subseteq \RS A\text{.}\) Let \(\boldr_i\) be the \(i\)-th row of \(B\text{,}\) and let \(\boldq_i\) be the \(i\)-th column of \(Q\text{.}\) By Theorem 3.1.21, we have \(\boldr_i=\boldq_i A\text{,}\) and furthermore, \(\boldq_i A\) is the linear combination of the rows of \(A\) whose coefficients come from the entries of \(\boldq_i\text{.}\) Thus \(\boldr_i\in\RS A\text{,}\) as desired.

Having shown that \(\RS B\subseteq \RS A\text{,}\) we see that the same argument works mutatis mutandis (swapping the roles of \(A\) and \(B\) and using \(Q^{-1}\) in place of \(Q\)) to show that \(\RS A\subseteq \RS B\text{.}\) We conclude that \(\RS A=\RS B\text{.}\)

Proof of (3).

By (2) we know that \(\NS A=\NS U, \RS A=\RS U\text{,}\) and \(\dim\CS A=\dim\CS U\text{.}\) So it is enough to show that \(\dim \NS U=\dim\RS U=r\) and \(\dim \NS U=s\text{.}\)

First, we will show that the \(r\) nonzero rows of \(U\) form a basis for \(\RS U\text{,}\) proving \(\dim\RS U=r\text{.}\) Clearly the nonzero rows span \(\RS U\text{,}\) since any linear combination of all the rows of \(U\) can be expressed as a linear combination of the nonzero rows. Furthermore, since \(U\) is in row echelon form, the staircase pattern of the leading ones appearing in the nonzero rows assures that these row vectors are linearly independent.

Next, we show that the columns of \(U\) containing leading ones form a basis of \(\CS U\text{.}\) Let \(\boldu_{i_1},\dots, \boldu_{i_r}\) be the columns of \(U\) with leading ones, and let \(\boldu_{j_1}, \boldu_{j_2}, \dots, \boldu_{j_s}\) be the columns without leading ones. To prove the \(\boldu_{i_k}\) form a basis for \(\CS U\text{,}\) we will show that given any \(\boldy\in \CS U\) there is a unique choice of scalars \(c_1, c_2,\dots, c_r\) such that \(c_1\boldu_{i_1}+\cdots +c_r\boldu_{i_r}=\boldy\text{.}\) (Recall that the uniqueness of this choice implies linear independence.)

So assume \(\boldy\in \CS U\text{.}\) Then we can find \(\boldx\in\R^n\) such that \(U\boldx=\boldy\text{,}\) which means the linear system with augmented matrix \([\ U\ \vert \ \boldy]\) is consistent. Using our Gaussian elimination theory (specifically, Theorem 2.3.5), we know that the solutions \(\boldx=(x_1,x_2,\dots, x_n)\) to this system are in 1-1 correspondence with choices for the free variables \(x_{j_1}=t_{j_1}, x_{j_2}=t_{j_2}, \dots, x_{j_s}=t_{j_s}\text{.}\) (Remember that the columns \(\boldu_{j_k}\) without leading ones correspond to the free variables.) In particular, there is a unique solution to \(U\boldx=\boldy\) where we set all the free variables equal to 0. By the column method (Theorem 3.1.19), this gives us a unique linear combination of only the columns \(\boldu_{i_k}\) with leading ones equal to \(\boldy\text{.}\) This proves the claim, and shows that the columns with leading ones form a basis for \(\CS U\text{.}\) We conclude that \(\dim\CS U=r\text{.}\)

Lastly, again using the results of (1), we have

\begin{align*} \dim NS U \amp =\dim NS T_U \\ \amp =n-\dim\im T_U \amp (\knowl{./knowl/th_rank-nullity.html}{\text{Theorem 4.6.2}}) \\ \amp =n-\dim\CS U\\ \amp =n-r \\ \amp =s \text{,} \end{align*}

where the last equality uses the fact that the sum of the number of columns with leading ones (\(r\)) and the number of columns without leading ones (\(s\)) is \(n\text{,}\) the total number of columns.

Proof of (4).

We have

\begin{align*} n \amp =\dim\NS T_A+\dim\im T_A \amp (\knowl{./knowl/th_rank-nullity.html}{\text{Theorem 4.6.2}})\\ \amp =\dim\NS A+\dim\CS A \amp (\text{by (1)})\\\ \amp =\dim\NS A+\dim\RS A \amp (\text{ by (3)})\text{.} \end{align*}

Procedure 4.6.7. Computing bases of fundamental spaces.

To compute bases for the fundamental spaces of an \(m\times n\) matrix \(A\text{,}\) proceed as follow.

Row reduce \(A\) to a matrix \(U\) in row echelon form.
We have \(\NS A=\NS U\text{.}\) Compute a parametric description of the solutions to the linear system \(U\boldx=\boldzero\) following Theorem 2.3.5. If the free variables are \(t_1, t_2, \dots, t_k \text{,}\) a basis \(B=\{\boldv_1, \boldv_2, \dots, \boldv_k\}\) of \(\NS A\) is obtained by letting \(\boldv_i\) be the solution corresponding to the choice \(t_i=1\) and \(t_j=0\) for \(j\ne i\text{.}\)
We have \(\RS A=\RS U\text{.}\) The set of nonzero rows of \(U\) is a basis for \(\RS A\text{.}\)
In general \(\CS A\ne \CS U\text{.}\) However, the columns of \(U\) containing leading ones form a basis of \(\CS U\text{,}\) and the corresponding columns of \(A\) form a basis for \(\CS A\text{.}\)

Theorem 4.6.6 allows us to add seven more equivalent statements to our invertibility theorem, bringing us to a total of fourteen! We have tried to sandwich in the new statements in places where the equivalence with a neighbor is fairly straightforward to show, or else already established.

Theorem 4.6.8. Invertibility theorem (supersized).

Let \(A\) be an \(n\times n\) matrix. The following statements are equivalent.

\(A\) is invertible.
The matrix equation

\begin{equation*} A\underset{n\times 1}{\boldx}=\underset{n\times 1}{\boldb} \end{equation*}

has a unique solution for any column vector \(\boldb\text{.}\)
\(\displaystyle \NS A=\{\boldzero\}\)
\(\displaystyle \nullity A=0\)
\(\displaystyle \rank A=0\)
\(\displaystyle \RS A=\R^n\)
\(\displaystyle \CS A=\R^n\)
The matrix equation

\begin{equation*} A\underset{n\times 1}{\boldx}=\underset{n\times 1}{\boldb} \end{equation*}

has a solution for any column vector \(\boldb\text{.}\)
The matrix equation

\begin{equation*} A\underset{n\times 1}{\boldx}=\underset{n\times 1}{\boldzero} \end{equation*}

has a unique solution: namely, \(\boldx=\boldzero_{n\times 1}\text{.}\)
Any of the following equivalent conditions about the set \(S\) of columns of \(A\) hold: \(S\) is a basis of \(\R^n\text{;}\) \(S\) spans \(\R^n\text{;}\) \(S\) is linearly independent.
Any of the following equivalent conditions about the set \(S\) of rows of \(A\) hold: \(S\) is a basis of \(\R^n\text{;}\) \(S\) spans \(\R^n\text{;}\) \(S\) is linearly independent.
\(A\) is row equivalent to \(I_n\text{,}\) the \(n\times n\) identity matrix.
\(A\) is a product of elementary matrices.
\(\det A\ne 0\text{.}\)

Subsection 4.6.3 Contracting and expanding to bases

Thanks to Theorem 4.5.11 we know that spanning sets can be contracted to bases, and linearly independent sets can be extended to bases; and we have already seen a few instances where this result has been put to good use. However, neither the theorem nor its proof provide a practical means of performing this contraction or extension. We would like a systematic way of determining which vectors to throw out (when contracting), or which vectors to chuck in (when extending). In the special case where \(V=\R^n\) for some \(n\text{,}\) we can adapt Procedure 4.6.7 to our needs.

Procedure 4.6.9. Contracting and extending to bases of \(\R^n\).

Let \(S=\{\boldv_1, \boldv_2,\dots, \boldv_r\}\subseteq \R^n\text{.}\)

Contracting to a basis

Assume \(S\) spans \(\R^n\text{.}\) To contract \(S\) to a basis \(B\subseteq S\text{,}\) proceed as follows.

Let \(A\) be the \(n\times r\) matrix whose \(j\)-th column is given by \(\boldv_j\) for all \(1\leq j\leq r\text{.}\)
Use the column space procedure (4.6.7) to compute a basis \(B\) of \(\CS A\text{,}\) chosen from among the original columns of \(A\text{.}\)
The subset \(B\subseteq S\) is a basis for \(\R^n\text{.}\)

Extending to a basis

Assume \(S\) is linearly independent. To extend \(S\) to a basis \(B\) of \(\R^n\) proceed as follows.

Let \(A\) be the \(n\times (r+n)\) matrix whose first \(r\) columns are the elements of \(S\text{,}\) and whose remaining \(n\) columns consist of \(\bolde_1, \bolde_2, \dots, \bolde_n\text{,}\) the standard basis elements of \(\R^n\text{.}\)
Use the column space procedure (4.6.7) to compute a basis \(B\) of \(\CS A\text{,}\) chosen from among the original columns of \(A\text{.}\)
The set \(B\) is a basis for \(\R^n\) containing \(S\text{.}\)

Proof.

Let's see why in both cases the procedure produces a basis of \(\R^n\) that is either a sub- or superset of \(S\text{.}\)

Contracting to a basis.

In this case we have \(\CS A=\Span S=\R^n\text{.}\) Thus \(B\) is a basis for \(\R^n\text{.}\) Since the column space procedure selects columns from among the original columns of \(A\text{,}\) we have \(B\subseteq S\text{,}\) as desired.

Extending to a basis.

Since \(\CS A\) contains \(\bolde_j\) for all \(1\leq j\leq n\text{,}\) we have \(\CS A=\R^n\text{.}\) Thus \(B\) is a basis for \(\R^n\text{.}\) Since the first \(r\) columns of \(A\) are linearly independent (they are the elements of \(S\)), when we row reduce \(A\) to a matrix \(U\) in row echelon form, the first \(r\) columns of \(U\) will contain leading ones. (To see this, imagine row reducing the \(n\times r\) submatrix \(A'\) consisting of the first \(r\) columns of \(A\) to a row echelon matrix \(U'\text{.}\) Since these columns are linearly independent, they already form a basis for \(\CS A'\text{.}\) Thus the corresponding colmns of \(U'\) must all have leading ones. ) It follows that the first \(r\) columns of \(A\) are selected to be in the basis \(B\text{,}\) and hence that \(S\subseteq B\text{,}\) as desired.

Exercises 4.6.4 Exercises

1.

In this exercise we will show that for any \((a,b,c,d)\in \R^4\text{,}\) there is a polynomial \(p(x)\in P_3\) satisfying

\begin{equation*} p(-1)=a, p(0)=b, p(1)=c, p(2)=d\text{.} \end{equation*}

In other words given any list of values \(a, b, c, d\text{,}\) we can find a polynomial that evaluates to these values at the inputs \(x=-1, 0, 1, 2\text{.}\)

Define \(T\colon P_3\rightarrow \R^4\) by \(T(p(x))=(p(-1), p(0), p(1), p(2))\text{.}\) Show that \(T\) is linear.
Compute \(\NS T\text{.}\) You may use the fact that a polynomial of degree \(n\) has at most \(n\) roots.
Use the rank-nullity theorem to compute \(\dim \im T\text{.}\) Explain why this implies \(\im T=\R^4\)
Explain why the equality \(\im T=\R^4\) is equivalent to the claim we wish to prove.

2.

Use the rank-nullity theorem to compute the rank of the linear transformation \(T\) described.

\(T\colon\R^7\rightarrow M_{32}\text{,}\) \(\nullity T=2\)
\(T\colon P_3\rightarrow \R^5\text{,}\) \(\NS T=\{\boldzero\}\)
\(T\colon P_5 \rightarrow P_5\text{,}\) \(\NS T=P_4\)

3.

For each linear transformation \(T\colon V\rightarrow W\) use the rank-nullity theorem to decide whether \(\im T=W\text{.}\)

\(\displaystyle T\colon \R^4\rightarrow \R^5\)
\(T\colon M_{22}\rightarrow P_2\text{,}\) \(\nullity T=2\)
\(T\colon M_{23}\rightarrow M_{32}\text{,}\) \(\NS T=\{\boldzero\}\)

4.

Let \(A\) be \(m\times n\) with \(n\lt m\text{.}\) Prove, using Theorem 4.6.6 that there is a \(\boldb\in\R^m\) such that the system \(A\boldx=\boldb\) is inconsistent.

5.

For each matrix \(A\) (i) row reduce \(A\) to a matrix \(U\) in row echelon form, (ii) compute bases for \(\CS A\) and \(\CS U\text{,}\) (iii) compute \(\dim \CS A\) and \(\dim \CS U\text{,}\)and (iv) decide whether \(\CS A=\CS U\text{.}\)

\begin{equation*} A=\boldzero_{n\times n} \end{equation*}
\begin{equation*} A=\begin{amatrix}[rr]-2 \amp 1\\ 1\amp 5 \end{amatrix} \end{equation*}
\begin{equation*} A=\begin{amatrix}[rrr]1 \amp 1 \amp 3 \\ 1 \amp 2 \amp 1 \\ 1 \amp 3 \amp -1 \end{amatrix} \end{equation*}

6.

Assume \(A\) is invertible, and is row equivalent to the row echelon matrix \(U\text{.}\) Prove: \(\CS A=\CS U\text{.}\)

7.

For each matrix below, (i) compute bases for each fundamental space, (ii) identify these spaces as familiar geometric objects in \(\R^2\) or \(\R^3\text{,}\) and (iii) provide sketches of each space. The sketches of \(\NS A\) and \(\RS A\) should be combined in the same coordinate system.

\begin{equation*} A=\begin{amatrix}[rrr]1\amp 0\amp 0\\ 0\amp 1\amp 0 \end{amatrix} \end{equation*}
\begin{equation*} A=\begin{amatrix}[rrr]1\amp 1\amp -2\\ 3\amp -4\amp 1 \end{amatrix} \end{equation*}
\begin{equation*} A=\begin{amatrix}[rr] 1\amp 2\\ 0\amp 0 \\ -1\amp -2 \end{amatrix} \end{equation*}

8.

For each \(A\) compute bases for each fundamental space. In each case, you can find bases for one of the fundamental spaces by inspection, and then use the rank-nullity theorem to reduce your workload for the other spaces. See first solution for a model example.

\begin{equation*} A=\begin{amatrix}[rrrr]1\amp 1\amp 1\amp 1\\ 1\amp 1\amp 1\amp 1\\ \end{amatrix} \end{equation*}
\begin{equation*} A=\begin{bmatrix}1\amp 1\amp 1\amp 1\\ 2\amp 1\amp 2\amp 1\\ 1\amp 1\amp 1\amp 1 \end{bmatrix} \end{equation*}

Solution.

Clearly, \(B=\{(1,1)\}\) is a basis for \(\CS A\text{,}\) and \(B'=\{(1,1,1,1)\}\) is a basis for \(\RS A\text{.}\) It follows that \(\rank A=1\) and hence \(\nullity A=4-1=3\text{.}\) Thus we need to find three linearly independent elements of \(\NS A\) to find a basis. We can do so by inspection with the help of the column method. Namely, observe that \(\boldv_1=(1,-1,0,0), \boldv_2=(0,1,-1,0), \boldv_3=(0,0,1,-1)\) are all in \(\NS A\) (column method). The location of zeros in these vectors make it clear that \(B''=\{\boldv_1,\boldv_2, \bolv_3\}\) are linearly independent. Since \(\dim NS A=3\text{,}\) and \(\val{B''}=3\text{,}\) we conclude that \(B''\) is a basis of \(\NS A\) (4.5.17).

9.

For each \(A\) use Procedure 4.6.7 to compute bases for each fundamental space.

\begin{equation*} A= \begin{bmatrix}1\amp 2\amp 4\amp 5\\ 0\amp 1\amp -3\amp 0\\ 0\amp 0\amp 1\amp -3\\ 0\amp 0\amp 0\amp 0 \end{bmatrix} \end{equation*}
\begin{equation*} A= \begin{bmatrix}1\amp 2\amp -1\amp 5\\ 0\amp 1\amp 4\amp 3\\ 0\amp 0\amp 1\amp -7\\ 0\amp 0\amp 0\amp 1 \end{bmatrix} \end{equation*}
\begin{equation*} A = \begin{bmatrix}1\amp 4\amp 5\amp 6\amp 9\\ 3\amp -2\amp 1\amp 4\amp -1\\ -1\amp 0\amp -1\amp -2\amp -1\\ 2\amp 3\amp 5\amp 7\amp 8 \end{bmatrix} \end{equation*}

10.

Find the rank and nullity of each matrix by reducing it to row echelon form.

\begin{equation*} A = \begin{amatrix}[rrrrr] 1\amp 0\amp -2\amp 1\amp 0\\ 0\amp -1\amp -3\amp 1\amp 3\\ -2\amp -1\amp 1\amp -1\amp 3\\ 0\amp 1\amp 3\amp 0\amp -4 \end{amatrix} \end{equation*}
\begin{equation*} A = \begin{amatrix}[rrrr] 1\amp 3\amp 1\amp 3\\ 0\amp 1\amp 1\amp 0\\ -3\amp 0\amp 6\amp -1\\ 3\amp 4\amp -2\amp 1\\ 2\amp 0\amp -4\amp -2 \end{amatrix} \end{equation*}

11.

Let \(A\) be an \(n\times n\) matrix.

Prove: \(A^2=\boldzero_{n\times n}\) if and only if \(\CS A\subseteq \NS A\text{.}\)
Construct a \(2\times 2\) matrix \(A\) with \(\NS A=\CS A=\Span\{(1,2)\}\text{.}\) Verify that your \(A\) satisfies \(A^2=\boldzero_{2\times 2}\text{.}\)

12.

Suppose \(A\) is \(m\times n\) with \(m\ne n\text{.}\)

Prove: either the rows of \(A\) are linearly dependent or the columns of \(A\) are linearly dependent.

13.

Prove: \(\nullity A=\nullity A^T\) if and only if \(A\) is a square matrix.

14.

Suppose \(\underset{m\times n}{A}\) reduces to the row echelon matrix \(\underset{m\times n}{U}\text{.}\)

Prove: the columns \(\boldu_{i_1}, \boldu_{i_2}, \dots, \boldu_{i_r}\) of \(U\) form a basis for \(\CS U\) if and only if the corresponding columns \(\bolda_{i_1}, \bolda_{i_2}, \dots, \bolda_{i_r}\) of \(A\) form a basis for \(\CS A\text{.}\)
Prove: the columns of \(U\) with leading ones form a basis for \(\CS U\text{.}\)
Prove: \(\dim \NS U=\#(\text{ free variables } )\) in the linear system \(U\boldx=\boldzero\text{.}\)
Suppose \(x_{j_1}=t_{j_1}, x_{j_2}=t_{j_2}, \dots, x_{j_s}=t_{j_s}\) are the free variables of the system \(U\boldx=\boldzero\text{.}\) Let \(\boldv_{j_k}\) be the element of \(\NS U\) obtained by setting \(t_{j_k}=1\) and \(t_{j_\ell}=0\) for \(\ell\ne k\) in our parametric description of solutions to \(U\boldx=\boldzero\text{.}\) Prove: \(\{\boldv_{j_1}, \boldv_{j_2}, \dots, \boldv_{j_s}\}\) is a basis for \(\NS U\text{.}\)

Solution.

Recall that \(A\) being row equivalent to \(U\) is equivalent to there being an invertible matrix \(Q\) such that

\begin{equation*} QA=U\tag{\(*\)}\text{.} \end{equation*}

We will use this fact repeatedly.

By the column method of matrix multiplication we have \(\boldu_i=Q\bolda_i\text{:}\) i.e., the \(i\)-th column of \(U\) is obtained by multiplying the \(i\)-th column of \(A\) on the left by \(Q\text{.}\) First observe that

\begin{align*} \boldzero=c_1\bolda_{i_1}+\cdots +c_r\bolda_{i_r}\amp \Leftrightarrow Q\boldzero=Q(c_1\bolda_{i_1}+\cdots +c_r\bolda_{i_r}) \amp \text{ (since \(Q\) is invertible) }\\ \amp \Leftrightarrow \boldzero=c_1Q\bolda_{i_1}+\cdots +c_rQ\bolda_{i_r} \amp \text{ (matrix arithmetic) }\\ \amp \Leftrightarrow \boldzero=c_1\boldu_{i_1}+\cdots +c_r\boldu_{i_r} \end{align*}

In particular, there is a nontrivial linear combination of the \(\bolda_{i_k}\) equal to \(\boldzero\) if and only if there is a nontrivial linear combination of the \(\boldu_{i_k}\) equal to \(\boldzero\text{.}\) Thus the \(\bolda_{i_k}\) are independent if and only if the \(\boldu_{i_k}\) are independent.

Next, we claim the \(\bolda_{i_k}\) span \(\CS A\) if and only if the \(\boldu_{i_k}\) span \(\CS U\text{.}\) Indeed, suppose the \(\bolda_{i_k}\) span \(\CS A\text{.}\) Take \(\boldy\in \CS U\text{.}\) we will show that \(\boldy\) is a linear combination of the \(\boldu_{i_k}\text{.}\)

Since \(\boldy\in \CS U\text{,}\) there is an \(\boldx\in\R^n\) such that \(\boldy=U\boldx\) (since \(\CS U=\range U\)). We have \(U=QA\text{,}\) and thus \(\boldy=QA(\boldx)=Q(\boldw)\text{,}\) where \(\boldw=A\boldx\text{.}\) Then \(\boldw\in \CS A\text{,}\) and we can write \(\boldw=c_1\bolda_{i_1}+\cdots +c_r\bolda_{i_r}\text{.}\) It then follows that

\begin{equation*} \boldy=Q\boldw=Q(c_1\bolda_{i_1}+\cdots +c_r\bolda_{i_r})=c_1Q\bolda_{i_1}+\cdots +c_rQ\bolda_{i_r}=c_1\boldu_{i_1}+\cdots +c_r\boldu_{i_r}\text{.} \end{equation*}

Since \(\boldy\) was any element of \(\CS U\text{,}\) we see that the \(\boldu_{i_k}\) span \(\CS U\text{,}\) as desired.

To go the other way (i.e., that if the \(\boldu_{i_k}\) span \(\CS U\text{,}\) the \(\bolda_{i_k}\) span \(\CS A\)), note that \((*)\) implies \(A=Q^{-1}U\text{,}\) and we can use the same argument above with the roles of \(\bolda_{i_k}\) and \(\boldu_{i_k}\) swapped!

We have shown that the \(\bolda_{i_k}\) are independent if and only if the \(\boldu_{i_k}\text{,}\) and that they span \(\CS A\) if and only if the \(\boldu_{i_k}\) span \(\CS U\text{.}\) It follows that the \(\bolda_{i_k}\) form a basis for \(\CS A\) if and only if the \(\boldu_{i_k}\) form a basis for \(\CS U\text{.}\)
Let \(\boldu_{i_1},\dots, \boldu_{i_r}\) be the columns of \(U\) with leading ones, and let \(\boldu_{j_1}, \boldu_{j_2}, \dots, \boldu_{j_s}\) be the columns without leading ones. To prove the \(\boldu_{i_k}\) form a basis for \(\CS U\text{,}\) we will show that given any \(\boldy\in \CS U\) there is a unique choice of scalars \(c_1, c_2,\dots, c_r\) such that \(c_1\boldu_{i_1}+\cdots +c_r\boldu_{i_r}=\boldy\text{.}\) (Recall that the uniqueness of this choice implies linear independence.)

So assume \(\boldy\in \CS U\text{.}\) Then we can find \(\boldx\in\R^n\) such that \(U\boldx=\boldy\text{,}\) which means the linear system with augmented matrix \([\ U\ \vert \ \boldy]\) is consistent. Using our Gaussian elimination theory, we know that the solutions \(\boldx=(x_1,x_2,\dots, x_n)\) to this system are in 1-1 correspondence with choices for the free variables \(x_{j_1}=t_{j_1}, x_{j_2}=t_{j_2}, \dots, x_{j_s}=t_{j_s}\text{.}\) (Remember that the columns \(\boldu_{j_k}\) without leading ones correspond to the free variables.) In particular, there is a unique solution to \(U\boldx=\boldy\) where we set all the free variables equal to 0. By the column method, this gives us a unique linear combination of only the columns \(\boldu_{i_k}\) with leading ones equal to \(\boldy\text{.}\) This proves the claim.
Using the notation and result from (b) we see that \(\rank U=\dim\range U=\dim\CS U=r\text{,}\) the number of columns of \(U\) with leading ones. By the rank-nullity theorem, \(\dim\NS U=n-r=s\text{,}\) the number of columns without leading ones. This is also equal to the number of free variables in the corresponding system of equations.
(d) The given recipe produces a list of \(s\) distinct vectors in \(\NS U\text{.}\) Since \(s=\dim\NS U\text{,}\) by part (c), it suffice to show the \(\boldv_i\) are linearly independent. This is easy to see. Indeed suppose we have \(c_1\boldv_1+c_2\boldv_2+\cdots +c_s\boldv_s=\boldzero\text{.}\) Since for each \(1\leq i\leq s\text{,}\) \(\boldv_i\) is the only vector with a nonzero entry for the \(i\)-th component, we must have \(c_i=0\) for all \(1\leq i\leq s\text{.}\) Thus we see that the set is linearly independent.