Section 2.2 Gaussian elimination
In Section 2.1 we sketched a procedure for solving a linear system \(L\text{.}\) That procedure involved applying a sequence of row operations to \(L\) to obtain a “simpler” system \(L'\text{.}\)
We will fill out this sketch in the next two sections. Specifically, we will
describe precisely what we mean by a “simpler” system,
provide an algorithm (or recipe) that decides exactly what sequence of row operations to apply to obtain this simpler system,
explain how to find all solutions of the resulting simpler system.
Before we embark on these steps, we introduce a useful bit of notation. As you may have noticed, when performing row operations on a system of equations, we essentially treat the unknowns, as well as the plus and equals symbols, as placeholders; the only things that actually change in a given step are the coefficients in the equations. The augmented matrix associated to a linear system is a formal way of extracting just the information of the coefficients from a linear system.
Definition 2.2.1. Augmented matrix.
Let \(L\) be the linear system
The augmented matrix associated to \(L\) is the matrix
Remark 2.2.2.
As defined more thoroughly in Definition 3.1.2, a matrix is just a rectangular array of numbers.
Subsection 2.2.1 Row echelon matrices
Here is our precise formulation of a “simple” linear system; it is a system whose associated augmented matrix is in row echelon form, as described below.
Definition 2.2.3. Row echelon form.
A zero row of a matrix, is a row whose entries are all equal to zero; a nonzero row is a row that contains at least one nonzero entry.
A matrix is in row echelon form if the following conditions hold.
- (i)
In any nonzero row the first (i.e., leftmost) nonzero entry is equal to one. A leading one of a matrix is such an entry: i.e., an entry of a row that is equal to one, and that is also the first nonzero entry of that row.
- (ii)
All zero rows are grouped together at the bottom of the matrix.
- (iii)
Given any two nonzero rows in the matrix, the leading one of the lower row occurs strictly to the right of the leading one of the row above it.
A matrix is in reduced row echelon form if in addition to conditions (i)-(iii) it also satisfies the following condition.
- (iv)
- Any column of the matrix that contains a leading one has zeros elsewhere. In other words, if a column contains a leading one, then that is the only nonzero entry of that column.
A linear system \(L\) is in row echelon form (resp. reduced row echelon form) if its augmented matrix is in row echelon form (resp. reduced row echelon form).
In practice to decide whether a matrix is in in (reduced) row echelon form, follow these steps:
First verify whether all zero rows are at the bottom.
For each nonzero row, determine whether the first nonzero entry is a 1, and put a box around it.
Make sure your boxes make a staircase pattern.
(For reduced row echelon form only.) Decide whether each column with a box has 0's everywhere else.
Example 2.2.4. Row echelon versus reduced row echelon form.
For each matrix decide (a) whether it is in row echelon form, and (b) whether it is in reduced row echelon form.
- \begin{equation*} \begin{bmatrix}0\amp 2\amp 1\amp 0\amp 0\\ 0\amp 0\amp 0\amp 0\amp 0\\ 0\amp 1\amp 0\amp 0\amp 1\\ 1\amp 0\amp 0\amp 0\amp 0\\ 0\amp 0\amp 0\amp 0\amp 0 \end{bmatrix} \end{equation*}
- \begin{equation*} \begin{bmatrix}1\amp 0\amp 0\amp -3\amp -7\\ 0\amp 0\amp 1\amp 2\amp 0\\ 0\amp 0\amp 0\amp 0\amp 1\\ 0\amp 0\amp 0\amp 0\amp 0\\ 0\amp 0\amp 0\amp 0\amp 0 \end{bmatrix} \end{equation*}
-
Below you find the matrix with leading ones boxed. This matrix fails to be in row echelon form (and hence also reduced row echelon form) for a variety of reasons: the zero rows are not all grouped at the bottom; the first row is nonzero, but does not have a leading one; the leading one of the fourth row is to the left of the leading one of the leading one in the row above it.
\begin{equation*} \begin{bmatrix}0\amp 2\amp 1\amp 0\amp 0\\ 0\amp 0\amp 0\amp 0\amp 0\\ 0\amp \boxed{1}\amp 0\amp 0\amp 1\\ \boxed{1}\amp 0\amp 0\amp 0\amp 0\\ 0\amp 0\amp 0\amp 0\amp 0 \end{bmatrix} \end{equation*} -
Below you find the matrix with leading ones boxed. This matrix is in row echelon form: the zero rows (rows 4 and 5) are grouped at the bottom; each nonzero row has a leading one (boxed in the matrix below); the leading ones step strictly to the right as we move down the rows.
\begin{equation*} \begin{bmatrix}\boxed{1}\amp 0\amp 0\amp -3\amp -7\\ 0\amp 0\amp \boxed{1}\amp 2\amp 0\\ 0\amp 0\amp 0\amp 0\amp \boxed{1}\\ 0\amp 0\amp 0\amp 0\amp 0\\ 0\amp 0\amp 0\amp 0\amp 0 \end{bmatrix} \end{equation*}The matrix is not in reduced row echelon form, as the last column contains a leading one in its third row, and a nonzero entry in its first row.
Subsection 2.2.2 Gaussian elimination
We will now describe a systematic procedure, called Gaussian elimination, that allows us to reduce a given linear system \(L\) to a system \(L'\)in row echelon form. In keeping with the foregoing discussion, we will identify a system \(L\) with its augmented matrix \(A\text{.}\) Furthermore, reducing a linear system using elementary operations on equations is now cast as performing elementary row operations on matrices. At the risk of redundancy we now officially translate a number of our former notions regarding reduction of linear systems to the setting of matrices.
Definition 2.2.5. Elementary row operations on matrices.
An elementary row operation is one of the three following types of matrix operations. Let \(A\) be a given \(m\times n\) matrix, and denote by \(r_i\) the \(i\)-th row of \(A\text{.}\)
- Scalar multplication
Multiply a row by a nonzero number \(c\ne 0\text{:}\) i.e., replace \(r_i\) with \(c\,r_i\text{,}\) the result of multiplying all entries of the row by \(c\text{.}\)
- Row swap
Swap two rows of \(A\text{.}\)
- Row addition
Add a multiple of one row to another: i.e., replace \(r_i\) with \(r_i+cr_j\) for some \(c\text{,}\) \(i\text{,}\) and \(j\text{.}\)
The act of transforming a matrix using elementary row operations is called row reduction
Two matrices are row equivalent if the one can be obtained from the other by performing a finite sequence of elementary row operations.
Remark 2.2.6. Notation.
Our former elementary operation notation easily transfers to row operations on matrices. The expressionsAt last we are ready to define Gaussian elimination. This is a procedure, or algorithm, that takes an input matrix \(A\) and row reduces it to a matrix \(B\) in row echelon form.
Mantra 2.2.7. Gaussian elimination is the workhorse of linear algebra.
Simplifying linear systems is but one of the many useful applications of Gaussian elimination, and you should think of it as having a life independent of its role in solving systems of linear equations. In its essence Gaussian elimination is simply a certain procedure performed on matrices: one that we will come back to again and again, and to diverse ends.
Definition 2.2.8. Gaussian elimination.
Gaussian elimination is the algorithm described below. It takes as an input a matrix \(A\) and returns as an output a row equivalent matrix \(B\) in row echelon form.
- Step 1
Find the leftmost nonzero column and perform a row swap to move the row with this nonzero entry to the top of the matrix.
- Step 2
Scale the new top row to produce a leading one in the row. Call this new row \(r\text{.}\)
- Step 3
For each row \(r_i\) below \(r\) perform a row operation of the form \(r_i+c\,r\) to replace all entries below the leading one of \(r\) with zeros.
- Step 4
Begin again with Step 1 applied to the matrix consisting of all rows below \(r\text{.}\) Continue until the matrix is in row echelon form.
Subsection 2.2.2.1 Model example
Use the following example as a model of how to both perform and annotate the steps in Gaussian elimination. When first learning this procedure, make sure to follow it to the letter. In particular, in the spirit of the mandate issued in Remark 2.1.10, you should only perform one row operation at a time, and only in the sequence prescribed in Steps 1-4 of Definition 2.2.8.
The matrix outputed by Gaussian elimination is guaranteed to be in row echelon form, but it may not be in reduced row echelon form. Gauss-Jordan elimination describes a systematic way to continue reducing to this even simpler state.
Definition 2.2.9. Gauss-Jordan elimination.
Gauss-Jordan elimination is the algorithm described below. It takes as an input a matrix \(A\) and returns a row equivalent matrix \(B\) in reduced row echelon form.
- Steps 1-4
Apply Gaussian elimination to transform \(A\) to a matrix in row echelon form.
- Step 5
Find the rightmost column of the matrix containing a leading one. Let \(r_i\) be the row containing this leading one. For each row \(r_j\) above \(r_i\) perform a row operation of the form \(r_i+c\,r_j\) to replace all entries above the leading one with zeros.
- Step 6
Begin again with Step 5 applied to the next column to the left that contains a leading one. Continue until the matrix is in reduced row echelon form.
Continuing our previous example:
Definition 2.2.8 and Definition 2.2.9 are really theorems in disguise, and we make this official in Theorem 2.2.10.
Theorem 2.2.10. Row equivalent matrix forms.
- Row echelon forms exist
Any matrix \(A\) is row equivalent to a matrix in row echelon form. Indeed, Gaussian elimination row reduces any matrix to a matrix in row echelon form.
- Reduced row echelon forms exist
Any matrix \(A\) is row equivalent to a matrix in reduced row echelon form. Indeed, Gauss-Jordan elimination row reduces any matrix to a matrix in reduced row echelon form.
- Reduced row echelon forms are unique
Any matrix \(A\) is row equivalent to a unique matrix in reduced row echelon form.
We will make heavy use of the first two results of Theorem 2.2.10. The proofs of these statements are not difficult, but not especially illuminating. Accordingly we omit them here, and point the interested reader to Robert Beezer's A First Course in Linear Algebra (Theorem REMEF).
The third statement of Theorem 2.2.10, that every matrix is row equivalent to a unique matrix in reduced row echelon form, does in fact have an enlightening proof. We will postpone this proof, however, until we have a little more theory at our disposal. (See Corollary 3.4.15.) Until then we will conscientiously not make use of this result in developing any of our further theory.
Example 2.2.11. Row echelon form is not unique.
Show that a matrix \(A\) may be row equivalent to two or more matrices in row echelon form.
Take \(A=\begin{bmatrix} 1 \amp 1 \\ 1 \amp 2 \end{bmatrix}\text{.}\) This row reduces to \(B=\begin{bmatrix} 1 \amp 1 \\ 0 \amp 1 \end{bmatrix}\) using Gaussian elimination; and it row reduces further to \(C=\begin{bmatrix}1\amp 0\\ 0\amp 1\end{bmatrix}\) using Gauss-Jordan elimination. Thus we see that \(A\) is row equivalent to two different matrices in row echelon form. (According to Theorem 2.2.10, the matrix \(C\) is the only matrix in reduced row echelon that is row equivalent to \(A\text{.}\))
Exercises 2.2.3 Exercises
Exercise Group.
1.
\(A=\begin{bmatrix} 1 \amp 2\amp 2\amp -3\\ 0\amp -3\amp 4\amp 0\\ 0\amp 0\amp 0\amp 1 \end{bmatrix}\)
The first nonzero term in the second row is not a one.
2.
\(A=\begin{bmatrix} 0\amp 1\amp 2\amp -3\\ 0\amp 1\amp 4\amp 0\\ 0\amp 0\amp 0\amp 1 \end{bmatrix}\)
3.
\(A=\begin{bmatrix} 1\amp 1\amp 2\amp -3\\ 0\amp 0\amp 0\amp 0\\ 0\amp 1\amp 1\amp 1 \end{bmatrix}\)
Exercise Group.
4.
First bring the system into standard form:
Then perform Gaussian elimination on the associated augmented matrix:
The corresponding equivalent system is