Section 5.2 Orthogonal bases and orthogonal projection
Subsection 5.2.1 Orthogonal sets
Definition 5.2.1. Orthogonal.
Let \((V,\langle \ , \rangle)\) be an inner product space. Vectors \(\boldv, \boldw\in V\) are orthogonal if \(\langle \boldv, \boldw\rangle =0\text{.}\)
Let \(S\subseteq V\) be a subset of nonzero vectors.
The set \(S\) is orthogonal if \(\langle\boldv,\boldw \rangle=0\) for all \(\boldv\ne\boldw\in S\text{.}\) We say that the elements of \(S\) are pairwise orthogonal in this case.
The set \(S\) is orthonormal if it is both orthogonal and satisfies \(\norm{\boldv}=1\) for all \(\boldv\in S\text{:}\) i.e., \(S\) consists of pairwise orthogonal unit vectors.
Theorem 5.2.2. Orthogonal implies linearly independent.
Let \((V,\langle\ , \rangle)\) be an inner product space. If \(S\) is orthogonal, then \(S\) is linearly independent.
Proof.
We have shown that if \(a_1\boldv_1+a_2\boldv_2+\cdots +a_r\boldv_r=\boldzero\text{,}\) then \(a_i=0\) for all \(i\text{,}\) proving that \(S\) is linearly independent.
Subsection 5.2.2 Example
Let \(V=C([0,2\pi])\) with standard inner product \(\langle f, g\rangle=\int_0^{2\pi} f(x)g(x) \, dx\text{.}\)
Let
Then \(S\) is orthogonal, hence linearly independent.
Proof.
Using some trig identities, one can show the following:
Orthogonality holds more generally if we replace the interval \([0,2\pi]\) with any interval of length \(L\text{,}\) and replace \(S\) with
Subsection 5.2.3 Orthogonal bases
Definition 5.2.3. Orthogonal and orthonormal bases.
Let \((V,\langle \ , \rangle)\) be an inner product space. An orthogonal basis (resp., orthonormal basis) of \(V\) is a basis \(B\) that is orthogonal (resp., orthonormal) as a set.
Theorem 5.2.4. Existence of orthonormal bases.
Let \((V,\langle \ , \rangle)\) be a vector space of dimension \(n\text{.}\)
There is an orthonormal basis for \(V\text{.}\) In fact, any basis of \(V\) can be converted to an orthonormal basis using the Gram-Schmidt procedure.
If \(S\subseteq V\) is an orthogonal set, then there is an orthogonal basis \(B\) containing \(S\text{:}\) i.e., any orthogonal set can be extended to an orthogonal basis.
The proof that every finite-dimensional vector space has an orthogonal basis is actually a procedure, called the Gram-Schmidt procedure, for converting an arbitrary basis for an inner product space to an orthogonal basis.
Procedure 5.2.5. Gram-Schmidt procedure.
Let \((V, \langle \ , \ \rangle)\) be an inner product space, and let \(B=\{\boldv_1, \boldv_2, \dots, \boldv_n\}\) be a basis for \(V\text{.}\) We can convert \(B\) into an orthogonal basis \(B'=\{\boldw_1, \boldw_2, \dots, \boldw_n\}\text{,}\) and further to an orthonormal basis \(B''=\{\boldu_1, \boldu_2, \dots, \boldu_n\}\text{,}\) as follows:
Set \(\boldw_1=\boldv_1\text{.}\)
-
For \(2\leq r\leq n\) replace \(\boldv_r\) with
\begin{equation*} \boldw_r:=\boldv_r-\frac{\angvec{\boldv_r, \boldw_{r-1}}}{\angvec{\boldw_{r-1},\boldw_{r-1}}}\boldw_{r-1}-\frac{\angvec{\boldv_r, \boldw_{r-2}}}{\angvec{\boldw_{r-2},\boldw_{r-2}}}\boldw_{r-2}-\cdots -\frac{\angvec{\boldv_r, \boldw_{1}}}{\angvec{\boldw_{1},\boldw_{1}}}\boldw_1\text{.} \end{equation*}The resulting set \(B'=\{\boldw_1, \boldw_2, \dots, \boldw_n\}\) is an orthogonal basis of \(V\text{.}\)
-
For each \(1\leq i\leq n\) let
\begin{equation*} \boldu_i=\frac{1}{\norm{\boldw_i}}\,\boldw_i\text{.} \end{equation*}The set \(B''=\{\boldu_1, \boldu_2, \dots, \boldu_n\}\) is an orthonormal basis of \(V\text{.}\)
Theorem 5.2.6. Calculating with orthogonal bases.
Let \((V, \langle , \rangle )\) be an inner product space, and let \(B=\{\boldv_1,\dots,\boldv_n\}\subseteq V\) be an orthogonal basis.
-
Given any \(\boldv\in V\) we have
\begin{equation*} \boldv=a_1\boldv_1+a_2\boldb_2+\cdots +a_n\boldv_n \end{equation*}where
\begin{equation*} a_i=\frac{\langle \boldv,\boldv_i\rangle}{\langle\boldv_i,\boldv_i\rangle}\text{.} \end{equation*} -
If \(B\) is further assumed to be orthonormal, then
\begin{equation*} a_i=\langle\boldv,\boldv_i\rangle \end{equation*}for each \(1\leq i\leq n\text{.}\)
Subsection 5.2.4 Example
Let \(V=\R^2\) with the standard inner produce (aka the dot product).
(a) Verify that \(B'=\{\boldv_1=(\sqrt{3}/2,1/2), \boldv_2=(-1/2,\sqrt{3}/2)\}\) is an orthonormal basis.
(b) Compute \([\boldv]_{B'}\) for \(\boldv=(4,2)\text{.}\)
(c) Compute \(\underset{B\rightarrow B'}{P}\text{,}\) where \(B\) is the standard basis. \begin{bsolution}
(a) Easily seen to be true.
(b) Since \(B'\) is orthonormal, \(\boldv=a_1\boldv_1+a_2\boldv_2\) where \(a_1=\boldv\cdot\boldv_1=2\sqrt{3}+1\) and \(a_2=\boldv\cdot\boldv_2=\sqrt{3}-2\text{.}\) Thus \([\boldv]_{B'}=\begin{bmatrix}2\sqrt{3}+1\\ \sqrt{3}-2 \end{bmatrix}\)
(c) As we have seen before, \(\underset{B'\rightarrow B}{P}=\begin{bmatrix}\sqrt{3}/2\amp -1/2\\1/2\amp \sqrt{3}/2 \end{bmatrix}\) (put elements of \(B'\) in as columns). Hence \(\underset{B\rightarrow B'}{P}=(\underset{B'\rightarrow B}{P})^{-1}=\begin{bmatrix}\sqrt{3}/2\amp 1/2\\-1/2\amp \sqrt{3}/2 \end{bmatrix}\) \end{bsolution}
Theorem 5.2.7. Orthogonal matrices.
Let \(A\) be an \(n\times n\) matrix. The following statements are equivalent.
\(A\) is invertible and \(A^{-1}=A^T\text{.}\)
The columns of \(A\) are orthonormal.
The rows of \(A\) are orthonormal.
The columns (resp., rows) of \(A\) form an orthonormal basis of \(\R^n\text{.}\)
Definition 5.2.8. Orthogonal matrices.
An \(n\times n\) matrix \(A\) is orthogonal if it is invertible and \(A^{-1}=A^T\text{.}\) Equivalently, \(A\) is orthogonal if its columns (or rows) are orthonormal.
Subsection 5.2.5 Gram-Schmidt Process
Subsection 5.2.6 Orthogonal complement
Definition 5.2.9. Orthogonal complement.
. Let \((V,\langle \ , \rangle)\) be an inner product vector space, and let \(W\subseteq V\) be a finite-dimensional subspace. The orthogonal complement of \(W\), denoted \(W^\perp\text{,}\) is defined as
In other words \(W^\perp\) is the set of vectors that are orthogonal to all elements of \(W\text{.}\)
Theorem 5.2.10. Orthogonal complement.
Let \((V,\langle \ , \rangle)\) be an inner product vector space, and let \(W\subseteq V\) be a subspace.
The orthogonal complement \(W^\perp\) is a subspace of \(V\text{;}\)
We have \(W\cap W^\perp=\{\boldzero\}\text{.}\)
Subsection 5.2.6.1 Example
Let \(V=\R^3\) equipped with the dot product, and let \(W=\Span\{(1,1,1)\}\subset \R^3\text{.}\) This is the line defined by the vector \((1,1,1)\text{.}\) Then \(W^\perp\) is the set of vectors orthogonal to \((1,1,1)\text{:}\) i.e., the plane perpendicular to \((1,1,1)\text{.}\)
Subsection 5.2.7 Geometry of fundamental spaces
The notion of orthogonal complement gives us a new way of understanding the relationship between the various fundamental spaces of a matrix.
Theorem 5.2.11.
Let \(A\) be \(m\times n\text{,}\) and consider \(\R^n\) and \(\R^m\) as inner product spaces with respect to the dot product. Then:
\(\NS(A)=\left(\RS(A)\right)^\perp\text{,}\) and thus \(\RS(A)=\left(\NS(A)\right)^\perp\text{.}\)
\(\NS(A^T)=\left(\CS(A)\right)^\perp\text{,}\) and thus \(\CS(A)=\left(\NS(A^T)\right)^\perp\text{.}\)
Proof.
(i) Using the dot product method of matrix multiplication, we see that a vector \(\boldv\in\NS(A)\) if and only if \(\boldv\cdot\boldr_i=0\) for each row \(\boldr_i\) of \(A\text{.}\) Since the \(\boldr_i\) span \(\RS(A)\text{,}\) the linear properties of the dot product imply that \(\boldv\cdot\boldr_i=0\) for each row \(\boldr_i\) of \(A\) if and only if \(\boldv\cdot\boldw=0\) for all \(\boldw\in\RS(A)\) if and only if \(\boldv\in \RS(A)^\perp\text{.}\)
(ii) This follows from (i) and the fact that \(\CS(A)=\RS(A^T)\text{.}\)
Subsection 5.2.8 Example
Understanding the orthogonal relationship between \(\NS(A)\) and \(\RS(A)\) allows us in many cases to quickly determine/visualize the one from the other. Consider the example \(A=\begin{bmatrix}1\amp -1\amp 1\\ 1\amp -1\amp -1 \end{bmatrix}\text{.}\)
Looking at the columns, we see easily that \(\rank(A)=2\text{,}\) which implies that \(\nullity(A)=3-2=1\text{.}\) Since \((1,-1,0)\) is an element of \(\NS(A)\) and \(\dim(\NS(A))=1\text{,}\) we must have \(\NS(A)=\Span\{(1,-1,0)\}\text{,}\) a line.
By orthogonality, we conclude that
Subsection 5.2.9 Orthogonal Projection
Theorem 5.2.12. Orthogonal projection theorem.
Let \((V,\langle \ , \rangle)\) be an inner product space, and let \(W\subseteq V\) be a finite-dimensional subspace.
-
Orthogonal decomposition.
For all \(\boldv\in V\) there is a unique choice of vectors \(\boldw\in W\) and \(\boldw^\perp\in W^\perp\) such that \(\boldv=\boldw+\boldw^\perp\text{.}\) We call this vector expression an orthogonal decomposition of \(\boldv\text{,}\) and denote \(\boldw=\proj{\boldv}{W}\) and \(\boldw^\perp=\proj{\boldv}{W^\perp}\text{,}\) the orthogonal projections of \(\boldv\) onto \(W\) and \(W^\perp\text{,}\) respectively.
-
Distance to \(W\).
The orthogonal projection \(\boldw=\proj{\boldv}{W}\) is the unique element of \(W\) that minimizes the distance to \(\boldv\text{.}\) In other words
\begin{equation*} \norm{\boldv-\proj{\boldv}{W}}\leq\norm{\boldv-\boldw'} \end{equation*}for all \(\boldw'\in W\text{.}\)
Accordingly, we define the distance from \(\boldv\) to \(W\), denoted \(d(\boldw, W)\text{,}\) as
\begin{equation*} d(\boldv, W)=d(\boldv, \proj{\boldv}{W}=\norm{\boldw^\perp}\text{.} \end{equation*} -
Orthogonal projection formula.
Pick an orthogonal basis \(B=\{\boldv_1,\boldv_2,\dots, \boldv_r\}\) of \(W\text{.}\) Then
\begin{equation*} \proj{\boldv}{W}=\sum_{i=1}^r\frac{\angvec{\boldv,\boldv_i}}{\angvec{\boldv_i, \boldv_i}}\boldv_i\text{.} \end{equation*}
Subsection 5.2.10 Proof of orthogonal projection theorem
Pick an orthogonal basis \(B=\{\boldv_1,\boldv_2,\dots, \boldv_r\}\) of \(W\) and set \(\boldw=\sum_{i=1}^r\frac{\angvec{\boldv,\boldv_i}}{\angvec{\boldv_i, \boldv_i}}\boldv_i\text{.}\) This is clearly an element of \(W\text{.}\) Next we set \(\boldw^\perp=\boldv-\boldw=\boldv-\sum_{i=1}^r\frac{\angvec{\boldv,\boldv_i}}{\angvec{\boldv_i, \boldv_i}}\boldv_i\text{.}\)
To complete the proof, we must show the following: (A) \(\boldw^\perp\in W^\perp\text{,}\) (B) this choice of \(\boldw\) and \(\boldw^\perp\) is unique, and (C) \(\boldw\) is the closest element of \(W\) to \(\boldv\text{.}\)
Subsection 5.2.10.1 (A)
For all \(i\) we have
Subsection 5.2.10.2 (B)+(C)
Recall: \(\boldw\) satisfies \(\boldv=\boldw+\boldw^\perp\text{,}\) where \(\boldw^\perp\in W^\perp\text{.}\) Now take any other \(\boldw'\in W\text{.}\) Then
Taking square-roots now proves the desired inequality. Furthermore, we have equality iff the last inequality is an equality iff \(\norm{\boldw''}=\norm{\boldw-\boldw'}=0\) iff \(\boldw=\boldw'\text{.}\) This proves our choice of \(\boldw\) is the unique element of \(W\) minimizing the distance to \(\boldv\text{!}\)
Corollary 5.2.13.
Let \((V,\angvec{\ , \ })\) be an inner product space, and let \(W\subseteq V\) be a finite-dimensional subspace. Then \((W^\perp)^\perp=W\text{.}\)
Proof.
Clearly \(W\subseteq (W^\perp)^\perp\text{.}\) For the other direction, take \(\boldv\in (W^\perp)^\perp\text{.}\) Using the orthogonal projection theorem, we can write \(\boldv=\boldw+\boldw^\perp\) with \(\boldw\in W\) and \(\boldw^\perp\in W^\perp\text{.}\) We will show \(\boldw^\perp=\boldzero\text{.}\)
Since \(\boldv\in (W^\perp)^\perp\) we have \(\angvec{\boldv,\boldw^\perp}=0\text{.}\) Then we have
Thus \(\angvec{\boldw^\perp,\boldw^\perp}=0\text{.}\) It follows that \(\boldw^\perp=\boldzero\text{,}\) and hence \(\boldv=\boldw+\boldzero=\boldw\in W\text{.}\)
Corollary 5.2.14.
Let \((V,\angvec{\ , \ })\) be an inner product space, and let \(W\subseteq V\) be a finite-dimensional subspace.
Define \(T\colon V\rightarrow V\) as \(T(\boldv)=\proj{\boldv}{W}\text{.}\) Then \(T\) is a linear transformation.
In other words, orthogonal projection onto \(W\) defines a linear transformation of \(V\text{.}\)
Proof.
We must show that \(T(c\boldv_1+d\boldv_2)=cT(\boldv_1)+dT(\boldv_2)\) for all \(c,d\in\R\) and \(\boldv_1,\boldv_2\in V\text{.}\) This is easily shown by picking an orthonormal basis \(B=\{\boldv_1,\boldv_2, \dots, \boldv_r\}\) of \(W\) and using the formula from the orthogonal projection theorem.
Subsection 5.2.11 Projection onto lines and planes in $\R^3$
Let's revisit orthogonal projection onto lines and planes in \(\R^3\) passing through the origin. Here the relevant inner product is dot product.
Subsection 5.2.12 Projection onto a line $\ell$
Any line in \(\R^3\) passing through the origin can be described as \(\ell=\Span\{\boldv_0\}\text{,}\) for some \(\boldv_0=(a,b,c)\ne 0\text{.}\) Since this is an orthogonal basis of \(\ell\text{,}\) by the orthogonal projection theorem we have, for any \(\boldv=(x,y,z)\)
We have re-derived the matrix formula for orthogonal projection onto \(\ell\text{.}\)
Subsection 5.2.13 Projection onto lines and planes in $\R^3$
Let's revisit orthogonal projection onto lines and planes in \(\R^3\) passing through the origin. Here the relevant inner product is dot product.
Subsection 5.2.14 Projection onto a plane
Any plane in \(\R^3\) passing through the origin can be described with the equation \(\mathcal{P}\colon ax+by+cz=0\) for some \(\boldn=(a,b,c)\ne 0\text{.}\) This says precisely that \(\mathcal{P}\) is the orthogonal complement of the line \(\ell=\Span\{(a,b,c)\}\text{:}\) i.e., \(\mathcal{P}=\ell^\perp\text{.}\)
From the orthogonal projection theorem, we know that
But then
where \(A\) is the matrix formula for \(\proj{\boldv}{\ell}\) from the previous example. We conclude that the matrix defining \(\proj{\boldv}{\mathcal{P}}\) is
We can express this in terms of matrix multiplication as
\item Translate the whole picture by \(-Q=(-q_1,-q_2, -q_3)\text{,}\) which means we replace \(P=(x,y,z)\) with \(P-Q=(x-q_1,y-q_2,z-q_3)\text{.}\) \item Apply our formulas from before, replacing \((x,y,z)\) with \((x-q_1,y-q_2,z-q_3)\) \item Translate back by adding \(Q\) to your answer.
Subsection 5.2.15 Example: sine/cosine series
Let \(V=C[0,2\pi]\) with inner product \(\langle f, g\rangle=\int_0^{2\pi}f(x)g(x) \, dx\text{.}\)
We have seen that the set
is orthogonal. Thus \(B\) is an orthogonal basis of \(W=\Span(B)\text{,}\) which we might describe as the space of trigonometric polynomials of degree at most \(n\).
Given an arbitrary function \(f(x)\in C[0,2\pi]\text{,}\) its orthogonal projection onto \(W\) is the function
where
The projection theorem tells us that \(\hat{f}\) is the “best” trigonometric polynomial approximation of \(f(x)\) (of degree at most \(n\)), in the sense that for any other sinusoidal \(g\in W\text{,}\) \(\left\vert\left\vert f-\hat{f}\right\vert\right\vert\leq \norm{f-g}\text{.}\)
This means in turn
Subsection 5.2.16 Example: least-squares solution to $A\boldx=\boldy$
Often in applications we have an \(m\times n\) matrix \(A\) and vector \(\boldy\in\R^m\) for which the matrix equation
has no solution. In terms of fundamental spaces, this means simply that \(\boldy\notin \CS(A)\text{.}\) Set \(W=\CS(A)\text{.}\)
In such situations we speak of a least-squares solution to the matrix equation. This is a vector \(\hat{\boldx}\) such that \(A\hat{\boldx}=\hat{\boldy}\text{,}\) where \(\hat{\boldy}=\proj{\boldy}{W}\text{.}\) Here the inner product is taken to be the dot product.
Note: the equation \(A\hat{\boldx}=\hat{\boldy}\) is guaranteed to have a solution since \(\hat{\boldy}=\proj{\boldy}{W}\) lies in \(\CS(A)\text{.}\)
The vector \(\hat{\boldx}\) is called a least-square solutions because its image \(\hat{\boldy}\) is the element of \(\CS(A)\) that is “closest” to \(\boldy\) in terms of the dot product. Writing \(\boldy=(y_1,y_2,\dots,y_n)\) and \(\hat{\boldy}=(y_1',y_2',\dots, y_n')\text{,}\) this means that \(\hat{\boldy}\) minimizes the distance
Subsection 5.2.17 Least-squares example (curve fitting)
Suppose we wish to find an equation of a line \(y=mx+b\) that best fits (in the least-square's sense) the following \((x,y)\) data points: \(P_1=(-3,1), P_2=(1,2), P_3=(2,3)\text{.}\)
Then we seek \(m\) and \(b\) such that
or equivalently, we wish to solve \(\begin{bmatrix}-3\amp 1\\ 1\amp 1\\ 2\amp 1 \end{bmatrix} \begin{bmatrix}m \\ b \end{bmatrix} =\begin{bmatrix}1\\ 2\\ 3 \end{bmatrix}\text{.}\)
This equation has no solution as \(\boldy=(1,2,3)\) does no lie in \(W=\CS(A)=\Span(\{(-3,1,2),(1,1,1)\}\text{.}\) So instead we compute \(\hat{\boldy}=\proj{\boldy}{W}=(13/14,33/14,38/14)\text{.}\) (This was not hard to compute as conveniently the given basis of \(W\) was already orthogonal!)
Finally we solve \(A\begin{bmatrix}m\\ b \end{bmatrix} =\hat{\boldy}\text{,}\) getting \(m=5/14\text{,}\) \(b=28/14=2\text{.}\) Thus \(y=\frac{5}{14}x+2\) is the line best fitting the data in the least-squares sense.
Subsection 5.2.18 Least-squares example contd.
In what sense does \(y=\frac{5}{14}x+2\) “best” fit the data?
Let \(\boldy=(1,2,3)=(y_1,y_2,y_3)\) be the given \(y\)-values of the points, and \(\hat{\boldy}=(y_1',y_2',y_3')\) be the projection we computed before. In the graph the values \(\epsilon_i\) denote the vertical difference \(\epsilon_i=y_i-y_i'\) between the data points, and our fitting line.
The projection \(\hat{\boldy}\) makes the error \(\norm{\boldy-\hat{\boldy}}=\sqrt{ \epsilon_1^2+\epsilon_2^2+\epsilon_3^2}\) as small as possible.
This means if I draw any other line and compute the corresponding differences \(\epsilon_i'\) at the \(x\)-values -3, 1 and 2, then we have
Subsection 5.2.19 Finding least squares solutions
As the last example illustrated, one method of finding a least-squares solution \(\boldx\) to \(A\boldx=\boldy\) is to first produce an orthogonal basis for \(\CS(A)\text{,}\) then compute \(\hat{\boldy}=\proj{\boldy}{\CS(A)}\text{,}\) and then use GE to solve \(A\boldx=\hat{\boldy}\text{.}\)
Alternatively, it turns out (through a little trickery) that \(\hat{\boldy}=A\boldx\text{,}\) where \(\boldx\) is a solution to the equation
This solves us the hassle of computing an orthogonal basis for \(\CS(A)\text{;}\) to find a least-squares solution \(\boldx\) for \(A\boldx=\boldy\text{,}\) we simply use GE to solve the boxed equation. (Some more trickery shows a solution is guaranteed to exist!)
Subsection 5.2.19.1 Example
In the previous example we were seeking a least-squares solution \(\boldx=\colvec{m\\ b}\) to \(A\boldx=\boldy\text{,}\) where \(A=\begin{bmatrix}-3\amp 1\\ 1\amp 1\\ 2\amp 1 \end{bmatrix} , \boldy=\colvec{1\\2\\3}\text{.}\)
The equation \(A^TA\boldx=A^T\boldy\) is thus
As you can see, \(\boldx=\colvec{m\\ b}=\colvec{5/14\\ 2}\) is a least-squares solution, just as before
Exercises 5.2.20 Exercises
1.
The vectors
are pairwise orthogonal with respect to the dot product, as is easily verified. For each \(\boldv\) below, find the scalars \(c_i\) such that
\(\displaystyle \boldv=(3,0,-1,0)\)
\(\displaystyle \boldv=(1,2,0,1)\)
\(\boldv=(a,b,c,d)\) (Your answer will be expressed in terms of \(a,b,c\text{,}\) and \(d\text{.}\) )
2.
Consider the inner product space given by \(V=\R^3\) together with the dot product. Let \(W\) be the plane with defining equation \(x+2y-z=0\text{.}\) Compute an orthogonal basis of \(W\text{,}\) and then extend this to an orthogonal basis of \(\R^3\text{.}\)
3.
Consider the vector space \(V=C([0,1])\) with the integral inner product. Apply Gram-Schmidt to the basis \(B=\{1,2^x, 3^x\}\) of \(W=\Span(B)\) to obtain an orthogonal basis of \(W\text{.}\)
The resulting orthogonal basis is \(B'=\{f_1, f_2,f_3\}\text{,}\) where
OK, I admit, I used technology to compute those integrals.
4.
Consider the vector space \(V=P_2\) with the evaluation at \(-1, 0, 1\) inner product:
Apply Gram-Schmidt to the standard basis of \(P_2\) to obtain an orthogonal basis of \(P_2\text{.}\)
5.
Let \(V=M_{22}\) with inner product \(\angvec{A,B}=\tr(A^TB)\text{,}\) and let \(W\subseteq V\) be the subspace of matrices whose trace is 0.
Compute an orthogonal basis for \(W\text{.}\) You can do this either by inspection (the space is manageable), or by starting with a simple basis of \(W\) and applying the Gram-Schmidt procedure.
-
Compute \(\proj{A}{W}\text{,}\) where
\begin{equation*} A=\begin{bmatrix}1\amp 2\\ 1\amp 1 \end{bmatrix}\text{.} \end{equation*}
6.
Let \(V=C([0,1])\) with the integral inner product, and let \(f(x)=x\text{.}\) Find the function of the form \(g(x)=a+b\cos(2\pi x)+c\sin(2\pi x)\) that “best approximates” \(f(x)\) in terms of this inner product: i.e. find the the \(g(x)\) of this form that minimizes \(d(g,f)\text{.}\)
The set \(S=\{f(x)=1, g(x)=\cos(2\pi x), h(x)=\sin(2\pi x)\}\) is orthogonal with respect to the given inner product.
7.
Let \((V,\langle , \rangle )\) be an inner produce space. Prove: if \(\angvec{\boldv,\ \boldw}=0\text{,}\) then
This result can be thought of as the Pythagorean theorem for general inner product spaces.
8.
Let \((V, \langle , \rangle )\) be an inner product space, let \(S=\{\boldw_1, \boldw_2, \dots, \boldw_r\}\subseteq V\text{,}\) and let \(W=\Span S\text{.}\) Prove:
In other words, to check whether an element is in \(W^\perp\text{,}\) it suffices to check that it is orthogonal to each element of its spanning set \(S\text{.}\)
9.
Let \((V, \langle , \rangle )\) be an inner product space, and suppose \(B=\{\boldv_1, \boldv_2, \dots, \boldv_n\}\) is an orthonormal basis of \(V\text{.}\) Suppose \(\boldv, \boldw\in V\) satisfy
-
Prove:
\begin{equation*} \langle \boldv, \boldw\rangle =\sum_{i=1}^nc_id_i\text{.} \end{equation*} -
Prove:
\begin{equation*} \norm{\boldv}=\sqrt{\sum_{i=1}^nc_i^2}\text{.} \end{equation*}
10.
Prove both statements of Theorem 5.2.10.
11.
Prove Corollary 5.2.14 following the suggestion in the text.
12.
Let \(V\) an inner product space, and let \(W\subseteq V\) be a finite-dimensional subspace. Recall that \(\proj{\boldv}{W}\) is defined as the unique \(\boldw\in W\) satisfying \(\boldv=\boldw+\boldw^\perp\text{,}\) where \(\boldw^\perp\in W^\perp\text{.}\) Use this definition (including the uniqueness claim) to prove the following statements.
If \(\boldv\in W\text{,}\) then \(\proj{\boldv}{W}=\boldv\text{.}\)
We have \(\boldv\in W^\perp\) if and only if \(\proj{\boldv}{W}=\boldzero\text{.}\)
13. Dimension of \(W^\perp\).
Let \((V, \ \angvec{\ , \ })\) be an inner product space of dimension \(n\text{,}\) and suppose \(W\subseteq V\) is a subspace of dimension \(r\text{.}\) Prove: \(\dim W^\perp=n-r\text{.}\)
Begin by picking an orthogonal basis \(B=\{\boldv_1,\dots ,\boldv_r\}\) of \(W\) and extend to an orthogonal basis \(B'=\{\boldv_1,\boldv_2, \dots, \boldv_r, \boldu_1,\dots , \boldu_{n-r}\}\) of all of \(V\text{.}\) Show the \(\boldu_i\) form a basis for \(W^\perp\text{.}\)
14.
We consider the problem of fitting a collection of data points \((x,y)\) with a quadratic curve of the form \(y=f(x)=ax^2+bx+c\text{.}\) Thus we are given some collection of points \((x,y)\text{,}\) and we seek parameters \(a, b, c\) for which the graph of \(f(x)=ax^2+bx+c\) “best fits” the points in some way.
Show, using linear algebra, that if we are given any three points \((x,y)=(r_1,s_1), (r_2,s_2), (r_3,s_3)\text{,}\) where the \(x\)-coordinates \(r_i\) are all distinct, then there is a unique choice of \(a,b,c\) such that the corresponding quadratic function agrees precisely with the data. In other words, given just about any three points in the plane, there is a unique quadratic curve connecting them.
-
Now suppose we are given the four data points
\begin{equation*} P_1=(0,2), P_2=(1,0), P_3=(2,2), P_4=(3,6)\text{.} \end{equation*}Use the least-squares method described in the lecture notes to come up with a quadratic function \(y=f(x)\) that “best fits” the data.
Graph the function \(f\) you found, along with the points \(P_i\text{.}\) (You may want to use technology.) Use your graph to explain precisely in what sense \(f\) “best fits” the data.