Theorem.(Orthogonal Decomposition Theorem)
Let \(W\) be a subspace of \(\mathbb R^n\) and \(\overrightarrow{y}\in \mathbb R^n\). Then
\[\overrightarrow{y}=\overrightarrow{w}+\overrightarrow{z}\]
for unique vectors \(\overrightarrow{w}\in W\) and \(\overrightarrow{z}\in W^{\perp}\). Moreover,
if \(\{\overrightarrow{w_1},\overrightarrow{w_2},\ldots,\overrightarrow{w_k}\}\) is an orthogonal basis of \(W\),
then
\[\overrightarrow{w}=\frac{\overrightarrow{y}\cdot \overrightarrow{w_1}}{\overrightarrow{w_1} \cdot \overrightarrow{w_1}}\overrightarrow{w_1}+
\frac{\overrightarrow{y}\cdot \overrightarrow{w_2}}{\overrightarrow{w_2} \cdot \overrightarrow{w_2}}\overrightarrow{w_2}+\cdots+
\frac{\overrightarrow{y}\cdot \overrightarrow{w_k}}{\overrightarrow{w_k} \cdot \overrightarrow{w_k}}\overrightarrow{w_k}
\text{ and } \overrightarrow{z}=\overrightarrow{y}-\overrightarrow{w}.\]
Suppose \(\{\overrightarrow{w_1},\overrightarrow{w_2},\ldots,\overrightarrow{w_k}\}\) is an orthogonal basis of \(W\).
Then
\[\overrightarrow{w}=\frac{\overrightarrow{y}\cdot \overrightarrow{w_1}}{\overrightarrow{w_1} \cdot \overrightarrow{w_1}}\overrightarrow{w_1}+
\frac{\overrightarrow{y}\cdot \overrightarrow{w_2}}{\overrightarrow{w_2} \cdot \overrightarrow{w_2}}\overrightarrow{w_2}+\cdots+
\frac{\overrightarrow{y}\cdot \overrightarrow{w_k}}{\overrightarrow{w_k} \cdot \overrightarrow{w_k}}\overrightarrow{w_k} \in \operatorname{Span} \{\overrightarrow{w_1},\overrightarrow{w_2},\ldots,\overrightarrow{w_k}\}=W.\]
Let \(\overrightarrow{z}=\overrightarrow{y}-\overrightarrow{w}\). We show that \(\overrightarrow{z}=\overrightarrow{y}-\overrightarrow{w}\in W^{\perp}\).
For \(i=1,2,\ldots,k\),
\[\begin{align*}
\overrightarrow{z} \cdot \overrightarrow{w_i}&= (\overrightarrow{y}-\overrightarrow{w}) \cdot \overrightarrow{w_i}\\
&= \overrightarrow{y} \cdot \overrightarrow{w_i}-\overrightarrow{w} \cdot \overrightarrow{w_i}\\
&= \overrightarrow{y} \cdot \overrightarrow{w_i}-\left( \frac{\overrightarrow{y}\cdot \overrightarrow{w_1}}{\overrightarrow{w_1} \cdot \overrightarrow{w_1}}\overrightarrow{w_1}+
\frac{\overrightarrow{y}\cdot \overrightarrow{w_2}}{\overrightarrow{w_2} \cdot \overrightarrow{w_2}}\overrightarrow{w_2}+\cdots+
\frac{\overrightarrow{y}\cdot \overrightarrow{w_k}}{\overrightarrow{w_k} \cdot \overrightarrow{w_k}}\overrightarrow{w_k} \right) \cdot \overrightarrow{w_i}\\
&= \overrightarrow{y} \cdot \overrightarrow{w_i}-\left(0+\cdots+0+
\frac{\overrightarrow{y}\cdot \overrightarrow{w_i}}{\overrightarrow{w_i} \cdot \overrightarrow{w_i}}\overrightarrow{w_i}\cdot \overrightarrow{w_i}+0+\cdots+0 \right)\\
&= 0.
\end{align*}\]
Since \(\overrightarrow{z} \cdot \overrightarrow{w_i}=0\) for \(i=1,2,\ldots,k\),
\(\overrightarrow{z} \cdot \overrightarrow{w}=0\) for all \(\overrightarrow{w}\in W=\operatorname{Span} \{\overrightarrow{w_1},\overrightarrow{w_2},\ldots,\overrightarrow{w_k}\}\)
and consequently \(\overrightarrow{z}\in W^{\perp}\).
To show the uniqueness of the decomposition \(\overrightarrow{y}=\overrightarrow{w}+\overrightarrow{z}\),
let \(\overrightarrow{y}=\overrightarrow{w}'+\overrightarrow{z}'\) for some \(\overrightarrow{w}' \in W\) and
\(\overrightarrow{z}' \in W^{\perp}\).
Then
\[\begin{array}{rrl}
&\overrightarrow{0} = &\overrightarrow{y}-\overrightarrow{y}=(\overrightarrow{w}+\overrightarrow{z})-(\overrightarrow{w}'+\overrightarrow{z}')\\
\implies & \overrightarrow{w}'-\overrightarrow{w} =& \overrightarrow{z}-\overrightarrow{z}' \in W\cap W^{\perp}=\{\overrightarrow{0} \}\\
\implies & \overrightarrow{w}'=\overrightarrow{w}, & \overrightarrow{z}'=\overrightarrow{z}.
\end{array}\]
Definition.
Let \(W\) be a subspace of \(\mathbb R^n\). Each vector \(\overrightarrow{y}\in \mathbb R^n\) can be uniquely written
as \(\overrightarrow{y}=\overrightarrow{w}+\overrightarrow{z}\)
where \(\overrightarrow{w}\in W\) and \(\overrightarrow{z}\in W^{\perp}\). The unique vector \(\overrightarrow{w}\in W\) is called
the orthogonal projection of \(\overrightarrow{y}\) onto \(W\) and it is denoted by \(\operatorname{proj}_W \overrightarrow{y}\).
Example.
Let \(\overrightarrow{w}=[2,1]^T\) and \(W=\operatorname{Span} \{\overrightarrow{w}\}\).
For \(\overrightarrow{y}=[2,3]^T\), find \(\operatorname{proj}_W \overrightarrow{y}\) and the orthogonal decomposition
of \(\overrightarrow{y}\) with respect to \(W\).
Solution.
\(\operatorname{proj}_W \overrightarrow{y}=\frac{\overrightarrow{y}\cdot \overrightarrow{w}}{\overrightarrow{w} \cdot \overrightarrow{w}}\overrightarrow{w}=
\frac{7}{5}[2,1]^T \in W\) and \(\overrightarrow{y}-\operatorname{proj}_W \overrightarrow{y}=\frac{1}{5}[-4,8]^T \in W^{\perp}\).
The orthogonal decomposition of \(\overrightarrow{y}\) with respect to \(W\) is
\[\overrightarrow{y}=[2,3]^T=\frac{7}{5}[2,1]^T+\frac{1}{5}[-4,8]^T.\]
Let \(\overrightarrow{w_1}=\left[\begin{array}{r}2\\3\\0\end{array} \right]\), \(\overrightarrow{w_2}=\left[\begin{array}{r}0\\0\\2\end{array} \right]\),
and \(W=\operatorname{Span} \{\overrightarrow{w_1},\overrightarrow{w_2}\}\). For \(\overrightarrow{y}=\left[\begin{array}{r}1\\0\\1\end{array} \right]\),
find \(\operatorname{proj}_W \overrightarrow{y}\) and the orthogonal decomposition of \(\overrightarrow{y}\)
with respect to \(W\).
Solution.
\[\begin{align*}
\operatorname{proj}_W \overrightarrow{y}&=\frac{\overrightarrow{y}\cdot \overrightarrow{w_1}}{\overrightarrow{w_1} \cdot \overrightarrow{w_1}}\overrightarrow{w_1}
+\frac{\overrightarrow{y}\cdot \overrightarrow{w_2}}{\overrightarrow{w_2} \cdot \overrightarrow{w_2}}\overrightarrow{w_2}\\
&=\frac{2}{13}\left[\begin{array}{r}2\\3\\0\end{array} \right]
+\frac{2}{4}\left[\begin{array}{r}0\\0\\2\end{array} \right]\\
&=\frac{1}{13}\left[\begin{array}{r}4\\6\\13\end{array} \right]\in W,\\
\overrightarrow{y}-\operatorname{proj}_W \overrightarrow{y}&=\frac{1}{13}\left[\begin{array}{r}9\\-6\\0\end{array} \right] \in W^{\perp}.
\end{align*}\]
The orthogonal decomposition of \(\overrightarrow{y}\) with respect to \(W\) is
\[\overrightarrow{y}=\left[\begin{array}{r}1\\0\\1\end{array} \right]
=\frac{1}{13}\left[\begin{array}{r}4\\6\\13\end{array} \right]+\frac{1}{13}\left[\begin{array}{r}9\\-6\\0\end{array} \right].\]
Corollary.
Let \(W\) be a subspace of \(\mathbb R^n\) with an orthonormal basis
\(\{\overrightarrow{w_1},\overrightarrow{w_2},\ldots,\overrightarrow{w_k}\}\). Let \(U=[\overrightarrow{w_1}\; \overrightarrow{w_2}\; \cdots \;\overrightarrow{w_k}]\).
Then for each \(\overrightarrow{y}\in \mathbb R^n\),
\[\operatorname{proj}_W \overrightarrow{y}=UU^Ty=
(\overrightarrow{y}\cdot \overrightarrow{w_1}) \overrightarrow{w_1}
+(\overrightarrow{y}\cdot \overrightarrow{w_2}) \overrightarrow{w_2}
+\cdots+ (\overrightarrow{y}\cdot \overrightarrow{w_k}) \overrightarrow{w_k}.\]
Remark.
Recall that for an \(m\times n\) real matrix \(A\), \(A\overrightarrow{x}=\overrightarrow{b}\) has a solution
if and only if \(\overrightarrow{b} \in \operatorname{CS}\left(A\right)\). So \(A\overrightarrow{x}=\overrightarrow{b}\)
has no solution if and only if \(\overrightarrow{b} \notin \operatorname{CS}\left(A\right)\).
We find \(\overrightarrow{w}\in \operatorname{CS}\left(A\right)\) that is closest to
\(\overrightarrow{b}\), i.e., the best approximation to \(\overrightarrow{b}\) by a vector
\(\overrightarrow{w}\in \operatorname{CS}\left(A\right)\).
Theorem.(Best Approximation Theorem)
Let \(W\) be a subspace of \(\mathbb R^n\) and \(\overrightarrow{b}\in \mathbb R^n\). Then
\[\min_{\overrightarrow{w}\in W}\left\lVert\overrightarrow{b}-\overrightarrow{w}\right\rVert
=\left\lVert\overrightarrow{b}-\operatorname{proj}_W \overrightarrow{b}\right\rVert.\]
It suffices to show that \(\left\lVert\overrightarrow{b}-\operatorname{proj}_W \overrightarrow{b}\right\rVert
< \left\lVert\overrightarrow{b}- \overrightarrow{w}\right\rVert\) for all \(\overrightarrow{w}\in W\)
when \(\overrightarrow{w}\neq \operatorname{proj}_W \overrightarrow{b}\). Let \(\overrightarrow{w}\in W\)
and \(\overrightarrow{w}\neq \operatorname{proj}_W \overrightarrow{b}\).
Then \(\overrightarrow{0}\neq \operatorname{proj}_W \overrightarrow{b}-\overrightarrow{w}\in W\).
Since \(\operatorname{proj}_W \overrightarrow{b} \in W\), \(\overrightarrow{b}-\operatorname{proj}_W \overrightarrow{b} \in W^{\perp}\)
by the orthogonal decomposition. Then
\[(\overrightarrow{b}-\operatorname{proj}_W \overrightarrow{b})\cdot (\operatorname{proj}_W \overrightarrow{b}-\overrightarrow{w})=0.\]
By Pythagorean theorem,
\[\begin{align*}
& \left\lVert(\overrightarrow{b}-\operatorname{proj}_W \overrightarrow{b})+
(\operatorname{proj}_W \overrightarrow{b}-\overrightarrow{w})\right\rVert^2
=\left\lVert\overrightarrow{b}-\operatorname{proj}_W \overrightarrow{b}\right\rVert^2
+\left\lVert\operatorname{proj}_W \overrightarrow{b}-\overrightarrow{w}\right\rVert^2\\
\implies & \left\lVert\overrightarrow{b}-\overrightarrow{w}\right\rVert^2=
\left\lVert\overrightarrow{b}-\operatorname{proj}_W \overrightarrow{b}\right\rVert^2
+\left\lVert\operatorname{proj}_W \overrightarrow{b}-\overrightarrow{w}\right\rVert^2
> \left\lVert\overrightarrow{b}-\operatorname{proj}_W \overrightarrow{b}\right\rVert^2
\end{align*}\]
because \(\operatorname{proj}_W \overrightarrow{b}-\overrightarrow{w}\neq \overrightarrow{0}\).
Thus \(\left\lVert\overrightarrow{b}- \overrightarrow{w}\right\rVert
> \left\lVert\overrightarrow{b}-\operatorname{proj}_W \overrightarrow{b}\right\rVert\).
Example.
Let \(\overrightarrow{u}=[2,3,0]^T\), \(\overrightarrow{v}=[0,0,2]^T\), and \(W=\operatorname{Span} \{\overrightarrow{u},\overrightarrow{v}\}\).
For \(\overrightarrow{y}=[1,0,1]^T\), find the point on \(W\) closest to \(\overrightarrow{y}\) (the best approximation to \(\overrightarrow{y}\)
by a vector of \(W\)) and find the distance between \(\overrightarrow{y}\) and \(W\).
Solution.
The point on \(W\) closest to \(\overrightarrow{y}\) is
\(\operatorname{proj}_W \overrightarrow{y}=\frac{1}{13}[4,6,13]^T \in W\) (show steps).
The distance between \(\overrightarrow{y}\) and \(W\) is
\(\left\lVert\overrightarrow{y}-\operatorname{proj}_W \overrightarrow{y}\right\rVert
=\left\lVert\frac{1}{13}[9,-6,0]^T\right\rVert=\frac{\sqrt{117}}{13}\).
To find \(\operatorname{proj}_W \overrightarrow{y}\) in an alternative way, note that
\(\left\lbrace \frac{\overrightarrow{u}}{\left\lVert\overrightarrow{u}\right\rVert}\;
\frac{\overrightarrow{v}}{\left\lVert\overrightarrow{v}\right\rVert} \right\rbrace\) is an orthonormal basis of \(W\).
Let \(U=\left[\frac{\overrightarrow{u}}{\left\lVert\overrightarrow{u}\right\rVert}\;
\frac{\overrightarrow{v}}{\left\lVert\overrightarrow{v}\right\rVert} \right]
=\left[\begin{array}{cc}
\frac{2}{\sqrt{13}}&0\\
\frac{3}{\sqrt{13}}&0\\
0&1\end{array}\right]\).
Then
\[\operatorname{proj}_W \overrightarrow{y}=UU^Ty=
\left[\begin{array}{cc}
\frac{2}{\sqrt{13}}&0\\
\frac{3}{\sqrt{13}}&0\\
0&1\end{array}\right]
\left[\begin{array}{ccc}
\frac{2}{\sqrt{13}}&\frac{3}{\sqrt{13}}&0\\
0&0&1\end{array}\right]
\left[\begin{array}{r}1\\0\\1\end{array} \right]
=\left[\begin{array}{rr}
\frac{2}{\sqrt{13}}&0\\
\frac{3}{\sqrt{13}}&0\\
0&1\end{array}\right] \left[\begin{array}{c} \frac{2}{\sqrt{13}}\\1\end{array} \right]
=\frac{1}{13}\left[\begin{array}{c}4\\6\\13\end{array} \right].\]