Theorem (Gram-Schmidt Process).
Let \(W\) be a subspace of \(\mathbb R^n\) with a basis \(\{\overrightarrow{w_1},\overrightarrow{w_2},\ldots,\overrightarrow{w_k}\}\).
There is an orthogonal basis \(\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}\) of
\(W\) where
\[\overrightarrow{v_1}=\overrightarrow{w_1} \text{ and } \overrightarrow{v_i}=\overrightarrow{w_i}-\sum_{j=1}^{i-1} \frac{\overrightarrow{w_i}\cdot \overrightarrow{v_j}}{\overrightarrow{v_j}\cdot \overrightarrow{v_j}} \overrightarrow{v_j},\; i=2,3,\ldots,k.\]
Moreover, \(\operatorname{Span} \{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_i}\}
=\operatorname{Span} \{\overrightarrow{w_1},\overrightarrow{w_2},\ldots,\overrightarrow{w_i}\}\) for \( i=1,2,\ldots,k\).
Let \(W_i=\operatorname{Span} \{\overrightarrow{w_1},\overrightarrow{w_2},\ldots,\overrightarrow{w_i}\}\)
for \( i=1,2,\ldots,k\). By finite induction, we prove that \(W_i=\operatorname{Span} \{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_i}\}\)
and \(\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_i}\}\) is an orthogonal set for each
\( i=1,2,\ldots,k\).
Since \(\overrightarrow{v_1}=\overrightarrow{w_1}\), \(W_1=\operatorname{Span} \{\overrightarrow{w_1}\}=\operatorname{Span} \{\overrightarrow{v_1}\}\)
and \(\{\overrightarrow{v_1}\}=\{\overrightarrow{w_1}\}\) is an orthogonal set. So the statement is true for \(i=1\).
Suppose the statement is true for some \(j < k\), i.e., \(W_j=\operatorname{Span} \{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_j}\}\)
and \(\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_j}\}\) is an orthogonal set.
We prove the statement is true for \(i=j+1\). Note that
\[\overrightarrow{v_{j+1}}=\overrightarrow{w_{j+1}}-\sum_{t=1}^{j} \frac{\overrightarrow{w_{j+1}}\cdot \overrightarrow{v_t}}{\overrightarrow{v_t}\cdot \overrightarrow{v_t}} \overrightarrow{v_t}=\overrightarrow{w_{j+1}}-\operatorname{proj}_{W_j} \overrightarrow{w_{j+1}}.\]
Since \(\overrightarrow{w_{j+1}} \notin W_j=\operatorname{Span} \{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_j}\}\),
by the orthogonal decomposition we have
\[\overrightarrow{v_{j+1}}=\overrightarrow{w_{j+1}}-\operatorname{proj}_{W_j} \overrightarrow{w_{j+1}} \in W_j^{\perp}.\]
Then \(\overrightarrow{v_{j+1}}\) is orthogonal to each of \(\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_j}\)
which are in \(W_j\). Thus \(\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_{j+1}}\}\) is an
orthogonal set of \(j+1\) vectors in \((j+1)\)-dimensional subspace \(W_{j+1}\) and consequently it spans \(W_{j+1}\)
(in fact it forms an orthogonal basis of \(W_{j+1}\)).
Remark.
To find an orthonormal basis, normalize each vector of an orthogonal basis by making each vector a unit vector.
Example.
Find an orthogonal basis of \(\operatorname{CS}\left(A\right)\) for
\(A=\left[\begin{array}{rrr}
3&1&0\\ 4&0&-1\\0&2&0\end{array}\right]\).
Let \(\overrightarrow{w_1}=[3,4,0]^T\), \(\overrightarrow{w_2}=[1,0,2]^T\), and \(\overrightarrow{w_3}=[0,-1,0]^T\).
Since the columns \( \overrightarrow{w_1},\overrightarrow{w_2},\overrightarrow{w_3}\) of \(A\) are linearly indpendent,
they form a basis of \(\operatorname{CS}\left(A\right)\).
Let \(\overrightarrow{v_1}=\overrightarrow{w_1}\) and \(W_1=\operatorname{Span}\{\overrightarrow{v_1}\}\).
Let \(\overrightarrow{v_2}=\overrightarrow{w_2}-\operatorname{proj}_{W_1} \overrightarrow{w_2}
=\overrightarrow{w_2}- \frac{\overrightarrow{w_2}\cdot \overrightarrow{v_1}}{\overrightarrow{v_1}\cdot \overrightarrow{v_1}} \overrightarrow{v_1}
=\left[\begin{array}{r}1\\0\\2\end{array} \right]-\frac{3}{25}\left[\begin{array}{r}3\\4\\0\end{array} \right]
=\frac{1}{25}\left[\begin{array}{r}16\\-12\\50\end{array} \right]\) and \(W_2=\operatorname{Span}\{\overrightarrow{v_1},\overrightarrow{v_2}\}\).
\[\begin{align*}
\text{Let }
\overrightarrow{v_3}=\overrightarrow{w_3}-\operatorname{proj}_{W_2} \overrightarrow{w_3}
&=\overrightarrow{w_3}- \frac{\overrightarrow{w_3}\cdot \overrightarrow{v_1}}{\overrightarrow{v_1}\cdot \overrightarrow{v_1}} \overrightarrow{v_1}
- \frac{\overrightarrow{w_3}\cdot \overrightarrow{v_2}}{\overrightarrow{v_2}\cdot \overrightarrow{v_2}} \overrightarrow{v_2}\\
&=\left[\begin{array}{r}0\\-1\\0\end{array} \right]-\frac{-4}{25}\left[\begin{array}{r}3\\4\\0\end{array} \right]-\frac{12/25}{2900/25^2}\frac{1}{25}\left[\begin{array}{r}16\\-12\\50\end{array} \right]\\
&=\frac{1}{29}\left[\begin{array}{r}12\\-9\\-6\end{array} \right]
\end{align*}\]
Thus an orthogonal basis of \(\operatorname{CS}\left(A\right)\) is
\[\{\overrightarrow{v_1},\overrightarrow{v_2},\overrightarrow{v_3}\}
=\left\lbrace\left[\begin{array}{r}3\\4\\0\end{array} \right],\frac{1}{25}\left[\begin{array}{r}16\\-12\\50\end{array} \right], \frac{1}{29}\left[\begin{array}{r}12\\-9\\-6\end{array} \right] \right\rbrace \text{ or simply }
\left\lbrace\left[\begin{array}{r}3\\4\\0\end{array} \right], \left[\begin{array}{r}16\\-12\\50\end{array} \right], \left[\begin{array}{r}12\\-9\\-6\end{array} \right] \right\rbrace.\]
An orthonormal basis of \(\operatorname{CS}\left(A\right)\) is
\[\left\lbrace \frac{\overrightarrow{v_1}}{\left\lVert\overrightarrow{v_1}\right\rVert},
\frac{\overrightarrow{v_2}}{\left\lVert\overrightarrow{v_2}\right\rVert},
\frac{\overrightarrow{v_3}}{\left\lVert\overrightarrow{v_3}\right\rVert} \right\rbrace
=\left\lbrace \frac{1}{5} \left[\begin{array}{r}3\\4\\0\end{array} \right],
\frac{1}{10\sqrt{29}}\left[\begin{array}{r}16\\-12\\50\end{array} \right],
\frac{1}{3\sqrt{29}}\left[\begin{array}{r}12\\-9\\-6\end{array} \right] \right\rbrace.\]
Theorem (\(QR\)-factorization).
If an \(m\times n\) real matrix \(A\) has linearly independent columns, then \(A\) can be factored as \(A=QR\)
where \(Q\) is an \(m\times n\) real matrix whose columns form an orthonormal basis of \(\operatorname{CS}\left(A\right)\)
and \(R\) is an \(n\times n\) upper-triangular real matrix.
By the Gram-Schmidt process find an orthonormal basis
\(\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_n}\}\) of \(\operatorname{CS}\left(A\right)\).
Let \(Q=[\overrightarrow{v_1},\overrightarrow{v_2},\cdots,\overrightarrow{v_n}]\). Since the columns of \(Q\)
are orthonormal, \(Q^TQ=I_n\) and consequently \(Q^TA=Q^TQR=I_nR=R\).