Definition.
A set \(\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}\) of vectors in \(\mathbb R^n\)
is called an orthogonal set if \(\overrightarrow{v_i} \cdot \overrightarrow{v_j}=0\) for all distinct
\(i,j=1,2,\ldots,k\). Also \(\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}\) is called an
orthonormal set if it is an orthogonal set of unit vectors.
Example.
Let \(\overrightarrow{v_1}=\left[\begin{array}{r}2\\0\\-1\end{array} \right]\),
\(\overrightarrow{v_2}=\left[\begin{array}{r}0\\2\\0\end{array} \right]\), and
\(\overrightarrow{v_3}=\left[\begin{array}{r}1\\0\\2\end{array} \right]\).
Verify that \(\overrightarrow{v_1}\cdot \overrightarrow{v_2}=0,\; \overrightarrow{v_1}\cdot \overrightarrow{v_3}=0,\; \overrightarrow{v_2}\cdot \overrightarrow{v_3}=0\).
Then
\(\{\overrightarrow{v_1},\overrightarrow{v_2},\overrightarrow{v_3}\}\) is an orthogonal set in \(\mathbb R^3\)
but not orthonormal.
The following is an orthonormal set:
\[\left\lbrace \frac{\overrightarrow{v_1}}{\left\lVert\overrightarrow{v_1}\right\rVert},
\frac{\overrightarrow{v_2}}{\left\lVert\overrightarrow{v_2}\right\rVert},
\frac{\overrightarrow{v_3}}{\left\lVert\overrightarrow{v_3}\right\rVert} \right\rbrace
=\left\lbrace \frac{1}{\sqrt{5}} \left[\begin{array}{r}2\\0\\-1\end{array} \right],
\frac{1}{2}\left[\begin{array}{r}0\\2\\0\end{array} \right],
\frac{1}{\sqrt{5}}\left[\begin{array}{r}1\\0\\2\end{array} \right] \right\rbrace.\]
Theorem.
If \(\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}\) is an orthogonal set nonzero vectors
in \(\mathbb R^n\), then \(\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}\) is
linearly independent and consequently it forms a basis of \(\operatorname{Span} \{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}\).
Let \(c_1\overrightarrow{v_1}+c_2\overrightarrow{v_2}+\cdots+c_k\overrightarrow{v_k}=\overrightarrow{0}\)
for some scalars \(c_1,c_2,\ldots,c_k\).
Then
\[\begin{array}{rrl}
&\overrightarrow{0} \cdot \overrightarrow{v_1} & =(c_1\overrightarrow{v_1}+c_2\overrightarrow{v_2}+\cdots+c_k\overrightarrow{v_k}) \cdot \overrightarrow{v_1}\\
\implies & 0 & =c_1(\overrightarrow{v_1}\cdot \overrightarrow{v_1})+c_2(\overrightarrow{v_2}\cdot \overrightarrow{v_1})+\cdots+c_k(\overrightarrow{v_k}\cdot \overrightarrow{v_1}) \\
\implies & 0 & =c_1\left\lVert\overrightarrow{v_1}\right\rVert^2+0+\cdots+0\\
\implies & c_1& =0 \left(\text{since } \left\lVert\overrightarrow{v_1}\right\rVert \neq 0 \text{ as }\overrightarrow{v_1}\neq \overrightarrow{0} \right).
\end{array}\]
Similarly we can prove \(c_2=c_3=\cdots=c_k=0\). Thus \(\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}\)
is linearly independent and consequently it forms a basis of \(\operatorname{Span} \{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}\).
Definition.
Let \(W\) be a subspace of \(\mathbb R^n\). An orthogonal basis of \(W\) is a basis of \(W\) that is
an orthogonal set. Similarly an orthonormal basis of \(W\) is a basis of \(W\) that is an orthonormal set.
Example.
Let \(\overrightarrow{v_1}=\left[\begin{array}{r}2\\0\\-1\end{array} \right]\),
\(\overrightarrow{v_2}=\left[\begin{array}{r}0\\2\\0\end{array} \right]\), and
\(\overrightarrow{v_3}=\left[\begin{array}{r}1\\0\\2\end{array} \right]\). Then
\(\{\overrightarrow{v_1},\overrightarrow{v_2},\overrightarrow{v_3}\}\) is an orthogonal basis of \(\mathbb R^3\).
Theorem.
Let \(W\) be a subspace of \(\mathbb R^n\) and \(\{\overrightarrow{w_1},\overrightarrow{w_2},\ldots,\overrightarrow{w_k}\}\)
be an orthogonal basis of \(W\). If \(\overrightarrow{v}\in W\),
then
\[\overrightarrow{v}=\frac{\overrightarrow{v}\cdot \overrightarrow{w_1}}{\overrightarrow{w_1} \cdot \overrightarrow{w_1}}\overrightarrow{w_1}+
\frac{\overrightarrow{v}\cdot \overrightarrow{w_2}}{\overrightarrow{w_2} \cdot \overrightarrow{w_2}}\overrightarrow{w_2}+\cdots+
\frac{\overrightarrow{v}\cdot \overrightarrow{w_k}}{\overrightarrow{w_k} \cdot \overrightarrow{w_k}}\overrightarrow{w_k}.\]
Let \(\overrightarrow{v}\in W=\operatorname{Span} \{\overrightarrow{w_1},\overrightarrow{w_2},\ldots,\overrightarrow{w_k}\}\).
Then \(\overrightarrow{v}=c_1\overrightarrow{w_1}+c_2\overrightarrow{w_2}+\cdots+c_k\overrightarrow{w_k}\) for some
scalars \(c_1,c_2,\ldots,c_k\). Then
\[\begin{array}{rrl}
&\overrightarrow{v} \cdot \overrightarrow{w_1} & =(c_1\overrightarrow{w_1}+c_2\overrightarrow{w_2}+\cdots+c_k\overrightarrow{w_k}) \cdot \overrightarrow{w_1} \\
\implies & \overrightarrow{v} \cdot \overrightarrow{w_1} & =c_1(\overrightarrow{w_1}\cdot \overrightarrow{w_1})+c_2(\overrightarrow{w_2}\cdot \overrightarrow{w_1})+\cdots+c_k(\overrightarrow{w_k}\cdot \overrightarrow{w_1}) \\
\implies &\overrightarrow{v} \cdot \overrightarrow{w_1} & =c_1(\overrightarrow{w_1}\cdot \overrightarrow{w_1})+0+\cdots+0\\
\implies & c_1& =\displaystyle\frac{\overrightarrow{v} \cdot \overrightarrow{w_1}}{\overrightarrow{w_1}\cdot \overrightarrow{w_1}} \;\;
\left(\text{since } \overrightarrow{w_1}\cdot \overrightarrow{w_1}=\left\lVert\overrightarrow{w_1}\right\rVert^2 \neq 0 \text{ as }\overrightarrow{w_1}\neq \overrightarrow{0} \right).
\end{array}\]
Similarly we can prove that \(c_i =\displaystyle\frac{\overrightarrow{v} \cdot \overrightarrow{w_i}}{\overrightarrow{w_i}\cdot \overrightarrow{w_i}}\)
for \(i=2,3,\ldots,k\).
Example.
Let \(\overrightarrow{v_1}=\left[\begin{array}{r}2\\0\\-1\end{array} \right]\),
\(\overrightarrow{v_2}=\left[\begin{array}{r}0\\2\\0\end{array} \right]\), and
\(\overrightarrow{v_3}=\left[\begin{array}{r}1\\0\\2\end{array} \right]\).
Write \(\overrightarrow{v}=\left[\begin{array}{r}-1\\4\\3\end{array} \right]\) as a unique linear combination of
\(\overrightarrow{v_1},\overrightarrow{v_2},\overrightarrow{v_3}\) which form an orthogonal basis of \(\mathbb R^3\).
Solution.
\[\begin{align*}
\overrightarrow{v}=\left[\begin{array}{r}-1\\4\\3\end{array} \right] &=\frac{\overrightarrow{v}\cdot \overrightarrow{v_1}}{\overrightarrow{v_1} \cdot \overrightarrow{v_1}}\overrightarrow{v_1}
+\frac{\overrightarrow{v}\cdot \overrightarrow{v_2}}{\overrightarrow{v_2} \cdot \overrightarrow{v_2}}\overrightarrow{v_2}
+\frac{\overrightarrow{v}\cdot \overrightarrow{v_3}}{\overrightarrow{v_3} \cdot \overrightarrow{v_3}}\overrightarrow{v_3}\\
&=\frac{-5}{5}\overrightarrow{v_1}+\frac{8}{4}\overrightarrow{v_2}+\frac{5}{5}\overrightarrow{v_3}\\
&=-\overrightarrow{v_1}+2\overrightarrow{v_2}+\overrightarrow{v_3}.
\end{align*}\]
Theorem.
An \(m\times n\) real matrix \(U\) has orthonormal columns if and only if \(U^TU=I_n\).
Let \(U=[\overrightarrow{u_1}\:\overrightarrow{u_2}\:\cdots\overrightarrow{u_n}]\) be an \(m\times n\) real matrix.
Then
\[U^TU= \left[\begin{array}{c}\overrightarrow{u_1}^T\\\overrightarrow{u_2}^T\\ \vdots\\\overrightarrow{u_n}^T \end{array} \right] [\overrightarrow{u_1}\:\overrightarrow{u_2}\:\cdots\overrightarrow{u_n}]
=\left[\begin{array}{cccc}
\overrightarrow{u_1}\cdot \overrightarrow{u_1} &\overrightarrow{u_1}\cdot \overrightarrow{u_2}&\cdots &\overrightarrow{u_1}\cdot \overrightarrow{u_n}\\
\overrightarrow{u_2}\cdot \overrightarrow{u_1} &\overrightarrow{u_2}\cdot \overrightarrow{u_2}&\cdots &\overrightarrow{u_2}\cdot \overrightarrow{u_n}\\
\vdots&\vdots&\ddots &\vdots\\
\overrightarrow{u_n}\cdot \overrightarrow{u_1} &\overrightarrow{u_n}\cdot \overrightarrow{u_2}&\cdots &\overrightarrow{u_n}\cdot \overrightarrow{u_n}\\ \end{array}\right].\]
Thus \(U\) has orthonormal columns if and only if \(U^TU=I_n\).
Definition.
A square real matrix \(U\) is called an orthogonal matrix if \(U\) has orthonormal columns, equivalently if
\(U^TU=I\).
Theorem.
The following are equivalent for an \(n\times n\) real matrix \(U\).
\(U\) is an orthogonal matrix.
\(U\) has orthonormal columns.
\(U^TU=I_n\).
\(UU^T=I_n\).
\(U\) has orthonormal rows.
\(U^{-1}=U^T\).
Example.
\(U=\left[\begin{array}{rrr}
\frac{2}{\sqrt{5}}&0&\frac{1}{\sqrt{5}}\\0&1&0\\\frac{-1}{\sqrt{5}}&0&\frac{2}{\sqrt{5}}\end{array}\right]\)
is an orthogonal matrix and
\(U^{-1}=U^T=\left[\begin{array}{rrr}
\frac{2}{\sqrt{5}}&0&\frac{-1}{\sqrt{5}}\\0&1&0\\\frac{1}{\sqrt{5}}&0&\frac{2}{\sqrt{5}}\end{array}\right]\).
Theorem.
Let \(U\) be an \(m\times n\) real matrix with orthonormal columns. Then
\((U\overrightarrow{x}) \cdot (U\overrightarrow{y})=\overrightarrow{x} \cdot \overrightarrow{y}\)
for all \(\overrightarrow{x},\overrightarrow{y} \in \mathbb R^n\).
\((U\overrightarrow{x}) \cdot (U\overrightarrow{y})=0\) if and only if
\(\overrightarrow{x} \cdot \overrightarrow{y}=0\) for all \(\overrightarrow{x},\overrightarrow{y} \in \mathbb R^n\) (i.e.,
the map \(\overrightarrow{x} \mapsto U\overrightarrow{x}\) preserves the orthogonality between vectors).
\(\left\lVert U\overrightarrow{x}\right\rVert=\left\lVert\overrightarrow{x}\right\rVert\)
for all \(\overrightarrow{x}\in \mathbb R^n\) (i.e.,
the map \(\overrightarrow{x} \mapsto U\overrightarrow{x}\) preserves the length of vectors).
Since \(m\times n\) real matrix \(U\) has orthonormal columns, \(U^TU=I_n\).
\((U\overrightarrow{x}) \cdot (U\overrightarrow{y})=(U\overrightarrow{x})^T (U\overrightarrow{y})
=\overrightarrow{x}^TU^TU\overrightarrow{y}=\overrightarrow{x}^TI_n\overrightarrow{y}
=\overrightarrow{x} \cdot \overrightarrow{y}\)
for all \(\overrightarrow{x},\overrightarrow{y} \in \mathbb R^n\).
Corollary.
An \(n\times n\) real matrix \(U\) is orthogonal if and only if \(\left\lVert U\overrightarrow{x}\right\rVert
=\left\lVert\overrightarrow{x}\right\rVert\) for all \(\overrightarrow{x}\in \mathbb R^n\).
Let \(U\) be an \(n\times n\) real matrix.
(\(\implies\)) It follows from (c) of the preceding theorem.
(\(\Longleftarrow\)) Suppose \(\left\lVert U\overrightarrow{x}\right\rVert=\left\lVert\overrightarrow{x}\right\rVert\)
for all
\(\overrightarrow{x}\in \mathbb R^n\). Let \(U^TU=[a_{ij}]\). Since \(U^TU\) is symmetric, \(a_{ij}=a_{ji}\).
For \(i=1,2,\ldots,n\), \(a_{ii}=(U\overrightarrow{e_i})^T(U\overrightarrow{e_i})
=\left\lVert U\overrightarrow{e_i}\right\rVert^2=\left\lVert\overrightarrow{e_i}\right\rVert^2=1\).
For \(i\neq j\),
\[\begin{array}{rrl}
& a_{ii}-a_{ji}-a_{ij}+a_{jj}&=(U(\overrightarrow{e_i}-\overrightarrow{e_j}))^T(U(\overrightarrow{e_i}-\overrightarrow{e_j}))\\
\implies & 2-2a_{ij}&=\left\lVert U(\overrightarrow{e_i}-\overrightarrow{e_j})\right\rVert^2
=\left\lVert\overrightarrow{e_i}-\overrightarrow{e_j}\right\rVert^2=2\\
\implies & a_{ij}&=0.
\end{array}\]
Thus \(U^TU=I_n\) and \(U\) is orthogonal.