Linear Algebra Home

Orthogonal Bases and Matrices

Definition. A set $$\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}$$ of vectors in $$\mathbb R^n$$ is called an orthogonal set if $$\overrightarrow{v_i} \cdot \overrightarrow{v_j}=0$$ for all distinct $$i,j=1,2,\ldots,k$$. Also $$\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}$$ is called an orthonormal set if it is an orthogonal set of unit vectors.

Example. Let $$\overrightarrow{v_1}=\left[\begin{array}{r}2\\0\\-1\end{array} \right]$$, $$\overrightarrow{v_2}=\left[\begin{array}{r}0\\2\\0\end{array} \right]$$, and $$\overrightarrow{v_3}=\left[\begin{array}{r}1\\0\\2\end{array} \right]$$. Verify that $$\overrightarrow{v_1}\cdot \overrightarrow{v_2}=0,\; \overrightarrow{v_1}\cdot \overrightarrow{v_3}=0,\; \overrightarrow{v_2}\cdot \overrightarrow{v_3}=0$$. Then $$\{\overrightarrow{v_1},\overrightarrow{v_2},\overrightarrow{v_3}\}$$ is an orthogonal set in $$\mathbb R^3$$ but not orthonormal. The following is an orthonormal set: $\left\lbrace \frac{\overrightarrow{v_1}}{\left\lVert\overrightarrow{v_1}\right\rVert}, \frac{\overrightarrow{v_2}}{\left\lVert\overrightarrow{v_2}\right\rVert}, \frac{\overrightarrow{v_3}}{\left\lVert\overrightarrow{v_3}\right\rVert} \right\rbrace =\left\lbrace \frac{1}{\sqrt{5}} \left[\begin{array}{r}2\\0\\-1\end{array} \right], \frac{1}{2}\left[\begin{array}{r}0\\2\\0\end{array} \right], \frac{1}{\sqrt{5}}\left[\begin{array}{r}1\\0\\2\end{array} \right] \right\rbrace.$

Theorem. If $$\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}$$ is an orthogonal set nonzero vectors in $$\mathbb R^n$$, then $$\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}$$ is linearly independent and consequently it forms a basis of $$\operatorname{Span} \{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}$$.

Let $$c_1\overrightarrow{v_1}+c_2\overrightarrow{v_2}+\cdots+c_k\overrightarrow{v_k}=\overrightarrow{0}$$ for some scalars $$c_1,c_2,\ldots,c_k$$. Then $\begin{array}{rrl} &\overrightarrow{0} \cdot \overrightarrow{v_1} & =(c_1\overrightarrow{v_1}+c_2\overrightarrow{v_2}+\cdots+c_k\overrightarrow{v_k}) \cdot \overrightarrow{v_1}\\ \implies & 0 & =c_1(\overrightarrow{v_1}\cdot \overrightarrow{v_1})+c_2(\overrightarrow{v_2}\cdot \overrightarrow{v_1})+\cdots+c_k(\overrightarrow{v_k}\cdot \overrightarrow{v_1}) \\ \implies & 0 & =c_1\left\lVert\overrightarrow{v_1}\right\rVert^2+0+\cdots+0\\ \implies & c_1& =0 \left(\text{since } \left\lVert\overrightarrow{v_1}\right\rVert \neq 0 \text{ as }\overrightarrow{v_1}\neq \overrightarrow{0} \right). \end{array}$ Similarly we can prove $$c_2=c_3=\cdots=c_k=0$$. Thus $$\{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}$$ is linearly independent and consequently it forms a basis of $$\operatorname{Span} \{\overrightarrow{v_1},\overrightarrow{v_2},\ldots,\overrightarrow{v_k}\}$$.

Definition. Let $$W$$ be a subspace of $$\mathbb R^n$$. An orthogonal basis of $$W$$ is a basis of $$W$$ that is an orthogonal set. Similarly an orthonormal basis of $$W$$ is a basis of $$W$$ that is an orthonormal set.

Example. Let $$\overrightarrow{v_1}=\left[\begin{array}{r}2\\0\\-1\end{array} \right]$$, $$\overrightarrow{v_2}=\left[\begin{array}{r}0\\2\\0\end{array} \right]$$, and $$\overrightarrow{v_3}=\left[\begin{array}{r}1\\0\\2\end{array} \right]$$. Then $$\{\overrightarrow{v_1},\overrightarrow{v_2},\overrightarrow{v_3}\}$$ is an orthogonal basis of $$\mathbb R^3$$.

Theorem. Let $$W$$ be a subspace of $$\mathbb R^n$$ and $$\{\overrightarrow{w_1},\overrightarrow{w_2},\ldots,\overrightarrow{w_k}\}$$ be an orthogonal basis of $$W$$. If $$\overrightarrow{v}\in W$$, then $\overrightarrow{v}=\frac{\overrightarrow{v}\cdot \overrightarrow{w_1}}{\overrightarrow{w_1} \cdot \overrightarrow{w_1}}\overrightarrow{w_1}+ \frac{\overrightarrow{v}\cdot \overrightarrow{w_2}}{\overrightarrow{w_2} \cdot \overrightarrow{w_2}}\overrightarrow{w_2}+\cdots+ \frac{\overrightarrow{v}\cdot \overrightarrow{w_k}}{\overrightarrow{w_k} \cdot \overrightarrow{w_k}}\overrightarrow{w_k}.$

Let $$\overrightarrow{v}\in W=\operatorname{Span} \{\overrightarrow{w_1},\overrightarrow{w_2},\ldots,\overrightarrow{w_k}\}$$. Then $$\overrightarrow{v}=c_1\overrightarrow{w_1}+c_2\overrightarrow{w_2}+\cdots+c_k\overrightarrow{w_k}$$ for some scalars $$c_1,c_2,\ldots,c_k$$. Then $\begin{array}{rrl} &\overrightarrow{v} \cdot \overrightarrow{w_1} & =(c_1\overrightarrow{w_1}+c_2\overrightarrow{w_2}+\cdots+c_k\overrightarrow{w_k}) \cdot \overrightarrow{w_1} \\ \implies & \overrightarrow{v} \cdot \overrightarrow{w_1} & =c_1(\overrightarrow{w_1}\cdot \overrightarrow{w_1})+c_2(\overrightarrow{w_2}\cdot \overrightarrow{w_1})+\cdots+c_k(\overrightarrow{w_k}\cdot \overrightarrow{w_1}) \\ \implies &\overrightarrow{v} \cdot \overrightarrow{w_1} & =c_1(\overrightarrow{w_1}\cdot \overrightarrow{w_1})+0+\cdots+0\\ \implies & c_1& =\displaystyle\frac{\overrightarrow{v} \cdot \overrightarrow{w_1}}{\overrightarrow{w_1}\cdot \overrightarrow{w_1}} \;\; \left(\text{since } \overrightarrow{w_1}\cdot \overrightarrow{w_1}=\left\lVert\overrightarrow{w_1}\right\rVert^2 \neq 0 \text{ as }\overrightarrow{w_1}\neq \overrightarrow{0} \right). \end{array}$ Similarly we can prove that $$c_i =\displaystyle\frac{\overrightarrow{v} \cdot \overrightarrow{w_i}}{\overrightarrow{w_i}\cdot \overrightarrow{w_i}}$$ for $$i=2,3,\ldots,k$$.

Example. Let $$\overrightarrow{v_1}=\left[\begin{array}{r}2\\0\\-1\end{array} \right]$$, $$\overrightarrow{v_2}=\left[\begin{array}{r}0\\2\\0\end{array} \right]$$, and $$\overrightarrow{v_3}=\left[\begin{array}{r}1\\0\\2\end{array} \right]$$. Write $$\overrightarrow{v}=\left[\begin{array}{r}-1\\4\\3\end{array} \right]$$ as a unique linear combination of $$\overrightarrow{v_1},\overrightarrow{v_2},\overrightarrow{v_3}$$ which form an orthogonal basis of $$\mathbb R^3$$.
Solution. \begin{align*} \overrightarrow{v}=\left[\begin{array}{r}-1\\4\\3\end{array} \right] &=\frac{\overrightarrow{v}\cdot \overrightarrow{v_1}}{\overrightarrow{v_1} \cdot \overrightarrow{v_1}}\overrightarrow{v_1} +\frac{\overrightarrow{v}\cdot \overrightarrow{v_2}}{\overrightarrow{v_2} \cdot \overrightarrow{v_2}}\overrightarrow{v_2} +\frac{\overrightarrow{v}\cdot \overrightarrow{v_3}}{\overrightarrow{v_3} \cdot \overrightarrow{v_3}}\overrightarrow{v_3}\\ &=\frac{-5}{5}\overrightarrow{v_1}+\frac{8}{4}\overrightarrow{v_2}+\frac{5}{5}\overrightarrow{v_3}\\ &=-\overrightarrow{v_1}+2\overrightarrow{v_2}+\overrightarrow{v_3}. \end{align*}
Theorem. An $$m\times n$$ real matrix $$U$$ has orthonormal columns if and only if $$U^TU=I_n$$.

Let $$U=[\overrightarrow{u_1}\:\overrightarrow{u_2}\:\cdots\overrightarrow{u_n}]$$ be an $$m\times n$$ real matrix. Then $U^TU= \left[\begin{array}{c}\overrightarrow{u_1}^T\\\overrightarrow{u_2}^T\\ \vdots\\\overrightarrow{u_n}^T \end{array} \right] [\overrightarrow{u_1}\:\overrightarrow{u_2}\:\cdots\overrightarrow{u_n}] =\left[\begin{array}{cccc} \overrightarrow{u_1}\cdot \overrightarrow{u_1} &\overrightarrow{u_1}\cdot \overrightarrow{u_2}&\cdots &\overrightarrow{u_1}\cdot \overrightarrow{u_n}\\ \overrightarrow{u_2}\cdot \overrightarrow{u_1} &\overrightarrow{u_2}\cdot \overrightarrow{u_2}&\cdots &\overrightarrow{u_2}\cdot \overrightarrow{u_n}\\ \vdots&\vdots&\ddots &\vdots\\ \overrightarrow{u_n}\cdot \overrightarrow{u_1} &\overrightarrow{u_n}\cdot \overrightarrow{u_2}&\cdots &\overrightarrow{u_n}\cdot \overrightarrow{u_n}\\ \end{array}\right].$ Thus $$U$$ has orthonormal columns if and only if $$U^TU=I_n$$.

Definition. A square real matrix $$U$$ is called an orthogonal matrix if $$U$$ has orthonormal columns, equivalently if $$U^TU=I$$.

Theorem. The following are equivalent for an $$n\times n$$ real matrix $$U$$.

1. $$U$$ is an orthogonal matrix.

2. $$U$$ has orthonormal columns.

3. $$U^TU=I_n$$.

4. $$UU^T=I_n$$.

5. $$U$$ has orthonormal rows.

6. $$U^{-1}=U^T$$.

Example. $$U=\left[\begin{array}{rrr} \frac{2}{\sqrt{5}}&0&\frac{1}{\sqrt{5}}\\0&1&0\\\frac{-1}{\sqrt{5}}&0&\frac{2}{\sqrt{5}}\end{array}\right]$$ is an orthogonal matrix and $$U^{-1}=U^T=\left[\begin{array}{rrr} \frac{2}{\sqrt{5}}&0&\frac{-1}{\sqrt{5}}\\0&1&0\\\frac{1}{\sqrt{5}}&0&\frac{2}{\sqrt{5}}\end{array}\right]$$.

Theorem. Let $$U$$ be an $$m\times n$$ real matrix with orthonormal columns. Then

1. $$(U\overrightarrow{x}) \cdot (U\overrightarrow{y})=\overrightarrow{x} \cdot \overrightarrow{y}$$ for all $$\overrightarrow{x},\overrightarrow{y} \in \mathbb R^n$$.

2. $$(U\overrightarrow{x}) \cdot (U\overrightarrow{y})=0$$ if and only if $$\overrightarrow{x} \cdot \overrightarrow{y}=0$$ for all $$\overrightarrow{x},\overrightarrow{y} \in \mathbb R^n$$ (i.e., the map $$\overrightarrow{x} \mapsto U\overrightarrow{x}$$ preserves the orthogonality between vectors).

3. $$\left\lVert U\overrightarrow{x}\right\rVert=\left\lVert\overrightarrow{x}\right\rVert$$ for all $$\overrightarrow{x}\in \mathbb R^n$$ (i.e., the map $$\overrightarrow{x} \mapsto U\overrightarrow{x}$$ preserves the length of vectors).

Since $$m\times n$$ real matrix $$U$$ has orthonormal columns, $$U^TU=I_n$$.
1. $$(U\overrightarrow{x}) \cdot (U\overrightarrow{y})=(U\overrightarrow{x})^T (U\overrightarrow{y}) =\overrightarrow{x}^TU^TU\overrightarrow{y}=\overrightarrow{x}^TI_n\overrightarrow{y} =\overrightarrow{x} \cdot \overrightarrow{y}$$ for all $$\overrightarrow{x},\overrightarrow{y} \in \mathbb R^n$$.

2. Follows from (a).

3. By (a), $$\left\lVert U\overrightarrow{x}\right\rVert^2 =(U\overrightarrow{x}) \cdot (U\overrightarrow{x})=\overrightarrow{x} \cdot \overrightarrow{x} =\left\lVert \overrightarrow{x}\right\rVert^2 \implies \left\lVert U\overrightarrow{x}\right\rVert =\left\lVert\overrightarrow{x}\right\rVert$$.

Corollary. An $$n\times n$$ real matrix $$U$$ is orthogonal if and only if $$\left\lVert U\overrightarrow{x}\right\rVert =\left\lVert\overrightarrow{x}\right\rVert$$ for all $$\overrightarrow{x}\in \mathbb R^n$$.

Let $$U$$ be an $$n\times n$$ real matrix.
($$\implies$$) It follows from (c) of the preceding theorem.
($$\Longleftarrow$$) Suppose $$\left\lVert U\overrightarrow{x}\right\rVert=\left\lVert\overrightarrow{x}\right\rVert$$ for all $$\overrightarrow{x}\in \mathbb R^n$$. Let $$U^TU=[a_{ij}]$$. Since $$U^TU$$ is symmetric, $$a_{ij}=a_{ji}$$. For $$i=1,2,\ldots,n$$, $$a_{ii}=(U\overrightarrow{e_i})^T(U\overrightarrow{e_i}) =\left\lVert U\overrightarrow{e_i}\right\rVert^2=\left\lVert\overrightarrow{e_i}\right\rVert^2=1$$. For $$i\neq j$$, $\begin{array}{rrl} & a_{ii}-a_{ji}-a_{ij}+a_{jj}&=(U(\overrightarrow{e_i}-\overrightarrow{e_j}))^T(U(\overrightarrow{e_i}-\overrightarrow{e_j}))\\ \implies & 2-2a_{ij}&=\left\lVert U(\overrightarrow{e_i}-\overrightarrow{e_j})\right\rVert^2 =\left\lVert\overrightarrow{e_i}-\overrightarrow{e_j}\right\rVert^2=2\\ \implies & a_{ij}&=0. \end{array}$ Thus $$U^TU=I_n$$ and $$U$$ is orthogonal.

Last edited