Numerical Analysis Home

Eigenvalues: Power Method

Let $$A$$ be an $$n\times n$$ matrix. If $$A\overrightarrow{x}=\lambda \overrightarrow{x}$$ for some nonzero vector $$\overrightarrow{x}$$ and some scalar $$\lambda$$, then $$\lambda$$ is an eigenvalue of $$A$$ and $$\overrightarrow{x}$$ is an eigenvector of $$A$$ corresponding to $$\lambda$$.

Example. Consider $$A=\left[\begin{array}{rr}1&2\\0&3\end{array} \right],\;\lambda=3,\; \overrightarrow{v}=\left[\begin{array}{r}1\\1\end{array} \right],\; \overrightarrow{u}=\left[\begin{array}{r}-2\\1\end{array} \right]$$.
Since $$A\overrightarrow{v} =\left[\begin{array}{rr}1&2\\0&3\end{array} \right]\left[\begin{array}{r}1\\1\end{array} \right] =\left[\begin{array}{r}3\\3\end{array} \right] =3\left[\begin{array}{r}1\\1\end{array} \right] =\lambda\overrightarrow{v}$$, $$3$$ is an eigenvalue of $$A$$ and $$\overrightarrow{v}$$ is an eigenvector of $$A$$ corresponding to the eigenvalue $$3$$.
Since $$A\overrightarrow{u} =\left[\begin{array}{rr}1&2\\0&3\end{array} \right]\left[\begin{array}{r}-2\\1\end{array} \right] =\left[\begin{array}{r}0\\3\end{array} \right] \neq \lambda\left[\begin{array}{r}-2\\1\end{array} \right] =\lambda\overrightarrow{u}$$ for all scalars $$\lambda$$, $$\overrightarrow{u}$$ is not an eigenvector of $$A$$.

The characteristic polynomial of $$A$$ is $$\det(A-\lambda I)$$ and its roots are the eigenvalues of $$A$$. The eigenvectors of $$A$$ corresponding to the eigenvalue $$\lambda$$ are nonzero solutions of $$(A-\lambda I)\overrightarrow{x}=\overrightarrow{O}$$. The eigenspace of $$A$$ corresponding to $$\lambda$$ is $\text{NS} (A-\lambda I)=\{\overrightarrow{x}\;|\;(A-\lambda I)\overrightarrow{x}=\overrightarrow{O}\}.$ The dominant eigenvalue of $$A$$ is the eigenvalue of $$A$$ with the largest modulus. The spectral radius $$\rho(A)$$ of $$A$$ is the modulus of the dominant eigenvalue of $$A$$. For example, if $$3$$ and $$-4$$ are the eigenvalues of $$A$$, then $$-4$$ is the dominant eigenvalue of $$A$$ and $$\rho(A)=4$$.

The Power Method constructs a sequence $$\{\overrightarrow{x}^{(k)}\}$$ defined by $\overrightarrow{x}^{(k)}=\frac{A^k\overrightarrow{x}^{(0)}}{||A^k\overrightarrow{x}^{(0)}||},\; k\geq 1$ with an initial approximation $$\overrightarrow{x}^{(0)}$$ to approximate an unit eigenvector corresponding to $$\lambda_1$$:
Since $$\overrightarrow{v_1},\ldots,\overrightarrow{v_n}$$ are $$n$$ linearly independent eigenvectors, they span $$\mathbb R^n$$ and $\overrightarrow{x}^{(0)}=c_1\overrightarrow{v_1}+c_2\overrightarrow{v_2}+\cdots+c_n\overrightarrow{v_n},$ for some scalars $$c_1,c_2,\ldots,c_n$$. We choose $$\overrightarrow{x}^{(0)}$$ such that $$c_1\neq 0$$. Then $\overrightarrow{x}^{(k)}=\frac{A^k\overrightarrow{x}^{(0)}}{||A^k\overrightarrow{x}^{(0)}||} \to \pm\frac{\overrightarrow{v_1}}{||\overrightarrow{v_1}||}\text{ as } k\to \infty.$

\begin{align*} & \overrightarrow{x}^{(0)}=c_1\overrightarrow{v_1}+c_2\overrightarrow{v_2}+\cdots+c_n\overrightarrow{v_n}\\ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \implies & A\overrightarrow{x}^{(0)}=c_1A\overrightarrow{v_1}+c_2A\overrightarrow{v_2}+\cdots+c_nA\overrightarrow{v_n}\\ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \implies & A\overrightarrow{x}^{(0)}=c_1\lambda_1\overrightarrow{v_1}+c_2\lambda_2\overrightarrow{v_2}+\cdots+c_n\lambda_n\overrightarrow{v_n}\\ %%%%%%%%%%%%%%%%%%%%%%%%%%%%% \implies & A^2\overrightarrow{x}^{(0)}=c_1\lambda_1A\overrightarrow{v_1}+c_2\lambda_2A\overrightarrow{v_2}+\cdots+c_n\lambda_nA\overrightarrow{v_n}\\ %%%%%%%%%%%%%%%%%%%%%%%%%%% \implies & A^2\overrightarrow{x}^{(0)}=c_1\lambda_1^2\overrightarrow{v_1}+c_2\lambda_2^2\overrightarrow{v_2}+\cdots+c_n\lambda_n^2\overrightarrow{v_n}\\ &\cdots\\ %%%%%%%%%%%%%%%%%%%%%%%%%%% \implies & A^k\overrightarrow{x}^{(0)}=c_1\lambda_1^k\overrightarrow{v_1}+c_2\lambda_2^k\overrightarrow{v_2}+\cdots+c_n\lambda_n^k\overrightarrow{v_n}\\ %%%%%%%%%%%%%%%%%%%%%%%%% \implies & A^k\overrightarrow{x}^{(0)}=\lambda_1^k\left(c_1\overrightarrow{v_1}+c_2\left(\frac{\lambda_2}{\lambda_1}\right)^k\overrightarrow{v_2}+\cdots+c_n\left(\frac{\lambda_n}{\lambda_1}\right)^k\overrightarrow{v_n}\right) %\implies & \frac{A^k\overrightarrow{x}^{(0)}}{\lambda_1^k}=\left(c_1\overrightarrow{v_1}+c_2\left(\frac{\lambda_2}{\lambda_1}\right)^k\overrightarrow{v_2}+\cdots+c_n\left(\frac{\lambda_n}{\lambda_1}\right)^k\overrightarrow{v_n}\right)\to c_1\overrightarrow{v_1} \text{ as } k\to \infty. \end{align*} \begin{align*} \overrightarrow{x}^{(k)}=\frac{A^k\overrightarrow{x}^{(0)}}{||A^k\overrightarrow{x}^{(0)}||} &=\frac{\lambda_1^k\left(c_1\overrightarrow{v_1}+c_2\left(\frac{\lambda_2}{\lambda_1}\right)^k\overrightarrow{v_2}+\cdots+c_n\left(\frac{\lambda_n}{\lambda_1}\right)^k\overrightarrow{v_n}\right)}{\left\Vert\lambda_1^k\left(c_1\overrightarrow{v_1}+c_2\left(\frac{\lambda_2}{\lambda_1}\right)^k\overrightarrow{v_2}+\cdots+c_n\left(\frac{\lambda_n}{\lambda_1}\right)^k\overrightarrow{v_n}\right)\right\Vert }\\ %%%%%%%%%%%%%%%%%%%%%%%%%%%%% &=\frac{\lambda_1^k\left(c_1\overrightarrow{v_1}+c_2\left(\frac{\lambda_2}{\lambda_1}\right)^k\overrightarrow{v_2}+\cdots+c_n\left(\frac{\lambda_n}{\lambda_1}\right)^k\overrightarrow{v_n}\right)}{|\lambda_1^k|\left\Vert\left(c_1\overrightarrow{v_1}+c_2\left(\frac{\lambda_2}{\lambda_1}\right)^k\overrightarrow{v_2}+\cdots+c_n\left(\frac{\lambda_n}{\lambda_1}\right)^k\overrightarrow{v_n}\right)\right\Vert }\\ %%%%%%%%%%%%%%%%%%%%%%%%%% &=\pm\frac{c_1\overrightarrow{v_1}+c_2\left(\frac{\lambda_2}{\lambda_1}\right)^k\overrightarrow{v_2}+\cdots+c_n\left(\frac{\lambda_n}{\lambda_1}\right)^k\overrightarrow{v_n}}{\left\Vert c_1\overrightarrow{v_1}+c_2\left(\frac{\lambda_2}{\lambda_1}\right)^k\overrightarrow{v_2}+\cdots+c_n\left(\frac{\lambda_n}{\lambda_1}\right)^k\overrightarrow{v_n}\right\Vert }\\ %%%%%%%%%%%%%%%%%%%%%% &\longrightarrow \pm\frac{c_1\overrightarrow{v_1}}{||c_1\overrightarrow{v_1}||} =\pm\frac{c_1\overrightarrow{v_1}}{|c_1|\cdot||\overrightarrow{v_1}||}=\pm\frac{\overrightarrow{v_1}}{||\overrightarrow{v_1}||}\text{ as } k\to \infty. \end{align*}

Note that $A\overrightarrow{v_1}=\lambda_1\overrightarrow{v_1} \implies \overrightarrow{v_1}^T A\overrightarrow{v_1}=\lambda_1\overrightarrow{v_1}^T\overrightarrow{v_1} \implies \lambda_1=\frac{\overrightarrow{v_1}^T A\overrightarrow{v_1}}{\overrightarrow{v_1}^T \overrightarrow{v_1}}$ If $$\overrightarrow{x}^{(k)}\approx\overrightarrow{v_1}$$, then $\lambda_1\approx \lambda_1^{(k)}:=\frac{(\overrightarrow{x}^{(k)})^T A\overrightarrow{x}^{(k)}}{(\overrightarrow{x}^{(k)})^T \overrightarrow{x}^{(k)}}.$

Remark.

1. The power method does not work for $$\overrightarrow{x}^{(0)}=c_1\overrightarrow{v_1}+c_2\overrightarrow{v_2}+\cdots+c_n\overrightarrow{v_n}$$ when $$c_1=0$$.

2. The rate convergence of $$\{\overrightarrow{x}^{(k)}\}$$ is faster when $$\displaystyle \left\vert\frac{\lambda_2}{\lambda_1}\right\vert$$ is smaller, i.e., $$|\lambda_2|<<|\lambda_1|$$.

Example. Do three iterations by the power method with $$\overrightarrow{x}^{(0)}=[2,3]^T$$ to approximate the dominant eigenvalue of $$A=\left[\begin{array}{rr}1&2\\2&1\end{array}\right]$$ with its eigenvector.
Solution. \begin{align*} A\overrightarrow{x}^{(0)}=\left[\begin{array}{r}8\\7\end{array}\right] &\implies \overrightarrow{x}^{(1)}=\frac{A\overrightarrow{x}^{(0)}}{||A\overrightarrow{x}^{(0)}||_{\infty}}=\frac{1}{8} \left[\begin{array}{r}8\\7\end{array}\right]=\left[\begin{array}{r}1\\0.87\end{array}\right]\\ %%%%%%%%%%%% A\overrightarrow{x}^{(1)}=\left[\begin{array}{r}2.74\\2.87\end{array}\right] &\implies \overrightarrow{x}^{(2)}=\frac{A\overrightarrow{x}^{(1)}}{||A\overrightarrow{x}^{(1)}||_{\infty}}=\frac{1}{2.87} \left[\begin{array}{r}2.74\\2.87\end{array}\right]=\left[\begin{array}{r}0.95\\1\end{array}\right]\\ %%%%%%%%%%%% A\overrightarrow{x}^{(2)}=\left[\begin{array}{r}2.95\\2.90\end{array}\right] &\implies \overrightarrow{x}^{(3)}=\frac{A\overrightarrow{x}^{(2)}}{||A\overrightarrow{x}^{(2)}||_{\infty}}=\frac{1}{2.95} \left[\begin{array}{r}2.95\\2.9\end{array}\right]=\left[\begin{array}{r}1\\0.98\end{array}\right]\\ %%%%%%%%%%%% A\overrightarrow{x}^{(3)}=\left[\begin{array}{r}2.96\\2.98\end{array}\right] & \end{align*} \begin{align*} \lambda_1^{(1)}= \frac{(\overrightarrow{x}^{(1)})^T A\overrightarrow{x}^{(1)}}{(\overrightarrow{x}^{(1)})^T \overrightarrow{x}^{(1)}} & \approx 5.23/1.75= 2.98\\ \lambda_1^{(2)}= \frac{(\overrightarrow{x}^{(2)})^T A\overrightarrow{x}^{(2)}}{(\overrightarrow{x}^{(2)})^T \overrightarrow{x}^{(2)}} &\approx 5.70/1.90 =3.00\\ \lambda_1^{(3)}= \frac{(\overrightarrow{x}^{(3)})^T A\overrightarrow{x}^{(3)}}{(\overrightarrow{x}^{(3)})^T \overrightarrow{x}^{3)}} &\approx 5.88/1.96 =3.00 \end{align*} The actual $$\lambda_1=3$$ and $$\overrightarrow{v_1}=\left[\begin{array}{r}1\\1\end{array}\right]$$. Note that if $$\overrightarrow{x}^{(0)}=[1,-1]^T$$ which is $$\overrightarrow{v_2}$$, then the power method fails. Since $$\displaystyle\left\vert\frac{\lambda_2}{\lambda_1}\right\vert=\frac{1}{3}$$ is small, the convergence is fast.

Applications:
Power method is used to calculate Google PageRank and to find recommendations of who to follow in Twitter.

Last edited