Definition.
Let \(A\) and \(B\) be \(n\times n\) matrices. \(A\) is similar to \(B\) if \(A=PBP^{-1}\) for some
invertible matrix \(P\).
Remark.
If \(A\) is similar to \(B\), then \(B\) is similar to \(A\) because \(B=P^{-1}A(P^{-1})^{-1}\). So we simply say
\(A\) and \(B\) are similar.
Example.
Consider \(A=\left[\begin{array}{rr}6&-1\\2&3\end{array} \right],\; B=\left[\begin{array}{rr}5&0\\0&4\end{array} \right],\;
C=\left[\begin{array}{rr}5&4\\0&0\end{array} \right]\).
\(A\) and \(B\) are similar because \(A=PBP^{-1}\) where \(P=\left[\begin{array}{rr}1&1\\1&2\end{array} \right]\) and
\(P^{-1}=\left[\begin{array}{rr}2&-1\\-1&1\end{array} \right]\).
It can be verified that there is no invertible matrix \(P\) such that \(A=PCP^{-1}\). So \(A\) and \(C\) are not similar.
Theorem.
If \(n\times n\) matrices \(A\) and \(B\) are similar, then they have the same characteristic polynomial and
consequently the same eigenvalues, counting multiplicities.
Let \(A=PBP^{-1}\) for some invertible matrix \(P\). Then
\[\begin{eqnarray*}
\det(\lambda I-A)&=&\det(\lambda I-PBP^{-1})\\
&=&\det(\lambda PP^{-1} -PBP^{-1})\\
&=&\det(P(\lambda I-B)P^{-1})\\
&=&\det P\det(\lambda I-B)\det(P^{-1})\\
&=&\det(\lambda I-B)\det P\det(P^{-1})\\
&=&\det(\lambda I-B)\det(PP^{-1})\\
&=&\det(\lambda I-B)\cdot 1.
\end{eqnarray*}\]
Remark.
In the preceding example \(A\) and \(B\) are similar and then they have the same eigenvalues. Since the eigenvalues
of \(C\) and \(B\) are different, \(C\) is not similar to \(B\) and hence \(A\).
If \(A\) and \(B\) are similar, they have the same eigenvalues. But the converse is not true.
For example, \(\left[\begin{array}{rr}1&0\\0&0\end{array}\right]\) and \(\left[\begin{array}{rr}1&1\\0&0\end{array}\right]\)
are not similar, but have the same eigenvalues.
Theorem.
Let \(A\) be an \(n\times n\) matrix with eigenvalue \(\lambda\). Then the geometric multiplicity of \(\lambda\)
is less than or equal to the algebraic multiplicity of \(\lambda\).
Let \(k\) be the geometric multiplicity of \(\lambda\). Suppose \(\overrightarrow{x_1},\ldots,\overrightarrow{x_k}\)
are \(k\) linearly independent eigenvectors of \(A\) corresponding to \(\lambda\).
Let \(P=[\overrightarrow{x_1}\cdots\overrightarrow{x_k}*\ldots *]\) be an \(n \times n\) invertible matrix.
Then \(P^{-1}AP=\left[\begin{array}{c|c}\lambda I_k&*\\\hline 0&*\end{array}\right]\) and \(\lambda\) is an eigenvalue
of \(P^{-1}AP\) with algebraic multiplicity at least \(k\). Since \(A\) and \(P^{-1}AP\), being similar, have the
same eigenvalues, the algebraic multiplicity of \(\lambda\) is at least \(k\).
Definition.
A square matrix \(A\) is diagonalizable if \(A\) is similar to a diagonal matrix, i.e., \(A=PDP^{-1}\)
for some invertible matrix \(P\) and some diagonal matrix \(D\).
Example. In the first example, \(A\) is diagonalizable as \(A=PBP^{-1}\) where \(B\) is a diagonal matrix.
Theorem.
Let \(A\) be an \(n\times n\) matrix. Then TFAE.
\(A\) is diagonalizable,
There are \(n\) linearly independent eigenvectors of \(A\),
The sum of the geometric multiplicities of the distinct eigenvalues of \(A\) is \(n\), and
Geometric multiplicity and algebraic multiplicity are the same for all eigenvalues of \(A\).
First note that (b), (c), and (d) are equivalent. So we prove (a)\(\Longleftrightarrow\)(b).
(a)\(\Longrightarrow\)(b) There is an invertible matrix \(P=[\overrightarrow{p_1},\ldots,\overrightarrow{p_n}]\)
such that \(A=P\mbox{diag}(\lambda_1,\ldots,\lambda_n)P^{-1}\), i.e., \(AP=P\; \mbox{diag}(\lambda_1,\ldots,\lambda_n)\).
So \(A\overrightarrow{p_i}=\lambda_i\overrightarrow{p_i}\) for \(i=1,\ldots,n\). So \(\overrightarrow{p_i}\) is
an eigenvector of \(A\) corresponding to the eigenvalue \(\lambda_i\) for \(i=1,\ldots,n\). Since \(P\) is invertible,
its columns \(\overrightarrow{p_1},\ldots,\overrightarrow{p_n}\) are linearly independent by the IMT.
(b)\(\Longrightarrow\)(a) Suppose \(\overrightarrow{x_1},\ldots,\overrightarrow{x_n}\) are \(n\) linearly independent
eigenvectors of \(A\) corresponding to the eigenvalues \(\lambda_1,\ldots,\lambda_n\) respectively.
Then \(P=[\overrightarrow{x_1},\ldots,\overrightarrow{x_n}]\) is invertible by the IMT. Since \(A\overrightarrow{x_i}=\lambda_i\overrightarrow{x_i}\)
for \(i=1,\ldots,n\), we get
\[\begin{eqnarray*}
[A\overrightarrow{x_1}\ldots,A\overrightarrow{x_n}]&=&[\lambda_1\overrightarrow{x_1},\ldots,\lambda_n\overrightarrow{x_n}]\\
A[\overrightarrow{x_1},\ldots,\overrightarrow{x_n}]&=&[\overrightarrow{x_1},\ldots,\overrightarrow{x_n}]\mbox{diag}(\lambda_1,\ldots,\lambda_n)\\
AP&=&PD,
\end{eqnarray*}\]
where \(D=\mbox{diag}(\lambda_1,\ldots,\lambda_n)\). Thus \(A=PDP^{-1}\).
Corollary. Let \(A\) be an \(n\times n\) matrix.
If \(A\) has \(n\) distinct eigenvalues, then \(A\) is diagonalizable.
Suppose that \(A\) has \(k\) distinct eigenvalues \(\lambda_1,\ldots,\lambda_k\) with eigenbases \(B_1,\ldots,B_k\) respectively.
Then \(A\) is diagonalizable if and only if \(B_1\cup\cdots\cup B_k\) is a basis for \(\mathbb R^n\).
A formula for \(A^k\): Suppose \(A\) is diagonalizable and \(A=PDP^{-1}\) for some diagonal matrix \(D\).
Then \[A^k=PD^kP^{-1}.\]
It is easy to see that
\[AA\cdots A=(PDP^{-1})(PDP^{-1})\cdots (PDP^{-1})=PDD\cdots DP^{-1}.\]
Note that \(D^k\) is obtained from \(D\) by raising the power of each diagonal entry of \(D\) to \(k\).
Example.
Let \(A=\left[\begin{array}{rrr}
2&0&0\\
1&2&1\\
-1&0&1
\end{array}\right]\).
Diagonalize \(A\), if possible.
Find \(A^k\), if \(A\) is diagonalizable.
Solution. \(\det(\lambda I-A)=\left|\begin{array}{ccc}
\lambda-2&0&0\\
-1&\lambda-2&-1\\
1&0&\lambda-1
\end{array}\right|=(\lambda-1)(\lambda-2)^2=0\implies \lambda=1,2,2\).
Verify the following:
\[\begin{eqnarray*}
\operatorname{NS}(A-1I)&=&\displaystyle\operatorname{Span}\left\{\left[\begin{array}{r}0\\-1\\1\end{array}\right]\right\}\\
\operatorname{NS}(A-2I)&=&\displaystyle\operatorname{Span}\left\{
\left[\begin{array}{r}0\\1\\0\end{array}\right],
\left[\begin{array}{r}-1\\0\\1\end{array}\right]
\right\}
\end{eqnarray*}\]
(a) Since \(3\times 3\) matrix \(A\) has \(3\) linearly independent eigenvectors, \(A\) is diagonalizable and
\(A=PDP^{-1}\) where
\(D=\left[\begin{array}{rrr}
1&0&0\\0&2&0\\0&0&2\end{array}\right]\) and
\(P=\left[\begin{array}{rrr}
0&0&-1\\-1&1&0\\1&0&1\end{array}\right]\).
You may verify this by showing \(AP=PD=\left[\begin{array}{rrr}
0&0&-2\\-1&2&0\\1&0&2\end{array}\right]\).