Linear Algebra Home

Inverse of a Matrix

    


Definition. An \(n\times n\) matrix \(A\) is invertible if there an \(n\times n\) matrix \(B\) such that \[AB=BA=I_n.\] This \(B\) is called the inverse of \(A\), denoted by \(A^{-1}\), for which \(AA^{-1}=A^{-1}A=I_n.\) An invertible matrix is also called a nonsingular matrix. A square matrix that is not invertible is called a singular matrix.

Example. For \(A=\left[\begin{array}{rrr}1&2\\4&6\end{array} \right]\) and \(B=\left[\begin{array}{rrr}-3&1\\2&-0.5\end{array} \right]\), \(AB=\left[\begin{array}{rrr}1&0\\0&1\end{array} \right] =BA.\) So \(B=A^{-1}\).

Theorem. Let \(A\) and \(B\) be two \(n\times n\) invertible matrices. Then the following hold.

  1. \(A^{-1}\) is invertible and \((A^{-1})^{-1}=A\).

  2. \(A^T\) is invertible and \((A^T)^{-1}= (A^{-1})^T\).

  3. For \(c\neq 0\), \(cA\) is invertible and \((cA)^{-1}=\frac{1}{c}A^{-1}\).

  4. \(AB\) is invertible and \((AB)^{-1}=B^{-1}A^{-1}\).

(a) and (c) are exercises. For (b) note that \[\begin{align*} A^T(A^{-1})^T & =(A^{-1}A)^T=I_n^T=I_n \text{ and }\\ (A^{-1})^TA^T &=(AA^{-1})^T=I_n^T=I_n. \end{align*}\] For (d) note that \[\begin{align*} (AB) (B^{-1}A^{-1}) & =A(BB^{-1}) A^{-1} =AI_n A^{-1} =A A^{-1}=I_n \text{ and }\\ (B^{-1}A^{-1}) (AB) & =B^{-1} (A^{-1}A) B =B^{-1} I_n B = B^{-1}B=I_n. \end{align*}\]

Example. For \(A=\left[\begin{array}{rrr}1&1\\3&4\end{array} \right]\) and \(B=\left[\begin{array}{rrr}1&2\\2&5\end{array} \right]\), \(A^{-1}=\left[\begin{array}{rrr}4&-1\\-3&1\end{array} \right]\) and \(B^{-1}=\left[\begin{array}{rrr}5&-2\\-2&1\end{array} \right]\). Verify \((A^T)^{-1}=\left[\begin{array}{rrr}1&3\\1&4\end{array} \right]^{-1} = \left[\begin{array}{rrr}4&-3\\-1&1\end{array} \right] = (A^{-1})^T\), \((5A)^{-1}=\frac{1}{5} \left[\begin{array}{rrr}4&-1\\-3&1\end{array} \right] =\frac{1}{5} A^{-1}\), and \[(AB)^{-1}=\left[\begin{array}{rrr}3&7\\11&26\end{array} \right]^{-1} =\left[\begin{array}{rrr}26&-7\\-11&3\end{array} \right] =B^{-1}A^{-1}.\]


How do we know a given square matrix \(A\) is invertible? How do we find \(A^{-1}\)?

Theorem. Let \(A\) be an \(n\times n\) matrix. Then the following are equivalent.

  1. \(A\) is invertible.

  2. \(A\overrightarrow{x}=\overrightarrow{b}\) has a unique solution for each \(\overrightarrow{b}\in \mathbb R^n\).

  3. The RREF of \(A\) is \(I_n\).

(b) \(\iff\) (c) \(A\overrightarrow{x}=\overrightarrow{b}\) has a unique solution for each \(\overrightarrow{b}\in \mathbb R^n\) if and only if each column of the RREF of \(A\) has a leading 1 if and only if the RREF of \(A\) is \(I_n\).
(a) \(\implies\) (b) Suppose \(A\) is invertible. Let \(\overrightarrow{b}\in \mathbb R^n\). Then \(A\overrightarrow{x}=\overrightarrow{b} \implies \overrightarrow{x}=A^{-1}\overrightarrow{b}\).
(b) \(\implies\) (a) Suppose \(A\overrightarrow{x}=\overrightarrow{b}\) has a unique solution for each \(\overrightarrow{b}\in \mathbb R^n\). Let \(A\overrightarrow{v_i}=\overrightarrow{e_i}\) for \(i=1,2,\ldots,n\). Then \[A[\overrightarrow{v_1}\:\overrightarrow{v_2}\:\cdots\overrightarrow{v_n}] =[A\overrightarrow{v_1}\: A\overrightarrow{v_2}\:\cdots A\overrightarrow{v_n}] =[\overrightarrow{e_1}\:\overrightarrow{e_2}\:\cdots\overrightarrow{e_n}]=I_n.\] To show \(A^{-1}=[\overrightarrow{v_1}\:\overrightarrow{v_2}\:\cdots\overrightarrow{v_n}]\), it suffices to show \([\overrightarrow{v_1}\:\overrightarrow{v_2}\:\cdots\overrightarrow{v_n}]A=I_n\). Since \(A[\overrightarrow{v_1}\:\overrightarrow{v_2}\:\cdots\overrightarrow{v_n}]=I_n\), \[A[\overrightarrow{v_1}\:\overrightarrow{v_2}\:\cdots\overrightarrow{v_n}]A =I_nA=A.\] Let \(\overrightarrow{b_i}\) be the \(i\)th column of \([\overrightarrow{v_1}\:\overrightarrow{v_2}\:\cdots\overrightarrow{v_n}]A\) for \(i=1,2,\ldots,n\). Then \(A\overrightarrow{b_i}=\overrightarrow{a_i}\). But \(A\overrightarrow{e_i}=\overrightarrow{a_i}\). By the uniqueness of solution of \(A\overrightarrow{x}=\overrightarrow{a_i}\), \(\overrightarrow{b_i}=\overrightarrow{e_i}\) for \(i=1,2,\ldots,n\). Thus \[[\overrightarrow{v_1}\:\overrightarrow{v_2}\:\cdots\overrightarrow{v_n}]A =[\overrightarrow{e_1}\:\overrightarrow{e_2}\:\cdots\overrightarrow{e_n}]=I_n.\]

To find \(A^{-1}\) for an invertible matrix \(A\), we investigate how row operations on \(A\) are obtained from premultiplying \(A\) by elementary matrices.

Definition. An \(n\times n\) elementary matrix is obtained by applying an elementary row operation on \(I_n\).

Example.

  1. \(E_{ij}\) is obtained by \(R_i\leftrightarrow R_j\) on \(I_n\). Note that \(E_{ij}A\) is obtained by \(R_i\leftrightarrow R_j\) on \(A\). \[\begin{align*} A=\left[\begin{array}{rrr}0&2&4\\1&-3&0\\-1&3&1\end{array} \right]\xrightarrow{R_1\leftrightarrow R_2} \left[\begin{array}{rrr}1&-3&0\\0&2&4\\-1&3&1\end{array} \right] &=\left[\begin{array}{rrr}0&1&0\\1&0&0\\0&0&1\end{array} \right] \left[\begin{array}{rrr}0&2&4\\1&-3&0\\-1&3&1\end{array} \right]\\ &=E_{12}A. \end{align*}\]

  2. For \(c\neq 0\), \(E_{i}(c)\) is obtained by \(cR_i\) on \(I_n\). Note that \(E_{i}(c)A\) is obtained by \(cR_i\) on \(A\). \[\begin{align*} E_{12}A=\left[\begin{array}{rrr}1&-3&0\\0&2&4\\-1&3&1\end{array} \right]\xrightarrow{\frac{1}{2} R_2} \left[\begin{array}{rrr}1&-3&0\\0&1&2\\-1&3&1\end{array} \right] &=\left[\begin{array}{rrr}1&0&0\\0&\frac{1}{2}&0\\0&0&1\end{array} \right] \left[\begin{array}{rrr}1&-3&0\\0&2&4\\-1&3&1\end{array} \right]\\ &=E_2\left(\frac{1}{2}\right) E_{12}A. \end{align*}\]

  3. \(E_{ij}(c)\) is obtained by \(cR_i+R_j\) on \(I_n\). Note that \(E_{ij}(c)A\) is obtained by \(cR_i+R_j\) on \(A\). \[\begin{align*} E_2\left(\frac{1}{2}\right) E_{12}A=\left[\begin{array}{rrr}1&-3&0\\0&1&2\\-1&3&1\end{array} \right]\xrightarrow{R_1+ R_3} \left[\begin{array}{rrr}1&-3&0\\0&1&2\\0&0&1\end{array} \right] &=\left[\begin{array}{rrr}1&0&0\\0&1&0\\1&0&1\end{array} \right] \left[\begin{array}{rrr}1&-3&0\\0&1&2\\-1&3&1\end{array} \right]\\ &=E_{13}(1)E_2\left(\frac{1}{2}\right) E_{12}A. \end{align*}\]

Remark. Elementary matrices are invertible. Moreover, \(E_{ij}^{-1}=E_{ij}\), \(E_{i}(c)^{-1}=E_{i}\left(\frac{1}{c}\right)\) for \(c\neq 0\), and \(E_{ij}(c)^{-1}=E_{ij}(-c)\).

Theorem. Let \(A\) be an \(n\times n\) invertible matrix. A sequence of elementary row operations that reduces \(A\) to \(I_n\) also reduces \(I_n\) to \(A^{-1}\).

Since \(A\) is invertible, the RREF of \(A\) is \(I_n\). Suppose \(I_n\) is obtained from \(A\) by successively premultiplying by elementary matrices \(E_1,E_2,\ldots,E_k\), i.e., \[E_kE_{k-1}\cdots E_1A=I_n.\] Postmultiplying by \(A^{-1}\), we get \[E_kE_{k-1}\cdots E_1AA^{-1}=I_nA^{-1} \implies E_kE_{k-1}\cdots E_1 I_n=A^{-1}.\]

Gauss-Jordan elimination:
Find the RREF of \([A\;|\;I_n]\). If the the RREF of \(A\) is \(I_n\), then \(A\) is invertible and the RREF of \([A\;|\;I_n]\) is \([I_n\;|\;A^{-1}]\). Otherwise \(A\) is not invertible.

Example. \[[A\;|\;I_3]=\left[\begin{array}{rrr|rrr}0&2&4&1&0&0\\1&-3&0&0&1&0\\-1&3&1&0&0&1\end{array} \right] \xrightarrow{R_1\leftrightarrow R_2} \left[\begin{array}{rrr|rrr}1&-3&0&0&1&0\\0&2&4&1&0&0\\-1&3&1&0&0&1\end{array} \right] \] \[ \xrightarrow{R_1+ R_3} \left[\begin{array}{rrr|rrr}1&-3&0&0&1&0\\0&2&4&1&0&0\\0&0&1&0&1&1\end{array} \right] \xrightarrow{-4R_3+ R_2} \left[\begin{array}{rrr|rrr}1&-3&0&0&1&0\\0&2&0&1&-4&-4\\0&0&1&0&1&1\end{array} \right] \] \[ \xrightarrow{\frac{1}{2} R_2} \left[\begin{array}{rrr|rrr}1&-3&0&0&1&0\\0&1&0&\frac{1}{2}&-2&-2\\0&0&1&0&1&1\end{array} \right] \xrightarrow{3R_2+ R_1} \left[\begin{array}{rrr|rrr}1&0&0&\frac{3}{2}&-5&-6\\0&1&0&\frac{1}{2}&-2&-2\\0&0&1&0&1&1\end{array} \right] =[I_3\;|\;A^{-1}] \] Thus \(A^{-1}=\left[\begin{array}{rrr}\frac{3}{2}&-5&-6\\ \frac{1}{2}&-2&-2\\0&1&1\end{array} \right]\). Notice how elementary matrices \(E_{12}\), \(E_{13}(1)\), \(E_{32}(-4)\), \(E_2\left(\frac{1}{2}\right)\), \(E_{21}(3)\) are successively applied on \(A\) to get \(I_3\):
\[E_{21}(3) E_2\left(\frac{1}{2}\right) E_{32}(-4) E_{13}(1) E_{12}A= I_3.\] Verify that the product of those elementary matrices is \(A^{-1}\):
\[A^{-1}=E_{21}(3) E_2\left(\frac{1}{2}\right) E_{32}(-4) E_{13}(1) E_{12}.\]
Remark. For an \(m\times n\) matrix \(A\) there is a generalized inverse called the Moore-Penrose inverse, denoted by \(A^+\), which can be found using the singular-value decomposition of \(A\).


Last edited