Definition.
A function \(T: V \to W\) from a subspace \(V\) of \(\mathbb R^n\) to a subspace \(W\) of \(\mathbb R^m\) is called a
linear transformation if
\(T(\overrightarrow{u}+\overrightarrow{v})= T(\overrightarrow{u})+T(\overrightarrow{v})\) for all
\(\overrightarrow{u}, \overrightarrow{v} \in V\) and
\(T(c\overrightarrow{v})=cT(\overrightarrow{v})\) for for all \(\overrightarrow{v}\in V\) and all scalars
\(c \in \mathbb R\).
In short, a function \(T: V \to W\) is a linear transformation if it preserves the linearity among vectors:
\(T(c\overrightarrow{u}+d\overrightarrow{v})= cT(\overrightarrow{u})+dT(\overrightarrow{v})\) for all
\(\overrightarrow{u}, \overrightarrow{v} \in V\) and all scalars \(c,d \in \mathbb R\).
Example.
The projection \(T:\mathbb R^3 \to \mathbb R^3\) of \(\mathbb R^3\) onto the \(xy\)-plane in \(\mathbb R^3\) is
defined by
\[T \left( \left[\begin{array}{c}x_1\\x_2\\x_3 \end{array} \right] \right) = \left[\begin{array}{c}x_1\\x_2\\0 \end{array} \right] \text{ for all } \overrightarrow{x}=\left[\begin{array}{c}x_1\\x_2\\x_3 \end{array} \right] \in \mathbb R^3.\]
Sometimes it is simply denoted by \(T(x_1,x_2,x_3)=(x_1,x_2,0)\) in terms of row vectors. To show it is a linear transformation,
let \(\overrightarrow{x}=(x_1,x_2,x_3)\) and \(\overrightarrow{y}=(y_1,y_2,y_3)\) in \(\mathbb R^3\) and \(c,d\in \mathbb R\).
Then
\[\begin{align*}
T(c\overrightarrow{x}+d\overrightarrow{y}) &= T(cx_1+dy_1,cx_2+dy_2,cx_3+dy_3)\\
&= (cx_1+dy_1,cx_2+dy_2,0)\\
&= (cx_1,cx_2,0)+(dy_1,dy_2,0)\\
&=cT(\overrightarrow{x})+dT(\overrightarrow{y}).
\end{align*}\]
For the matrix \(A=\left[\begin{array}{rr}1&2\\0&1 \end{array}\right]\), define the shear transformation
\(T:\mathbb R^2\to \mathbb R^2\) by \(T(\overrightarrow{x})=A\overrightarrow{x}\). Let \(\overrightarrow{x}, \overrightarrow{y}\in \mathbb R^2\)
and \(c,d\in \mathbb R\). Then
\[T(c\overrightarrow{x}+d\overrightarrow{y})=A(c\overrightarrow{x}+d\overrightarrow{y})
= cA \overrightarrow{x}+dA \overrightarrow{y}= cT(\overrightarrow{x})+dT(\overrightarrow{y}).\]
Thus \(T\) is a linear transformation which transforms the square formed by \((0,0)\),\((1,0)\),\((1,1)\),\((0,1)\)
to the parallelogram formed by \((0,0),(1,0),(3,1),(2,1)\).
Definition.
A matrix transformation is the linear transformation \(T:\mathbb R^n\to \mathbb R^m\) defined by
\(T(\overrightarrow{x})=A\overrightarrow{x}\) for some \(m\times n\) matrix \(A\). It is denoted by \(\overrightarrow{x} \mapsto A\overrightarrow{x}\).
From the definition of a linear transformation we have the following properties.
Property
For a linear transformation \(T: V \to W\) where \(V\leq \mathbb R^n\) and \(W\leq \mathbb R^m\),
\(T(\overrightarrow{0_n})=\overrightarrow{0_m}\) and
for all \(\overrightarrow{v_1}, \ldots,\overrightarrow{v_k} \in V\) and all \(c_1,\ldots,c_k \in \mathbb R\),
\[T(c_1\overrightarrow{v_1}+c_2\overrightarrow{v_2}+\cdots+c_k\overrightarrow{v_k})
= c_1 T(\overrightarrow{v_1})+c_2 T(\overrightarrow{v_2})+\cdots+c_k T(\overrightarrow{v_k}).\]
Example.
Consider the function \(T:\mathbb R^3 \to \mathbb R^3\) defined by \(T(x_1,x_2,x_3)=(x_1,x_2,5)\). Since
\(T(0,0,0)=(0,0,5) \neq (0,0,0)\), \(T\) is not a linear transformation.
Theorem.
For a linear transformation \(T:\mathbb R^n \to \mathbb R^m\), there exists a unique \(m\times n\) matrix \(A\),
called the standard matrix of \(T\), for which
\[T(\overrightarrow{x})=A\overrightarrow{x} \text{ for all } \overrightarrow{x}\in \mathbb R^n.\]
Moreover, \(A=[T(\overrightarrow{e_1})\: T(\overrightarrow{e_2}) \:\cdots T(\overrightarrow{e_n})]\) where
\(\overrightarrow{e_i}\) is the \(i\)th column of \(I_n\).
Let \(\overrightarrow{x}=[x_1,x_2,\ldots,x_n]^T \in \mathbb R^n\). We can write \(\overrightarrow{x}=x_1\overrightarrow{e_1}+x_2\overrightarrow{e_2}+\cdots+x_n\overrightarrow{e_n}\).
Then
\[\begin{align*}
T(\overrightarrow{x})=T(x_1\overrightarrow{e_1}+x_2\overrightarrow{e_2}+\cdots+x_n\overrightarrow{e_n})
&=x_1 T(\overrightarrow{e_1})+x_2 T(\overrightarrow{e_2})+\cdots+x_n T(\overrightarrow{e_n})\\ &=[T(\overrightarrow{e_1})\: T(\overrightarrow{e_2}) \:\cdots T(\overrightarrow{e_n})]
\left[\begin{array}{c}x_1\\x_2\\ \vdots\\x_n \end{array} \right]\\
&= A\overrightarrow{x}.
\end{align*}\]
Example.
Use the standard matrix to find the rotation transformation \(T:\mathbb R^2 \to \mathbb R^2\) that rotates
each point of \(\mathbb R^2\) about the origin through an angle \(\theta\) counterclockwise.
Solution. By trigonometry we have
\[T(\overrightarrow{e_1})=T\left(\left[\begin{array}{c}1\\0 \end{array} \right] \right)=\left[\begin{array}{c}\cos \theta \\ \sin\theta \end{array} \right] \text{ and }
T(\overrightarrow{e_2})=T\left(\left[\begin{array}{c}0\\1 \end{array} \right] \right)=\left[\begin{array}{r} -\sin\theta\\ \cos \theta \end{array} \right].\]
Then the standard matrix is \(A=[T(\overrightarrow{e_1})\: T(\overrightarrow{e_2}) ]
=\left[\begin{array}{rr} \cos \theta& -\sin\theta\\ \sin\theta& \cos \theta \end{array} \right].\) Thus
\[T(\overrightarrow{x})=A\overrightarrow{x}, \text{ i.e., }
T\left( \left[\begin{array}{c}x_1\\x_2 \end{array} \right] \right)
=\left[\begin{array}{c}x_1\cos \theta -x_2\sin\theta \\ x_1\sin\theta +x_2\cos \theta \end{array} \right]
\text{ for all } \overrightarrow{x}\in \mathbb R^2.\]
Consider the linear transformation \(T:\mathbb R^2 \to \mathbb R^3\) defined by
\[T(x_1,x_2)=(x_1-x_2,2x_1+3x_2,4x_2).\]
Note that \(T(\overrightarrow{e_1})=T(1,0)=(1,2,0)\) and \(T(\overrightarrow{e_2})=T(0,1)=(-1,3,4)\). The standard matrix
of \(T\) is
\[A=[T(\overrightarrow{e_1})\: T(\overrightarrow{e_2}) ]
=\left[\begin{array}{rr} 1&-1\\ 2&3\\ 0&4 \end{array} \right].\]
For any given linear transformation \(T:\mathbb R^n \to \mathbb R^m\), the domain space is \(\mathbb R^n\) and
the codomain space is \(\mathbb R^m\). We study a subspace of the domain space called Kernel or Null Space and
a subspace of the codomain space called Image Space or Range.
Definition.
The kernel or null space of a linear transformation \(T:\mathbb R^n \to \mathbb R^m\), denoted by
\(\ker (T)\) or \(\ker T\), is the following subspace of \(\mathbb R^n\):
\[\ker T= \{\overrightarrow{x} \in \mathbb R^n \;|\; T(\overrightarrow{x})=\overrightarrow{0_m}\}.\]
The nullity of \(T\), denoted by \(\operatorname{nullity}(T)\), is the dimension of \(\ker T\), i.e.,
\[\operatorname{nullity}(T)=\operatorname{dim}(\ker T).\]
Remark.
If \(A\) is the standard matrix of a linear transformation \(T:\mathbb R^n \to \mathbb R^m\), then
\(\ker T=\operatorname{NS}(A)\) and \(\operatorname{nullity}(T)=\operatorname{nullity}(A)\).
Example.
The linear transformation \(T:\mathbb R^3 \to \mathbb R^2\) defined by \(T(x_1,x_2,x_3)=(x_1,x_2)\) has the standard matrix
\(A=[T(\overrightarrow{e_1})\: T(\overrightarrow{e_2}) \: T(\overrightarrow{e_3})]
=\left[\begin{array}{rrr} 1&0&0\\ 0&1&0 \end{array} \right]\). Note that
\[\ker T=\operatorname{NS}(A)=\operatorname{Span} \left\lbrace \left[\begin{array}{r} 0\\0\\1 \end{array} \right] \right\rbrace,\]
and \(\operatorname{nullity}(T)=\operatorname{nullity}(A)=1\).
Definition.
The image space or range of a linear transformation \(T:\mathbb R^n \to \mathbb R^m\), denoted by
\(\operatorname{im} (T)\) or \(\operatorname{im} T\) or \(T(\mathbb R^n)\), is the following subspace of \(\mathbb R^m\):
\[\operatorname{im} T= \{T(\overrightarrow{x}) \;|\; \overrightarrow{x} \in \mathbb R^n\}.\]
The rank of \(T\), denoted by \(\operatorname{rank}(T)\), is the dimension of \(\operatorname{im} T\), i.e.,
\[\operatorname{rank}(T)=\operatorname{dim}(\operatorname{im} T).\]
Remark.
If \(A\) is the standard matrix of a linear transformation \(T:\mathbb R^n \to \mathbb R^m\), then
\(\operatorname{im} T=\operatorname{CS}\left(A\right)\)
and \(\operatorname{rank}(T)=\operatorname{rank}(A)\).
Example.
The linear transformation \(T:\mathbb R^2 \to \mathbb R^3\) defined by \(T(x_1,x_2)=(x_1,x_2,0)\) has the standard matrix
\(A=[T(\overrightarrow{e_1})\: T(\overrightarrow{e_2}) ]
=\left[\begin{array}{rr} 1&0\\ 0&1\\ 0&0 \end{array} \right]\).
Note that
\[\operatorname{im} T=\operatorname{CS}\left(A\right)=\operatorname{Span} \left\lbrace \left[\begin{array}{r} 1\\0\\0 \end{array} \right],
\left[\begin{array}{r} 0\\1\\0 \end{array} \right] \right\rbrace,\]
and \(\operatorname{rank}(T)=\operatorname{rank}(A)=2\).
Theorem.(Rank-Nullity Theorem)
For a linear transformation \(T:\mathbb R^n \to \mathbb R^m\),
\[\operatorname{rank}(T)+\operatorname{nullity}(T)=n.\]
Let \(A\) be the \(m\times n\) standard matrix of \(T\). Then by the Rank-Nullity Theorem on \(A\),
\[\operatorname{rank}(T)+\operatorname{nullity}(T)=\operatorname{rank}(A)+\operatorname{nullity}(A)=n.\]
Example.
The linear transformation \(T:\mathbb R^3 \to \mathbb R^2\) defined by \(T(x_1,x_2,x_3)=(x_1,x_2)\) has
\(\operatorname{nullity}(T)=1\) (see examples before). Then by the Rank-Nullity Theorem,
\[\operatorname{rank}(T)=3-\operatorname{nullity}(T)=2.\]
Now we discuss two important types of linear transformation \(T:\mathbb R^n \to \mathbb R^m\).
Definition.
Let \(T:\mathbb R^n \to \mathbb R^m\) be a linear transformation. \(T\) is onto if each
\(\overrightarrow{b}\in \mathbb R^m\) has a pre-image \(\overrightarrow{x}\) in \(\mathbb R^n\) under \(T\), i.e.,
\(T(\overrightarrow{x})=\overrightarrow{b}\). \(T\) is one-to-one if each \(\overrightarrow{b}\in \mathbb R^m\)
has at most one pre-image in \(\mathbb R^n\) under \(T\).
Example.
The linear transformation \(T:\mathbb R^3 \to \mathbb R^2\) defined by \(T(x_1,x_2,x_3)=(x_1,x_2)\) is onto
because each \((x_1,x_2)\in \mathbb R^2\) has a pre-image \((x_1,x_2,0)\in \mathbb R^3\) under \(T\). But
\(T\) is not one-to-one because \(T(0,0,0)=T(0,0,1)=(0,0)\), i.e., \((0,0)\) has two distinct pre-images \((0,0,0)\) and
\((0,0,1)\) under \(T\).
The linear transformation \(T:\mathbb R^2 \to \mathbb R^3\) defined by \(T(x_1,x_2)=(x_1,x_2,0)\) is one-to-one
because \(T(x_1,x_2)=T(y_1,y_2) \implies (x_1,x_2,0)=(x_1,x_2,0) \implies (x_1,x_2)=(y_1,y_2)\). But \(T\) is not onto
because \((0,0,1)\in \mathbb R^3\) has no pre-image \((x_1,x_2)\in \mathbb R^2\) under \(T\).
The linear transformation \(T:\mathbb R^2 \to \mathbb R^2\) defined by \(T(x_1,x_2)=(x_1+x_2,x_1-x_2)\) is
one-to-one and onto (exercise).
Theorem.
Let \(T:\mathbb R^n \to \mathbb R^m\) be a linear transformation with the standard matrix \(A\). Then the following
are equivalent.
\(T\) (i.e., \(\overrightarrow{x} \mapsto A\overrightarrow{x}\)) is one-to-one.
(b), (c), and (d) are equivalent by the definitions.
(a) \(\implies\) (b) Suppose \(T\) (i.e., \(\overrightarrow{x} \mapsto A\overrightarrow{x}\)) is one-to-one.
Let \(\overrightarrow{x} \in \ker T=\operatorname{NS}(A)\). Then \(A\overrightarrow{x}=\overrightarrow{0_m}\).
Also \(\overrightarrow{0_n} \mapsto A\overrightarrow{0_n}=\overrightarrow{0_m}\). Since \(\overrightarrow{x} \mapsto A\overrightarrow{x}\)
is one-to-one, \(\overrightarrow{x}=\overrightarrow{0_n}\). Thus
\[\operatorname{NS}(A)=\{\overrightarrow{0_n}\}.\]
(b) \(\implies\) (a) Suppose \(\ker T=\operatorname{NS}(A)=\{\overrightarrow{0_n}\}\). Let \(\overrightarrow{x},\overrightarrow{y} \in \mathbb R^n\)
such that \(A\overrightarrow{x}= A\overrightarrow{y}\). Then \(A(\overrightarrow{x}-\overrightarrow{y})=\overrightarrow{0_m}\).
Then \(\overrightarrow{x}-\overrightarrow{y} \in \operatorname{NS}(A)=\{\overrightarrow{0_n}\}\) which implies
\(\overrightarrow{x}-\overrightarrow{y}=\overrightarrow{0_n}\), i.e., \(\overrightarrow{x}=\overrightarrow{y}\).
Thus \(\overrightarrow{x} \mapsto A\overrightarrow{x}\) is one-to-one.
Example.
The linear transformation \(T:\mathbb R^2 \to \mathbb R^3\) defined by \(T(x_1,x_2)=(x_1,x_2,0)\) has the standard
matrix \(A=[T(\overrightarrow{e_1})\: T(\overrightarrow{e_2}) ]
=\left[\begin{array}{rr} 1&0\\ 0&1\\ 0&0 \end{array} \right]\). Note that the columns of \(A\) are linearly independent,
\(\ker T=\operatorname{NS}(A)=\{\overrightarrow{0_2}\}\), and \(\operatorname{nullity}(T)=\operatorname{nullity}(A)=0\).
Thus \(T\) (i.e., \(\overrightarrow{x} \mapsto A\overrightarrow{x}\)) is one-to-one.
Theorem.
Let \(T:\mathbb R^n \to \mathbb R^m\) be a linear transformation with the standard matrix \(A\). Then the following are
equivalent.
\(T\) (i.e., \(\overrightarrow{x} \mapsto A\overrightarrow{x}\)) is onto.
(b), (c), and (d) are equivalent by the definitions.
(a) \(\implies\) (b) Suppose \(T\) (i.e., \(\overrightarrow{x} \mapsto A\overrightarrow{x}\)) is onto.
Let \(\overrightarrow{b} \in \mathbb R^m\). Since \(\overrightarrow{x} \mapsto A\overrightarrow{x}\) is onto,
\(\overrightarrow{b}=A\overrightarrow{x}\) for some \(\overrightarrow{x}\in \mathbb R^n\). Then
\(\overrightarrow{b}=A\overrightarrow{x} \in \operatorname{CS}\left(A\right)\). Thus \(\operatorname{im} T=\operatorname{CS}\left(A\right)=\mathbb R^m\).
(b) \(\implies\) (a) Suppose \(\operatorname{im} T=\operatorname{CS}\left(A\right)=\mathbb R^m\). Let
\(\overrightarrow{b} \in \mathbb R^m\). Since \(\overrightarrow{b} \in \operatorname{CS}\left(A\right)=\mathbb R^m\),
\(\overrightarrow{b}=A\overrightarrow{x}\) for some \(\overrightarrow{x}\in \mathbb R^n\). Thus \(\overrightarrow{x} \mapsto A\overrightarrow{x}\)
is onto.
Example.
The linear transformation \(T:\mathbb R^3 \to \mathbb R^2\) defined by \(T(x_1,x_2,x_3)=(x_1,x_2)\) has the standard
matrix \(A=[T(\overrightarrow{e_1})\: T(\overrightarrow{e_2}) \: T(\overrightarrow{e_3})]
=\left[\begin{array}{rrr} 1&0&0\\ 0&1&0 \end{array} \right]\). Note that each row of \(A\) has a pivot position,
\(\operatorname{im} T=\operatorname{CS}\left(A\right)=\mathbb R^2\), and \(\operatorname{rank}(T)=\operatorname{rank}(A)=2\).
Thus \(T\) (i.e., \(\overrightarrow{x} \mapsto A\overrightarrow{x}\)) is onto.
Definition.
A linear transformation \(T:\mathbb R^n \to \mathbb R^n\) is an isomorphism if it is one-to-one and onto.
Example.
The linear transformation \(T:\mathbb R^2 \to \mathbb R^2\) defined by \(T(x_1,x_2)=(x_1+x_2,x_1-x_2)\) is one-to-one
and onto consequently an isomorphism. Showing \(T\) is one-to-one is enough to show \(T\) is an isomorphism by the
following theorem.
Theorem.
Let \(T:\mathbb R^n \to \mathbb R^n\) be a linear transformation with the \(n\times n\) standard matrix \(A\).
Then the following are equivalent.
\(T\) (i.e., \(\overrightarrow{x} \mapsto A\overrightarrow{x}\)) is an isomorphism.
\(T\) (i.e., \(\overrightarrow{x} \mapsto A\overrightarrow{x}\)) is one-to-one.
Each row and column of \(A\) has a pivot position.
(f), (g), (h), and (i) are equivalent by the preceding theorem. (b), (c), (d), and (e) are equivalent by the theorem
before the preceding theorem. Now for the \(n\times n\) standard matrix \(A\),
\(\operatorname{rank}(A)+\operatorname{nullity}(A)=n\). Thus \(\operatorname{nullity}(A)=0\) if and only if
\(\operatorname{rank}(A)=n\), i.e., (d) and (h) are equivalent. Since (b) and (f) are equivalent, they are equivalent to
(a).
Example.
What can we say about \(\operatorname{CS}\left(A\right),\operatorname{NS}(A),\operatorname{rank}(A),
\operatorname{nullity}(A)\), and pivot positions of a \(3\times 3\) matrix with three linearly independent columns?
What about \(\overrightarrow{x} \mapsto A\overrightarrow{x}\)?
Solution.
By the preceding theorem, \(\operatorname{CS}\left(A\right)=\mathbb R^3,\operatorname{NS}(A)
=\{\overrightarrow{0_3}\},\operatorname{rank}(A)=3,\operatorname{nullity}(A)=0\), \(A\) has 3 pivot positions, and
\(\overrightarrow{x} \mapsto A\overrightarrow{x}\) is a one-to-one linear transformation from \(\mathbb R^3\) onto
\(\mathbb R^3\).