Linear transformations

Return to the main page.

One way to think of functions, as you are accustomed, is as input-output boxes: one value goes in, exactly one corresponding value comes out. Another way to think about them is as relationships between sets. If we are dealing with pure sets and the only thing worth talking about is their size, then a bijection can tell us, say, that there are as many rational numbers as there are integers (wait, what?) If we are dealing with additional structure, then we have ways of seeing how two algebras are the same. A monumental concept in mathematics is the idea of a structure-preserving map between two objects. In short, if X and Y are objects with features you care about, and B \subseteq Y has those features means that the pre-image f^{-1}(B) \subseteq X does, then the function f:X \to Y is said to preserve whatever the structure is. In particular, f can tell us in what ways X and Y are the same and in what ways they are different.

You are no doubt aware that we are interested in linear structure, namely the ability to add and stretch vectors in a “nice” way (i.e. follows the vector space axioms). Therefore, functions that preserve our linear structure will preserve the addition and scaling operations. These functions turn out to be wicked nice to deal with, especially in the case that our vector space is finite-dimensional (as it always goes in one’s first linear algebra class). Let’s get started.

Linear transformations

Let in everything that follows V and W be vector spaces and T:V \to W be a function between them.

We say that T is a linear transformation (or function, or map) if it preserves addition and scalar multiplication, which means that for all {\mathbf v}, {\mathbf w} \in V and \lambda \in {\mathbf R}we have

T({\mathbf v} + {\mathbf w}) = T({\mathbf v}) + T({\mathbf w}) \qquad T(\lambda {\mathbf v}) = \lambda T({\mathbf v}).

e.g. We have seen loads of linear transformations throughout math classes. Examples include scalar multiplication, matrix multiplication, differentiation, and integration.

If T is linear, then it preserves the origin. In other words, let {\mathbf 0}_V be the zero vector in V and let {\mathbf 0}_W be the zero vector in W. Then T({\mathbf 0}_V) = {\mathbf 0}_W. This means that linear transformations never move, or translate, the space. They rotate, or stretch, or shear, or reflect it instead.

Composition of linear transformations

Let T:V \to W and U:W \to X be linear maps. (I don’t need to say that X is a vector space; were it not, then it would be nonsense to say that U is a linear transformation.) Their composition, recall from precalculus, is the map

UT:V \to X \qquad \text{defined by } \qquad UT({\mathbf v}) = U(T({\mathbf v})).

The map UT takes {\mathbf v}, does T to it, and then does U to the result. We read UT from the inside-out, just like the order of operations from algebra. First of all, UT is linear. Second, TU may not even exist, let alone be equal to UT. (Sound familiar?) For TU to exist, we would need X = V, since T would have to pick up where U ended.

The rank-nullity theorem

Two subspaces tell us about what information from V that T keeps and destroys. As with all functions, the image of T is all values that it takes:

\text{im } T = \{ T({\mathbf v}) \; : \; {\mathbf v} \in V \}.

Special to algebraic structures (since regular sets don’t have an idea of an additive identity 0), we have the kernel of T, which is everything in V that gets sent to {\mathbf 0}_W:

\text{ker }T = \{ {\mathbf v} \; : \; T({\mathbf v}) = {\mathbf 0}_W \}.

Importantly, since the whole point of linear transformations is that they preserve linear combinations, \text{ker }T is a subspace of V and \text{im }T is a subspace of W. The rank of T is the dimension of \text{im }T and the nullity of T is the dimension of \text{ker }T.

It makes sense that if you add up what you destroy with what you save, what you get is what you started with, right? This is basically the point of the rank-nullity theorem, which is the tip of an iceberg of sorts of a more powerful, more complicated algebraic idea (called the first isomorphism theorem). In short,

\dim \text{ker }T + \dim \text{im }T = \dim V.


You may have noticed that when viewed as members of their respective vector spaces, quadratic polynomials ax^2 + bx + c behave a lot like 3-vectors [a, b, c]^t. In both cases, when we add and scale, all that really matters is the quadratic coefficients a, b, c, and the vector components a, b, c. Polynomials and column vectors are different objects, though, so the vector spaces aren’t equal. How can we quantify this “sameness” of P_2(x) and {\mathbf R}^3 with a notion weaker than equality?

If you guessed linear transformations would do the trick, good on you. An isomorphism (Greek “same form”) T:V \to W is a linear map that is also a bijection. In looser language, V has exactly as many elements as W (that’s the bijection part), and those elements add and scale the same way (that’s the linear map part). If V and W admit an isomorphism between them, then we say they are isomorphic and write V \simeq W.


There are a few equivalent, more illustrative ways to think of isomorphisms rather than verifying that a linear map is 1-1 and onto. In both cases, T must be a linear map:

  • If \dim V = \dim W = n and \text{ker }T = \{ {\mathbf 0}_V \}, then T is an isomorphism.

Why? Well, first, V and W are the same size. Second, T doesn’t destroy any information (its kernel is triviali.e. just the zero vector). That means that every vector in V must be sent to a unique vector in W, and V is just big enough that it totally fills up W in this way.

  • If T sends a basis for V to a basis for W, then T is an isomorphism.

More formally, let \mathscr{B}=\{{\mathbf b}_1, {\mathbf b}_2, \ldots, {\mathbf b}_n\} be a basis for V. We suppose that T(\mathscr{B}) = \{T({\mathbf b}_1), T({\mathbf b}_2), \ldots, T({\mathbf b}_n)\} is a basis for W. This implies that \dim V = \dim W. If it also implies that \text{ker }T is trivial as in the above interpretation, then we are done.

Suppose that {\mathbf v} \neq {\mathbf 0}_V \in \text{ker }T. There exist scalars \alpha_i, not all zero, such that {\mathbf v} = \sum_{i=1}^n \alpha_i {\mathbf b}_i since the {\mathbf b}_i form a basis for V. If {\mathbf v} \in \text{ker }T, then

T(\sum_{i=1}^n \alpha_i {\mathbf b}_i) = \sum_{i=1}^n \alpha_i T({\mathbf b}_i) = {\mathbf 0}_W

which contradicts the hypothesis that T(\mathscr{B}) is a linearly independent set.

Inverse isomorphism

Of course, when any function is a bijection, that means it can be inverted. If T:V \to W is an isomorphism, then there exists a map T^{-1}:W \to V such that T^{-1}T:V \to V is equal to I_V and TT^{-1}:W \to W is equal to I_W.

Equivalence relations

In general, an equivalence relation is a set of properties that, when held, mean that a relationship is like equality for all intents and purposes. Isomorphism carries these properties: it is an algebraic equality-type relationship that doesn’t require two spaces have the same elements. These properties are:

  1. Reflexitivity: All vector spaces are isomorphic to themselves: V \simeq V.
  2. Symmetry: If V \simeq W by the isomorphism T, then W \simeq V by the isomorphism T^{-1}.
  3. Transitivity: If V \simeq W by the isomorphism T and W \simeq X by the isomorphism U, then V \simeq X by the isomorphism UT.

Change of basis map

We end on a useful isomorphism that we will return to in sections 6 and 7. Suppose that V has a basis \mathscr{B}=\{{\mathbf b}_1, {\mathbf b}_2, \ldots, {\mathbf b}_n\} and a basis \mathscr{C} = \{{\mathbf c}_1, {\mathbf c}_2, \ldots, {\mathbf c}_n\}. Then a change of basis map is the isomorphism T:V \to V (called an operator when the domain equals the codomain) such that

T({\mathbf b}_i) = {\mathbf c}_i \qquad 1 \le i \le n.

Per the preceding argument, this is an isomorphism since it sends one basis to another basis.