Matrix algebra

Return to the MTH 237 homepage.

Introduction

In the last section, we learned that by consolidating a system of equations into a matrix we could quickly (and more important, in a way that is easily automated—c.f. extra credit #1) get at the solution(s) of the system, should they exist.

It is standard operating procedure in mathematics, once an object is discovered, to ask what else can be done with the object. What structure exists here? There are many kinds of mathematical structure, but one with which you are already familiar is the algebraic properties of the real numbers R. Let’s recall what they are.

  1. Addition is associative(a + b) + c = a + (b + c).
  2. Addition is commutative: a + b = b + a.
  3. There exists an additive identity 0 such that a + 0 = 0 + a = a.
  4. There exist additive inverses -a such that a + (-a) = 0.
  5. Multiplication is associative: a(bc) = (ab)c.
  6. Multiplication is commutativeab = ba.
  7. There exists a multiplicative identity 1 such that 1\cdot=  a.
  8. Multiplication distributes over addition: a(b+c) = ab + ac.
  9. There exist multiplicative inverses 1/a for all nonzero such that a(1/a) = 0.

The question arises: what are ways to define addition and multiplication on matrices that both make sense and follow as many of these rules as possible? We’ll answer that question in this section.

The basics

A matrix, loosely speaking, is an array of numbers. To be more technical (and we must), let and be positive integers. When we have a matrix with rows and columns whose entries are all real numbers, we say A \in M_{m \times n}({\mathbf R}). The entry in the i-th row and j-th column of A (or, more easily, the (i,j)-th entry) is denoted A_{i,j}. The matrix looks like

A = \left[ \begin{array}{ccccc} A_{1,1} & A_{1,2} & A_{1,3} & \cdots & A_{1,n} \\ A_{2,1} & A_{2,2} & A_{2,3} & \cdots & A_{2,n} \\ & & & \ddots & \\ A_{m,1} & A_{m,2} & A_{m,3} & \cdots & A_{m,n} \end{array} \right]

This is all very abstract so let’s see an example. Kicking up a random 3×4 (three rows, four columns) matrix in Octave gives

A = \left[ \begin{array}{cccc} 8 & 5 & 1 & 3 \\ 4 & 0 & 6 & 3 \\ 3 & 2 & 2 & 4 \end{array} \right].

Here, belongs to the set M_{3 \times 4}({\mathbf R}). (If you are unfamiliar with set theory, we will cover the basics at the beginning of the second part of the course.) The element in its second row and third column is 6, so A_{2,3} = 6. The element in its first row and fourth column is 3, so A_{1,4} = 3. And so on.

Addition and subtraction

If I asked you to make up a way to add matrices together, you’d probably come up with adding each of their entries together. You’d be right, except for one problem: the matrices must be of the same size. It doesn’t make sense to add entries if one of the entries doesn’t exist! Addition for matrices of the same size, then, is defined

(A+B)_{i,j} = A_{i,j} + B_{i,j}.

The above sentence is read the (i,j)-th entry of A+B is the (i,j)-th entry of plus the (i,j)-th entry of B. For notational simplicity, this is how we will be defining all of the matrix operations—in terms of what happens to each entry.

For example,

\left[ \begin{array}{cc} 8 & 3 \\ 7 & 7 \\ 9 & 2 \end{array} \right] + \left[ \begin{array}{cc} 7 & 2 \\ 5 & 8 \\ 7 & 9 \end{array} \right] = \left[ \begin{array}{cc} 15 & 5 \\ 12 & 15 \\ 16 & 11 \end{array} \right]

Subtraction’s definition should not surprise you.

(A-B)_{i,j} = A_{i,j} - B_{i,j}.

For example,

\left[ \begin{array}{cc} 8 & 3 \\ 7 & 7 \\ 9 & 2 \end{array} \right] - \left[ \begin{array}{cc} 7 & 2 \\ 5 & 8 \\ 7 & 9 \end{array} \right] = \left[ \begin{array}{cc} 1 & -1 \\ -2 & 1 \\ 2 & -7 \end{array} \right]

Addition satisfies the rules we’d hope for.

  1. Addition is associative: (A + B) + C = A + (B + C).
  2. Addition is commutativeA.
  3. There exists an additive identity

    0_{m \times n} = \left[ \begin{array}{ccccc} 0 & 0 & 0 & \cdots 0 \\ 0 & 0 & 0 & \cdots & 0 \\ & & & \ddots & \\ 0 & 0 & 0 & \cdots & 0 \end{array} \right]

    such that + 0_{m \times n}A.

  4. For every there exists an additive inverse -A such that + (-A) = 0_{m \times n}.

One way to regard subtraction is as addition of a number times negative one, like how 4 – 3 = 4 + (-1)3. To get a result like these for matrices, we will need to introduce a new operation.

Scalar multiplication

We haven’t talked about scalar multiplication before, because for numbers scalar multiplication is just, well, multiplication. In linear algebra, however, it is of utmost priority along with addition.

Let \lambda (Greek “lambda”) be a real number and be a matrix. Then

(\lambda A)_{i,j} = \lambda A_{i,j}.

In other words, just multiply every entry of by \lambda. Example:

4 \left[ \begin{array}{cc} 3 & 4 \\ 4 & 0 \\ -8 & -2 \end{array} \right] = \left[ \begin{array}{cc} 12 & 16 \\ 16 & 0 \\ -32 & -8 \end{array} \right]

The properties of multiplication by scalars \lambda and \mu (Greek “mu”) on compatible matrices and are listed below. These will be important later, as they are a key ingredient in what makes a vector space, the central object of study in linear algebra.

  1. Multiplication is associative: (\lambda \mu) A = \lambda (\mu A).
  2. There exists an identity scalar 1 such that 1\cdot A = A.
  3. Multiplication distributes over matrix addition: \lambda(A + B) = \lambda A+ \lambda B.
  4. Multiplication also distributes over scalar addition: (\lambda + \mu)A = \lambda A+ \mu A.

It may be clear to you that scalar multiplication is a sort of “stretching” operation. In fact, if the matrix in question is a row or column vector with two or three entries, it is identical to the scalar multiplication from vector calculus, which geometrically was in fact a stretching operation.

So far, we have adding matrices and scaling them. Is there a sensible way to multiply two matrices together?

There is, but you won’t like it.

Matrix multiplication

In the interest of establishing the usefulness of matrices concretely before diving headfirst into the abstraction necessary to talk about vector spaces, linear transformations, and the magic that’s going on under the hood here, I will just tell you how to multiply matrices without telling you why it is an extremely good definition. As we move forward in the course, hopefully you will see that for yourself: remarkably many useful properties fall out of this definition.

If and are matrices such that has as many columns as has rows, then

(AB)_{i,j} = [i-th row of A] \cdot [j-th column of B].

Here, \cdot is the vector dot product from calculus. As reminder, that is the operation where you multiply each component together and then add. Assuming that has columns and B has rows,

(AB)_{i,j} = A_{i,1}B_{1,j} + A_{i,2}B_{2,j} + \cdots + A_{i,n}B_{n,j}.

Let’s see an example.

\left[ \begin{array}{cc} 9 & -2 \\  3 & 0 \end{array}\right] \left[ \begin{array}{ccc} -1&  2 & 5 \\ 0 &  2 & 0 \end{array}\right]

First, we note that the matrix on the left, A, has the same number of columns as the one on the right, B, has rows (two each). Good! These matrices are compatible. Furthermore, the other dimensions of the factors tell us that the result will be a 2 (rows of the left factor) x 3 (columns of the right factor) matrix.

\left[ \begin{array}{cc} 9 & -2 \\  3 & 0 \end{array}\right] \left[ \begin{array}{ccc} -1&  2 & 5 \\ 0 &  2 & 0 \end{array}\right] = \left[ \begin{array}{ccc} & &  \\   & &  \end{array}\right] 

To get the (1,1)-th entry of the product, we take the 1st row of and dot it with the 1st column of B. The result is 9(-1) + (-2)0 = -9:

\left[ \begin{array}{cc} 9 & -2 \\  3 & 0 \end{array}\right] \left[ \begin{array}{ccc}-1 &  2 & 5 \\ 0 &  2 & 0 \end{array}\right] = \left[ \begin{array}{ccc} -9 & &  \\   & &  \end{array}\right] 

 To get the (2,2)-th entry of the product, compute the dot product of the 2nd row of with the 2nd column of B: 3(2) + 0(2) = 6.

\left[ \begin{array}{cc} 9 & -2 \\  3 & 0 \end{array}\right] \left[ \begin{array}{ccc}-1 &  2 & 5 \\ 0 &  2 & 0 \end{array}\right] = \left[ \begin{array}{ccc} -9 & &  \\   & 6 &  \end{array}\right] 

The rest are filled in as follows. Try it yourself!

\left[ \begin{array}{cc} 9 & -2 \\  3 & 0 \end{array}\right] \left[ \begin{array}{ccc}-1 &  2 & 5 \\ 0 &  2 & 0 \end{array}\right] = \left[ \begin{array}{ccc} -9 & 14 & 45 \\   -3 & 6 & 15  \end{array}\right] 

However, observe that if we reversed the order of the matrices, it wouldn’t work! Since has three columns and has two columns, the product BA doesn’t exist. If AB exists but BA doesn’t, we are forced to conclude that in general

AB \neq BA

even if both exist!

Multiplication, by necessity, will not follow all the rules that we’d like. Above, we see commutativity is out. (The reason that commutativity can’t happen is due to the magic-under-the-hood view of matrix multiplication, which we’ll see in a couple of sections.) What rules do work, then? If AB, and C are compatible matrices (i.e. can be multiplied together as described),

  1. Multiplication is associative: (AB)A(BC)
  2. Multiplication distributes over multiplication: A(C) = AB AC.
  3. If is a matrix, then has a left identity I_m such that I_mA = A and a right identity I_n such that AI_n = A. If is square, we can just write I_m = I_n = I.

This identity matrix is

I_n = \left[ \begin{array}{ccccc} 1 & 0 & 0 & \cdots & 0 \\ 0 & 1 & 0 & \cdots & 0 \\ & & & \ddots & \\ 0 & 0 & 0 & \cdots & 1 \end{array} \right],

i.e. it is the matrix with 1’s on the diagonal and 0’s everywhere else.

The inverse of a matrix

If matrices can be multiplied together, then one rightfully wonders if there is some type of division operation. (This is the kind of thing you think about, right?) Again, the answer is sometimes. Recall that in real number division, the reciprocal of a is a number a^{-1} such that

a^{-1}a = aa^{-1} = 1

so long if is nonzero. Extending this sentence to refer to matrices rather than real numbers a, we might write

A^{-1}A = AA^{-1} = I_m.

Such a matrix $A^{-1}$ is called the inverse of A.

Let have rows and columns and $A^{-1}$ have rows and columns. Observe first that if the above sentence is true, then n = h and k = m so that the multiplication is possible at all. Since we want I_m to be the same size no matter how we order the factors, we also have m = n = h = k. Therefore, for such a sentence to be possible, must be square. (Square matrices, as we will see, are the most like numbers. For example, they may always be added and multiplied.)

Can we find the inverse of every square matrix? No, just like not every number has a reciprocal. There is a zero-ness to matrices that are not invertible. In the next section, we will discover the special quality that makes a matrix invertible, but it is necessary that the matrix be square.

Recall that matrix equations like AX = B may be rendered as augmented matrices [ ]. The solution is found by row-reducing the augmented matrix until is in reduced row echelon form. The right side of the bar becomes the solution in this process. To find the matrix, we are solving the equation AX = I, or row-reducing the augmented matrix [ I ]. If A^{-1} exists, then we should be able to row-reduce [ ] into [ A^{-1} ].

If a matrix has an inverse, then the matrix equation Ab has a unique solution {\mathbf x} = A^{-1}{\mathbf b}.

As an example, the square matrix

\left[ \begin{array}{cc} a & b \\ c & d \end{array} \right]

has the inverse

\dfrac{1}{ad-bc} \left[ \begin{array}{cc} d & -b \\ -c & a \end{array} \right]

so long as ad – bc is not zero. To see this, multiply the two matrices together (in both orders!) and see that you get the identity matrix both times.

To end, we will summarize two last easy matrix operations.

Transposition

The transpose of a matrix swaps its rows and columns. If is a matrix, then A^t is a matrix such that

(A^t)_{i,j} = A_{j,i}.

A matrix with real entries is normal if it computes with its transpose, i.e. AA^t = A^tA. A matrix is symmetric if it is its own transpose, i.e. A = A^t.

Trace

If is a square matrix of size n, then the trace of a matrix is the sum of its diagonal entries:

\text{tr }A = \displaystyle \sum_{k=1}^n A_{k,k}.

Advertisements