Return to the course page.

In the interest of time, these notes are kept brief, and do not include the wicked cool geometric example of determinants representing areas, volumes, hypervolumes. (You’re coming to class anyway, right?) These examples will be added later, or in future iterations of the course.


Last time, we learned that for a 2 x 2 matrix

\left[ \begin{array}{cc} a & b \\ c & d \end{array} \right]

the matrix

\dfrac{1}{ad-bc} \left[ \begin{array}{cc} d & -b \\ -c & a \end{array} \right]

serves as its inverse so long, obviously, as ad – bc is nonzero. I also dropped the hint that square matrices behave the most like numbers of all matrices, and that matrices are not invertible if they have some sort of je ne sais quoi they share with zero the number.

Obviously the 2 x 2 matrix

\left[ \begin{array}{cc} 0 & 0 \\ 0 & 0 \end{array} \right]

will not be invertible, not least of which because ad – bc here is definitely zero. But the matrix

A = \left[ \begin{array}{cc} 1 & 2\\ 3 & 6 \end{array} \right]

might look perfectly fine if not for 1(6) – 3(2) = 0. It’s not invertible either. What gives?

As we go forward, we’ll be looking at how matrices transfer information. What does that mean? Well, if I multiply on the left of the vector [2, -1]^t, I get

\left[ \begin{array}{cc} 1 & 2\\ 3 & 6 \end{array} \right] \left[ \begin{array}{c} 2 \\ -1 \end{array} \right] = \left[ \begin{array}{c}  0 \\ 0 \end{array} \right].

If you’re able to picture it geometrically, this means that the vector [2, -1]^t gets collapsed into the origin. In fact, since scalars pass through matrix multiplication—A(\lambda B) = \lambda (AB), check it yourself!—the entire line through the origin and [2,-1]^t gets squeezed into the origin.

In other words, multiplying a 2-vector by loses information. If a function loses information, there is no unique way to retrieve the information—how do we come back from [0,0]^t? Should we use [2, -1]^t? Why not [4,-2]^t or even [0,0]^t?—just like when we multiply by zero there is no unique way to reverse the process. The process, one might say, is not invertible.

So this number ad – bc, which we’ll be calling the determinant of a 2 x 2 matrix (det A) going forward, is somehow a predictor of invertibility. In fact,

A square matrix whose determinant is zero is said to be singular. A matrix is invertible if and only if it is not singular.

The determinant tells us exactly which square matrices are invertible, i.e. which ones do not have this zero-ness that will destroy some of our information.

So how do we compute it?

Example: the cross product

Keeping with the mathematics curriculum’s frustrating tendency to use “future math” as necessary without explaining it, you’ve actually seen determinants before. If [a,b,c]^t and [d, e, f]^t are 3-vectors, then their cross product is

\det \left[ \begin{array}{ccc} \mathbf i & \mathbf j & \mathbf k \\ a & b & c \\ d & e & f \end{array} \right] = (bf-ce){\mathbf i} - (af - cd){\mathbf j} + (ae - bd){\mathbf k}

where ij, and k are the basis vectors [1,0,0]^t, [0,1,0]^t, and [0,0,1]^t as in calculus.

Let’s notice a couple of things about how this was computed. It seems like we add something times i, minus something times j, plus something times k. The sum seems to be alternating along the top row of the determinant. The coefficients of ij, and k are the determinants, notice, of the smaller matrices obtained by removing the row and column in which i and j and k are located. These smaller determinants are called the cofactors of these elements. The cofactor of the (i,j)-th entry (here unfortunately and j refer to rows and columns as usual) is denoted C_{i,j}.


A = \left[ \begin{array}{ccc} \mathbf i & \mathbf j & \mathbf k \\ a & b & c \\ d & e & f \end{array} \right]

then the determinant may be re-written as

\det A = A_{1,1}C_{1,1} - A_{1,2}C_{1,2} + A_{1,3}C_{1,3}.

This is how we’ll compute the determinant of any square matrix.

Computing the determinant

We don’t have to use the top row. We may compute the determinant of any square matrix along any column or row we wish, and the metric we’ll usually use to choose the row is simplicity. The more zeroes, the better! (If a row or column is all zeroes, the determinant will be zero anyway—why?)

If the determinant is computed along the i-th row (hold i fixed), then

\det A = \displaystyle \sum_{j=1}^n (-1)^{i+j} A_{i,j}C_{i,j}.

Note that the (-1)^{i+j} factor keeps the alternation from the cross product example.

If the determinant is computed along the j-th column (hold j fixed), then

\det A = \displaystyle \sum_{i=1}^n (-1)^{i+j} A_{i,j}C_{i,j}.

Notice that the two formulas are pretty much identical: all that changes is whether you add up the column entries in the row or the row entries in the column.

Now, maybe it’s—oh, I don’t know—the midterm, and you’ve got a scary-looking 4 x 4 matrix where none of the rows or columns look particularly friendly. We can use row reduction to whittle the matrix into something we’d like to work with, but first, we’ll need to know how multiplication affects the determinant.

Properties of the determinant

The determinant plays with our known operations in the following way. If and are matrices and \lambda is a real number, then

  1. det AB = det det B
  2. \det \lambda A = \lambda^n \det A
  3. \det A^t = \det A
  4. If is invertible, then \det A^{-1} = 1/(\det A).

Simplifying the determinant

It’s oddly enough the multiplication rule—(1) abovethat predicts how the determinant responds to row reduction. That’s because all three of the row operations we use—swapping rows, multiplying a row by a scalar, and adding a multiple of one row to another—can all be realized by multiplying modifications of the identity matrix called elementary matrices.

For example, to swap the second and third rows of a 3 x 3 matrix, you can multiply it on the left by the matrix

\left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{array} \right].

Observe that this is just the identity matrix with its second and third rows exchanged. See for yourself that the determinant of this matrix is -1. In fact, any row-swapping elementary matrix will have a determinant of -1, so

If is obtained from by swapping rows, then det B = -det A.

An example of multiplying the first row of a 3 x 3 matrix by 7 is to multiply the whole matrix on the left by

\left[ \begin{array}{ccc} 7 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{array} \right]

which observe is the identity matrix with its first row multiplied by 7. The determinant of this matrix is 7. In general,

If is obtained from by multiplying a row by λ, then det B = λ det A.

Finally, if you were to subtract three times the second row from the first, you would multiply a 3 x 3 matrix by

\left[ \begin{array}{ccc} 1 & -3 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{array} \right]

The determinant of this matrix is just 1.

If is obtained from by adding a multiple of one row to another, then det B = det A.

If we keep track of what each move does to our determinant and undo all of the changes on the way, then we can get the determinant of a matrix from the determinant of its simpler, row-reduced form.

Application: Cramer’s rule

Finally, let’s look at an application of determinants to solving linear systems (for this part of the course, it all comes back to linear systems!) Let Ax = b represent a system of linear equations with variables and equations and a unique solution.

Let’s break that down: if there are variables and equations, then is square; and if the system has a unique solution, then that solution is {\mathbf x} = A^{-1}{\mathbf b}, so has a nonzero determinant.

The unique solution can be found using the determinant of and the determinant of a new matrix called A_i, which is identical to except that its i-th column is replaced with bCramer’s rule says that

If Axb represents a system of linear equations in variables with a unique solution, then that solution has entries

x_i = \dfrac{\det A_i}{\det A}, \qquad 1 \le i \le n.


That wraps up the first half of our course. Here’s what you should know so far:

  • Matrices are arrays of numbers used to represent systems of linear equations.
  • Linear systems can be solved by a process called row reduction applied to the augmented matrix that represents that system.
  • Matrices have a number-like algebra on their own, independent of their origin as representations of linear systems. In particular, matrices of the same size can be added together. All matrices can be scaled by a number. Matrices with compatible columns and rows can be multiplied together. Some square matrices can be inverted, which is kind of like division.
  • Those square matrices that can be inverted have a nonzero determinant. The determinant is a measure of how much the matrix is “like zero” in the sense of whether or not it destroys information by multiplication.

We have reached a natural conclusion in the basic study of matrices, and at this point you know enough to be dangerous. But there’s a thread I’ve been alluding to all through the course, that matrix multiplication is a type of information transfer. Matrix multiplication acts as a special type of function between vector spaces, of which you’ve explicitly seen two—the two-dimensional plane and three-dimensional space in calculus—and of which you’ve seen many more without being told about it. In the second part of the course, we’ll take a look at matrices from that point of view, which will illuminate a lot of what we’ve learned about their basic properties. (In particular, the formula for matrix multiplication will finally make sense!)