Intro to Linear Algebra | Page 1 | Mathematics | Guild Forums

Layra-chan
Crew

Offline

Posted: Mon Jun 22, 2009 10:00 am

Preface

This thread is a basic introduction to Linear Algebra and will cover the basic linear algebra necessary to understand most of modern physics. It will also include a few more advanced topics that a first-time reader may wish to skip.
This is not intended to be a formal course, so it will not have a lot of proofs or exercises. Rather, it is intended as a simple reference and as an alternative to simply saying "read this book and come back in a year".

I shall be updating this thread as I think of things and as suggestions come in. If you see any problems or have an ideas of things to put in, feel free to post in this thread.

Posted: Mon Jun 22, 2009 10:04 am

Table of Contents

-Intro: Vectors

-Spaces and subspaces

-Products of vectors

-Intro: Matrices

-Linear Transformations

-Trace and Determinant

-Eigenstuff

-Multilinear Forms

-Orthogonality and Normality

-Advanced stuff

Layra-chan
Crew

Offline

Layra-chan
Crew

Offline

Posted: Mon Jun 22, 2009 10:09 am

Vectors

Consider the plane. A vector denoted as (x, y)ᵀ (I'll explain the ᵀ in a bit) would be a journey x units to the right and y units up (if x is negative then you move to left instead of right, and if y is negative you move down instead of up).

Now suppose you had two vectors, (x1, y1)ᵀ and (x2, y2)ᵀ. Then if you do (x1, y1)ᵀ first, and then (x2, y2)ᵀ, you land x1+x2 units to the right and y1+y2 units up from where you started. Also, if you go back to the starting point and do (x2, y2)ᵀ first and then (x1, y1)ᵀ, you end up in the same place. Since either two-part journey is equivalent to doing (x1+x2, y1+y2)ᵀ, we say that

(x1, y1)ᵀ + (x2, y2)ᵀ = (x2, y2)ᵀ + (x1, y1)ᵀ = (x1+x2, y1+y2)ᵀ

This is called, unsurprisingly, vector addition, where each coordinate of the first vector is added to the corresponding coordinate of the second vector to get the vector sum.

What happens when you add a vector to itself, i.e. what happens when you go along a vector, and then do so again? According to the above formula,

(x, y)ᵀ+(x, y)ᵀ = (2x, 2y)ᵀ

We can write the left side of the equation as 2(x, y)ᵀ. In fact, we can say that for any scalar c,

c(x, y)ᵀ = (cx, cy)ᵀ

This is called scalar multiplication. Note that I haven't said what scalar means quite yet. I'll get to that in a moment. In the case of the plane, scalar means real number.

Since (x1+x2, y1+y2) and (cx, cy) both give movements on the plane, we note that the sum of two vectors is a vector, and a vector times a scalar is a vector.
But you don't have to restrict yourself to just the plane. What if you wanted to look at an 3-dimensional volume? You'd need three coordinates, (x, y, z). So your vectors would have three coordinates and would be written (x, y, z)ᵀ. In fact, we could have any (whole) number of coordinates, with vectors of the form (x1, x2, x3,..., xn)ᵀ. The coordinates are now called entries in the vector, since it looks like a list. Let's stick to vectors with a finite number of entries for now.
We still have scalar multiplication, in that

c(x1, x2, x3,..., xn)ᵀ = (cx1, cx2, cx3,...,cxn)ᵀ

When doing vector addition, you have to make sure that the two vectors you're trying to add have the same number of entries, otherwise the addition won't match up. But we still have

(x1, x2, x3,...,xn)ᵀ+(y1,y2,y3,....,yn)ᵀ = (x1+y1, x2+y2, x3+y3,...,xn+yn)ᵀ

More abstractly

We can characterize a vector by just these properties: a vector is something that you can add to another vector to get a third vector and that you can multiply by a scalar to get a vector, i.e. if we denote the set of vectors by V, we say that
If u and v are in V, then u+v is in V
If u is in V and c is a scalar, then cu is in V

I'm denoting vectors by bold lowercase letters and sets of vectors by bold uppercase letters to distinguish them from regular numbers. Also, I don't have the ᵀs here, since the abstract vectors are assumed to be written correctly without needing a ᵀ. Again, this will be explained further in a later segment.

There are a number of axioms needed to make sure that scalar multiplication and vector addition are compatible in a useful way. Specifically, given vectors u, v, and w and scalars a and b,

u + (v + w) = (u + v) + w
u + v = v + u
There is a vector called the zero vector, denoted 0, such that 0 + v = v. In coordinates, it is written as (0, 0, 0,...,0)ᵀ
For all v, there exists an additive inverse denoted -v such that v + (-v) = 0
a(u + v) = au + av
(a + b)v = av + bv
a(bv) = (ab)v
1v = v

Speaking of multiplying, what happens if you multiply a vector by a complex number? Does i(1, 1)ᵀ = (i, i)ᵀ lead from one point on the plane to another?
This leads to the question of what I mean by scalar. In the case of the plane, note that the two entries x and y are both real numbers. You can multiply a vector that takes you from one point of the plane to another by any real number and end up with a second vector that leads between points on the plane; So we say that the set of vectors that describe journeys on the plane belong to R². Similarly, the set of vectors that have n real entries, i.e. those of the form (x1, x2, x3,...,xn)ᵀ belong to Rⁿ.
So can we make a vector with complex numbers in it? Sure we can. We simply say that such a vector belongs to C² if it has exactly two entries, and Cⁿ if it has n entries. Note that a vector with all real entries can still be in Cⁿ, since the complex numbers include the real numbers.
A scalar is then something that you can multiply a vector by to get another vector. A scalar always has to be an element of a field such as R or C.
Note that if c is a real number and u is in Rⁿ, then cu is in Rⁿ, but if c is a non-real complex number and u is in Rⁿ, then cu is not in Rⁿ.
If c is a complex number and v is in Cⁿ, then cv is in Cⁿ, for all complex numbers c, real or not. So we say that you can multiply vectors in Cⁿ by any complex scalar, but vectors in Rⁿ you can only multiply by real scalars.
The set of scalars that you're allowed to multiply vectors by is called the ground field or scalar field.

-To be finished-

Posted: Mon Jun 22, 2009 10:12 am

Spaces and Subspaces

The entire set of vectors is called a vector space. The sum of two vectors in the vector space is again a vector in the vector space, and a vector multiplied by a scalar is also in the vector space.
So we can say that Rⁿ is a vector space for any n, and that Cⁿ is also a vector space. In fact, for any field F, we can make vector fields Fⁿ, being lists of n entries where each entry is in F.

Consider for instance the rational numbers Q, i.e. the numbers of the form a/b where a and b are both whole numbers (and b isn't 0). We can make the vector space Qⁿ, although most people don't use this space because the rational numbers can be troublesome. But nonetheless, if we have whole numbers a through h, then we have that

(a/b, c/d)+(e/f, g/h) = (a/b+e/f, c/d+g/h)

as expected.

All vector spaces whose scalars are in R are called real vector spaces, and all vector spaces whose scalars are in C are called complex vector spaces.

Subspaces, Spans, Linear Independence, and Bases

A subspace of a vector space V is a set of vectors in V such that given any two vectors in the subspace, their sum is in the subspace, and given any scalar, the scalar times any vector in the subspace is in the subspace.
So a subspace is a vector space that happens to be part of a larger vector space. Note that it has the same scalar field as the larger vector space.
Note that the entire vector space is a subspace (but not a proper subspace) and that the space containing only 0 is also a subspace.

Now consider two vectors u and v in some vector space V that has a scalar field F. Let's consider the set of all vectors that can be written as au+bv where a and b are in F. Such a vector would be called a linear combination of u and v, and a and b would be called the coefficients of u and v respectively. The set of all such linear combinations would be called the span of u and v.
The span of u, v and w is the set of all vectors that can be written as au+bv+cw, where a, b and c are in F, and similarly for larger sets of vectors. Note that the span of a list of vectors is a subspace of V, in that given two vectors in the span their sum is in the span, and given a vector in the span and a scalar, the scalar times the vector is in the span. So the span is a subspace.

Given a list of vectors u1, u2, u3, etc., we can consider linear combinations of them, i.e. a1u1+a2u2+a3u3+... but with the restriction that only finitely many of the coefficients can be non-zero.
The span of this list contains the zero vector, since we can simply write

0 = 0u2+0u2+0u3+...

But is this the only way to write 0? What if there were a list of scalars a1, a2, a3, etc such that a1u1+a2u2+a3u3+... = 0 but not all of a1, a2, a3, etc were 0?
In this case we say that the list of vectors is linearly dependent. If there is no such list of scalars such that the linear combination with those scalars were the zero vector, then we say that the list of vectors is linearly independent.
Suppose we had a list of vectors u, v and w that was linearly dependent, i.e. we could write au+bv+cw = 0 and at least one of a, b and c isn't 0. Then this list could be considered redundant. Suppose that c is the one that isn't 0. Then we can say that
w = (-a/c)u+(-b/c)v
So w is in the span of u and v. That's another way to state linear independence: when no vector in the list is in the the span of the rest of the vectors in the list.
Furthermore, if the vectors in the list are linearly independent, then given any vector in the span of the list, there is precisely one list of scalars that gives a linear combination equal to that vector.

Finally, what can you say about a list of vectors that is linearly independent and spans the entire vector space V? We can say that for every vector in the vector space, there is one and only one way to write the vector as a linear combination of vectors in the list. Such a list is called a basis of V.
Some simple thought will yield that no fewer than two vectors will span R², and that you can have two vectors in R² that are linearly independent (for instance, (1, 0)ᵀ and (0, 1)ᵀ). So we have a basis of two vectors for R. Similarly, for Rⁿ and Cⁿ will have bases with n vectors in them. In fact, for both Rⁿ and Cⁿ, all bases will have exactly n vectors in them. This is difficult to prove, though, so I won't.
The size of the basis is a more general way to give the dimension of a vector space. For finite-dimensional vector spaces, it is the size of any basis, as well as the maximum possible size of a list of linearly independent vectors, and the minimum possible size of a list of vectors that span the entire space. For infinite-dimensional vector spaces, this breaks down a bit.

Function Spaces
Some of the most interesting vector spaces are those where the vectors aren't lists of coordinates but are instead functions.
Consider a polynomial P(x) = p0+p1x+p2x²+... where the p0, p1 etc are all in some base field F. If you take a number c in F and multiply P(x) by it, you get another polynomial

cP(x) = cp0+cp1x+cp2x²+...

Also, if you had two polynomials P(x) and R(x) = r0 + r1x + r2x+..., you can add them to get a third polynomial

P(x)+R(x) = (p0+r0) + (p1+r1)x+(p2+r2)x²+...

So polynomials with real coefficients form a real vector space. Similarly, polynomials with complex coefficients form a complex vector space. The polynomials of degree n or below always form an n+1 dimensional vector space, since we have the basis (1, x, x²,...,xⁿ).

This can be generalized to functions other than polynomials. The space of functions on k variables that always spit out real numbers form a real vector space, since such a function times a real number is the same kind of function. If the functions return out complex numbers, the vector space is complex. The dimensions of more general function spaces can be a bit difficult to deal with since function spaces are often infinite-dimensional.

Given a vector space V over a field F, we can consider the function space called the dual space of V, which is the set of linear functions from V to the scalar field F. Specifically, it is the set of functions f that take vectors in V and send them to values in F, with the constraints that given two vectors u and v and a scalar c,
f(u+v) = f(u) + f(v)
f(cu) = cf(u)
A more explicit construction of the dual space will come in the section on matrices.

-To Be Finished-

Layra-chan
Crew

Offline

Layra-chan
Crew

Offline

Posted: Mon Jun 22, 2009 10:57 am

Products of Vectors

Although vectors can be added naturally, there is no one single notion of what it means to multiply two vectors. Instead there are a few different ideas, the two most popular being the dot product and the cross product.

The Dot Product

Once again in R², we define the dot product of two vectors as follows: for u = (u1, u2)ᵀ and v = (v1, v2)ᵀ, the dot product of u and v is:
u·v = u1·v1+u2·v2
where the · for the ui and such is just regular multiplication.

So, for example, if we take the vectors (2, 3)ᵀ and (1, -2)ᵀ, their dot product is
(2, 3)ᵀ·(1,-2)ᵀ = 2·1+3·(-2) = 2-6 = -4

In general, for a vector in Rⁿ, the dot product of u = (u1, u2, u3,...,un) and v = (v1, v2, v3,...,vn) is
u·v = u1·v1+u2·v2+u3·v3+...+un·vn

The dot product gives us some important information about the vectors in Rⁿ. For Cⁿ and for function spaces, there are other similar objects, but we'll concentrate on Rⁿ for now.

If you take the dot product of a vector with itself, you get
u·u = u1²+u2²+u3²+...+un²
Pythagoras' theorem tells us that this is the square of the length of the vector, i.e. if you started at the origin, (0, 0, 0,...,0) and went along the vector u, you'd end up at a point a total distance of √(u1²+u2²+u3²+...+un²) away from the origin.
We write the length of the vector u as ||u||, and thus we get that
u·u = ||u||²

Another interesting case is what it means when u·v = 0. For example, (1, 0)ᵀ·(0,1)ᵀ = 0. Note that (1, 0)ᵀ and (0, 1)ᵀ form a right angle. In fact, this is true whenever u·v = 0. Another such pair is (1, 1)ᵀ and (1, -1)ᵀ, which you can check yourself.
In three dimensions, we have that (1, 0, 0)ᵀ, (0, 1, 0)ᵀ and (0, 0, 1)ᵀ are all at right angles, and the dot product of any two of them is 0.
In general, we say that two vectors are orthogonal when their dot product is 0. The word orthogonal actually has a slightly more general meaning, which is why we use it instead of "at right angles" or "perpendicular".

There is in fact a general relation for the dot product that encapsulates both the result about length and the result about orthogonal vectors. If we denote the angle between two vectors u and v as θ, then we get that
u·v = ||u||·||v||cos(θ)

Note again that the · on the right means regular multiplication, since the length of a vector is a scalar.
In the case where u = v, θ = 0, so cos(θ) = 1 and thus the right side is just ||u||², while if u and v are orthogonal, θ = 90° so cos(θ) = 0 and the right side becomes 0.

The Cross Product
While the dot product can be defined in Rⁿ, the cross product only really makes sense in certain cases, specifically R[b/]², R³ and R⁷. I will first deal with the R³ and R⁷ cases, as they are similar.

The cross product for R³ and R⁷ returns a vector, unlike the dot product which returns a scalar.
For two vectors u = (u1, u2, u3)ᵀ and v = (v1, v2, v3)ᵀ, we get that the cross product of the two vectors is
uxv = (u2·v3-u3·v2, u3·v1-u1·v3, u1·v2-u2·v1)ᵀ
For two vectors u' = (u1, u2, u3, u4, u5, u6, u7)ᵀ and v' = (v1, v2, v3, v4, v5, v6, v7)ᵀ, we get that the cross product of the two vectors is
(u2·v3-u3·v2+u4·v5-u5·v4+u7·v6-u6·v7,
u3·v1-u1·v3+u4·v6-u6·v4+u5·v7-u7·v5,
u1·v2-u2·v1+u4·v7-u7·v4+u6·v5-u5·v6,
u5·v1-u1·v5+u6·v2-u2·v6+u7·v3-u3·v7
u1·v4-u4·v1+u3·v6-u6·v3+u7·v2-u2·v7
u1·v7-u7·v1+u2·v4-u4·v2+u2[size=7]5[/s8]·v3-u3·v5
u2·v5-u5·v2+u3·v4-u4·v3+u6·v1-u1·v6)
Don't let the hugeness of the R⁷ version scare you; there is a logic behind it. If you ignore the last four coordinates, you get the R³ cross product.

The cross product has a number of interesting properties.

Note that uxv = -vxu. Because of this, the cross-product gives a sense of orientation, of whether things have been reflected across a mirror or not. Also, we get that the cross product of a vector with itself is always the 0 vector.

A geometrical interpretation of the cross product is that the area of the parallelogram formed by having u and v as two of its sides is given by ||uxv||. Some fiddling with triangles will give you that if the angle between u and v is given by θ,
||uxv|| = ||u||·||v||·|sin(θ)|

An interesting property of the cross product is that the product vector is orthogonal to both of the original vectors, i.e.
u·(uxv) = v·(uxv) = 0

This follows from a more general notion of what happens when you take the dot product of a vector with the result of a cross product of two vectors:
Given three vectors u, v and w, the volume V of the parallelepiped whose edges are defined by the three vectors (analogous to the parallelograms in two dimensions) is given by
V = |u·(vxw)|
If two of the vectors are the same, then the parallelepiped is flat and thus has no volume.

You might come across a notion of a cross product in two dimensions. This is in reality just derived from the three-dimensional case. Specifically, for vectors u = (u1, u2)ᵀ and v = (v1, v2)ᵀ, we define the vectors u' = (u1, u2, 0)ᵀ and v' = (v1, v2, 0)ᵀ, and say that the cross product of u and v is just the third component of the cross product of u' and v', i.e. u1·v2-u2·v1. As in the three- and seven-dimensional cases, the absolute value of this cross product is the area of the parallelogram whose sides are defined by u and v.

There are notions that generalize the cross product to other dimensions, but I shall leave them aside for now.

-To be finished-

Posted: Mon Jun 22, 2009 10:59 am

Matrices

Just as a vector is a column of numbers, a matrix is a grid of numbers, with rows and columns. If we have an m by n matrix, we have a set of m rows, where each row has n numbers in it.
So a 2 by 3 matrix would be something like:

INSERT MATRIX

A matrix where m and n are the same is called a square matrix i.e. an n by n matrix is called an n-dimensional square matrix.

Like vectors, matrices can be added to matrices with the same dimensions, and can be multiplied by scalars from the appropriate scalar field. Matrix addition is, like with vectors, done entry by entry, so that we have

INSERT MATRIX ADDITION

Similarly, scalar multiplication is also done entry by entry.

Unlike vectors, however, matrices have a natural notion of multiplication, at least for certain cases. Specifically, when we have an m by n matrix A and an n by p matrix B, we can multiply them together to get an m by p matrix AB as follows:

INSERT MATRIX MULTIPLICATION

Note that the entry in the (i, j) position of the new matrix AB is equal to the dot product of the ith row of matrix A with the jth column of matrix B, if we interpret the row and column as vectors.
In fact, we can define the multiplication of an m by n matrix A by an n-dimensional vector v as such, writing v as a column:

INSERT MATRIX TIMES VECTOR

Note that we get a vector again, but of a different dimension. We can actually consider vectors to be like n by 1 matrices.

We define the transpose of a matrix to be what we get if we flip the entries of the matrix over the upper-left to lower-right diagonal, switching the dimensions as well.

INSERT MATRIX TRANSPOSE

We denote the transpose of a matrix A by Aᵀ. This is why I wrote the ᵀs on the coordinate forms of vectors earlier, because they are supposed to be written as columns, not horizontal lists.
Now the dot product of two vectors u and v can be written as
uᵀv
which gives a 1 by 1 matrix, i.e. a scalar.

Note that you can't switch the order of multiplication. First, the dimensions have to match, and second, even if the dimensions do match, you're taking different things from each matrix (rows from one, columns from the other), so the order of multiplication matters. Thus we say that matrix multiplication is noncommutative. We've already seen an example of noncommutative multiplication with the cross product, but in that case it was just positive or negative. Matrix multiplication is even more complicated.

INSERT NONCOMMUTATIVE MULTIPLICATION

There is a special kind of n by n square matrix called the identity matrix, denoted usually by In. The entries on the diagonal going from the top left to the bottom right are 1, and everything else is a 0. The identity matrix is called such because given an m by n dimensional matrix A, AIn = A, and given an n by m dimensional matrix B, InB = B.

-To be finished-

Layra-chan
Crew

Offline

Layra-chan
Crew

Offline

Posted: Mon Jun 22, 2009 11:04 am

Linear Transformations

Why is matrix multiplication so weird?

Suppose we want a function f from one vector space to another that preserves vector addition and scalar multiplication, i.e. for any vectors u and v and any scalar c,
f(u + v) = f(u) + f(v)
f(cv) = cf(v)

f is called a linear transformation. It turns out that for functions from an n-dimensional vector space to an m-dimensional one, where both n and m are finite, the function f can be written as multiplication by an m by n matrix. Note that the columns of the matrix are the images of the matrix times the standard basis vectors, and because the linear transformation respects vector addition and scalar multiplication, we can find out where any linear combination of basis vectors ends up by taking the corresponding linear combination of the images of the basis vectors.

INSERT EXAMPLE

Let's consider a few simple linear transformations:

Consider the plane. Suppose we want to rotate everything by 90°, keeping everything the same length. Then we write:

[[0, 1], [-1, 0]][[x],[y]] = [[y][-x]]

which you can check is (x, y)ᵀ rotated by 90°. Let us denote the rotation matrix by A.

Suppose we want add whatever the x-coordinate is to the y-coordinate (this is called a shear). We write this as

[[1, 0][1, 1]][[x],[y]] = [[x],[x+y]]

We denote the shear matrix as B

Now suppose we want to first perform a shear, and then a rotation. We write this as multiplying first by B and then by A (note that this is written in reverse order since matrices multiply on the left):

A(Bv) = [[0, 1],[-1, 0]]([[1, 0],[1, 1]][[x][y]]) = [[-x-y],[x]]

Note that this is what you'd get if you multiplied v by (AB) using the matrix multiplication rule.

You can also do other things with linear transformations/matrix multiplication. For example, if you have a three-dimensional object, you can project it to a two-dimensional or even one-dimensional space using a 2 by 3 or 1 by 3 matrix. These projections aren't unique; the different planes/line that you can project onto become different matrices.

Kernel and Image

Any linear transformation A from a vector space U to a vector space V defines a subspace of each. The kernel of A is the set of U of vectors that A sends to 0; since A is linear, this is actually a subspace of U. The image of A is the set of vectors v in V such that there is a u in U where Au = v; again, this is actually a subspace, this time of V.
The dimension of the kernel is called the nullity of A, and the dimension of the image is called the rank of A. A useful fact is that the rank plus the nullity is the dimension of U; this is called the Rank-Nullity theorem.

Change of Basis

When you move from one basis of a vector space to another, you find that the operation of changing the basis has some familiar properties.
Using u, v and w to denote vectors in the original basis and u', v' and w' to denote vectors in the new basis, we find that the following relations hold:

If u+v = w, then u'+v' = w'
If cu = w, then cu' = w'

In other words, changing the basis is a linear transformation! Therefore, it can be represented by a matrix, which we'll call the "change of basis matrix".

Suppose that we have a basis e1,...,en, and we have a new basis f1,...,fn.
The change of basis matrix P is given as follows: we can write each ei in terms of the f basis, i.e. as a linear combination of the f vectors, and thus the j'th term in the i'th row of the matrix P is given by the coefficient of fj in the linear combination for e. Then any vector u written in the e basis becomes Pu when written in the f basis.
Note that we are able to switch back to the e basis from the f basis, so the matrix P must have an inverse, i.e. there must exist a matrix Pˉ¹ such that PPˉ¹ = I.
So what happens to a linear transformation when you change basis? Well, a matrix A for a linear transformation is written with respect to two bases; the vector space of the input has a basis, and the vector space of the output has a basis. If we change the basis of the input vector space by a matrix P and change the output vector space by a matrix Q, we get that the matrix A becomes QAPˉ¹. Why is this?
In the original bases, the vector u got sent to Au; this vector in the new basis is QAu, so we want a matrix such that Pu gets sent to QAu; thus, to cancel out the P, we use the matrix QAPˉ¹.

Oftentimes we consider linear transformations from a space to itself. In that case when changing basis we set Q = P, since we want the basis to which we apply the transformation to be the same basis that the result is expressed in. So then the matrix A with the change of basis P would then be PAPˉ¹.

-To be finished-

Posted: Mon Jun 22, 2009 11:09 am

Trace and Determinants

When you change basis, the actual values of the entries a matrix change. But for square matrices, there are two important numbers associated with a given square matrix that don't change. These are known as the trace and the determinant.

Let's consider a square matrix A. We can denote its entries by Aij, where the first index i denotes the row, and the second index j denotes the column. So A32 is the entry in the third row from the top and the second column from the left.

Trace

Consider the diagonal of A going from the top-left to the bottom-right, i.e. the entries of the form Aii. Then the sum of these diagonal entries is called the trace of A, i.e.

INSERT TRACE FORMULA

The trace has some nice properties. It is linear, i.e. given two n-by-n matrices A and B and a scalar c,

tr(A+B) = tr(A)+tr(B)
tr(cA) = ctr(A)

As mentioned earlier, it does not change under change-of-basis, specifically, tr(PAPˉ¹) = tr(A). Also, if you have an n-by-m matrix A and an m-by-n matrix B (note the dimensions), then AB and BA are both square matrices, albeit perhaps of different dimensions, and tr(AB) = tr(BA).

Determinant

The (absolute value of) the determinant tells you how volume changes when A is applied.
For example, if A is a two-by-two matrix, then the determinant of A tells you how large the parallelogram with sides [A11, A12] and [A21, A22 would be. Similarly, if A is an n-by-n matrix, then the determinant of A tells you how big an n-volume the n-dimensional parallelepiped with edges given by the columns of A would be.

Before giving a formula for the determinant, we need to talk about permutations. A permutation is a way to order the numbers from 1 to n. For example, there are six permutations of the numbers {1, 2, 3}: 123, 132, 213, 231, 312, 321. We can consider a permutation to be a function from the numbers from 1 to n to the numbers from 1 to n. So for the function associated with 231, we say that the function sends 1 to 2, 2 to 3, and 3 to 1.
In general, a permutation/function is denoted by sigma, σ. So again, in the case where σ represents 231, σ(1) = 2, σ(2) = 3 and σ(3) = 1.
A permutation has a sign, denoted by sgn(σ); if the sequence of the permutation can be gotten by starting with the numbers in order and performing an odd number of switches of adjacent numbers, then sgn(σ) = -1; if it takes an even number of switches, then sgn(σ) = 1. 231 is even, since we start with 123, switch the first pair of numbers to get 213, and then switch the second pair to get 231. 321 is odd, since we switch the first pair to get 213, then the second pair to get 231, and then the first pair again to get 321.

Now we can finally give a formula for the determinant.

INSERT DETERMINANT FORMULA

In the two-by-two case, we can simply write
det(A) = A11A22-A21A12

For two n-by-n matrices A and B, det(AB) = det(A)det(B). In other words, the determinant is multiplicative. This gives us that det(I) = 1, and that det(Aˉ¹) = (det(A))ˉ¹
It turns out that this fact gives us a handy way to check if a matrix is invertible: if there exists an inverse Aˉ¹, then det(A) cannot be 0, and conversely if det(A) is not 0, then Aˉ¹ exists.

Since the volume of a region does not depend on the basis, we expect, and indeed have claimed earlier, that the determinant does not change when the basis changes. We see that given a change-of-basis matrix P,
det(PAPˉ¹) = det(P)det(A)det(Pˉ¹) = det(P)(det(P))ˉ¹det(A) = det(A)

-To be finished-

Layra-chan
Crew

Offline

Layra-chan
Crew

Offline

Posted: Mon Jun 22, 2009 11:22 am

Eigenvalues and Eigenvectors

Consider the identity matrix; it sends every vector to itself. Two times the identity matrix sends every vector to twice itself.
Now consider the following matrix A:

3 2
-1 0

A sends (1, 0)ᵀ to (3, -1)ᵀ and (0, 1)ᵀ to (2, 0)ᵀ. But it also sends (1, -1)ᵀ to (1, -1)ᵀ and (2, -1)ᵀ to (4, -2)ᵀ. So it sends (1, -1)ᵀ to itself and (2, -1)ᵀ to twice itself.

We call (1, -1)ᵀ and (2, -1)ᵀ eigenvectors of the matrix A, and we call 1 and 2 the eigenvalues that correspond to (1, -1)ᵀ and (2, -1)ᵀ respectively.

Because A sends its eigenvectors to multiples of themselves, A² also sends the same vectors to multiples of themselves, and so on. So A² applied to (1, -1)ᵀ is again (1, -1)ᵀ, and (2, -1)ᵀ gets sent to (8, -4)ᵀ. Applying A again gives (1, -1)ᵀ and (16, -8 )ᵀ respectively, and so on.

Note that (2, -2)ᵀ is also an eigenvector A with eigenvalue 1, and that (4, -2)ᵀ is also an eigenvector of A with eigenvalue 2. In general, the set of eigenvectors with a shared eigenvalue λ form a vector subspace (called the eigenspace for λ): if a vector u is an eigenvector with eigenvalue λ, then for any scalar c, cu is an eigenvector with eigenvalue λ, and if u and v both have eigenvalue λ, then u+v also has eigenvalue λ. The dimension of the eigenspace is called the multiplicity of λ.

Another way of thinking about eigenvectors is that an eigenvector with eigenvalue λ is in the kernel of A-λI. In turn, this means that det(A-λI) = 0.

For a general n-by-n matrix A, if we have enough distinct eigenvectors to make a basis for the n-dimensional space that A acts on, then we call such a basis an eigenbasis. Note that we do not always have enough eigenvectors, though. For instance, the matrix

[[1, 1][0, 1]]

has (1, 0)ᵀ as an eigenvector, but all other eigenvectors are multiples of (1, 0)ᵀ.

Sometimes the eigenvectors and the eigenvalues are not over the same scalar field as the matrix. For instance, the matrix
[[0, 1][-1, 0]]
can be viewed as a matrix over the real numbers, but its eigenvalues are i and -i, and its eigenvectors are (1, i)ᵀ and (1, -i)ᵀ respectively; in other words, over the real numbers the matrix has no eigenvalues or eigenvectors.

If we do have an eigenbasis for a matrix A, then we can write any vector
u as u1e1+...+unen
where the ej are the elements of the eigenbasis, with corresponding eigenvalues λj; then we get that
Au = λ1u1e1+...+λnunen

For a matrix A with enough distinct eigenvectors to form an eigenbasis, we can change to the eigenbasis so that A is diagonal, with the eigenvalues as the diagonal entries; the number of times an eigenvalue appears is its multiplicity.
Because of this, we get that the trace of A is the sum of its eigenvalues (taking into account multiplicity), and the determinant of A is the product of its eigenvalues, again with multiplicity.

We say that u is a generalized eigenvector with generalized eigenvalue λ if there is some positive integer such that u is in the kernel of (A-λI)ᵐ. In this case, at least over C, we have enough generalized eigenvectors to make a basis, and we can write A as an upper-triangular matrix with the generalized eigenvalues along the diagonal. In this case the trace is the sum of the generalized eigenvalues with multiplicity, and the determinant is the product of the generalized eigenvalues with multiplicity.

-To be finished-

Posted: Mon Jun 22, 2009 11:25 am

Multilinear Forms

Remember the dot product? u·v = u_1v_1+u_2v_2+...

We can imagine that the dot product is actually written as uᵀIv, where I is the identity matrix. If you do out the multiplication explicitly, you'll find that this matches the original definition of the dot product.

What happens if we replace I by some other matrix?

-To be finished-

Layra-chan
Crew

Offline

Layra-chan
Crew

Offline

Posted: Mon Jun 22, 2009 11:29 am

Orthogonality and Normality

For certain multilinear forms there are other notions associated with them.

Consider the dot product again. What happens when you apply a change of basis matrix?

-To be finished-

Posted: Mon Jun 22, 2009 11:33 am

Direct, Tensor and Wedge Products

There are yet more notions of what the product of two vectors entails. But instead of ending up in the base field like a multilinear form or in the same vector space like the cross product, these products end up in spaces of higher dimension.

-To be finished-

Layra-chan
Crew

Offline

Welcome to Gaia! ::

Gaia Guilds

The Physics and Mathematics Guild

Quick Reply

More Information