Microsoft Word – matrix algebra
Matrix algebra primer, page 1
Copyright By PowCoder代写 加微信 powcoder
MATRIX ALGEBRA FOR STATISTICS: PART 1
Matrices provide a compact notation for expressing systems of equations or
variables. For instance, a linear function might be written as:
This is really the product of a bunch of b variables and a bunch of x variables. A
vector is simply a collection of variables (in a particular order). We could define a (k-
dimensional) vector
) , and another vector
) . Again,
these vectors simply represent the collections of x and a variables; the dimension of
the vector is the number of elements in it.
We define the product of two vectors to be:
# bi = x1b1 + x2b2 +…+ xkbk
(Specifically, this is called a dot product or inner product; there exist other ways
to calculate products, but we won’t be using those.) If you think of b as “the
collection of all b variables” and x as “the collection of all x variables”, then the
product x !b is “the product of each b variable with the corresponding x variable.”
You can calculate the (dot) product only when the two vectors have the same
dimension.
Example: Let a = (1,2, 3, 4) and let b = (5,6, 7,8) . These are both 4-dimensional vectors,
so we can calculate their dot product. a !b = ”
= (1 !5) + (2 !6) + (3 ! 7) + (4 !8)
= 5 + 12 + 21+ 32 = 70 .
Sometimes, we say that two vectors are orthogonal if their dot product equals zero.
Orthogonality has two interpretations. Graphically, it means that the vectors are
perpendicular. On a deeper philosophical level, it means that the vectors are
unrelated.
Example: Let c = (0,1) and let d = (1,0) . Since c !d = (0 !1) + (1 !0) = 0 + 0 = 0 , the
vectors are orthogonal. Graphically, we can represent c as a line from the origin to the point
(0,1) and d as a line from the origin to (1,0) . These lines are perpendicular. (On a deeper
sense, they are “unrelated” because the first vector moves only along the x-axis and never changes
its y-direction; the second moves only along the y-axis and doesn’t change its x-direction.)
Example: Let e = (1,1) and let f = (1,!1) . Since e ! f = (1 !1) + (1 ! “1) = 1+ (“1) = 0 , the
vectors are orthogonal. Again, we can show that these lines are perpendicular in a graph. (It’s a
bit hard to graph how they are unrelated; but we could create a new coordinate system for the
space in which they are.) There’s a moral to this exercise: two vectors can have a product of zero,
even though neither of the vectors is zero.
Matrix algebra primer, page 2
Finally, dot products have a statistical interpretation. Let’s let x and y be two
random variables, each with mean zero. We will collect a sample of size N, and we
will record the value of x
for each observation. We can then construct a
) and a similar vector
) . When we take their
dot product, we calculate:
= (N #1) !Côv(x, y)
The dot product is essentially their (empirical) covariance. Saying that the vectors x
and y are orthogonal is exactly the same as saying that the variables x and y are
uncorrelated.
Similarly, the dot product of a vector with itself is:
= (N #1) !Vâr(x)
Here’s an unnecessary bit of trivia: if we graph two vectors in N-dimensional space,
the angle ! between them must always satisfy:
x “x y ” y
In the case of these random variables,
x “x y ” y
(N #1)Côv(x, y)
(N #1)Vâr(x) (N #1)Vâr(y)
= Côrr(x, y)
The correlation coefficient is the cosine of the angle between the vectors!
(Remember that the cosine of two rays is one if they point in exactly the same
direction, zero if they are perpendicular, negative one if they point in exactly
opposite directions—exactly the same as with correlations.) Coincidence? Not
really, on a very deep level, but we don’t have to go there.
Now let’s move on to matrices. As it turns out, vectors are just special cases of
matrices, so there’s not much point in discussing them specifically. We used vectors
to express a single linear equation, and we will use matrices to present a system of
linear equations, like:
Matrix algebra primer, page 3
(The subscripts above are two separate numbers. The first line would be read “y-one
equals x-one-one times a-one plus x-one-two….” A careful person might separate the
indices with a comma, to make it clear that x
is not x-eleven.) Instead of this
complicated system of equations, we can represent the vector
product of an n ! k matrix X with the vector
A matrix A is defined as a collection of n ! k entries arranged into n rows and k
columns. The entry in the i-th row and j-th column is denoted by a
The elements of a matrix are scalars. A scalar is a real number (or a function that
takes on a specific value). Tacking a set of dimensions onto the bottom right-hand
corner of the matrix always makes it easier to remember the dimensions of that
matrix. This is strictly optional. The dimensions are always expressed as rows x
columns. An n ! k matrix is different from an k ! n matrix.
Incidentally, a vector is just a special kind of matrix: it is a matrix with a single
column. An n-dimensional vector is nothing more or less than an n !1 matrix.
A spreadsheet containing data is a common example of a matrix. I might have an
Excel file with my students’ grades:
Student Exam 1 Exam 2 Exam 3
Ann 90 85 86
Bob 78 62 73
Carl 83 86 91
Doris 92 91 90
Pat 97 98 93
Essentially, I have a 5 ! 3 matrix of grades,
Matrix algebra primer, page 4
This is how we usually use matrices in econometrics: to express a collection of data.
We will be applying the same formula to each observation in our empirical model
(much as I would apply the same formula to calculate the final grade of each
student). However, let’s just leave this example matrix for now, and study basic
matrix operations.
Given an n ! k matrix A with the entries described as above, the transpose of A is
the k ! n matrix !A (sometimes written as AT ) that results from interchanging the
columns and rows of A. That is, the i-th column of A becomes the i–th row of !A ;
the j-th row of A becomes the j-th column of !A :
Think of this like flipping the matrix on its diagonal.
Example: With the matrix of grades above,
90 78 83 92 97
85 62 86 91 98
86 73 91 90 93
Addition of matrices is fairly straightforward, defined in this manner. Given two
matrices A and B that have the same dimension n ! k , their sum A + B is also an
n ! k matrix, which we obtain by adding elements in the corresponding positions:
Not all matrices can be added; their dimensions must be exactly the same. As with
addition of scalars (that is, addition as you know it), matrix addition is both
commutative and associative; that is, if A and B and C are matrices of the same
dimension, then A + B( ) + C = A + B + C( ) and A + B = B + A .
Example: Let D and E be the matrices below:
Matrix algebra primer, page 5
Then their sum is the matrix:
1+ 1 2 + 0
3 + 1 4 + 1
6 + 0 7 + 1
Again, matrix addition probably feels very natural. Matrix subtraction is the same.
There are two types of multiplication used with matrices, and the first should also
feel natural. This is called scalar multiplication: when we multiply an entire
matrix by a constant value. If ! is some scalar (just a single number), and B is an
n ! k matrix, then !B is computed by multiplying each component of B by the
constant ! :
Scalar multiplication has all the familiar properties: it is distributive, commutative,
and associative. That is, ! A + B( ) = !A +!B , ! + “( )A = ” +!( )A ,
! “A( ) = !”( )A , and! ” + #( )A = !”A +!# A .
Example: Use the matrix D from the previous example. Then 4D is the matrix:
4 ! 3 4 ! 4
4 !6 4 ! 7
Multiplying one matrix by another matrix is more complicated. Matrix
multiplication is only defined between an n ! k matrix A and an k ! m matrix B,
and the order matters. The number of columns in the first must equal the number of
rows in the second. Their product is the n ! m matrix C, where the ij-th element is
defined as:
In other words, we take get c
by taking the dot product of the i-th row of A and
the j-th column of B:
Matrix algebra primer, page 6
Notice that multiplying a row of A by a column of B is unlikely to give you the same
answer as multiplying a column of A by a row of B. Matrix multiplication is not
commutative: AB ! BA (except by coincidence, or when both are diagonal matrices of
the same dimension). It is very, very important to keep the order right.
Here are two other rules to know about matrix multiplication:
AB = 0 /! A = 0 or B = 0( ) and: AB = AC /! B = C
except in special cases. Fortunately, matrix multiplication is still associative and
distributive. That is, A BC( ) = AB( )C and A B + C( ) = AB + BC . This makes
multiplication a bit easier.
Because I find it really hard to remember which column gets multiplied by which
row and ends up where, I use this trick to keep everything straight when multiplying
matrices. I align the two matrices A and B so that the second one is above and to
the right of the first. For each row i of A I trace a line out to the right, and each
column j of B a line going down, and where these intersect is where their product
lies in the matrix C. This is like a coordinate system for the c
I also find this trick very useful for multiplying a bunch of matrices. If we have find
the product ABD of three matrices, Once I find C = AB as above, all I have to do is
stick the matrix D immediately to the right of B, and I have my “coordinate system”
for the product of C and D.
Matrix algebra primer, page 7
Example: Let F by a 2 ! 2 matrix, and let G be a 2 ! 2 matrix, defined below:
Then the product FG is the 2 ! 2 matrix:
1 (1’1 (2 0 (1+ 2 (2
1 ( 3’1 ( 4 0 ( 3 + 2 ( 4
1′ 2 0 + 2
3′ 4 0 + 8
Example: Let C by a 2 ! 3 matrix, and let D be a 3! 2 matrix, defined below:
Then the product CD is the 2 ! 2 matrix:
1 (1+ 2 ( 3 + 0 (6 1 (2 + 2 ( 4 + 0 ( 7
0 (1+ 3 ( 3!1 (6 0 (2 + 3 ( 4 !1 ( 7
1+ 6 + 0 2 + 8 + 0
0 + 9 ! 6 0 + 12 ! 7
Now let’s talk about some names for special types of matrices. A square matrix is
one that has the same number of rows as columns; that is, an n ! n matrix. A
diagonal matrix is a square matrix that has the entry a
= 0 for all i ! j (in other
words, zero everywhere except for the diagonal). For example,
is a diagonal matrix. A symmetric matrix is one that is the same as its
transpose, A = !A . Idempotent matrices ares one that are the same when
multiplied by themselves, A2 = AA = A .
The n ! n identity matrix (denoted by I or I
) is a diagonal matrix with ones on
the diagonal (and zeros everywhere else):
Matrix algebra primer, page 8
This has the property that for any n ! k matrix A, the product AI
equals A. In
matrix multiplication, it is the analogue of the number one in simple multiplication.
(In fact, you could take the position that simple multiplication is just matrix
multiplication using 1!1 matrices; the 1!1 identity matrix is just [1] .)
We will use the identity matrix to define the matrix equivalent of division.
However, we never “divide” matrices; we always “multiply by the inverse”. With
normal numbers, the “inverse of a ” is defined as the number a!1 such that
a . Most square matrices (but not all) are invertible, and given an n ! n
matrix A, its inverse is the matrix A!1 with the property that:
Computing inverses of matrices is a major pain in the ass most of the time.
Fortunately, we usually only do this in theory; we let Stata calculate it for us the rest
of the time. However, you should know that the inverse is not obtained by inverting
each individual component of the matrix.
Counterexample and Example: Let F by a 2 ! 2 matrix, and let H be a 2 ! 2
matrix, defined below:
1 /1 1 / 2
1 / 3 1 / 4
H is not the inverse of F, since the product is not the identity matrix:
1 /1 1 / 2
1 / 3 1 / 4
1 /1+ 2 / 3 1 / 2 + 2 / 4
3 /1+ 4 / 3 3 / 2 + 4 / 4
5 / 3 4 / 4
13 / 3 10 / 4
If we want to compute the inverse of F, it is some 2 ! 2 matrix of the form:
where we will treat w, x, y, and z as unknowns. This matrix F!1 has the property that:
Matrix algebra primer, page 9
1 (w + 2 ( y 1 ( x + 2 ( z
3 (w + 4 ( y 3 ( x + 4 ( z
This gives us a system of four equations in four unknowns:
1 !w + 2 ! y = 1
1 ! x + 2 ! z = 0
3 !w + 4 ! y = 0
3 ! x + 4 ! z = 1
We can solve this by iterated substitution. The second equation tells us that x = !2 ” z . We can
plug this into the last equation and get 3 ! (“2z) + 4z = 1, so !2z = 1 , and z = !1 2 . This
means that x = !2z = !2(!1 2) = 1 . Next, we observe from the third equation that
y = !3 4 w . Plugging this into the first equation, we have 1w + 2(!3 4 w) = 1 , so
w ! 3 2w = 1 , !1 2 w = 1 , so w = !2 . This means that y = 3 2 . Putting this altogether, the
inverse of F must be:
We can verify this by taking the product,
1 ( (!2) + 2 ( (3 2) 1 (1+ 2(!1 2)
3 ( (!2) + 4 ( (3 2) 3 (1+ 4(!1 2)
!2 + 3 1+ 1
!6 + 6 3! 2
Again, computing the inverse of a matrix is a pain—and the 2 ! 2 case is the easiest
that it comes! (More generally, with an n ! n matrix you have n2 equations in n2
unknowns, so it rapidly gets complicated.) We generally deal with matrix inverses
only in theory, so it’s important to know some theoretical properties of inverses. I’ll
add some rules for transposes as well, since they mirror the others:
(A!1)!1 = A (AB)!1 = B!1A!1 ( !A )”1 = (A”1 !)
( !A !) = A (AB !) = !B !A (A + B !) = !A + !B
Note that the order of multiplication changes when passing the transpose or inverse
through parentheses. Also, the rule (AB)!1 = B!1A!1 works only when each matrix is
a square matrix (otherwise, they don’t have individual inverses—but their product
might be a square matrix, so it might still have an inverse).
Matrix algebra primer, page 10
As I mentioned before, not all square matrices are invertible. (The same is true of
regular numbers: zero has no inverse.) A square matrix that has no inverse is called a
singular matrix. Let me give you one example. The 2 ! 2 matrix
is not invertible. If it did have an inverse, it would be some matrix of the form:
with the property that:
(That’s just the definition of an inverse.) That would mean that:
2w + 0y 2x + 0z
2w + 0y 2x + 0z
This gives the system of equations:
2w + 0y = 1
2w + 0y = 0
2x + 0z = 0
2x + 0z = 1
This cannot possibly have a solution. Look at the first two equations: they are the
same on the left-hand side, but equal zero in the first equation and one in the
second. For these to be satisfied, we would have to have 1 = 2w + 0y = 0 , so 1 = 0 .
That’s just not possible. J cannot possibly have an inverse, so it is “singular”.
Here are some rules for identifying whether a matrix is singular:
1. If all of the elements in one row (or column) of the matrix are zero, then the matrix has
no inverse.
2. If two rows (or two columns) are identical, then the matrix has no inverse.
3. If two rows (or two columns) are proportional, then the matrix has no inverse.
4. If one row (or one column) can be written as a linear function of some other rows (or of
some other columns), then the matrix has no inverse.
Matrix algebra primer, page 11
These essentially exhaust all possibilities. We can use the matrix J as an example of
both the first and second cases. The second column of J is all zeros, so this indicates
that the matrix has no inverse. It is also the case that the first
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com