







































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
This class has three parts. In Part I, we analyze linear systems that transform an n-dimensional vector (x) into another n-dimensional vector (y).
Typology: Exams
1 / 47
This page cannot be seen from the preview
Don't miss anything!
This class has three parts. In Part I, we analyze linear systems that transform an n-dimensional vector (x) into another n-dimensional vector (y). This transformation is often expressed as a linear system via matrix multiplication: y = Ax. In Part II, we expand the type of systems we can solve, including systems that transform vectors from n dimensions to m dimensions. We will also consider solution strategies that use alternative objectives when the system contains either too much or not enough information. Finally, Part III dispenses with linear systems altogether, focusing purely on observations of sets of n-dimensional vectors (matrices). We will learn how to analyze and extract information from matrices without a clear input/output relationship.
We will distinguish scalars, vectors, and matrices with the following typographic conventions:
Object Font and Symbol Examples Scalars italicized, lowercase letters x, α, y Vectors bold lowercase letters x, y, n, w Matrices bold uppercase A, A−^1 , B, Γ
There are many ways to represent rows or columns of a matrix A. Since this course uses Matlab, we think it is convenient to use a ma- trix addressing scheme that reflects Matlab’s syntax. So, the ith row of matrix A will be A(i, :), and the jth column will be A(:, j). Rows or columns of a matrix are themselves vectors, so we choose to keep the boldface font for the matrix A even when it is subscripted. We could also use Matlab syntax for vectors (x(i), for example). However, the form xi is standard across many fields of mathematics and engi- neering, so we retain the common notation. The lack of boldface font reminds us that elements of vectors are scalars in the field.
These symbols can be used to succinctly write mathematical state- ments. For example, we can formally define the set of rational num- bers as
A number is rational if and only if it can be expressed as the quotient of two integers.
with the statement
r ∈ Q ⇔ ∃ p, q ∈ Z s.t. r = p/q
While the latter statement is shorter, it is more difficult to under- stand. So whenever possible we recommend writing statements with as few symbols as necessary. Rely on mathematical symbols only when a textual definition would be unwieldly or imprecise, or when brevity is important (like when writing on a chalkboard).
10 linear algebra
a + b + c = (a + b) + c = a + (b + c)
abc = (ab)c = a(bc)
a + b = b + a
ab = ba
a(b + c) = ab + ac
a + 0 = a
1 × a = a
It might surprise you that only five axioms are sufficient to recreate everything you know about algebra. For example, nowhere do we state the special property of zero that a × 0 = 0 for any number a. We don’t need to state this property, as it follows from the field axioms:
Theorem. a × 0 = 0
Proof.
a × 0 = a × (1 − 1) = a × 1 + a × (−1) = a − a = 0
Similarly, we can prove corollaries from the field axioms.
Corollary. If ab = 0, then either a = 0 or b = 0 (or both).
fields and vectors 11
Proof. Suppose a 6 = 0. Then there exists a−^1 such that
a−^1 ab = a−^1 × 0 1 × b = 0 b = 0
A similar argument follows when b 6 = 0.
The fundamental theorem of algebra relies on the above corollary when solving polynomials. If we factor a polynomial into the form (x − r 1 )(x − r 2 ) · · · (x − rk) = 0, then we know the polynomial has roots r 1 , r 2 ,.. ., rk. This is only true because the left hand side of the factored expression only reaches zero when one of the factors is zero, i.e. when x = ri.
The advantage of fields is that once a set is proven to obey the five field axioms, we can operate on elements in the field just like we would operate on real numbers. Besides the real numbers (which the concept of fields was designed to emulate), what are some other fields? The rational numbers are a field. The numbers 0 and 1 are rational, so they are in the field. Since we add and multiply rational numbers just as we do real numbers, these operations commute, associate, and distribute. All that remains is to show that the rationals have additive and multiplicative inverses in the field. Let us consider a rational number p/q, where p and q are integers.
fields and vectors 13
What happens when we try to define multiplication as an elementwise operation? For example
This is bad. Very bad. Here we have an example where xy = 0 , but neither x nor y is the zero element 0. This is a direct violation of a corollary of the field axioms, so elementwise vector multiplication is not a valid algebraic operation. On the bright side, if vectors were a Sadly, vectors are not a field. There is no way to define multipli- field this class would be far too short. cation using only vectors that satisfies the field axioms. Nor is there anything close to a complete set of multiplicative inverses, or even the element 1. Instead, we will settle for a weaker result – showing that vectors live in a normed inner product space. The concepts of a vector norm and inner product will let us create most of the operations and elements that vectors need to be a field.
When you were first taught to multiply, it was probably introduced as a “faster” method of addition, i.e. 4×3 = 3+3+3+3. If so, why do we need multiplication as a separate requirement for fields? Couldn’t we simply require the addition operator and construct multiplication from it? The answer is no, for two reasons. First, the idea of multiplication as a shortcut for addition only makes sense when discussing the non- negative integers. What, for example, does it mean to have − 2. 86 × 3? What does − 2 .86 groups look like in terms of addition? Also, the integers are not a field! Second, we must realize that multiplication is a much stronger relationship between numbers. To understand why, we should start talking about the “linear” part of linear algebra.
Linear systems have two special properties.
We can combine both of these properties into a single condition for linearity.
14 linear algebra
Definition. A system f is linear if and only if
f (k 1 x 1 + k 2 x 2 ) = k 1 f (x 1 ) + k 2 f (x 2 )
for all inputs x 1 and x 2 and scalars k 1 and k 2.
Consider a very simple function, f (x) = x + 3. Is this function lin- ear? First we calculate the lefthand side of the definition of linearity.
f (k 1 x 1 + k 2 x 2 ) = k 1 x 1 + k 2 x 2 + 3
We compare this to the righthand side.
k 1 f (x 1 ) + k 2 f (x 2 ) = k 1 (x 1 + 3) + k 2 (x 2 + 3) = k 1 x 1 + k 2 x 2 + 3(k 1 + k 2 ) 6 = f (k 1 x 1 + k 2 x 2 )
This does not follow the definition of linearity. The function f (x) = x + 3 is not linear. Now let’s look at a simple function involving multi- plication: f (x) = 3x. Is this function linear?
f (k 1 x 1 + k 2 x 2 ) = 3(k 1 x 1 + k 2 x 2 ) = k 1 (3x 1 ) + k 2 (3x 2 ) = k 1 f (x 1 ) + k 2 f (x 2 )
The function involving multiplication is linear. These results might not be what you expected, at least concerning the nonlinearity of functions of the form f (x) = x + b. This is probably because in earlier math courses you referred to equations of straight lines (y = mx + b) as linear equations. In fact, any equation of this form (with b 6 = 0) is called affine, not linear. Truly linear functions have the property that f (0) = 0. Addition This follows from proportionality. If f (k0) = kf (0) for all k, then f (0) is, in a way, not “strong” enough to drive a function to zero. The must equal zero. expression x + y is zero only when both x and y are zero. By contrast, the product xy is zero when either x or y is zero.
One of the nice properties of the real numbers is that they are well ordered. Being well ordered means that for any two real numbers, we can determine which number is larger (or if the two numbers are equal). Well orderedness allows us to make all sorts of comparisons between the real numbers. Vectors are not well ordered. Consider the vectors (3, 4) and (5, 2). Which one is larger? Each vector has one element that is larger than the other (4 in the first, 5 in the second). There is no unambiguous way to place all vectors in order.
16 linear algebra
The normalized unit vector ( ˆx ) is We use the hat symbol (ˆ) over a unit vector to remind us that it has been normalized. x ˆ =
3 / ‖x‖ − 4 / ‖x‖
Figure 1.2: Vectors separate into a length (norm) and direction (unit vector). The length and direction can be combined by scalar multiplication
We saw earlier that elementwise multiplication was a terrible idea. In fact, defining multiplication this way violates a corollary of the field axioms (xy = 0 implies that x = 0 or y = 0 ). However, elementwise multiplication does work in one case – scalar multiplication, or the product between a scalar (real number) and a vector:
kx = k
x 1 x 2 .. . xn
kx 1 kx 2 .. . kxn
where k is a scalar real number. Notice that scalar multiplication does not suffer from the same problem as elementwise vector multiplication. If kx = 0, then either the scalar k equals zero or the vector x must be the zero vector. What happens when you multiply a vector by a scalar? For one, the norm changes: Remember that √ k^2 = |k|, not k itself. We consider the square root to ‖kx‖ = √(kx be the positive root. 1 )^2 + (kx 2 )^2 +^ · · ·^ + (kxn)^2 =
k^2 (x^21 + x^22 + · · · + x^2 n) = |k| ‖x‖
Scalar multiplication scales the length of a vector by the scalar. If the scalar is negative, the direction of the vector “reverses”.
One way to think of the product of two vectors is to consider the product of their norms (magnitudes). Such operations are common in mechanics. Work, for example, is the product of force and displace- ment. However, simply multiplying the magnitude of the force vector and the magnitude of the displacement vector disregards the orienta- tion of the vectors. We know from physics that only the component of the force aligned with the displacement should count. In general, we want an operation that multiplies the magnitude of one vector with the projection of a second vector onto the first. We call this operation the inner product or the dot product. Geometrically,
fields and vectors 17
the dot product is a measure of both the product of the vectors’ mag- nitudes and how well they are aligned. For vectors x and y the dot product is defined x · y = ‖x‖ ‖y‖ cos θ
where θ is the angle between the vectors. Now we see why we use the symbol × for multiplication; the dot (·) is reserved for the dot product.
Figure 1.3: The projection of x onto y is a scalar equal to ‖x‖ cos θ.
If two vectors are perfectly aligned, θ = 0◦^ and the dot product is simply the product of the magnitudes. If the two vectors point in exactly opposite directions, θ = 180◦^ and the dot product is - times the product of the magnitudes. If the vectors are orthogonal, the angle between them is 90◦, so cos θ = 0 and the dot product is zero. Thus, the dot product of two vectors is zero if and only if the vectors are orthogonal.
We know how to calculate norms, but how do we calculate the angle between two n-dimensional vectors? The answer is that we don’t need to. There is an easier way to calculate x · y than the formula ‖x‖ ‖y‖ cos θ. First, we need to define a special set of vectors – the unit vectors ˆei. These are vectors that have only a single nonzero entry, a 1 at element i. For example,
ˆe 1 =
, ˆe 2 =
, eˆn =
Every vector can be written as a sum of scalar product with unit vectors. For example,
= −3ˆe 1 + 6ˆe 2 + 2ˆe 3
In general
x = x 1
∑^ n
i=
xiˆei
Let’s take stock of the operations we’ve defined so far.
All of these operations appeared consistent with the field axioms. Unfortunately, we still do not have a true multiplication operation
y 1 = a 11 x 1 + a 12 x 2 + · · · + a 1 nxn y 2 = a 21 x 1 + a 22 x 2 + · · · + a 2 nxn .. . yn = an 1 x 1 + an 2 x 2 + · · · + annxn
where the scalars aij determine the relative weight of xj when con- structing yi. There are n^2 scalars required to unambiguously map x to y. For convenience, we collect the set of weights into an n by n numeric grid called a matrix. If A is a real-valued matrix with dimensions m × n, we say A ∈ Rm×n and dim(A) = m × n.
20 linear algebra
a 11 a 12 · · · a 1 n a 21 a 22 · · · a 2 n .. .
an 1 an 2 · · · ann
What we have been calling “vectors” all along are really just ma- trices with only one column. Thinking of vectors as matrices lets us write a simple, yet powerful, definition of multiplication.
Definition. The product of matrices AB is a matrix C where each element cij in C is the dot product between the ith row in A and the jth column in B: cij = A(i, :) · B(:, j)
Using this definition of matrix multiplication, the previous system of n equations becomes the matrix equation
y 1 .. . yn
a 11 · · · a 1 n .. .
an 1 · · · ann
x 1 .. . xn
or, more succinctly y = Ax
In the previous example, both x and y were n-dimensional. This does not need to be the case. In general, the vector y could have m 6 = n dimensions. The matrix A would have m rows, each used to construct an element yi in y. However, the matrix A would still need n columns to match the n rows in x. (Each row in A is “dotted” with the n- dimensional vector x, and dot products require the two vectors have the same dimension.) Matlab returns an error that “ma- trix dimensions must agree” when Any matrices A and B are conformable for multiplication if the multiplying non-conformable objects. number of columns in A matches the number of rows in B. If the dimensions of A are m × n and the dimensions of B are n × p, then the product will be a matrix of dimensions m × p. For the system y = Ax, if dim(A)=m × n and dim(x)=n × 1, dim(y) = m × 1, i.e. y is a column vector in Rm.
Matrix multiplication is associative [ABC= (AB)C= A(BC)] and distributive over addition [A(B+C) = AB+AC], provided A, B, and C are all conformable. However, it is not commutative. To see why, consider A ∈ Rm×n^ and B ∈ Rn×p. The product AB is an m × p matrix, but the product BA is not conformable since p 6 = m. Even if BA were conformable, it is not the same as the product AB.