Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Linear Algebra: An Introduction to Data Science, Exams of Linear Algebra

This class has three parts. In Part I, we analyze linear systems that transform an n-dimensional vector (x) into another n-dimensional vector (y).

Typology: Exams

2022/2023

Uploaded on 05/11/2023

tomseller
tomseller 🇺🇸

4.6

(16)

276 documents

1 / 47

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
LINEAR ALGEBRA
An Introduction to Data Science
Paul A. Jensen
University of Illinois at Urbana-Champaign
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f

Partial preview of the text

Download Linear Algebra: An Introduction to Data Science and more Exams Linear Algebra in PDF only on Docsity!

L I N E A R A L G E B R A

An Introduction to Data Science

Paul A. Jensen

University of Illinois at Urbana-Champaign

  • Introduction
  • 0.1 Notation
  • 1 Fields and Vectors
    • 1.1 Algebra
    • 1.2 The Field Axioms
    • 1.2.1 Common Fields in Mathematics
    • 1.3 Vector Addition
    • 1.4 Vector Multiplication is not Elementwise
    • 1.4.1 Do We Need Multiplication?
    • 1.5 Linear Systems
    • 1.6 Vector Norms
    • 1.6.1 Normalized (Unit) Vectors
    • 1.7 Scalar Vector Multiplication
    • 1.8 Inner (Dot) Products
    • 1.8.1 Computing the Dot Product
    • 1.8.2 Dot Product Summary
  • 2 Matrices
    • 2.1 Matrix Multiplication
    • 2.1.1 Generalized Multiplication
    • 2.2 Identity Matrix
    • 2.3 Matrix Transpose
    • 2.4 Solving Linear Systems
    • 2.5 Gaussian Elimination
    • 2.6 Computational Complexity of Gaussian Elimination
    • 2.7 Solving Linear Systems in Matlab
  • 3 The Finite Difference Method
    • 3.1 Finite Differences
    • 3.2 Linear Differential Equations
    • 3.3 Discretizing a Linear Differential Equation
    • 3.4 Boundary Conditions
  • 4 Inverses, Solvability, and Rank
    • 4.1 Matrix Inverses
    • 4.2 Elementary Matrices
    • 4.3 Proof of Existence for the Matrix Inverse
    • 4.4 Computing the Matrix Inverse
    • 4.5 Numerical Issues
    • 4.6 Inverses of Elementary Matrices
    • 4.7 Rank
    • 4.8 Rank and Matrix Inverses
    • 4.9 Summary

Introduction

This class has three parts. In Part I, we analyze linear systems that transform an n-dimensional vector (x) into another n-dimensional vector (y). This transformation is often expressed as a linear system via matrix multiplication: y = Ax. In Part II, we expand the type of systems we can solve, including systems that transform vectors from n dimensions to m dimensions. We will also consider solution strategies that use alternative objectives when the system contains either too much or not enough information. Finally, Part III dispenses with linear systems altogether, focusing purely on observations of sets of n-dimensional vectors (matrices). We will learn how to analyze and extract information from matrices without a clear input/output relationship.

0.1 Notation

We will distinguish scalars, vectors, and matrices with the following typographic conventions:

Object Font and Symbol Examples Scalars italicized, lowercase letters x, α, y Vectors bold lowercase letters x, y, n, w Matrices bold uppercase A, A−^1 , B, Γ

There are many ways to represent rows or columns of a matrix A. Since this course uses Matlab, we think it is convenient to use a ma- trix addressing scheme that reflects Matlab’s syntax. So, the ith row of matrix A will be A(i, :), and the jth column will be A(:, j). Rows or columns of a matrix are themselves vectors, so we choose to keep the boldface font for the matrix A even when it is subscripted. We could also use Matlab syntax for vectors (x(i), for example). However, the form xi is standard across many fields of mathematics and engi- neering, so we retain the common notation. The lack of boldface font reminds us that elements of vectors are scalars in the field.

These symbols can be used to succinctly write mathematical state- ments. For example, we can formally define the set of rational num- bers as

A number is rational if and only if it can be expressed as the quotient of two integers.

with the statement

r ∈ Q ⇔ ∃ p, q ∈ Z s.t. r = p/q

While the latter statement is shorter, it is more difficult to under- stand. So whenever possible we recommend writing statements with as few symbols as necessary. Rely on mathematical symbols only when a textual definition would be unwieldly or imprecise, or when brevity is important (like when writing on a chalkboard).

10 linear algebra

  1. Associativity.

a + b + c = (a + b) + c = a + (b + c)

abc = (ab)c = a(bc)

  1. Commutativity.

a + b = b + a

ab = ba

  1. Distribution of multiplication over addition.

a(b + c) = ab + ac

  1. Identity. There exist elements 0 and 1, both in the field, such that

a + 0 = a

1 × a = a

  1. Inverses.
    • For all a, there exists an element (−a) in the field such that a + (−a) = 0.
    • For all a 6 = 0, there exists an element (a−^1 ) in the field such that a × a−^1 = 1.

It might surprise you that only five axioms are sufficient to recreate everything you know about algebra. For example, nowhere do we state the special property of zero that a × 0 = 0 for any number a. We don’t need to state this property, as it follows from the field axioms:

Theorem. a × 0 = 0

Proof.

a × 0 = a × (1 − 1) = a × 1 + a × (−1) = a − a = 0

Similarly, we can prove corollaries from the field axioms.

Corollary. If ab = 0, then either a = 0 or b = 0 (or both).

fields and vectors 11

Proof. Suppose a 6 = 0. Then there exists a−^1 such that

a−^1 ab = a−^1 × 0 1 × b = 0 b = 0

A similar argument follows when b 6 = 0.

The fundamental theorem of algebra relies on the above corollary when solving polynomials. If we factor a polynomial into the form (x − r 1 )(x − r 2 ) · · · (x − rk) = 0, then we know the polynomial has roots r 1 , r 2 ,.. ., rk. This is only true because the left hand side of the factored expression only reaches zero when one of the factors is zero, i.e. when x = ri.

1.2.1 Common Fields in Mathematics

The advantage of fields is that once a set is proven to obey the five field axioms, we can operate on elements in the field just like we would operate on real numbers. Besides the real numbers (which the concept of fields was designed to emulate), what are some other fields? The rational numbers are a field. The numbers 0 and 1 are rational, so they are in the field. Since we add and multiply rational numbers just as we do real numbers, these operations commute, associate, and distribute. All that remains is to show that the rationals have additive and multiplicative inverses in the field. Let us consider a rational number p/q, where p and q are integers.

  • We know that −p/q is also rational, since −p is still an integer. The additive inverse of a rational number is in the field of rational numbers.
  • The additive inverse of p/q is q/p, which is also rational. The multi- plicative inverse of a rational is also in the field. So the rational numbers are a field. What does this mean? If we are given an algebraic expression, we can solve it by performing any algebraic manipulation and still be assured that the answer will be another rational number. The integers, by contrast, are not a field. Every integer has a recip- rocal (2 → 1 /2, − 100 → − 1 /100, etc.). However, the reciprocals are themselves not integers, so they are not in the same field. The field axioms require that the inverses for every element are members of the field. When constructing a field, every part of every axiom must be satisfied. Let’s see an example of this. Imagine the simple equation y = ax + b, which we solve for x to yield

fields and vectors 13

1.4 Vector Multiplication is not Elementwise

What happens when we try to define multiplication as an elementwise operation? For example  

 ×
− 1 × 0
0 × 2
4 × 0

This is bad. Very bad. Here we have an example where xy = 0 , but neither x nor y is the zero element 0. This is a direct violation of a corollary of the field axioms, so elementwise vector multiplication is not a valid algebraic operation. On the bright side, if vectors were a Sadly, vectors are not a field. There is no way to define multipli- field this class would be far too short. cation using only vectors that satisfies the field axioms. Nor is there anything close to a complete set of multiplicative inverses, or even the element 1. Instead, we will settle for a weaker result – showing that vectors live in a normed inner product space. The concepts of a vector norm and inner product will let us create most of the operations and elements that vectors need to be a field.

1.4.1 Do We Need Multiplication?

When you were first taught to multiply, it was probably introduced as a “faster” method of addition, i.e. 4×3 = 3+3+3+3. If so, why do we need multiplication as a separate requirement for fields? Couldn’t we simply require the addition operator and construct multiplication from it? The answer is no, for two reasons. First, the idea of multiplication as a shortcut for addition only makes sense when discussing the non- negative integers. What, for example, does it mean to have − 2. 86 × 3? What does − 2 .86 groups look like in terms of addition? Also, the integers are not a field! Second, we must realize that multiplication is a much stronger relationship between numbers. To understand why, we should start talking about the “linear” part of linear algebra.

1.5 Linear Systems

Linear systems have two special properties.

  1. Proportionality. If the input to a linear system is multiplied by a scalar, the output is multiplied by the same scalar: f (kx) = kf (x).
  2. Additivity. If two inputs are added, the result is the sum of the original outputs: f (x 1 + x 2 ) = f (x 1 ) + f (x 2 ).

We can combine both of these properties into a single condition for linearity.

14 linear algebra

Definition. A system f is linear if and only if

f (k 1 x 1 + k 2 x 2 ) = k 1 f (x 1 ) + k 2 f (x 2 )

for all inputs x 1 and x 2 and scalars k 1 and k 2.

Consider a very simple function, f (x) = x + 3. Is this function lin- ear? First we calculate the lefthand side of the definition of linearity.

f (k 1 x 1 + k 2 x 2 ) = k 1 x 1 + k 2 x 2 + 3

We compare this to the righthand side.

k 1 f (x 1 ) + k 2 f (x 2 ) = k 1 (x 1 + 3) + k 2 (x 2 + 3) = k 1 x 1 + k 2 x 2 + 3(k 1 + k 2 ) 6 = f (k 1 x 1 + k 2 x 2 )

This does not follow the definition of linearity. The function f (x) = x + 3 is not linear. Now let’s look at a simple function involving multi- plication: f (x) = 3x. Is this function linear?

f (k 1 x 1 + k 2 x 2 ) = 3(k 1 x 1 + k 2 x 2 ) = k 1 (3x 1 ) + k 2 (3x 2 ) = k 1 f (x 1 ) + k 2 f (x 2 )

The function involving multiplication is linear. These results might not be what you expected, at least concerning the nonlinearity of functions of the form f (x) = x + b. This is probably because in earlier math courses you referred to equations of straight lines (y = mx + b) as linear equations. In fact, any equation of this form (with b 6 = 0) is called affine, not linear. Truly linear functions have the property that f (0) = 0. Addition This follows from proportionality. If f (k0) = kf (0) for all k, then f (0) is, in a way, not “strong” enough to drive a function to zero. The must equal zero. expression x + y is zero only when both x and y are zero. By contrast, the product xy is zero when either x or y is zero.

1.6 Vector Norms

One of the nice properties of the real numbers is that they are well ordered. Being well ordered means that for any two real numbers, we can determine which number is larger (or if the two numbers are equal). Well orderedness allows us to make all sorts of comparisons between the real numbers. Vectors are not well ordered. Consider the vectors (3, 4) and (5, 2). Which one is larger? Each vector has one element that is larger than the other (4 in the first, 5 in the second). There is no unambiguous way to place all vectors in order.

16 linear algebra

The normalized unit vector ( ˆx ) is We use the hat symbol (ˆ) over a unit vector to remind us that it has been normalized. x ˆ =

3 / ‖x‖ − 4 / ‖x‖

1.7 Scalar Vector Multiplication

Figure 1.2: Vectors separate into a length (norm) and direction (unit vector). The length and direction can be combined by scalar multiplication

We saw earlier that elementwise multiplication was a terrible idea. In fact, defining multiplication this way violates a corollary of the field axioms (xy = 0 implies that x = 0 or y = 0 ). However, elementwise multiplication does work in one case – scalar multiplication, or the product between a scalar (real number) and a vector:

kx = k

x 1 x 2 .. . xn

kx 1 kx 2 .. . kxn

where k is a scalar real number. Notice that scalar multiplication does not suffer from the same problem as elementwise vector multiplication. If kx = 0, then either the scalar k equals zero or the vector x must be the zero vector. What happens when you multiply a vector by a scalar? For one, the norm changes: Remember that √ k^2 = |k|, not k itself. We consider the square root to ‖kx‖ = √(kx be the positive root. 1 )^2 + (kx 2 )^2 +^ · · ·^ + (kxn)^2 =

k^2 (x^21 + x^22 + · · · + x^2 n) = |k| ‖x‖

Scalar multiplication scales the length of a vector by the scalar. If the scalar is negative, the direction of the vector “reverses”.

1.8 Inner (Dot) Products

One way to think of the product of two vectors is to consider the product of their norms (magnitudes). Such operations are common in mechanics. Work, for example, is the product of force and displace- ment. However, simply multiplying the magnitude of the force vector and the magnitude of the displacement vector disregards the orienta- tion of the vectors. We know from physics that only the component of the force aligned with the displacement should count. In general, we want an operation that multiplies the magnitude of one vector with the projection of a second vector onto the first. We call this operation the inner product or the dot product. Geometrically,

fields and vectors 17

the dot product is a measure of both the product of the vectors’ mag- nitudes and how well they are aligned. For vectors x and y the dot product is defined x · y = ‖x‖ ‖y‖ cos θ

where θ is the angle between the vectors. Now we see why we use the symbol × for multiplication; the dot (·) is reserved for the dot product.

Figure 1.3: The projection of x onto y is a scalar equal to ‖x‖ cos θ.

If two vectors are perfectly aligned, θ = 0◦^ and the dot product is simply the product of the magnitudes. If the two vectors point in exactly opposite directions, θ = 180◦^ and the dot product is - times the product of the magnitudes. If the vectors are orthogonal, the angle between them is 90◦, so cos θ = 0 and the dot product is zero. Thus, the dot product of two vectors is zero if and only if the vectors are orthogonal.

1.8.1 Computing the Dot Product

We know how to calculate norms, but how do we calculate the angle between two n-dimensional vectors? The answer is that we don’t need to. There is an easier way to calculate x · y than the formula ‖x‖ ‖y‖ cos θ. First, we need to define a special set of vectors – the unit vectors ˆei. These are vectors that have only a single nonzero entry, a 1 at element i. For example,

ˆe 1 =

, ˆe 2 =

, eˆn =

Every vector can be written as a sum of scalar product with unit vectors. For example,  

= −3ˆe 1 + 6ˆe 2 + 2ˆe 3

In general

x = x 1

  • x 2
  • · · · + xn

∑^ n

i=

xiˆei

Matrices

2.1 Matrix Multiplication

Let’s take stock of the operations we’ve defined so far.

  • The norm (magnitude) maps a vector to a scalar. (Rn^7 → R)
  • The scalar product maps a scalar and a vector to a new vector (R × Rn^7 → Rn), but can only scale the magnitude of the vector (or flip it if the scalar is negative).
  • The dot product maps to vectors to a scalar (Rn^ × Rn^7 → R) by projecting one onto the other and multiplying the resulting magnitudes.

All of these operations appeared consistent with the field axioms. Unfortunately, we still do not have a true multiplication operation

  • one that can transform any vector into any other vector. Can we construct such an operation using only the above methods? Let’s construct a new vector y from vector x. To be as general as possible, we should let each element in y be an arbitrary linear combination of the elements in x. This implies that

y 1 = a 11 x 1 + a 12 x 2 + · · · + a 1 nxn y 2 = a 21 x 1 + a 22 x 2 + · · · + a 2 nxn .. . yn = an 1 x 1 + an 2 x 2 + · · · + annxn

where the scalars aij determine the relative weight of xj when con- structing yi. There are n^2 scalars required to unambiguously map x to y. For convenience, we collect the set of weights into an n by n numeric grid called a matrix. If A is a real-valued matrix with dimensions m × n, we say A ∈ Rm×n and dim(A) = m × n.

20 linear algebra

A =

a 11 a 12 · · · a 1 n a 21 a 22 · · · a 2 n .. .

an 1 an 2 · · · ann

What we have been calling “vectors” all along are really just ma- trices with only one column. Thinking of vectors as matrices lets us write a simple, yet powerful, definition of multiplication.

Definition. The product of matrices AB is a matrix C where each element cij in C is the dot product between the ith row in A and the jth column in B: cij = A(i, :) · B(:, j)

Using this definition of matrix multiplication, the previous system of n equations becomes the matrix equation   

y 1 .. . yn

a 11 · · · a 1 n .. .

an 1 · · · ann

x 1 .. . xn

or, more succinctly y = Ax

2.1.1 Generalized Multiplication

In the previous example, both x and y were n-dimensional. This does not need to be the case. In general, the vector y could have m 6 = n dimensions. The matrix A would have m rows, each used to construct an element yi in y. However, the matrix A would still need n columns to match the n rows in x. (Each row in A is “dotted” with the n- dimensional vector x, and dot products require the two vectors have the same dimension.) Matlab returns an error that “ma- trix dimensions must agree” when Any matrices A and B are conformable for multiplication if the multiplying non-conformable objects. number of columns in A matches the number of rows in B. If the dimensions of A are m × n and the dimensions of B are n × p, then the product will be a matrix of dimensions m × p. For the system y = Ax, if dim(A)=m × n and dim(x)=n × 1, dim(y) = m × 1, i.e. y is a column vector in Rm.

Matrix multiplication is associative [ABC= (AB)C= A(BC)] and distributive over addition [A(B+C) = AB+AC], provided A, B, and C are all conformable. However, it is not commutative. To see why, consider A ∈ Rm×n^ and B ∈ Rn×p. The product AB is an m × p matrix, but the product BA is not conformable since p 6 = m. Even if BA were conformable, it is not the same as the product AB.

A =
, B =