Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Handout A: Linear Algebra Cheat Sheet, Cheat Sheet of Linear Algebra

College of Menominee Nation (CMN)Linear Algebra

The purpose of this handout is to give a brief review of some of the basic concepts and results in linear algebra.

Typology: Cheat Sheet

2019/2020

Uploaded on 10/09/2020

anasooya 🇺🇸

4

(12)

244 documents

1 / 9

This page cannot be seen from the preview

Don't miss anything!

G00TE1204: Convex Optimization and Its Applications in Signal Processing

Handout A: Linear Algebra Cheat Sheet

Instructor: Anthony Man–Cho So Updated: May 10, 2015

The purpose of this handout is to give a brief review of some of the basic concepts and results

in linear algebra. If you are not familiar with the material and/or would like to do some further

reading, you may consult, e.g., the books [1, 2, 3].

1 Basic Notations, Definitions and Results

1.1 Vectors and Matrices

We denote the set of real numbers (also referred to as scalars) by R. For positive integers m, n ≥1,

we use Rm×nto denote the set of m×narrays whose components are from R. In other words,

Rm×nis the set of n–dimensional real matrices, and an element A∈Rm×ncan be written as

A=





a11 a12 ··· a1n

a21 a22 ··· a2n

.

..

.

.....

.

am1am2··· amn







,(1)

where aij ∈Rfor i= 1, . . . , m and j= 1, . . . , n. A row vector is a matrix with m= 1, and

acolumn vector is a matrix with n= 1. The word vector will always mean a column vector

unless otherwise stated. The set of all n–dimensional real vectors is denoted by Rn, and an element

x∈Rncan be written as x= (x1, . . . , xn). Note that we still view x= (x1, . . . , xn) as a column

vector, even though typographically it does not appear so. The reason for such a notation is simply

to save space. Now, given an m×nmatrix Aof the form (1), its transpose ATis defined as the

following n×mmatrix:

AT=





a11 a21 ··· am1

a12 a22 ··· am2

.

..

.

.....

.

a1na2n··· amn







.

An m×mreal matrix Ais said to be symmetric if A=AT. The set of m×mreal symmetric

matrices is denoted by Sm.

We use x≥0to indicate that all the components of xare non–negative, and x≥yto mean

that x−y≥0. The notations x > 0,x≤0,x < 0,x>y,x≤y, and x<yare to be interpreted

accordingly.

We say that a finite collection C={x1, x2, . . . , xm}of vectors in Rnis

•linearly dependent if there exist scalars α1, . . . , αm∈R, not all of them zero, such that

Pm

i=1 αixi=0;

•affinely dependent if the collection C0={x2−x1, x3−x1, . . . , xm−x1}is linearly dependent.

1

Partial preview of the text

Download Handout A: Linear Algebra Cheat Sheet and more Cheat Sheet Linear Algebra in PDF only on Docsity!

G00TE1204: Convex Optimization and Its Applications in Signal Processing

Handout A: Linear Algebra Cheat Sheet

Instructor: Anthony Man–Cho So Updated: May 10, 2015

The purpose of this handout is to give a brief review of some of the basic concepts and results in linear algebra. If you are not familiar with the material and/or would like to do some further reading, you may consult, e.g., the books [1, 2, 3].

1 Basic Notations, Definitions and Results

1.1 Vectors and Matrices

We denote the set of real numbers (also referred to as scalars) by R. For positive integers m, n ≥ 1, we use Rm×n^ to denote the set of m × n arrays whose components are from R. In other words, Rm×n^ is the set of n–dimensional real matrices, and an element A ∈ Rm×n^ can be written as

A =

a 11 a 12 · · · a 1 n a 21 a 22 · · · a 2 n .. .

am 1 am 2 · · · amn

where aij ∈ R for i = 1,... , m and j = 1,... , n. A row vector is a matrix with m = 1, and a column vector is a matrix with n = 1. The word vector will always mean a column vector unless otherwise stated. The set of all n–dimensional real vectors is denoted by Rn, and an element x ∈ Rn^ can be written as x = (x 1 ,... , xn). Note that we still view x = (x 1 ,... , xn) as a column vector, even though typographically it does not appear so. The reason for such a notation is simply to save space. Now, given an m × n matrix A of the form (1), its transpose AT^ is defined as the following n × m matrix:

AT^ =

a 11 a 21 · · · am 1 a 12 a 22 · · · am 2 .. .

a 1 n a 2 n · · · amn

An m × m real matrix A is said to be symmetric if A = AT^. The set of m × m real symmetric matrices is denoted by Sm. We use x ≥ 0 to indicate that all the components of x are non–negative, and x ≥ y to mean that x − y ≥ 0. The notations x > 0 , x ≤ 0 , x < 0 , x > y, x ≤ y, and x < y are to be interpreted accordingly. We say that a finite collection C = {x^1 , x^2 ,... , xm} of vectors in Rn^ is

linearly dependent∑ if there exist scalars α 1 ,... , αm ∈ R, not all of them zero, such that m i=1 αix i (^) = 0 ;
affinely dependent if the collection C′^ = {x^2 −x^1 , x^3 −x^1 ,... , xm^ −x^1 } is linearly dependent.

The collection C (resp. C′) is said to be linearly independent (resp. affinely independent) if it is not linearly dependent (resp. affinely dependent).

1.2 Inner Product and Vector Norms

Given two vectors x, y ∈ Rn, their inner product is defined as

xT^ y ≡

∑^ n

i=

xiyi.

We say that x and y are orthogonal if xT^ y = 0. The Euclidean norm of x ∈ Rn^ is defined as

‖x‖ 2 ≡

xT^ x =

( (^) n ∑

i=

|xi|^2

A fundamental inequality that relates the inner product of two vectors and their respective Eu- clidean norms is the Cauchy–Schwarz inequality: ∣ ∣xT^ y

∣ (^) ≤ ‖x‖ 2 · ‖y‖ 2.

Equality holds iff the vectors x and y are linearly dependent; i.e., x = αy for some α ∈ R. Note that the Euclidean norm is not the only norm one can define on Rn. In general, a function ‖ · ‖ : Rn^ → R is called a vector norm on Rn^ if for all x, y ∈ Rn, we have

(a) (Non–Negativity) ‖x‖ ≥ 0;

(b) (Positivity) ‖x‖ = 0 iff x = 0 ;

(c) (Homogeneity) ‖αx‖ = |α| · ‖x‖ for all α ∈ R;

(d) (Triangle Inequality) ‖x + y‖ ≤ ‖x‖ + ‖y‖.

For instance, for p ≥ 1, the `p–norm on Rn, which is given by

‖x‖p =

( (^) n ∑

i=

|xi|p

) 1 /p ,

is a vector norm on Rn. It is well known that

‖x‖∞ = lim p→∞ ‖x‖p = max 1 ≤i≤n |xi|.

1.3 Matrix Norms

We say that a function ‖ · ‖ : Rn×n^ → R is a matrix norm on the set of n × n matrices if for any A, B ∈ Rn×n, we have

(a) (Non–Negativity) ‖A‖ ≥ 0;

(b) (Positivity) ‖A‖ = 0 iff A = 0 ;

Moreover, we have rank(A) ≤ min{m, n}, and if equality holds, then we say that A has full rank. The nullspace of A is the set null(A) ≡ {x ∈ Rn^ : Ax = 0 }. It is a subspace of Rn^ and has dimension n−rank(A). The following summarizes the relationships among the subspaces range(A), range(AT^ ), null(A), and null(AT^ ):

(range(A))⊥^ = null(AT^ ), ( range(AT^ )

= null(A).

The above implies that given an m × n real matrix A of rank r ≤ min{m, n}, we have rank(AAT^ ) = rank(AT^ A) = r. This fact will be frequently used in the course.

1.5 Affine Subspaces

Let S 0 be a subspace of Rn^ and x^0 ∈ Rn^ be an arbitrary vector. Then, the set S =

x^0

+ S 0 =

{x + x^0 : x ∈ S 0 } is called an affine subspace of Rn, and its dimension is equal to the dimension of the underlying subspace S 0. Now, let C = {x^1 ,... , xm} be a finite collection of vectors in Rn, and let x^0 ∈ Rn^ be arbitrary. By definition, the set S =

x^0

span(C) is an affine subspace of Rn. Moreover, it is easy to verify that every vector y ∈ S can be written in the form

y =

∑^ m

i=

[

αi(x^0 + xi) + βi(x^0 − xi)

]

for some α 1 ,... , αm, β 1 ,... , βm ∈ R such that

∑m i=1 (αi^ +^ βi) = 1; i.e., the vector^ y^ ∈^ R n (^) is an

affine combination of the vectors x^0 ± x^1 ,... , x^0 ± xm^ ∈ Rn. Conversely, let C = {x^1 ,... , xm} be a finite collection of vectors in Rn, and define

S =

{ (^) m ∑

i=

αixi^ : α 1 ,... , αm ∈ R,

∑^ m

i=

αi = 1

to be the set of affine combinations of the vectors in C. We claim that S is an affine subspace of Rn. Indeed, it can be readily verified that

S =

x^1

span

{x^2 − x^1 ,... , xm^ − x^1 }

This establishes the claim. Given an arbitrary (i.e., not necessarily finite) collection C of vectors in Rn, the affine hull of C, denoted by aff(C), is the set of all finite affine combinations of the vectors in C. Equivalently, we can define aff(C) as the intersection of all affine subspaces containing C.

1.6 Some Special Classes of Matrices

The following classes of matrices will be frequently encountered in this course.

Invertible Matrix. An n × n real matrix A is said to be invertible if there exists an n × n real matrix A−^1 (called the inverse of A) such that A−^1 A = I, or equivalently, AA−^1 = I. Note that the inverse of A is unique whenever it exists. Morever, recall that A ∈ Rn×n^ is invertible iff rank(A) = n.

Now, let A be a non–singular n × n real matrix. Suppose that A is partitioned as

A =

[

A 11 A 12

A 21 A 22

]

where Aii ∈ Rni×ni^ for i = 1, 2, with n 1 + n 2 = n. Then, provided that the relevant inverses exist, the inverse of A has the following form:

A−^1 =

[ (

A 11 − A 12 A− 221 A 21

A− 111 A 12

A 21 A− 111 A 12 − A 22

A 21 A− 111

A 22 − A 21 A− 111 A 12

]

Submatrix of a Matrix. Let A be an m × n real matrix. For index sets α ⊂ { 1 , 2 ,... , m} and β ⊂ { 1 , 2 ,... , n}, we denote the submatrix that lies in the rows of A indexed by α and the columns indexed by β by A(α, β). If m = n and α = β, the matrix A(α, α) is called a principal submatrix of A and is denoted by A(α). The determinant of A(α) is called a principal minor of A. Now, let A be an n×n matrix, and let α ⊂ { 1 , 2 ,... , n} be an index set such that A(α) is non– singular. We set α′^ = { 1 , 2 ,... , n}\α. The following is known as the Schur determinantal formula: det(A) = det(A(α))det

[

A(α′) − A(α′, α)A(α)−^1 A(α, α′)

]

Orthogonal Matrix. An n × n real matrix A is called an orthogonal matrix if AAT^ = AT^ A = I. Note that if A ∈ Rn×n^ is an orthogonal matrix, then for any u, v ∈ Rn, we have uT^ v = (Au)T^ (Av); i.e., orthogonal transformations preserve inner products.
Positive Semidefinite/Definite Matrix. An n×n real matrix A is positive semidefinite (resp. positive definite) if A is symmetric and for any x ∈ Rn{ 0 }, we have xT^ Ax ≥ 0 (resp. xT^ Ax > 0). We use A º 0 (resp. A Â 0 ) to denote the fact that A is positive semidefinite (resp. positive definite). We remark that although one can define a notion of positive semidefiniteness for real matrices that are not necessarily symmetric, we shall not pursue that option in this course.
Projection Matrix. An n × n real matrix A is called a projection matrix if A^2 = A. Given a projection matrix A ∈ Rn×n^ and a vector x ∈ Rn, the vector Ax ∈ Rn^ is called the projection of x ∈ Rn^ onto the subspace range(A). Note that a projection matrix need not be symmetric. As an example, consider

A =

[

]

We say that A defines an orthogonal projection onto the subspace S ⊂ Rn^ if for every x = x^1 + x^2 ∈ Rn, where x^1 ∈ S and x^2 ∈ S⊥, we have Ax = x^1. Note that if A defines an orthogonal projection onto S, then I − A defines an orthogonal projection onto S⊥. Fur- thermore, it can be shown that A is an orthogonal projection onto S iff A is a symmetric projection matrix with range(A) = S. As an illustration, consider an m × n real matrix A, with m ≤ n and rank(A) = m. Then, the projection matrix corresponding to the orthogonal projection onto the nullspace of A is given by Pnull(A) = I − AT^ (AAT^ )−^1 A.

gives rise to a set of k eigenvectors of A whose associated eigenvalue is λ¯. It is worth noting that if {v^1 ,... , vk} is an orthonormal basis of L¯, then we can find an orthogonal matrix P 1 k ∈ Rk×k^ such that V 1 k = U 1 k P 1 k , where U 1 k (resp. V 1 k ) is the n × k matrix whose i–th column is ui^ (resp. vi), for i = 1,... , k. In particular, if A = U ΛU T^ = V ΛV T^ are two spectral decompositions of A with

λi 1 In 1 λi 2 In 2

... λil Inl

where λi 1 ,... , λil are the distinct eigenvalues of A, Ik denotes a k × k identity matrix, and n 1 + n 2 + · · · + nl = n, then there exists an orthogonal matrix P with the block diagonal structure

P =

Pn 1 Pn 2

... Pnl

where Pnj is an nj × nj orthogonal matrix for j = 1,... , l, such that V = U P. Now, suppose that we order the eigenvalues of A as λ 1 ≥ λ 2 ≥ · · · ≥ λn. Then, the Courant– Fischer theorem states that the k–th largest eigenvalue λk, where k = 1,... , n, can be found by solving the following optimization problems:

λk = min w^1 ,...,wk−^1 ∈Rn^ x 6 =max 0 ,x∈Rn x⊥w^1 ,...,wk−^1

xT^ Ax xT^ x

= max w^1 ,...,wn−k^ ∈Rn^ x 6 =min 0 ,x∈Rn x⊥w^1 ,...,wn−k

xT^ Ax xT^ x

2.2 Properties of Positive Semidefinite Matrices

By definition, a real positive semidefinite matrix is symmetric, and hence it has the properties listed above. However, much more can be said about such matrices. For instance, the following statements are equivalent for an n × n real symmetric matrix A:

(a) A is positive semidefinite.

(b) All the eigenvalues of A are non–negative.

(c) There exists a unique n × n positive semidefinite matrix A^1 /^2 such that A = A^1 /^2 A^1 /^2.

(d) There exists an k × n matrix B, where k = rank(A), such that A = BT^ B.

Similarly, the following statements are equivalent for an n × n real symmetric matrix A:

(a) A is positive definite.

(b) A−^1 exists and is positive definite.

(c) All the eigenvalues of A are positive.

(d) There exists a unique n × n positive definite matrix A^1 /^2 such that A = A^1 /^2 A^1 /^2.

Sometimes it would be useful to have a criterion for determining the positive semidefiniteness of a matrix from a block partitioning of the matrix. Here is one such criterion. Let

A =

[

X Y

Y T^ Z

]

be an n × n real symmetric matrix, where both X and Z are square. Suppose that Z is invertible. Then, the Schur complement of the matrix A is defined as the matrix SA = X − Y Z−^1 Y T^. If Z Â 0 , then it can be shown that A º 0 iff X º 0 and SA º 0. There is of course nothing special about the block Z. If X is invertible, then we can similarly define the Schur complement of A as S′ A = Z − Y T^ X−^1 Y. If X Â 0 , then we have A º 0 iff Z º 0 and S A′ º 0.

3 Singular Values and Singular Vectors

Let A be an m × n real matrix of rank r ≥ 1. Then, there exist orthogonal matrices U ∈ Rm×m and V ∈ Rn×n^ such that A = U ΛV T^ , (4)

where Λ ∈ Rm×n^ has Λij = 0 for i 6 = j and Λ 11 ≥ Λ 22 ≥ · · · ≥ Λrr > Λr+1,r+1 = · · · = Λqq = 0 with q = min{m, n}. The representation (4) is called the Singular Value Decomposition (SVD) of A; cf. (2). The entries Λ 11 ,... , Λqq are called the singular values of A, and the columns of U (resp. V ) are called the left (resp. right) singular vectors of A. For notational convenience, we write σi ≡ Λii for i = 1,... , q. Note that (4) can be equivalently written as

A =

∑^ r

i=

σiui(vi)T^ ,

where ui^ (resp. vi) is the i–th column of the matrix U (resp. V ), for i = 1,... , r. The rank of A is equal to the number of non–zero singular values. Now, suppose that we order the singular values of A as σ 1 ≥ σ 2 ≥ · · · ≥ σq, where q = min{m, n}. Then, the Courant–Fischer theorem states that the k–th largest singular value σk, where k = 1,... , q, can be found by solving the following optimization problems:

σk = min w^1 ,...,wk−^1 ∈Rn^ x 6 =max 0 ,x∈Rn x⊥w^1 ,...,wk−^1

‖Ax‖ 2 ‖x‖ 2 = max w^1 ,...,wn−k^ ∈Rn^ x 6 =min 0 ,x∈Rn x⊥w^1 ,...,wn−k

‖Ax‖ 2 ‖x‖ 2

The optimization problems (3) and (5) suggest that singular value and eigenvalue are closely related notions. Indeed, if A is an m × n real matrix, then

λk(AT^ A) = λk(AAT^ ) = σ k^2 (A) for k = 1,... , q,

where q = min{m, n}. Moreover, the columns of U and V are the eigenvectors of AAT^ and AT^ A, respectively. In particular, our discussion in Section 2.1 implies that the set of singular values of A is unique, but the sets of left and right singular vectors are not. Finally, we note that the largest singular value function induces a matrix norm, which is known as the spectral norm and is sometimes denoted by ‖A‖ 2 = σ 1 (A).

Handout A: Linear Algebra Cheat Sheet, Cheat Sheet of Linear Algebra

Related documents

Partial preview of the text

Download Handout A: Linear Algebra Cheat Sheet and more Cheat Sheet Linear Algebra in PDF only on Docsity!

Handout A: Linear Algebra Cheat Sheet

1 Basic Notations, Definitions and Results

1.1 Vectors and Matrices

A =

AT^ =

1.2 Inner Product and Vector Norms

1.3 Matrix Norms

1.5 Affine Subspaces

+ S 0 =

[

]

S =

1.6 Some Special Classes of Matrices

A =

[

A 11 A 12

A 21 A 22

]

A−^1 =

[ (

A 11 − A 12 A− 221 A 21

A− 111 A 12

A 21 A− 111 A 12 − A 22

A 21 A− 111 A 12 − A 22

A 21 A− 111

A 22 − A 21 A− 111 A 12

]

[

]

A =

[

]

P =

2.2 Properties of Positive Semidefinite Matrices

A =

[

X Y

Y T^ Z

]

3 Singular Values and Singular Vectors

A =