



































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Course title Seminar in Engineering Analysis. Analytic and numerical methods applied to the solution of engineering problems at an advanced level. Solution methods are demonstrated on a wide range of engineering topics, including structures, fluids, thermal, thermal energy transport, and mechanical systems.
Typology: Study notes
1 / 43
This page cannot be seen from the preview
Don't miss anything!
Introduction
These notes provide an introduction to the use of vectors and matrices in engineering analysis.
In addition they provide a discussion of how the simple concept of a vector in mechanics leads to the concept of vector spaces for engineering analysis.
Matrix notation is used to simplify the representation of linear algebraic equations. In addition, the matrix representation of systems of equations provides important properties regarding the
system of equations. The discussion here presents many results without proof. You can refer to a general advanced engineering math text, like the one by Kreyszig or a text on linear algebra for
such proofs.
Parts of these notes have been prepared for use in a variety of courses to provide background
information on the use of matrices in engineering problems. Consequently, some of the material may not be used in this course and different sections from these notes may be assigned at
different times in the course.
Introduction
A vector is a common concept in engineering mechanics that most students first saw in their high-
school physics courses. Vectors are usually described in introductory courses as a quantity that has a magnitude and a direction. Force and velocity are common examples of vectors used in
basic mechanics course.
In addition to representing a vector in terms
of its magnitude and direction, we can also represent a vector in terms of its components.
This is illustrated in the figure at the right. Here we have a force vector, f , with a
magnitude, | f |, and a direction, , relative to the x axis. (Note that the notation of the vector, f , and its magnitude, | f |, are different.
The vector is the full specification of a magnitude and direction; e.g., 2000 pounds
force at an angle of 30o^ from the x axis. The
magnitude | f | is 2000 pounds in this example.) The components of the vector in the x and y directions are called fx and fy, respectively. These are not vectors, but are scalars that are multiplied by the unit vectors in the x
and y direction to give the vector forces in the coordinate directions. The unit vectors in the x and y direction are usually given the symbols i and j , respectively. In this case we would write the
vector in terms of its components as f = fx i + fy j. The vector components are called scalars to
distinguish them from vectors. (Formally a scalar is defined as a quantity which is invariant under a coordinate transformation.)
The concept of writing a vector in terms of its components is an important one in engineering
analysis. Instead of writing f = fx i + fy j , we can write f = [fx fy], with the understanding that the first number is the x component of the vector and the second number is the y component of the
vector. Using this notation we can write the unit vectors in the x and y directions as i = [1 0] and j = [0 1]. This notation for unit vectors provides a link between representing a vector as a row or
column matrix, as we will do below, and the conventional vector notation: f = fx i + fy j and f = [fx fy]. If we substitute i = [1 0] and j = [0 1] in the equation f = fx i + fy j , we get the result that f = fx[1, 0]
fy j
fxi
x
numerical subscripts for the coordinate directions and components. In this scheme we would call
the x and y coordinate directions the x 1 and x 2 directions and the vector components would be labeled as f 1 and f 2. The numerical notation allows a generalization to systems with an arbitrary
number of dimensions.
From the diagram of the vector, f , and its components, we see that the magnitude of the vector,
| f |, is given by Pythagoras’s theorem:
2 2
2 1
2 2 (^) f f (^) x fy f f. We know that we can
extend the two dimensional vector shown on the previous page to three dimensions. In this case our vectors have three components, one in each coordinate direction. We can write the unit
vectors in the three coordinate directions as i = [1 0 0], j = [0 1 0], and k = [0 0 1]. We would then write our three-dimensional vector, using numerical subscripts in place of x, y, and z
subscripts, as f = f 1 i + f 2 j + f 3 k or f = [f 1 f 2 f 3 ]. If we substitute i = [1 0 0], j = [0 1 0], and k = [0, 0, 1] in the equation f = f 1 i + f 2 j + f 3 k , we get the result that f = f 1 [1 0 0] + f 2 [0 1 0] + f 3 [0 0 1] =
[f 1 , f 2 , f 3 ].
The dot product of two vectors, a and b is written as a • b. The dot product is a scalar and its
value is | a || b |cos(), where is the angle between the two vectors. The magnitude of the unit
vectors, i , j , and k , is one. Each unit vector is parallel to itself so if we evaluate i • i , j • j , or k • k , we get |1||1|cos(0) = 1 for the dot product. Any two different unit vectors are perpendicular to each
other so the angle between them is 90o; thus the dot product of any two different unit vectors is
|1||1| cos(90o) = 0. The dot product of two vectors, expressed in terms of their components can be written as follows. a • b = (a 1 i + a 2 j +a 3 k) • ( b 1 i + b 2 j +b 3 k ) = a 1 b 1 i • i + a 1 b 2 i • j + a 1 b 3 i • k + a 2 b 1 j • i + a 2 b 2 j • j + a 2 b 3 j • k + a 3 b 1 k • i + a 3 b 2 k • j + a 3 b 3 k • k = a 1 b 1 + a 2 b 2 + a 3 b 3. This result – the dot product of
two vectors is the sum of the products of the individual components – is the basis for the generalization of the dot product into the inner product as discussed below.
The dot product represents the magnitude of the first component along the direction of the second component times the magnitude of the second component. The most familiar application
of the dot product is engineering mechanics is in the definition of work as dW = f • dx ; this gives
the product of the magnitude of the force component in the direction of the displacement times the magnitude of the displacement.
The fact that the unit vectors are perpendicular to each other gives a particularly simple
relationship for the dot product. This is an important tool in later application of vectors. We use the word orthogonal to define a set of vectors that are mutually perpendicular. In addition when
we have a set of mutually perpendicular vectors, each of which has a magnitude of one, we call
this set of vectors an orthonormal set.
We can represent any three dimensional vector in terms of the three unit vectors, i , j , and k. Because of this we say that these three vectors are a basis set for representing any three real,
three-dimensional vector. In fact we could use any three vectors in place of i , j , and k , to represent any three-dimensional vector so long as the set of three vectors is linearly independent.
For example, we could use a new set, m = i + j + k, n = i + j – k and o = i + k. This would be an
inconvenient set to use, since the unit vectors are not orthogonal and the dot products would be hard to compute. Nevertheless, we could represent any vector, a = a 1 m + a 2 n + a 3 o , instead of the equivalent vector (a 1 + a 2 + a 3 ) i + (a 1 + a 2 ) j .+ (a 1 - a 2 + a 3 ) k. We can convert the vector B =
b 1 i + b 2 j + b 3 k components into the m , n , o basis by solving the following set of equations:
1 2 3 3
1 2 2
1 2 3 1
a a a b
a a b
a a a b
[1]
important analytical results. We will see that a matrix property knows as its eigenvalues
represents the fundamental vibration frequencies in a mechanical system.
n n n nm
m
m
m
a a a a
a a a a
a a a a
a a a a
1 2 3
31 32 33 3
21 22 23 2
11 12 13 1
Two matrices can be added or subtracted if both matrices have the same size. If we define a
matrix, C , as the sum (or difference) of two matrices, A and B , we can write this sum (or difference) in terms of the matrices as follows.
C A B ( possible onlyif A and B havethesamesize ) [5]
The components of the C matrix are simply the sum (or difference) of the components of the two matrices being added (or subtracted). Thus for the matrix sum (or difference) shown in equation
[5], the components of C are give by the following equation.
C A B c (^) ij aij bij ( i 1 , n ; j 1 , m ) [6]
The product of a matrix, A , with a single number, x, yields a second matrix whose size is the
same as that of matrix A. Each component of the new matrix is the component of the original matrix, aij, multiplied by the number x. The number x in this case is usually called a scalar to
distinguish it from a matrix or a matrix component.
B x A ifbij xaij ( i 1 , n ; j 1 , m ) [7]
We define two special matrices, the null matrix, 0 , and the identity matrix, I. The null matrix is an
arbitrary size matrix in which all the elements are zero. The identity matrix is a square matrix in which all the diagonal terms are 1 and the off-diagonal terms are zero. These matrices are
sometimes written as 0 (m x n) or I n to specify a particular size for the null or identity matrix. The null matrix and the identity matrix are shown below.
A matrix that has the same pattern as the identity matrix, but has terms other than ones on its
principal diagonal is called a diagonal matrix. The general term for such a matrix is diδij, where di is the diagonal term for row i and δij is the Kronecker delta; the latter is defined such that δij = 0
unless i = j, in which case δij = 1. A diagonal matrix is sometimes represented in the following form: D = diag (d 1 , d 2 , d 3 ,…,dn); this says that D is a diagonal matrix whose diagonal components
are given by di
We call the diagonal for which the row index is the same as the column index, the main or
principal diagonal. Algorithms in the numerical analysis of differential equations lead to matrices whose nonzero terms lie along diagonals. For such a matrix, all the nonzero terms may be
represented by symbols like ai,i-k or ai,i+k. Diagonals with subscripts ai,i-k or ai,i+k are said to lie, respectively, below or above the main diagonal.
If the n rows and m columns in a matrix, A, are interchanged, we will have a new matrix, B, with m rows and n columns. The matrix B is said to be the transpose of A, written as AT.
if bij aji [ i 1 , n ; j 1 , m ; is ( nxm ); is ( mxn ).]
T B A A B [9]
An example of an original A matrix and its transpose is shown below.
The transpose of a product of matrices equals the product of the transposes of individual
matrices, with the order reversed. That is,
T T T T T T T T ( AB) B A (ABC) C B A (ABCD) [11]
Matrices with only one row are called row matrices; matrices with only one column are called
column matrices.^1 Although we can write the elements of such matrices with two subscripts, the subscript of one for the single row or the single column is usually not included. The examples
below for the row matrix, r, and the column matrix, c , show two possible forms for the subscripts. In each case, the second matrix has the commonly used notation. When row and column
matrices are used in formulas that have two matrix subscripts, the first form of the matrices shown below are implicitly used to give the second subscript for the equation.
(^1) Row and column matrices are called row vectors or column vectors when they are used to represent the
components of a vector. In these notes we will use upper case boldface letters such as A and B to represent matrices with more than one row or more than one column; we will use lower case boldface letters such as a or b to represent matrices with only one row or only one column. We will generally refer to these matrices as vectors.
computed. In each bikakj product, the second b subscript (k) is the same as the first a subscript.
From these observations we can write a general equation for each of the four coefficients in equation [16] as follows.
2
1
c b a i j k
ij ik kj^ [17]
The definition of matrix multiplication is a generalization of the simple example in equation [17] to any general sizes of matrices. In this general case, we define the product, C = AB , of two
matrices, A with n rows and p columns, and B with p rows and m columns by the following equation.
1
( ) ( ) ( ) c a b i n j m
p
k
There are two important items to consider in the formula for matrix multiplication. The first is that order is important. The product AB is different from the product BA. In fact, one of the products
may not be possible. The second item is the need for compatibility between the first and second matrix in the AB product.^2 In order to obtain the product AB the number of columns in A must
equal the number of rows in B. A simple example of matrix multiplication is shown below.
[19]
Matrix multiplication is simple to program. The C++ code for multiplying two matrices is shown
below.^3 This code assumes that all variables have been properly declared and initialized. The code uses the obvious notation to implement equation [18]. The array components are denoted
as a[i][k]. b[k][j] and c[i][j]. The product matrix, C , has the same number of rows, n, as in matrix A and the same number of columns, m, as in matrix B. The number of columns in A is equal to p,
which must also equal the number of rows in B.
for (i = 1; i <= n; i++ ) for ( j = 1; j <= m; j++ ) { c[i][j] = 0.0; for ( k = 1; k <= p; k++ )
(^2) The terms premultiply and postmultiply are commonly used to indicate the order of the matrices involved
in matrix multiplication. In the matrix product AB , we say that B is premultiplied by A or that A is postmultiplied by B. Alternatively, the terms left multiplied and right multiplied are used. In the AB product, A is right multiplied by B and B is left multiplied by A.
(^3) The basic code structure is the same in any language. There are three nested loops. The two outer loops
cover all possible combinations of i and j to ensure that all the cij components are computed. The inner loop code is the typical code for summing a number of items. C++ programmers will note that the loop indices used in this code ignore the fact that the minimum index for a C++ array is zero. This was done deliberately for all code examples in these notes to provide similar notation for the notes and the code.
c[i][j] += a[i][k] * b[k][j]; }
We now examine how the coordinate transformations that we used above to introduce matrix
multiplication can be represented as matrix equations. We can define matrices, A , B , and C to represent the coefficients that we used in our coordinate transformation equations.
21 22
11 12
21 22
11 12
21 22
11 12
c c
c c
b b
b b
a a
a a A B C [20]
The various coordinate pairs can be represented as column matrices as shown below.
21
11
2
1
21
11
2
1
21
11
2
1
z
z
z
z
y
y
y
y
x
x
x
x x y z [21]
With these matrix definitions, the two sets of simultaneous linear equations shown in equation
[13] can be represented by the following pair of matrix equations:
y A x and z By [22]
You can verify that the equations above are correct by applying the general formula for matrix
multiplication in equation [18] to the matrix equations in [22]. To do this, you should use the definitions of A , B , x , y , and z , provided in equations [20] and [21]. If we combine the matrix
equations in [22] to eliminate the y matrix, we get the following result.
z By BAx or z Cx with C BA [23]
Note the importance of the order of multiplication. In general, BA ; it is not equal to AB.
There are two cases where the order is not important. These are multiplication by a null matrix, which produces a null matrix, and multiplication by an identity matrix, which produces the original
matrix.
0A A0 0 and AI IA A [24]
Although the order is not important here, the actual identity and null matrices used may be
different. We can rewrite equations [24] to explicitly show the rows and columns in each matrix.
.
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
nxm mxm nxn nxm nx m
pxn nxm pxm nxm mxq nxq
A I I A A
[25]
By definition the identity matrix is a square matrix. One size specification for the identity matrix,
the number of rows or the number of columns, is set by the compatibility condition for matrix multiplication. Once this is done, the other size is set by the requirement that I is square. The
same is not true for the null matrix. It can have any shape. Thus the size specifications p and q, for the null matrices in equation [25] are arbitrary. (The other size specifications, n or m, for the
null matrices in equation [25] must match the sizes for the A matrix.)
second equation to obtain the following pair of equations, which is equivalent to the original set in
equation [29].
2
1 2
x
x x
[30]
We can readily solve the second equation to find x 2 = 2, and substitute this value of x 2 into the
first equation to find that x 1 = [13 – 5(2)]/3 = 1. The general process for solving the system of equations represented by equations [26], [27], or [28], known as Gauss elimination, is similar to
the one just shown. It requires a series of operations on the coefficients aij and bi to produce a set of equations with the form shown in equation [31], below, without changing the solution of the
initial problem.
n
n
n
n
nn
n n
n
n
n
n n
n
n
n
x
x
x
x
x
1
3
2
1
1
3
2
1
1
3
2
1
1 1
33 3 1
22 23 2 1
11 12 13 1 1
[31]
The basic rule in the Gauss elimination process is that we can use a linear combination of
two equations to replace one of those equations, without changing the solution to the problem. This is the process that we used above in going from the set of equations in [29] to the
set of equations in [30]. Both sets of equations are equivalent in the sense that both sets of equations give the same answers for x 1 and x 2. However, the second set of equations can be
directly solved for all the unknowns.
The revised coefficient matrix in equation [31] is called an upper triangular matrix. The only
nonzero terms are on or above the principal diagonal. The same operations that are used to obtain the revised coefficient matrix are used to obtain the revised right-hand-side matrix.
The revised A and b matrices are obtained in a series of steps. In the first step, the x 1 coefficients are eliminated from all equations except the first one. This is done by the following
replacement operations on the coefficients in equations 2 to n. The replacement notation () from computer programming is used here to indicate that an old value of aij is being replaced by the results of a calculation. This avoids the need to use mathematical notation that would require
separate symbols for the two values.
b i n a
a a j n and b b a
a a a i j i i
i ij ij^1 ,^ ^12 , 11
1 1 11
1
After equation [32] is applied to all rows below the first row, the only nonzero x 1 coefficient is in
the first equation (represented by the first row of the matrix.) You can confirm that this will set ai = 0 for i > 1. You can also apply the formulae in [32] to equation [29] to see that the result is
equation [30]. The elimination process is next applied to make the x 2 coefficients on all equations
below the second equation zero.
b i n a
a a j n and b b a
a a a
i j i i
i ij ij^2 ,^ ^23 , 22
2 2 22
2
Equation [33] has the same form as equation [32]; only the starting points for the row and column
operations are different. The process described by equations [32] and [33] continues until the form shown in equation [31] is obtained. From equation [31], the various values of x can be found
by back substitution. We can simply find xn as βn/αnn. The remaining values of x are found in reverse order by the following equation.
1
i n n
x
x ii
n
j i
i ij j
i^ [34]
When we are solving for xi, all previous values of xj required in the summation are known.
The C++ code below shows a simplified version^4 of how the Gauss elimination method is applied
to the solution of equations. As in previous code examples, all data values are assumed to be properly declared and initialized. The number of equations is equal to the number of unknowns,
n. The row that is subtracted from all rows below it is called the pivot row. The main outer loop in the first part of the code uses the variable, pivot, to represent this row. The code execution is
simplified by augmenting the a matrix so that ai,n+1 = bi. This allows the code to proceed without separate consideration of similar operations on the A and b matrix components.
// augment a matrix with b values
for ( row = 1; row <=n; row++) a[row][n+1] = b[row];
// get upper triangular array
for (pivot = 1; pivot < n; pivot++ ) for ( row = pivot+1; row <= n; row++ ) for ( column = row+1; column <= n+1; column++) a[row][column] -= a[row][pivot] * a[pivot] [column] / a[pivot][pivot];
// Upper triangular matrix complete; get x values
for (row = n; row <= 1; row--) { x[row] = a[row][n+1]; for ( column = n; column < row; column-- ) x[row] -= a[row][column] * x[column]; x[row] /= a[row][row]; }
The process outlined above for the solution of a set of simultaneous equations is known as the Gaussian elimination procedure. Alternative procedures such as the Gauss-Jordan method and
(^4) Actual code would have to account for the possibility that the system of equations might not have a
solution. It would also use different operations to reduce round-off error. This example continues the practice used previously of starting the array subscripts at 1 and ending them at n to be consistent with the notation. Typical C++ code starts the array subscripts at 0.
2 1
3 5
7 3
8 0
8 4
1 6
0 0 0 0 0 0
0 0 0 0 6 0
0 0 0 4 1 0
0 0 2 0 3 5
0 1 7 8 6 2
6 0 2 0 0 0
[36]
The existence and uniqueness of solutions are defined in terms of the rank of the augmented matrix, [ A , b ]. This is the matrix in which the right hand side column matrix, b , is added as the
final column in the A matrix. This augmented matrix is shown below for the general case of n equations and m unknowns. The n equations mean that there are n rows in the matrix. The m
unknowns give m + 1 columns to the augmented matrix.
n n n nm n
m
m
m
b
b
b
b
a a a a
a a a a
a a a a
a a a a
3
2
1
1 2 3
31 32 33 3
21 22 23 2
11 12 13 1
[ A , b ] [37]
The existence and uniqueness of solutions to Ax = b is stated below without proof.
If the rank of the original matrix, A , equals the rank of the augmented matrix, [ A , b ], equals the number of unknowns, m, there is a unique solution to the matrix equation, Ax = b. If the rank of the original matrix, A , equals the rank of the augmented matrix, [ A , b ], but is less than the number of unknowns, m, there are an infinite number of solutions to the matrix equation, Ax = b. If the rank of the original matrix, A , is not equal the rank of the augmented matrix, [ A , b ], there is no solution to the matrix equation, Ax = b.
We can see that these statements are consistent with the examples in equation [35]. A formal proof of these statements is given in linear algebra texts.
These guidelines for the existence and uniqueness of solutions to simultaneous linear equations are illustrated in the three sets of equations shown below. Each equation set has three equations
in three unknowns. The original equation set, shown in the first column, is converted to an upper triangular form in the second column. We see that the first set has a unique solution. The
second and third sets do not have a unique solution; however, there is a difference between
these two. The second set has an infinite number of solutions. For any value, that we pick for x 3 we can determine a value of x 1 and x 2 that is consistent with the original set of equations.
However, for the third set of equations, the upper triangular form gives an inconsistent third equation. Thus this set of equations has no solution.
Original Equation Set Upper Triangular Form Solutions
Set I
7 3 8 13
2 9 5
4 26 2
1 2 3
2 3
1 2 3
x x x
x x
x x x
2 9 5
4 26 2
3
2 3
1 2 3
x
x x
x x x
1
7
0
3
2
1
x
x
x
Set II
2 10 61 9
2 9 5
4 26 2
1 2 3
2 3
1 2 3
x x x
x x
x x x
0 0
2 9 5
4 26 2
2 3
1 2 3
x x
x x x
3
2
1
12 8
x
x
x
Set III
2 10 61 8
2 9 5
4 26 2
1 2 3
2 3
1 2 3
x x x
x x
x x x
0 1
2 9 5
4 26 2
2 3
1 2 3
x x
x x x No Solution
These three sets of equations are shown in terms of their A and augmented [ A b ] matrices in the table below. We see that the set of equations in the table above corresponds to the data in
the augmented matrix. The first set of equations has rank A = rank [ A b ] = 3, the number of unknowns. We have already seen that this provides the unique solution above. The second set
of equations has rank A = rank [ A b ] = 2, less than the number of unknowns. This means that we have an infinite number of solutions. Again, this corresponds to the result above. Finally, the
third case below has rank A = 2, but rank [ A b ] = 3. This difference in rank shows that there are no solutions.
Original Matrices Row-Echelon Form Rank A [ A b ] A [ A b ] A [ A b ]
Set I
7 3 8
0 2 9
1 4 2
13
5
2
7 3 8
0 2 9
1 4 2
0 0 50. 5
0 2 9
1 4 2
5
2
0 0 50. 5
0 2 9
1 4 2
3 3
Set
II
2 10 61
0 2 9
1 4 2
9
5
2
2 10 61
0 2 9
1 4 2
0 0 0
0 2 9
1 4 2
0
5
2
0 0 0
0 2 9
1 4 2
2 2
Set
III
2 10 61
0 2 9
1 4 2
8
5
2
2 10 61
0 2 9
1 4 2
0 0 0
0 2 9
1 4 2
1
5
2
0 0 0
0 2 9
1 4 2
2 3
There is one final case to consider; that is the case of homogenous equations , where the b
matrix is all zeros. If there are n equations and the rank of the coefficient matrix is n then the only solution to the set of equations is that all xi = 0. (This is called the trivial solution.) However, if the
rank is less than n, it is possible to have a solution in which all the xi are not zero. However, such a solution is not unique.
Consider the two sets of homogenous equations shown below. Each set of equations has a right- hand side that is all zeros. (The two equation sets are identical except for the coefficient of the x 1
term in the first equation.)
The general equation for computing a determinant is given in terms of minors (or cofactors) of a
determinant. The minor, Mij of a determinant is the smaller determinant that results if row i and column j are eliminated from the original determinant. The cofactor, Aij, equals (-1)i+jMij. For
example, if we start with a 3x3 determinant, such as the one shown in equation [41] we can define nine possible minors (and cofactors). Four of these are shown below:
22 23
12 13 31
31 31 21 23
11 13 32
32 32
31 33
11 13 22
2 2 22 21 22
11 12 33
33 33
a a
a a A M a a
a a A M
a a
a a A M a a
a a A M
[42]
The determinant of a matrix can be written in terms of its minors or cofactors as follows.
n
j
ij ij
n
j
ij ij
i j
n
i
n
i
ij ij ij ij
i j nxn
a M a A
Det a M a A
1 1
1 1
( )
[43]
Note that the sum is taken over any one row or over any one column. In applying this formula,
one seeks rows or columns with a large number of zeros to simplify the calculation of the determinant. We can show that this equation is consistent with the results given previously for
the determinants of 2x2 and 3x3 arrays. Applying equation [43] to the third row of a 3x3 array gives the following result.
31 31 32 32 33 33
3
1
Det A ( 3 3 ) a 3 A 3 a A a A a A j
[44]
We could have applied equation [43] to any of the three rows or any of the three columns to compute the determinant. I chose to use the third row since the necessary cofactors can be
found in equation [42]. If we use equation [40] to expand the (2 x 2) cofactors in [42] and apply those results to equation [44], we obtain the following result.
31 12 23 31 22 13 32 11 23 32 2113 3311 22 33 12 21
31 12 23 22 13 32 11 23 21 13 33 11 22 12 21
21 22
11 12 33 21 23
11 13 32 22 23
12 13 31
( 3 3 ) 31 31 32 32 33 33
a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a
a a
a a a a a
a a a a a
a a a
Det A x a M a M a M
[45]
The final result, after some rearrangement, is the same as the one in equation [41].
Two rules about determinants are apparent from equation [44]:
A determinant is zero if any row or any column contains all zeros.
If one row or one column of a determinant is multiplied by a constant, k, the value of the determinant is multiplied by the same constant. Note the implication for matrices: if a matrix is multiplied by a constant, k, then each matrix element is multiplied by k. If A is an n x n matrix, Det(k A ) = knDet( A ).
Additional rules for and properties of determinants are stated below without proof.
If one row (or one column) of a determinant is replaced by a linear combination of that row (or column) with another row (or column), the value of the determinant is not changed. This means that the operations of the Gauss elimination process do not change the determinant of a matrix.
If two rows (or two columns) of a determinant are linearly dependent the value of the determinant is zero.
The determinant of the product of two matrices, A and B is the product of the determinants of the individual matrices: Det( AB ) = Det( A ) Det( B ).
The determinant of transposed matrix is the same as the determinant of the original matrix: Det( A T) = Det( A ).
If we apply the column expansion of equation [44] to an upper triangular matrix, A , we find that
Det A = a 11 A 11 , since the a 11 term is the only term in the first column. We can apply equation [44] repeatedly to the cofactors. Each application shows that the determinant is simply the new term
in the upper left of the array times its cofactor. Continuing in this fashion we see that the determinant of an upper triangular matrix is simply the product of the diagonal terms. We can
combine this result with the fact noted above that the operations of the Gauss elimination process do not change the determinant of a matrix to develop a practical for computing determinants of
any matrix. Apply Gauss elimination to get the matrix in upper triangular form then the determinant (of both the original matrix and the one in upper triangular form) is simply the product
of the diagonal elements.
As an example consider the matrices from “Set 1” in the table on page 14. The original matrix
was
7 3 8
0 2 9
1 4 2
; its upper triangular form was
0 0 50. 5
0 2 9
1 4 2
. We can readily compute the
determinant as the product (1)(2)(50.5) = 101. You can show that the same value is obtained by
the conventional formula for the evaluation of the original 3 x 3 determinant.
Determinants are not used in normal numerical calculations. However if you need to find the
numerical value for a large determinant, the process outlined above is the most direct numerical approach.
Cramer’s rule gives the solution to a system of linear equations in terms of determinants. This approach is never used except in some theoretical applications. According to Cramer’s rule the
solution for a particular unknown xi is the ratio of two determinants. The determinant in the denominator uses the usual matrix coefficients, aij. The determinant in the numerator consists of
the aij coefficients except in one column. When we are solving for xj we replace column i in the aij coefficients by the right-hand-side matrix coefficients, bi. For a set of three equations in three
unknowns, Cramer’s rule would give the solutions shown in equation [46].
Cramer’s rule allows us to find an analytical expression for the solution of a set of equations, and
it is sometimes used to solve small sets of equations (2 x 2 or 3 x 3). However, it is never used for numerical calculations of larger systems because it is extremely time consuming.
It is usually not necessary to find the inverse of a matrix. If necessary, you can find a numerical value of the inverse by the same process used to solve simultaneous linear algebraic equations. To understand how this is done, we define a second matrix, B , as A -1. Then, by the definition of inverse we have the following equation.
B A AB I , 1 If [50]
Equation [51] shows the matrices involved in this equation.
1 2 3
31 32 33 3
21 22 23 2
11 12 13 1
1 2 3
31 32 33 3
21 22 23 2
11 12 13 1
n n n nn
n
n
n
n n n nn
n
n
n
b b b b
b b b b
b b b b
b b b b
a a a a
a a a a
a a a a
a a a a
[51]
We have a form similar to the usual problem of solving a set of equations. The coefficient matrix, A , is the same, but we have n right-hand side columns of known values. Each of these columns of known values corresponds to one column of unknowns in the B matrix that is A -1. If we use our usual process for solving Ax = b , with, for example, b = [1 0 0 0 …0]T, we will obtain the first column of B = A -1. Repeating the process for similar b columns, which are all zeros except for a 1 in row k gives us column k of the inverse. For example, equation [52] shows the solution for the second column of B = A -1.
2
32
22
12
1 2 3
31 32 33 3
21 22 23 2
11 12 13 1
n n n nn n
n
n
n
b
b
b
b
a a a a
a a a a
a a a a
a a a a
[52]
Because the operations for solving a set of simultaneous linear equations are based on the A matrix only, the solution for the inverse is actually done simultaneously for all columns.
An analytical expression for the inverse can be obtained in terms of the cofactors discussed in the section on determinants. We continue to define B = A -1; the components of the inverse, bij, are then given in terms of the minors or cofactors of the original A matrix and its determinant.
1
A A
Det
Det
If b
ji i j ji ij
[53]
The simplest application of this equation is to a 2x2 matrix. For such a matrix,
(^222211) 22
(^211221) 21
(^122112) 12
(^111122) 11
Det
a
Det
b Det
a
Det
b
Det
a
Det
b Det
a
Det
b
[54]
Combining the results of equation [54] with equation [40] for a 2x2 determinant, gives the
following result for the inverse of a 2x2 matrix.
21 11
22 12
11 22 21 12
1
21 22
a a
a a
a a a a a a
a a [55]
You can easily show that this is correct by multiplying the original matrix by its inverse. You will
obtain a unit matrix by either multiplication: AA -1^ or A -1 A. The same process can be used to find the inverse of a (3x3) matrix; the result is shown below:
11 32 23 21 12 33 31 22 13
11 22 33 21 32 13 31 12 23
21 32 31 22 31 12 11 32 11 22 2112
31 23 33 21 11 33 31 13 2113 11 23
22 33 32 23 32 13 33 12 12 23 22 13
1
31 32 33
21 22 23
11 12 13 ( ) ( ) ( )
a a a a a a a a a
a a a a a a a a a
a a a a a a a a a a a a
a a a a a a a a a a a a
a a a a a a a a a a a a
a a a
a a a
a a a
[56]
Equations [55] and [56] show the value of determinants in providing analytical solutions to inverses. Although determinants are valuable in such cases any use of determinants should be
avoided in numerical work.
The general rule for the inverse of a matrix product and the inverses of the individual matrices is
similar to the same equation for the transpose of a matrix product and the product of the transposes of the individual matrices. This relation is shown below.
1 1 1 1 1 1 1 1 ( AB) B A (ABC) C B A (ABCD) [57]
Matrix Eigenvalues and Eigenvectors
If a square matrix can premultiply a column vector and return the original column vector multiplied by a scalar, the scalar is said to be an eigenvalue of the matrix and the column vector is called an eigenvector. In the following equation, the scalar, λ, is an eigenvalue of the matrix A , and x is an
eigenvector.
A (^) ( n xn ) x ( nx 1 ) x ( nx 1 ) [58]
We can use the identity matrix to rewrite this equation as follows.
[ A (^) ( n xn ) I ( nxn )] x ( nx 1 ) (^0) ( nx 1 ) [59]