







































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Econometrics is used to deal with or solve these prob- lems. 1.1.3 Data Types. Definition 1.2 Time-series data are data that represent repeated observations.
Typology: Study notes
1 / 47
This page cannot be seen from the preview
Don't miss anything!
Definition 1.1 Accounting data is routinely recorded data (that is, records of transactions) as part of market activities. Most of the data used in econmetrics is accounting data. 2
Acounting data has many shortcomings, since it is not collected specifically for the purposes of the econometrician. An econometrician would prefer data collected in experiments or gathered for her.
The econometrician does not have any control over nonexperimental data. This causes various problems. Econometrics is used to deal with or solve these prob- lems.
Definition 1.2 Time-series data are data that represent repeated observations of some variable in subseqent time periods. A time-series variable is often sub- scripted with the letter t. 2
Definition 1.3 Cross-sectional data are data that represent a set of obser- vations of some variable at one specific instant over several agents. A cross- sectional variable is often subscripted with the letter i. 2
Definition 1.4 Time-series cross-sectional data are data that are both time- series and cross-sectional. 2
An special case of time-series cross-sectional data is panel data. Panel data are observations of the same set of agents over time.
We need to look at the data to detect regularities. Often, we use stylized facts, but this can lead to over-simplifications.
1.2 Models
Models are simplifications of the real world. The data we use in our model is what motivates theory.
By choosing assumptions, the
Models can also be developed by postulating relationships among the variables.
Economic models are usually stated in terms of one or more equations.
Because none of our models are exactly correct, we include an error component into our equations, usually denoted ui. In econometrics, we usually assume that the error component is stochastic (that is, random). It is important to note that the error component cannot be modeled by economic theory. We impose assumptions on ui, and as econometricians, focus on ui.
1.3 Statistics
Some stuff here.
Some stuff here.
Statistical Statements
As statisticians, we are often called upon to answer questions or make statements concerning certain random variables. For example: is a coin fair (i.e. is the probability of heads = 0.5) or what is the expected value of GNP for the quarter.
Population Distribution
Typically, answering such questions requires knowledge of the distribution of the random variable. Unfortunately, we usually do not know this distribution (although we may have a strong idea, as in the case of the coin).
Experimentation
In order to gain knowledge of the distribution, we draw several realizations of the random variable. The notion is that the observations in this sample contain information concerning the population distribution.
Inference
Definition 2.1 The process by which we make statements concerning the pop- ulation distribution based on the sample observations is called inference. 2
Example 2.1 We decide whether a coin is fair by tossing it several times and observing whether it seems to be heads about half the time. 2
Suppose we draw n observations of a random variable, denoted {x 1 , x 2 , ..., xn} and each xi is independent and has the same (marginal) distribution, then {x 1 , x 2 , ..., xn} constitute a simple random sample.
Example 2.2 We toss a coin three times. Supposedly, the outcomes are inde- pendent. If xi counts the number of heads for toss i, then we have a simple random sample. 2
Note that not all samples are simple random.
Example 2.3 We are interested in the income level for the population in gen- eral. The n observations available in this class are not indentical since the higher income individuals will tend to be more variable. 2
Example 2.4 Consider the aggregate consumption level. The n observations available in this set are not independent since a high consumption level in one period is usually followed by a high level in the next. 2
Definition 2.2 Any function of the observations in the sample which is the basis for inference is called a sample statistic. 2
Example 2.5 In the coin tossing experiment, let S count the total number of heads and P = S 3 count the sample proportion of heads. Both S and P are sample statistics. 2
A sample statistic is a random variable – its value will vary from one experiment to another. As a random variable, it is subject to a distribution.
Definition 2.3 The distribution of the sample statistic is the sample distribu- tion of the statistic. 2
Example 2.6 The statistic S introduced above has a multinomial sample dis- tribution. Specifically Pr(S = 0) = 1/8, Pr(S = 1) = 3/8, Pr(S = 2) = 3/8, and Pr(S = 3) = 1/8. 2
n^2
nσ^2
σ^2 n
We have been able to establish the mean and variance of the sample mean. However, in order to know its complete distribution precisely, we must know the probability density function (pdf) of the random variable x.
2.3 The Normal Distribution
Definition 2.4 A continuous random variable xi with the density function
f ( xi ) =
2 πσ^2
e−^
1 2 σ^2 (^ xi−μ^ ) 2 (2.5)
follows the normal distribution, where μ and σ^2 are the mean and variance of xi, respectively. 2
Since the distribution is characterized by the two parameters μ and σ^2 , we denote a normal random variable by xi ∼ (^) N ( μ, σ^2 ). The normal density function is the familiar “bell-shaped” curve, as is shown in Figure 2.1 for μ = 0 and σ^2 = 1. It is symmetric about the mean μ. Approximately 23 of the probability mass lies within ±σ of μ and about. lies within ± 2 σ. There are numerous examples of random variables that have this shape. Many economic variables are assumed to be normally distributed.
Consider the transformed random variable
Yi = a + bxi
We know that μY = E Yi = a + bμx
and σ Y^2 = E(Yi − μY )^2 = b^2 σ^2 x
If xi is normally distributed, then Yi is normally distributed as well. That is,
Yi ∼ (^) N ( μY , σ^2 Y )
Figure 2.1: The Standard Normal Distribution
Moreover, if xi ∼ (^) N ( μx, σ x^2 ) and zi ∼ (^) N ( μz , σ z^2 ) are independent, then
Yi = a + bxi + czi ∼ (^) N ( a + bμx + cμz , b^2 σ^2 x + c^2 σ z^2 )
These results will be formally demonstrated in a more general setting in the next chapter.
If, for each i = 1, 2 ,... , n, the xi’s are independent, identically distributed (iid) normal random variables, then
xi ∼ N ( μx,
σ x^2 n
The distribution of x will vary with different values of μx and σ x^2 , which is inconvenient. Rather than dealing with a unique distribution for each case, we
is,
lim n→∞ f
⎝ x q^ − μx σ^2 x n
⎠ (^) = ϕ
⎝ x q^ − μx σ x^2 n
2.5 Distributions Associated With The Normal
Distribution
Definition 2.5 Suppose that Z 1 , Z 2 ,... , Zn is a simple random sample, and Zi ∼ N( 0, 1 ). Then X^ n
i=
Z i^2 ∼ X (^) n^2 , (2.10)
where n are the degrees of freedom of the Chi-squared distribution. 2
The probability density function for the X (^) n^2 is
fχ 2 ( x ) =
2 n/^2 Γ( n/2 )
xn/^2 −^1 e−x/^2 , x > 0 (2.11)
where Γ(x) is the gamma function. See Figure 2.2. If x 1 , x 2 ,... , xn is a simple random sample, and xi ∼ (^) N ( μx, σ^2 x ), then
X^ n
i=
μ xi − μ σ
∼ X (^) n^2. (2.12)
The chi-squared distribution will prove useful in testing hypotheses on both the variance of a single variable and the (conditional) means of several. This multivariate usage will be explored in the next chapter.
Example 2.7 Consider the estimate of σ^2
s^2 =
Pn i=1(^ xi^ −^ x^ ) 2 n − 1
Then
( n − 1 )
s^2 σ^2
∼ X (^) n^2 − 1. (2.13)
2
Figure 2.2: Some Chi-Squared Distributions
Definition 2.6 Suppose that Z ∼ (^) N( 0, 1 ), Y ∼ X (^) k^2 , and that Z and Y are independent. Then Z q Y k
∼ tk, (2.14)
where k are the degrees of freedom of the t distribution. 2
The probability density function for a t random variable with n degrees of freedom is
ft( x ) =
¡ (^) n+ 2
nπ Γ
¡ (^) n 2
1 + x 2 n
¢(n+1)/ 2 ,^ (2.15)
for −∞ < x < ∞. See Figure 2. The t (also known as Student’s t) distribution, is named after W.S. Gosset, who published under the pseudonym “Student.” It is useful in testing hypotheses concerning the (conditional) mean when the variance is estimated.
Example 2.8 Consider the sample mean from a simple random sample of nor- mals. We know that x ∼ (^) N( μ, σ^2 /n ) and
x − μ q σ^2 n
Definition 2.7 Suppose that Y ∼ X (^) m^2 , W ∼ X (^) n^2 , and that Y and W are inde- pendent. Then Y m W n
∼ Fm,n, (2.17)
where m,n are the degrees of freedom of the F distribution. 2
The probability density function for a F random variable with m and n degrees of freedom is
fF( x ) =
¡ (^) m+n 2
(m/n)m/^2 Γ
¡ (^) m 2
¡ (^) n 2
x(m/2)−^1 (1 + mx/n)(m+n)/^2
The F distribution is named after the great statistician Sir Ronald A. Fisher, and is used in many applications, most notably in the analysis of variance. This situation will arise when we seek to test multiple (conditional) mean parameters with estimated variance. Note that when x ∼ tn then x^2 ∼ F 1 ,n. Some examples of the F distribution can be seen in Figure 2.4.
Figure 2.4: Some F Distributions
Let (^) ⎡
⎢⎢ ⎢ ⎣
x 1 x 2 .. . xm
= x
be an m × 1 vector-valued random variable. Each element of the vector is a scalar random variable of the type discussed in the previous chapter. The expectation of a random vector is
E[^ x^ ] =
E[^ x 1 ] E[^ x 2 ] .. . E[^ xm ]
μ 1 μ 2 .. . μm
= μ. (3.1)
Note that μ is also an m × 1 column vector. We see that the mean of the vector is the vector of the means. Next, we evaluate the following:
E[(^ x^ −^ μ^ )(^ x^ −^ μ^ )^0 ]
( x 1 − μ 1 )^2 ( x 1 − μ 1 )( x 2 − μ 2 ) · · · ( x 1 − μ 1 )( xm − μm ) ( x 2 − μ 2 )( x 1 − μ 1 ) ( x 2 − μ 2 )^2 · · · ( x 2 − μ 2 )( xm − μm ) .. .
( xm − μm )( x 1 − μ 1 ) ( xm − μm )( x 2 − μ 2 ) · · · ( xm − μm )^2
since the term inside the expectation is a quadratic. Hence, Σ is a positive semidefinite matrix. Note that P satisfying the third relationship is not unique. Let D be any m × m orthonormal martix, then DD^0 = Im and P∗^ = PD yields P∗P∗^0 = PDD^0 P^0 = PImP^0 = Σ. Usually, we will choose P to be an upper or lower triangular matrix with m(m + 1)/2 nonzero elements.
Positive Definite
Since Σ is a positive semidefinite matrix, it will be a positive definite matrix if and only if det(Σ) 6 = 0. Now, we know that Σ = PP^0 for some m × m matrix P. This implies that det(P) 6 = 0.
Let y |{z} m× 1
= (^) |{z}b m× 1
m× 1 z}|{ x. Then
E[^ y^ ]^ =^ b^ +^ B^ E[^ x^ ] = b + Bμ = μy (3.3)
Thus, the mean of a linear transformation is the linear transformation of the mean. Next, we have
E[ (^ y^ −^ μy )(^ y^ −^ μy )^0 ]^ =^ E{[^ B^ (^ x^ −^ μ^ )][ (B^ (^ x^ −^ μ^ ))^0 ]} = B (^) E[( x − μ )( x − μ)^0 ]B^0 = BΣB^0 = BΣB^0 (3.4) = Σy (3.5)
where we use the result ( ABC )^0 = C^0 B^0 A^0 , if conformability holds.
3.2 Change Of Variables
Let x be a random variable and fx(·) be the probability density function of x. Now, define y = h(x), where
h^0 ( x ) =
d h( x ) d x
That is, h( x ) is a strictly monotonically increasing function and so y is a one- to-one transformation of x. Now, we would like to know the probability density function of y, fy (y). To find it, we note that
Pr( y ≤ h( a ) ) = Pr( x ≤ a ), (3.6)
Pr( x ≤ a ) =
Z (^) a
−∞
fx( x ) dx = Fx( a ), (3.7)
and,
Pr( y ≤ h( a ) ) =
Z (^) h( a )
−∞
fy ( y ) dy = Fy ( h( a )), (3.8)
for all a. Assuming that the cumulative density function is differentiable, we use (3.6) to combine (3.7) and (3.8), and take the total differential, which gives us
dFx( a ) = dFy ( h( a )) fx( a )da = fy ( h( a ))h^0 ( a )da
for all a. Thus, for a small perturbation,
fx( a ) = fy ( h( a ))h^0 ( a ) (3.9)
for all a. Also, since y is a one-to-one transformation of x, we know that h(·) can be inverted. That is, x = h−^1 (y). Thus, a = h−^1 (y), and we can rewrite (3.9) as fx( h−^1 ( y )) = fy( y )h^0 ( h−^1 ( y )).
Therefore, the probability density function of y is
fy ( y ) = fx( h−^1 ( y ))
h^0 ( h−^1 ( y ))
Note that fy ( y ) has the properties of being nonnegative, since h^0 (·) > 0. If h^0 (·) < 0, (3.10) can be corrected by taking the absolute value of h^0 (·), which will assure that we have only positive values for our probability density function.
Consider the graph of the relationship shown in Figure 3.1. We know that
Pr[ h( b ) > y > h( a )] = Pr( b > x > a ).
Also, we know that
Pr[ h( b ) > y > h( a )] ' fy [ h( b )][ h( b ) − h( a )],
We also assume that ∂h ∂(x^ x 0 )exists. This is the m × m Jacobian matrix, where
∂h( x ) ∂x^0
h 1 ( x ) h 2 ( x ) .. . hm( x )
∂(x 1 x 2 · · · xm)
∂h 1 ( x ) ∂x 1
∂h 2 ( x ) ∂x 1 · · ·^
∂hm( x ) ∂x 1 ∂h 1 ( x ) ∂x 2
∂h 2 ( x ) ∂x 2 · · ·^
∂hm( x ) ∂x 2 .. .
∂h 1 ( x ) ∂xm
∂h 2 ( x ) ∂xm · · ·^
∂hm( x ) ∂xm
= Jx( x ) (3.12)
Given this notation, the multivariate analog to (3.11) can be shown to be
fy( y ) = fx[ h−^1 ( y )]
|det(Jx[ h−^1 ( y )])|
Since h(·) is differentiable and one-to-one then det(Jx[ h−^1 ( y )]) 6 = 0.
Example 3.1 Let y = b 0 + b 1 x, where x, b 0 , and b 1 are scalars. Then
x =
y − b 0 b 1
and dy dx
= b 1.
Therefore,
fy ( y ) = fx
μ y − b 0 b 1
|b 1 |
Example 3.2 Let y = b + Bx, where y is an m × 1 vector and det(B) 6 = 0. Then x = B−^1 ( y − b )
and ∂y ∂x^0
= B = Jx( x ).
Thus,
fy ( y ) = fx
B−^1 ( y − b )
|det( B )|
3.3 Multivariate Normal Distribution
Definition 3.1 An m × 1 random vector z is said to be spherically normally distributed if
f (z) =
( 2π )n/^2
e−^ (^12) z^0 z
. 2
Such a random vector can be seen to be a vector of independent standard normals. Let z 1 , z 2 ,... , zm, be i.i.d. random variables such that zi ∼ (^) N(0, 1). That is, zi has pdf given in (2.8), for i = 1, ..., m. Then, by independence, the joint distribution of the zi’s is given by
f ( z 1 , z 2 ,... , zm ) = f ( z 1 )f ( z 2 ) · · · f ( zm )
=
Y^ n
i=
2 π
e−^
(^12) z (^2) i
( 2π )m/^2
e−^
12 Sni=1 z i 2
( 2π )m/^2
e−^ (^12) z^0 z , (3.14)
where z^0 = (z 1 z 2 ... zm).
Definition 3.2 The m × 1 random vector x with density
fx( x ) =
( 2π )n/^2 [ det( Σ )]^1 /^2
e−^
(^12) ( x−μ ) (^0) Σ− (^1) ( x−μ ) (3.15)
is said to be distributed multivariate normal with mean vector μ and positive definite covariance matrix Σ. 2
Such a distribution for x is denoted by x ∼ N(μ, Σ). The spherical normal distribution is seen to be a special case where μ = 0 and Σ = Im. There is a one-to-one relationship between the multivariate normal random vector and a spherical normal random vector. Let z be an m × 1 spherical normal random vector and
|{z}^ x m× 1
= μ + Az,
where z is defined above, and det(A) 6 = 0. Then,
E x^ =^ μ^ +^ A^ E z^ =^ μ^ ,^ (3.16)