




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Material Type: Exam; Class: APPLIED ECONOMETRICS; Subject: Economics; University: Clark University; Term: Unknown 1989;
Typology: Exams
1 / 121
This page cannot be seen from the preview
Don't miss anything!
The purpose of this course is to:
The guiding principle is learning by doing! It is, therefore, important to actively participate in the exercises which are all based on real data sets. Half of the exam questions will be related to the exercises.
The aim of this section is to discuss how the analysis of cross-sectional data and panel data analyses differ from the analysis of time-series data. It is important to understand that in time-series modelling we have to make a number of simplyfying assumptions regarding the constancy of the mean
and the covariances over time, some of which cannot be tested because we have just one realization per time period. This is contrary to panel data analysis where we have several observations available per time period. To test constancy of parameters (i.e. constancy of covariances of the data over time) we often need quite long time series. This means that it is almost impossible to know whether some macroeconomic mechanisms have changed as a result of a change in a regime, for example, as a result of adopting the Euro. Thus, when interpreting the results from macroeconometric models it is important to have a realistic sense for the reliablity of the results.
The notation in econometrics is far from standardized and it is important from the outset to get used to the fact that different authors use different notations for the same concepts. Though there will be occasional exceptions the following notation will generally be used during the course: In MV (Marno Verbeek) Y denotes a random variable and y a realization of the random variable. For time series data we often use the notation yt both for a random variable and its realization. It is also customary to use capital letters (for example Ct) to denote a variable before the log transformation and lower case letter (for example ct) to denote lnCt. In the following we use the notation: yt is the dependent/endogenous/ variable (or the regressand), xi,t is an explanatory/exogenous variable (or a regressor) βi is the theoretical regression coefficient bi or βˆi is an estimate of βi, whereas the formula (for ex. b = (X^0 X)−^1 X^0 y) is called an estimator. σ^2 y or σyy are used to denote the theoretical variance of y. The former is often used in connection with a single time series, and the latter in connection with a covariance matrix. s^2 y/ˆσ^2 y or syy/ˆσyy are the corresponding estimates. Σ is often used to denote a matrix of theoretical variances and covariances. Σ^ ˆ notates the corresponding estimated variances and covariances.
y(t)
6
1
r³³³
³³ y 1
μ 1
r·
·
·
·
·
·
··
y 2
μ 2
r QQ Q QQ
y 3
μ 3
r´ ´ ´´
´
y 4 μ 4
r S S S S S S y 5
μ 5
ry 6
μ 6
Figure 2. E(yt) = μt, V ar(yt) = σ^2 y, t = 1, .., 6
In the two examples, the line connecting the realizations yt produces the graph of the time-series. For instance in Figure 1 we have assumed that the distribution, the mean value and the variance is the same for each yt, t = 1, ..., T. In figure 2 the distribution and the variance are identical, but the mean varies with t. Note that the observed time graph is the same in both cases illustrating the fact, that we often need rather long time series to be able to statistically distinguish between different hypotheses in time series models. To be able to make statistical inference we need:
(i) a probability model for yt, for example the normal model (ii) a sampling model for yt, for example dependent or independent drawings
For the normal distribution, the first two moments around the mean are suffi- cient to describe the variation in the data. Without simplifying assumptions on the time series process we have the general formulation for t = 1, ..., T :
E(yt) = = μt V ar(yt) = E(yt − μt)^2 = σt,t. 0 Cov(yt, yt−h) = E[(yt − μt)(yt−h − μt−h)] = σt,t−h.h h = ... − 1 , 1 , ...
E[y] = E
y 1 y 2 .. . yT
μ 1 μ 2 .. . μT
= μ
Cov[y] = E[y−E(y)][y−E(y)]^0 =
σ 11. 0 σ 12. 1 σ 13. 2 · · · σ 1 T.T − 1 σ 21. 1 σ 22. 0 σ 23. 1 · · · σ 2 T.T − 2 σ 31. 2 σ 32. 1 σ 33. 0 · · · σ 3 T.T − 3 .. .
σT 1 .T − 1 σT 2 .T − 2 σT 3 .T − 3 · · · σT T. 0
y =
y 1 y 2 .. . yT
∼ N(μ, Σ)
Because there is just one realization of the process at each time t, there is not enough information to make statistical inference about the underlying functional form of the distribution of each yt, t ∈ T and we have to make simplifying assumptions to secure that the number of parameters describing the process are fewer than the number of observations available. A typical assumption in time series models is that each yt has the same distribution and that the functional form is approximately normal. Furthermore, given the normal distribution, it is frequently assumed that the mean is the same, i.e. E(yt) = μy, for t = 1, ..., T, and that the variance is the same, i.e. E(yt − μ)^2 = σ^2 y, for t = 1, ..., T.
We will now move on to the more interesting case where we observe a variable yt (the ’endogenous’ variable) and k explanatory variables xi,t, i = 1, ..., k. In this case we need to discuss covariances between the variables {yt, xi,t} at time t as well as covariances between t and t − h. The covariances contain information about static and dynamic relationships between the variables which we would like to uncover using econometrics. For notational simplicity yt, xi,t will here denote both a random variable and its realization. We consider the vector zt:
where Z is a (k + 1)T × 1 vector. The covariance matrix Σ˜ is given by
E[(Z−μ)(Z−μ)^0 ] =
T (k+1)×T (k+1)
where Σt.h = Cov(zt, zt−h) = E(zt − μt)(zt−h − μt−h)^0. The above notation provides a completely general description of a multivariate vector time series process. Since there are far more parameters than observations available for estimation, it has no meaning from a practical point of view. Therefore, we have to make simplifying assumptions to reduce the number of parameters. Empirical models are typically based on the following assumptions:
These two assumptions are needed to secure parameter constancy in the dynamic regression model to be subsequently discussed. When the assump- tions are satisfied we can write the mean and the covariances of the data matrix in the simplified form:
μ ˜ =
μ μ .. . μ
The static regression model generally disregards the information con- tained in Σi, i 6 = 0. Thus the static regression model is only based on the information in Σ 0.
The above two assumptions for infinite T define a weakly stationary process:
Definition 1 Let {yt} be a stochastic process (an ordered series of random variables) for t = ..., − 1 , 1 , 2 , .... If
E[yt] = μ < ∞ f or all t, E[yt −^ μ]^2 =^ σ^2 <^ ∞^ for all t, E[(yt − μ)(yt+h − μ)] = σ.h < ∞ for all t and h = 1, 2 , ...
then {yt} is said to be weakly stationary. Strict stationarity requires that the distribution of (yt 1 , ..., ytk ) is the same as (yt 1 +h, ..., ytk+h) for h = ..., − 1 , 1 , 2 , ....
The data set is defined by [crt , ytr , wrt , Rb,t, ∆pt, ph,t−pc,t], t = 1973:1, ..., 2003 :1,where
crt = ct −pt is a measurement of real private consumption at time t, where ct is the log of nominal conumption expenditure in Denmark and pt is the log of the implicit consumption deflator,
yrt is the log of real domestic expenditure, GNE,
Rb,t is the 10 year government bond rate,
∆pt is the quarterly inflation rate measured by the implicit consumption deflator, and
ph,t − pc,t is the log difference between the house price deflator and the consumption deflator.
Figures 3 and 4 show the graphs of the data in levels and in first differ- ences.
1970 1980 1990 2000
Rb
1970 1980 1990 2000
-0.
0.005 DRb
1970 1980 1990 2000
0.04 DLPc
1970 1980 1990 2000
-0.
0.050 (^) DDLPc
1970 1980 1990 2000
-0.
-0.
0.1 (^) LphLPc
1970 1980 1990 2000
0.050 (^) DLphLPc
Figure 1.1: Figure 4: The graphs of the bond rate, inflation rate, and relative house - consumption prices in levels and differences.
is related to the lifting of previous restrictions on capital movements and the start of the ‘hard’ EMS in 1983.
These are realistic examples that point at the need to include additional information on interventions and institutional reforms in the empirical model analysis. This can be done by including new variables measuring the effect of institutional reforms, or if such variables are not available by using dummy variables as a proxy for the change in institutions.
At the start of the empirical analysis it is not always possible to know whether an intervention was strong enough to produce an ”extraordinary” effect or not. Essentially every single month, quarter, year is subject to some kind of political interventions, most of them have a minor impact on the data and the model. Thus, if an ordinary intervention does not ‘stick out’ as an outlier, it will be treated as a random shock for practical reasons. Major interventions, like removing restrictions on capital movements, joining the EMS, etc. are likely to have much more fundamental impact on economic behavior and, hence, need to be included in the systematic part of the model. Ignoring this problem is likely to seriously bias all estimates of our model and result in invalid inference.
It is always a good idea to start with a visual inspection of the data and their time series properties as a first check of the assumptions of the linear regression model. Based on the graphs we can get a first impression of whether xi,t looks stationary with constant mean and variance, or whether this is the case for ∆xi,t. If the answer is negative to the first question, but positive to the next one, we can solve the problem by respecifying the model in error-correction form as will be subsequently demonstrated. If the answer is negative to both questions, it is often a good idea to check the economic calendar to find out whether any significant departure from the constant mean and constant variance coincides with specific reforms or interventions. The next step is then to include this information in the model and find out whether the intervention or reform had changed the parameters of the model. In the latter case the intervention is likely to have caused a regime shift and the model would need to be re-specified allowing for the shift in the model structure. We will subsequently discuss procedures to check for parameter constancy over the chosen sample period.
The covariance of two variables, xk and xj , is defined as:
cov(xk, xj ) = E
(xk − μk)(xj − μj )
= E(xkxj )−μkμj =
i
fi(xk,i−μk)(xj,i−μj ) = σkj
and an unbiased estimator is given by:
cov c(xk, xj ) = ˆσkj = skj =
t=
(xkt − xk)(xjt − xj )
A positive (i.e., upward-sloping) linear relationship between the variables will give a positive covariance, and a negative (i.e., downward-sloping) linear relationship gives a negative covariance. The variance-covariance matrix of a set of variables {xi, xj , xk} is given by:
Σx. 0 =
var(xi) cov(xi, xj ) cov(xi, xk) cov(xi, xj ) var(xj ) cov(xj , xk) cov(xi, xk) cov(xj , xk) var(xk)
σii σij σik σji σjj σjk σki σkj σkk
with the variances on the diagonal and the covariances on the off-diagonal. The subscript 0 in Σx. 0 shows that the covariances have been calculated based on current values (but not lagged values) of the variables, i.e. h = 0. The sample standard deviation of the variable, xi, is given by:
bσi =
q bσ^2 i =
v u u t 1 T − 1
t=
(xit − xi)^2
The sample correlation coefficient between two variables, xi and xj , is given by:
rij = corr(xi, xj ) =
pP^ t(xit^ −^ xi)(xjt^ −^ xj^ ) t(xit^ −^ xi) 2 pP t(xjt^ −^ xj^ ) 2
The relationship between the standard deviation, the covariance, and the correlation coefficient can be obtained by multiplying the numerator and the denominator by 1 /(T − 1), to give:
1 T − 1
t=
(xit − xi)(xjt − xj ) q 1 T − 1
t(xit^ −^ xi) 2
q 1 T − 1
t(xjt^ −^ xj^ ) 2
which reduces to:
rij =
σˆij σ ˆi · σˆj
The correlation coefficient measures the strength of the linear relationship between the variables. Perfect negative and positive linear relationships are indicated by r = − 1 and r = 1, respectively, and a value of r = 0 indicates no linear relationship. It interpretation is strictly limited to linear relationships.
The estimated correlation matrix, i.e. the standardized covariance matrix Σ^ ˆ 0 , for the Danish consumption data (output from the PcGive Descriptive Statistics package):
Means, standard deviations and correlations (using consumption.in7) The sample is 1973 (1) - 2003 (1) Means LrC LrY LrW Rb DLPc 6.1337 6.7752 8.2741 0.029070 0. Standard deviations (using T-1) LrC LrY LrW Rb DLPc 0.11663 0.13811 0.13517 0.011418 0. Correlation matrix: LrC LrY LrW Rb DLPc LrC 1.0000 0.98299 0.97384 -0.86493 -0. LrY 0.98299 1.0000 0.97986 -0.87192 -0. LrW 0.97384 0.97986 1.0000 -0.88104 -0. Rb -0.86493 -0.87192 -0.88104 1.0000 0. DLPc -0.70119 -0.64968 -0.67227 0.67992 1.
Note that the nonstandrdized covariances can be derived using the for- mula (1.2).
and σyy.x = σyy − σyxσ− xx^1 σxy (1.4)
The joint distribution of zt can now be expressed as the product of the conditional and the marginal distribution:
P (yt, xt; θ) | {z } the joint distribution
= P (yt|xt; θ 1 ) | {z } the conditional distribution
× P (xt; θ 2 ) | {z } the marginal distribution
The linear regression model:
yt = β 0 + β 1 xt + εt
corresponds to the conditional expectation of yt for given values of xt (or alternatively when keeping xt fixed.
The linear regression model, in matrix notation, can either be written as:
yt = β^0 xt + εt, t = 1, ..., T
where β is a k × 1 vector of coefficients, xt is a k × 1 vector of explanatory variables, including a constant, or in more compact form as:
y = Xβ + ε (2.1)
where y is an (T ×1) vector, X is an (T ×k) matrix of which the first column is 1’s, β is a (k × 1) vector, and ε is a (T × 1) vector. Estimation of the standard linear model by the method of ordinary least squares (OLS) is motivated by the Gauss-Markov theorem which states that the OLS estimators are best linear unbiased estimators (b.l.u.e.). Least-squares estimators are the “best” in the sense that among the class of linear unbiased estimators they have the least variance under the following assumptions.
To minimize the sum of squared residuals, take the derivative of e^0 e with respect to βb and set it equal to zero:
∂(e^0 e) ∂ βb
= − 2 X^0 y + 2X^0 Xβb = 0
which yields
(X^0 X)bβ = X^0 y.
If the matrix X has full rank the design matrix, X^0 X is invertible and we can find the OLS estimator as:
bβ = (X^0 X)−^1 X^0 y. (2.2)
To derive the variance of the OLS estimate βb we insert the value of y from (2.1) in (2.2):
β^ b = (X^0 X)−^1 X^0 (Xβ + ε) = (X^0 X)−^1 (X^0 X)β + (X^0 X)−^1 X^0 ε = β + (X^0 X)−^1 X^0 ε
and
E(βb) = β + E(X^0 X)−^1 X^0 ε = β + (X^0 X)−^1 E(X^0 ε)
Under the assumption A.2:
E(X^0 ε) = 0
the OLS estimator is unbiased. To derive the standard error of estimate σβˆ we first express the deviation of the OLS estimate from the true value (see (2.3)) as:
(βb − β) = (X^0 X)−^1 X^0 ε.
The variance can then be expressed as:
E{(βb − β)(βb − β)^0 } = E{(X^0 X)−^1 X^0 εε^0 X(X^0 X)−^1 }
By assumption A.5 E(εε^0 ) = σ^2 εIT and we obtain:
var(βb) = σ^2 ˆβ = (X^0 X)−^1 X^0 σ^2 IT X(X^0 X)−^1 = σ^2 (X^0 X)−^1
The normality assumption implies bβ is a linear combination of normally distributed variables. Since we know its mean and variance, it can be con- cluded that βb ∼ N(β, σ^2 (X^0 X)−^1 ).
The OLS residual, e, is connected to the error term, ε, in the following way:
e^0 e = ε^0 Mε,
where M = I − X(X^0 X)−^1 X^0 is an idempotent matrix of reduced rank, (T − k). An unbiased estimator of the residual error variance, bσ^2 ε, is:
bσ^2 ε =
ε^0 Mε T − k
e^0 e T − k
or, equivalently,
bσε =
T − k
t=
et^2 =
(T − k)
where RSS stands for the residual sum of squares. Thus, note that
RSS = (T − k) · σb^2 ε.
The square-root of the estimated residual variance is the standard error of the regression estimate, bσε, calculated by
bσε =
s RSS (T − k)
Finally, as will be shown in the next chapter, the quadratic form, ε^0 Mε, is χ^2 -distributed when ε is normally distributed:
ε^0 Mε σ^2
e^0 e σ^2
∼ χ^2 (T −k)
where the degrees of freedom, T − k, are equal to the rank of the idempotent matrix, M.