Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Lemmas and Results in Time Series Analysis: Asymptotic Properties of Whittle Estimator, Lecture notes of Statistics

Lemmas and results from a research paper on time series analysis, focusing on the asymptotic properties of the Whittle estimator. The paper discusses conditions for the consistency and linearization of the Whittle estimator under various assumptions, including zero mean, independent, identically distributed sequences with finite fourth moments, and causality. The document also proposes a bootstrap procedure for the Whittle estimator.

Typology: Lecture notes

2021/2022

Uploaded on 09/27/2022

photon
photon 🇺🇸

4.6

(5)

223 documents

1 / 32

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
GOODNESS OF FIT FOR LATTICE PROCESSES
JAVIER HI DALG O
Abstract. The pap er di scuss es test s for the corre ct spe ci…cation o f a mo del
when d ata is ob served in a d- dime nsion al latt ice, e xtend ing pr eviou s work
when t he dat a is colle cted in t he rea l line. As it ha ppen s with t he latt er
type of d ata, the asy mpto tic dis tribu tion of th e tests a re fun ction als of a
Gaus sian sh eet pr oces s, say B(),2[0; ]d. Be caus e it is not ea sy to …nd a
time tr ansfo rmat ion h()such that B(h()) be come s the sta ndard Brownia n
sheet , a cons equen ce is tha t the cri tical valu es are di ¢ cult , if at all p ossibl e,
to obta in. S o, to overc ome th e prob lem of it s imple menta tion, w e prop ose to
empl oy a boo tstra p appr oach, sh owing it s validity in o ur cont ext.
JEL C lassi …cation : C 21, C 23.
1. INTRODUCTION
The paper is concerned with testing the goodness of …t of a parametric family
of models for data collected in a lattice. More speci…cally, we are concerned with
the correct speci…cation (or model selection) of the dynamic structure with time
series and/or spatial stationary processes fx(t)gt2Zde…ned on a d-dimensional
lattice. The key idea of the test is to compare how close is the parametric and
nonparametric ts of the data to provide support for the null hypothesis. In the
paper, we shall speci…cally consider data for which d3. The motivation to focus
on the case d3lies in the fact that the most often type of data available in
economics is when d= 2, say with agricultural or environmental data, or when
d= 3. An important example of the latter is the spatial-temporal data sets, that
is data collected in a lattice during a number of periods. However, we ought to
mention that extensions to higher index lattice processes can be adapted under
suitable modi…cations.
All throughout the paper we will assume that the (spatial) process fx(t)gt2Zd
can be represented by the multilateral model
(1.1) x(t)=X
j2Zd
(j)"(tj);X
j2Zd
2(j)<1 (0) = 1,
for some sequence f"(t)gt2Zdsatisfying E("(t)) = 0 and E("(0) "(t)) = 2
"if t= 0;
and = 0 for all t6= 0. Notice that because our model is multilateral, the sequence
f"(t)gt2Zdloses its interpretation as the “prediction” error or that they can be
regarded as innovations. Under (1:1), the spectral density function of fx(t)gt2Zd
can be factorized as
f() = 2
"
(2)dj ()j2,2d,
where = (; ]and with
(1.2) () = X
j2Zd
(j) exp (ij ).
Date : 1 Febr uary 2 008.
Key w ords an d phras es. Go odn ess of …t test s. Spat ial lin ear pro cess es. Sp ectr al dom ain.
Boo tstra p tests.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20

Partial preview of the text

Download Lemmas and Results in Time Series Analysis: Asymptotic Properties of Whittle Estimator and more Lecture notes Statistics in PDF only on Docsity!

GOODNESS OF FIT FOR LATTICE PROCESSES

JAVIER HIDALGO

Abstract. The paper discusses tests for the correct speciÖcation of a model when data is observed in a d-dimensional lattice, extending previous work when the data is collected in the real line. As it happens with the latter type of data, the asymptotic distribution of the tests are functionals of a Gaussian sheet process, say B (),  2 [0; ]d. Because it is not easy to Önd a time transformation h () such that B (h ()) becomes the standard Brownian sheet, a consequence is that the critical values are di¢ cult, if at all possible, to obtain. So, to overcome the problem of its implementation, we propose to employ a bootstrap approach, showing its validity in our context. JEL ClassiÖcation: C21, C23.

1. INTRODUCTION

The paper is concerned with testing the goodness of Öt of a parametric family of models for data collected in a lattice. More speciÖcally, we are concerned with the correct speciÖcation (or model selection) of the dynamic structure with time series and/or spatial stationary processes fx (t)gt 2 Z deÖned on a d-dimensional lattice. The key idea of the test is to compare how close is the parametric and nonparametric Öts of the data to provide support for the null hypothesis. In the paper, we shall speciÖcally consider data for which d  3. The motivation to focus on the case d  3 lies in the fact that the most often type of data available in economics is when d = 2, say with agricultural or environmental data, or when d = 3. An important example of the latter is the spatial-temporal data sets, that is data collected in a lattice during a number of periods. However, we ought to mention that extensions to higher index lattice processes can be adapted under suitable modiÖcations. All throughout the paper we will assume that the (spatial) process fx (t)gt 2 Zd can be represented by the multilateral model

(1.1) x (t)  =

X

j 2 Zd

(j) " (t j) ;

X

j 2 Zd

(^2) (j) < 1 (0) = 1,

for some sequence f" (t)gt 2 Zd satisfying E (" (t)) = 0 and E (" (0) " (t)) = ^2 " if t = 0; and = 0 for all t 6 = 0. Notice that because our model is multilateral, the sequence f" (t)gt 2 Zd loses its interpretation as the ìpredictionî error or that they can be regarded as innovations. Under (1:1), the spectral density function of fx (t)gt 2 Zd can be factorized as

f () =

^2 "

(2)d^

j ()j^2 ,  2 d,

where  = (; ] and with

(1.2) () =

X

j 2 Zd

(j) exp (ij  ).

Date : 1 February 2008. Key words and phrases. Goodness of Öt tests. Spatial linear processes. Spectral domain. Bootstrap tests. 1

2 JAVIER HIDALGO

Henceforth the notation ìj îmeans the inner product of the d-dimensional vectors s and . The function () summarizes the covariogram structure of fx (t)gt 2 Zd , which is the main feature to obtain good and accurate prediction/extrapolation and/or interpolation (kriging) in the case of spatial data. Notice that the ultimate aim when modelling data is nothing but to predict the future. The aim of the paper is then on testing whether the data support the null hypothesis that () belongs to a speciÖc parametric family

(1.3) H = f (^)  () :  2 g ,

where   Rp^ is a proper compact parameter set. That is, we are interested on the null hypothesis

(1.4) H 0 : 8  2 [; ]d^ and for some  0 2 , j ()j^2 = j (^)  0 ()j^2.

The alternative hypothesis is the negation of H 0. Alternatively we could have formulated the null hypothesis in terms of the covariogram given by f (s)gs 2 Zd , where (s) = Cov (x (t) ; x (t + s)). That is, the null hypothesis is that the co- variogram follows a particular parametric family, say f (s)gs 2 Zd = f (^) # (s)gs 2 Zd ,

where from now on we denote # =

^0 ; ^2 "

. This is the case after observing that for any stationary spatial lattice process fx (t)gt 2 Zd , the spectral density f () and the covariogram (s) are related through the expression

(s)

(s)^

 R

d^ f^ ()^ e

isd R d^ f#^ ()^ e

isd ;^ s^ = 0;^ ^1 ;^ ^2 ; : : :^.

For the moment, it will be convenient to work in a general d-dimensional setting. Herewith, any element a that belongs to Zd^ (or d), the d-fold Cartesian product of the set Z (or ), is referred to as a multi-index of dimension d. Also, we shall write, say, a = (a [1] ; :::; a [d]) with the square brackets used to denote the components of a. Some particular parameterizations of (1:1) or (2:12), given in Condition C 1 be- low, are the ARM A Öeld model

P (L) (x (t) ) = Q (L) " (t) ,

where

P (z) =

X

j 2 Zd

(j) zj^ ; (0) = 1

Q (z) =

X

j 2 Zd

(j) zj^ ; (0) = 1,

are Önite series in Zd. That is, only a Önite number of the (j)^0 s and (j)^0 s coe¢ cients are non-zero. For instance the ARM A Öeld model given by

X^ k^2

j=k 1

(j) (x (t j) ) =

X^ `^2

j=` 1

(j) " (t j) (0) = (0) = 1

whose spectral density function is

f () =

^2 "

(2)d

P` 2

j=` 1 (j)^ e

ij 2

Pk 2 j=k 1 (j)^ e ij

Notice that the ARM A Öeld model becomes a causal representation if the polyno- mials Q (L) and P (L) are both unilateral. When say Q (L) = 0, a condition for the latter is that

R

d^ log^ P^

ei

R d^ = 0. So that in general, we can expect that d^ log^ 

ei

d 6 = 0 for any admissible value of , see Guyon (1982b). The

4 JAVIER HIDALGO

face the strange situation that with the same data set two di§erent practitioners might conclude di§erently or that if a practitioner chose to optimize the size of the test, that choice would lead to tests which have very poor power properties and viceversa. The latter is clearly not very appealing from both theoretical or applied stand point of view. So, in this context, one of our main motivation is to extend goodness-of-Öt tests examined and described when d = 1 to d  1 , where we do not require the choice of any bandwidth parameter. For that purpose, we rely on the periodogram which although it is not a consistent estimator for f (), its integral is a consistent estimator of the spectral distribution function as the integral is the most natural smoothing algorithm. The remainder of the paper is organized as follows. In the next section, we present the test and examine its asymptotic properties when the true value of the parameter  0 is known, whereas Section 3 extends these results to more realistic situations where we need to estimate the parameters of the model. Because, the asymptotic distribution of the test in the latter scenario is not pivotal and model dependent, Section 4 describes the bootstrap test showing its validity. Section 5 gives the proof of a series of lemmas employed in the proof of our main results in Section 6.

2. TESTS WHEN THE PARAMETERS ARE KNOWN

This section discusses and examines how we can test the null hypothesis H 0 given in (1:4). That is,

H 0 : f () =

^2 "

(2)d^

j (^)  0 ()j^2 8  2 ed^ for some value  0 ,

when the ìtrueî value of  0 is known, and where herewith ed^ denotes [0; ] 

[; ]d^1 , that is  2 ed^ if  [1] 2 [0; ] and  [] 2 [; ] for = 2; :::; d. Before we introduce and describe the test, we notice that the null hypothesis H 0 can be alternatively stated as

(2.1) H 0 :

G 0 ()

G 0 ()

Y^ d

`=

 [`]

for all  2 [0; ]d^ ,

where

G () = 2

Z 



f (!) j (^)  (!)j^2

d!

with the notation

Z 



Z [1]

([1]^0)

Z [2]

[2]

Z (^) [d]

[d]

Under H 0 , G 0 () is the spectral distribution function of the lattice process f" (t)gt 2 Z and G 0 () = ^2 ". Notice that by symmetry of f (), it does not matter which co- ordinate we take to belong only to [0; ]. The consequence is that the choice would not a§ect the value of G () and so the value of the test given below.

Given a record fx (t)gnt=1 and denoting henceforth N =

Yd = n [], a natural

estimator of G 0 () is

G^ e;N () = 2^1 N

[~n= X]

j=[~n=]

Ix (j ) j (^)  (j )j^2

GOODNESS OF FIT FOR LATTICE DATA 5

where Iv () denotes the periodogram of a generic sequence fv (t)gnt=1,

Iv () =

N

X^ n

t=

v (t) eit

2 ;  2 ed

and similarly to the deÖnition of

R 

 , we are employing henceforth the notation

[~n= X]

j=[~n=]

[~n[1] X[1]=]

j[1]=[~n[1][1]=]+

[~n[2] X[2]=]

j[2]=[~n[2][2]=]

[~n[d] X[d]=]

j[d]=[~n[d][d]=]

where [q]+ = max fjqj ; 1 g. Also we have abbreviated [n [] =2] by n~ [] for ` = 1; :::; d.

As usual we have excluded the frequency j = 0 from the sum

P[~n=] j=[~n=], so that we can take Ex (t) = 0 or assume that x (t) has been centered around its sample mean. It often the case that in real applications, in order to make use of the fast Fourier transform, the periodogram is evaluated at the Fourier frequencies, that is k =

k[1]; :::; k[d]

, where

k[1] =

2 k [1] n [1] ; k [1] = 0; 1 ; :::; n~ [1] ;

k[`] =

2 k [] n []

; k [] = 0;  1 ; :::; n~ [] , ` = 2; :::; d.

Unfortunately, as noted by Guyon (1982a), due to nonnegligible end e§ects, the bias of the periodogram does not converge to zero fast enough when d > 1 , so that it would have unwanted consequences. One of theses is for the Whittle estimator of the parameters #, see Guyon (1982a), which it does not have the standard asymptotic properties as when d = 1. Because of that, in the paper, we shall employ the taper periodogram deÖned as

ITv (j ) = wTv (j ) 2 ,

where

wTv (j ) =

Pn t=1 h (^2) (t))^1 =^2

X^ n

t=

h (t) v (t) eitj

is the taper discrete Fourier transform of a generic sequence fv (t)gnt=1. Tapering is primary a technique employed to reduce the bias of the ìstandardîperiodogram Iv (). Notice that when h (t) = 1, we have that the taper discrete Fourier transform wvT (j ) becomes the standard discrete Fourier transform (DFT). It is worth men- tioning that to alleviate the bias problem, alternative procedures to tapering have been proposed. One of these proposals was due to Guyon (1982a), who replaced the periodogram by

I v (k) =

(2)^2

X

h2D

b v (h) eihk^ ,

where b v (h) = (^) N j^1 hj

P

t(h) v^ (t)^ v^ (t^ +^ h)^ and^ D^ ={h^ :^ n^ []^ < h^ []^ < n^ [] ;^^ = 1 ; :::; d}. Notice that the standard periodogram Iv (k) replaces b v (h) by bv (h) = 1 N

P

t(h) v^ (t)^ v^ (t^ +^ h). However, Dahlhaus and K¸nsch^ (1987)^ have criticized the use of I v (k) on the grounds that when employed to estimate the parameters of the model via a Whittle estimator, see (3:1) below, the estimator loses its mini- mum distance interpretability and that the objective function possesses several local maxima. The latter implies that to obtain the maximum of the Whittle function becomes more strenuous. Another possibility is that described by Robinson and Vidal-Sanz (2006). The latter proposal will be helpful when d  4. However as we

GOODNESS OF FIT FOR LATTICE DATA 7

One rational of the statistic (^) ;N () follows from the observation (see Lemma 4 in Section 5) that under H 0 , we have that

max n~j~n

E

ITx (j ) j (^)  0 (j )j^2

I"T (j ) = o (1) ,

where ìa  bî means that a []  b [] for all ` = 1; :::; d and

I"T (j ) =

Pn t=1 h (^2) (t)

X^ n

t=

h (t) " (t) eitj

2 .

Also, observe that 0 < j [1]  n~ [1] whereas ~n [] < j []  ~n [] for = 2; :::; d. Thus from the previous observation, we can expect that (^)  0 ;N will be asymptot- ically equivalent to Bartlettís Up process for f" (t) ="gt 2 Zd , i.e.

(2.10) 0 N () = 2^1 =^2 N 1 =^2

G^0 N ()

G^0 N ()

Y^ d

`=

 [`]

with

G^0 N () = 2

N

[~n= X]

j=[~n=]

IT" (j ) ;  2 [0; ]d^.

Observe that the Up process (^0) N and the Tp process (^)  0 ;N are identical when fx (t)gt 2 Zd is a ìwhite noiseî process. We should now comment on why the formulation of the test in terms of the spec- tral density function might be useful. For that purpose, we have just to remember how the spectral density function is related to the conditional distribution at each site on an inÖnite lattice. Indeed, denoting

^2 " f (z)

X

j 2 Zd

 (j) zj^ ,

we then have that

(2.11) E fx (t) jx (r) : r 6 = t g =

X

j 2 Zd^fj 6 =0g

 (j) x (t j).

So, (2:11) gives a motivation why the spectral density function plays a central role, and therefore why we have decided to work in terms of f (). The equality in (2:11) is related to the well known CAR (Conditional Autoregression) model compared to the SAR (Simultaneous Autoregression) model in (2:12) below. See, for instance Besag (1974) and Whittle (1954), respectively. It is however known that the class of CAR models is more general than that of SAR models. In fact, as Cressie (1993, Ch.6) observed, any SAR model has a CAR representation but not vice versa, see also Besag (1974) or Guyon (1982b). Let us introduce the following regularity conditions. Condition C1: (a) The process f" (t)gt 2 Zd in (1:1) is a zero mean indepen- dent identically distributed sequence of random variables with variance ^2 " equal to 1 and Önite 4 th moments with " denoting the fourth cumulant of f" (t)gt 2 Zd. (b) The multilateral Moving Average representation of fx (t)gt 2 Zd in (1:1) can be written (or it has a representation) as a multilateral Autocorrelation model

(2.12)

X

j 2 Zd

 (j) x (t j) = " (t)  (0) = 1,

8 JAVIER HIDALGO

where  (j) is the coe¢ cient of zj^ in the Fourier expansion of L^1 (z), where L (z) = L (z [1] ; :::; z [d]) =

X

j 2 Zd

(j) zj

denoting for muti-indeces z and j, zj^ =

Yd = z []j[`]^ with the convention that 00 = 1. Condition C2: N =

Yd = n [], where n []  n for = 1; :::; d, and ìa  bî means that 0 < C 1  a=b  C 2 < 1 for some Önite positive constants C 1 and C 2. Condition C3: fh (t)gnt=1 is the cosine-bell taper function in (2:4). We now comment on Conditions C 1 to C 3. Part (a) of Condition C 1 seems to be a minimal condition for Proposition 1 below to hold true. Observe that due to the quadratic nature of (^0) N , for the latter to have Önite second moments, we require Önite fourth moments for the lattice process f" (t)gt 2 Zd. Also we have assumed that the true value of ^2 " is 1. The latter follows from our comments made after the deÖnition of G;N () in (2:9). However, we shall emphasize that we are not saying or suggesting that the true value of ^2 " is known, only that it is equal to

  1. Su¢ cient regularity conditions required for the validity of the expansion as an Autocorrelation in part (b) is that (z) be no zero for any z [], = 1; :::d, which simultaneously satisfy jz [1]j = 1; :::; jz [d]j = 1 at least when the Moving Average representation is of Önite order. The latter implies that f () is a positive function. Looking at the proof of Proposition 1 below, and then that of Theorem 1, it ap- pears that we do not need to assume Önite four moments of the sequence f" (t)gt 2 Z. The reason is similar to the work of Anderson and Walker (1964) as the statistic ;N ()^ in^ (2:8). However, as in the more realistic situation when we need to es- timate the unknown parameters of the model under H 0 , we require Önite fourth moments to obtain the asymptotic properties of the estimates, we have just pre- ferred to leave the condition as it stands. Condition C 2 can be generalized to the case where the rate of convergence to zero of n^1 [] di§ers for di§erent = 1; :::; d. However, for notational simplicity we prefer to leave it as it stands. On the other hand, in C 3 the taper function employed for the asymptotics to follow can be more general, as those given by Kolmogorovís or Parzenís tapers. In fact, in situations where the dimension of d is greater than 3, it might be needed for the results of the paper to follow. However, as the most important cases in empirical applications are covered in the paper, we shall leave the cosine-bell taper explicitly as the taper function to be employed. The empirical processes (^0) N () and (^)  0 ;N () given in (2:10) and (2:8) respec-

tively are random elements in D [0; ]d. The functional space D [0; ]d^ is endowed with the Skorohodís metric (see e.g. Billingsley, 1968 and Bickel and Wichura, 1971) and convergence in distribution in the corresponding topology will be denoted by ì)î.

Proposition 1. Under C 1 C 3 , we have that

(2.13) (^0) N () ) Be () = B

Y^ d

`=

 [`]

B (1)  2 [0; ]d^ ,

where

n B (u) : u 2 [0; 1]d

o is the standard Brownian sheet.

Remark 1. Recall that the covariance structure of the standard Brownian sheet is

Cov (B (u) ; B (v)) =

Y^ d

`=

(u [] ^ v []) , for u; v 2 [0; 1]d^.

10 JAVIER HIDALGO

A popular estimator of #^00 =

^00 ; ^2 "

is the Whittle (1954) estimator deÖned as b#c = arg min

2 R+^

Qc^ (#) ,

where

Qc^ (#) =

Z 



log f# () + IxT () (2)d^ f# ()

d

or in its discrete version

(3.1) b# = arg min

2 R+^

QN (#) ,

where

(3.2) QN (#) =

N

X^ ~n

j=~n

f# (j ) +

ITx (j ) (2)d^ f# (j )

with f# (j ) = ^2 " j (^)  (j )j = (2)d^ and   Rp^ is a compact set. Recall our notation given in (2:3), and that the true value of the variance of " (t) is unknown and therefore we need to estimate it. In this case, the Tp process (^)  0 ;N () becomes

(3.3) (^) b;N () = 2^1 =^2 N 1 =^2

Gb;N () Gb;N ()

Y^ d

`=

 [`]

,  2 [0; ]d^ ,

where G;N () is given in (2:9). Notice that, contrary to the standard causal models, as Whittle (1954) noticed, the estimator of # 0 obtained by

 = arg min  2 

N

X^ ~n

j=~n

IxT (j ) j (^)  (j )j^2

, ^2 " =

N

Xn^ ~

j=n~

IxT (j ) b (j )^

2

is inconsistent. The main reason for the lack of consistency of  is that when the model is not causal then

R 

 '^ ()^ d^6 = 0, where from now on we write

# () =

log f# () =

'^0  () ;  "^2

and

(3.4) ' () =

log j (^)  ()j^2.

Letís introduce the following regularity conditions on  0 and on the model (1:1) or (2:12).

Condition C5:  0 is an interior point of the compact parameter set   Rp and ^2 " 2 R+. Condition C6: (^)  () is a positive and twice continuously di§erentiable func- tion in  on [; ]d. Condition C7: If  1 6 =  2 , then (^)  1 () 6 = (^)  2 () in a set   [; ]d^ with positive Lebesgue measure. The conditions imposed on  and the model are standard so that we omit any comment on them. Let

q#;N =

N

X^ ~n

j=~n

# (j )

IxT (j ) ^2 " j (^)  (j )j^2

Q#;N =

N

X^ ~n

j=~n

(3.5) # (j ) ^0 # (j ) ,

GOODNESS OF FIT FOR LATTICE DATA 11

and also, recalling our notation in (2:2),

# = (2)d

Z 



# () d and # = (2)d

Z 



# () ^0 # () d.

Notice that we write explicitly ^2 " as it is a parameter in itself.

Condition C8: # 0 is a continuous positive deÖnite matrix.

Theorem 2. Under C1-C3 and C 5 C 8 , we have that

N 1 =^2

#b # 0

 (^) d ! N

0 ; 2 # 01 V# 0  # 01

where V# 0 = 2# 0 + "

32

d # 0 ^0 # 0.

Proof. First, by deÖnition, we know that

b# # 0 = Q e#;N^1 q# 0 ;N ,

where e# is an intermediate point between # 0 and b#, q#;N is given in (3:5) and Q#;N is given by

Q#;N +

N

Xn^ ~

j=n~

2 # (j ) ^0 # (j )

@^2 f# (j ) @#@#^0

ITx (j ) ^2 " j (^)  (j )j^2

= Q#;N + op (1)

by Lemma 5 and that b# # 0 = op (1) by Lemma 6. On the other hand, by Brillinger (1981, p.15) and standard arguments, since e# # 0 = op (1), we have that Qe#;N # 0 = op (1). Next, by Lemma 4 with  () = # 0 (j ) there,

q# 0 ;N =

N

X^ ~n

j=~n

# 0 (j )

I"T (j ) 1 + op (1).

From here the proof proceeds as in Robinson and Vidal-Sanz (2006). 

Looking at the proof of Theorem 2, and denoting in what follows

e' () = ' ()

(2)d

Z 



' () d, e# () =

e'^0  () ; 0

' e;N (j ) = ' (j )

N

X^ ~n

j=~n

' (j ) , e#;N () =

e'^0 ;N () ; 0

with ' () given in (3:4), standard algebra establishes that the Whittle estimator b# in (3:1) satisÖes the asymptotic linearization

b# # 0 = Q #^1 0 ;N

Z 



e 0 ()^ ^0 ;N^ (d) +

Z 



# 0 () d

N

X^ ~n

j=~n

IxT (j ) (2)d^ f# 0 (j )

!^9

  • op

N ^1 =^2

Then using (3:6) and deÖning

1 () =^ Be^ ()^ ^

(2)d

Z 



' e^0  0

d

e^1 ( 0 )

Z 



e' 0 ;N

 e B

d

where

^ e = 1 (2)d

Z 



e'

'e^0 

d,

GOODNESS OF FIT FOR LATTICE DATA 13

However, its implementation is quite cumbersome even for the rather simpler case when d = 1. See for instance Anderson (1997) for details. So, in view of the preceding arguments, we consider a third approach based on bootstrap algorithms. This is the route employed, among others, by Chen and Romano (2000) or Hainz and Dahlhaus (2000) for short-range models using the Up process and by Hidalgo and Kreiss (2006), who allow also long-range dependence models using the Tp process. Of course all those articles were for d = 1. Also, we will see that bootstraps employed when d = 1 are not valid in our context.

  1. BOOTSTRAP TEST FOR THE TEST Since Efron (1979), bootstrap algorithms have become a common tool in applied work and thus considerable e§ort has been devoted to its development. The primary motivation for this e§ort is that they have proved to be a very useful statistical tool. We can cite two main examples/reasons. First, bootstrap methods are capable of approximating the Önite sample distribution of statistics better than those based on their asymptotic counterparts. And secondly, and perhaps the most important, they allow computing valid asymptotic quantiles of the limiting distribution in situations when the practitioner is unable to compute its quantiles. In the present paper we face the latter situation. Following our comments at the end of the previous section, the aim of this section is to propose a bootstrap

procedure for (^) b;N () given in (3:3) and thus for bN = 

b;N

. The resampling

method must be such that the conditional distribution, given x  = fxtgnt=1, of the

bootstrap statistic, say b N , consistently estimates the distribution of  ( 1 ) under H 0. That is, b N !d^  ( 1 ) in probability under H 0 , where ì!d^ î denotes

Pr

h b N  zj x 

i (^) p ! G (z) ,

at each continuity point z of G (z) = Pr ( ( 1 )  z). Moreover, under local alter- natives

(4.1) Ha : f# ()

N 1 =^2

g ()

for some # 2   R+

where g () is some symmetric, non-constant continuous function in [0; ] such that 1 N 1 =^2 g^ ()^ >^ ^1 for all^ N^ ^1 ,^ b

 N must also converge, in bootstrap distribution to  ( 1 ), whereas under the alternative H 1 , we only require that b N is bounded in probability to have good power properties.

Remark 3. We should point out that Ha could have been written as

Ha : f# () +

N 1 =^2

eg () for some # 2   R+

where eg () is a positive integrable function. However, since we are concerned with

the relative error of IT^ (j ) compared to f# (j )

j (^) # (j )j^2

, we found notationally

more convenient to write the alternative hypothesis Ha as given in (4:1).

When d = 1, Hidalgo and Kreiss (2006) examined a bootstrap algorithm based on an approach in Hidalgo (2003) showing its validity and consistency. This bootstrap consists on the following 3 STEPS.

STEP 1: Let xe (t) = (x (t) x) =bx, where x = N ^1

Pn t=1 x^ (t)^ and^ b

2 x = N ^1

PN

t=1 (x^ (t)^ ^ x)

(^2) , and a random sample of size N with replacement from the empirical distribution of ex (t). Denote that sample as x 

fx^ (t)gnt=1.

14 JAVIER HIDALGO

STEP 2: For j = 1; :::; n~, compute the bootstrap periodogram I^ eTx (j ) = fb

(j^ )

IxT (j ) , where

fb# (j ) =

Gb;N () (2)d^

b (j )^

2 , IxT (j ) =

Pn t=1 h (^2) (t)

X^ n

t=

h (t) x^ (t) eitj

2

and the bootstrap analogue of b# by

(4.2) #  = arg min

2 R+

Q^ e N (#) ,

where

Q^ e N (#) = 1 N

X^ ~n

j=~n

f# (j ) + IeTx (j ) (2)d^ f# (j )

with f# (j ) = ^2 " j (^)  (j )j = (2)d. STEP 3: Compute the bootstrap Tp process

;N () = 2^1 =^2 N^1 =^2

Ge  ;N () G;N ()

Y^ d

`=

 [`]

(^5) ,  2 [0; ]d^ ,

where

G^ e ;N () = 2^1 N

[~n= X]

j=[~n=]

I^ exT (j ) j (^)  (j )j^2

Some other procedures are possible as that based on that of Franke and H‰rdle (1992), where the bootstrap periodogram IexT (j ) = (^) b (j )

ITx (j ) is replaced by ITx (j ) = fb

(j^ )^ j^ ,

where ~n; :::; ~n are independent exponential random variables. However, unlike in the case of d = 1, the previous bootstrap algorithm will not be valid. The reason is because the bootstrap does not correctly ìestimateî the fourth cumulant ". More speciÖcally the asymptotic distribution of the bootstrap estimator #  in (4:2)

will not have the same asymptotic variance as that of #b in (3:1). So to overcome this problem, following Hidalgo (2007), see also Hidalgo and Lazarova (2007), we propose in the paper an alternative algorithm, as described in the next 4 STEPS.

STEP 1: We Örst obtain the residuals

b" (t) = (2)d=^2

N 1 =^2

X~n

j=~n

eitj^ b^1

eij^

wx (j ) ,

for t = 1; :::; n. From here as usual, we obtain a random sample of size N with replacement from the empirical distribution function of fb" (t)gnt=1. Letís denote the bootstrap sample by f"^ (t)gnt=1.

Remark 4. (a) Notice that because the test statistic bN = 

b;N ()

is asymp-

totically independent of the mean and variance of f" (t)gt 2 Zd , we do not need to standardize b" (t) before computing the empirical distribution of fb" (t)gnt=1 in order to obtain the bootstrap sample. (b) The motivation to compute the residuals as in STEP 1 comes from the observation that for t = 1; :::; n,

" (t) ' (2)d=^2

N 1 =^2

Xn~

j=n~

eitj^  01

eij^

wx (j ).

16 JAVIER HIDALGO

Theorem 5. Under H 0 and assuming C : 1 C : 3 and C5 C8 , uniformly in  2 [ 0 ; ],

(a)  b;N () =  N^0 ()

N

[~n= X]

j=[~n=]

' e b^0 ;N (j )

A e b^1 ;N

N

Xn^ ~

j=n~

e'^0 b;N (j ) I"T (j )

+op^ (1) ,

where the op^ (1) is uniformly in .

(b)  b;Nd

 ) 1.

A conclusion from Theorem 5 is the following corollary.

Corollary 3. Let b N = 

 b;N ()

. Under under the maintained hypothesis

and assuming C1 C3 and C5 C8 , we have that for any continuous functional,

b Nd

 !  ( 1 ()).

Proof. The proof follows from Theorem 5 and the continuous mapping theorem. 

Thus, Theorem 5 and Corollary 3 indicate that the bootstrap statistic b N is

consistent. That is, let cfN;(1 ) and ca (1 ) be such that

Pr

n jbN j > cfn;(1 )

o

and

lim n! Pr

n jbN j > ca (1 )

o = ,

respectively. So, Theorems 3 and 5 indicate that cfN;(1 )! ca (1 ) and c (1 ) p ! ca (1 ), respectively, where c (1 ) is deÖned as

Pr

n jb N j > c (1 )

o =.

Typically, the Önite sample distribution of b N is not available, although the critical values c (1 ) can be approximated, as accurately as desired, by standard

Monte-Carlo simulation. To that end, consider the bootstrap samples

n b"`^ (t)

on t= for = 1; :::; B, and compute (^) b;N () as in (4:5) for each `. Then, c (1 ) is

approximated by the value c (1B ) that satisÖes

1 B

X^ B

`=

I

` b;N ()

c (1B )

Next we study the behaviour of the bootstrap tests under the alternative hy- pothesis H 1.

Corollary 4. Assuming C.1-C.8, under H 1 ,

b N d ! (e 1 ()) in probability,

where e 1 () is a centered Gaussian process with covariance structure as 1 () but with  0 replaced by  1 =plim b.

Proof. The proof proceeds exactly the same as that of Theorem 5 and then Corol- lary 3 but instead of writing b  0 = op (1) we write b  1 = op (1) and  1 instead of  0. 

GOODNESS OF FIT FOR LATTICE DATA 17

  1. LEMMAS First, we introduce some notation. We denote the conjugate of a complex num- ber a by a. Also, for a generic function  (), we abbreviate  (j ) by j = j[1]; :::; j[d]

and C will denote a generic positive and Önite constant. For the next two lemmas, we shall assume that f (t)gt 2 Zd and f (t)gt 2 Zd are two stationary spatial processes with a representation as that in (1:1) and whose respec- tive errors satisfy C 1. Also f () = (2)d^

P

j 2 Zd^ E^ (^ (t)^ ^ (t^ +^ j)) exp^ fij^ ^ g, the cross-spectral density function, is a twice continuously di§erentiable function in  2 d. Denote Zed^ = fj : (~n  j  ~n) ^ (0 < j [1])g.

Lemma 1. Consider j 2 Zed. Then,

(a) E

wT;j wT;j

f;j = O

n^2

; (b) E

w;jT wT;j

= O

Y^ d

`=

j [`]^3

Proof. We begin with part (a). By deÖnition, the left side of the equality in (a) is

Z

d

(f () f (j ))

Y^ d

`=

KT^

 [] j[]

d,

surprising any reference to in KT and/or DT` for notational simplicity. Now, because f () is twice continuous di§erentiable and

R

 K

T (^) () d = 0,

we have that the last displayed expression is bounded in modulus by

C

Z

d

X^ d

`=

X^ d

p=

 [] j[]  [p] j[p]

Y^ d

`=

KT^

 [] j[]

d

 C

Z

d

X^ d

`=

 [] j[] 2 Yd

`=

KT^

 [] j[]

d

by the Cauchy-Schwarz inequality. Now, using (2:5), that the Fejerís kernel inte-

grates 1, and that

Pn[] t[]=1 h^ (t^ [])

(^2)  Cn, we obtain that the right side of the last

displayed inequality is bounded by

C N

Z

d

X^ d

`=

 [] j[] 2 Yd

`=

min

n n^2 ; n^4  [] j[] 6 o d = O

n^2

by C 2 and standard algebra. Next we show part (b). Again by deÖnition and that jf ()j < C, we obtain

that E

wT;j wT;j

is bounded by

C

Z

d

jf ()j

Y^ d

`=

n^1 DT^

 [] j[]

DT^

 [] + j[]

d  C

Y^ d

`=

j [`]^3

by standard arguments after using (2:5). 

Lemma 2. Let k  j 2 eZd^ and cjk = min

Y

d = jj [] k [`]j+^3 ; logn^ n

, where

jqj+ = max f 1 ; jqjg. Then,

(a) E

w;jT wT;k

= f;j I (jj [] k []j = 2; ` = 1; :::; d) + O (cjk) (b) E

wT;j wT;k

= O (cjk).

GOODNESS OF FIT FOR LATTICE DATA 19

Proof. Denote %j = uj vj. By standard arguments, the left side of (5:3) is

X^ s

j=r

^2 j E

vj vj %j %j +

X^ s

j 6 =k=r

j kE

vj vk%j %k

X^ s

j=r

^2 j faj 1 + aj 2 g +

X^ s

j 6 =k=r

j k fbjk; 1 + bjk; 2 g ,

where

aj 1 = E (vj vj ) E

%j %j

+ E

vj %j

+ E

vj %j

aj 2 = cum (vj ; vj ; uj ; uj ) + cum (vj ; vj ; vj ; vj ) cum (vj ; vj ; uj ; vj ) cum (vj ; vj ; uj ; vj ) bjk; 1 = E (vj vk) E

%j %k

+ E

vj %j

E (vk%k) + E (vj %k) E

vk%j

bjk; 2 = cum (vj ; vk; uj ; uk) + cum (vj ; vk; vj ; vk) cum (vj ; vk; uj ; vk) cum (vj ; vk; uj ; vk).

After observing that E (vj uj ) = 1 + O

n^2

and E

vj %j

= E (vj uj ) E (vj vj ), we have that Lemma 1 implies that aj 1 = O

n^2

, whereas Lemmas 1 and 2 imply

that bjk; 1 = O

c^2 jk + n^1 log nI (jj [] k []j = 2; ` = 1; :::; d)

, with cjk as deÖned

there. From here it is immediate to conclude that the contribution due to aj 1 and bjk; 1 into the left of (5:3) is its right side. Finally we examine aj 2 and bjk; 2. Using formulae in Brillinger [(1975), (2: 6 :3), page 26, and (2: 10 :3), page 39], we deduce after standard algebra that

bjk; 2 =

N 2

Z

d

Z

d

j

k

DT^ ( j ) DT^ ( + k)

DT^ (j k  ) dd.

By the Cauchy-Schwarz inequality, we have that jbjk; 2 j^2 is bounded by CN ^1 times

Z

d

j

KT^ ( j ) d

Z

d

k

KT^ ( + k) KT^ (j k  ) dd.

Proceeding as in Lemma 2 and by C 4 , we then obtain that bjk; 2 = O

n^2 N ^1 =^2

Likewise aj 2 = O

n^2 N ^1 =^2

. From here, the conclusion of the lemma easily

follows by observing that

Yd = js [] r [`]j+  N. 

Lemma 4. Let  () be a function as in Lemma 3. Then, under C1 C4 ,

E sup  2 [0;]d

[~n= X]

j=[~n=]

j

Ix;jT j (^) j j^2

I";jT

= o

N 1 =^2

Proof. We shall consider the proof in the positive quadrant

P[~n=] j=1 , being the proof for the remaining 2 d^1 1 quadrants similarly handled. By the Cauchy-Schwarz and the triangle inequalities, it su¢ ces to show that

(5.4) E sup s

X^ s

j=

j

ITx;j j (^) j j^2

I";jT

 E sup s

Xs

j=

j %j 2

  • 2E sup s

X^ s

j=

j vj %j

is o

N 1 =^2

, where we abbreviate ìsups=1;:::;~nî by ìsupsî and %j = uj vj.

20 JAVIER HIDALGO

The Örst term on the right of (5:4) is bounded by

C

X^ ~n

j=

n E juj j^2 1

(E (uj vj ) 1) (E (uj vj ) 1) +

E jvj j^2 1

o = o

N 1 =^2

because j  C, d < 4 and by Lemma 1, say,

E

uj

vj uj

1   " 2 j (^) j j^2

E

wx;j

w";j wx;j

^2 "

j j (^) j j^2

= O

n^2

Next, we examine the second term of (5:4). To that end, let q = 0; : : : ; [~n&^ ] 1 for some 0 < & < 1 =d. (Recall that

n~

n~ [1]

~n [d]

for any > 0 .) Standard inequalities imply that the square of the second term on the right of (5:4) is bounded by

(5.5) E max s

X^ s

j=

q(s)X[~n^1 &^ ]

j=

j vj %j

2

  • E max s

q(s)X[~n^1 &^ ]

j=

j vj %j

2

,

where herewith q(s) denotes the value of q = 0; : : : ; [~n&^ ] 1 such that q(s)

~n^1 &^

is

the largest vector s 1 such that s 1  s, and using the convention

Pd j=c ^0 if^ d < c. From now on, we abbreviate (~n [1] = [~n&^ [1]] ; :::; n~ [d] = [~n&^ [d]]) by

n~^1 &^

From the deÖnition of q (s) and

supp jcpj

= supp jcpj^2 

X

p

jcpj^2 ,

the second term of (5:5) is bounded by

[~n X&^ ] 1

q=

E

q[n~^1 &^ ] X

j=

j vj %j

2

= O

N 1+&^ n^1 log^2 n

= o (N )

by Lemma 3 and because & < 1 =d. To complete the proof we need to show that the Örst term in (5:5) is o (N ). To that end, we note that it is bounded by

E max q=1;:::;[~n&^ ] 1

max s=1+q[~n^1 &^ ];:::;(q+1)[~n^1 &^ ]

X^ s

j=1+q[~n^1 &^ ]

j vj %j

2

which is O (N &^ ) E maxs=1;:::;[~n 1 & (^) ]

Ps j=1 j vj^ %j

2 . So, we have that the square of the second term on the right of (5:4) is

O

N 1+&^ n^1 log^2 n

  • O (N &^ ) E max s=1;:::;[~n^1 &^ ]

X^ s

j=

j vj %j

2 .

Observe that the second factor of the second term of the last displayed expression is similar to the second term on the right of (5:4) but with s = 1; :::;

n~^1 &^

instead of s = 1; :::; ~n. So, repeating the same steps, the last displayed expression, and so