Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Characterization of Statistical Distributions: Tests for Hypothesis Equivalence, Exams of Statistics

Problems related to testing hypotheses about the distribution class of a theoretical distribution, using the example of normal, uniform, and Poisson distributions. It introduces the concept of statistics that satisfy properties (1) and (2), which make the hypothesis of X's distribution belonging to a certain class equivalent to the hypothesis of Y's distribution being equal to Q6. The document also presents results on the uniformity and normality of the statistic Y for samples from certain distributions.

What you will learn

  • How does the group of transformations (7.1) relate to the characterization of multidimensional distributions?
  • What is the significance of the uniformity and normality of the statistic Y for samples from certain distributions?

Typology: Exams

2021/2022

Uploaded on 09/27/2022

koss
koss 🇺🇸

4.8

(16)

243 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
SOME
CHARACTERIZATION
PROBLEMS
IN
STATISTICS
YU.
V.
PROHOROV
V.
A.
STEKLOV
INSTITUTE,
MOSCOW
1.
Introduction
In
this
paper
we
shall
discuss
problems
connected
with
tests
of
the
hypothesis
that
a
theoretical
distribution
belongs
to
a
given
class,
for
instance,
the
class
of
normal
distributions,
or
uniform
distribution
or
Poisson
distribution.
The
sta-
tistical
data
consist
of
a
large
number
of
small
samples
(see
rll).
2.
Reduction
to
simple
hypotheses
Let
(9C,
a)
be
a
measurable
space
(9C
is
a
set
and
a
is
a
oa-algebra
of
subsets
of
9).
Let
6P
be
a
set
of
probability
distributions
defined
on
G,
let
('),
a)
be
another
measurable
space,
and
let
Y
=
f
(X),
X
E
9C,
be
a
measurable
mapping
of
($,
a)
into
(yj,
a).
With
this
mapping
every
distribution
P
induces
on
a
a
correspond-
ing
distribution
which
we
shall
denote
by
Qp.
We
will
be
interested
in
the
mappings
(statistics)
Y
which
possess
the
following
two
properties:
(1)
Qp
is
the
same
for
all
P
e
6';
in
this
case
we
will
simply
write
Q'.
(2)
If
for
some
P'
on
a
one
has
Qp
=
Qe,
then
P'
E
(P.
Sometimes
it
is
expedient
to
formulate
requirement
(2)
in
the
weakened
form:
(2a)
If
P'
c
P'
D
P
and
QJy
=
Q
,
then
P'
c
(P.
In
other
words,
we
can
assert
in
this
case
only
that
the
equation
Qp
=
QX
implies
P'
E
(P
for
some
a
priori
restrictions
(P'
E
d")
on
P'.
If
Y
is
a
statistic
satisfying
(1)
and
(2),
then
it
is
clear
that
the
hypothesis
that
the
distribution
of
X
belongs
to
class
(
is
equivalent
to
the
hypothesis
that
the
distribution
of
Y
is
equal
to
Q6.
Let
us
consider
some
examples.
In
these
examples
(9C,
a)
is
an
n-dimensional
Euclidean
space
of
points
X
=
(xi,
* *
*,
x,,)
with
the
oa-algebra
of
Borel
sets.
The
distributions
belonging
to
(P
have
a
probability
density
of
the
form
(2.1)
p(x1,
O)p(x2,
0)
...
p(xn,
0)
where
p
is
a
one-dimensional
density
and
0
a
parameter
taking
values
in
a
parameter
space.
EXAMPLE
1
(I.
N.
Kovalenko
[2]).
Translation
parameter.
Let
p(x;0)
=
p(x
-
0),
with
-X
<
0
<
X
(additive
type).
Here
obviously
it
is
necessary
to
take
the
(n
-
1)-dimensional
statistic
Y
=
(xi
-
x",
*---,
x.-
-xn).
Of
course,
we
can
take
any
uniquely
invertible
function,
for
example
Y'
=
(xl-x-,
*
**,Xn-x)
where
x
=
(1/n)
F_1
Xk.
341
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Characterization of Statistical Distributions: Tests for Hypothesis Equivalence and more Exams Statistics in PDF only on Docsity!

SOME CHARACTERIZATION

PROBLEMS IN STATISTICS

YU. V. PROHOROV

V. A. STEKLOV INSTITUTE, MOSCOW

  1. Introduction

In this paper we shall discuss problems connected with tests of the hypothesis that a theoretical distribution belongs to a given class, for instance, the class of normal distributions, or uniform distribution or Poisson distribution. The sta-

tistical data consist of a large number of small samples (see rll).

  1. Reduction to simple hypotheses

Let (9C, a) be a measurable space (9C is a set and a is a oa-algebra of subsets of

9). Let^ 6P^ be^ a set^ of^ probability distributions^ defined^ on^ G, let^ ('), a) be another

measurable space, and let^ Y^ =^ f (X), X^ E^ 9C, be^ a^ measurable^ mapping^ of^ ($, a)

into (yj, a). With this mapping every distribution P^ induces on a^ a correspond-

ing distribution which we shall^ denote by Qp. We will^ be^ interested^ in^ the mappings (statistics) Y^ which possess the^ following two^ properties:

(1) Qp^ is the same^ for all^ P^ e^ 6';^ in^ this case we will^ simply^ write^ Q'.

(2) If for some P'^ on a^ one^ has Qp =^ Qe, then^ P'^ E^ (P.

Sometimes it is^ expedient to^ formulate^ requirement (2) in^ the^ weakened form:

(2a) If P'^ c^ P'^ D P^ and QJy =^ Q , then P'^ c^ (P.^ In^ other^ words,^ we can assert

in this case only that the^ equation Qp =^ QX implies P'^ E^ (P^ for^ some a^ priori restrictions (P' E^ d") on P'.

If Y is a statistic satisfying (1) and (2), then it is clear that the hypothesis

that the distribution of X belongs to class ( is equivalent to the hypothesis that

the distribution of Y is equal to Q6.

Let us consider some examples. In these examples (9C, a) is an n-dimensional

Euclidean space of points X =^ (xi, * * *, x,,) with the oa-algebra of Borel sets.

The distributions belonging to (P have a probability density of the form

(2.1) p(x1, O)p(x2, 0) ...^ p(xn, 0)

where p is a one-dimensional density and 0 a parameter taking values in a

parameter space. EXAMPLE 1 (I. N. Kovalenko [2]). Translation parameter. Let p(x;0) =

p(x -^ 0), with^ -X <^0 <^ X (additive type).^ Here^ obviously^ it is^ necessary

to take the (n - 1)-dimensional statistic Y =^ (xi -^ x", *---, x.- -xn).

Of course, we can take any uniquely invertible function, for^ example Y'^ =

(xl-x-, *^ **,Xn-x) where x^ =^ (1/n) F_1 Xk. 341

342 FIFTH^ BERKELEY^ SYMPOSIUM:^ PROHOROV In [2] it is shown that for n >^ 3, the distribution of Y determines the^ charac- teristic function f (t) =^ f 00 eit" p(x) dx to within a factor of the form (^) eitt, on every interval where f(t)^ $-^ 0. In^ particular,^ if f(t)^ #!^0 for^ every^ t,^ then for

n > 3, the statistic Y satisfies conditions (1) and (2) of section 2. This is also

true if^ f^ (t)^ is^ uniquely^ determined by its^ values^ in^ some^ neighborhood^ of^ zero (for example, if^ f (t) is^ analytic^ in^ some^ neighborhood^ of^ zero). In (^) this paper, for every n there is given a pair of distributions, not belonging to the same additive type, for^ which the^ distribution^ of^ the^ statistic^ Y^ is the^ same for samples of size n.^ In^ section 4 these^ results^ are^ extended^ to a^ sample^ from^ a multidimensional population, and in^ section 5 to^ the^ case^ of^ a^ scale^ parameter. REMARK. Let us assume that a^ distribution with density^ p(x)^ has^ four finite

moments: mi = 0, M2, M3, m4, and that^ p(x) <^ A. Let us^ denote^ by^ F(x)^ the

corresponding distribution function and let^ G(x) be another^ distribution^ function such that the distribution of Y^ is the same for^ F^ and^ G.^ Then^ it^ can be^ shown that

(2.2) inf sup (^) IG(x) -^ F(x -^ O)1 <^ C(A, M2, M3,^ M4) 6 xVn That (^) is, if the sample size n is large, all the additive types corresponding to a given distribution of the^ statistic^ Y^ must^ be^ close^ to^ each^ other.

EXAMPLE 2 (A. A. Zinger, Yu. V. Linnik [3], [4]). Let 0 =^ (a, a),

-Xo < a <0o, a > 0, and let

(2.3) P(x,^ 0) a (x ra)

where y is^ a^ normal^ (0, 1) density. Here it is natural^ to^ take the^ (n^ -^ 2)-dimen-

sional statistic Y^ =^ (yi, * *^ ,^ y.), yk^ =^ (xk^ - x)/s,^ where^ S2^ =^ ,k^ (Xk-

s > 0. The sum of the^ components yk of^ the^ vector^ Y^ is^ equal^ to^ zero,^ and^ the

sum of their squares is unity. Thus^ the^ distribution^ of Y^ is^ concentrated^ on^ an

(n -^ 2)-dimensional sphere L yk =^ 0, E y2 =^ 1.

It is known [1], [3] that for^ p(x, 0) defined by formula^ (2.3)^ the^ distribution

of Y is uniform on this sphere. In^ [3] it^ is^ shown^ that^ for^ n^ >^ 6,^ the statistic Y

possesses properties (1) and (2) of^ section 2; that^ is, from^ uniformity^ of^ the

distribution of Y on the corresponding sphere it follows that the^ x's^ are^ normally

distributed. This result is extended to^ distributions different^ from^ the^ normal in section 6. It is clear for both examples cited that the choice of the statistic^ Y is based on considerations of invariance. Namely, there exists^ a^ group^ of

one-to-one (or almost one-to-one) mappings of^ the^ sample space onto^ itself

(X =^ (Xi1... Xn) (X1- a,^ **,x^ -^ a)^ in^ the^ first^ example^ and^ X^ - ((xi -^ a)/), -- ,^ (xn-^ a)/cv))^ in the^ second)^ having^ the^ property^ that distri- butions of "random elements" X^ and gX, g E (^) g, simultaneously belong to^ or^ do

not belong to (P. In addition, for^ any two^ distributions^ P1^ and^ P2^ there^ exists

g X 9 such that for every at, P2(ct) = Pl(ga).

In this case it is natural^ to^ take for Y^ a^ maximal invariant of the^ group^ 9.

344 FIFTH BERKELEY^ SYMPOSIUM:^ PROHOROV

We are interested in its real continuous solutions with a(O) =^0 (actually, from the assumption of the theorem it follows that Au(t) is infinitely differentiable in the neighborhood of zero which we are considering, and therefore it can be as-

sumed that a(t) is infinitely differentiable). We have a(t') + a(T') =^ a(t' + T).

Therefore,

(3.6) a(t') = E yjt(i)

jf-

Further, from^ a(t) + a(T') =^ a(t'^ +^ T'),^ it^ follows^ that^ a(t)^ =^ a(t').^ In^ such^ a

way, in the^ neighborhood^ of^ zero^ which^ we are^ considering

k (3.7) Al(t) -^ A2(t) =^ E (^) Yjt(i) j= and

(3.8) f1(t) =^ f2(t) exp^ {iE yj}t(i)

Because of the analyticity of fi and f2, this equation holds for all values of t.

REMARK. If^ f(t) $ 0^ for^ every t,^ then^ equation^ (3.8)^ is^ obtained without the

condition stated in theorem 1.

4. Scale parameter in^ a^ multidimensional^ population

Now we shall consider the^ case of^ a^ family d', given by formula^ (2.1), under^ the

assumption that xi is an t-dimensional vector and

(4.1) p(x, 0) =^ p (^) (x)

(^1) -

Let X =^ (xl, X2, ...^ ,^ x") be^ a^ sample from the^ distribution^ (4.1) xj =

(xi' **,^ xjt)). The^ distribution of^ the^2 t-dimensional^ vectors (4.2) V>=^ (ln (^) xj('), *, ln (^) lx(1I, sign ,xsign x5')

belongs to^ the^2 C-dimensional^ additive^ type^ with^ density

(4.3) q(v, 9) =^ q(v(') -^..^ ., v(t)9-, v(t+I), * V(2)

where = lnI 0. The following theorem is easily derived from the result of^ the

preceding section.

THEOREM 2. Assume that p(x) is bounded and satisfies Cramer's condition.

Then the statistic Y =^ (V1-^ V3, V2- V3), where V3. is defined in^ terms^ of V

according to^ the rule^ of^ theorem^1 (with^ replacement^ of^ t^ by^ 2t and k^ by^ t),^ possesses

properties (1)^ and^ (2).

The proof consists of verifying that the distribution^ of^ V,^ satisfies^ the^ con-

ditions of theorem 1.

  1. One-dimensional linear type We return to the one-dimensional case^ analogous to^ that considered in^ example

2, section 2. Let 0 =^ (a, b), -g^ < a <-,X b > 0 and

CHARACTERIZATION PROBLEMS 345

(5.1) p(x, 0) 1 ( a)

We will call a type symmetric if it is possible to^ choose^ the^ function^ p to be even.

Let x, s, and yk keep the same meaning^ as in^ example^ 2,^ section^ 2. Let us^ denote

by P'^ the family of distributions (2.1) which^ corresponds^ to^ symmetric^ types.

THEOREM 3. If p is symmetric and^ bounded^ and^ satisfies Cramer's^ condition,

then for n >^ 6, the statistic

(5.2) (5.2) 'Y.= [(y4 y3^ y6-y6)2]

~~~~~Y2- Yl/\Y2 -^ YlI

possesses properties (1) and (2a) of section 2. PROOF. We have

(5.3)^ (5~~ ~~ ~y3)Y*^ (4^ -^ Y3)2=^ (X4^ -X3)

_y2 - Yl X22 - Xj

and an analogous equality for the second component Y2 of the vector Y*.^ Let

p' be^ a^ symmetric density,^ different^ from^ p^ and^ such^ that^ Q,^ =^ QY.^ From^ the

fact that p satisfies Cram6r's condition^ and^ is^ bounded,^ it^ follows^ easily^ that

In Yi* and at the same time^ ln^ (xn -^ x1)2^ satisfy^ Cram6r's condition^ (both for

p and for p'). To the sample of size 3 made^ up of^ the^ variables^ In^ (x2^ -x)2,

In (X4 -^ X3)2, ln (x6 -^ x6)2, one can apply^ what^ was^ said^ in the^ remark^ on

example 1, section 2. Consequently, the distribution of^ Y*^ determines^ the^ distri- bution of ln (X2 -^ x1)2 to within a translation parameter, and the^ distribution^ of (x2 -^ xI)2 to within a scale parameter. Since the variable x2^ -^ x1^ is^ symmetrically distributed, its distribution also is determined to^ within^ a^ scale parameter.^ We

note that thus far we have not made use anywhere of^ the^ symmetry^ of^ p'.^ If,^ for

example, p is normal, then the distribution of x2 -^ xi under^ p'^ is^ normal, and

by Cram6r's^ theorem^ xi is normal.^ In^ the^ general^ case, for^ a^ symmetric^ density

p', the^ distribution^ of^ xi is^ uniquely^ determined^ except^ for a^ translation^ pa-

rameter by the distribution of x2 -^ xi. The theorem is proved.

Without the assumptions of^ symmetry, the^ formulation^ must^ be^ changed.

THEOREM 4. Assume that p is^ bounded^ and^ satisfies Cramgr's condition.^ Then

for n^ >^ 9, the^ statistic^ Y^ =^ (Yt, Y2*)^ where

Y =^ (ln|83 _YY1, In _ YY1, sign (y3 -^ y), sign (Y2 - YO)

(5.4) Y / / /

Y2*= (In^ Y6^ -Y4,^ In^ Y65^ Y^2 ssign^ (y6 -^ 14), sign^ (y5^ -^ Y4))

Y9 -Y71 1/8 -Y

possesses properties (1) and^ (2), section^ 2.

PROOF. The distribution of^ the^ vector^ (X3 -^ X1, X2 -^ x1) belongs^ to the

multiplicative type. Using a sample of size^ 3, namely (X3 -^ X1,^ X2^ -X),

(X6 -^ X4, X5 -^ X4), (X9 -^ X7, X8 -^ X7), this^ multiplicative type is^ determined

uniquely by the distribution of^ the^ statistic mentioned in the formulation of the theorem. Knowing the^ distribution^ of^ (X3 -^ X1, X2 -^ X1),^ we^ determine the ad-

ditive type of^ the^ distribution^ of^ x1.^ The theorem is^ proved.

CHARACTERIZATION (^) PROBLEMS 347

From this follows, as is easily seen, the "shift-compactness" of the distributions

of In (x2(f-) )2. We shall take now an (^) arbitrary sequence Nk T of natural numbers and choose from it a subsequence Mk (^) for which the distributions of

(6.8) In X2M -^ X(Mi))2 (^) bk > 0,

converge weakly to a limit distribution. Then the distributions of

(6.9) x2^ X(Mk)^ _xi2(AA)

bk bk also form a weakly (^) convergent sequence, and the (^) sequence of distributions of

(xiMk))/bk iS "shift-compact." From this it is obvious that the convergence (6.4)

implies relative compactness of (^) the sequence of types T(p(N)). The proof can now be completed in the same (^) way as in (^) part A.

  1. Characterization of multidimensional linear (^) types We shall (^) say that the (^) distributions of random vectors x and y belong to the same type if there exists a nonsingular matrix A and vector b such that

(7.1) gx = Ax + b

has the same distribution as y. Let p(x) be any 4-dimensional density and 0 =^ (A, b). We denote (^) by (P =

T(p) =^ {pe} the linear type generated in the obvious manner by the density p.

All presently known results on characterization of multidimensional distri- butions have been obtained under the assumption that the distributions con- sidered belong to the class (P', defined in the following manner. The distribution

of an C-dimensional random vector x belongs to class GI' if in some coordinate

system its components are independent. The group 9 of all transformations (7.1)

induces a group (^) g of transformations gX =^ (gx1, -^ * ,* (^) gxn) in the (^) n4-dimensional space of vectors X =^ (xi, * * *, xn). A (^) maximal invariant Y of the group (^) g can

be expressed in terms of the determinants

X,i, **^ *,it =^ [xi * xi, where^ x =^ (1/n)(xi + +^ xn)

and where [z1, * * *, ze] denotes the volume of the oriented parallelepiped con-

structed on the vectors z1, * * * , Zt.

Let us assume that the sample size n > 6U. We shall take vectors zj = X2j-

X2j_1 with^ components (^) zj"), k =^ 1, 2, * - *, t. Let

(7.3) 6k =^ [Zk?+1y ...^ ,^ Zk4t]2 k^ =^ 0, 1, 2

(7.4) 1 = ln "' £2 = ln 62,

60 so

(7.5) = (2k, (^) 2).

It is clear that 2^ is a function of a maximal invariant Y of the group q. The

following theorem^ (see [5]) holds.

348 FIFTH BERKELEY SYMPOSIUM: PROHOROV

THEOREM 6. If a density p satisfies Cramer's condition and if it is bounded,

then the statistic Y possesses properties (1) and (2a) with respect to the class 6" of distributions of random vectors x which can be transformed into vectors with inde- pendent, identically distributed, symmetrical components by a transformation of the form (7.1). The proof of this theorem is based on a lemma which has independent interest. LEMMA. Let (^) Vj', (i, (^) j = 1, *--, 4) be independent random variables with the same distribution function V(x), and let (^) WY", (i, (^) j =^ 1,...^ , t) also be independent and have a distribution function W(x). If all moments of V(x) exist and the distri- bution of the determinant A = det (^) IlV5(')l coincides with the distribution of the determinant (^) a = det W`1 (^) If, then V = (^) W.

  1. Application to testing hypothesis The classical method of testing the hypothesis that the distribution of a sample belongs to a given parametric family (2.1) consists in the construction, based on the results of observations, of an estimate 0^ for 0 and in the subsequent test of the significance of the deviation of the empirical distribution from the theoretical with 0 =^ 0. Another statement of the problem will interest us. A (^) large number s of small samples (^) Xi, * * *, (^) X. of sizes (^) ni,..^ ., n,, respectively is given. The null hypothesis Ho is that for every j the distribution of (^) Xi is in the

family (2.1). If there exist statistics Y1, *- -, Y.; Yj =^ fj(Xj), satisfying prop-

erties (1) and (2), then the composite hypothesis Ho is replaced by the simple

hypothesis Ho: for every j the distribution of Y, is equal to QYi. Let the dimen-

sionality of^ the statistic^ Yj be^ equal to^ mj. With^ the^ proper transformation^ one

can translate (^) Yj into (^) zj, Yj = (^) 46j(zj), where (^) zj has a uniform distribution on the

unit cube in m,-dimensional Euclidean space. This transforms the hypothesis Ho

into the equivalent hypothesis H': the components of the (mi + * + mi)-

dimensional vector Z = (zi, ...^ , z8) are independent and uniformly distributed

on the interval [0, 1]. In this way one can give a standard form to the hypothesis

Ho. Of course, the^ first^ question which^ arises^ in^ connection^ with such transfor-

mations concerns the form taken by the alternative hypotheses. From this point of view the transformations mentioned must be (^) "sufficiently smooth" so that

they transform the "alternatives close" to Ho iiito the "alterniatives close" to Ho'.

For now we shall (^) postpone the (^) corresponding analysis.

REFERENCES [1] A. A. PETROV, "Tests, based on small samples, of statistical hypotheses concerning the type of a distribution," Teor. Verojatnost. i Primenen., Vol. I (1956), pp. 248-271. [2] I. N. KOVALENKO, "On the recovery of the additive type of a distribution on the basis of a sequence of series of independent observations," Proceedings of the All Union Congress on the Theory of Probability and Mathematical Statistics (Erevan, 1958), Erevan, Press of (^) the Armenian Academy of Sciences, 1960.