Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Definitions and formulas with satistical tables for elementary statistics, Cheat Sheet of Statistics

Definitions and formulas are laws of probability, theoretical mean variance for discrete distributions, correlation and regression, analysis of variance and median test for two independent samples. From university of oxford.

Typology: Cheat Sheet

2021/2022

Uploaded on 02/07/2022

aasif
aasif 🇺🇸

4.9

(7)

218 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
DEFINITIONS AND FORMULAE WITH
STATISTICAL TABLES
FOR ELEMENTARY STATISTICS AND
QUANTITATIVE METHODS COURSES
Department of Statistics
University of Oxford
October 2015
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Definitions and formulas with satistical tables for elementary statistics and more Cheat Sheet Statistics in PDF only on Docsity!

DEFINITIONS AND FORMULAE WITH

STATISTICAL TABLES

FOR ELEMENTARY STATISTICS AND

QUANTITATIVE METHODS COURSES

Department of Statistics

University of Oxford

October 2015

Contents

6 Standard errors

Single sample of size n

SE(¯x) = √σn or, if σ unknown, √sn

SE(ˆp) =

√pq

n with^ q^ = 1^ −^ p^ ,^ or, if^ p^ unknown,

pˆ(1−ˆp) n

Sampling without replacement

When n individuals are sampled from a population of N without replacement, the standard error is reduced. The standard error for no replacement SENR is related to the standard error with replacement SEWR by the formula

SENR = SEWR

n − 1 N − 1

σ √ n

n − 1 N − 1

where σ is the known standard deviation of the whole population.

Two independent samples of sizes, n 1 and n 2

SE(¯x 1 − x¯ 2 ) =

σ 12

n 1 +^

σ^22 n 2 or, if^ σ^1 and^ σ^2 unknown and different,

s^21

n 1 +^

s^22 n 2.

For common but unknown σ , SE(¯x 1 − x¯ 2 ) = s

1

n 1 +^

1 n 2 with^ s

(^2) =(n^1 −1)s (^21) +(n 2 −1)s (^22) n 1 +n 2 − 2

SE(ˆp 1 − pˆ 2 ) =

p 1 q 1

n 1 +^

p 2 q 2 n 2 ,^ or,

if p 1 and p 2 unknown and unequal,

pˆ 1 ˆq 1

n 1 +^

pˆ 2 ˆq 2 n 2

For common but unknown p , SE(ˆp 1 − pˆ 2 ) =

p ˆqˆ( n^11 + n^12 ) where ˆp is a pooled estimate of

p defined as pˆ = n^1 n^ pˆ^1 +n^2 pˆ^2

1 +n 2 and ˆq = 1 − pˆ.

7 95% confidence limits for population parameters

Mean: when σ known use ¯x ± 1. 96 √σn

when σ unknown use x¯± t √sn

where t is the tabulated two-sided 5% level value with degrees of freedom, d.f. = n − 1

Proportion: pˆ ± 1. 96

pˆq/nˆ

8 z-tests

Single sample test for population mean μ (known σ): z = ¯x−μ σ/ √ n

Single sample test for population proportion p: z = √pˆ−pqp

n

Two sample test for difference between two means (known σ 1 and σ 2 ): z = √x¯^1 −x¯^2

σ^21 n 1 +^

σ^22 n 2

Two sample test for difference between two proportions : z = √ pˆ^1 −^ pˆ^2 p ˆqˆ( (^) n^11 + (^) n^12 )

where ˆp is a pooled estimate of p defined as pˆ =

n 1 pˆ 1 +n 2 pˆ 2 n 1 +n 2 and ˆq^ = 1^ −^ pˆ

9 t-tests

Population variance σ^2 unknown and estimated by s^2

Single sample test for population mean μ t = x¯−μ s/

n with d.f.= n − 1

Paired samples : test for zero mean difference, using n pairs (x, y), d = x − y

t = d¯ sd/

n

with d.f.= n − 1, where d¯ and sd are the mean and standard deviation of d.

Independent samples test for difference between population means μx and μy using nx x’s and ny y’s. Provided that s^2 x and s^2 y are similar values, use the pooled variance estimate,

s^2 =

(nx−1)s^2 x+(ny−1)s^2 y nx+ny− 2 ,^ and^ t^ =^

¯x−¯y s

1 nx +^

1 ny

with d.f = nx + ny − 2

10 The χ^2 -test

(Note that the two tests in this section are nonparametric tests. There are χ^2 tests of vari- ances, not included here, that are parametric.)

χ^2 Goodness-of-fit tests using k groups have d.f.=(k − 1) − p where p is the number of independent parameters estimated and used to obtain the (fitted) expected values.

χ^2 Contingency table tests on two-way tables with r rows and c columns have d.f.= (r − 1)(c − 1)

For both tests, χ^2 = Σ

(O−E)^2 E where^ O^ is an observed frequency and^ E^ is the corresponding expected frequency

13 Median test for two independent samples

For two independent samples, sizes n 1 and n 2 , the median of the whole sample of n = n 1 + n 2 observations is found. The number in each sample above this median is counted and expressed as a proportion of that sample size. The two proportions are compared using the Z-test as in §8.

14 Rank sum test or Mann-Whitney test

For two independent samples, sizes n 1 and n 2 , ranked without regard to sample, call the sum of the ranks in the smaller sample R. If n 1 ≤ n 2 ≤ 10 refer to Table 5, otherwise use a Z

test with z = (R − μ)/σ where μ = 12 n 1 (n 1 + n 2 + 1) and σ =

√ n 1 n 2 (n 1 +n 2 +1) 12 , assuming n 1 ≤ n 2. In case of ties, ranks are averaged.

15 Sign test for matched pairs

The number of positive differences from the n pairs is counted. This number is binomially distributed with p = 12 , assuming a population zero median difference. So apply the Z test for a binomial proportion with p = 12.

16 Wilcoxon test for matched pairs

Ignoring zero differences, the differences between the values in each pair are ranked without regard to sign and the sums of the positive ranks, R+ and of the negative ranks, R−, are calculated. (Check R+ + R− = 12 n(n + 1), where n is the number of nonzero differences). The smaller of R+ and R− is called T and may be compared with the critical values in Table 6 for a two-tailed test. (For one-tailed tests, use R− and R+ with the same table, remembering to halve P .) In case of ties, ranks are averaged.

17 Kolmogorov-Smirnov test

Two samples of sizes n 1 and n 2 are each ordered along a scale. At each point on the scale the empirical cumulative distribution function is calculated for each sample and the difference between the pairs are recorded as Di. The largest absolute value of the Di is called Dmax and this value is compared with the 5% one-tailed value

Dcrit = 1. 36

n 1 + n 2 n 1 n 2

Single sample version, compares sample with theoretical distribution,

Dcrit = 1. 36

n

Should only be used with no ties, but it commonly is used otherwise. With ties, the value of Dmax tends to be too small, so that the p-value is an overestimate.

18 Kruskal-Wallis test for several independent samples

(Analysis of variance for a single factor). For k samples of sizes n 1 , n 2 , ..nk, comprising a total of n observations, all values are ranked without regard to sample, from 1 to n. The rank sums for the samples are calculated as R 1 , R 2 , .., Rk. (Check ΣRi = 12 n(n + 1). The test statistic is

H =

[ 12 n(n+1)Σ

R i^2 ni

] −3(n + 1),

which is compared to χ^2 table with d.f. = k − 1

19 Spearman’s Rank Correlation Coefficient

If x and y are ranked variables the Spearman Rank Correlation Coefficient is just the sample product moment correlation coefficient between the pairs of ranks, rs, which may also be computed by

rs = 1− 6Σd^2 n(n^2 −1)

where d is the difference x − y, and n is the number of pairs (x, y).

Test rs using t = rs

√n−^2 1 −r^2 s

with d.f.= n − 2

2 !

!t 0 t

P P

X

Variance ratio F = s^21 /s^22 with ν 1 and ν 2 degrees of freedom respectively.

23 TABLE 5 : Critical values of R for the Mann-Whitney rank-sum test The pairs of values below are approximate critical values of R for two-tailed tests at levels P = 0.10 (upper pair) and P = 0.05 (lower pair). (Use relevant P = 0.10 entry for one-tailed

  • 1 Laws of Probability
  • 2 Theoretical mean and variance for discrete distributions
  • 3 Mean and variance for sums of Normal random variables
  • 4 Estimates from samples
  • 5 Two common discrete distributions
  • 6 Standard errors
  • 7 95% confidence limits for population parameters
  • 8 z-tests
  • 9 t-tests
  • 10 The χ^2 -test
  • 11 Correlation and regression
  • 12 Analysis of variance
  • 13 Median test for two independent samples
  • 14 Rank sum test or Mann-Whitney test
  • 15 Sign test for matched pairs
  • 16 Wilcoxon test for matched pairs
  • 17 Kolmogorov-Smirnov test
  • 18 Kruskal-Wallis test for several independent samples
  • 19 Spearman’s Rank Correlation Coefficient
  • 20 TABLE 1 : The Normal Integral
  • 21 TABLE 2 : Table of t TABLE 3 : Table of χ^2
  • 22 TABLE 4 : Table of F for P = 0.
  • 23 TABLE 5 : Critical values of R for the Mann-Whitney rank-sum test
  • 24 TABLE 6 : Critical values for T in the Wilcoxon Matched-Pairs Signed-Rank test
  • 21 TABLE 2 : Table of t TABLE 3 : Table of χ -!
  • Probability P of lying outside ±t Probability P of a value of χ P
  • d.f. P=0.10 P=0.05 P=0.02 P=0.01 d.f P=0.05 P=0. greater than:
  • 1 6.31 12.71 31.82 63.7 1 3.84 6.
  • 2 2.92 4.30 6.96 9.93 2 5.99 9.
  • 3 2.35 3.18 4.54 5.84 3 7.81 11.
  • 4 2.13 2.78 3.75 4.60 4 9.49 13.
  • 5 2.02 2.57 3.36 4.03 5 11.07 15.
  • 6 1.94 2.45 3.14 3.71 6 12.59 16.
  • 7 1.90 2.37 3.00 3.50 7 14.07 18.
  • 8 1.86 2.31 2.90 3.36 8 15.51 20.
  • 9 1.83 2.26 2.82 3.25 9 16.92 21.
  • 10 1.81 2.23 2.76 3.17 10 18.31 23.
  • 11 1.80 2.20 2.72 3.11 11 19.68 24.
  • 12 1.78 2.18 2.68 3.06 12 21.03 26.
  • 13 1.77 2.16 2.65 3.01 13 22.36 27.
  • 14 1.76 2.15 2.62 2.98 14 23.68 29.
  • 15 1.75 2.13 2.60 2.95 15 25.00 30.
  • 16 1.75 2.12 2.58 2.92 16 26.30 32.
  • 17 1.74 2.11 2.57 2.90 17 27.59 33.
  • 18 1.73 2.10 2.55 2.88 18 28.87 34.
  • 19 1.73 2.09 2.54 2.86 19 30.14 36.
  • 20 1.73 2.09 2.53 2.85 20 31.41 37.
  • 21 1.72 2.08 2.52 2.83 21 32.67 38.
  • 22 1.72 2.07 2.51 2.82 22 33.92 40.
  • 23 1.71 2.07 2.50 2.81 23 35.17 41.
  • 24 1.71 2.06 2.49 2.80 24 36.42 42.
  • 25 1.71 2.06 2.49 2.79 25 37.65 44.
  • 26 1.71 2.06 2.48 2.78 26 38.89 45.
  • 27 1.70 2.05 2.47 2.77 27 40.11 46.
  • 28 1.70 2.05 2.47 2.76 28 41.34 48.
  • 29 1.70 2.05 2.46 2.76 29 42.56 49.
  • 30 1.70 2.04 2.46 2.75 30 43.77 50.
  • 40 1.68 2.02 2.42 2.70 40 55.76 63.
  • 60 1.67 2.00 2.39 2.66 60 79.08 88.
  • ∞ 1.65 1.96 2.33 2.
  • 22 TABLE 4 : Table of F for P = 0.
  • P = 0. F
  • ν 2 ν ν 1 1 2 3 4 5 6 8 12 24 ∞
  • 6 5.99 5.14 4.76 4.53 4.39 4.28 4.15 4.00 3.84 3.67
  • 8 5.32 4.46 4.07 3.84 3.69 3.58 3.44 3.28 3.12 2.93
  • 10 4.96 4.10 3.71 3.48 3.33 3.22 3.07 2.91 2.74 2.54
  • 12 4.75 3.89 3.49 3.26 3.11 3.00 2.85 2.69 2.51 2.30
  • 14 4.60 3.74 3.34 3.11 2.96 2.85 2.70 2.53 2.35 2.13
  • 16 4.49 3.63 3.24 3.01 2.85 2.74 2.59 2.42 2.24 2.01
  • 18 4.41 3.55 3.16 2.93 2.77 2.66 2.51 2.34 2.15 1.92
  • 20 4.35 3.49 3.10 2.87 2.71 2.60 2.45 2.28 2.08 1.84
  • 30 4.17 3.32 2.92 2.69 2.53 2.42 2.27 2.09 1.89 1.62
  • 40 4.08 3.23 2.84 2.61 2.45 2.34 2.18 2.00 1.79 1.51
  • 60 4.00 3.15 2.76 2.53 2.37 2.25 2.10 1.92 1.70 1.39
  • larger sample size, n test at level 0.05).
  • smaller sample 4 12,24 13,27 14,30 15,33 16,36 17,39 18,
  • size n 1 11,25 12,28 12,32 13,35 14,38 15,41 16,
  • 5 19,36 20,40 22,43 23,47 25,50 26,
  • 18,37 19,41 20,45 21,49 22,53 24,
  • 6 28,50 30,54 32,58 33,63 35,
  • 26,52 28,56 29,61 31,65 33,
  • 7 39,66 41,71 43,76 46,
  • 37,68 39,73 41,78 43,
  • 8 52,84 54,90 57,
  • 49,87 51,93 54,
  • 9 66,105 69,
  • 63,108 66,
  • 10 83,
  • 79,