Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Methods Formula Sheet: Descriptive Stats, Probability, Hypothesis Testing, Cheat Sheet of Statistics

Formulas and concepts for various statistical methods including descriptive statistics (five number summary, sample variance, permutations, combinations), probability theory (addition rule, multiplication rule, independent events, law of total probability, de morgan's laws), expected value and variance for discrete and continuous random variables, and hypothesis testing and confidence intervals for one mean, difference of two means, one proportion, and difference of two proportions. It also covers distributions such as binomial, hypergeometric, poisson, normal, and sampling distributions.

What you will learn

  • What is the multiplication rule in probability theory?
  • How do you calculate the sample mean?
  • What is the difference between permutations and combinations?
  • What is the expected value of a continuous random variable?
  • How do you calculate the sample variance using the given formula?

Typology: Cheat Sheet

2021/2022

Uploaded on 02/07/2022

zeb
zeb 🇺🇸

4.6

(26)

231 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Formula Sheet for Statistical Methods (201-DDD-05)
Five number summary:
min, Q1, median, Q3, max
Q1: median of smallest half
Q3: median of largest half
Fourth spread
fs=Q3Q1
Outliers
xiis an outlier if its distance from the
closest fourth (Q1or Q3) is >1.5fs
Sample variance
s2=1
n1X(xi¯x)2
s2=1
n1Xx2
i(Pxi)2
n
Sample standard deviation
s=s2
Permutations
Pk,n =n!
(nk)!
Combinations
Ck,n =n
k=n!
k!(nk)!
Addition rule
P(AB) = P(A) + P(B)P(AB)
Multiplication rule
P(AB) = P(A)P(B|A)
Independent events
Aand Bare independent if P(B|A) = P(B),
equivalently P(AB) = P(A)P(B)
Law of Total Probability
A1,...,Akmutually exclusive & exhaustive:
P(B) = P(A1B) + ···+P(AkB)
Special case: P(E) + P(E0)=1
De Morgan’s laws
(AB)0=A0B0
(AB)0=A0B0
Expected value for a discrete r.v.
E(X) = µX=Pxp(x)
E(h(X)) = Ph(x)p(x)
Expected value for a continuous r.v.
E(X) = µX=R
−∞ xf(x)dx
E(h(X)) = R
−∞ h(x)f(x)dx
Variance and standard deviation
V(X) = σ2
X=E(X2)E(X)2
σX=pV(X)
Rule for expected value
E(aX +b) = aE(X) + b
Rule for variance
V(aX +b) = a2V(X)
Binomial distribution
XBin(n, p):
p(x) = n
xpx(1 p)nxfor x= 0,1,...,n
E(X) = np,V(X) = np(1 p)
Hypergeometric distribution
n= sample size, N= population size,
M= number of successes in population
p(x) = M
xNM
nx
N
n
E(X) = n·M
N,V(X) = Nn
N1·n·M
N·1M
N
Poisson distribution
XPoisson(λ):
p(x) = eλλx
x!for x= 0,1...
E(X) = λ,V(X) = λ
Percentiles
η= 100pth percentile of X(continuous r.v.):
P(Xη) = p
Normal distribution
If XN(µ, σ) then Xµ
σN(0,1)
For ZN(0,1) set Φ(z) = P(Zz)
Φ(zα)=1α
Statistics
X1,...,Xnrandom sample:
X=1
nXXi(sample mean)
S2=1
n1X(XiX)2(sample variance)
Sampling distributions
X1,...,Xnrandom sample,
Xidistribution with mean µand std. dev. σ:
E(X) = µ,V(X) = σ2/n
CLT: Xµ
σ/nN(0,1) (n > 30)
Regression and Correlation
Sxx = Σx2
ixi)2
n
Syy = Σy2
iyi)2
n
Sxy = Σ(xiyi)xi)(Σyi)
n
SSE = Σy2
iˆ
β0Σyiˆ
β1Σxiyi
SST =Syy
ˆ
β1=Sxy
Sxx
ˆ
β0=Σyiˆ
β1Σxi
n
r=Sxy
SxxSyy , r2= 1 SSE
SST
s2=SSE
n2
pf2

Partial preview of the text

Download Statistical Methods Formula Sheet: Descriptive Stats, Probability, Hypothesis Testing and more Cheat Sheet Statistics in PDF only on Docsity!

Formula Sheet for Statistical Methods (201-DDD-05)

Five number summary:

min, Q 1 , median, Q 3 , max

Q 1 : median of smallest half

Q 3 : median of largest half

Fourth spread

fs = Q 3 − Q 1

Outliers

xi is an outlier if its distance from the

closest fourth (Q 1 or Q 3 ) is > 1. 5 fs

Sample variance

s 2 =

n − 1

(xi − x¯) 2

s

2

n − 1

x

2 i −^

xi) 2

n

Sample standard deviation

s =

s 2

Permutations

Pk,n =

n!

(n − k)!

Combinations

Ck,n =

n

k

n!

k!(n − k)!

Addition rule

P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

Multiplication rule

P (A ∩ B) = P (A)P (B|A)

Independent events

A and B are independent if P (B|A) = P (B),

equivalently P (A ∩ B) = P (A)P (B)

Law of Total Probability

A 1 ,... , Ak mutually exclusive & exhaustive:

P (B) = P (A 1 ∩ B) + · · · + P (Ak ∩ B)

Special case: P (E) + P (E ′ ) = 1

De Morgan’s laws

(A ∪ B)

′ = A ′ ∩ B ′

(A ∩ B)

′ = A ′ ∪ B ′

Expected value for a discrete r.v.

E(X) = μX =

xp(x)

E(h(X)) =

h(x)p(x)

Expected value for a continuous r.v.

E(X) = μX =

−∞ xf (x)dx

E(h(X)) =

−∞ h(x)f (x)dx

Variance and standard deviation

V (X) = σ 2 X

= E(X

2 ) − E(X) 2

σX =

V (X)

Rule for expected value

E(aX + b) = aE(X) + b

Rule for variance

V (aX + b) = a 2 V (X)

Binomial distribution

X ∼ Bin(n, p):

p(x) =

n

x

p x (1 − p) n−x for x = 0, 1 ,... , n

E(X) = np, V (X) = np(1 − p)

Hypergeometric distribution

n = sample size, N = population size,

M = number of successes in population

p(x) =

M x

N −M n−x

N n

E(X) = n · M N

, V (X) =

N −n N − 1

· n · M N

M N

Poisson distribution

X ∼ Poisson(λ):

p(x) =

e −λ λ x

x!

for x = 0, 1...

E(X) = λ, V (X) = λ

Percentiles

η = 100p th percentile of X (continuous r.v.):

P (X ≤ η) = p

Normal distribution

If X ∼ N (μ, σ) then

X − μ

σ

∼ N (0, 1)

For Z ∼ N (0, 1) set Φ(z) = P (Z ≤ z)

Φ(zα) = 1 − α

Statistics

X 1 ,... , Xn random sample:

X =

n

Xi (sample mean)

S

2

n − 1

(Xi − X) 2 (sample variance)

Sampling distributions

X 1 ,... , Xn random sample,

Xi ∼ distribution with mean μ and std. dev. σ:

E(X) = μ, V (X) = σ^2 /n

CLT:

X−μ σ/

√ n

∼ N (0, 1) (n > 30)

Regression and Correlation

Sxx = Σx 2 i

(Σxi)^2 n

Syy = Σy 2 i −^

(Σyi) 2

n

Sxy = Σ(xiyi) −

(Σxi)(Σyi) n

SSE = Σy 2 i − βˆ 0 Σyi − βˆ 1 Σxiyi

SST = Syy

β^ ˆ 1 =^

Sxy Sxx

β^ ˆ 0 =^

Σyi− βˆ 1 Σxi n

r =

Sxy √ SxxSyy

, r 2 = 1 − SSE SST

s 2 = SSE n− 2

Formula Sheet for Statistical Methods (201-DDD-05)

HYPOTHESIS TESTING AND CONFIDENCE INTERVALS

(α = significance level) (100(1 − α)% confidence level)

One mean

H 0 : μ = μ 0

z ∗ =

x − μ 0

s/

n

∼ N (0, 1) (if n > 30)

t ∗ =

x − μ 0

s/

n

∼ tn− 1 (if data normally distr.)

x ± zα/ 2

s √ n

(if n > 30)

x ± tα/ 2 ,n− 1

s √ n

(if data normally distr.)

Difference of two means

H 0 : μ 1 − μ 2 = ∆ 0

z ∗ =

x 1 − x 2 − ∆ 0 √ s^21 n 1

s^22 n 2

∼ N (0, 1) (if n 1 > 30 and n 2 > 30)

t ∗ =

x 1 − x 2 − ∆ 0 √ s^21 n 1

s^22 n 2

∼ tν (if data normally distr.)

x 1 − x 2 ± zα/ 2

s^2 1 n 1

s^2 2 n 2

(if n 1 > 30 and n 2 > 30)

x 1 − x 2 ± tα/ 2 ,ν

s 2 1 n 1

s 2 2 n 2

(if data normally distr.)

where ν =

s^21 n 1

s^22 n 2

2

(s^21 /n 1 ) 2

n 1 − 1

(s^22 /n 2 ) 2

n 2 − 1

One proportion

H 0 : p = p 0

z ∗ =

ˆp − p 0 √ p 0 (1−p 0 ) n

∼ N (0, 1) (if n large)

pˆ ± zα/ 2

pˆ(1 − pˆ)

n

Difference of two proportions

H 0 : p 1 − p 2 = ∆ 0

z ∗ =

pˆ 1 − pˆ 2 − ∆ 0 √ pˆ 1 (1− pˆ 1 ) n 1

pˆ 2 (1− pˆ 2 ) n 2

∼ N (0, 1) (if n 1 , n 2 large)

pˆ 1 − pˆ 2 ± zα/ 2

p ˆ 1 (1 − ˆp 1 )

n 1

pˆ 2 (1 − pˆ 2 )

n 2

One variance

H 0 : σ^2 = σ^2 0

χ 2 =

(n − 1)s 2

σ^2 0

∼ χ 2 n− 1 (if data normally distr.)

(n − 1)s 2

χ 2 α/ 2 ,n− 1

(n − 1)s 2

χ 2 1 −α/ 2 ,n− 1

Ratio of two variances

H 0 : σ 2 1 =^ σ

2 2

f ∗ =

s^2 1 s 2 2

∼ Fn 1 − 1 ,n 2 − 1 (if data normally distr.)

property of critical F -values: F 1 −α/ 2 ,ν 1 ,ν 2

Fα/ 2 ,ν 2 ,ν 1

Slope of regression line

H 0 : β 1 = β 10

t ∗ =

β 1 − β 10

s/

Sxx

∼ tn− 2 (if data normally distr.)

βˆ 1 ±^ tα/ 2 ,n− 2

s √ Sxx

Correlation coefficient

H 0 : ρ = 0

t

r

n − 2 √ 1 − r^2

∼ tn− 2 (if data normally distr.)

Test of normality (Ryan-Joiner)

H 0 : population distribution is normal

test statistic: correlation coefficient r from probability plot

if r < rc, reject H 0