Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistics and Probability Cheat Sheet, Cheat Sheet of Statistics

Descriptive statistics, probability terms and rules and many formulas

Typology: Cheat Sheet

2020/2021

Uploaded on 04/26/2021

thecoral
thecoral 🇺🇸

4.4

(29)

401 documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Statistics Cheat Sheets
Descriptive Statistics:
Term Meaning Population Formula Sample Formula Example
{1,16,1,3,9}
Sort Sort values in
increasing order
{1,1,3,9,16}
Mean Average
N
X
N
i
i
1
n
X
n
X...XX
X
n
i
i
n
121
6
Median The middle value –
half are below and
half are above
3
Mode The value with the
most appearances
1
Variance The average of the
squared deviations
between the values
and the mean
2
N
i
i
X
N
1
2
1
1
1
2
2
n
XX
s
n
i
i
(1-6)2 + (1-6) 2 + (3-
6)2 + (9-6)2 + (16-6)2
divided by 5 values =
168/5 = 33.6
Standard
Deviation
The square root of
Variance, thought of
as the “average”
deviation from the
mean.
2
1
1
2
2
n
XX
ss
n
i
i
Square root of 33.6 =
5.7966
Coefficien
t of
Variation
The variation relative
to the value of the
mean
5.7966 divided by 6
= 0.9661
Minimum The minimum value 1
Maximum The maximum value 16
Range Maximum minus
Minimum
16 – 1 = 15
Probability Terms:
Term Meaning Notation Example* (see footnote)
Probability For any event A, probability is represented within 0 P 1. P() 0.5
Random
Experiment
A process leading to at least 2 possible outcomes with
uncertainty as to which will occur.
Rolling a dice
Event A subset of all possible outcomes of an experiment. Events A and B
Intersection of
Events
Let A and B be two events. Then the intersection of the two
events is the event that both A and B occur (logical AND). ABThe event that a 2 appears
Union of Events The union of the two events is the event that A or B (or both)
occurs (logical OR). ABThe event that a 1, 2, 4, 5 or 6
appears
Complement Let A be an event. The complement of A is the event that A does
not occur (logical NOT).
A
The event that an odd number
appears
Mutually
Exclusive Events
A and B are said to be mutually exclusive if at most one of the
events A and B can occur.
A and B are not mutually
exclusive because if a 2 appears,
both A and B occur
Collectively
Exhaustive
Events
A and B are said to be collectively exhaustive if at least one of
the events A or B must occur.
A and B are not collectively
exhaustive because if a 3
appears, neither A nor B occur
Basic Outcomes The simple indecomposable possible results of an experiment.
One and exactly one of these outcomes must occur. The set of
basic outcomes is mutually exclusive and collectively
exhaustive.
Basic outcomes 1, 2, 3, 4, 5, and
6
Sample Space The totality of basic outcomes of an experiment. {1,2,3,4,5,6}
* Roll a fair die once. Let A be the event an even number appears, let B be the event a 1, 2 or 5 appears
1137 Yoavi Liedersdorf (MBA’03)
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Statistics and Probability Cheat Sheet and more Cheat Sheet Statistics in PDF only on Docsity!

Descriptive Statistics :

Term Meaning Population Formula Sample Formula Example

{1,16,1,3,9}

Sort Sort values in

increasing order

{1,1,3,9,16}

Mean Average

N

X

N

i

i

1

n

X

n

X X ... X

X

n

i

i

n

1 2 1

6

Median The middle value –

half are below and

half are above

3

Mode The value with the

most appearances

1

Variance The average of the

squared deviations

between the values

and the mean

2

  

N

i

i

X

N

1

2

  

1

2

2

n

X X

s

n

i

i

(1-6)

2

  • (1-6)

2

  • (3-

2

  • (9-6)

2

  • (16-6)

2

divided by 5 values =

168/5 = 33.

Standard

Deviation

The square root of

Variance, thought of

as the “average”

deviation from the

mean.

2  

 

1

2

2

n

X X

s s

n

i

i

Square root of 33.6 =

Coefficien

t of

Variation

The variation relative

to the value of the

mean

X

s

CV 

5.7966 divided by 6

= 0.

Minimum The minimum value 1

Maximum The maximum value 16

Range Maximum minus

Minimum

16 – 1 = 15

Probability Terms :

Term Meaning Notation Example (see footnote)*

Probability For any event A, probability is represented within 0  P 1.

P()

Random

Experiment

A process leading to at least 2 possible outcomes with

uncertainty as to which will occur.

Rolling a dice

Event A subset of all possible outcomes of an experiment. Events A and B

Intersection of

Events

Let A and B be two events. Then the intersection of the two

events is the event that both A and B occur (logical AND).

AB

The event that a 2 appears

Union of Events The union of the two events is the event that A or B (or both)

occurs (logical OR).

AB

The event that a 1, 2, 4, 5 or 6

appears

Complement Let A be an event. The complement of A is the event that A does

not occur (logical NOT).

A

The event that an odd number

appears

Mutually

Exclusive Events

A and B are said to be mutually exclusive if at most one of the

events A and B can occur.

A and B are not mutually

exclusive because if a 2 appears,

both A and B occur

Collectively

Exhaustive

Events

A and B are said to be collectively exhaustive if at least one of

the events A or B must occur.

A and B are not collectively

exhaustive because if a 3

appears, neither A nor B occur

Basic Outcomes The simple indecomposable possible results of an experiment.

One and exactly one of these outcomes must occur. The set of

basic outcomes is mutually exclusive and collectively

exhaustive.

Basic outcomes 1, 2, 3, 4, 5, and

6

Sample Space The totality of basic outcomes of an experiment. {1,2,3,4,5,6}

  • Roll a fair die once. Let A be the event an even number appears, let B be the event a 1, 2 or 5 appears

Probability Rules :

If events A and B are mutually exclusive If events A and B are NOT mutually exclusive

Term

Equals

Area:

Term

Equals

Venn:

P(A)=

P(A)

P(A)=

P(A)

P( A )=

1 - P(A)

P( A )=

1 - P(A)

P(AB)=

P(AB)=

P(A) * P(B)

only if A and

B are

independent

P(AB)=

P(A) + P(B)

P(AB)=

P(A) + P(B)

  • P(AB)

P(A|B)=

[ Bayes'

Law : P(A

holds given

that B

holds)]

 

PB

P A  B

General probability rules :

1) If P(A|B) = P(A) , then A and B are independent

events! (for example, rolling dice one after the other).

2) If there are n possible outcomes which are equally

likely to occur:

P(outcome i occurs) =

n

for each i  [1, 2, ..., n ]

*Example: Shuffle a deck of cards, and pick one

at random. P(chosen card is a 10) = 1/52.

3) If event A is composed of n equally likely basic

outcomes :

P(A) =

Number of Basic Outcomes in A

n

*Example: Suppose we toss two dice. Let A

denote the event that the sum of the two dice is

9. P(A) = 4/36 = 1/9, because there are 4 out of

36 basic outcomes that will sum 9.

P(AB) = P(A|B) * P(B)

P(AB) = P(B|A) * P(A)

P(A)=

P(AB) +

P(A (^) B )

=

P(A|B)P(B) +

P(A| (^) B )P(

B

)

*Example: Take a deck of 52 cards. Take out 2 cards sequentially,

but don’t look at the first. The probability that the second card you

chose was a  is the probability of choosing a  (event A) after

choosing a  (event B), plus the probability of choosing a  (event

A) after not choosing a  (event B), which equals (12/51)(13/52) +

(13/51)(39/52) = 1/4 = 0.25.

Uniform Distribution :

Term/Meaning Formula

Expected Value

X

 

a  b

Variance

2

X

 

2

ba

Standard Deviation X

 

ba

Probability that X falls

between c and d

PcXd

b a

d c

Normal Distribution :

z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.

Probability Density Function:

 

 

2

2

1

 

  

x

X

f x e

where   3.1416 and e2.

Standard Deviations away from the mean:

X

Z

(Z andare swappable!)

P(aXb) = area under fX(x) between a and b:

PaXb  

^ 

b

Z

a

P

Standard Normal Table - seven usage scenarios: (^) 2.2.

= + = (^) +

=

  • = -

=

  • = -

= +

Correlation :

 If X and Y are two different sets of data, their correlation is represented by Corr(XY), rXY, or  XY (rho).

 If Y increases as X increases, 0 <  (^) XY < 1. If Y decreases as X increases, -1 <  (^) XY < 0.

 The extremes  (^) XY = 1 and  (^) XY = -1 indicated perfect correlation – info about one results in an exact prediction about the other.

 If X and Y are completely uncorrelated, XY = 0.

 The Covariance of X and Y, Cov(XY) , has the same sign as  XY, has unusual units and is usually a means to find  XY.

Term Formula Notes

Correlation

 

 

X Y

XY

XY

Cov

Corr

Used with Covariance formulas below

Covariance (2 formulas)

 

Cov E   X X  Y Y  XY

(difficult to calculate)

Sum of the products of all sample pairs’ distance from their

respective means multiplied by their respective probabilities

 

Cov EXY   X  YXY

Sum of the products of all sample pairs multiplied by their

respective probabilities, minus the product of both means

Finding Covariance

given Correlation

XYX YXY

Cov    Corr

Portfolio Analysis :

Term Formula Example*

Mean of any Portfolio “S” Sa XbY

S = ¾(8.0%)+ ¼(11.0%) = 8.75%

Uncorrelate

d

Portfolio Variance

2 2 2 2 2

X Y

 a   b 

2 = (¾)

2 (0.5)

2

  • (¼)

2 (6.0)

2 = 2.

Portfolio Standard Deviation

2 2 2 2

X Y

a   b

 = 1.

Confidence Intervals :

Parameter Confidence Interval Usage Sample

n

X z

 2

Normal Known 

n

s

X z  2

Large Unknown 

  

n

s

X t n  1 ,  2

Normal Small Unknown 

p

 

n

pˆ ˆ p

pˆ z

 2

Binomial Large

X Y

n

s

D z

D

 2

Normal Matched pairs

X Y

 

Y

Y

X

X

n n

X Y z

2 2

2

/

Normal

Known ,

Independent Samples

X Y

 

Y

Y

X

X

n

s

n

s

X Y z

2 2

2

 /

Large

X Y

pp  

   

Y

Y Y

X

X X

X Y

n

p p

n

p p

p p z

/

 2

Binomial Large

Formulae Guide t-table

Large/Normal

or

Small?

Mean

or

Proportion?

Single Mean

or

Difference?

Matched

or

Independent?

Single p

or

Difference?

1

4

6

2

3

5

d. f. 0.100 0.050 0.025 0.010 0.

1 3.078 6.

6

1

6

2 1.886 2.920 4.303 6.965 9.

3 1.638 2.353 3.182 4.541 5.

4 1.533 2.132 2.776 3.747 4.

5 1.476 2.015 2.571 3.365 4.

6 1.440 1.943 2.447 3.143 3.

7 1.415 1.895 2.365 2.998 3.

8 1.397 1.860 2.306 2.896 3.

9 1.383 1.833 2.262 2.821 3.

10 1.372 1.812 2.228 2.764 3.

11 1.363 1.796 2.201 2.718 3.

12 1.356 1.782 2.179 2.681 3.

13 1.350 1.771 2.160 2.650 3.

14 1.345 1.761 2.145 2.624 2.

15 1.341 1.753 2.131 2.602 2.

16 1.337 1.746 2.120 2.583 2.

Confidence Level to Z-Value Guide 17 1.333 1.740 2.110 2.567 2.

Confidence Level Z/2 (2-Tail) Z(1-Tail) 18 1.330 1.734 2.101 2.552 2.

80%= 20% 1.28 0.84 19 1.328 1.729 2.093 2.539 2.

90%= 10% 1.645 1.28 20 1.325 1.725 2.086 2.528 2.

95%= 5% 1.96 1.645 21 1.323 1.721 2.080 2.518 2.

99%= 1% 2.575 2.325 22 1.321 1.717 2.074 2.508 2.

c= 1.0-c Z(c/2) z(c-0.5) 23 1.319 1.714 2.069 2.500 2.

24 1.318 1.711 2.064 2.492 2.

Determining the Appropriate Sample Size 25 1.316 1.708 2.060 2.485 2.

Term Normal Distribution Formula Proportion Formula 26 1.315 1.706 2.056 2.479 2.

Sample Size (for +/- e)

 

2

2 2

e

n

2

2

e

n 

27 1.314 1.703 2.052 2.473 2.

28 1.313 1.701 2.048 2.467 2.

Hypothesis Testing :

Two-tailed Lower-tail Upper-tail

Test Type Test Statistic H a

Critical

Value

H a

Critical

Value

H a

Critical

Value

Single( n

30)

n

s

X

z

0

0

 

0

 2

z 0

z 0

z

Single( n

30)

n

s

X

t

0

0

 

0

  1 ,  2 

n

t 0

  1 , 

n

t 0

  1 , 

n

t

Single p ( n

30)

 

n

p p

p p

z

0 0

0

0

1

ˆ

0

pp  2

z 0

pp

z 0

pp

z

Diff. between

twos

 

Y

Y

x

x

n

s

n

s

X Y

z

2 2

0

X Y

   z^  2   0

X Y

 z   0

X Y

z

Diff. between

two p s

 

 

X Y

X Y

X Y

n n

n n

p p

p p

z

0

X Y

p p  2

z (^)   0 X Y

p p

z   0 X Y

p p

z

Classic Hypothesis Testing Procedure

Step Description Example

1 Formulate Two

Hypotheses

The hypotheses ought to be mutually exclusive and collectively

exhaustive. The hypothesis to be tested (the null hypothesis) always

contains an equals sign, referring to some proposed value of a

population parameter. The alternative hypothesis never contains an

equals sign, but can be either a one-sided or two-sided inequality.

H

0

H

A

2 Select a Test Statistic The test statistic is a standardized estimate of the difference between

our sample and some hypothesized population parameter. It answers

the question: “ If the null hypothesis were true, how many standard

deviations is our sample away from where we expected it to be ?” n

s

X 0

 

3 Derive a Decision Rule The decision rule consists of regions of rejection and non-rejection,

defined by critical values of the test statistic. It is used to establish the

probable truth or falsity of the null hypothesis.

We reject H 0 if

n

X z

 

4 Calculate the Value of the

Test Statistic; Invoke the

Decision Rule in light of

the Test Statistic

Either reject the null hypothesis (if the test statistic falls into the

rejection region) or do not reject the null hypothesis (if the test

statistic does not fall into the rejection region.

n

s

X 0

 

50

080

021 0

.

.

Regression :

Statistic Symbol

Regression

Statistics

Independent Variables X 1 ,…Xk

Multiple R 0.

R Square 0.

Dependent Variable (a random variable) Y

Adjusted R Square 0.

Standard Error 6.

Dependent Variable (an individual

observation among sample)

Yi

Observations 15

ANOVA

Intercept (or constant); an unknown

population parameter

0

df SS MS F Significance F

Regression 2 5704.0273 2852.0137 65.0391 0.

Estimated intercept; an estimate of 0

ˆ Residual 12 526.2087 43.

Total 14 6230.

Slope (or coefficient) for Independent

Variable 1 (unknown)

1

Coefficients Standard Error t Stat P-value

Intercept -20.3722 9.8139 -2.0758 0.

Estimated slope for Independent Variable 1;

an estimate of 1

1

Size (100 sq ft) 4.3117 0.4104 10.5059 0.

Lot Size (1000 sq ft) 4.7177 0.7646 6.1705 0.

Statistic (Mapped

to Output Above)

Symbol Formula

Statistic

(Mapped to

Output Above)

Symbol Formula

Dependent Variable

(sample mean of n

observations)

Y

n

Y

n

i

 i

1

R -square

(Coefficient of

Determination)

2

R

TSS

SSE

Dependent Variable

(estimated value for a

given vector of

independent variables)

i

Y

ˆ

i i i k ki

x

x ...

x

x

0 1 1 2 2 3 3

Multiple R

(Coefficient of

Multiple Correlation)

R

2

R

Error for observation i****.

The unexplained

difference between the

actual value of Y i and

the prediction for Y i

based on our regression

model.

i

i i

Y

 Y 

Adjusted R -square

2

R

n

SST

n k

SSE

Total Sum of Squares

TSS

(or SST)

2

1

n

i

i

Y Y  SSR  SSE

Standard Error

(a.k.a. Standard

Error of the

Estimate)

s

n k

SSE

Sum of Squares due to

Error

SSE  

2

1

n

i

i i

Y

Y

t -statistic for testing

0 1

H :   vs.

1

H :  

A

0

t

1

1

s

Mean Squares due to

Error

MSE

n k

SSE

p -value for testing

0 1

H :   vs.

1

H :  

A

p -value  

0

 P T  t

Sum of Squares due to

Regression

SSR  

2

1

n

i

i

Y Y

F

F

MSE

MSR

2852.

Mean Squares due to

Regression

MSR

k

SSR

2

2

R

R

k

n k