






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Descriptive statistics, probability terms and rules and many formulas
Typology: Cheat Sheet
1 / 11
This page cannot be seen from the preview
Don't miss anything!
Term Meaning Population Formula Sample Formula Example
{1,16,1,3,9}
Sort Sort values in
increasing order
{1,1,3,9,16}
Mean Average
N
i
i
1
n
i
i
n
1 2 1
6
Median The middle value –
half are below and
half are above
3
Mode The value with the
most appearances
1
Variance The average of the
squared deviations
between the values
and the mean
2
N
i
i
1
2
1
2
2
n
i
i
(1-6)
2
2
2
2
2
divided by 5 values =
168/5 = 33.
Standard
Deviation
The square root of
Variance, thought of
as the “average”
deviation from the
mean.
2
1
2
2
n
i
i
Square root of 33.6 =
Coefficien
t of
Variation
The variation relative
to the value of the
mean
s
5.7966 divided by 6
= 0.
Minimum The minimum value 1
Maximum The maximum value 16
Range Maximum minus
Minimum
16 – 1 = 15
Term Meaning Notation Example (see footnote)*
Probability For any event A, probability is represented within 0 P 1.
Random
Experiment
A process leading to at least 2 possible outcomes with
uncertainty as to which will occur.
Rolling a dice
Event A subset of all possible outcomes of an experiment. Events A and B
Intersection of
Events
Let A and B be two events. Then the intersection of the two
events is the event that both A and B occur (logical AND).
The event that a 2 appears
Union of Events The union of the two events is the event that A or B (or both)
occurs (logical OR).
The event that a 1, 2, 4, 5 or 6
appears
Complement Let A be an event. The complement of A is the event that A does
not occur (logical NOT).
A
The event that an odd number
appears
Mutually
Exclusive Events
A and B are said to be mutually exclusive if at most one of the
events A and B can occur.
A and B are not mutually
exclusive because if a 2 appears,
both A and B occur
Collectively
Exhaustive
Events
A and B are said to be collectively exhaustive if at least one of
the events A or B must occur.
A and B are not collectively
exhaustive because if a 3
appears, neither A nor B occur
Basic Outcomes The simple indecomposable possible results of an experiment.
One and exactly one of these outcomes must occur. The set of
basic outcomes is mutually exclusive and collectively
exhaustive.
Basic outcomes 1, 2, 3, 4, 5, and
6
Sample Space The totality of basic outcomes of an experiment. {1,2,3,4,5,6}
If events A and B are mutually exclusive If events A and B are NOT mutually exclusive
only if A and
B are
independent
P B
General probability rules :
1) If P(A|B) = P(A) , then A and B are independent
events! (for example, rolling dice one after the other).
2) If there are n possible outcomes which are equally
likely to occur:
P(outcome i occurs) =
for each i [1, 2, ..., n ]
*Example: Shuffle a deck of cards, and pick one
at random. P(chosen card is a 10 ) = 1/52.
3) If event A is composed of n equally likely basic
outcomes :
*Example: Suppose we toss two dice. Let A
denote the event that the sum of the two dice is
9. P(A) = 4/36 = 1/9, because there are 4 out of
36 basic outcomes that will sum 9.
P(AB) +
P(A (^) B )
=
P(A|B)P(B) +
P(A| (^) B )P(
B
)
*Example: Take a deck of 52 cards. Take out 2 cards sequentially,
but don’t look at the first. The probability that the second card you
chose was a is the probability of choosing a (event A) after
choosing a (event B), plus the probability of choosing a (event
A) after not choosing a (event B), which equals (12/51)(13/52) +
(13/51)(39/52) = 1/4 = 0.25.
Term/Meaning Formula
Expected Value
Variance
2
X
2
b a
Standard Deviation X
b a
Probability that X falls
between c and d
P c X d
b a
d c
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.
Probability Density Function:
2
2
1
x
X
where 3.1416 and e 2.
Standard Deviations away from the mean:
(Z and are swappable!)
P(a X b) = area under fX(x) between a and b:
P a X b
b
a
Standard Normal Table - seven usage scenarios: (^) 2.2.
= + = (^) +
=
=
= +
If X and Y are two different sets of data, their correlation is represented by Corr(XY), rXY, or XY (rho).
If Y increases as X increases, 0 < (^) XY < 1. If Y decreases as X increases, -1 < (^) XY < 0.
The extremes (^) XY = 1 and (^) XY = -1 indicated perfect correlation – info about one results in an exact prediction about the other.
If X and Y are completely uncorrelated, XY = 0.
Term Formula Notes
Correlation
X Y
XY
XY
Used with Covariance formulas below
Covariance (2 formulas)
Cov E X X Y Y XY
(difficult to calculate)
Sum of the products of all sample pairs’ distance from their
respective means multiplied by their respective probabilities
Cov E XY X Y XY
Sum of the products of all sample pairs multiplied by their
respective probabilities, minus the product of both means
Finding Covariance
given Correlation
XY X Y XY
Term Formula Example*
Mean of any Portfolio “S” S a X bY
S = ¾(8.0%)+ ¼(11.0%) = 8.75%
Uncorrelate
d
Portfolio Variance
2 2 2 2 2
X Y
2 = (¾)
2 (0.5)
2
2 (6.0)
2 = 2.
Portfolio Standard Deviation
2 2 2 2
X Y
a b
= 1.
Parameter Confidence Interval Usage Sample
n
X z
2
n
s
X z 2
n
s
X t n 1 , 2
n
pˆ ˆ p
pˆ z
2
X Y
D
2
X Y
Y
Y
X
X
2 2
2
/
X Y
Y
Y
X
X
2 2
2
/
X Y
p p
Y
Y Y
X
X X
X Y
/
2
Formulae Guide t-table
Large/Normal
or
Small?
Mean
or
Proportion?
Single Mean
or
Difference?
Matched
or
Independent?
Single p
or
Difference?
1
4
6
2
3
5
d. f. 0.100 0.050 0.025 0.010 0.
1 3.078 6.
6
1
6
2 1.886 2.920 4.303 6.965 9.
3 1.638 2.353 3.182 4.541 5.
4 1.533 2.132 2.776 3.747 4.
5 1.476 2.015 2.571 3.365 4.
6 1.440 1.943 2.447 3.143 3.
7 1.415 1.895 2.365 2.998 3.
8 1.397 1.860 2.306 2.896 3.
9 1.383 1.833 2.262 2.821 3.
10 1.372 1.812 2.228 2.764 3.
11 1.363 1.796 2.201 2.718 3.
12 1.356 1.782 2.179 2.681 3.
13 1.350 1.771 2.160 2.650 3.
14 1.345 1.761 2.145 2.624 2.
15 1.341 1.753 2.131 2.602 2.
16 1.337 1.746 2.120 2.583 2.
Confidence Level to Z-Value Guide 17 1.333 1.740 2.110 2.567 2.
Confidence Level Z /2 (2-Tail) Z (1-Tail) 18 1.330 1.734 2.101 2.552 2.
80% = 20% 1.28 0.84 19 1.328 1.729 2.093 2.539 2.
90% = 10% 1.645 1.28 20 1.325 1.725 2.086 2.528 2.
95% = 5% 1.96 1.645 21 1.323 1.721 2.080 2.518 2.
99% = 1% 2.575 2.325 22 1.321 1.717 2.074 2.508 2.
c = 1.0-c Z(c/2) z(c-0.5) 23 1.319 1.714 2.069 2.500 2.
24 1.318 1.711 2.064 2.492 2.
Determining the Appropriate Sample Size 25 1.316 1.708 2.060 2.485 2.
Term Normal Distribution Formula Proportion Formula 26 1.315 1.706 2.056 2.479 2.
Sample Size (for +/- e)
2
2 2
2
2
27 1.314 1.703 2.052 2.473 2.
28 1.313 1.701 2.048 2.467 2.
Two-tailed Lower-tail Upper-tail
Test Type Test Statistic H a
Critical
Value
H a
Critical
Value
H a
Critical
Value
Single ( n
30)
n
s
X
z
0
0
0
2
z 0
z 0
z
Single ( n
30)
n
s
X
t
0
0
0
1 , 2
n
t 0
1 ,
n
t 0
1 ,
n
t
Single p ( n
30)
n
p p
p p
z
0 0
0
0
1
ˆ
0
p p 2
z 0
p p
z 0
p p
z
Diff. between
two s
Y
Y
x
x
2 2
0
X Y
X Y
X Y
z
Diff. between
two p s
X Y
X Y
X Y
0
X Y
p p 2
z (^) 0 X Y
p p
z 0 X Y
p p
z
Classic Hypothesis Testing Procedure
Step Description Example
1 Formulate Two
Hypotheses
The hypotheses ought to be mutually exclusive and collectively
exhaustive. The hypothesis to be tested (the null hypothesis) always
contains an equals sign, referring to some proposed value of a
population parameter. The alternative hypothesis never contains an
equals sign, but can be either a one-sided or two-sided inequality.
0
A
2 Select a Test Statistic The test statistic is a standardized estimate of the difference between
our sample and some hypothesized population parameter. It answers
the question: “ If the null hypothesis were true, how many standard
deviations is our sample away from where we expected it to be ?” n
s
X 0
3 Derive a Decision Rule The decision rule consists of regions of rejection and non-rejection,
defined by critical values of the test statistic. It is used to establish the
probable truth or falsity of the null hypothesis.
We reject H 0 if
n
X z
4 Calculate the Value of the
Test Statistic; Invoke the
Decision Rule in light of
the Test Statistic
Either reject the null hypothesis (if the test statistic falls into the
rejection region) or do not reject the null hypothesis (if the test
statistic does not fall into the rejection region.
n
s
X 0
50
080
021 0
.
.
Statistic Symbol
Regression
Statistics
Independent Variables X 1 ,…Xk
Multiple R 0.
R Square 0.
Dependent Variable (a random variable) Y
Adjusted R Square 0.
Standard Error 6.
Dependent Variable (an individual
observation among sample)
Yi
Observations 15
ANOVA
Intercept (or constant); an unknown
population parameter
0
df SS MS F Significance F
Regression 2 5704.0273 2852.0137 65.0391 0.
Estimated intercept; an estimate of 0
ˆ Residual 12 526.2087 43.
Total 14 6230.
Slope (or coefficient) for Independent
Variable 1 (unknown)
1
Coefficients Standard Error t Stat P-value
Intercept -20.3722 9.8139 -2.0758 0.
Estimated slope for Independent Variable 1;
an estimate of 1
1
Size (100 sq ft) 4.3117 0.4104 10.5059 0.
Lot Size (1000 sq ft) 4.7177 0.7646 6.1705 0.
Statistic (Mapped
to Output Above)
Symbol Formula
Statistic
(Mapped to
Output Above)
Symbol Formula
Dependent Variable
(sample mean of n
observations)
n
i
1
R -square
(Coefficient of
Determination)
2
Dependent Variable
(estimated value for a
given vector of
independent variables)
i
Y
ˆ
i i i k ki
x
x ...
x
x
0 1 1 2 2 3 3
Multiple R
(Coefficient of
Multiple Correlation)
2
R
Error for observation i****.
The unexplained
difference between the
actual value of Y i and
the prediction for Y i
based on our regression
model.
i
i i
Adjusted R -square
2
R
Total Sum of Squares
(or SST)
2
1
n
i
i
Standard Error
(a.k.a. Standard
Error of the
Estimate)
s
n k
Sum of Squares due to
Error
2
1
n
i
i i
t -statistic for testing
0 1
1
A
0
1
1
s
Mean Squares due to
Error
n k
p -value for testing
0 1
1
A
0
Sum of Squares due to
Regression
2
1
n
i
i
F
2852.
Mean Squares due to
Regression
k
2
2