Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Understanding Population & Sample Statistics: Mean, Median, Dispersion & Confidence Interv, Exercises of Statistics

University of North Florida (UNF)Statistics

An in-depth exploration of statistics, focusing on population and sample mean, median, measures of location and dispersion, and confidence intervals. It covers the philosophical and practical meanings of a population, statistical samples, strategies for defining survey objectives, and various measures of location and dispersion such as mean, median, population variance, sample variance, and standard deviation. Additionally, it discusses point and interval estimates, sampling distributions, and the Central Limit Theorem.

What you will learn

How is the standard error of the mean calculated?
What is the difference between population variance and sample variance?
How is the median calculated?

What is the difference between population and sample mean?
What is the Central Limit Theorem and how does it apply to statistics?

Typology: Exercises

2021/2022

Uploaded on 09/27/2022

lalitdiya 🇺🇸

4.3

(25)

240 documents

1 / 23

This page cannot be seen from the preview

Don't miss anything!

Basic Statistical Concepts

Statistical Population

• The entire underlying set of observations

from which samples are drawn.

– Philosophical meaning: all observations that could

ever be taken for range of inference

• e.g. all barnacle populations that have ever existed, that

exist or that will exist

– Practical meaning: all observations within a

reasonable range of inference

• e.g. barnacle populations on that stretch of coast

Partial preview of the text

Download Understanding Population & Sample Statistics: Mean, Median, Dispersion & Confidence Interv and more Exercises Statistics in PDF only on Docsity!

Basic Statistical Concepts

Statistical Population

The entire underlying set of observations from which samples are drawn. - Philosophical meaning: all observations that could ever be taken for range of inference - e.g. all barnacle populations that have ever existed, that exist or that will exist - Practical meaning: all observations within a reasonable range of inference - e.g. barnacle populations on that stretch of coast

Statistical Sample

A representative subset of a population.
- What counts as being representative
  - Unbiased and hopefully precise

Strategies

Define survey objectives: what is the goal of survey or experiment? What are you hypotheses?
Define population parameters to estimate (e.g. number of individuals, growth, color etc).
Implement sampling strategy
- measure every individual (think of implications in terms of cost, time, practicality especially if destructive)
- measure a representative portion of the population (a sample)

Population mean ( ) - the average value
Sample mean = estimates 
Population median - the middle value
Sample median estimates population median
In a normal distribution the mean=median (also the mode), this is not ensured in other distributions

Y Y Mean & median Median Mean

Measures of location

Measures of dispersion

Population variance ( ^2 ) - average sum of squared deviations from mean
Measured sample variance ( s^2 ) estimates population variance
Standard deviation ( s )
- square root of variance
- same units as original variable

( xi - x ) 2 n - 1

( xi - ) 2 n

( xi - x ) 2 n - 1

Measures (statistics) of Dispersion

Population variance ^2 =

Sample variance s^2 =

Sample standard deviation s = (^) 

Note, units are squared
Denominator is (n)
Note, units are squared
Denominator is (n-1)
Note, units are not squared

Population Sum of Squares ( xi - ) 2

Sample Sum of Squares SS = ( xi - x ) 2

s^2 n

s x

( xi - x ) ( yi - y ) n - 1

More Statistics of Dispersion

Standard error of the mean s (^) x =

Coefficient of variation CV =

Covariance s (^) xy =

This is also the Standard Deviation^  of the sample means
Measurement of variation independent of units
Expressed as a percentage of mean
Measure of how two variables covary
Range is between - and +
Value depends in part on range in data
bigger numbers yield bigger values of covariance

=  n

s  n

P(y)

Sampling distribution of sample means

Multiple samples

multiple sample means

True Mean = 25

22 27

12 33 25

(^3619)

Mean = 21.

23 24

28 28 25 21 17 40 16 Mean = 25.

Means 21.

25.

Estimate of Mean

Number of cases

10 20 30 40

Sampling distribution of mean

The sampling distribution of the sample mean approaches a normal distribution as n gets larger - Central Limit Theorem.
The mean of this sampling distribution is , the mean of original population.

Estimate of Mean ( x)

Probability



Estimate of Mean

(^015 20 25 30 )

of cases

0.3 (^) Proportion per Bar

Large number of Samples

~2 SEM^ ~2 SEM

Probability2.5% 2.5%

Estimate of Mean ( x)



Standard deviation can be calculated for any distribution The standard deviation of the distribution of sample means can be calculated the same as for a given sample



( x (^) i - x ) 2 However: N - 1 To do so would require an immense sampling effort, hence an approximation is used:

Where: s = sample standard deviation and n = number of replicates in the sample

 n

s  n

s s x ~^ SEM =

s (^) x =

Standard error of mean

population SD estimated by sample SE:

s / n

measures precision of sample mean
how close sample mean is likely to be to

true population mean

Standard error of mean

If SE is low:
- repeated samples would produce similar sample means
- therefore, any single sample mean likely to be close to population mean
If SE is high:
- repeated samples would produce very different sample means
- therefore, any single sample mean may not be close to population mean

0.00 (^0 10) Estimate of Mean 20 30 40

Probability

0.00 (^0 10) Estimate of Mean 20 30 40

Probability

1 SEM=2 1 SEM=

Effect of Standard error on estimate of (assume df= large)



 

~2 SEM ~2 SEM

~2 SEM ~2 SEM 2.5%^ 2.5%

Distribution of sample means

Calculate the proportion of sample means within a range of values.

Transform distribution of means to a distribution with mean = 0 and standard deviation = 1

95%

99%

P( (^) y ) (^) y

t statistic

s / n

y ^ 

0.0 -5 -4 -3 -2 -1 0 1 2 3 4 5

Probability

s / n

y ^ 

t =

Null distribution

t statistic – interpretation and

units

The deviation between the sample and population mean is expressed in terms of Standard error (i.e. Standard deviations of the sampling distribution)
Hence the value of t’s are in standard errors
For example t=2 indicates that the deviation ( y-  ) is equal to 2 x the standard error

s / n

y ^ 

Degrees of Freedom .01 .02 .05 .10. 1 63.66 31.82 12.71 6.314 3. 2 9.925 6.965 4.303 2.920 1. 3 5.841 4.541 3.182 2.353 1. 4 4.604 3.747 2.776 2.132 1. 5 4.032 3.365 2.571 2.015 1. 10 3.169 2.764 2.228 1.812 1. 15 2.947 2.602 2.132 1.753 1. 20 2.845 2.528 2.086 1.725 1. 25 2.787 2.485 2.060 1.708 1. z 2.575 2.326 1.960 1.645 1.

Two tailed t-values

Probability

Probabilities of occurring outside the range

t (^) df to + t (^) df

-5 -4 -3 -2 -1-5 -4 -3 -2 -1 00 11 22 33 44 55

-2.78 95%+2.

4 df

s / n t =y ^ 

s / n

t =y ^ 

Degrees of Freedom .005/.01 .01/.02 .025/.05 .05/.10 .10/. 1 63.66 31.82 12.71 6.314 3. 2 9.925 6.965 4.303 2.920 1. 3 5.841 4.541 3.182 2.353 1. 4 4.604 3.747 2.776 2.132 1. 5 4.032 3.365 2.571 2.015 1. 10 3.169 2.764 2.228 1.812 1. 15 2.947 2.602 2.132 1.753 1. 20 2.845 2.528 2.086 1.725 1. 25 2.787 2.485 2.060 1.708 1. z 2.575 2.326 1.960 1.645 1.

-5 -4 -3 -2 -1-5 -4 -3 -2 -1 00 11 22 33 44 55

-2.78 95%+2. -5 -4 -3 -2 -1-5 -4 -3 -2 -1 00 11 22 33 44 55

95%+2. -5 -4 -3 -2 -1-5 -4 -3 -2 -1 00 11 22 33 44 55

-2.132 95%

One and two tailed t-values (df 4)

2 tailed 1 tailed 1 tailed

s / n

t =y ^ 

The t statistic

This t statistic follows a t -distribution, which has a mathematical formula.
Same as normal distribution for n >30 otherwise flatter, more spread than normal distribution.
Different t distributions for different sample sizes < 30 (actually df which is n -1).
The proportions of t values between particular t values, yield a confidence estimate (the likelihood that the true mean is in the range)

For n = 5 (df = 4), 95% of all t values occur between t = -2.78 and t = +2.

95%

Pr( t )

-2.78 0 +2. t

Probability is 95% that t is between -2.78 and +2.
Probability is 95% that is between -2.78 and +2.
Rearrange equation to solve for 

s n

y  

-5 -4 -3 -2 -1-5 -4 -3 -2 -1 00 11 22 33 44 55

-2.78 95%+2.

Worked example (Lovett et al. 2000) Sample mean 61. Sample SD 5. SE 0.

The t value (95%, 38df) = 2.02 (from a t -table)
2.5% of t values are greater than 2.
2.5% of t values are less than -2.
95% of t values are between -2.02 and +2.

P {61.92 - 2.02 (5.24 / 39) <  < 61.92 + 2.02 (5.24 / 39)} = 0.

P {60.22 <  < 63.62} = 0.

Degrees of Freedom .01 .02 .05 .10. 1 63.66 31.82 12.71 6.314 3. 2 9.925 6.965 4.303 2.920 1. 3 5.841 4.541 3.182 2.353 1. 4 4.604 3.747 2.776 2.132 1. 5 4.032 3.365 2.571 2.015 1. 10 3.169 2.764 2.228 1.812 1. 15 2.947 2.602 2.132 1.753 1. 20 2.845 2.528 2.086 1.725 1. 25 2.787 2.485 2.060 1.708 1. 38 2.705 2.426 2.020 1.685 1.

Confidence Interval (2 tailed) assume 95% CI is desired

Probability

95%

Lovett et al 38 df. (2000)

y  t ( s / n ) y  t ( s / n ) 61.92 – 2.02(0.84) 60.

61.92 + 2.02(0.84) <  < 63.

Pr[y  t ( s / n )  y^  t ( s / n ) ]

Sample mean 61. SEM 0. DF 38

The interval 60.22 – 63.62 will contain  95% of the time.
We are 95% confident that the interval 60.
- 63.62 contains .

Effect on Confidence Interval

Case Mean Samplesize (SS) Standarddeviation (SD)

StandardError Probability(%) LowerConfidence Limit

UpperConfidence Limit Reference 61.92 39 5.24 0.834 95% 60.22 63. Double SD

61.92 39 10.48 1.68 95% 58.53 65. Reduce SS

61.92 20 5.24 1.17 95% 59.47 64. Increase %

61.92 39 5.24 0.834 99% 59.65 64.

Understanding Population & Sample Statistics: Mean, Median, Dispersion & Confidence Interv, Exercises of Statistics

Related documents

Partial preview of the text

Download Understanding Population & Sample Statistics: Mean, Median, Dispersion & Confidence Interv and more Exercises Statistics in PDF only on Docsity!

Basic Statistical Concepts

Statistical Population

Statistical Sample

Strategies

Measures of location

Measures of dispersion

Measures (statistics) of Dispersion

More Statistics of Dispersion

Sampling distribution of mean

of cases

Standard error of mean

s / n

true population mean

Standard error of mean

t statistic

s / n

y ^ 

y ^ 

t =

t statistic – interpretation and

units

s / n

y ^ 

Two tailed t-values

The t statistic

Effect on Confidence Interval