






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An explanation of the central limit theorem and demonstrates its validity through simulation. The document compares the distributions of sample means for right-skewed and normal populations, showing how the distributions of sample means converge to a normal distribution as the sample size increases. The document also includes r code for generating random samples and calculating sample means.
Typology: Lab Reports
1 / 10
This page cannot be seen from the preview
Don't miss anything!
Central limit theorem says that if x ∼ D(μ, σ) where D is a probability density function or probability mass function (regardless of the form of the distribution), when sample size is large enough: ¯x ∼ N(μ, √σn ). Central Limit Theorem also suggests that when the underlying
population distribution is Normal, we can assume ¯x ∼ N(μ, √σn ), even for smaller n. In this handout, and using simulation, we will obtain the sampling distributions of the sampling mean for two populations. A right skewed population, and a normally distributed one. Figure 1 shows a clearly right skewed population with μ = 2.029 and σ = 1.382. For 1000 times 1 take samples of size 2 from this population and we obtain 1000 sample averages associated with those random samples. Then we repeat this process for samples of sizes 3, 6, 10, 20, 100 and each time we keep track of the mean and the distribution of those 1000 sample averages. We also plot histograms and qq-plots for each scenario. It turns out that as sample size increases, the distribution of 1000 sample averages converges to normality (figures 2 and 3). Also, the standard deviation of those sampling distributions gets closer to √^ σ n (table 1). Next, we sample from a normal population with μ = 99.602 and σ = 10.2211 (figure 4). Then we repeat the same procedure for sample averages. One can verify that regardless of sample size, the central limit theorem holds (figures 5, 6, and table 2).
Sample Size 2 3 6 10 20 100 Mean 2.0945 2.061667 2.016 2.0301 2.0186 2. Standard Deviation 0.9860487 0.78204 0.5660369 0.453773 0.3010642 0. Table 1. Results for popoulation 1. The mean and standard deviations of the sample mean with different sample sizes
Sample Size 2 3 6 10 20 100 Mean 99.95793 99.30017 99.5832 99.5639 99.653 99. Standard Deviation 7.199265 5.83941 4.132356 3.290563 2.240824 0. Table 2. Results for population 2. The mean and standard deviations of the sample mean with different sample sizes
Histogram of test
test
Frequency
0 1 2 3 4 5 6 7
0
50
100
150
200
250
300
Figure 1. Case one: Population distribution. The distribution is Skewed to the right.
−3 −2 −1 0 1 2 3
0
1
2
3
4
5
Normal Q−Q Plot
Theoretical Quantiles
Sample Quantiles
−3 −2 −1 0 1 2 3
0
1
2
3
4
5
Normal Q−Q Plot
Theoretical Quantiles
Sample Quantiles
−3 −2 −1 0 1 2 3
1
2
3
4
Normal Q−Q Plot
Theoretical Quantiles
Sample Quantiles
−3 −2 −1 0 1 2 3
Normal Q−Q Plot
Theoretical Quantiles
Sample Quantiles
−3 −2 −1 0 1 2 3
Normal Q−Q Plot
Theoretical Quantiles
Sample Quantiles
−3 −2 −1 0 1 2 3
Normal Q−Q Plot
Theoretical Quantiles
Sample Quantiles
Figure 3. QQ-plots for the sample mean distributions for different sample sizes.
Histogram of test
test
Frequency
70 80 90 100 110 120 130
0
50
100
150
200
Figure 4. Case two: Population distribution. Normal distribution.
−3 −2 −1 0 1 2 3
80
100
120
Normal Q−Q Plot
Theoretical Quantiles
Sample Quantiles
−3 −2 −1 0 1 2 3
80
90
110
Normal Q−Q Plot
Theoretical Quantiles
Sample Quantiles
−3 −2 −1 0 1 2 3
90
100
110
Normal Q−Q Plot
Theoretical Quantiles
Sample Quantiles
−3 −2 −1 0 1 2 3
90
100
110
Normal Q−Q Plot
Theoretical Quantiles
Sample Quantiles
−3 −2 −1 0 1 2 3
92
96
100
106
Normal Q−Q Plot
Theoretical Quantiles
Sample Quantiles
−3 −2 −1 0 1 2 3
97
99
101
Normal Q−Q Plot
Theoretical Quantiles
Sample Quantiles
Figure 6. QQ-plots for the sample mean distributions for different sample sizes.
(a) Population 1: Right Skewed Distribution We can simulate from a Poisson dis- tribution:
test1<-rpois(1000,2) hist(test1) mean(test1) [1] 2. sd(test1) [1] 1.
(b) Population 1: Obtaining 1000 Samples With Size 2, 3, 6, 10, 20, 100 Here is the case for Size 2. Others are similar.
test<-matrix(nrow=1000,ncol=2) for(i in 1:1000) { test[i,]<-sample(test1,2) }
mean.size2<-apply(test,1,mean)
mean(mean.size2) [1] 2. sd(mean.size2) [1] 0.
(c) Population 2: Obtaining 1000 Samples With Size 2, 3, 6, 10, 20, 100 Again, only the case for size 2 is included.
test<-matrix(nrow=1000,ncol=2) for(i in 1:1000) { test[i,]<-sample(test2,2) } mean.size2.norm<-apply(test,1,mean)
mean(mean.size2.norm) [1] 99.
(1) Use R to generate 10,000 random samples of size 10 from a Binomial(10,0.5) distribu- tion. In each repetition, calculate the sample mean. Graph the distribution of X¯, the sample mean. Does the central limit theorem hold in this case?
(2) Repeat the above procedure by increasing the sample size to 50, 100, 1000, 10000. Comment on the validity of the central limit theorem.
(3) Scores on the Stanford-Binet IQ test are normally distributed with μ =100 and σ =16.
(a) Use R to obtain 1000 random samples of size 20 from this distribution. (b) Compute the sample mean for each of the 500 samples. (c) Draw a histogram of the 1000 sample means. Comment on its shape. (d) What do you expect the mean and the standard deviation of the sampling distri- bution of the mean to be? (e) Compute the mean and the standard deviation of the sample means. Are they close to the theoretical values? (f) Using the simulation results, approximate the probability that a random sample of size 20 results in a sample mean greater than 108.
(3) The incomes in a certain large population of college teachers have a distribution with a mean of μ = $35, 000 and a standard deviation of σ = $5, 000. Four hundred teachers are selected at random from this population.
(a) What is the sampling distribution of X¯, the sample mean income? (b) What is the probability that the average salary of the 400 teachers will exceed $35,615?