Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Central Limit Theorem: Demonstrating Convergence to Normal Distribution through Simulation, Lab Reports of Mathematics

An explanation of the central limit theorem and demonstrates its validity through simulation. The document compares the distributions of sample means for right-skewed and normal populations, showing how the distributions of sample means converge to a normal distribution as the sample size increases. The document also includes r code for generating random samples and calculating sample means.

Typology: Lab Reports

Pre 2010

Uploaded on 08/16/2009

koofers-user-l87
koofers-user-l87 🇺🇸

10 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Math 338, Lab 5, Due Monday 22, 2009
1 Central Limit Theorem
Central limit theorem says that if xD(µ, σ)whereDis a probability density function or
probability mass function (regardless of the form of the distribution), when sample size is
large enough: ¯xN(µ, σ
n). Central Limit Theorem also suggests that when the underlying
population distribution is Normal, we can assume ¯xN(µ, σ
n), even for smaller n.
In this handout, and using simulation, we will obtain the sampling distributions of the
sampling mean for two populations. A right skewed population, and a normally distributed
one.
Figure 1 shows a clearly right skewed population with µ=2.029 and σ=1.382. For
1000 times 1 take samples of size 2 from this population and we obtain 1000 sample averages
associated with those random samples. Then we repeat this process for samples of sizes 3,
6, 10, 20, 100 and each time we keep track of the mean and the distribution of those 1000
sample averages. We also plot histograms and qq-plots for each scenario. It turns out that
as sample size increases, the distribution of 1000 sample averages converges to normality
(figures 2 and 3). Also, the standard deviation of those sampling distributions gets closer to
σ
n(table 1).
Next, we sample from a normal population with µ=99.602 and σ=10.2211 (figure 4).
Then we repeat the same procedure for sample averages. One can verify that regardless of
sample size, the central limit theorem holds (figures 5, 6, and table 2).
Sample Size 2 3 6 10 20 100
Mean 2.0945 2.061667 2.016 2.0301 2.0186 2.02654
Standard Deviation 0.9860487 0.78204 0.5660369 0.453773 0.3010642 0.1300328
Table 1. Results for p opoulation 1. The mean and standard deviations of the sample mean with
different sample sizes
Sample Size 2 3 6 10 20 100
Mean 99.95793 99.30017 99.5832 99.5639 99.653 99.57605
Standard Deviation 7.199265 5.83941 4.132356 3.290563 2.240824 0.9427503
Table 2. Results for population 2. The mean and standard deviations of the sample mean with
different sample sizes
1
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Central Limit Theorem: Demonstrating Convergence to Normal Distribution through Simulation and more Lab Reports Mathematics in PDF only on Docsity!

Math 338, Lab 5, Due Monday 22, 2009

1 Central Limit Theorem

Central limit theorem says that if x ∼ D(μ, σ) where D is a probability density function or probability mass function (regardless of the form of the distribution), when sample size is large enough: ¯x ∼ N(μ, √σn ). Central Limit Theorem also suggests that when the underlying

population distribution is Normal, we can assume ¯x ∼ N(μ, √σn ), even for smaller n. In this handout, and using simulation, we will obtain the sampling distributions of the sampling mean for two populations. A right skewed population, and a normally distributed one. Figure 1 shows a clearly right skewed population with μ = 2.029 and σ = 1.382. For 1000 times 1 take samples of size 2 from this population and we obtain 1000 sample averages associated with those random samples. Then we repeat this process for samples of sizes 3, 6, 10, 20, 100 and each time we keep track of the mean and the distribution of those 1000 sample averages. We also plot histograms and qq-plots for each scenario. It turns out that as sample size increases, the distribution of 1000 sample averages converges to normality (figures 2 and 3). Also, the standard deviation of those sampling distributions gets closer to √^ σ n (table 1). Next, we sample from a normal population with μ = 99.602 and σ = 10.2211 (figure 4). Then we repeat the same procedure for sample averages. One can verify that regardless of sample size, the central limit theorem holds (figures 5, 6, and table 2).

Sample Size 2 3 6 10 20 100 Mean 2.0945 2.061667 2.016 2.0301 2.0186 2. Standard Deviation 0.9860487 0.78204 0.5660369 0.453773 0.3010642 0. Table 1. Results for popoulation 1. The mean and standard deviations of the sample mean with different sample sizes

Sample Size 2 3 6 10 20 100 Mean 99.95793 99.30017 99.5832 99.5639 99.653 99. Standard Deviation 7.199265 5.83941 4.132356 3.290563 2.240824 0. Table 2. Results for population 2. The mean and standard deviations of the sample mean with different sample sizes

Histogram of test

test

Frequency

0 1 2 3 4 5 6 7

0

50

100

150

200

250

300

Figure 1. Case one: Population distribution. The distribution is Skewed to the right.

−3 −2 −1 0 1 2 3

0

1

2

3

4

5

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

0

1

2

3

4

5

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

1

2

3

4

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

Figure 3. QQ-plots for the sample mean distributions for different sample sizes.

Histogram of test

test

Frequency

70 80 90 100 110 120 130

0

50

100

150

200

Figure 4. Case two: Population distribution. Normal distribution.

−3 −2 −1 0 1 2 3

80

100

120

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

80

90

110

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

90

100

110

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

90

100

110

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

92

96

100

106

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

97

99

101

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

Figure 6. QQ-plots for the sample mean distributions for different sample sizes.

R-codes For Simulation

(a) Population 1: Right Skewed Distribution We can simulate from a Poisson dis- tribution:

test1<-rpois(1000,2) hist(test1) mean(test1) [1] 2. sd(test1) [1] 1.

(b) Population 1: Obtaining 1000 Samples With Size 2, 3, 6, 10, 20, 100 Here is the case for Size 2. Others are similar.

test<-matrix(nrow=1000,ncol=2) for(i in 1:1000) { test[i,]<-sample(test1,2) }

mean.size2<-apply(test,1,mean)

mean(mean.size2) [1] 2. sd(mean.size2) [1] 0.

(c) Population 2: Obtaining 1000 Samples With Size 2, 3, 6, 10, 20, 100 Again, only the case for size 2 is included.

test<-matrix(nrow=1000,ncol=2) for(i in 1:1000) { test[i,]<-sample(test2,2) } mean.size2.norm<-apply(test,1,mean)

mean(mean.size2.norm) [1] 99.

Assignment

(1) Use R to generate 10,000 random samples of size 10 from a Binomial(10,0.5) distribu- tion. In each repetition, calculate the sample mean. Graph the distribution of X¯, the sample mean. Does the central limit theorem hold in this case?

(2) Repeat the above procedure by increasing the sample size to 50, 100, 1000, 10000. Comment on the validity of the central limit theorem.

(3) Scores on the Stanford-Binet IQ test are normally distributed with μ =100 and σ =16.

(a) Use R to obtain 1000 random samples of size 20 from this distribution. (b) Compute the sample mean for each of the 500 samples. (c) Draw a histogram of the 1000 sample means. Comment on its shape. (d) What do you expect the mean and the standard deviation of the sampling distri- bution of the mean to be? (e) Compute the mean and the standard deviation of the sample means. Are they close to the theoretical values? (f) Using the simulation results, approximate the probability that a random sample of size 20 results in a sample mean greater than 108.

(3) The incomes in a certain large population of college teachers have a distribution with a mean of μ = $35, 000 and a standard deviation of σ = $5, 000. Four hundred teachers are selected at random from this population.

(a) What is the sampling distribution of X¯, the sample mean income? (b) What is the probability that the average salary of the 400 teachers will exceed $35,615?