Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Analysis of Two Independent Samples, Study notes of Statistics

The calculations for the statistical analysis of two independent samples, including the means, standard deviations, and hypothesis testing using z-scores and t-tests. The data is presented in two sets, with each set containing sample sizes, sample means, sample standard deviations, and degrees of freedom.

Typology: Study notes

2010/2011

Uploaded on 04/29/2011

rollercoaster-101
rollercoaster-101 🇨🇦

4.6

(6)

41 documents

1 / 29

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ECON 2202 - LONG ANSWER QUESTIONS
TOPIC 1
Q1. .
Hypothesis Test:
1. 2 independent random samples; population variances unknown (and unequal).
1
x
= 5.30;
2
x
= 5.10; s1=2.20; s2=2.644.
n1=75≥30 and n2=80≥30, so CLT applies.
Use
2
2
2
1
2
1
2121
0
n
s
n
s
xx
z
, α = 0.05.
2. H0: μ1 – μ2 ≤ 0; Ha: μ1 – μ2 > 0
3.
Reject H0 if Z0>Zα = Z0.05 = 1.645.
4.
5131.0
0.3898
2.0
80
644.2
75
2.2
01.53.5
22
2
2
2
1
2
1
2121
0
n
s
n
s
xx
z
5. Do not reject H0 since Z0<Zα (0.5131<1.645). At the 5% significance level, the mean of population 1 is
not larger than the mean of population 2.
95% Confidence Interval
1. See Above
Use
2
2
2
1
2
1
2/21
n
s
n
s
Zxx
2.
2.0
21
xx
3. Zα/2 = Z0.025 = 1.96
4.
0.3898
80
644.2
75
2.2
22
2
2
2
1
2
1
n
s
n
s
5.
0.76392.03898.0*96.12.0
2
2
2
1
2
1
2/21
n
s
n
s
Zxx
. The 95% for μ1- μ2 is given by (-
0.5639. 0.9639)
ECON 2202, Topic 1 & 2 – © S. Dubey, 2011 1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d

Partial preview of the text

Download Statistical Analysis of Two Independent Samples and more Study notes Statistics in PDF only on Docsity!

ECON 2202 - LONG ANSWER QUESTIONS

TOPIC 1

Q1..

Hypothesis Test:

  1. 2 independent random samples; population variances unknown (and unequal).

 1

x = 5.30; 2

x = 5.10; s 1 =2.20; s 2 =2.644.

 n 1 =75≥30 and n 2 =80≥30, so CLT applies.

 Use

2

2

2

1

2

1

1 2 1 2

0

n

s

n

s

x x

z

, α = 0.05.

  1. H 0 : μ 1 – μ 2 ≤ 0; Ha: μ 1 – μ 2 > 0

Reject H 0 if Z 0 >Zα = Z0.05 = 1.645.

2 2

2

2

2

1

2

1

1 2 1 2

0

n

s

n

s

x x

z

 

  1. Do not reject H 0 since Z 0 <Zα (0.5131<1.645). At the 5% significance level, the mean of population 1 is

not larger than the mean of population 2.

95% Confidence Interval

  1. See Above

 Use  

2

2

2

1

2

1

1 2 / 2 n

s

n

s

xxZ  

1 2

x  x 

  1. Zα/2 = Z0.025 = 1.

2 2

2

2

2

1

2

1    

n

s

n

s

  1.   0. 2 1. 96 * 0. 3898 0. 2 0.

2

2

2

1

2

1

1 2 / 2

n

s

n

s

x x Z

. The 95% for μ 1 - μ 2 is given by (-

0.5639. 0.9639)

α=0.

Q2..

Hypothesis Test:

1

2

1 1 1

x ~ N ,  / n ,^  

2

2

2 2 2

x ~ N ,  / n ,^

2

2

2

1

 ,  unknown, independent random samples

 1

x = 430; 2

x = 405; s 1 =20; s 2 =24.

 n 1 =25<30 and n 2 =20<30, so CLT does not apply.

 Use

2

2

2

1

2

1

1 2 1 2

0

n

s

n

s

x x

t

, α = 0.05,

int( 36.9475)

int

* int

2 2 2

2

2

2

2

2

1

2

1

2

1

2

2

2

2

1

2

1

n

s n

n

s n

n

s

n

s

df

.

  1. H 0 : μ 1 – μ 2 = 0; Ha: μ 1 – μ 2 ≠ 0
  2. Reject H 0 if t 0 >t α/2, df* = t 0.025, 36 = 2.0211 or t 0 < - t α/ = -2.
  • tα/2 =-2.0211 tα/2 =2.0211.

2 2

2

2

2

1

2

1

1 2 1 2

0

n

s

n

s

x x

t

  1. Reject H 0 since t 0 >t α/ (3.7351>2.0211). At the 5% significance level, the mean of population 1 is not

equal to the mean of population 2.

95% CI:

  1. See above

 Use  

2

2

2

1

2

1

1 2 / 2 , *

n

s

n

s

x x t

df

2 1

xx 

  1. tα/2, df* = t0.025, 36 = 2.

4. 6.^6933

2 2

2

2

2

1

2

1    

n

s

n

s

2

2

2

1

2

1

2 1 / 2

n

s

n

s

x x t

. The 95% for μ 1 - μ 2 is given by (-

38.5278, -11.4722)

α/2=0.

Q4..

a.:

1

2

1 1 1

x ~ Dist1 ,  / n ,^  

2

2

2 2 2

x ~ Dist2 ,  / n ,^

2

2

2

1

 ,  unknown , samples independent;

1

x (^) = 2.5, 2

x (^) = 3.5,

2

1

s = 2*2=4,

2

2

s = 1.5*1.5=2.25;

n 1 = 15, n 2 = 15, so CLT doesn’t apply, so you must assume a normal distribution.

2

2

2

1

2

1

1 2 1 2

0

n

s

n

s

x x

t

 

; α=0.05. df*=int(25.9644)=

  1. H 0 : μ 1 ≤ μ 2 ; Ha: μ 1 > μ 2

  2. Reject H 0 if t 0 > tα,df* = t0.05,25≈ 1.7081.

2

2

2

1

2

1

1 2 1 2

0

n

s

n

s

x x

t

  1. Since t 0 < tα (-1.5492<1.7081), do not reject H 0. Reject the manager’s claim that discount coupons are as

likely to increase sales of DVDs as are on-line discounts at the 5% significance level.

b. 95% CI for μ 1 – μ 2

1. First three points are the same as in part a.

Use CI formula  

2

2

2

1

2

1

1 2 / 2 , *

n

s

n

s

x x t

df

; α=0.05. df*=int(25.9644)=

1 2

xx   

3.

/ 2 , * 0. 025 , 25

t  t 

df

2

2

2

1

2

1    

n

s

n

s

2

2

2

1

2

1

1 2 / 2 ,

n

s

n

s

x x tdf

or (-2.3294, 0.3294). The 95% CI for

the difference in population means, μ 1 – μ 2 , is (-2.3294, 0.3294).

α=0.

Q5..

a.:

  1. Let X 1 = Number of successes in population 1; X 2 =Number of successes in population 2 ; assume the two

samples are independent random samples

X 1 = 70, X 2 = 75, p 1 = X 1 /n 1 = 70/100=0.7, p 2 = X 2 /n 2 = 75/100 = 0.

n 1 p 1 = 70>5, n 1 (1-p 1 ) = 30>5, n 2 p 2 = 75>5, n 2 (1-p 2 ) = 25>5, n 1 =100≥ 30, n 2 =100≥ 30 so CLT applies

Use

1 2

1 2 1 2

0

n n

p p

p p

Z

p p

;

1 2

1 2  

n n

X X

p p α=0.05.

  1. H 0 : π 1 – π 2 ≥ 0; Ha: π 1 – π 2 < 0
  2. Reject H 0 if Z 0 < -Zα = -Z0.05 = -1.

-Zα = -Z0.05 = -1.

1 2

1 2 1 2

0

n n

p p

p p

Z

p p

  1. Since Z 0 > Zα (-0.7918 > -1.645), do not reject H 0. Population proportion 1 is not smaller than population

proportion 2 at the 5% significance level.

b:

  1. First (^) three givens the same as part a.

Use ^ ^

2

2 2

1

1 1

1 2 / 2

n

p p

n

p p

p p Z

 ;

1 2

1 2  

n n

X X

p p α=0.05.

1 2

p  p   

  1. Zα/2 =Z.0025 = 1.

2

2 2

1

1 1   

n

p p

n

p p

  0. 05 1. 96 * 0. 0630 0. 05 0.

2

2 2

1

1 1

1 2 / 2

n

p p

n

p p

p p Z . The 95% CI for

the difference in the two population proportions is (-0.1736, 0.0736).

Note: The 95% CI for (p 2 – p 1 ) is (-0.0736, 0.1736), which requires changes to fformulas in Steps 1, 2 & 5.

c: The population standard error for the difference in sample proportions is:

2

2 2

1

1 1

n n

. A

CI makes no assumptions about π 1 and π 2 , so estimate them by p 1 and p 2 , respectively. If π 1 = π 2 = π 2 , as in the

“=” part of H 0 , then π 1 (1- π 1 ) = π 2 (1- π 2 ) = π (1- π), so common factoring gives:

α=0.

b. Classical Hypothesis Test:

  1. See a.

 Use

0

s

x

Z

  1. H 0 : μ ≤ 50 (km); H a : μ > 50 (km);

  2. (^) Reject H 0 if Z 0 > Zα = Z0.10 = 1.

0

s

x

Z

  1. Do not reject H 0 since Z 0 < Z α ( 0.5229 < 1.2816). Do not accept the engineers’ claim that you can

drive on a punctured tire for an average of more than 50km.

c. Hypothesis Test with p-value approach:

 Steps 1 and 2 are the same as in b.

  1. Reject H 0 if p-value < α = 0.10.
  2. p ^ valueP (^ ZZ 0 ) P ( Z ^0.^5299 )^0.^5  P (^0  Z ^0.^5299 )^0.^5 ^0.^2019 ^0.^2981
  3. Do not reject H 0 since p-value>α (0.2981 > 0.10). Do not accept the engineers’ claim that you can

drive on a punctured tire for an average of more than 50km.

d. In general, you cannot use the results from a 90% CI to make a conclusion about the engineers’ one-

tailed claim, since the CI is two-tailed and the test is 1-tailed. Though both use the same standard

error, the CI has a smaller critical value.

α = 0.

Q8. Since the random samples come from the same 11 days, the test of difference in means should use paired

samples, since the day of sale is likely to influence volume of sales.

1. Assume sales in the two locations are normally distributed (so the difference is normally distributed),

assume α=0.05, and assume test is H 0 : μA = μB

 /  139 / 11  12. 6364

d d n i ;^ 97,629.654 6

(^222)

2 

n

d n d

s

i

d

n=11<30, so CLT does not apply

Use

s n

d

t

d

d

/

0

 

 with α=0.

Location

A

Location

B

Difference

di=Ai-Bi di

2

$444 $233 (^211 )

200 299 -99 9801

(^167 800) -633 400689

300 780 -480 230400

(^478 127 351 )

400 250 150 22500

(^250 340) -90 8100

(^600 370 230 )

(^501 230 271 )

(^350 300 50 )

300 400 -100 10000

∑di=-139 ∑di

2 = 978053

2. H 0 : μA = μB , HA: μA ≠ μB, or, H 0 : μd = 0 , HA: μd ≠ 0 where μd = μA – μB 3. Reject H 0 if t 0 >tα/2, n-2 =t0.025, 10 =2.2281 or t 0 <-tα/2= -2. -

t α/ = - 2.2281 t α/ = 2.

4. -0.

97629.65455/ 11

  1. 6364 0

/

0

s n

d

t

d

d

5. Do not reject H 0 since -tα/2 < t 0 < tα/2 (-2.2281< -0.1341 < 2.2281). Do not reject the claim, at the 5%

level, that average daily sales are the same in the two locations.

α/2=0.

the population average service time for customers is the same in both the drive-through service and the

regular walk-in service, which is the same conclusion as the hypothesis test.

.

Q10..

1. Same as Q9. 2. H 0 : μ 1 ≥ μ 2 , Ha: μ 1 < μ 2 3. Reject H 0 if t 0 < -tα, df = -t0.05,25 = -1.

-t α = -1.

4.

1 2

1 2 1 2

0

n n

s

x x

t

p

5. Do not reject H 0 since t 0 > - tα/2 (-1.5492 > -1.7081). Therefore the evidence suggests you reject the

manager’s claim, at the 5% level, that the drive-through service is more efficient.

b. 95% CI does not change. The answer is the same as in Q9.

c. You can use the (2-sided) CI as a (2-tailed) test on means - both use the same critical values and standard

errors.

Q11..

a.

  1. Let Boys=Population 1, Girls=Population 2

1

2

1 1 1

x ~ N ,  / n ,^  

2

2

2 2 2

x ~ N ,  / n ,^

2

2

2

1

   unknown, 2 independent random samples.

 3. 2 1

x  ; 2 2

x  ; s 1 =4; s 2 =1.5.

 n 1 =16<30 and n 2 =0<30, so CLT does not apply.

 Use  

2

2

2

1

2

1

2 1 / 2 , n

s

n

s

x x t df

, α = 0.05 , df=int(20.9790)=

1 2

x  x   

/ 2 , * 0. 025 , 20

t  t 

df

2

2

2

1

2

1    

n

s

n

s

5. ^ ^1.^22.^08601.^11801.^2 2.

2

2

2

1

2

1

1 2 / 2 ,

n

s

n

s

x x tdf The 95% CI for the difference in

population average internet use between boys and girls is (-1.1322, 3.5322).

b.

  1. Same as in part a, except use formula

2

2

2

1

2

1

1 2 1 2

0

n

s

n

s

x x

t

 

  1. H 0 : μ 1 = μ 2 ; H a : μ 1 ≠ μ 2

α=0.

α/2=0.

  1. Reject H 0 if t o > t α/2, df = t 0.025, 20 = 2.

Or to < - tα/2 =- 2.

  • tα/2 = - 2.0860 tα/2 = 2.

     

2

2

2

1

2

1

1 2 1 2

0

n

s

n

s

x x

t

  1. Do not reject H 0 since -tα/2 < t 0 < tα/2 (-2.0860<1.0733<2.0860). At the 5% significance level, conclude that

boys and girls do have the same average daily internet use.

c. In general, you can use the (2-sided) CI to test the (2-tailed) claim on population means since they both use

the same critical value and standard error.

Q12..

Classical Hypothesis Test:

1.^11.^79 1

x  , 2. 50 1

2

x  , 1. 95 2

  , 

2

2

2

1

2

1

1 1 1 1

- ~N ,

n n

x x

 

 .

Small n 1 =25, n 2 =25, with known distribution and population variances

Use

2

2

2

1

2

1

1 2 1 2

0

n n

x x

Z

and α=0.

2. H 0 : μ 1 - μ 2 ≤ 2 , Ha: μ 1 - μ 2 > 2 (men’s spending is at least women’s + $2) 3. Reject H 0 if Z 0 > Z α = Z 0. =1.

4.

2 2

2

2

2

1

2

1

1 2 1 2

0

n n

x x

Z

5. Do not reject H 0 since Z 0 < Z α (1.2616 < 1.645). Based on the evidence, reject the claim, at the 5% level,

that men spend, on average, at least $2 more for lunch than women at this restaurant.

p-value approach

1. Same as above. 2. Same as above. 3. Reject H 0 if p-value<α = 0.

α=0.

α=0.

Q13..

a. Hypothesis Test:

1. Let X 1 = Number of Engineers in sample who found a job within 6 months ;

X 2 =Number of Computer Scientists in sample who found a job within 6 months.

X 1 & X 2 come from 2 independent random samples each with a Binomial distribution

X 1 =70; X 2 =80,

1

1

1

n

X

p ,

2

2

2

n

X

p ,

1 2

1 2  

n n

X X

p p

n 1 p 1 =X 1 =70>5; n 1 (1-p 1 )=30>5; n 2 p 2 =X 2 =80>5; n 2 (1-p 2 )=20>5; n 1 =100≥30; n 2 =100≥30; so the CLT applies

and 

2

2 2

1

1 1

1 1 1 2

  • ~Dist ,

n n

p p

Use

   

1 2

1 2 1 2

0

n n

p p

p p

Z

p p

, α=0.

2. H 0 : π 2 ≤ π 1 ; Ha: π 2 > π 1 or H 0 : π 2 – π 1 ≤ 0, Ha: π 2 – π 1 > 0 3. Reject H 0 if Z 0 >Zα = Z0.01 =2.

4.

   

1 2

1 2 1 2

0

n n

p p

p p

Z

p p

5. Since Z 0 < Zα (1.6330 < 2.3264) do not reject H 0. The evidence does not support, at the 1% level, the

claim that a larger proportion of Computer Scientists than Engineers found a job within 6 months of

graduation.

b. 99% CI

1. As above. Use formula ^ ^

2

2 2

1

1 1

2 1 / 2

n

p p

n

p p

p p Z

2 1

pp   

3. Zα/2 = Z0.005 = 2.

2

2 2

1

1 1    

n

p p

n

p p

5. ^ ^0. 10 2. 5758 *. 0608 0. 10 0.

2

2 2

1

1 1

2 1 / 2

n

p p

n

p p

p p Z  or (-0.0567,

0.2567). The 99% CI for the difference in the population proportion of Computer Scientists to

Engineers that found a job within 6 months (π 2

  • π 1 ) is (-0.0567, 0.2567).

α=0.

Q14. Answers to numerically equivalent to Q13 part a. and part b., though concluding statements should

refer to Population 1 and Population 2, not to Engineers and Computer Scientists.

c. For p-value = 0.0332 and Ha: π 2 – π 1 >10 ( Ho: π 2 – π 1 ≤10 and Decision Rule : Reject H 0 if p-value < α )

i) α=0.10, α>p-value, so reject H 0 and accept the claim in Ha (π 2 – π 1 >10)

ii) if α=0.05, α>p-value, so reject H 0 and accept the claim in Ha (π 2 – π 1 >10)

iii) if α=0.01, α<p-value, so do not reject H 0 (π 2 – π 1 ≤10)

Q15..

a.

  1. Let p 1 = Proportion of Women in sample primarily concerned with weight of laptop;

p 2 = Proportion of Men in sample primarily concerned with weight of laptop.

Assume X 1 & X 2 come from 2 independent random samples; both have a Binomial distribution

p 1 =0.70 ; p 2 =59,

1 2

1 1 2 2 

n n

pn p n

p p

n 1 p 1 =0.7481 = 336.7>5; n 1 (1-p 1 )=144.3>5; n 2 p 2 =0.59374=220.66>5; n 2 (1-p 2 )=153.34>5; n 1 =481≥30;

n 2 =374≥30; so CLT applies, and 

2

2 2

1

1 1

1 1 1 2

  • ~Dist ,

n n

p p

Use  

2

2 2

1

1 1

2 1 / 2

n

p p

n

p p

p p Z

, α=0.

2 1

pp   

6. Zα/2 = Z0.025 = 1.

2

2 2

1

1 1    

n

p p

n

p p

2

2 2

1

1 1

2 1 / 2

n

p p

n

p p

p p Z

. The 95% CI

for the difference in the population proportion of women minus men who primarily care about laptop

weight is (0.0455, 0.1745).

b. Test

1. As above. Use formula

1 2

1 2

0

n n

p p

p p

Z

p p

2. H 0 : π 1 ≤ π 2 ; Ha: π 1 > π 2 3. Reject H 0 if Z 0 >Zα = Z0.05 =1.

4.

1 2

1 2

0

n n

p p

p p

Z

p p

α=0.

Q17..

a.

1. Let X 1 = # of government donors out of sample; X 2 = # of private sector donors out of sample

16 1

X  , 0.^40

1

1

1

n

X

p ; 28 2

X  , 0.^35

2

2

2

n

X

p ;

1 2

1 2  

n n

X X

p p

n 1 p 1 =X 1 =16>5; n 1 (1-p 1 )=24>5; n 2 p 2 =X 2 =28>5; n 2 (1-p 2 )=52>5; n 1 =40≥30; n 2 =80≥30; CLT applies

α=0.05. Both samples are large, n≥30 and np>5 and n(1-p)>5, so CLT applies

1 2

1 2 1 2

0

n n

p p

p p

Z

p p

with α=0.05.

2. H 0 : π 1 ≤ π 2 , Ha: π 1 > π 2 , or H 0 : π 1 - π 2 ≤ 0, Ha: π 1 - π 2 > 0 3. Reject H 0 if Z 0 > Zα = Z0.05 = 1.

4.

1 2

1 2 1 2

0

n n

p p

p p

Z

p p

5. Do not reject H 0 since Z 0 < Zα (0.5358< 1.645). Reject the claim that government employees donate

more than private sector employees.

b. 95% Confidence Interval:

1. Same as above, except use formula  

2

2 2

1

1 1

1 2 / 2

n

p p

n

p p

p p Z

1 2

p  p   

3. Zα/2= Z0.025 = 1.

4.

2

2 2

1

1 1    

n

p p

n

p p

5.

2

2 2

1

1 1

1 2 / 2

n

p p

n

p p

p p Z

So the 95% CI for the difference in the proportion of public sector and private sector employees who

donate is (-0.1343, 0.2343).

c. In general, you cannot use a (2-sided) CI to test this (1-sided) hypothesis, since the CI and Hypothesis Test

use different critical values and different standard errors. In this particular example, however, you would reach

the same conclusion since the CI also leads you to accept, at the 5% significance level, that the proportion of

donors is the same in the two groups.

α=0.

Q19..

1..

Given: 2 independent random samples from normal distributions; population variances unknown (and

unequal).

  1. 6364

11

73

11

12 3 6 11 3 8 12 8 33 14 13

 

         

 

n

d

d .

  1. 0545

10

10

2045 11 * 6. 6364

1

(^222)

2  

n

d n d i

d

 n=11<30 so CLT does not apply.

 Use

n

d

t

d

d

/

0

 (^) , α = 0.05.

  1. H 0 : μs = 5; Ha: μs ≠ 5

  • tα/2 , n-1=

-2.2281 t α/2, n- =t α/2, 10 =2.2281.

Reject H 0 if t 0 >tα/2 = 2.2281 or t 0 < - tα/2 = -2.

  1. 0545 / 11

  2. 6364 5

/

0

n

d

t

d

d

  1. Do not reject H 0 since -t α/ <t 0 <t α/ (-2.2281 < 0.4344 < 2.2281). At the 5% significance level, the the

difference in the means of the two populations is equal to 5.

This conclusion would not change at the 1% level of significance, since the critical values would be

larger (in absolute value), so the test statistic would still lie between these critical values.

Sample 1 50 47 44 48 40 36 43 46 72 40 54

Sample 2 38 44 38 37 43 44 31 38 39 54 41

di = x 1 – x 2 12 3 6 11 -3 -8 12 8 33 -14 13 ∑di= 73

di

2 144 9 36 121 9 64 144 64 1089 196 169 ∑di

2 = 2045

α/2=0.

Q20.

a.

2009 2010 d i =x 2010

- x 2009 d i

2

11 12 1 1

22 24 2 4

6 4 -2 4

12 10 -2 4

26 22 -4 16

∑di = -5 ∑di

2 = 29

n

d

d

i

6

4

24

4

29 5

4

1 4 4 4 16 5 * 1

1

2 2

2  

    

n

d n d i

d

b. 95% CI:

  1. Assume the difference in movies watched between 2010 and 2009 is normally distributed.

d  1 ,^6

2  d

 n=5, so CLT does not apply.

 Use d^  t  / 2 , n  1  d / n

  1. d  1
  2. t α/2, n- = t 0.025, = 2.
  3. /^6 /^51.^2 1.

2 n    d

  1. d^  t^  / 2 , n  1  d / n ^1 ^2.^7765 *^1.^0954 ^1 3.0415. The 95% for the average difference

in movies watched between 2010 and 2009 is (-4.0415. 2.0415)

c. Based on the 95%CI, we cannot conclude movie goers watched the same number of movies, on

average, at the 5% significance level, since: a) a hypothesis test would use the same critical value and

standard error, and b) the CI includes the value “0”, so it is possible there is no average difference in

movies watched in the two years.