Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

10 Hypothesis Testing with Two Independent Samples, Slides of Biostatistics

Leeds Trinity University Biostatistics

is t-distributed with n − 1 degrees of freedom. Example: Two independent samples have been taken from two in- dependent normal populations. The observations ...

Typology: Slides

2021/2022

Uploaded on 09/27/2022

sharina 🇬🇧

4.5

(11)

217 documents

1 / 16

This page cannot be seen from the preview

Don't miss anything!

MATH1015 Biostatistics Week 10

10 Hypothesis Testing with Two

Independent Samples

Previously we have studied:

•the one-sample t-test for population mean µ, using the

information provided by a single sample;

•the one-sample z-test for population proportion p, based

on one sample;

•the matched pairs t-test based on two observations on each

(or identical) subject (which reduces to the one-sample t-

test for diﬀerenced data).

This week we consider an extension of the above work and study

methods to compare two population means and two population

proportions, both based on two independent samples from two

populations.

10.1 Two-sample t-test comparing two popu-

lation means (P.145-149)

10.1.1 Introduction

In every area of human activity, new procedures are invented and

existing techniques are revised. Advances occur whenever a new

technique is proved to be better than the old. Hence we need

to test whether the new method is better than the old one but

based on a new experimental design other than matched pairs

design.

SydU MATH1015 (2013) First semester 1

Partial preview of the text

Download 10 Hypothesis Testing with Two Independent Samples and more Slides Biostatistics in PDF only on Docsity!

10 Hypothesis Testing with Two

Independent Samples

Previously we have studied:

the one-sample t-test for population mean μ, using the

information provided by a single sample;

the one-sample z-test for population proportion p, based

on one sample;

the matched pairs t-test based on two observations on each

(or identical) subject (which reduces to the one-sample t-

test for differenced data).

This week we consider an extension of the above work and study

methods to compare two population means and two population

proportions, both based on two independent samples from two

populations.

10.1 Two-sample t-test comparing two popu-

lation means (P.145-149)

10.1.1 Introduction

In every area of human activity, new procedures are invented and

existing techniques are revised. Advances occur whenever a new

technique is proved to be better than the old. Hence we need

to test whether the new method is better than the old one but

based on a new experimental design other than matched pairs

design.

This section develops a popular statistical test that compares

the means of two independent populations.

A motivational example: Suppose that there are two types

of food available for milking cows. A farmer wishes to test which

type of food helps cows to produce more yield of milk.

An experiment: The farmer can select two independent groups

of cows who produce similar milk yields. One group is given the

food A and the other group is given the food B. After one week,

the farmer calculates the means and standard deviations of milk

yields for each group and then use his knowledge to decide the

type of food which gives better yield.

Further examples:

1. Compare the average age at first marriage of females in two

ethnic groups.

2. Compare the average efficiency of two brands of fertilisers.

3. Compare the average marks of statistics students at USYD

and UNSW

Note: This type of design do comparison using two indepen-

dent samples rather than matched pairs. This type of design is

necessary in some situations when matched pairs from similar

or same subjects are more difficult to form, for example, in the

comparison of two ethnic groups where human characteristics

are difficult to match.

A statistical test, the two-sample t-test, for such comparisons

can be developed under the following assumptions:

H 1 : μ 1 < μ 2 or H 1 : μ 1 − μ 2 < 0 (one-sided); H 1 : μ 1 ̸= μ 2 or H 1 : μ 1 − μ 2 ̸= 0 (two-sided).

Note: As we have two sample variances s^21 and s^21 , we need to com- bine them to form a single variance in order to develop a t test. This can be done by combining or pooling s^21 and s^21 as given below:

10.1.4 Combined or Pooled Variance

It can be shown that the best combination of s^21 and s^21 to produce the common variance is given by

s^2 p =

(n 1 − 1)s^21 + (n 2 − 1)s^22 n 1 + n 2 − 2

Remarks:

This combined variance s^2 p is called the pooled variance.
The pooled variance is simply a weighted average of the two individual sample variances, weighted by their df.

10.1.5 The Test Statistic

It can be proved that under the null hypothesis H 0 : μ 1 − μ 2 = 0, the test statistic

t =

X 1 − X 2

sp

1 n 1 +^

1 n 2

∼ tn 1 +n 2 − 2

is t-distributed with n 1 + n 2 − 2 degrees of freedom.

Note: The corresponding df for this two sample problem is 2 less than the total sample size of n 1 + n 2. Compare this with the one sample t-test

t =

X − μ 0 s/

n

∼ tn− 1

is t-distributed with n − 1 degrees of freedom.

Example: Two independent samples have been taken from two in- dependent normal populations. The observations are:

Sample 1: 8, 5, 7, 6, 9, 7 Sample 2: 2, 6, 4, 7, 6.

Find an estimate of the combined variance (or pooled variance).

Solution:

Sample 1: n 1 = 6. ¯x 1 = 7, s^21 = 2. Sample 2: n 2 = 5. ¯x 2 = 5, s^22 = 4.

Therefore, the combined or pooled variance (estimate) is:

s^2 p =

Example (cont): State the distribution of t = X¯^1 −^ X¯^2 sp

√ (^1) n 1 +^ n^12

under H 0.

Solution: Since the df = 6 + 5 − 2 = 9

t =

X 1 − X 2

sp

1 n 1 +^

1 n 2

∼ t 9

Example: Using the sample information, calculate the value of test statistic.

Example: A feeding test is conducted on a herd of 25 dairy cows to compare two diets, A and B. A sample of 13 cows randomly selected from the herd are fed diet A and the remaining cows are fed with diet B. From observations made over a three-week period, the average daily milk production (in L) is recorded for each cow:

Milk Yield (in L) Diet A (x 1 ) 44 44 56 46 47 38 58 53 49 35 46 30 41 Diet B (x 2 ) 35 47 55 29 40 39 32 41 42 57 51 39

Assume these two samples come from independent normally dis- tributed populations with equal variances σ^2.

(i) Find the mean and the sd for each sample.

(ii) Find an estimate of the ‘pooled variance’ s^2 p, which estimates the common variance σ^2

(iii) Perform the two-sample t-test to investigate the evidence of a difference in true mean milk yields for the two diets.

Solution:

(i) ¯x 1 = 45. 15 , s 1 = 7. 998 , n 1 = 13 for A x ¯ 2 = 42. 25 , s 2 = 8. 740 , n 2 = 12 for B

(ii) The ‘pooled’ sample variance is

s^2 p =

(n 1 − 1)s^21 + (n 2 − 1)s^22 (n 1 + n 2 − 2)

=

(iii) The two-sample t-test:

Hypotheses: As we want to test whether there is a differ- ence in milk yields, we have a two-sided alternatives:

H 0 : μ 1 = μ 2 against H 1 : μ 1 ̸= μ 2.

Test Statistic: Under H 0 ,

t 0 =

x¯ 1 − x¯ 2 sp

1 n 1 +^

1 n 2

1 13 +^

1 12

P -value: 2P(T 23 ≥ 0 .867) > 0. 05
Conclusion: Since P -value is > 0 .05, the data are consistent with H 0. There is no significant difference between the two di- ets.

−2.069 −0.867 0 0.867 2.

t 23 P−value=0. α= 0.05 (RR)

0.197 0.

Two−sided t−test

10.1.6 Confidence interval (CI) for μ 1 − μ 2

The (1 − α)100% CI for μ 1 − μ 2 is

x¯ 1 − x¯ 2 ± tn 1 +n 2 − 2 ,α/ 2 sp

n 1

n 2

10.2 Two-sample z-test for comparing two

population proportions (P.157-161)

In some life science problems we need to test whether the two popu- lation proportions for a particular attribute are equal.

Motivating example: Suppose that a federal member of the par- liament wishes to test whether two suburbs in his electorate have the same unemployment rate.

To test this, the member can take two independent samples (one from each suburb) and calculate the proportion of the unemployment. However, these two proportions can not show whether any difference between them is sufficiently large to support his claim. Therefore, we need to develop a proper statistical test.

10.2.1 Assumption

Two independent samples
Both sample sizes are large: both n 1 ≥ 30 and n 2 ≥ 30.

10.2.2 Hypotheses

Null hypothesis of interest: As we would like to compare two proportions p 1 and p 2 for each of the populations,

H 0 : p 1 = p 2 or equivalently H 0 : p 1 − p 2 = 0

Alternative hypothesis: Depending on the specific problem, it can be:

H 1 : p 1 > p 2 or equivalently H 1 : p 1 − p 2 > 0 (one-sided), H 1 : p 1 < p 2 or equivalently H 1 : p 1 − p 2 < 0 (one-sided), H 1 : p 1 ̸= p 2 or equivalently H 1 : p 1 − p 2 ̸= 0 (two-sided).

To develop a suitable test statistic, we need a single estimate for the proportion based on two independent samples under H 0 of equal proportions. This combined or pooled estimate is obtained using the formula given below:

10.2.3 Combined or pooled proportion

Suppose that x 1 and x 2 are the number of “successes” in each inde- pendent sample, and n 1 and n 2 their respective sample sizes. Under the null hypothesis that two population proportions are equal, we estimate this common proportion using:

pˆ =

x 1 + x 2 n 1 + n 2

Note: It is clear that

pˆ =

n 1 pˆ 1 + n 2 pˆ 2 n 1 + n 2

where ˆp 1 and ˆp 2 are the estimates of the two proportions based on two independent samples.

Remark: pˆ is just a weighted average of the two sample proportions pˆ 1 and ˆp 2 , weighted by their sample sizes.

10.2.4 The test Statistic

The formula for the test statistic is

z =

pˆ 1 − pˆ 2 √ p ˆ(1 − pˆ)

1 n 1 +^

1 n 2

) ∼^ N(0,^ 1)

since under the null hypothesis,

Var(ˆp 1 −pˆ 2 ) = Var(ˆp 1 )+Var(ˆp 2 ) = pˆ(1 − pˆ) n 1

pˆ(1 − pˆ) n 2 = ˆp(1−pˆ)

( 1 n 1

1 n 2

)

Preliminary calculations: Sample proportions are:

pˆ 1 =

X 1

n 1

= 0. 767 , pˆ 2 =

X 2

n 2

Combined or the pooled proportion is:

pˆ =

x 1 + x 2 n 1 + n 2

Hence the test statistic is:

z 0 =

pˆ 1 − pˆ 2 √ p ˆ(1 − pˆ)

1 n 1 +^

1 n 2

) =^

30 +^

1 72

) =^ −^0.^28

P -value: P(Z < − 0 .28) = 1 − P(Z < 0 .28) = 1 − 0 .6103 = 0. 3897
Conclusion: Since P -value is > 0 .05, the data are consistent with H 0. That is, there is insufficient evidence that Instructor A is inferior to Instructor B in terms of their student success rate.

−1.645 −0.

N( 0 , 1 ) P−value= 0. α=0.05 (RR)

One−sided Z−test

Example 2: On October 23, 2009, an outbreak of mumps was re- ported in Borough Park, Brooklyn. Fifty-seven children were diag- nosed with this childhood disease. Surprisingly, 43 of the children had the recommended two doses of MMR vaccine which is supposed to protect against the disease. In the past, from a sample of 100 children with mumps in New York State, 83% of them had the rec- ommended two doses of the vaccine. Test the hypothesis that the MMR vaccination rate for the two groups is different at α = 0.05.

Solution: Let

p 1 = proportion of vaccinated children with mumps in Boro Park
p 2 = proportion of vaccinated children with mumps in NYS

Hypotheses: As we want to test whether the rates are different, we have a two-sided alternatives:

H 0 : p 1 = p 2 vs. H 1 : p 1 ̸= p 2.

Test statistic: Preliminary calculations:

pˆ 1 =

X 1

n 1

= 0. 75 , pˆ 2 =

X 2

n 2

p ˆ =

x 1 + x 2 n 1 + n 2

Hence the test statistic is:

z 0 =

pˆ 1 − pˆ 2 √ p ˆ(1 − pˆ)

1 n 1 +^

1 n 2

) =^

57 +^

1 100

) =^ −^1.^21

Solution: The 95% CI for p 1 − p 2 is:

pˆ 1 − pˆ 2 ∓ z 1 −α/ 2

p ˆ(1 − pˆ)

n 1

n 2

Since the 95% CI contain 0, there is no significant difference between the two success rates. This result agrees with the result from hy- potheses testing.

Exercise: Find a 95% CI for p 1 − p 2 for example 2.

Answer: (-0.2051, 0.0539)

10 Hypothesis Testing with Two Independent Samples, Slides of Biostatistics

Related documents

Partial preview of the text

Download 10 Hypothesis Testing with Two Independent Samples and more Slides Biostatistics in PDF only on Docsity!

10 Hypothesis Testing with Two

Independent Samples

Previously we have studied:

information provided by a single sample;

on one sample;

(or identical) subject (which reduces to the one-sample t-

test for differenced data).

This week we consider an extension of the above work and study

methods to compare two population means and two population

proportions, both based on two independent samples from two

populations.

10.1 Two-sample t-test comparing two popu-

lation means (P.145-149)

10.1.1 Introduction

In every area of human activity, new procedures are invented and

existing techniques are revised. Advances occur whenever a new

technique is proved to be better than the old. Hence we need

to test whether the new method is better than the old one but

based on a new experimental design other than matched pairs

design.

This section develops a popular statistical test that compares

the means of two independent populations.

A motivational example: Suppose that there are two types

of food available for milking cows. A farmer wishes to test which

type of food helps cows to produce more yield of milk.

An experiment: The farmer can select two independent groups

of cows who produce similar milk yields. One group is given the

food A and the other group is given the food B. After one week,

the farmer calculates the means and standard deviations of milk

yields for each group and then use his knowledge to decide the

type of food which gives better yield.

Further examples:

1. Compare the average age at first marriage of females in two

ethnic groups.

2. Compare the average efficiency of two brands of fertilisers.

3. Compare the average marks of statistics students at USYD

and UNSW

Note: This type of design do comparison using two indepen-

dent samples rather than matched pairs. This type of design is

necessary in some situations when matched pairs from similar

or same subjects are more difficult to form, for example, in the

comparison of two ethnic groups where human characteristics

are difficult to match.

A statistical test, the two-sample t-test, for such comparisons

can be developed under the following assumptions:

10.1.4 Combined or Pooled Variance

10.1.5 The Test Statistic

X 1 − X 2

X 1 − X 2

10.1.6 Confidence interval (CI) for μ 1 − μ 2

10.2.1 Assumption

10.2.2 Hypotheses

10.2.3 Combined or pooled proportion

10.2.4 The test Statistic

) ∼^ N(0,^ 1)

X 1

X 2

) =^

30 +^

) =^ −^0.^28

X 1

X 2

) =^

57 +^

) =^ −^1.^21