Experimental Design and Analysis of Variance - Statistical Applications | MATH 241 | Study notes Mathematics

Experimental Design and ANALYSIS of VARIANCE

ANALYSIS of VARIANCE tests equivalence of means by comparing sample variances. It assumes normal and inde-

pendent populations with equal variance.

H0:µ1=µ2=µ3···

Ha: Some mean is different.

We could test the means pair-wise to see if some pair of means shows a difference, but multiplying tests on the same

data set multiplies the alpha risk (risk of type I error) — if you do three tests at the .05 level [as you would to compare

three means], theres up to a 14% chance (not just a 5% chance – exact value depends on actual relationships) of getting

a significant result just by chance [that is, even if theres no difference at all in the popuations]. With six comparisons (as

you’d need to compare four means in pairs) the risk is up to 26.5%.

Example for class discussion: The following table represents the number of miles per gallon for a tankful of five different

brands of gasoline achieved by four different test drivers. The problem is to determine if the brands are different regarding

their yields in miles per gallon.

Mileage (miles per gallon) for cars driven with five brands of gasoline

BRAND A BRAND B BRAND C BRAND D BRAND E

27 29 30 24 25

26 28 32 24 21

21 27 27 23 22

22 24 27 21 20

¯xa= 24 ¯xb= 27 ¯xc= 29 ¯xd= 23 ¯xe= 22

sa= 2.94 sb= 2.16 sc= 2.45 sd= 1.41 se= 2.16

Grand Mean, denoted ¯

¯x= 25 (mean of all 20 observations) Sample size n= 20

H0: Mean mileage for A = mean mileage for B = . . . = mean mileage for E

Ha: There is some difference among brands in the (long-term) mileage.

The type of gasoline is called a factor (the different brands are the treatments). As in this example the factor is

frequently (but not always) a qualitative variable. The values of the factors determine the columns of the table.

The yield, in miles per gallon, is the response variable. The response variable is always a quantitative variable. The

numbers in the table are values of the response variable.

Let k= # of columns (# of treatments) and nj= # rows (# of replications) for the j-th treatment (j= 1,2,. . . , k ).

Then:

Treatment (factor) sum of squares =

j=1

nj(¯xj−¯

¯x)2= SSTR [between- factor variation in x]

Error sum of squares =

j=1

i=1

(xij −¯xj)2= SSE [within-factor variation in x]

Alternatively: SSE =

j=1

(nj−1) s2

Total sum of squares (SST) =

j=1

i=1

(xij −¯

¯x)2= SSTR + SSE

MSTR = SSTR

k−1, mean square (of deviations) due to difference of treatments [“mean square for treatments”

— also called “mean square for factors”]

MSE = SSE

n−k, mean square (of deviations) due to sampling error [“mean square of (remaining) error”]

F = MSTR

MSE

Experimental Design and Analysis of Variance - Statistical Applications | MATH 241, Study notes of Mathematics

Related documents

Partial preview of the text

Download Experimental Design and Analysis of Variance - Statistical Applications | MATH 241 and more Study notes Mathematics in PDF only on Docsity!

MSTR =

SSTR

MSE =

SSE

F =

MSTR

MSE

MSE

MSE

MSE

MSE

MSE

2MSE

2MSE