Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Design of Experiment, Study notes of Statistics

Statistic with Design of Experiment

Typology: Study notes

2019/2020

Uploaded on 05/06/2020

Taufik_Rizkiandi
Taufik_Rizkiandi 🇮🇩

3 documents

1 / 143

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Design of Experiments
1. Analysis of Variance
2. More about Single Factor Experiments
3. Randomized Blocks, Latin Squares
4. Factorial Designs
5. 2kFactorial Designs
6. Blocking and Confounding
Montgomery, D.C. (1997): Design and Analysis of Experiments (4th ed.), Wiley.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download Design of Experiment and more Study notes Statistics in PDF only on Docsity!

Design of Experiments

    1. Analysis of Variance
    1. More about Single Factor Experiments
    1. Randomized Blocks, Latin Squares
    1. Factorial Designs
    1. 2 k^ Factorial Designs
    1. Blocking and Confounding

Montgomery, D.C. (1997): Design and Analysis of Experiments (4th ed.), Wiley.

1. Single Factor – Analysis of Variance

Example: Investigate tensile strength y of new synthetic fiber.

Known: y depends on the weight percent of cotton (which should range within 10% – 40%).

Decision: (a) test specimens at 5 levels of cotton weight: 15%, 20%, 25%, 30%, 35%. (b) test 5 specimens at each level of cotton content.

Single Factor Experiment with a = 5 levels and n = 5 Replicates.

=⇒ 25 runs.

Runs should be in Random Order (prohibit warm up effects of machine ...)

boxplot(y~w); plot(as.numeric(w), y); points(tapply(y, w, mean), pch=20)

15 20 25 30 35

0

5

10

15

20

25

30

Cotton Weight Percent

Tensile Strength

1 2 3 4 5

0

5

10

15

20

25

30

Cotton Weight Percent

Tensile Strength

We wish to test for differences between the mean strengths at all a = 5 levels of cotton weight percent ⇒ Analysis of Variance.

Analysis of Variance (ANOVA)

Use the Linear Regression Model

yij = μ + τi + ≤ij

for treatment i = 1,... , a, and replication j = 1,... , n.

Observation yij (ith treatment, jth replication) Parameter μ is common to all treatments (Overall Mean) Parameter τi is unique to the ith treatment (ith Treatment Effect) Random variable ≤ij is the Random Error component.

Further assumption: ≤ij iid ∼ N (0, σ^2 ).

Our interest is in the treatment effects.

Fixed Effects Model

Treatment effects τi are usually defined as the deviations from the overall mean

μ :=

a

∑^ a

i=

μi =

a

∑^ a

i=

(μ + τi) = μ +

a

∑^ a

i=

τi ,

Thus, we have a restriction on these effects, namely

∑^ a

i=

τi = 0.

Here, μi = E(yij) is the mean of all observations yij in the ith treatment (row).

ANOVA Decomposition

We are interested in testing the equality of the a treatment means

H 0 : μ 1 = μ 2 = · · · = μa ⇐⇒ H 0 : τ 1 = τ 2 = · · · = τa

which is equivalent to testing the equality of all treatment effects.

The Sum of Squares decomposition in Regression is valid

SST = SSR + SSE

where SSR, the Sum of Squares due to the Regression model, is only related to the treatment effects τi. Hence, we have

∑^ a

i=

∑^ n

j=

(yij − μˆ)^2 =

∑^ a

i=

∑^ n

j=

(ˆμi − μˆ)^2 +

∑^ a

i=

∑^ n

j=

(yij − μˆi)^2

Therefore, the total variability in the data can be partitioned into a sum of squares of the differences between the treatment averages and the grand average, plus a sum of squares of the differences of observations within treatments from the treatment average.

ANOVA Table

Source of Sum of Degrees of Mean Variation Squares Freedom Square F Between Treatments SSR a − 1 M SR M SR/M SE Error (within Treatments) SSE N − a M SE Total SST N − 1

Tensile Strength Data: Test H 0 : μ 1 = μ 2 = μ 3 = μ 4 = μ 5 against H 1 : some means are different

Source of Sum of Degrees of Mean Variation Squares Freedom Square F 4 , 20 p-value Cotton Weight Percent 475.76 4 118.94 14.76 < 0. 001 Error (within Treatments) 161.20 20 8. Total 639.96 24

Thus, we reject H 0 and conclude that the treatment means differ!

summary(aov(y~w)) Df Sum Sq Mean Sq F value Pr(>F) w 4 475.76 118.94 14.757 9.128e-06 *** Residuals 20 161.20 8.


Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Moreover, M SE estimates σ^2 and the (1 − α) confidence interval for the ith treatment mean μi is

[ yi· ± t 1 −α/ 2 ,N −a

M SE/n

]

W <- C(w, treatment); coefficients(aov(y~W)) # default contrast for w (Intercept) W20 W25 W30 W 9.8 5.6 7.8 11.8 1. W <- C(w, sum); coefficients(aov(y~W)) (Intercept) W1 W2 W3 W 15.04 -5.24 0.36 2.56 6. options(contrasts=c("contr.sum", "contr.poly")) # for all factors

Bartlett’s Test for Equality of Variances: H 0 : σ 12 = σ 22 = · · · = σ a^2

K^2 is based on the (pooled) sample variances and approximately χ^2 a− 1.

bartlett.test(y~W)

Bartlett test for homogeneity of variances

data: y by W Bartlett’s K-squared = 0.9331, df = 4, p-value = 0.

=⇒ Conclude that all 5 variances are the same!

This test is very sensitive to the normality assumption!

Relationship b/w σy and μ α λ = 1 − α Transformation σy ∝ const 0 1 no transformation σy ∝ μ^1 /^2 1/2 1/2 Square Root σy ∝ μ 1 0 Log σy ∝ μ^3 /^2 3/2 − 1 / 2 Reciprocal Square Root σy ∝ μ^2 2 − 1 Reciprocal

Selection of the Power: If σyi ∝ μαi = θμαi then

log σyi = log θ + α log μi

A plot of log σyi versus log μi is a straight line with slope α. Substitute σyi and μi by their estimates Si and yi· and guess the value of α from the plot.

Example: 4 different estimation methods of the peak discharge applied to the same watershed.

Method discharge (cubic feet / second) yi· Si 1 0.34 0.12 1.23 0.70 1.75 0.12 0.71 0. 2 0.91 2.94 2.14 2.36 2.86 4.55 2.63 1. 3 6.31 8.37 9.75 6.09 9.82 7.24 7.93 1. 4 17.15 11.82 10.95 17.20 14.35 16.82 14.72 2.

y <- c(0.34, 0.12, ..., 16.82); m <- gl(4, 6, labels=c(1, 2, 3, 4)) tapply(y, m, mean); tapply(y, m, sd) 1 2 3 4 0.710000 2.626667 7.930000 14. 1 2 3 4 0.661090 1.192202 1.647070 2. summary(aov(y~m)) Df Sum Sq Mean Sq F value Pr(>F) m 3 708.35 236.12 76.067 4.111e-11 *** Residuals 20 62.08 3.

bartlett.test(y~m)

Bartlett test for homogeneity of variances

data: y by m Bartlett’s K-squared = 8.9958, df = 3, p-value = 0.

The Bartlett Test rejects Equality of Variances. Thus we analyze y∗^ = √y.

ry <- sqrt(y); tapply(ry, m, sd) 1 2 3 4 0.4044534 0.3857295 0.2929908 0. summary(aov(ry~m)) Df Sum Sq Mean Sq F value Pr(>F) m 3 32.684 10.895 81.049 2.296e-11 *** Residuals 20 2.688 0.

To account for the use of the data to estimate α we reduce the error degrees of freedom by one. This gives F = 76. 99 again with p-value < 0. 001.

r <- residuals(aov(ry~m)); f <- fitted(aov(ry~m)); plot(f, r) library(mass); boxcox(y~m)

0 1 2 3 4 5

−1.

−0.

0.^ 0.^

fitted values

residuals

−2 −1 0 1 2

lambda

log−Likelihood

95%