Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Worked examples, Exercises of Statistics

Practice Examples with Solutions.

Typology: Exercises

2021/2022

Uploaded on 02/24/2022

albertein
albertein 🇺🇸

4.8

(4)

240 documents

1 / 18

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
MATH1725 Introduction to Statistics: Worked examples
Worked Example: Lectures 1–2
The lifetimes of 400 light-bulbs were found to the nearest hour. The results were recorded as
follows.
Lifetime (hours) 0–199 200–399 400–599 600–799 800–999 1000–1199 1200–1999
Frequency 143 97 64 51 14 14 17
Construct a histogram and cumulative frequency polygon for these data. Estimate the percentage
of bulbs with lifetime less than 480 hours.
Answer: Lifetimes cannot be negative so class intervals are [0,199.5), [199.5,399.5), [399.5,599.5),
and so on.
Lifetime (hours)
Freq. per 200 hour class
0 500 1000 1500 2000
0 20 40 60 80 120
Adjust height of the rectangle for the “1200–2000” interval to make histogram area proportional
to frequency. If the vertical axis is “frequency per interval of 200 hours”, the height of the [0,199.5)
class is 143 ×200/199.5 = 143.4 to allow for the first class not b eing of width 200.
Lifetime (hours) 0.0 199.5 399.5 599.5 799.5 999.5 1299.5 1999.5
Cumulative frequency 0 143 240 304 355 369 383 400
Make the cumulative frequency at time zero equal to 0.
0 500 1000 1500 2000
0 100 200 300 400
Lifetime (hours)
Cumulative freq.
400 450 500 550 600
240 260 280 300
Lifetime (hours)
Cumulative freq.
480
265.8
Estimated number of light-bulbs with lifetime less than 480 hours is
240 + 480 399.5
200 ×(304 240) = 265.8.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12

Partial preview of the text

Download Worked examples and more Exercises Statistics in PDF only on Docsity!

MATH1725 Introduction to Statistics: Worked examples

Worked Example: Lectures 1–

The lifetimes of 400 light-bulbs were found to the nearest hour. The results were recorded as follows.

Lifetime (hours) 0–199 200–399 400–599 600–799 800–999 1000–1199 1200– Frequency 143 97 64 51 14 14 17

Construct a histogram and cumulative frequency polygon for these data. Estimate the percentage of bulbs with lifetime less than 480 hours.

Answer: Lifetimes cannot be negative so class intervals are [0, 199 .5), [199. 5 , 399 .5), [399. 5 , 599 .5), and so on.

Lifetime (hours)

Freq. per 200 hour class

0 500 1000 1500 2000

0

20

40

60

80

120

Adjust height of the rectangle for the “1200–2000” interval to make histogram area proportional to frequency. If the vertical axis is “frequency per interval of 200 hours”, the height of the [0, 199 .5) class is 143 × 200 / 199 .5 = 143.4 to allow for the first class not being of width 200.

Lifetime (hours) 0.0 199.5 399.5 599.5 799.5 999.5 1299.5 1999. Cumulative frequency 0 143 240 304 355 369 383 400

Make the cumulative frequency at time zero equal to 0.

0 500 1000 1500 2000

0

100

200

300

400

Lifetime (hours)

Cumulative freq.

400 450 500 550 600

240

260

280

300

Lifetime (hours)

Cumulative freq. 480

Estimated number of light-bulbs with lifetime less than 480 hours is

240 +

× (304 − 240) = 265. 8.

Required percentage is

  1. 8 400

× 100 = 66.4%

Worked Example: Lectures 1–

The Christmas cactus Zygocactus truncatus has branches made up of separate segments. For one such cactus the number of segments in each branch were counted.

Number x of segments 1 2 3 4 5 6 7 8 9 Number of branches with x segments 3 0 6 7 8 18 8 0 2

Construct a cumulative frequency polygon to represent these data.

Answer: The data is discrete so cumulative frequency plot is a step function.

Number x of segments 1 2 3 4 5 6 7 8 9 Number of branches with ≤ x segments 3 3 9 16 24 42 50 50 52

0 2 4 6 8 10

0

10

20

30

40

50

60

Number of segments

Cumulative freq.

Worked Example: Lectures 1–

The following data give one hundred measurement errors made during the mapping of the American state of Massachusetts during the last century.

Error X (in minutes ′^ of arc) (− 4 , −2] (− 2 , 0] (0, +2] (+2, +4] (+4, +6] Frequency 10 43 39 5 3

Show that the sample mean and sample standard deviation for these data are ¯x = − 0. 04 ′^ and s = 1. 717 ′^ respectively.

Answer:

Class Class frequency f Class mid-point x f x f x^2 − 4 < x ≤− 2 10 − 3 − 30 90 − 2 < x ≤ 0 43 − 1 − 43 43 0 < x ≤+2 39 +1 39 39 +2< x ≤+4 5 +3 15 45 +4< x ≤+6 3 +5 15 75 Totals n = 100 − 4 292

Worked Example: Lectures 4–

A firm investigates the length of telephone conversations of their office staff. Ten consecutive conversations had lengths, in minutes:

10.7, 9.5, 11.1, 7.8, 11.9, 4.1, 10.0, 9.2, 6.5, 9.2.

Derive a 95% confidence interval for the mean conversation length. Test whether the mean length of a conversation is eight minutes.

Answer:

x ¯ =

n

∑^ n

i=

xi =

= 9 minutes.

s^2 =

n − 1

{ (^) n ∑

i=

x^2 i − nx¯^2

Estimate the population variance σ^2 by s^2 with s =

5 .42 = 2.33. Then X¯ − μ s/

n

∼ tn− 1.

95% confidence interval for μ is ¯x ± t 9 (2.5%)s/

  1. Here s/

10 = 0.737, t 9 (2.5%) = 2.262.

x¯ ± t 9 (2.5%)

s √ 10

= 9 ± (2. 262 × 0 .737)

Since 8 minutes lies inside the 95% confidence interval we would accept H 0 in testing H 0 : μ = 8 vs. H 1 : μ 6 = 8 at the 5% significance level.

Worked Example: Lectures 5–

A population has a Poisson distribution but it is not known whether the mean μ is 1 or 4. To test the hypothesis H 0 : μ = 1 vs. H 1 : μ = 1 on the basis of one observation X the following test procedure is considered: reject H 0 if X ≥ i. Type I error is defined to be “rejecting H 0 when H 0 is true”. Find the probability of type I error for the three cases i = 2, 3, 4.

Answer: If H 0 is true, μ = 1 and

pr{X = x} = e−^1 x!

, x = 0, 1 , 2 ,... ,

so that pr{Type I error} = pr{X ≥ i}. If i = 2,

pr{Type I error} = pr{X ≥ 2 } = 1−pr{X < 2 } = 1−pr{X = 0}−pr{X = 1} = 1−e−^1 −e−^1 = 0. 264.

Similarly if i = 3,

pr{Type I error} = pr{X ≥ 3 } = 1 − pr{X < 3 } = 0. 080.

If i = 4, pr{Type I error} = pr{X ≥ 4 } = 0. 019.

Notice that an exact 5% or 10% significance level test does not exist for this discrete distribution.

Worked Example: Lectures 5–

A sample of size 64 is drawn by simple random sampling from a normal population which has known variance 4. The sample mean is − 0 .45. Test the hypothesis H 0 : μ = 0 vs. H 1 : μ 6 = 0 at the 5% level of significance. Repeat for testing H 0 : μ = 0 vs. H 1 : μ > 0

Answer: Here X¯ ∼ N(μ, σ^2 /n) with σ^2 = 4, n = 64, so σ^2 /n = 0.0625 and X¯ ∼ N(μ, 0 .0625). Test statistic is Z = X¯ − μ σ/

n

where Z ∼ N(0, 1) if H 0 is true. For α = 0.05 with a two-sided test, zα/ 2 = 1.96. Critical region is Z < − 1 .96 and Z > 1 .96. Observed value is z = − 0. 45 / 0 .25 = − 1 .8. This does not lie in critical region so accept H 0. For α = 0.05 with a one-sided test, zα = 1.645. Critical region is Z < − 1 .645. Observed value is z = − 1 .8 which lies in critical region so reject H 0.

Worked Example: Lecture 6

The absenteeism rates (in days and parts of days) for nine employees of a large company were recorded in two consecutive years.

Employee 1 2 3 4 5 6 7 8 9 Year 1 3.0 6.7 11.3 5.0 9.4 15.7 8.0 10.0 9. Year 2 2.8 5.1 8.4 5.0 6.2 12.2 10.0 6.8 6.

Is there any evidence that the average absenteeism rate is different for the two years?

Answer: Data paired as same employee studied in each of the two years. Form difference di = (year 1)i − (year 2)i. Need to estimate variance σ^2 d. Test H 0 : μd = 0 vs. H 1 : μd 6 = 0. See lecture 6.

Worked Example: Lecture 8

Which phrases i-iv below apply to the sample correlation coefficient rXY? (i) measures linear association between two variables, (ii) is never negative, (iii) has positive slope, (iv) depends on the units of measurement of X and Y.

Answer: i only.

Worked Example: Lecture 8

The tensile strength of a glued joint is related to the glue thickness. A sample of six values gave the following results:

Glue Thickness (inches) 0.12 0.12 0.13 0.13 0.14 0. Tensile Strength (lbs.) 49.8 46.1 46.5 45.8 44.3 45. Calculate the sample correlation coefficient r for these data. Use the fitted least squares regression line to predict the tensile strength of a joint for a glue thickness of 0.14 inches. Using scatter-diagrams, sketch the form of regression line expected in the three cases when r takes the values −1, 0, and +1.

Joint probabilities p(x, y) are found by summing probabilities for each outcome giving rise to (X = x, Y = y). Thus p(1, 2) = pr{HT T or T T H} = 1/4. Marginal probabilities are found by forming row or column sum. For example

pr{X = 2} = p(2, 1) + p(2, 2) + p(2, 3) =

(c) If X = 1, then

pr{Y = y|X = 1} = p(1, y) pX (1)

p(1, y) 3 / 8

Thus

pr{Y = 1|X = 1} =

= 1/ 3 , pr{Y = 2|X = 1} =

= 2/ 3 , pr{Y = 3|X = 1} = 0.

If X = 1, then the outcome is one of HTT, THT, TTH. In one out of these three cases we observe Y = 1 and in two out of three we observe Y = 2.

Worked Example: Lecture 11

Suppose X and Y are independent continuous random variables which are each uniformly dis- tributed on the interval (0, 1). (a) Find the probability that 0 < X + Y < z for values z ∈ (0, 2). (b) If Z = X + Y , deduce the form of the probability density function f (z) of Z. Hints: In (a), think about the area on the x-y plane corresponding to 0^ < x^ +^ y < z. In (b), first find the cumulative distribution function F (z) = pr{Z ≤ z}.

Answer: As X and Y are uniformly distributed on the interval [0, 1) they have pdf

fX (x) =

1 if 0 < x < 1 , 0 otherwise, fY (y) =

1 if 0 < y < 1 , 0 otherwise.

(a) Joint probability density is f (x, y) = fX (x)fY (y) by independence of X and Y. Hence f (x, y) = 1, a constant, for 0 < x < 1 and 0 < y < 1.

Probability of an event A is volume under pdf with base area given by A. Here A is the region for which 0 < X + Y < z.

Consider the two cases z < 1 and z > 1 sepa- rately. (^) X

Y

f(x,y)

1 A

0

1

1

x+y<z

x+y<z

Case z < 1 Case z > 1

2-z

Y Y

X X

1

0 z

1

1 0 1

z

From the figure above, pr{ 0 < X + Y < z} =

2 z

(^2) if 0 < z < 1 , 1 − 12 (2 − z)^2 if 1 ≤ z < 2.

An alternative derivation uses integration. For example, in the case z < 1 ,

pr{ 0 < X + Y < z} =

0 <x+y<z

f (x, y)dxdy =

∫ (^) z

y=

∫ (^) z−y

x=

dxdy =

∫ (^) z

y=

(z − y)dy =

z^2.

(b) If Z = X + Y , then Z has cumulative distribution function F (z) where, from (a) above,

F (z) = pr{X + Y ≤ z} =

2 z (^2) if 0 < z < 1 , 1 − 12 (2 − z)^2 if 1 ≤ z < 2.

Probability density function for Z is then f (z) =

dF (z) dz

z if 0 < z < 1 , 2 − z if 1 ≤ z < 2.

Z has a triangular distribution on the interval (0, 2) with mode at z = 1.

Worked Example: Lecture 14

Measurements of stature were made on each member of a large population of pairs of adult brothers. The height of the elder brother was denoted by X and of the younger brother by Y. Both X and Y had the same mean μ and the same standard deviation σ. The correlation coefficient was ρ. Deduce the mean and variance of (i) U = X − Y , and (ii) V = X + Y. Derive the covariance of U and V.

Answer: E[U ] = E[X − Y ] = E[X] − E[Y ] = μ − μ = 0. Var[U ] = Var[X − Y ] = Var[X] + Var[Y ] − 2cov(X, Y ) = σ^2 + σ^2 − 2 ρσ^2 = 2σ^2 (1 − ρ),

as

corr(X, Y ) =

cov(X, Y ) √ Var[X]Var[Y ]

⇒ cov(X, Y ) = corr(X, Y )

Var[X]Var[Y ] = ρ

σ^2 σ^2 = ρσ^2.

E[V ] = E[X + Y ] = E[X] + E[Y ] = μ + μ = 2μ. Var[V ] = Var[X + Y ] = Var[X] + Var[Y ] + 2cov(X, Y ) = σ^2 + σ^2 + 2ρσ^2 = 2σ^2 (1 + ρ).

cov(U, V ) = cov(X − Y, X + Y ) = cov(X, X) − cov(Y, Y ) + cov(X, Y ) − cov(Y, X) = Var[X] − Var[Y ] + cov(X, Y ) − cov(X, Y ) = σ^2 − σ^2 + cov(X, Y ) − cov(X, Y ) = 0.

Worked Example: Lecture 14.

Let T = a 1 X 1 + a 2 X 2 , where X 1 and X 2 are uncorrelated random variables with mean μ and variance σ^2 , and a 1 and a 2 are constants chosen so that E[T ] = μ. Deduce that the choice a 2 = 1 − a 1 gives E[T ] = μ. In this case prove that the variance of T is a minimum if a 1 = a 2 = 12.

Answer: Have two independent normal distributions with unknown variances. Wrens: x¯ 1 = 21.18 mm., s^21 = 0.6418, n 1 = 10. Reed warblers: x¯ 2 = 22.14 mm., s^22 = 0.4116, n 2 = 10. Assume σ 12 = σ 22 = σ^2 (unknown). Estimate σ^2 using

s^2 = (n 1 − 1)s^21 + (n 2 − 1)s^22 n 1 + n 2 − 2

9 s^21 + 9s^22 18

Also ¯x 1 − ¯x 2 = 21. 18 − 22 .14 = − 0. 96 ,

s^2

n 1

n 2

= 0. 1053 , t 18 (2.5%) = 2.101.

If μ 1 = μ 2 then the two groups of eggs have the same mean length.

To test H 0 : μ 1 = μ 2 vs. H 1 : μ 1 6 = μ 2 at 5% level, reject H 0 if

x ¯ 1 − x¯ 2 √ s^2 (1/n 1 + 1/n 2 )

≥ t 8 (2.5%).

Here

x ¯ 1 − x¯ 2 √ s^2 (1/n 1 + 1/n 2 )

∣∣ √^ −^0.^96

∣∣ = 2.95 so reject the null hypothesis of equal means at 5%

level. The two groups of eggs are significantly different at 5% level.

This does not necessarily imply cuckoos can control their egg size. It has been proposed that a cuckoo lays its egg in the particular nest for which it is best adapted. For further information see: Wyllie, I. (1981) The Cuckoo. Batsford: London. Davies, N.B. and Brooke, M. Coevolution of the cuckoo and its host, Scientific American, January 1991, p.66-73.

Question (lecture 1-2).

For values 1, 3, 4, 5, 6 obtain the sample mean, sample median, sample variance and sample standard deviation. Answer: 1

Question (lecture 1-2).

The number of insurance policies sold by a small firm per week is 7, 8, 5, 6, 6, 7, 9, 5, 7, 8, 4, 7, 6, 7, 7, 5, 8, 6, 7, 6, 6. Obtain the sample mean, sample median, sample variance, sample standard deviation. Check your values using R. Answer: 2

Question (lecture 3).

For Z ∼ N(0, 1), calculate pr{Z ≤ 0. 55 }, pr{Z > 2. 25 }, pr{Z ≤ − 0. 15 }, pr{− 1. 50 < Z ≤ 2. 25 }. Answer: 3

Question (lecture 3).

For Z ∼ N(0, 1), calculate pr{Z ≤ 0. 63 }. Answer: 4

Question (lecture 3).

For Z ∼ N(0, 1), determine the value of z such that: pr{Z ≤ z} = 0.8944, pr{Z > −z} = 0.9713, pr{−z < Z ≤ z} = 0.9108. Answer: 5

Question (lecture 3).

An advertising company requires all of its job applicants to take a psychometric test. Based on recent studies, it is believed that the test score follows a normal distribution with mean 100 and standard deviation 15. Determine the probability that a job applicant will receive a test score below 118, above 112, between 100 and 112. Answer: 6

Question (lecture 4).

If X ∼ t 5 , for what value of x is pr{X > x} = 0.05? Answer: 7

Question (lecture 4).

If T ∼ t 8 , for what value t is pr{T > t} = 0.025? For what value t is pr{T ≤ t} = 0.05? Answer: 8

Question (lecture 4).

(^1) 3.8, 4, 3.7, 1.92. (^2) 6.524, 7.0 (middle ordered value), 1.462, 1.209. (^3) pr{Z ≤ 0. 55 } = Φ(0.55) = 0.7088, pr{Z > 2. 25 } = 1 − Φ(Z ≤ 2 .25) = 1 − Φ(2.25) = 0.0122, pr{Z ≤ − 0. 15 } = 1 − pr{Z ≤ 0. 15 } = 1 − Φ(0.15) = 0.4404, pr{− 1. 50 < Z ≤ 2. 25 } = pr{Z ≤ 2. 25 } − pr{Z ≤ − 1. 50 } = 0.9210. Recall that pr{Z > z} = 1 − pr{Z ≤ z}, pr{Z < −z} = pr{Z > z} by symmetry, and also pr{X < b} = pr{X < a} + pr{a < X < b}. (^4) Using interpolation in the tables Φ(0.63) = 0.7356. (^5) pr{Z ≤ 1. 25 } = 0.8944, pr{Z > − 1. 90 } = pr{Z ≤ 1. 90 } = 0.9713, pr{−z < Z ≤ z} = Φ(z) − Φ(−z) = 2Φ(z) − 1 = 0.9108 so Φ(z) = 0.9554 and z = 1.70. (^6) 0.8849, 0.2119, 0.2881. Hint: If X ∼ N(μ, σ (^2) ), then pr{X ≤ x} = Φ `^ x−μ σ

´ . (^7) From tables, x = 2.015. (^8) t 8 (2.5%) = 2.306. pr{T > 1. 860 } = 0.05 so pr{T ≤ − 1. 860 } = 0.05 by symmetry. Thus t = − 1 .860.

Answer: 15

Question (lecture 8).

For values (x, y) as given below, obtain the sample correlation r.

xi 1.1 2.2 3.4 4.5 5. yi 3.3 6.1 7.0 10.4 11.

Answer: 16

Question (lecture 10).

For values (x, y) as given below, obtain the line of regression for y given x. What does the residual at the first data point x 1 = 1.1 equal? If x = 4, what is the predicted value of y?

xi 1.1 2.2 3.4 4.5 5. yi 3.3 6.1 7.0 10.4 11.

Answer: 17

Question (lecture 10).

For values (x, y) as given below, a line of regression for y given x is fitted.

xi 1.1 2.2 3.4 4.5 5. yi 3.3 6.1 7.0 10.4 11.

Test the hypothesis that the slope β equals zero. Answer: 18

Question (lecture 11).

Suppose pr{X = x} = 10 x for x = 1, 2 , 3 , 4. Check that the probability function is valid (is 0 ≤

pr{X = x} ≤ 1 for all x and does

x

pr{X = x} = 1?). Calculate E[X] and Var[X].

(^15) n = 4, ¯x = 4, s (^2) = 3.333, μ 0 = 1, s (^2) /n = 0.8333. Test statistic is t = ¯x^ −^ μ^0 σ/ √ n = √^4 −^1

  1. 8333

= 3.286. Test rule is

reject H 0 if |t| > t 3 (2.5%). As t 3 (2.5%) = 3.182, reject H 0 at 5% level. (^16) x¯ = 3.24, s^2 x = 1 n − 1

X (xi − x¯)^2 = 1 n − 1

“X x^2 i − nx¯^2

” = 2.593,

¯y = 7.66, s^2 y = (^) n −^1

X (yi − y¯)^2 = (^) n −^1

“X y i^2 − n¯y^2

” = 11.033,

sxy = (^) n −^1

X (xi − x¯)(yi − y¯) = (^) n −^1

“X xiyi − n¯x¯y

” = 5.2645, rXY = sxy / p s^2 xs^2 y = 0.984. Check your answer using R! x=c(1.1,2.2,3.4,4.5,5.0) # And setup y similarly. cor(x,y) (^17) ¯x = 3.24, ¯y = 7.66, s (^2) x = 2.593, s (^2) y = 11.033, sxy = 5.2645. Regression line is y = α + βx where βˆ = sxy /s (^2) x =

2 .030, ˆα = ¯y − βˆ ¯x = 1.082 so fitted line is y = 1.082 + 2. 030 x. If x 1 = 1.1, predict ˆy 1 = 3.315. At x = 1.1, residual is r 1 = y 1 − ˆy 1 = 3. 3 − 3 .315 = − 0 .015. If x = 4, predict y = 9.023. Check your answers using R! x=c(1.1,2.2,3.4,4.5,5.0) # And setup y similarly. lm(y∼x) # Gives parameter estimates. model=lm(y∼x) # Stores regression model output as model. model$residual[1] # First residual value. (^18) If H 0 : β = 0, then β/ˆ

r ˆσ^2 Sxx^ ∼^ tn−^2 , where^ Sxx^ =^

P(x i −^ x¯)^2 = (n^ −^ 1)s^2 x.^ Here

r σˆ^2 Sxx^ = 0.2105 where Sxx = (n − 1)s^2 x = 10.372. Thus t = 9.646. t 3 (2.5%) = 3.182. As |t| > 3 .182, reject H 0 at 5% level. Check your answers using R! x=c(1.1,2.2,3.4,4.5,5.0) # And setup y similarly. model=lm(y∼x) summary(model) # Can you find your answers in the R output?

Answer: 19

Question (lecture 12).

Suppose (X, Y ) take values (0,0), (0,1), (1,0), (1,1) with probabilities 0.2, 0.5, 0.2, 0.1 respectively. Obtain the marginal probabilities for X, and the conditional probabilities for Y given X = 1. Obtain E[XY ]. Are X and Y independent? Answer: 20

Question (lecture 12).

Suppose fXY (x, y) = 4xy for 0 < x < 1 and 0 < y < 1. Obtain the marginal pdf fX (x). Obtain E[XY ]. Are X and Y independent? Answer: 21

Question (lecture 13).

The table below gives the joint probability function for (X, Y ).

Y 0 1 2 0 0.1 0.1 0. X 1 0.2 0.0 0. 2 0.1 0.0 0.

Obtain the marginal probabilities pX (x) and pY (y) for X and Y. Hence obtain E[X], E[Y ], Var[X], Var[Y ]. Obtain cov(X, Y ) and corr(X, Y ). Answer: 22

Question (lecture 14).

If cov(X, Y ) = 0.5 and Var[X] = 2, what is cov(X, X + Y )? Answer: 23

Question (lecture 14).

If cov(X + Y, X − Y ) = 12, Var[X + Y ] = 20 and Var[X − Y ] = 16, obtain σ^2 X = Var[X], σ^2 Y = Var[Y ], σXY = cov(X, Y ) and so obtain corr(X, Y ). Answer: 24

Question (lecture 14).

A fair die is rolled 100 times and the number X of ones and the number Y of twos is counted. What distribution does X have? What distribution does Y have? If Z = X + Y is the total number of ones or twos in the 100 rolls of the die, what distribution does Z have? What is the variance of X, Y and Z? Hence obtain cov(X, Y ) and corr(X, Y ). Answer: 25

(^19) Yes, 3, 1. (^20) pX (0) = 0.7, pX (1) = 0.3, pr{Y = 0|X = 1} = 2 3 , pr{Y^ = 1|X^ = 1}^ = pr{X^ = 1^ ∩^ Y^ = 1}^ /pr{X^ = 1}^ =^ 1

E[XY ] = 0.1. No. (^21) fX (x) = R y fXY^ (x, y) dy^ = 2x^ for 0^ < x <^ 1. E[XY^ ] =^ 4 22 9. Yes. Marginal probabilities for X are 0.3, 0.4, 0.3, and for Y they are 0.4, 0.1, 0.5. E[X] = 1, E[Y ] = 1.1, Var[X] = 0.6, Var[Y ] = 0.89, cov(X, Y ) = 0.1, corr(X, Y ) = 0.137. (^23) Var[X] + cov(X, Y ) = 2.5. (^24) σ X (^2) − σ (^2) Y = 12, σ (^2) X + 2σXY + σ (^2) Y = 20, σ (^2) X − 2 σXY + σ (^2) Y = 16, so 2σ (^2) X + 2σ Y (^2) = 36 and 4σXY = 4. Thus σ (^2) X = 15, σ^2 Y = 3, σXY = 1 and corr(X, Y ) = 1/

(^25) X ∼ Bin(n = 100, θ = 1 6 ). Similarly for^ Y^.^ Z^ ∼^ Bin(100, θ^ =^ 1 3 ). Var[X] = Var[Y^ ] = 500/36, Var[Z] = 200/9 = σ^2 X + 2σXY + σ^2 Y. Hence cov(X, Y ) = − 100 /36 so corr(X, Y ) = − 15. Notice X and Y are not uncorrelated. If you have a lot of ones, you would expect fewer twos!

Question (lecture 15).

Two independent samples gave values 3, 6, 5, 2 for sample 1 and 2, 2, 3, 3, 5 for sample 2. Assuming that the samples come from independent normal distributions with common unknown variance σ^2 , test at the 5% level whether the difference in mean equals zero against the alternative that it does not equal zero. Answer: 31

Question (lecture 15).

Five randomly selected remuneration packages for US oil and gas CEOs in 2008 were (in thousands of US dollars) 21333, 7294, 6712, 5727, 7087. Five randomly selected remuneration packages for US health care CEOs in 2008 were (in thousands of dollars) 14262, 8381, 7245, 10211, 1817. Test at the 5% level whether the difference in mean remuneration equals zero against the alternative hypothesis that it does not equal zero. You can assume that the two populations have common (unknown) variance σ^2. Answer: 32

Question (lecture 16).

A quarter of insurance claims are incomplete in some way. If you have 250 forms to process, what is the approximate probability that you will find fewer than 50 of them incomplete? Answer: 33

Question (lecture 16).

In n = 100 tosses of a coin I obtain X = 72 heads. Obtain an approximate 95% confidence interval for the probability θ of a head. Answer: 34

Question (lecture 17).

In December 2010 two analysts suggested several shares as likely to rise in 2011. By the end of October 2011 one (Neil Woodford) had four out of n 1 = 7 “share tips” showing a rise while the other (Harry Nummo) had three out of n 2 = 10 “share tips” showing a rise. Test at the 5% level whether the two success proportions are significantly different. Answer: 35

z = qx¯ (^) σ^1 2 −^ ¯x^2 n^1 1 +^

σ^22 n 2

= q^4 −^3 (^44) + (^15) = 0. 913. Test rule is reject H 0 if |z| > 1 .96. Thus accept H 0 at 5% level.

(^31) n 1 = 4, ¯x 1 = 4, s (^21) = 3.333, n 2 = 5, ¯x 2 = 3, s (^22) = 1.5, pooled estimate of σ (^2) is s (^2) =^3 s^21 + 4s^22 7 = 2.2857. Testing H 0 : μ 1 − μ 2 = 0 vs. H 1 : μ 1 − μ 2 6 = 0. Test statistic is t = ¯x^1 −^ x¯^2 s

q (^1) n 1 +^ n^12

= 4 −^3

  1. 5119 ×

q (^1) 4 +^ 1 5

= 0. 986. Test rule is

reject H 0 if |t| > t 7 (2.5%). As t 7 (2.5%) = 2.365, accept H 0 at 5% level. (^32) Data source: http://graphicsweb.wsj.com/php/CEOPAY09.html. n 1 = 5, ¯x 1 = 9630.6, s^21 = 43158021, n 2 = 5, ¯x 2 = 8383.2, s^22 = 20577907, n 1 + n 2 − 2 = 8, t 8 (2.5%) = 2.306.

If variances are equal to σ^2 , estimate σ^2 using s^2 = (n^1 −^ 1)s

(^21) + (n 2 − 1)s (^22)

|¯x^ n^1 +^ n^2 −^2 = 31867964.^ Test statistic is^ t^ = r^1 −x¯^2 | s^2 ( (^) n^11 + n^12 ) = (^35701247) .. 324 = 0.349. Since t 8 (2.5%) = 2.306, then |t| < t 8 (2.5%) so accept H 0 that μ 1 = μ 2 against the

alternative μ 1 6 = μ 2 at the 5% level. (^33) If X is the number of incomplete forms, X ∼ Bin(n = 250, θ = 1 4 )^ ≈^ N(μ^ = 62.^5 , σ (^2) = 46.875). You require

pr{X < 50 } = pr{X ≤ 49 } = Φ

„ (^) 49 + 1 2 −^ μ σ

« = Φ(− 1 .899) = 0.0288. Notice we have used a continuity correction. (^34) Number of heads X ∼ Bin(n = 100, θ). Here n = 100, X = 72 observed, ˆθ = X/n = 72/100 = 0.72.

Approximate 95% confidence interval is θˆ ± 1. 96

s θ^ ˆ(1 − θˆ) 35 n^ = 0.^72 ±^0 .088. Data source: http://www.thisismoney.co.uk/money/investing/article-1709914/Stock-market-predict

Question (lecture 17).

In January 2011 Durham police were reported as disappointed by the increase in the num- ber of people arrested for drinking and driving. Between December 1st 2010 and December 31st 2010 they had 52 positive breath tests out of 1799 breath tests administered, while for the same period in 2009 they had 41 positive tests out of 1433 administered. Construct a 95% confidence interval for the difference in proportion of drivers who tested positive. Source: http://www.bbc.co.uk/news/uk-england- Answer: 36

Question (lecture 17).

I observe two dice. For one die I notice that it gives a six 20 times out of 100 and for the second die I notice that it gives a six 22 times out of 80. Test at the 5% level whether the two dice give the same probability of showing a six. Answer: 37

Question (lecture 18).

If X ∼ χ^24 , for what value of x is pr{X > x} = 0.05? Answer: 38

Question (lecture 19).

I roll a die 100 times and observe the following results.

Outcome i 1 2 3 4 5 6 Observed frequency 16 15 16 15 15 23

Test at the 5% level whether the die is fair. Answer: 39

ions-tips-2011.html Two binomial proportions here. θˆ 1 = 4/7 = 0.571, ˆθ 2 = 3/10 = 0.300, n 1 = 7, n 2 = 10. Common estimated

proportion is θ =^7

ˆθ 1 + 10ˆθ 2 17 = 0.412. Approximate test statistic is^ z^ =^ r |θˆ^1 −^ θˆ^2 | θ^ ˆ(1 − θˆ)

“ (^1) n 1 +^ 1 n 2

” = 1.119. reject H^0 at

5% level if |z| > 1 .96, so here accept the hypothesis that the two proportions are equal. (^36) Two binomial proportions again. ˆθ 1 = 52/1799 = 0.028905, θˆ 2 = 41/1433 = 0.028611, n 1 = 1799, n 2 = 1433. Common estimated proportion is θ =^1799 θˆ 1 + 1433θˆ 2 3232 = 0.0288.^ (This is very small so the normal approxima- tion is doubtful. In practice we would transform to give approximate normality.) Approximate test statistic is

z = | θˆ 1 − θˆ 2 | r θ^ ˆ(1 − θˆ)

“ n^1 1 +^ n^1 2

” = 0.0496. Reject H^0 at 5% level if^ |z|^ >^1 .96, so here accept the hypothesis that the two

proportions are equal. (^37) n 1 = 100, x 1 = 20, θˆ 1 = 20/100 = 0.200, n 2 = 80, x 2 = 22, θˆ 2 = 22/80 = 0.275. We test H 0 : θ 1 = θ 2 (= θ) vs. H 1 : θ 1 6 = θ 2. This is equivalent to testing H 0 : θ 1 − θ 2 = 0 vs. H 1 : θ 1 − θ 2 6 = 0. Assuming H 0 is

true, the estimated common proportion θ is estimated by θˆ = n^1

ˆθ 1 + n 2 θˆ 2 n 1 + n 2 =

20 + 22 180 = 0.2333.^ Test statistic is

z = θˆ 1 − θˆ 2 q (^) ˆ θ(1−θˆ) n 1 +^

ˆθ(1−ˆθ) n 2

= √ 0 .0017889 + 0^0.^200 −^0.^275. 0014907 = − 1. 31. Test rule is reject H 0 if |z| > 1 .96, so accept H 0 at 5%

level. (^38) From tables, x = 9.488. (^39) Let X denote the outcome of the die. We test whether pr{X = i} = 1/6 for all i. Expected frequency for any outcome would then be 100 × 16 = 16.667.

Outcome i 1 2 3 4 5 6 Observed frequency Oi 16 15 16 15 15 23 Expected frequency Ei 16.67 16.67 16.67 16.67 16.67 16. (Oi − Ei)^2 /Ei 0.0267 0.1667 0.0267 0.1667 0.1667 2.407 sum=2.