










Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Practice Examples with Solutions.
Typology: Exercises
1 / 18
This page cannot be seen from the preview
Don't miss anything!
The lifetimes of 400 light-bulbs were found to the nearest hour. The results were recorded as follows.
Lifetime (hours) 0–199 200–399 400–599 600–799 800–999 1000–1199 1200– Frequency 143 97 64 51 14 14 17
Construct a histogram and cumulative frequency polygon for these data. Estimate the percentage of bulbs with lifetime less than 480 hours.
Answer: Lifetimes cannot be negative so class intervals are [0, 199 .5), [199. 5 , 399 .5), [399. 5 , 599 .5), and so on.
Lifetime (hours)
Freq. per 200 hour class
0 500 1000 1500 2000
0
20
40
60
80
120
Adjust height of the rectangle for the “1200–2000” interval to make histogram area proportional to frequency. If the vertical axis is “frequency per interval of 200 hours”, the height of the [0, 199 .5) class is 143 × 200 / 199 .5 = 143.4 to allow for the first class not being of width 200.
Lifetime (hours) 0.0 199.5 399.5 599.5 799.5 999.5 1299.5 1999. Cumulative frequency 0 143 240 304 355 369 383 400
Make the cumulative frequency at time zero equal to 0.
0 500 1000 1500 2000
0
100
200
300
400
Lifetime (hours)
Cumulative freq.
400 450 500 550 600
240
260
280
300
Lifetime (hours)
Cumulative freq. 480
Estimated number of light-bulbs with lifetime less than 480 hours is
240 +
Required percentage is
The Christmas cactus Zygocactus truncatus has branches made up of separate segments. For one such cactus the number of segments in each branch were counted.
Number x of segments 1 2 3 4 5 6 7 8 9 Number of branches with x segments 3 0 6 7 8 18 8 0 2
Construct a cumulative frequency polygon to represent these data.
Answer: The data is discrete so cumulative frequency plot is a step function.
Number x of segments 1 2 3 4 5 6 7 8 9 Number of branches with ≤ x segments 3 3 9 16 24 42 50 50 52
0 2 4 6 8 10
0
10
20
30
40
50
60
Number of segments
Cumulative freq.
The following data give one hundred measurement errors made during the mapping of the American state of Massachusetts during the last century.
Error X (in minutes ′^ of arc) (− 4 , −2] (− 2 , 0] (0, +2] (+2, +4] (+4, +6] Frequency 10 43 39 5 3
Show that the sample mean and sample standard deviation for these data are ¯x = − 0. 04 ′^ and s = 1. 717 ′^ respectively.
Answer:
Class Class frequency f Class mid-point x f x f x^2 − 4 < x ≤− 2 10 − 3 − 30 90 − 2 < x ≤ 0 43 − 1 − 43 43 0 < x ≤+2 39 +1 39 39 +2< x ≤+4 5 +3 15 45 +4< x ≤+6 3 +5 15 75 Totals n = 100 − 4 292
A firm investigates the length of telephone conversations of their office staff. Ten consecutive conversations had lengths, in minutes:
10.7, 9.5, 11.1, 7.8, 11.9, 4.1, 10.0, 9.2, 6.5, 9.2.
Derive a 95% confidence interval for the mean conversation length. Test whether the mean length of a conversation is eight minutes.
Answer:
x ¯ =
n
∑^ n
i=
xi =
= 9 minutes.
s^2 =
n − 1
{ (^) n ∑
i=
x^2 i − nx¯^2
Estimate the population variance σ^2 by s^2 with s =
5 .42 = 2.33. Then X¯ − μ s/
n
∼ tn− 1.
95% confidence interval for μ is ¯x ± t 9 (2.5%)s/
10 = 0.737, t 9 (2.5%) = 2.262.
x¯ ± t 9 (2.5%)
s √ 10
Since 8 minutes lies inside the 95% confidence interval we would accept H 0 in testing H 0 : μ = 8 vs. H 1 : μ 6 = 8 at the 5% significance level.
A population has a Poisson distribution but it is not known whether the mean μ is 1 or 4. To test the hypothesis H 0 : μ = 1 vs. H 1 : μ = 1 on the basis of one observation X the following test procedure is considered: reject H 0 if X ≥ i. Type I error is defined to be “rejecting H 0 when H 0 is true”. Find the probability of type I error for the three cases i = 2, 3, 4.
Answer: If H 0 is true, μ = 1 and
pr{X = x} = e−^1 x!
, x = 0, 1 , 2 ,... ,
so that pr{Type I error} = pr{X ≥ i}. If i = 2,
pr{Type I error} = pr{X ≥ 2 } = 1−pr{X < 2 } = 1−pr{X = 0}−pr{X = 1} = 1−e−^1 −e−^1 = 0. 264.
Similarly if i = 3,
pr{Type I error} = pr{X ≥ 3 } = 1 − pr{X < 3 } = 0. 080.
If i = 4, pr{Type I error} = pr{X ≥ 4 } = 0. 019.
Notice that an exact 5% or 10% significance level test does not exist for this discrete distribution.
A sample of size 64 is drawn by simple random sampling from a normal population which has known variance 4. The sample mean is − 0 .45. Test the hypothesis H 0 : μ = 0 vs. H 1 : μ 6 = 0 at the 5% level of significance. Repeat for testing H 0 : μ = 0 vs. H 1 : μ > 0
Answer: Here X¯ ∼ N(μ, σ^2 /n) with σ^2 = 4, n = 64, so σ^2 /n = 0.0625 and X¯ ∼ N(μ, 0 .0625). Test statistic is Z = X¯ − μ σ/
n
where Z ∼ N(0, 1) if H 0 is true. For α = 0.05 with a two-sided test, zα/ 2 = 1.96. Critical region is Z < − 1 .96 and Z > 1 .96. Observed value is z = − 0. 45 / 0 .25 = − 1 .8. This does not lie in critical region so accept H 0. For α = 0.05 with a one-sided test, zα = 1.645. Critical region is Z < − 1 .645. Observed value is z = − 1 .8 which lies in critical region so reject H 0.
The absenteeism rates (in days and parts of days) for nine employees of a large company were recorded in two consecutive years.
Employee 1 2 3 4 5 6 7 8 9 Year 1 3.0 6.7 11.3 5.0 9.4 15.7 8.0 10.0 9. Year 2 2.8 5.1 8.4 5.0 6.2 12.2 10.0 6.8 6.
Is there any evidence that the average absenteeism rate is different for the two years?
Answer: Data paired as same employee studied in each of the two years. Form difference di = (year 1)i − (year 2)i. Need to estimate variance σ^2 d. Test H 0 : μd = 0 vs. H 1 : μd 6 = 0. See lecture 6.
Which phrases i-iv below apply to the sample correlation coefficient rXY? (i) measures linear association between two variables, (ii) is never negative, (iii) has positive slope, (iv) depends on the units of measurement of X and Y.
Answer: i only.
The tensile strength of a glued joint is related to the glue thickness. A sample of six values gave the following results:
Glue Thickness (inches) 0.12 0.12 0.13 0.13 0.14 0. Tensile Strength (lbs.) 49.8 46.1 46.5 45.8 44.3 45. Calculate the sample correlation coefficient r for these data. Use the fitted least squares regression line to predict the tensile strength of a joint for a glue thickness of 0.14 inches. Using scatter-diagrams, sketch the form of regression line expected in the three cases when r takes the values −1, 0, and +1.
Joint probabilities p(x, y) are found by summing probabilities for each outcome giving rise to (X = x, Y = y). Thus p(1, 2) = pr{HT T or T T H} = 1/4. Marginal probabilities are found by forming row or column sum. For example
pr{X = 2} = p(2, 1) + p(2, 2) + p(2, 3) =
(c) If X = 1, then
pr{Y = y|X = 1} = p(1, y) pX (1)
p(1, y) 3 / 8
Thus
pr{Y = 1|X = 1} =
= 1/ 3 , pr{Y = 2|X = 1} =
= 2/ 3 , pr{Y = 3|X = 1} = 0.
If X = 1, then the outcome is one of HTT, THT, TTH. In one out of these three cases we observe Y = 1 and in two out of three we observe Y = 2.
Suppose X and Y are independent continuous random variables which are each uniformly dis- tributed on the interval (0, 1). (a) Find the probability that 0 < X + Y < z for values z ∈ (0, 2). (b) If Z = X + Y , deduce the form of the probability density function f (z) of Z. Hints: In (a), think about the area on the x-y plane corresponding to 0^ < x^ +^ y < z. In (b), first find the cumulative distribution function F (z) = pr{Z ≤ z}.
Answer: As X and Y are uniformly distributed on the interval [0, 1) they have pdf
fX (x) =
1 if 0 < x < 1 , 0 otherwise, fY (y) =
1 if 0 < y < 1 , 0 otherwise.
(a) Joint probability density is f (x, y) = fX (x)fY (y) by independence of X and Y. Hence f (x, y) = 1, a constant, for 0 < x < 1 and 0 < y < 1.
Probability of an event A is volume under pdf with base area given by A. Here A is the region for which 0 < X + Y < z.
Consider the two cases z < 1 and z > 1 sepa- rately. (^) X
Y
f(x,y)
1 A
0
1
1
x+y<z
x+y<z
Case z < 1 Case z > 1
2-z
Y Y
X X
1
0 z
1
1 0 1
z
From the figure above, pr{ 0 < X + Y < z} =
2 z
(^2) if 0 < z < 1 , 1 − 12 (2 − z)^2 if 1 ≤ z < 2.
An alternative derivation uses integration. For example, in the case z < 1 ,
pr{ 0 < X + Y < z} =
0 <x+y<z
f (x, y)dxdy =
∫ (^) z
y=
∫ (^) z−y
x=
dxdy =
∫ (^) z
y=
(z − y)dy =
z^2.
(b) If Z = X + Y , then Z has cumulative distribution function F (z) where, from (a) above,
F (z) = pr{X + Y ≤ z} =
2 z (^2) if 0 < z < 1 , 1 − 12 (2 − z)^2 if 1 ≤ z < 2.
Probability density function for Z is then f (z) =
dF (z) dz
z if 0 < z < 1 , 2 − z if 1 ≤ z < 2.
Z has a triangular distribution on the interval (0, 2) with mode at z = 1.
Measurements of stature were made on each member of a large population of pairs of adult brothers. The height of the elder brother was denoted by X and of the younger brother by Y. Both X and Y had the same mean μ and the same standard deviation σ. The correlation coefficient was ρ. Deduce the mean and variance of (i) U = X − Y , and (ii) V = X + Y. Derive the covariance of U and V.
Answer: E[U ] = E[X − Y ] = E[X] − E[Y ] = μ − μ = 0. Var[U ] = Var[X − Y ] = Var[X] + Var[Y ] − 2cov(X, Y ) = σ^2 + σ^2 − 2 ρσ^2 = 2σ^2 (1 − ρ),
as
corr(X, Y ) =
cov(X, Y ) √ Var[X]Var[Y ]
⇒ cov(X, Y ) = corr(X, Y )
Var[X]Var[Y ] = ρ
σ^2 σ^2 = ρσ^2.
E[V ] = E[X + Y ] = E[X] + E[Y ] = μ + μ = 2μ. Var[V ] = Var[X + Y ] = Var[X] + Var[Y ] + 2cov(X, Y ) = σ^2 + σ^2 + 2ρσ^2 = 2σ^2 (1 + ρ).
cov(U, V ) = cov(X − Y, X + Y ) = cov(X, X) − cov(Y, Y ) + cov(X, Y ) − cov(Y, X) = Var[X] − Var[Y ] + cov(X, Y ) − cov(X, Y ) = σ^2 − σ^2 + cov(X, Y ) − cov(X, Y ) = 0.
Let T = a 1 X 1 + a 2 X 2 , where X 1 and X 2 are uncorrelated random variables with mean μ and variance σ^2 , and a 1 and a 2 are constants chosen so that E[T ] = μ. Deduce that the choice a 2 = 1 − a 1 gives E[T ] = μ. In this case prove that the variance of T is a minimum if a 1 = a 2 = 12.
Answer: Have two independent normal distributions with unknown variances. Wrens: x¯ 1 = 21.18 mm., s^21 = 0.6418, n 1 = 10. Reed warblers: x¯ 2 = 22.14 mm., s^22 = 0.4116, n 2 = 10. Assume σ 12 = σ 22 = σ^2 (unknown). Estimate σ^2 using
s^2 = (n 1 − 1)s^21 + (n 2 − 1)s^22 n 1 + n 2 − 2
9 s^21 + 9s^22 18
Also ¯x 1 − ¯x 2 = 21. 18 − 22 .14 = − 0. 96 ,
s^2
n 1
n 2
= 0. 1053 , t 18 (2.5%) = 2.101.
If μ 1 = μ 2 then the two groups of eggs have the same mean length.
To test H 0 : μ 1 = μ 2 vs. H 1 : μ 1 6 = μ 2 at 5% level, reject H 0 if
x ¯ 1 − x¯ 2 √ s^2 (1/n 1 + 1/n 2 )
≥ t 8 (2.5%).
Here
x ¯ 1 − x¯ 2 √ s^2 (1/n 1 + 1/n 2 )
∣∣ = 2.95 so reject the null hypothesis of equal means at 5%
level. The two groups of eggs are significantly different at 5% level.
This does not necessarily imply cuckoos can control their egg size. It has been proposed that a cuckoo lays its egg in the particular nest for which it is best adapted. For further information see: Wyllie, I. (1981) The Cuckoo. Batsford: London. Davies, N.B. and Brooke, M. Coevolution of the cuckoo and its host, Scientific American, January 1991, p.66-73.
For values 1, 3, 4, 5, 6 obtain the sample mean, sample median, sample variance and sample standard deviation. Answer: 1
The number of insurance policies sold by a small firm per week is 7, 8, 5, 6, 6, 7, 9, 5, 7, 8, 4, 7, 6, 7, 7, 5, 8, 6, 7, 6, 6. Obtain the sample mean, sample median, sample variance, sample standard deviation. Check your values using R. Answer: 2
For Z ∼ N(0, 1), calculate pr{Z ≤ 0. 55 }, pr{Z > 2. 25 }, pr{Z ≤ − 0. 15 }, pr{− 1. 50 < Z ≤ 2. 25 }. Answer: 3
For Z ∼ N(0, 1), calculate pr{Z ≤ 0. 63 }. Answer: 4
For Z ∼ N(0, 1), determine the value of z such that: pr{Z ≤ z} = 0.8944, pr{Z > −z} = 0.9713, pr{−z < Z ≤ z} = 0.9108. Answer: 5
An advertising company requires all of its job applicants to take a psychometric test. Based on recent studies, it is believed that the test score follows a normal distribution with mean 100 and standard deviation 15. Determine the probability that a job applicant will receive a test score below 118, above 112, between 100 and 112. Answer: 6
If X ∼ t 5 , for what value of x is pr{X > x} = 0.05? Answer: 7
If T ∼ t 8 , for what value t is pr{T > t} = 0.025? For what value t is pr{T ≤ t} = 0.05? Answer: 8
(^1) 3.8, 4, 3.7, 1.92. (^2) 6.524, 7.0 (middle ordered value), 1.462, 1.209. (^3) pr{Z ≤ 0. 55 } = Φ(0.55) = 0.7088, pr{Z > 2. 25 } = 1 − Φ(Z ≤ 2 .25) = 1 − Φ(2.25) = 0.0122, pr{Z ≤ − 0. 15 } = 1 − pr{Z ≤ 0. 15 } = 1 − Φ(0.15) = 0.4404, pr{− 1. 50 < Z ≤ 2. 25 } = pr{Z ≤ 2. 25 } − pr{Z ≤ − 1. 50 } = 0.9210. Recall that pr{Z > z} = 1 − pr{Z ≤ z}, pr{Z < −z} = pr{Z > z} by symmetry, and also pr{X < b} = pr{X < a} + pr{a < X < b}. (^4) Using interpolation in the tables Φ(0.63) = 0.7356. (^5) pr{Z ≤ 1. 25 } = 0.8944, pr{Z > − 1. 90 } = pr{Z ≤ 1. 90 } = 0.9713, pr{−z < Z ≤ z} = Φ(z) − Φ(−z) = 2Φ(z) − 1 = 0.9108 so Φ(z) = 0.9554 and z = 1.70. (^6) 0.8849, 0.2119, 0.2881. Hint: If X ∼ N(μ, σ (^2) ), then pr{X ≤ x} = Φ `^ x−μ σ
´ . (^7) From tables, x = 2.015. (^8) t 8 (2.5%) = 2.306. pr{T > 1. 860 } = 0.05 so pr{T ≤ − 1. 860 } = 0.05 by symmetry. Thus t = − 1 .860.
Answer: 15
For values (x, y) as given below, obtain the sample correlation r.
xi 1.1 2.2 3.4 4.5 5. yi 3.3 6.1 7.0 10.4 11.
Answer: 16
For values (x, y) as given below, obtain the line of regression for y given x. What does the residual at the first data point x 1 = 1.1 equal? If x = 4, what is the predicted value of y?
xi 1.1 2.2 3.4 4.5 5. yi 3.3 6.1 7.0 10.4 11.
Answer: 17
For values (x, y) as given below, a line of regression for y given x is fitted.
xi 1.1 2.2 3.4 4.5 5. yi 3.3 6.1 7.0 10.4 11.
Test the hypothesis that the slope β equals zero. Answer: 18
Suppose pr{X = x} = 10 x for x = 1, 2 , 3 , 4. Check that the probability function is valid (is 0 ≤
pr{X = x} ≤ 1 for all x and does
x
pr{X = x} = 1?). Calculate E[X] and Var[X].
(^15) n = 4, ¯x = 4, s (^2) = 3.333, μ 0 = 1, s (^2) /n = 0.8333. Test statistic is t = ¯x^ −^ μ^0 σ/ √ n = √^4 −^1
= 3.286. Test rule is
reject H 0 if |t| > t 3 (2.5%). As t 3 (2.5%) = 3.182, reject H 0 at 5% level. (^16) x¯ = 3.24, s^2 x = 1 n − 1
X (xi − x¯)^2 = 1 n − 1
“X x^2 i − nx¯^2
” = 2.593,
¯y = 7.66, s^2 y = (^) n −^1
X (yi − y¯)^2 = (^) n −^1
“X y i^2 − n¯y^2
” = 11.033,
sxy = (^) n −^1
X (xi − x¯)(yi − y¯) = (^) n −^1
“X xiyi − n¯x¯y
” = 5.2645, rXY = sxy / p s^2 xs^2 y = 0.984. Check your answer using R! x=c(1.1,2.2,3.4,4.5,5.0) # And setup y similarly. cor(x,y) (^17) ¯x = 3.24, ¯y = 7.66, s (^2) x = 2.593, s (^2) y = 11.033, sxy = 5.2645. Regression line is y = α + βx where βˆ = sxy /s (^2) x =
2 .030, ˆα = ¯y − βˆ ¯x = 1.082 so fitted line is y = 1.082 + 2. 030 x. If x 1 = 1.1, predict ˆy 1 = 3.315. At x = 1.1, residual is r 1 = y 1 − ˆy 1 = 3. 3 − 3 .315 = − 0 .015. If x = 4, predict y = 9.023. Check your answers using R! x=c(1.1,2.2,3.4,4.5,5.0) # And setup y similarly. lm(y∼x) # Gives parameter estimates. model=lm(y∼x) # Stores regression model output as model. model$residual[1] # First residual value. (^18) If H 0 : β = 0, then β/ˆ
r ˆσ^2 Sxx^ ∼^ tn−^2 , where^ Sxx^ =^
P(x i −^ x¯)^2 = (n^ −^ 1)s^2 x.^ Here
r σˆ^2 Sxx^ = 0.2105 where Sxx = (n − 1)s^2 x = 10.372. Thus t = 9.646. t 3 (2.5%) = 3.182. As |t| > 3 .182, reject H 0 at 5% level. Check your answers using R! x=c(1.1,2.2,3.4,4.5,5.0) # And setup y similarly. model=lm(y∼x) summary(model) # Can you find your answers in the R output?
Answer: 19
Suppose (X, Y ) take values (0,0), (0,1), (1,0), (1,1) with probabilities 0.2, 0.5, 0.2, 0.1 respectively. Obtain the marginal probabilities for X, and the conditional probabilities for Y given X = 1. Obtain E[XY ]. Are X and Y independent? Answer: 20
Suppose fXY (x, y) = 4xy for 0 < x < 1 and 0 < y < 1. Obtain the marginal pdf fX (x). Obtain E[XY ]. Are X and Y independent? Answer: 21
The table below gives the joint probability function for (X, Y ).
Y 0 1 2 0 0.1 0.1 0. X 1 0.2 0.0 0. 2 0.1 0.0 0.
Obtain the marginal probabilities pX (x) and pY (y) for X and Y. Hence obtain E[X], E[Y ], Var[X], Var[Y ]. Obtain cov(X, Y ) and corr(X, Y ). Answer: 22
If cov(X, Y ) = 0.5 and Var[X] = 2, what is cov(X, X + Y )? Answer: 23
If cov(X + Y, X − Y ) = 12, Var[X + Y ] = 20 and Var[X − Y ] = 16, obtain σ^2 X = Var[X], σ^2 Y = Var[Y ], σXY = cov(X, Y ) and so obtain corr(X, Y ). Answer: 24
A fair die is rolled 100 times and the number X of ones and the number Y of twos is counted. What distribution does X have? What distribution does Y have? If Z = X + Y is the total number of ones or twos in the 100 rolls of the die, what distribution does Z have? What is the variance of X, Y and Z? Hence obtain cov(X, Y ) and corr(X, Y ). Answer: 25
(^19) Yes, 3, 1. (^20) pX (0) = 0.7, pX (1) = 0.3, pr{Y = 0|X = 1} = 2 3 , pr{Y^ = 1|X^ = 1}^ = pr{X^ = 1^ ∩^ Y^ = 1}^ /pr{X^ = 1}^ =^ 1
E[XY ] = 0.1. No. (^21) fX (x) = R y fXY^ (x, y) dy^ = 2x^ for 0^ < x <^ 1. E[XY^ ] =^ 4 22 9. Yes. Marginal probabilities for X are 0.3, 0.4, 0.3, and for Y they are 0.4, 0.1, 0.5. E[X] = 1, E[Y ] = 1.1, Var[X] = 0.6, Var[Y ] = 0.89, cov(X, Y ) = 0.1, corr(X, Y ) = 0.137. (^23) Var[X] + cov(X, Y ) = 2.5. (^24) σ X (^2) − σ (^2) Y = 12, σ (^2) X + 2σXY + σ (^2) Y = 20, σ (^2) X − 2 σXY + σ (^2) Y = 16, so 2σ (^2) X + 2σ Y (^2) = 36 and 4σXY = 4. Thus σ (^2) X = 15, σ^2 Y = 3, σXY = 1 and corr(X, Y ) = 1/
√
(^25) X ∼ Bin(n = 100, θ = 1 6 ). Similarly for^ Y^.^ Z^ ∼^ Bin(100, θ^ =^ 1 3 ). Var[X] = Var[Y^ ] = 500/36, Var[Z] = 200/9 = σ^2 X + 2σXY + σ^2 Y. Hence cov(X, Y ) = − 100 /36 so corr(X, Y ) = − 15. Notice X and Y are not uncorrelated. If you have a lot of ones, you would expect fewer twos!
Two independent samples gave values 3, 6, 5, 2 for sample 1 and 2, 2, 3, 3, 5 for sample 2. Assuming that the samples come from independent normal distributions with common unknown variance σ^2 , test at the 5% level whether the difference in mean equals zero against the alternative that it does not equal zero. Answer: 31
Five randomly selected remuneration packages for US oil and gas CEOs in 2008 were (in thousands of US dollars) 21333, 7294, 6712, 5727, 7087. Five randomly selected remuneration packages for US health care CEOs in 2008 were (in thousands of dollars) 14262, 8381, 7245, 10211, 1817. Test at the 5% level whether the difference in mean remuneration equals zero against the alternative hypothesis that it does not equal zero. You can assume that the two populations have common (unknown) variance σ^2. Answer: 32
A quarter of insurance claims are incomplete in some way. If you have 250 forms to process, what is the approximate probability that you will find fewer than 50 of them incomplete? Answer: 33
In n = 100 tosses of a coin I obtain X = 72 heads. Obtain an approximate 95% confidence interval for the probability θ of a head. Answer: 34
In December 2010 two analysts suggested several shares as likely to rise in 2011. By the end of October 2011 one (Neil Woodford) had four out of n 1 = 7 “share tips” showing a rise while the other (Harry Nummo) had three out of n 2 = 10 “share tips” showing a rise. Test at the 5% level whether the two success proportions are significantly different. Answer: 35
z = qx¯ (^) σ^1 2 −^ ¯x^2 n^1 1 +^
σ^22 n 2
= q^4 −^3 (^44) + (^15) = 0. 913. Test rule is reject H 0 if |z| > 1 .96. Thus accept H 0 at 5% level.
(^31) n 1 = 4, ¯x 1 = 4, s (^21) = 3.333, n 2 = 5, ¯x 2 = 3, s (^22) = 1.5, pooled estimate of σ (^2) is s (^2) =^3 s^21 + 4s^22 7 = 2.2857. Testing H 0 : μ 1 − μ 2 = 0 vs. H 1 : μ 1 − μ 2 6 = 0. Test statistic is t = ¯x^1 −^ x¯^2 s
q (^1) n 1 +^ n^12
= 4 −^3
q (^1) 4 +^ 1 5
= 0. 986. Test rule is
reject H 0 if |t| > t 7 (2.5%). As t 7 (2.5%) = 2.365, accept H 0 at 5% level. (^32) Data source: http://graphicsweb.wsj.com/php/CEOPAY09.html. n 1 = 5, ¯x 1 = 9630.6, s^21 = 43158021, n 2 = 5, ¯x 2 = 8383.2, s^22 = 20577907, n 1 + n 2 − 2 = 8, t 8 (2.5%) = 2.306.
If variances are equal to σ^2 , estimate σ^2 using s^2 = (n^1 −^ 1)s
(^21) + (n 2 − 1)s (^22)
|¯x^ n^1 +^ n^2 −^2 = 31867964.^ Test statistic is^ t^ = r^1 −x¯^2 | s^2 ( (^) n^11 + n^12 ) = (^35701247) .. 324 = 0.349. Since t 8 (2.5%) = 2.306, then |t| < t 8 (2.5%) so accept H 0 that μ 1 = μ 2 against the
alternative μ 1 6 = μ 2 at the 5% level. (^33) If X is the number of incomplete forms, X ∼ Bin(n = 250, θ = 1 4 )^ ≈^ N(μ^ = 62.^5 , σ (^2) = 46.875). You require
pr{X < 50 } = pr{X ≤ 49 } = Φ
„ (^) 49 + 1 2 −^ μ σ
« = Φ(− 1 .899) = 0.0288. Notice we have used a continuity correction. (^34) Number of heads X ∼ Bin(n = 100, θ). Here n = 100, X = 72 observed, ˆθ = X/n = 72/100 = 0.72.
Approximate 95% confidence interval is θˆ ± 1. 96
s θ^ ˆ(1 − θˆ) 35 n^ = 0.^72 ±^0 .088. Data source: http://www.thisismoney.co.uk/money/investing/article-1709914/Stock-market-predict
In January 2011 Durham police were reported as disappointed by the increase in the num- ber of people arrested for drinking and driving. Between December 1st 2010 and December 31st 2010 they had 52 positive breath tests out of 1799 breath tests administered, while for the same period in 2009 they had 41 positive tests out of 1433 administered. Construct a 95% confidence interval for the difference in proportion of drivers who tested positive. Source: http://www.bbc.co.uk/news/uk-england- Answer: 36
I observe two dice. For one die I notice that it gives a six 20 times out of 100 and for the second die I notice that it gives a six 22 times out of 80. Test at the 5% level whether the two dice give the same probability of showing a six. Answer: 37
If X ∼ χ^24 , for what value of x is pr{X > x} = 0.05? Answer: 38
I roll a die 100 times and observe the following results.
Outcome i 1 2 3 4 5 6 Observed frequency 16 15 16 15 15 23
Test at the 5% level whether the die is fair. Answer: 39
ions-tips-2011.html Two binomial proportions here. θˆ 1 = 4/7 = 0.571, ˆθ 2 = 3/10 = 0.300, n 1 = 7, n 2 = 10. Common estimated
proportion is θ =^7
ˆθ 1 + 10ˆθ 2 17 = 0.412. Approximate test statistic is^ z^ =^ r |θˆ^1 −^ θˆ^2 | θ^ ˆ(1 − θˆ)
“ (^1) n 1 +^ 1 n 2
” = 1.119. reject H^0 at
5% level if |z| > 1 .96, so here accept the hypothesis that the two proportions are equal. (^36) Two binomial proportions again. ˆθ 1 = 52/1799 = 0.028905, θˆ 2 = 41/1433 = 0.028611, n 1 = 1799, n 2 = 1433. Common estimated proportion is θ =^1799 θˆ 1 + 1433θˆ 2 3232 = 0.0288.^ (This is very small so the normal approxima- tion is doubtful. In practice we would transform to give approximate normality.) Approximate test statistic is
z = | θˆ 1 − θˆ 2 | r θ^ ˆ(1 − θˆ)
“ n^1 1 +^ n^1 2
” = 0.0496. Reject H^0 at 5% level if^ |z|^ >^1 .96, so here accept the hypothesis that the two
proportions are equal. (^37) n 1 = 100, x 1 = 20, θˆ 1 = 20/100 = 0.200, n 2 = 80, x 2 = 22, θˆ 2 = 22/80 = 0.275. We test H 0 : θ 1 = θ 2 (= θ) vs. H 1 : θ 1 6 = θ 2. This is equivalent to testing H 0 : θ 1 − θ 2 = 0 vs. H 1 : θ 1 − θ 2 6 = 0. Assuming H 0 is
true, the estimated common proportion θ is estimated by θˆ = n^1
ˆθ 1 + n 2 θˆ 2 n 1 + n 2 =
20 + 22 180 = 0.2333.^ Test statistic is
z = θˆ 1 − θˆ 2 q (^) ˆ θ(1−θˆ) n 1 +^
ˆθ(1−ˆθ) n 2
= √ 0 .0017889 + 0^0.^200 −^0.^275. 0014907 = − 1. 31. Test rule is reject H 0 if |z| > 1 .96, so accept H 0 at 5%
level. (^38) From tables, x = 9.488. (^39) Let X denote the outcome of the die. We test whether pr{X = i} = 1/6 for all i. Expected frequency for any outcome would then be 100 × 16 = 16.667.
Outcome i 1 2 3 4 5 6 Observed frequency Oi 16 15 16 15 15 23 Expected frequency Ei 16.67 16.67 16.67 16.67 16.67 16. (Oi − Ei)^2 /Ei 0.0267 0.1667 0.0267 0.1667 0.1667 2.407 sum=2.