Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Midterm Exam with Solutions - Statistical Methods I | STAT 251, Study notes of Data Analysis & Statistical Methods

Material Type: Notes; Professor: Clark; Class: Statistical Methods I; Subject: Statistics; University: Hollins University; Term: Fall 2008;

Typology: Study notes

Pre 2010

Uploaded on 08/18/2009

koofers-user-p7l
koofers-user-p7l 🇺🇸

5

(1)

10 documents

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Stat 251 – Midterm Solutions
October 8, 2008
Please write clearly on this paper, using the back of a page if you need
additional space. You may use your text, handouts, notes, calculator,
Minitab, Excel, and Java applets.
For questions that call for calculations, present your method of solution in a
clear, well-labeled manner and show the details of your calculations. For
questions that ask for interpretations and explanations, explain your
answers fully unless instructed otherwise. For problems with multiple
parts, be aware that you can usually complete later parts successfully
whether or not you do so on earlier parts. If one part of a question does use
the answer to an earlier part that you have not been able to answer, you
may use a suitable symbol in place of the answer in working the later part
of the question. If you made an approximation or an assumption, be sure to
note when you have done so, and why.
Don’t spend too long on one question. Tentative point values have been
given to each problem to help you manage your time.
If any questions arise during the exam, or you need any terminology
clarified, please do not hesitate to ask!
You have 60 minutes to complete the first six problems. The last four
problems are to be finished on a take-home basis and handed in by 10:20
am on Monday, October 13, 2008. You are to work completely
independently on all parts of the exam, and you may not use any aids other
than those mentioned above.
Name:
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Midterm Exam with Solutions - Statistical Methods I | STAT 251 and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

Stat 251 – Midterm Solutions

October 8, 2008

Please write clearly on this paper, using the back of a page if you need

additional space. You may use your text, handouts, notes, calculator,

Minitab, Excel, and Java applets.

For questions that call for calculations, present your method of solution in a

clear, well-labeled manner and show the details of your calculations. For

questions that ask for interpretations and explanations, explain your

answers fully unless instructed otherwise. For problems with multiple

parts, be aware that you can usually complete later parts successfully

whether or not you do so on earlier parts. If one part of a question does use

the answer to an earlier part that you have not been able to answer, you

may use a suitable symbol in place of the answer in working the later part

of the question. If you made an approximation or an assumption, be sure to

note when you have done so, and why.

Don’t spend too long on one question. Tentative point values have been

given to each problem to help you manage your time.

If any questions arise during the exam, or you need any terminology

clarified, please do not hesitate to ask!

You have 60 minutes to complete the first six problems. The last four

problems are to be finished on a take-home basis and handed in by 10:

am on Monday, October 13, 2008. You are to work completely

independently on all parts of the exam, and you may not use any aids other

than those mentioned above.

Name:

cannot conclude that sleep deprivation is the cause of the increased likeliness of crashing,

because this is an observational study.

(11 pts) Biologists studied the relative brain sizes (measured as brain weight divided by body weight, times 1000) for 96 species of mammals. The species were also classified by whether their average litter size is less than 2 or not. Summary statistics are below: average litter Variable size N Mean StDev Q 1 Medi Q 3 relative brain 2 or more 45 10.97 9.84 3.32 7.97 18. size under 2 51 6.886 5.460 2.480 5.000 10. A simulation was used to produce 1000 repetitions where the 96 brain sizes were randomly assigned to the 2 litter size groups. a) (5 pts) Based on these simulation results, would you consider the increase in average brain sizes for the larger litters to be statistically significant? Explain by estimating and interpreting the p -value.

The observed difference in group means = 6.89 – 10.97 = - 4.

So – the p-value ≈ (1+0+6) / 1000 ≈.

With such a small p-value, we conclude that the difference is statistically significant. It

would be very surprising to find a difference of -4.08 or less (more extreme) if there was no

real difference in average brain size between the two groups.

b) (3 pts) The previous study was operationally identical to that of another study and the results of the two studies were combined. The sample sizes were now roughly twice as large in each group, and the other summary statistics remained similar to the values listed above. Without calculating, would the p -value for this combined study be larger, smaller, or approximately the same as that in (a)? Explain your reasoning.

  1. (14 pts) Consider two basketball players Alice and Bree Suppose that for games played at Hollins, Alice successfully makes 40 of her 50 free throws, while Bree successfully makes 9 of her 10 free throws. Suppose that for games played away from Hollins, Alice successfully makes 2 of her 10 free throws, while Bree successfully makes 25 of her 50 free throws. a) (3 pts) For games played at Hollins, which player successfully makes a high proportion of free throws? Justify your answer with appropriate calculations.

Alice makes 40/50 = .800, Bree makes 9/10 =.

Bree makes a higher proportion.

b) (3 pts) For games played away from Hollins, which player successfully makes a high proportion of free throws? Justify your answer with appropriate calculations.

Alice makes 2/50 = .200, Bree makes 25/50 =.

Bree makes a higher proportion.

c) (4 pts) Now combine games played at Hollins and away from Hollins. When these games are combined, which player successfully makes a higher proportion of her free throws? Justify your answer with appropriate calculations.

Alice makes 42/60 = .700, Bree makes 34/60 ≈.

Alice makes a higher proportion.

d) (4 pts) Explain why Simpson’s paradox occur’s here. (Be sure that you do more than describe the paradox; be sure to explain why it happens in this case.) Base your explanation on the data provided.

Both players high a higher success proportion at home than away. Alice gets most of her

opportunities at home (50 of 60), and Bree gets most of her opportunities away (also 50 of 60).

So, even though Bree does better than Alice at both places, Alice does better overall because she

gets most of her opportunities where the success proportions are higher.

Stat 251 – Midterm

Take-Home Questions:

This portion of the midterm is due at 10:20 am Monday, October 13th^ – no

exceptions. You are to work completely independently on this portion of the

exam – that means you cannot get help from any other person, or from

another person’s notes, or from the web, etc. You may use your text,

handouts, notes, calculator, Minitab, Excel, and Java applets. You may

consult me if you have questions or if you need terminology clarified – but I

am unlikely to be available Sunday evening – so start the exam TODAY!

For questions 8-10, present your method of solution in a clear, well-labeled

manner and show ALL the details of your calculations. For questions that

ask for interpretations and explanations, explain your answers in excessive

detail. If you made an approximation or an assumption, be sure to note

when and how you have done so, and why. DO NOT expect me to make any

assumptions, or figure out anything that you have done.

If you use Minitab, Excel or an Applet, explain each and every step –

including every command and macro you used. If you decide to ‘explain’ by

submitting a copy of a Minitab or Excel file or a screen shot – please make

sure that the file/screen shot shows everything I could possibly need to see.

(For example – the Minitab session window does not display macros!) Bald

answers (numbers with no supporting work) will receive NO CREDIT on this

portion of the exam – even if they are correct to 10 decimal places.

Enjoy! 

  1. Suppose that the subjects in an experiment are to be randomly divided into two groups. a) (3 pts) Suppose that there are 8 subjects, of whom 4 are men and 4 are women. Determine the probability that 2 subjects of each gender are assigned to each group.

This is hypergeometric with N = 8, M = 4, n = 4 and we want P(x = 2).

P(x = 2) =.

b) (3 pts) Now consider the general case that there are 4N subjects, of whom 2N are men and 2N are women. Derive an expression for the probability that N subjects of each gender are assigned to each group, as a function of N.

This is hypergeometric with N = 4N, M = 2N, n = 2N and we want P(x = N).

P(x = N) = C(2N,N)×C(2N,N) / C(4N,2N)

c) (3 pts) Produce and submit a graph of your function from b), for values of N ranging from 1 to 10. Does the function appear to be increasing or decreasing? Explain why this makes sense.

This function is decreasing – which makes sense because as N gets larger – it gets more and

more difficult for exactly half of the women and half of the men to end up in each group.

(C(4N,2N) increases much more rapidly than does C(2N,N)^2 .)

N P(x = N)

  1. (33 pts) A psychology experiment investigated whether people display more creativity when they are thinking about intrinsic or extrinsic motivations. The subjects were 47 people with extensive experience with creative writing. They were randomly assigned to one of two groups: one group answered a survey about intrinsic motivations for writing (such as the pleasure of self-expression) and the other group answered a survey about extrinsic motivations (such as public recognition). Then all subjects were instructed to write a Haiku poem, and these poems were evaluated for creativity by a panel of judges. The researchers conjectured that subjects who were thinking about intrinsic motivations would display more creativity than subjects who were thinking about extrinsic motivations. The creativity scores are in the Minitab worksheet creativity.mtw.

Note that the data appear in both “stacked” format (c1 and c2) and unstacked format (c4 and c5). a) (4 pts) Which group (intrinsic or extrinsic) tended to achieve higher creativity scores? Report the values of appropriate summary statistics (i.e., measures of center) to support your answer. (Do not bother to write a paragraph or even a sentence.)

The intrinsic group tended to have higher creativity scores. This is seen in comparing the means

(19.883 vs. 15.74) and medians (20.4 vs. 17.2).

b) (4 pts) Which group (intrinsic or extrinsic) tended to have higher variability in creativity scores? Report the values of appropriate summary statistics (i.e., measures of spread) to support your answer. (Do not bother to write a paragraph or even a sentence.)

The extrinsic group tended to have higher variability in creativity scores. This is seen in

comparing the standard deviations (5.25 vs. 4.24) and interquartile ranges (7.200 vs. 5.225).

c) (2 pts) Do any of the creativity scores in either group show up as outliers on boxplots? If so, identify the values of the outliers. (You do not need to conduct an

outlier test by hand; just create and look at the boxplots.) There are no outliers in

either group.

The values in c10 are differences in group means, obtained from simulating the random assignment process 10,000 times, assuming no difference between the intrinsic and extrinsic motivation groups. d) (6 pts) Use this column of simulation results to approximate the p -value of the randomization test. Describe how you do this, as well as reporting the approximate p -value.

We need to count how many of the 10,000 values in c10 (differences in group means) are at

least as large as 19.883 – 15.74 = 4.143. This count turns out to be 25, which you can find

by typing: let c11=(c10>=4.143). The approximate p-value is therefore 25/10,000 = .0025.

Or: let c11 = (c10 <= -4.143), in which case you will find an approximate p-value of

e) (8 pts) Summarize the conclusion that you would draw from this study. Be sure to address the issue of cause-and-effect as well as the issue of statistical significance.

The very small p-value provides very strong evidence that the intrinsic motivation group has

significantly higher scores on average than the extrinsic motivation group. Because this is a

randomized experiment, we can further conclude that the intrinsic motivation causes an

increase in creativity scores.

f) (3 pts) Suppose that you were asked to write a Minitab macro to conduct this simulation analysis. What would the first line of this macro be (i.e., the line that performs the random assignment of subjects to groups)?

sample 47 c2 c