

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Instructions for an activity on inferring and predicting in multiple regression using a dataset of national universities' graduation rates, percent of classes under 20 students, student-faculty ratios, and alumni/ae giving rates. Students are expected to work in teams, complete exercises, and discuss results. The goal is to determine if the given predictors can effectively predict the alumni/ae giving rate and identify statistically significant predictors.
Typology: Exams
1 / 3
This page cannot be seen from the preview
Don't miss anything!
Statistical Applications ACTIVITY 9: Inference and prediction in multiple regression
Why
The purpose of finding a regression line is description of the relation between the predictors and the response in the population. The regression equation gives the best linear fit to the data (sample) and the coefficient of determination (R^2 ) measures the closeness of that fit to the data, not to the population. The two main issues for “usefulness” are determining whether the data indicate a linear relation in the population (and determining which predictors are useful) and (if so) estimating the rate of change of the response (based on the different predictors) and using the equation to predict means and individual values.
LEARNING OBJECTIVES
CRITERIA
RESOURCES
PLAN
EXERCISE The table given in Case problem 3 (p687-688 in your text) gives information on a sample of 48 national universities, showing graduation rate (percent of entering first-year students who graduate from the school), percent of classes that are under 20 (students), the student-faculty ratio (# students per faculty member) and the alumni/ae giving rate (percent of alumni/ae who donate to the university). A national fundraising organization wishes to see if graduation rate, % of classes under size 20 and student- faculty ratio can be used to effectively predict the alumni/ae giving rate. The Minitab printout of the regression calculation is given below.
(a) Give the regression equation (with context, of course) for predicting alumni/ae giving rate based on these three predictors. (b) What does the equation give as the predicted giving rate for New York University (graduation rate 72%, 63% of classes under 20, student-faculty ratio 13)? What is the residual?
(c) What proportion of variation in alumni/ae giving rate is “explained” by the relation to the three predictors? (d) The coefficient of “graduation rate” in the equation is .748. What does this tell us (be specific — interpret that actual number)? (e) Set up and carry out [no calculation is necessary — but writing is] the test to determine whether there is evidence of a linear relation useful for predicting alumni/ae giving rate based on the (set of ) three predictors used here. Does it appear there is a significant linear relationship? (f ) Which predictors are shown to be useful [statistically significant] for predicting the giving rate (give the evidence)? (g) Use the information in the printout (you will have to do some calculation) to give a 95% confi- dence estimate for the decrease in alumni/ae giving rate for each unit increase in the student- faculty ratio. [Note: the standard error sbi of each coefficient is given in the printout, in the column SE Coeff ] (h) The printout shows the 95% “95% CI” for schools with graduation rate 72%, with 63% of classes under 20 and with student-faculty ratio 13 as “(14.20, 24.77)”. What does this mean? [It certainly isn’t the same as g, above] (i) Similarly, the printout shows the“95% PI” for schools with graduation rate 80%, with 45% of classes under 20 and with student-faculty ratio 15 as “(3.26, 35.70)”. What does this mean [why is it different from the preceding result]?
(a) Give the regression equation. (b) Compare the usefulness of this regression model to the three-predictor model, using R^2 (adj). How does this correspond to your result in part f above? (c) What does this model give as a 95% confidence interval for the decrease in giving rate for each unit increase in faculty-student ratio? Why doesn’t it match the previous result (1g)?
READING ASSIGNMENT (in preparation for next class) Read Sections 15.7 (use of qualitative predictors), and 15.8(residual analysis)
SKILL EXERCISES:(hand in - individually - at next class meeting): p.645 #24, 25 p.648 #28, 29, 31 (Use Minitab where possible)
Regression Analysis: %Giving versus %Grad, Under 20, S/FRatio
The regression equation is %Giving = - 20.7 + 0.748 %Grad + 0.029 Under 20 - 1.19 S/FRatio
Predictor Coef SE Coef T P Constant -20.72 17.52 -1.18 0. %Grad 0.7482 0.1660 4.51 0. Under 20 0.0290 0.1393 0.21 0. S/FRatio -1.1920 0.3867 -3.08 0.
S = 7.60972 R-Sq = 70.0% R-Sq(adj) = 67.9%
Analysis of Variance
Source DF SS MS F P Regression 3 5943.5 1981.2 34.21 0. Residual Error 44 2547.9 57. Total 47 8491.