Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Regression: Model Fit Measures, Exams of Statistics

The coefficient of determination may be interpreted as the proportional reduction in error resulting from use of the regression model to predict Y. Another ...

Typology: Exams

2021/2022

Uploaded on 09/27/2022

lana23
lana23 🇺🇸

4.8

(4)

216 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Regression: Model Fit Measures
1. Coefficient of Multiple Correlation R and Coefficient of Determination R2
As previously noted one measure of model fit—how well the regression model is able to reproduce the observed
scores on the dependent variable Y—is the simple Pearson’s correlation between observed Y and predicted Y'.
R = Pearson’s correlation, r, between Y and Y'
The closer R is to 1.00 the better the regression model is able to reproduce Y, the closer R is to 0.00, the worse the
performance of the model in reproducing Y. While R may be negative, this is not expected or likely; one anticipates
R to be positive since the regression model is designed to predict Y as well as is possible given the data.
The coefficient of determination, R2, is simply R squared:
R2 = R × R = proportion of variance in Y predicted (or explained) by regression model
The coefficient of determination may be interpreted as the proportional reduction in error resulting from use of the
regression model to predict Y. Another interpretation of the coefficient of determination is explained variance—the
proportion of variance in Y explained, or predicted, by the regression model. The complement of this, 1R2, is the
amount of variance in Y that is not explained or predicted by the regression model.
1R2 = proportion of variance in Y not explained by regression model
Recall the student ratings data:
Table 1: Student Ratings and Course Grades Data
Course Quarter Year Student Ratings
(mean ratings for course)
Percent A's
EDR852 FALL 1994 3.00 46.00
EDR761 FALL 1994 4.40 47.00
EDR761 FALL 1993 4.40 53.00
EDR751 SUMM 1994 4.50 62.00
EDR751 SUMM 1994 4.90 64.00
EDR761 SPRI 1994 4.40 50.00
EDR751 SPRI 1994 3.70 33.00
EDR751 WINT 1994 3.30 25.00
EDR751 WINT 1994 4.40 53.00
EDR751 FALL 1993 4.80 50.00
EDR751 SUMM 1993 4.80 54.00
EDR751 SUMM 1993 3.80 60.00
EDR751 SPRI 1993 4.60 54.00
EDR761 SPRI 1993 4.10 37.00
EDR751 WINT 1993 4.20 53.00
EDR751 FALL 1992 3.50 41.00
EDR751 FALL 1992 3.80 47.00
SPSS Data File: http://www.bwgriffin.com/gsu/courses/edur8132/notes/student_ratings.sav
pf3

Partial preview of the text

Download Regression: Model Fit Measures and more Exams Statistics in PDF only on Docsity!

Regression: Model Fit Measures

1. Coefficient of Multiple Correlation R and Coefficient of Determination R 2

As previously noted one measure of model fit—how well the regression model is able to reproduce the observed scores on the dependent variable Y—is the simple Pearson’s correlation between observed Y and predicted Y'.

R = Pearson’s correlation, r, between Y and Y'

The closer R is to 1.00 the better the regression model is able to reproduce Y, the closer R is to 0.00, the worse the performance of the model in reproducing Y. While R may be negative, this is not expected or likely; one anticipates R to be positive since the regression model is designed to predict Y as well as is possible given the data.

The coefficient of determination, R^2 , is simply R squared:

R^2 = R × R = proportion of variance in Y predicted (or explained) by regression model

The coefficient of determination may be interpreted as the proportional reduction in error resulting from use of the regression model to predict Y. Another interpretation of the coefficient of determination is explained variance—the proportion of variance in Y explained, or predicted, by the regression model. The complement of this, 1−R^2 , is the amount of variance in Y that is not explained or predicted by the regression model.

1 −R^2 = proportion of variance in Y not explained by regression model

Recall the student ratings data:

Table 1: Student Ratings and Course Grades Data

Course Quarter Year Student Ratings (mean ratings for course)

Percent A's

EDR852 FALL 1994 3.00 46.

EDR761 FALL 1994 4.40 47.

EDR761 FALL 1993 4.40 53.

EDR751 SUMM 1994 4.50 62.

EDR751 SUMM 1994 4.90 64.

EDR761 SPRI 1994 4.40 50.

EDR751 SPRI 1994 3.70 33.

EDR751 WINT 1994 3.30 25.

EDR751 WINT 1994 4.40 53.

EDR751 FALL 1993 4.80 50.

EDR751 SUMM 1993 4.80 54.

EDR751 SUMM 1993 3.80 60.

EDR751 SPRI 1993 4.60 54.

EDR761 SPRI 1993 4.10 37.

EDR751 WINT 1993 4.20 53.

EDR751 FALL 1992 3.50 41.

EDR751 FALL 1992 3.80 47.

SPSS Data File: http://www.bwgriffin.com/gsu/courses/edur8132/notes/student_ratings.sav

  1. What is the coefficient of multiple correlation value for the student ratings data; that is, what is the correlation between observed ratings (Y) and predicted ratings (Y')?
  2. What is the coefficient of determination value? 2. Residuals and Model Fit: SEE and MSE

Recall that a residual, or error, is the difference between observed Y and predicted Y':

e = Y - Y'

One way to measure model fit is to examine variation in residuals.

From basic statistics note that variance in raw data may be calculated for the population as

σ^2 =

N

Y Y

2

and variance for sample data may be calculated as

s^2 =

2

n

Y Y

The difference between these formula is the degrees of freedom. In the population case the count of all observations is use, N, but in the sample formula degrees of freedom is n − 1 is used (to provide an unbiased estimate of σ^2 ).

The variance for residuals may also be calculated in the same manner taking into account regression model degrees of freedom:

2

n k

Y Y

= MSE

The above produces a variance that as many names:

variance error of residuals , or variance error of estimate , or mean squared error (MSE)

and is denoted as or MSE.

ˆ^2

σ

The square root of MSE, MSE , is conceptually the standard deviation of residuals, but since these data are

residuals, or errors, MSE is known as the standard error of residuals or standard error of estimate and is

symbolized as

σˆ = MSE = SEE (standard error of estimate)

Note that as SEE, and MSE, become smaller, the fit of the model is better since the residuals are smaller.