Module 12: Intuition: R2 vs standard error


Module 12: Intuition: R2 vs standard error

Author
Message
ShubhankarYadav
Forum Newbie
Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)

Group: Awaiting Activation
Posts: 4, Visits: 155
NEAS - 2/13/2010 9:48:03 AM

Module 12: Statistical inference for multiple linear regression

 

(The attached PDF file has better formatting.)

 

Intuition: R2 vs standard error

 

Jacob: We have two measures of goodness-of-fit: R2 and the standard error of the regression. Do they measure the same thing?  Is one a function of the other, like the t statistic and the F statistic for the two-variable regression model?

 

Rachel: The standard error measures the unexplained variance.  This is the residual sum of squares, or the error sum of squares: what is left over after the regression.  The R2 measures the percentage of the total sum of squares that is explained by the regression.

 

If the total sum of squares is held constant, the R2 and the standard error measure the same thing.  If the total sum of squares differs for two regression equations, the equation with the larger total sum of squares may have the same R2 but a higher standard error.

 

Jacob: Why would the total sum of squares be the same for two regression equations?  The total sum of squares is the deviation of the Y values from their mean.  The Y values are stochastic, so why should the total sum of squares be the same for different regressions?

 

Rachel: Suppose we seek to explain a phenomenon, such as the scores on an actuarial exam.  We consider several explanatory variables, such as the hours the candidate studies, the college grades, the actuarial courses taken, or the years of work experience.

 

           Given a regression equation, we assume the Y values are stochastic.  In theory, the regression is an experiment: we choose values for the independent variable, and we observe the resulting values of the dependent variable.

           In practice, we begin with the observed scores. We hypothesize explanations, and we test each explanatory variable.

 

 

If the total sum of squares is fixed before we begin forming regression equations, the R2 and the standard error measure the same thing.  If we start in the opposite direction – first choosing values of the independent variable and then observing the Y values – the total sum of squares is not fixed.

 

Jacob: If the total sum of squares is not fixed, which is the better measure of the goodness-of-fit, the R2 or the standard error?

 

Rachel: If the residuals have a normal distribution with a constant variance, the standard error should not depend on the dispersion of the X values.  The standard error is an unbiased estimate of ó2, which is assumed to be constant.  If the variance is not constant, a wider dispersion of X values may give a higher ó2.

 

 



If TSS is not fixed, and the error term has non-constant variance, the standard error will no longer be an unbiased estimator. Is that why R^2 is a better measure for non fixed TSS-non constant variance cases?
Edited 6 Years Ago by ShubhankarYadav
NEAS
Supreme Being
Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)

Group: Administrators
Posts: 4.5K, Visits: 1.6K

Module 12: Statistical inference for multiple linear regression

 

(The attached PDF file has better formatting.)

 

Intuition: R2 vs standard error

 

Jacob: We have two measures of goodness-of-fit: R2 and the standard error of the regression. Do they measure the same thing?  Is one a function of the other, like the t statistic and the F statistic for the two-variable regression model?

 

Rachel: The standard error measures the unexplained variance.  This is the residual sum of squares, or the error sum of squares: what is left over after the regression.  The R2 measures the percentage of the total sum of squares that is explained by the regression.

 

If the total sum of squares is held constant, the R2 and the standard error measure the same thing.  If the total sum of squares differs for two regression equations, the equation with the larger total sum of squares may have the same R2 but a higher standard error.

 

Jacob: Why would the total sum of squares be the same for two regression equations?  The total sum of squares is the deviation of the Y values from their mean.  The Y values are stochastic, so why should the total sum of squares be the same for different regressions?

 

Rachel: Suppose we seek to explain a phenomenon, such as the scores on an actuarial exam.  We consider several explanatory variables, such as the hours the candidate studies, the college grades, the actuarial courses taken, or the years of work experience.

 


           Given a regression equation, we assume the Y values are stochastic.  In theory, the regression is an experiment: we choose values for the independent variable, and we observe the resulting values of the dependent variable.

           In practice, we begin with the observed scores. We hypothesize explanations, and we test each explanatory variable.


 

 

If the total sum of squares is fixed before we begin forming regression equations, the R2 and the standard error measure the same thing.  If we start in the opposite direction – first choosing values of the independent variable and then observing the Y values – the total sum of squares is not fixed.

 

Jacob: If the total sum of squares is not fixed, which is the better measure of the goodness-of-fit, the R2 or the standard error?

 

Rachel: If the residuals have a normal distribution with a constant variance, the standard error should not depend on the dispersion of the X values.  The standard error is an unbiased estimate of ó2, which is assumed to be constant.  If the variance is not constant, a wider dispersion of X values may give a higher ó2.

 

 


Attachments
GO
Merge Selected
Merge into selected topic...



Merge into merge target...



Merge into a specific topic ID...





Reading This Topic


Login
Existing Account
Email Address:


Password:


Social Logins

  • Login with twitter
  • Login with twitter
Select a Forum....













































































































































































































































Neas-Seminars

Search