Neas-Seminars

Fox Module 12 Statistical inference for multiple regression


http://33771.hs2.instantasp.net/Topic8626.aspx

By NEAS - 12/2/2009 2:28:58 PM

Fox Module 12 Statistical inference for multiple regression

 


           Confidence intervals

           Hypothesis testing

           Empirical vs structural relations


 

 

Read Section 6.2.1, “The multiple regression model,” on pages 105-106.

 

The five assumptions on page 105 and the five attributes of least squares estimators on page 106 are the same as for simple linear regression.

 

Know equation 6.2 on page 106, and read carefully the explanatory paragraph afterward. The R2j is the “squared multiple correlation from the regression of Xj on all the other X’s.” Think of a multiple regression with two explanatory variables: R22 is the squared multiple correlation from the regression of X2 on X1.

 

Focus on two critical points in this paragraph:

 


 

           The error term ó2å decreases if the explanatory variables are orthogonal.

           The variance-influence factor increases in the explanatory variables are correlated.


 

 

Read Section 6.2.2, “Confidence intervals and hypothesis tests,” on pages 106-110. Focus on the degrees of freedom for the t-distribution (n-k-1) on page 106 and the standard error of Bj at the top of page 107 (which follows directly from the previous section).

 

The example on page 107 is clear. Expect similar questions on the final exam.

 

Know both forms of the F-test for the omnibus null hypothesis on page 108: one uses RSS and RegSS and other uses R2. RSS + ResSS = TSS, so the final exam may give various input data (RSS, RegSS, TSS, R2) and ask for the F-statistic.

 

Know the analysis of variance table on page 108. The residual mean square (RMS) is the estimated error variance ó2E. You use analysis of variance for qualitative factors and for the student project, so learn the definitions in this module.

 

The F-test for a subset of slopes is hard to grasp, but it is essential for regression analysis. It is tested on the final exam and is used in the student projects.

 


 

           Know the two forms of the F-statistic at the bottom of page 109.

           Distinguish between q (the number of slopes being tested) and k (the number of explanatory variables in the full model.


 

 

Know the degrees of freedom (q and n-k-1).

 

Read Section 6.2.2, “Empirical vs structural relations,” on pages 110-112. This section is intuition, not formulas. Know the relation to the bias of the regression equation in the gray box on page 112. Understand the last line in the box: “Bias in least squares estimation results from the correlation that is induced between the included explanatory variable and the error by incorporating the omitted explanatory variable in the error.”

 

 

By ShubhankarYadav - 2/14/2018 6:54:49 PM

NEAS - 12/2/2009 2:28:58 PM

Fox Module 12 Statistical inference for multiple regression

 

           Confidence intervals

           Hypothesis testing

           Empirical vs structural relations

 

 

Read Section 6.2.1, “The multiple regression model,” on pages 105-106.

 

The five assumptions on page 105 and the five attributes of least squares estimators on page 106 are the same as for simple linear regression.

 

Know equation 6.2 on page 106, and read carefully the explanatory paragraph afterward. The R2j is the “squared multiple correlation from the regression of Xj on all the other X’s.” Think of a multiple regression with two explanatory variables: R22 is the squared multiple correlation from the regression of X2 on X1.

 

Focus on two critical points in this paragraph:

 

 

           The error term ó2å decreases if the explanatory variables are orthogonal.

           The variance-influence factor increases in the explanatory variables are correlated.

 

 

Read Section 6.2.2, “Confidence intervals and hypothesis tests,” on pages 106-110. Focus on the degrees of freedom for the t-distribution (n-k-1) on page 106 and the standard error of Bj at the top of page 107 (which follows directly from the previous section).

 

The example on page 107 is clear. Expect similar questions on the final exam.

 

Know both forms of the F-test for the omnibus null hypothesis on page 108: one uses RSS and RegSS and other uses R2. RSS + ResSS = TSS, so the final exam may give various input data (RSS, RegSS, TSS, R2) and ask for the F-statistic.

 

Know the analysis of variance table on page 108. The residual mean square (RMS) is the estimated error variance ó2E. You use analysis of variance for qualitative factors and for the student project, so learn the definitions in this module.

 

The F-test for a subset of slopes is hard to grasp, but it is essential for regression analysis. It is tested on the final exam and is used in the student projects.

 

 

           Know the two forms of the F-statistic at the bottom of page 109.

           Distinguish between q (the number of slopes being tested) and k (the number of explanatory variables in the full model.

 

 

Know the degrees of freedom (q and n-k-1).

 

Read Section 6.2.2, “Empirical vs structural relations,” on pages 110-112. This section is intuition, not formulas. Know the relation to the bias of the regression equation in the gray box on page 112. Understand the last line in the box: “Bias in least squares estimation results from the correlation that is induced between the included explanatory variable and the error by incorporating the omitted explanatory variable in the error.”

 

 



On the last page of this section on "Empirical vs Structural relations", the author says that in the case of spurious causal relationships, it is critical to "control for X2".

‌‌Does he mean that X2 must be added as an explanatory variable to our model? Because if we don't add it, then the coefficient of X1 will incorrectly capture this spurious causation due to a common prior X2.
‌By including X2 in our model, we will avoid adding the spurious component of the relation between X1 and Y to B1 (coefficient of X1).