Group: Administrators
Posts: 4.5K,
Visits: 1.6K
|
Regression analysis Module 12: F test practice problems (The attached PDF file has better formatting.) ** Exercise 12.1: F-Test RegSS is the regression sum of squares. RSS is the residual (error) sum of squares. TSS is the total sum of squares. n is the number of data points in the sample. k is the number of explanatory variables (not including the intercept). An F-statistic tests the hypothesis that all the slopes (ß’s) are zero. What is the expression for the F-statistic using sums of squares? What is the expression for the F-statistic using R2? Part A: The F-statistic using sums of squares is F -statistic = (RegSS / k) ÷ (RSS / (n – k – 1) ) ResSS + RSS = TSS, so some textbooks write this as F -statistic = (RegSS / k) ÷ ( (TSS – RegSS) / (n – k – 1) ) Part B: The F-statistic using R2 is F -statistic = (R2 / k) ÷ ( (1 – R2) / (n – k – 1) ) R2 = ResSS / TSS, so the expression with R2 is the expression with TSS and RegSS after dividing numerator and denominator and TSS. Intuition: The total sum of squares (TSS) is divided between the regression sum of squares (RegSS) that is explained by the regression equation and the residual sum of squares (RSS) that remains unexplained. If a greater percentage is explained by the regression line, R2 is greater (ResSS is a greater percentage of TSS), the F-statistic is larger, and the regression is more likely to be significant. (See Fox, Chapter 6, statistical inference, page 108) ** Exercise 12.2: Degrees of freedom of F-statistic A regression model has N data points, k explanatory variables (ß’s), and an intercept. An F-test for the null hypothesis that q slopes are 0 has how many degrees of freedom in the numerator? This F-test has how many degrees of freedom in the denominator? Part A: The F-test says: "How much additional predictive power does the model under review have compared to what we would otherwise use, as a ratio to the total predictive power of the model under review?" Each part of this ratio is adjusted for the degrees of freedom. The degrees of freedom in the numerator adjusts for the extra predictive power of the model under review stemming from additional explanatory variables. If the model under review has one extra explanatory variable, it predicts better even if this extra explanatory variable has no actual correlation with the response variable. The degrees of freedom is the number of extra explanatory variables, or q. If the F-test has a p-value of P% with q degrees of freedom in the numerator, its p-value is more than P% with q+1 degrees of freedom in the numerator. A higher p-value means that it is more likely that the observed increase in predictive power reflects the spurious effects of additional explanatory variables. Part B: The degrees of freedom for the model under review is N – k – 1; this is the degrees of freedom in the denominator of the F-ratio. As N increases but no other parameters change, the additional predictive power of the model under review is less likely to be spurious (more likely to be real), so the p-value decreases ** Exercise 12.3: F test A linear regression Yj = á + â × Xj + åj with 5 observations has an estimated ó2å = 1.4333 and an F value of 10.1628. What is the residual sum of squares (RSS) of the regression? What is the regression sum of squares (RegSS)? What is the R2 of the regression? What is the absolute value of the correlation between the explanatory variable and the response variable? If the ordinary least squares estimator of â is 1.7, what is its standard error? Part A: The regression equation has one intercept, one explanatory variable, and five observations, so it has N – k – 1 = 5 – 1 – 1 = 3 degrees of freedom. The ó2å = RSS / degrees of freedom RSS = ó2å × degrees of freedom = 1.4333 × 3 = 4.300. Part B: The F value = the regression sum of squares (RegSS / k) / (RSS / N – k – 1) = (RegSS / 1) / (4.3 / 3) RegSS = (4.3 / 3) × 20.1628 = 28.900. Part C: The R2 of the regression is the regression sum of squares divided by the total sum of squares. The total sum of squares TSS = ResSS + RSS = 28.9 + 4.3 = 33.2, so the R2 = 28.9 / 33.2 = 0.87048. Part D: The absolute value of the correlation is the square root of the R2: 0.87048 = 0.9330. Part E: The t value for B, the ordinary least squares estimator for â, is the square root of the F value: t value = 20.1628 = 4.4903 This t value is the ordinary least squares estimator for â divided by its standard error the standard error = standard error = 1.7 / 4.4903 = 0.3786.
|
Group: Awaiting Activation
Posts: 13,
Visits: 393
|
I think there is an error here, the F value is shown as 10.1682 and Part B is using an F value of 20.1682.
|