Fox Module 10 R2 practice problems

Author	Message
lms0123	lms0123 posted 13 Years Ago #11742 #
Forum Newbie Group: Forum Members Posts: 6, Visits: 46	The solution for Exercise 10.1 Part D shows R^2 = RegSS/RSS appears to be mistated, and I believe it should be RegSS/TSS, which is equivalent to the second expression 1-RSS/TSS. [NEAS: Thank you for pointing out the typo; the file has been corrected and re-uploaded.] Edited 13 Years Ago by NEAS 0
	Reply
NEAS	NEAS posted 14 Years Ago #10382 #
Supreme Being Group: Administrators Posts: 4.5K, Visits: 1.6K	Fox Module 10 R² practiceproblems (The attached PDF file has betterformatting.) Exercise 10.1: R² A simple linear regression with anintercept and one explanatory variable fit to 18 observations has a total sum ofsquares (TSS) = 256 and s² (the ordinary least squares estimator foró²) = 4. A. Howmany degrees of freedom does the regression equation have? B. Whatis RSS, the residual sum of squares? C. Whatis RegSS, the regression sum of squares? D. Whatis the R² of the regression equation? E. Whatis the adjusted (corrected) R² of the regression equation? F. Whatis the correlation of the explanatory variable and the response variable? G. Whatis the F-value for the omnibus F-test? H. Whatis the t-value for the explanatory variable? Part A: The regression equation has N – k – 1= 18 – 1 – 1 = 16 degrees of freedom. Take heed: In this equation, k is thenumber of explanatory variables not including the intercept á. Part B: The estimate of the variance of theerror term (s²) is the residual (error) sum of squaresdivided by the number of degrees of freedom, or N – k: s² =RSS / df, so the residual sum of squares (RSS) = s² × degrees offreedom = 4 × 16 = 64. Part C: The regression sum of squares (RegSS)= TSS – RSS. The total sumof squares TSS is the sum of the squared residuals, given in the problem as256. RSS = s² ×(N – 2) = 4 × 16 = 64. RegSS = 256 – 64 =192. Part D: The R² = RegSS / TSS = 1 – RSS/TSS = 192 / 256 = 75%. Part E: Adjusted R² = 1 – (1 – R²)× (N – 1) / (N – k) = 1 – (1 – 75%) × 17 / 16 = 73.44% Part F: The correlation ñ(x,y) = r = √R² = √75% =0.866. Part G:Fox, Chapter 6, page 109: for theomnibus F-test in a simple linear regression, R₀² = 0 andk = 1, so F = (N – 2) × R² / (1 – R²)= (18 – 2) × 0.75 / (1 – 0.75) = 48.000 Part H: The t-value for simple linearregression is the square root of the F value: √48 = 6.928 [This practice problem is an essayquestion, reviewing the meaning of the significance tests, goodness-of-fittests, and measures of predictive power. It relates the statistical tests tothe form of the regression line, emphasizing the intuition. Final exam problemstest specific items in a multiple choice format.] Exercise 10.2: Measures of significance The R², the adjusted(corrected) R², the s² (the ordinary least squaresestimator for ó²), the t-value, and the F-value measure thesignificance, goodness-of-fit, or predictive power of the regression. A. Whatdoes the R² measure? B. Whatdoes the adjusted (corrected) R² measure? C. Whenis it important to use the adjusted (corrected) R² instead of thesimple R²? D. Ifthe R² 0, what can one sayabout the regression? E. Ifthe R² 1, what can one say about the regression? F. Whatdoes the s² measure? G. GivenR², what is the F-value for the omnibus F-test? H. Whatdoes the F-value measure? I. Ifthe F-value 0, what can one say aboutthe regression? Part A: R² measures the percentageof the total sum of squares explained by the regression, or RegSS / TSS. Jacob: Why does the textbook show the R²as 1 – RSS / TSS? This is equivalent, since RSS + RegSS = TSS. Rachel: To adjust for degrees of freedom (forthe corrected R²), we adjust RSS and TSS. The format R² =1 – RSS / TSS makes it easier to understand the adjustment for degrees offreedom. Jacob: Does the R² measure if theregression analysis is significant? The textbook gives significance levels fort-values and F-values (and associated confidence intervals for the regressioncoefficients), but it does not give significant levels for R². Rachel: R² combines two items:whether the explanatory variables have predictive power and whether theregression coefficients are significantly different from zero (or from anothernull hypothesis). This exercise reviews the concepts and explains what R²implies vs what s² and the F-value imply. Part B: R² does not adjust for degreesof freedom. If the regression has N data points and uses N explanatoryvariables (or N-1 independent variables + 1 intercept), all points are fitexactly, and the R² = 100%. This is true even if the explanatoryvariables have no predictive power: that is, each explanatory variable isindependent of the response variable. The same problem exists even if thenumber of explanatory variables is less than the number of data points. Even ifthe explanatory variables are independent of the response variable and have nopredictive power, the R² is always more than zero. The adjusted (corrected) R²adjusts for degrees of freedom. The degree of freedom apply to RSS and TSS, notto RegSS. With N data points and k independent variables (= k+1explanatory variables including the intercept), the TSS has N-1 degrees offreedom and the RSS has N-k-1 degrees of freedom. Fox explains: R² is 1 – RSS/ RSS = the complement of (the residual sum of squares / total sum of squares).The adjusted R² is the complement of (the residual variance / thetotal variance). The adjusted (corrected) R²= 1 – (RSS / N-k-1) / (TSS / N-1). The R² is a ratio of sumsof squares and the adjusted (corrected) R² is a ratio of variances. Part C: For most regression analyses, the R²is fine. It says what percentage of the variation in the sample values isexplained by the regression. This percentage is not used for tests ofsignificance, so a slight over-statement is not a problem. Jacob: Is the R² over-stated? Thetextbook does not say that is over-stated. Rachel: The R² says whatpercentage of the variation in the sample values is explained by theregression. It is the correct percentage, not over- or under-stated. Some ofthe explanation is spurious, caused by random fluctuations in small datasamples. The adjusted R² says: What would the R² be if wehad an infinite number of data points? Jacob: This adjustment seems proper; why dowe still use the simple R²? Rachel: We have a simple data set; we don’tknow what the R² would be if we had an infinite number of datapoints. We estimate the expected correction. This estimate is unbiased, but itis sometimes too high and sometimes too low. To compare regression equations withdifferent degrees of freedom, one must use the adjusted R². Forexample, suppose one regresses a response variable Y on several explanatoryvariables. One might say that the best regression equation is the one which explainsthe largest percentage of the variation in the response variable. R²is not a valid measure, since adding an explanatory variable always increasesthe R², even if the explanatory variable is unrelated to theresponse variable. Instead, we choose the regression equation with the highestadjusted R². Part D: If R² is close to zero,the explanatory variables explain almost none of the variance in the responsevariable. For a simple linear regression with one explanatory variable, thecorrelation of X and Y is close to zero. Jacob: Suppose we draw a scatterplot of Yagainst X. If R² is close to zero, is the scatterplot a cloud ofpoints with no pattern? Rachel: The R² reflects twothings: the variance of the error term and the slope of the regression line.The variance of the error term compared to the dispersion of the responsevariable determines whether the scatterplot is a cloud of points with no clearpattern or a set of points lying next to the regression line. The slope of theregression line (the â coefficient)determines whether the explanatory variable much affects the response variable. The units of measurement areimportant. Suppose we regress personal auto claim frequency on the distance thecar is driven. If the slopecoefficient is â when the distance is in miles (or kilometers), the slopecoefficient is â × 1,000 when the distance is thousands of miles(kilometers). If the slopecoefficient is â when the claim frequency is in claims per car, the slopecoefficient is â / 100 when the claim frequency is claims per hundredcars. Illustration: Suppose the regression line is Y = 1+ 0 × X + å. N (number of points) = 1,000, theexplanatory variables are the integers from 1 to 1,000, and ó²_ε = 1. The scatterplot is a horizontal line Y = 1 withslight random fluctuations above and below the line. The scatterplot shows aclear pattern; it is not a cloud of points. But R² is close to zero,since the values of X have no effect on the values of Y. Now suppose the true regression lineis Y = 1 + 1 × X + å, with N (numberof points) = 1,000, the explanatory variables are the integers from 1 to 1,000,and ó²_ε = 1 million. The scatterplot is a 45̊ diagonal line Y =X with much random fluctuations above and below the line. The scatterplot doesnot show a clear pattern; it appears as a cloud of points, and only by lookingcarefully does one see the pattern. But R² is not close to zero,since the values of X have a strong effect on the values of Y. The exact valueof R² depends on the error terms. Some statisticians do not much use R²,since it is a mix of two values: the slope of the regression line and the ratioof ó_ε to the dispersion of the Y values. We do not use R²for goodness-of-fit tests or tests of significance, since it mixes two items.We use the t-value (or the F-value) for the significance of the explanatoryvariables. Part E: If R² is close to 1, thecorrelation of the explanatory variable and the response variable (X and Y) isclose to 1 or –1. Almost all the variation in the response variable isexplained by the explanatory variables. An R² is close to 1 impliesthat the ratio of ó_ε to the dispersion of the Y values(the variance of Y) is low. Three things affect the R². RSS and ó²_å are low. â is not low. TSS (the variance ofY) is high. Part F: s² is the ordinaryleast squares estimator of ó²_ε. Most importantly, s² is an unbiasedestimator of ó²_ε. Jacob: Does this imply that s is anunbiased estimator of ó_ε? Rachel: If s² is anunbiased estimator of ó²_ε, s is not an unbiased estimator of ó_ε. To grasp the rationale for this, suppose ó²_ε is 4 and s² is 2, 3, 4, 5, or 6, with20% probability of each. ó_å is 4 = 2. s is 2, 3, 4, 5, or 6,with a 20% probability of each. The mean of s is ( 2+ 3 + 4 + 5 + 6) / 5 = 1.966. s is a reasonable estimator of ó_ε, but it is not unbiased. Part G: Use the relation F = [ (N – k –1) / q ] × R² / (1 – R²), where k is thenumber of explanatory variables (not including the intercept) and q is thenumber of variables in the group being tested. Jacob: How is this relation derived? Rachel: Use the expression for the F-value interms of RSS and divide numerator and denominator by TSS. Jacob: Fox has a q in his formula(page 109) and an R₀. What is the difference between k and q,and what is R₀? Rachel: Fox shows the general form of theF-value. For the omnibus F-test, the null hypothesis is that all ß’s arezero, so k = q and R₀² (the R²for the null hypothesis) = 0. Jacob: Can you explain the intuition forthat last statement? Rachel: If all ß’s are zero, RSS = TSS, andRegSS = 0. Part H: The F-value measures if agroup of explanatory variables in combination is significant. The omnibus F-testmeasures if all the explanatory variables in combination are significant. Jacob: Is that the same as at least oneexplanatory variable is significant? After all, if the explanatoryvariables in combination are significant, at least one of them must besignificant. Rachel: No, that is not correct. A clearexample is a regression analysis on a group of correlated explanatoryvariables. Suppose an actuary regresses the loss cost trend for workers’compensation on three inflation indices: monetary inflation (the change in theCPI), wage inflation, and medical inflation. All three inflation indices arehighly correlated. If any one were used in the regression equation alone, itwould significantly affect the loss cost trend. If all three are used, we maynot be able to discern which affects the loss cost trend, and none might besignificant. Jacob: If the regression equation has onlyone explanatory variable, are the t-value and the F-value thesame? Rachel: They have the same p-values, and theyare equivalent significance tests, but they have different units. The F-valueis the square of the t-value. Part I: If the F-value is close to zero, the slope coefficientis not significantly different from zero. This means one of three things: 1. The slope coefficient is close tozero. The slope coefficient â depends on theunits of measurement, so the term close to zero depends on the units ofmeasurement. To avoid problems with the units of measurement, assume the X andY values are normalized: deviations from the mean in units of the standarddeviation. 2. The variance of the error term ó²_ε is large relative to the variance of the responsevariable. The random fluctuation in the residual variance overwhelms the effectof the explanatory variable. 3. The data sample has so few pointsthat the regression pattern is spurious. For example, one can draw a straightline connecting any two points, so the regression analysis means nothing. TheF-value has zero degrees of freedom and is not significant no matter how largeit is. [The following exercise explains someintuition for R², adjusted R², F values, andsignificance.] Exercise 10.3: Measures ofsignificance Two regression equations Y and Zregress inflation rates on interest rates using data from different periods.The true population distributions of the explanatory variable and the responsevariable are the same in the two equations. Equation Y has thehigher R² and an estimated slope coefficient of â_Y. Equation Z has thehigher adjusted (corrected) R² and an estimated slope coefficient ofâ_Z. A. Whichregression equation uses a larger data set? B. Whichregression equation has a greater F-value? C. Whichis the better estimate of the slope coefficient: â_Y or â_Z? Part A: Equation Y has the higher R²and the lower adjusted (corrected) R². This implies that Equation Yhas fewer data points, and more of its R² is spurious. Part B: The F-test uses the same adjustmentfor degree of freedom as the adjusted R², so Equation Z has thehigher F-value. Part C: â_Z has the higher t-value (the squareroot of the F-value), so it is the better estimate. In practice, we would use aweighted average of the two ß’s, with more weight given to Equation Z. Exercise 10.4: R² A simple (two-variable) linearregression model Y_i = á+ â × X_i + å_i is fit to the 5 points: (0, 0), (1, 1), (2, 4), (3, 4), (4, 6) A. Whatis the mean X value? B. Whatis the mean Y value? C. Whatare the five points in deviation form? D. Whatis (x_i – )²? E. Whatis (y_i – )²? F. Whatis (x_i – )(y_i – )? G. Whatis R²? H. Whatis the adjusted (corrected) R²? Part A: The mean X value () = (0 + 1 + 2 + 3 + 4) / 5 = 2 Part B: The mean Y value () = (0 + 1 + 4 + 4 + 6) / 5 = 3 Part C: For the deviations from the mean,subtract 2 from each X value and 3 from each Y value to get (–2, –3), (–1, –2), (0, 1), (1, 1),(2, 3) Part D: ∑(x_i – )² = 4 + 1 + 0 + 1 + 4 = 10 Part E: ∑(y_i – )² = 9 + 4 + 1 + 1 + 9 = 24 Part F: ∑(x_i – )(y_i – ) = 6 + 2 + 0 + 1 + 6 = 15 Part G: The total sum of squares (TSS) = ∑(y_i– )² = 9 + 4 + 1 + 1 + 9 = 24 The regression sum of squares (RegSS)= [ ∑(x_i – )(y_i – ) ]² / ∑(x_i – )² = 15² / 10 = 22.5 The R² = RegSS / TSS = 22.5/ 24 = 93.75% Part H: Adjusted R² = 1 – (1 – R²)× (N – 1) / (N – k) = 1 – (1 – 0.9375) × (5 – 1) / (5 – 2) = 0.917 Question 10.5: Adjusted R² We fit the model Y_i = á + â₁ X_1i + â₂ X_2i + â₃ X_3i + â₄ X_4i + å_i to N observations. Y = the expectedvalue of R² Z = the expectedvalue of the adjusted R². As N increases, which of the followingis true? A. Yincreases and Z increases B. Yincreases and Z decreases C. Ydecreases and Z increases D. Ydecreases and Z decreases E. Ydecreases and Z stays the same Answer 10.5: E If N = 2, R² = 100%, sincewe can fit a straight line connecting two points. As N increases, R² declines to thesquare of the correlation between the population variables X and Y. The adjusted R² iscorrected for degrees of freedom, so its expected value is the square of thecorrelation between the variables X and Y, regardless of N. Intuition: R² is correct for largesamples and overstated for small samples. The adjusted (corrected) R²is an unbiased estimate for all samples. Question 10.6: Adjusted R² We estimate two regression equations,S and T, with a different number of observations and a different number ofindependent variables in each regression equation. R²_sand R²_t are the R² for equations S and T. N_s and N_tare the number of observations for equations S and T. K_s and K_tare the number of independent variables for equations S and T. R²_s = R²_t.Under what conditions is the adjusted R² for equation S definitelygreater than the adjusted R² for equation T? A. N_s> N_t and K_s > K_t B. N_s< N_t and K_s < K_t C. N_s> N_t and K_s < K_t D. N_s< N_t and K_s > K_t E. Inall scenarios, the adjusted R² for equation S may be more or lessthan the adjusted R² for equation T. Answer 10.6: C Use the formula for the adjusted R²in terms of R², N, and k. Intuitively, the difference betweenthe R² and the adjusted R² decreases as the degrees offreedom increase. Adjusted R²= 1 – (1 – R²) × (N – 1) / (N – k). N is more than k. The value of (N-1)/(N-k) decreases as Nincreases increases as kincreases As (N-1)/(N-k) decreases, theadjusted R² increases. Choice C has these relations. Attachments Fox Module 10 r2 practice problems df.pdf (2K views, 101.00 KB) Edited 13 Years Ago by NEAS 0
	Reply