Fox Module 19 Heteroscedasticity
Non-constant error variance
Residual plots
Read Section 12.2, “Non-constant error variance,” on pages 272-274.
Know the three bullet points on page 268. Some statements may not be clear at first, and you must invest the time to understand them. The first bullet point says that “although the validity of least squares estimation is robust, the efficiency of least squares is not robust.”
Fox explains what this means, and the final exam tests if you grasp the concepts.
The next bullet point says “highly skewed error distributions compromise the interpretation of the least squares fit.” Fox means that the mean is not the median, so least squares estimators do not indicate the center of the distribution.
The last bullet point discusses error distributions with two modes. All concepts on this page are tested on the final exam.
Know well quantile plots; see Figure 12.1 at the top of page 269. From a quantile plot, you determine if the distribution is heavy or thin tailed and if it is symmetric, positively skewed, ro negatively skewed. The final exam tests these relations.
Fox shows how transformations can correct skewness and lead to normal quantile plots. Review his comments at the bottom of page 269 and top of page 270.
Read Section 12.2.1, “Residual plots,” on page 272-274. The relation in the second paragraph in this section on page 272, “Plotting residuals …,” is tested on the final exam: the linear correlation is the square root of 1 – R2. The formula in the last two paragraphs on this page (p = 1 – b) is not tested on the final exam.
Know the meaning of plots of studentized residuals vs fitted value and spread-level plots. Fox shows examples in Figures 12.3 and 12.4 on page 273.
Heteroscedasticity is important when the values of the dependent variable (the response variable) range widely. Mortality rates, claim frequencies, and claim severities vary greatly among policyholders.
Residual plots are essential for judging a regression model, and your student project should show these plots. Plot the residuals against the fitted values, not the observed values.