Fox Module 16: One way ANOVA
(The attached PDF file has better formatting.)
Homework assignment: insurance renewals
An insurer examines policy renewals by territory.
Urban | Sub-urban | Rural |
Territory | Renewal Rate | Territory | Renewal Rate | Territory | Renewal Rate |
1 | 65.00% | 6 | 80.00% | 13 | 89.00% |
2 | 80.00% | 7 | 75.00% | 14 | 90.00% |
3 | 75.00% | 8 | 90.00% | 15 | 91.00% |
4 | 75.00% | 9 | 90.00% | | |
5 | 80.00% | 10 | 90.00% | | |
| | 11 | 82.00% | | |
| | 12 | 88.00% | | |
Use a one way analysis of variance to determine if renewal rates differ by territory. Renewal rates are percentages, so use log odds (logits), not the observed renewal rate.
A. Why would a linear regression of renewal rates on territory not be appropriate?
B. What are the log odds by territory?
C. Why might a logistic regression of renewal rates on territory be appropriate?
Part A: Renewal rates are percentages; the observed values have a binomial distribution. If a territory has 1,000 drivers, and the mean renewal rate is P, the variance of the renewal rate is P × (1 – P) / 1,000. Classical regression analysis assumes the response variable has a normal distribution with a constant variance in all territories. What is the range of a normal distribution? What is the range of a binomial distribution? If two binomial distributions have different expected values but the same N, can they have the same variance?
Part B: The log odds are ln( ð / (1 – ð) ). Calculate the log odds by territory.
Part C: What is the range of the log odds? As ð ➝ 1, what are the log odds? As ð ➝ 0, what are the log odds?
Fox shows how to regress renewal rates on explanatory variables in chapter 14. This homework assignment converts renewal rates to log odds, which is the first step in the procedure.
Classical regression analysis is still not ideal, since the variance in rural territories is greater than the variance in urban territories. In later modules, we show better ways of doing the statistical analysis (GLMs).