In setting up the dummy variables to use for the Excel regression, is it always appropriate to use an average value for Y? For example, in this case I was going to set up Excel to perform the regression as follows.
Category | Avg. Y | D1 | D2 |
Urban | 15.18% | 1 | 0 |
Suburban | 12.50% | 0 | 1 |
Rural | 9.54% | 0 | 0 |
I do not see how you could look at all the Y values individually since you only have two dummy variables and a default. Is this the correct approach to this problem or does the number of territories that each group was sampled from (5) need to be taken into account somehow?
[NEAS: Use the average values.
Jacob: Do we use dummy variables for urban, sub-urban, and rural, or dummy variables for each territory?
Rachel: This homework assignment assumes we group the territories into three categories: urban, sub-urban, and rural. If the insurer believed that each territory has a different claim frequency, it might use dummy variables for each territory. ]