Group: Administrators
Posts: 4.5K,
Visits: 1.6K
|
Fox Module 16 analysis of variance:explanation of Duncan’s prestige data (The attached PDF file has betterformatting.) This file explains one-way analysis ofvariance on pages 147-148 of the Fox textbook. Final exam problems compute R2and F-statistics from the TSS, RSS, and RegSS. Chapter 8 of the Fox textbook usesDuncan’s prestige data to illustrate a one-way ANOVA analysis. The data shows45 occupations, with four attributes:
Type of occupation:professional or managerial, white collar, and blue collar. Average income Education Measure of prestige
Income, education, and prestige arescaled from 0 to 100. The table below shows the data byoccupation. The Excel workbook attached to this posting shows all the figuresfor the one-way analysis of variance. The computations are straight-forward;the final exam problems test these computations on a small data set. Occupation | Type | Income | Education | Prestige | accountant | prof | 62 | 86 | 82 | pilot | prof | 72 | 76 | 83 | architect | prof | 75 | 92 | 90 | author | prof | 55 | 90 | 76 | chemist | prof | 64 | 86 | 90 | minister | prof | 21 | 84 | 87 | professor | prof | 64 | 93 | 93 | dentist | prof | 80 | 100 | 90 | reporter | wc | 67 | 87 | 52 | engineer | prof | 72 | 86 | 88 | undertaker | prof | 42 | 74 | 57 | lawyer | prof | 76 | 98 | 89 | physician | prof | 76 | 97 | 97 | welfare.worker | prof | 41 | 84 | 59 | teacher | prof | 48 | 91 | 73 | conductor | wc | 76 | 34 | 38 | contractor | prof | 53 | 45 | 76 | factory.owner | prof | 60 | 56 | 81 | store.manager | prof | 42 | 44 | 45 | banker | prof | 78 | 82 | 92 | bookkeeper | wc | 29 | 72 | 39 | mail.carrier | wc | 48 | 55 | 34 | insurance.agent | wc | 55 | 71 | 41 | store.clerk | wc | 29 | 50 | 16 | carpenter | bc | 21 | 23 | 33 | electrician | bc | 47 | 39 | 53 | RR.engineer | bc | 81 | 28 | 67 | machinist | bc | 36 | 32 | 57 | auto.repairman | bc | 22 | 22 | 26 | plumber | bc | 44 | 25 | 29 | gas.stn.attendantbc | bc | 15 | 29 | 10 | coal.miner | bc | 7 | 7 | 15 | streetcar.motormanbc | bc | 42 | 26 | 19 | taxi.driver | bc | 9 | 19 | 10 | truck.driver | bc | 21 | 15 | 13 | machine.operator | bc | 21 | 20 | 24 | barber | bc | 16 | 26 | 20 | bartender | bc | 16 | 28 | 7 | shoe.shiner | bc | 9 | 17 | 3 | cook | bc | 14 | 22 | 16 | soda.clerk | bc | 12 | 30 | 6 | watchman | bc | 17 | 25 | 11 | janitor | bc | 7 | 20 | 8 | policeman | bc | 34 | 47 | 41 | waiter | bc | 8 | 32 | 10 | The one-way ANOVA analysis testswhether the type of occupation affects prestige. Professional occupations havehigher prestige than white collar or blue collar; we test if the differencesare statistically significant. Jacob: Prestige depends on education andincome, not type of occupation. The highest blue collar prestige level (67) isfor RR engineer, which also has the highest blue collar income (81). Thehighest professional prestige level is for physicians (97), who have the thirdhighest education. In Duncan’s study, dentists and lawyers have highereducation, but this is probably measurement error: medical school along withinternship and residency is longer for doctors than for dentists or lawyers. Rachel: You are correct; the full ANOVAanalysis considers also income and education. One-way ANOVA considers a simplerquestion: does prestige differ by type of occupation? Our goal is to explainthe statistical technique. Education and income affect prestige, and thissimple analysis is not complete. The explanatory variable is type ofoccupation; the response variable is prestige. Regression analysis assumes theresponse variable has a normal distribution. But prestige is a value from 0 to100; it does not have a normal distribution. We transform prestige to logit(prestige/ 100). The transformed response variable is closer to a normal distribution. Jacob: How do we test if the responsevariable has a normal distribution? Rachel: We use QQ plots. The QQ plot forprestige is thin-tailed; the QQ plot for logit(prestige/100) fits betterto a normal distribution. The textbook shows the computationsfor both prestige and logit(prestige/100). The overall mean prestige is 47.68889.The mean prestige by type of occupation is
Professional: 80.44444 White collar: 36.66667 Blue collar: 22.76190
The total sum of squares (TSS) is thesquare of the (prestige minus the average prestige). For accountants, this is(82 – 47.68889)2 = 1,177.25 The residual sum of squares (RSS) isthe square of the (prestige minus the average prestige for that type of occupation).For accountants, this is (82 – 80.44444)2 = 2.42. The regression sum of squares (RegSS)is the square of the (average prestige for the occupation minus the overallaverage prestige). For accountants, this is (80.44444 – 47.68889)2 =1,072.93. Occupation | Type | Inc | Edu | TSS | Prestige | Mn(prs) | RSS | RegSS | accountant | prof | 62 | 6 | 1,177.25 | 82 | 80.4444 | 2.42 | 1,072.92 | pilot | prof | 72 | 76 | 1,246.87 | 83 | 80.4444 | 6.53 | 1,072.93 | architect | prof | 75 | 92 | 1,790.23 | 90 | 80.4444 | 91.31 | 1,072.93 | author | prof | 55 | 90 | 801.52 | 76 | 80.4444 | 19.75 | 1,072.93 | chemist | prof | 64 | 86 | 1,790.23 | 90 | 80.4444 | 91.31 | 1,072.93 | minister | prof | 21 | 84 | 1,545.36 | 87 | 80.4444 | 42.98 | 1,072.93 | professor | prof | 64 | 93 | 2,053.10 | 93 | 80.4444 | 157.64 | 1,072.93 | dentist | prof | 80 | 100 | 1,790.23 | 90 | 80.4444 | 91.31 | 1,072.93 | reporter | wc | 67 | 87 | 18.59 | 52 | 36.6667 | 235.11 | 121.49 | engineer | prof | 72 | 86 | 1,624.99 | 88 | 80.4444 | 57.09 | 1,072.93 | undertaker | prof | 42 | 74 | 86.70 | 57 | 80.4444 | 549.64 | 1,072.93 | lawyer | prof | 76 | 98 | 1,706.61 | 89 | 80.4444 | 73.20 | 1,072.93 | physician | prof | 76 | 97 | 2,431.59 | 97 | 80.4444 | 274.09 | 1,072.93 | welfare.worker | prof | 41 | 84 | 127.94 | 59 | 80.4444 | 459.86 | 1,072.93 | teacher | prof | 48 | 91 | 640.65 | 73 | 80.4444 | 55.42 | 1,072.93 | conductor | wc | 76 | 34 | 93.87 | 38 | 36.6667 | 1.78 | 121.49 | contractor | prof | 53 | 45 | 801.52 | 76 | 80.4444 | 19.75 | 1,072.93 | factory.owner | prof | 60 | 56 | 1,109.63 | 81 | 80.4444 | 0.31 | 1,072.93 | store.manager | prof | 42 | 44 | 7.23 | 45 | 80.4444 | 1,256.31 | 1,072.93 | banker | prof | 78 | 82 | 1,963.47 | 92 | 80.4444 | 133.53 | 1,072.93 | bookkeeper | wc | 29 | 72 | 75.50 | 39 | 36.6667 | 5.44 | 121.49 | mail.carrier | wc | 48 | 55 | 187.39 | 34 | 36.6667 | 7.11 | 121.49 | insurance.agent | wc | 55 | 71 | 44.74 | 41 | 36.6667 | 18.78 | 121.49 | store.clerk | wc | 29 | 50 | 1,004.19 | 16 | 36.6667 | 427.11 | 121.49 | carpenter | bc | 21 | 23 | 215.76 | 33 | 22.7619 | 104.82 | 621.35 | electrician | bc | 47 | 39 | 28.21 | 53 | 22.7619 | 914.34 | 621.35 | RR.engineer | bc | 81 | 28 | 372.92 | 67 | 22.7619 | 1,957.01 | 621.35 | machinist | bc | 36 | 32 | 86.70 | 57 | 22.7619 | 1,172.25 | 621.35 | auto.repairman | bc | 22 | 22 | 470.41 | 26 | 22.7619 | 10.49 | 621.35 | plumber | bc | 44 | 25 | 349.27 | 29 | 22.7619 | 38.91 | 621.35 | gas.stn.attendantbc | bc | 15 | 29 | 1,420.45 | 10 | 22.7619 | 162.87 | 621.35 | coal.miner | bc | 7 | 7 | 1,068.56 | 15 | 22.7619 | 60.25 | 621.35 | streetcar.motormanbc | bc | 42 | 26 | 823.05 | 19 | 22.7619 | 14.15 | 621.35 | taxi.driver | bc | 9 | 19 | 1,420.45 | 10 | 22.7619 | 162.87 | 621.35 | truck.driver | bc | 21 | 15 | 1,203.32 | 13 | 22.7619 | 95.29 | 621.35 | machine.operator | bc | 21 | 20 | 561.16 | 24 | 22.7619 | 1.53 | 621.35 | barber | bc | 16 | 26 | 766.67 | 20 | 22.7619 | 7.63 | 621.35 | bartender | bc | 16 | 28 | 1,655.59 | 7 | 22.7619 | 248.44 | 621.35 | shoe.shiner | bc | 9 | 17 | 1,997.10 | 3 | 22.7619 | 390.53 | 621.35 | cook | bc | 14 | 22 | 1,004.19 | 16 | 22.7619 | 45.72 | 621.35 | soda.clerk | bc | 12 | 30 | 1,737.96 | 6 | 22.7619 | 280.96 | 621.35 | watchman | bc | 17 | 25 | 1,346.07 | 11 | 22.7619 | 138.34 | 621.35 | janitor | bc | 7 | 20 | 1,575.21 | 8 | 22.7619 | 217.91 | 621.35 | policeman | bc | 34 | 47 | 44.74 | 41 | 22.7619 | 332.63 | 621.35 | waiter | bc | 8 | 32 | 1,420.45 | 10 | 22.7619 | 162.87 | 621.35 | Total / average | | | | 43,687.64 | 47.6889 | | 10,597.59 | 33,090.05 | The total / average row shows that TSS(43,687.64) = RSS (10,597.59) + RegSS (33,090.05). The prestige scores do not have anormal distribution. For a better ANOVA analysis, Fox uses the logit of the prestigescores divided by 100. Let Pr = prestige / 100, so logit (Pr) = ln(Pr)/ (1 – ln(Pr) ). We dot show the analysis of variance table forunadjusted prestige levels, though you can compute them easily from the lastrow of the table. Logit of (Prestige / 100) We form the same table using logit(prestige / 100). The attached Excel workbook has the same figures. Occupation | Type | I | E | TSS | Pres | logit(Pr) | Mn(pr) | RegSS | RSS | accountant | prof | 62.00 | 6.00 | 2.66960 | 82 | 1.5163 | 1.632114 | 3.06130 | 0.01340 | pilot | prof | 72.00 | 76.00 | 2.90079 | 83 | 1.5856 | 1.632114 | 3.06130 | 0.00216 | architect | prof | 75.00 | 92.00 | 5.35815 | 90 | 2.1972 | 1.632114 | 3.06130 | 0.31935 | author | prof | 55.00 | 90.00 | 1.61347 | 76 | 1.1527 | 1.632114 | 3.06130 | 0.22986 | chemist | prof | 64.00 | 86.00 | 5.35815 | 90 | 2.1972 | 1.632114 | 3.06130 | 0.31935 | minister | prof | 21.00 | 84.00 | 4.07435 | 87 | 1.9010 | 1.632114 | 3.06130 | 0.07228 | professor | prof | 64.00 | 93.00 | 7.31287 | 93 | 2.5867 | 1.632114 | 3.06130 | 0.91121 | dentist | prof | 80.00 | 100.00 | 5.35815 | 90 | 2.1972 | 1.632114 | 3.06130 | 0.31935 | reporter | wc | 67.00 | 87.00 | 0.03904 | 52 | 0.0800 | -0.590384 | 0.22358 | 0.44947 | engineer | prof | 72.00 | 86.00 | 4.45199 | 88 | 1.9924 | 1.632114 | 3.06130 | 0.12983 | undertaker | prof | 42.00 | 74.00 | 0.15952 | 57 | 0.2819 | 1.632114 | 3.06130 | 1.82321 | lawyer | prof | 76.00 | 98.00 | 4.87652 | 89 | 2.0907 | 1.632114 | 3.06130 | 0.21034 | physician | prof | 76.00 | 97.00 | 12.91426 | 97 | 3.4761 | 1.632114 | 3.06130 | 3.40028 | welfare.worker | prof | 41.00 | 84.00 | 0.23185 | 59 | 0.3640 | 1.632114 | 3.06130 | 1.60820 | teacher | prof | 48.00 | 91.00 | 1.23691 | 73 | 0.9946 | 1.632114 | 3.06130 | 0.40640 | conductor | wc | 76.00 | 34.00 | 0.13839 | 38 | -0.4895 | -0.590384 | 0.22358 | 0.01017 | contractor | prof | 53.00 | 45.00 | 1.61347 | 76 | 1.1527 | 1.632114 | 3.06130 | 0.22986 | factory.owner | prof | 60.00 | 56.00 | 2.45722 | 81 | 1.4500 | 1.632114 | 3.06130 | 0.03316 | store.manager | prof | 42.00 | 44.00 | 0.00691 | 45 | -0.2007 | 1.632114 | 3.06130 | 3.35910 | banker | prof | 78.00 | 82.00 | 6.55304 | 92 | 2.4423 | 1.632114 | 3.06130 | 0.65648 | bookkeeper | wc | 29.00 | 72.00 | 0.10875 | 39 | -0.4473 | -0.590384 | 0.22358 | 0.02047 | mail.carrier | wc | 48.00 | 55.00 | 0.29784 | 34 | -0.6633 | -0.590384 | 0.22358 | 0.00532 | insurance.agent | wc | 55.00 | 71.00 | 0.06072 | 41 | -0.3640 | -0.590384 | 0.22358 | 0.05127 | store.clerk | wc | 29.00 | 50.00 | 2.37371 | 16 | -1.6582 | -0.590384 | 0.22358 | 1.14029 | carpenter | bc | 21.00 | 23.00 | 0.34886 | 33 | -0.7082 | -1.482151 | 1.86216 | 0.59902 | electrician | bc | 47.00 | 39.00 | 0.05650 | 53 | 0.1201 | -1.482151 | 1.86216 | 2.56735 | RR.engineer | bc | 81.00 | 28.00 | 0.68183 | 67 | 0.7082 | -1.482151 | 1.86216 | 4.79757 | machinist | bc | 36 | 32 | 0.15952 | 57 | 0.2819 | -1.482151 | 1.86216 | 3.11171 | auto.repairman | bc | 22 | 22 | 0.86197 | 26 | -1.0460 | -1.482151 | 1.86216 | 0.19026 | plumber | bc | 44 | 25 | 0.60504 | 29 | -0.8954 | -1.482151 | 1.86216 | 0.34430 | gas.stn.attendantbc | bc | 15 | 29 | 4.32508 | 10 | -2.1972 | -1.482151 | 1.86216 | 0.51133 | coal.miner | bc | 7 | 7 | 2.61488 | 15 | -1.7346 | -1.482151 | 1.86216 | 0.06373 | streetcar.motormanbc | bc | 42 | 26 | 1.77547 | 19 | -1.4500 | -1.482151 | 1.86216 | 0.00103 | taxi.driver | bc | 9 | 19 | 4.32508 | 10 | -2.1972 | -1.482151 | 1.86216 | 0.51133 | truck.driver | bc | 21 | 15 | 3.18057 | 13 | -1.9010 | -1.482151 | 1.86216 | 0.17540 | machine.operator | bc | 21 | 20 | 1.07151 | 24 | -1.1527 | -1.482151 | 1.86216 | 0.10855 | barber | bc | 16 | 26 | 1.60973 | 20 | -1.3863 | -1.482151 | 1.86216 | 0.00919 | bartender | bc | 16 | 28 | 6.09668 | 7 | -2.5867 | -1.482151 | 1.86216 | 1.22000 | shoe.shiner | bc | 9 | 17 | 11.27990 | 3 | -3.4761 | -1.482151 | 1.86216 | 3.97583 | cook | bc | 14 | 22 | 2.37371 | 16 | -1.6582 | -1.482151 | 1.86216 | 0.03100 | soda.clerk | bc | 12 | 30 | 6.93792 | 6 | -2.7515 | -1.482151 | 1.86216 | 1.61134 | watchman | bc | 17 | 25 | 3.89351 | 11 | -2.0907 | -1.482151 | 1.86216 | 0.37038 | janitor | bc | 7 | 20 | 5.40471 | 8 | -2.4423 | -1.482151 | 1.86216 | 0.92198 | policeman | bc | 34 | 47 | 0.06072 | 41 | -0.3640 | -1.482151 | 1.86216 | 1.25034 | waiter | bc | 8 | 32 | 4.32508 | 10 | -2.1972 | -1.482151 | 1.86216 | 0.51133 | Total / avg | | | | 134.15390 | 47.689 | -0.1175 | | 95.55014 | 38.60376 | These tables show how Fox calculatedthe figures on page 148. Fox doesn’t show all the work; the tables here showall the computations.
The logit of Prestige/ 100 for accountants is ln(0.82 / (1 – 0.82) ) = 1.51635. The average logit forall occupations is –0.11754. The average logit bytype of occupation is 1.6321 for professional, –0.5791 for white collar, and–14821 for blue collar. The regression sum ofsquares (RegSS) for accountants is (1.6321 – –0.11754)2 = 3.06124. The residual sum ofsquares (RSS) for accountants is (1.6321 – 1.5163)2 = 0.01341.
The total row in the table forms theanalysis of variance. Fox shows the following table: Source | Sum of Squares | Degrees of Freedom | Mean Square | F | p | Groups | 95.550 | 2 | 47.775 | 51.98 | << 0.001 | Residuals | 38.604 | 42 | 0.919 | | | Total | 134.154 | 44 | | | |
The mean square isthe sum of squares divided by the degrees of freedom. 95.550 / 2 = 47.775;38.604 / 42 = 0.9191. The F-statistic isthe mean square for the groups divided by the mean square of the residuals. 47.775 / 0.9191 =51.98 The R2 isthe sum of squares for the groups (RegSS) divided by the total sum of squares(TSS). 95.550 / 134.154 =71.22%
Jacob: How do we get the degrees of freedom? Rachel: 45 data points minus 1 parameter (themean) = 44 degrees of freedom for the total sum of squares. Three occupation types (groups) minusone relation = 2 degrees of freedom for the groups. The relation is that anoccupation is either professional, white collar, or blue collar. Degrees of freedom for TSS – degreesof freedom for RegSS = degrees of freedom for RSS.
|
Group: Forum Members
Posts: 6,
Visits: 148
|
Going from the occupational table to the type of occupation table, why is the sum of the residual sum of squares used as the RegSS for the group mean square, and on the other side the RegSS of the occupational table is used as the RSS for the Residual Mean Square? [NEAS: Thank you for noticing the typo: the RSS and RegSScolumn headings were reversed on one of the exhibits, though all the figureswere correct and the final F-statistic was correctly computed. Theregression sum of squares is square of the group mean minus the overall mean.The residualsum of squares is the square of the individual value minus the group mean.The post hasbeen corrected and re-posted.]
|