Neas-Seminars

Fox Module 19: Heteroscedasticity HW


http://33771.hs2.instantasp.net/Topic8680.aspx

By NEAS - 12/3/2009 2:31:56 PM

Fox Module 19: Heteroscedasticity

 

(The attached PDF file has better formatting.)

 

Homework assignment: residual plots and heteroscedasticity

 

This homework assignment simulates heteroscedastic data and forms residual plots. Use Excel of other statistical software.

 

You must use Excel or other statistical software for the student project. Learn the statistical tools in your software. Simulations are the quickest way to learn the statistical concepts.

 


A.     Choose a sample for the explanatory variables. The exhibits below use 1 to 100.

B.     Simulate response variables whose observed values are heteroscedastic. The exhibits below use Yj = Xj + Xj × Ö(0,1): the variance of the error term is proportional to the value of the Xj.

C.    Form the regression equation and derive the fitted values at each point.

D.    Plot the residuals against the observed values (left pane below).

E.     Pot the residuals against the fitted values (right pane below).

F.     Explain why the two plots are so different.

 

We show sample exhibits below. Your exhibits will look different, since you simulate points.

 

If your variance is too small, you won’t see the effect that the textbook discusses.

We should use Yj = Xj + ŷj × Ö(0,1): the variance of the error term is proportional to the fitted value of Yj. This is harder to simulate, so we use the simpler formula above.

 

Form the exhibits in Excel, SAS, R, or any other software. Sample exhibits are below.

 

By NEAS - 1/3/2013 8:04:23 AM

NEAS: RAND() gives a random number from a uniform distribution on (0,1). To get a random number from a standard normal distribution, use NORMSINV(RAND()), which will give you the plots in the homework posting. RAND() has a smaller variance and doesn’t show the difference between the two plots. You can also use R to generate the plots:

xvv <- 1:100

yvv <- xvv + xvv * rnorm(100,0,1)

lm.hsce <- lm(yvv ~ xvv)

rsds <- residuals(lm.hsce)

fits <- fitted(lm.hsce)

par(mfcol=c(1,2))

plot(yvv, rsds, xlab="observed values", ylab = "residuals", main="residuals vs observed values")

plot(fits, rsds, xlab="fitted values", ylab = "residuals", main="residuals vs fitted values")

Using random draws from a uniform distribution gives the plots which you formed, as with

yvv <- xvv + xvv * runif(100,0,1)