Fox Module 19: Heteroscedasticity HW


Fox Module 19: Heteroscedasticity HW

Author
Message
NEAS
Supreme Being
Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)

Group: Administrators
Posts: 4.5K, Visits: 1.6K

Fox Module 19: Heteroscedasticity

 

(The attached PDF file has better formatting.)

 

Homework assignment: residual plots and heteroscedasticity

 

This homework assignment simulates heteroscedastic data and forms residual plots. Use Excel of other statistical software.

 

You must use Excel or other statistical software for the student project. Learn the statistical tools in your software. Simulations are the quickest way to learn the statistical concepts.

 


A.     Choose a sample for the explanatory variables. The exhibits below use 1 to 100.

B.     Simulate response variables whose observed values are heteroscedastic. The exhibits below use Yj = Xj + Xj × Ö(0,1): the variance of the error term is proportional to the value of the Xj.

C.    Form the regression equation and derive the fitted values at each point.

D.    Plot the residuals against the observed values (left pane below).

E.     Pot the residuals against the fitted values (right pane below).

F.     Explain why the two plots are so different.

 

We show sample exhibits below. Your exhibits will look different, since you simulate points.

 

If your variance is too small, you won’t see the effect that the textbook discusses.

We should use Yj = Xj + ŷj × Ö(0,1): the variance of the error term is proportional to the fitted value of Yj. This is harder to simulate, so we use the simpler formula above.

 

Form the exhibits in Excel, SAS, R, or any other software. Sample exhibits are below.

 


Attachments
Delta
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: Forum Members
Posts: 3, Visits: 1

in Excel, which function is good for Ö(0,1) or just use random number generator rand() or randbetween(0,100)

[NEAS: Either function is fine.]


CalLadyQED
Forum Guru
Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)

Group: Forum Members
Posts: 62, Visits: 2

We can send in graphs from Excel, right? They don't have to be hand-drawn?

Is it fine to just plot the raw residuals here, or do we need to calculate the studentized residuals and use those?

[NEAS: Yes, use Excel graphs and just plot the residuals.]


anne26
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: Forum Members
Posts: 4, Visits: 1

For A, X is labeled 1 to 100 while for B, Y = Xi + Xi* rand(0,1). Is my understanding correct?
Please advise. Thanks!

[NEAS: You can choose any set of points, as long as you make the standard errors dependent on either the X values or the Y values. The example given uses Y = X with a standard error that varies with X.]


dclevel
Forum Newbie
Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)

Group: Forum Members
Posts: 6, Visits: 1
What is the meaning of plots of studentized residuals vs fitted value and spread-level plots?

Or where can i find this info?
smh1021
Forum Newbie
Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)

Group: Forum Members
Posts: 9, Visits: 1

Based on the example plots given, it appears that you are trying to show that the residual vs observed values plot has a more constant variance while the residuals vs fitted values plot has a nonconstant error variance and thus is much more spread out. Is this a correct interpretation of these plots??

[NEAS: The variance is not constant. The point of the example is that an observed value greater than the fitted value causes a positive residual, and an observed value less than the fitted value causes a negative residual.]

When I did my own simuation in Excel, my residuals vs fitted values plot had a similar pattern to the one shown but my residuals vs observed values plot was much more spread out than the example shown. Comparing these graphs, they are not as "different" as I would expect them to be. Is there a possible reason for this or do I need to re-do my simulation?

[NEAS: The difference is clear in very small samples and large residuals at one point. In large samples with residuals of similar size at all points, the difference is hard to see.]


mbellis2011
Forum Newbie
Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)

Group: Forum Members
Posts: 5, Visits: 226
I must be doing something wrong...

I am using Excel to run my regression. Using X from 1-100 and Yi = Xi + Xi*RAND():

X Y

1 1.944

2 2.205

. .

. .

. .

100 169.911


When I run my regression my Resid vs Observed plot looks identical to my Resid vs Fitted plot (both look like the Resid vs Fitted plot illustrated in the HW). I obviously made a mistake somewhere along the way (maybe in the X's and Y's I am using in the regression?).



Adversely Selected
Forum Newbie
Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)

Group: Forum Members
Posts: 2, Visits: 38
mbellis2011 (12/12/2012)
I must be doing something wrong...

I am using Excel to run my regression. Using X from 1-100 and Yi = Xi + Xi*RAND():

X Y

1 1.944

2 2.205

. .

. .

. .

100 169.911


When I run my regression my Resid vs Observed plot looks identical to my Resid vs Fitted plot (both look like the Resid vs Fitted plot illustrated in the HW). I obviously made a mistake somewhere along the way (maybe in the X's and Y's I am using in the regression?).






I think what you are doing is correct because I got the similar results. I think the graph they provided for residuals vs observed values is incorrect. What the NEAS plotted in the first graph looks more like fitted values vs observed values.
Edited 11 Years Ago by NEAS
NEAS
Supreme Being
Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)

Group: Administrators
Posts: 4.5K, Visits: 1.6K
NEAS: RAND() gives a random number from a uniform distribution on (0,1). To get a random number from a standard normal distribution, use NORMSINV(RAND()), which will give you the plots in the homework posting. RAND() has a smaller variance and doesn’t show the difference between the two plots. You can also use R to generate the plots:

xvv <- 1:100

yvv <- xvv + xvv * rnorm(100,0,1)

lm.hsce <- lm(yvv ~ xvv)

rsds <- residuals(lm.hsce)

fits <- fitted(lm.hsce)

par(mfcol=c(1,2))

plot(yvv, rsds, xlab="observed values", ylab = "residuals", main="residuals vs observed values")

plot(fits, rsds, xlab="fitted values", ylab = "residuals", main="residuals vs fitted values")

Using random draws from a uniform distribution gives the plots which you formed, as with

yvv <- xvv + xvv * runif(100,0,1)


GO
Merge Selected
Merge into selected topic...



Merge into merge target...



Merge into a specific topic ID...





Reading This Topic


Login
Existing Account
Email Address:


Password:


Social Logins

  • Login with twitter
  • Login with twitter
Select a Forum....













































































































































































































































Neas-Seminars

Search