MS Mod 16: Regression estimates – practice problems


MS Mod 16: Regression estimates – practice problems

Author
Message
NEAS
Supreme Being
Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)Supreme Being (5.9K reputation)

Group: Administrators
Posts: 4.3K, Visits: 1.3K

MS Module 16: Regression estimates – practice problems

(The attached PDF file has better formatting.)

Exercise 16.1: Least squares estimator for β1

●    A linear regression uses the N points Xi = {1, 2, …, 10, 11}
●    The least squares estimator for β1 is a linear function of the Y values =  γiYi

(The textbook uses the notation β1 =  ciYi)

A.    What is , the mean X value?
B.    What is Sxx, the sum of squared residuals for the X values?
C.    What is γ2, the coefficient of the Y value corresponding to X=2, in the estimate of β1?

Part A: The mean X value is (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11) / 11 = 6

Part B: Sxx, the sum of squared residuals for the X values, is

(1-6)2 + (2-6)2 + (3-6)2 + (4-6)2 + (5-6)2 + (6-6)2 + (7-6)2 + (8-6)2 + (9-6)2 + (10-6)2 + (11-6)2 = 110

Part C: γi = (xi – ) / Sxx = (2 – 6) / 110 = -0.03636

Question: How does this formula relate to the formula β1 = Sxy / Sxx?

Answer: Expand the formula β1 = Sxy / Sxx =  (xi – ) (yi – ) / Sxx =

     (xi – ) × yi / Sxx –  (xi – ) × / Sxx =  γiYi – 0

The value of / Sxx is independent of the subscript i, so  (xi – ) × / Sxx = [(xi – )] × [ / Sxx] = 0, and

     (xi – ) × yi / Sxx –  (xi – ) × / Sxx =  γiYi

Exercise 16.2: Summary statistics

A regression analysis on 10 data points has summary statistics

●    xi = 40
●    yi = 20
●    xi2 = 4,000
●    yi2 = 1,200
●    xiyi = 1,600

A.    What is , the average X value?
B.    What is , the average Y value?
C.    What is Sxx, the sum of squares of the X values?
D.    What is Syy, the sum of squares of the Y values?
E.    What is Sxy, the cross sum of squares of the X and Y values?
F.    What is the least squares estimate for β1?
G.    What is the least squares estimate for β0?
H.    What is the error sum of squares SSE?
I.    What is s2, the least squares estimate for σ2?
J.    What is the correlation ρ between X and Y?
K.    What is the least squares estimate for R2?

Part A: The average X value is = xi / N = 40 / 10 = 4

Part B: The average Y value is = yi / N = 20 / 10 = 2

Part C: Sxx, the sum of squared deviations of the X values, is xi2 – N × 2 = xi2 – (xi)2/N =

    4,000 – 10 × 42 = 3,840

Part D: Syy, the sum of squares of the Y values (the total sum of squares SST), is yi2 – N × 2 =

    1,200 – 10 × 22 = 1,160

Part E: Sxy, the cross sum of squares of the X and Y values, is xiyi – N × × =

    1,600 – 10 × 4 × 2 = 1,520

Part F: The least squares estimate for β1 is Sxy / Sxx = 1,520 / 3,840 = 0.395833

Part G: The least squares estimate for β0 is – β1 × = 2 – 0.39583333 × 4 = 0.416667

Part H: The error sum of squares SSE is yi2 – β0 × yi – β1 × xiyi =

1,200 – 0.41666667 × 20 – 0.39583333 × 1,600 = 558.333339

Answer: Do we need so many significant digits?

Answer: Extra significant digits are not used in real problems, since they give a false sense of accuracy. The practice problems show many significant digits so that when you work the problems on a spread-sheet or a calculator you can check your answers.

Some terms have very small numbers multiplied by very large numbers. If you round 0.00149 × 200 to 0.001 × 200, your solution may be incorrect.

The textbook says “in computing 0, use extra digits in 1, because, if is large in magnitude, rounding may
affect the final answer.” See page 619 for an example.

Part I: The value of s2, the least squares estimate for σ2, is SSE / (N-2) = 558.3333 / (10 – 2) = 69.7917

Part J: The correlation ρ between X and Y is Sxy / ( Sxx × Syy)½ =

    1,520 / (3,840 × 1,160)0.5 = 0.720193

Part K: R2 is 1 – SSE/SST; SST is the same as Syy.

    R2 = 1 – 558.333333 / 1,160 = 0.518678

Note that R2 is the square of the correlation between X and Y: 0.7201932 = 0.518678

Exercise 16.3: Estimating σ2

A statistician estimating σ2 for a regression analysis mistakenly uses divides SSE (the error sum of squares) by (n-1) instead of (n-2), where n is the number of observations.

If the population is normally distributed, n = 17, and σ2 = 4:

A.    What is the expected value of the statistician’s estimator?
B.    What is the bias of the statistician’s estimator?

Part A: Let s2 be the unbiased estimator of σ2, using a denominator of (n-2). The mistaken estimator using a denominator of (n-1) is s2 × (n-2)/(n-1), and its expected value is σ2 × (n-2)/(n-1).

Part B: The bias of the mistaken estimator is σ2 × (n-2)/(n-1) – σ2 = – σ2/(n-1). For n = 17 and σ2 = 4, the bias is –4/(17-1) = –4/16 = –0.250.

(See Example 7.6 on pages 339-340 of the textbook (second edition) or page 333 of the first edition)



Attachments
Edited 6 Years Ago by NEAS
GO
Merge Selected
Merge into selected topic...



Merge into merge target...



Merge into a specific topic ID...





Reading This Topic


Login
Existing Account
Email Address:


Password:


Social Logins

  • Login with twitter
  • Login with twitter
Select a Forum....











































































































































































































































Neas-Seminars

Search