Module 8: Simple linear regression practice problems

View Options

Author

Message

NEAS

posted 15 Years Ago

Supreme Being

Group: Administrators
Posts: 4.5K, Visits: 1.6K

Module 8: Simple linear regression practice problems

(The attached PDF file has better formatting.)

Linear Regression: practice exam problems

This posting illustrates linear regression exam problems covering the basic formulas. On the final exam, expect a scenario with five pairs of points similar to the exercise below. The problem derives the ordinary least squares estimators, their standard errors, t-values, levels of significance, and F-statistic. Some statistical items are taught in later modules; these practice problems covers many items in basic regression analysis.

An actuary fits a two-variable regression model (Y_i = á + â × X_i + å_i ) to the relation between the incurred loss ratio (x) and the retrospective ratio (y), using the data below:

Policy Year	(x)	(y)	(x – 0)	(x – 0)²	^(y–⁾	^(y–⁾²	^(x–⁰^)(y–⁾
^20X1	^61.00%	^15.00%	^0.00%	^0.00%	^0.32%	^0.0010%	^0.0000%
^20X2	^62.00%	^13.20%	^1.00%	^0.01%	^-1.48%	^0.0219%	^-0.0148%
^20X3	^63.00%	^14.00%	^2.00%	^0.04%	^-0.68%	^0.0046%	^-0.0136%
^20X4	^60.00%	^15.20%	^-1.00%	^0.01%	^0.52%	^0.0027%	^-0.0052%
^20X5	^59.00%	^16.00%	^-2.00%	^0.04%	^1.32%	^0.0174%	^-0.0264%
^Average	^61.00%	^14.68%	^0.00%	^0.02%	^0.00%	^0.009536%	^-0.01200%

^{The column captions use lower case x and y for the variables; the deviations are shown explicitly as (x –}⁰^{) and (y–}^{). Some statistician use upper case letter for the variables and lower case letters for the deviations.}

^{Take heed:}^{The notation in the John Fox regression analysis text differs sightly from the notation in some of the discussion forum postings.}

^{John Fox uses the symbols A and B as the least squares estimators for}^á^and^â^.

^{He uses RSS for the residual sum of squares; other authors use ESS, the error sum of squares.}

^{He uses RegSS for the regression sum of squares; other authors use RSS.}

^{The final exam problems use Fox’s notation.}

^{Question 8.1: Ordinary Least Squares Estimator of}^â

^{What is the value of B, the ordinary least squares estimator of}^â^?

^A.^–0.600

^B.^–0.120

^C.^–0.020

^D.^–0.019

^E.^–0.012

^{Answer 8.1: A}

^{The table gives the sum of the cross-product terms and of the squared deviations of X.}

^{B = ∑(x}_i^–^)(y_i^–^{) / ∑(x}_i^–⁾² = –0.012 / 0.020 = -0.600

Note: The last row of the table shows averages. The ratio of the averages is the ratio of the sums.

Question 8.2: Ordinary Least Squares Estimator of á

What is the value of A, the ordinary least squares estimator of á?

A. –0.6100

B. –0.1468

C. +0.1468

D. +0.5128

E. +0.6100

Answer 8.2: D

Use the relation: A = – B × = 14.68% – (–0.60) × 61.00% = 0.5128

Question 8.3: Total Sum of Squares (TSS)

What is the total sum of squares (TSS)?

A. 0.0117%

B. 0.0360%

C. 0.0477%

D. 0.0833%

E. 0.1310%

Answer 8.3: C

The total sum of squares can be found two ways.

(1) We subtract the mean of Y from each observed value and square the deviations:

∑(y_i – )² = 0.32%² + (–1.48%)² + (–0.68%)² + 0.52%² + 1.32%² = 0.04768%

(2) We square the observed values of Y and subtract N times the square of the mean:

[1]

= 15% + 13.2% + 14% + 15.2% + 16% – 14.68%² / 5 = 0.04768%

The table in the exam problem gives the TSS as 5 × 0.009536% = 0.04768%

Question 8.4: Regression Sum of Squares (RegSS)

What is the regression sum of squares (RegSS)?

A. 0.0117%

B. 0.0360%

C. 0.0477%

D. 0.0833%

E. 0.1310%

Answer 8.4: B

Find the fitted Y value at each observation as A + B × X. Subtract the mean of Y and square the result. The sum of these is the regression sum of squares.

Policy Year	(x)	(y)	ŷ	(ŷ – )	(ŷ – )²
^20X1	^61.00%	^15.00%	^14.68%	^0.00%	^0.0000%
^20X2	^62.00%	^13.20%	^14.08%	^-0.60%	^0.0036%
^20X3	^63.00%	^14.00%	^13.48%	^-1.20%	^0.0144%
^20X4	^60.00%	^15.20%	^15.28%	^0.60%	^0.0036%
^20X5	^59.00%	^16.00%	^15.88%	^1.20%	^0.0144%
^Average	^61.00%	^14.68%	^14.68%	^0.00%	^0.007200%

^{5 × 0.0072% = 0.0360%}

^{A quick formula: regression sum of squares (RegSS) = B2} × ∑(x_i – )²

∑(x_i – )² = 1%² + 2%² + (–1%)² + (–2%)² = 0.10%

RegSS = B² × 0.10% = 0.6² × 0.10% = 0.0360%

Question 8.5: Error Sum of Squares (ESS) or Residual Sum of Squares (RSS)

What is the error sum of squares (ESS) or residual sum of squares (RSS)

A. 0.0117%

B. 0.0360%

C. 0.0477%

D. 0.0833%

E. 0.1310%

Answer 8.5: A

We compute the residual sum of squares two ways.

(1) The ESS (RSS) is the TSS minus the RegSS.

0.04768% – 0.0360% = 0.01168%

(2) We determine residuals as the observed Y minus the fitted Y. The sum of the squared residuals is the residual sum of squares (RSS).

Policy Year	(x)	(y)	ŷ	(ŷ – y)	(ŷ – y)²
^20X1	^61.00%	^15.00%	^14.68%	^-0.32%	^0.0010%
^20X2	^62.00%	^13.20%	^14.08%	^0.88%	^0.0077%
^20X3	^63.00%	^14.00%	^13.48%	^-0.52%	^0.0027%
^20X4	^60.00%	^15.20%	^15.28%	^0.08%	^0.0001%
^20X5	^59.00%	^16.00%	^15.88%	^-0.12%	^0.0001%
^Average	^61.00%	^14.68%	^14.68%	^0.00%	^0.002336%

^{5 × 0.002336% = 0.011680%}

^{Question 8.6: Standard Error}

^{What is s2}, the estimated variance of the regression?

A. 0.0036%

B. 0.0039%

C. 0.0360%

D. 0.0389%

E. 0.0117%

Answer 8.6: B

The estimated variance of the regression is the residual sum of squares divided by the degrees of freedom (the number of observations minus the number of explanatory variables). The explanatory variables are the independent variables plus the constant term. For a simple linear regression, this is N–2: 0.01168% / 3 = 0.003893%. This is an unbiased estimate of ó².

Question 8.7: Variance of Ordinary Least Squares Estimator of â

What is the variance of the ordinary least squares estimator of â? (This is the variance, not the standard error.)

A. 0.36%

B. 0.39%

C. 3.60%

D. 3.89%

E. 1.17%

Answer 8.7: D

The variance of B is the ó² (or its unbiased estimate) divided by ∑(x_i – )²:

0.00389% / 0.10% = 3.890%

Question 8.8: t Statistic

What is the t statistic for testing the null hypothesis that â = 0?

A. –3

B. –2

C. –1

D. +1

E. +2

Answer 8.8: A

The t statistic is (the difference between B and the null hypothesis) divided by the standard deviation of B, which is the square root of the variance of B:

–0.6 / 3.890%^½ = -3.042

Question 8.9: p-value

The p-value for â for this regression equation is 0.0558. Which of the following is true, assuming the classical regression assumptions hold?

A. The true â is within ±5.58% (multiplicative) of the ordinary least squares estimator.

B. The true â is within ±0.0558 (additive) of the ordinary least squares estimator.

C. The probability is 95% that the true â is within ±0.0558 of the ordinary least squares estimator.

D. If the true value of â is zero, the probability that the absolute value of the ordinary least squares estimator of â is at least as great as in this regression equation is 5.58%.

E. If the true value of â is zero, the probability is 95% that the absolute value of the ordinary least squares estimator of â is no more than 0.0558.

Answer 8.9: D

To test hypotheses, we consider the probability that we would observe an ordinary least squares estimator as far from the null hypothesis (or farther) because of sampling error. The p-value gives this probability, as stated in Statement D.

Question 8.10: F Statistic

What is the F statistic for testing the null hypothesis that â = 0?

A. –4

B. –1

C. +1

D. +4

E. +9

Answer 8.10: E

We compute the F statistic two ways.

(1) For a two-variable regression model, the F statistic is the square of the t statistic:

–0.6² / 3.893% = 9.247

(2) The F statistic is the ratio of the regression sum of squares divided by its degrees of freedom to the error sum of squares divided by its degrees of freedom:

[0.0360% / 1 ] / [0.01168% / 3] = 9.247

Question 8.11: R²

What is the value of R², the coefficient of determination?

A. 55%

B. 65%

C. 75%

D. 85%

E. 95%

Answer 8.11: C

R² = 0.036% / 0.04768% = 75.50%

[1]

∑y²_i – N × ² = ∑y²_i – (∑y_i)² / N =

Attachments

Fox Module 8 simple linear regression ppss f10.pdf (4.5K views, 115.00 KB)

NJS26

posted 15 Years Ago

Forum Newbie

Group: Forum Members
Posts: 3, Visits: 1

In question 1, it state the table gives the sum, but the label says average? So problem 1 uses that last line as a sum, but then in problem three it shows it as an average (since you have to multiply by 5 to get the answer you are looking for). Is problem 1 a typo?

[NEAS: Ratio of averages = ratio of sums]

Michelle2010

posted 15 Years Ago

Junior Member

Group: Forum Members
Posts: 18, Visits: 1

Why is question 8.3, (1) set equal to the sum of y_i²? I would have expected it to be set equal to the sum of (y_i - y_bar)². Thanks.

[NEAS: Many statisticians use upper case Y to denote the observation and lower case y to denote the deviation from the mean. Fox does not use this notation. The final exam problems all use Fox’s notation.]

Nezzie

posted 15 Years Ago

Forum Newbie

Group: Forum Members
Posts: 5, Visits: 1

It is set to (y_i - y_ibar), if you look at the things it is summing, they are in fact those differences... yea that notation was a little odd to me as well...

I am curious if anyone knows where they got the equation for solving for RegSS in part (1) of 8.4...
maybe I read over this but I do not remember this being mentioned in the section, thanks!

NEAS: F = t² = RegSS / s² ➾ RegSS = t² × s² = B² × ∑(x_i – x-bar)²

Merge into selected topic...

Merge into merge target...

Merge into a specific topic ID...

Reading This Topic