Questions and Answers on Tme Series Modeling


Questions and Answers on Tme Series Modeling

Author
Message
n2thornl
Junior Member
Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)

Group: Forum Members
Posts: 20, Visits: 1

CM-

Thank you.  I had forgotten that those equations were in there.  I think I will try that, if it comes to it.

So, what made you want to try an AR(2) model?  Did your AR(1) model just not fit right? 

Did you check forecasts against actual data to decide which model was better, or what?

Just curious... you don't need to reply if you're busy.

Thanks again!

Jacob: How do we compare an AR(1) model with an AR(2) model? If the AR(2) model has a higher R2 and a lower sum of squared residuals, won’t its forecasts also be better?

Rachel: The AR(2) model will have the higher R2 and the lower sum of squared residuals. This must be the case, since the AR(2) model uses the same independent variables as the AR(1) model plus an additional variable. But its out-of-sample forecasts may be worse. For a perfect random walk, this will often be the case.


kquick
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: Forum Members
Posts: 4, Visits: 1

I thought I understood all of the examples in the text, but now I am very confused.  Looking at the 3 month T Bills, trying to get a handle on what NEAS did.  The sample autocorr function for the yt series leads me to take first differences.  Fine.  The sample autocorr function for the first differences goes toward zero quickly and Bartlett's confirms that this is good, so I would go with AR(1).  I have no idea how to get the coefficients?  I'm pretty sure, based on the book, my own understanding and the posts here, that we would need to regress on the first differences.  I would get that series, make predictions with it, then add up those predictions to get a prediction for the original yt (I'm not there yet, so that is a little hazy).  But it doesn't make sense to fit a regression to the first differences, since they are just basically hanging out and alternating around zero.  Which is what they should be doing - indicating stationarity.  But you can't get a good fit to data like that.  Am I right?  What am I missing?   Regressing on the original yt gives a nice fit, because it is a nicely increasing series.  Not stationary, I know....

Can someone help me out with this? 

Thanks!

Jacob: If interest rates are a random walk, are the first differences a white noise process? What model do we use, and what items do we use in the regression?

Rachel: If the interest rates are a perfect random walk, the first differences are white noise. The AR(1) model on the first differences gives a β of zero and an α equal to the drift of the interest rates. This is a perfect fit of the ARIMA model, with d = 1 and β = 0.

In practice, we don’t expect a perfect fit. We might have seasonality, or changing means, drifts, and variances. As the graph of interest rates on the NEAS web site shows, interest rates are complex and their pattern changes from time to time. Even if the ideal model is AR(1), the stochasticity makes the fit less than perfect.


n2thornl
Junior Member
Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)

Group: Forum Members
Posts: 20, Visits: 1

kquick-

alright, this is JUST my opinion, but here goes...

I see exactly what you're saying.  However, I think you're missing something.  We do not perform a regression on JUST the yt-1 (the first differences)... you perform a regression on TWO columns of data: yt and yt-1.  This is to form a regression model that looks like:

yt = a*yt-1 + b

a would be your AR(1) coefficient, and b would be your... whatever that greek letter is.  Delta?  I dunno.  The circle-ish thing with a squiggly mark growing out the top.  I'm tired, forgive me.

So, although a graph of your first differences should look like a squished together zig zag line, the graph of yt vs. yt-1 hopefully should not, and hopefully you can fit a decent line to it.

Caveat - I haven't gotten to this point yet either, althought I feel confident doing what I plan to do.  I've had trouble getting the regression package in Excel to work.

Anyone else out there agree or disagree on this?


kquick
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: Forum Members
Posts: 4, Visits: 1

n2thornl:

Thanks for the reply, I definitely appreciate it.  However, I am making sure that I am fitting my regression to the equation yt = A*yt-1 + B.  I find that this seems to get nice results from the regression output.  (High R square value ~ .98).  But the series is not stationary, so I take first differences and the autocorrelations tell me that the series of the first differences should be stationary.  So, what I had thought we were supposed to do with this is regress on the first differenced series, i.e., fit the regression to the series wt=(delta)yt = A*(delta)yt-1 + B.  But the regression output for this is not very promising at all, very low R square value (~.38). 

This is where I must be missing something.  What did everyone else do with their first differences.  I'm going by pages 498 -499 in the text where they have the stationary wt series.  Then as they say in the beginning of section 16.2.2 that you have to integrate (or sum up) forecasts from the wt series to get your forecasts for the original yt. 

I thought I understood the theory pretty well but it is just not working in practice.  I am sure I am missing something, but don't know what.  Can someone give me an idea of what they did or if they have the same problem?  N2thornl:  when you get to this part could you let me know how you make out? 

Thanks!

Quality of Fit: Non-stationary Random Walk vs Stationary White Noise Process

Jacob: When I regress the interest rates on the lagged interest rates, using an AR(1) model on the monthly interest rates themselves, I get a β coefficient of one, indicating that the model is not stationary. The t statistic for β is high, the p-value is low, and the R2 for the regression is high. The fit seems excellent.

When I regress the first differences of the interest rates on the lagged first differences, using an AR(1) model on the first differences, I get a β coefficient of zero, indicating that the model is stationary. But the t statistic for β is low and not significant, the p-value is high, and the R2 for the regression is low. The fit seems poor.

I had expected the opposite results.

ÿ If β . 1, the time series is a random walk. It is not stationary, and we do not use it for forecasts.

ÿ If β . 0, the time series is a white noise process. It is stationary, and we use it for forecasts.

Am I doing something wrong? Why does the random walk have a good fit and the white noise process has a poor fit?

Rachel: Nothing is wrong; these are expected results. Consider what each test implies.

The t statistic tests the hypothesis that β = 0. If the time series is a random walk, β is 1, not zero. When we take first differences, β = 0; that is exactly correct.

To test if the time series is a random walk, the null hypothesis is β0 = 1. This gives a t statistic close to zero, and we do not reject the null hypothesis.

The R2 says how much of the variance is explained by the β coefficient. If the time series is a random walk with a low standard error (low σ), the R2 is high. Values of 98% or 99% are reasonable for a perfect random walk with many observations.

If the β coefficient is zero for the AR(1) model of first differences, we expect the R2 to be about zero. This says that the β coefficient doesn’t explain anything. If the β is zero, it doesn’t explain anything.

Forecasts: A random walk is not stationary, but we still use it for forecasts. For a random walk with a drift of zero, the L-period forecast is the current value. For a random walk with a drift of k, the L-period forecast is the current value + L × k.

Jacob: For a perfect random walk, with β = 1 and a drift of zero, I assume the R2 depends on the stochasticity. If σ is high, the R2 should be low; if σ is low, the R2 should be high.

Rachel: That is true for most regression equations, where the values of X are not stochastic. For the AR(1) model, the X values are the Y values lagged one period. The dispersion of the X values varies directly with σ.

If σ is twice as large, σ2 is four times as large, the sum of squared deviations of the X values () is four times as large, the variance of does not change, and the t statistic does not change. The R2 is the σ2 divided by the total sum of squares (TSS) of the Y values. The X values are also the Y values, so the R2 does not change.

Jacob: That is counter-intuitive. Suppose the starting interest rate is 8%. If σ is 1%, which is high stochasticity for monthly interest rates, I presume the R2 will be low. If σ is 0.01%, which is low stochasticity, I presume the R2 will be high.

Rachel: To conceive of the relations, assume the starting interest rate is zero. The deviation is the actual value minus the mean, so we might as well start with a mean of zero.

> If σ = 1%, we have much random fluctuation in the Y values. This makes the X values more dispersed, which lowers the variance of the ordinary least squares estimator.

> If σ = 0.01%, we have little random fluctuation in the Y values. The X values are less dispersed, which raises the variance of the ordinary least squares estimator.

The two effects offset each other.

Jacob: Shouldn’t the degree of stochasticity affect the R2?

Rachel: The change from σ = 1% to σ = 0.01% as a change in the units of measurement. Nothing about the regression has changed.


n2thornl
Junior Member
Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)

Group: Forum Members
Posts: 20, Visits: 1
No problem, will do. I've got the day off, so I am trying to get it all done today. I will definitely be in your situation, because my series' are only stationary as differences. I think I'll have to use second differences even!

Theoretically, though, what you did should have worked... what I think is most likely is that you just don't have an AR(1) model, which is entirely possible. If you do think that is the case, you can try AR(2) (use the formulas from... chapter 4, I think, get the AR(2) formula without using multiple regression) or try MA(1) (I don't know how to do that, but there is a post out there somewhere where it is discussed). Go back to your correlogram for your stationary series, and try to use that to see what type of ARIMA model you think it should be. There are several posts put up by NEAS that could help.

One guy tried fitting all of those models, and still didn't get anything to fit, and he submitted his project anyway ... he just said none of the simple models fit, and the rest are out of scope of this project! So don't worry too much...
n2thornl
Junior Member
Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)

Group: Forum Members
Posts: 20, Visits: 1

Well, I'm almost done. I have to figure out how to do two more things, then do them, and I'm all done.

So, my first series was easily modeled as AR(1).  I tested the sample autocorrelations of the residuals from that regression, and they were definitely white noise.  No real problem.

The problem has been my second series.  AR(1) and AR(2) don't fit.  Not for the rates, nor the first or second differences.  So, I need to try an MA(1) and an ARMA(1,1) model.  I have a friend using the same data, very similar periods, and her second series was an ARIMA(1,2,1), so I think I am on the right path.  The hassle is that I don't know how to create those models!  She used Minitab, which I don't have.  I'm stuck re-reading through posts and the books, because apparently there is a 'long way' to create those models.

, I created a Word document as I worked through this, noting what I was doing (mostly to keep me from getting lost).  It's 7-8 pages now.  I think I did a good job on the project.

Anybody out there have advice regarding MA(1) or ARMA(1,1) models?

Jacob: Should we assume that an AR(1) model works? Should we assume that we need a moving average component? Should we assume that Treasury bill and Treasury bond rates have different time series models? How do we know which is the right model? The textbook fits a complex ARIMA model; if we can’t do a nonlinear regression, how do we fit that model?

Rachel: The purpose of the student project is not to find the ideal model. No one knows the ideal ARIMA model for interest rates. The suggestion in the textbook is one model for one time period that might not work for another time period.

Jacob: If there is no clear solution, why are using these data?

Rachel: The purpose of the student project is to apply the time series concepts to real data. No model is ideal, and your results depend on the time period you choose. Choosing a time period that differs by a year or two may lead to a different model. The student project should show your hypotheses, your statistical tests, and your results. We are not looking for a specific answer.

Jacob: If AR(1) works for one time period, why doesn’t it work for another time period?

Rachel: Several explanations are possible:

In one period, the Federal Reserve Board targets a stable interest rate; in another period, the FED targets lower unemployment or higher GDP growth or lower inflation.

Twenty years ago, interest rates were influenced primarily by domestic factors. In the past decade, international competition and capital flows play a greater role.

Inflation, regulation, and markets have changed greatly over the years. Each of these affects the interest rate process.

The apparent difference may be spurious. Exogenous factors, like wars, elections, oil price changes, and recessions cause temporary changes in the ARIMA parameters.

Jacob: If an AR(1) model fits for two eras, do we expect the same parameters?

Rachel: Of the three interest rate eras on the NEAS illustrative workbook, the first period has an upward drift and the third period has a downward drift. These give different autoregressive parameters.


Chesters Mom
Junior Member
Junior Member (14 reputation)Junior Member (14 reputation)Junior Member (14 reputation)Junior Member (14 reputation)Junior Member (14 reputation)Junior Member (14 reputation)Junior Member (14 reputation)Junior Member (14 reputation)Junior Member (14 reputation)

Group: Forum Members
Posts: 14, Visits: 1

MA(1)'s parameter can be backed into by equation 18.13.  ARMA(1,1) is a combination of AR(1) and MA(1).  Nonlinear regression is required for estimating MA parameters (other than order of 1) but is not required for the student project.  There are also Excel addin out there on the internet to do nonlinear regression.  Or any stat software should have this function.  If you can figure out how to do it, great; otherwise you can state in your conclusion.

Jacob: Are we expected to use nonlinear regression to fit moving average components?

Rachel: We are not expecting anyone to use nonlinear regression.


PayMeBack
Forum Newbie
Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)

Group: Forum Members
Posts: 2, Visits: 1

A little help....I'm stuck... I've regressed my differenced series and believe that it has a MA part.  I've backed into the MA(1) coefficient.  So I have an ARMA (1,1) model with coeffecients.  The next step is to test my model using a Box-Pierce Statistic or to test if the autocorellations of the residuals have a normal dist.  The r's are calculated by eq 18.15 and the Q Stat follows, but the problem I'm having is coming up with the e^'s.  How are these determined??? Thanks.

Jacob: How do we determine the residuals?

Rachel: The residuals are the actual interest rates minus the estimated interest rates. The actual interest rates are given. The estimated interest rates are based on the past interest rates and the ARIMA model.

Jacob: How do we test if the autocorrelations of the residuals have a normal distribution?

Rachel: Compute the sample autocorrelation function of the residuals and form a correlogram; we show an example in an illustrative spreadsheet. If the ARIMA model fits well, the residuals are a white noise process. Their standard deviation is 1//T, where T is the number of observations. Check the number of sample autocorrelations exceeding a given absolute value.

Jacob: If we have 400 observations, do we check 400 sample autocorrelations?

Rachel: The last 30 or 40 sample autocorrelations don’t have enough points. The first 4 or 5 sample autocorrelations can be distorted by other factors. Start with lags 6 through 55, for a total of 50 sample autocorrelations.

If only 2 or 3 sample autocorrelations are outside the 95% confidence interval, we presume the distribution is normal with the hypothesized standard deviation.

If 8 or 9 sample autocorrelations are outside the 95% confidence interval, we presume it is not a white noise process.

If 4 to 7 sample autocorrelations are outside the 95% confidence interval, we examine lags 56 to 105. We may have a white noise process with minor distortions. The ARIMA model may be reasonably good, even if it is not perfect.


kquick
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: Forum Members
Posts: 4, Visits: 1

I am with you, I get the theory of this stuff but when it comes to application, I get a little fuzzy.

I used as my e^t = actual (delta)yt - predicted (delta)yt, for the first differenced equation.  Basically it is actual minus predicted.  I had a little trouble calculating e^(t-1) which we need for the MA component of the model.  In other words, I need to calculate this in order to get the predicted (delta)yt.  What I did was use e^(t-1) = actual (delta)yt-1 - mean.  I used the calculated mean of my model.  This was just to get the intial value of e^(t-1).  Then for all subsequant values, I used actual minus predicted (delta)yt-1 values.

I hope this makes sense.  I am really looking for someone to give me an idea of what they did for this.  For people whose models have an MA component, how did you calculate the e^(t-1) value to get your predicted values.

Jacob: How do we get the estimated values for an autoregressive model?

Rachel: Suppose the order of the autoregressive process is p. For an AR(1) model, p = 1; for an AR(2) model, p = 2.

For t > 2, yt is estimated from the ARIMA model.

For t = 2, we assume y0 = the mean of the ARIMA model.

For t = 1, we assume y0 and y–1 = the mean of the ARIMA model.

Jacob: If the ARIMA model has a moving average component, how do we determine the estimated values?

The estimate for y1 requires knowledge of ε0, which we don’t know.

The estimate for yt requires knowledge of εt-1, which we don’t know, since we don’t know the estimated value of yt-1.

Unless we have a way of starting, we don’t know the estimates for any values.

Rachel: We assume the residuals for all values before the first observed value are zero.

Jacob: The candidate who posted this message used the actual minus the mean as the residual. Is this wrong?

Rachel: This is not wrong; we don’t know the true residual for the first period. This candidate’s method is as good as any, though the textbook uses the method described above.


n2thornl
Junior Member
Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)Junior Member (19 reputation)

Group: Forum Members
Posts: 20, Visits: 1

Probably not what anyone wants to hear, but I just used Minitab for any regression past AR(2). 

I tried to get the MA(1) formula on my own, and could never match what Minitab had given me.  No post by NEAS forbids Minitab, so I included my Minitab regression outputs in the Word doc I sent in. 

Jacob: Can we use minitab, sas, and similar statistical packages for the student project?

Rachel: The student project ensures that you can apply the concepts to real data using statistical software.. Any statistical package is fine. We show illustrations in Excel, since most candidates are familiar with Excel. If you have minitab, sas, or similar packages, you may use them.


GO
Merge Selected
Merge into selected topic...



Merge into merge target...



Merge into a specific topic ID...





Reading This Topic


Login
Existing Account
Email Address:


Password:


Social Logins

  • Login with twitter
  • Login with twitter
Select a Forum....













































































































































































































































Neas-Seminars

Search