UNIVERSITY OF CALIFORNIA, LOS ANGELES
Department of Economics

Economics 143 (Cameron) - Applied Regression Analysis

Problem Set #7: Serially Correlated Errors

December 1, 1998
NOT TO BE HANDED IN

NETWORK FILES NEEDED:  carcred.dat, int2.dat

1. (From the Fall 1997 final exam) Explore a preliminary model to explain the observed quarterly time-series variation in automobile loans at commercial banks. The data are contained in the file carcred.dat, where the first seven lines must be skipped because they contain read-statement information. (Use this information to begin your SHAZAM program file.) The variables involved are defined as follows:

DATE = year and quarter in decimal form (e.g. 1960.25=1960:1)
AUTOCRED = consumer installment credit outstanding: automobiles, commercial banks (million $, end of month, not seasonally adjusted) [CITIBASE variable CCIUAC; monthly data averaged for each quarter (1960:1-1996:4)]
YP = gross national product, total [CITIBASE variable GNP; quarterly (1960:1-1996:4)].
R = nominal interest rate, measured as the rate on commercial paper, 6-mo (% per annum, not seasonally adjusted) [CITIBASE variable FYCP; monthly data averaged for each quarter (1960:1-1996:4)].
AUTOINV = inventories, business, retail durables, motor vehicle dealers; billions [CITIBASE variable GLRDA; quarterly (1960:1-1996:4)]
QTR1, QTR2, QTR3, QTR4 = set of quarterly dummy variables, equal to one during each respective quarter, zero otherwise.

a.) Which three variables in this data set are the most highly correlated?

b.) Approximately what is the change in outstanding automobile loans per year?

c.) Based solely on an ordinary least squares regression of autocred on yp, r, autoinv, and a set of quarterly dummies, does multicollinearity compromise our ability to discern the incremental effects on autocred of changes in any of the individual explanatory variables? Explain.

d.) Is there evidence of systematic "seasonal" variations in the level of autocred, according to this initial OLS regression? Explain.

e.) Save the fitted OLS regression errors from this initial OLS regression and create the first through fourth lags of these errors. Run a regression of current period errors on the first four lags of the error term. What is the purpose of this regression? What does it imply about the results obtained from your preliminary OLS regression?

f.) Use the AUTO command to regress autocred on yp, r, autoinv, and a set of quarterly dummies. Is this model likely to be adequate to correct the problems revealed by your regression of current on lagged OLS errors? Why or why not? Explain.

g.) Now use the AUTO command to regress autocred on yp, r, autoinv, and a set of quarterly dummies, but allow for fourth-order autoregressive errors (use the ORDER= option). Suppose this ends up being your preferred model. (It may not be, if we continue to explore further....) Does this specification suggest the presence of seasonal effects in AUTOCRED? Which months tend to have the highest amount of outstanding car loans? Which months tend to have the lowest amount of outstanding car loans?

h.) How do the implications of your autoregressive-error model (with AR(4) errors) differ from those of the corresponding OLS specification concerning the effects on car loans of (a) nominal interest rates, and (b) car dealer inventories? Explain.

i.) Compare the goodness-of-fit of your AR(4) specification with that of the naive OLS model. Comment.
 

2. A very simplistic model of the determination of real interest rates (REALR) assumes that real interest rates are determined by the interaction of money demand and money supply. Liquidity preference theory says that the demand for real balances is affected negatively by real interest rates and positively by real incomes. The nominal supply of money (M) is less sensitive to real interest rates, being determined primarily by the Federal Reserve Policy, but the price level influences real money supply (=M/P).

The file int2.dat contains information on the following variables:

R = nominal interest rate, measured as the rate on commercial paper, 6-mo (% per annum, not seasonally adjusted) [CITIBASE variable FYCP; monthly data averaged for each quarter (1959:1-1997:2)]. P = implicit price deflator, Gross National Product (1992=100) [CITIBASE variable GD; quarterly (1959:1-1997:2)]. YP = gross national product, total [CITIBASE variable GNP; quarterly (1959:1- 1997:2)]. M1 = M1 money supply, demand deposits, total (billion $, not seasonally adjusted) [CITIBASE variable FZMD; monthly data averaged for each quarter (1959:1- 1997:2)]. M2 = M2 money supply, M1+o'nte RPS, EURO$, G/P&B/D MMMFS&SAV&SM TIME DEP (billion $, NSA) [CITIBASE variable FZMS2; monthly data averaged for each quarter (1959:1-1997:2)].

Your data will be input as follows (OBS is an artifact of the data source--it gives year and quarter as a single number--and can be ignored in your work, but it must appear in the read statement for the data to be read in correctly). There are 154 observations in the data set (1959:1-1997:2). However, we will start with the subset of these series--the first 115 observations--that I used to employ for this problem (1959:1-1987:3).

sample 1 115
read(int2.dat) obs p yp m1 m2 r
a.) Real interest rates (REALR) are defined as nominal interest rates minus the expected rate of inflation. Since the latter is unobservable, economists usually substitute the actual observed rate of inflation as a proxy in simple models. Before running any regressions, you will need to create a variable for the percent change in prices from the previous period to use as your inflation measure. Furthermore, note that the inflation in prices over one quarter is not compatible with the rate of interest during a quarter (the latter being defined on an annual basis). Institute an appropriate "fix." Be sure to note what this section does to the number of usable observations. Also, you will need to convert YP, M1, and M2 into constant dollar terms (HINT: divide by the price deflator). b.) Run a conventional OLS regression model of real interest rates (REALR) on YP and M2. Save the fitted residuals. Use GENR T=TIME(0) to create a "time" variable and plot the residuals from the OLS regression against time to look for any discernible pattern. From viewing the residuals, what can you infer about your point estimates and your standard errors? Anything? (Be careful.) c.) Plot current against one-period lagged residuals to perform an alternative visual check for the presence of serial correlation in the residuals. Standard SHAZAM plots are adequate, but if you omit the STOP command in your SHAZAM program and run your program interactively, you might try using the / GNU LINEONLY option on the plot command to see a plot with better resolution. To get a printed plot, you will have to use the commfile= and datafile= options as suggested in the second lab session.

Since these are quarterly data, you should probably also try current against other lagged residuals. Use GENR e1=lag(e), GENR e2=lag(e,2), GENR e3=lag(e,3) and GENR e4=lag(e,4). Which correlation is strongest? (A quantitative measure can be found using a STAT ... / PCOR command.) You might also want to try regressing current period residuals on residuals lagged once, twice, three times, and four times (in the same regression), being careful to employ the AUXRSQR option. You might also consider using the NOCONSTANT option. (Why might regression through the origin be imposed?) What are the coefficients for your regressions of current on lagged residuals? What about the significance levels of the point estimates?
d.) Now be more rigorous about testing statistically for the presence of first-order serial correlation. Obtain and evaluate the Durbin-Watson test statistic for the null hypothesis of no serial correlation. You can do this with the RSTAT or EXACTDW (better) options on your regression. What hypothesis is tested using this statistic? Do you need to use the tables in the back of the textbook to perform the hypothesis test? What evidence do you find regarding first-order serial correlation from your DW test statistic? e.) You could correct for serial correlation the hard way by using your estimate of the first-order error correlation from the STAT command in (c.) or by using the DW test statistic, transformed according to the relationship between and d, to transform all of the data into generalized differences. You could then re-run the model on the transformed data and check to see whether the new residuals display less serial correlation. Instead, use the AUTO command to obtain the optimum transformation of the data using the value of rho produced by the Cochrane-Orcutt procedure. f.) Compare the point estimates and the standard error estimates for the parameters which you get using plain OLS and the model "corrected" for serial correlation. Since both estimators produce unbiased estimates of the true underlying population parameters, do we expect the coefficients to be identical for both procedures? Why or why not? g.) Compare the effect on the predictive ability of the model of resorting to the AUTO procedure instead of naive OLS. In order to do this, you will need to save the fitted values of the dependent variable from the OLS model in one variable (say, using PREDICT=FITOLS) and the fitted values for the AUTO model (say, using PREDICT=FITAUTO). On two separate plots, display the actual and fitted values for OLS and then for AUTO. (This will be most visible with / GNU LINE or /GNU LINEONLY on the PLOT command, but printing these plots, as usual, is a trifle involved.) Comment. h.) Now run your program again, this time bringing the sample up to 1997 using sample 1 154). Comment on the difference in your findings.
Updated: 6:46 PM 12/7/98; Prepared by: Trudy Ann Cameron; Site Index