UNIVERSITY OF CALIFORNIA, LOS ANGELES
Department of Economics
Economics 143 (Cameron) - Applied Regression
Analysis
Problem Set #7: Serially Correlated Errors
December 1, 1998
NOT TO BE HANDED IN
NETWORK FILES NEEDED:
carcred.dat,
int2.dat
1. (From the Fall 1997 final exam) Explore a preliminary model to explain
the observed quarterly time-series variation in automobile loans at commercial
banks. The data are contained in the file carcred.dat, where the first seven lines must be skipped
because they contain read-statement information. (Use this information to begin
your SHAZAM program file.) The variables involved are defined as follows:
DATE = year and quarter in decimal form
(e.g. 1960.25=1960:1)
AUTOCRED = consumer installment credit
outstanding: automobiles, commercial banks (million $, end of month, not
seasonally adjusted) [CITIBASE variable CCIUAC; monthly data averaged for
each quarter (1960:1-1996:4)]
YP = gross national product, total [CITIBASE
variable GNP; quarterly (1960:1-1996:4)].
R = nominal interest rate, measured as
the rate on commercial paper, 6-mo (% per annum, not seasonally adjusted)
[CITIBASE variable FYCP; monthly data averaged for each quarter
(1960:1-1996:4)].
AUTOINV = inventories, business, retail
durables, motor vehicle dealers; billions [CITIBASE variable GLRDA; quarterly
(1960:1-1996:4)]
QTR1, QTR2, QTR3, QTR4 = set of quarterly
dummy variables, equal to one during each respective quarter, zero
otherwise.
a.) Which three variables in this data set
are the most highly correlated?
b.) Approximately what
is the change in outstanding automobile loans per year?
c.) Based solely on an ordinary least squares regression
of autocred on yp, r, autoinv, and a set of quarterly
dummies, does multicollinearity
compromise our ability to discern the incremental effects on autocred of
changes in any of the individual explanatory variables? Explain.
d.) Is there evidence of systematic "seasonal"
variations in the level of autocred, according to this initial OLS
regression? Explain.
e.) Save the fitted OLS regression errors from this
initial OLS regression and create the first through fourth lags of these errors.
Run a regression of current period errors on the first four lags of the error
term. What is the purpose of this regression? What
does it imply about the results obtained from your preliminary OLS
regression?
f.) Use the AUTO command to regress autocred on
yp, r, autoinv, and a set of quarterly dummies. Is this
model likely to be adequate to correct the problems revealed by your regression of
current on lagged OLS errors? Why or why not? Explain.
g.) Now use the AUTO command to regress autocred
on yp, r, autoinv, and a set of quarterly dummies, but allow
for fourth-order autoregressive errors (use the ORDER= option). Suppose this ends
up being your preferred model. (It may not be, if we continue to explore
further....) Does this specification suggest the presence of seasonal effects in
AUTOCRED? Which months tend to have the highest amount of outstanding car
loans? Which months tend to have the lowest amount of outstanding car
loans?
h.) How do the implications of your autoregressive-error
model (with AR(4) errors) differ from those of the corresponding OLS specification
concerning the effects on car loans of (a) nominal interest rates, and (b) car
dealer inventories? Explain.
i.) Compare the goodness-of-fit of your AR(4)
specification with that of the naive OLS model. Comment.
2. A very simplistic model of the determination of real interest rates (REALR)
assumes that real interest rates are determined by the interaction of money demand
and money supply. Liquidity preference theory says that the demand for real
balances is affected negatively by real interest rates and positively by real
incomes. The nominal supply of money (M) is less sensitive to real interest
rates, being determined primarily by the Federal Reserve Policy, but the price
level influences real money supply (=M/P).
The file int2.dat contains information on the
following variables:
R = nominal interest rate, measured as the rate on commercial paper, 6-mo (%
per annum, not seasonally adjusted) [CITIBASE variable FYCP; monthly data averaged
for each quarter (1959:1-1997:2)].
P = implicit price deflator, Gross National Product (1992=100) [CITIBASE
variable GD; quarterly (1959:1-1997:2)].
YP = gross national product, total [CITIBASE variable GNP; quarterly (1959:1-
1997:2)].
M1 = M1 money supply, demand deposits, total (billion $, not seasonally
adjusted) [CITIBASE variable FZMD; monthly data averaged for each quarter (1959:1-
1997:2)].
M2 = M2 money supply, M1+o'nte RPS, EURO$, G/P&B/D MMMFS&SAV&SM TIME DEP
(billion $, NSA) [CITIBASE variable FZMS2; monthly data averaged for each quarter
(1959:1-1997:2)].
Your data will be input as follows (OBS is an artifact of the data source--it
gives year and quarter as a single number--and can be ignored in your work, but it
must appear in the read statement for the data to be read in correctly). There
are 154 observations in the data set (1959:1-1997:2). However, we will start with
the subset of these series--the first 115 observations--that I used to employ for
this problem (1959:1-1987:3).
sample 1 115
read(int2.dat) obs p yp m1 m2 r
a.) Real interest rates (REALR) are defined as nominal interest rates minus
the expected rate of inflation. Since the latter is unobservable, economists
usually substitute the actual observed rate of inflation as a proxy in simple
models. Before running any regressions, you will need to create a variable for
the percent change in prices from the previous period to use as your inflation
measure. Furthermore, note that the inflation in prices over one quarter is not
compatible with the rate of interest during a quarter (the latter being defined on
an annual basis). Institute an appropriate "fix." Be sure to note what this
section does to the number of usable observations. Also, you will need to convert
YP, M1, and M2 into constant dollar terms (HINT: divide by the price
deflator).
b.) Run a conventional OLS regression model of real interest rates (REALR)
on YP and M2. Save the fitted residuals. Use GENR T=TIME(0) to create a "time"
variable and plot the residuals from the OLS regression against time to look for
any discernible pattern. From viewing the residuals, what can you infer about
your point estimates and your standard errors? Anything? (Be careful.)
c.) Plot current against one-period lagged residuals to perform an
alternative visual check for the presence of serial correlation in the residuals.
Standard SHAZAM plots are adequate, but if you omit the STOP command in your
SHAZAM program and run your program interactively, you might try using the / GNU
LINEONLY option on the plot command to see a plot with better resolution. To get
a printed plot, you will have to use the commfile= and datafile= options as
suggested in the second lab session.
Since these are quarterly data, you should probably also try current against other
lagged residuals. Use GENR e1=lag(e), GENR e2=lag(e,2), GENR e3=lag(e,3) and GENR
e4=lag(e,4). Which correlation is strongest? (A quantitative measure can be
found using a STAT ... / PCOR command.) You might also want to try regressing
current period residuals on residuals lagged once, twice, three times, and four
times (in the same regression), being careful to employ the AUXRSQR option. You
might also consider using the NOCONSTANT option. (Why might regression through
the origin be imposed?) What are the coefficients for your regressions of current
on lagged residuals? What about the significance levels of the point estimates?
d.) Now be more rigorous about testing statistically for the presence of
first-order serial correlation. Obtain and evaluate the Durbin-Watson test
statistic for the null hypothesis of no serial correlation. You can do this with
the RSTAT or EXACTDW (better) options on your regression. What
hypothesis is tested using this statistic? Do you need to use the tables in the
back of the textbook to perform the hypothesis test? What evidence do you find
regarding first-order serial correlation from your DW test statistic?
e.) You could correct for serial correlation the hard way by using your
estimate of the first-order error correlation from the STAT command in (c.) or by
using the DW test statistic, transformed according to the relationship between
and d, to transform all of the data into generalized differences. You could then
re-run the model on the transformed data and check to see whether the new
residuals display less serial correlation. Instead, use the AUTO command to
obtain the optimum transformation of the data using the value of rho
produced by the Cochrane-Orcutt procedure.
f.) Compare the point estimates and the standard error estimates for the
parameters which you get using plain OLS and the model "corrected" for serial
correlation. Since both estimators produce unbiased estimates of the true
underlying population parameters, do we expect the coefficients to be identical
for both procedures? Why or why not?
g.) Compare the effect on the predictive ability of the model of resorting to
the AUTO procedure instead of naive OLS. In order to do this, you will need to
save the fitted values of the dependent variable from the OLS model in one
variable (say, using PREDICT=FITOLS) and the fitted values for the AUTO model
(say, using PREDICT=FITAUTO). On two separate plots, display the actual and
fitted values for OLS and then for AUTO. (This will be most visible with / GNU
LINE or /GNU LINEONLY on the PLOT command, but printing these plots, as usual, is
a trifle involved.) Comment.
h.) Now run your program again, this time bringing the sample up to 1997
using sample 1 154). Comment on the difference in your findings.
Updated: 6:46 PM 12/7/98; Prepared by: Trudy Ann Cameron; Site Index