March 5, 1998

Due:

NETWORK FILES NEEDED: n:het1.dat, n:het2.dat, n:het1.sha, n:het2.sha, n:wls_sim.sha

NOTE: After this, there will be one more homework set, covering serially correlated error, endogeneity, and dummy dependent variables model. However, this homework set will NOT be turned in for grading. It will be designed solely to give you an idea of the types of exam issues you might encounter concerning these models.

1. Weighted least squares (WLS) is the most common technique to use for
(mostly) cross-sectional data where heteroskedasticity is a problem. We
will look at a pair of contrived data sets in the files n:het1.dat and
n:het2.dat. In n:het1.dat, there are 50 observations
for recently hired
production line workers on two variables: weeks of experience
(**weeks _{i}**), and
percent perfect gizmos produced (

2. At other times, we will *not* have the luxury of
repeated observations at
each value of the explanatory variable(s) that let us calculate estimates of
the conditional error variance for each group of observations to use in our
weighting computations. The data set in n:het2.dat
contains another sample
of 50 new workers for the same firm, but this time, experience is reported
in days (**days _{i}**). Percent perfect gizmos
(

3. Verifying what goes on in the background when you specify the
**weight=** on an OLS command: Run the simulation program contained
in n:wls_sim.sha and review the output. This program creates
a single sample of size 300 for a dependent variable **d** and an explanatory
variable **r** with a known form of heteroscedasticity in the
population regression function from which the data were drawn. Each time the
program is run, an entirely different set of data will be created, so your results
will differ from those of other people in your study group.

Review the commentary and questions contained in the program, then run it and
see if your results are typical. Does
the **ols d r / weight=...** method of producing weighted least squares
regression estimates produce the same results as used to be obtained by making
explicit transformations of all the variables in the model and running the
regression on the transformed data? Note also that the transformed data, when
plotted, probably look much more like they satisfy the maintained hypothesis of
ordinary least squares regressions. A scatterplot for the transformed data
certainly looks different than a scatterplot of the raw data.

Mission: insert lines to perform an initial "naive" OLS regression without
any weights. Save the fitted residuals and square them. Then regress these
residuals alternately on **r** (the explanatory variable in the main model), on
the square of **r**, and on both at the same time. Do you recover the
type of heteroscedasticity that was used in the production of the data? The
lesson here is that a sample will not necessarily always reveal exactly the type
of heteroscedasticity that afflicts the population from which the sample is drawn.
BOTTOM LINE: We usually just do the best we can to characterize the approximate
nature of our heteroscedasticity problem. We then use this information to effect
the most appropriate WLS solution that we can come up with.

COURSE OUTLINE | LECTURE OUTLINES | PROBLEM SETS | PROBLEM SOLUTIONS | COMPUTER LABS |

SHAZAM EXAMPLES | DATA SETS | ONLINE QUIZZES | GRAPHICS | HANDOUTS |

Updated: March 4, 1998

Prepared by: Trudy Ann Cameron