UNIVERSITY OF CALIFORNIA, LOS ANGELES
Department of Economics
Economics 143 (Cameron) - Applied Regression
Analysis
Computing Lab Session #7: Heteroscedasticity; Stapler Experiment
Data
Goals for this Lab:
Heteroscedasticity
- Conduct a Monte Carlo exercise to observe the sampling distribution of simple
regression slope and intercept estimates when the population regression function
is characterized by heteroscedasticity.
- Reinforce the idea that OLS and WLS are both unbiased estimators, but WLS is
more efficient (has lower variance) when there is heteroscedasticity
Functional Form (REAL LIVE DATA)
- Read about the Fall 1997 Econ 1 "Stapler Experiment"
- Explore the data for viable functional forms
- Assess the data for multicollinearity, heteroscedasticity
Tasks:
1. The program entitled wls_eff.sha does the
following things:
- "Creates" population data for the revenues and fund-raising expenditures of
55 non-profit organizations. If these data were real, we would worry that they
are jointly determined, but here, we will assume that fund-raising expenditures
depend upon available revenues (and not vice-versa).
- Checks the PRF - it will be different for everybody, because everybody
will have a different set of population data.
- As in the plotsrf.sha program, we will then draw
100 different samples of size 10 from this population. For each sample, we will
fit a regression slope and intercept by OLS and also by WLS using the weight
that should be appropriate, given the way the data were created.
- We will then look at the average point estimates of the regression parameters
produced by each method, and also (most importantly) at the standard deviations in
these point estimates. The standard deviations for the WLS estimators should be
smaller than those for the OLS estimators...this is what we mean by WLS being a
"more efficient" estimator in the presence of heteroscedasticity.
2. In the Fall quarter of 1997, I taught an Economics 1 course wherein I put
the students to work in order to generate a "real" production function.
In each of eleven discussion sections, most with approximately 36 students,
students organized themselves into "firms" of about nine people. Each firm was
asked to produce "tri-fold mailers." These consist of a piece of letter paper,
folded into thirds and stapled at each end. Each firm started with one
stapler and one worker and did a production run of 60 seconds. Then they
increased the labor to two workers, then three, and so on. Then they
moved to two staplers and added incremental units of labor. The complete
description of the experiment can be found at my 1997 Econ 1
Website. The unit of observation is a 60-second workshift. Data from the
experiment can be read as follows:
sample 1 254
read(e1prod.dat) firm shift staplers qcon coo &
labor totalq goodq rejectq
sample 3 254
* The first two observations are "fake" observations so that some plots
* would come out nicely. They force zero output with zero inputs
For our purposes, the relevant variables are:
- STAPLERS = number of staplers used on this shift
- LABOR = number of workers used on this shift
- TOTALQ = total units produced
- GOODQ = total units produced that meet quality control requirements
- REJECTQ = total units produced that do not meet quality control
requirements
1. Explore these data, selecting a reasonable production function for tri-fold
mailers. Justify your choice.
a.) Does (can?) your production function exhibit diminishing returns to labor
as increasing amounts of labor are applied to a given quantity of capital
equipment?
b.) Does (can?) negative marginal productivity set in within the range of
your data?
2. These are cross-sectional data. What common data pathologies might you be
on the lookout for? Are they present here? Explain.
Updated: 4:15 PM 11/23/98; Prepared by: Trudy Ann Cameron; Site Index