UNIVERSITY OF CALIFORNIA, LOS ANGELES
Department of Economics
Economics 143 (Cameron) - Applied Regression
Analysis
Computing Lab Session #3:
Distribution of Sample Regression Functions
Main Idea to be Gleaned
Usually, we get only one sample from a population to work with in drawing inferences about the characteristics of the relationship between variables in that population. Here, we look at the distribution of possible intercept and slope coefficients if we had the luxury of drawing a large number of independent random samples from the same parent population. For each sample, we compute the sample regression function and save the point estimates of the intercept and the slope. After accumulating the results of many such samples, we can look at the marginal distribution of the intercept estimates, the marginal distribution of the slope estimates, and the joint distribution of the intercept and slope together. We can also visualize the range of possible fitted regression lines and see something that looks like the "confidence interval for prediction" that will be covered in a theoretical sense in the lectures.
Tasks to be Performed
- Copy the program n:plotsrf.sha
from the network (or, preferably, download
it from the Web site) to your c: or a: drive, so that it can be
edited.
- When you have loaded the program into your editor, note the location specified
for the READ file and alter this as
necessary to reflect the location from which you will be reading the data file,
which is called a:table5_1.dat if you are working from a
diskette, or perhaps c:\temp\table5_1.dat if you are working in the lab, or c:\shazam\table5_1.dat
if you are working with your own copy of SHAZAM for Windows.
- Preliminaries: Look over the contents of the program so
you have a feel for what tasks will
be performed.
- Find the OLS regression that uses the entire population of 55 observations.
- Find the block of commands enclosed by do #=1,[nsim] at the beginning
and endo at the end. Each time the program loops through this set of
commands, the character # is replaced in the code by the number of the current
iteration.
- Notice the use of the sort and sample commands to obtain
different random samples from the population of 55 observations.
- Note the options on the ols commands (within this "do-loop") that
save
the fitted coefficients from each regression (coef=)as well as the vector
of fitted values for the dependent variable (predict=).
- If you enjoy the challenge, review the basic matrix commands in SHAZAM and
figure out what is happening in the matrix and copy commands. Don't
panic if you do not know matrix algebra. This program is also just a tool to
demonstrate an important point. You will not be expected to write SHAZAM code of
this complexity during the course.
- Find the crucial plot commands. When the sample is set to
sample 1 [nsamsim], the plot qdd pdd / gnu will display the range of
alternative fitted sample regression functions from alternative samples. When the
sample is set to sample 1 [nsim], the plot bb1 bb2 / gnu will show
the
correlation between the slope and the intercept estimates across the different
sample regressions we will be estimating.
- Now, run the plotsrf.sha program. Have a pencil and paper handy. When
the program first pauses, you should be able to move the output window to the "top" and see the regression estimates for the entire population (all 55 observations). Note these "true" values
for the
slope and intercept parameters.
- Over-ride the pause and watch the program go through the iterations. In each
iteration, a separate random sample is drawn, and a sample regression function is
calculated. When you get to the next pause, move the output window to the top again and
make a note of the recommended form
of the first plot command, given the values you have established for the number of
simulations. Over-ride the pause to continue.
- At the next pause, note the second set of plotting instructions for the
current run of the program. Over-ride the pause to continue.
- When you eventually get to the end of the pre-written program and back to the
TYPE COMMAND prompt, issue the recommended plotting
commands and observe what happens.
- Questions:
- When we study "confidence intervals for prediction," in simple regression,
we will derive the shape of the distribution of sample regression fitted values
around the true population regression function. Describe what shape you expect
this confidence band to have, based on the evidence in your simulations. (You may
need to try more than 100 simulated samples to get a really clear sense of this
shape.)
- How would you describe the shape of the sampling distribution of intercepts?
Of slopes? (Again, you may need more than 100 simulations to see a tendency.)
- What is the relationship between the slope and the intercept across your
different random samples from the population and the different sample regression
functions they produce? Are they correlated? How? Is this logical?
Updated: 2:14 PM 10/19/98; prepared by: Trudy Ann Cameron; site index