UNIVERSITY OF CALIFORNIA, LOS ANGELES
Department of Economics
Economics 143 (Cameron) - Applied Regression Analysis

Computing Lab Session #3:
Distribution of Sample Regression Functions

Main Idea to be Gleaned

Usually, we get only one sample from a population to work with in drawing inferences about the characteristics of the relationship between variables in that population. Here, we look at the distribution of possible intercept and slope coefficients if we had the luxury of drawing a large number of independent random samples from the same parent population. For each sample, we compute the sample regression function and save the point estimates of the intercept and the slope. After accumulating the results of many such samples, we can look at the marginal distribution of the intercept estimates, the marginal distribution of the slope estimates, and the joint distribution of the intercept and slope together. We can also visualize the range of possible fitted regression lines and see something that looks like the "confidence interval for prediction" that will be covered in a theoretical sense in the lectures.

Tasks to be Performed
  1. Copy the program n:plotsrf.sha from the network (or, preferably, download it from the Web site) to your c: or a: drive, so that it can be edited.

  2. When you have loaded the program into your editor, note the location specified for the READ file and alter this as necessary to reflect the location from which you will be reading the data file, which is called a:table5_1.dat if you are working from a diskette, or perhaps c:\temp\table5_1.dat if you are working in the lab, or c:\shazam\table5_1.dat if you are working with your own copy of SHAZAM for Windows.

  3. Preliminaries: Look over the contents of the program so you have a feel for what tasks will be performed.

    1. Find the OLS regression that uses the entire population of 55 observations.

    2. Find the block of commands enclosed by do #=1,[nsim] at the beginning and endo at the end. Each time the program loops through this set of commands, the character # is replaced in the code by the number of the current iteration.

    3. Notice the use of the sort and sample commands to obtain different random samples from the population of 55 observations.

    4. Note the options on the ols commands (within this "do-loop") that save the fitted coefficients from each regression (coef=)as well as the vector of fitted values for the dependent variable (predict=).

    5. If you enjoy the challenge, review the basic matrix commands in SHAZAM and figure out what is happening in the matrix and copy commands. Don't panic if you do not know matrix algebra. This program is also just a tool to demonstrate an important point. You will not be expected to write SHAZAM code of this complexity during the course.

    6. Find the crucial plot commands. When the sample is set to sample 1 [nsamsim], the plot qdd pdd / gnu will display the range of alternative fitted sample regression functions from alternative samples. When the sample is set to sample 1 [nsim], the plot bb1 bb2 / gnu will show the correlation between the slope and the intercept estimates across the different sample regressions we will be estimating.

  4. Now, run the plotsrf.sha program. Have a pencil and paper handy. When the program first pauses, you should be able to move the output window to the "top" and see the regression estimates for the entire population (all 55 observations). Note these "true" values for the slope and intercept parameters.

  5. Over-ride the pause and watch the program go through the iterations. In each iteration, a separate random sample is drawn, and a sample regression function is calculated. When you get to the next pause, move the output window to the top again and make a note of the recommended form of the first plot command, given the values you have established for the number of simulations. Over-ride the pause to continue.

  6. At the next pause, note the second set of plotting instructions for the current run of the program. Over-ride the pause to continue.

  7. When you eventually get to the end of the pre-written program and back to the TYPE COMMAND prompt, issue the recommended plotting commands and observe what happens.

  8. Questions:

    1. When we study "confidence intervals for prediction," in simple regression, we will derive the shape of the distribution of sample regression fitted values around the true population regression function. Describe what shape you expect this confidence band to have, based on the evidence in your simulations. (You may need to try more than 100 simulated samples to get a really clear sense of this shape.)

    2. How would you describe the shape of the sampling distribution of intercepts? Of slopes? (Again, you may need more than 100 simulations to see a tendency.)

    3. What is the relationship between the slope and the intercept across your different random samples from the population and the different sample regression functions they produce? Are they correlated? How? Is this logical?


Updated: 2:14 PM 10/19/98; prepared by: Trudy Ann Cameron; site index