UNIVERSITY OF CALIFORNIA, LOS ANGELES
Department of Policy Studies
Winter 1998 Cameron
Policy Studies 208 - Final Examination

INSTRUCTIONS: Answer all questions in the space provided (or indicate clearly where you have continued your answer on the back of the page). Calculators are NOT permitted. Reduce all computations to the simplest form so that anyone with a calculator could attain the answer easily. Show your work and reasoning to the fullest extent possible so that part marks can be assigned as warranted. You have three hours to complete this exam. There are 25 questions (or question sections) worth 5 points each except where noted. Total points = 125. Budget your time carefully. Exhibit pages should not be turned in with your exam. Remember: answer questions in a manner that reflects the econometric reasoning you have learned in this course.

1.  Exhibit A describes an analysis of some fictitious data concerning graduate student movie- going behavior as a function of the average full-price (including parking costs) of movies at theater complexes in the student's local area and the annual income of the student (in thousands of dollars per year).

a.)  According to Regression A1 and Regression A2, are movies a normal or an inferior good, on average, for the graduate students in the sample? Or, are they a normal good for lower-income students and an inferior good for higher income students. Explain.

b.)  What does the diagnos / het output following Regression A2 tell us? What are the implications for the standard OLS results produced by Regression A2?

c.)  Consider Regression A3, Regression A4, Regression A5, and Regression A6 . Is there one exogenous variable that is unambiguously the most closely related to the sizes of the unobserved individual conditional error variances, si2? Explain.

d.)  Among Regression A7, Regression A8, and Regression A9, which specification is inappropriate as a potential remedy for the problems afflicting Regression A2? Explain why.

e.)  In this example, are the substantive implications of the fitted model altered by the use of weighted least squares methods? Discuss.

f.)  If you did not have to worry about violations of the maintained hypotheses for OLS regarding error terms, would you prefer the linear specification in Regression A2 or the log- log specification in Regression A10? By what criterion? Explain.

g.)  Sometimes, using a log-log model will eliminate a heteroscedasticity problem. Is this the case here? Explain. Mention the circumstances under which a logarithmic transformation of the dependent variable will perfectly remedy a heteroscedasticity problem.

2. The following questions pertain to EXHIBIT B. These are real data, and we will explore a preliminary model to explain the observed monthly time-series variation in new construction of public buildings for education. The variables read by the program are defined as follows:

YRMO = year and month in CITIBASE format (e.g. 9509 = September, 1995
PUBLIC = New construction, public buildings, educational (million $, monthly, not seasonally adjusted) [CITIBASE variable CZONQE; (1964:1-1995:12)].
P1 = Population estimate; under 5 years (thousands, annual) [CITIBASE variable PAN1; annual data replicated for each month of the corresponding year; (1964-1995)].
P2 = Population estimate; 5-9 years (thousands, annual) [CITIBASE variable PAN2; annual data replicated for each month of the corresponding year; (1964-1995)].
P3 = Population estimate; 10-14 years (thousands, annual) [CITIBASE variable PAN3; annual data replicated for each month of the corresponding year; (1964-1995)].
P4 = Population estimate; 15-19 years (thousands, annual) [CITIBASE variable PAN4; annual data replicated for each month of the corresponding year; (1964-1995)].
a.)  According to Regression B1, has new construction of public school buildings been growing over time? Explain.

b.)  According to Regression B1, does new construction of public school buildings depend upon the numbers of kids of different ages in the population? Explain.

c.)  Is there multicollinearity among the regressors in Regression B1? Is it causing any problems of inference concerning the parameters in this model? Explain.

d.)  Based on the output following Regression B1 and on the results of Regression B2, what do you suspect might be wrong with the results of Regression B1? Why?

e.)  Why does Regression B2 involve so many explanatory variables? Are we concerned that there might be multicollinearity among these regressors? Explain.

f.)  Explain succinctly the main tasks that are performed "behind the scenes" by SHAZAM when the AUTO command is used.

g.)  Consider the revised results in Regression B3: (i.) Does public school construction depend on demographics? Explain. (ii.) Does public school construction activity seem to anticipate future enrollments, or simply respond to current enrollments? Explain.

h.)  Regression B4 explores a more-general specification for the public school construction model. According to this model, does this new construction change systematically over time? Does it change systematically in response to populations of children in different age groups? Explain each answer carefully.

i.)  Is there a "typical" seasonal pattern in public school construction expenditures? Characterize this pattern. Does it conform with your intuition?


3. (10 points) Non-experimental data can sometimes make it very difficult to draw policy implications from regression analysis. Choose (a.) OR (b.)

a.) GUN CONTROL: Suppose your sample consists of households that have been victimized by robbery. The dependent variable takes a value of 1 if a household member is shot during the robbery and 0 otherwise. One of your explanatory variables is a dummy variable equal to 1 if there is a handgun present in the house, 0 otherwise. When a handgun is present in a household, an occupant of that house is much more likely to be shot in the process of a robbery than when no handgun is present. Therefore, to minimize injury and loss of life from robbery incidents, private ownership of handguns should be banned. Evaluate this policy proposal and the "evidence" upon which it is premised. Briefly describe the nature of the true "experiment" that would allow an unambiguous determination of the effect of handgun presence on robbery shootings via a regression like this.

b.) LEGALIZATION OF MARIJUANA: Suppose you have a random sample of at-risk 18-year-olds. The dependent variable is the number of times each teenager has used heroin. Among the explanatory variables is a dummy variable that takes a value of 1 if the subject experimented with marijuana prior to age 13, and 0 otherwise. You find that the coefficient on this dummy variable is positive and strongly statistically significant. Therefore, we should not legalize marijuana use (which would make it much more accessible to pre-teens) since this will lead to widespread use of heroin. Evaluate this policy proposal and the "evidence" upon which it is premised. Briefly describe the nature of the true "experiment" that would allow an unambiguous determination of the effect of pre-teen marijuana use on subsequent heroin use via a regression like this.

 
4. Assume your dependent variable takes on a value of 1 if a high-school student is affiliated with a gang and zero otherwise. Among your explanatory variables are included: family income level, GPA in school, dummy variables for father present in household and mother present in household, eligibility for after-school programs, educational attainment of each parent, etc.

a.)  What sort of estimation method would you probably choose to determine empirically the effect of after-school program eligibility on gang affiliation? How would you interpret the results? Are there any caveats you might add concerning this single-equation model?

b.)  Multicollinearity among the regressors can lead to problems in making clear inferences about the effects of changes in individual explanatory variables only in Ordinary Least Squares models. It is not a concern in fundamentally nonlinear estimation methods such as probit or logit models. True, False, Uncertain? Explain.

 
5. Suppose you are reading an article concerning the effects of immigration status on utilization levels of social services among legal and undocumented immigrants who have been in the US for less than 10 years and who have been receiving social services. You encounter the following estimated model. (Note that the sample producing these results is fictitious.)
 

SERVi = 30.90 - 1.20 TIMEi + 9.30 LEGALi + 0.33 TIMEi*LEGALi
                (5.2)     (0.31)              (5.50)                     (0.20)

where SERVi = value of social services utilized (in hundreds of dollars per year);
          TIMEi = time spent in the US (in years);
          LEGALi = 1 if legal immigrant; = 0 if undocumented; i = 1,...,676.

and the parameter standard errors are given in parentheses below each point estimate.


a.)   Based on the point estimates, what is the average utilization of social services for a legal immigrant in the first year after arrival in the US? ______________ For an undocumented immigrant in the first year? ___________________________

b.)   Based on the point estimates, how does utilization of social services vary with time in the US for a legal immigrant? _________________ For an undocumented immigrant? ____________________________

c.)   Overall, does legal/undocumented status have a statistically significant effect on utilization of social services? Explain.

d.)   Does this model predict that legal immigrants will always utilize more social services than undocumented immigrants (or vice-versa)? If not, how does the predicted utilization differential (legal- undocumented) change with time in the US? When will predicted utilization be the same for both groups? Comment.

6.  Suppose you are working with individual household survey data. If you do not have data at the individual household level for one of your explanatory variables, you might be able to use group averages as a proxy for this variable (e.g. 5-digit zip code median household income instead of individual household incomes for a nation-wide sample). To the extent that the groups you use are relatively homogeneous, the proxies may be very useful in mitigating what would otherwise be omitted variables bias. The same strategy is appropriate if you do not have any individual data for your desired dependent variable. True, False, Uncertain? Explain, suggesting the best alternative if you disagree.

BONUS: (5 points)  If you estimate a regression model and get a counter-intuitive sign on a slope coefficient, what sort of problem(s) do you initially suspect? Explain.


Outline of Solutions
COURSE OUTLINE LECTURE OUTLINES PROBLEM SETS PROBLEM SOLUTIONS COMPUTER LABS
SHAZAM EXAMPLES DATA SETS ONLINE QUIZZES GRAPHICS HANDOUTS

Updated: March 26, 1998
Prepared by: Trudy Ann Cameron