UNIVERSITY OF CALIFORNIA, LOS ANGELES
Department of Economics
Economics 143 (Cameron) - Applied Regression
Analysis
Computing Lab Session #6: Functional Form; Dummy
Variables
Goals for this Lab
- View the sorts of scatterplots where logarithmic models can be appropriate.
- Review criteria for choosing between logarithmic and linear models.
- Practice using quadratic specifications.
- Practice using dummy variable(s).
- Practice interpreting interaction terms (here involving dummy variables).
- Practice solving estimated quadratic model for maximizing or minimizing values
of regressors.
Tasks:
1. The file gourmet.dat contains 100 hypothetical
observations on household annual consumption of gourmet coffee beans (in ounces)
and taxable annual household income from all sources (in thousands of dollars).
The variables are beans and income, respectively.
a.) Plot beans as a function of income. What do you
observe?
b.) Regress beans on income. If this relationship is not contaminated
by any form of omitted variables bias, what are the implications of this model for
the income elasticity of demand for gourmet coffee beans? Be specific.
c.) Consider models that involve log transformations. Generate the
logarithms of each variable and explore plots for the relationship between
log(beans) and income, beans and log(income), and
log(beans) against log(income). Which plot seems best to conform
with the maintained hypotheses for ordinary least squares regression? Why?
d.) Do regressions for each of these models involving logs. Which model
produces the best "fit"? Remember to use the correct option on the OLS command
(either LOGLOG, LOGLIN, or LINLOG, as necessary.)
In a log-log model, what demand elasticity is implied by the estimated
equation? How does this compare to the "elasticity" from a model in the levels of
each variable?
2. The file qcontrol.dat contains 100
hypothetical observations on numbers of data-entry errors per hour for employees
of a data-processing firm (err) as a function of the number of days of
experience the employee has with his or her job (exp).
a.) Repeat tasks similar to those in the last problem, and determine which
specification provides the better "fit."
3. The file agility.dat contains 17 hypothetical
observations on results from a test of mental agility. The variables are (in
order) the agility measure (agility), the age of the subject (age),
and the gender (female=1 if female, =0 otherwise).
a.) Try a simple regression of agility on age. What are the implications of
this model?
b.) Control for gender before determining the linear effect of agility on
age. Does gender have an effect on mental agility in these data? (Provide a
formal hypothesis concerning the true underlying parameters of the model that
corresponds to this "verbal" question. Test this hypothesis using the estimates
from this sample.)
c.) Forget about gender for the moment and explore whether agility is
linear in age, or whether the effect of an additional year on mental
agility varies with age level. [HINT: Trying a model that is quadratic in
age would be a sensible strategy to start with.] Convert this verbal issue
to a formal hypothesis about the true underlying parameters of the model and test
it using the data.
d.) At what age level is mental agility maximized or minimized (state
which)? Is this within the range of the data?
e.) Now allow mental agility to vary non-linearly with age, but allow
separate agility-age profiles for males and females. [HINT: is just a
female dummy variable sufficient? Is is sufficient to have a female
dummy variable and an interaction term between female and age? Or,
do you also need an interaction term between female and
age2?]
f.) Do the male and female agility-age profiles peak at the same age for men
and women? Formulate this as a rigorous hypothesis in terms of the parameters of
the model and test statistically.
g.) At what age does male mental agility peak? female mental agility?
Update date: 4:30 PM 11/9/98; Prepared by: Trudy Ann Cameron; Site Index