UNIVERSITY OF CALIFORNIA, LOS ANGELES
Department of Economics

Economics 143 (Cameron) - Applied Regression Analysis

Computing Lab Session #6: Functional Form; Dummy Variables


Goals for this Lab Tasks:

1. The file gourmet.dat contains 100 hypothetical observations on household annual consumption of gourmet coffee beans (in ounces) and taxable annual household income from all sources (in thousands of dollars). The variables are beans and income, respectively.

a.) Plot beans as a function of income. What do you observe? b.) Regress beans on income. If this relationship is not contaminated by any form of omitted variables bias, what are the implications of this model for the income elasticity of demand for gourmet coffee beans? Be specific. c.) Consider models that involve log transformations. Generate the logarithms of each variable and explore plots for the relationship between log(beans) and income, beans and log(income), and log(beans) against log(income). Which plot seems best to conform with the maintained hypotheses for ordinary least squares regression? Why? d.) Do regressions for each of these models involving logs. Which model produces the best "fit"? Remember to use the correct option on the OLS command (either LOGLOG, LOGLIN, or LINLOG, as necessary.) In a log-log model, what demand elasticity is implied by the estimated equation? How does this compare to the "elasticity" from a model in the levels of each variable?

2. The file qcontrol.dat contains 100 hypothetical observations on numbers of data-entry errors per hour for employees of a data-processing firm (err) as a function of the number of days of experience the employee has with his or her job (exp).

a.) Repeat tasks similar to those in the last problem, and determine which specification provides the better "fit."

3. The file agility.dat contains 17 hypothetical observations on results from a test of mental agility. The variables are (in order) the agility measure (agility), the age of the subject (age), and the gender (female=1 if female, =0 otherwise).

a.) Try a simple regression of agility on age. What are the implications of this model? b.) Control for gender before determining the linear effect of agility on age. Does gender have an effect on mental agility in these data? (Provide a formal hypothesis concerning the true underlying parameters of the model that corresponds to this "verbal" question. Test this hypothesis using the estimates from this sample.) c.) Forget about gender for the moment and explore whether agility is linear in age, or whether the effect of an additional year on mental agility varies with age level. [HINT: Trying a model that is quadratic in age would be a sensible strategy to start with.] Convert this verbal issue to a formal hypothesis about the true underlying parameters of the model and test it using the data. d.) At what age level is mental agility maximized or minimized (state which)? Is this within the range of the data? e.) Now allow mental agility to vary non-linearly with age, but allow separate agility-age profiles for males and females. [HINT: is just a female dummy variable sufficient? Is is sufficient to have a female dummy variable and an interaction term between female and age? Or, do you also need an interaction term between female and age2?] f.) Do the male and female agility-age profiles peak at the same age for men and women? Formulate this as a rigorous hypothesis in terms of the parameters of the model and test statistically. g.) At what age does male mental agility peak? female mental agility?
Update date: 4:30 PM 11/9/98; Prepared by: Trudy Ann Cameron; Site Index