Before you prepare your final proposal, you are strongly encouraged to undertake the following. Even if you choose not to hand in this "pre-proposal," it is a valuable exercise in that it gets you thinking clearly about the likely nature of your proposal. In very brief point form, provide the following information (I'll try to give some timely feedback).
1. Identify the population that your
proposed sample is intended to represent. For example:
a. Economics majors
at UCLA (observation = a student) cross-sectional data
b. Firms in the S&P
500 (observation = a firm) cross-sectional data
c. Canada in specified
years (observation = a year) annual time-series data
d. Games played by
professional basketball teams (observation = a team in a particular game)
pooled cross-sectional data over time
e. Counties in California
(observation = county) cross-sectional data
2. Identify the dependent variable and
be sure it is likely to display some variation over your proposed sample
(so that there will be something to explain using a regression model).
Make sure it can be measured at the level of the individual observation.
a. Example: avoid
county-level average data for Y (e.g. proportion of homeowners) and
individual-level
data for the X variables (e.g. age of household head, ethnic identity of
household head, household income). The Y variable cannot be more highly
aggregated (over observations or time) than the X variables.
b. Example: if data
for Y are at the level of the individual household, but for some X variables,
household data are not available, it is sometimes possible to use, say,
zip code median values for those variables (with the clear caveat that
these variables only capture "neighborhood" characteristics, which may
or may not be highly correlated with the desired but missing X variable
for the household).
c. Be aware that
it IS possible to use categories of outcomes as a dependent variable. These
are called "discrete choice models." If your proposal concerns a YES/NO
dependent variable, proceed for now as though this was a conventional continuous
dependent variable. We will cover procedures for these models (better than
OLS) in the last couple of lectures.
3. Identify some plausible explanatory (X) variables. Some of these will be key to the main hypotheses you are hoping to test; others will be incidental determinants of the dependent (Y) variable, included in order to avoid omitted variables bias in the coefficients on the key variables.
4. Identify at least one interesting hypothesis that can be tested using your proposed model. Remember that for reliable hypothesis testing, you need good precise (small standard error) and unbiased estimates of the relevant slope coefficients. It is not possible to get a fix on the slope of a regression function with respect to an explanatory variable if that variable does not display sufficient variation across observations. For example, if you are trying to explain the effect of business taxes on firm location in city, you will not get far if business taxes are the same everywhere in that city.
| COURSE OUTLINE | LECTURE OUTLINES | PROBLEM SETS | PROBLEM SOLUTIONS | COMPUTER LABS |
| SHAZAM EXAMPLES | DATA SETS | ONLINE QUIZZES | GRAPHICS | HANDOUTS |