UNIVERSITY OF CALIFORNIA, LOS ANGELES
Department of Economics

Economics 143 (Cameron) - Applied Regression Analysis

Classroom Handout #16: Proposal Planning


IMPORTANT: Some people will still be under the mistaken impression that they will actually be gathering data and doing an original regression analysis as a term paper for Economics 143. This is not the case. A "research proposal" is just that--a plan for what you might do in the way of gathering data and estimating a model. Your job is to be persuasive regarding the fundamental importance of the quantitiative relationships you want to measure, and to convince the reader of the proposal that you are a competent regression analyst.

Before you prepare your final proposal, you are strongly encouraged to undertake the following. This is a valuable exercise in that it gets you thinking clearly about the likely nature of your proposal. In very brief point form, be sure you can provide the following information.

1. Identify the population that your proposed sample is intended to represent, and be sure you know what constitutes an "observation". For example:
    a. Economics majors at UCLA (observation = a student) cross-sectional data
    b. Firms in the S&P 500 (observation = a firm) cross-sectional data
    c. Canada in specified years (observation = a year) annual time-series data
    d. Games played by professional basketball teams (observation = a team in a particular game) pooled cross-sectional data over time
    e. Counties in California (observation = county) cross-sectional data

2. Identify the dependent variable and be sure it is likely to display some variation over your proposed sample (so that there will be something to explain using a regression model). Make sure it can be measured at the level of the individual observation.
    a. Example: avoid county-level average data for Y (e.g. proportion of homeowners) and individual-level data for the X variables (e.g. age of household head, ethnic identity of household head, household income). The Y variable cannot be more highly aggregated (over observations or time) than the X variables.
    b. Example: if data for Y are at the level of the individual household, but for some X variables, household data are not available, it is sometimes possible to use, say, zip code median values for those variables (with the clear caveat that these variables only capture "neighborhood" characteristics, which may or may not be highly correlated with the desired but missing X variable for the household).
    c. Be aware that it IS possible to use categories of outcomes as a dependent variable. These are called "discrete choice models." If your proposal concerns a YES/NO dependent variable, proceed for now as though this was a conventional continuous dependent variable. We will cover procedures for these models (better than OLS) in the last couple of lectures.

3. Identify some plausible explanatory (X) variables. Some of these will be key to the main hypotheses you are hoping to test; others will be incidental determinants of the dependent (Y) variable, included in order to avoid omitted variables bias in the coefficients on the key variables.

4. Identify at least one interesting hypothesis that can be tested using your proposed model. Remember that for reliable hypothesis testing, you need good precise (small standard error) and unbiased estimates of the relevant slope coefficients. It is not possible to get a fix on the slope of a regression function with respect to an explanatory variable if that variable does not display sufficient variation across observations. For example, if you are trying to explain the effect of business taxes on firm location in city, you will not get far if business taxes are the same everywhere in that city.


Updated: 7:29 AM 11/3/98; Prepared by: Trudy Ann Cameron; Site Index