Assignment 9. Additional Topics: Robustness and Bootstrapping.
Inference Comparing Two Populations.


1. Bootstrap. Do a 2% original sample from GSS for TVHOURS.
(Review Assignment 6, Problem 6, where you did a 10% sample.
Follow the instructions there for the setup and the first sample,
except substituting .02 for .1 at the appropriate place.)

This should yield about 40 cases with valid TVHOURS scores.  

Bootstrap that original sample 200 times, for _result(3), the
sample mean, and for _result(10), the sample median, saving those
200 values of bootstrap sample means and medians in a file named
"tvmnmdbs".

        bs "summarize tvhours , detail"  "_result(3) _result(10)"
           , rep(200)  saving(tvmnmdbs)

Find and compare two alternative 95% confidence intervals for the
population mean: (a) the Percentile confidence interval from
bootstrapping, (b) the usual confidence interval based on the
original sample and the t distribution. Compare both confidence
intervals with the "true" population mean, from the entire GSS94
dataset, here treated as a known population. 

(The Percentile Confidence Interval is one of three alternatives
displayed in stata printout; for present purposes, ignore the
other two, both of which in one way or another drag the Normal
distribution back in.)

Clear stata, and load the tvmnmdbs.dta file in which the 200
cases are the 200 bootstrap samples, and the 2 variables are
sample mean and sample median. Do detailed summaries and graphs,
and write a paragraph describing those results.


2. After reading Moore and McCabe Section 8.2, "Comparing Two
Proportions", do the following exercises beginning on page 609.
        8.27, 8.29, 8.31, 8.40, 8.41

3.  After reading Moore and McCabe Section 7.2, "Comparing Two
Means", do the following exercises beginning on page 556:
        7.51, 7.52, 7.57

4.  Graph the distribution of EDUC in the GSS, separately for
males and for females.  Comment on whether there are any
differences sufficiently large that you consider them potentially
important for subsequent careers, and the nature of any such
differences (e.g., one sex more skewed than the other).

5.  Tabulate EDUC by SEX and perform a ChiSquare test of the null
hypothesis that those two variables are statistically
independent.  Give an alternative, equivalent statement of the
null hypothesis in terms of two different distributions. 
Interpret the results.