Assignment 9. Additional Topics: Robustness and Bootstrapping. Inference Comparing Two Populations. 1. Bootstrap. Do a 2% original sample from GSS for TVHOURS. (Review Assignment 6, Problem 6, where you did a 10% sample. Follow the instructions there for the setup and the first sample, except substituting .02 for .1 at the appropriate place.) This should yield about 40 cases with valid TVHOURS scores. Bootstrap that original sample 200 times, for _result(3), the sample mean, and for _result(10), the sample median, saving those 200 values of bootstrap sample means and medians in a file named "tvmnmdbs". bs "summarize tvhours , detail" "_result(3) _result(10)" , rep(200) saving(tvmnmdbs) Find and compare two alternative 95% confidence intervals for the population mean: (a) the Percentile confidence interval from bootstrapping, (b) the usual confidence interval based on the original sample and the t distribution. Compare both confidence intervals with the "true" population mean, from the entire GSS94 dataset, here treated as a known population. (The Percentile Confidence Interval is one of three alternatives displayed in stata printout; for present purposes, ignore the other two, both of which in one way or another drag the Normal distribution back in.) Clear stata, and load the tvmnmdbs.dta file in which the 200 cases are the 200 bootstrap samples, and the 2 variables are sample mean and sample median. Do detailed summaries and graphs, and write a paragraph describing those results. 2. After reading Moore and McCabe Section 8.2, "Comparing Two Proportions", do the following exercises beginning on page 609. 8.27, 8.29, 8.31, 8.40, 8.41 3. After reading Moore and McCabe Section 7.2, "Comparing Two Means", do the following exercises beginning on page 556: 7.51, 7.52, 7.57 4. Graph the distribution of EDUC in the GSS, separately for males and for females. Comment on whether there are any differences sufficiently large that you consider them potentially important for subsequent careers, and the nature of any such differences (e.g., one sex more skewed than the other). 5. Tabulate EDUC by SEX and perform a ChiSquare test of the null hypothesis that those two variables are statistically independent. Give an alternative, equivalent statement of the null hypothesis in terms of two different distributions. Interpret the results.