27sep00

Outline:

Topic 1: Infrastructure for Intermediate Statistical Methods

Professor: David D. McFarland


Web Pages for Fall 2000


Prolog: Computerized Statistical Description of Computerized Data

The first part of the quarter we will retrace -- but not just as review -- some of the material covered in introductory statistics under the rubric of "univariate descriptive statistics". Here the emphasis will be on getting the computer to do the calculations, using large datasets of real sociological data, and paying close attention to just what all this has to do with substantive sociological research.

The descriptive statistics done in intro stat courses may have had various limitations (e.g., artificial datasets made up just to serve as classroom exercises, or examples from fields remote from sociology). But those exercises did serve to teach about such things as describing frequency distributions with tables or histograms; summarizing them with means and standard deviations, or medians and other percentiles; and other things that go beyond univariate descriptive statistics.

In the early part of this quarter's course, we will build on that knowledge, accessing and analyzing some real survey data from several thousand respondents in a nationally representative sample of households called the "General Social Survey" (GSS). Later in the course we will consider some alternative study designs, such as experiments.

Calculations will be done in a statistical software package called "stata". The computer will relieve us of some drudgerous arithmetic, but not of careful thought. For instance, we will need to assess (taking into consideration measurement properties of the variables used and overall design of the particular study) which of the things statistical software such as stata routinely calculates may be meaningless or inapplicable, which require cautious interpretation, and which are meaningful and important in a straightforward manner. Most important, we need to keep in mind the kinds of sociological problems to which such calculations may be relevant.

We will be using computers connected to the internet to access the GSS data and, while we are at it, explore some other datasets, including textual ones related to the conduct of statistical analyses.

Also, looking to the future, beyond this course, we should give some attention as well to the matter of getting results from statistical analyses (tables and graphs, and verbal presentations thereof) into documents of the sorts wanted by thesis advisers or journal editors.


Topic 1: Overview; Infrastructure, Local and Otherwise

Our infrastructure includes networked PCs, and such things as survey datasets, data documentation, statistical software, and journal articles. Most of the items discussed here are used by professional sociologists, not just students in courses such as this one.

  • Computer Laboratories
  • Social Sciences Computing (SSC), 2035 Public Policy. http://computing.sscnet.ucla.edu. Login using SSC account procedures. PCs here have stata software and Zip drives. Limited printing (syllabi) at no extra charge.

  • CLICC labs in Powell Library www.clicc.ucla.edu. PCs here have stata software and Zip drives. Login using BOLid. Printing billed to student's UCLA account.

  • Various departmental computer labs. Check with your local tech support person regarding such matters as lab access rules, logon procedures, stata software, Adobe Acrobat Reader software, Zip drives, printing.

  • Professional Association
  • American Sociological Association www.asanet.org. The ASA "Instructions to Authors" stylesheet is available for download as an Adobe PDF file (and also as a Word 7.0 document).

  • Libraries. In this course we will be referring not only to the books on statistics and stata software, but also to examples of social research. The ones I have selected are typically available online in jstor or other CDL sources, but some items cited are not available electronically.

  • Orion2 http://orion2.library.ucla.edu, is UCLA's new web-based online library catalog.

  • melvyl is the combined library catalog for the entire University of California. Access this through the UCLA library's main website www.library.ucla.edu, or at melvyl.ucop.edu via telnet. It lists books by location, but not checkout status.

  • California Digital Library, www.cdlib.org, or access it through the UCLA library's main website www.library.ucla.edu. This has articles from some relevant journals that are not in jstor (which see below), via sciencedirect, idealibrary, and others. Beware, however, of items that are unusably incomplete (text only, figures missing, tables missing).

  • JSTOR www.jstor.org or access it through the UCLA library's main website www.library.ucla.edu. This online journal storage facility provides complete text, tables, figures of articles, but not all journals, and only with several years' delay after publication in paper form.

  • And amid all the "virtual libraries", don't forget the physical libraries, especially the Young Research Library.

  • Data Archives
  • UCLA Social Science Data Archives. www.sscnet.ucla.edu/issr/da/ Located in the UCLA Institute for Social Science Research, this is the main UCLA repository of machine-readable social science data.
  • California Census Research Data Center, www.ccrdc.ucla.edu, is the local resource for UCLA researchers doing research with US Census Bureau data.
  • The Interuniversity Consortium for Political and Social Research www.icpsr.umich.edu. The Consortium, of which UCLA is a member, is located physically in Ann Arbor, but "virtually" everywhere. Online icpsr holdings include, among many others, the General Social Survey data, and the GSS online codebook.
  • An overview of the GSS is available from NORC, where the GSS originates: www.norc.uchicago.edu/gss/.