THE UNIVERSITY OF CALIFORNIA, LOS ANGELES
Department of Economics
Economics 143 - Applied Regression Analysis
January 13, 1998
Cameron
Problem Set # 1: Univariate Statistics (Review)
 
Due: Beginning of lecture, Tuesday, January 20, 1998 [For review. Will not be graded in detail. Will not count quantitatively towards course grade; but must be submitted for inspection.]
 
INSTRUCTIONS: To get you up to speed, this problem set highlights some of the concepts you should have grasped in your prerequisite coursework. The first three or four lectures will be devoted to a thorough, although quick, review of this material. This problem set covers material in Gujarati, Chapter 2.

1. We will be using summation notation in this course. What do the following stand for? (Simplify to the extent possible.)

a.) 

b.) 

c.) 

d.) 

e.) 

f.) 
 

2. Correlation is a measure of the degree of linear relatedness of two variables. If Y and X are uncorrelated, then they are statistically independent (i.e. a scatterplot of their values will be an amorphous blob). True, False, Uncertain? Explain.
 

3. For each of the following, is this a complete and valid probability distribution (in the case of a discrete random variable) or a complete and valid probability density function (in the case of a continuous random variable)? Why or why not?

a.)  f(Y) = P(Y = yi) = .50

b.)  f(X) = .2 when x = 0, 1, 2;  f(X) = .1 when x = 3, 4, 5, 10;  f(X) = 0 otherwise.

c.) The random variable X can take on four different values: -1, 0 , 1 , 3, with corresponding probabilities f(X) = -.2, .5, .9, -.2.

d.) f(X) =  x-1,   1 < x < 3;   0 otherwise

e.) f(X,Z) = 1/3,   0 < x < 1;   2 < z < 5;   0 otherwise

f.) f(X,Z) = 1/9,   x = 0, 1, 2;   z = 2, 3, 4;   0 otherwise
 

4.  For the following joint (or bivariate) discrete distribution of the variables Y and X:

     3   |    0.1      0.1      0.1
         |
     2   |    0.1      0.2       0
         |
   Y=1   |    0.3      0.1       0
--------------------------------------
   X=    |     0        1        2

Assuming that this joint discrete probability function is the true population distribution, f(X,Y), compute:

a.) the marginal distribution f(Y), its mean and its variance;

b.) the conditional distribution of Y and its mean, given that X=0; given that X=2. Does the conditional mean of Y appear to be related to the magnitude of X? How? [Recall: the relative frequencies in a conditional distribution must be scaled so that the probabilities sum to one.]

c.) given your answer in (b.), can the random variable Y be statistically independent of X? (I.e., is the test for independence, f(X,Y) = f(X)f(Y) violated for any of these specific (x,y) pairs?)

d.) compute the covariance between X and Y and then the correlation between these variables. Bear in mind that Cov(X,Y) equals E(XY)-E(X)E(Y) and Corr(X,Y) equals Cov(X,Y) divided by the product of the individual marginal standard deviations of the two variables.
 

5. Suppose you are told that E(X) = 4 and that Var(X) = 16. What are the expected values and variances of the following expressions? [Recall the formula for the mean and variance of a linear function of a single random variable.]

a.) Y = 3X + 2

b.) Y = .6X - 3

c.) Y = X/5

d.) Y = aX + b, where a and b are scalar constants

 
6. What is the formula for the variance of the linear combination aX1 + bX2, where X1 and X2 are two random variables? Let X1 stand for the rate of return on one security, and let X2 stand for the rate of return on another security, and let E(X1) = E(X2). A simple "investment portfolio" would consist of a combination of these two securities. Suppose that Var(X1) is 16 and Var(X2) is 9 and that the rates of return have a correlation of 0.6. If you had $10,000 to invest, would you be better off to invest all of it in security 1, in security 2, or half in each? This is the essence of modern portfolio theory.
 

7. One of my favorite ways to remember the difference between probability theory and statistical inference is to contrast the endeavors of the professor and the student around final exam time. The student (having sat through the course) knows the population of possible questions that could be asked on the final exam and, in the process of studying, tries to ascertain which ones are most likely to be asked on the final. On the other hand, the professor asks only a limited number of questions on the final exam, but must try to ascertain from this sample what proportion of the subject matter each student has actually mastered. Who is thinking about probability theory and who is conducting statistical inference? Explain.
 

8. Is there any difference between a mean, an average, an expected value, and a first moment around zero? Is there any difference between a standard deviation and a standard error? Is there any difference between a mean squared deviation (MSD), a variance, and a second moment around the mean? Is there any difference between variance in the population and variance in a sample? [If your prerequisite used different terminology, don't panic. We'll sort it out.] 


COURSE OUTLINE LECTURE OUTLINES PROBLEM SETS PROBLEM SOLUTIONS COMPUTER LABS
SHAZAM EXAMPLES DATA SETS ONLINE QUIZZES GRAPHICS HANDOUTS

Updated: January 12, 1998
Prepared by: Trudy Ann Cameron