A. Expected Value
Intuitively, E[X] can be considered
as a probability-weighted average of the values of the random variable;
the center of gravity of a distribution. It is also called mX, or the population mean.
Its properties are:
1. E[b] = b
2. E[X+Y] = E[X] + E[Y]
3. E[aX] = a E[X]
4. E[aX+b] = a E[X] + b
5. E[aX+bY] = a E[X] + b
E[Y]
Unfortunately, you must also remember the following:
1. E[X/Y] is not equal to
E[X]/E[Y]
2. E[XY] is not equal to E[X] E[Y]
unless X and Y are independent
3. E[g(X)] is not equal to g(E[X])
except when g(X) is linear
B. Variance (and Standard Deviation)
Variance is a measure of the amount of dispersion or "spread" in a distribution. It is measured in units that are the square of the units in which the variable is measured, so standard deviation (the square root of variance) is often more intuitive. Standard deviation is measured in the same units as the variable itself.
1. Var[a] = 0 (if "a" is a
constant)
2. If X and Y are
independent:
Var[X+Y] = Var[X] + Var[Y]
Var[X-Y] = Var[X] + Var[Y] (still a sum)
3. Var[X+b] = Var[X] (constant
b has no variance)
4. Var[aX] = a2
Var[X]
5. Var[aX+b] = a2
Var[X]
6. If X and Y are
independent:
Var[aX+bY] = a2 Var[X] + b2 Var[Y]
7. If X and Y are not
independent:
Var[aX+bY] = a2 Var[X] + b2 Var[Y] + 2ab
Cov[X,Y]
But what is Cov[X,Y]?
C. Covariance
Measures the extent to which two variables move together. If E[X] = m X and E[Y] = m Y, then Cov[X,Y] = E[(X-m X)(Y-m Y)] = E[XY] - E[X]E[Y]. For discrete random variables, the formula is given by:
1. If X and Y are independent, their
covariance is zero.
2. Cov[ a+bX, c+dY ] = bd
Cov[X,Y]
3. Cov[X,X] = Var[X]
D. Correlation
Covariances have associated units that are the product of the units of X and the units of Y. If X is in feet and Y is in pounds, Cov[X,Y] is measured in foot-pounds. If is often desirable to have a unit-free measure of the extent of the linear relationship between any two variables; use correlation.
1. sign(r
X,Y) = sign(Cov[X,Y])
Now, for statistically dependent
random variables, we must allow for Cov[X,Y] nonzero, so the general and
special-case variance formulas are:
Var[aX+bY] = a2 Var[X] + b2 Var[Y] + 2ab
Cov[X,Y] <===
2. measures only the degree of
LINEAR association, not slope
3. -1 < r
< +1
4. rX,X = 1
Var[X+Y] = Var[X] + Var[Y] + 2 Cov[X,Y]
Var[X-Y] = Var[X] + Var[Y] - 2 Cov[X,Y]