Review of Mathematical Expectation.

Next: Covariance and Correlation. Up: Theory: Covariance & Correlation Previous: Theory: Covariance & Correlation

Review of Mathematical Expectation.

Suppose X is a r.v. (random variable) with density . If its distribution is discrete, then the density is also known as the probability mass function and in fact gives the ``point'' probabilities:

Probabilities of sets with multiple points are obtained by summing:

If the distribution of X is continuous, then all point probabilities are 0 and the density gives probabilities of intervals through integration:

For The mathematical expectation of a function of a r.v. is defined to be

displaymath580

The formulae displayed here illustrate the general principle: summations for discrete r.v.'s and integration for continuous r.v.'s. Also, whenever the limits of integration or summation are not shown, it is assumed that they are over the entire ``space,'' which effectively means all values of x where .

Technical Note: To define E[h(X)], it is usually required that either (i) so that the summands or integrand is never negative, and then E[h(X)] may possibly be , or (ii) , which means effectively that the summation or integration converges absolutely. We will generally not bother ourselves with such details. We always asssume that the integral or summation satisfies whatever mathematical properties are needed for things to make sense. There are very few practical situations where problems of infinite expectation arise.

The connection between mathematical expectation and data comes through the notion of long run averages: If , , , is a sample of realized values of the r.v. X, then as the sample mean tends to E[h(X)]. The precise mathematical formulation of this ``principle'' is the Law of Large Numbers, which makes certain assumptions on how the sample is generated (e.g. that the sample are realized values of independent and identically distributed (abbreviated i.i.d.) random variables with the same distribution as X).

Of course, there are certain mathematical expectations which are of most interest, namely the mean and variance of the r.v.:

eqnarray193

Some useful properties of these mathematical ``operators'' are summarized in the next proposition. Note that if X and Y are r.v.'s, then so are h(X) and g(X,Y) for any appropriately defined real valued functions h(x) and g(x,y).

proposition195

Proof. Part (i) is proved in Hogg & Craig. For part (ii), assuming X is a continuous r.v. we have

eqnarray203

For (iii) we apply the version of Chebyshev's inequality that says if X is a nonnegative r.v. and c > 0 then P[X > c] E[X]/c. See Hogg & Craig. Since E[X] = 0, it follows that P[X > c] = 0 for all c > 0. In particular, P[X > 0] = = 0. Since we know we have 1 = = P[X = 0] + P[X > 0] = P[X=0].

proposition214

Proof. Since the r.v. is nonnegative, it follows that = 0 by part (ii) of Proposition 1. Continuing with the fact that 0, if 0 =t = , then by part (iii) of Proposition 1 it follows that the r.v. = 0 with probability 1, i.e.\ with probability 1. Since is a constant, the completes the proof of part (i) of Proposition 2.

For part (ii), note from part (i) of Proposition 1 that E[a X + b] = a E[X] + b, so

eqnarray226

Part (iii) is already proved in Hogg & Craig, right after the definition of variance.

The corresponding ``sample'' quantities are given by

eqnarray228

An ``alternative'' sample variance is sometimes considered::

The difference between the two sample variances is unimportant when n is large.

Next: Covariance and Correlation. Up: Theory: Covariance & Correlation Previous: Theory: Covariance & Correlation

Dennis Cox
Tue Jan 21 09:20:27 CST 1997