Road to Analyst: THE COVARIANCE BETWEEN TWO RANDOM VARIABLES

2. THE COVARIANCE BETWEEN TWO RANDOM VARIABLES

Definition

Let X and Y be random variables with joint probability mass function (or density) f(x,y). The covariance between X and Y is a number:

Remark

By definitions of Cov(X,Y) and E[g(X,Y)], assuming g(x,y)=(x - m _X)(y - m _Y), we can get the following:

for discrete X, Y,

for continuous X, Y.

Denotation: Instead of Cov(X,Y) we often write s_XY.

Interpretation. Covariance is a kind of a measure of relationship between two random variables:

(a) If large values of X (bigger then m _X) tend to occur with large values of Y (bigger then m _Y), and small values of X (smaller then m _X ) occur with small values of Y (smaller then m _Y) then Cov (X,Y) > 0.

(b) If large values of X (bigger then od m _X ) tend to occur with small values of Y ( smaller than m _Y) and small values of X (smaller than m _X) occur with large values of Y (bigger then m _Y) then Cov (X,Y) < 0.

(c) You can notice that for X = Y Cov (X,Y) = Var (X) ³ 0.

Proposition

Cov(X,Y) = E(XY) - m _X m _Y.

Proof

Cov(X,Y) = E[(X - m _X ) (Y - m _Y)] = E(XY - Xm _Y - Ym _X + m _X m _Y ) =

= E(XY) - E(Xm _Y ) - E(Ym _X ) + m _X m _Y = E(XY) - m _X m _Y.

Theorem

If random variables X and Y are independent, then Cov(X,Y) = 0.

Proof

For independent random variables E(XY) = E(X) E(Y). On the basis of that and by the covariance formula we have:

Cov(X,Y) = E(XY) - m _X m _Y = E(X) E(Y) - m _X m _Y = 0.

Remark

The opposite theorem usually doesn't hold. For example let (X,Y) be discrete random variables with the joint probability mass function given as follows

We can notice, that value of X determines value of Y, therefore, X and Y are dependent. At the same time EX= EY = 0 and EXY = (1/4) (2´ 2 + 2´ (- 2) + (- 4)´ 1 + 4´ (- 1)) = 0. Therefore Cov(X,Y) = 0.

Theorem

For any constants a, b

Var(aX + bY) = a² Var(X) + b² Var(Y) + 2abCov(X,Y).

Proof

E{ [(aX + bY) – (am _X + bm _Y )]² } = E{ [a(X – m _X) + b(Y – m _Y )]² } =

= E{ [a(X – m _X)]² } + E{ [2ab(X – m _X) (Y – m _Y )]} + E{ [b(Y – m _Y )]² } =

= a² Var(X) + 2abCov(X,Y) + b² Var(Y).

Corollary

If X and Y are independent random variables, then

Var(aX + bY) = a² Var(X) + b² Var(Y).

Example

Let X₁, ... , X₅ be numbers of spots in 5 independent rolls of a die. Then

Var( (X₁ + X₂ ) / 2 ) = (1/2) Var(X₁), and Var( (X₁ + X₂ + ... + X₅) / 5 ) = (1/5) Var(X₁).

We can see that the variance of the average number of spots decreases inversely to the number of rolls, while its' standard deviation decreases inversely to the square root of the number of rolls. Similar property holds for the variance of the average from the results obtained in independent experiments.

Monday, February 8, 2021

THE COVARIANCE BETWEEN TWO RANDOM VARIABLES