info prev up next book cdrom email home

Covariance

Given $n$ sets of variates denoted $\{x_1\}$, ..., $\{x_n\}$ , a quantity called the Covariance Matrix is defined by

$\displaystyle V_{ij}$ $\textstyle =$ $\displaystyle \mathop{\rm cov}\nolimits (x_i,x_j)$ (1)
  $\textstyle \equiv$ $\displaystyle \left\langle{(x_i-\mu_i)(x_j-\mu_j)}\right\rangle{}$ (2)
  $\textstyle =$ $\displaystyle \left\langle{x_ix_j}\right\rangle{}-\left\langle{x_i}\right\rangle{}\left\langle{x_j}\right\rangle{},$ (3)

where $\mu_i=\left\langle{x_i}\right\rangle{}$ and $\mu_j=\left\langle{x_j}\right\rangle{}$ are the Means of $x_i$ and $x_j$, respectively. An individual element $V_{ij}$ of the Covariance Matrix is called the covariance of the two variates $x_i$ and $x_j$, and provides a measure of how strongly correlated these variables are. In fact, the derived quantity
\begin{displaymath}
\mathop{\rm cor}\nolimits (x_i,x_j)\equiv {\mathop{\rm cov}\nolimits (x_i,x_j)\over\sigma_i\sigma_j},
\end{displaymath} (4)

where $\sigma_i$, $\sigma_j$ are the Standard Deviations, is called the Correlation of $x_i$ and $x_j$. Note that if $x_i$ and $x_j$ are taken from the same set of variates (say, $x$), then
\begin{displaymath}
\mathop{\rm cov}\nolimits (x,x) = \left\langle{x^2}\right\ra...
...left\langle{x}\right\rangle{}^2=\mathop{\rm var}\nolimits (x),
\end{displaymath} (5)

giving the usual Variance $\mathop{\rm var}\nolimits (x)$. The covariance is also symmetric since
\begin{displaymath}
\mathop{\rm cov}\nolimits (x,y) = \mathop{\rm cov}\nolimits (y,x).
\end{displaymath} (6)

For two variables, the covariance is related to the Variance by
\begin{displaymath}
\mathop{\rm var}\nolimits (x+y) = \mathop{\rm var}\nolimits ...
...mathop{\rm var}\nolimits (y)+2\mathop{\rm cov}\nolimits (x,y).
\end{displaymath} (7)


For two independent variates $x=x_i$ and $y=x_j$,

\begin{displaymath}
\mathop{\rm cov}\nolimits (x,y)=\left\langle{xy}\right\rangl...
...{x}\right\rangle{}\left\langle{y}\right\rangle{}-\mu_x\mu_y=0,
\end{displaymath} (8)

so the covariance is zero. However, if the variables are correlated in some way, then their covariance will be Nonzero. In fact, if $\mathop{\rm cov}\nolimits (x,y) > 0$, then $y$ tends to increase as $x$ increases. If $\mathop{\rm cov}\nolimits (x,y) < 0$, then $y$ tends to decrease as $x$ increases.


The covariance obeys the identity

$\displaystyle \mathop{\rm cov}\nolimits (x+z,y)$ $\textstyle =$ $\displaystyle \left\langle{(x+z)y-\left\langle{x+z}\right\rangle{}\left\langle{y}\right\rangle{}}\right\rangle{}$  
  $\textstyle =$ $\displaystyle \left\langle{xy}\right\rangle{}+\langle zy\rangle -(\left\langle{x}\right\rangle{} +\left\langle{z}\right\rangle{})\left\langle{y}\right\rangle{}$  
  $\textstyle =$ $\displaystyle \left\langle{xy}\right\rangle{}-\left\langle{x}\right\rangle{}\le...
...zy}\right\rangle{}-\left\langle{z}\right\rangle{}\left\langle{y}\right\rangle{}$  
  $\textstyle =$ $\displaystyle \mathop{\rm cov}\nolimits (x,y)+\mathop{\rm cov}\nolimits (z,y).$ (9)

By induction, it therefore follows that
$\displaystyle \mathop{\rm cov}\nolimits \left({\sum_{i=1}^n x_i,y}\right)$ $\textstyle =$ $\displaystyle \sum_{i=1}^n \mathop{\rm cov}\nolimits (x_i,y)$ (10)
$\displaystyle \mathop{\rm cov}\nolimits \left({\sum_{i=1}^n x_i, \sum_{j=1}^m y_j}\right)$ $\textstyle =$ $\displaystyle \sum_{i=1}^n \mathop{\rm cov}\nolimits \left({x_i, \sum_{j=1}^m y_j}\right)$ (11)
  $\textstyle =$ $\displaystyle \sum_{i=1}^n \mathop{\rm cov}\nolimits \left({\sum_{j=1}^m y_j,x_i}\right)$ (12)
  $\textstyle =$ $\displaystyle \sum_{i=1}^n \sum_{j=1}^m \mathop{\rm cov}\nolimits (y_j,x_i)$ (13)
  $\textstyle =$ $\displaystyle \sum_{i=1}^n \sum_{j=1}^m \mathop{\rm cov}\nolimits (x_i, y_j).$ (14)

See also Correlation (Statistical), Covariance Matrix, Variance



info prev up next book cdrom email home

© 1996-9 Eric W. Weisstein
1999-05-25