Skip to content

Covariance and Correlation

Covariance and correlation describe the relationship between two random variables. Covariance is a measure of how two random variables change together. Correlation is a standardized measure of the strength and direction of the linear relationship between two variables.

\[ \begin{align*} \text{Cov}(\mathbf{X},\mathbf{Y}) &= E\left[(\mathbf{X}-E(\mathbf{X}))(\mathbf{Y}-E(\mathbf{Y}))\right] \\ &= E\left[(\mathbf{X}-\mu_\mathbf{X})(\mathbf{Y}-\mu_\mathbf{X})\right] \end{align*} \]

Properties

\[ \require{cancel} \begin{align*} \text{Cov}(\mathbf{X},\mathbf{Y}) &= E\left[(\mathbf{X}-\mu_\mathbf{X})(\mathbf{Y}-\mu_\mathbf{X})\right] \\ &= E(\mathbf{X}\mathbf{Y}) - \mu_\mathbf{X}E(\mathbf{Y}) - \cancel{\mu_\mathbf{Y}E(\mathbf{X})} + \cancel{\mu_\mathbf{X}\mu_\mathbf{Y}} \\ &= E(\mathbf{X}\mathbf{Y}) - \mu_\mathbf{X}E(\mathbf{Y}) \\ &= E(\mathbf{X}\mathbf{Y}) - E(\mathbf{X})E(\mathbf{Y}) \end{align*} \]

When \(\mathbf{X}\) and \(\mathbf{Y}\) are independent, then \(\text{Cov}(\mathbf{X},\mathbf{Y}) = 0\).

\[ \begin{align*} \text{Var}(\mathbf{X}) &= \text{Cov}(\mathbf{X},\mathbf{X}) \\ \text{Var}(\mathbf{X}+\mathbf{Y}) &= \text{Var}(\mathbf{X}) + \text{Var}(\mathbf{Y}) + 2\text{Cov}(\mathbf{X},\mathbf{Y}) \end{align*} \]

Correlation of \(\mathbf{X}\) and \(\mathbf{Y}\)

  • A correlation of +1 indicates a perfect positive linear relationship.
  • A correlation of -1 indicates a perfect negative linear relationship.
  • A correlation of 0 implies no linear relationship.
\[ \begin{align*} \text{Cor}(\mathbf{X},\mathbf{Y}) &= \frac{\text{Cov}(\mathbf{X},\mathbf{Y})}{\sigma(\mathbf{X})\sigma(\mathbf{Y})} \\ &= \frac{\sigma_{\mathbf{X}\mathbf{Y}}}{\sigma_{\mathbf{X}}\sigma_{\mathbf{Y}}} \\ \text{Cor}(a\mathbf{X}+b, c\mathbf{Y}+d) &= \text{Cor}(\mathbf{X}, \mathbf{Y}) \end{align*} \]

The Caveat

While independent variables always have zero covariance and zero correlation, the reverse is not true: zero covariance or zero correlation does not imply independence.