When Does No Correlation Imply Independence?
Published:
It is a well-known fact in probability that zero correlation does not always imply independence. The standard counterexample is trivial: let $X$ be a standard normal random variable and $Y = X^2$. Then $X$ and $Y$ are clearly dependent, since knowing $X$ determines $Y$ completely, yet by symmetry $\operatorname{Cov}(X,Y) = \mathbb E[X^3] = 0$. Independence is a much stronger condition than mere uncorrelatedness.
But what is the relationship between correlation (more generally, covariance) and independence? In this note I will attempt to explain that and also show how a stronger covariance relation with a regularity condition does imply independence. Specifically, if two random variables have exponentially decaying tails and all their mixed polynomial covariances vanish, then they must be independent. This result applies to many common families of distributions including Gaussian, Gamma, and Exponential.
Notations. We work on a fixed probability space $(\Omega,\mathcal F,\mathbb P)$. For a random variable $X$, we denote its law by $\mu_X$. For a pair $(X,Y)$, $\mu_{(X,Y)}$ is the joint law and $\mu_X \otimes \mu_Y$ is the product measure. The covariance is $\operatorname{Cov}(X,Y) = \mathbb E[XY] - \mathbb E[X]\mathbb E[Y]$, and the moment generating function is $M_X(\theta) = \mathbb E[e^{\theta X}]$.
What Independence Means
Two random variables $X$ and $Y$ are independent if their joint distribution factorizes:
\[\mathbb P(X\in A, Y\in B) = \mathbb P(X\in A)\cdot\mathbb P(Y\in B)\]for all Borel sets $A,B\subset\mathbb R$. This is equivalent to expectation factorizing for all bounded measurable functions:
\[\mathbb E[f(X)g(Y)] = \mathbb E[f(X)]\cdot\mathbb E[g(Y)].\]To see the equivalence, note that if the joint law factorizes then integrating $f(x)g(y)$ against $\mu_X\otimes\mu_Y$ gives factorization by Fubini’s theorem. Conversely, taking $f=\mathbf 1_A$ and $g=\mathbf 1_B$ recovers the probabilistic definition.
The zero covariance condition $\mathbb E[XY] = \mathbb E[X]\mathbb E[Y]$ only gives factorization for the linear functions, which is vastly weaker than what is required for independence. This explains why the standard counterexample $Y = X^2$ works: we only have factorization for linear functions, not for nonlinear functions like $f(x) = x^2$.
The Main Theorem
Theorem. Let $X$ and $Y$ be real random variables such that
- $\mathbb P(|X|>t) \le C_X e^{-c_X t}$ and $\mathbb P(|Y|>t) \le C_Y e^{-c_Y t}$ for all $t>0$ and some constants $C_X,c_X,C_Y,c_Y>0$,
- $\operatorname{Cov}(X^m,Y^n) = 0$ for all positive integers $m,n$.
Then $X$ and $Y$ are independent.
Proof Strategy
Our covariance hypothesis gives us $\mathbb E[X^m Y^n] = \mathbb E[X^m]\mathbb E[Y^n]$ for all positive integers $m,n$, which extends by linearity to all polynomials with real coefficients: $\mathbb E[p(X)q(Y)] = \mathbb E[p(X)]\mathbb E[q(Y)]$. To prove independence, we need to extend this factorization from polynomials to all bounded measurable functions.
One trick is to use moment generating functions. Our strategy is to show that the joint moment generating function $M_{(X,Y)}(\theta,\eta) = \mathbb E[e^{\theta X + \eta Y}]$ and the product $M_X(\theta) M_Y(\eta)$ both exist and are equal in a neighborhood of zero. By the standard uniqueness theorem for MGFs, this equality forces the distributions to be equal, which gives independence.
The proof proceeds in three steps. First, we show that exponential tails imply the moment generating function has positive radius of convergence. Second, we use the covariance condition to show the joint moment generating function factorizes. Finally, we invoke the standard uniqueness theorem for MGFs to conclude that equality of MGFs implies equality of distributions.
Proof of the Main Theorem
First, we will prove three lemmas.
Lemma 1. For all $n \ge 0$, we have $\mathbb E[|X|^n] \le C_X n! c_X^{-n}$. Moreover, $M_X(\theta)$ converges absolutely for all $|\theta| < c_X$.
Proof. For any $n \ge 0$, we have
\[\begin{align} \mathbb E[|X|^n] &= \int_0^\infty nt^{n-1}\mathbb P(|X| > t)\,dt \\ &\le \int_0^\infty nt^{n-1}C_Xe^{-c_Xt}\,dt \\ &= C_Xn! c_X^{-n} < \infty. \end{align}\]For all $|\theta| < c_X$, the Taylor series of the MGF converges by the moment bound:
\[\begin{align} M_X(\theta) &= \sum_{n=0}^\infty \frac{\theta^n}{n!}\mathbb E[X^n] \\ &\le \sum_{n=0}^\infty \frac{|\theta|^n}{n!}\mathbb E[|X|^n] \\ &\le C_X\sum_{n=0}^\infty \frac{|\theta|^n}{c_X^n} < \infty. \end{align}\]By Lemma 1, $M_X(\theta)$ converges for $|\theta| < c_X$ and by symmetry $M_Y(\eta)$ converges for $|\eta| < c_Y$. We now verify that the joint MGF also converges in a rectangle around the origin. For $|\theta| < c_X/2$ and $|\eta| < c_Y/2$, by Cauchy-Schwarz we have
\[\begin{align} \mathbb E[e^{\theta X + \eta Y}] &\le \mathbb E[e^{|\theta||X|+|\eta||Y|}] \\ &\le \sqrt{\mathbb E[e^{2|\theta||X|}]\mathbb E[e^{2|\eta||Y|}]} < \infty, \end{align}\]where the right side is finite since $2|\theta| < c_X$ and $2|\eta| < c_Y$, so $\mathbb E[e^{2|\theta||X|}] < \infty$ and $\mathbb E[e^{2|\eta||Y|}] < \infty$ by Lemma 1. Thus the joint MGF $M_{(X,Y)}(\theta,\eta) = \mathbb E[e^{\theta X + \eta Y}]$ converges for all $|\theta| < c_X/2$ and $|\eta| < c_Y/2$. The covariance condition now implies factorization of the joint MGF.
Lemma 2. For $|\theta| < c_X/2$ and $|\eta| < c_Y/2$, we have $\mathbb E[e^{\theta X}e^{\eta Y}] = \mathbb E[e^{\theta X}]\mathbb E[e^{\eta Y}]$.
Proof. For $|\theta| < c_X/2$ and $|\eta| < c_Y/2$, we may expand $e^{\theta X} = \sum_{m=0}^\infty \frac{\theta^m}{m!}X^m$ and $e^{\eta Y} = \sum_{n=0}^\infty \frac{\eta^n}{n!}Y^n$. The partial sums are dominated by the integrable function $e^{|\theta||X|+|\eta||Y|}$, which has finite expectation as shown above. By dominated convergence,
\[\begin{align} \mathbb E[e^{\theta X}e^{\eta Y}] &= \mathbb E\left[\sum_{m=0}^\infty \frac{\theta^m}{m!}X^m \sum_{n=0}^\infty \frac{\eta^n}{n!}Y^n\right] \\ &= \sum_{m=0}^\infty\sum_{n=0}^\infty \frac{\theta^m\eta^n}{m!n!}\mathbb E[X^m Y^n] \\ &= \sum_{m=0}^\infty\sum_{n=0}^\infty \frac{\theta^m\eta^n}{m!n!}\mathbb E[X^m]\mathbb E[Y^n] \\ &= \mathbb E[e^{\theta X}]\mathbb E[e^{\eta Y}], \end{align}\]where the third equality uses the factorization $\mathbb E[X^m Y^n] = \mathbb E[X^m]\mathbb E[Y^n]$ for all $m,n \ge 0$. For $m,n \ge 1$, this follows from the covariance assumption $\operatorname{Cov}(X^m,Y^n) = 0$. For $m=0$ or $n=0$, the factorization is trivial since $X^0 = Y^0 = 1$, giving $\mathbb E[X^0Y^n] = \mathbb E[Y^n] = \mathbb E[X^0]\mathbb E[Y^n]$ and $\mathbb E[X^mY^0] = \mathbb E[X^m] = \mathbb E[X^m]\mathbb E[Y^0]$. $\square$
The final step is to show that factorization of the MGF implies factorization of the distribution.
Lemma 3. If $M_{(X,Y)}(\theta,\eta)$ and $M_X(\theta)M_Y(\eta)$ both exist and are equal for all $|\theta| < c_X/2$ and $|\eta| < c_Y/2$, then $\mu_{(X,Y)} = \mu_X \otimes \mu_Y$.
Proof. This is a standard result in probability theory. See Khoshnevisan, Moment-generating functions and independence, Theorem 6 for a proof. $\square$
We can now complete the proof of the main theorem. By Lemma 2, for $|\theta| < c_X/2$ and $|\eta| < c_Y/2$, we have
\[M_{(X,Y)}(\theta,\eta) = M_X(\theta)M_Y(\eta).\]By Lemma 3, this implies $\mu_{(X,Y)} = \mu_X \otimes \mu_Y$, so $X$ and $Y$ are independent. $\square$
