Central Limit Theorem, Normal Distribution and Entropy


A simple form of The Central Limit Theorem:

For identically independent distributed random vectors Y_{1},Y_{2},\cdots; let \bar{Y}_{n}=\Sigma_{i=1}^{n}Y_{i}/n.

If E||Y_{1}||^{2}\le\infty, \sqrt{n}(\bar{Y}_{n}-EY_{1})\to N(0,CovY_{1}).

When I first learn the central limit theorem, I’m always curious why the sum of a large number of random variables is distributed approximately as a normal/Gaussian distribution. The answer is simple, that’s where normal distribution comes from. Normal distribution is found when we first try to find the limitation of the sum of a large number of random variables. In other words, normal distribution is defined as the limit distribution of a large number of random variables.

Firstly, using the i.i.d. condition, it’s easy to check that E(\bar{Y}_{n}-EY_{1})=0 and Cov(\bar{Y}_{n}-EY_{1})=\frac{CovY_{1}}{n}.

Before we move on, let’s review the basic idea of entropy. The entropy of words s is H(s)=-\Sigma_{\omega\in s}p(\omega)logp(\omega). Entropy is a measure of uncertainty. The less information we have about something, the larger its entropy will be.

Now, ignore the central limit theorem for a while, let’s image what the limit distribution of a large number of random variables can be.

Note that \bar{Y}_{n} is a sample mean. The more we “mean”, the more individual characters will be lost. Mean will hide individual characters, make individuals indistinctive, and hence reduce the information we have. So it’s natural that the more we “mean”, the larger its entropy will be.

OK, that’s enough. From the point above, we can expect that the limit distribution has a large entropy. In other words, the normal distribution should have a large entropy.

The fact is that:

Theorem 8.6.5 ([1], P254)

Let the random vector X\in R^{n} have zero mean and covariance K=EXX^{t}. Then h(X)\leq\frac{1}{2}log(2\pi e)^{n}|K|, with equality iff X\sim N(0,K).

Theorem 8.8.6 ([1], P255, Estimation error and differential entropy)

For any random variable X and estimator \hat{X}, E(X-\hat{X})^{2}\geq\frac{1}{2\pi e}e^{2h(X)}, with equality if and only if X is Gaussian and \hat{X} is the mean of X.

Corollary: Given side information Y and estimation \hat{X}(Y), it follows that E(X-\hat{X}(Y))^{2}\geq\frac{1}{2\pi e}e^{2h(X|Y)}.

8a.6 ([2], P532) N_{p} as a Distribution with Maximum Entropy

The multivariate normal distribution N_{p} has the maximum entropy

(H=-\int P(U)logP(U)dv) subject to the condition that mean and covariance are fixed.

For given mean and covariance, the Gaussian distribution is the maximum entropy distribution. “It gives the lowest log likelihood to its members on average. That means Gaussian distribution is the safest assumption when the true distribution is unknown” ([3]). That also explains why many people tend to “abuse” Gaussian assumption so much.

References:

[1] Elements Of Information Theory, Second Edition, Thomas M. Cover, Thomas M. Cover, 2006

[2] Linear Statistical Inference and its Applications, Second Edition, C. Radhakfushna Rao, 2002

[3] A Short Introduction to Model Selection, Kolmogorov Complexity and Minimum Description Length (MDL), Volker Nannen, 2003

Advertisements
This entry was posted in asymptotic, entropy, probability. Bookmark the permalink.

3 Responses to Central Limit Theorem, Normal Distribution and Entropy

  1. San Diego says:

    Bruce, is it possible to prove the Cramer-Rao Limit using calculus of variations?

    • Bruce Zhou says:

      Maybe, I don’t know. I know only a little about calculus of variations. There is an old book on the applications of calculus of variations of Statistics (Variational Methods in Statistics by Jagdish S. Rustagi, 1976). For C-R bound, most proofs I’ve seen is based on Cauchy-Schwarz inequality.

  2. This website really has all of the information and facts I needed about this subject and didn’t
    know who to ask.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s