Engineers are familiar with mathematical convergence - that the terminal value of a series approaches some limit as the number of terms increases. |

We are less familiar with an analogous statistical concept of "**convergence in
distribution**," where the characteristic of the limit isn't a single value, but
rather that the character of the sequence itself approaches some specific
distribution. An example is the central limit theorem. Further examples are
illustrated here, with the dotted arrows indicating asymptotic relationships.

"**Convergence in probability**" is not quite the same as convergence in
distribution. Convergence in probability says that the random variable converges
to a value I know. So (r.v. - value) = 0, or (r.v. - other r.v.) = 0. Always.
**Convergence in distribution** says that they behave the same way (but aren't the
same value). Clearly if X has a normal density, N(0,1) and Y, too, has a normal
density, Y~N(0,1), then the difference between a random draw from X and a random
draw from Y is not equal to zero, X-Y ≠ 0.

Still other examples of convergence in distribution are the extreme value distributions.

So what? In practical applications simple, direct-sampling* Monte Carlo simulation may not be up to the task of producing draws from the target joint density even when the joint density is correctly specified. (Sadly, many engineering MC simulations rely on an inadequate correlation coefficient, or worse - ignore dependencies among variables.)

Recent advances** in computational statistics take advantage of convergence in
distribution to simulate the often complicated joint density by **sampling
directly from the joint probability density itself (!)** These are iterative,
rather than direct, sampling methods. It can be shown that under suitable
conditions that the sequence of samples ultimately becomes **ergodic*****, with
elements of the sequence converging in distribution, thus representing samples
from the desired joint probability density.

Because they do not have to sample everywhere in the probability space, only
where the variables most probably reside, these methods are not fettered by the
problem of large dimensions (the Curse of Dimensionality).

* Direct-sampling methods attempt to sample from the entire probability space and thus from the joint probability density of interest, usually inversely through the marginal cdfs.

** Regrettably, many engineers view statistics as static, hidebound, if not moribund, and sort of a mathematical analog to Latin. This lamentably ignorant perspective does little to dispel an equally common view of engineer-as-buffoon held by many statisticians.

*** Time-dependent and other sequential processes are called

**ergodic**if the eventual distribution of states in the system does not depend on the starting state so the random sequence S_{m}from time = t_{n}to time = t_{n+m}does not depend on*n*as