Glivenko–Cantelli theorem


In the theory of probability, the Glivenko–Cantelli theorem, named after Valery Ivanovich Glivenko and Francesco Paolo Cantelli, determines the asymptotic behaviour of the empirical distribution function as the number of independent and identically distributed observations grows.

Statement

The uniform convergence of more general empirical measures becomes an important property of the Glivenko–Cantelli classes of functions or sets. The Glivenko–Cantelli classes arise in Vapnik–Chervonenkis theory, with applications to machine learning. Applications can be found in econometrics making use of M-estimators.
Assume that are independent and identically-distributed random variables in with common cumulative distribution function. The empirical distribution function for is defined by
where is the indicator function of the set. For every , is a sequence of random variables which converge to almost surely by the strong law of large numbers, that is, converges to pointwise. Glivenko and Cantelli strengthened this result by proving uniform convergence of to.
Theorem
This theorem originates with Valery Glivenko, and Francesco Cantelli, in 1933.
Remarks
For simplicity, consider a case of continuous random variable. Fix such that for. Now for all there exists such that. Note that
Therefore, almost surely
Since by strong law of large numbers, we can guarantee that for any integer we can find such that for all
which is the definition of almost sure convergence.

Empirical measures

One can generalize the empirical distribution function by replacing the set by an arbitrary set C from a class of sets to obtain an empirical measure indexed by sets
Where is the indicator function of each set.
Further generalization is the map induced by on measurable real-valued functions f, which is given by
Then it becomes an important property of these classes that the strong law of large numbers holds uniformly on or.

Glivenko–Cantelli class

Consider a set with a sigma algebra of Borel subsets A and a probability measure P. For a class of subsets,
and a class of functions
define random variables
where is the empirical measure, is the corresponding map, and
Definitions
Theorem

Examples