To define the Hellinger distance in terms of measure theory, letP and Qdenote two probability measures that are absolutely continuouswith respect to a third probability measure λ. The square of the Hellinger distance between P and Q is defined as the quantity Here, dP / dλ and dQ / dλ are the Radon–Nikodym derivatives of P and Q respectively. This definition does not depend on λ, so the Hellinger distance between P and Q does not change if λ is replaced with a different probability measure with respect to which both P and Q are absolutely continuous. For compactness, the above formula is often written as
To define the Hellinger distance in terms of elementary probability theory, we take λ to be Lebesgue measure, so that dP / dλ and dQ / dλ are simply probability density functions. If we denote the densities as f and g, respectively, the squared Hellinger distance can be expressed as a standard calculus integral where the second form can be obtained by expanding the square and using the fact that the integral of a probability density over its domain equals 1. The Hellinger distance H satisfies the property
Discrete distributions
For two discrete probability distributions and, their Hellinger distance is defined as which is directly related to the Euclidean norm of the difference of the square root vectors, i.e. Also,
Properties
The Hellinger distance forms a bounded metric on the space of probability distributions over a given probability space. The maximum distance 1 is achieved when P assigns probability zero to every set to which Q assigns a positive probability, and vice versa. Sometimes the factor in front of the integral is omitted, in which case the Hellinger distance ranges from zero to the square root of two. The Hellinger distance is related to the Bhattacharyya coefficient as it can be defined as Hellinger distances are used in the theory ofsequential and asymptotic statistics. The squared Hellinger distance between two normal distributions and is: The squared Hellinger distance between two multivariate normal distributions and is The squared Hellinger distance between two exponential distributions and is: The squared Hellinger distance between two Weibull distributions and : The squared Hellinger distance between two Poisson distributions with rate parameters and, so that and, is: The squared Hellinger distance between two Beta distributions and is: where is the Beta function.
The Hellinger distance and the total variation distance are related as follows: These inequalities follow immediately from the inequalities between the 1-norm and the 2-norm.