Evidence lower bound


In statistics, the evidence lower bound is the quantity optimized in Variational Bayesian methods. These methods handle cases where a distribution over unobserved variables is optimized as an approximation to the true posterior, given observed data. Then the evidence lower bound is defined as :
where is cross entropy. Maximizing the evidence lower bound minimizes, the Kullback–Leibler divergence a measure of dissimilarity of from the true posterior. The primary reason why this quantity is preferred for optimization is that it can be computed without access to the posterior, given a good choice of.
For other measures of dissimilarity to be optimized to fit see Divergence.

Justification as a lower bound on the evidence

The name evidence lower bound is justified by analyzing a decomposition of the KL-divergence between the true posterior and :
As this equation shows that the evidence lower bound is indeed a lower bound on the log-evidence for the model considered. As does not depend on this equation additionally shows that maximizing the evidence lower bound on the right minimizes, as claimed above.