Hammersley–Clifford theorem


The Hammersley–Clifford theorem is a result in probability theory, mathematical statistics and statistical mechanics that gives necessary and sufficient conditions under which a strictly positive probability distribution can be represented as a that of events generated by a Markov network. It is the fundamental theorem of random fields. It states that a probability distribution that has a strictly positive mass or density satisfies one of the Markov properties with respect to an undirected graph G if and only if it is a Gibbs random field, that is, its density can be factorized over the cliques of the graph.
The relationship between Markov and Gibbs random fields was initiated by Roland Dobrushin and Frank Spitzer in the context of statistical mechanics. The theorem is named after John Hammersley and Peter Clifford, who proved the equivalence in an unpublished paper in 1971. Simpler proofs using the inclusion–exclusion principle were given independently by Geoffrey Grimmett, Preston and Sherman in 1973, with a further proof by Julian Besag in 1974.

Proof Outline

It is a trivial matter to show that a Gibbs random field satisfies every Markov property. As an example of this fact, see the following:
In the image to the right, a Gibbs random field over the provided graph has the form. If variables and are fixed, then the global Markov property requires that: , since forms a barrier between and.
With and constant, where and. This implies that.
To establish that every positive probability distribution that satisfies the local Markov property is also a Gibbs random field, the following lemma, which provides a means for combining different factorizations, needs to be proven:
Lemma 1
Let denote the set of all random variables under consideration, and let and denote arbitrary sets of variables.
If
for functions and, then there exist functions and such that
In other words, provides a template for further factorization of.
In order to use as a template to further factorize, all variables outside of need to be fixed. To this end, let be an arbitrary fixed assignment to the variables from . For an arbitrary set of variables, let denote the assignment restricted to the variables from .
Moreover, to factorize only, the other factors need to be rendered moot for the variables from. To do this, the factorization
will be re-expressed as
For each : is where all variables outside of have been fixed to the values prescribed by.
Let
and
for each so
What is most important is that when the values assigned to do not conflict with the values prescribed by, making "disappear" when all variables not in are fixed to the values from.
Fixing all variables not in to the values from gives
Since,
Letting
gives:
which finally gives:
Lemma 1 provides a means of combining two different factorizations of. The local Markov property implies that for any random variable, that there exists factors and such that:
where are the neighbors of node. Applying Lemma 1 repeatedly eventually factors into a product of clique potentials.
End of Proof