Maximum a posteriori estimation

In Bayesian statistics, a maximum a posteriori probability estimate is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the basis of empirical data. It is closely related to the method of maximum likelihood estimation, but employs an augmented optimization objective which incorporates a prior distribution over the quantity one wants to estimate. MAP estimation can therefore be seen as a regularization of ML estimation.

Description

Assume that we want to estimate an unobserved population parameter on the basis of observations. Let be the sampling distribution of, so that is the probability of when the underlying population parameter is. Then the function:
is known as the likelihood function and the estimate:
is the maximum likelihood estimate of.
Now assume that a prior distribution over exists. This allows us to treat as a random variable as in Bayesian statistics. We can calculate the posterior distribution of using Bayes' theorem:
where is density function of, is the domain of.
The method of maximum a posteriori estimation then estimates as the mode of the posterior distribution of this random variable:
The denominator of the posterior distribution is always positive and does not depend on and therefore plays no role in the optimization. Observe that the MAP estimate of coincides with the ML estimate when the prior is uniform.
When the loss function is of the form
as goes to 0, the Bayes estimator approaches the MAP estimator, provided that the distribution of is quasi-concave. But generally a MAP estimator is not a Bayes estimator unless is discrete.

Computation

MAP estimates can be computed in several ways:

Analytically, when the mode of the posterior distribution can be given in closed form. This is the case when conjugate priors are used.
Via numerical optimization such as the conjugate gradient method or Newton's method. This usually requires first or second derivatives, which have to be evaluated analytically or numerically.
Via a modification of an expectation-maximization algorithm. This does not require derivatives of the posterior density.
Via a Monte Carlo method using simulated annealing
Limitations

While only mild conditions are required for MAP estimation to be a limiting case of Bayes estimation, it is not very representative of Bayesian methods in general. This is because MAP estimates are point estimates, whereas Bayesian methods are characterized by the use of distributions to summarize data and draw inferences: thus, Bayesian methods tend to report the posterior mean or median instead, together with credible intervals. This is both because these estimators are optimal under squared-error and linear-error loss respectively - which are more representative of typical loss functions - and because the posterior distribution may not have a simple analytic form: in this case, the distribution can be simulated using Markov chain Monte Carlo techniques, while optimization to find its mode may be difficult or impossible.
in which the highest mode is uncharacteristic of the majority of the distribution
In many types of models, such as mixture models, the posterior may be multi-modal. In such a case, the usual recommendation is that one should choose the highest mode: this is not always feasible, nor in some cases even possible. Furthermore, the highest mode may be uncharacteristic of the majority of the posterior.
Finally, unlike ML estimators, the MAP estimate is not invariant under reparameterization. Switching from one parameterization to another involves introducing a Jacobian that impacts on the location of the maximum.
As an example of the difference between Bayes estimators mentioned above and using a MAP estimate, consider the case where there is a need to classify inputs as either positive or negative. Suppose there are just three possible hypotheses about the correct method of classification, and with posteriors 0.4, 0.3 and 0.3 respectively. Suppose given a new instance,, classifies it as positive, whereas the other two classify it as negative. Using the MAP estimate for the correct classifier, is classified as positive, whereas the Bayes estimators would average over all hypotheses and classify as negative.

Example

Suppose that we are given a sequence of IID random variables and a priori distribution of is given by . We wish to find the MAP estimate of. Note that the normal distribution is its own conjugate prior, so we will be able to find a closed-form solution analytically.
The function to be maximized is then given by
which is equivalent to minimizing the following function of :
Thus, we see that the MAP estimator for μ is given by
which turns out to be a linear interpolation between the prior mean and the sample mean weighted by their respective covariances.
The case of is called a non-informative prior and leads to an ill-defined a priori probability distribution; in this case

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Maximum a posteriori estimation

Description

Computation

Limitations

Example