Poisson distribution


In probability theory and statistics, the Poisson distribution, named after French mathematician Siméon Denis Poisson, is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event. The Poisson distribution can also be used for the number of events in other specified intervals such as distance, area or volume.
For instance, an individual keeping track of the amount of mail they receive each day may notice that they receive an average number of 4 letters per day. If receiving any particular piece of mail does not affect the arrival times of future pieces of mail, i.e., if pieces of mail from a wide range of sources arrive independently of one another, then a reasonable assumption is that the number of pieces of mail received in a day obeys a Poisson distribution. Other examples that may follow a Poisson distribution include the number of phone calls received by a call center per hour and the number of decay events per second from a radioactive source.

Definitions

Probability mass function

The Poisson distribution is popular for modeling the number of times an event occurs in an interval of time or space.
A discrete random variable X is said to have a Poisson distribution with parameter λ > 0, if, for k = 0, 1, 2, ..., the probability mass function of X is given by:
where
The positive real number λ is equal to the expected value of X and also to its variance
The Poisson distribution can be applied to systems with a large number of possible events, each of which is rare. The number of such events that occur during a fixed time interval is, under the right circumstances, a random number with a Poisson distribution.

Example

The Poisson distribution may be useful to model events such as
The Poisson distribution is an appropriate model if the following assumptions are true:
If these conditions are true, then is a Poisson random variable, and the distribution of is a Poisson distribution.
The Poisson distribution is also the limit of a binomial distribution, for which the probability of success for each trial equals divided by the number of trials, as the number of trials approaches infinity.

Probability of events for a Poisson distribution

An event can occur 0, 1, 2,... times in an interval. The average number of events in an interval is designated . is the event rate, also called the rate parameter. The probability of observing events in an interval is given by the equation
where
This equation is the probability mass function for a Poisson distribution.
This equation can be adapted if, instead of the average number of events, we are given a time rate for the events to happen. Then , and

Examples of probability for Poisson distributions

On a particular river, overflow floods occur once every 100 years on average. Calculate the probability of = 0, 1, 2, 3, 4, 5, or 6 overflow floods in a 100-year interval, assuming the Poisson model is appropriate.
Because the average event rate is one overflow flood per 100 years, λ = 1
The table below gives the probability for 0 to 6 overflow floods in a 100-year period.
P
00.368
10.368
20.184
30.061
40.015
50.003
60.0005

Ugarte and colleagues report that the average number of goals in a World Cup soccer match is approximately 2.5 and the Poisson model is appropriate.
Because the average event rate is 2.5 goals per match, λ = 2.5.
The table below gives the probability for 0 to 7 goals in a match.
P
00.082
10.205
20.257
30.213
40.133
50.067
60.028
70.010

Once in an interval events: The special case of ''λ'' = 1 and ''k'' = 0

Suppose that astronomers estimate that large meteorites hit the earth on average once every 100 years, and that the number of meteorite hits follows a Poisson distribution. What is the probability of = 0 meteorite hits in the next 100 years?
Under these assumptions, the probability that no large meteorites hit the earth in the next 100 years is roughly 0.37. The remaining 1 − 0.37 = 0.63 is the probability of 1, 2, 3, or more large meteorite hits in the next 100 years.
In an example above, an overflow flood occurred once every 100 years. The probability of no overflow floods in 100 years was roughly 0.37, by the same calculation.
In general, if an event occurs on average once per interval, and the events follow a Poisson distribution, then. In addition, P = 0.37, as shown in the table for overflow floods.

Examples that violate the Poisson assumptions

The number of students who arrive at the student union per minute will likely not follow a Poisson distribution, because the rate is not constant and the arrivals of individual students are not independent.
The number of magnitude 5 earthquakes per year in a country may not follow a Poisson distribution if one large earthquake increases the probability of aftershocks of similar magnitude.
Examples in which at least one event is guaranteed are not Poission distributed; but may be modeled using a Zero-truncated Poisson distribution.
Count distributions in which the number of intervals with zero events is higher than predicted by a Poisson model may be modeled using a Zero-inflated model.

Properties

Descriptive statistics

Bounds for the median of the distribution are known and are sharp:

Higher moments

For the non-centered moments we define, then
where is some absolute constant greater than 0.

Sums of Poisson-distributed random variables

Other properties

Let and be independent random variables, with, then we have that
The upper bound is proved using a standard Chernoff bound.
The lower bound can be proved by noting that is the probability that, where, which is bounded below by, where is relative entropy. Further noting that, and computing a lower bound on the unconditional probability gives the result. More details can be found in the appendix of Kamath et al..

Related distributions

General

Assume where, then is multinomially distributed
conditioned on.
This means, among other things, that for any nonnegative function,
if is multinomially distributed, then
where.
The factor of can be removed if is further assumed to be monotonically increasing or decreasing.

Bivariate Poisson distribution

This distribution has been extended to the bivariate case. The generating function for this distribution is
with
The marginal distributions are Poisson and Poisson and the correlation coefficient is limited to the range
A simple way to generate a bivariate Poisson distribution is to take three independent Poisson distributions with means and then set. The probability function of the bivariate Poisson distribution is

Free Poisson distribution

The free Poisson distribution with jump size and rate arises in free probability theory as the limit of repeated free convolution
as N → ∞.
In other words, let be random variables so that has value with probability and value 0 with the remaining probability. Assume also that the family are freely independent. Then the limit as of the law of
is given by the Free Poisson law with parameters.
This definition is analogous to one of the ways in which the classical Poisson distribution is obtained from a Poisson process.
The measure associated to the free Poisson law is given by
where
and has support.
This law also arises in random matrix theory as the Marchenko–Pastur law. Its free cumulants are equal to.

Some transforms of this law

We give values of some important transforms of the free Poisson law; the computation can be found in e.g. in the book Lectures on the Combinatorics of Free Probability by A. Nica and R. Speicher
The R-transform of the free Poisson law is given by
The Cauchy transform is given by
The S-transform is given by
in the case that.

Statistical Inference

Parameter estimation

Given a sample of n measured values, for i = 1, ..., n, we wish to estimate the value of the parameter λ of the Poisson population from which the sample was drawn. The maximum likelihood estimate is
Since each observation has expectation λ so does the sample mean. Therefore, the maximum likelihood estimate is an unbiased estimator of λ. It is also an efficient estimator since its variance achieves the Cramér–Rao lower bound. Hence it is minimum-variance unbiased. Also it can be proven that the sum is a complete and sufficient statistic for λ.
To prove sufficiency we may use the factorization theorem. Consider partitioning the probability mass function of the joint Poisson distribution for the sample into two parts: one that depends solely on the sample and one that depends on the parameter and the sample only through the function. Then is a sufficient statistic for.
The first term,, depends only on. The second term,, depends on the sample only through. Thus, is sufficient.
To find the parameter λ that maximizes the probability function for the Poisson population, we can use the logarithm of the likelihood function:
We take the derivative of with respect to λ and compare it to zero:
Solving for λ gives a stationary point.
So λ is the average of the ki values. Obtaining the sign of the second derivative of L at the stationary point will determine what kind of extreme value λ is.
Evaluating the second derivative at the stationary point gives:
which is the negative of n times the reciprocal of the average of the ki. This expression is negative when the average is positive. If this is satisfied, then the stationary point maximizes the probability function.
For completeness, a family of distributions is said to be complete if and only if implies that for all. If the individual are iid, then. Knowing the distribution we want to investigate, it is easy to see that the statistic is complete.
For this equality to hold, must be 0. This follows from the fact that none of the other terms will be 0 for all in the sum and for all possible values of. Hence, for all implies that, and the statistic has been shown to be complete.

Confidence interval

The confidence interval for the mean of a Poisson distribution can be expressed using the relationship between the cumulative distribution functions of the Poisson and chi-squared distributions. The chi-squared distribution is itself closely related to the gamma distribution, and this leads to an alternative expression. Given an observation k from a Poisson distribution with mean μ, a confidence interval for μ with confidence level is
or equivalently,
where is the quantile function of the chi-squared distribution with n degrees of freedom and is the quantile function of a gamma distribution with shape parameter n and scale parameter 1. This interval is 'exact' in the sense that its coverage probability is never less than the nominal.
When quantiles of the gamma distribution are not available, an accurate approximation to this exact interval has been proposed :
where denotes the standard normal deviate with upper tail area.
For application of these formulae in the same context as above, one would set
calculate an interval for μ = , and then derive the interval for λ.

Bayesian inference

In Bayesian inference, the conjugate prior for the rate parameter λ of the Poisson distribution is the gamma distribution. Let
denote that λ is distributed according to the gamma density g parameterized in terms of a shape parameter α and an inverse scale parameter β:
Then, given the same sample of n measured values ki as before, and a prior of Gamma, the posterior distribution is
The posterior mean E approaches the maximum likelihood estimate in the limit as, which follows immediately from the general expression of the mean of the gamma distribution.
The posterior predictive distribution for a single additional observation is a negative binomial distribution, sometimes called a gamma–Poisson distribution.

Simultaneous estimation of multiple Poisson means

Suppose is a set of independent random variables from a set of Poisson distributions, each with a parameter,, and we would like to estimate these parameters. Then, Clevenson and Zidek show that under the normalized squared error loss, when, then, similar as in Stein's example for the Normal means, the MLE estimator is inadmissible.
In this case, a family of minimax estimators is given for any and as

Occurrence and applications

Applications of the Poisson distribution can be found in many fields including:
The Poisson distribution arises in connection with Poisson processes. It applies to various phenomena of discrete properties whenever the probability of the phenomenon happening is constant in time or space. Examples of events that may be modelled as a Poisson distribution include:

Gallagher showed in 1976 that the counts of prime numbers in short intervals obey a Poisson distribution provided a certain version of the unproved prime r-tuple conjecture of Hardy-Littlewood is true.

Law of rare events

The rate of an event is related to the probability of an event occurring in some small subinterval. In the case of the Poisson distribution, one assumes that there exists a small enough subinterval for which the probability of an event occurring twice is "negligible". With this assumption one can derive the Poisson distribution from the Binomial one, given only the information of expected number of total events in the whole interval. Let this total number be. Divide the whole interval into subintervals of equal size, such that > . This means that the expected number of events in an interval for each is equal to. Now we assume that the occurrence of an event in the whole interval can be seen as a Bernoulli trial, where the trial corresponds to looking whether an event happens at the subinterval with probability. The expected number of total events in such trials would be, the expected number of total events in the whole interval. Hence for each subdivision of the interval we have approximated the occurrence of the event as a Bernoulli process of the form. As we have noted before we want to consider only very small subintervals. Therefore, we take the limit as goes to infinity.
In this case the binomial distribution converges to what is known as the Poisson distribution by the Poisson limit theorem.
In several of the above examples—such as, the number of mutations in a given sequence of DNA—the events being counted are actually the outcomes of discrete trials, and would more precisely be modelled using the binomial distribution, that is
In such cases n is very large and p is very small. Then the distribution may be approximated by the less cumbersome Poisson distribution
This approximation is sometimes known as the law of rare events,since each of the n individual Bernoulli events rarely occurs. The name may be misleading because the total count of success events in a Poisson process need not be rare if the parameter np is not small. For example, the number of telephone calls to a busy switchboard in one hour follows a Poisson distribution with the events appearing frequent to the operator, but they are rare from the point of view of the average member of the population who is very unlikely to make a call to that switchboard in that hour.
The word law is sometimes used as a synonym of probability distribution, and convergence in law means convergence in distribution. Accordingly, the Poisson distribution is sometimes called the "law of small numbers" because it is the probability distribution of the number of occurrences of an event that happens rarely but has very many opportunities to happen. The Law of Small Numbers is a book by Ladislaus Bortkiewicz about the Poisson distribution, published in 1898.

Poisson point process

The Poisson distribution arises as the number of points of a Poisson point process located in some finite region. More specifically, if D is some region space, for example Euclidean space Rd, for which |D|, the area, volume or, more generally, the Lebesgue measure of the region is finite, and if denotes the number of points in D, then

Poisson regression and negative binomial regression

and negative binomial regression are useful for analyses where the dependent variable is the count of the number of events or occurrences in an interval.

Other applications in science

In a Poisson process, the number of observed occurrences fluctuates about its mean λ with a standard deviation. These fluctuations are denoted as Poisson noise or as shot noise.
The correlation of the mean and standard deviation in counting independent discrete occurrences is useful scientifically. By monitoring how the fluctuations vary with the mean signal, one can estimate the contribution of a single occurrence, even if that contribution is too small to be detected directly. For example, the charge e on an electron can be estimated by correlating the magnitude of an electric current with its shot noise. If N electrons pass a point in a given time t on the average, the mean current is ; since the current fluctuations should be of the order , the charge can be estimated from the ratio.
An everyday example is the graininess that appears as photographs are enlarged; the graininess is due to Poisson fluctuations in the number of reduced silver grains, not to the individual grains themselves. By correlating the graininess with the degree of enlargement, one can estimate the contribution of an individual grain. Many other molecular applications of Poisson noise have been developed, e.g., estimating the number density of receptor molecules in a cell membrane.
In Causal Set theory the discrete elements of spacetime follow a Poisson distribution in the volume.

Computational methods

The Poisson distribution poses two different tasks for dedicated software libraries: Evaluating the distribution, and drawing random numbers according to that distribution.

Evaluating the Poisson distribution

Computing for given and is a trivial task that can be accomplished by using the standard definition of in terms of exponential, power, and factorial functions. However, the conventional definition of the Poisson distribution contains two terms that can easily overflow on computers: λk and k!. The fraction of λk to k! can also produce a rounding error that is very large compared to e−λ, and therefore give an erroneous result. For numerical stability the Poisson probability mass function should therefore be evaluated as
which is mathematically equivalent but numerically stable. The natural logarithm of the Gamma function can be obtained using the lgamma function in the C standard library or R, the gammaln function in MATLAB or SciPy, or the log_gamma function in Fortran 2008 and later.
Some computing languages provide built-in functions to evaluate the Poisson distribution, namely
The less trivial task is to draw random integers from the Poisson distribution with given.
Solutions are provided by:
A simple algorithm to generate random Poisson-distributed numbers has been given by Knuth:
algorithm poisson random number :
init:
Let L ← e−λ, k ← 0 and p ← 1.
do:
k ← k + 1.
Generate uniform random number u in and let p ← p × u.
while p > L.
return k − 1.
The complexity is linear in the returned value k, which is λ on average. There are many other algorithms to improve this. Some are given in Ahrens & Dieter, see below.
For large values of λ, the value of L = e−λ may be so small that it is hard to represent. This can be solved by a change to the algorithm which uses an additional parameter STEP such that e−STEP does not underflow:
algorithm poisson random number :
init:
Let λLeft ← λ, k ← 0 and p ← 1.
do:
k ← k + 1.
Generate uniform random number u in and let p ← p × u.
while p < 1 and λLeft > 0:
if λLeft > STEP:
p ← p × eSTEP
λLeft ← λLeft − STEP
else:
p ← p × eλLeft
λLeft ← 0
while p > 1.
return k − 1.
The choice of STEP depends on the threshold of overflow. For double precision floating point format, the threshold is near e700, so 500 shall be a safe STEP.
Other solutions for large values of λ include rejection sampling and using Gaussian approximation.
Inverse transform sampling is simple and efficient for small values of λ, and requires only one uniform random number u per sample. Cumulative probabilities are examined in turn until one exceeds u.
algorithm Poisson generator based upon the inversion by sequential search:
init:
Let x ← 0, p ← e−λ, s ← p.
Generate uniform random number u in .
while u > s do:
x ← x + 1.
p ← p × λ / x.
s ← s + p.
return x.

History

The distribution was first introduced by Siméon Denis Poisson and published together with his probability theory in his work Recherches sur la probabilité des jugements en matière criminelle et en matière civile. The work theorized about the number of wrongful convictions in a given country by focusing on certain random variables N that count, among other things, the number of discrete occurrences that take place during a time-interval of given length. The result had already been given in 1711 by Abraham de Moivre in De Mensura Sortis seu; de Probabilitate Eventuum in Ludis a Casu Fortuito Pendentibus. This makes it an example of Stigler's law and it has prompted some authors to argue that the Poisson distribution should bear the name of de Moivre.
In 1860, Simon Newcomb fitted the Poisson distribution to the number of stars found in a unit of space.
A further practical application of this distribution was made by Ladislaus Bortkiewicz in 1898 when he was given the task of investigating the number of soldiers in the Prussian army killed accidentally by horse kicks; this experiment introduced the Poisson distribution to the field of reliability engineering.

Citations