Ziggurat algorithm

The ziggurat algorithm is an algorithm for pseudo-random number sampling. Belonging to the class of rejection sampling algorithms, it relies on an underlying source of uniformly-distributed random numbers, typically from a pseudo-random number generator, as well as precomputed tables. The algorithm is used to generate values from a monotone decreasing probability distribution. It can also be applied to symmetric unimodal distributions, such as the normal distribution, by choosing a value from one half of the distribution and then randomly choosing which half the value is considered to have been drawn from. It was developed by George Marsaglia and others in the 1960s.
A typical value produced by the algorithm only requires the generation of one random floating-point value and one random table index, followed by one table lookup, one multiply operation and one comparison. Sometimes more computations are required. Nevertheless, the algorithm is computationally much faster than the two most commonly used methods of generating normally distributed random numbers, the Marsaglia polar method and the Box–Muller transform, which require at least one logarithm and one square root calculation for each pair of generated values. However, since the ziggurat algorithm is more complex to implement it is best used when large quantities of random numbers are required.
The term ziggurat algorithm dates from Marsaglia's paper with Wai Wan Tsang in 2000; it is so named because it is conceptually based on covering the probability distribution with rectangular segments stacked in decreasing order of size, resulting in a figure that resembles a ziggurat.
. the sample value may lie under the curve, and must be tested further. In that case, a random y value within the chosen layer is generated and compared to f. If less, the point is under the curve and the value x is output. If not,, the chosen point x is rejected and the algorithm is restarted from the beginning.

Theory of operation

The ziggurat algorithm is a rejection sampling algorithm; it randomly generates a point in a distribution slightly larger than the desired distribution, then tests whether the generated point is inside the desired distribution. If not, it tries again. Given a random point underneath a probability density curve, its x coordinate is a random number with the desired distribution.
The distribution the ziggurat algorithm chooses from is made up of n equal-area regions; n − 1 rectangles that cover the bulk of the desired distribution, on top of a non-rectangular base that includes the tail of the distribution.
Given a monotone decreasing probability density function f, defined for all x ≥ 0, the base of the ziggurat is defined as all points inside the distribution and below y₁ = f. This consists of a rectangular region from to, and the tail of the distribution, where x > x₁.
This layer has area A. On top of this, add a rectangular layer of width x₁ and height A/x₁, so it also has area A. The top of this layer is at height y₂ = y₁ + A/x₁, and intersects the density function at a point, where y₂ = f. This layer includes every point in the density function between y₁ and y₂, but also includes points such as which are not in the desired distribution.
Further layers are then stacked on top. To use a precomputed table of size n, one chooses x₁ such that x_n = 0, meaning that the top box, layer n − 1, reaches the distribution's peak at exactly.
Layer i extends vertically from y_i to y_i+1, and can be divided into two regions horizontally: the portion from 0 to x_i+1 which is entirely contained within the desired distribution, and the portion from x_i+1 to x_i, which is only partially contained.
Ignoring for a moment the problem of layer 0, and given uniform random variables U₀ and U₁ ∈ 0,1), the ziggurat algorithm can be described as:

Choose a random layer 0 ≤ i < n.

Let x = U₀x_i.

If x < x_i+1, return x.

Let y = y_i + U₁.

Compute f. If y < f, return x.

Otherwise, choose new random numbers and go back to [step 1.

Step 1 amounts to choosing a low-resolution y coordinate. Step 3 tests if the x coordinate is clearly within the desired density function without knowing more about the y coordinate. If it is not, step 4 chooses a high-resolution y coordinate, and step 5 does the rejection test.
With closely spaced layers, the algorithm terminates at step 3 a very large fraction of the time. For the top layer n − 1, however, this test always fails, because x_n = 0.
Layer 0 can also be divided into a central region and an edge, but the edge is an infinite tail. To use the same algorithm to check if the point is in the central region, generate a fictitious x₀ = A/y₁. This will generate points with x < x₁ with the correct frequency, and in the rare case that layer 0 is selected and x ≥ x₁, use a special fallback algorithm to select a point at random from the tail. Because the fallback algorithm is used less than one time in a thousand, speed is not essential.
Thus, the full ziggurat algorithm for one-sided distributions is:

Choose a random layer 0 ≤ i < n.
Let x = U₀x_i
If x < x_i+1, return x.
If i = 0, generate a point from the tail using the fallback algorithm.
Let y = y_i + U₁.
Compute f. If y < f, return x.
Otherwise, choose new random numbers and go back to step 1.

For a two-sided distribution, of course, the result must be negated 50% of the time. This can often be done conveniently by choosing U₀ ∈ and, in step 3, testing if |x| < x_i+1.

Fallback algorithms for the tail

Because the ziggurat algorithm only generates most outputs very rapidly, and requires a fallback algorithm whenever x > x₁, it is always more complex than a more direct implementation. The fallback algorithm, of course, depends on the distribution.
For an exponential distribution, the tail looks just like the body of the distribution. One way is to fall back to the most elementary algorithm E = −ln and let x = x₁ − ln. Another is to call the ziggurat algorithm recursively and add x₁ to the result.
For a normal distribution, Marsaglia suggests a compact algorithm:

Let x = −ln/x₁.
Let y = −ln.
If 2y > x², return x + x₁.
Otherwise, go back to step 1.

Since x₁ ≈ 3.5 for typical table sizes, the test in step 3 is almost always successful.

Optimizations

The algorithm can be performed efficiently with precomputed tables of x_i and y_i = f, but there are some modifications to make it even faster:

Nothing in the ziggurat algorithm depends on the probability distribution function being normalized, removing normalizing constants can speed up the computation of f.
Most uniform random number generators are based on integer random number generators which return an integer in the range . A table of 2⁻³²x_i lets you use such numbers directly for U₀.
When computing two-sided distributions using a two-sided U₀ as described earlier, the random integer can be interpreted as a signed number in the range , and a scale factor of 2⁻³¹ can be used.
Rather than comparing U₀x_i to x_i+1 in step 3, it is possible to precompute x_i+1/x_i and compare U₀ with that directly. If U₀ is an integer random number generator, these limits may be premultiplied by 2³² so an integer comparison can be used.
With the above two changes, the table of unmodified x_i values is no longer needed and may be deleted.
When generating IEEE 754 single-precision floating point values, which only have a 24-bit mantissa, the least-significant bits of a 32-bit integer random number are not used. These bits may be used to select the layer number.
The first three steps may be put into an inline function, which can call an out-of-line implementation of the less frequently needed steps.
Generating the tables

It is possible to store the entire table precomputed, or just include the values n, y₁, A, and an implementation of f⁻¹ in the source code, and compute the remaining values when initializing the random number generator.
As previously described, you can find x_i = f⁻¹ and y_i+1 = y_i + A/x_i. Repeat n − 1 times for the layers of the ziggurat. At the end, you should have y_n = f. There will, of course, be some round-off error, but it is a useful sanity test to see that it is acceptably small.
When actually filling in the table values, just assume that x_n = 0 and y_n = f, and accept the slight difference in layer n − 1's area as rounding error.

Finding ''x''₁ and ''A''

Given an initial x₁, you need a way to compute the area t of the tail for which x > x₁. For the exponential distribution, this is just e^−x₁, while for the normal distribution, assuming you are using the unnormalized f = e^−x²/2, this is erfc. For more awkward distributions, numerical integration may be required.
With this in hand, from x₁, you can find y₁ = f, the area t in the tail, and the area of the base layer A = x₁y₁ + t.
Then compute the series y_i and x_i as above. If y_i > f for any i < n, then the initial estimate x₁ was too low, leading to too large an area A. If y_n < f, then the initial estimate x₁ was too high.
Given this, use a root-finding algorithm to find the value x₁ which produces y_n−1 as close to f as possible. Alternatively, look for the value which makes the area of the topmost layer, x_n−1, as close to the desired value A as possible. This saves one evaluation of f⁻¹ and is actually the condition of greatest interest.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Ziggurat algorithm

Theory of operation

Fallback algorithms for the tail

Optimizations

Generating the tables

Finding ''x''1 and ''A''

Finding ''x''₁ and ''A''