Lehmer random number generator
The Lehmer random number generator, sometimes also referred to as the Park–Miller random number generator, is a type of linear congruential generator that operates in multiplicative group of integers modulo n. The general formula is:
where the modulus m is a prime number or a power of a prime number, the multiplier a is an element of high multiplicative order modulo m, and the seed X is coprime to m.
Other names are multiplicative linear congruential generator and multiplicative congruential generator .
Parameters in common use
In 1988, Park and Miller suggested a Lehmer RNG with particular parameters m = 2 − 1 = 2,147,483,647 and a = 7 = 16,807, now known as MINSTD. Although MINSTD was later criticized by Marsaglia and Sullivan, it is still in use today. Park, Miller and Stockmeyer responded to the criticism, saying:
Given the dynamic nature of the area, it is difficult for nonspecialists to make decisions about what generator to use. "Give me something I can understand, implement and port... it needn't be state-of-the-art, just make sure it's reasonably good and efficient." Our article and the associated minimal standard generator was an attempt to respond to this request. Five years later, we see no need to alter our response other than to suggest the use of the multiplier a = 48271 in place of 16807.
This revised constant is used in C++11's
minstd_rand
random number generator.The Sinclair ZX81 and its successors use the Lehmer RNG with parameters m = 2+1 = 65,537 and a = 75.
The CRAY random number generator RANF is a Lehmer RNG with the power-of-two modulus m = 2 and a = 44,485,709,377,909. The GNU Scientific Library includes several random number generators of the Lehmer form, including MINSTD, RANF, and the infamous IBM random number generator RANDU.
Choice of modulus
Most commonly, the modulus is chosen as a prime number, making the choice of a coprime seed trivial. This produces the best quality output, but introduces some implementation complexity and the range of the output is unlikely to match the desired application; converting to the desired range requires an additional multiplication.Using a modulus m which is a power of two makes for a particularly convenient computer implementation, but comes at a cost: the period is at most m/4, and the low bits have periods shorter than that. This is because the low k bits form a modulo-2 generator all by themselves; the higher-order bits never affect lower-order bits. The values X are always odd, bits 2 and 1 alternate, the low 4 bits repeat with a period of 4, and so on. Therefore, the application using these random numbers must use the most significant bits; reducing to a smaller range using a modulo operation with an even modulus will produce disastrous results.
To achieve this period, the multiplier must satisfy a ≡ ±3 and the seed X must be odd.
Using a composite modulus is possible, but the generator must be seeded with a value coprime to m, or the period will be greatly reduced. For example, a modulus of F = 2+1 might seem attractive, as the outputs can be easily mapped to a 32-bit word 0 ≤ X−1 < 2. However, a seed of X = 6700417 or any multiple would lead to an output with a period of only 640.
A more popular implementation for large periods is a combined linear congruential generator; combining several generators is equivalent to the output of a single generator whose modulus is the product of the component generators' moduli. and whose period is the least common multiple of the component periods. Although the periods will share a common divisor of 2, the moduli can be chosen so that is the only common divisor and the resultant period is /2. One example of this is the Wichmann–Hill generator.
Relation to LCG
While the Lehmer RNG can be viewed as a particular case of the linear congruential generator with, it is a special case that implies certain restrictions and properties. In particular, for the Lehmer RNG, the initial seed X must be coprime to the modulus m that is not required for LCGs in general. The choice of the modulus m and the multiplier a is also more restrictive for the Lehmer RNG. In contrast to LCG, the maximum period of the Lehmer RNG equals m−1 and it is such when m is prime and a is a primitive root modulo m.On the other hand, the discrete logarithms of X in represent a linear congruential sequence modulo the Euler totient.
Implementation
A prime modulus requires the computation of a double-width product and an explicit reduction step. If a modulus just less than a power of 2 is used, reduction modulo can be implemented more cheaply than a general double-width division using the identity.The basic reduction step divides the product into two e-bit parts, multiplies the high part by d, and adds them:. This can be followed by subtracting m until the result is in range. The number of subtractions is limited to ad/m, which can be easily limited to one if d is small and is chosen.
When the modulus is a Mersenne prime, the procedure is particularly simple. Not only is multiplication by d trivial, but the conditional subtraction can be replaced by an unconditional shift and addition. To see this, note that the algorithm guarantees that, meaning that x = 0 and x = m are both impossible. This avoids the need to consider equivalent e-bit representations of the state; only values where the high bits are non-zero need reduction.
The low e bits of the product ax cannot represent a value larger than m, and the high bits will never hold a value greater than a−1 ≤ m−2. Thus, the first reduction step produces a value at most m+a−1 ≤ 2m−2 = 2e+1−4. This is an e+1-bit number which can be greater than m, but the high half is at most 1, and if it is, the low e bits will be strictly less than m. Thus, whether the high bit is 1 or 0, a second reduction step will never overflow e bits and the sum will be the desired value.
If d > 1, conditional subtraction can also be avoided, but the procedure is more intricate. The fundamental challenge of a modulus like 232−5 lies in ensuring that we produce only one representation for values such as 1 ≡ 232−4. The solution is to temporarily add d so that the range of possible values is d through 2e−1, and reduce values larger than e bits in a way that never generates representations less than d. Finally subtracting the temporary offset produces the desired value.
Begin by assuming that we have a partially reduced value y which is bounded so 0 ≤ y < 2m = 2e+1−2d. In this case, a single offset subtraction step will produce 0 ≤ < m. To see this, consider two cases:
; 0 ≤ y < m = 2e − d
; m ≤ y < 2m
Reducing a larger product ax to less than 2m = 2e+1 − 2d can be done by one or more reduction steps without an offset.
If ad ≤ m, then one additional reduction step suffices. Since x < m, ax < am ≤ 2e, and one reduction step converts this to at most 2e − 1 + d = m + ad − 1. This is within the limit of 2m if ad − 1 < m, which is the initial assumption.
If ad > m, then it is possible for the first reduction step to produce a sum greater than 2m = 2e+1−2d, which is too large for the final reduction step. However, as long as d2 < 2e, the first reduction will produce a value in the range required for the preceding case of two reduction steps to apply.
Schrage's method
If a double-width product is not available, Schrage's method, also called the approximate factoriation method, may be used to compute, but this comes at the cost:- The modulus must be representable in a signed integer; arithmetic operations must allow a range of ±m.
- The choice of multiplier a is restricted. We require that, commonly achieved by choosing a ≤ .
- One division per iteration is required.
To use Schrage's method, first factor, i.e. precompute the auxiliary constants and = /a. Then, each iteration, compute.
This equality holds because
so if we factor x = + q, we get:
The reason it does not overflow is that both terms are less than m. Since x mod q < q ≤ m/a, the first term is strictly less than am/a = m and may be computed with a single-width product.
If a is chosen so that r ≤ q, then the second term is also less than m: r ≤ rx/q = x ≤ x = x < m. Thus, the difference lies in the range and can be reduced to with a single conditional add.
This technique may be extended to allow a negative r, changing the final reduction to a conditional subtract.
The technique may also be extended to allow larger a by applying it recursively. Of the two terms subtracted to produce the final result, only the second risks overflow. But this is itself a modular multiplication by a compile-time constant r, and may be implemented by the same technique. Because each step, on average, halves the size of the multiplier, this would appear to require one step per bit and be spectacularly inefficient. However, each step also divides x by an ever-increasing quotient, and quickly a point is reached where the argument is 0 and the recursion may be terminated.
Sample C99 code
Using C code, the Park-Miller RNG can be written as follows:This function can be called repeatedly to generate pseudorandom numbers, as long as the caller is careful to initialize the state to any number greater than zero and less than the modulus. In this implementation, 64-bit arithmetic is required; otherwise, the product of two 32-bit integers may overflow.
To avoid the 64-bit division, do the reduction by hand:
To use only 32-bit arithmetic, use Schrage's method:
or use two 16×16-bit multiplies:
Another popular Lehmer generator uses the prime modulus 2−5:
This can also be written without a 64-bit division:
Many other Lehmer generators have good properties. The following modulo-2128 Lehmer generator requires 128-bit support from the compiler and uses a multiplier computed by L'Ecuyer. It has a period of 2126:
The generator computes an odd 128-bit value and returns its upper 64 bits.
This generator passes BigCrush from TestU01, but fails the TMFn test from . That test has been designed to catch exactly the defect of this type of generator: since the modulus is a power of 2, the period of the lowest bit in the output is only 262, rather than 2126. Linear congruential generators with a power-of-2 modulus have a similar behavior.
The following core routine improves upon the speed of the above code for integer workloads :
However, because the multiplication is deferred, it is not suitable for hashing, since the first call simply returns the upper 64 bits of the seed state.