Suppose we need to factorize a number, where is a non-trivial factor. A polynomial modulo, called , is used to generate a pseudorandom sequence: A starting value, say 2, is chosen, and the sequence continues as,,, etc. The sequence is related to another sequence. Since is not known beforehand, this sequence cannot be explicitly computed in the algorithm. Yet, in it lies the core idea of the algorithm. Because the number of possible values for these sequences are finite, both the sequence, which is mod, and sequence will eventually repeat, even though we do not know the latter. Assume that the sequences behave like random numbers. Due to the birthday paradox, the number of before a repetition occurs is expected to be, where is the number of possible values. So the sequence will likely repeat much earlier than the sequence. Once a sequence has a repeated value, the sequence will cycle, because each value depends only on the one before it. This structure of eventual cycling gives rise to the name "Rho algorithm", owing to similarity to the shape of the Greek character ρ when the values,, etc. are represented as nodes in a directed graph. This is detected by Floyd's cycle-finding algorithm: two nodes and are kept. In each step, one moves to the next node in the sequence and the other moves forward by two nodes. After that, it is checked whether. If it is not 1, then this implies that there is a repetition in the sequence is a divisor of other than 1. This may be itself, since the two sequences might repeat at the same time. In this case the algorithm fails, and can be repeated with a different parameter.
Algorithm
The algorithm takes as its inputs, the integer to be factored; and, a polynomial in computed modulo. In the original algorithm,, but nowadays it is more common to use. The output is either a non-trivial factor of, or failure. It performs the following steps: x ← 2 y ← 2 d ← 1 while d = 1: x ← g y ← g d ← gcd if d = n: return failure else: return d Here and corresponds to and in the section about core idea. Note that this algorithm may fail to find a nontrivial factor even when is composite. In that case, the method can be tried again, using a starting value other than 2 or a different.
Example factorization
Let and.
1
5
26
1
2
26
7474
1
3
677
871
97
97 is a non-trivial factor of 8051. Starting values other than may give the cofactor instead of 97. One extra iteration is shown above to make it clear that moves twice as fast as. Note that even after a repetition, the GCD can return to 1.
Variants
In 1980, Richard Brent published a faster variant of the rho algorithm. He used the same core ideas as Pollard but a different method of cycle detection, replacing Floyd's cycle-finding algorithm with the related Brent's cycle finding method. A further improvement was made by Pollard and Brent. They observed that if, then also for any positive integer. In particular, instead of computing at every step, it suffices to define as the product of 100 consecutive terms modulo, and then compute a single. A major speed up results as 100 steps are replaced with 99 multiplications modulo and a single. Occasionally it may cause the algorithm to fail by introducing a repeated factor, for instance when is a square. But it then suffices to go back to the previous term, where, and use the regular ρ algorithm from there.
Application
The algorithm is very fast for numbers with small factors, but slower in cases where all factors are large. The ρ algorithm's most remarkable success was the factorization of the Fermat number = 1238926361552897 * 93461639715357977769163558199606896584051237541638188580280321. The ρ algorithm was a good choice for because the prime factor = 1238926361552897 is much smaller than the other factor. The factorization took 2 hours on a UNIVAC1100/42. Brent, R. P., & Pollard, J. M.. Factorization of the Eighth Fermat Number. Mathematics of Computation, 36, 627.
The example = 10403 = 101 · 103
Here we introduce another variant, where only a single sequence is computed, and the is computed inside the loop that detects the cycle.
C code sample
The following code sample finds the factor 101 of 10403 with a starting value of = 2.
include
include
int gcd int main
The above code will show the algorithm progress as well as intermediate values. The final output line will be "factor is 101". The code will only work for small test values as overflow will occur in integer data types during the square of x.
from itertools import count from math import gcd number, x = 10403, 2 for cycle in count: y = x for i in range: x = % number factor = gcd if factor > 1: print exit
The results
In the following table the third and fourth columns contain secret information not known to the person trying to factor = 10403. They are included to show how the algorithm works. If we start with = 2 and follow the algorithm, we get the following numbers:
step
2
2
2
2
0
5
2
5
2
1
26
2
26
2
2
677
26
71
26
3
598
26
93
26
4
3903
26
65
26
5
3418
26
85
26
6
156
3418
55
85
7
3531
3418
97
85
8
5168
3418
17
85
9
3724
3418
88
85
10
978
3418
69
85
11
9812
3418
15
85
12
5983
3418
24
85
13
9970
3418
72
85
14
236
9970
34
72
15
3682
9970
46
72
16
2016
9970
97
72
17
7087
9970
17
72
18
10289
9970
88
72
19
2594
9970
69
72
20
8499
9970
15
72
21
4973
9970
24
72
22
2799
9970
72
72
23
The first repetition modulo 101 is 97 which occurs in step 17. The repetition is not detected until step 23, when. This causes to be, and a factor is found.
Complexity
If the pseudorandom number occurring in the Pollard ρ algorithm were an actual random number, it would follow that success would be achieved half the time, by the Birthday paradox in iterations. It is believed that the same analysis applies as well to the actual rho algorithm, but this is a heuristic claim, and rigorous analysis of the algorithm remains open.