Proofs of Fermat's theorem on sums of two squares
asserts that an odd prime number p can be expressed as
with integer x and y if and only if p is congruent to 1. The statement was announced by Girard in 1625, and again by Fermat in 1640, but neither supplied a proof.
The "only if" clause is easy: a perfect square is congruent to 0 or 1 modulo 4, hence a sum of two squares is congruent to 0, 1, or 2. An odd prime number is congruent to either 1 or 3 modulo 4, and the second possibility has just been ruled out. The first proof that such a representation exists was given by Leonhard Euler in 1747 and was complicated. Since then, many different proofs have been found. Among them, the proof using Minkowski's theorem about convex sets and Don Zagier's short proof based on involutions have appeared.
Euler's proof by infinite descent
succeeded in proving Fermat's theorem on sums of two squares in 1749, when he was forty-two years old. He communicated this in a letter to Goldbach dated 12 April 1749. The proof relies on infinite descent, and is only briefly sketched in the letter. The full proof consists in five steps and is published in two papers. The first four steps are Propositions 1 to 4 of the first paper and do not correspond exactly to the four steps below. The fifth step below is from the second paper.For the avoidance of ambiguity, zero will always be a valid possible constituent of "sums of two squares", so for example every square of an integer is trivially expressible as the sum of two squares by setting one of them to be zero.
1. The product of two numbers, each of which is a sum of two squares, is itself a sum of two squares.
2. If a number which is a sum of two squares is divisible by a prime which is a sum of two squares, then the quotient is a sum of two squares.
.
3. If a number which can be written as a sum of two squares is divisible by a number which is not a sum of two squares, then the quotient has a factor which is not a sum of two squares..
4. If and are relatively prime positive integers then every factor of is a sum of two squares.
.
5. Every prime of the form is a sum of two squares.
.
Lagrange's proof through quadratic forms
completed a proof in 1775 based on his general theory of integral quadratic forms. The following presentation incorporates a slight simplification of his argument, due to Gauss, which appears in article 182 of the Disquisitiones Arithmeticae.An quadratic form is an expression of the form with integers. A number is said to be represented by the form if there exist integers such that. Fermat's theorem on sums of two squares is then equivalent to the statement that a prime is represented by the form exactly when is congruent to modulo.
The discriminant of the quadratic form is defined to be. The discriminant of is then equal to.
Two forms and are equivalent if and only if there exist substitutions with integer coefficients
with such that, when substituted into the first form, yield the second. Equivalent forms are readily seen to have the same discriminant, and hence also the same parity for the middle coefficient, which coincides with the parity of the discriminant. Moreover, it is clear that equivalent forms will represent exactly the same integers, because these kind of substitutions can be reversed by substitutions of the same kind.
Lagrange proved that all positive definite forms of discriminant −4 are equivalent. Thus, to prove Fermat's theorem it is enough to find any positive definite form of discriminant −4 that represents. For example, one can use a form
where the first coefficient a = was chosen so that the form represents by setting x = 1, and y = 0, the coefficient b = 2m is an arbitrary even number, and finally is chosen so that the discriminant is equal to −4, which guarantees that the form is indeed equivalent to. Of course, the coefficient must be an integer, so the problem is reduced to finding some integer m such that divides : or in other words, a 'square root of -1 modulo ' .
We claim such a square root of is given by. Firstly it follows from Euclid's Fundamental Theorem of Arithmetic that. Consequently, : that is, are their own inverses modulo and this property is unique to them. It then follows from the validity of Euclidean division in the integers, and the fact that is prime, that for every the gcd of and may be expressed via the Euclidean algorithm yielding a unique and distinct inverse of modulo. In particular therefore the product of all non-zero residues modulo is. Let : from what has just been observed,. But by definition, since each term in may be paired with its negative in,, which since is odd shows that, as required.
Dedekind's two proofs using Gaussian integers
gave at least two proofs of Fermat's theorem on sums of two squares, both using the arithmetical properties of the Gaussian integers, which are numbers of the form a + bi, where a and b are integers, and i is the square root of −1. One appears in section 27 of his exposition of ideals published in 1877; the second appeared in Supplement XI to Peter Gustav Lejeune Dirichlet's Vorlesungen über Zahlentheorie, and was published in 1894.1. First proof. If is an odd prime number, then we have in the Gaussian integers. Consequently, writing a Gaussian integer ω = x + iy with x,y ∈ Z and applying the Frobenius automorphism in Z/, one finds
since the automorphism fixes the elements of Z/. In the current case, for some integer n, and so in the above expression for ωp, the exponent /2 of -1 is even. Hence the right hand side equals ω, so in this case the Frobenius endomorphism of Z/ is the identity.
Kummer had already established that if is the order of the Frobenius automorphism of Z/, then the ideal in Z would be a product of 2/f distinct prime ideals. Therefore, the ideal is the product of two different prime ideals in Z. Since the Gaussian integers are a Euclidean domain for the norm function, every ideal is principal and generated by a nonzero element of the ideal of minimal norm. Since the norm is multiplicative, the norm of a generator of one of the ideal factors of must be a strict divisor of, so that we must have, which gives Fermat's theorem.
2. Second proof. This proof builds on Lagrange's result that if is a prime number, then there must be an integer m such that is divisible by p ; it also uses the fact that the Gaussian integers are a unique factorization domain. Since does not divide either of the Gaussian integers and , but it does divide their product, it follows that cannot be a prime element in the Gaussian integers. We must therefore have a nontrivial factorization of p in the Gaussian integers, which in view of the norm can have only two factors, so it must be of the form for some integers and. This immediately yields that.
Proof by Minkowski's Theorem
For congruent to mod a prime, is a quadratic residue mod by Euler's criterion. Therefore, there exists an integer such that divides. Let be the standard basis elements for the vector space and set and. Consider the lattice. If then. Thus divides for any.The area of the fundamental parallelogram of the lattice is. The area of the open disk,, of radius centered around the origin is. Furthermore, is convex and symmetrical about the origin. Therefore, by Minkowski's theorem there exists a nonzero vector such that. Both and so. Hence is the sum of the squares of the components of.
Zagier's "one-sentence proof"
Let be prime, let denote the natural numbers, and consider the finite set of triples of numbers.Then has two involutions: an obvious one whose fixed points correspond to representations of as a sum of two squares, and a more complicated one,
which has exactly one fixed point. Two involutions over the same finite set must have sets of fixed points with the same parity, and since the second involution has an odd number of fixed points, so does the first.
Zero is even, so the first involution has a nonzero number of fixed points, any one of which gives a representation of as a sum of two squares.
This proof, due to Zagier, is a simplification of an earlier proof by Heath-Brown, which in turn was inspired by a proof of Liouville. The technique of the proof is a combinatorial analogue of the topological principle that the Euler characteristics of a topological space with an involution and of its fixed-point set have the same parity and is reminiscent of the use of sign-reversing involutions in the proofs of combinatorial bijections.
This proof is equivalent to a geometric or "visual" proof using "windmill" figures, described in this and this Mathologer YouTube video.