In this post a proof of the following theorem is going to be sketched, following the treatment in Borevich and Shafarevich’s Number Theory. This sketch is by no means meant to be highly detailed and I am writing it mostly for my own purposes, so I avoid proving some things, even if they aren’t that straightforward.
Thue’s theorem: Suppose is a binary form which has degree , is irreducible (i.e. is an irreducible polynomial in ) and has at least one nonreal root in . Then for any nonzero integer the equation has only finitely many integral solutions.
Proof: Suppose otherwise…
First of all, we may suppose , for otherwise, we replace $f(x,y)$ with , which still has integer coefficients. Write
The numbers are all conjugates of , since we assumed is irreducible. It’s then easy to see
where is the norm of the field . Also put . Hence we are interested in the solutions of , where is in the module (i.e. the additive subgroup) generated by . Extend this two-element set to a basis of and denote by the module generated by these. To recover elements of among these, we use the dual basis, i.e. elements such that for and $latex T(\mu_i\mu_i^*)=1$. Trace of recovers then the coefficient of in , hence we want
A general result (Theorem 1, Section 5.2, Chapter 2 in Borevich-Shafarevich, slightly rephrased) about elements of fixed norm in a module states the following.
Theorem 1: For a module of rank in a field of degree there are elements and such that every solution of can be uniquely written as
Moreover, , where is the number of real embeddings of into and is the number of complex embeddings.
Therefore as above is in if it satisfies the system of equations
Since we assume there are infinitely many solving the above system, and ranges over a finite set, we can choose one of the such that infinitely many solutions of the above have . We can now write this system as
where are embeddings of into ordered so that .
So now we want to derieve a contradiction from the assumption that has infinitely many solutions in integers .
Entering the -adic world
The idea now is to prove that not only has finitely many integral solutions, but it has finitely many solutions in -adic integers, where is some prime of . More precisely, we take any prime (= prime ideal in the ring of integers) and the corresponding valuation . Then we construct the completion of with respect to this valuation. By a “-adic number” we mean any element of , and ones with nonnegative valuations are going to be called “-adic integers”.
We now want to make sense of equations for not necessarily integers, but also -adic integers. The problem reduces to making sense of for a fixed -adic number and a -adic integer, which is meant to vary. For this, we employ exponential and logarithmic functions: we will write . and are defined using their power series:
These two functions are each other’s inverses, that is,
There are many ways to justify this, the most straightforward one being that we know these equalities hold for complex numbers, hence they are formal equalities of power series, hence they must also hold for -adic numbers. However, these functions are not defined everywhere. Nevertheless, they can be shown to have positive radius of convergence. More precisely:
Lemma 1: There is a rational integer such that both and are defined for . Moreover, , so is defined for any -adic integer .
Unfortunately, there is no reason to expect numbers suit our purposes. However, we can change them so that this is the case. First of all, we may suppose that is such that all of have valuation zero (there are finitely many of these numbers, and they have nonzero valuation only with respect to finitely many prime ideals). Now we look at reduction modulo (or, more precisely, modulo any element with valuation ). The quotient ring is finite, say it’s of size . Then $\varepsilon_i^d$ always is congruent to modulo , i.e. for of valuation at least .
Moreover, we can replace the set of by products of and suitable powers of . we only need to multiply by powers between and . To avoid introducing more notation, we will just assume that , and hence also , are of the form which allows us to speak of their exponential functions.
The exponential function on -adic numbers satisfies all the familiar properties. Thanks to this, equations can be rewritten as
where and . Note that the involved functions are all continuous functions of .
Now we use the fact that -adic integers are compact (under the topology induced by the valuation). Since we assumed has infinitely many (-adic) integral solutions, there must be a subsequence of these solutions which converges to some tuple . By continuity, this tuple constitutes another solution to . By a change of variables , we get a system of equations
where , which by above has a sequence of solutions converging to the origin. We point out at this point that the equations in are linearly independent, i.e. the matrix of coefficients has rank . This is because is the product of and , and the matrix of all is invertible, as square of its determinant is discriminant of linearly independent tuple, hence is nonzero.
We consider the local analytic manifold of , i.e. the set of solutions of this system in some small neighbourhood of the origin. By assumption on the sequence of solutions converging to the origin, this manifold consists of more than one point. Hence, by a general theorem, it must contain an analytic curve – there is a system of (formal) power series , not all identically zero and all with no constant term, which plugged in for in . Equivalently, if we put , we get
where are power series with no constant terms.
Finishing the proof
We have the system of equations involving (exponentials of) . However, are also linear combinations of power series. Therefore, by linear algebra, we can find a system of independent linear equations
satisfied by these power series. We will now use the assumption we haven’t used yet: that has a complex root. Recall this implies the field has at least one complex embedding, i.e. (see statement of theorem 1). Therefore . Using and we can therefore use the following lemma:
Lemma 2: Suppose formal power series (over some field of characteristic zero) with no constant term satisfy a system of equations of the form
and also a system of two equations of the form
Then for some .
Before we provide a proof of this lemma, we will show why it helps us complete the proof. Recalling the definition of , this implies that any analytic curve contained in the manifold is also contained in the manifold defined by the equation
It follows (though not immediately) that . We will obtain a contradiction as soon as we deduce contains only finitely many points corresponding to the solutions of , since we assumed that contains infinitely many such points. Equivalently, since product in the definition of consists of finitely many terms, we need to show only finitely many tuples can satisfy
Let be a solution of coming from , and . We have
where is a constant independent of $\alpha$. Similarly,
Assuming , this implies
Taking to be a different such solution, this implies
and hence and ( can’t be both zero, so neither can be). It follows that is a rational multiple of , say . But recall that have the same norm, so has norm , hence it is . Therefore are equal or opposite. Hence there are only two possible values of , which is certainly a finite amount! As explained above, this gives us a contradiction.
Proof of lemma 2
Since power series satisfy independent linear equations, we can express all of them in terms of just two, say and . Put
Suppose . Then and are equal. They have constant terms equal to, respectively, since have no constant term, so and we can deduce from this (computing coefficients one-by-one) that . Hence we may assume (as otherwise we are already done). Putting we then have
and we may also assume are nonzero. Differentiation gives
Previous two equations combined give
with for . We now use the other pair of assumed equations. By subtracting suitable multiples of from them we find
If either is zero, this gives us a nontrivial linear relation between . Otherwise, subtracting suitable multiples and using independence we again get a nontrivial linear relation. In either case, we get
for not all zero. Differentiation and give us
(setting ). As we deduce
Hence we get that the rational function
vanishes when we put . But unless this function vanishes identically, this would imply is algebraic overits field of coefficients. But no nonconstant power series over a field is algebraic, so this can’t be as . Thus this rational function is identically zero. This means that some two are equal (otherwise this function would have a pole as for any with . Therefore for some .
Since , gives us
Comparing constant coefficients and then other coefficients, we get with . $\square$
The proof goes roughly as follows:
- Suppose otherwise.
- Using (a variation of) Dirichlet’s unit theorem and general results on modules, reduce the problem to showing finiteness of certain exponential equation in many variables.
- Generalize the context of the question to -adic-analytic setting so that we can speak of exponentials of (some) non-rational-integers.
- Using some difficult words like “local analytic manifold” reduce (a big part of) the problem to (essentially) showing it cannot contain an analytic curve.
- Use a fancy lemma to deduce the manifold is too algebraically constrained to contain infinitely many integral points.
- Write an ultrabrief summary.
Clearly two of these steps are (arguably) the most ingenious and crucial ones: passing from a number field to its completion and then reducing the problem to analoguous problem in functional setting (i.e. there is no formal power series blah blah). Both the complete fields (called more precisely local fields) and functional questions have many times in mathematics proven themselves to be much easier to work with than in number fields. The former’s advantage is mainly ability for us to use analytic tools (and difficult words), while in functional setting we have an incredibely useful tool – differentiation.
You can see simplicity of working in functional setting e.g. in the proof of Riemann hypothesis. In the future I will probably make more posts showcasing the local methods like this one, possibly less difficult ones (or perhaps more).