Lecture Notes in Mathematics Editors: A. Dold, Heidelberg B. Eckmann, Ztirich F. Takens, Groningen
1467
Wolfgang M. Schmidt
Diophantine Approximations and Diophantine Equations
Springer-Verlag Berlin Heidelberg NewYork London Paris Tokyo Hong Kong Barcelona Budapest
Author Wolfgang M. Schmidt Department of Mathematics. University of Colorado Boulder, Colorado, 80309-0426. USA
Mathematics Subject Classification (1991): 11J68, 11J69, 11057, 11D61
ISBN 3-540-54058-X Springer-Verlag Berlin Heidelberg New York ISBN 0-387-54058-X Springer-Verlag New York Berlin Heidelberg This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. Duplication of this publication or parts thereof is only permitted under the provisions of the German Copyright Law of September 9, 1965, in its current version, and a copyright fee must always be paid. Violations fall under the prosecution act of the German Copyright Law. © Springer-Verlag Berlin Heidelberg 1991 Printed in Germany Typesetting: Camera ready by author Printing and binding: Druckhaus Beltz, Hemsbach/Bergstr. 2146/3140-543210 - Printed on acid-free paper
Preface The present notes are the outcome of lectures I gave at Columbia University in the fall of 1987, and at the University of Colorado 1988//1989. Although there is necessarily some overlap with my earlier Lecture Notes on Diophantine Approximation (Springer Lecture Notes 785, 1980), this overlap is small. In general, whereas in the earlier Notes I gave a systematic exposition with all the proofs, the present notes present a varirety of topics, and sometimes quote from the literature wihtout giving proofs. Nevertheless, I believe that the pace is again leisurely. Chapter I contains a fairly thorough discussion of Siegel's Lemma and of heights. Chapter II is devoted to Roth's Theorem. Rather than Roth's Lemma, I use a generalization of Dyson's Lemma as given by Esnault and Viehweg. A proof of this generalized lemma is not given; it is beyond the scope of the present notes. An advantage of the lemma is that it leads to new bounds on the number of exceptional approximations in Roth's Theorem, as given recently by Bombieri and Van der Poorten. These bounds turn out to be best possible in some sense. Chapter III deals with the Thue equation. Among the recent developments are bounds by Bombieri and author on the number of solutions of such equations, and by Mueller and the author on the number of solutions of Thue equations with few nonzero coefficients, say s such coefficients (apart from the constant term). I give a proof of the former, but deal with the latter only up to s = 3, i.e., to trinomial Thue equations. Chapter IV is about S-unit equations and hyperelliptic equations. S-unit equations include equations such as 2 ~ + 3 y = 4 z. I present Evertse's remarkable bounds for such equations. As for elliptic and hyperelliptic equations, I mention a few basic facts, often without proofs, and proceed to counting the number of solutions as in recent works of Evertse, and of Silverman, where the connection with the Mordell-Weil rank is explored. Chapter V is on certain diophantine equations in more than two variables. A tool here is my Subspace Theorem, of which I quote several versions, but without proofs. I study generalized S-unit equations, such as, e.g. -4-a~1 -4- a~ 2 + ... -4- a~" = 0 with given integers ai > 1, as well as norm form equations. Recent advances permit to give explicit estimates on the number of solutions. The notes end with an Epilogue on the abc-conjecture of Oesterl$ and Masser. Hand written notes of my lectures were taken at Columbia University by Mr. Agboola, and at the University of Colorado by MS. Deanna Caveny. The manuscript was typed by Ms. Andrea Hennessy and Ms. Elizabeth Stimmel. My thanks are due to them. January 1991
Wolfgang M. Schmidt
Table of Contents
Chapter
Page
I. S i e g e l ' s L e m m a s a n d H e i g h t s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. 2. 3. 4. 5. 6. 7. 8. 9.
Siegel's L e m m a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G e o m e t r y of N u m b e r s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lattice Packings ....................................................... Siegel's Lemma Again .................................................. Grassman Algebra .................................................... Absolute Values ....................................................... H e i g h t s in N u m b e r F i e l d s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Heights of Subspaces .................................................. A n o t h e r V e r s i o n o f Siegel's L e m m a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
II. D i o p h a n t i n e A p p r o x i m a t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Dirichlet's Theorem and Liouvilte's Theorem .......................... Roth's Theorem ....................................................... Construction of a Polynomial .......................................... U p p e r B o u n d s for t h e I n d e x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E s t i m a t i o n of V o l u m e s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A V e r s i o n of R o t h ' s T h e o r e m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P r o o f of t h e M a i n T h e o r e m s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Counting Good Rational Approximations .............................. T h e N u m b e r of G o o d A p p r o x i m a t i o n s t o A l g e b r a i c N u m b e r s . . . . . . . . . . A G e n e r a l i z a t i o n of R o t h ' s T h e o r e m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
III. T h e T h u e E q u a t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. M a i n R e s u l t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. P r e l i m i n a r i e s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. M o r e o n t h e c o n n e c t i o n b e t w e e n T h u e ' s E q u a t i o n a n d Diophantine Approximation ....................................... 4. L a r g e S o l u t i o n s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. S m a l l S o l u t i o n s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. H o w t o G o F r o m F ( x ) = 1 to F ( x ) = m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. T h u e E q u a t i o n s w i t h F e w Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. T h e D i s t r i b u t i o n of t h e R o o t s of S p a r s e P o l y n o m i a l s . . . . . . . . . . . . . . . . . . 9. T h e A n g u l a r D i s t r i b u t i o n of R o o t s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10. O n T r i n o m i a l s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. R o o t s of f close to ~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12. P r o o f o f 7 A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13. G e n e r a l i z a t i o n s of t h e T h u e E q u a t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 3 7 9 11 18 22 28 32 34 34 38 42 45 48 49 52 57 63 69 73 73 76 83 85 86 91 99 100 106 111
116 119 124
Viii Table o f C o n t e n t s (cont.) Chapter
Page
IV. S-unit Equations and Hyperelliptic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
S-unit Equations and Hyperelliptic Equations . . . . . . . . . . . . . . . . . . . . . . . . Evertse's Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . More on S-unit Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Elliptic, Hyperelliptic, and Superelliptic Equations . . . . . . . . . . . . . . . . . . . . The N u m b e r of Solutions of Elliptic, Hyperelliptic, and Superelliptic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On Elliptic Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Rank of Cubic Thue Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lower Bounds for the Number of Solutions of Cubic Thue Equations .. Upper Bounds for Rational Points on Certain Elliptic Equations in terms of the Mordell-Weil Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Isogenies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Upper Bounds on Cubic Thue Equations in Terms of the Mordell-Weil Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . More General Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
V. Diophantine Equations in More T h a n Two Variables . . . . . . . . . . . . . . . . . . . . . . 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Epilogue.
The Subspace Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generalized S-unit Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Norm Form Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An Application of the Geometry of Numbers . . . . . . . . . . . . . . . . . . . . . . . . . P r o d u c t s of Linear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Generalized Gap Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Small Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Large Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P r o o f of Theorem 3B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The a b c - c o n j e c t u r e
................................................
127 127 129 134 137 142 147 159 163 165 169 173 175 176 176 180 182 186 188 192 193 199 200 203 205
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
209
Index of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
216
I. Siegel's L e m m a a n d H e i g h t s §1. S i e g e l ' s L e m m a . Consider a system of homogeneous linear equations a l l x l + " " + al,~xn = 0
•
(1.1)
amlX 1 _.}_. . . qt_ amnXn = 0
If m < n and the coefficients lie in a field, then there is a nontrivial solution with components in the field. If m < n and the coefficients lie in Z (the integers), then there is a nontrivial solution in integers. (Just take a solution with rational components and multiply by the common denominator.) It is reasonable to believe that if the coefficients are small integers, then there will also be a solution in small integers. This idea was used by A. Thue (1909) and formalized by Siegel in (1929; on p. 213 of his Collected Works). L E M M A 1. S u p p o s e that in (1.1) the coet~cients aij lie in Z and have [alj[ < A (1 <__i , j < n ) where A is n a t u r a l T h e n there is a nontrivial solution in Z with Ixil < 1 + ( n A ) m / ( " - m )
(i = 1 , . . . , n ) .
P r o o f . We follow Siegel. Let H be an integer parameter to be specified later. Let C be the cube consisting of points _•
= (Xl,...
, xn)
with
Ix l =< H
(4 = 1 , . . . ,
n).
There are (2H + 1) n integer points in this cube, since there are 2 H + 1 possibilities for each coordinate. Let T be the linear map R n --* ~ m with T_x ----(allX 1 "[- " " + alnXn,. .. , a m l x l W " " T a m n X n ) . Writing T x = y = (Yl,.-. , ym), we observe that the image of C lies in the cube C' :
[yj] = < nAg
(j = 1 , . . . , m )
of R m. T h e n u m b e r of integer points in C' is (2nAH- + 1) ~. Suppose now that ( 2 n A g + 1) m < (2H + 1)'.
(1.2)
T h e n T restricted to the integers in the original cube will not be one-to-one. So there exist x ' , x " in C, with x' ~ x", such that T x ' = T x " . P u t x = x I - x". T h e n T x = O. So _x is a solution to the system with integer coordinates. Note that [x~I =< 2 / ~ (i =
Typeset by .AA4S-TEX
1 , . . . ,n), bec=use Ix~l, Ix~l < H (i = 1 , . . . ,n), so Ix~l = Ix: - ~71 < I~:1 + Ix"l _-< 2H. Choose H to be the natural number satisfying ( n A ) m l C n ~ m ~ <= 2 H I ~ ( n A ) m / ( n - m ) + 1.
Then
(2H + 1) n = (2H + 1)m(2H + 1) n - m ~__ (2H + 1 ) m ( n A ) m > ( 2 n A H + 1) m.
So there exists an x= satisfying Ixil < 2 H < 1 + ( n A ) m / ( " - m ) . So the proof of Siegers L e m m a uses the box principle. Can the exponent be improved? The answer is "no". Siegel's L e m m a is almost best possible. Put k = n - m , and for large P pick distinct primes p~j (i < i < k, 1 < j < m) with P r / < pij < P , where 0 < 77 < 1 is given. P u t
Pj
= PljV2j'"Pkj
Qi = p l a p i 2 " " p i m
(1
= < j
< m),
=
(1 < i < k),
P,j = P~lpis Qij = Q i l p i j
(1 < i _5
=< m ) .
Consider the system of m equations in n variables: PllXl
-~- " ' " -~-
PklXk
--
PlY1
P12xl + "'" + Pk2zk Plmxl +""
=
-P2Y2 -Pmym
+ PkmZk
0
= 0 = O.
T h e m a x i m u m modulus A of its coefficients has A < pk. The following are solution vectors: z =(QI,0,.. ,O, Q i i , Q 1 2 , . . . , Q l m ) ----1
z k = (0,0,... ,Qk,Qkl,Qki,...
,Qkm).
It is clear that every solution of our system of equations is a linear combination cl_z1 + . . . -4- ck___zk. For integer solutions, ci is necessarily a rational n u m b e r whose denominator is Qi, and then every component of ciz/ has denominator Qi. Moreover, if, say, cl is not integral, then (since Q l l , Q12,... , Qlm are coprime), Cl__Z1 is not integral and has some component whose denominator is a prime Plj. But since c2_~,... , ckz__k don't have P l j in the denominator, cl_zI + c2z 2 + -.. + ckz k cannot be i n t e g r a l - - a contradiction. Therefore the integer solutions are z = ClZ1 + .-- + ck_zk with c l , . . . , ck in Z. When z # _0, say Cl # 0, the first component xl of_z has Ixll > Q1 > (r~P) m = r f " P m > r f " A m/k = r f ~ A ' ' / ( " - ' ~ ) .
Therefore every integer solution ( X l , . . . , xk, yl , . . . , Ym ) # ( 0 , . . . , O) has
max(Ix, l,..., Ixkl)
>
~TmA'~1("-').
3
Here 77 may be taken arbitrarily close to 1. Another approach is as follows. When m = n - 1, consider the system of equations A x i - Xi+l = 0
(i = 1 , . . . , n - 1).
Every nontriviM solution, in fact every nontriviM complex solution, has x , / x l Thus if we set q(x_) = max Ix,/x¢ t,
= A ~-1.
with the maximum over i, j in i < i, j __
A m/<'-m) E x e r c i s e l a . Suppose now that m = 1. For large A, construct an equation al Xl 4 - ' "
+ a.xn = 0
with integral coefficients and la~l =< A (i = 1 , . . . , n), such that every nontrivial solution x with complex components has q(x_) >= c ( n ) A 11("-1) = c , ( n , m ) A ml('~-'O > O.
This approach can be carried out for general n, m. See Schmidt (1985). §2. G e o m e t r y o f N u m b e r s . The subject was founded by Minkowski (1896 ~¢ 1910). Other references are Cassels (1959), Gruber and Lekkerkerker (1987), and Schmidt (1980, Chapter IV). A lattice A is a subgroup of R " which is generated by n linearly independent vectors __bl,... ,b n (linearly independent over Rn). The elements of this lattice are clb 1 + .-. + c , b with Ci E Z .
////////
/
J
/
-////)// -
~
The set b l , . . , ' =bn is called a basis. A basis is not uniquely determined. For example, bl,b I + b2,=b3,... , b is another basis. How unique is a basis? Suppose __b'l, b' is another basis. Then n
6' Z
=i =
j=l
cijb j
and
cij C Z
and
n
b = E cijb ' ~ and =j
ctjiEZ.
i=1
So the matrices (cij) and (c~i) are inverse to each other and cij, c~i E Z, so det (c/j) = det (c~i) = +1. Thus the matrix (cij) is unimodular, where by definition a unimodular matrix is a square matrix with integer entries and determinant 1 or - 1 . L E M M A 2A. A necessary and sumcient condition/'or a subset A of R n to be a lattice is the foI1owing: (i) A is a group under addition. (ii) A contains n lineaxly independent vectors. (iii) A is discrete. For a proof, see e.g., Schmidt (1980, Ch. IV, Theorem 8A). Consider R n with the Euclidean metric and A a lattice with =ba,... ,b n as basis. Let II be the set of linear combinations Alb 1 + ..- + A,b n with 0 =< A~ < 1 (i = 1 , . . . , n). T h e n H is called a fundamental parallelepiped of A.
h, T h e fundamental parallelepiped does depend on which basis is chosen. T h e volume of II is given by V(H) = [det ( b l , . . . , bn) I where the right-hand side involves the matrix whose rows are respectively made up of the coordinates of b l , . . . ,b n with respect to an orthonormal bases of ~n. This volume is independent of the chosen basis of the lattice, since different bases are connected by unimodular transformations. It is an invariant of the lattice. We define det A = V(II). Notice that when __--| b. = (hi1,... , bin), then
V 2 = det
bll b21 .
\ bnl
b12 b22
.."'"
bn2 ... bib
bin ~ b2n I bnn
)
bib 2
det
iz bll b21 "" [ b12 b22 ""
/
.
\ bin .-.
b2n
° • ,
bnl "~ bn2 ) bnl.
l
=bibn )
= det
where the inner product of vectors x_, y is denoted by x y.
(2.1)
Every x_ in R " may uniquely be written as x = x' + x__"where x_' E H and x_" E A. X_ =
~ib i = i=1
{ ~ i } b i 71i=1
[~ilb i . i=l
~"n
eA
Here we used the notation that uniquely = [~] + {~} where [~] is an integer, called the integer part of ~, and {~} satisfies 0 < {~} < 1 and is called the fractional part of ~. i A
Z " is a lattice with basis ___el,... ,_e where___ei = ( 0 , . . . , 0, 1, 0 , . . . , 0), (i = 1 , . . . , n), and with det Z " = 1. If A is an arbitrary lattice with basis b l , . . . , b , then there exists a linear transformation T such that Te__i = b i (i = 1 , . . . ,n). So T Z " = A. Is T unique? Suppose T Z " = T ' Z " . Then ( T ' ) - I T Z " = Z", so det ( ( T ' ) - ~ T ) = 4-1 and ( T I ) - I T is unimodular. Call it U. Then T = TIU. Observe that det A = l d e t
T I.
T H E O R E M 2B. (Minkowski's First Theorem on Convex Sets.) Let B C R " be a convex set which is symmetric about the origin (i.e., x_ 6 B if and only if -x_ 6 B ) of volume
V(B) >
2" aet A
(2.2)
where A is a lattice. Then B contains a non-zero lattice point. C o m m e n t s . The volume here is the Jordan volume, i.e., the Riemann integral over the characteristic function of the set. Every bounded convex set has such a volume. Let g_ E B be a non-zero lattice point in B. Then - g ~ 0 and -g_ E B by symmetry, so B contains actually at least two non-zero lattice p~nts. P r o o f . (Mordell, 1934). V ( B ) / d e t h is invariant under non-singular linear maps. Therefore, after applying a linear transformation, we may assume that A = Z". Then the theorem reduces to: If V ( B ) > 2 n, then B contains a non-zero integer point. Let Bm be the set of rational points in B with common denominator m. Then IS~l
, V(B)
as
m--* oo
Tt~ n
where [ [ denotes the cardinality. For m sufficiently large, IBm[/m" > 2" and thus [B,n[ > (2m)". So there are two points a_ = ( a l / m , . . . , a , / m ) , b = ( b l / m , . . . , b a l m ) in Bm with ai - bi ( m o d 2 m ) (i = 1 , . . . ,n). Then 1
- b)
B
since the m i d p o i n t of___aa n d -=b is ½( a - b) E B. Let _g = ½(a - _b). Clearly g is a n o n - z e r o integer point. E x e r c i s e 2a. If B is s y m m e t r i c , convex, a n d V(B) > 2'~k det A w h e r e k E N, then B contains at least k pairs of non-zero lattice points. A convex body is a c o m p a c t , convex set containing 0 as an interior point. Such a b o d y clearly has 0 < V(B) < oo. R e m a r k 2 C . If B is a s y m m e t r i c , convex b o d y a n d V(B) > 2 n det A, t h e n B contains a n o n - z e r o lattice point. It is easy to show t h a t 2C follows f r o m 2B, a n d vice versa. R e m a r k 2 D . T h e o r e m 2B is best possible. Take A = Z n, B the cube with [xi[ < 1, (i = 1 , . . . , n). T h e n V(B) = 2" = 2 n det A a n d there are no n o n - z e r o integer points in B. Minkowski defines successive minima as follows: Given B, A where B is s y m m e t r i c , convex, b o u n d e d , a n d with 0 in its interior, let hi = inf {h : h B contains a non-zero lattice p o i n t ) . * M o r e generally, for 1 < j < n, Aj -- inf{h : h B contains j linearly i n d e p e n d e n t lattice p o i n t s ) . T h e n 0
< A 2 < A 3 < ...
< A,
Here hi > 0 since B is b o u n d e d a n d h,, < oo since _0 is an interior point. THEOREM
2 E . (Minkowski's Second T h e o r e m on Convex Bodies.) 2" n! det A < h l - - ' h n V ( B )
< 2" det A.
(2.3)
Example. Let n = 2, A - - Z 2 a n d B the rectangle [x,[ =< k, Ix2[ =< 1/k where k > 1. T h e n hi = 1/k, h2 = k a n d h l h 2 V ( B ) = V(B) = 4 = 22 det A. A p r o o f wii1 not b e given here. T h e r e is no really simple p r o o f of the u p p e r b o u n d in (2.3). A weaker b o u n d is p r o v e d in S c h m i d t (1980, p. 88). R e m a r k 2 F . T h e c o n s t a n t s 2n/n! a n d 2" are best possible. Let A = Z " a n d B the cube [xi[ < 1. T h e n V(B) = 2 n. Now __el,... ,__e are on the b o u n d a r y , so A1 . . . . . An = 1. We have A 1 - ' . . X n V ( B ) = 2 n and 2 n d e t A = 2". So we get equality on the r i g h t - h a n d side of (2.3). Next, let A = Z " and B the " o c t a h e d r o n " [ X l [ + ' " + Ix,[ < 1. It m a y be seen t h a t V(B) = 2"/n!. We have again A1 = A2 . . . . . A, = 1. T h u s ( 2 n / n ! ) det A = 2n/n! a n d Am... A , V ( B ) = 2"/n!. We have equality on the left-hand side of (2.3). Note t h a t AI'V(B ) < 2 n det A (2.4) * )~B is the set of points )~_xwith x_ E B.
since A1 < A2 =< ... =< A,. Suppose now that V(B) > 2 n det A. Then A1 < 1 so that B = 1B D AIB contains a non-zero lattice point. Therefore (2.4) implies 2B. It is easily seen that (2.4) and 2B are equivalent. Exercise 2b. Suppose B is a symmetric, convex set, A a lattice, )~1 the first minimum. Given/~ > 0, the number of lattice points in #B is < ((2#/A1) + 1) n. §3. L a t t i c e Packings. A good reference for general packing and covering problems is C. A. Rogers (1964). Let B C R" be compact and measurable. Given a lattice A, write B + A = {b +_~ :
b=eB,~=eA}.
If the translates of B by vectors of A are disjoint (as in the picture), then we call
B + A a lattice packing of B. Having disjoint translates is equivalent to having unique representations of the vectors of B + A as _b+ _L The density of such a lattice packing is 6(B, A) = lim Y(h + B, r)
(3.1)
r-0
where V(r) is the volume of a ball of radius r, and V(A + B, r) is the volume of the intersection of A + B with the ball of radius r and center 0. Exercise 3a. Show that the limit (3.1) always exists under our hypotheses, and that (with II a fundamental parallelepiped) 6(B, A) = V(II f3 (h + B))/V(H) = V(B)/det A. We define 6(B) =
sup A+B
6(B, A).
A a lattice
packing
This is invariant under nonsingular linear transformations T, since 5(TB, TA) = 6(B, A). Suppose B is convex and symmetric about 0. Suppose B contains no non-zero lattice point.
C l a i m : T h e n A + 71B is a lattice packing.
Justification: Suppose :eI + lb2=1= =~2+ Ib2 where =tl, ~ E A, =b1, _~ E B. Then by an argument used above, !b2=1- !b2=2is in B N A . But !b2=l- !b2--2= =e2-gl= and B contains no non-zero lattice points, so that ~2-~1 = 0, _t2 = ~1' hence b 1 = b_2. Therefore A + 1 B is indeed a lattice packing.
V(}B) 5( B , A ) - -
det A
V(B) < 5 ( 1 B ) = 5 ( B ) 2n det A =
-9
<1 =
and therefore
v(B) _5 2" (B)det A. T H E O R E M 3A. If B is convex and symmetric about 0 and V ( B) > 2 " 5 ( B ) det A, then B contains a non-zero lattice point. In particular, this happens when V ( B ) > 2 n det A. So 3A is a strengthening of Minkowski's T h e o r e m 2B. R e m a r k 3B. For B convex, symmetric about '_0, one can show that 5(B) < 1 except for certain polyhedra. E.g., the cube has density 5 = 1. So do regular hexagons in the plane.
Let 5. = 5(B) where B is a ball in R". Consider the following picture.
The "triangle" lattice A has det A = 1 V~. It is easy to guess that
det A
This had already been proved implicitly by Gauss in his theory of positive definite binary quadratic forms. We know the values of ~2, g3,-. • , 68; see Cassels (1959, Appendix) and Exercise 3b below. The estimation of 6n for large n remains among the central unsolved problems in the Geometry of Numbers. Blichfeldt (1929) proved that ~, < 2-n/2(1 + ~). Also, ~,~ > (½ - e) n if n > n0(e). See Cassels (1959, p. 249). More recently, G.A. Kabatjanskii and V.I. Leven~tein (1978) have shown that ~, -< 2 -°'599"(1-~) for n > One may in a fairly obvious way define a general (not necessarily lattice) packing of a set B, and the maximum general packing density. For a disk in R 2, Thue (1892) had shown that the maximum packing density is in fact achieved for a lattice-packing. It is not known whether a similar result holds for a ball in R 3. It is generally believed that the densest packing density of a ball in R " where n is sufficiently large is less than the smallest lattice packing density. Now V(B)A~ < 2 ~ det A~(B), so that A1 < 2(det A)I/n(~(B)/V(B)) a/". For the unit ball B, V(B) = Y(n) = ~rn/2/F(1 + 2), so that by Stirling's formula we have the asymptotic relation v(nU"
= V(BU"
~
-* oo. n
We define Hermite's constant 7- to be least such that for any lattice A
ha =<71/2(det a)'/" where A1 =/~1 (unit ball). So <
7.=
4~/n y(
n
)z/.
< 2n
= - 7r
if
n>2.
We have l i m ( 7 , / n ) =< ~4 = yg, 2 by using the trivial estimate ~ =< 1. If instead we use Blichfeldt's estimate, we obtain lim(7~/n) =< ~ee" 1 The result of Kabatjanskii and Leven~tein quoted above yields lim(%~/n) = 2-°'agV(Trc) -1. E x e r c i s e 3b. Show that 7n = 4 ~ l " / V ( n ) 21". E x e r c i s e 3c. Let Q(X) = ~i~,j=l aijXiXj be a positive definite quadratic form with real coefficients aij = aji. Then there exists a non-zero integer point x_ with Q(x=) < 7 , ( d e t Q)I/,. Moreover, 7n is least with this property. §4. Siegel's L e m m a Again.
A rational subspace of •n or C n is a subspace spanned by vectors with rational coordinates. A rational subspace S k of dimension k is spanned by k vectors a l , . . . , a k E Qn. Such a space S k may be defined by n - k linear homogeneous equations with rational coefficients. The integer points in a subspace S k form a set A which is a lattice of S t by Lemma 2A.
10 T h e height of S k is defined by
H ( S k) = det A. An integer point a = ( a l , . . . , a,,) is called primitive if g c d ( a x , . . . , a,,) = 1. T h e n a is "closest to the origin," i.e., there is no integer point on the line segment between _0 and a. Either a or - a is a basis of the 1-dimensional lattice of integer points of the space S 1 spanned by a. Thus H ( S 1) = lal.
By the definition of Hermite's constant, there is on S k an integer point x ~ 0 with
"I~/2H(Sk) 1/k
<
L E M M A 4A. Consider the unit cube C in R n, i.e., Ixil < 1 (i = 1,... ,n). Let S k be a k-dlmensionM subspace. Then C ¢q S k has k-dimensional volume > 2 k. This result, due to 3. Vaaler (1979), will not be proved here. Let S k be a rational subspace, A the lattice of integer points associated with S k, i.e., h = A(Sk). Let B = C gl S k. By Minkowski's Theorem 2C, A~V(B) < 2 k det A. Now V ( B ) > 2 k so that A~2k =< 2 ~ det A, Ak < det A = H(Sk), ,h < H(Sk)I/k.
Recall that Ix[ was the Euclidean norm. Let
U = m x(Ix11,..., Ix,I) be the m a x i m u m norm. Our results may be summarized in LEMMA
4B. Given a rational subspace S k there is an integer point x_ ~ 0 on S k
with
< "1~/2H(S) l/k.
=
Also, there is such a point x=1 with I~'l
<
H(S)I/k.
When S k is a rational subspace, then (Sk) -1- is a rational subspace of dimension
L E M M A 4 0 . H ( S a') = H(S). T h e proof is postponed to the next section (see Corollary 5J). To make the lemma correct for S O and S " = R", we set H ( S °) = 1.
11
Let us go b a c k t o a s y s t e m of linear equations = 0
allXl +'''+alnXn
a'lXl
" ~ " " • -}- a ' n X n
=
0
w h e r e 0 < m < n, aii E Z. If these equations are i n d e p e n d e n t , t h e y define a r a t i o n a l s u b s p a c e S k of d i m e n s i o n k = n - m. A n d ( S k ) ± is s p a n n e d b y the row vectors a l , . . . , a m where___ai = ( a i l , . . . ,ain). T h e n H ( S k) = H ( S k±) < la__ll...la_r.l. T h e last inequality occurs b e c a u s e a l , . . . , a m can be w r i t t e n as linear c o m b i n a t i o n s of basis vectors for the lattice, so t h a t det l a l , . . . , a I is an integer m u l t i p l e of the d e t e r m i n a n t of a basis, t T h u s det(A(Sk±))
< I d e t ( a ~ , -.- , a ' ) l
<
I%1.-I%1
b y H a d a m a x d ' s inequality, which is a consequence of L e m m a 5E below. LEMMA 4 D . (Siegel's L e m m a ) Given the s y s t e m of equations above, (i) there is a non-trivial integer solution x_ with
= 7._r.u~,~..-l~r.l) 1/¢"-'> 5
n
( v ~ A) m/("-'O
if[a~i[ < A for e v e r y i , j . (ii) Also, there is a non-trivial integer solution z_1 with
I~=11 =< (la11...1_%1)1/("-r")
=< (v~A)'/("-').
In the first inequality we used t h a t 7 - - " < ~ ( n - m ) < -2n,~ if n - m => 2, a n d 7 - - - , = 1 < 3-n if n - m = 1, so t h a t n > 2. It is clear t h a t we do not have to restrict to t h e case w h e n the m equations are i n d e p e n d e n t . R e m a r k 4 E . If Minkowski's Second T h e o r e m (2E) is used, (ii) can be s t r e n g t h e n e d to get the following: there are n - m linearly independent solutions x l , . . . ,X__n_r. of our s y s t e m of equations such that
)~1 )1~1 - -. Ix=._ " r _5 I~1 I - " )~" ). T h e first assertion can be s t r e n g t h e n e d in the s a m e way, b u t this is not so obvious.
§5. G r a s s m a n
Algebra.
) W e t h i n k of a__x,... , a m as v e c t o r s w i t h m c o m p o n e n t s in t e r m s of a n o r t h o n o r m a l coordinate s y s t e m in S k±
12 (Also presented in Schmidt (1980), Ch. IV.) Notation: Let K be a field, K " a vector space, and __ei = ( 0 , . . . , 0 , 1 , 0 , . . . . ,0) be basis vectors. Suppose 0 < p < n. Let C ( n , p ) i be the set o f p - t u p l e s a = { i l , . . . , i p } w i t h i i E Z and 1 < il < i2 < - " This set has cardinality
< ip < n.
(;)
Let E
be the formal expression E
~O"
= e~--t . 1 Ae. A . . . A e l.l .l t p ~i 2
There are ( vn ) such expressions. For p = 0, let E• = 1. Let K~rt be the vector space over K generatedby E with a e C ( n , p ) . Then d i m ( K ; ) = ( ,ltl) . Elements of K~rt are called p-vectors. S p e c i a l cases: K p = K n, K~ = K , and K~ is spanned by the single vector e_l h ~ A . . - A e = ~ . We now introduce a more flexible notation. For any p and any integers i l , . . . , i v between 1 and n, the symbol e h - . . A e , should be 0 i f i i = ij, for s o m e j ~ jl. Otherwise, if { i , , . . . ,iv} = { j , , . . . , j , } , (considered as unordered sets), where j l < j2 < "'" < jp, then e. A . . - A e . =-t-e. A . . . A e . with the + sign if we get the i's from the j ' s by an even permutation, and with the sign otherwise. Set Gn = K g {9 I~/~ @ . . - @ K 2. n Then dim Gn = ( o ) + ( 1n ) + " " + (,,) = 2n" We are going to make G , into an algebra over K . We need a product, or wedge A. By linearity, it suffices to define products of the basis vectors E . We set
1A1=1, . , 1 A (_eil A- "" A_eip ) = e .=Ix A . - - A e=zp (_ei A . . . A e :.1 1
~
)Al=e.
~1 1
A . - . A e:.1 p ,
(e_i A...A_e__ip)A(e_j~_ A...A=.he. )----=,:e, A...A=eip A=3~e....A=jqe . Initially, this is given for il < .-- < ip and j l < "" "jq, but clearly it remains true in general. We have -----s e. A ----J e. = - e=.J A ----t e.. This algebra is associative. Note that this fits in with the original notation e. A ... A e . . The resulting algebra is called the Grassman algebra, or ezterior algebra. If x_:,... ,x_v E K n, then x A - . . A x~ p E K ~ ~.
~1
Such a p-vector is called decomposable.
13 LEMMA
5A. Suppose x_i = (~il,... ,~i,) = )-~'=I ~ije__j (i = 1,... ,p). Then
$1^-..^x =
~
GE
u6C(n,p) where ~ is the p x p determinant [~ulwith 1 < i < p, j 6 o. For example, when n = 3,p = 2,
~1A~2 =
~11 ~12]
~1 ~22 ~ 2 +
~11%~13
T h e linear map with E12 ~-~ _~, =_Ela ~-~ - ~ , wedge produced with the cross product. When n = 4 , p - - 2 __z1 ^ x_2
:
I
~12 ~13]
(21 ~23 ~13 + ~22 ~23 ~3" ~2a ~-~ ~1 identifies K 3 with K 3 and the
[11 621
I
~13 ~21 ~22 E12 ~- "'" "~ ~23 ~24 & 4 '
which has six terms. When p = n,
-z_1
A...Ax
=p ----
• " "
~ln
• " "
~nn
"
E12...n
P r o o f . T h e left-hand side is linear in each vector =zi, the right-hand side is linear, too. So it suffices to consider the case when X_l,... ,x___p E {__el,... ,_e }. If two of the =,z"s are the same, then both sides vanish. So without loss of generality =,x = __ejl, (i = 1 , . . . ,p) with ji 6 { 1 , . . . ,n} distinct. Since both sides behave in obvious ways under permutations of vectors, we may suppose j l < j2 < "'" < jp. T h e n the left-hand side is E . = E where 7" = { j l , - . ,jp}. On the right-hand side ~ = 1 if a = r , ~-21""Jn
=r
but ~ = 0 if ~r # % since ~ is the determinant of the submatrix of (~ij) with columns
jl,...
,jp.
A consequence of L e m m a 5A is Laplace's expansion of a determinant after a set of rows. For simplicity we will deal only with expansion after the first p rows. Given p,q with p + q = n, and given a 6 C ( n , p ) , let # E C(n,q) be the complement of ~r in { 1 , . . . ,n}. Let e(a,#) be 1 or - 1 , depending on whether (a,~Y) is an even or an odd permutation of { 1 , . . . ,n}. Let =Xl,... ,_z_p,~ 1 ' " " ' y----q be in K " , and write •
=
~p
=
trEC(n,p)
--
=q
=
-rEC(n,q)
14 Then 271 A . . .
A x p A y I A " ' " A 1/ = =q
o,7 -E
^E
~'6C(n,p) r6O(n,q)
=
aEC(n,p) =
.
a6C(n,p)
By L e m m a 5A, X__1 A
"'"
AX
=P
Ay I
A""
Ay
=q
=
(detM)E12...n
where M is the matrix with rows X l , . . . ,xp, Y l ' " " ' y " We therefore have =q
LEMMA
5B. (Laplace expansion of a determinant.) detM=
~
e(a, # ) ~ a .
aeC(n,p)
Note that by L e m m a 5A the ~, are the (p × p)-determinants from the rows x l , . . . ,xp of M, and the r/a are the (q × q)-determinants from the complementary rows
Yl'''"
' 1/ " =q
LEMMA
5C. x I A . . . A =px = =0if and only if =x1' . . . , x_p are linearly d e p e n d e n t .
P r o o f . It is an immediate consequence of Lemma 5A. LEMMA
5 D . Ilia__1,... ,X__p are linearly i n d e p e n d e n t and Y=I'"" '1/
=p
i n d e p e n d e n t , then £1 A . . . A x
=P
is proportional
to
Yl A
"'"
are linearly
A 1/ if and only i f x l , . . . =p
=
,z__p
and Y=I' " " ' 1/ span the s a m e subspace of K " . =p
Proof.
If the x__'s and _y's span the same subspace, then each !/. (i = 1 , . . . ,p)
is a linear combination of =xl,"" ,_x_p, so that Y l " " . ,yp is a multiple of _xI A -.. A =px. The factor is the determinant of the coefficient matrix for the y.'s in terms of the ~t
x.'s. Conversely, suppose that ~--1 x A ... A :xp = A(y 1 A .-. A =1/p ). For any _x, the vector =.1 = -x A (x I A .-. A__xp) = x A_£1 A - . . A x is zero precisely when x lies in the space spanned __xp)= )~(1/i A Yl A - . - A y=p ) = 0 s i n c e t w o y . ' =t soccur. by x__l, .. . ,z__p. But ~i A (-£1 A .-. A __ SO ~i is in the space spanned by z_l,... ,x__p(i = 1 , . . . ,p). Therefore the spaces are the same. Let S p C_ K " be a subspace of dimension p. Let X__l,... , x___pbe a basis of S p. Then let X = __xt A . . . A _xp, which is a vector with ~ = ( pn ) components and which lies in K ~ ~ K t. T h e components of X are called the " G r a s s m a n coordinates of S / ' . By the lemma, the Grassman coordinates are given up to a factor. Grassman coordinates of distinct p-dimensional subspaces are not proportional. Incidentally, the Grassman
15 coordinates in general do not fill all of K ; , i.e., not every p - v e c t o r is decomposable. A heuristic a r g u m e n t is t h a t the p - d i m e n s i o n a l subspaces of K n constitute a "manifold" with p ( n - p ) degrees of freedom, so t h a t the G r a s s m a n coordinates should be a manifold of dimension p ( n - p) + 1, and for most cases with 0 < p < n we have p ( n - p) + 1 < ($) = dimg;. Now suppose t h a t K = R or C. Make R " into a Euclidean space or introduce a H e r m i t i a n metric on C n with e . e . = gij (i < i, j < n). T h u s in the H e r m i t i a n case, if _x = ( ~ 1 , . . . , ~,), Y = ( r / l , . . . , r/,,), then _xy = ~ / 1 + . . - + ~,~/,. Introduce a similar metric on K ; with E E = =
=
0
= ~,
otherwise.
say. LEMMA
hE. (Laplace identity.) Given
Xl,... ,_Xp,--Yl'''"' _---p y in
have
^...
(=y ^ . . . A g )
=
IX
R " (or C n), we
.-
•
Here the inner p r o d u c t of the left-hand side is in R~ (or C~), b u t each inner product on the right-hand side is in R " (or C"). E x e r c i s e ha. Prove L e m m a hE, using linearity. A consequence of the Laplace identity is t h a t
ii= ~1 """ I_X_lA . . . A ~pl --
-Z-l--%
"
X1
"
"'"
1/2 •
X =p X ~-p
As we have seen, this is the volume of the p - d i m e n s i o n a l parallelepiped spanned by
~1,'" ,~=p" LEMMA
5 F . For a n y ~ l , . . .
(i) 1~1 A " " Axp A~I A " " A y [ ~ Ix I ^ " "
ifx.~uj = 0 ( 1 (ii) For a n y u l , . . .
=q
,z__p, =Yl'"" ' y '
^m, lly1 ^ . . . A ~ l, where equality holds ***
_< i _< p , 1 < j
1~ ^ " " ^ ~-~1" I-~ ^ ' " ^ -~ ^ g~ ^ ' " ^ gp ^ ~=1^ __< I_UlA . . . A ~ A _z1 A . . . A x
I1-~ A . - .
A _ ~ A Yl A " "
^ y I A y I.
***There are other cases of equality, e.g. when ~1 ^ "'" ^ -~p = 0 or Yl ^ "" "^ y = O.
16
T h e r e is a geometric interpretation to (i): The volume of the (p + q)-dimensional parallelepiped spanned by x_q,... ,z_p, --Yl"'" ' y does not exceed the product of the vol=q
umes of the p-dimensional parallelepiped spanned by z__l,... , x__pand the q-dimensional parallelepiped spanned by Y l ' " " ' y " This is well-known, e.g. when p = 2, q = 1. l~q
P r o o f • The length of a wedge product is invariant under orthogonal (unitary) linear transformations in N'* (or (2'*) (i) is obvious if X l , . . . ,xp, y , . . . y are linearly dependent. So we may assume that =q
x l , . . . ,xp, Yl'" "" ' y span a space S of dimension m = =q
p+q.
After a suitable orthogonal
(or unitary) transformation, S consists of points (~1,.-. ,~m, 0 , . . . ,0). So we may writex.= (~n,-..,~i,,,0,...,0) andy = (r/jl,...,r/jm,0,...,0) (i= 1,...,p; j = ~S
~3
1 , . . . ,q). T h e n
=P
A__YlA--'Ay~_q =
"""
-.-
(~m
" • •
~lrn
7111
• ..
(pp rhp
r/q1
...
r/qp
•
XlA'"Ax
~lp
•°
(lm
, 0 , . . . ,0
%-,
T h e length is the absolute value of this m × m determinant, x 1 A. • • A =xp has coordinates which are the p × p determinants from (~ij) and y A --• A y has coordinates which =q
are the (q × q) determinants from (r/i j). The first assertion of (i) follows from Laplace's expansion of the m x m determinant with respect to the first p rows and last q rows and from Cauchy's inequality• (Laplace's expansion has ( mp ) = ( mq ) summands.) For (ii), we can write =x, = =x'., + =x"i (i = 1, " ' " ' p) where the =z i" ' s are spanned by the u's and the x'.'s are orthogonal to the u's (i = 1 , . . . ,p)• Similarly, y = y'. + y" (j = 1 , . . . , q). The exterior products in (ii) are unchanged if we replace _x_i's with x='i's and y ' s with y"s. . So we may suppose, without loss of generality, that =,x'=su = 0, y u__ = 0 (i = 1 , . . . ,p; j = 1 , . . . ,q; s = 1 , . . . ,g). Then, using the second assertion in (i), the left-hand side becomes
I~IA-•.A~I=I~,A.•.A~_=pA~ A•-.Ay=~h the right-hand side becomes
I~-1 ^
A-~l~l_-zl A ' " A % I r v I A.--Ay I,
and (ii) follows. Let p, q be positive integers with p + q = n . Let K n be •n or C n, equipped with the usual Euclidean or Hermitian metric. Let S p , T q be subspaces of respective dimensions p, q which are orthogonal complements of each other.
17 LEMMA
5G. If {~a} is a Grassman coordinate vector for S p, then {7,} with
~ = ~(e,~)~, is a Grassman coordinate vector for T q. (Here ~ denotes the complement but ~ the complex conjugate). Example.
(2:21,2:22,2:23).
Let p = 2, q = 1. A
set of
Suppose S 2 is spanned by
(~:11,2:12,2:13)
i
and
coordinates will be (~12, ~13, ~23) ----( 2:112:121 2721 2:22 I ' \
Grassman
12:112:21 2:232:13I ' 12:12x22 2:232:13) . But T1 is spanned by (~2z _~13, ~23), so that its Grassman coordinates are (~23,-~la, ~12)- So the lemma holds. Proof.,
Let _zl,... ,z__p be a basis for S p and =Vl,... ,y a basis for Tq. Write =q X__= z_1 A .-. A ~p, Y = __VlA ..- A y . Discarding the notation in the statement of the Lemma, write ~c(.,p) By Lemma 5F we have
r~O(n,p)
IX ^ YI = I_YII_yl.We
obtain
Ixl21_YI ~ = I x ^ El 2 =
=
=
~
^
~ ~,~(~, ~)~...~ ~
(~(~,~)~,)~,~
r~C(.,q)
: lXl2l__Yl2. Hence, since the Cauchy-Schwartz inequality is an equality, the vector {~/r} must be a multiple of the vector {¢(~, r)~r}. The Lemma follows. Suppose S p is a rational subspace of R n and A = A(S p) is the lattice of integer points in S p. Let =xl,... ,~p be a basis for A. Then
x=~i^...^2:=p ~This proof was suggested by Prof. L. Baggett.
18
will be a set of Grassman coordinates for S k. LEMMA
5H. The components of X are relatively prime.
P r o o f . Suppose the components of X have a common prime factor ~. Then the matrix
k
Xll •..
Xln /
Xpl
T, p n
• . .
/
modulo ~ is singular. So there exist integers e l , . . . ,%, not all congruent to 0 mod ~, such that ClX~ + - . - + % =x p - 0 (mod g). Without loss of generality, Cl ~ 0 (mod £). Then take x__' 1 = ~ ( c l x 1 + . . . + % =x p ) • h. But x'1 is not a linear combination with integer coefficients of X l , . . . , x f Contradiction, since the _x_/'s form a basis. C O R O L L A R Y 5I. H ( S p) = IX], where __Xis a Grassman coordinate vector with coprime components of the rational subspace S p. COROLLARY S ± has
5J. When S is a rational subspace, its orthogonal complement H ( S ±) = H ( S ) .
P r o o f i Let X be a Grassman coordinate vector of S with coprime integer components. By Lemma 5G, S ± has a Grassman coordinate vector whose components are the same as those of X__, except for signs and ordering. §6. A b s o l u t e Values. An absolute value is a map a ~ ~ [hi from a field K to the reals such that [el > 0, and la l = lall l,
[a[=0
precisely when
a=0,
[a -t- b[ < ]a[-4- [b[. The last relation is called triangle inequality. When K = Q, the standard absolute value is an absolute value in our sense. In order to avoid confusion with other possible absolute values, we will denote it by la[c¢. For any prime p, we can write any nonzero a E Q as a = p ~ ( u / v ) where p I uv. The p-adic absolute value is defined by
{p
-~
lalP=
0
if if
a#O,
a=O.
Then [a[, satisfies all the axioms of an absolute value. In fact we have ]a+bIp <=max(iaip, [blp), which is stronger than the triangle inequality. An absolute value with ]a -t- bI < max(]al, Ib[)
(6.1)
19
is called non-Archimedean. A b s o l u t e values which d o n ' t satisfy (6.1) are called
Archimedean. Let M ( Q ) = {c¢, 2, 3, 5 , . . . }. For v • M ( Q ) , t h e absolute value I Iv is A r c h i m e d e a n precisely w h e n v = c¢. If a • Q is nonzero, we have the p r o d u c t f o r m u l a
II ,alo=
yaM(Q)
LEMMA 6 A . Let I . " I be an absolute va/ue on a field K of characteristic zero. This absolute vMue is non-Archimedean if and only if Inl _5 1 for every n • Z. P r o o f . Ill = 111111,so t h a t 111 = 1. Also tll = l - 1 f [ - 11, so t h a t l - 11 = 1. In the case of a n o n - A r c h i m e d e a n absolute value one now proves easily by i n d u c t i o n on n > 0 that I'~1 5 1, and therefore also ] - n I = Inl 5 1. Conversely, suppose that Inl =< 1 for n • Z. We have
i----0
therefore la + b] u <
]alilbl ~-i
]alilb] u-i < (u + 1 ) N v
<
i=0
i=0
where N = max(M, Ibl)- Taking . h
roots we get
la+bl 5 ~ " ~ + l U , a n d letting u -* oo we o b t a i n la + b I < N = max(la[, Ibl). T h e absolute value with lal= 1 for every a # 0 in K is called trivial. THEOREM
6 B . (Ostrowski (1935)) The non-trivial absolute values on Q are
given by tal=lal~
where O < p = < 1
and
l a l = lal~
where
a > O.
P r o o f . One proves b y i n d u c t i o n on positive integers n > 0 t h a t In] < n, so t h a t also I - n I < n, a n d
Inl =< Inl~.
20 L e t a, b E Z with a > i, b > 1. For u > 0, we can write b" = co + c l a -4- " " -t- Cna n with 0 __< ci < a and c,, # 0. T h e n
IbV = Ib~l _5 Ic01 + I~111~1 + . . . + Ic, llal" < (n + i)aM"
__<
~logb~ a M n ,
1 + log a )
where M = max(a, lal)- Taking v-th roots gives /
< "~/1+ u l o g b ~r~ MlOg b~ log a. Ibl = V log a And letting u --+ ~ gives [b[ =< M l°s b/los a T h a t is,
Ibl -__<m a x ( i , lal ~°gb/~o~ a).
(6.2)
C a s e I. I I Archimedean. By L e m m a 6A, there exists an integer b with Ib[ > 1. T h e n by (6.2) lal > 1 for any a E Z with a > 1. So lal > 1 for any a E Z with a ~ { - 1 , 0 , - 1 } . For a , b E Z and a , b > 1, we have from the above t h a t
Ibl _5 max(l, I~I ~°""/~°~) Then [b[1/l°gb
=<
[a[1/log~
By s y m m e t r y , we get equality, i.e.
Ibl'/~°g~ = lall/'°s". We have 1 < Ibl 5 Ibloo = b. So Ibl = bP with 0 by (6.3). T h e n for any rational r, we get Irl = IriS.
(6.3) <
p __< 1 and then lal = aP = l a l ~
C a s e I I . I I non-Archimedean. We have In[ __< 1 for every n E Z. Since [] is non-triviM, there exists a E Z with lal < 1. Let I = {a E Z : lal < 1}. T h e n I is an ideal in Z. If labl < X, then lal < 1 or Ibl < 1 since labl = lallbl- So I is a p r i m e ideal,
say I = (p). Now let r E Q with r # 0. Write r = p " x / y with x, y E Z and p [ x y . T h e n x, y ¢~ I so Ixl = I~1 = 1 and Irl = IpVI = IpV. Since p E I , we have [p[ < 1 so ]p]=p-~
with
a>0.
21
Then Irl = IPl~ = P-~V = ( P - U F = Irl~ with a > 0. We now t u r n to algebraic n u m b e r fields. With each algebraic n u m b e r field K there is associated a set M ( K ) along with certain absolute values lal where v E M ( K ) . We have [ [~ # [ [., if v # v' and the following: (i) M ( Q ) = {oo, 2, 3, 5 , . . . }. (ii) For any a E K , a # 0, we have [a[~ = 1 for all but finitely m a n y v E M ( K ) . (iii) W i t h every v is associated a natural number n~ such that for a # 0 i n K we have the product formula
II
vEM(K)
lat: = 1
For v E M ( Q ) , n~ = 1. (iv) If g ' C g and v • M ( K ) , then there is a v' • M ( K ) such that lair restricted to a • g ' equals [a[,,,. This v' is unique and n., I n . . We write v [v'. (v) If K ' C g and v' • M ( K ' ) , then there are finitely many v • M ( K ) with v [ v' and --
vEM(K)
nvt
= [Z:
g'].
vI vt
In particular, by (iii), (v), given v' • M ( Q ) we have
uEM(U) vl ot
(vi) If K is an algebraic extension of Q with rl real embeddings mapping a • K respectively into c O ) , . . . , a (rl) and r2 pairs of complex conjugate embeddings mapping a into a(rt+l), a(rl+l),...
, a(rt+r~)~ a(rt+r2)
where rl + 2r2 = [K : Q], then the absolute values dividing c~ are [a 0) 1 , . . . , [a(")1,
la('~+l)l,..., la(-~+-~)I.
T h e first rl of these have n . = 1 and the last r2 have n~ = 2. (vii) Let a : K ~ L be an isomorphism. If v • M(L), for a • K put la[~, -- laal~. T h e n w • M ( K ) , and this gives a o n e - t o - o n e map M ( L ) --* M ( K ) , and in this correspondence nv = nw. We listed these properties in an axiomatic way but don't intend to prove them. T h e y can be found in any treatment of algebraic number fields based on absolute values. For readers more familiar with classical ideal theory we now sketch the connection with ideals. With any algebraic n u m b e r field K there is associated a ring D of integers in K . Any nonzero ideal 9.1 C L3 can uniquely be written as a product of prime ideals, i.e., ~l = ~3~t .-- q3~t with nonnegative integers a l , . . . , at. In particular, any principal ideal (p) can be factored in this manner, (p) = ~ ' . . . ~ t , where q3i I P. We can define
22 norms of these prime ideals by 9 l ( ~ i ) = pl~ where card(D/~3i) = pl~. A fractional nonzero ideal 92 can also uniquely be written as 92 = ~3~~ . . . ~3~~ with a l , . . . , at in Z. Non-Archimedean absolute values [ [v with v ~ M ( K ) are in o n e - t o - o n e correspondence with prime ideals. T h a t is, lair = p-"/* if (a) = ~ 9 2 where ~3 [ p a n d (p) = ~ 3 ~ 1 . - . ~ , and where ~ does not occur in the factorization of 92. So [P[~ -----P-~ = IPIp and [ [~ extends [ Iv. We also have n ~ = e f where ~ ( ~ ) = pf. E x a m p l e : Let K = Q ( v ~ ) . T h e n 7 = (3 + vr2)(3 - x/r2) in K . By (vi), there are two absolute values dividing o0. Say vl ] oo and v2 I oo. T h e n nvt = nv2 = 1. A n d we
ha~e 13+ v~lv, = 3 + v ~ ,
13-v~l~
= 3-~,
13+v~lv, = 3 - v ~ ,
13- v~tv, = 3 + ~ .
Now look at the absolute values dividing 7. ~ = (3 + v~), ~32 = (3 - v ~ ) divide 7. If Wl,W2 are the absolute values associated with ~ 1 , ~ 2 we have nw~ = nw2 = 1 and 13 + v~l,o, = 7 -1, 1 3 - v~l~o, = 1, 13 + Vr21w~ = 1, 1 3 - v~l,o~ = 7 -1- For u # Vl,V~,W~,W2, 13 + v~l= = 1, so the product formula holds for a = 3 + v/2 and a = 3 - v~. R e m a r k . If v I ~ , then I Iv is Archimedean. If v I c¢, then I Iv is non-Archimedean. This follows from L e m m a 6A. §7. H e i g h t s in N u m b e r Consider a = ( a l , . . .
Fields.
, a,,) E K n. Define
2 1/2 (lallv2 + . . . + I~,1~) I-~lv =
max(l~l~,...
,1~,1~)
if if
v is Archimedean ~ is non-Archimedean.
I f ~ :# _0, then ]~]v = 1 for all but finitely m a n y v. For ~ # 0, define the height (or "field height") of ~ by
HK(~)=
H
I-~l~~"
.CM(h') If, e.g., a l :fi O, then [~lv --> ]oq Iv for each v, which implies H K ( ~ ) iAIvl=~lv,so HK(),~=) = HK(~_)for A # 0 by the product formula. Example.
> 1. Also
IA_~lv --
Take K = Q. Let x = ( x l , . . . ,x,,) be a primitive integer point. We
have I~1~ = I~I (the usual Euclidean norm) and i~lv = m a x ( i x l l v , . . . , Ix-lp) = 1 for every p # oo, since x is primitive. So HQ(x___)= Ix_I.
E x a m p l e . Take K = Q(v~). Let ~ = (1, 3 + v ~ ) . We have I~lv, = ~/12 + (3 + v ~ ) 2 = v ~ , / 2 + V~, la[v~ = ~ , / 2 v~, I~-i~1 = t~[~, = 1, in fact i-~i- = 1 for ~ nonArchimedean in M ( K ) . So HK(~_) = [~_[,~I~_1,,, = 6 V~.
23 Suppose K C / ~ and ~ E K n, ~ # 0. Then how do HK(~=) and HR(a=) compare?
0EM(R}
veM(K)
0E~(R')
And (v) of section 6 gives
HR(~)= H
I~I.~[~:~
vEM(K)
= (HK(~)) tR:K] . The absolute height H(~) is defined by
H(~) = H~(~_)~/tK:Q1. Then H(a__) does not depend on the field K. R e m a r k . If K -- L and a : K ---* L an isomorphism, ~ E K " , a=~ 6 L " , then by (vii) HK(a__) = HL(a~)and H(~) = H(a~). So conjugate vectors have the same height. E x e r c i s e 7a. Is it possible to estimate H ( S + ~) in terms of H(_q) and H(_fl)? You would need to supppose a.q_# _0, fl # O, a + fl # O. If P ( X ) --- a , X " + . . . + a l X + ao with-ai E K is a nonzero polynomial, define the height by
HK(P) = H g ( a , , . . . , Oil, O~o). We can define H(P), the absolute height of a polynomial, in a similar fashion. LEMMA
7A.
H(PQ) < v/~ + 1 H(P)H(Q) when deg P = n, ( or n = min(deg P, deg Q)). P r o o f . Write P = a n X " + ".. + ao. Associate P with _q.a= ( a , , . . . ,a0). Define [PI~ = I~[v. Then
HK(P)=
H
IPl:~"
vEM(K)
Suppose v is non-Archimedean. Then
IPQI. =
IPIdQI~.
This is essentially Gauss' Lemma. We leave its proof as E x e r c i s e 7b. Now write Q = fl,~X m + " " + flo and PQ = 7,+,~X n+m + . . . + 70, where 7i = ~ a + b = i Ota~b. If V is Archimedean, then
2 [PQI2v : E h'il2v: ~ i i
"
E a+b=i
Ota~b
"
24
Cauchy's inequality implies t h a t N
2
N
__
k----1
Using this inequality we obtain nnCm
IPQI~ <= ~ (n + 1) =
l~/hl ~ a+b=i
i----O
o5,5~
(n + 1) ~ I~.I = ~ lh'bl~ a
= (n +
b
2
2
1)IPI.IQI,,.
So IPQI~ < ~
+ I
IPI,,IQI.
if v is Archimedean. T h e n
HK(PQ) < (n + 1)½"[K:~HK(P)H~:(Q) since ~ , l o o n , = [ K : Q]. And
H(PQ) < (n + 1)'/2H(P)H(Q). For a E K 1 --- K we have HK(a) = H K ( 1 ) ---- 1. This doesn't tell us m u c h a b o u t o~, so we define hK(a) ----Hh-(1, a ) and
h(~) = H(1, o0. Then hK(~) =
n
l(1,'~)I~"
vEM(K)
Remark.
We have h(1/a) = H(1, l / a ) = H ( a , 1) -- h(a).
R e m a r k . T h e reader should be warned that other authors often use another height, with the m a x i m u m n o r m for b o t h Archimedean and non-Archimedean absolute values. This is true, e.g., of Bombieri and Van der Poorten (1987) or Mueller and Schmidt (198.9). But Bombieri and Vaaler (1983) use the same n o r m as in these Notes.
25
Example. Let a / b E Q be in lowest terms. HQ(b, a) = vc~ + b=.
Then hQ(a/b) = H Q ( 1 , a / b )
=
E x e r c i s e 7c. Estimate h(crf/) and h ( a +/~) in terms of h(a), h(~). L E M M A 7BI I f P ( X )
= ( ~ d ( X - - 3 ' l ) ' " ( X - T a ) where ad, " / 1 , . . . , "i'd are algebraic,
then 5 - d / 2 h ( ~ l ) "'" h('yd) =< H ( P )
< 2(d-1)/2h(7,) .." h('yd).
R e m a r k . Notice that the leading coefficient aa doesn't enter into the inequalities since the height of a polynomial is independent of multiplication by a constant factor. P r o o f . The upper limit follows on applying Lemma 7A, d - 1 times. T h a t gives H(P)
< 2 ( d - 1 ) / 2 H ( X - ")'1)"" H ( X - 7d)-
Notice that g ( x - 7) = H(1, - 7 ) = H ( 1 , 7 ) = h(7), so the upper bound is proven. For the lower bound, we may suppose that C~d = 1. If v is non-Archimedean, then d
1-I max(l, I'ri[~) = ]Rio i=1
by Gauss' Lemma. If v is Archimedean, then we claim that d
1-I x/x + I'~il=. < 5d/21Pl~. i=1
(This claim will be proven below.) If K is a number field of degree n containing all of the 7's, then d
H hK(T,) < 5 n d / 2 H K ( P ) . i=1
Taking n-th roots gives the result d
H h(7i) < 5 d / 2 H ( p ) " i=l
The proof:~ of the claim is by induction on d. The case d = 1 is trivial. For v Archimedean, we may think of I ]~ as the usual absolute value on C. Without loss of generality, we have I711 < "'" < ]Td[, and we may suppose that 17d] > 2. (Otherwise the result is clear.) Write P ( X ) = X d + Old_l X d - 1 -~- '' " -~ Ol0 = Q(X)(X
- "rd)
~:I a m i n d e b t e d to Prof. H a l b e r s t a m for simplifying m y original proof.
26
w h e r e Q ( X ) = X d-1 + ]~d--2x d - 2
+ ''" -~- ~0" T h e n i = O,i,... ,d-i,
O~i----~i-1 -- "/dfli,
with
fl_]
= o,
fld-1 = i.
Writing c = lTd], we get I~il 2 > (c]Zil ~ - I Z ~ - I ])~ = a l Z i l ~ + ]Z~-ll ~ - 2 c l Z ~ - , Z ~ ] _> c~]Zil ~ + ]Z~-,I ~ - ~(IZ~] ~ + IZ~-x] 2) = (c ~
-
c)l#~l
i)lfli_] 12,
(c-
= -
and, summing over i, d-1
d-2
d-2
1 + ~--~' Io.I 2 _> 1 + a - c + (c~ - ~)])-]~ In, l~ - ( c - 1) ~ i=0
i=0
I,a,I 2
i--0
d--2
= (~ - i)~ ~
In,l~ + ~ - ~ + i d--2
= (c - 1)2{i + ~
I/kl ~} +
c,
i=0
so that
IPI,, > (c - 1)IQI,, and
i
IQI. < ~_--:-i-IPI.. By induction, d-1
1-[(1 + hd~) 1/2 _5 5('i-~)/21Q1., i=1
so that
d 1 - I ( 1 .~_
2,1/2
(1
C2)1/25(d_1)/2
i
i=1
=< 5d/2JPiv, since c > 2. L E M M A 7C. Given n, d and B , there are only finitely m a n y non-zero vectors ( a_ = ( a l , . . . , a , ) with each ai of degree < d and H ( ~ ) < B , if you consider proportional vectors the same. P r o o f . Without loss of generality, consider vectors (1, a 2 , . . . , a,,). Then Hg(1,a2,...
,an) > HK(1,ai)=hK(ai)
(i=2,...,n),
27
since I(1, a s , . . . , ~.)1. => I(1, a~)l.. Thus it suffices to show there are only finitely many a of given degree d with h(a) <__B. Such a would satisfy a polynomial equation P ( a ) = 0 where
P( X)
=
(x
-
~))( x
-
~(~))... ( x
-
~,(~))
with O~(1),... , CI~(d) the conjugates of a. Here P has rational coefficients. Then
H ( P ) < 2(d-1)/2h(a(1))... h(a (d)) < 2(d-1)/2Bd
since h(cr (1)) . . . . . h(a (a)) and h(a) < B. There axe only finitely m a n y such polynomials P. Because suppose P ( X ) = X d + pal_iX d-1 + ... + Po where Pi E Q (i = 0 , . . . , d - 1 ) . If H ( P ) = H ( 1 , p d - 1 , . . . ,Po) < 2(a-1)/~B a, then H(1,pi) < 2(d-l)/2Bd (i = 0 , . . . , d - 1), and clearly there are only finitely m a n y such rational p 0 , . . . , Pd-1. O p e n P r o b l e m . Given n, d, and B, find an asymptotic formula for the number of vectors ~ satisfying the hypothesis of Lemma 7C. For a E K and v E M ( K ) , we define nv
<~>v = I~1~. Then the product formula becomes
l-[
<°0~ = 1
~M(K)
for c~ 5~ 0 in K. LEMMA
7D. Ifoq 5~ ¢~2 in K, then for any v E M ( K ) , ,,, ~-- IOQ -- ~=1""
I
=> h~(~,,)hK(o~)"
P r o o f . If v is non-Archimedean, then
=< (max(I, I-,k))(m~(1, I~*k)). If v is Archimedean, then
28 which follows easily b y considering al,a2 as complex n u m b e r s a n d [ [. as the usual absolute value on C. T h e n the p r o d u c t f o r m u l a gives
w~M(K)
II
o
'~E M ( K )
_5
I,~,
-
~I~ °
II
l(l,,~,)l,~"lfl,,~)l~"
-,6M(K)
= lal
-
nv
a 2 l . hK(aa)hK(a2),
a n d the result follows.
§8. Heights of Subspaces. Let K be an algebraic n u m b e r field and S p a s u b s p a c e of dimension p s p a n n e d by _x,,... ,xp. T h e n t h e G r a s s m a n coordinates of S p are given by =X = x_1 A ..- A =xp . We define t h e field height of S p b y
H K ( S p) = H K ( X ) a n d the absolute height by
H ( S p) = H ( X ) . W h e n p = 0, so t h a t S O = {0}, p u t HK(S °) = H ( S °) = 1. /.From L e m m a 5G, we have for 0 < p < n t h a t
H ( S p) = H((SP)±). Since K ~ is a 1 - d i m e n s i o n a l vector space, it is clear t h a t H ( K n) = 1. T h e r e f o r e the above relation holds for p = 0 or p = n also. LEMMA 8 A . If S, T C_ K " are subspaees and S + T is the subspace of vectors _s + t where s_ 6 S, t 6 T, then
H(S
rl
T)H(S + T) <=H(S)H(T).
P r o o f . Let U = S 71T. Pick a basis _ul,... ,__~ of U. T h e n we can choose a basis u__x,... ,u=t , z__l,... ,z__p for S a n d a basis u__i,... ,__ut, =Vl,... ,Y for T. All we need to =q
show is t h a t
I ~ A . . . A ~1,~1-~1 A - . . A ~ A ~ A . . . A x A y~ A ' . . A y I, _
_
_
i I~-, ^ " - ^_~ ^ ~=~^ --
= p
=q
^_% I~ Igl,... ,_~ A =y A... A =Vq I~
for every v 6 M ( K ) . We have already p r o v e n the A r c h i m e d e a n case. (See L e m m a 5F.) T h e n o n - A r c h i m e d e a n case is left to the reader.
29
Exercise 8a. Prove the inequality above for v non-Archimedean. (When g = 0, the above inequality is to be interpreted as [x1 A . . . A xp A Ya A-.. A y [, < Ix_l A . . . A
pl,l=u 1 ^ . . . ^ =q y Iv.) For B E Z with B > 0, let N ( K , n , p , B ) denote the number of p-dimensionM subspaces S p of K " with H K ( S p) < B. Then N ( K , n,p, B) is bounded. W. M. Schmidt showed (1967) that N ( K , n , p , B ) >><< B '~ (8.1) where the notation >><< means there exist constants C, C ~ such that C'(K,n,p)B n < U(K,n,p,B) < C(K,n,p)B". Notice that the constants C, C ~may depend on K, n, and p. Schmidt (1968) also showed that in the special case K = Q, we have the asymptotic relationship U ( Q , n , p , B) ... co(n,p)B" for0
Here
co(n,p)= 1 (~)V(n)V(n-1)...V(n-p+l) n V-(~(2i:-:~(~
,(2)~(3)...~(p) ¢ ( n ) C ( n - 1)... C ( n - p + 1)
where Y ( n ) denotes the volume of the unit ball in B:n. Notice that co(n,p) = co(n, n p). In general, (i.e., for K an arbitrary number field), an asymptotic formula was recently derived by J. Thunder (to appear). Schanuel (1979) has given such a formula for N ( K , n, 1, B), but for different heights. L E M M A 8B. Let K be an algebraic number t]eld of degree d and 0 7~ 0 an algebraic number which is not necessarily in K. Given B > 1, the number of a 's in K satisfying
h( o) <= B is bounded by 36(2B) 2d. A similar result is due to Evertse (1984). R e m a r k . Consider the case 0 = 1. For a E K, the condition h(a) < B can be written as H K ( 1 , a ) < B a since (h(a)) ~ = hK(a) = HK(1,(~). Lemma 8B agrees with (8.1), according to which the number of pairs (1, 4) satisfying the last inequality has order of magnitude B 2d. P r o o f . For a E K, we will use the notation (a)~ = I~17 ~ and the corresponding product formula
II vEM(K)
/s>o = 1
30
For fl E K " we will let (/3>~ =
I~1~ so that H,~(_Z)=
<_~>~.
H veM(K)
Let L be a field containing K and 0. For v E M(K), we have
(~>o = Ida"" =
H
(141;'1 ~/EL:ul
,,~EM(L) wJv
since the exponent on the right-hand side is (see (v) of §6) Z
7.~tO
--
wI,J
In our new notation, ,~EM(£)
w]v
t~ix an Archimedean v ~ M(K). Assume, say, that v corresponds to a complex embedding of K into C. Consider
H
h~(~o)=
<(1,4o1>~
wEM(L)
= hL,(~O)hL~(40) where
hL,(40)=
H
((1,a0))w
and
hz2(40)=
,oEM(L) •"1"
H
((1'40)>IIM
wEM(L) '*'1"
Then
h(aO) = h,(aO)h2(40) where hi(40) = hLi(O:O) 1/[L:Q] (i : 1, 2). We will initially estimate the number of a's in K satisfying h1(40) < B1 and h2(40) < B2. Take such an 4. Then (4}v=
(4)~/[L:K] (by(8.2))
H ,,,~M(L)
w]v
=
<40>~/tL:'qM:d
H ~EM(L)
where m~
[[
31 Then
(~1~ ~ hL'(ae)'/EL:KJM:d = (h, (o~O)/M~)~ <=(B,/M~) d
where the first inequality follows because (aS)w < ((1, e~8))w for each w. The assumption that v corresponds to a complex embedding gives lair =< (BI/M~) d/= since (a}~ = la12~. So a lies in a square centered at t h e origin with sides of length
2(B1/M~) d/2. Suppose there are N elements a satisfying the inequalities hl(c~8) < B1, ha(oLS) < B2. Assume that N > 2. Pick the positive integer t satisfying t2"
5.
Divide the original square into $2 squares of equal size. Since N > t 2, there exist oq # a2 satisfying the two inequalities and lying in the same subsquare. Then
Io,, - ~1~ =< 2-~t2 (B'/Mv)d/2 < a(B~/M~)d/= and
_q (cq --o~2)v < ~ 2 ( B I / M v ) d 36
(s.3)
d
<=~ ( B i / M ~ ) . In the case where v corresponds to a real embedding, the same bound can be derived. Since 8(eel - a2) is non-zero, the product formula holds. Consider first the factor
H (O)w(Oq--~2)w= M[vL:Q](~I--ff2)~
:K'j
tvEM(L) w[,J
< 36 [L:KIB~L:~N -[L:K] (by (8.3))
= (36B~N-1)[ L:K]. The remaining factor is
II (0~1 - O~a),o. wEM(L) w[v
Using the fact that
I0o,1 -0o~21,,, =< {
max(l, [Sal]w)max(1, ]8a2].,)
if w is non-Archimedean,
V 1 "{-[0Oll]2wV 1 "3!-1OOt2]L if w is Archimedean,
32 we have H
<~(::~1-- 00~2>w ~
,~EM(L)
H
((1, 0al))w<(1, 0a2))w
,~EM(L) --_ <
hL2 (O011 )hL 2 (O012 )
= (h2(O~ 1) h 2 ( 0 o ~ 2 ) ) [L:K]'d =<
Then the product formula gives 1 =< (36. ~I~21:tdI:=t2dM-I~[L:K]' ' ] and N =< 36. ~ll:=tdl~2d~2"
(8.4)
This is trivially true when N < 2. Recall that N was the number of a E K with < B1, h2(o~O) = < B2. hi(a0) = We now complete the proof of the lemma as follows. Given c~ with h(aO) = hx(o~O)h2(o:O) <=B , let k be the integer with
2 k-1 =< hl(O~0) < 2 k. Thenk
> 1 a n d 2 k-1 < B s o t h a t
h2(o~0) =< B.21-k. Given k, the number of corresponding ~ E K is (by (8.4) with BI = 2 k, B2 = B . 2 l - k ) < 36- 2kdB2d22d-2kd = 36 • 22aB2d2 -kd. Summing over all k __>1, the total number of ~'s is
< 36.22dB 2d.
§9. A n o t h e r Version of Siegel's Lemma. Let K be a number field of degree d with embeddings U l , . . . , O"d into C. Given ~ 1 ' " " '-~m in K " , let S be the subspace of K " which they span. We have ~ i ) , . . . , a~) in K (i) (i = 1 , . . . ,d), where ai is the isomorphism K -+ K (i). Let /~ denote the compositum of K(D,. K (a), and S the subspace o f / ~ spanned by ~ 1 ) , a(1) _~d),. a (a) If rh d i m S , then rh < rod. Let wl, so that each a . has the form "
"
} ~
m
"
~
~---
~_j = wlx__il + . . . + wax=ja
"
°
"
,Wd be a field basis for K / Q ,
(1 =< j =< m)
33
with x_jt 6 Q" (1 < g < d). Then a(. i ) = w ( i ) x
-----3
1 rail
+...+w(i)x
d =jd
(1 < j < m , 1 < i .
.
.
.
The matrix (w~O) is non-singular. For fixed j, each v e c t o r x_jt (1 < g < d) is a linear combination of a()) ~---3
is
~'"
. a (d) So S is spanned by x_jt (1 < j '--~-J
"
=
< rn, 1 < g < d), and =
----
---
defined over Q (i.e.,S is spanned by vectors with rational components). The orthogonM complement S± is also defined over I Q and dimS±=n-Fn
> n-rod.
Assume now that n > rod. By Lemma 4B, there exists an integer point x_ 5~ 0 in S± with ~=~ <=H(S±)I/(n-'h)
< H(~±)l/(n-md) = H(S)i/(n-'nd). Let S (i) (1 __
(by Lemma 8A)
= H(S) d
(by (vii) of §6)
__<(H(_a)..-H(___a)) ~
(by Lemma 8A)
We have proven the following version of Siegel's Lemma: L E M M A 9A. S u p p o s e K is a n u m b e r field of degree d, the vectors a__l,. . . , ~,~ in K " span a subspace S, and d m < n. Then there is a vector z= 6 Z " \ O with a.x = 0 and
(j = 1,... ,m)
<-5H(S) d/('-md) < (H(a=l)... H ( ~ , ~ ) ) d/("-md).
This particular formulation is due to Bombieri and Vaaler (1983). They assumed, however, that S (a) 6) ..- (9 S (a) had dimension rod. R e m a r k . In fact, there are n - d m linearly independent solutions _xl,... 'z--,-dm such that 1~11".. I~,_d,,,I =< H ( S ) d < (H(~I)...
H(~__m))d.
More generally, Bombieri and Vaaler consider a number field k C K and a system of equations a . x = 0 (i = 1, rn) where a. 6 K", and they seek solutions x 6 k".
II. Diophantine Approximation References: Cassels (1957), Schmidt (1980). §1. D i r i c h l e t ' s T h e o r e m a n d L i o u v i l l e ' s T h e o r e m . T H E O R E M 1A. (Dirichlet, (1842)) Given a E R and N > 1, there exist integers x.,y with l <=y <=N and lay - xl < 1 / N .
When a is irrational, there are infinitely m a n y reduced [ractions x / y with
R e m a r k . It is clear that the first statement remains true when we require that x, y be relatively prime. Then the second inequality follows by a-y
1 < 1 < N---y = y2"
Since a is irrational, a y - x is never zero, thus for fixed x, y the inequality l a y - x [ < 1 / N can be satisfied only for N under a fixed bound. Then as N --~ oo, we must get infinitely m a n y distinct pairs x, y. THEOREM l B . (Dirichlet) Given a l , . . . , a n in R and N > 1, there exist y, x l , . . . ,xn in Z with 1 <=y < N and
[ a i y - zi[ < N -1/n
(i = l , . . . ,n).
If at least one of the a l , . . . , an is irrational, then there are int~nitely m a n y n-tuples (2y , . . . , *-~y) with gcd(y, x l , . . . x n ) = 1 and
ta i -
1 < y:+C:t--------y
(i = 1, ...
Consider the system of inequalities: ] a l l X 1 "J- "'" + alnXnl < A:
(1.1) [an--l,lXl +''"
+ an--l,nXn[ < A . - I
lan:Xl + " " + annx,~l < An where [ d e t ( a i j ) I # 0 and Ai > 0 (i = 1 , . . . , n). This system defines a parallelepiped of volume 2"A1 --~ An i det(aij)l . (1.2)
35 Suppose A a . . . A , > [ det(ai/)[. T h e n the volume of the parallelepiped is greater than or equal to 2", and the result would follow by Minkowski's T h e o r e m (2C) of Chapter I if we had a compact set. However, we have L E M M A 1C. S u p p o s e Ai > 0 (i = 1 , . . . , n ) and A1 . . . A n (1.1). T h e n the s y s t e m o f inequalities has a solution x_ E Zn\O.
> Idet(aii)[ > 0 in
E x e r c i s e l a . Prove L e m m a 1C. P r o o f (of T h e o r e m 1B). In R "+1, consider the system of inequalities J a l y - xl[ < N -1/"
loony -- xn[ < N - 1 / n
_-< N. By L e m m a 1C, there is a non-trivial solution. If we had y = 0, then x l , . . . , z , would all be zero, too. Thus y ¢ 0. Then there exists a solution with y > 0, therefore I < y < N. T h e seeond assertion of Theorem 1B follows just like in T h e o r e m 1A. T h e o r e m 1A was improved by Hurwitz (1891). He showed, for a irrational, that there exist infinitely m a n y fractions x / y with <
1
See, e.g., Schmidt (1980) for a proof. The following L e m m a can be used to show that the constant v ~ is best possible. L E M M A 1D. S u p p o s e a is a r e a / quadratic irrationM satisfying aa 2 + ba + c = 0 w i t h a, b, c E Z, the leading coemcient a > 0 a n d discriminant D = b2 - 4ac. T h e n for A > V ~ , there are only t~nitely m a n y fractions x / y with < a - Y
1 Ay----~ .
E x a m p l e . Consider the polynomial equation a 2 - a - 1 = 0. Here D = 5 and a = (1 + v/'5)/2. Using L e m m a 1D, we see that for A > x/5 there are only finitely many solutions to [o~ - ( x / y ) [ < 1 / ( A y 2 ) . Thus nurwitz's result is best possible.
36 P r o o f . Writing f ( X ) = a X 2 + bX + c = a ( X - a ) ( X - a') gives D = a~(a - a ' ) 2. Then if ]a - ( x / y ) l < 1/(Ay2), we have y2 =
o)
=
< - - {2 Ay 2 v~ < ~
O¢
-
O/
+ - -X y
OL
a + A~y----~.
Subtracting v/-D/Ay 2 from both sides gives 1(._~__~) y2 1 -which becomes
a < A2y 4 , a
y2 <
a( A - V ~ )"
The result is proven. Given any quadratic irrationality a, there exists a c > 0 such that a
y
c
x
> y2
for any ~ # o~. Any irrational a satisfying Ot
_ •
>
y
=
y2
for some constant c(a) > 0 is called badly approximable. T H E O R E M 1E. (Liouville (1844).) Suppose a is algebraic of degree d. Then there exists a constant c(a) > 0 such that for any rational ~ ~ a, we have -y
> yd "
P r o o f . The proof is broken into three steps which will be important in a more general context later on. (a) Let P ( X ) be the defining polynomial of a. So deg P = d, the coefficients of P are in Z (i.e., P ( X ) E Z[X]), and P ( a ) = O. (b) For rational ~ 7~ a, we have 1
> y~.
37 (c) Expanding P into a Taylor series at a, we get
P (y)
=
-
t P(i)(Ol) a
i=1
i!
'
since P ( a ) = O. We may assume that
(Otherwise, we're done.) T h e n
1 < ]p ( ~ ) [ ya
=
< o~ y [ ~ [e(i)(°~)[ i! i=1
=
Let c ( a ) be defined by
IP(i)( )I i=1
i~
1 - 2c(a) ;
then the result follows. C O R O L L A R Y 1F. (Liouville) T h e n u m b e r a = Eve¢=I 2 -~: is t r a n s c e n d e n t M . Liouville was first to exhibit transcendental numbers, in fact first to prove the existence of such numbers. P r o o f . Write y ( k )
_
2 k' a n d z ( k ) = 2 k' zX-~k . , v = , -9-u! . T h e n x ( k ) , y ( k )
and
e Z (k = > 1)
o~
y(k
- =k+ 1 = 2-(k+1) ! + . . . < 2- 2 -(k+l)! = 2 / y ( k + 1) < c/y(k) d
for any given c, d, provided that k > ko(c, d). Hence, for any d, we have cr not algebraic of degree d by Liouville's T h e o r e m (1E). T h e numbers which can be proved transcendental by Liouville's T h e o r e m are called "Liouville numbers". T h e y form a set of measure zero. This explains why Liouville's T h e o r e m is not enough to prove the transcendence of classical numbers such as e or ~r. E x e r c i s e l b . Given a E R and N > 0, there exist x, y E Z, not both 0, with N[c~y - x[ + N -1 lY[ < v'~.
38 Now use the arithmetic-geometric inequality to show that given ~ E R \ Q , there are infinitely many rationals ~ with a
x[ 1 --y <--. 2y 2
(This is better than Dirichlet's Theorem, but worse than Hurwitz's Theorem.) §2. R o t h ' s T h e o r e m . A consequence of Liouville's Theorem is the following: If deg a = d > 2 and p > d, then a-y
1
has only finitely many solutions ~. Thue (1908) strengthened this result by weakening the hypothesis to # > (d/2) + 1. Siegel (1921) in his thesis improved this to p > 2 v~. Dyson (1947) and Gelfond (1952) showed that the result holds for p > v ~ . In 1956, Roth received a Field prize for his 1955 result with # > 2. Dirichlet's Theorem shows that Roth's result is best possible. T H E O R E M 2A. (Roth (1955).) Ira is algebraic and 5 > 0, there are only finitely many rationals *with y -
1
< y~+~.
Remarks.
(i) Roth's result is correct but trivial for a E C\R. (ii) If deg a = 2, then Lemma 1D is better. (iii) We know that there are infinitely many ~ with
¢X--y < - and only finitely many ~ with
tO~- ~xt < y2-{-----' 1 -~ with 5 > 0. For any given ~ with deg a > 3, it is still unknown whether o~ is badly approximable, i.e. whether there exists a c > 0 so that
>5 for every rational -.~ The conjecture is that this holds for no algebraic a of degree >3.
39
(iv) Another conjecture is that Roth's Theorem holds in the following strengthened form: the inequality
Oe-y<
1
y~(log y)k
has only finitely many solutions for k > 1. The following theorem gives heuristic grounds for the conjectures in (iii) and (iv). T H E O R E M 2B. (Khintchine (1926).) Suppose ¢(y) > 0 is defined on the positive integers and ¢, is nonincreasing. Consider the inequality
]~-
~
¢(Y)
<
(2.1)
If
(i) ~ , = , ¢(y) < ~ , (ii) ~ = ,
(2.1) h ~ only ~nitely many solutions for al=o,t all ¢(y) = c¢, then (2.1) has in6nitely many solutions t'or almost all a.
~.
R e m a r k s . Take ¢(y) = 1/y(logy) k with k > 1. Then case (i) in Kintchine's Theorem says that
Io:<
y2 (log y)~
has only finitely m a n y solutions for almost all a. Taking ¢(y) = 1/ylogy, case (ii) tells us that
-
<
y2~ogy
has infinitely many solutions for almost all a. Here we will only prove the easy part (i) of Khintchine's Theorem. The inequality (2.1) defines an interval for a of length 2¢(y)/y. The union of these intervals for x = 1 , 2 , . . . ,y has measure < 2¢(y). The union of the intervals (2.1) with x e Z is a set which is invariant under translations by integers, and the intersection of this set with 0 =< c~ < 1 has measure __<2¢(y). Thus if S(y) is the set of numbers o~ in 0 _-< a < 1 for which (2.1) holds for some x, then S(y) has measure #(S(y)) _-< 2¢(y). Further if
sN = U s(y), y=N N then I,(SN) --, 0 since Ey=~ ¢(Y) is convergent. Now o, in 0 ___<~ < 1 has infinitely
m a n y solutions to (2.1) precisely when el lies in oo
N SN; N=I
but this set has measure 0. Outline o f P r o o f o f R o t h ' s T h e o r e m . We might try the fb!lowing.
40 (a) Pick a polynomial P ( X ) E Z[X] which is not identically zero, vanishes at a of order q, and has degree r. (b) Show t h a t P ( ~ ) ~ 0 with only finitely m a n y exceptions i " T h e n
= yr"
(c) Consider the Taylor expansion
T h e n if ~ - a
< 1, we have
for some constant C ( a ) , so t h a t Y
-
a
C'
with C ' constant. This is good if r/q is small. But if deg a = d, then r > qd. T h u s r/q > d, and we get no i m p r o v e m e n t over Liouville's Theorem. This a r g u m e n t can be modified by using a polynomial in m variables. T h u e used a polynomial in 2 variables of the form P ( X 1 , X 2 ) = X2Q(X1) - P(X~). Siegel used a m o r e general polynomial in two variables, and so did Dyson and Gelfond. Only Roth was able to overcome the difficulties involved in dealing with more t h a n 2 variables. To see why a polynomial in m variables offers an advantage, consider P ( X 1 , . . . , X m ) E Z [ X 1 , . . . ,Xm] of degree at most r in each variablefl Such a polynomial is m a d e up of m o n o m i a l s X~ 1 . . . X~,"* with 0 < i l , . . . , z,, < r. The numbers of such monomials is (r + 1) m, so the n u m b e r of possible coefficients is also (r + 1) m --~ r '~ as m --+ oc. Try to m a k e P vanish at ( a , . . . , a ) of order q. T h e n P ( J ' ..... J ~ ) ( a , . . . , a ) = 0 for j l + ' " + j m <=q. T h e n u m b e r of implied linear homogeneous equations with rational coefficients for the coefficients of P does not exceed
as q --~ oo. Roughly speaking, we can choose r, q with
r m ..~ d q m / m ! , tOne might be tempted to try a polynomial of bounded ~olal degree, but difficulties in part (b) preclude such an approach.
41 or
r/q~
V
m!
Use the multidimensional Taylor's formula to write p
x
..,
.
.
where the sum is over j l , . . .
.
.
c(jl,...,jm)
, j m with jl + "'" + j m
> q. Then if things go well, in
%
particular when P { { . , . . . , {.~ # 0, we have ktl
1 1<_(_ =P
yT,
y,...,
<<
-a
q
and -
y >> yr,*/q
with r m / q ~ cm • TcQ, where = mt
(2.3)
So we can hope to improve Liouville's result to # > c,,d lIra, and this will actually be achieved in Theorem 6A below. In order to get Roth's Theorem, one further has to show that the number of conditions imposed on the auxiliary polynomial P is often less than (2.2). The difficulty with this approach is in step (b), The zero-set of P ( X 1 , . . . , X m ) is some algebraic manifold in R m, so that it is hard to show that P ( ~ , . . . , ~) # 0. To overcome this difficulty, one considers instead an m-tuple *-~ - ~ of distinct %
Yl
' " " "
'
yrn
- ~ ~ 0. It turns out that rational approximations, and tries to show that P [\ *_r Yl ' " " " ' Ym ] one needs Yl < Y2 < "'" < Ym increasing rapidly. In order to make this approach work, one needs I n - -~] all small (i = 1,... m). For yi example, in the case m = 2, one needs two good approximations *_r *_z with Y2 much Y t ' Y:~ larger than yl. This is why just one very good approximation gives no contradiction, and the result is "ineffective" in the sense that no bound can be stated for the size of the numerators y of very good approximations. Effective improvements of Liouville's Theorem for certain cubic irrationals were given by A. Baker (1964). Then Feldman (1971) used Baker's theory of linear forms of logarithms to give improvements for general algebraic a. These were of the type _
>
C(O/)
yd-Ct ( C*)
where c(a) > 0 and c~(a) > 0 are effective. Unfortunately, cl(a) so obtained is usually very small. Then further improvements for special numbers were obtained by Baker and Stewart (1988), Bombieri (1982), Bombieri and Mueller (1983), Chudnovsky (1983).
42
§3. C o n s t r u c t i o n o f a P o l y n o m i a l . We will follow Bombieri and Van der Poorten (1987). We will construct a polynomial P ( X 1 , . . . Xm) E Z[X1,... ,Xm]. For any such polynomial P, define Iel to be the m a x i m u m absolute value of its coefficients. If I = ( i i , . . . ,ira), then let
pZ _
1 Oq +'"+i"~P il!i2!'" i,d OX~ ~... OXi~' '
Then pZ E Z[X1,... ,Xm] for P E Z[Xx,... ,Xm]. If P has degree ___
Ieq =<2r'+ +'~iE = 2r~q
(3.1)
with r = rl + " " + rm. One similarly shows that I(et)JI
< 3 ''++''"
Iel = 3"lel.
(3.2)
We will say that P is of multidegree < R = ( r ~ , . . . , rm) if its degree in X~ is < r~ (i = 1 , . . . , m ) . Let ~ = ( a l , . . . , am) with real or complex coordinates and E = ( e l , . . . , em) with natural coordinates be given. If P ( X a , . . . , X ~ ) ~ O, the indez of P at __awith respect to E is defined to be the largest value of t such that
#(~)
= o
for every I = (ia,...ira) with
i_£+ .... + i,, < t. el
em
The index of the zero polynomial is understood to be oc. (This definition is due to Roth.) R e m a r k s . If el . . . . . e m = 1, then the index is simply the order of vanishing of P at __a. The ei's allow different weights to be given to the different variables. Given m-tuples I = ( i l , . . . , i m ) and E = ( e l , . . . era), let
E Let ~ (t, R) denote the set of ( ~ 1 , . . - , ~m) satisfying 0 __ < ~i _< 1
(i = 1 , . . . , m )
and
~1~ + . . . + ~ m rm =
Then
em
i_~ + . . . + _ _ igll < t el
em
43
if and only if • G
t,
.
(3.3)
The index of P with respect to E is the largest value of t such that Pr(_a) -- 0 for every I satisfying (3.3). R e m a r k . In our applications, R and E will both be the multidegree of P. Let W ( t , R ) denote the volume of ~(t,~),R let ~(t) = ~ ( t , ~ ) , and W ( t ) = W (t, R) the volume of G(t). L E M M A 3A. Suppose a l , . . . , a m lie in an algebraic number t]eld K of degree d. Suppose t > 0 and d W ( t ) < 1. Suppose ~ > O. Then if R = ( r l , . . . ,rm) is large, (i.e., each ri > c ( a l , . . . , a m , t , e)). there exists a polynomial
P(Xl,...Xm) e
,Xm]
~[Xl,...
which is not identically zero and has multidegree < R, such that the index of P at ~___= ( a l , . . . , a m ) With respect to R is >=t, and
Ip---~ __<((4h(a1))~1... ( 4 h ( a m ) ) ~ ' ) ~
(1+~).
P r o o f . Try rl
rm
e= ~
... 5 :
j~=0
c(j~,... ,jm)x~' ...x:~
j,,=0
= ~c(s)x', J
with J = ( j l , . . . ,jm). The coefficients c(J) are the "unknowns" which need to be found. The number of coefficients (i.e., unknowns) is n = (~1 + 1 ) . . . (~m + 1). We want PI(al,... , a,,) = 0
for every I with -~ E ~(t). That is, ~c(J) J
il
"" im
a,
•
This is a system of homogeneous linear equations in n unknowns. The conditions are parametrized by I with -~ E 6(t). Given such an I, denote the coefficien~ vector of the linear condition by ~I" The components of ~I are
44 For v 6 M ( K ) non-Archimedean, we have
I~zl. =< max(1, l°'~lv) ~' --'max(1, l°~-,I,,) r". For v 6 M ( K ) Archimedean~ each component of oe. has norm
=< 2~'+"+~mx/~+lc~llZ.~l-'-x/l+
"~--m2r'. ,
and the n u m b e r of components of =~I is n < 2 ~l+'''+r" . Thus for v Archimedean,
I~zl.
~ 4r~+'"+r~ X/1 + I~11~"~.-. V/1 + [em[~ ~"
Then
H K ( ~ I ) <-_4(rl+'"+r'OdhK(Oq)n ... hK(O:m) r'* and
H ( ~ I ) <=(4h(c~1)) r l . . - (4h(am)) ~'~. Recall, the number of unknowns c(J) is n = (rl + 1 ) . . . (rm + 1) -,- fir2 ... rm. Let k be the n u m b e r of conditions (i.e., the number of I's with -~ 6 ®(t)). For large R, we have
k ~, f i r 2 " . , rmW(t). This can be seen in the case m = 2 by considering the following picture.
I So we have
n - dk ~ Therefore,
an
- -
n - dk
rlr2
<
. . . rm(1
r1 dW(t)). -
dW(t) ( 1 + ~)
1 - dW(t)
for R sufficiently large. By Siegel's L e m m a 9A of Chapter I, there is a solution of bounded size. More precisely, there exists a polynomial P ~ 0 satisfying the index condition, of size
IPI ___<((4h(oq)) n - - . (4h(olm))rm)~(1+').
45
§4. U p p e r B o u n d s for t h e I n d e x . 4A. (Roth's Lemma (1955).) Suppose 0 < ~ < 1/12. Put
THEOREM
w = w(m,¢) = 2 4 . 2 - m ( ~ / 1 2 ) 2~-~ where m E N. Let R = ( r l , . . . ,rm) with
wrh >rh+l Suppose x_r Y l "t •
(h = 1 , . . . , m -
1).
z_m are rationals in reduced form with denominators satisfying
•
• ~ Ym
y'~ =>2 am
(h = 1,... , m )
and rh
Vh = V[~
(h = 1,... ,m).
Let p ( z ~ , . . . , x , , ) • z[x~,...xm] be such that p # o and
Suppose P is of multidegree < R. Then the index of P at ~Yl ~ with respect to R is at most ~. See Roth (1955), Cassels (1957), or Schmidt (1980) for a proof. Roth's Theorem m a y be proved either by using Roth's Lemma or Theorem 4B below. Neither of these will be proved in these Notes. The proof is by induction on m. Here we will only consider the (trivial) case m = 1. Let P ( X ) E Z[X] and ~Yt a rational with gcd(xi,yl) = 1. We may write " " "
where M (~-~) w ~ 0 and t is the order of vanishing of P at ~-~. vl We P(X)
=
(ylX
--
'
Ym
can
also write
Xl)tQ(X)
where Q (~--~) v, # 0. Since P ( X ) E Z[XI and ( y l X -- X I ) has integer coefficients a n d content 1, we get Q ( X ) E Z[X] by Gauss' Lemma. Thus the leading coefficient of P is divisible by y~ and
yf_<~<
= Y7r, = Y l ~r, "
Therefore, _<~ rl
and the index of P at ~ with respect to R = (rl) is < e. Note that this argument, using Gauss' Lemma, is arithmetical. Now suppose P ( X 1 , . . . , Xm) E Z[X1, • • • Xm] is a polynomial of multidegree < R = ( r l , . . . ,to,). The number of coefficients of P is (rl + 1 ) ' - . ( r m + 1) "~ r l r 2 . . . r m if
46
each ri is large. Suppose the index of P at ( a l , . . . , a m ) ¢ C m with respect to E is > t. Thus P I ( a l , . . . , a m ) = 0 for every I with -~ • ~ ( t , R ) . The n u m b e r of such conditions is given asymptotically by r ~ r 2 . - , r m W (t,-~). Suppose __al,... ,_a_k • C m and the index o f P at_a m with respect to E i s > th (h = 1 , . . . , k ) . T h e n the total n u m b e r of conditions is approximately k h=l
If these conditions are independent and P ¢ O, then ~ =k1 W (th, R) should not be much larger than 1. THEOREM 4 B . (Esnault and Viehweg (1984).) Suppose r l => r 2 => . . . and let a_q,... , a_k E C m with the condition that if a = (O~il , a i m ) , then ---:1
air#air
if
i•j
'
"
"
= > rm,
°
(1 < g __<m).
Suppose P ( X I , , . . , X m ) E C ( X , , . . . X m ) of multidegree < R with P • O, and the index of P at a_h with respect to E = ( O , . . . e m ) is > th ( h = 1 , . . . , m ) . Then
h=l
j----1
i=j+l
where k' = max(2, k). Bombieri (1982) did the case m = 2 before the general case was done. He called this Dyson's Lemma, in reference to work done by Dyson in 1947. For the m = 2 case, the bound is slightly better, namely,
EW
h=l
th,
=1+
-rl
Viola (1985) gave another argument for the m = 2 case. He removed the condition ail 7~ a j l , ai2 7~ aj2 for i 7~ j and imposed the condition that P ( X 1 , X 2 ) have no factor of the form X1 - c or X2 - c. T h e o r e m 4B is algebraic in nature. The proof involves a lot of algebraic geometry and will not be given here. Now suppose K is a number field of degree d. Let _.a = ( a l , . . . , a m ) E K m with Q ( a i ) = K (i = 1 , . . . , m ) and fl = (fll,..-,/~,~) E Qm. We will apply T h e o r e m 4B of Esnault and Viehweg with k = d + 1 and _aO),... ,~(d),/~. T h e gth coordinates are a ~ l ) , . . . ,a~ d), fit, which are all distinct. We will set tl . . . . .
td = t and td+l = r.
47
If P ( X 1 , . . . , X m ) # 0 of multidegree R = (r~,... ,rm) has index __> t at (i = 1 , . . . , d) and index > r at fl with respect to R, then Theorem 4B gives
dW(t) + W(r) <= I I
1+(a-1)
j=l
r,
a_(O
.
i=j+l
Suppose now that ri+ 1 <
ri
1 = 2dmA
(1 < i < m -
1)
where 1 __~1. Then (since d > 2) ... i=i+~ ri
~
2--d-~Sm A
< - -2 3dmA '
and we have H
l+(d-1)
E
j=l
ri
< II
1+3--m~2
j--1
i=j+l
< e 2/3"x < l q -
1
since A > 1. We have proven the following lemma. L E M M A 4C. Suppose P ( X 1 , . . . , X m ) ~ 0 has coet~cients in Q and has multidegree < R = ( r l , . . . ,rm). Furthermore, suppose ri+l
ri
<
=
l
2dmA
(1 < i < m -
1)
(4.1)
is satisfied with I > 1. Suppose a-, fl are as above and t, r satisfy 1 d W ( t ) + W(~-) >=1 + -~. Then if P has index > t at a- with respect to R, it follows that P must have index < r at ~ with respect to R. E x e r c i s e 4a. Define r = r(k, t) to be the least integer such that given any k points _al,... , a k in C~, there exists a polynomial P ( X , Y ) ~ 0 with coefficients in C, of total degree __ t at each of a-l'"'a-k" Certainly, r <= ro(k,t) where r0 is least with
48
Thus < V/-£.
limt-~r(k't) t Compute the following: r(2, t)
and
lim r(2, t) t---,co
t
as well as r(3,t)
and
lim r(3,t)
t---,oo
t
E x e r c i s e 4b. Compute r(4,t) and r(5, t). §5. E s t i m a t i o n
of Volumes.
Recall from §3 that W(t) represents the volume of the set of (~1,... , ~m) satisfying 0 =< ~i =< 1
(i=l,....,rn)
and ~ l - ~ - ' ' ' - } - ~ r a ~ t.
LEMMA
5A. I f t <=1, then W(t) = tm/m!.
R e m a r k . If t =< 1, then ~(t) is the set of (~1,..- ,~m) satisfying 0 < ~i
and
~l+'"+~m
_-< t.
The lemma may be obtained by induction on m. LEMMA
5 B . / f t = ~ - 0 w;hereO < 0 < 9 , then W(t) < e -°2/m.
R e m a r k . Consider the cube C consisting of points ((1,.-. ,(m) with 0 < (i _-< 1 (i = 1,... , m). Lemma 5B says that those points satisfying ~l+'"+(m
< m =
0
2
form a small proportion of C and similarly, by symmetry about ( ½ , . . . , ½), for those points satisfying (1.Ar.....~_~m)> m
Thus most points satisfy (l+'''+(m--2 which agrees with probability.
=<0'
49
P r o o f . Set q =
30/2m, so that q __<3/4. For points in 6 ( t ) , we have m<_0.
~I q- "'" "~- ~m
Then
--q(~l"l-"'+~m
--
2
2)=> Oq=30212m"
Consider
W(t)e a°'/2m <
~001L 1 ...
exp
( -q (~1 +"" + ~m - 2 ) ) d~l"" d~m
= (Llexp(-q('-l))d')
m
Im~ where I is the integral on the right-hand side. By a change of variables, we get I =
x12 J-1/2
exp(-q~)d~
eq/2 _ e-q~2
2 = -2 sin h ( ~ ) q
2((q/2)+~. (q/2)3"4q
q2
)
q4
= 1 + ~-~ + -1-2- -0-.-2- ~ + " "
q2
<1+1---~ < eq2/10 = e902/40m2 < eO~/4m2 Then and
W(t)e 302/2m < e°2/4m W(t) <e -s°~/4" < e -°Vm.
§6. A v e r s i o n o f R o t h ' s T h e o r e m . Let a be algebraic of degree d and fl = ~. We will consider inequalities of the type l a - fl] < 1/h(fl) 2+6.
50
Given C > 1, a window of exponential width C will m e a n an interval of real n u m b e r s of type
w<=¢<w c where w > 1. THEOREM 6A. Suppose a is algebraic o f degree d > 3. Suppose 1 < m! < d and 0 < X < 1. Set A = 2 d ( 6 / X ) " and cm = m / ( m ! ) x/m. Then the rational solutions/3 of the inequality [ a - fl] < h(/3) -c"d'lm(x+x) (6.1)P
have their heights in the union of the interval h(/3) < (8h(a)) 6)~/x -- BI, say, and at most m -- 1 windows of exponential width C where C = 6alma = 12d2m(6/X) r". R e m a r k s . In the case m = 2, we get cmd 1/m = v ~ . T h u s if ~ < # < 2 ,Cry, there is a X with # = c,,,dl/m(1 + X) and 0 < X < 1. This implies the D y s o n - G e l f o n d estimate. In this case, we have at most 1 window. In the case m = 3, we get cm = 3/~/6 = ~ and # > ~ r ~ - ~ . ,~/'d. In this case, there are two possible windows. THEOREM 6B. Suppose a is algebraic of degree d > 3. Suppose 0 < 5 < 1 and m = [(25/6) 2 log 2d]. Let A = 2m!. Then the rational solutions/3 of the inequality
la -/31 < h(/3) have their heights in the union of the interval h(fl) < (4h(a)) 25Me = B2, say, and at most rn -- 1 windows of exponential width C = 6drnA. R e m a r k . T h e o r e m 6B contains R o t h ' s Theorem. Let a be given in a n u m b e r field K with deg K = d and K = Q ( a ) . F u r t h e r m o r e , let/3 E Q a n d A > 1 be given. We introduce the mixed height of a and/3, given by
ha(a,/3) = (4h(a)) a - 4h(/3). We will now state the main theorems. THEOREM 6 C . Let ( a l , / 3 1 ) , . . . , ( a m , / 3 r n ) b e Such that ( ~ ( a i ) ----K and~3, E Q (i = 1 , . . . , m). Suppose that (i) l < r n ! < d, (ii) $ > 2 d . 4 m, (iii) ]oq -/31[ < hx(ai,/3i) -cr"d'l'~ (l+4(2d/A)'lm) (i .= 1,... , m ) ,
51 (iv) h~(ai+l,ti+l) This is impossible!
> h j , ( o l i , t i ) 3dm'k (i = 1 , . . .
,m-
1).
THEOREM 6 D . L e t (hi, tll),.. • , (am, tim) b e as a b o v e , a3"ld s u p p o s e t h a t (i) m > 36 log 2d, (ii))~ __>2m!, (iii) ] a i - f l i l < hx(ai,fli) - 2 - 1 2 ~ (iv) as above. This is impossible.
(i = 1,... ,m),
R e m a r k . Theorems 6C and 6D give Theorems 6A and 6B, respectively. First, we will verify that Theorem 6C implies Theorem 6A. Consider solutions fl of
(6.1), i.e. I~ -
Zl < h(fl) c=d''m°+x)
Let B, = (Sh(a)) 6a/x, and suppose h(fl) > B1. 6(2d/~) 1/m and (6.1) implies that la - fll
Let ~ = 2d(6/X) m.
Then X =
x < h(t)-c'd'l"(l+~X)h(t) - e " dXlm (=x) < h(t)-c,,da/'(l+4(2dla)l/=)(Sh(oL))-cmd~/='22~
< ha(a, t ) -c'd'l"O+4(2a/a)I/"), since 2 > 1+4(2d/)01/m and 8 a > (4~).4. If there is no approximation t with h(fl) > B1, then we are finished. Otherwise, let t l have minimal height with h(fl) > BI. If each such t has h ( t ) < h(t~) 6dm~, then they all lie in a single window and we are done. Otherwise, let t2 have minimal height with h ( t ) > h(fll)6dma. Then
h~(a,t~) > h(Z~) >= h(t~) ~dm~. B~ dma >= (h(t,). (Sh(a))~) 3dm~ => h~(a,,
t,)~dm~.
Continue in this fashion. If the solutions with h(fl) > BI do not lie in m - 1 windows, then t l , t 2 , . . . , t m can be found, and Theorem 6C with ai = a (i = 1 , . . . , m) gives a contradiction. Next, we will show that Theorem 6D implies Theorem 6B. Suppose that la - t[ <
h(t) -=-6 and
h(t) >= 13~ = (4h(a)) ~a/~. With m, )~ as in Theorem 6B, we have log2d (~)2 m < gg
,
/i v ' ( l o g 2 ~ ) / m > -'24
52 We infer t h a t
Io/- ~1 < h(13)-2-61ih(/3) -612 < h(/3)-~-'~ ~ < h(~)-2-,2 ~
B:" (4h(o/))-300x x/Oog2d)lm/5.
But since log 2d -~-> m
v/(log 2 d ) / m / 6 > - 25
,
we obtain
Io/--/31 < h(/3) -2-12 ~
(4h(o/)) -12x
< hA(o/, fl)-2-12
~l(log2d)/m.
We now proceed as with T h e o r e m 6A: If there is no a p p r o x i m a t i o n / 3 with h(/3) > B2, then we are finished. Otherwise, . . . .
§7. P r o o f o f t h e M a i n T h e o r e m s ,
i.e., T h e o r e m s
6C, 6D.
Suppose (o/1,fll),... ,(Otrn,flrn) are given such t h a t Q ( a i ) = K and fl, E Q 1 , . . . , m). Pick t = t(m, A) such t h a t d W ( t ) = 1 - (l/A), and 7" such t h a t W(7") = We will use L e m m a 3A to construct a polynomial P ( X 1 , . . . , X , , ) of multidegree ( r l , . . . , r m ) , where the ri are large, such t h a t P has index > t at ( h i , . . . , a m ) respect to R. We can choose r l , . . . , rm such that ri+l < _ _ 1
ri
(i = 1,..
. ,m
(i = 2/A. R = with
-- 1).
= 2dmA
T h e n by L e m m a 4C, we know t h a t P will have index < r at (/3t,... ,/3,~) E Q. T h u s there exists an I = ( i l , . . . ,ira) with
i~ + . . - + i., < r rl rm
(7.1)
such t h a t p I ( ~ ) # 0. Set Q ( X ) = P I ( X ) . T h e n by (3.1) we have IP--71 = IQI < 2rlPl where r = rl + . . . + rm. Moreover, by (3.2) we have IQJI < 3~lPI for any J . Since P had index => t at c~, it follows from (7.1) that Q = p I has index => t - r at _a (with respect to R). Since dW(t) 1 - (l/A)
1 - dW(O we have
-
(1/A)
-A-l,
dW(t) _ 1 d W ( t ) (1 + ~) < A
for ¢ > 0 sufficiently small. T h e n by L e m m a 3A, the polynomial P can be constructed such t h a t I p I __< ( ( 4 h ( a l ) ) r , . . . (4h(a,,,))r,~)x.
53
Then IQ"I < a"lPl < 3r((4h(al)) r' " " ( 4 h ( a , ~ ) ) r ' ) a. Writing 13i = x i / Y i (i = 1 , . . . , m ) , we have y[' . . . y~"Q(/3) • g \ { 0 } . Thus uP"-
u,~"Q(~')
=> 1.
Writing the Taylor's expansion for Q about ~ gives
Q(_~) =
~
(/3~ - ~lY' ... (/~m - ~,,)i~ QJ(~_)
J=(/~ ..... jr.)
~+...+~
_>_,-~
where the sum is restricted to ~ + . - - + ~,, => t - r since all lower order partial derivatives vanish at e. By condition (iii), we have
] a i - fli[ < 1/2
(i --~ 1~... ~m)~
which gives 1 Io, d < 1,8'~1+ ~ =
." [
Ix,I + (1/')./I,. I 1 + ~ = ~
(i = 1 , . . . ,m).
Applying Cauchy's inequality to the right-hand side gives
I~,1 =< v'T~h(fl,)
(i =
1,... ,m).
Yi Now we have I q s ( ~ ) l < 1~-71
,-~ + 1
*
-.-
"'
and therefore
y["" "yDqs(a=) < 3rlPl
(r~ + 1)
h(5~) i----1
.
54
The Taylor expansion for Q gives us y[' • • • y~,~Q(__fl)[
max
" ' l~
] ~+...+~_>_,-~
(1/31 - ~, Ij '
IZ~
-~1 j') ri + 1) 2
= IPI
3
max
h(fli)
(Icq - fill i t - . . la,~
i=1
-flmlJ"). So for r i large,
luP -. • u2" Q(__Z) <
( 4 h ( ~ i ) ) r'
IPI
max
~ + . . - + ~ __>~-~
(1~, - ~ l J ' • • • I~,~ - ~mV~"),
and 1 <
max
( h x ( a i , f l l ) ) r'
~+.. + ~ >__,-r
(]fll - ~x] "1 "'" ]tim - c~m]"'~).
(7.2)
Now suppose that I Z ~ - ~ 1 _5 h x ( ~ , f l ~ ) - ¢
with ¢ > 0. Then (7.2) yields 1 <
h.x(~i, fli)) rl
max
~ + . . + ~ >__, - ~
(hx((~l,fll)-Jt~P'"hx(~m,flm)
-jm¢)
Following Bombieri and Van der Poorten (1987) we take logarithms of both sides. Write L / = log h x ( a i , fli), (i = 1 , . . . , m ) . We then obtain
0 < rlLl+'"+rmLm-•,
min (jlLl+'"+jmLm). ~ + " + ~ : ,-~
Putting qai= j i / r i (i = 1 , . . . , m), we have ((7.3))
rlLl +...rmLm
= > ¢
min
~ot+'--+~m > t--r ~oiEIL
(~lrlL1 +--.¢pmrmLm).
=
Now choose ri = [L/Li] (i = 1 , . . . , m ) where L is large. By condition (iv), we have Li+l > 3 d m / k L i (i = 1 , . . . , m - 1), so that ri+l
1
ri
2dm~
- - < - -
(i=1,
"
.. m - i ) . '
55 Dividing (7.3) by L and letting L --~ co, we get m > ¢
min
(~1 + " " + q0m) = ¢ ( t -
r).
~l'+'"-t-'.Pm > l - - r
Hence
¢<
m
(7.4)
t--T
This is the key inequality. Using it together with estimates for m~ t~ and r , we will prove the two main theorems. RecM1 that d W ( t ) = 1 - (l/A) and W(~-) = 2/~. In T h e o r e m 6C, we have (i) 1 < m! < d and (ii) ~ > 2d. 4 m. This gives 1< 1 W(t) < ~ = ~ . and
w(~) < ~ _ so that we can apply L e m m a 5A in both cases. T h a t is, tm
w ( ~ ) = m--i' d~-]=
1-
,
t = (m!) l/m
1-
,
and T m
W(r)
=
2
i
m!'
vm
)~ m!' r = (m!)'/'(21~) So we have
t - r = (m!)'/md -'Ira > (ml)'i"d-'l" = (m!)'lmd-'l"(1
((
1-
((l-
'/'.
- (2dl,~) '/"
)
~) -(2dlJ)'l")
- '7)
where
= (11,~) + (2dl,~) li"
(7.5)
~, 2(2dl,t )'l"
< 112
(by (ii)).
56
Observe that
1
l-r/
< I + 2 r / ~ 1 + 4(2dlA) 1/m =
Therefore t
-
r ~(m!)ll'*d-'l'~(1
+
(by (7.5)).
4(2dlA)'l')-'.
Now we are in a position to e s t i m a t e ¢ with
IZi - ail < hA(ozi,~i) -¢. B y (7.4) we have
¢5
m t--7"
< (Fr~!)l/rn m 31/.,( ID -, -1 -t- 4(2dlA) l/m) = cmda/m(1 + 4(2d/A)l/m). However, in T h e o r e m 6C (iii), we have
]Oti -- flil <~ h)~(oti,l~i) -c'dl/'~(l+4~/A)l/m), which gives a contradiction. We now t u r n to T h e o r e m 6D. As before, dW(t) = 1 - ( l / A ) a n d W ( r ) = 2/A < 1 / ( m ! ) b y (ii). Since W(r) is an increasing function of r , we m a y infer f r o m L e m m a 5A t h a t r < 1. Next, d > 2 gives us W(t) < 1/2. T h e r e f o r e t < m/2, say t = ( r n / 2 ) - 8 with 0 < 8 < m/2. T h e n by L e m m a 5B,
w ( t ) < ~-°2/'', so t h a t
Now we h a v e
eO21m < d ( l _ l )
-1
< 2d
(by (ii)).
T a k i n g the l o g a r i t h m of b o t h sides gives
8< ~
log 2d
and t > (m12)-
4-A-~2~.
So t - ~ > (m/2) - 1 - v/~iog2d.
N o w we may return to our task of estimating ¢. By (7.4), ¢<
m t-r'
57
so that we obtain
~< =
m (m/2)
-
1 -
v/-~l-og
2d
2
1
(2/m)
- 2 v/(log
2d)/m
2
<
1 - 3 v/(log 2 d ) / m < 2(1 + 6 v/(log 2d)/m )
(by (i))
= 2 + 12 x/(log 2d)/m. But T h e o r e m 6D (iii) is
which gives the desired contradiction. §8. C o u n t i n g G o o d R a t i o n a l A p p r o x i m a t i o n s . In the smnmer of 1987, in the course of a n u m b e r theory conference in Budapest, A. Schinzel asked the following, almost philosophical question: "But how can it be, how can it be in number theory, that one could prove the finiteness of a set of natural numbers, without being able to give a b o u n d for its cardinality?" The next day he himself provided the following explanation. Suppose we are given a set S of positive integers and suppose we can prove that if y, y~ are in S, then y~ < 2y. Then S must be finite. However, unless we know at least one y in S, we are unable to estimate the cardinality of S without further information. We will generalize Schinzel's remark as follows. Given C > 1, a set S of positive numbers is a C-set, if for any y, y~ in S we have y~ <__Cy. Any C - s e t consisting of integers is finite. Now let 7 > 1 be given. A 7-set is a set of positive real numbers with the following property: if y, y~ are in the set and y < y', then y' > 7Y. Thus a "~-set has a certain "Gap Principle". A set which is b o t h a C - s e t and a 7-set will be called a (C, 7)-set. Its elements are positive real numbers, not necessarily integers. LEMMA
8A. Suppose C > 1 and 7 > 1 are given.
The cardinality of any
(C, 7 ) - s e t is < 1 + ( l o g C ) / l o g 7. P r o o f . Let G > 1 and 7 > 1 be given. Suppose V0 < yl < y2 < "'" < y~ belong to a (G, 7)-set. T h e n
vi _->y0- 7 i
(i = 0 , . . . ,
and
Cyo > y~ > y07 ~. Therefore u __< (log C)/log u,
~)
58 and the cardinality of the (C, 7)-set is =< 1 + (logC)/log3'. Suppose 5 > O. Let L = log(1 + 5). LEMMA
(s.1)
8B. Let a reM number ~ be given. The number of reduced fractions
(~/~) with xI
1
y < 2y2+-------g
-
(8.2)
and y in a window of exponentiM width C is
C)/L.
< 1 + (log
P r o o f . If y, y~ are in a window of exponential width C, then yt < yC. We will call this an exponential C-set. Now, if x/y, x'/y' satisfy the hypotheses and x / y • x'/y', say y~ => y, then
I
<
_
yy~ =
~7
1 < ~ < 1
1 + 2y,2+-------~
= y2+5 ' and we have
yt ~>yl+5 ~ y-t,
where 3' = 1 +5. We call such a set an exponential 7-~et. The logarithms of the numbers y form a (C,')')-set. By Lemma 8A its cardinality is < 1+ l°gC-l+log 3'
log C L
Suppose now that 1 < A < B are given, and consider rational approximations to with ~_y
<
1 2y~+6
and A __< y __< B. The denominators y lie in a window of exponential width C = (log B ) / l o g A. Therefore, Lemma 8B says the number of such V is
<1+
log(log B~ log A) L
59
There are a couple of drawbacks to L e m m a 8B. We cannot let A go to 1, since C = (log B ) / l o g A. Secondly, we have a 2 in the denominator in (8.2). We will try to remove these drawbacks. For 6 > 0, we will call x / y a 6-approximation to ~ if y > 0 and (x, y) = 1 and 1
< y2+-----7"
(8.3)
T h e n we have the following results. L E M M A 8C. The n u m b e r of 6-approximations to ~ with y in a window W ~= y < W c where W > 41/~ is < 1 + (log 2 C ) / L . P r o o f . Suppose x / y and x ' / y I are such approximations and y' > y. Using the same argument as above (i.e., in the proof of L e m m a 8B), we get > Y 1+~/2 Y, =
Then logy' > (1 + 6 ) l o g y - log2. Now suppose that xo/Yo, x l / Y l , . . . , x , / y , • .. = < y~. T h e n
are such approximations with y0 < yl <
logyl > (1 + 6)log y0 - log2, logy2 > (1 + 6 ) l o g y l - log2
__>(1 + 6)2 log yo - ((1 + 6) + 1)log 2,
l o g y , => (l q- 6)" logy0 -- ((l + 6) "-1 + . . - + ( l q - 6 ) > (1 + 6)"(log W - (log 2)/6). Since W = > 41/6, we have log W __>(log 4)/6 = 2(log 2)/6 and l o g y . > (1 + 6 ) " ( l o g W ) / 2 . We also have y. < W c, so that C l o g W => l o g y , => (1 + 6 ) " ( l o g W ) / 2 . Thus (1 + 6) ~ =< 2 C
and
u = < (log2C)/L. T h e lemma follows.
T1)log2
60 THEOREM where B > e, is
8D. The n u m b e r of 5-approximations with denominators y < B , < L -1 log log B + 20((1/6) + 1).
Recall, L = log(1 + 6), as in (S.1). This result, as well as T h e o r e m 8E below, is due to Mueller and Schmidt (1989). P r o o f . We will say that "large solutions" are those with e ~/~ __
log B 6 log e2/~ = 2 log B.
By L e m m a 8C, the n u m b e r of solutions here is log(6 log B)
<1+ -
L
log log B -- -
-
+
I
log 6 +
-
-
L
L
log log B < - - + 2 , L since L = log(1 + 6) > log & We will let "small solutions" be those with y < e 2/s. Given an integer u, let the set <..-<-S(u) consist of 6-approximations with e" =< y < e ~+1. Suppose -x0- < - - xl xf, Y0 Yl Y~ are elements of S(u). Any two consecutive elements cannot be too close, since Xi+l
Yi+l
xi > 1 = - - > e yi YiYi+l
-2~-2
(i = O , . . . , ~ -
We m a y conclude that X/t
X0 > ~ - - 2 u - - 2
Yt,
Y0
On the other hand, we have Xtt
X0 <
y#
y0
x0 -~-
e 7 1
< ~
e
1
+ y~+~
< 2e-U(2+6) , Combining these two estimates, we get # < 2e 2-'~*, and hence card ( S ( u ) ) = 1 + # < 1 + 2e 2-"8.
l).
61
The "small approximations" lie in the union of S(O), S ( 1 ) , . . . , S(k) where k = [log e 2/$] < 2/6. So the total n u m b e r of "small" 6-approximations is k
=<
+ 2e 2-"') cx~
< k + 1 + 2e 2 ~
e -u~
< ~2 + 1 + 2 e 2 ( 1 -
e_6)_ 1
< l+~+2e
+1
2
Adding the estimates for the numbers of "small" and "large" 6-approximations, we get a bound which is less than L -1 loglog B + 20((1/6) + 1). T h e following theorem tells us that the main term in the conclusion of Theorem 8D (i.e., L -1 loglog B ) is indeed best possible. T H E O R E M BE. Let 6 > 0 be given. There exists a reM transcendentM n u m b e r such that for every B >=e, the number of 6-approximations with y < B is
_> loglog_____B_B+ log(6/2)__ L
L
The proof uses continued fractions. For an introduction to continued fractions, see Hardy and Wright (1954), or Cassels (1957), or Schmidt (1980). P r o o f . Given natural numbers a l , . . . , an, write 1
Pn qn
1
al -Ia2 -4-
°..
+- 1 an
with qn > 0 and g.c.d. (Pn,qn) = 1. We will define a sequence a l , a 2 , . . , Set al = 1. Given a l , . . . ,an, let an+l be the least integer with:~
inductively.
an+lqn + qn-1 => 27n^l+6. :l:The factor 2 on the righthand side is not needed for our present purpose but will come in handy in §9.
62 Then since (by the theory of continued fractions) q,,+l = an+lqn + q,,-1, we have 2q~+6 =< qn+l =< 3~n-1+5. Again from the theory of continued fractions, the limit lim,~--.,~p,,/q,, exists. Denote this limit by ~. It is customary to write 1 1 a l '4- -
-
a2 +
.
and to interpret the right hand side as an "infinite continued fraction". It is known that _ Pn < - - , qnqn+l so that in our case <
(s.4)
We would like to know how fast these denominators grow. We have ql = 1,
logql = 0,
q2 =< 3q~ +8,
Iogq2 =< log 3,
q3 =< 3q~ +~,
logq3 =< ( ( 1 + 6 ) + 1 ) 1 o g 3 .
For arbitrary n, the inequality is logqn =< ( ( 1 + 6 ) " - 2 + . . - + ( 1 + 6 ) + 1 ) 1 o g 3 (1 + 6) "-1 < 6 log 3. So the number of qn < B is at least N, where N is the largest integer with
(1 + Then
6) N - a
6
log 3 =< log B.
6 (1 + 6) N > ~ l o g B
and N > L -1 log(61og B / log 3) > L -1 log log B + L -1 log(6/2). Since we have infinitely many g-approximations, we know t~hat ( is transcendental by Roth's Theorem.
63
E x e r c i s e 8a. T h e n u m b e r of approximations x / y in reduced form with
a n d 2 =< y = B
< (1 + ~)1o-7~Bg~og when B > Bo(e).
§9. T h e N u m b e r
of Good
Approximations
to Algebraic
Numbers.
T H E O R E M 9A. Suppose that a is algebraic of degree d > 3, that m satisfies I < rn! < d, and that # = cmdl/m(1 + X) where X > 0. Then the number of approximations x / y to a with
(9.1) is <
log+ log h(~) .
m,x
log d
where << means that the implicit constant m a y depend on rn and X. Fbrthermore, the rn~X
number o f such approximations with y >__h(a) is tn,X
Consider the case m = 2. Then, e.g., # = 3 v Q / 2 has # = v / ~ ( 1 + X) for some X > 0. T h e o r e m 9A says the n u m b e r of approximations satisfying xI - v :~log+ z =
logz 0
if z > l , if O < z < l
1 < v3~/----~
64 is bounded as indicated, i.e., <<1+
log + log h(a) log d
P r o o f . We may suppose that 0 < X < 1. Then # > cmd 1/m > cm31/m > 2.1. As before, we distinguish between "small" and "large solutions". "Small solutions" will be those satisfying y < ( 8 h ( ~ ) ) '~A/X = B , ,
where A = 2d(12/X) m. By Theorem 8B with 6 = # - 2, the number of such approximations is log log B 1 << l o g ( , - 1) + V = 2 + 1 <<
log log B1 +i. log d
(Here and in the rest of this section, << means << .) m,x
We will estimate log log B1. Since 12A
logBa =
X
log(Sh(a))
: 2d
log(Sh(a)),
we have that log log B1 << log d + log + log h(a) + 1. Thus the number of "small solutions" is <<
log + log h(a) +I. log d
Now consider the "large solutions", i.e., those satisfying y => (8h(~)) "A/X = B1. We will write fl = x/y. We have from (9.1) that y
< lal -F I'
so that h(#) < (lal + 2)y
= < 3h(a)dy
65 since H
<= h(a) d. Now consider yt~ : yCmdll"'(l+x) = yc,v,dxl"~(l+(x/2))ycmdl/"~(X/2) > h(/3) c,,,d ' / " (, +( x / ))((3h(2 a )d)-2 Y x/2)~,,,d'/'~.
Since
u > (3h(oO") ~/x, we have
(3h(ot)d)-2y x/2 > 1, so that
y~' > h(t3)c~t'/'(i+(xt2)).
We therefore obtain
Is -/31 < h(3) -cmd'/~°-(x/2)). Apply T h e o r e m 6A with X/2 in place of X. Then we have either h(fl) < B1, or h(fl) lies in the union of at most m - 1 windows of exponential width C = 6dmA where )~ = 2d(12/X) m. We also know that the first case is ruled out because h(fl) > y > B1. Therefore, by L e m m a 8C, the number of solutions is not greater than 1+
log 2C log(# - 1)"
We know that 2C = 2d2(12/x)mm, so that log 2C << 1 + log d. Also, log(# - 1) >> log# >> logd. So the n u m b e r of solutions in such windows is << 1, and the main statement of the theorem is proven. We have already seen that the number of approximations with y > B1 is << 1. It remains to count the solutions with y in the interval h(a) < y < B1. Recall that
B, = (8h(~)) 1~/~ and A = 2d(12/X) m. Then
B1 -- (8h(o~)) 2d(121x)'+'.
66
If h(a) < 41°, then the second assertion of the theorem follows from the first. If h(a) > 41°, then h(a) > (Sh(a)) 1/~. Our interval is a window of exponential width not greater than log B1 = 4d(12/X),,+l. log(Sh(c~)) 1/2 We also have 5 = tt - 2 > 2.1 - 2 = 1/10, so that 41/~ < 4 l°. Then by Lemma 8C, the number of approximations is not greater than
log(Sd(12/X)) m+'
1+
log(~
-
1)
'
which is << 1. T H E O R E M 9B. Suppose a is algebraic of degree d > 3, and 0 < 5 < 1. Then the number of 5-approximations to ~ is less than log + log h(o~) +e(d,5), L
(9.2)
where
-~-(log 2d) log
log 2d
.
This theorem estimates the number of "exceptional" approximations in Roth's Theorem. Davenport and Roth (1956) had given an estimate with a s u m m a n d exp(70daS-2). The latest results are by Bombieri and Van der Poorten (1988) and by Luckhardt (1989). Both use the Theorem of Esnault and Viehweg. In Theorem 9B, the first term in the estimate is best possible (see Theorem 9C below), but the c(d, 5) term can probably be improved. Actually Bombieri and Van der Poorten had 3000 in place of 10 s, but they also had
c~--~ <
1
64y2+ ~ rather than 5-approximations, and had 5/2 in place of L = log(1 + 5) > 5/2 in (9.2). P r o o f . Put m = [(50/5) 21og2d] and A = 2m m. Then A > d. Consider "small solutions" to be those with
y =<
= B2.
By Theorem 8D, the number of such approximations is not greater than L
+20
+1
Estimating the first term, we have 50A log 132 = T log(4h(a)),
.
67
so that log log B2 < log ~ + log X + log log 4h(c@ We know that
and also loglog4h(o 0 < 1 + log + log h(oO. So the total number of "small solutions" is not greater than log + log h(a) 2((50//~) 2 log 2d) log((50/~) 2 log 2d) + L ~/2 Now consider "large solutions" to be those with y > B2. As in the previous proof, if ,~ = x/y, then h(~) < 3h(a)dy. Consider y2+~ =
y2+(~/2)y6/2
> hG3)2+(~/2)(3h(a))-3dyS/2
> h(¢~)2+(~/2~. The last inequality follows because y > B2 and X > d. So y2+~ yields 1
h(¢~)2+(~/2)" Apply Theorem 6B with ~/2 in place of 6. Then either h(~) < B2 (which in our case is ruled out) or h(/~) lies in the union of at most m - 1 windows of exponential width C = 6drnX. Notice that B2 > 4 2/~, so we can apply Lemma 8C with ~/2 in place of 6. This tells us that the number of approximations with h(~) in the given window is not larger than log 2C 4 + 1 =< ~ 1 o g 2 C + 1
log(1 +
< 5 log 2C. o
We will estimate log2C. We know that 2 C = 24din re+l, so that log 2C = < 2m log m + log 24d.
68
Then the number of approximations in a given window is <
m log m + ~ log 24d
< ~ m log m. The number of windows is less than m, so the total number of "large" approximations is less than ~ - m logm <
(log2d)21og
log2d
.
Combining the results for "small" and "large solutions" gives the desired bound. T H E O R E M 9C. Let K be a real algebraic number t~eld of degree d > 2. Let 5 > 0 be given. Then there are infinitely many a E K, with K = Q(a) and h(o0 > e,
such that the number of 5-approximations to a is greater than co(K, 5).
loglog h(a) L
R e m a r k . We could drop the condition that h(a) > e if we replace log log h(a) with log + log h(a). This result is due to Mueller and Schmidt (1989). P r o o f . We may choose 7 E K with Q(7) = K and I'rl < 1/2. We also construct
thesequence {P-~n} a s i n T h e o r e m S E . G i v e n N > 1, let b N b e t h e l e a s t i n t e g e r s u e h that bN > q~N +8. Then we may pick an integer aN with aN bN
"Y
[
1
<
2bN
=
< __
PN qN
Set
1
2a~v+~"
aN aN
bN "
= 7
Then a N generates our number field. Suppose n satisfies 1 _< n _< N. Then we have aN
__ P "
~
<
aN
P~
1 < 2q-~N+~+
+
1 q n- q n + l
< 1 1 = 2q +' + 2q +' 1 q2v+~ '
P~N N
q,
I
69 (__/ where the last inequality follows from the construction o f / P ~ ' ~ in§8. (Recall, we had q,,+l = > 2qa,+~.) So for 1 =< n = < N, we have that p,,/q,~ is a 6-approximation to aN. Hence, o~N has at least N of these 6-approximations. Now we seek a lower limit for N in terms of h(ag). We have (see Exercise 7C in Chapter I)
h(aN) = h 7 -
= < v~h(7) h ( a ~ ) Furthermore,
a~-NN=
+
=<
bu.
We than have
h(OtN) 5 v/~h(~/)bN <
+'
since bN was the least integer with bN > q2N+6.Taking logarithms gives log h(a) =< (2 + 6)log qN + c ' ( g ) , where the constant depends only on K at this point. By our construction in §8, we know that logqN < (1 + 6) n-1 = 6 log 3. Therefore,
log h(aN) < c"(K, 8)(1 + ~)N and
N > loglogh(aN) _ co(K,~). =
§10. A Generalization
L
of Roth's
Theorem.
Let Q c k C K be algebraic number fields. As in §I.6, let M(K) be an indexing set for absolute values of K. We write
M(K) = Mo(g) U Moo(K),
70
where Mo(K) consists of non-Archimedean, and M ~ of the Archimedean absolute values. In most parts of these Notes, S will denote a finite set of the type M ~ ( K ) C S C M(K). H o w e v e r , in the present section we need only that S is a finite subset o f M ( K ) . Suppose that for each v 6 S we are given a linear form Lv = L~(X, Y ) with coefficients in K. We will study the inequality
IX IL'(x)lT°lxl-o < Hk(x) - ~ - "
(10.1)
yES
in unknowns x = (x, y) with components in k, where Ixl. = max(Ix[r, [y[~) and where n~ is the local degree. Since (10.1) is unaffected if we replace x by a multiple, x in projective space Pl(k). 10n. Given ~ > O, (10.1) has only finitelym a n y solutions x 6 pl(k). This is due to Lang (1962). Earlier generalisations of Roth's Theorem were given by Ridout (1958). Now why does this actually give Roth's Theorem? Suppose a is algebraic, and suppose THEOREM
~_~ < 1 lyl2+ .
(10.2)
Then ]o~y- x] < Cl]X[ - 1 - t
with a constant C1. Further if a = a O ) , . . . , a(d) are the conjugates of a (in C), then - x J . . . I (d)y -
xl =< C21xl d-
-t
Set L , ( X , Y ) = a Y - X for each v Then if k = Q and K = Q(a), we have H vEMcc(K)
IL'(x)[~u = f i [a(/)y - x[ < C2lx[ - 2 - " = C2Hk(x) d-2-~, i=1
or
II ] L ' ( x ) I ~ < C2Hk(x)-2-e" v6M~(g) ]xln~ By Theorem 10A this has only a finite number of solutions, x 6 pl(Q), so that (10.2) has only finitely many solutions x/y. A more quantitative version is as follows. THEOREM 10B. Suppose K-is of degree 5. Suppose these are not more than t distinct forms Lv for v 6 S. Define [Lvlv and HK(Lv) in terms of the coe~cient vectors
71
of L~, and suppose that HK(L~) < H t'or v 6 S. Then for given C > 0, the number of solutions
H (' IL~(x)l~
)°o< CHK(X)-~-~
in x E Pl(k) with Hk(x) > Cl(6, t,e)(C + H + 1) c2(6't'')
is less than
c3( , t, where s = Card S. As Evertse, GySry, Stewart and Tijdeman (1988) point out, this theorem can be proved by making Lang's arguments more explicit, and combining them with ideas of Davenport and Roth (1955). But no explicit proof of T h e o r e m 10B has been published. T h e following exercises are not on the material of this particular section but could have been given earlier. E x e r c i s e 10a. Let B be a symmetric convex body in •"
and A a lattice. The
inhomogeneous minimum of B with respect to A is defined as the least # such that A + # B covers R", (i.e., every x 6 R " may be written as g + x with l 6 A and y E #B). Prove that # is well-defined, and that 0 < # < n)~n/2, where )~, is the n t h minimum. For n = 2 and B a disk centered at the origin we have the following picture.
E x e r c i s e 10b. Let c~ 6 R be irrational. We call (x, y) 6 Z 2 a best approximation to a if y is positive, l a y - x [ < 1/2, and if for any other pair (x', y') with 1 < y' < y, we have lay' - x'] > lay - x[. Show that one gets an infinite sequence of best approximations, say ( x l , y l ) , ( x 2 , y 2 ) , . . . , with yx < y2 < "'" and 1
l a y i - x i l _-< - Yi+l
(i = 1 , 2 , . . . ) .
72 E x e r c i s e 10c. Let a E R be irrational, and for N > 1, let H(N) be the parallelogram lay-
zl =< ~1,
lyl =< N .
(10.3)
Since the area of P ( N ) is 4, Minkowski's Convex Body Theory says that the first minimum satisfies ~1 = AI(N) =< 1. Show that there are arbitrarily large values of N with A2(N) < 1. (Hint: This should follow from Exercise 10b.) E x e r c i s e 10d. Combine Exercises 10a and 10c to show that if a, fl E R, where a is irrationM and fl is not of the type fl = m a + n with m, n E Z, then there are infinitely many (x,y) E Z ~ with y > 0 and lay-
~-
x I < l/y.
R e m a r k s . Exercise 10d is a quantitative version of the one-dimensional "Kronecker's Theorem", which only asserts that we can solve l a y - ~ - xl <
for (~ irrational. Minkowski proved that (10.3) may be replaced by the stronger inequality l a y - Z - xl < 1 / ( 4 y ) ,
which is best possible. (See Cassels (1957), Chapter III, Theorem II h).
III. The Thue Equation. References: Whue (1909), A. Baker (1968), Bombieri and Schmidt (1987). §1. M a i n R e s u l t . Let F ( X , Y ) = ao Xd + a l X d - l Y + . . . + adY d with ai E Z be a form of degree d => 3 which is irreducible over Q. R e m a r k . Such a form F can never be irreducible over C. First consider
F ( X , 1) = aox d + a i X d-i j r . . . Jr ad = ao(X - (~1)"" (X - (~d) with a l , . . . , ad algebraic of degree d and conjugates of one another. Then
F(X,Y) = ydF (X,1)
= ao(X - - o ~ l Y ) . . . ( X - - a d Y ) .
T h e o r e m 1A. (Thue, 1909). Let F as above and m be given. The equation (1.1)
F(x, y) = m has only t~nitely many integer solutions (x, y). R e m a r k . Today, equations of type (1.1) are called Thue equations. R e m a r k . Theorem 1A is false for d -- 2. Consider, for example, x 2 -- 2y 2 =
1.
This equation factors into (x + v
y)(x - v S y ) = 1.
If D -- Z[x/2], i.e., the ring of elements x + vf2y with x,y E Z, then e = x + v ~ y and g = x - v ~ y axe units in D. In particular, we can take x = 3 and y -- 2. Then ~0 = 3 jr 2x/~ is a unit. For each n > 1, the number ~ is also a unit, which gives a solution e~ = x , jr v ~ y , to x 2n _2y2 = 1. For example E02 = 17jr 12v/2, so that x2 = 17, y2 -- 12. P r o o f . Factoring F(x, y) over C, we can write a0(
-
-
dY) = m .
(1.2)
Then dividing by yd and taking absolute values gives ]a0[]C~l-y "'" ~d--Yl---- ~ddl" We have, without loss of generality, Ix -- aly[
=
min
l
Ix - aiy],
(1.3)
74 which is the same as -y
l~i~
Also, let 1 3' = 5 ml ai ni i c j
-ail
> 0.
If y is large, then both sides of (1.3) will be small. In particular, a l - ,~ will be small. For i ~ 1, observe that ~i -
= l a i - ~11 -
~1 -
=
- 3' = 3"-
Then we have --
a0
~d
i~1 a"
Since d > 3, Roth's Theorem implies that there is only a finite number of solutions
(~,u).
E x e r c i s e l a . The proof of Liouville's Theorem 1E of Chapter II uses implicitly that
If( x, Y)I > 1 for integers x, y. (Actually, it uses that IP(z/y)l > 1/y d for a polynomial P E Z[X] of degree d which does not vanish at x/y.) Employ Roth's Theorem to show that for F as in Theorem 1A,
IF(x,y)l _->c 0 ( F , e ) ( m a x ( I x l ,
lyl) a - z - "
> 0.
E x e r c i s e l b . In Theorem 1A, one may weaken the hypothesis on F. Rather than supposing that F is irreducible over Q, assume that at least three of the complex numbers a l , . . . , ad in (1.2) are distinct. The methods of Thue, Siegel, and Roth are "ineffective" in the sense that they don't yield a bound A = A(F, m) such that any solution (x, y) satisfies max(]xh lyl) < A. Alan Baker (1967) remedied this situation. Thue's method, however, can be used to give some upper bound on the number of possible solutions. Lewis and Mahler (1961) gave a bound
B = B(d, m, g), where H =
max
lail. For m a n y years, it had been conjectured that a better bound
~<=i<=d
could be obtained. Siegel (1929) conjectured that there should be some B = B(d, m) independent of H. In subsequent work, he proved this for some cases. Evertse (1983) proved the conjecture in his thesis, obtaining the bound
715((~)+1)2 + 6 x 7 2(~)(~+1),
75
where u is the n u m b e r of distinct prime factors of m. More recently, Bombieri and Schmidt gave the following result. THEOREM l B . (Bombieri and Schmidt, 1987). Let F and m be given. The n u m b e r of primitive solutions to the diophantine equation F ( x , y) = m is not greater than c0 a l l + v ,
where co is an absolute constant, u is the n u m b e r of distinct prime factors o f m, and d is the degree of F. Further advances on the n u m b e r of solutions were made in recent work of Stewart (to appear). THEOREM inequality"
1C. Let F and m be given. The n u m b e r of solutions of the "Thue
IF(x,
=< m
is <( dm2/d(1 + log ml/a). In T h e o r e m 1B, consider the case m = 1. T h e n we have F ( x , y) = 1 and v = O. So the n u m b e r of solutions is << d. This b o u n d is not bad, as is seen by the following example. Consider x
+ c(x - y ) ( 2 x - y ) . . . (dx -
= 1.
(1.5)
This equation has at least d solutions, namely (1,1), ( 1 , 2 ) , . . . , (1,d). In fact, if d is even, we have 2d solutions. Hence we see that << d is best possible. Note that when c is e.g., a prime, then the form in (1.5) is irreducible over Q by Eisenstein's criterion. Now let m be arbitrary, say m = p~l ... pZ~. T h e n Pl "'" p~ < m and logpl +logp2 + • .- + log p~ < log m. If we have Pl < P2 < "'" < P~, then Pi > i + 1 (i = 1 , . . . , v). Thus log2 + log3 + . . . + log(g + 1) < logm, and for m sufficiently large, v l o g v < (1 + ¢ ) l o g m . Then, again for e > 0 and m sufficiently large, v <
logm
"1
= log--i~ogm( + e ) " So the n u m b e r of solutions of F ( x , y) = m is << d l + v << m ~ d,e a s m --÷ Oo.
C o n j e c t u r e . Given a form F as above, the n u m b e r of solutions of F ( x , y) = m is (log m)%
76
where c is an absolute constant. Perhaps this is valid for c = 1, or at least for every c>l. Now we will look at T h e o r e m 1C. Consider the set S ( m ) of (x, y) E R 2 with
tF(
,y)l _-< m.
T h e n S(rn) = m l / d S ( 1 ) . Suppose F ( x , y ) = ao(x - ~ l Y ) " " (x - ady), and say, for simplicity, that the ai are real. Then $(1) contains the lines x = a i y as in the following illustration. x
=
~ly
=
~2y
x = w3y In T h e o r e m 1C, we are counting the integer points in S ( m ) . Intuitively, the n u m b e r should be approximately the area of S ( m ) . S o i f A F = area(S(1)), the n u m b e r of solutions should be about m 2 / d A f . Remark. inequality is
Mahler (1934) has shown that the n u m b e r of solutions of the Thue ,.., AFro2~ d
as m ~ oo. But the dependency of the error term on the coefficients of F was left unspecified.
§2. P r e l i m i n a r i e s . forms in 2 variables,
We may deal with a more general type of forms t h a n the
F ( X , Y ) = a o ( X - aO)Y) . . . ( X - a ( d ) r ) ,
of §1. Let L ( X ) be a linear form O ¢ l X 1 3v - . . -~- o t n X n with coefficients in a n u m b e r field K of degree d. We will suppose that a l , . . . , a n are linearly independent over
77
Q. If a ~
o~(i) (i = 1 , . . . ,d) are the embeddings of K into C, then let L(i)(x) =
ol~i)Xl ~-"'" "31"Og(~)Xn (i = 1,... ,d). * •orm form is a form F ( X ) = F ( X 1 , . . . , X , ) = aL(')(X) .-. L(~)(X) where a E Q×. If K C_C, put
ILl = x/l&,l ~ + . - . + I~.12 . The norm form F ( X ) has rational coefficients, so we may write F ( X ) = a ' a ( X ) where a' q Q and the coefficients of G are relatively prime integers. Then we have cont (F) = la'l. We define HK(L) = HK (Oil,... ,ITn). E x e r c i s e 2a. Let o~ be a root of a 4 - 2a - 2 = 0 and K = Q(a). Compute the norm form 91(x + a y + a 2 z ) = z 4 + . . . .
LEMMA
2A. Given a norm form F ( X ) (as above), we have lallL(')l"" IL(d) l = (cont F)HK(L).
P r o o f . Write M( K) = Moo(K) U Mo( K), where Moo(K) consists of Archimedean absolute values (i.e., v [ co), and M0 of non-Archimedean absolute values. We have
HK(L)= H
ILl:',
~eM(K)
where ILl, = I(~,,-..,
~-)1~. Therefore HK(L) = HKOO(L)HKo(L)
(2.1)
where
HKj(L) =
E [LI:" ~,CM~(g)
(j = co, O).
By (vi) of Chapter I, §6,
Hgoo(L) = IL(1)] ... IL (~)j. By (vii) of Chapter I, §6, we see that if a , embeddings of K into C, then
H~o(L) = H~:(,)o(L(O) =
(2.2)
, a (/) (i = 1 , . . . ,d) are the isomorphic
H IL(')l"o vEMo(K(O)
(i = 1,...d).
Therefore if E is the composition of K 0 ) , . . . , g(d), then
HKo(L) =
H
wEMo(E)
IL(OI~/"
(i = 1 , . . . d )
78
where e = [ E : K (I)] (i = 1 . , . . . , d ) . Thus
(IL(1)lw... IL(a)Jw)"" wEMo(E)
--
la-l Fl~,",
II
wEMo(E)
where in the last relation we used F = aL 0 ) . . . L (a) and Gauss' Lemma, and the notation that for any polynomial P, we write IPI,~ for the m a x i m u m of Iclw for the coefficients c of P. Since a - i F has rational coefficients,
o
Therefore
HKo(L) = lal " (Cont F ) -1. This, in conjunction with (2.1), (2.2), gives the assertion. Recall, (~1,--- , (~, are linearly independent over Q. So we can extend them to a field basis (~1,... , ~ , , - . . ,~d of K over Q. T h e n it is known that the matrix
(l
=
is non-singular. Hence the submatrix
o~j(i)")
(1 _
l =<j = n <)
has rank n. Among the linear forms L 0 ) , . . . ,L(d), there is a set of n linearly independent ones. Let I be the collection of n-tuples ( i l , . . . , i , ) with 1 < i i < d for which the forms L ( i ' ) , . . . , L (/") are linearly independent. The semi-discriminant of F is given by
D(F)
=
lali±In/11
Idet( L(i')'''" 'L(~"))h
(i~,... ,i,)EI
where I I I = card (I). It is easy to see that D(F) is independent of how we write F. Say F -- aL (1) ..- L (d) = bM (1) ... M (d). By the essential uniqueness of factorization, we have after reordering that M (/) = )`iL (i) with certain coefficients ) , 1 , . . . , )`d. Since I = I ( L ( 1 ) , . . . , L (d)) = I ( M ( D , . . . , M (d)), it will suffice (for the independence) that each L (0 occurs exactly ]I]n/d times in a determinant of the product. All of the L (/) together occur II]n times, so if it is "fair", each L (0 will occur IIIn/d times. Since there is an automorphism of E mapping L 0) into L (0, (1 _< i < d), each L (0 does indeed occur the same number of times, i.e., IIIn/d times.
79 Clearly D(F) is nonzero. Because the product is fixed (where E is the composition of K 0 ) , . . . , K(d)), and since by the exponent IIin/d is an integer, we may conclude that D(F) The coefficients of the L (i) lie in the field E . If v • Mo(E),
under a • Gae(E/Q), what we have just said is rational. then by Gauss' Lemma,
lal,lL(a)l~"" IL(a)l~ --- IFI~If we suppose that F has integer coefficients, which we shall, then
lal~lL(a)l~"" IL(a)l~ = IFI, _5 1. On the other hand,
I det(L(il),... L(in))l ~ __< IL(il)l~... IL(i-)l~. We have noted that each conjugate L (0 occurs exactly IIIn/d times in the product for D. T h e n for v E Mo(E), we have
]nl~ _-< (lal~l/(a)l. -.- l/(~)l~)llln/a < 1. Since this is true for every non-Archimedean is the usual absolute value. If v is Archimedean, then
I1~,
IOl, _5 (lal~lL(])l,-..
we have D E Z and IDI ~
1,
where II
IL(a)l,)lIl'/a
by H a d a m a r d ' s inequality. So we have
IOl =< (lallL(1)l'" IL(a)l)lZtn/a for the ordinary absolute value also. We may summarize our results as follows. L E M M A 2B. Suppose F is a norm form with coet~cients in Z. rationa/integer, D ¢ 0, and
Then D is a
D(F) < H*(F) IxlnH, where H*(F) = (cont F)HK(L) = lall L(1) . . . L(d)l. R e m a r k . Suppose n = 2 and F = a(X - ~(1)y)... (X - a(a)Y) is irreducible over Q. Then a(i) ~ a(j) for i ¢ j , and I consists of all pairs (i,j) with i ¢ j and 1 < i, j < d, so that I I I = d ( d - 1)/2. We may conclude that
D(F) < H*(F) d-1. Now suppose T : Z n --* Z n is a linear map. P u t F T ( x ) = F ( T X ) . T h e n
D( F T) = I det TIIII D( F ).
(2.3)
8O
Note t h a t the set I introduced above depends on the ordering of L O ) , . . . , L (d). But its cardinality, which by abuse of notation we denote by I I ( F ) I , depends on F only. For T E GL(nonsingular T, Z),
l I ( F r ) l = II(r)l. Let p be a prime. Let To, T 1 , . . . , Tp be the linear m a p s given by the matrices
0)
Suppose x = ( ; )
0)
lies in Z 2. T h e n either x -- 0 ( m o d p ) or x ~ 0 ( m o d p ) . In the first
case, x = pxl for some Xl E Z and
On the other hand, if x ~ 0 ( m o d p ) , then y - j x ( m o d p ) , or y = j x + pyl with 1 < j < p. In this case,
We have seen t h a t any integer point x m a y be written as X = TjX I
for some j , (0 < j < p), and x ' E Z 2. Further, if x is primitive, then so is x ' . More generally, for n variables, put
/p / 1
To=
O
1
O
,
1
0
J
p
Tj=
1°) 1
.°.
O
1
,
( I <"j < P=)=
",
1
Again, every x E Z'* m a y be written as x = Tjx' for some j and some x ~ E Z n. In symbols, P
zn= Ur zn. j=O
Suppose t h a t we want to s t u d y the n u m b e r of solutions of the diophantine equation F ( x ) -- 1. This n u m b e r is not bigger t h a n the sum of the n u m b e r s of solutions of
F Tj (x) = F ( T / x ) = 1
81 for j = 0 , 1 , . . . ,p. Since D ( F T) = [detT[ID(F), we know that D ( F ~ ) > pl1[. The following lemma is now obvious. L E M M A 2C. Let d be fixed and let ~ be a class of norm forms which is closed under nonsingular substitutions F : Z n ~ Z n. Let N¢ be the maximum number of solutions o f F ( x ) -- 1 over all F in ~. Let N¢ (p) be the maximum number of solutions of F(x) = 1 over ~ F in ~ with D ( F ) >= # l . Then
N¢ < (V + 1)No (V). R e m a r k 2D. The lemma can be modified in two ways. Rather than counting solutions of F ( x ) --- 1, we m a y count solutions of F ( x ) E S, where S is a given set of integers. Secondly, the lemma remains true if instead of counting all integer points we count only primitive integer points. Now let us restrict to n = 2 and F ( X , Y ) as in Thue's Theorem, and of degree d. Let WE(m) be the number of integer solutions of the Whue inequality [F(x, Y)I < m and PF(m) the number of primitive integer solutions of the same inequality. Let P~F(m) be the number of primitive integer solutions of the inequality
m/2 ~ < IF(x, u)l _5 m.
(2.4)
Furthermore, let N ( m ) = max.NF(m), and likewise for P ( m ) and P'(m). F P R O P O S I T I O N 2E. Suppose F is a form as in Thue's Theorem and D ( F ) >=(50ma/d) 2111. Then
P V m ) << d(1 + log-~'/~), where the implicit constant is absolute. Proposition 2E will be proven in sections 3, 4, and 5. We will use it now to deduce Theorem 1C. P r o o f . (of Theorem 1C). Pick a prime p > 250Ore 2/a.
(2.5)
Then if D ( F ) > pl1], the condition of Proposition 2E is true. Combining Remark 2D and the proposition, we get that
P ' ( m ) << (p + 1)d(1 +
logm l/d)
<< dm2/d(1 + log mUa), provided p is a prime chosen as small as possible with (2.5). Given m, pick u satisfying 2 du <-_ m <
2 d(u+l).
82
Then
P(m) =< P ( 2 d('+l)) u+l
= ~ P'(2 ~) j=0 u+l
<<
d ~ 2~(1 + log 20 j=O
<< d22U(1 + u) << dm2/d(1 + logml/d). Now we can count all the solutions of [F(x, V)] < m. Given such a solution, say (z,v) = t(x',y') where (x',y') is primitive and t > 0, we have
[F(z', Y')I < rn/td, since F is homogeneous of degree d. Thus
t < mr/d
<< d ~
m2/d
t----1
(
l+log
<< dm2/d(1 + log ml/d). Before giving a proof of Proposition 2E, we show that we may place additional hypotheses on F by introducing an equivalence relation on forms. If F and G are forms of degree d, we say F -~ G if G = F T for some T E SL(2, Z), where SL(2, Z) is the group of one-to-one, linear maps from Z 2 onto itself. The number of solutions of the T h u e equation F ( z , y) = m, or the inequality IF(z, v)l _5 m, depends only on the equivalence class of F. T h e same is also true for primitive solutions. Recall that we let P'F(m) denote the number of primitive solutions of the inequality (2.4). Suppose there is such a solution, say x0 = (x0, Y0). Since x0 is primitive, there exist integers xl, Yl G Z with xl
Yl
Let G(X, Y) = F(zoX + xlY, yoX + ylY). Then G ,-~ F and G(1,0) = F(xo, Yo). This gives us
m/2 d < IV(l, 0)l _5 m. We may, therefore, restrict ourselves to forms F which are normalized in the sense that m/2 d < I F ( l , 0)1 <m. By writing the form as F(X, Y) = a(X - a(1)Y) ... ( X - a(d)Y), this inequality becomes m / d d < Jal 5_ m. We will call F reduced if F is normalized and if H*(F) is minimal among all normalized forms which are equivalent to F. We have seen that Proposition 2E is
83
sufficient to get our bound on the number of solutions of the Thue inequality. Since D(F) and II(F)I are invariant under substitutions from SL(2, Z), we may restrict ourselves to reduced forms F. In view of D(F) > (50rnl/d) 21II and Lemma 2B, we need only concern ourselves with reduced forms having
H*(F) > 50din.
(2.6)
We m a y also suppose, w i t h o u t loss of generality, that cont (F) = 1. For in general, if F = c/~ with cont (z~) = 1, we replace F, m respectively by ~" = c - I F , fit = c - l m , so that (2.4) becomes fit/2 d < IF(x)l < fit, and F is normalized (with respect to fit) and reduced, and (2.6) changes into H * ( F ) > 50dfit. W h a t we need to prove, then, is the following PROPOSITION 2F. Let F be a form as in Thue's Theorem. reduced, cont ( F) = 1, and (2.6) holds. Then
P~F(m) << d(1 + log mud).
Suppose F is
(2.7)
§3. M o r e o n t h e c o n n e c t i o n b e t w e e n T h u e ' s E q u a t i o n a n d D i o p h a n t i n e A p p r o x i m a t i o n s . Throughout the next few sections, we will let F ( X , Y ) be a form of degree d as in Whue's Theorem. We will write f ( X ) = F ( X , 1) = a(X - 4 ( ' ) ) . . . ( X a(d)). LEMMA
3A. Suppose we are given (z, y) with Y > O. Let a be a root of f ( X )
with
Suppose 1 < u < d and f(")(a) # O. Then Ot
x I < d ( 2 d l F ( x , y ) l ~ ~/~
(3.2)
P r o o f . Let 6 denote any ordered subset of { 1 , 2 , . . . ,d} of cardinality u. How m a n y such ~ are there? There are
d(d-1)...(d-u+l) Let f 6
< d u.
(X) be defined by
f~
(X)=a
H (X--°t(i))" l<__i<_d
Then we have
f(")(x) =
(x). 6
84
Therefore, given our a, there is some such 6 with If(=)(~)l _5 d"lfe (~)1. This e will be fixed in the sequel. For any j, where 1 =< j _-< d, we have by (3.1) that
I~(~ - ~(J))l -_< I~ - xl + I~ (~) - xl 5_ 21y~ (;) - xl. Now f 6
(a) satisfies
lya-~f6 (~)1
= lal 1-I lu(~ - ~
and f(")(a) satisfies
Iya-~f(~)(~)l ~ d~2d-~lal 1-'[ lYc~(i)- xl. j~6 Multiplication of both sides by 1-Ije6 lYa(j) - x[ gives
so that lw~ - xl~lyd-~f(")('~)l ~ d'~2a-'*IF(x,Y)l •
The assertion (3.2) now easily follows. In what follows, we will apply the lemma for u = 1. For this we need a lower bound for If'(~)l. L E M M A 3B. Suppose f ( X ) E Z[X] is of degree d with cont (f) = 1, and is irreducible over Q. Let a be a root of f and K = Q(a). Then If'(~)l
>
ID(f)ll/2 >
= hK(ot)d-2
1
= hK(o~)d-2
(where D(f) is the (classical) discriminant of f ). Proof.
W r i t e f ( X ) = a(X
-
oL(1)) ..- (X
-
~ ( d ) ) and suppose a = a ('). W e have
if(a) = a(a - a(z)) -.. (a - a (d)) and
a2d--2
(f'(~))~
=
a2d-4
H
1 < i<3 < d (OL(i) I-~
2 5- i<j <=
_ a(.i))2
d(OL(i) -- c~(J)) 2
D(f) G
'
85
say, where G
=
a 2 d - 4 H 2 < i<j
<= d("(i)
].(i) -- .(j)12
--
.(j))2. We need to estimate G. We have
=< ].(012 + ].(i)]2 + 21.(0.0")1
< (1 +
since 21.(011.(J)l _5 1 + I.(Ol2l.(J)l 2. IGI _5 laU -4
I.(')12)(1
+
I.O)12)
Therefore,
17[
(1 + I.")12)(1 + I.¢J)l 2)
2 <__ i < j < d d
= la[ 2a-4 II(1 + I.(i)l~) d-2
i=2
_5 (lal(1 + ].0)]2)1/2 .--(1
+
I.(a)12) '/2)
2d-4.
Let F ( X , Y ) = a(X - . ( 1 ) y ) . . . (X - o~(a)Y) = aLO)(X) ... L(d)(x). Then
IGI =< (lallL(')l"" IL(d)l)~d-4 = (hK(.))~d_ 4 by Lemma 2A, since cont (F) = 1 and HK(L (1)) = hK(.). follows, since
The desired result now
-- (hK(.))~d-," Combining these two results, with u = 1 in Lemma 3A, gives the following. LEMMA
3 e . Let F and f be as above. Suppose
IF(~,y)I 5 m. If, as before, y > 0 a n d .
is a m o n g . O ) , . . . a
,.(d) with
x] min a (i) x] -- y = 1 < i < d --'y" '
then
la-~l
<=d2ahKa(d)2-m2
yd
This result is due to Lewis and Mahler (1961). We now have some reasonably nice values for the constant in (1.4). §4. L a r g e S o l u t i o n s . primitive solutions of (2.4):
Recall, we are estimating P~,(m), i.e., the number of
m/2 ~ < IF(x)i 5 m. The hypothesis on F in Proposition 2F is that F is reduced with cont ( F ) = 1 and
H*(F) => S0dm.
86
If y = 0, we have the primitive points (1,0) and ( - 1 , 0). By putting another factor in our << inequality, we may restrict our solutions to those with y > 0. We will let "large solutions" be those with y > H * ( F ) s. (4.1) Write F ( X , Y ) = a ( X - a O ) Y ) . . . ( X - a(d)Y) and L -- X - a Y . Then H * ( F ) = ( c o n t ' F ) H K ( L ) = h K ( a ) = hK((~(i)). Therefore, "large solutions" satisfy y > (hK(a(i))) s (i = 1 , . . . , d). By L e m m a 3C, some root a satisfies x o:
-
d 2dhK(a)d-2rn <
ud
Since h g ( a ) > 50din, we have (4.2)
[a - y
<
hl~(a)d y~
with plenty to spare. Since d > 3, we have d=~
1d
7d
+8
1
> !d
> -8d + - 2 - d = 8
+
v/-d "
Then (4.2) for large solutions yields x
- y
d
1
(4.3)
< yd/Sy3 V~ /2 < ya Vq /-----~ "
As an application of Theorem 9A of Chapter II, we saw that the number of solutions of (4.3) with y > h(c 0 is << 1. Here we have y > h g ( a ) s > h((~). Thus the number of large solutions for the given root ~ is << 1. Then the total number of large solutions is << d. We state this in the following lemma. L E M M A 4A. Let F be a form of degree d as in Thue's Theorem. Suppose cont (F) = 1 and H * ( F ) > 50din. Then the n u m b e r of primitive solutions of (2.4) with y > H * ( F ) s is << d. Thus when it comes to large solutions, the log factor in (2.7) is unnecessary. §5. S m a l l S o l u t i o n s . We let small solutions be those with y < H * ( F ) s. By Lemma 2F, we may assume that F is reduced (and therefore normalized). We will also assume that H * ( F ) > 50din. (5.1) L E M M A hA. Suppose a ( x , Y ) = bo(X - ~ ( 1 ) y ) . . . ( X - ~(d)y) is normalized and is equivalent to F. Let g E Z and let 71i ~- 1/~(i) -- ~1 "4- 1
(1 --< i 5 d).
87 Then 71i ... rld > Q, where
(5.2)
Q = H*(F)/m. P r o o f . Let
G ( X , Y ) = G ( X + gY, Y ) d
= bo 1-[(x
- (~(') - ,)Y).
i=1 T h e n G ,,~ G ,,, F and G is normalized since G is normalized. F u r t h e r m o r e , since F is reduced, we have H * ( F ) < H*(G) d
= Ibol 1-I ~/1 + I~¢0 - el ~ i=1 < roT/1 •.. 7/4, since
[bol ~
m.
LEMMA inequality
5 B . Suppose F is as above and x, Xo are primitive solutions of our
m / 2 d < IF(x)l =< m.
(5.3)
Then there are numbers ¢ 1 , . . - , Ca (depending on x, x0, F, m ) such that the following conditions are satisfied: (i) for each i, either ¢ , = 0 or 1/2d <=~, <=1, d (ii) Z ¢ i _> 1/2, 4=1 (iii) /'or each i, ILi(xo)/Li(x)l = > where L i ( x ) = o~(i)y -- x a n d D(x°'x)=lx°x
Y° I'y
P r o o f . Since x is a primitive integer point, we m a y pick x ' E Z 2 with D ( x ' , x) = 1. T h e n x, x t is a basis for Z 2. Therefore, x0=rx+sx'
with
r, s E Z .
/.From linear algebra, we have D ( x o , x) = s D ( x ' , x) = s. T h e n we have
x0 = rx + D(xo, x)x'.
88 Now
Li(x0)
- r - D ( x o , x)fll,
-- r + D ( x o , x)
where fli = -Li(x')/Li(x).
(i = 1 , . . . ,d)
Set
G(V, W) = F ( V x + W x ' ) d
= ao H Li(Vx + W x ' ) i--=l d
= a0 l - I ( v L i ( x )
+ WLi(x'))
i=l d
= bo 1 - [ ( v -
iw),
i=l
where b0 = F ( x ) . So G is normalized and G ,-, F . Since x, x0 are solutions to our original inequality, we have
0 < IF(xo)/F(x)l < 2 a, or equivalently, d
0 < H 1Li(xo)/Li(x)l < 2 a. i=l
T h e n at least one factor is no greater t h a n 2, say
ILa(xo)/La(x)l =< 2. T h e n we have
I v - D(xo,x)/~l < 2. P u t t i n g / 3 = Re fla gives
I v - D(xo,x)Zl 5 2. Let g E Z with [~ - g ] _5 1/2. We have
ILi(xo)/Li(x)[ = I(~ - 13i)D(xo, x) + r - D ( x o , x)fl I
=>I/~ -/3~llD(xo,x)l --->( I g - ~ l
1
fi
- 2
2)lD(x°,x)[
~)lD(xo,x)h
= (oiif ~/i = ]g - f l i ] + 1. For each i, p u t Q t
r/i=
if
T/i if 1
if
7/i__>Q, Q1/2d <
= T/i
r/i < Q1/2a.
89
Define ¢i by r/~ = Q¢'. By Lemma 5A, we have 7 h . . . T/d =>Q. Therefore, r/l ..- r/~ > Q1/2, d
so that Y~¢i > 1/2. Furthermore, r/i > r/~ = Q¢', which gives i=1
as desired. In applications of Lemma 5B, we will take x0 = (1, 0). This is a solution since F is normalized. In this case, Li(x0) = - 1 and D(x0, x) = y, so the lemma gives ¢ 1 , . . . , Ca satisfying
1/,Li(x)] > (O'~' - ~ ) Now suppose that ~bi > 1/2d. Then Q~' > ~
(i=1,. ,d)... > 7 by (5.1), (5.2), and
1/ILl(x)[ > Q¢'y/2 > Q¢,/2y
(i = 1,... ,d).
That is, IL,(x)l _5 llQ¢'/2y
(i = 1,... ,d)
and
a ( i ) - - Y l < I/Q¢'/2y2
(i -- 1 , . . . , d ) .
We state this in the following lemma. L E M M A 5C. Suppose F. is a form as above (normalized and reduced) and x = (x, y) is a primitive solution of our inequality, i.e.
m / 2 d < [F(x, y)[ < m
(5.3)
with y > O. Then there exist numbers ¢i = ¢i(x) (i = 1,... d) as above with
]a(i) - yx I <
1
Q¢,/2y--------~
(5.4)
for every i in 1 < i <_ d with ¢i > O. Compare this with Lemma 3C which says that for some a (1), we have
For small values of y, Lemma 5C is better than 3C, because there is no constant in the numerator in (5.4).
90
L E M M A 5D. Let 3£ be the set of primitive solutions of our inequality (5.3) with 1 =< y =< Y = H * ( F ) s. Then for a n y i , 1 _< i _< d, we have
log___Y_Y Z
¢i(x) << 1 + logQ"
xE~
P r o o f . Given i, let X l , . . . ,X v be the elements of 3£ with ¢i(x) > 0, ordered such that yl < Y2 < "'" < y, < Y. Then l
<
Xj
XjT 1 [
Yj
Yj+I
YjYj+I
l
Yi Yj+l 1 1 < Q¢,(xj)/2y2 + Q¢,(xj+,)/2y~+ 1 < 1 + - -1 = q,~(xi)/2y~ 2yjyj+l " The last inequality follows from Q ¢i(xi+~)/2 we have 1
=>
Q1/4d >
501/4 >
=
2 and Yj+I = > Yj. Now
1
<
2yjyj+l = Qq,~(,,j)12y~ ' which gives yj+l > Q¢'("J)l~yj/2 => Q¢,(,,j)/4 yj. (This relationship between yj and Yj+x can be thought of as a "variable gap principle".) Since we have y~,/yl < Y , this gap principle tells us that v--1
YI Q~,(xj)/4 __
and therefore v--I
log Y
¢,(xj) < 4log j=l
=
Q.
Since ¢ i ( x , ) =< 1, the lemma follows. We are now able to estimate the cardinality of 3£. /,From Lemmas 5B (ii) and 5D, we have d
XX i=1
xE.~
f
<< d i1 + =
(1 +
\
log Y'~ logQ] log H *(F)S "~
logQ
]"
91 Therefore, on using (5.1), (5.2), we obtain fXl<
logm
1 + ~ /
= d(1 + logrnl/d). We have proven, then, the Proposition, and therefore the main result ( T h e o r e m 1C), which said that the n u m b e r of solutions of the Thue inequality IF(x)l < m is << dm2/a(1 + log ml/a). In particular, the n u m b e r of solutions of F ( x ) = 1 is << d. R e m a r k . One might believe that the number of solutions could be estimated in terms of the n u m b e r of real a(0's, rather than d. This is not the case, as shown by the following. ,4 E x a m p l e . Let F ( X , Y ) = X d + c ( X - Y)~(2X - y ) 2 ... ( ~ X - y ) 2 , where c > 0, and d is even. Then F ( x , y ) = 0 for real x , y would imply that x = y = 0. So F ( x , 1) has no real root. However, F ( x , y ) = 1 has d non-trivial integer solutions, namely +(1, 1 ) , + ( 1 , 2 ) , . . . , 4-(1, d). This example was communicated to me by M. Waidschmidt. §6. H o w t o G o f r o m F ( x ) = 1 t o F ( x ) = m. We now know that the equality F ( x ) = 1 has << d solutions. We want to extend our results to T h e o r e m 1B, which states that F ( x ) = m has <:< d l+v solutions, where v = v(m) is the number of distinct prime factors of m. To consider this problem, we go to a wider setting. Consider forms F ( X 1 , . . . , X n ) • Z [ X 1 , . . . ,Xn] which are decomposable, i.e., F = L I " " Ld where L 1 , . . . ,Ld are linear forms in n variables with complex coefficients. E x e r c i s e 6a. Let F be decomposable as above. Show that F can be written as a product L1 " " Ld, where now the coefficients of each L~ generate an algebraic number field Ki of degree [K~ : Q] < d. For given n and d, let ¢ be a class of decomposable forms which is closed in the following sense. If F E ~ and G = c F T with c a non-zero rational, T E G L ( n , Z), and G E Z [ X 1 , . . . , X,~], then G • ~: also. E x a m p l e . Suppose n = 2 and ¢ the class of forms of given degree d > 3 and irreducible over Q, as in Thue's equation. T h e n this class is closed, since irreducibility is unchanged under a linear transformation. E x a m p l e . Suppose F = aL (1) ... L (d) is a norm form where any n among L 0 ) , . . . , L (d) axe linearly independent. T h e n this class is also closed.
92 For a given class E, let Me (1) be the m a x i m u m number of solutions of F ( x ) = 1 with x E Z n, where the maximum is over all F E E. Let Me (m) be the m a x i m u m number of primitive solution of F ( x ) = m over all F E E. It is trivial that Me (m) = Me ( - m ) . THEOREM
6A. If Me (1) is tinite, then
Me (m) < =
n-1
d,,_l(ma)M¢ (1),
where v = v(m) is the n u m b e r of distinct prime factors of m, and dn-l(X) is the n u m b e r of ways of writing x = x l x 2 . . . x , - 1 with xi E N, (i = 1,... , n - 1). This result has its genesis in Bombieri and Schmidt (1987), where the case n = 2 was shown. R e m a r k . The function d2(x) is the ordinary divisor function, while dl(x) = 1. In the special case n = 2, we have
Me (m) <=dVMe (1). Furthermore, for the Thue equation, we know that Me (1) << d, so that
Mc (m) << d~+1, and T h e o r e m 1B is proved. E x e r c i s e 6b. Given n and d, let
g(m)= ( n d Show that g(m)is multiplicative, i.e., g(mlm2)= g(ml)g(m2)if gcd ( m l , m 2 ) =
1.
E x e r c i s e 6c. Given n, d and e > 0, and for g(m) as above, show that g(rn) << m e d~n~e a s /Tt ----+ (:X).
Because of the multiplicativity of g(m), it will suffice to prove the following. PROPOSITION
6B. Fork > 0 and a prime power p ~ where u > 1 and p l k, we
have
Me (kp ~) = < ( n -d 1) d"-I(P~d)M¢ (k) In what follows, suppose that E is a field and [ [ is a non-Archimedean absolute value on E , sometimes called an ultra-metric absolute value. For x = ( x l , . . . , x n ) E E n, put [x[ = max(Ix1 [ , . . . , [xn[). Similarly, if P ( X 1 , . . . , X , ) is a polynomial with coefficients in E , then put [P[ = max [c[, over all coefficients c of P.
93
LEMMA
6 C . Suppose such a field E is algebraically dosed. Then m a x ]P(x)[ = [P[. x6E n Ixl-<_ I
Proof. It is clear from the properties of a non-Archimedean absolute value that
m a x [P(x)I < IPI.
xEE" Ixl =< I
=
T h u s it will suffice to show t h a t there is an x 6 E " with [x[ = 1 and IP(x)[ = [P[. Consider the case n = 1. We m a y suppose t h a t P # 0. T h e n write P(X) = PI(X) + P 2 ( X ) , where P1 is the sum of the m o n o m i a l s of P whose coefficients h~ve n o r m [P[, a n d / ' 2 is the sum of the remaining monomials. Suppose t h a t P I ( X ) = cQX it + ci~X i2 + " " + ci, X i' with il < "'" < it. Choose z 6 E with Pl(x) = CitXi'+l.
T h e n [x I = 1: For if Ixl > 1, the s u m m a n d ci, x I'+1 would dominate, and if Ixl < 1, the s u m m a n d ci, x iI would dominate. We have IPl(x)l = Ic,,l = IPI and IP2(x)l < IPI. Therefore, by the isosceles triangle principle for n o n - A r c h i m e d e a n absolute values, we have IP(x)l = IPI, as desired. T h e general case follows by induction on n. R e m a r k . T h e hypothesis of algebraic closure was really necessary, as illustrated by the following example. E x a m p l e . Let E be the field of rational n u m b e r s with [ [ = [ [p, the p-adic absolute value associated with some p r i m e p. Let P = X p - X . T h e n [PI = 1. S u p p o s e x 6 Q and Ixl =< 1. Write x = u/v with u,v E Z a n d p i v . T h e n xP-~=
( u~ ) p
-
( v ) u _P - u v p--j 1
T h e n u m e r a t o r satisfies uP - uvP - 1 = uP - u -= 0 ( m o d V). T h e r e f o r e , I~' - ~l < 1 / p
and m a x ]P(x)[ < 1 • EQ p I~l
94
L E M M A 6D. Let E be a tJeld, not necessarily algebraically closed, with a nonArchimedean absolute value [ [. Let L , ( X ) , . . . , L t ( X ) be linearly dependent linear forms with coemcients in E. Given non-negative real numbers A , , . . . , At, let Y~ be the set of x with components in E and satisfying
IL~(x)l < A~ (i = 1,... ,t).
(6.1)
Then J~ m a y be detined by t - 1 of these inequalities, i.e., there exists an io, 1 <=io <-_t, such that Y~ is detlned by
IL,(x)I :~ A,
(i = 1 , . . . , io - 1, io + 1 , . . . , 0-
R e m a r k . This does not hold in the Archimedean case. Consider, for example, the field of real numbers R with the usual Archimedean absolute value and the system of inequalities Ix, I < 1, Ix21 < 1, Ix, +x21 < 3/2. The corresponding linear forms are clearly linearly dependent. However, the solution set (shown below) cannot be defined with less than three inequalities.
(o, I)
[o, -l)
P r o o f . Since L 1 , . . . , Lt are linearly dependent, there is a non-trivial relation 71L1 + "'" + 7tLt = O, with 7i E E (i = 1 , . . . ,t). Furthermore, not all of the 7i's are zero, say 71 ~ 0 , . . . ,Tt O, but 7t+1 . . . . . 7t = O. Without loss of generality, we have [71 [A1 = max (171 ]A1,..., ]TtIAt).
95
Now take i0 = 1. If (6.1) holds for i = 2 , . . . , t, then the linear relation gives
[7iLl(X)[ _-< max(172[lL2(x)[,... ,17tllLt(x)l) < max(172[~2,
, [TtlAt)
< 1711~1, so that [Ll(x)l =< ~1, and the proof is complete. L E M M A 6E. Let E and I [ be as above. Let L x , . . . ,Ln be n Iinear forms in n variables with coefficients in E. Let non-negative A I , . . . , A n be given. Suppose there is an x' E E" with
[x'l
=
1,
IL,(x')l
< )~,
(i =
1,...,n).
Then there exists an io, 1 < io < n, such that any x satisfying
Ixl =< 1,
IL,(x)l =< ~,
(i
= 1,... ,i0 - 1,i0 + 1,... ,n),
(6.2)
has in fact
IL~(x)l =< Ai
(i = 1,... ,n).
P r o o f . By the preceding lemma, we may suppose that L 1 , . . . ,Ln are linearly independent. Then each coordinate Xi may be written as a linear combination of the linear forms L 1 , . . . , L , . That is, Xi = 7 i l L I ( X ) + "'" +TinLn(X),
(i = 1 , . . . ,n),
with 7ij G E for 1 __
17111~1=
max
l<~i,j<__n
17,~l~i.
The given x ~ has J
and since [x'l = 1, this gives us 1 ___<17u]A1. In particular, 711 # 0. Now we will show that i0 = 1 works. Suppose x satisfies (6.2). Then the linear expression for the first coordinate gives [711[[L,(x)l =< ma×(171u[lLu(x)[,..., 171,11L,(x)l, [Xl I)
___<m~(17,21A2,..., 171-1~,, Izx l) ___<17111~1 since f~,l =< Ixl =< 1 =< 17,11~,. Dividing by 17111 in the inequality gives ILl(x)l =< ~1, and the lemma is proved. L E M M A 6F. For d >=n, let L , ( X ) , . . . , Ld(X) be d linear forms in n variables with coetticients in E. Let non-negative t e a / n u m b e r s A1,... , Ad be given. Suppose
96
there is an x' E E n with
Ix'l = 1,
IL~(x')l _-< Ai
(i = 1 , . . . ,d).
Then there are n - 1 among these forms, say L i l , . . . , L i . _ t , such that any x E E n satisfying
Ixl <
1,
IL~(x)l
< )qj
(j = 1 , . . . ,n - 1)
also has
[L~(x)[ < A~
(i -- 1 , . . . , d).
P r o o f . Apply L e m m a 6D repeatedly, d - n times, to reduce the system to n linear forms. T h e n apply L e m m a 6E to reduce further to n - 1 forms. For the following exercises, suppose that E is a field with a non-Archimedean absolute value [ ], as before. Furthermore, suppose that E is complete, i.e., any Cauchy sequence has a limit in E with respect to the absolute value. A set ~ in E " is called symmetric, convex if Ax + # y C ~ for every x , y E .~ and [A[ < 1, I~1 -< 1. Given X l , . . . , x , E R, their c o n v e x hull is the set of ail points AlXl + ... + A,xn with I~il _-< x, (i = 1,..., n). This is simply the smallest convex, symmetric set containing Xl,...
,Xn.
E x e r c i s e 6d. Suppose h C_ E " is a symmetric, convex set, as above. Suppose als that .~ is sequentially compact (i.e., every sequence in .¢i has a convergent subsequence) and that .¢l contains n linearly independent points. Show that .tl is the convex hull of certain n linearly independent points. E x e r c i s e 6e. Suppose .~ is as in Exercise 6d. Show that there are n linearly independent linear forms L 1 , . . . , L , with coefficients in E such that ~ may be defined by the inequalities ILi(x)l < 1 (i = 1 , . . . ,n). Returning to Proposition 6B, we want to consider the n u m b e r of primitive solutions of F ( x ) = k p ~,
where
(6.3)
p lk.
By a previous exercise, we write F ( X ) = L I ( X ) . - - L d ( X ) , where the coefficients of Li lie in a field Ki. Let E be the algebraic closure of Q and let ] ]on E be some extension of the p-adic absolute value [[p. With any primitive solution x' of (6.3), we will associate real numbers A1,... , Ad such that
ILi(x')l-- -~i
(i -- 1 , . . . , d).
Since x' is primitive, we have [x'[p = 1. Therefore, by L e m m a 6F, given such an x', we can pick n - 1 of the forms, say L i l , . . . , L i . _ l , such that any x E E with
Ixl _-< 1,
ILii(x)[ < Aij
(j = 1 , . . . , n -
also satisfies [Li(x)l _-< Ai
(i = 1 , . . . , n ) .
1)
97
With each x', we associate ( L i l , . . . ,Li,_l; h i l , . . . ,hi._~), which will be called the anchor of x ~. Initially, we will count the number of solutions with a given anchor. Without loss of generality, consider solutions with the anchor ( L I , . . . , Ln-1; h i , . . . , h,,-1). Such an anchor is associated with x ~ and h i , . - - , hd satisfying hl''"
"~d =
I L l ( x ' ) " " L~(x')l
=
Ikp"l =
p-".
Suppose some x E E'* is a solution to the inequalities Ixl < 1
and
ILi(x)l < hl
(i = 1 , . . . , n -
1).
(6.4)
By L e m m a 6F, such an x also satisfies
ILi(x)l < hi
(i = 1,...,d).
Thus I r ( x ) l = I L l ( X ) ' " Ld(x)l < h i ' " hd = p - " . Now, the points x E Z'* with Ini(x)[ < hl
(i = 1 , . . . , n -
1)
make up a lattice. Choose some basis for this lattice, say a l , . . . , a , . Consider x = y l a i + ' " + y,,a,~ where yl e E and lYi] < 1 (i = 1 , . . . ,n). For such an x, we have (6.4), namely
Ixl _5 1 and
ILi(x)l _5 hi
(i = 1,...,~-
1),
therefore IF(x)l 5 p - " as above. Furthermore, if x E Z n belongs to our anchor, then x is in this lattice, i.e.
x = ylal + ' " + ynan.
(6.5)
We introduce another form G by G ( Y 1 , . . . , Y n ) = F ( Y l a l + . . " + Yna,). If x satisfies F ( x ) = kp u and (6.5), then G ( y ) = kp ~', where y = ( y l , . . - , Y n ) from (6.5). But for any y E E n with lyl _5 1, we have IG(y)I _5 p-". Hence, tGI _5 p-", by L e m m a 6A. We know, then, that every coefficient of G is divisible by p". Set G ( Y ) = p " R ( Y ) with R ( Y ) E Z[Y]. T h e n G ( y ) = kp" becomes R ( y ) = k. Furthermore, R E ~:, so the n u m b e r of solutions to R ( y ) = k is not greater than Mc (k). We may conclude, then, that the number of solutions with a given anchor is not greater than Me (k). It remains for us to estimate the number of possible anchors (Li~,... ,Li,_~; hi~,... ,hi,_~). The number of choices for (n - 1)-tuples, i.e., ( i l , . . . , i n - l ) , is ( d _ l ) . Given i l , . . . , i n - l , how many possibilities exist for h i , , . . . , hi,_~? Without loss of generality, we may consider the number of anchors of the form (L1,... , Ln-1 ; h i , . . . , hn--1). To answer our question, it is necessary to count the number of possibilities for h i , . . . , h,,-1. We begin by making several observations.
98
(i) B y Gauss' L e m m a , we have I L l [ . . . ILal = IFI < 1. S u p p o s e / ~ 1 , . . . ,*~n-1 belongs to an a n c h o r coming f r o m A1,... , Ad. T h e n A1 "--Aa = p - ~ . P u t t i n g #i = A i / l L i [ , we have #i_-< 1
(i=l,...,d)
and #l"''#d
> P-".
(ii) If A1,.. • , A , - 1 and A'1 , - - - , A.-1 ' b o t h occur in anchors and if Ai <= A~ (i = 1 , . . . , n - l ) , t h e n Ai = A~ (i = 1 , . . . , n - l ) . For suppose A1,..- ,An-1 comes from A I , . • • , A d and A~ , . . . ,A~_ i 1 comes from A~,...A~. T h e n there m u s t be x and x ~ with Ix[ = [x'[ = 1, ILl(x)[ = AI, and ]Li(x')] = A~. T h e r e f o r e
Ixl-- 1 and Since A~,...
,
IL~(x)l _5 ,~
(i = 1 , . . . , n -
1).
Atn--1 belong to an anchor, we have
ILi(x)l =< ,X~
(i = 1, . . . ,d),
and thus, Ai =< A~
(i = 1 , . . .
,d).
However, therefore (i = 1 , . . . , d ) .
=
(iii) Recall t h a t , for each i, the field Ki is an algebraic extension of Q of degree < d. T h e p-adic absolute value ] Iv on Q has the value set {pro : m E Z}. Since the absolute value I Iv on K i is an extension of I Iv, we know t h a t it has the value set {pm/e~ : m E Z}, where ei = e i ( K i ) and 1 =< ei =< d e g K i =< d. (This ei is known as the r a m i f i c a t i o n i n d e x . ) Given a primitive solution x, for each i we have Ai = ILi(x)l- F r o m observation (iii), Ai is of the t y p e pro~e, for some m E Z. T h e n write Ai
_ p-~/~i
for some vi E Z. We have vi _-> 0 (i = 1 , . . . , n - 1) and ]z1 "'" ~n--1 => p--U by observation (i). Hence Vl + el
_~_ Yn--1
< U•
en--1
Suppose two different anchors have v l , . . . ,vn-1 and v ~ , . . , v~n-i with vi =< v i~ (i = 1 , . . . , n - 1). By applying observation (ii) to the corresponding A1,... ,A,,-1 and A~, . . . , A In _ i , we find t h a t vi = v i' (i = 1 , . .. , n - 1). Therefore, v n - 1 is unique once V 1, . . . , U n _ 2 a r e given• It remains, then, to count the n u m b e r of non-negative V l , . . . , vn-2 with Vl 71el
+ Vn--2 < U• en--2
99
Since ei <=d, this number is bounded by the number of non-negative Vl,... , Y n - - 2 with Vl + " " + v , - 2 < du, which in turn equals the number of non-negative v l , . . . , vn-1 with vl + ... + v , - 1 = du. If one thinks of factoring pdU as pVl ...pV,,_l, it is obvious that the number of non-negative v l , . . . , vn-1 with vl + - - - + v,,-1 = du is simply d,_l(pa"). Thus, the number of possible anchors is ( , d l ) dn_l(pd*'), and the number of primitive solutions of F(x) = kp ~' is not greater than n d ) dn_l(pd,,)M¢ (k). -1
§7. T h u e E q u a t i o n s w i t h Few Coefficients. Recall that the bound on the number of solutions to F ( x , y) = m depends only on d (the degree of F), and on m. Siegel (1929) hypothesized that there ought to be a bound for the number of solutions of certain diophantine equations f ( x , y) = 0 which depends only on the number of non-zero coefficients. We consider the special case of the Thue equations F ( x , y) = m. For cubic Thue equations, the Siegel conjecture is incorrect. There is no bound independent of m. See Ch. IV, §8. Consider the simplest Thue equation, namely the binomial one ax
d --
by d = m,
d >3 .
This equation had already been studied by Siegel, (1929) (1970), Domar (1954), Hyyr5 (1964), Evertse (1982), and Mueller (1987). They achieved upper bounds B = B ( m ) independent of the degree d. Consider also the trinomial Thue equation, ax
d
+ bxqy a-q + cy d = m,
d > 3 .
In this case, Mueller and Schmidt (1987) obtained a bound B = B ( m ) . One may also study the more general Thue equations where F is a form of degree d > 3 with s + 1 non-zero coefficients. That is, F may be written as F(x, y) = ~
aixd'y d-d'
(7.1)
i=0
with 0 = do < dl < ... < de = d. It turned out to be just as easy to study the related "Thue inequality" IF(z,y)l = <m . (7.2) Schmidt (1987) obtained a bound on the number of solutions, namely << V/~s m2/d(1 + logml/d). Later, Mueller and Schmidt (1988) obtained a second bound, showing that the number of solutions of (7.2) is << s 2 m2/d(1 + logml/a).
100
This b o u n d is itself bounded in terms of m and s, i.e. the n u m b e r of coefficients of F , but independently of the degree. It is easily seen that the term m 2/d is needed, but the logarithmic term is probably unnecessary. T h e term s 2 should probably be s. In these Notes we will deal only with the binomial and trinomial case. We will prove (Mueller and Schmidt (1987)): T H E O R E M 7A. Suppose F is a binomiM or trinomial of degree d > 9. Then the number o[ solutions o[ the Thue Inequality (7.2) is
<< m2/~ . If we combine this with Theorem 1C (to deal with 3 _< d _< 8), we see that for any binomial or trinomial Thue inequality, the number of solutions is << m2/~(1 + logml/d).
§8. The Distribution of the Roots of Sparse Polynomials. By "sparse polynomial", we mean polynomials with few coefficients (as compared to their degree). Given a binary form $
F(x, y) = ~
a,xdly ~-a'
i=0
as in §7, we have a corresponding polynomial in one variable, f ( x ) , determined by f ( x ) = F(x, 1). T h a t is, $
f(x)
=
a,x
(s.1)
i=0
with 0 = do < dl < ... < d,. In this section, we will study the distribution of the roots of such a sparse polynomial f(x). We start with the case s = 2, say f ( z ) = a 0 + al zdt + a2 zd2,
with 0 = do < dl < d2 = d. First, suppose that al is "large", i.e. [all is large relative to laol, [a2[ • The polynomial ao + alz d' has dl roots, all of which satisfy [z[ = [ao/al Ix/d'. T h e n it turns out that the original polynomial f ( z ) has dl roots with [z[ .~ [ao/all 1/~1. Similarly, the polynomial alz dl + a2z d2 has d2 - d~ nonzero roots satisfying [z[ = [alia2[ 1/(d2-d'), and f ( z ) has d2 - d l roots with ]z[ ~ [ella2[ 1/(a2-d'). On the other hand, suppose that al is "small" relative to ao and a 2 . Then the polynomial ao + a2z d2 has d2 roots with [z] ~ [ao/a2[ lids. In this case, f ( z ) has all d2 = d roots with [z[ ~ [ao/a2 [1/d2.
101
How may one remember this? Consider the three points P0 = ( 0 , - l o g [a0l), P1
=
(d~, -
Ps = (d2
-
log la~ I), log lasl).
In the first case, when [al I is large relative to [a01 and las 1, we have the following picture, with P1 lying below the line segment P0 P2.
d2
d
P2
P
As discussed above, we have dl roots with log Iz[ ~ -
log lax I - log [a0 [ = slopePoP1, d
and there are ds - dl roots with log
Izl
-
log [as[ - log [aal ds -
= slopeP1 Ps.
dI
In the second case, when lall is small relative to la01 and [a2[, the point P1 lies on or above the line segment PoPs, as illustrated below.
dt
d2
t
° PI P0
)
-~P2
102
Here all d = d2 roots satisfy
loglzl ~
logla~l-logla01
d2
= slopePoP2.
Returning to the more general case, suppose we have a sparse polynomial f given by (8.1) with 0 = do < dl < . . . < ds = d. We call the points
Pi = ( d i , - l o g l a i [ ) ,
(i = 0 , . . . ,s),
the Newton points. T h e corresponding Newton polygon is defined to be the lower boundary of the convex hull of the Newton points. In other words, the polygon consists of points (x, y) in the convex hull such that (x, y - ~) is not in the convex hull for ~ > 0. For s = 5, we m a y have the following picture.
P5 = P , = P~(3)
Pi(o) =
P0
"/2 --,.,.,_
/'3 = P~(,)
So t h a t we m a y refer to each of the vertices of the Newton polygon, we label t h e m P0 = Pi(0), PI(1),.-- , Pi(t) = P:, as shown above. For 1 < u < g, we also let cr(u) denote the slope of the line segment P i ( : - I ) Pi(,,). Since the Newton polygon is convex, (r(u) is an increasing function of u. By studying the case s = 2, we noticed that there is some relationship between the roots of the polynomial f ( z ) and the Newton polygon of f . T h e following theorem illustrates this connection in the general case. THEOREM 8 A . There exists a map from the set of roots of f ( z ) to the set of segments of the Newton polygon such that exactly di(u) - di(u-1) roots correspond to the segment Pi(=-l) Pi(~), and these roots have
l o g l z l - ~(u) < (2iog3)s. Remark.
In p- adic analysis, the Newton polygon is well-known. In t h a t case,
log Izl = o(u). See, e.g., Koblitz (1977), Ch. IV. T h e T h e o r e m holds for all polynomials, sparse or not, b u t is p r o b a b l y more interesting in the sparse case. Consider the circular ring (or annulus) R : given by
e ~(~)-2x" < Izl < e "(~)+~x',
103
where = log 3.
(8.2)
Then the theorem says that the di(,,) - di(,,-1) roots are associated with an annulus R , , and in fact, these roots lie in R , . One should be aware, however, that there may be some overlapping of the annuli R , . For this reason, we will call these annuli "large rings" and we will introduce another set of smaller annuli. For 1 __
< Izl < e
Two of these small annuli, say S~ and S~+1, will overlap precisely if a(u + 1) < a(u) + 2A. In this case, the segments to the left and right of Pi(,) have similar slopes, so the angle at the vertex Pi(u) is not very sharp. We will say that the vertex P/(,) is "blunt". On the other hand, those vertices P/(,) with a(u + 1) __>a ( u ) + 2A will be referred to as "sharp". By definition, we have that P0 and P, are sharp vertices. As with vertices in general, we would like a notation which allows us to refer to a particular sharp vertex. Suppose that we have sharp vertices P/(,) for u(0) = 0, u ( 1 ) , . . . , u(p)L Setting i[k] = i(u(k)) for k = 0 , . . . ,p, we have the sharp vertices P0 = P/J0], Pi[x],... , P/[p] = P~To eliminate the problem of overlap in the rings, we introduce a third set of annuli which we will call "medium rings". For 1 < k < p, let Lk be the annulus given by e°'(u(k-1)+1)-A < ]Z] < e a(u(k))q''k
.
Then Lk is the union of the small annuli S, with u(k - 1) + 1 < u < u(k).
(8.3)
The u in u(k - 1) < u < u(k) correspond to blunt vertices P/(,), so that a(u + 1) < a(u) + 2A
( u ( k - 1) + 1 < u < u ( k ) ) .
Thus, subsequent rings S,, S,+1 in this range will overlap. Furthermore, by adding up the changes in the slopes of adjacent line segments, we have a(u(k)) < a(u(k - 1) + 1) + 2A(u(k) - u(k - 1) - 1) < a(u(k - 1) + 1) + 2A(a - 1). Let Lk be any medium annulus and z E L k . For any u satisfying the corresponding inequality (8.3), we have e < Izl < e In other words, the medium ring Lk is contained in the intersection of the large rings R~ with u satisfying (8.3). Using the medium annuli Lk, (1 < k < p), we will see that it suffices to prove the following proposition in place of Theorem 8A.
104
P R O P O S I T I O N 8B. E x a c t l y dqM - di[k-1) roots lie in L~ (1 __ e ~('(k)+l)-x .
P r o o f . As before, let f(z)
=
a i z d' .
i=0
Given a E R, put f * ( z ) = f ( e ' z ) . Then the Newton points Pi(0),... ,Pi(t) of f are mapped into the Newton points P,~0)," "" , P,~t) of f* by the linear map (x, y) ~-~ (x, y ax). In fact, the sharp vertices of f are mapped into the sharp vertices of f*. This is easily seen by considering ~r*(u), i.e. the slope of the segment to the left of P ~ ) on the Newton polygon of f*. We see that a*(u) = or(u) - cr. Given k corresponding to a sharp vertex Pi[k], we put
Then a* ( u ( k ) ) = a ( u ( k ) ) - ~ = - A and a * ( u ( k ) + 1) = ¢ ( u ( k ) + 1) - c r ( u ( k ) ) - A > 2A A = A. We illustrate these facts in the following picture.
slopo. = ,~(.(k) + 1)
slope ~
1
ope > A
Plikl slope = a(u(k)) Newton polygon of f
Newton polygon of f"
105
T h e n Pi~k] is the "lowest" point on the Newton polygon of f* for the chosen value of a. Now write 8
I*(Z)
= ~
a*.z I d'
i=0
First, suppose t h a t i < i[k]. T h e n the slope of the line segment P 'i P *i[k] is =< - A, since the Newton polygon is convex. C o m p u t i n g this slope, we get _
log I a*~[,~]1
-
log la71
di[k] -- di
<
_ ,x.
=
Thus
laTI < lai*tk]Ie-x(a'tk]-a') Since A = log 3, we have
la~l _-< la~*[k]l 3 '-'[k]
for
i <
i[k].
In a similar way, one gets
la*l _5 lai*tkll 3 ~tkl-~ for
i > i[k].
W h a t does this tell us when k = p? In t h a t case, i[k] = i[p] = s and we have
la~l___ < la:13'-"
for
i < s.
This will show t h a t all of the roots of f* lie in Iz[ < 1. To see this, suppose Izl > 1. Then
If*(z)l > Izl d (la:l
I ,-~1 1
_5 lzl d la:l(1
.,. 1 9
3
laSI)
"' ) >0"
Since all of the roots of f* lie in Izl < 1, we have t h a t all the roots of f lie in ]z] < e ~ = e ~(~(r))+x. T h u s L e m m a 8C (i) is true for k = p. For 0 < k < p, consider instead the polynomial ilk]
:o(z) = ~
a,z d'
i=0
and the corresponding
f;(z) = yo(e ~z) = ~
a~z"
i=0
with a given by (8.4). Applying our previous results to f0, we know t h a t all of the di[k] roots of f~(z) lie in Iz] < 1. We claim t h a t exactly di[k] roots of i f ( z ) are, in fact, in Izl < 1. Once this is proven, we will know t h a t exactly di[k] roots of f(z) lie in [zl < e ~ = e ~(~(k))+~.
106
Using Rouch6's Theorem to prove the claim, it will suffice to show that
lf;(z)- f*(z)l < lf;(z)l for
lzl=
x.
Then the polynomials f0 and f~ will have the same number of roots in the disk ]z I < 1. But for [z I = 1, we have [ f ~ ( z ) ] - If~(z) - i f ( z ) [ >= [ai*[k][- E
i•i[k]
>la,'t 11(1 ---~0.
]a[l
1
1
3
9
'
1
1
3
9
"")
To prove part (ii), the lower bound for the zeros, put ] ( w ) = w a f ( w1) . Then the Newton polygon of ] is obtained from the Newton polygon of f by a reflection through the line x = d/2. Under such a reflection, sharp vertices remain sharp and the bounds follow.
Exercise 8a. Let f ( z ) be a polynomial with coefficients in a field E and I I a non-Archimedean absolute value. Define the Newton polygon as before. Suppose that f ( z ) has all of its roots in the field E. Then it is known (see, e.g., Koblitz (1977)) that one has a mapping from the roots to the segments of the Newton polygon such that roots corresponding to Pi(,,-1) Pi(~) have log Izl = ~(u), where a ( u ) is the slope of this segment. Now prove this statement for a trinomial. §9. T h e A n g u l a r D i s t r i b u t i o n o f R o o t s . In §8, we studied the radial distribution of the roots of a sparse polynomial f ( z ) . In this section, we will consider the angular distribution for binomials f ( z ) = az a + c and trinomials f ( z ) = az ~ + bzq + c. For a binomial, the roots make up a regular d-gon, so that the angular distribution is completely regular. In what follows, A will denote a wedge with vertex 0, i.e., a region bounded by two rays emanating from 0. Part or all of the boundary of A may belong to A. Write IAI = ¢/2~', where ¢ is the angle between the rays. We will consider the whole plane to be a wedge with IAI = 1.
0
W i t h this notation, we have the following result, which also holds (trivially) for b = O, i.e., for binomials.
107
T H E O R E M 9 A . Let f ( z ) be a trinomial of degree d. If Z ( A ) denotes the n u m b e r of roots of f which lie in A, then Z ( A ) - dlA I < 6 ~_
•
P r o o f . We may suppose t h a t b is real since the roots of f ( z ) are not affected when we multiply the polynomial by a suitable constant. P u t t = d - q, so that d = t + q, and write the equation as az t + cz -q = -b. We may assume, without loss of generality, that t > q so that t > d/2. Introducing the notation e(x) = e 2~I~, write a = lale(~), c = ]c]e(7), and z = ]z]e(C). T h e n
lal Izl' e(tC + ~) + Icl Izl -q e(-qC + ~) = - b . The imaginary part of the left-hand side must be zero, so we have lal Iz[t sin (2~r(tC + a)) = [c[ [zl-q sin (27r(q C - 7 ) ) . The left-hand side of this equality vanishes for some ~0, hence precisely for ~0 + ( m / 2 t ) , for m E 2g. The right-hand side may also vanish for one of these values. By a change of notations we can suppose that it vanishes at the same value {0. In that case, the right-hand side will vanish at Co + (m'/2q) for m' E Z. For the time being, we will require the additional hypothesis that gcd(t, q) = 1. 1 Thus, for the values T h e n m / 2 t = m'/2q is possible only when m / 2 t = m'/2q C=~Z. C = Co + (1/2t), C0 + ( 2 / 2 t ) , . . . , C0 + ((t - 1)/2t), ¢0 + ((t + 1 ) / 2 t ) , . . . , (0 + ( ( 2 t - 1)/2t), the left-hand side vanishes but the right-hand side does not. If arg(z) = ¢ and ( is one of the above values, then we see that z cannot be a root of f . T h e rays given by arg(z) = C, where ( is one of the above, are called "forbidden rays" because they contain no solutions. Furthermore, these rays are determined independently of b. Now let a # 0, c # 0 be fixed and denote the d roots of f as functions of b by z l ( b ) , . . . , zd(b). One can arrange this such that the zi(b) are continuous functions of t h e real variable b. By the continuity of zi(b), we see that the values of zi(b) for various b can not cross any forbidden ray.
forbidden ray
)
108
W h e n b = 0, the roots of f form a regular d-gon and
Z(A)-dlA
I = < 2.
Now let A be an angular domain bounded by two forbidden rays. We call such an A a "special angular domain". In that case, the continuity of the zi(b) gives
Z(A)-
diAl]
<
2
for any b. Now let A be an arbitrary angular domain. There exist special angular domains A1, A2 such that A1 c A c A 2 and
I A 2 I - I A I I < 2/t < 4 / d . (We do allow the possibility that Ai is empty.) For the arbitrary angular domain A, we have Z(A) < Z(A2) < dlA21 + 2 < diAl + d(IA~l- IAll) ÷ 2. Then
Z(A) < dlA I ÷ 6, since [A21- lgl I < 4/d. The lower bound for Z(A) is proved similarly. Now we need only to remove the additional hypothesis that gcd(t, q) = 1. In general, we have gcd(t, q) = ~. Write d = ~dl, t = ~tl, q = @1, so that gcd(tl, ql) = 1. The roots of f are of the type z = w 1/6 where w is a root of the polynomial h(w) = aw dl ÷bw ql +c. To each root w of h, there correspond ~ roots of f which form a regular 8-gon. For any angular domain A, we have m p [A I = ~ + ~ ' where m E Z and 0 =< # < 1 , as illustrated.
109 We will count the number of roots z of f in the domain A by considering the roots of h. A domain of angle 1/6 in the z-plane corresponds to a complete circle in the w-plane
z-plane w-plane
Furthermore, every root w of h will give rise to a root z in each of the m domains of angle 1/6. Thus we get dim roots in the portion of A of angle m/6. Now consider the portion of A of angle #/6 in the z-plane. This corresponds to a angular domain B in the w-plane with IBI = ~.
z-plane w-plane If Z' denotes the number of roots of h in the domain B, then Iz'-dl
l =< 6
by our previous work. Combining these results gives
I
Z(A)-dlA I = = [ d i m - dim + Z ' - dl#[ = IZ'-dl#[ = < 6.
This result may be be generalized to a polynomial with an arbitrary number of terms, say S
S(Z) ~---~ a i i=0
zdi
110 as before. K h o v a n s k y (1981) showed in this case t h a t
Z ( A ) - dlA] =< k(s) where k depends only on s. Khovansky obtains k(s) of the order of m a g n i t u d e e cs2. This is almost certainly larger t h a n need be. His work was related to an open question which we will discuss below. First, we consider a special case as an exercise. E x e r c i s e 9a. Suppose f(z) is a polynomial with s + 1 t e r m s as above. Show that f has not m o r e t h a n s positive, real roots. Now suppose we have two polynomials f(z, w), g(z, w), each containing no more t h a n s + 1 monomials. Furthermore, suppose f and g have only finitely m a n y (complex) zeros in common. C o n j e c t u r e . The polynomials f and g have not more t h a n s 2 c o m m o n real roots (z, w) in the first quadrant, i.e. z > 0, w > 0. Khovansky showed that the n u m b e r of such roots is not greater t h a n g(s), where g(s) is some function similar to k(s) above. T h e reader m a y find an account of K h o v a n s k y ' s work in Risler's (1984/85) paper. Erdhs and T u r a n (1950) gave another result on angular distribution. T h e y considered polynomials of the form f(z) = adz d +... + alz + ao, where a0 ¢ 0, ad ¢ 0. In this case,
]Z(A)-d,A,
<=16(dlog [a°'+ ' a l [ + " ' ~- ~ ]ad[)
If all the coefficients are close in absolute value, then we have the b o u n d << (dlog We now r e t u r n to trinomials. THEOREM there is a subset we have
d) I/2 .
9 B . Let f(z) be a trinomial of degree d with roots al,... ,aa. Then of these roots, say 51,... , ~t, where g < 32, such tfiat for every reM ~, min
1 < j < t
t¢ - 3jl < el°d
min
a <_. i <_ d
I~" - a i l •
R e m a r k . The n u m b e r s 32 and e 1° are larger t h a n need be. A similar result holds for polynomials with s + 1 terms, where 32 and e a° are replaced by constants which depend on s. See Schmidt (1987). P r o o f . In the ease of a trinomial, the corresponding Newton polygon has either one or two segments. Hence the roots of a trinomial lie in one or two annuli of the type Iloglzl-a
< (2•)(2)=4A<5
Thus which we will write as
< N <
111 where B2/B1 = e 1°. Consider the following picture.
• e
O~
T h e n u m b e r of roots in each of these two angular domains is not greater than
dlA I+ 6 = 8, since IAI = 2/d. So, there are at most 16 roots in the two angular domains together. Now suppose that a is a root in the annulus which is not in one of these angular domains. Suppose 5 is any root in the annulus. Let ~ E R. If I~[ > 2B2, then I ~ - a l > B~ and
I¢ - 61 => I¢ - ~1 + Is - 61 _5 I¢ - ~1 + 2B~ _5 3 IC - ~1 < d°d
I¢ -
~1.
On the other hand, if Iffl < 2Be, then write o~ = pe(r/), where I~1 _5 1/2. First, say I~1 =< U4. T h e n 101 --> 1/d since o~ is not in the angular domains. We have
IC - 61 _5 ICI + I~1 < 3B2. In this case, we also have
IC- ~1--> 12m~l
= Ipsin(2'~)l---> 400 _->4nl/d.
Combining these two estimates, gives 3 B2 el0d I~ - ~1 =< ~ d ~ I¢ - ~1 < I¢ - ~1. It is now clear t h a t the theorem holds with 6 x , . . . , ~t the roots in the angular domains in each of the annuli, if there are such roots. If the angular domains in one of the annuli contains no root, but if there are roots in this annulus, then we pick one such root to be among 81,... , $t. Clearly, the T h e o r e m holds with this choice. §10. O n T r i n o m i a l s . Initially, we will consider trinomials of the form g(z) = z d + #zq + 1. We will write
M = I#1 a n d d = q + t .
112 L E M M A 10A. ( i ) W h e n M > 3 4d, then g has exactly t roots in the annulus M1/t 3-4 < lZ[ < M1/t 3 4 , and exactly q roots in
MX/q 3-4
< [z[ -1 < M1/q 3 4 .
( i i ) When M < 3 4d, then all d roots of g lie in the annulus 3 -4(d+l) <
Izl
< 34(d+I) •
R e m a r k . In case (i), roots z in the first annulus have [z I > 1 and those in the second annulus have [z[ < 1. In what follows, we will call these "large" roots and "small" roots, respectively. P r o o f . Consider the Newton polygon for g(z), illustrated below.
When M > 1, the Newton polygon has two segments, and when M < 1 it has only one. In case (i), apply Theorem 8A to see that there are t roots corresponding to the second segment with log Izl
log M
So we have M lit 3 -4 <
Izl
< 4 log 3
< M 1/t 3 4 •
Similarly, there are q roots corresponding to the first segment which satisfy log [z I + lOgqM [ < 4 log 3. Then M -1/q 3 -4 <
[z[
< M -1/q 3 4 .
113
In case (ii), since M < 34d, we may have either one or two segments on the Newtwon polygon, as mentioned above. The absolute value of the slope of any segment, however, is not greater than log M < (4log 3)d. Using Theorem 8A once again, we know that all roots have - ( d + 1)(4 log 3) < log Izl < (d + 1)(4 log 3). T h a t is, all d roots satisfy 3 - 4 ( d + l ) < Izl < 3 4(d+l) .
We introduce the notation A ~- B to mean that A > B / K d, where K is an absolute constant. LEMMA
10B.
(i) W h e n M > 3 4d, then every large root z has
Ig'(z)l
M (d-1)/t
~-
and every small root z has Ig'(~)l >- M1/q
•
(ii) W h e n M < 34d, then every root has either
Ig'(z)l >- 1
or
Ig"(z)l ~- 1 .
Proof. (i). Since g(~) = zd + ~z~ + 1, we have g'(z) = dz d - i + q ~ z ~ - i dz d-1 + d # z q-1 - t # z q-1 -t- ~ 7: ~z • If z is a root, then
=
dz d-1 + d p z q-1 -Jr d = dg(z) _ 0, z
z
and we have Ig'(z)l =
- t ~ = q-1 a=
> 1 = N _
_
(tMizl q
-
-
d)
.
If z is a large root of g, then M lit 3 -4 < Izl, so that Ig'(z)l
> M1/t
'
34
( t M l + ( q / 0 3 -4q - d )
o
Furthermore, t M l+(q/t) 3 - 4 q > M > 2d, so that we have
Ig'(z)l >
Ml+tq/t)
= 2.3 4.3 4d.M1/t
~. M ( d - 1 ) I t .
If z is a small root of g, then ~ = 1 / z is a large root of the reciprocal polynomial ~(z) = q-z d -4- # z t -t- 1. In that case, I~1--< Ma/q 34 and I~'(~)1 >- M ( d - 1 ) / q • The original polynomial g ( z ) and its reciprocal polynomial ~(z) are related by the equation
114
g(~) = z~ get
~(~).
F r o m the p r o d u c t rule a n d the fact t h a t ~(~) = 0 for a root z of g, we
g'(z) = _~d-~ ~,(~) = _ ~ - d ~ , ( ~ ) .
Then
Ig'(z)t ~- M (2-d)/q MCd-1)/q-~
Ml/q
•
Now we look at case (ii). F r o m the first p a r t of the proof, we have gt(z)
~-- - - t # z q - 1
d
T-
g
w h e n z is a root of g. S t a r t i n g with the original p o l y n o m i a l g a n d differentiating twice, we get g " ( z ) = d(d - 1)z d-2 + # q ( q - 1)z q-2 = d(d-
1) g ( z ) -7
+ # ( q ( q - 1) - d(d - 1))z q-2 T
d(d-
z2
1) •
As previously, the first t e r m vanishes if z is a root. In t h a t case, g " ( z ) = # ( q ( q - 1) - d(d - 1))z q-2 T
d(d - 1)
z2
Now consider t h e expression z g ' ( z ) ( d ( d - 1) - q(q - 1)) - z 2 g " ( z ) t = ~:d(d(d - 1) - q(q - 1) - t ( d - 1)) = ~ : d q t .
T h i s e q u a t i o n implies t h a t either [zg'(z)(d(d-
1) - q ( q - 1))1 > d q t / 2
or
[z2g"(z)tl
>= d q t / 2 .
In o t h e r words, either
Izg'(z)l ~- 1
or
Iz2g"(z)l >- 1
B y L e m m a 10A, we have Iz] -~ 1, so t h a t ]g'(z)l ~- 1
or
[g"(:)i >'- i.
T h e two p r e c e d i n g l e m m a s dealt with t r i n o m i a l s of t h e f o r m g ( z ) = z d + p z q 4- 1. In general, we have f ( z ) = a z d + bzq 4- c E Z[z], where a, c > 0. P u t # = ba - q / d c - t / a
and
M=[#I.
T h e n f ( z ) = c g ( w ) , where w = ( a / c ) l / d z . Now the various cases will d e p e n d on H = m a x (a, Ibh c). We will s u p p o s e t h a t a, b, c are integers, so t h a t in p a r t i c u l a r a = > 1, c__> 1.
115 LEMMA
1 0 C . / . f M > 3ad and H = c, then every root o f f has
If'(z)l
~- n 1-(1/d) •
P r o o f . As we have seen, f ( z ) = cg((a/c)l/dz), so
lY'(z)l =
¢ ( a l c ) 1/d I g ' ( ( a l c ) ' / d z ) l
.
If z is a root of f , then (a/c)l/dz is a root of the special polynomial g, and by L e m m a lOB, we know that ~- 1 . Therefore,
Ig'((a/c)l/dz)l If'(z)l
~- c(alc) "lId = a l l d c l - ( i l d )
> H 1-(1/d),
since a > 1 . LEMMA
I O D . If M > 34d and H = Ibl, then every large root o f f has
If'(z)l where
~- P q H 1-(2/a) ,
p=(~b_~a)O-1/d)/t
( L ~ ) (Ud)/q
P r o o f . As before, we have f ( z ) = cg((a/c)l/dz) and
l/'(z)l
c(a/c) lid
=
Ig'((a/c)l/dz)l
N" c(a/c) lid M (d-i)/t
(Ibl~-~~/* ....
= \ ~-zzr-, ] = pqal/d
ibll-(~ld) ella
> pq HX-(21d) . LEMMA
1 0 E . Suppose M < 34d and a < c. Then every root o f f has either
If'(z)l ~-
or
n 1-v/a)
P r o o f i Since M -4 1, we have with w = (a/c)I/dz,
If"(z)l ~-
[b[ -< a q/a c t/d -.4 c
n ~-(2/a) •
and H -~ c. From the chain rule,
If'(z)l =
a lid c 1-°/d)
Ig'(w)l
If"(z)l =
a ~/d c 1-(2/d)
Ig"(w)l.
and By L e m m a 10B, either
If'(z)l >" a l / d
cl-(1/d)
~"
H 1-(l/d)
116 or If"(z)l ~-
a =/a c 1 - ( 2 / a )
>'- H 1-(2/a).
§11. R o o t s o f f c l o s e t o ~. We now return to the T h u e inequality
IF(x,y)l < m,
(11.1)
where F is the homogeneous polynomial given by
F ( X , Y ) = a X d + b x q Y t -4- cY d , with a, c > 0. We will see t h a t either there exists a root a of f ( z ) = F ( z , 1) which is close to ~, or there exists a root/3 of the reciprocal polynomial ] ( z ) = F(1, z) which is close to ~. We m u s t distinguish several cases.
Let H = m a x (a, c). Suppose that M > 3 4a and (x, y) is a solution to (11.1) with x ¢ 0, y ¢ O. Then either there is a root a o f f ( z ) -- F ( z , 1) with LEMMA
llA.
x ]
m H (1/d)-I
or there is a root fl o f f ( z ) = F(1, z) with
fl - -x y '< mHO/a)-l]x[ d
P r o o f . We m a y suppose t h a t H = c. In this case, we will see t h a t the first alternative holds. By L e m m a 3A, with the p a r a m e t e r u = 1, there is a root a of f with
O/
X[ -~
m ~ if,(a)l lyl ~ •
F u r t h e r m o r e , by L e m m a IOC, we have [if(a)[ ~- H 1-(l/d)
, so
that
x ] mH (1/d)-I T h e second case follows similarly when H = a. LEMMA l l B . Suppose M >__ 3 4d and H : ]b], and (x,y) is a solution to (11.1) with ~ # o, y # o. T h e . either there is a l ~ e root ~ o f f ( z ) = F(z, 1) with
o~
x - y
-~
m H (2/a)-1
I#
117
or there is a
large root/3 o f / ( z ) = F ( 1 , z ) with ?1 x
mH(2/d)-I Ix[ d
P r o o f . T h e polynomial f(z) has t large roots, say a l , . . . , a t , and q small roots, say 1/fll,... , 1//~q. T h e n the reciprocal polynomial ] has large roots /31,... ,/~q and small roots 1 / a l , . . . , 1/at. Let L = min (Ix -
a l y h . . . , Ix - atyl)
L = min ( l Y -
fllxh'.. , ] Y - flqxD,
and and consider the two real n u m b e r s
L(~b]) (I-lId)It
(.C._C '~(1-1/d)/q
L\lbl ] By s y m m e t r y , we m a y suppose t h a t
( a "~(1-1/d)/t < L ( c "~(1-1/d)/q L\lblj = \Ibl) We will see t h a t the first case follows. We have
(1 <
k < q),
so t h a t
L __!
\lbl]
]
I/~kl Ix-fill
Now/~k is a large root o f / , so flk is
ul
(1 _< k
< q).
(a/c) 1/d times a large root of g. T h e n we have
I#kl-< (al~) lid M ' l q = (Ibll~) '1~ and
L -4
([b[~(1-1/d)/t([bl~ 1/qd ka/
\c/
.Ix - t311y[ = P Ix - 13;'y I
(t 5 k 5 q),
where P is as in L e m m a 10D. By reordering, we m a y suppose t h a t L -- Ix - a l y]. T h e n
I(a~ - aj)~l _5 Ix - a,rA + I~ - '~j~l --<-2 Ix - ajyl
(2 =< j =< t).
118
From our work above, we also have I ( ° q - f l -~kl)yl
+ Ix -- /3;lVl
-~ PIz - fl;ay[,
(1 < k < q)
since P ___>1. Writing f as a product of its linear factors, differentiating, evaluating at x = ax, and multiplying by Ix - a~y[ lyd-'l gives
Ix-,~,vl [f'(aa)Yd-ll
=
Ix-,~yl a
H
I'~I-'~)Yl
2<j
Pqa 1-[ Ix-~jyl 1 < j < t
But IF(x,
y)l
= a
H
Ix - aJ yl
1 <=_j <=_t
H
'~'-
l
[I
Ix-/3;'yl.
1 <__k <_. q
H
Ix - fl~-lyl, so that by (11.1) we have
1 <_ k <_ q _
OL1
x
mPq
-~
y
I1'(~1)1 M d "
Furthermore, Lemma IOD gives a lower bound for the derivative, namely
If'(aa)l >'-
P qHl-(21d)
•
x I rnH(2/d)-I Otl -- y "~ [y[d
'
Then
as desired. L E M M A l l C . Suppose M < 3 4d, a n d ( x , y ) is a solution to (11.1) with x # O, y ~ O. I r a
or
(11.2)
m H (l/d)-I ~
lyl d
,
m l / 2 H(1/d)-(1/2) x lUld/~ Y -~
a-
If c < a, then there is a root 13 of ] with either ] ~ m H /3 --< or
(l/d)-I iz[d ,
- Y .~ mlt2H(ild)-Ot2)
[/3
Ixld/2
(11.3)
119
P r o o f . By s y m m e t r y , we m a y assume a < c. Once again, a p p l y L e m m a 3A, this time with u = 1 and u = 2. We see t h a t there is a root a of f with
c~-
"~
a n d with
a-
If'(~)l
lul d
ml[ 2
x
~
I/"(~)l'n
lYl'U ~ •
By Lemma fOE, either If'(~)l ~- H '-('/~) or I f " ( ~ ) l ~" HI-(~/~), which gives the two cases above, respectively. In the next section, we will complete the proof of T h e o r e m 7A, which gives a b o u n d on the n u m b e r of solutions of the T h u e inequality (11.1). For any solution pair (x, y), there m a y be a different a satisfying one of the results in this section. This could possibly introduce a factor of d. However, T h e o r e m 9B allows us to restrict ourselves to as few as 32 roots a if we are willing to give up a factor which is -~ 1. So it remains for us to estimate the n u m b e r of integers (z, y) satisfying one of the l e m m a s in this section for a fixed root a . Before doing so, we would like to combine our l e m m a s to get a single relation. Write K = m H (2/d)-1 (11.4) T h e n (11.3) in L e m m a l l C becomes
a
x I
K112 - ;. --< lyld/-~
and all of the other relations for ~ imply K
If we restrict ourselves to solutions where min(lxl, lyD > K1/d, then we see t h a t it will suffice to s t u d y
lyldl2
•
By symmetry, a similar result holds for roots 1//~.
§12. P r o o f o f T h e o r e m 7A. Recall, in T h e o r e m 7A, we give a b o u n d on the n u m b e r of solutions of
IF( x, Y)I < m , where F ( x , y) is an irreducible trinomial of degree d > 9. As in the previous sections, we write
F ( x , y) = ax d + bxty d-t 4- ey d ,
120 with a > 0, c > 0. We let h = max (a,
[bh c)
and K = m / h 1-(2/d) .
THEOREM 12A. (a) I T F is a trinomiM with d => 9 and K <_- 1, then the number of primitive solutions of the Thue inequality above is
<< 1 + l ° g + ( l ° g m / l o g ( 2 / K ) ) << 1 + l°g+ logm log d log d (b) If F and d are as above and K > 1, then the number of primitive solutions to the Thue inequality with min (Ixl, ]Yl) > c, K 1/(d-4) is <<1+
log + log rn log d
C O R O L L A R Y 12B. If m < H 1-(3/d), then the number of solutions is << 1. The corollary follows since m <=H 1-(3/~) gives 1 / K >=H1/d. Then log m / l o g ( 2 / K ) << (log H ) / ( d log H) = d . R e m a r k . In fact, the corollary is still true when the 3 is replaced by any constant greater t h a n 2. P r o o f ( T h e o r e m 12A). Let c~1,... , C~dbe the roots of f ( z ) = F ( z , 1). By Lemma 7B of Chapter I, h(o~l).., h(c~d) <= 5 ~/2 H ( f ) ~ H. In particular, each root o~ has h(o0 -~ H. Since d _-> 9, we have d/2 = V ~ < V/~ = x 2 > l --
'
with X = ~ ' ~ • In case (a), in view of what we said at the end of the last section, we need to consider - y K1/2 "< yd/2 " T h a t is, look at X
-y
<
Cd2 K 1/2 K 1/2 1 - - < - yd/2 < 4 ydl2x yd/2x
(12.1)
if y > ca. The exponent in the last inequality is # = d / 2 x > X v ' ~ . Thus, by Theorem 9A of Chapter II, the number of solutions with lYl > h(~) is << 1.
121 Next consider solutions of (12.1) with c3 =< lyl _-< h(~). Suppose these solutions have denominators c3 =< Y0 = < Yl = < --- = < Y~ = < h(c0. By the Gap Principle, we have
Yj+l >
2
(dl2x)-I >
= K1/2Yj
2
all3
= Ki/2Yj
,
and so
h(a)
> v, > (
=
=
2
•
This gives a b o u n d on the number of denominators, namely u+l<2+
log+(log h(a)l log (21K'12)) log (d/3)
Since h(a) -< H, (i.e. h(a) < cdH), we have u+l<<2+
log + (log H~ log ( 2 / g li2)) log d
Furthermore,
log H I log(21K li~) << (log m + log(1/K))/log(21K'12) << 1 + (log m l l o g ( 2 / K ) ) , and we see that the n u m b e r of solutions in this case is
<
log+ (log m/log(2/K))
log d
For part (a) of the theorem, it remains to count the solutions with lYl < c3. This will be done after stating L e m m a 12C, which follows shortly. In part (b), we have K > 1. As before, we consider the inequality
a - Y < cd g ' l ~ ca g'12 ydl-------i---< yXV~ yC~a
(12.2)
If y > c6 K cT/d, then
yXV~ " Again by T h e o r e m 9A of Chapter II, the number of solutions with y > h(a) is << 1. Thus it remains to count the solutions with y __< max (c6K c'/d, h(a)). We rewrite the inequality in (12.2) as
or- Yl < cd2 K I/2 _ A _ A ydl2 yal2 y2+~ '
(12.3)
122 where 6 = (d/2) - 2. By a variation of L e m m a 8C of Chapter II, we know that the n u m b e r of solutions to (12.3) with W __ (4A) 1/6, is <<1+
log (C log W) log(1 + 6)
In our particular case, we want to count solutions with (4A) '/6 < y __< max (c~K ~'/a, h(a)), so we let W = (4A) 1/6 and C-
l o g c 6 K c'/d log(4A)l/6
or
C-
logh(a) log (4A)1/~
In the first case, in view of A = c d K 1/2, we have C << 1, and in the second, C < log h(a) < log (cdH) << d + log m. We also have l o g W = log (4A) 1/~ << 1 + l o g m , and the b o u n d on the number of solutions becomes <<1+
log + log m log d
In case (b), we are left with solutions with [Y[ < (4A) a/6 < c r K 1 / 2 6 = e7 K1/(d-4) • Thus by the way T h e o r e m 12A (b) was formulated, we are finished with this case. In case (a), it remains to count solutions with [y] < c3. We will finish this after the following lemma. LEMMA with
1 2 C . Let p ( X ) = A X d + B X q + C be a polynomial. The r e a / n u m b e r s
£all into the union o£ at most eight intervals of totM length << 1 + k - ~ [ ]
min
1, \ [ B [ ]
'
-~[
"
T h e proof will not be given here, but the following picture may give the reader some indication as to what goes on.
123
For a proof (using only calculus) see Mueller and Schmidt (1987, L e m m a 7.1). In particular, if [A[ > 1, the total length is
,~]ld ml/2 << 1 + m i n \ t , , ] - ~ ) ' IBl('12)-('/d) I ¢ m
rnl/2
' IClO-7~T:(Iid))
"
Now we are able to complete the proof of part (a) of T h e o r e m 12C. Recall that we left open the n u m b e r of solutions of ] f ( x , y ) ] = < m where ]y[ =< c3 (or when Ix[ = < c3). For given y, consider the polynomial p(X) = F(X,y)
-~ a X d + by t X q + cy d
= AX d + BX q + C , with A = a, B = by ~, C = :kcy a. We have K < 1, so that m = < H 1-(2/a), and then m 1/2 < H 0 / 2 ) - ( 1 / d ) . For each y # 0, the lemma tells us that the n u m b e r of solutions is << 1. W h e n y = 0, the only possible primitive points are (1,0) and ( - 1 , 0). All together, we have << 1 solutions in this case. We may now prove the main theorem, which we state again for the reader. THEOREM
7A. I f F ( X , Y ) is an irreducible t r i n o m i a l o f degree d > 9, t h e n t h e
n u m b e r of s o l u t i o n s of
IF(x,y)l <__ m is << m21 d .
124 P r o o f . As in section 2, it suffices to show that
P ( m ) (< m 2/d , where P ( m ) denotes the number of primitive solutions. By T h e o r e m 12A, we have P ( m ) << 1 +
log + log m log d
provided that g < 1 or min(ixl, ]y]) > cl g i/(d-4) , where K = m / H ~-(2/d). So we are left with the case where K > 1 and min(iz h [Yl) < Cl K1/(d-4) • Say 1 =< [y[ < ci g ~/(d-4). It suffices to consider 1 =< [y[ < Cl m ~/(d-4) < c~m 2/d if d => 8. Given such a y, the number of solutions in x is (by L e m m a 12C with A = a, B = by t, C = cy d) << 1 .F min ( ( _ ~ )
=< 1 + min \¢ m
1/d
1/a '
rnl/2) ' ICl(1/2)-O/d)
ml/2
)
"
So for given y, the n u m b e r of solutions is << m lId, and there are << m 21d solutions with 1 < ]Yl < ml/d • Now it remains to count those solutions with
mlld < [U] < ¢1m2/d Their n u m b e r is
<< rn2/d +
E rnl/d ~ I~l 5-
m 1/~ \( ly~(-T]~12)-l )
(12.4)
c~m 2/d
W e estimate the sum by considering the integral
f
c~m2/d
jmt/d
1
dy << rn ~(2-~) 1 d = m~2 ½
y(di2)-I
It is now easily seen that the sum in (12.4) is << m 2/d, and the proof is complete.
§13. Generalizations of the Thue Equation. Let K be an algebrMc number field of degree [K : Q] = 5. As usual, let M ( K ) denote the set of absolute values on K and let S be a finite subset of M ( K ) , containing all of the Archimedean absolute values. Let s = card (S) and for a E K , let [dis =
1-L s
°.
An element a E K having --5 < 1 for every v • S is called an S-integer. These integers form a subring D s of K . T h e units of D s form a group Us consisting of a E K with ]a]~ = 1 for every v ~ S. Consider the special case where S = Mo~(K), i.e. the set of Archimedean absolute values of K . If the n u m b e r field K has rl real embeddings and r2 complex embeddings,
125 t h e n s = rl + r 2 . In this c a s e D s = D , the ring of integers i n K , and Us = U, the group of units in K . By a generalized version of Dirichlet's Theorem, we know that Us has rank s - 1, not just in the special case, but also for general S as above. T h e torsion part of Us is finite and consists of the roots of unity in K . ( T h e reader m a y find a proof of this version of Dirichlet's Unit Theorem in Borevich and Shafarevich (1966)). Let F ( X , Y ) E Ds[X, Y] be a homogenous form of degree d > 3. Suppose that F has no multiple factors. Consider the generalized Thue equation
If(x, Y)ls = 1 in variables x,y E Ds. Two pairs (x,y) and (x',y') are called equivalent if x = xly, y = y ~ with 7/E Us. If (x, y) is a solution to the equation, then every pair which is equivalent to (x,y) is also a solution, since [~]s = 1. THEOREM
1 3 A . With conventions as above, the number of equivalence classes
of solutions to If(x,
)ls = 1
is __<(4s) 2~ (4d) 26~ . This result, which will not be proved here, is due to Bombieri (1985). COROLLARY
1 3 B . The number o[ equivalence classes of solutions to
F(x,
e Us
is < (4s) 2~ (4d) 26~ . Now consider solutions to F(x, y) = 1 with x, y e D s . If (x, y) and (x', y') are solutions in the same equivalence class, then they differ by a factor ~ E Us. Since (by F(x, y) = f ( x ' , y') = 1), Oa = 1, we have the following corollary. COROLLARY
1 3 C . The number o£ solutions of F(x,
= 1
with x, y E D s is < d(4s) 2~ (4d) 26' . Now consider the equation F ( x , y) =
...
where F ( X , Y ) E Z[X, IF] and P l , . . . ,P~ are distinct primes. We seek solutions x, y, z l , . . . ,z~ E Z with
gcd (x,y, pi) = 1
(i = 1,... ,v).
This is known as the Thue-Mahler equation, since it is a generalization of the Thue equation which was first studied by Mahler (1933).
126
Example.
We m a y seek integer solutions (x, y, z) of x 3 - 2 y 3 = 5 ~ where gcd(x, y, 5) =
1. For the T h u e - M a h l e r equation above, take S : i o n , p 1 , . . . ,pv} with s = v + 1. As in Corollary 13B, look for solutions to F ( x , y) E Us with x, y E LDs. T h e corollary gives a b o u n d on the n u m b e r of equivalence classes of solutions. T w o equivalent solutions differ by a factor in Us, which consists of =t=p~'~ ... p ~ . Because of the condition gcd(x, y, p~) = 1, these factors can only be + l , and we have the following result. COROLLARY
1 3 D . The number of solutions of the Thue-Mahler equation
F(x,y) =p~'...p:~ is < 2 (4v + 4) 2~ (4d) 26(~+1) . This corollary includes the case F ( x , y ) = m, where m = PlZ l ...P~,~ is given. In this particular case, we know that, e.g., the factor 26 in the exponent is unnecessary. For lower bounds see Erd6s, Stewart and T i j d e m a n (1988). Next, we consider the T h u e - M a h l e r Equation in terms of ideals in D s . We know that the n o n - A r c h i m d e a n absolute values in S come from p r i m e ideals, say ~31,... , ~3t. T h e n the group of units Us consists of a E K* which generate a principal ideal (a} of the form (a) = ~3~1 . . . ~ ' . Above, we considered the equation F ( x , y) e Us, but this Can also be written as the generalized T h u e - M a h l e r equation (F(x,
=
...
in unknowns x , y in O s and z l , . . . ,zt in Z. Now consider the special case (x d -
wy d) =
...
,
where w is a given coefficient. Evertse (1984a) studied a variation of this, namely
(x d - wy d} (xd, wyd> - - ~ ' . . . ~ ; '
,
(13.1)
where (a,/3) is the fractional ideal generated by a and/3. In this setting, one studies solutions (x, y) E p I ( K ) and Z l , . . . , zt E Z. Evertse obtained the following result. THEOREM 1 3 E . The number of solutions of (13.1) is no g r e a t e r than (cld) s , where Cl is an absolute constant. (Here s = c a r d s equals t plus the number of Archimedean absolute values).
IV. S-unlt Equations and Hyperelliptic Equations. References: Evertse, Gy6ry, Stewart and T i j d e m a n (1988), Shorey and Tijdeman (1986), Schlickewei (1977).
§1. S-unit Equations. As in the end of Chapter III, let K be a n u m b e r field with [K : Q] = 5. Also, M o o ( K ) C S C M ( K ) and card S = s. An S-unit equation is one of the form a l X + a2y = 1, where non-zero a l , a 2 6 K are given, with unknowns x , y E Us. Consider the concrete example where K = Q and S = { o o , p l , . . . ,p~}, with the equation x+y=l. So we are looking at ±p?
where z i , w i 6 Z, 1. T h e n we have
. . . p~v + v T ' . . . vY v = 1 , W r i t e x = x ' / z ' a n d y = y ' / z ' w i t h gcd ( x ' , y ' , z ' )
(i = 1 , . . . , v ) .
xl
+
yl
= z'
=
,
or
+ v ? . . . p~" + v ~ ' . . . v : ~ ± p l ' - . , v 2 = 0 ,
where z i , w i , t i (i = 1 , . . . ,v) are nonnegative integers and the summands have no common prime factor. (Thus rain (zi, wi, tl) = 0 (i = 1 , . . . , v)). Example: ± 2 zl 3 *2 -t- 2 wl 3 w2 ± 2 h 3 '2
=
0 .
THEOREM 1A. A n S-unit equation a l x + a2y = 1 has at m o s t a finite n u m b e r of solutions. Fhrthermore, if a l , a2 6 Us, then their n u m b e r is ~ 825cS ,
where c is an absolute constant.
Remark. We may force a l , a2 6 Us by enlarging S if necessary. This may change the bound, though. In section 2, we will give bounds which do not require a l , a2 6 Us. We follow Evertse, Gy6ry, Stewart and Tijdeman (1988). P r o o f . We prove the second assertion, since it implies the first. By a generalized version of Dirichlet's Unit T h e o r e m mentioned in Chapter III, the group of units Us has the form Us
=Z
~ . . . @ Z ~ (Torsion) ,
*-- 8--1 times--*
where the torsion part is a finite cyclic group consisting of roots of unity.
128
Let Ug be the subgroup of Us consisting of elements of the form u d where u E Us. T h e quotient group U s / U g has cardinality __< d s - i . d = d s . Let ~1,... ,~t be coset representatives, where £ < d s. T h e n any x, y E Us m a y be written as x = ~iX d
y = ~jyd ,
and
where X , Y E Us and 1 < i, j < g . T h e S-unit equation s i x + ~2Y = 1 becomes ~ l ¢ i X d + ~2~j y d = 1 . Applying the a r g u m e n t for Corollary 13C of C h a p t e r III with any given d > 3, but noting t h a t factors 7? with T]a do not affect x = ~i(XTl) a and y = ~j(y,)d, we see t h a t the total n u m b e r of solutions is < 62 (4s) 2~ (4d) 268 < d2,(4s)2~ (4d) 26~. For instance, if d = 3, we get the b o u n d 32s (12) 26s (4s) 2~ =< s2~(12) 28s 42~ < s 2~ 1230s . So we have used the fact t h a t a Thue equation has only finitely m a n y solutions, to show t h a t an S-unit equation has only finitely m a n y solutions as well. One can see t h a t the converse is also true, i.e. t h a t the finiteness of the n u m b e r of solutions of S-unit equations implies the finiteness for Thue equations. We consider the T h u e equation F ( x , y) = m , where F ( X , Y ) E D s [ X , Y] is homogeneous with at least three non-proportional linear factors and m E D s . In some extension, F factors as F = L1 . . . L d , where the Li are linear with coefficients in a field N D K D Q • Let S ~ C M ( N ) consist of the extensions of the absolute values of S to N. We can assume t h a t m and the coefficients of Li lie in D8,. Consider the equation Ll(~_)...
Ld(x)
= m
(1.1)
and seek solutions x, y E D s , • Take linear forms L1,L2, L3 which are non-proportional. Since these are three linear forms in two variables, they must be linearly dependent. We have La = 11 L1 + 12L2
with
X1,~2 E N × ,
and Ll(X_) "~1 ~
L2 (x_) -~- )~2
--
La(x__)
- 1 .
129 By (1.1) above, we know that Ll(__x) I m in iDs, . Up to equivalence, there are only finitely m a n y divisors of m. Say LI(__x) = pu , where u 6 Ust, and there are only finitely many possibilities for p. T h e n Ll(x__)
where ul 6 Us, and there axe only finitey m a n y choices for pl- Similarly,
L~(Z) L3(~)
- p2 u2.
This gives A,plu, + A2p2u2 = 1
with u l , u 2 6 Us, • This is an S'-unit equation in u l , u 2 , and it has only finitely m a n y solutions. Thus there are only finitely m a n y possibilities for L l ( x ) / L 3 ( x ) and L2(x_)/L3(x). But these quotients determine x up to a factor of proportionality. It now follows immediately that (1.1) has only finitely m a n y solutions x.
§2. Evertse~s B o u n d . Let K be a n u m b e r field with [K : Q] = 6 and S C M ( K ) with card S = s, as before. Evertse (1983) gives a bound for the number of solutions of the S-unit equation a l x + a2y = 1
which is independent of the coefficients a l , a2 and which does not require that a l , a2 6
us. THEOREM
2 A . (Evertse (1984)) The equation O:lX + o~2y = 1,
where a,, OL2 6 K
are non-zero has at most
3 • 7 ~+2s solutions x, y in Us. Here, we will prove less, namely that the number of solutions is =< c , ( 6 ) e ~ .
For instance, if K = Q, then the number is < c~. First we reformulate the problem in projective space. T h a t is, we look at solutions to the equation ~ l x + ~ 2 Y - z = 0,
130 where (x, y, z) 6 P2(Us). O r let aa = - 1 and consider OllX + a 2 y + a s Z = O, which m a y be r e w r i t t e n as yl + y 2 + y a
=0.
T h e n we consider solutions where y l / a i E Us are considered the same. Let _a = ( a l , a 2 , a 3 ) and y = (yl,y2,ya). define 1-T A = !!
(i = 1, 2,3), and p r o p o r t i o n a l solutions As in C h a p t e r I, let (()~ = [~[~"~, and •
v~S
Since [yi/ail. = 1 for v ~ S, we have
H
v6S
(YlY2Y3)V 1 V~S 1 V~S ((~-~->3 (YlY2Y3)V) (y___)3 -- 1~ ~ (YlY2Y3)v ~,(y)3 (O/10/20/3)v _
v6S
=
= A Hk(y_) -a . Given y and v, let ( i , , j , , k,) be the p e r m u t a t i o n of (1, 2, 3) with the p r o p e r t y t h a t
[Yi~ [~ < [yj~ [~ < lYkv[.- If v is n o n - A r c h i m e d e a n , t h e n lYj~ tv = [Ykv[, since Yi~ + Yj~ + Yku -- 0. If v is Archimedean, t h e n [Yk~I- < 2[y/~ [. and [_y]. < v~[Yi~ ],. This gives
I_yl
'
where 1 a t , ~--
T h e n we have
if v is n o n - A r c h i m e d e a n , if v is Archimedean.
(Yi~ (y)~).._._~v < 36 A v6S = YI
Hk(y) -a .
(2.1)
This ties in with R o t h ' s T h e o r e m in its generalized form, t h a t is, T h e o r e m 10B of C h a p t e r II. We consider the variables as yl, y~, and we have the linear forms
L.=
Yl
if
i~ = 1
Y2
if
it,=2
-yl-y2
if
i.=3
We know t h a t m a x (lYll~, lY21~, ]Yl + Y2[~) =< 2 ~'(~) m a x (lYl[~, [Y2[~),
131 where
: { oi
if v is n o n - A r c h i m e d e a n if v is A r c h i m e d e a n .
L e t t i n g x = (yl, y2), the inequality b e c o m e s
T h e n (2.1) yields H
(L.(x_)>~ 6~ _ (L.>~x__>. < A HK(X_) -a
(2.2)
v6S
since (Lv)v => I. To count solutions, we apply Theorem 10B of Chapter If. W e get s
solutions with
gg(x__) > c1(6,t,c) (C + H + I) c2(~'''*) , where t is the number of distinct linear forms L., so that t = 3, and 6 = I, C = 12~A.
So we h a v e
< c3(*)4 solutions w i t h
HK(X=) > c1(6) (12~A + 2 + 1) c:(~) . Now we are left to count small solutions. We have
HK(X_) < Hg(y) < 2~HK(x) , a n d there are two possibilities.
If A < 12 n~, t h e n HK(y_) < Ca(6) . T h e r e are only
finitely m a n y solutions in this case, say < cs(6). T h e n we are left with the case where A > 1212~, a n d we have to count solutions with H K ( y ) < A c~(~). To do this, we need the following l e m m a . LEMMA 2 B . Suppose 0 < 7 < 1 and s 6 Z + are given. Then there is a finite set G of s-tuples (P1,.-. ,l'~s) o f non-negative reals with F1 + . . . + Fs = 7, such that for e v e r y x = ( x l , . . . , x s ) with xi > O, there is an s-tuple from 6 with xi > F i ( x l + . . . + x , ) (i = 1 , . . . , s). Fhrthermore, card ~ < c8(7)*.
E x e r c i s e 2 a . P r o v e L e m m a 2B. In the case s = 2, the e l e m e n t s o f ~ following picture.
lie on the line F1 + F 2
= 7. We have the
132
~
a ( r l , r~)
3'
1
It is easy to see that one may cover the line Xl + x2 = 1 with a finite n u m b e r of sets o = ~ ( r l , r ~ ) consisting of (xl,x2) with xi > Fi (i = 1,2). The lemma then follows for the case s = 2. Remark. As a consequence of this lemma, we have that for xi < 0, (i = 1,...,n),thereissome(F1,...,F,)•®suchthatxi
(Y'°)" =< (~s
(2.3)
for every v 6 S. (Notice that we have changed the subscripts of Fi ( i = 1 , . . . ,s) to F . (v • S).) Using the estimates (2.2), (2.3), we have
(y,o)~
<
(_Y) v
=
(6~A HK(y)_a)r.
(v 6 S).
(2.4)
=
We subdivide the set of solutions into those with fixed {F.}~Es, which gives < c8(7) s = c~ classes. We then further subdivide into classes for which i , is fixed for every v 6 S, which gives 3 s subclasses. All together, we have cf0 classes. In the next lemma, we establish a Gap Principle which allows us to count the number of solutions in a fixed class. LEMMA
2C. Consider solutions y,
y' to (2.4) which lie in a given class. Suppose
H~(~_) <__H~(y_'). Then H~(y_') >= 12-~ £/6 H~(y_)3/2 .
133
P r o o f . Write y = (yl,y2,ya) and y' = (Y~,Y~,Y~a). For v • M ( K ) , form the "determinant" a ~ = (YiY; - YjY~)v
(u)v (y')v where i ~ j . (Since ~ yi = ~ y~ = 1, it doesn't matter which pair i ~ j we choose). In particular, for v • S, take i = iv. Then Av < 2 x(')"~ max
(Yiv)v (Y~v)v~ (_--~ , ~ ],
where
f 1 ( 0
if v is Archimedean if v is non-Archimedean.
Then, by (2.4), we have A~ =< 2x(v) n~
() )ro 6 ' A HK y --3
where Y~,,es F,, = 5/6. Taking the product over v • S, we get 6 ' a H g ( y ) -a
H Av < 2' yES Now, for v # S, we have yl, so then
ty~/a~lv
)
s/s (2.5)
= 1 since yilai G Us. A similar result holds for
(Yi)v = (Y~)v -= (Oli)v,
(i = 1, 2, 3).
If we pick i,j, k such that
(-/)v < (-i)v < (-~)v = (_~)v, then (--)I
(a)~
where the inequality holds since v ~ S is a non-Archimedean absolute value. Taking the product over v ~ S gives
I I ~v __< I I (.,.j.k)v = A-, In conjunction with (2.5) this yields 1 H A~ < 12' A -1/6 H g ( y ) -5/2 . HK(y) HK(_y') = = _ vEM(K)
134
L e m m a 2C follows. We return to the proof of Theorem 2A, where we were left with solutions having HK(y) < A c7(~) • Furthermore, we may initially restrict ourselves to solutions in a given class, as explained above. For two such solutions y, y', the Gap Principle of L e m m a 2C, along with A > 1212~, gives the inequality HK(y__') > A 1/12 HK(y_) 3/2. Suppose some fixed class contains solutions Y0' Y l ' " " 'Y, with non-decreasing heights b o u n d e d above by A ~*(*). T h e n by the Gap Principle, HK(Yl) > A 1 / 1 2
,HK(y ) > (A1/12) (3/2Y'-'
But HIr(y ) < A cT(e), so we have ~V
1Q3)
u-1
- -
<
12
=
and v < Cll(G ). Recall that the n u m b e r of classes was not greater than c[0 , so that we have no more t h a n cn(g)c~o solutions with height bounded by A ~7(e). Combining this with the other estimates, we get no more than c1(6)c~ solutions to the S-unit equation, where 6 = [K : Q] and s = card S. Our proof essentially followed the m e t h o d of Evertse, Gygry, Stewart and Tijdeman (1988). Evertse's proof of Theorem 2A used hypergeometric functions. These are advantageous when studying rational approximations to radicals, i.e. numbers of the type
§3. M o r e on S-unit E q u a t i o n s . In sections 1 and 2, we considered S-unit equations in two variables. We reformulate this in projective space by considering the equation O~1 Xl + Ol2 X2 -t- O~3 X3 = 0 ,
where a l , a2, aa E K* are given and the solutions ( z l , z2, xa) lie in P2(Us). Even more generally, we may consider O~OX0 "~ O~lX 1 + . . . + O/nX a ~---0 ,
where ai E K* (i = 0,... ,n) and where we seek solutions ( x 0 , . . . ,x,,) E [*"(Us). If n __> 3, we no longer necessarily have only finitely m a n y solutions. Consider, for instance, the following example. Let n = 3 and consider solutions of the equation xl + x2 + xs + x4 = 0, where S = {oc,2} and the n u m b e r field K = Q. There are infinitely m a n y solutions of the form ( 2 " , - 2 n, 2 m, - 2 " ) where m, n E Z. However, with suitable restrictions on the solutions, an S-unit equation will have only finitely many solutions. We see this in the following theorem due to Evertse (1984b) and Schlickewei and van der Poorten (1982).
135
T H E O R E M 3A. A n S-unit equation has only flnitely m a n y solutions in Fn(Us) for which no subsum vanishes, i.e. for which ai, xi~ + . . . + s i , xi, # 0 for any {il, i2, . . . , it }
c {0,1,... ,,~} with t # O, ~ # ,~+ 1. The proof, which will not be given here, (but see Ch. V, §2), uses results on simultaneous diophantine approximations which involve both Archimedean and nonArchimedean absolute values. Mahler (1933) was the first to study diophantine approximations using non-Archimedean absolute values. For the remainder of this section we return to the case n = 2, considering equations of the type S 0 X 0 3I- O~lXl -4- Ot2X2 = 0 (3.1). Evertse, GySry, Stewart, and Tijdeman defined an equivalence relation on S-unit equations. A slight variation on their definition is that two equations (3.1) and ' s'o*o + S'lZl + a2z2 =
0
(3.2).
are equivalent if
a~ = ~
(i = 0,1, 2),
where ¢i E Us and ~ E K x. Equivalent equations have the same number of S-unit solutions, for if x0, x l, x2 is a solutions of (3.1), then x0/~0, x~/~1, x2/e2 is a solution of
(3.2). T H E O R E M 3B. (Evertse, GySry, Stewart, Tijdeman (1988)). E x c e p t for finitely m a n y equivalence classes of equations, an S-unit equation over p2(Us) has at most two solutions. R e m a r k . In general, there may be more solutions. Consider the S-unit equation x + y = 1. Nagell (1969) proved that when 6 > 5, there are number fields K of degree such that this equation has at least 3(2' - 3) solutions in units of K , i.e. with S = M~(K). In a more recent result, Erdhs, Stewart and Tijdeman (1988) proved that for K = Q and s arbitrary, there are sets S of cardinality s such that the S-unit equation above has at least e ~sl/2/l°ss solutions. Stewart made a case that e s2/3 is the correct order. R e m a r k . The number 2 in Theorem 3B is best possible, provided s > 1, so that Us is infinite. To see this, pick ~,77 E Us, with ~ ~ q, ~ # 1, 71 ~ 1. Consider the
equation s i x + s 2 y = 1,
where s l = (T/- 1 ) / ( 7 / - ~) and s2 = (~ - 1)/(~ - ~/). This has solutions (1, 1) and (~, T/). Since there are infinitely m a n y choices for ~, r/, there are infinitely m a n y S-unit equations of this particular form. However, only a finite number of these belong to a given class. To see this, suppose s i x -4- s2y = 1 and
s i x + s~y = 1
136
lie in the same class. Then a t = a l ¢ l and a S = a2~2 where ~1,e2 E Us. Then the second equation becomes aielX + a2e2y = 1. Since (1,1) is a solution of this new equation, we have O~1~1 -{-O~2~ 2 ~- 1 . But this is an S-unit equation in the unknowns el,e2, thus it has only finitely m a n y solutions. Therefore, we have infinitely many equivalence classes of equations with at least two solutions. P r o o f o f T h e o r e m 3B. We are considering S-unit equations of the form a x + fly + Tz = O ,
with solutions (x, y, z) E ]P2(Us). equation of the type
Every equivalence class of equations contains an a x + fly + z = O .
Now suppose that we have three distinct solutions, say (xi, yi, zi) Thus a x i + fiYi + zi = 0 (i = 1, 2, 3).
(i = 1,2, 3). (3.3)
Because the solutions are distinct in p2(K), any two of the triples (xi, yi, zi) are nonproportional. So we have r a n k ( xlx2 Y2Yl z2Z1) = 2 , and then
I Y l [ # 0, Y2 I
Xl x2
(3.4)
by (3.3). Therefore, xlY2Za 7t x2YlZ3. Cyclic permutations give two more relations, namely x2yazl ~ x3y2zl and x3ylz2 # xly3z2. All together, considering y , z or z , x in place of x, y in (3.4), we get nine relations. Also, the three solutions to the given S-unit equation satisfy (3.3), which may be interpreted as a system of three equations for ¢x, fl, 7. The matrix of this system must be singular: Xl
Yl
x2 x3
Y2 z2 = 0 . Y3 z3
Zl
Writing out this determinant, we have x l y 2 z a + x2yaz] + x a y l z 2 -- xlYaZ2 -- X2ylZ3 -- x3Y2Zl = 0 ,
(3.5)
and the nine relations above impose the additional condition that no term with a + sign can equal a term with a - sign. To finish the proof, we need the following lemma. LEMMA a n d yl z2 / y2 zl .
3C. There are only tlnitely m a n y possibilities for the ratios xl z 2 / x 2 z l
137 P r o o f . T h e relation (3.5) is about a sum of S-units, i.e. an equation al + a2 + a3 -- bl - b2 - b3 = 0, with ai, b~ E Us (i = 1,2, 3). By T h e o r e m 3A, this equation has only a finite n u m b e r of solutions in LOS(Us) for which no subsum vanishes. We consider several cases. c a s e (i). Suppose no subsum vanishes. T h e n we have only finitely m a n y possibilities for bl/a 2 = 2;lyaz2/x2yaz I ---- 2;lZ2/2;2Zl
and an~ha = xaylz2/xay2zl = ylz2/y2zl. c a s e (ii). Suppose al + a2 + a3 = 0 and bl + b2 + bs = 0. For each of these subsums, we may apply T h e o r e m 3A to see that there axe only finitely m a n y possibilities for al/a2, an~a2, bl/b2, bl/ba. T h e n there are only finitely m a n y possibilities for the product alasb~/a~b2bs, which simplifies to (x l z 2 / x 2 z 1)3. So we h a v e o n l y finitely m a n y possibilities for 2;lz2/x2zl, as desired. By symmetry, we get the same result for Yl z2/y2 Zl. Again by symmetry, we axe left with the following two cases. case (iii). al + a2 = 0 and as - bl - b2 - b3 = 0. case (iv). a l + a 2 - b l = 0 a n d a s - b 2 - b a = 0 . E x e r c i s e 3a. Establish the result for cases (iii) and (iv). We return to the proof of Theorem 3B. From (3.3) with i = 1, 2 we obtain
::
Z2
t
2;2
Z2
Yl
Xl
2;1
Yl
Y2
x2
x2
Y2
so that
(x v,12;iv ) : l'
z,
(2;lv212;
v,)- I
Thus, by L e m m a 3C, there are only finitely many choices for aXl/Zl and ~ y l / z l . Since 2;1/Zl, y l / z l E Us, there are only finitely many possibilities for a, fl up to equivalence classes (i.e. up to S-units). Therefore, there are, up to equivalence, only finitely many S-unit equations with more than two solutions. R e m a r k . This argument can not be generalized to give a similar result for S-unit equations in four variables. R e m a r k . T h e n u m b e r of exceptional equivalence classes has not been estimated, although such an estimate could perhaps be derived from recent bounds on the number of solutions of 5'-unit equations in n variables. §4. Elliptic~ Hyperelliptic, and Superelliptic E q u a t i o n s .
138 We consider equations of the form yd = f(x), where d > 2, deg f = n > 2 and A ( f ) = discr(f) ¢ 0. The case where d = n = 2 will be excluded. The polynomial f m a y have its coefficients in various fields or rings, for example, f(X) E Q[X], or f(X) E K[X] where g is a number field, or f(X) E Ds[X] where Ds is as before. In the case d = 2, n = 3, we have elliptic equation~, which have the form y2 = f(x), where f is a cubic polynomial with distinct roots. W h e n d = 2 and deg f = n > 3, we have hyperelliptic equations. The most general case, namely y~ = f(x), where d > 2, n > 2, but n and d not both 2 is called a superelliptic equation. In the case of hyperelliptic equations, the genus of the Riemann surface is g = [(n 1)/2]. For suppose the equation is written in factored form as y2 = a(x-al )... (x-an). Consider the case where n is even, say n = 2m. Since n is even, we m a y pair the roots, as shown, making a cut between each pair. O~2
C~4
J
.,.
Ofn--,
We see that y is an analytic function in this cut plane. In other words, we have two Riemann spheres with m cuts in each. el Oil O
°~2
d2
e2
Identifying edges of the various cuts, we get a single Riemann surface which is h o m o m o r p h i c to the surface of a pretzel with m - 1 "holes". We have to identify, e.g., the upper edge el with the lower edge d2, and the lower edge e2 with the upper edge dl. Thus the genus is m - 1 = [(2m - 1)/2] = [(n - 1)/2]. The proof is the same for n odd, except that the last root is paired with the point at infinity on the sphere. E x e r c i s e 4a. Consider the equation y3 = with distinct roots. Show the genus g = 1.
f(x),
where f is a cubic polynomial
139
T H E O R E M 4 A . A superelliptic equation with coefficients in an Mgebra/c number tield K has only finitely many solutions x, y E Os. (Here S C M ( K ) and O s are as before.) The special case of an elliptic equation was done by Mordell (1922). T h e general case, which is due to Siegel (1926), was published under the pseudonym X. In the proof it is necessary to consider an extension field K ~ D K . We then choose S ~ C M ( K t) large, in particular so large that it contains every extension of elements of S to the field K t. T h e n if x E K is an S-unit, it is also an St-unit. For if ]x[, = 1 for every v ~ S, then [x[,o = 1 for every w ~ S'. P r o o f . In some extension field K r, the polynomial f factors, i.e. (4.1) with a, ai E K ~, provided K ~ is sufficiently large. Furthermore, if S' is large enough, then a, ai E Ds,. By a change of notation, we may suppose that K, S have these properties. We will use the following lemma to rewrite the factors x - ai. L E M M A 4 B . There exists a finite set B of non-zero elements of K such that for any solution x , y to (4.1) with y ~ O, we have -
where/3i E B and yi E Ds. Proofi (i) T h e r e is a finite subset ~a,-.. , ~,~ in Us such that every u E Us m a y be written in the form U ~- ~itL Id,
where 1 _< i < m and u ~ E Us. (ii) Let P be the set of prime ideals in D s which either divide ai - a i for some i ~ j or divide a. Then P is finite. Suppose some prime ideal ~ divides x - ai and x - a 1 for i ~ j . T h e n ~3 E P . (iii) Say P consists of ~31, .... , ~e. If f i is an ideal whose prime factors lie in P , then d = .~3t (~31 where 0 =< c i < d for i = 1 , . . . ,g. In other.words, f i = fi*L~a with only finitely many possibilities for fiB*. (iv) We know that the class number h of the ring D s is finite. This means that there are certain fractional ideals ~ x , - - . , ~ h such that every ideal has the form ,,
• ..
where (z) denotes the principal fractional ideal generated by z E K and where 1 < i < h. If the ~ i are properly chosen, then any integral ideal ~ will have the form where now z E £Ds.
140
We now combine these observations. For x - ai given, write (x - a l ) = ~ i ~ i , where ~{i consists of prime factors which are not in P and Bi consists of prime factors which are in P. Then we have
= (a)~l...
<x -- Ogn)
~,~1...
~,,,
where each ~i is coprime to the other factors on the right-hand side by remark (ii). But the left-hand side is a dth power, so each PAi must be also. Using this fact and (iii) from above, write * d ~i = ~/a and ~ i = ~ i "t'~i • Then <x - ~ )
= ~7(a~)d,
and we noted that there are only finitely many possibilities for the ~*. By (iv), we may write ~ i ~ i = ~ , {zi) with zi E D s . Taking dth powers and noting that (x - ~i) is a principal ideal in D s , we have with only finitely many possibilities for the principal ideal (~i)- Then X -- Ol i :
Ui~i z d ,
where ui E Us, and ~i lies in a finite set. By observation (i), we have Ui = ~k i It td i ,
with only finitely m a n y possibilities for ~k,. So finally, we get - ~i = (¢~,~i)(,~'~z,) d,
which gives the result with/3i = ~k,~i and yi = u~zi. We return to the proof of Theorem 4A, considering two cases. In the first case, we suppose d __>3, n > 2. Using the lemma, write x - ~1 = / ~ 1 y l d,
Then /~,yl - Z~y~ = ,~ - ~, ~
o,
which is a Thue equation in the variables Ya, Y2 since d > 3. So we have only finitely m a n y solutions yl, y2- Since x is determined by the/~i's and yi's, we have only finitely m a n y possibilites for x.
141
T h e remaining ease is when d = 2 and n > 3. As above, write X - - O ~ 1 -----
X --OL 2 =
X--
OL3 =
We need to solve this system of equations in x, yl, y2, y3 E First, we extend K so that it contains v/-~l, x/~2, sides will be squares, i.e., let zi = v/-~i yl so that x - ai 73 = ~2 -- or1 #- 0, and permuting the indices to get 71,72,
L~s. v ~ 3 . T h e n the right-hand = zi2 (i = 1,2, 3). Letting we have
zl2 - z~ : 73, -
71,
=
Z 2 __ Z 2
~=72.
Now the left-hand sides can be factored. We have, for instance, (Zl -- Z2)(ZI
"~- Z 2 ) : - "~3"
We write zl
-
z2
=
(4.2)
p3u3,
where u3 is a unit and (since Zz - z2 divides 73) where we may take P3 from a finite set. We also have Z 2 - - Z 3 ----- P l U l , Z3 -- Z1 =
/)2152.
Adding these last three equations gives plUl
-{- p 2 U 2 + p 3 U 3
== 0 ,
an S-unit equation. Hence there are only finitely m a n y solutions (u,, u2, u3) G P2(Us). We would like to know that there are only finitely m a n y possibilities for the zi (i = 1, 2, 3). T h e n it will follow that there are only finitely many solutions to the originM hypereUiptic equation in this case. So we consider zl+z2--
73 , p3u3
which in conjunction with (4.2) gives zl
=
pau3
+
p3U3
Similarly, by cyclic permutation, z2 =
plUl
+
plul
(4.3)
142 and
z3 = x
p~u2 +
p2u2
We also have directly from (4.2), (4.3) that
Z2 =
-~
paU3
-- psU3
•
Now the "Yi are fixed and we have only finitely many choices for the pi- There are finitely m a n y possibilities for (ul, u2, u3) up to equivalence in p2(Us). So suppose that we replace ui by Aui (i = 1, 2, 3). Equating the two expressions for z2 gives
( ")'1 (fllUl -[- f13U3))~ = -- P'~I
"f3 )~ PSU3 '
so A is determined (up to + ) unless plul + p3u3 = 0, which is impossible. In the next section, we will obtain estimates on the number of solutions. §5. T h e N u m b e r o f S o l u t i o n s o f E l l i p t i c , H y p e r e l l i p t i c , a n d S u p e r e l l i p t i c
Equations. Here we discuss relatively explicit bounds on the number of solutions of the various equations. These results are the joint work of Evertse and Silverman (1986). Let K be a number field of degree 6 and K × the multiplicative group of K . Let S be a finite set of absolute values which contains all of the non-Archimedean ones, i.e. Moo(K) C S C M ( K ) , and let s = card S. As above, let Ds denote the S-integers in K and Us the S-units. Consider polynomials f ( X ) E Ds[X] with discriminant A ( f ) E Us. Notice that this last requirement is not much of a restriction, since we may enlarge S to force A ( f ) E Us. Then the cardinality s will reflect the number of prime factors of A(f). In what follows, L is an extension of K with degree [L : K] = t. We will also have d > 2, and hd(L) will denote the order of the subgroup of the ideal class group of L consisting of elements [9.1] with [9/]d = 1. We will count solutions of the superelliptic equation yd = f ( x ) , (5.1) with x E Ds, y ~ 0, y E K . (Then automatically, y E Ds).
T H E O R E M 5A. (a) Suppose d > 3, n >__2, and L contains at least two roots of f . Then the number of solutions of(5.1) with x E D s and y E K* is < 17 t(6~+~) d 2ts ha(L).
(b) Suppose d = 2, n > 3 and L contains at least three roots of f. Then the number of solutions is 7t(46+9s) h2(L) 2.
143
R e m a r k . We may pick L with g < n ( n - 1) in case (a) and g < n ( n - 1)(n - 2) in case (b). Aside from the choice of L, the coefficients of the polynomial f do not enter into the estimates. In the case of an elliptic equation, one may conclude that the number of solutions is < c(¢)H 2+e, where H is the height of the equation. See Schmidt (to appear). Here we will prove a weaker form of the case (a). We will show that the number of solutions in (a) is
=< (c,-dU" hd(L). We need several lemmas first. L E M M A 5B. Suppose [ [ is a non-Archimedean absolute value on a field E. Let the polynomial f(X)
= anX n +...
- al)...(X
-I- ao = a ( X
- an),
be given with ai, ai in E and la, I _5 1 (i = 0,... , n ) , and also Ih(f)l = x where A denotes the discriminant. Then for every x E E with Izl <=1 and i # j, we have
la,
-
a,I
-- m a x (Ix
-
a,l, I~ - ajl)
= m a x ( l , lad) m a x ( l , la, I). P r o o f . Because [ [ is non-Archimedean, we have
[ai--aj[ < max(lx-aih Ix-oql)
=< max0, [a/I) max(l, laiD-
(5.2)
We also know that
IA(f)l = la"l 2"-2 I I
la, - ~.~l = 1.
(5.3)
By Gauss' Lemma (and with the notation Ifl = max la, I)
la.I fl max(l, laid
=
Ifl _5 1,
i=1 so
[anl2n-2 H
m a x ( l , I~1) m a x ( i ,
I~1) =< I.
Comparing (5.3) and (5.4) gives
H max(l' [aiDmax(1, [ajD< H I ° q - a j [ ' and this is <
I'I m a x ( l , I~,1) m~(1, laid
(5.4)
144
by (5.2). We therefore have equality everywhere, in particular in (5.2). Now let v E M ( K ) such that v ] p for a prime p. Then v extends the p-adic absolute value. In this case, the value group G~ of v consists of powers lr t (g E Z), where ~r is some fixed fractional power of p, say rr = f / * . L E M M A 5(3. Let v, ~r be as above. Suppose f ( x ) = yd, where y E K x. Then Ix - ~,lv = max(l,
where ui E Z (i = 1 , . . .
, n).
Ixl. < l,
I'~,l,)
Ifl. --< 1, I/X(f)l. = 1, and
~'~
In fact, ui = 0 with the possible exception of one value
tti o •
P r o o f . By the proof of the preceding lemma, we have n
la=l. E m a x ( 1 ,
I~lv)
= l f l , = 1.
i=1
Since f ( x ) C ( K x)d,
If(x)lv = la,,l~ fi Ix- ,~1~ e a~. i=1
Then
i=l max(l, I'~1o)
a~.
Letting c~ = Ix - , ~ l J max (1, I'~lv), we have
ficl
E G d •
i=1
Now if la, l~ > 1, then ci ~-- 1. If, on the other hand, la, l~ _-< 1 and I~jl. _-< 1, then I(x - ai) - (x - % ) Iv = lai - a j l . = 1 by Lemma 5B. So only one of Ix - a i h Ix - %l can be strictly less than 1, thus only one of ci, cj can be strictly less than 1. Therefore, ci = 1 with one possible exception, and each ci E G d. T h a t is, I X - - Olll v
m a x ( l , I,~1~)
e a~,
(i = 1 , . . . , ~)
as desired. As in Chapter III, Section 13, suppose there are t non-Archimedean elements of S. These absolute values correspond to prime ideals ~ 1 , . . . , ~ t . Given fractional ideals ~t, ~ , we write 92 =__~3 (mod S)
145
if ~/fl~ is of the type ~3~' ... ~ ' with integers c l , . . . , ct. We write ~1 - ~B (mod S, d) if ~i/~B is of the type ~ t ... a t ' E~ where ~: is any fractional ideal. Consider the congruence in the variable z given by (z) - m (mod S, d) (5.4), where (z} is the principal ideal with generator z. If z is a solution and z' = zw d, then z' is also a solution. So it is valid to count solutions z E K X / ( K X ) d. LEMMA
5 D . The number of solutions of the congruence (5.4) in g × / ( g x ) d is
< d t hd(K).
P r o o f . Suppose that there exists a solution z0. T h e n for any other solution z, we have
(z/zo} = (1} ( m o d S , d). Thus, it suffices to count solutions z of
( z / - (1) (mod S, d). Suppose z is such a solution. By the definition of the congruence relation, we have
and without a loss of generality, 0 < ci < d (i = 1 , . . . ,t). We will count solutions z with fixed c l , . . . , ct. Say zt is a fixed such solution,
(zll = v ? . . . v ? ¢ f and z is an arbitrary such solution, (z)
=
. . . V?
•
Then
(z/z,i = (¢/¢,)d . T h e ideal class of ¢ / ~ , , say [~/~,]~ has [ ~ / ~ 1 ] d = [1]. Also, if ~:, ¢1 are in the same ideal class, then ( z / z , ) E ( K × ) a. So (since we only want solutions modulo ( K × ) d) all that remains is to count ideal classes whose d th power is [1]. But their n u m b e r is hd(K) by definition. Allowing for all the possibilities for c t , . . . ,ct with 0 _-< ci < d (i = 1 , . . . ,t), we have < dt hd(K) solutions in K x / ( K x )a.
146
LEMMA
5E. Suppose that d > 3 and ~ is a fractional ideal. Consider solutions
a E K x to the paJr o f congruences
(rood ,9, d),
(a) =_ ~
(1 - a) = (1, a) (rood S). T h e n u m b e r o f such a is < (cld) 2" h a ( K ) ,
where cl is an absolute constant. P r o o f . Write a = w z ~, where w runs through a complete residue system in K × / ( K X ) d. T h e n by hypothesis,
- w z d) (-17 ~ --(1)(roodS),
(1
which m a y be written as
< 1 - wz d)
=
~?"'~?"
For a given w, we count the n u m b e r of solutions z. By T h e o r e m 13E of Chapter III, the n u m b e r of solutions is < ( c l d ) " where cl is absolute. (Sorry about the occurrence of Cl with two meanings.) But we also have (~) _= ~ (mod S, d), where w E K × / ( K × )d, and by the preceding lemma, the number of such w is < d t h d ( K ) . Therefore, the total n u m b e r of solutions is < (cld) 2" h d ( K ) . We are now ready to return to the proof of Theorem 5A, part (a). We are considering solutions of the hyperelliptic equation yd = f ( x ) = a(x -- of1 ) . . . (x
--
O~n),
wherexEOs, y E KX. In (a) we had d > 3, n > 2, and L contained at least two ]roots o f f , say a l , a 2 E L. Let S I be the set of absolute values of L which extend absolute values of S. For x E O s , put Z(
x)
--
X -- Oll . X - - OL2
For v ~ S', we have IA(f)I. = 1, so by Lemma 5B, we have 141 - ~21~ = m ~ ( I x
-- ~ x k , Ix - ~21,~)
147
and 1
0~I -- 012
xX - a l ~2 v
X
~
O~ 2
: max
(1,
v
This means t h a t for every v ~ S'
I1 - z ( = ) l ~ = m a x (1, I Z ( x ) l . ) , or in terms of prime ideals,
(1 - Z(x)) _= (1, Z(z)) (roodS'). Also, by L e m m a 5C, for v E S', we have
max (1, lail~)
(i = 1, 2),
E a~
so that IZ(=)l~
=
ma~(1, I~1~) a m a x O , Io~=1~) " 9~,
with g~ E G~. Now we have
(Z(x)) = 93(modS', d), where 93 is a certain ideal. By the last lemma, the number of possibilities for Z(x) is <=(cld) 2is ha(L), since c a r d ( S ' ) < / c a r d s = es, where e = [L: K]. §6. O n E l l i p t i c C u r v e s . Consider an irreducible polynomial equation
f ( x , v) = 0 and its associated affine curve, embedded in two-dimensional space. A point P on the curve is called a singular point if 0 Otherwise, it is called a r~on-singular point. For such a point, the equation of the tangent line at P is given by
where C is a constant. Thus, at non-singular points, we have a well-defined tangent line.
148 If the total degree of f is d, then put
so t h a t F is homogeneous of degree d. We s t u d y solutions of
F ( z , y, z) = O, where ( x , y , z ) E p2. If the point ( x o , y o ) lies on the affine curve F ( x , y ) = 0, then (xo,Yo, 1) lies on the projective curve F ( x , y, z) = 0. Conversely, if (x0, yo, zo) lies on the projective curve and z0 # 0, then ( x o / z o , yo/zo) lies on the affine curve. In other words, there is a one-to-one correspondence between points on the affine curve and points on the projective curve with z # 0. Points on the projective curve with z = 0 are called the points at infinity of the afflne curve. We say the curve is of degree d if F is of degree d. E x a m p l e . Consider y2 = f ( x ) , where f is a cubic with non-zero discriminant. We m a y write y2 = a(x - a l ) ( x - a2)(x - ha), and we see t h a t the corresponding projective curve is y2z = a(x - alZ)(X - a 2 z ) ( z - aaz). T h e points at :infinity occur when z = 0, so x = 0 and y # 0, i.e. the point (0, 1, 0) C p2. E x a m p l e . Consider the cubic T h u e equation f ( x , y ) = m and the corresponding projective curve f ( x , y) = m z a. In factored form we have a ( z - a l y ) ( x - a2 y ) ( x - aa y) = m z a. If z = 0, then x = a i y for some i E {1,2,3}, so the three points at infinity are (hi, 1,0), (a2, 1,0), (as, 1, 0) • p2. We will also need to talk a b o u t lines in projective space. Recall, a line in affine space is given by the equation ax + by + c = 0, where a ancl b are not b o t h zero. In projective space, this becomes ax + by + cz = 0, where we require that a, b, c are not all zero. T h e additional line which appears, i.e. the line z = 0, is called the line at infinity. Recall, in affine space, we said the singular points on f correspond to (x,y)
=
= 0
and
0.
In projective space, we have (recall t h a t F was homogeneous of degree d) dF(x,y,z)
xOF . yOF . zOF . -- --~-x ( X , y , z ) + --~-y ( X , y , z ) + - - ~ z ( X , y , z ) ,
so t h a t singular points m a y reasonably be defined by OF OF x, OF y, z) = (X,y,z) Ox (z, y, z) =
= 0
It is a simple l e m m a to prove t h a t if an affine point is non-singular, then the corresponding projective point is also non-singular, and conversely'.
149
Given a projective curve, we could specialize any of the variables to 1 to obtain corresponding affine curves. This is illustrated by the diagram here. afflne curves
F ( x , y, 1) = o
projective curve F ( x , y, z) = 0
F ( x , 1, z) = 0 F(1,y,z)=O
We may use these affine curves to study properties of the projective curve. E x a m p l e . Consider the curve y2 = f ( x ) with discr(f) ~ 0. Does it contain any singular points? If so, there is a solution to 2y = 0, f ( x ) -- 0, i f ( x ) = O, but this can not happen since discr(f) ¢ 0. W h a t about the point at infinity? Is it a singular point? The projective curve is given by y 2 Z -~- a ( x -- o Q z ) ( x
-- o ~ 2 z ) ( x -- o/3z )
and the point at infinity is (0,1,0) • p2, as seen earlier. We check the partial with respect to z, and we see that
which does not happen at (0,1,0) • p2. So the curve is non-singular, i.e. it has no singular points. E x a m p l e . The projective cubic Thue equation f ( x , y) = m z 3 is also non-singulax. Bez6ut~s T h e o r e m . I f C1, C2 are projective curves of degrees dl, d~, respectively, then the totaJ n u m b e r of their intersection points (counted according to multiplicities which axe not defined here) is did2. In the special case of a cubic curve intersected with a line, the number of points of intersection is 3. Actually, it is not necessary to use the general version of Bez6ut's Theoerem to get this result. We return to the examples once again, considering intersections with particular lines. E x a m p l e . Consider the cubic equation y2 = f ( x ) and the line at infinity, z = 0. Their intersection is the point (0, 1, 0) in p2, so this point must have multiplicity 3. E x a m p l e . Consider the cubic f ( x , y) = m z 3 where f = a(x - a , y ) ( x - a2y)(x day) and the line x = aiy for i • {1, 2, 3). These intersect at (ai, 1, 0), which axe all triple points of intersection. Now suppose that C is a non-singulax curve. A divisor is a formal sum D=
Z PEC
c(P)P
150
where c(P) E Z and c(P) = 0 for all but finitely m a n y points P. The divisors of C form a group, denoted by Div C. For D a divisor, we let degD=
c(P)-
E PEC
Example.
l For instance, D = 3P1 - 5/°2 is a divisor with deg D = - 2 . Consider the afflne line, i.e. the curve C which is the affine line. The rational functions on C are r(x) = a(x)/b(x) where a, b are polynomials in x. At any a E C, we can expand r(x) into a Laurent series, say OO
t~m
where (when r # 0) we may suppose that cm # O. We say o r d a r = m. We put orda0 = +oo. Consider an affine curve C. A rational function on C is by definition a rational function r(x, y) e C_~x,y) whose denominator is not identically zero on C, with two functions r, s considered equal if they coincide on C. Now consider a curve C C p2. We would like to define rational functions on C. They would have the form
r(x, y, z) -- b(x, y, z) ' where a, b are homogeneous polynomials of equal degree and b(x, y, z) is not identically zero on C. We will consider two rational functions on C as equal if they coincide for every point of C where their denominators are not zero. E x a m p l e . Consider the curve y2 = We m a y rewrite r as r(P) = y
f(z). Then r(P) = y is a rational function. + y2
-
f(x).
If the curve is interpreted to lie in p2, then we may write r ( P ) ~-. y = Y z d - 1 -4-y2zd-2 - - f ( x / z ) z z zd
d
151 where d = deg f . Now suppose t h a t r is some rational function on C and P is a point on C. If P is non-singular, then there exists a tangent line to C at P . Near P , everything can be expressed as a function of a local p a r a m e t e r t at P , as illustrated here. t
C
For points on C near P , the function r m a y be written as a function of t, and r has a Lanrent series in t. We let o r d p r denote the order of this Lanrent series. E x a m p l e . We return to the curve y2 = f ( x ) = a ( x - a l ) ( x ai are real, then we could have the following picture.
- a2)(x - a s ) . If the
x
For P on C, take r ( P ) = y as in the previous example. T h e n r vanishes only at Pi = ( a i , 0 ) . Since the tangent lines are vertical, the local p a r a m e t e r is y itself, and ordpir = 1 (i = 1,2,3). E x a m p l e . Consider the above curve with r ( P ) = x - a l for P = (x, y). T h e function r vanishes at P1 = ( a l , 0). T h e r e the local p a r a m e t e r is y. So we need to find the Laurent series for x - a l in the variable y. We have x - ~1 = c 2 y 2 + c 4 y 4 ~ . - . , and so o r d p 1(x - a l ) = 2. So far, we have only considered o r d p r for non-singular a ~ n e points. By homogenizing the curve and the rational function r, we m a y consider non-singular projective points as well. E x a m p l e . R e t u r n to the curve y2 = f ( x ) = a ( x - a l ) ( x - a2)(x - c~s) and the rational function r ( P ) = y. We homogenize to y 2 z = a ( x - ~ l z ) ( x - a 2 z ) ( x - a 3 z ) and r ( P ) = y / z . Consider the point at infinity P ~ = (0, 1,0). T h e n o r d p ~ ( y / z ) = - o r d p ~ ( z / y ) . We consider the affine curve with y = 1, t h a t is, z = a ( x - ~1 z ) ( x a2z)(x - a s z ) , and the corresponding point at infinity P ~ = (0,0). We can now
152 d e t e r m i n e ordPoo(z). If g ( x , z ) i s the p o l y n o m i a l of this afflne c u r v e , t h e n ~ at Poo.
/
/
= 1
4 z
So x is a local p a r a m e t e r , a n d we can e x p a n d z = "/3x 3 + . . . . above.) T h e n o r d p ~ z = 3 a n d ordpco ( y / z ) = -ordp~o (z) = --3. C o m b i n i n g this with w h a t we saw above, we have 1 ordpy =
= 0, ~
(See the figure
i f P = Pi,
- 3
if P = Poo,
0
everywhere else,
a n d we see t h a t -'~ordpy = 0 . PEc This is, in fact, t r u e in general. THEOREM. curve C, then
I f r is any non-zero rational function on a non-singular projective ~--~ o r d p r = 0 . PEc
This t h e o r e m will not be p r o v e d here. See e.g. D e u r i n g (1973). Earlier, we h a d i n t r o d u c e d the g r o u p of divisors, Div C, for a n o n - s i n g u l a r curve C. Given a n o n - z e r o r a t i o n a l f u n c t i o n r on the curve C, we associate a divisor b y Divr = ~ (order)P. PEC T h e n b y the t h e o r e m , d e g ( D i v r) = 0. Example. For the curve y2 = f ( x ) a n d r ( P ) = y, as. above, we have Div y = PI + P~ + P3 - 3Po~. A divisor D is called principal if D = Div r for some ratiional f u n c t i o n r. W e h a v e D i v ( r s ) = D i v ( r ) + Div(s). We have the following inclusions. g r o u p of divisors
ul g r o u p of divisors of degree 0
ul g r o u p of principal divisors.
153
Now let D, E be two divisors on a non-singular curve C. Say
E=
Z
c*(P)P,
PEC
and D as above. We write D > E if c(P) > c*(P) for every P e C. This gives a partial ordering on the group of divisors. For a divisor D, we write I~(D) to denote the set of rational functions r with D i v r > - D. E x a m p l e . Let C b e t h e x-axis a n d P 1 = 1,P2 = 2. Let D = 2 P 1 - P 2 . Then £ ( D ) consists of all rational functions r with Div r > - 2P1 + P2- This allows a pole of order at most 2 at P1 and requires a zero of order at least 1 at P2. Then, since no pole is allowed at c~, 12(D) consists of all r of the form (z -- 2)(az + b) ( z - 1) 2
(a,b E (2)
which shows that it is a vector space over C of dimension 2. In general, t2(D) is a vector space over C. We let ~(D) = d i m £ ( D ) . It is a consequence of the R i e m a n n - R o c h Theorem (see Deuring (1973)), there is a unique non-negative integer g = g(C) such that if deg D > 2g - 2, then g(D) = (deg D) - g + 1. This g is called the genus of C, and it turns out to be the same as the topological genus which we mentioned earlier. E x a m p l e . Let C be the x-axis. Then it is easily seen that g = 0. For P1, P2 on C, let D = 2 P 1 - P 2 , sodegD = 1 > 2g-2. Theng(D) = 1-0+1 = 2 , as we had determined earlier in a special case. If D, E are divisors, we say D ,,~ E if D - E is a principal divisor. Notice that for D, E to be equivalent, it is necessary that d e g D = deg E. E x a m p l e . Let C be the x-axis. If P, Q are any two points on C, then ( P ) ,,~ (Q). For suppose that P = a, Q = fl, and both are finite. Then take r(z) = (z - a ) / ( z - fl). Or if P = a, Q = e~, then take r(z) = z - a. In fact, for any curve C with g = 0, two points P, Q on C are equivalent. To see this, let D = P - Q . Then deg D = 0 > 2 g - 2 , and we have/~(D) = 1 by the consequence of the R i e m a n n - R o c h Theorem which was mentioned above. Hence there exists an f with o r d p f > 1, o r d Q f > - 1, and o r d R f > 0 otherwise. Since }'-~Rec o r d n f = 0, the inequalities are all equalities. Then D = Div f and ( P ) ~ (Q). Now consider the case where g = 1. Take D = (Q). Then d e g D = 1 > 2 g - 2 and then £(D) = 1. Up to multiplication by C, there exists one function r which has at most a pole at Q. In this case, £ ( D ) consists only of constants, so ( P ) ~- (Q) is the same as P = Q. L E M M A 6 A . Let C be a nonsingular curve with genus g = 1. Let 0 be a point oi2 C. Given a divisor D with deg D = 0, there is a unique P E C with
(P) - ( 0 ) ,,, D.
154
P r o o f . Let D I = D + (O) so t h a t deg D t = 1 > 0 = 2g - 2. By the R i e m a n n - R o c h T h e o r e m , g ( D ' ) = d e g D ' - g + 1 = 1, so there exists a function f with D i v f > - D ' . Since deg D i v f = 0 = 1 - 1 = 1 + d e g ( - D ' ) , there is a point P with D i v f = - D ' + ( P ) = - D - (O) + (P). But then ( P ) - (O) ,,- D. T h e point P is unique, for if we h a d solutions P and P ' , then ( P ) ,,~ ( P ' ) and thus P = P ' by previous work. Let C be a nonsingular curve of genus 1, and let O be a point on C. T h e n C together with (O) is called an elliptic curve. Let ~ o be the group of divisors of degree 0 and ~ p the subgroup of principal divisors. By the preceding lemma, there is a 1 - 1 correspondence between the points P on the curve and elements of the factor group ~ o / ~ p . (This factor group is called the Picard group). Namely, P corresponds to the class of ( P ) - (O) modulus ~ p . Since ~ o / ~ v is a group, this induces a group structure on C. Let P1 + P2 be the sum of points, as defined in this way. T h e n (P1 + P2) - (O) --- ( P l ) - (O) + (P2) - (O). T h u s P1 + P2 is the unique point with (P1 + P2) + (O) ,,~ (P1) -4- (P2). T h e point O, which is called the base point, is the zero element of the group. Concerning the s u m of n points, it is i m m e d i a t e that (P1 + . . . +
Pn) - (O) ,~ (P1) - (O) + . . . + (Pn) - (O).
Our group law depends on the choice of the base point, but at any rate the group is isomorphic to ~ o / ~ p . Given points O, O', the canonical isomorphism g between the elliptic curves C(O), C(O') with respective base points O, O' is given by
(P)- (0),
, (g(P))- (0').
It m a y be shown t h a t a nonsingular cubic curve has genus 1. Given a cubic curve C and two points P1, P2 on C, we would like to find a function f with D i v f = (Pl + P2) + (O) - ( P , ) - (e2). T h e n f has a zero of order 1 at P1 + P2 and 0 and poles of order 1 at P1,P2. P1 # P2, we m a y have the following picture.
For
155
l +t2
(Determine PiP2 as the point of intersection of C with the line ~ through P1, P2- Then determine ~' as the line through O, P1P2. Finally P1 + P2 is the point of intersection of C and ~'.) If ~ is given by the linear form L(x) = 0 and ~' is given by L'(x) = 0, then f ( x ) = L'(x=)/L(x=) has the desired properties. In the case where P1 = P2 (= P, say) we take ~ to be the tangent line to C at P. Again we have f(x_) = L'(x=)/L(x_).
fpp
2P
156
In the case where P2 = O, t a k e ~ = 2,', a n d f(_x_) = 1.
p~O £=£,
We may also use this graphical technique to find - P for P on C. We draw the tangent line to C at the base point O. Call its third intersection point R. T h e n draw the line through P and R. Its third intersection point is - P . This is illustrated here.
P
In the special case where 0 is a triple point on its tangent line, we have 0 = R and the picture simplifies.
157
In this case we have another nice result. We have P + Q + S = 0 if and only if P, Q, S are eolinear.
PQ = - ( p + Q)
Let C : y2 = f ( x ) , where f is a cubic with distinct roots. T h e n we take 0 = (0, 1,0), which is a triple point of the line at infinity. Since the lines through this point at infinity are simply vertical lines, our picture looks like this.
~/(p + Q)
\
P+Q
158
Now suppose that our elliptic curve is a cubic of genus 1. Suppose also that it is defined by an equation with rational coefficients and that the base point O is rational. If P, Q are rational points on the curve then PQ is rational. Then, since O is rational, we have P + Q rational. Thus given a rational point on our curve, we can generate other rational points. Let E(C) denote the group of all complex points on an elliptic curve E . Let E ( Q ) denote the group of all rational points on E. Then E ( Q ) is a subgroup of E(C). THEOREM ( M o r d e l l - W e i l ) . The group E ( Q ) is finitely generated. T h e theorem as stated here is actually due to Mordell (1922), while Weil has generalized it. By the theorem, we know that E(Q) ~_Z ~ ... ~ Z @ Torsion, --r times-, where r is the rank of E(Q), and the torsion part is finitely generated. Curves with rank as high as 14 are known. The conjecture is that there exist curves of arbitrarily large rank. There are, on the other hand, only finitely many possibilities for the torsion part. THEOREM
(Mazur).
There are exactly fourteen groups which may arise as the
torsion part of an elliptic curve. Let E C p2(Q) be an elliptic curve. T h e n every point on E may be represented as ( x ( P ) , y(P), z(P)), where x ( P ) , y(P), z(P) E Z are relatively prime. This representation is unique up to sign. T h e Mordell-Weil height is defined by
ho(P) = l o g ( m a x ( [ x ( P ) l , [y(P)[, [z(P)[) The Neron-Tate height is given by
h(P) = lim n---+ OO
h0(2nP) n
THEOREM. The limit above exists, and h(P) = 0 if and only if P is a torsion point. If P is non-torsion, then
ho(P) < c(E)h(P), where c(E) is a constant depending on E. Furthermore, h(P) is a quadratic form on E ( Q ) , where a quadratic form is defined as below. A quadratic form on an abelian group G is a real-valued function f such that for any P, Q in G we have
f ( P + Q) + f ( P - Q) = 2 f ( P ) + 2 f ( Q ) The theorems generated in this section will not be proved here. For more on elliptic curves, see Silverman (1985).
159
Exercise 6a. Let P 1 , . . . , Pk be in the abelian group G. Then k aij n i n j , i,j=l
where the coefficients aij depend on P1,... , Pk.
§7. The Rank of Cubic Thue Curves. Consider the cubic Thue equation F(x, y) = m,
where F is a homogeneous cubic polynomial in two variables with no multiple factors. The genus g = 1. This equation has the homogeneous form F ( x , y) = m z 3.
If the corresponding elliptic curve contains at least one rational point, then we can study the group of rational points Era(Q). THEOREM rank E,n0 (Q) = > 1.
7A. Given any such F, there is an integer m o >
0 such that
Our proof, which comes later, will follow the work of Silverman (1983). PROPOSITION has rank 3.
7B. The group of rational points on the curve x a -b y3 : 657z a
This special result is not proven here. See tables compiled by Stephens (1968). The following result of Silverman (1983) shows that there exist Thue equations with rank at least 4. PROPOSITION
7C. For certain cubics F and certain values of m, we have rankEm(Q) = > 4.
What about the torsion part of E(C)? Given a natural number n, there points P E E(C) with n P = O.
are
n 2
L E M M A 7D. Suppose E(Q) is as above. Given any integer d => 1, there are only finitely many points in E(C) which are torsion points and whose coordinates generate an algebraic number tleld of degree no greater than d. In other words, the set
U L number field [L:Q] < d
E(L)torsion
160
is t~nite. T h e proof follows Silverman (1983), but first we need some preliminaries. Suppose E ( Q ) and a p r i m e p are given. By reduction of the curve modulo p we m e a n the following. Consider the corresponding equation f(x, y) = O, which is an equation over Z, and reduce the coefficients modulo p to obtain the new equation ~(x, y) = 0 over the finite field Fp with p elements. We say t h a t we have a good reduction if the new curve is also an elliptic curve. In the previous section, we considered the equation y2 = f(x), where f was a cubic polynomial over Z with distinct roots. In this case if A f ~ 0, we have a good reduction. We also considered the cubic T h u e equation F(x, y) = m, with certain restrictions on F. Here, if A F ~ 0 and ~ ¢ 0, we have a good reduction. In these cases, if p is a sufficiently large prime, we will get a good reduction. This is in fact true in general. Now let P = (x, y) be a rational point on E. If neither x nor y has a factor of p in its denominator, then consider 7, ~. We have 7 ( ~ , ~ ) = 0. If p does occur in the d e n o m i n a t o r of either x or y, then homogenize the point P to get (x, y, z) E Z 3 with gcd(x, y, z) = 1. T h e n take the point (7, ~,3). E x a m p l e . Consider the equation 43x 3 - 2y 3 = 1, the point P = ( 1 / 3 , 2 / 3 ) , and the p r i m e p = 3. Homogenize to get P = (1, 2, 3) on the curve 43x s - 2y 3 = z 3. T h e n = ( 1 1 - 1 , 0 ) satisfies x 3 + y3 = z 3. P r o o f o f L e m m a 7 D . Let E ( Q ) and d > 1 be given. Choose a p r i m e p such that E has good reduction at p. Let L be any field with [L : Q] = d, and choose a prime ideal ~ of L lying above p. Take the curve E(L) to be the set of all points on E with coordinates in L. Reduce this m o d ~ to get E ( ~ v ), where ~V = D / ~ is a finite field and D is the ring of integers in L. From algebraic n u m b e r theory, we have card ~V ~ pd. So we have a m a p
E(L)
~ E(~ v ),
where EC~ v ) is the set of all points on the reduced curve with coordinates in ~ . Now let E(L)p denote the "prime to p torsion" of E(L) consisting of points P with the p r o p e r t y t h a t rnP = 0 for some integer m with p ]/ m. T h e n the m a p
E(L)p is injective. ) c
, E(~ v )
(For a p r o o f of this fact, see Silverman (1985, p.176).) ), so we have card(S(~v
)) __< ( c a r d e r ) 2
+ (carder)
We know t h a t
+ 1.
Using the injective m a p above and the bound on card ( ~ V ) gives a b o u n d card E(L)p < B(p, d, E). Given another p r i m e q with good reduction, we have card E(L)q < B(q, d, E).
161 But Wor E ( L ) C_ E ( L ) p ÷ E(L)q, so then card (Tor E ( L ) ) < B(d, E). T h e n for P e T o r E ( L ) , we have m P = 0 for some integer m with m < B ( d , E). Here m may be different for each point P , but for fixed m, the number of such points P is no greater t h a n rn 2. So the total number of such points P , i.e. for any rn < B(d, E), is <
(B(d,E)!) 2,
and the result is proved. Before giving the proof of Theorem 7A, we state one further lemma. L E M M A 7 E . Suppose F ( x , y ) = m z 3 is given where F ( x , y ) = a(x - oqy)(x o~2y)(x - (~3Y). T h e points ((~, 1, 0) (i = 1, 2, 3) are triple points on the curve.
-
-
P r o o f . T h e lemma is true, since on the line x = (~iy, the only point on the curve has z =O.
P r o o f o f T h e o r e m 7A. Given F , start with the curve E:
F ( x , y ) = z a.
162
As illustrated, the point P~ = (t, 1, F~V / ~ , I ) ) , with t E Z, FX/-~,I a ) • 0 lies on E. Now take the new curve Z, : F(x,y) = F(t, 1)z 3,
Pt
which contains the point Pt = (t, 1, 1). We have a mapping
E
,Et:(x,y,z),
,(x,y,z/~,l)),
and it is a group isomorphism if O = ( a l , 1,0) is the base point. We consider E as a curve over K = Q ( a l ) . T h e n P,' has coordinates in a cubic extension of K . By L e m m a 7D, for all but finitely many t, the point P,' is non-torsion on E. So except for finitely many t, the point Pt is non-torsion on Et. Unfortunately, the base point O is not defined over Q, but we want to consider rank Era(Q), where m = F(t, 1). So we take Pt to be the new base point of Et, and let Qt be the third point of intersection of the tangent line to Et at Pt. Now Pt, Qt E Et(Q), and with respect to this new base point, we claim that Qt is non-torsion. For suppose that Qt were torsion with respect to the base Pt. T h e n nQt = 0 for some integer n, which we write as an equivalence
n(Qt) ~ n(P,).
(7.1)
Since the original base point O is a triple point, we have Qt = - 2 P t , in the group operations with respect to base O. We write this as
(Or) + 2(t,,) ~ a(o). Then
n(Q,) + 2n(Pt) ~ 3n(O). Combining equivalences (7.1) and (7.2) gives 3n(P,) ~ an(O), so Pt is torsion with respect to the base O, a contradiction.
(7.2)
163
§8. L o w e r B o u n d s for the N u m b e r
o f Solutions o f Cubic T h u e E q u a t i o n s .
In (1933), Chowla studied equations of the form x a - ky a = m, Suppose k ¢ 0 is given and let Z ( m ) denote the n u m b e r of solutions to the equation. T h e n Chowla proved t h a t
z(m) = ~k(log log m)) as m --* oo. In other words, Z ( m ) > ck l o g l o g m for infinitely m a n y values of m. Mahler (1935) studied the more general cubic T h u e equations F ( x , y) = m. He showed t h a t
z~(m) = nF((log m)1/4). T h e latest result, which we state here, is due to Silverman (1983). THEOREM 8 A . Suppose F ( x , y) is a form of degree 3 without multiple factors. Suppose there is some integer mo • 0 such that F ( x , y) = m o has a rational solution and that the corresponding elliptic curve has Mordell-Weil rank R > O. Then i f Z F ( m ) is the n u m b e r of solutions of F ( x , y) = m , we have
z~(.~) = ~((log.~)R/('+2)). In the preceding section, we showed ( T h e o r e m 7A) t h a t there always exists an m0 with r a n k R __> 1, which gives the following result. COROLLARY
8 B . For any cubic form F as above, we have
zv(m) = av((log m)l/~). COROLLARY R = 3, so
8 C . Suppose F ( x , y) = x 3
+ y3. Then we can use m0 = 657 a n d
zF(m) = ~(0og m)3/5). COROLLARY
8 D . For certain cubic forms F, we can f~nd mo with R >=4, so
z~(m) : ~ ( ( l o g m)2/3). Corollaries 8C and 8D follow from Propositions 7B and 7C, respectively. tGiven f(m),g(m) > 0, we say f(m) = ~2(g(rn)) if there exists a constant e > 0 such that f(m) > cg(m) for infinitely many values of m.
164
Remarks (i) Nothing like this is known for deg F > 3. It is quite possible that in this case there are bounds which are independent of m. It is also possible that Siegel's conjecture (Ch. III, first paragraph of §7) is true for curves of genus g > 1. (ii) T h e function Z f ( r r t ) c o u n t s primitive as well as non-primitive solutions of F(x, y) = m. If PF(m) denotes the number of primitive solutions, then for all that we know, it is possible even in the cubic case that P F ( m ) is bounded independently of m. (iii) T h e numbers m in the proof of the theorem will be of the type m = mog 3 where g is large. It is conceivable that the number of solutions is bounded as m runs through cube-free integers. P r o o f . Let E0 be the curve given by the homogeneous equation
F(x, y) = moz 3 . For points P on E0, write (x(P), y(P), z(P)), where x(P), y(P), z(P) are co-prime integers. By hypothesis, this curve has Mordell-Weil rank R > 0. So there exist points P1,... ,Pn on E0(Q) which generate a free abelian group of rank R. Let an integer N >= 1 be given and consider the set ~ ( N ) of points niP1 -4- ... -4- nRPR, with 0 < n~ = N
(i = 1 , . . . ,R).
T h e n card ~ ( N ) = N R and all of the points of ~ ( N ) lie on the curve E0. If the base point O is chosen appropriately, then all of the non-torsion points P will have z(P) ~ O. This is true if no root o~i of F(x, 1) is rational. If some root a j is rational, then put O = (o~j, 1, 0), which is a triple point of intersection (of the curve and its tangent at O). Then if z(P) = O, we have f(x(P), y(P)) = O, and thus x(P) = o~y(P) for some root o~. Again P is a triple point of intersection. T h e n 3 P = O, a contradiction. Now that we know z(P) ~ 0 for the non-torsion points, we put m = re(N) = m0
(8.1)
H
z(P)3"
Pe~5 (Y)
We know that
( x ( P ) y(P) F \z(P)' z(P))
TI20
for P E ~ ( N ) , and therefore
F ~z(P)Qes
(N)
Y(P)
Qe6 (N)
)
qe6 (g)
Therefore we have at least N n integer solutions to f(x, y) = m.
165
Thus we need to obtain a lower bound for N R in terms of m. For a non-torsion point P = (x(P), y(P), z ( P ) ) we have
loglz(P)l < ho(P) = log(max(Ix(P)], ly(P)I, Iz(P)l))
<=ch(P), where ho(P) is the Mordell-Weil height and h(P) is the Neron-Tate height. Notice that the last inequality holds since P is non-torsion. For the point P = nlPa + . . . + nRPR, we have R
loglz(P)[ _-
~ PE6
loglz(P)l (N)
~=3c**N R+2 + log Ira0 ]. Combining this with our previous estimate for ZF(m), we get
ZF(m) ~ N R ~
C***(1og[ml) R/¢R+2),
as desired. R e m a r k . Since, as we have seen in Theorem 1C of Ch. III, the number of solutions of IF(x, y)[ _< m is of smaller order of magnitude than m, the average number of solutions of F(x, y) = m, as m varies, is zero. §9. U p p e r B o u n d s for R a t i o n a l P o i n t s o n C e r t a i n E l l i p t i c E q u a t i o n s in Terms of the Mordell-Weil Rank. Consider the Neron-Tate height h(P) of points P E E(Q). We remarked in section 6 that (i) h(P) > 0 if and only if P is non-torsion (ii) h(P) is a quadratic form. Suppose that Q is a torsion point. Then for any point P, and for n , m E Z,
h(nP + mQ) : an 2 + bnm + cm 2.
166
If gQ -- 0 where g is a positive integer, then h ( n P + mgQ) -- h(nP), so that bnmg + cm2g 2 -- 0 for m E Z, and therefore b = c = 0. In particular, h(P + Q) = h(P), or in other words, h ( P ) = h(P') if P - P ' is a torsion point. Therefore h ( P ) is defined on the factor group E ( Q ) / T o r s i o n . Say rank E ( Q ) = R and E ( Q ) = ZP1 @ . . . @ ZPR ® Torsion. Then h(nlP1 + . . . + nRPR) = Einj=l aij 7~i nj. By (ii), the quadratic form on the right here is positive if n l , . . . , nn lie in Z and are not all 0. In fact, it is known that this quadratic form is positive definite, i.e. it is positive if n l , . . . , nR lie in R and are not all 0. It is easily seen that this property does not depend on our choice of the base points P 1 , . . . PR, and it is usually expressed by stating that (iii) h ( P ) is a positive definite quadratic form on E(Q)/Worsion. A consequence is that h ( T t l P 1 -~- . . . --[-n R P R ) )> Cl(n 2 - ~ . . . -~ r ~ )
where c 1 ) 0. T h e n u m b e r of integers n l , . . . , a n with h(nlP1 + ... + nRPR) <= ~ is --< c2 (R/2 + 1. From Mazur's Theorem, card(Torsion) =< 16. As a consequence, the number of points P E E ( Q ) with h(P) <=~ where ~ >-_1 is
< c3(E) ~n/2. Now we will consider elliptic equations of the form
y2 = x 3 + D and
EmD :
y~ = X3 + EmD,
where D is given and m varies. Recall that h0 denotes the Mordell-Weil height. THEOREM 9 A . Let D 7~ 0 and ~ >= 1 be given. If m >=too(D) and m is sixth power ffee, then the number o[ P E EmD(Q) with
ho(P) < ~ l o g m is less than 16. (c4 V/-~)ra"k E'D (@ ,
where c4 is an absolute constant, In earlier sections, we used the Weierstrass type equation y2 = f ( x ) , where f ( x ) is a cubic polynomial. Here it will be necessary to use the more general Weierstrass form y2 _{_ a l x y ~- a a y =
x 3 -{- a2x 2 + a4x -{- a6.
167 It is easy to see that we m a y transform the second form into the first by the change of variables y' = y Jr ( a l x / 2 ) + (a3/2). In general, transformations of the form (9.1)
y=u3y ' +u2sx ' +t,
x=u2x ' +r,
where u ¢ 0, will transform a generalized Weierstrass equation into another equation of the same type. In fact, it can be shown that these are the only "rational" transformations with this ]property. We say that two Weierstrass equations over Z are equivalent if they ave related by a transformation of the type (9.1). Each general Weierstrass equation W is equivalent to a Weierstrass equation y2 = f ( x ) . We then set A ( W ) = 16 discr. (f); it is easily shown that this quantity depends on W only. If W has integral coefficients, f no longer necessarily does, but it m a y be shown that A ( W ) will be integral. Among all equivalent equations with integer coefficients we may pick one where ]A(W)[ is minimal. This discriminant is called Amln, and it may be associated with a Weierstrass equation of the more general form. One may check the algebra to see that A ( W ) = u12A(W ') for W, W' related by a transformation as above. Thus if some A ( W ) is not divisible by any twelfth power, then we know that amin = Z~(W). E x a m p l e . Consider our equation y~ = x z + D. Then (see below) A = --16.27D 2. If W : yZ = x 3 + a6b, then A ( W ) = -16.27a12b 2. Now let y = a3y I, x = a2x I. Then we have W t : y,2 = x,3 + b, and A ( W ' ) = - 1 6 . 2 7 b 2. LEMMA
9B. Consider the curves E D : y2 = x 3 _~ D
and E m D : y2 = x 3 + roD. I f m is sixth p o w e r free, then log Amin(E,~D) >= 2log I m l - 10log [6D I.
P r o o f . In the example, we said that A ( E D ) = --16.27 D 2 .
168
We m a y check this by considering the roots of f ( x ) = x 3 + D. T h e y are a l = 0~, 0~2 = a<, as = a< 2, where (~ = a ~ and < is a primitive cube root of unity. T h e n ((3~ 1 -- Ol2)(O~ 1 -- Ot3)((:~ 2 -- Ol3) = O~3(1 -- <)(1
-- <2)(<
__ <2)
= - D ( 1 - ~)a(1 + <)( = D(1 - <)a = D(1 - 2( + (2)(1 - () = - D ( 3 < ) ( 1 - <) --- - 3 D ( < - <2) = - 3 D ( 2 < + 1) = - 3 v ~ i
D,
since < = ( - 1 + iv/3)/2. T h e n discr f = - 2 7 D 2. We also have
A(EmD) = - - 1 6 . 2 7 m 2 D 2. For any t r a n s f o r m a t i o n allowed, we would have A = g12AI, where £ E Q*. So we get - 1 6 . 2 7 m 2 D 2 = A(EmD) = ~,12Amin(ZmD). Write rn = m l m 2 , where rnl is a product of primes p with p { 6D and rn2 is a p r o d u c t of primes p with p [ 6D. Since m is sixth power free, we have
I ami.(EmD) and m2 ~ (6D) 5. Then
Amin(EmD) ~: rn21 , and the result follows by taking logarithms and applying the inequality for rn2. LEMMA
9 C . Consider
ED: y2=xa+D. If P is a non-torsion point on ED, then h(P) > c5 log Amin(ED), where c5 > 0 is absolute. This result is due to Silverman (1981). Lang has conjectured t h a t the result is true for m o r e general elliptic curves in Weierstrass form. L E M M A 9 D . Let ho denote the Morde11-Wei1 height and h the Neron-Tate height. Then for P on a curve E in general Weierstrass form, [h(P) - h0(P)[ _-< c6 log [A(E)[.
169
This result is due to Zimmer (1976). P r o o f . (of Theorem 9A.) Recall, we want to count the number of P E EmD(Q) with
ho(P) < ( l o g m , where ~ and D are given. By Lemma 9D, we know that this is less than or equal to the number of P E E,~D(Q) having
h(P) < ¢logm + c6 log
(Emo).
Write P E E(Q) as P = P ' + P " where P " E Torsion and pt = niP1 + . . . + nRPR. By Mazur's Theorem, we have card (Torsion) < 16. Combining this with L e m m a 9B, we see that the number of P being counted is no greater than 16. card{P' e EmD(Q) : h(P') < ~ logm + 2c6 log m + cT(D)}. Now we are back to a problem in the Geometry of Numbers. We know that h(P') is a positive definite quadratic form F in R variables, and we need to count the number of points P~ with h(P ~) < v = ~ log m -E 2c6 log m -E cr(n). We also have for nonzero P':
H(P') >=c5 log
=
say, by Lemma 9C. We use exercise 2b of Chapter I. Let ]C be the set of all _x with F(x_) _<_ 1. We know that every integer point _x_ ~ 0 has F(__x) => vl. So the first minimum A~ satisfies .~l ~_- v f ~ • We count the number of points x_ with E(_x_) < v, i.e. the number of integer points in the set v/viC. By the exercise, the number of such points is =
-El
Since for m > mo(n) we have v __<(~+2c6+1) l o g m < cs ~ logm, and/11 ~_--C9 logm, we have V~/Ul < clo ~1/2, so that we obtain
(2C10 ~1/2 .Or I)R 5 (C4 ~l/2)R . Theorem 9A follows.
§I0.
Isogenies.
Let C1, C2 be curves in (2. We want to discuss maps from C1 to C2. A rational map is a map given by rational functions ¢, ¢ on C1 such that whenever (x, y) E Ca and ¢, ¢ are defined at (x, y), then (¢(x, y), ¢(x, y)) E C2. However, it is better to think of C1, C2 as curves in •2. Then we consider maps of the form
(x,y,z)
(¢l(x, y, z), ¢2(x,v,z),
y, z)),
170 where ¢1, ¢2, ¢3 are homogeneous polynomials of equal degree such that for (x, y, z) E C1 we don't have (¢1, ¢2, ¢3) identically (0, 0, 0), and we have ( ¢ l ( x , y , z ) , ¢2(x, y , z ) , ¢3(x, y, z)) E C2 whenever (¢~, ¢2, ¢3) # (0, 0, 0). More precisely, we consider equivalence classes of such functions (¢I, ¢2, ¢3). One says ~ ( ¢ 1t' ¢ ~ ' ¢t 3 ) t
(¢1,~2,¢3)
if the matrix
has rank =< 1 for every x_ E C1. E x a m p l e . Let C1 : p1 The rational map t :
C2 :
and
;
(
x 2 "q- y2 =
1.
t 2 + 1' t 2 T
can be viewed as the map (t,w) ~
(2tw, t 2 - w 2, t 2 + w2).
E x a m p l e . Let C1 : x 4 + y 4 = 1
and
C2: u 2-t-v 2 = 1 .
A map which takes C1 into C2 is
A non-constant map has a degree 6 which is defined as the largest integer such that points on C1 are mapped into a single point on C2. In the first example, one could find an inverse map, so a = 1. In the second example, 6 = 4. F a c t . If C1,C2 are non-singular curves in p2, then a rational map C1 ~ C2 is necessarily defined everywhere. If C1, C2 are irreducible and the map ¢ is not constant, then __¢is onto C2. (See, e.g. Silverman (1985), Ch. II, Prop. 2.1 and Theorem 2.3). An isogeny of elliptic curves El, E2 with respective base points O1,02 is defined as a non-constant rational map _¢ : E1 ~ E2 with _¢(01) = 02. By what we just said above, such a map is onto. THEOREM and E2.
IOA. An isogeny is a homomorphism of the groups belonging to E1
171
How would one begin to prove this? We have ( P + Q) + (O1)
"
(P) + (Q),
where ~,, is an equivalence relation on the divisors of El. We would need
(_¢(P + Q)) + (_¢(o~)) ~ (_¢(P)) + (i (Q)), where ,,~ is now the equivalence relation on divisors of E2. So we have a rational function f on E1 such that D i v f = (P) + (Q) - (O1) (P -~- Q), - -
and we need a rational function f . on E2 with D i v f , = (_¢ (P)) + (¢ (Q)) - (_¢ (O2)) - (_¢(P + Q)). If D = ~
c,(Pi)
is a divisor of El, then let t
i=1
on E2. So it would suffice if with every rational function f of El, we could associate a rational function ¢ ( f ) = f* on E2 such that Div (¢ (f)) = ¢ (Div f). It is proved in Algebraic Geometry that this can be done. (See Silverman (1985), Ch. II, Prop. 3.6). Now suppose the functions defining ¢ : E1 ~-~ E2 have rational coefficients. Then ¢: El(Q) --, E2(Q). Since ¢ is-a group homomorphism, it is precisely g : 1 where g is the degree, and the kernel is of order ~. It is a general fact that a homomorphism with finite kernel of free abelian groups preserves the rank. Consider the special case of a binary cubic form F ( x , y) = ax s + bx2y + cxy 2 + dy 3
with distinct linear factors. Take the curve Cm : F ( x , y ) = m .
172 E x e r c i s e 1 0 a . Consider b i n a r y cubic F f o r m s with c o m p l e x coefficients a n d nonzero d i s c r i m i n a n t along with n o n - s i n g u l a r linear m a p s T : C 2 ~ C 2. Let F o ( x , y) = x3 + y3. T h e n a n y such F is of the f o r m F(_x) = Fo(T(x=)) for a suitable T. G i v e n a b i n a r y cubic f o r m F(x,y)
= a z 3 + bx2y + c x y 2 + dy 3,
associate with it
= (3ac - b2)x 2 + (9ad - bc)xy + (3bd - c2)y 2 and
I
H(x,y)=
a,
Gy
- (27a2d - 9abc + 2b3)x 3 - 3(6ac 2 - b2c - 9 a b d ) x 2 y + 3(6b2d - bc 2 - 9 a c d ) x y 2 - (27ad 2 - 9bcd + 2c3)y 3. T h e new f o r m s G, H are called covariani f o r m s . We also i n t r o d u c e the n o t a t i o n FT(z_) = F(T(x=)).
If F
~ F T,
then G,
~ detT)2G T
and H ~ * ( d e t T ) 3 H T. Remark.
T h i s is easy to check, using the two special t r a n s f o r m a t i o n s w i t h m a t r i x
(o
C o n s i d e r the curve Jm : y 2 z = x 3 - 4 3 2 m 2 D z 3,
where D = discr F a n d F is as above. We have a m a p ¢:
Cm ,
, J,, : (x,y,z)
~ (-4zG(x,y),
4 H ( x , y ) , z 3).
To see t h a t such _¢ really m a p s into J m , look first at the special case w h e r e m = 1 a n d F ( x , y) = x 3 + y3. We h a v e C : x 3 ~ y3 ~ z 3
J : y 2 z = x 3 _ 4 3 2 ( - 2 7 ) z 3,
173
since D = - 2 7 . Then F = F0 = x s + y3 and G = 9xy and H = 27(x 3 - y3). We need 16H2z 3 = - 6 4 z 3 G 3 + 432.27z 9, i.e. we need H 2 = - 4 G 3 + 272z ~, which does hold. The general truth follows, since we have F = F0T in general and G, H have the necessary covariance properties. Under this map, a point on C,n with F ( x , y) = 0, z = 0 will be mapped onto (0, 4H(x, y), 0) = (0, 1, 0) since H ( x , y) # O. Also, if C,,, has any rational point, say O1, then _¢(O1) = 02 will be rational on Jm. Now _¢ defines an isogeny from Cm with base point O1 to J,,, with base point 02. §11. U p p e r B o u n d s Well Rank.
o n C u b i c T h u e E q u a t i o n s in T e r m s o f t h e M o r d e l l -
As in section 8, we consider curves of the type Cm : F ( x , y ) = m , where F is a form of degree 3 with no multiple factors. If Cm contains no rational points then we have a trivial upper bound. So suppose that Cm contains a rational point. Then we have an elliptic curve with some R = rank Cm(Q) = R, say. THEOREM l l A . When m > too(F) and m is cube-free and Cm contains a rational point, then the number of integer solutions of the cubic Thue equation F ( x, y) = rn is < cl+rank Cm(Q) where c is an absolute constant. This result is due to Silverman (1982), but there was an earlier attempted proof by Demjanenko (1974). Recall that Bombieri and Schmidt have given the bound Co 3 l+v, where v = v(m) is the number of distinct prime factors of m, and only the primitive solutions are counted. These two estimates are rather different and, at this point, no one has shown how they fit together. (See also C. Stewart (to appear)). Suppose there exists some form F such that as rn ranges over positive cube-free integers, then the number of solutions of F ( x , y) = m is unbounded as a function of m. If this is so, then rank (Cm(Q)) is unbounded, and this would prove the conjectured existence of elliptic curves of arbitrarily high rank. LEMMA
l l B . Consider the Thue equation F ( x , y) = m
174 of degree 3 and define
H * ( F ) -- (cont F) hK(ai) as in Chapter III, Section 2. Then the number of solutions to this Thue equation with max(lx[, [Y[) > 4 S H * ( F ) S m 8/d
is <
d . 2d " hK(a)d-2rn a-
<
lYld
for some root a of F ( x , 1). If lYl ~ Ixl, we get
a
Y --
4dH*(F)dm 1 <: yd/8 y3V~/2 < y3x/d/----'--~
By a result of Chapter III, the number of solutions of this last relation with y > H * ( F ) is<
h0 (_¢ (x, y, 1)) < 3 log I~1 + cx(F). By the lemma above, we have 4 8 H * ( F ) 8m8/3 with =< c2 exceptions. If x is not an exception, then
ho ( ¢ ( x , y , 1)) < S l o g m + c3(F) <=91ogre for m >_ c4(F). We apply Theorem 9A with - 4 3 2 D in place of D. Since m is sixth power free, the n u m b e r of rational points P on Jm with ho(P) <=91ogre is
=< c]+rank Jm = cl+rankC~ since dra is obtained from Cm by an isogeny. Since ¢ is at most six to one, we get an estimate for the n u m b e r o f integer points on Cm.
175 §12. M o r e g e n e r a l results. Our discussion would be incomplete without at least a mention of the following deep results. Their proofs, however, are beyond the scope of these Notes. Siegel in (1929) proved that the number of integer points (x, y) of any irreducible curve
/(x, y) = 0
(12.1)
of genus g > 0 is finite. Baker and Coates (1970), in the case g -- 1, gave effective bounds for the number and the size of such points. They accomplished this by constructing a suitable birational transformation to an elliptic curve y2 = f(x). Better bounds were recently achieved by Schmidt (to appear): If (12.1) defines an irreducible curve of genus 1, where f has rational coefficients, then the number of integer solutions x, y E 7. is
< cl(d)H c2(a), where H is the height of f, where d is its degree, and c~(d) is a polynomial in d. (E.g., a polynomial of degree 13, but this can surely be improved. Furthermore, possible integer solutions have Faltings (1983) proved Mordell's conjecture, that on a curve of genus g > 1, there are only finitely many rational points. Another proof, with ideas closer to diophantine approximations, was given by Vojta (to appear), with a more elementary version given by Bombieri (to appear). There is every hope that this will lead to bounds on the number of rational points. However, when g > 1, effective results on the size of integer points or rational points (the size of numerators and denominators) seem at present to be quite beyond reach.
V. Diophantine Equations in More than Two Variables. References: Evertse, Gy6ry, Stewart and Tijdeman (1988), Schmidt (1980) §1. The Subspace Theorem. T H E O R E M 1A. (Subspace Theorem, Schmidt (1972)). Suppose that L I , . . . , L~ are linearly independent linear forms in n variables with algebraic coefticients. Suppose 5 > 0 is given. Then the integer points x_ 7~ O_with
ILl(__x)... L,-,(_x_)l < Ix__1-6 lie in a finite number of proper subspaces of Qn. The reader may find a proof in Schmidt (1980).
C O R R O L L A R Y l B . Suppose a l , . . . ,otn are algebraic and 1 , a l , . . . ,an are linearly independent over Q. Then there are only finitely m a n y rational n-tuples ( ~ , t y , . . . , ~ , l y ) with y > 0 and a~ - Y
(9.1)
1 < yl+(1/n)+~,
(i ----1, " " ' n).
In the special case n = 1, we get Roth's Theoerem. Also, the exponent 1 + ( l / n ) is best possible by Dirichlet's Theorem (Theorem 1B of Chapter II). P r o o f . Multiplying together all of the inequalities in (9.1), then multiplying by yn+l gives y l ~ l y - x , I . . . I~ny - ~ n l < 1/y ~. Now ptlt x - - ( X l , . . . , Xn, y) E Z n+l a n d let X--~ ( X l , . . .
L~(X) = a , Y - X ,
, Xn, Y).
Let
(i = 1 , . . . , n )
and
L,-,+i(X) = ]I. Then we have
]L~(__z)... L,,+~(_z)l
< 1/y 6 <
l/l__x] '5/'~
if y is large. By the Subspace Theorem in n + 1 dimensions, the solutions lie in a finite number of subspaces. Let one such subspace be given by ClXl ~- . . . -~- CnXn -~- c . + l y = 0
with ci E Q. On this particular subspace we have (c1~1 + . . . + cn~n + cn+i)y = C,(~lY -- x,) + . . . + ~ n ( ~ y -- x~)
177
by the defining equation above. Let 7 = c l a l + . . . + cna~ + c,+1. Then 7 # 0 by the linear independence of 1 , a ] , . . . ,an, and also 7 is fixed for a given subspace. We have
M lyl < (Ic l + . . . + le.I)/y < Icll + . . . + le, ISo y is bounded and we are finished. One would like to make the Subspace Theorem more quantitative. Recall, Roth's Theorem is ineffective in the sense that it does not give estimates for x, y. It can be strengthened, as we have seen, to give bounds on the number of solutions. A similar result is true in this case as well. We can not estimate the coefficients of the defining equations of the subspaces (i.e., we cannot estimate their heights), but we can give a bound for the number of subspaces. T H E O R E M 1C. (Schmidt (1989a)). Let L 1 , . . . , L n be linearly independent linear forms with coemcients in an algebraic n u m b e r field of degree d. Consider the inequality
IL,(_z)... L,(z_)I < I det(L,,..., L,,)I where 0 < ~ < 1. Then there are proper subspaces $ 1 , . . . , St of Qn where t=
[(2d) 2'e"
~-2],
such that all integer solutions x__~ O_lie in the union of $ 1 , . . . , St and the ball
~ max((n!) s/~,
H(L1),... , H(Ln)).
Schlickewei (1977) generalized Sehmidt's Subspace Theorem to allow more general absolute values. T H E O R E M 1D. Let K be an algebraic number field and let S C M ( K ) be a finite set of absolute values which contains all of the non-Archimedean ones. For v E S, let L v l , . . . , Lvn be n linearly independent linear.forms in n variables with coefficients in K . Let ~ > 0 be given. Then the solutions of the inequality n
IX H IL"'()I7
y E S i=1
with x E D~c and x__~ 0_, where
[x[----
max
[xlJ)[,
l<_i<_n
l~j~deg K
lie in finitely m a n y proper subspaces of K n. (As always, D K is the ring of integers in K, and n. is the local degree). A quantitative result was proved by Schlickewei, (to appear (a). See also (b).)
178
THEOREM 1E. Let S C M ( Q ) be of finite cardinality s, and containing the Archimedean absolute value. Let K be a number field of degree d, and suppose that for each v E S we are given a fixed extension of [ • [ to K . For v E S, let L v l , . . . , L v , be n linearly independent linear forms in n variables and with coefficients in K . Consider the inequality H yES
[Lvi(x)lv <
[det(Lvl,...,Lv,)lv
Ix[
•
i=1
Then there are proper subspaces $1, . . . , §t ° f Q n, where t=
[8sd!F
such that all the solutions ~A E Z " lie in the union o f $ 1 , . . . , S t and the box ~_[ < max ((n!) s/8 H(Lvi)). --
-~
l~_i~_n
yES
In Theorems 1C and 1E, the reader should note that t, the bound on the number of subspaces, is independent of the linear forms. The following theorem will turn out to be equivalent to Theorem 1D. THEOREM 1D'. Le.t K , S be as above. For v E S, let Lvi (i = 1 , . . . ,n) be n linearly independent linear forms over K . Then solutions x_ E P n - I ( K ) to the inequality
yes i=l
"-~=[=~ ]
< HK(X)"+6
where ~ > O, lie in finitely m a n y proper subspaces. R e m a r k . It is reasonable to consider x E P n - I ( K ) , since both sides of the inequality are invariant under multiplying x by a scalar. LEMMA such that
1F. A n y element x_ of P n - I ( K ) has a set of coordinatesx_ = ( x l , . . .
(i) (ii)
, x,)
[__x[v~ 1 for v non-Archimedean, H
IxI~ >=1 / c l ( K ) ,
v non-Archimedean
(iii)
[X[v =< c2(g)[x[w for any Archimedean v, w.
R e m a r k . While (i) bounds Ixlv from above for v non-Archimedean, (ii) says that Ix_Iv can not be too small. Result (iii) says that all of the Archimedean absolute values are about the same. P r o o f . Consider the ideal 2(x_) generated by X l , . . . , Z n . There is some integral ideal P2 in the same ideal class as 2(x), and Af(g2) __<¢1 (K). After multiplying x__by some
179 E K , we get a new ~' with 3(g') = ~1. After a change of notation, write 3(~) = 92. Then x l , . . . , x , are integers in K , and so (i) holds. If (a) = gl~gl~' ... ~ ' and if (p) = g l q . ~ : . . . L ~ ~, then the absolute value associated with ~ has ]ely = p-"/*. Also if ffi(~) = pf, then n~ = e l . (We axe using rudimentary facts from algebraic number theory). Putting this together gives [a[; ~ = 91(V) -v. So if
J then On the other hand, •
~(~) = H
.
° . .
'~"),
v . '°(~'''
J
so we have
I I I~la °
= H
[~-----1¢$7J = ~'[(~(~Z--))--I'
J
v non-Archimedean
and (ii) holds. By Dirichlet's Unit Theorem, if ~ 0 ) , . . . , c~(rl) correspond to real embeddings of K into R and a ( n + l ) , . . . ,a(~l+~2),(~(~'+O,... ,~(~l+r~) correspond to the complex erabeddings, then given any C 1 , . . . ,Cr,+r2, there exists a unit e such that
le(°lc, 5 I~°)lci • c2(K) for 1 < i , j < r , + r2 and some constant c2(K). To prove (iii), we need
I~¢°1 5 ~2(K)I~_(J)I, and this can be obtained by multiplying x_ by a suitable unit e. We can now see that Theorem 1D implies Theorem 1D'. By the lemma, we have
1
=<
C1 ~"" /
II
I~-I~ ~
< 1,
v non-Archimedean
more generally
i < ,in i:__l~o< el(K) v~s =
i.
=
In view of this inequality, (9.2) implies that
II
fI
y E S i=1
IL:~(x-)l~ ~ < ca(K) _ = HK(~_)~"
180
By the lemma again, we have
c ~ K ) <=HK(X=) <--_-~=~d, where max
(J)
1
v
Archimedean
=
1 ~=JSrl + r2
Then
cs(K, 6)
nfi
yES
i=1
and T h e o r e m 1D implies that the solutions lie in finitely m a n y proper subspaces. E x e r c i s e l a . Show that Theorem 1D ~ implies T h e o r e m 1D.
§2. General S-unit Equations. We now r e t u r n to, and elaborate, on results stated at the beginning of Ch. IV, §3. Let K be a number field and S C M ( K ) a set of absolute values which contains all of the Archimedean ones. First, we consider equations of the form XO + X l + . . . -~- X n = O,
where x~ E Us (i = O , . . . , n ) . Evertse (1984), as well as Schlickewei and Van der Poorten (1982) have the following result. THEOREM
2 A . An S-unit equation of the form x0 + xl + . . . + xn = 0
has only tinitely many solutions x_ 6 P"(Us ) for which no non-trivial subsum vanishes, i.e. for which iEI
when ¢ # Z # {0,1,..., n}. Example. The equation 34 + 3 b + 5 c + 5 d = 0 is an S-unit equation if S = {oo, 3, 5}. However, we get infinitely many solutions whose subsums vanish. So we see that the condition about no non-trivial subsum vanishing is a necessary hypothesis for this result. Schlickewei had a slightly weaker version of this result before he obtained the general version above. Instead of requiring that no subsum vanish, he had imposed a condition about distinct primes in the summands.
181
COROLLARY 2B. Given coe~cients s o , . . . , an in K×, the equation
aoxo + . . . + a r i z . = 0 has at most finitely many solutions x E pn(Us ) for which no non-trivial subsum vanishes. P r o o f . If S is suitably enlarged, we have ~i E Us. Then set x~ = aixi. By the theorem, there are only finitely m a n y possibilities for x ~ for which no subsum vanishes. (Notice t h a t when S is enlarged, the theorem is strengthened since more S-units are allowed.) P r o o f ( o f Theorem 2A). For v ~ S, we have Ix, I. = a, since _~ is a n S - u n i t . by the product formula, we have
II
Ixil7" = 1
Then
(i = o , . . . , n),
yES
and
iii)
yES
Ix~l~ o
=
1.
/=0
Now let ~ = ( x o , . . . ,xn-1). Then
(jx,j. no_ l-I
vGS
i=0
\ I~1~ ) --
1 HK('z) n+' =
For each v C S, choose i(v) in 0 < i(v) < n - 1 such that
I=~l,, = Ix~o,)l,,, and restrict to solutions where the i(v) for v E S are fixed. There are n c~ra s such choices. Let the set of linear forms L , j (1 =<j < n) be the set { X o , X 1 , . . . , X , - I , (Xo + . . . + Xn-1)}\Xi(~). Then
v~S
j=l
--
and by the Subspace Theorem (version 1D'), the solutions x- lie in finitely m a n y subspaces. Say one such subspace is given by COX 0 + . . .
"~- Cn--l Xn-- 1 = O.
Let 3o be the set of i with ci # O. Then
(2.1)
0 iE~ o
182
is an S-unit equation. Let ;3 C ;30 with 3 # ¢ and consider solutions of
~
(2.2)
cixi = 0
for which no subsum vanishes. For every solution to (2.1), there is such a set ;3, and the n u m b e r of such sets ;3 is finite. Apply the case n' + 1 = 131 to the S-unit equation (2.2). (We are using induction as n). Up to proportionality, there are finitely m a n y solutions. So it suffices to study solutions where {xi}i~ ;~ is proportional to a fixed {u~}ie ;I, i.e. where
zi = ~u~
(i c 3).
R e t u r n to the given equation xi
= O,
i=o
which we can rewrite as i~
i~
If EiE~[ ui ~ O, this is an S-unit equation in n + 1 - 131 + 1 = ,~ + 2 - 131 _-< n variables, namely ~ and xi (i ~ ;3). By induction, we get finitely m a n y solutions for which no subsum vanishes. On the other hand, if ~ i e ~ ul = 0, then the subsum ~-'~4~;~ zi vanishes as well and we are not interested in such solutions. We have proved that there are only finitely many solutions to an S-unit equation for which no subsum vanishes. In the case where K = Q, a bound has been given recently. Given coefficients a 0 , . . . , an in Q×, the number of solutions x__E p n ( u s ) of aoxo -4-... -4- anXn = 0
for which no subsum vanishes is
< (Ss)226°+', °. This result is due to Schlickewei (to appear). Notice that the bound is independent of the coefficients a 0 , . . . ,an. The case for general K is done in Schlickewei (to appear (d)).
§3. N o r m Form Equations. Let K be a n u m b e r field with [ K : Q] = d and L ( X ) be a linear form, say L ( X ) = L ( X 1 , . . . , X n ) = a l X 1 -t- ... + a n X n with ai E K. Suppose that a l , . . . , a n are linearly independent over Q. T h e n n < d and L will not vanish on Q ' \ 0 . As usual, denote the embeddings of K into C by a ~-~ a (i) (i = 1 , . . . ,d) and write L ( i ) ( x ) = ot~i) X l -1-... -4- o~(i)Xn. We write d
9I(L(X)) = H i=1
L(i)(x)"
183
A norm form is any form F of the type F ( X ) = afl(L(X)) for some L as above and a E Q×. For a norm form F , we have F ( X ) E Q[X___],and F !is trivially decomposable over the algebraic numbers as a product of linear factors. By a norm form equation we mean an equation of the type F(=x) = m,
x_ E Z n,
where F is a norm form. W h e n d > n this is a generalization of the T h u e equation. For if n = 2, the linear form L(X, Y ) = X - a Y gives norm forms d
F(X,Y) =a H (X-a(i)Y). i=l
If deg a = d > 3, then F(x, y) = m is a Thue equation. As x_ runs through Z n, the linear expression L(x) will run through a set ff)I C K . This set ff)t is a free Z-module of rank n with basis a l , . . • , an. So we could rewrite the norm form equation as = m, where p E fiR. Let Q O:R be the set of products q# with q E Q, # E 9)t. T h e n Q giR consists of alXl+...+anXn withxi EQ ( i = 1 , . . . , n ) . Let E be a subfield of K and let ff)IE be the set of p E flY[ such that A~ E Q ~ for every A E E . T h e n ff~E is a submodule of ~ . If E C E ' , then fir E' C ff~s; and we have ~ = frye. T h e module ff~ is called degenerate if there is a field E C K with E ~ Q and E not imaginary quadratic such that ffy~E ~ {0}. We say that F is degenerate if the corresponding ~ is degenerate. E x a m p l e . Take K = Q(i, V~), which has d = 4, and take L(X, Y, Z) = X + i Y + i v ~ Z . Let E = Q(i). T h e n { x + i i t : x, it E Z} = fiRE. For i f x + i i t + i ~ z E ff~E, then we would need ix - It - v/2z E QffR, which forces z = 0. This does not show that ~ is degenerate, though, since E is imaginary quadratic. We could also take E = Q ( i ~ ) to see = + # {0}, or t ke E = Q(v ) to get __ { i I t + iv z} # {0}. So we see that ff~ is, in fact, degenerate. E x a m p l e . Suppose d = p where p is a prime > 2, and 9~ is a Z-module of rank n, where n < p. T h e only subfields of K are K and Q, so we just need to consider 92tK. Suppose # E K, p ~ 0. Notice that Q ~ is a vector space over Q of dimension n. As A runs through K , then A# runs through K , a vector space of dimension d. So K # = K D QffR and thus p ~ 9/I K. T h e n 92lK = {0} and 92t is not degenerate. E x a m p l e . I f n = d, then K -- Q ~ a n d f f R K = 9~. I f K ~ Q and K is not imaginary quadratic, then 9Jl is degenerate. Degeneracy is i m p o r t a n t , for if F is non-degenerate, then the norm form equation F ( x , It) = m has only tlnitely many solutions. (Schmidt (1972) and (1980) Lecture
184 Notes). On the other hand, if F is degenerate, there will exist some m so that F(x__) = m has infinitely m a n y solutions. Before justifying this last remark, we will consider the simplest case which has infinitely many solutions. E x a m p l e . Suppose a l , . . . ,ad form an integral basis for K . Take n = d and L ( X ) = OtlX 1 + . . . + oedXd. Consider the norm form equation ¢Yi(L(x_)) = 1. T h e n we seek solutions to the equation ~ ( e ) = 1 where e = a l x l + . . . + a n x , . Thus e is a unit. By Dirichlet's Theorem, we have infinitely many solutions unless K is Q or is imaginary quadratic. In general, suppose that there exists a subfield E with ffj~E ~ {0}. We claim that if A E E , then not only is / ~ E C Q ~ but /~j~E C Q~e~E. For if S E 9XE, then As E Q ffJI, so that rnA s E 9X for some positive integer rn. Given A' E E , we have A'rnAs E QDJI, since A'rnA E E. This shows that mAS E 9)IE, thus AS E QgJ~E. Now let D E be the set of A E E with , ~ S C ~j~S. T h e set D E has the following properties: (i) It is a ring containing 1. (ii) It contains a field basis of E over Q. For if A1,... , A~ is a field basis, then ,~i~lJ~E C -~ 9XE for some positive m E Z. Then r e a l , . . . , m.k~ E O E form a field basis of E over Q. (iii) There exists an ~ > 0, ~ E Z such that gD E contains only algebraic integers. For if S • 0 is in 9XE then ,ks E 97tE for every A E D E . T h e n ,k E ttz ~'j~E. But ~1ffjtE is a free module with only finitely many generators, so there exists an £ so that -~t*ffJtE contains only algebraic integers. T h e n ~D E C ~ 9XE contains only algebraic integers. Any subring of E which satisfies these three conditions is called an order of E. The set ~E of all the integers in E is an example of an order. It is a fact that may order D in E is contained in the maximal order, D E. See Borevich and Shafarevich (1966) for a more complete discussion. E x a m p l e . Take E = Q(v/2). Then D E consists of x + v/2y, with x, y E Z. Another example of an order consists of x + 2x/~y with x, y E Z. We call D E the ring of multipliers. Take CE to be the group of units of D E of norm 9~E/Q(e) = 1. By Dirichlet's T h e o r e m on units, this group is infinite unless E = Q or E is imaginary quadratic. Now suppose that S0 # 0 is in 9XE. For e E EE , we have 91(es0) = 92(S0) and es0 E ffYtE. T h e n if 92(S0) = m, the norm-form equation 9l(s)
= m,
s E 9n E
has infinitely m a n y solutions. So the condition of non-degeneracy is necessary to ensure the finiteness of the n u m b e r of solutions. E x a m p l e . Let K = Q(i, V/2) and F(x, y, z) = 92(x + iy + iV~z). T h e n ffJt: x + iy + iv/2z. Now let E = Q(v/2). We have ffJtE: iy + i v ~ z = i(y + V~Z) and O E = D E, the ring of integers in E. We know that g E is infinite and we have a unit e = V~ - 1. Any solution of fRE(y + V~z) = +1 gives a solution iy + iv/2z E 99t of 9I(iy + iv/2z) = 1. Let us start with a particular solution of 9tE(y + V/2z) = 9~E(S) = +1, say with #0 = 1 (so that Y0 = 1, z0 = 0). By multiplication with powers of e we
185 obtain further solutions. Setting #t = #0e* = et, we have pa = e = v ~ - 1 (so that yl = - 1 , Zl = 1), #2 = ( x / 2 - 1) 2 = 3 - 2 v ~ (so that y2 = 3, z: = - 2 ) , etc. The reader may find a further discussion, especially about the degenerate case, in Schmidt (Lecture Notes, 1980). We return to the ease where F is non-degenerate and consider bounds on the number of solutions. THEOREM 3 A . (Schmidt (1986b)). I f F ( X ) is a non-degenerate norm form of degree d with coefficients in Z, then the norm form equation F(_x_) = m,
x_ E Z "
has at m o s t
d 2a°'d= el(n, d, m ) solutions where
el(n, d, 1)
= 1 and
(.
1) ~ d,-l(md),
where w is the n u m b e r of distinct pr/me factors of m and dn-l(g) is the n u m b e r of ways ofwritingg=gl...g,_x withgi>0 (i=1,...,n-1). As was seen in Section 6 of Chapter III, the general case follows from the case m = 1. Thus, one may concentrate on the norm-form equation =
1
and the bound (3.1)
d 23°"d2.
In these Notes, we will not prove Theorem 3A and (3.1), but a variation. See Theorem 3B below. Incidentally (Schmidt (1989b)) has also proved another bound in place of (3.1), namely d(2,) "2"+4 " This second bound can probably be improved, removing one of the exponents by using some combinatorial techniques. The second bound is nicer for fixed n, since it grows only like a polynomial in terms of d. How can this be generalized to the degenerate cases7 T h e r e we would have finitely many solutions up to multiplication by certain units. An explicit bound so far has not been derived. Also, what about asymptotic estimates for the number of solutions? The inequality IF(x_] ~ m defines some n-dimensional set of volume c y m '~/d where cy depends on F only. Mahler (1934) has shown in the case n = 2, i.e. for the Thue case, that the number of solutions of this inequality is ~ c y m n/d as m ~ oc. R a m a c h a n d r a (1969) proved this asymptotic formula for a class of norm form equations.
186
Now we will specialize the n o r m forms F somewhat. Let F be a n o r m form given by F ( X ) = a L < I ) ( X ) . . . L
cr(L(J)) --_ L(iJ)
(1 < j =< t).
If, for instance, the Galois group o f / ( is Sd, the s y m m e t r i c group on d elements, then G is d times transitive. THEOREM 3 B . Let everything be as above. If G acts n - 1 times transitively and if a n y n a m o n g L 0 ) , . . . , L (d) are linearly independent, then the n u m b e r of solutions of F(x__) = 1 is <: d 2~°" " It follows t h a t forms as in T h e o r e m 3B are non-degenerate. This could also be shown directly. E x a m p l e . Let K = Q((~), where (~ is an algebraic integer of degree d > 2, and suppose G --- Sd. Take F(X)
=
¢ ~ ( X 1 --~ o ~ X 2 -~- . . . +
o~d-2Xd_l).
T h e n the hypotheses are satisfied T h e class of equations treated in T h e o r e m 3B includes T h u e equations. T h e r e m a i n d e r of this chapter will be devoted to the proof of T h e o r e m 3B. (There are some e x t r a technical difficulties for T h e o r e m 3A.)
§4. A R e d u c t i o n . Given two n o r m forms F , G we say F ~ G if F = G T for some T E SL(n,Z). Recall that G T ( x ) = G(TX). The n u m b e r of solutions of F(x__) = 1 is invariant under this equivalence relation. In C h a p t e r III, Section 2, for a prime p, we exhibited certain transformations To, T1,... , Tp with det Ti = p such that P
z " = (.J T,Z n. i=0
So instead of studying F(_x) = 1, we could s t u d y the equations FTJ(x__) = 1 (j = O, 1,... ,p). So what is the advantage of using Frj's? From C h a p t e r III, we know t h a t D ( F T) = (det T)III D(F), and then D(F r) >=pill. T h e advantage, then, is t h a t we can
187
consider forms whose semi-discriminant is large. T h e disadvantage is t h a t we now have p + 1 n o r m form equations to consider. Let N(n, d) be the m a x i m u m n u m b e r of solutions to F(x_) = 1 for all n o r m forms F of degree d in n variables of the type described in T h e o r e m 3B. Let N(n, d,p) be the corresponding b o u n d if we restrict to forms F with semi-discriminant D(F) >=pill. Then N(n, d) <=(p + 1)Y(n, d,p), which is L e m m a 2C of C h a p t e r III. By L e m m a 2B of the same chapter, we have
D(F) <=H*(F) Irl"/d. Recall that I is the set of all n-tuples i l , . . . ,in in 1 _-< ij <= d with L ( i l ) , . . . , L (i") linearly independent. (Under the hypothesis of T h e o r e m 3B, I consists of all such ntypes of distinct n u m b e r s . ) Combining this last inequality with D(F) >= prII, we see that we m a y restrict ourselves to forms with
H*(F) >=pd/~. PROPOSITION
4A.
If p
>
n lOn2, then
N(n,d,p)
< d 2a°'-l°n2
We m a y deduce the m a i n t h e o r e m (3A) from this proposition. Recall t h a t we need to show that the n u m b e r of solutions to F(x_) = 1 is
d2a0n. We pick a p r i m e p with n 10n2 < p < 2r~ 10n2.
T h e n we have
N(n,d) <=(p + 1)N(n,d,p) < d2a°"-l°n~2rt iOn2
_< d 23°" since d > n. We m a y restrict still further. For a n o r m form F = aL (1) ... L (d), we have height H*(F) = H(L) d cont F. In general, this height is not invariant under ~. We let ~ ( F ) = rain G~F
H*(G).
This m i n i m u m exists, since a m o n g all forms L with coefficients in a given n u m b e r field, there are only finitely m a n y forms with H(L) <=B for any b o u n d B. F u r t h e r m o r e , this .~(F) is invariant under ~.
188
So we restrict to norm forms F with D(F) = H*(F). We have seen that when D ( F ) >= plII, we get ~ ( F ) _>_ p~/n. In the proposition, p > n ' ° " ' , so the inequality becomes (4.1)
-9(F) > n ~°d'.
How does this apply to the linear form L? In counting solutions of F ( x ) = 1, we may suppose that cont F = 1. Otherwise, we would have no solutions. Then (4.1) implies H ( L ) > n ion. In the sections which follow, we will distinguish large and small solutions. Small solutions will be those with
151 _-< --~(F) 6"d"
=
H(L) 6,~d"+'.
The remaining solutions will be (:ailed large solutions. §5. A n A p p l i c a t i o n o f t h e G e o m e t r y o f N u m b e r s . 'LetL i _< d). Let
=
OqXl-}-...-}-o~nXn withoq E K, a n d w r i t e L (i) / a
=J
\
andA=a
A...Aa
o~i)x,--~
-. .
..a-a(1)X, ~, (1 =<
~(x)) • ~"(d)
=
=
,
(l ____j __<~)
txj
. Then A lies in Ce, w h e r e g = A ( L ) == 131 = x/I/3,1 ~ + . .
d)' If A = (/3,,... ,/3,), introduce + I/3,12
Then it is easily seen that A ( L T) = A(L) for T 6 S L ( n , Z). LEMMA
5A. Given the notation above,
v(,~) lal"/a/X(L) >= 2n~3/2
a./:(con t F)(._,)/dS(F),/~,
where V ( n ) is the volume of the unit ball in N". Notice that both sides are invariant under ~. The right-hand side depends only on F , but the left-hand side depends on how we write F = aL O) ... L (d). We could write instead F = a ' L ' 0 ) . . . L '(d), where L' = ,~L. Then a' = a/9l()~), but there is no simple way to express A(L') in terms of )~ and A(L). The matrix (a~ i)) (l_
189
has r a n k n, so a x , . . . , a n are linearly independent. combinations zx_ax + . . . + x n _ a ,
Let A be the set of all linear
where xi E Z. We will see t h a t A m a y be interpreted as an n-dimensional lattice in some Euclidean space. Suppose t h a t the field K has rl real embeddings and r2 pairs of complex conjugate embeddings. T h e n rl + 2r2 -- d. Suppose a,
~
a (0
is real for 1 < i < r l , and ,
~ o~(i),
ot ~
¢~(i+r2)
are complex conjugate embeddings for rl + 1 -< i -< rl + r2. Let E d be the space of vectors
where z l , . . . ,zrl are real and Zi,Zi+r2 are complex conjugates for rl + 1 --< i --< rl + r2. T h e n E d is a vector space over R of dimension d. Introduce an inner p r o d u c t
z . z' = zl-~ + . . . + zd-~'d and a n o r m on E d. E x e r c i s e 5a. Show that, with this inner product, E d is a Euclidean vector space. By inspection, a. E E d. T h e n we m a y interpret A as an n-dimensional lattice in E d, as mentioned earlier. Now we m a y consider successive minima. In other words, introduce P l , . . . , #n where p j is the least positive real n u m b e r such t h a t there are j linearly independent points of A with Ig[ =< Pj. Here [ I is the n o r m above.
If z_ = Ul a 1 + . . . + un =an E A, with components z (i), t h e n z (i) = L ( i ) ( U l , . . . , u , ) . We have tal I z O ) . . . z(d) I = I F ( u a , . . . , u , ) [ > c o n t F, unless ul . . . . .
Un = O. So every non-zero lattice point __zhas
[zCi) ...zCd)l => (cont Y)/lal. T h e A r i t h m e t i c - G e o m e t r i c Inequality gives
=
+... + Iz( )l
> Id > ~/-d ~/(cont F)/[a[.
2
190
We may conclude that (5.1)
•1 => ~
~/(cont
F)llal
.
Now suppose that b l , . . . , b is another basis of A. Say
~!1) ) r 3 b. =$
= mjl
[~!d)
a 1 "4- . . .
-1"- m j n
an,
r.7
where the matrix (mjk) is in
SL(n, Z). Introduce the row vectors
_,(,) = (,i'),... ,,(:)) and
tic,) = (t},),..., t(,)). Then we have 3 t(i)
= m j l o~i)
+ ...
+ m jr,
Ol(ni)
= a~i)mjl + . . . + a~i) m.i,, so t (1) =
~(i)Mt, where M t is the transpose of 2lz/. Let F M' (X) = F(Mt_A') = a 1-I ( ~ ( ' ) M ' X ) i=1 d
---- a I I (flCi)x) i=1
Since
F M' ,~ F, we have H'(F M') >=~(F),
so that
d
lal2 I-~ I-~(i)12 --> "~(F)2" i----1
Using the Arithmetic-Geometric Inequality, we get d
I_~(')1~ = d(.5(F)/lalfl/d. i----1
Recall that
~--
----3
191 Therefore
(5.2)
~
Ibsl ~ > d(D(F)llal) 21d,
j=l which says t h a t a basis b l , . . . ,b n can not be too small. Given our lattice A and any basis =bl,... ,b n of A, there are linearly independent lattice points g , . . . ,g such t h a t 1
=n
Ig, I =/i~, Ig2=l= w,..., where/i1, • • •
,/in
Ig,, I =/i,,
are the successive minima. For n = 2 these g. necessarily f o r m a basis,
but for n > 2, they are not necessarily a basis. However, one can show t h a t there is a basis b l , . . . , b with the p r o p e r t y t h a t
I:bjl< j/ij
(j = 1 , . . . , n).
E x e r c i s e 5 b . Verify this last statement. T h e reader m a y consult Cassels' text on the G e o m e t r y of N u m b e r s (1959). Given such a basis, we have
_
=
=It
j=l
/t n.
j=l
If we combine this with (5.2) we obtain
dl/2
/i. > n-777/2(S~(F)Ilal) lid. F r o m (5.1), we also h a d
#1 > d '/2 (cont F/lal) '/d
(j = 1 , . . . , n - 1).
Taking the p r o d u c t of these inequalities we see t h a t
d./2
~ x m . . . / i , > n3/2 lal,/d (cont F) (n-D/d ~ ( F ) x/d. By Minkowski's Second T h e o r e m (2E of C h a p t e r I), we have
#1 . . . / i n V ( n ) < 2 ~ d e t A , so t h a t
v(,~) det h :> 2" ,~i~ lalni~ d"l~ (toni F) ("-')/d.~(F)a/d.
192 But det A = I det ai_ =ajl1/2
= la, A...
^ a I = A(L),
a n d the p r o o f is complete.
§6. Products of Linear Forms. LEMMA 6 A . Suppose F ( X ) = a L ( 1 ) ( X ) . . . L ( d ) ( X ) is a n o r m f o r m with coefflcients in Z. Suppose x_ 6 Z n is a solution of
F(x=) = ~. Then there exist i x , . . . ,in with 1 <=il < . . . < i,, <=d such that
2"n312
IL(i,)(~)... L(i-)(z__)l < -
i det(L(i,),...
= (r,!)'/~ v ( ~ )
-
,L (i"))
ISS(F)-ll a.
T h e significant t e r m on the right h a n d side is Y) ( F ) -lId. W h y w o u l d s u c h a l e m m a be useful? Recall, in the case of the T h u e equation, if alx - a l y [ . . , ix - anyl = 1, t h e n laiy - x[ was small for some i. T h e n [aiy - x[ [ajy - x] was small for a n y j . We now have a n analogue. Here we have n variables, a n d we can find some n of t h e linear forms such t h a t their p r o d u c t is small at x. Proof.
We have a L (1) (__x)... L (d) (x_) = 1.
Introduce
V(X) = i(~)/L(x), so t h a t F ( X ) -~- V ( 1 ) ( X ) . . .
v(d)(x).
N o w a p p l y t h e l e m m a of Section 5 with a = 1, cont F -- 1, a n d V in place of L. We have cont F = 1 b e c a u s e we have an integer solution to F(x_) = 1. T h e l e m m a gives
V(n) d"12 Let
a. =3
\<,~")/L(")(x), Then
A ( v ) ~ = I~l ^ . . . _
^ ~ . I~ = IAI ~ _
__
> v(~)~ d" ~ (F) ~/d. 4 n n3
=
193
Now let D ( i l , . . . i , ) be the determinant of the submatrix of (a~0/L(0(x)) where i is among i l , . . . , i , . Then
IAI 2 =
~
IO(il, ''.
,in)l 2,
l=
and the number of n-tuples in the sum is (~) =< -fiT" Then some D ( i l , . . . ,
[D(i,,.." ,i.)12 >
i n ) satisfies
V(n) 2 n!
4" n 3 ~) (F)2/d'
and
i-)1 > V(n)(n!)a/2 J~ (F) 1/~
ID(il, • •. ,
=
2 n n3/2
But ID(iI, •
,i-)l = I d e t ( L ( i ' ) ' " " ' L ( i " ) ) l •"
iL(i,)(x__)... L(i")(x=)l '
so we can get an upper bound for [L(/')(x__)... L(i-)(x__)[ in terms of [ det(L("),..., We have [L(il)(x_).. L(i.)(_x)l < [det(L(it), ... ,5(i"))1 where
L('"))].
p = ~ ( F ) 1/d (n!) 1/2 V ( n ) 2 n n3/2
In the ease of the Thue equation, we used a Gap Principle to get a bound on the number of solutions• We will do a similar thing in the next section. §7. A G e n e r a l i z e d G a p P r i n c i p l e . We first present a simple argument in diophantine approximations. Consider rational approximations ~ to a real number a with O~--
< ~
py2 '
and say P > 4. Given such reduced and distinct x l / y a , . . . , x , , / y , , with ya < ... < y~, then < a--xJ-x + ~-< 2--, Yj - 1 Yj =
therefore
Yj - 1
P Yj - 1
P Y j > "~ Y./-1,
and so y~ > ( p / 2 ) v-1 .
194
T h e n u m b e r of such approximations in reduced form with 1 < y < B is logB < 1 + 2 logB =< 1 + log(P/2~----) = log P " We now want to generalize this. L E M M A 7A. Suppose L1,. •. , Ln are n > 1 linearly independent linear forms in n variables with complex coefficients. Suppose (n!) 4 <-- P = B
and put Q = (log B ) / l o g P.
Then the integer points x_ in the ball
I~l ~= B
with
ILl(X=)... Ln(=x)l _-< l d e t ( L 1 , . . . , L n ) I P -~
lie in the union of at most rt3nQ n-1
proper subspaces of Qn. P r o o f . The proof falls into two parts, the first being a reduction of the problem. (A) If L ( X ) = o q Z l + . . . + a n X n = a X and M = B X , then let ( L , M ) = ~ and ILl = v/(L, L). We will reduce to the case where L 1 , . . . , Ln are pairwise orthogonal, i.e. (Li, Lj) = 0 for i # j . If L i ( X ) = ~i X , then let
~"
:Z
= (-1)i-1
-% ^ " "
^ =~ ,' - 1
A ~i+1 A ... A
%,
which has n components since (n--a)n = n. We have (7.1)
o~. &. = 6ij det(~,
" '~n)'
where gij is the Kronecker symbol. This last statement is true because the coordinates of the &. are the minors of the matrix with rows ~ 1 " " ' ~-n' so we get the determinant ----2 expanded about the ith row. T h e assertion of the lemma is invariant under replacing Li by the multiple .hiLi. We choose .hi = l~il/(l~xl-...=..l~--nl) x / ( n - n and replace c~. ~
so that
.hio~
195
After such a transformation, we may suppose that 1~1 [ . . . . . Suppose that we restrict our attention to solutions where
]--&nI = 1.
ILn(x__)l = max ILi(x)l. -
1<=i<=.
=
It suffices to show that the n u m b e r of subspaces required under this restriction is
5 1-nan Qn-1. l"Z
Write = Cl-----~1 + ' ' "
+ Cn~--"n"
We have i=1
= cj d e t ( ~ l , . . . ' ~ n ) by (7.1). Looking at the left-hand side, we know that
I~n a jl <= l with equality when j = n. Thus Icjl 5 ICn[
(j = 1 , . . . ,n).
Put ,~' = -~_ /Cn -n
~n
I
Cl'
~1 + "'" + Cn-1 --~n--X + Cxn _ _
_
_
with Ic~l 5- 1, and put L ' ( X ) = _-' X. n
We have d e t ( L 1 , . . . , L n - 1 , Ln) = d e t ( L 1 , . . . , L n - l ,
L')
and L " is orthogonal to L 1 , . . . , Ln-1. Also
' 1 L . - I ( x=) + L.(~), L'(x_) = e'IL~(x_ ) + .. . + c,,_ _
so that
IL;(x__)I =< nlL.(x__)l, since leVI ~ 1 and ILn(x__)l = maxx_
196
Thus it suffices to show that when L , is orthogonal to L1,... , L , - 1 , then the solutions of
I~1 < B and
ILl(Z_)... L,(z__)[ < nl d e t ( L , , . . . , L , ) I P -1 lie in the union of not more than X n3n n
Qn-i
proper subspaces. This is the key reduction. We repeat this argument. As before, we may suppose that [~11 . . . . we restrict to solutions with
IL,_l(X__)l--
I_-&nl= 1 and
max IL,(x___)l. l_
Again, write ()t
---- C1 ~---I "{-
+ Cn--I ~='n--I '
where -----n ~ does not appear since -----n & -- l is orthogonal to ~(~n and the orthogonal complement ofa=~ is spanned by g ~ , . . . , g , _ . We have Icjl =< te,_~l (1 =< j <= n - 1), and we take O~ ~---n--I
:
n--I
Cn_
1
and L ' _ I ( £ )__
- ~ . _'X .
Then IL'_~(~_)I < (n - 1)1L.-1(~)1. Thus it suffices to show that when L,~ is orthogonal to Li onal to Li (i ~ n - 1), then the solutions of
I~1 5
(i # n) and L . - 1 is orthog-
B
and
ILl(X=)... L,(__x)[ < n(n - 1)l d e t ( L , , . . . , L , ) I P -1 lie in no more than
1 n(n -- 1) n 3n Qn-1
proper subspaces. Applying this argument repeatedly, we get the following reduction. If L 1 , . . . , Ln are pairwise orthogonal, then the solutions of
Ix_l < B
197
and [L,(x__)... L,(x__)l < n! [ d e t ( L 1 , . . . , L , ) I P -1
lie in at most 1 n3,Q,_ 1 (2n~)"-1 Q"-I < n-i proper subspaces. We may reduce even further by supposing that (Li, Li) = 6ii. To do so, just multiply by suitable factors to make ILi] = 1 for 1 -< i ~ n. T h e n ] d e t ( L 1 , . . . , Ln)l = 1. (B) Having completed the reduction process, we study solutions x_ to
ILx(x_)... L,(x__)[ < n! p - l , Igl < B and therefore
IL/(_z_)I < B
(i = 1,... ,n).
Put
C = (Pl(n!)2) 11("-1), so that
Iii(x=)... Ln(x=_)l < 1/(n[ cn-1),
and put
R = l o g ( n ! B " ) / l o g C. We subdivide the solutions into several classes. Either for some i in 1 -< i -< n - 1 we have ILi(=x)l < B C - R =
Bl-"/nl,
(Zd
or we have B C - ' ' - 1 _-< ILi(z__)l < B C -p'
(i = 1 , . . . , n - 1)
( E m ..... p._,),
for integers P l , . . - , P , - 1 with 0 =< Pi <=R. If x is a solution in one of the first classes, say Ei, then one of the first n - 1 forms, namely Li(x_), is really small at x_. T h e second set of classes covers the cases where this is not true. Let x i , . . . , x_, be solutions in some class El. Then I det(x_l,... , X n ) l = I d e t ( L 1 , . . . ,Ln)l
I d e t ( x l , . . - ,Xn)[
det Lk(xj)[ =ll
Bl_n=l.
n! T h e n d e t ( x l , . . . , x ) = 0 axtd any n solutions in this class are linearly dependent. In other words, all of the solutions in a given Ei must lie in a single subspace. So counting solutions in E a , . . . , E ~ - I , we have at most n - 1 subspaces.
198 Now consider solutions in one of the second classes. We have
ILKx=)... L.(x=)l < 1/(n[ ca-l), SO
IL.(x)l < Let
X_I,... ,X__nbe
Cpl+...+p,~-I n! B " - I
solutions in this class, called Ep~ ..... p,_~. T h e n
[det(~-l"'" '~") = [l__
< n! B " - 1 C - p l - ' ' ' - P " - I
Cpl+ ...+pn- 1
~-].,
n! B,,-1
and x l , . . . , x are linearly dependent, as before. In this case, how m a n y subspaces m a y we have? In other words, how m a n y classes can occur? We have p l , . . - , pn-1 with 0 =< p~ <= R, which gives no more t h a n (R + 1) n-1 possibilities. Recall t h a t R - log(n! B n) log C ' and by hypothesis n! B " < B "+1 . We also have
C~_ ((n~.)2) 1/(n-1)
> p1/(2.-2)
by hypothesis. Combining these gives log B n+l - (2n 2 - 2)Q, R < log p1/(2,,-2) and we have
( n + 1) "-1 < ((2~ ~ - 2)Q + 1) "-1 subspaces. All together, the total n u m b e r of subspaces is
< ~ - 1 + ((2~ 2 - 1 ) 0 ) "-1 =< (2n2Q) - - 1 , which is what we asserted.
199
§8. S m a l l S o l u t i o n s . In Section 6, we saw that IF(x_)] = 1 implies that for some il < ... < in, we have
(8.1)
]L(iD(x)... L(i")(x__)[ < I d e t ( L ( i l ) , . . . ,L(i"))lP-1 ,
where P = V(~)(n!) 112 2 n n3/2 ~5 ( F ) 1/d.
E x e r c i s e 8a. Show that (n!) 1/2 V(n) ) 1/(27r) n/2.
Using this exercise, we get
1
P>
~j (F) lid > ~ (F) 1/2d > (n!) 4,
where the last two inequalities follow since Y) (F) > n l°nd by (4.1). As we mentioned previously, small solutions will be those with
Ixt 5 ~ (F) 6rid". Let this bound be B. By Lemma 7A, the small solutions satisfying (8.1) lie in no more than n an (log B~ log p ) n - 1 subspaces, and we have nan(log B / log p)n-1
< na n ( 6 n d n log ~j ( F ) ) n-1
g
=
_< 12 n n 4n dn2-1
.
Counting the number of possibilities for the n-tuple 1 < il < . . . < in < d, which is
(:) =<
result.
we
PROPOSITION
8A. Under our hypotheses, the small solutions of
IF(x__)l = 1 lie in the union of at most 12 n
n 4n dn2+n
proper subspaces. Note that for small solutions we did not need non-degeneracy or the hypothesis of Theorem 3B, but only the fact that the matrix of L(1),... , L (r) is of rank n.
200
§9. L a r g e S o l u t i o n s . We will count the large solutions only in the special case considered in Theorem 3B. L E M M A 9A. Let L1, . . . , L , be n linearly independent linear forms in n variables with coefBcients in a n u m b e r tleld E. Using linear independence of L 1 , . . . , L , , write each variable as X i = "Til L1 + . . . +
( i = 1 , . . . ,n)
7i,L,
with 7ij E E. Then
I'T~il ILjl
<
( i , j = 1 , . . . ,n)
HE(L1)... HE(L,).
P r o o L In fact, for any absolute value v* E M ( E ) , we claim that lTijl." ILj]. * _<- H E ( L 1 ) . . . H E ( L , ) . Write (i = l , . . . ,n)
Li(X) = a. X and put
, ~ n~ if v ¢ v*, nv = ~ n v - l i f v = v * . For a E E ×, we have the product formula, namely t
I~1.. H
I-I:°=
~.
vEM(E)
Writing o~ --- d e t ( ~ l , . . . , ~ , ) , we get 1 Id e t ( ~ , . . . , ~ , ) [ , .
(9.1)
=
H
[ d e t ( - ~ l " " ' ~ - )lf~
vEM(E)
<
I'[
(1_%1~
~"'°
-
vEM(E)
__ U ~ ( ~ l ) . . . HE(gn)
I~ll,,'..where we have used Hadamard's Inequality. Now let a. = ( a i l , . . . , a i , ) , A = (aij), the a. are simply the rows of A, while -y. =
I-~,l,~"
'
C = (Tij)- Then A C = I. Notice that "
axe the columns of C. Thus, up to
k 3~-j/
sign,
7j
1 det A
~1 A . . .
A ---~j-1 A ---~j+l A . . . A (3/ .
201
Now we have
I-% Iv.... I-% Iv.
I~jl~-12jl~" ~ - ~et.41~ by
(9.1).
5 HE(al)...HE(a,)
Looking at the ith components of the vectors 7. on the left, we have =3
]Lj[,. 17ijl~* -_
9B. fix_ is a large solution in Theorem 3B, then there are indices 1 <-_
il < . . . < in ~ d such that
(9.2)
[L(i')(__x)... L(~-)(x_)[ < I det(L(~'),...,
L(i"))[
P r o o f . It is convenient to normalize the forms L (0. That is, let
M (0 (X) = L (0 ( X ) / I L ( O ], so that IM(i)I = 1 (i = 1 , . . . ,d). (Notice that the M (i) are no longer necessarily conjugate forms.) We have from F(_X-) = 1 that ]M(1)(X-)... M(d)(_X-)[ _-
1
lal IL(1)I... IL(d)l
1 -
<1.
H*(F) -
Without loss of generality,
IM(a)(x__)l<=...<=lM(a)(x=iI. Since any n linear forms in Theorem 3B are linear 1y in "epen d en t , we may express Xi = 7il L (1) + ... + ~inL ('*). Then by the preceding lemma,
I'ml [L(i)l
< HE(L(X)) ... HE(L('*)),
where E is a field containing all of the coefficients of L(a),... , L (n). Then deg E _-< d ' , and 17ijl [L(J)I --< H(L) ha". Now express the variables Xi as linear combinations of M ( 1 ) , . . . , M (n), say
Xi = Cil M O) + ... + ci'*M ('*),
202 where cij =
7ij IL(i) 1. Then
Ic~jl ~ H(L) nd"•
We can now obtain an estimate for IX], which we will use to complete the proof. We have
Ixl _-
Since
/M
,M(n..{..l)(x)...M(d)(x)l ~ ( ,x__] )d-n _ _ = n2H(L)n4. Then [M(1)(__x)...
M(a)(x__)l __<1
leads to
)
(n2H(L) nd" d--n
IM(~)(z__)... M(~)(N)I < \
Iz_-I
Combining this last inequality with
IM (o (x_)l = IL(O(_x)I/IL(O I,
we have
d--n
IL(1)(x__)...L<")(x__)l Idet(L0),.-.
< IL(1)[... IL<~)[ , L<'))l = I det(L(1),..., L(~))I
< \( n2H(L)2nd'~ ) since
IL(1)] ... IL(n)l I det(L(1), ... , L(n) l
I:___l ]
d--n
< H(L) nd")
from the proof of the last lemma ((9.1) and the fact that d e g E <_- d~). Now, large solutions satisfy Iz__l >
H(L) °"d°+' > ,~Od,,
(see the reduction of section 4), so IL(X)(_x_)... L(n)(_x)l
d-rt
I det(L(1), .-. , L(n))l d-n
1 =
IXI1/~ •
203 R e m a r k . We have not made full use of the restriction Ixl > H(L) s"dn+l for large solutions. Fuller use is made in the proof of T h e o r e m 3A, which however is not presented here. Now we apply the Subspace T h e o r e m (Theorem 1D) with 6 = 1/2. T h e conclusion is that there exist subspaces $1,... , St where
t = [(2dn)226"6-=] = (2dn) 4"226", such that solutions to (9.2) lie in the union of $ 1 , . . . , St and the ball Ix___l< max((n!) 8/~, H(L)). Since our large solutions can not lie in this ball, we have the following result. PROPOSITION 9 C . Under the hypothesis of Theorem 3B, the large solutions to F(x_) = 1 lie in the union of at most t proper subspaces, where t = (2d") 4"22e".
§10. P r o o f o f T h e o r e m 3 B . Combining the results of Sections 8 and 9 on small and large solutions, we see that the solutions to the norm form equation F ( x ) = 1 lie in the union of not more than (2d-)5-2=6" subspaces. We want to count the number of solutions. We suppose that S is one of the subspaces and that S is given by a parameter representation _x = T y, where y E Qn-1 and 'T : Q , - 1 __~ Qn. If T is properly chosen, as y_y_runs through Z "-1, then x will run through the integer points of S. Z,~-I T
To study solutions x_ E S, we need to study F(Ty_) = I. Letting L*(_y) = L(T(y)), we have
(10.1)
aL*(1)(y_)... L*(d)(y_) = 1,
204 a n o r m f o r m e q u a t i o n in n - 1 variables. This allows us to do a p r o o f b y induction. We w o u l d like to a p p l y T h e o r e m 3B to the new linear f o r m L*. B y hypothesis, G was n - 1 times transitive on L ( 1 ) , . . . , L (d), so t h e n G is n - 1 times transitive, hence n - 2 times transitive on L*(1),... , L *(d). Also, the r a n k of the f o r m s L*(1),... , L *(d) is n - 1. So t h e n there exist n - 1 a m o n g t h e m which are l i n e a r l y i n d e p e n d e n t . B y (n - 1)-transitivity~ a n y n - 1 a m o n g t h e m are linearly i n d e p e n d e n t . So b o t h h y p o t h e s e s are satisfied, therefore (10.1) has
_~ d 2s°("-1) solutions in S. M u l t i p l y i n g by the n u m b e r of subspaces we get
< dlOn.226'*
d2S0(--1)
d230"-lOn 2
solutions (with p l e n t y to spare). This proves P r o p o s i t i o n 4A, a n d therefore T h e o r e m 3B.
Epilogue. The abc-conjecture. Let a, b, c be non-zero integers with
a + b+ c = 0
and
gcd(a, b, c) = 1.
Put P=Hp. p}abc J. Oesterl& posed the following question. Is there an absolute constant el such that
m=(lal, Ibi,Icl)< pc~ ? Masser (1985) refined this question. He conjectured that for any e > 0, there exists a constant c2(e) such that
max(lal, Ibl, Icl) <
TM.
This is known as the abc-conjecture. We will discuss consequences of the abc-conjecture. Our discussion will follow, to a large extent, a paper due to Stewart and Tijdeman (1986). M. Hall Jr. (1971) conjectured that
[2:2 __ y3] > c3yl/2 for positive integers x, y with x 2 # ya. A weaker version of Hall's conjecture follows from the abe-conjecture. To see this, let d = gcd(x 2, y3), and then set a = x2/d, b = - y a / d , c = (ya _ x 2 ) / d . Then P=
H p (= piabc
• vlv
-
The abc-conjecture gives for any e > 0 that Ibl =
v31d<
c2(e)P 1+~
and
a = x2/d < c2(e)P 1+~. Multiplying these inequalities, we get
x2y3/d 2 <. c2(e)2p 2+2~
< C2(~)2 X2+2~y2+2~ ]y3 -- X212+2e d2+2~ and then
X2y3 < C2 (~)2 X2-}'2ey2+2e [y3 -- X212+2e.
206 Now we have
Ix 2 - d
i2+2, > ~
1
~-
2~ ~/1--2e.
If x =< 2y 2, then
IX2 __ y312+2, > ¢4(e) yl-6e, and therefore
Ix 2 - v31 > c5(~) yO-6,)/(2+2,). So, again for every e > 0,
IX2 -- y3 I > C6(~ ) y(1/2)--e. On the other hand, if x > 2y 2, we get i x 2 _ y3] __>y4. So a weak form of Hall's conjecture follows, namely
Ix 2 - d i > c,(,) y (1/2)-~. This has the following consequence concerning a particular elliptic equation y2 = x 3 + k,
k # 0,
called the Mordell equation. Hall's conjecture gives
Ikl= Iv= - ~ I > ~ ( ~ ) x (1/2)-'. Thus every solution to a given Mordell equation has
14 < ~?(~)Ikl2+~. Next we consider the Fermat conjecture. T h a t is, we look at the equation x n + yn = z n
where
n >=3,
gcd(x,y,z) = 1,
and
x , y , z > O.
To apply the abc-conjecture, let a = x n, b = yn, c = - z n. T h e n
P=
I I P <=xyz ~= z ~, Piabc
since z is the largest of the three integers. By the abc-conjecture, we have z " = Icl _-< C2(~)Px+c
~ C2(g)Z3+3~,
so
z n - ~ - 3 ' < c~.(,).
207
Now suppose that n > 4 and e < 1/3. Since n - 3 - 3e > 0 in this case, we get a bound on z as well as a bound on n. Thus, for sufficiently large n, Fermat's conjecture is correct. In other words, Fermat is correct with at most finitely many exceptional n. And for any given n, there are still only finitely m a n y solutions. T h e weaker Oesterl6 conjecture also has this consequence about the Fermat Conjecture. However, in t h a t case, we obtain a larger lower b o u n d on n, so we get more possible exceptions. We then also need Falting's Theorem (1983) to deal with small n. T h e following conjecture was made by Pillai (1945). Given integers A > 0, B > 0, C > 0, the equation Ax ~ - By m = C in integers x > 1, y > 1, n > 1, m > 1 and with (re, n) # (2, 2) has only a finite number of solutions. If m, n were fixed, this would be a special case of an algebraic diophantine equation, the superelliptic equation. T h e conjecture has been proved only for A = B = C = 1, by Tijdeman (1976). This special case x ~ _ ym = 1 ties in with Catalan's conjecture, namely that this equation has no solutions except 32 - 23 = 1. Tijdeman showed that all solutions have x, y, m, n < B for some explicit B. To apply the abc-conjecture to Pillal's conjecture, let d - - g c d ( A x ~, By"*, C). Set a = A x " / d , b = - B y m / d , and c = - C / d . Then P < ABCxy, and the abc-conjecture gives A x " / d , B y m / d <=c2(e)P a+~. So
max (x", ym) <=cs(A, B , C, e)(xy) '+~. Without loss of generality, x n < - - ym ,
SO
y'~ _-
If m => 3, then 1 + ( m / n ) =< 1 + (m/2) =< m,5 so (1 + (m/n))(1 + sufficiently small. T h e n
<
for e > 0
yml2 < ym-(l+(mln))(l+e) < cs(A, B , C, e) gives an u p p e r bound for ym, hence for y and for m. On the other hand, if m = 2, then n >=3 a n d ( l + ( m / n ) ) < l + ( m=/ 3 ) < 5 = gm, and the rest follows as before. A more general application is as follows. Tijdeman (1989) proved that for given non-zero integers A, B, C the diophantine equation A x n + B y m -_ C z t
208
has only finitely m a n y solutions in positive integers x > 1, y > 1, z > 1, n, m,£ subject to gcd(Ax, B y , Cz) = 1 and n -1 + m -1 +~-1 < 1. On the other hand, Hindry has shown that for each triple n, m , £ with n -1 + m -1 + £-1 => 1, there exist A, B, C such that the above equation has infinitely many solutions x, y, z with g.c.d. (Ax, By, C z ) = 1. Last, we consider S-unit equations over Q. Let K = Q and S = { o o , p l , . . . ,p~). Consider the equation x+y+z=O, where x,y, z E U s N Z
and
gcd(x,y,z)=l.
TO apply the abc-conjecture, let a = x, b = y, c = z. Then P ~ pl . . . pv
and Ix[, [Y], Iz[ =< c2(e) PI+~ The abe-conjecture may be an elusive goal. W h a t do we know in this direction? THEOREM abc-conjecture,
1.
(Stewart and Yu, (to appear)).
Under the hypothesis of the
max(lat, [b[, [cl) < e v~/3+e'/l°gl°gP This improves upon an earlier bound due to Stewart and Tijdeman (1986)). T H E O R E M 2. (Stewart and Tijdeman, 1986). In this setting the conjecture would be fMse without the e. In other words, it is not true that
lal, Ibl, lcl < CloP. More preciseIy, given 5 > O, there ,ure int~nitely many positive integers a, b, c with gcd(a,b,c) = 1 and a = b + c such that a > Pe(4-8)~, where P I-[ P. plabc We have the following generalization to n variables. Suppose al + a 2 + . . . + a ,
=0,
where the al are non-zero integers, gcd(ai, aj) = 1 for i # j , and no sub-sum vanishes. Let P=
II
P"
Plal ...an
The conjecture is that max(]al [ , . . . , [an[)< o n ( n , e)P n-2+~.
Bibliography A. Baker (1964). Rational Approximations to certain algebraic numbers. Proc. London Math. Soc. 4, 385-398. A. Baker (1968). Contributions to the theory of diophantine equations. Phil. Trans. Roy. Soc. London A263, 173-208. A. Baker and J. Coates (1970). Integer points on curves of genus 1. Proc. Camb. Phil. Soc. 68, 105-123. A. Baker and C. L. Stewart (1988). On effective approximations to cubic irrationals. New Advances in Transcendence Theory (Ed. by A. Baker). Camb. Univ. Press. H. F. Blichfeldt (1929). The minimum value of quadratic forms and the closest packing of spheres. Math. Ann. 101,665-608. E. Bombieri (1982). On the Thue-Siegel-Dyson Theorem. Acts Math. 148, 255-296. E. Bombieri (1985). On the Thue-Mahler equation. (Diophantine approximation and transcendence theory. Bonn 1985) Springer Lecture Notes 1290, 213-243. E. Bombieri. The Mordell Conjecture Revisited. (to appear). E. Sombieri and J. Mueller (1983). On effective measures of irrationality for ~ (x/-(a-~ and related numbers. J. reine angew. Math. 342, 173-196. E. Bombieri and W. M. Schmidt (1987). On Thue's equation. Inv. Math. 88, 69-81. E. Bombieri and J. Vaaler (1983). On Siegel's Lemma. Invent. Math. 73, 11-32. E. Bombieri and A. J. Van der Poorten (1988). Some quantitative results related to Roth's theorem. J. Austral. Math. Soc. (Series A) 45, 233-248. Z. I. Borevich and I. R. Shafarevich (1966). Number Theory. Academic Press. (Translated from 1964 Russian edition). J. W. S. Cassels (1957). An introduction to diophantine approximation. Tracts 45, Cambridge Univ. Press. J. W. S. Cassels (1959). An introduction to the Geometry of Numbers. Grundlehern 99. Berlin-G6ttinger-Heidelberg.
Cambridge Springer
S. Chowla (1933). Contributions to the analytic theory of numbers (II). J. Indian Math. Soc. 20, 120-128. G. V. Chudnovsky (1983). On the method of Thue-Siegel. Ann. Math. 117, 325-382. H. Davenport and K. F. Roth (1956). Rational approximations to algebraic numbers. Mathematiku 2, 160-167.
210
V. A. Dem'janenko (1974). On Tare height and the representation of a number by binary forms. Math. USSR Izv. 8, 463-476. M. Deuring (1973). Lectures on the :Theory of Algebraic Functions of One Variable. Springer Lecture Notes 314. L. G. P. Dirichlet (1842). Verallgemeinerung eines Satzes aus der Lehre yon den Kettenbr~chen nebst einigen Anwendungen auf die Theorie der Zahlen. S. B. Preuss. Akad. Wiss., 93-95. Y. Domar (1954). On the diophantine equation [Ax" - B y " I = 1, n > 5. Math. Scand. 2, 29-32. E. Dubois and G. Rhin (1976). Sur la majoration de formes lindares d coefficients algdbriques rdels et p-adiques. .Demonstration d' une conjecture de K. Mahler. C.R. Acad. Sci. Paris 282, S6rie A, 1211 F. J. Dyson (1947). The approximation to algebraic numbers by rationals. Acta Math. Acad. Sci. Hung. 9, 225-240. P. ErdSs, C. L. Stewart and R. Tijdeman (1988). many solutions. Compositio Match. 66, 37-56
Some diophantine equations with
P. ErdSs and P. Turan (1950). On the distributions of roots of polynomials. Ann. of Math. (2) 51, 105-119. H. Esnanlt and E. Viehweg (1984). Dyson's Lemma for polynomials in several variables (and the theorem of Roth). Invent. Math. 78, 445-490. J. H. Evertse (1982). On the equation ax n - by n = c. Compositio Math. 47, 288-315. J. H. Evertse (1983). Upper bounds for the numbers of solutions of Diophantine equations. Math Centrum tract 168, pp. 1-127, Amsterdam. J. Evertse (1984a). On equations in S-units and the Thue-Mahler equation. Invent. Math. 75, 561-584. J. H. Evertse (1984b). On sums of S-units and linear recurrences, Comp. Math. 53, 225-244. J . H . Evertse, K. GySry, C. L. Stewart and R. Tijdeman (1988). S-unit equations and
their applications. New Advances in Transcendence Theory. Camb. Univ. Press 110-174.
J. H. Evertse and J. H. Silverman. (1986) Uniform bounds for the number of solutions to Y " = f ( X ) . Math. Proc. Camb. Phil. Soc. 100, 237-248. G. Faltings (1983). Endlichkeitssgtze ]~r abelsche Varietgten gber ZahlkSrpern. Invent. Math. 73, 349-366.
211
N. I. Feldman (1971). An effective refinement of the exponent in Liouville's theorem (in Russian). Izv. Akad. Nauk. SSSR 35, 973-990. A. O. Gelfond (1952). Transcendental and Algebraic Numbers. transl. (1969), Dover Publications, New York.
(Russian).
English
P. M. Gruber and C. G. Lekkerkerker (1987). Geometry of Numbers. Second edition, North-Holland Mathematical Library, vol. 37. M. Hall Jr. (1971). The diophantine equation x s - y2 = k. In: Computers in Number Theory, A.O. Atkin and B.:I. Birch (eds.). Proc. Sci. Res. Council Atlas Symp. No. 2, Oxford, 1969, pp. 173-198. London: Academic Press. G. Hardy and E. M. Wright (1954). An introduction to the theory of numbers. 3rd ed. Oxford, Clarendon Press. A. Hurwitz (1891). Uber die angen~herte Darstellung der Irrationalzahlen dutch rationale Brgche. Math Ann. 39, 279-284. S. Hyyr5 (1964). Uber die Gleichung ax n - by n = c und das Catalansche Problem. Ann. Acad. Sci. Fenn Ser. AI, 355, 1-50. G.A. Kabatjanskii and V.I. Leven§tein (1978). Bounds for packings on the sphere and in space. (in Russian). Problemy Pereda~i Informacii 14, 3-25 A. Khintchine (1926). Zur metrischen Theorie der diophantisehen Approximationen. Math. Z. 24, 766-714. N. Koblitz (1977). p-adic Numbers, p-adic Analysis, and Zeta-Functions. Graduate Texts 58.
Springer
A. G. Khovansky (1981). Sur les racines complexes des systdmes d'dquations alg~briqeus comportant peu de refines. C.R. Acad. Sci. Paris 292, 937-940. S. Lang (1962). Diophantine Geometry. Interscience Publishers. D. J. Lewis and K. Mahler (1961). On the representation of integers by binary forms. Acta Arith. 6, 333-363. J. Liouville (1844). Sur des classes tr~s-gtendues de quantitds dont la irrationelles algdbriques. C. R. Acad. Sci. Paris 18, 883-885 and 910-911. H. Luckhardt (1989). Herbrand-Analysen zweier Beweise des Satzes yon Roth: polynomiale Anzahlschranken, The Journal of Symbolic Logic, 54 no. 1, 234-263. K. Mahler (1933). Zur Approximation algebraischer Zahlen L Uber den grhssten Primteiler bin~rer Formen. Math Ann. 107, 691-730. K. Mahler (1934). Zur Approximation algebraischer Zahlen. 111. Acta Math. 62, 91-166.
212 K. Mahler (1935). On the lattice points on curves of genus 1. Proc. London Math. Soc. (2) 39, 431-466. D. W. Masser (1985). Open problem~,~. Proc. Syrup. Analytic Number Th., W.W.L. Chen (ed). London: Imperial College. H. Minkowski (1896 & 1910). Geometric der Zahlen. Teubner: Leipzig u. Berlin. (The 1910 ed. prepared posthumously by Hilbert and Speiser). L. J. Mordell (1922). Note on the integer solutions of the equation Ey ~ = Ax 3 + B x ~ + Cx + D. Messenger Math. 51,169-171. L. J. Mordell (1922). On the rational solutions of the indeterminate equations of the third and fourth degrees. Proc. Camb. Phil. Soc. 21,179-192. L. J. MordeU (1934). On some arithmetical results in the geometry of numbers. Compositio Math. 1,248-253. J. Mueller (1987). Counting 8olution3 of lax r - byr[ < h. Quarterly J. Oxford (2) 32, 503-513. J. Mueller and W. M. Schmidt (19871). Trinomial Thue equations and inequalities. J. reine aug. Math. 379, 76-99 (1988). Thue's equation and a conjecture of Siegel. Acta Math. 160, 207-247. (1989). On the number of good rational approximations to algebraic numbers. Proc. A.M.S. 106, 859-866. T. NageU (1969). Quelques probl~mes relatifs aux unitds algdbriques. 115-127.
Ark. Mat. 8,
A. Ostrowski (1935). Untersuchungen zur arithmetischen Theorie der KSrper. Math. Zeit. 39, 269-404. S. S. Pillai (1945). On the equation 2 ~: - 3 ~ = 2 x + 3y. Bull. Calcutta Math. Soc. 37, 15-20. K. Ramachandra (1969). A lattice point problem for norm forms in several variables. J. Number Theory 1,534-555. D. Ridout (1958). The p-adic generalization of the Thue-Siegel-Roth Theorem. Mathematika 5, 40-48. J. Risler (1984/85). Complezitg et G~!omdtrie Rdelle. S6m. Bourbaki, no. 637, 89-100. C. A. Rogers (1964). Packing and Covering. Cambridge Tracts in Math. and Math. Phys. 54.
213 K. F. Roth (1955). Rational approximations to algebraic numbers. Mathematika 2, 1-20. S. Schanuel (1979). Heights in number fields. Bull. Soc. Math. France 107, 433-449. H. P. Schlickewei (1977a). The p-adic Thue-Siegel-Roth-Schmidt theorem. Arch. Math. 29, 267-270. H. P. Schlickewei (1977b). Ober die diophantische Gleichung xl + x2 + ... + xn = O. Acta. Arith. 33, 183-185. H. P. Schlickewei (to appear (a)). The number of subspaces occurring in the p-adic subspace theorem in diophantine approximation. J. f. d. reine und. ang. Math. H. P. Schlickewei (to appear (b)). The quantitative subspace theorem for number fields. H. P. Schlickewei (to appear (c)). An explicit upper bound for the number of solutions of the S-unit equation. H. P. Schlickewei (to appear (d)). S-unit equations over number fields. H. P. Schlickewei and A. J. Van der Poorten (1982). The growth conditions for recurrence sequences. Macquarie Univ. Math. Rep. 82-0041. North Ryde, Australia. W. M. Schmidt (1967). On heights of subspaees and diophantine approximations. Annals of Math. 85, 430-472. W. M. Schmidt (1968). Asymptotic formulae for point lattices of bounded determinant and subspaces of bounded height. Duke Math. J. 35, 327-339. W. M. Schmidt (1972). Norm form equations. Ann. of Math. 96, 526-551. W. M. Schmidt (1980). Diophantine approximation. Springer Lecture Notes in Mathematics 785. W. M. Schmidt (1985). Small zeros of quadratic forms. Trans. AMS 291, 87-102. W. M. Schmidt (1987). Thue equations with few coefficients. Transactions A.M.S. 303, 241-255. W. M. Schmidt (1989a). The Subspace Theorem in diophantine approximations. Compositio Math. 69, 121-173. W. M. Schmidt (1989b). The number of solutions of norm form equations. Transactions A.M.S. 317, 197-227. W. M. Sehmidt. Integer points on curves of genus I. Compositio Math. (to appear). T. N. Shorey and R. Tijdeman (1986). Exponential diophantine equations. Cambridge Univ. Press. C. L. Siegel (1921). Approximation algebraischer Zahlen. Math. Zeitschr. 10, 173-213.
214 C. L. Siegel (1926). The integer solutions of the equation y~ = ax" + bx n-1 + ... + k. J. London Math. Soc. 1, 66-68. C. L. Siegel (1929). Uber einige Anwendungen diophantischer Approzimationen. Abh. Preuss. Akad. d. Wiss., Math. Phys. Kl., Nr. 1 = Ges. Abh. I, 209-266. C. L. Siegel (1970). Einige ErlSuterungen zu Thue8 Untersuchungen ~ber Ann~herungswerte algebraischer Zahlen und diophantische Gleichungen. Nachr. Akad. Wiss. GSttingen, Math. phys. K1. Nr. 8, 169-195. J. H. Silverman (1981). £ower bound for the canonical height on elliptic curves. Duke Math. J. 48, 633-648. J. H. Silverman (1982). Integer points and the rank of Thue elliptic curves. Invent. Math. 66, 395-404. J. H. Silverman (1983). Integer points on curves of genus 1. J. London Math. Soc. (2) 28, 1-7. J. H. Silverman (1985). The Arithmetic of Elliptic Curves. Springer Graduate Texts in Mathematics 106. N. Stephens (1968). The Diophantine equation x a + y3 = Dz 3 and the conjectures of Birch and Swinnerton-Dyer. J. Reine Angew. Math. 231,121-162. C. L. Stewart (to appear). On the number of solutions of polynomial congruences and Thue equations. C. L. Stewart and R. Tijdeman (1986). On the Oesterld-Masser Conjecture. Monatsh. Math. 102, 251-257. A. Thue (1892). Om nogle geometr~:sk-taltheoretiske, Theoremer, Forhandl. ved de Skand. Naturforskeres 14, 352-353, in Danish; ~Yber die dichteste Zusammenstellung von Kreisen in einer Ebene, Skr. Vidensk.-Selsk., Christ. 1 (1919), 1-9. A. Thue (1909). Uber Ann~herungswerte algebrai~cher Zahlen. J. reine angew. Math. 135, 284-305. J. Thunder (to appear). Asymptotic .formulae for the number of subspaces of bounded height over number fields. R. Tijdeman (1976). On the equation of Catalan. Acta Arith. 29, 197-209. R. Tijdeman (1989). In Number Theory and Applications, ed. by R. A. Mollin, Kluwer, p. 234. J. Vaaler (1979). A geometric inequality with applications to linear forms. Pacific J. 83, 543-553. C. Viola (1985). On Dyson's Lemma. Ann. Sc. Norm. Sup. Pisa, Classe di Sc. IV, 12, 105-135.
215
P.
Vojta (to appear). Siegelb theorem in the compact case.
Annals of Math.
H. Zimmer (1976). On the di1~erence of the Well heigh~ and Ndron-Tare height. Math Zeit. 147, 35-51.
I n d e x of s o m e definitions Archimedean absolute value p. 19 base point p. 152 best approximation p. 71 Bezout's Theorem p. 148 binomial Thue equation p. 98 convex hull p. 94 covariant forms p. 169 decomposable form p. 90 degenerate module p. 18] density of lattice packing p. 7 Dirichlet's unit theorem p. 123 divisor p. 148 elliptic curve p. 152 equivalence of S-unit equations p. 133 fundamental parallelepiped p. 4 gap principle p. 57 genus p. 137, 151 Grassman coordinates p. 14 Hall's conjecture p. 203 height of a polynomial p. 23 height of a subspace p. 12, p. 33 Hermite's constant p. 9 ineffective p. 73 index of a polynomial at a point, p. 42 inhomogeneous minimum p. 70 large root p. 111 lattice p. 3 lattice packing p. 7 local parameter p. 149 Mordell equation p. 204 Mordell-Weil height p. 156 Mordell-Weil Theorem p. 156 Neron-Tate height p. 156 Newton points p. 101 Newton polygon p. 101 non-Archimedean absolute value p. 19 non-singular point p. 146 norm form p. 75 normalized form p. 81 order p. 182 Pillai's conjecture p. 205 primitive integer point p. 10 principal divisor p. 151
217
rank p. 156 rational function p. 148, 149 reduced form p. 81 Riemann-Roch Theorem p. 151 ring of multipliers p. 182 semi-discriminant p. 77 singular point p. 146 small root p. 111 successive minima p. 6 Thue inequality p. T4 Thue-Mahler equation p. 124 torsion p. 156 trinomial Thue equation p. 98 ultra-metric p. 91 Weierstrass equation p. 164 window of exponential width C p. 50 C-set p. 57 S-integer p. 123 S-unit p. 123 6-approximations 58 "y-set p. 57