This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
_ R. As in Chapter 7 we find that Modification 1
R :!g R2/q < r < 3H.
This modification ensures that the factor R/q, which occurs in some error terms on the minor arcs, is less than unity.
Major and minor arcs
169
Modification 2 If q > 16R, then there are many rational numbers in Jj with
denominators near q in order of magnitude. Let Q be the next power of 2 following r. As explained in Chapter 6, Lemma 1.6.3 gives us at least Q2/16R2 rationals, each with denominator q :!g 32r< 96H. We average over the approximations corresponding to these rationals, indexing the minor arc
approximations by the rational a/q rather than the index j. We use the notation J(a/q) for a minor arc with the choice a/q of rational approximation to f'(x). Modification 3
Suppose that all rational points in JJ . have q >- 3H. Let x be
the midpoint of J,. By Dirichiet's approximation theoFem (Lemma 1.5.1) there are integers b and r with
Irx-bI<1/3H,
r<3H-1.
Then
1/2R2 5 Ix - b/rI 51/3Hr. Hence r ::r. 2R2/3H, and the Farey arc J(b/r) is a major arc. We assign the
interval J, to the Farey arc J(b/r). The major arcs are thus unions of consecutive intervals J,, with total length
5 2/3Hr + 1/R2 5 5/3Hr. We write U(a/q) for the union of the arcs U. for which JJ . lies in J(a/q), and
I(a/q) for the intersection of U(a/q) with the interval [M, M2]. We take the centre of the Farey arc U(a/q) to be the integer m = m, within U(a/q) for which f'(m) is closest to a/q, so that
f'(m)=a/q+A,
(8.1.9)
IAI S C3T/2M3 X 1/NR2 << 1/N2.
(8.1.10)
with
For this integer m we put
µ = µj =f' (m)/2. It is convenient to write
A= -2µv. We note that
µx 1/NR2,
IvI51+C2C3/M<2,
since we can suppose that M is sufficiently large. Lemma 8.1.1 (Counting Farey arcs) For positive Q, let W(Q) be the number of Farey arcs J, on which q, lies in the range Q s q < 2Q (for some choice of qj when Modification 2 operates). For major arcs,
N(R - 1)2
R2
4Ci Cz M S Q 5 H'
(8.1.11)
The exponential sum for the lattice point problem
170
and we have
W(Q) << (M2 - M)Q2/NR2 + 1.
(8.1.12)
R/2< Q 5 96H,
(8.1.13)
W(Q) << (M2 - M)R2/NQ2 + min(R2/Q, M/NQ).
(8.1.14)
For minor arcs, and we have The implied constants are constructed from C1 and C2.
Proof This lemma is the analogue of Lemma 7.2.2. The term + 1 in (8.1.12) also covers the case when there is a major arc whose centre is outside the
range MSm5M2.
El
8.2 THE MAJOR ARCS ESTIMATE We treat the major arcs by Poisson summation and direct estimation. We consider the sum H2
e(hf(m + n)) h=H
n
m+ne!
where I = I(a/q) is a major arc, centre m. Writing
f(m +x) = f(m) +ax/q+g(x), we have
g'(x) = f'(m +x) - a/q = f'(m +x) - f'(m) - 2µv =4'"(m + Ox) - 2µv for some 0 in 0 < 9 < 1. Hence for m +x in I 1 +Ixl
1
g'(x) « NR2 «NR2
NR2
Hq
1
« Hq'
g" (x) = f" (m +x) << 1/NR2. These orders of magnitude still hold if we allow n to range wider so that m + n can go up to 2N beyond the endpoints of I(a/q) on either side. This allows us to use the same technical device as in the last chapter: on a minor arc Ik we actually use the Taylor expansion corresponding to Ik t 21 so that n is bounded away from zero. This saves a subdivision into cases according to
the order of magnitude of n. The integer n runs through an interval I consisting of a major arc with one or both endpoints shifted by N or 2N. We can write the sum as H2 h
N2
n=Ni
+g(n)
+ahn q
The major arcs estimate
171
where N1 and N2 are the limits of summation for which m + n is in I for N,
E N2
n=NI
e(ahn l
e(hg(n))
q J Nz
1
e hg(x) - LX J dx + O(log(B -A + 2)),
q(A-1)S15q(B+1) IN1 1
q
- ah (mod q)
where
A = hg'(N2)
B = hg'(N1), with
A,B<< h/Hq<< 1/q. The sum is thus
N2e(hg(x)-q/4+O(1)<1
q q
dx+O(1).
1= - ah (mod q)
Since hg'(x) << 1/q, we can use the First Derivative Test (Lemma 5.1.2) to give N,
)dx4z
fN
q , except for bounded values of 1, where the Second Derivative Test (Lemma 9
5.1.3) gives the bound a
<< sup
1
<<
V (NH )
As h runs from H to H2, then the integer 1 takes each value mod q at most (O(H/q) times, and so HZ
N2
hE
n=NI
F, e(hf(m + n))
«H glogq+ VH! q
«H -q (NRH ) log H. 2
Lemma 8.2.1 (The major arcs contribution)
The major arcs contribute
( g H, « (MZ-M)R log H + min R (HN) log (HN)
to the sum (8.1.1).
`I
NI log H)
The exponential sum for the lattice point problem
172
Proof We divide the major arcs into ranges Q < q < 2Q, where Q is a power of 2. In the notation of Lemma 8.1.1, the major arcs contribute
Q
log H,
W (Q) Q
where Q runs through powers of 2 in the range (8.1.11). Summing geometrical progressions gives the lemma. o As in the last chapter, we can treat all arcs as major arcs. If R >- H/4, then minor arcs have q -< 3H<- 3H(4R/H)2 < 48R 21H. Then (8.2.1) holds with
N2 -N1
H/4<-R
S«
(Mz -M)R log H+ min (HN)
M
(1 M2 l H3T« M2-M M + min
H , HT
J,
H
(_j)logH.
The implied constant is constructed from C1 and C2. 8.3
POISSON SUMMATION ON THE MINOR ARCS
After Lemma 8.2.2 we can assume that R < H/4. If a/q and b/r are two consecutive major arcs, then the rationals (at + bu)/(qt + ru) with (t, u) equal to (1,1), (1, 2), (1.3), (2,1), and (3,1) have
Rz/H < qt + ru < 4R2 /H < R, so they correspond to distinct minor arcs (on which Modification 1 applies). Thus any two major arcs are separated by at least five minor arcs. If Jj is a
minor arc, then J,+z and ii-2 cannot both be major arcs or the flanks of major arcs. If, for example, Ji _ 2 is a minor arc, corresponding to the rational number a/q and centre m, then we use the approximating polynomial about m on the arc I,. The variable n runs through a range N1 5 n 5 N2 with
N2 -N1
N1zN,
N2<3N.
Poisson summation on the minor arcs
173
In the notation above
Lemma 8.3.1 (Polynomial approximation)
H2 N2
H2
N2
F,
F, e(hf(m + n)) << max F, E e(hg(n))
h=H n=N1
,
H3 N3
where the maximum is over H3 >- H and N3 >- N1, and ax
g(x) =f(M) +
+ µx2.
9
Proof We write
k(x) =f(m +x) -g(x). Then, for some 0, cp in 0 < 0, cp < 1,
N
x3
k(x) _ -2µvx + 6 f 3>(m + 9x)
N3
1
<< NRZ + MNR2 << H'
and
k'(x) _ -2µv+
x2
2
f(3)(m + cpx) <<
1
NRZ
N
+
1
<<
MRZ
HN'
by (8.1.5) and (8.1.7). By partial summation in two dimensions (Lemma 5.2.2) we have H2 N3
H3 N2
F, E e(hf(m +n)) SCmax L > e(hg(n)) H N,
H3 N3
where
H)I
HN
HN
C«I1+HII+HN)+HN«1. II
13
llllllway
Taking moduli in this than
means that we cannot expect any estimate better
M
N
HM (HN) <<
(HN)
from the minor arcs. Lemma 8.3.2 (Poisson summation over n) In the notation above H2 N2
E F, e(hg(n)) _ H3 N3
2
F,
F
Vi-,i-
lmodq h=H3 1=- -a/h,D351/hsD2
+OI R (HN)logH), where
D; = 2µgN;, and the square root is taken with positive real part.
el hf(m) `
2
4hµq 2 )
The exponential sum for the lattice point problem
174
Proof For fixed h we write
/3=hµ. By (8.1.3) and (8.1.8) we have
/3NSHµN5C2HNT/M35'and by (8.1.7) we have H << R2. By the truncated Poisson summation formula (Lemma 5.4.3), we have l N E e(agn + pn2) - !a)E a+' JN32e(- q + /3x2) dx q(A
N3
q(
!= -ah(modq)
a)
+ O(log(B -A + 2)), where B = 2 /3N2 = hD2/q.
A = 2 /3N3 = hD3/q,
We note that the sum over I has at most one term, since /3N < 4, and that the error term is O(log(2 + /3 N)), which is bounded. We write xh1
1
1
2/3q
2hµq
Then
/3x2-lx/q=/3(x-xhr)2-/3x21 There are now two cases, depending on the sign of µ. We treat the case when p. (and /3 also) is positive. If
N3+1/,5x,i,SN2-11F then we argue as in the proof of Lemma 5.5.2, writing f Nee( /3(x -x21)2 N3 2 e(-/31hl
- Gx21) dx 1
a
(JN+
f ')e(/3(x-xh1)2-dx
(1 N2
_
V-i-
e(-/3xh1)+0I
Nx21
+ xh1
(8.3.1)
- Ns
by the First Derivative Test. When µ is negative, then we still have (21131 In the ranges
N35xh1
N2-1//5xh1SN2,
we have
fNZe(/3(x -x21)2 - ax21) dx N3
2/3i
e(-Rx21) «
1
(8.3.2)
Poisson summation on the minor arcs
175
by the Second Derivative Test. In the ranges
N2 sxhl
N3 - 1/ xhl
j
e( /3(x -xhl)2
N3
- /3x21) dx «
(8.3.3)
by the Second Derivative Test, and in the ranges xhr
xh,>N2+1/vP
we have 1
13x',) dx <<
fN32e(a(x
1
1
(8.3.4)
a INZ -xh1l + lxhr -Nil
by the First Derivative Test.
In the error terms we note that for 1 fixed, h lies in some fixed residue class mod q. If g = h(mod q), then
I(g - h) Ixgi -xh11
1
>> H2µ
Nq
xH.
(8..3.5) 3.5)
Since H << N and R << q, we find that Ixgl - xhIl >> R
I
N
-
When we sum h over a residue class modulo q, then we use (8.3.2) and (8.3.3) a bounded number of times, and (8.3.1) and (8.3.4) at a sequence of values xhl separated by a distance (8.3.5). The error terms sum to <<
E t
1+ VN
1
t«H/q t
I
g=h
-
Ixg, -xh,I
«E log H << gNjlogH<< Rq I
i
(HN)logH.
Lemma 8.3.3 (Poisson summation over h) In the notation above HZ
N2
E E e(hg(n)) h=H3 n=N3 L2
=E E 1=L,
x2(r)
1
(4µq(k - K))
+Ol R (HN)logN),
e
k-K Nq
iikl - abl q - q
q
q
The exponential sum for the lat tice point p roblem
176
where
L2 = 2µgH2N2,
L1= 21kgH3 N3 z 2,
(8.3.6)
and 12
K1(1) = µ q max N3 , L2 NN
(8.3.7)
64,
2
P K2(1) = µq min N22,
LZ
(8.3.8)
N3
and the integer b and the real number K satisfy q
1
3
IKISmin(2 +O(H4).
qf(m)=b+K,
Proof In Lemma 8.3.2 we write H4 =H4(l) = max(H3,l/D2),
H5 =H5(I) = min(H2,1/D3). For each 1 the sum over h takes the form (
H45h5H5
h- -al We can take out the factor
I`
i
2h
1
Je
(h(b + K)
-
12
4hµg2
q
which is independent of h. We apply the inversion step (Lemma 5.5.3) with h = qn-al,
f(n =KnalKq--- 4Iaq 2 (qn-al) 12
M replaced by H/q, and T replaced by I2/4µg2H. When µ is positive, then f" is negative, and H5
1
E
(2hµ)
e(
qKh -
h - -a!
e(f(xk)-kxk-$) (2µ(gxk-al)) f"(xk)l 1
A
s
(H./(.12H\ +O
1
H)
q
2J + lo(B -A + 2)
(8.3.10)
Poisson summation on the minor arcs
177
where the endpoints A and B are given by + al
A=f'
12
= K+ max IzqHs2
R 12
D3
12
1
=K+ 4
4AqHzz'4M R )
ll
=K+max Lz µRNz,AgN3K1(l)+K, z
in the notation of this lemma and the preceding lemma. Similarly B = K2(1) + K.
Changing the limits of summation in the inversion step byt a bounded distance
K does not increase the order of magnitude of the error term. We take the sum over k between K,(1) and K2(1). The stationary phase integral in the Poisson summation formula is part of
the contour integral for a Bessel function, Hi_J/2(x) in the notation of Jeffreys and Jeffreys (1962, article 21.02). In the main term Xk is given by 12
K+
4µq ( qxk - al )
2 =k,
(qxk - al)2 =12/4lcq(k - K). Thus alk
12
f(xk )-kx k= (k - K)x -q -- 4µg2(gxk-al) k al
=-(k-K) xk --q =-1 V(LjLq
akl
12
q
4µq2
(4µq(k - K)) 1
akl
K
q
3
The weight function is 1
2 µ( qxk - al )3
211(gxk-al)
12
1
(2µ(gxk-al)If"(xk)I)
1
12
12
4µq(k - K)
1
(4µq(k - K))
The factor Vi is cancelled by e(- '-g) when µ is positive, and the corresponding factor is cancelled by e(g) when µ is negative.
The exponential sum for the lattice point problem
178
In the error term we have
H 9
« Hl
µgzH 1
l2
V
µH) «
qR z
H)
RH <<
1VR2
Nq
<< 1.
The assumptions (8.1.3), (8.1.4), and (8.1.6) give
KIM >
qH qN2
> RZ - 6R 2
C2M3
2qHN
L1>2µgHN>
NR2
> 64,
>2.
The error term in (8.3.10) is
«
log N
V(HM)
We conclude the proof by replacing the factor (8.3.9) and summing over 1. The range of summation for k is non-empty only when Ll < l 5 L2. The error term sums over 1 to <<
pgHNlogN q << R (HN) log N, (Hµ)
0
which absorbs the error term of Lemma 8.3.2.
8.4 THE LARGE SIEVE ON THE MINOR ARCS We work backwards from the large sieve, in the hope of reducing the number of confusing multiple summations. Lemma 8.4.1 (The large sieve) Let a(k) and b(j) be coefficients of modulus at most unity, indexed by the integer vectors k = (k1, k2,11,12) with entries in
some range K5 k. < 2K, L 5 l; < 2L, and by minor arcs Jj = J(a/q) with a/q = aj/qj, with denominators in some range Q:5 q < 2Q. Let x(k) _
(ilk,- l2k2,11
Y
ci)
a
kl - l2
kz
, 11 -12, l1 / kl - l2/ k2 ),
1
= -q,- (µq3) ,-
ab
K
q ,2 (µq3)
where b, K, µ come from the coefficients of the approximating polynomial on
J(a/q) as in Lemmas 8.3.1 and 8.3.3, and the values of K, L, and Q are consistent with the construction above. Let V >_ 1 be a real number. Then 2
a(k)b(j)e(xck> . y(j)) << ABKL4R2NV/Q3, k
j
The large sieve on the minor arcs
179
where A and B are the number of coincident pairs among the vectors x(k) and y(>>, respectively. Two vectors x(k) form a coincident pair if their corresponding entries differ by numerically less that 2
"iI)i
1, R
Two vectors y(>) form a coincident pair if their corresponding entries differ by numerically less than 1
1
1
FK
8KLV' 4L (2K) ' 4L ' 8L The result is still true for a family of sums Si of the form (8.1.1), taken over subintervals (depending on i) of H< h < 2H, and of M < m < 2M, formed with functions fi(x) satisfying (8.1.3) uniformly in i, when we replace b(j) by Ni, j), y(J) by y('°>), and the sum over j by a sum over i and j. The sequence of rational numbers aj/qj corresponding to the minor arcs may also depend on i.
proof This is Lemma 5.6.6 with
X= (8KLV, 4L (2K) , 8L,16L/V), and 1
Y= 1,2max
(µq3)
1
,1,max
.
(l-+-q3)
we can take Y = Y3 = 1 since the rational numbers a/q and ab/q are only defined modulo one. By (8.1.3) and (8.1.4) we have
2) 3) <
3T ( Q3T
Q
3
We can suppose that K, L, and Q have the orders of magnitude of the sums in Lemma 8.3.3, so that R<< Q<< H,
Kx RQ,
Lx
2
After verifying that
L V11
r NR2) I` Q3
»
R4 NQ Q3 >> Q » H (H2Q2 R2 NR2
we deduce that X;Y + 1 << X;Y
for each i =1, ... , 4.
(8.4.1)
The exponential sum for the lattice point problem
180
The numbers k,, 1, in Lemma 8.4.1 lie in fixed intervals; we actually require the range for l to vary with the minor arc Jj, and that for k to vary both with j and with 1. Moreover, the exponent x(k) y(j) is not quite the same as in Lemma 8.3.3. We tackle these difficulties one by one.
As in the last chapter we use Lemma 5.2.3 to relate the sum over an arbitrary subinterval to sums over the whole interval. There is an alternative treatment using Rademacher's idea of decomposing the interval `in the 2-adic integers', that is, in terms of the binary digits of the endpoints. The combinatorics are intricate. This idea explains the logarithm factor that appears when we use Lemmas 5.2.3 or 5.2.4; it is the number of binary digits in the numbers involved. Rademacher's idea was combined with the large sieve by Uchiyama (1972).
Lemma 8.4.2 (Large sieve over subintervals) Let d(k, 1) be any coefficients of modulus at most unity, and let
g(j, k, l) = d(k, l)e(kly( j) + ly(j) + ly(j)F + ly(j)/F 2 3 4 1
Let L3(j) and L4(j) be sequences indexed by the minor arcs Jj with Q < qj < 2Q, satisfying
L
i
2K-1 L4(j)
F, j
K
2
« ABKL4NR2V Q3 log4 N.
E g(j,k,l) L3(j)
Proof We apply Lemma 5.2.3 to the sum over 1, so that L4(j)
2K-1
2L-1 2K-1
1/2
E g(1, k, l) < f- 1/2 K(x)
L
1=L3(j) k=K
E L g(j,k,l)e(lx) L
K
with
K(x) = min(L, 1/2x). By repeated use of Cauchy's inequality we have 2 2
2K-1 L4(j)
F, K
5
E g(J,k,l)
L3(j)
I
f 1/2 K(x) dx) f 1/2 K(x)
1
1/2
1/2
2K-1 2L-1 L g(j,k,l)e(lx) L K
2
2K-1 2L-1 1/2
K(x)dx1 3f 1/2 K(x) 1/2
L g(j,k,l)e(lx) j
K
L
2
dx 2 2
dx.
The large sieve on the minor arcs
181
Multiplying g(j, k, l) by e(lx) corresponds to changing y3 to y3 +x in Lemma 8.4.1. Since all the y vectors are changed in the same way, this does not affect B, the number of coincident pairs. The result follows from Lemma 8.4.1 and the estimate from Lemma 5.2.3: fl/2
-1/2
K(x) dx<-1 + log L << log N.
Lemma 8.4.3 (Partial sums by Mellin transforms) L2(j)
E
K2(j,l)
E g(k,l) «WL log K+
l=L1(j) k=K1(j,l)
(
In the notation above 1/4
ABLNR2VW2
loge N,
Q3
(
V
)
where the sum is over approximating polynomials used on minor arcs with qj in the range Q:5 qj < 2Q, and W is the number of such polynomials: W= W(Q) when Modification 2 is not used, W << Q2W(Q)/R2 when Modification 2 is
used. The limits of summation for k and 1 are those of Lemma 8.3.3; the argument j has been added to indicate the dependence on the approximating polynomial. Here K and L are powers of 2 whose order of magnitude is given by (8.4.1).
Proof The formula (8.3.7) for K1(j,1) changes at
1 =L3(j) =L2(j)N3(j)/N2(j) = 2µjgjH2(j)N3(j), and the formula (8.3.8) for K201 1) changes at
I =L4(j) = 2µjgjH3(j)N2(j). Both L3(j) and L4(j) lie between L1(j) and L2(j), separating the range of summation for 1 into three subintervals of the form L5(j) < 1 < L6(j). On each subinterval, the range for k is K5(j,1) to K60, 1), where K5(j,1) is one of the two terms in (8.3.7), and K6(j,1) is one of the two terms in (8.3.8). Since L1(j), L2(j), K1(j,1), and K2(j,1) have order of magnitude given by (8.4.1), we can cover the ranges for k and I by a bounded number of intervals of the form K s k < 2K and L < 1 < 2L, where K and L are powers of 2. A typical section of the sum has I running from L7(j) to L8(j), with
L7(j) = max(L, L5(j)),
L8(j) = min(2L - 1, L6(j)),
and K running from K7(j,1) to K8(j,1) with limits given by K7(j,1) = max(K, K501 1)),
K8(j,1) = min(2K- 1, K6(j, W. We write L8(j)
Gj = E
K,Q,1)
F,
1=L7(j) k=K7(j,l)
g(k, l).
The exponential sum for the lattice point problem
182
In Lemma 5.2.4 we have L8(i)
g(k, 1) 1 K6(j,l)s-K5(j,l)s Gi=E 2K-1 E ks )ds+o(LlogK). 12ari f l-L7(i) s k=K 1
Il
C
Let c(j) be coefficients of unit modulus for which c(j)G1 is real and positive. Then either
IGiIc(j)G1<< WL log K,
(8.4.2)
1
1
or, by Cauchy's inequality, 2
E c(j)G1 i
<
(f Ids C
L
fc
S
i
2K-1 L8(j) g(k, 1) s c(j) E F, ks K6(j,l) K
L7(i)
2
dsl S
(8.4.3)
or the corresponding inequality with K6(j, l) replaced by K5(j, 1). We have fc
Ids/sl << log K << log N.
In the second integral on the right of (8.4.3), the modulus squared within the integrand can be estimated as
2K-1 2L-1 g(k,l)
EE K
K6(j,l)
ks
L
2 s
When µig1N2(j)2
K6(j,l) =
in (8.3.8), then we take a factor (K6(j, l)/K)s of bounded modulus outside the sum over k and 1. When K6
)2 j,l = tLj-gil2N3(j L1(, )z =
12
4µ1g1H3(j)z
then we take out a factor (L2/µig1H3(j)2K)s of bounded modulus. The sum is now in the right form for Lemma 8.4.2 with d(k, 1) replaced by
d(k, l)(K/k)s, or d(k,
l)(Kl2/4kL2)s.
Thus 2
c(j)GG
<<
112
ABKL4NR2V 1
Q3
WlogeN.
(8.4.4)
I
Since either (8.4.2) or (8.4.4) must hold, we have the result.
o
The large sieve on the minor arcs
183
Now we have the ranges of summation correct, but not the exponent. There is no point in the argument at which we can use Lemma 5.2.2 without destroying the special form of K(j,1) which is used in the last lemma. The method of last resort is to expand in power series. Lemma 8.4.4 (Large sieve with gardening) and 8.4.3, H2
N2
F,
E e(hg1(n))
QW <<
In the notation of Lemmas 8.3.3
(HN) log N
h=H3 n=N3
ABKL4NR2V`W211/4
R2
+ Q(
loge N.
Q3
The corresponding result for a family of sums Si is H2
E
N2(j)
E e(hgij(n))
h=H3 n=N3(j)
HNQ2VW R2
<<
r R2 ABKL4NR1°VW212
loge N + Q
(
QZ
logo N,
where all the minor arcs with Q< q < 2Q for all the sums S; are counted in B and W.
Proof The exponential in Lemma 8.3.3 has the form e(kly1 + ly3 + lye (k - K) ).
Since k Z 32, we can expand (1 - K/k) 1/2 by the binomial series. The first two terms in the expansion give the required 1y2V +ly4/VK,
which we retain inside the exponential. There
is a
1/ (1 --K/k) in the weight function. We expand ( 11
K
1/2
k)
1 K2
1k1/2
e
(µq3) (8
_...)
k2
as a double power series
P(x,y) =
F, r r
brsxrys
S
in the two variables
x = K/k,
Y = K2l/ (µg3k3) .
similar factor
The exponential sum for the lattice point problem
184
We note that in Lemma 8.3.3 we have P
12
k_- min
16µ9H2,
(
qN2),
1
so that IK12
<_
IYI
,
I ( 16µgH2k) 5 4H
V
4H
4HR2
1
k9 5µ92N2 5 NQ2 S 16
pg3k3
The double series converges geometrically, and xrys
qKr+2s
(4µgk)
is
2( µg3)(s+1)/2 kr+(3s+1)/2
which consists of powers of k and I times a factor depending on the minor arc.
The size of the leading term is 1
2V(µ9k)
NR` KQ
R` Q
After taking out an order-of-magnitude factor, we use Lemma 8.4.3 on each
term in the power series, and add up to get the second term in the upper bound of the lemma. The first term in Lemma 8.4.3 contributes QW R2LW (HN) log N, Q log K << HW log N << R
the last expression being also the contribution of the error terms in Lemmas 8.4.1 and 8.4.2, which come from edge effects in the two-dimensional regions 11 of summation.
9 Exponential sums with a difference 9.1 MAJOR AND MINOR ARCS In this chapter we consider the exponential sum formed with a central difference, introduced by Heath-Brown and Huxle (190),
HMZ
hH
e(TFI
m=M1
111
1
mMmM h )-TF( h))
(9.1.1)
where H, H2, M, M1, and M2 are integers with H2 < 2H, M1 z M + 2H - 1,
M252M-2H, so that
M<m±h <2M. An analogue of the method of Chapter 7 for a general two-dimensional sum does not seem possible, because of the scaling laws. In one dimension, an interval can be divided into N parts, with linear dimensions smaller by a factor 1/N. In two dimensions, the linear scaling factor is only 11N. This increases the accuracy required from the rational approximations to deriva-
tives of the exponent. But a function of two variables has r + 1 partial derivatives of each order r, and the more derivatives to approximate by rationals with a common denominator, the worse the accuracy. We can approach the sum (9.1.1) because it is a double sum involving a function of one variable, and subdivision is only necessary in the m-direction. The exponent is hf(m, h), where f(x,Y)=Y(F(xMY)-Ft(xMY//=
MFj
MY for some 0 between - 1 and 1. The double sum in the last chapter is a linearized version of (9.1.1), with T there corresponding to 2T here, F there
corresponding to F here. Again, the natural order of magnitude for T is about M2. Exponential sums like (9.1.1) appear in the differencing step (Lemma 5.6.2) and in the treatment of short interval means (Lemma 5.6.4). The mean-to-max
argument of Lemma 5.6.3 provides another route to estimating simple exponential sums.
We suppose that F(x) satisfies inequalities of the form for 1
IF(')(x)I >-1/C, IF(')(x)I 5 Cr, (9.1.2) x 5 2, the upper bound holding for r = 2, 3, 4, 5, and the lower bound
Exponential sums with a difference
186
for r = 2, 3 always, and for r = 1 if M>> FT. We write d1 for d/dx and f1 for d1 f, etc., so that (9.1.2) gives Idi-1f(x,Y)I< C Mr
Id,-1f(x,Y)I < 2MT
r
for M+ 2H - 1 <x < 2M - 2H and the corresponding values of r. We suppose that M is sufficiently large, and that integers N and R can be chosen with 15 R :!g H:!' N < M to satisfy 2R2NT >-
C3M3
> 2(R -1)2NT,
(9.1.3)
2C3{H + 1
(9.1.4)
64C3 H < N :!g M/8,
(9.1.5)
HN3T «M4,
(9.1.6)
H3 << N2R.
(9.1.7)
The conditions (9.1.3)-(9.1.6) correspond to the conditions (8.1.4)-(8.1.7) of
the last chapter. These four conditions are used to estimate terms in the Taylor expansions of Section 9.3. The extra condition (9.1.7) is used to estimate additional terms that arise in this chapter, and it can be relaxed a little.
The construction of Farey arcs is the same as in the last chapter, with Ji the interval of values taken by 2T f1(x,0) = MF M for x on the interval U. The length of J, lies between 1/R2 and 2C3/R2. If values of f1(x, 0) outside the range M 5 x < 2M are needed, then we con-
tinue F(x) as the sum of the first six terms of the Taylor series at the appropriate endpoint x = 1 or 2. The centre of a Farey arc U(a/q) is the integer m in U(a/q) for which f1(m, 0) is the closest to a/q. We put
fl(m, 0) = a/q - 2µv,
µ =f11(m, 0)/2, so that
µ x 1/NR2,
IV I < 1 + C3C4/M < 2,
(9.1.8)
when we suppose that M is sufficiently large. Lemma 9.1.1 (Counting Farey arcs) For positive Q, let W(Q) be the number of Farey arcs Ji on which q1 lies in the range Q < q < 2Q (for some choice of qj when Modification 2 operates). For major arcs,
N(R -1)2 4C2C3M
R2
H'
and we have
W(Q) << (M2 - M)Q2/NR2 + 1.
The major arcs estimate
187
For minor arcs,
R/25Q596H,
and we have
W(Q) << (M2 - M)R2/NQ2 + min(R2/Q, M/NQ). The implied constants are constructed from C2 and C3.
9.2 THE MAJOR ARCS ESTIMATE We treat the major arcs as in the last chapter. We consider the sum H
E
E e(hf(m+n,h))I ,
I
h=H
(9.2.1)
n
m+nel
where I = I(a/q) is a major arc centre m. Writing f (m +x, h) = f (m, 0) + ax/q + g(x, h), we have
gl(x, h) = fl(m +x, h) - a/q = fl(m +x, h) + fl(m, 0) - 2µv. Now
T( /m+x+h
f1(m+x,h)= - F'I
M
zP
nl
for some 0 in -1 < 6 < 1. Thus 1+Ix1 +h NR2
gi(x, h) <<
)_F' (
m+x-h)) M
)
1
«NR2
NR2
H9
1
<<
H9,
and
g11(x, h) =f11(m +x, h) << 1/NR2. Let N1 and N2 be the limits of summation for n in (9.2.1), so that N2-N1 << NR2/Hq. The sum in (9.2.1) can be written as H2
N2
F,
hF, n=N1
+g(n,h)
ahn
+ q
We now continue by Poisson summation just as in Section 8.2 of the previous chapter to get the corresponding estimates.
Lemma 9.2.1 (The major arcs) <<
W2 V(HN
(HN )
to the sum (9.1.1).
The major arcs contribute
)RlogH+min R (HN)logH,
M
r I
N) logH
1 13
Exponential sums with a difference
188
Lemma 9.2.2 (The easy estimate)
If there is a choice of N and R with
H/45RSH satisfying (9.1.3)-(9.1.6), then the sum S of (9.1.1) is bounded by
S«
(M2-M)R (HN) (M2-M
M
9.3
g
tog H + min R (HN) to H,
( + min I`
M211
1
) jlog H
N)
log H.
H' HT J 1
POISSON SUMMATION ON THE MINOR ARCS
We continue to follow the previous chapter. In particular, if Jj is a minor arc, then either J,12 or J,-2 (or both) is a minor arc, and we use the approximating polynomial on one of these, with centre m. The minor arc Jj corresponds to x = m + n (or m - n) with n in a range N1 5 n 5 N2 with N2 - N1 < N, N1 z N, N2 5 3N. Lemma 9.3.1 (Polynomial approximation) In the notation above / h3µ
N2
l 3
N3
H2
N2
F,
F, e(hf(m +n,h)) <<max F el -1 E e(hg(n))
H2
h=H n=Ni
H3
where the maximum is over H3 >- H and N3 z N1, and ax g(x) = f(m,0) + + µx2. 4
Proof This lemma is more complicated to prove than its analogue in the last chapter, as the variable h occurs non-trivially. We put
k(x,h) =hf(m +x,h) -hg(x) - µh3/3. In order to estimate k2(x, h), we expand in powers of h first; T
'92hf(m+x, h)M2(F'(
m+x-h
m+x+h M
)-F,(
M
M2F/(mMx)+T F(3)(mMx
+MT
P(4)(m+J 6h)-F(4)(m+M Oh I`
11
J1
Poisson summation on the minor arcs
189
for some 0 in 0 < 0 < 1. Next we expand F and F(3) in powers of x to get 2T
2xT
m
m
a2hf(m+x,h)= M FP(M) + M2 3
+
6M4
(^)() +
T
m
+(x2+h2)M3 T
h2xT OI
M4
I,
for some ,rt in
m+3N+H <,rt< M M m
Thus we have
02hf(m +x, h) = f(m, 0) +xf1(m, 0) + (x2 + h2) f11(m, 0)
(N3T
H2NT H 3 T 1
+ 0+ M4 + M4 11 2
=f(m,0)+xl q -214v) =f(m,0)+
ax
+(x2+h2)µ+O(H),
by (9.1.6), (9.1.4), and (9.1.8). Hence
k2 (x, h) << 1/H. Similar calculations give
kl(x, h) << 1/N,
k12(x,h) «1/HN. We can now apply partial summation (Lemma 5.2.2) as in Lemma 8.3.1.
Lemma 9.3.2 (Poisson summation over n) In the notation above H2 N2
1
e(hg(n)) _ H3 N3
Imodq h=H3
(2hµ)
1= -ah,D351/h5D2
+0 (R (HN) log H) where
D1= 2µgN1 and, the square root is taken with positive real part.
e hf(m,0) -
1
4hµg 2
0
Exponential sums with a difference
190
Lemma 9.3.3 (Poisson summation over h) In the notation above h31 N2 H2
E e N-1 E e(hg(n))
h=H3
n=N3
3
K2(1)
L2
1=L1 k=K¢l)
xe-1
c(l2/(k - K)2) (\4µq(k- K))
J k(3012
a(b+K)l
(k-K)
µq
2
q
/
+OI R (HN) log N), where 9(z) and cp(z) are algebraic functions, one at the origin, and regular inside the unit circle, and L1= 2µgH3N3 > 2,
L2 = 2pqH2N2,
(9.3.1)
a H3 >KZ(1)=µq min a N3+H3,NZ z
Kl(1) = µq maxi`
12
12
N2 + H2 , N3 +
L1
12F
12
12
L1
L2
64,
(9.3.2) (9.3.3)
HZ
and the integer b and the real number K satisfy
gf(m,0)=b+K,
/1 Kminl 2 +OI
If the parameters satisfy
`
q
H
4
`
H3 << N2 R,
(9.3.4)
then we can also take 12
K1(l)=µgmax L2 N2,N3J
(9.3.5)
64,
z
K2(1)=µq min N2
12
LZ
(9.3.6)
N3
Proof In Lemma 9.3.2 we write H4 = H4(l) = max(H3, l/D2), H5 = H5(1) = min(H2, 1/D3). For each 1 the sum over h takes the form
H45h5 H5
h= -al
IF)
el+)
(h(b + K) q
h3
3
l2
4hµg2
1
Poisson summation on the minor arcs
191
We can take out the factor el bh
=e1 - 9l)
which is independent of h. We apply the inversion step (Lemma 5.5.3) with h = qn - al,
f(x)= Kx-
a1K
µ(qx-al)3
+
q
-
12
4µg2(gx-al)
3
M replaced by H/q, and T replaced by l2 /4µg2H. When µ is positive, then 12
f"(x)=2µgz(gx-al)-
2µ(gx - al)3
12
4µ2 g2 (gx-al)4 lz
4µ(gx - al)3 Since qx - al is a value of h, we have
2µq(gx-al)2 1
H5 2H 5 N 532' - 2µgH5 D 5N3 1
3
so f" (x) has the opposite sign to A. The inversion step gives 12 1 Kh µh3 h =H4
e
(2hµ)
4hµgz
3
q
h=- -al
I(f"(xk))I
(2µ(gxk-al))
ASkSB
+O
e(f(xk)-kxk- 8)
1
= L
1
H
(Hµ)
R
4µq22 H) 12
+log(B-A+2)
,
1
where the endpoints A and B are given by P H5 + al = K+µgH5z + A=f I
12
= K + max
2
4M RH5
R
k
4µgHz lz
pglz + µRHz D3 + z > 4µq D3 ) 12
= K + maX kAq--H,2 + µgH2 , µgN3 + 4µRN3 / Iz
K+maX
1z
+ Z µRH3 =K1(l)+K, Lzz µgNZ +NRHz,NRN3 L1
Exponential sums with a difference
192
in the notation of this lemma and the preceding lemma. Similarly
B=K2(1)+K. Changing the limits of summation in the inversion step by a bounded distance
K does not increase the order of magnitude of the error term. We take the sum over K between KI(l) and K2(1). In the main term Xk is given by 12
K+µq(gxk-al)2+ We put
4µq(gxk-al)
2
=k.
h=hk=qxk-al.
Then
12/4µgh2 + µgh2 = k - K,
(9.3.7)
and
h4 - (k - K)h2 + 12 =0 4µ2q2 µq
h2=
k-K
12
1-
2µq
I(i_ (k- K)2 -1
12
1- (k
1+
2µq(k - K)
11
12
(9.3.8)
- K)2
we take the unique root for which h x H. By (9.3.7) and (9.3.8) we have akl (k- K)h
f(xk) - kxk +
-= q
+
q
12
3
4µg2h
12
2µh3
2µg2h
3 4µ282h4
12
+
2µg2h
k-K
=-1
2µq
X 1+ = -1
µh3
3
312
2
1+ Vlk - K)
12
()
3(k- K)
µq3
)
2
1+
Al2
(k- K)2J
rI
2
8
(
(k- K)2 )
(9.3.9)
The large sieve on the minor arcs
193
Similarly we have
I2µhf"(xk)I =
2 h2
1-
4µ g2h4 12
)
2µq(k-K) 1+ I11
r
412
(k- K)2
12 Z (k - tc)
1+
(k-K)2
12
= 4µq(k - K)/cp2(12/(k - K)2).
(9.3.10)
The error term in the inversion step has the same size as in Lemma 8.3.3.
Changing the range of summation for k by O(µgH2) corresponds to changing the range of summation H4 S h < H5 by O(µgH3/K), up to an error term of the same size as the error term already present in Lemma 9.3.2. This corresponds to changing the sum on the left by
<<µH3 K
L
(µH)
<<
H3 IVZR
(HN) ,
which provides the condition (9.3.4). Thus when (9.3.4) holds, then we can take the limits of the sum over k to be given by (9.3.5) and (9.3.6), not (9.3.2)
o
and (9.3.3).
9.4 THE LARGE SIEVE ON THE MINOR ARCS We follow the exposition of Section 8.4 of the previous chapter. The first three lemmas differ only in the notation. Lemma 9.4.1 (The large sieve) Let a(k) and b(j) be coefficients of modulus at most unity, indexed by the integer vectors k = (k1, k2, 11, 12) with entries in
some range KS k, < 2K, LS l; < 2L, and by minor arcs Jj =J(a/q) with a/q = aj/qj, with denominators in some range Q < q < 2Q. Let xik) = Ilk1 - 12k2,
x2k)=l1 kl0(li/ki) -12 k29(122/k2) x3k) 11
11
x4k> =
kl
e
2
(k1
- kl411 5,2 e
=11-12) 11
122
4123
1
( 122
- --k2e 121 + k25,2 e ` k2 1 k2 12
2
(k12
Exponential sums with a difference
194
where 0(z) is the function in Lemma 9.3.3, and let
y(j) = -
a
-
q
- ab
1
q
( µq3)
K
2V-( 1.tq3)
where b, K, µ, come from the coefficients of the approximating polynomial on
J(a/q) as in Lemmas 9.3.1 and 9.3.3, and the values of K, L, and Q are consistent with the construction above, so that K >- 4L. Let V >- 1 be a real number. Then 2
a(k)b(j)e(x(k).y(j))I << ABKL4R2NV/Q3 j where A and B are the number of coincident pairs among the vectors x(k) and k
y(j), respectively. Two vectors x(k) form a coincident pair if their corresponding entries differ by numerically less than 1,
R `N / '1' R `N Two vectors y(j) form a coincident pair if their corresponding entries differ by numerically less than 1
1
1
8KLV' 4L (2K) max0(x)' 4L' 8LmaxIO(x) - 4x0'(x)I where the maxima are over the range 0 S x S 1/4. The result is still true for a family of sums Si of the form (9.1.1), taken over
subintervals (depending on i) of H < h5 2H, and of M + 2H - 1 < m <2M - 2H, formed with functions f;(x) satisfying (9.1.3) uniformly in i, when we replace b(j) by b(i, j), y()) by y(', j), and the sum over j by a sum over i and j. The sequence of rational numbers aj/qj may also depend on i. O
Lemma 9.4.2 (Large sieve over subintervals). of modulus at most unity, and let
Let d(k, 1) be any coefficients
4
xrk,')yri)
g(j, k, l) = d(k, l )e r= 1
Let L3(j) and L4(j) be sequences indexed by the minor arcs Jj with Q5 qj < 2Q, satisfying
L5L3(j)SL4(j)<2L for some fixed powers of 2: L and Q. Then we have 2K-1 L4(j)
j
E E g(j,k,l) K
2 2
ABKL4NR2V <<
Q3
log4N,
L3(j)
in the notation of Lemma 9.4.1.
0
The large sieve on the minor arcs
195
Lemma 9.4.3 (Partial sums by Mellin transforms) L2(j)
K2(j,l)
E
]E
In the notation above
g(k,1) << WLlogN
j l=L1(j) k=Kj(j,1) ABKL°NR2 VW 2 + Q3
1/4
log2N,
where the sum is over approximating polynomials used on minor arcs with qj in
the range QS qj < 2Q, and W is the number of such polynomials: W= W(Q) when Modification 2 is not used, W << Q2W(Q)/R2 when Modification 2 is used. The limits of summation for k and 1 are (9.3.5) and (9.3.6) of Lemma
9.3.3; the argument j has been added to indicate the dependence on the approximating polynomial. Here K and L are powers of 2 whose order of magnitude is given by (8.4.1) of the previous chapter.
In the notation of Lemmas 9.3.3
Lemma 9.4.4 (Large sieve with gardening) and 9.4.3 H2
( µh3 1
N2
el - )e(hj(n)) h=H3
`
W
«
R
n=N3
3
(HN) log N
R2 ABKL4NR2VW2
+Q
1/4
log2N.
Q3
The corresponding result for a family of sums Si is HZ
E h=H3
µh3) N2(j) e(-) F, 3
e(hg;j(n))
n=N3(j)
HNQ2VW R2
R2 I ABKL4NR10VW2
log2N+ Q
1
1/2
log4N,
Q7 )
where all the minor arcs with Q < q < 2Q for all the sums Si are counted in B and W.
Proof The exponential in Lemma 9.3.3 has the form
e(klyl +1y3 +ly2 (k- K) 0(12/(k- K)2)). Since k > 64, we can expand
(1- K/k) by the binomial series, and since
(9.3.1), (9.3.5), and (9.1.5) give 1
k<
2 µgH2 N2
µgN3
5
12H
3
N 516'
Exponential sums with a difference
196
and 6(z) has no singularities inside the unit circle, we can expand 0(12/(k K)2) as a convergent power series in K with coefficients involving k and 1. The constant term and the term in K are kept inside the exponential. For the remaining terms we expand the exponential, and we also expand
/(i-)
z `P
(k - K) z
These terms form a triple power series in the variables K/k, I2/k2, and K21/ (µg3k3) . As in Lemma 8.4.4 we see that each of these variables has modulus at most ,'-6) and that the triple power series has radius of convergence at least i in each variable. For the first result we use Lemma 9.4.3 separately on each term in the triple power series and add up. For the second result we use Cauchy's inequality in the form r=0 s=0 t=0
Erst
Er F, Ft s
(E E E2' +'+IlErst l2)
2r+s+t 1
r
s
t
where the expression E,st contains the sums over i, j, k, and 1. The factor 2r+s+t does not destroy the convergence of the triple power series. We complete the proof as in Lemma 8.4.4.
0
10 Exponential sums with modular form coefficients 10.1 MODULAR FORMS Modular forms, and their generalization to automorphic functions, lie at the centre of mathematics, in the sense that they appear in combinatorics, group theory, number theory, algebraic geometry, geometric topology, functional analysis, and complex analysis. Swinnerton-Dyer once declared that the unsolved problems in mathematics remain unsolved because we have not yet connected them to modular forms. A homogeneous modular form is a function of two complex variables z1 and z2 which is homogeneous: f(Az1, Az2) = A-kf(z1, z2) (10.1.1) (the integer k is called the weight), and depends only on the lattice in the plane generated by z1 and z2, so that
f(z1,z2)=f(w1,w2)
if
(c
)(:2),
(10.1.2)
where the matrix has integer entries and determinant one. It is not defined if the vectors corresponding to z1 and z2 lie in the same straight line. Modular
forms first appeared as constants of ihtegration, independent of z, in the construction of the doubly periodic elliptic functions, which have g(z + w1) = g(z), g(z + w2) =g(z) for two complex numbers w1 and w2. What is important is the additive group of complex numbers w with g(z + (o) =g(z) for all z, rather than any particular pair of generators for the group. There are generalizations in many- directions. One that is essential to the
theory is to allow f(z1, z2) to take a finite number of different values as (z1, z2) runs through all pairs of generators of a given lattice in the plane. For
a matrix M = (c d ), we write f I M (`f twisted by M', or `f acted upon M') for the function f(az1 + bz2, cz1 + dz2). The integer matrices of determinant one for which f I M =f form a subgroup of the group SL(2, 77) of finite index. There is a wealth of relevant algebra. The finitely many different functions f I M give a finite-dimensional representation of SL(2, 7L), and certain linear combinations of them give the irreducible components of the representation.
Since f(-z1, -z2) = (-1)kf(z1, z2), if the integer k is to be odd, then we must permit f I M to take finitely many different values. It turns out that k can be a half-integer without our losing all the algebraic structure.
Exponential sums with modular form coefficients
198
A modular form and the subgroup of SL(2, Z) which fixes it are said to have congruence level N if f(zl, z2) = f(wl, w2) whenever (z1 - wl/N and (z2 - w2)/N are both lattice vectors. This property is equivalent to f I M = f whenever IMI = 1 and M is congruent to I, the 2 x 2 identity matrix, modulo N. The fundamental notion is commensurability. If f is a modular form with congruence level N, and M is an integer matrix with positive determinant D,
then f I M is a modular form with congruence level DN (in general, the subgroups which fix f and f I M are called commensurable with one another when they share a common subgroup of finite index). A suitable average of
the functions f I M with matrices M of the same determinant D gives a modular form of level N again. These averaging procedures are called Hecke operators.
Modular forms occur widely as the generating functions of interesting sequences of numbers. The generating function of a sequence r(1), r(2),... is the function
R(t) = Er(n)t". If the properties of R(t) are known, then we can deduce properties of the coefficients. More generally, R(t) may be related to a known function S(t) by
R(t) = exp S(t),
R(t) = S'(t)/S(t),
R(t) = S(t)/(1- t)2, for example, or the powers t" may be replaced by another sequence of functions. For modular forms the interesting coefficients appear in a Fourier expansion.
An inhomogeneous modular form is constructed from a homogeneous modular form by
F(zl/z2) =zif(zi, z2) =f(Z1/z2,1). The twisted function corresponding to f I M is
(FIM)(z)=(cz+d)-kF
az + b
(cz+d
I.
(10.1.3)
This more complicated formula is the price that we pay for eliminating one variable. Since F
I
1) takes only finitely many values as the integer b
varies, then (whether F has a congruence level or not) there is an integer N
with F o 1) = F, so that f(z + N) = F(z). For each fixed yo for which ( F(z) has` no singularity on the line y =yo, there is a Fourier series F(x+iyo)A"e( nxN)a(nexp 27rin(xN+ iyo) I
with
a(n) =A nexp
27rnyo
N
Modular forms
199
which converges absolutely. We can define G(z) _ E a(n)e2"i"ZIN. n
Then F(z) is an analytic continuation of G(z). The numbers a(n) are called the Fourier coefficients of F(z). When F(z) is invariant over the full group, then these coefficients are canonical. When F I M takes a finite number of different values for M in SL(2, 71), then there are finitely many Fourier expansions. There is a summation formula connecting a sum E a(m)w(m) formed with a smooth weight function w(x) with a sum E b(n)w(n), where w(y) is a transform of w(x), and the numbers b(n) are the Fourier coefficients of F o -1o In the case of functions F(z) of integer weight and congruence level N, we believe that among the linear combinations of twists F I M there is a simplest I
function G(z) whose Fourier coefficients satisfy the multiplicative rule b(mn) = b(m)b(n) when the highest common factor (m, n) is one. The Hecke
operators should take G(z) to a multiple of itself, AG(z) say, and the eigenvalues A for all the Hecke operators should determine the Fourier expansion of G I M for every non-singular integer matrix M. The numbers
b(m)e(am/q) are the Fourier coefficients of G (o
Q ),
with a summation
formula relating them to the Fourier coefficients of
GI
q I
q
Conversely, Andre Weil's Bestimmungssatz (1967, 1971) states that if a sequence of numbers b(m) have summation formulae of the right type for the sums E b(m)w(m)e(am/q), then they are the coefficients of a modular form of a certain weight and congruence level. For modular forms of half-integer weight, the Hecke operator of level D is
connected with b(D2), instead of with b(D). The Fourier coefficients of modular forms without a congruence level await an algebraic interpretation. There are many generalizations of modular forms to higher dimensions.
The algebraic structure does not always allow us to define a sequence of numbers that correspond to the Fourier coefficients in the classical case. A surprising generalization is to eigenfunctions of the Laplacian on twodimensional hyperbolic space, which can be modelled as the upper half-plane H, the complex numbers z = x + iy with y > 0. The Mobius transformations z -* Mz = (az + b)/(cz + d) are translations or rotations. The 'real-analytic modular functions' F(z) have (F I M)(z) = F(Mz) taking finitely many different values for the matrix M in SL(2, 71), and d2F1 y 2(d2F dx2+aY 22JI=AF for some fixed A, which plays a role similar to the weight.
Exponential sums with modular form coefficients
200
10.2 THE WILTON SUMMATION FORMULA Let F(z) and G(z) be modular forms of weight k with Fourier expansions (the old name is `q-series')
G(z) _
F(z) _ F, a(m)exp2irimz, m
b(l)exp
21rilz ,
(10.2.1)
C
1
with
_
r
GI?
I
1
) =z-kGl - z) =F(z).
(10.2.2)
The idea of the Wilton summation formula is to express the coefficients a(m)
as averages of the coefficients b(l). We treat the simplest case when F(z) and G(z) are cusp forms: this means that for any matrix M in the modular group, the Fourier coefficients of F I M are zero for m 5 0, and (F I M)(z) tends to zero as y tends to infinity. To justify convergence, we quote a fundamental result of Rankin (1939): the series corresponds to an integral involving IF(z)I2 over a region in the z-plane.
Lemma 10.2.1 (Rankin's series)
When G(z) is a cusp form, then the series W
Ib(1)I2
I=1
12
R(s) _
converges for Re s > k, and has an analytical continuation with a simple pole at s = k.
Corollary
There is a constant B with L
Ib(1)I2
1=1
l
k
S B2 log2L
(10.2.3)
for any positive integer L Z 1.
Proof of the corollary Since (s - 1)R(s) is continuous, let
A=
max
k5ssk+2
Ks - 1)R(s)I,
where the maximum is taken along the real axis. We put s = k + 1/(log2L). Then 1
Ib(1)I2
Alog2LzR(s)> E 1
which gives the corollary with B2 = eA.
> is
e
E
Ib(1)I2
1
lk
0
The Wilton summation formula
201
Lemma 10.2.2 (The Wilton summation formula) Let F(z) and G(z) be cusp
forms of even weight k satisfying (10.2.1) and (10.2.2). Let g(x) be twice continuously differentiable, with g(x) = 0 for x :!g M or x z M2, where 0 < M < M2. Then M2
F, a(m)g(m) M
(k-1)/2
_ (-1)k/22ir
b(l) fM 2(-)
Jk-1 4"'
V(
g(x) dx.
Proof We write µ = 1/21rM and put h(x) = e2iµxg(x). Let
h(x) = fuM2 h(u)e(-ux) du =M
(10.2.4)
be the Fourier transform of h(x). Then W
f h(x)F(x + iµ) dx = f h(x) E a(m)e(mx)e-2"m dx °°
°°
m
_ E a(m)h(m)e-2,,mµ = F, a(m)g(m). m
m
We use (10.2.2) to obtain
E a(m)g(m) = f h(x)(x + iµ)-kG( -
f Wh(x)(x + 00
=
iµ)-k
1 µ) dx x+i
b(l)e(-
c(x +
iµ)) dx. (10.2.5)
We integrate by parts twice and use the First Derivative Test (Lemma 5.1.2) to get h(x) << 1/IxI3
for large x. Using Lemma 10.2.1 with s = (2k + 3)/4 and Holder's inequality, we find that r
l
µ c(x+iµ)) «(c(x2+'2)\°k3v4
Exponential sums with modular form coefficients
202
Hence the sum and integral in (10.2.5) are absolutely convergent, and we may integrate first. We find that
h(x)(x + iµ)-keI
- c(x + iµ)
I dx
=
l - ux)dxdu fu=M h(u) fx= - (x + iµ)-keI` - C(x+ill)
=
fU=M g(u)fz= -oo+iµ e ( - cz - uz 1 dz du. zk
MZ
I
W+114
MZ
The integrand has an essential singularity at z = 0, and is small in the half-plane Im z < 0. We can deform the inner integral to a clockwise circle around the origin. The substitutions iv
z
(1)
,
)
t = 41T Ku
give
(-i)
k- t Cu (k- t)/2 ( l )
t/
f expl 2I u
v
v-kdv.
(10.2.6)
The coefficient of vk- t in the expansion of the exponential in (10.2.6) is the Bessel coefficient Jk_ 1(t).
Lemma 10.2.3 (Twisted Wilton summation) Let F(z) be a modular form invariant under the full modular group (so that G(z) = F(z) in (10.2.2)), with Fourier coefficients b(m). Let a/q be a rational number and let g(x) be as in Lemma 10.2.2. Then M2
F, b(m)e(
q)g(m)
M
W/2 _ (-1)
21r
q
I (lx) / al) M2(IX) (k-t)/2 g(x)dx, Jk_ll4,7r q b(l)eI` - q fM
where a is defined modulo q by as = 1.
Proof We have
FI z+ q) =FI
(
alq) =FI
( aq a)(0
a1q) =F
aq
l0q)'
The Wilton summation formula
203
where as + qq = 1. The modular form G(z) with
a/q
G
0/=FI`0
(1
1
is given by aq
G=F
-0) =F 1/q -a
1/0gl(0
0
q
'
so
az - -
G(z) = q-'F
q
We now use the Wilton summation formula with c = q2 and a(m), b(l) o replaced by b(m)e(am/q), q-kb(l)e(-al/q). The Wilton summation formula is simplest for analytic cusp forms of even weight. The first cases were Voronoi's formula (1904) for sums of the divisor
function d(m), and Hardy's formula (1915) for sums of the representation function r(m) that counts the number of ways of writing m = a2 + b2 with a > 0, b Z 0. These are cases of Lemma 6.2.1 in which there is a closed form for the Fourier coefficients. The divisor function corresponds to a special non-holomorphic modular form. We quote the twisted formula, using the notation of Jeffreys and Jeffreys (1962, chapter 21) for Bessel functions.
Lemma 10.2.4 (Twisted Voronoi summation) Let g(x) and a/q be as in Lemma 10.2.2. Then M2
M
q
(am)g(m)
d(m)el
1
= q fmM2 log
Xq2
+
2y g(x) dx 1
- 2'r L d(1)eI - al q
+
21T
q
I
fM2Y,0 41r
qJM
r
(lx) q
)xdx
(lx) ( all M2 ( d(l)e - q JI fM K0,°I4rr q
)xdx. 0
To obtain a reflection step we must truncate these series and approximate the Bessel functions. Our next lemma is an exercise in the use of Lemmas 5.5.5 and 5.5.6.
Lemma 10.2.5 (Stokes' approximation) Let I(l, u) be the integral +iµ I(l u) =
1z=--+iµ
I
e (
1 dz - uzJ
CZ
Zk
Exponential sums with modular form coefficients
204
where k z 2 is an even integer. For u Z c/l we have 1/4 Cu (k- 1)/2
C
I(l, u) = 2( 41u)
0
(l)
r lU
Re e (2 1(2k+1)/4 +
1
c ) +8
I
)k-1)
u3/2 (
Proof We move the line of integration from Im z = µ to Im z = ri =1/2lru, and we put z = q(x + 0. Then l cn(x2 +)1) - 7u(x + i) I
I(l, U)
/
1
k-1(x + i) k
2
I
1
'qk-1(x+i)k
e1-a2/(XZ+1) exp(-
x X2A+1
- ix) dx,
(10.2.7)
where we have put A2 = 21rl/c-q = 4ir2lu/c.
This is an exponential integral with x
A2x
2ir(x2 + 1) 1 g(x) = nk-1(x+i)k
21r'
e1-aZ/(X2+1).
Strictly we must consider the real and imaginary parts of g(x) separately, but since ds
Reg(x)
-
I
Re
ds
d3 dxsg(x)I SI
dxsg(x)
the same order-of-magnitude estimates apply formally. We have
A2(x2 -1) )
- -(X2+ f (x)= --I1 2'r ` 1
( x)
/
A2x(x2 - 3) 1r(x2 + 1)2
The equation f'(x) = 0 is a quadratic in x2. For A z 2,7r, the roots are ±x1, ±x2, where 2
x1=1+ 2+0141, x2=A 1-
3 ZAZ
1
+0 (A4))
The Wilton summation formula
205
The subinterval -25 x:!92 of the range of integration
in (10.2.7)
contributes O(uk-1 a-AZ/4)
(10.2.8)
We divide the rest of the range into blocks [H,2H] and [-2H, -H], where H is a power of 2. Let Ho be the nearest power of 2 to x2. We combine two pairs of consecutive blocks as [-2Hp, -Ho/2] and [Ho/2,2Ho]. The endpoint terms cancel between adjacent blocks. At ±2 the endpoint terms are
O
uk21
e A2/4 J, )
A
and they can be absorbed by (10.2.8).
For each block H5 Ixl 5 2H we have If1r)(x)I X A2/Hr, and
uk- 1/Hk+s
for H;-> A,
k 1/H k+3seA2/H' A2su-
for H5 A.
For H< x:5 2H, H < Ho we take A2
TxH
Ux
H3
uk-1e-A2/H2 Hk
MxH,
Nx
12.
The error term of Lemma 5.5.5 is M3U U u k-1 e- A2/H2 << N2T2 << H << Hk+1
(10.2.9)
For Ho/2 5 x :!g 2Ho we take
Uxuk-1/Ho,
TxHO,
MxNxH0.
The error term of Lemma 5.5.6 is MU << T3/2 <<
U HO <<
uk-1
Ak+1/z
(10.2.10)
For HSx52H,HO
Tx H,
uk-1
Ux
Hk
,
MxNxH.
The error term of Lemma 5.5.5 is uk-1
H3U « MU IT « A4 « A4Hk-3
(10.2.11)
Exponential sums with modular form coefficients
206
For H > A2 we use the trivial estimate uk-1 << MU<<
(10.2.12)
WT-1
These terms (10.2.8), (10.2.9), (10.2.10), (10.2.11), and (10.2.12) sum to 1
K
llk- 1
1
k+1/2 + A2k_2
)
The second term is negligible for k > 4.
Since f"(x) is negative, we compute the main term for the range [Ho/2,2Ho] as
g(x2)e(f(x2)
-
(10.2.13)
s)/If" (x2)I
using the complex conjugate form of Lemma 5.5.6. In the exponent (
A
f(x2)= - -+0I
),
A
so that
e(f(x2)-'-'-g)=e(-2
I
lc
I
AJI.
The weight is
g(x2)/
I f"(x2)I
77
=
1+O
k-1xk
2 +0 z
exp 1
2
2
h(rtlA)'rl/2 k- ] k- 1/2
(i+oi_ii11 1
c
-(4lu )
]/4
(l)
(k-1)/2
I
/
1
`
`
II
h(qx2)(1+OI
We also have
h(gx2) = h(,qA) + 0(,q/A).
These approximations give another error term of size (10.2.10). The main term for the range [-2Ho, -Ho/2] is the complex conjugate of (10.2.13). 0 We can now give a truncated form of the Wilton formula. The condition k Z 4 comes from Lemma 10.2.5, and can be removed if we use a different integral for the Bessel function.
207
The Wilton summation formula
Lemma 10.2.6 (The truncated Wilton summation formula) Let F(z) and G(z) be cusp forms of even weight k >- 4 satisfying (10.2.1) and (10.2.2). Let M and M2 be positive integers with M2 < 2M. Let g(x) be s times continuously
differentiable, and zero for X< M and x >- M2. Define the means U, for r = 0,...,s by
f ,Jg(')(x)Idx= U,(M2 -M). M"
Let K + 1 be the power of 2 with
KS
cM(s+1)/(s-1)(Us/U0)2/(s
<2K+ 1.
Then, if M >- c and K z 1, we have M2
F, a(m)g(m) M
K
=2
1/4/
!
(k-1)/2
b(l) fMMZI 41x)
g(x)Re e 2
I
+ $ dx
+O(Bc(k+1)/2Mk/2-1+1/(2s-2)
X (m2 - M)U(2 s-3)/(2s-2)USl/(2s-2)(log K)1/2), where B is the constant of Lemma 10.2.1.
Proof We write Ik(l, u) for I(1, u) in Lemma 10.2.5 to indicate the power of z. Then d
du
Ik(l,u) _ -271iIk-1(I,u).
We integrate by parts to get M2
u) du =
fm
(27ri)s fm
Zg(s)(u)Iks(I, u) du.
Using Stokes' approximation as an upper bound, we have 1(2k+2s- 1)/4 fM 1 cM r fM 2I g(s)(u)I du. Zg(u)Ik(I, ll) du <<
Il
M
J
Lemma 10.2.1 and Holder's inequality now give M
b(l)Ik(1,u) 1=K+1 <<
Bc(2k+2s-1)/4K-(2s-3)/411(2k+2s-3)/4(M2 -M)Us(log
K)1/2.
Exponential sums with modular form coefficients
208
We substitute Stokes' approximation into the terms with 1:!g K in the Wilton formula. The error terms contribute cx (2k+1)/4
< fM Ig(x)I() MM
x-3/2 dx.
(10.2.15) by Lemma 10.2.1 and Holder's inequality again. The choice of K makes both 11 error terms (10.2.14) and (10.2.15) of the size stated in the lemma. Bc(2k+1)/4K1/.4M(2k-s)/4 (M2 -M)U0(log K)1/2,
<<
10.3 FAREY ARCS We present the ideas of Jutila (1987a et seq.) for the sum M2
S = E b(m)e(f(m)),
(10.3.1)
m=M formed with the Fourier coefficients b(m) of a cusp form for the full modular group. Here
f(x) = TF(x/M)
as usual, where M and T are size parameters, and F(x) is a real function with s continuous derivatives, where s Z 4 is an even integer. We suppose that (10.3.2) If`r)(x)I <-CrT/Mr for r = 1, ... , s, and (10.3.3) If (r)(x)I > T/C,M' for r = 2; for T << M we also require (10.3.3) for r = 1. We use Dirichlet's approximation theorem (Lemma 1.5.1) to subdivide the sum. We choose an extended Farey sequence .9(R). When aj/qj and aj+1/q1+1 are consecutive fractions of .A(R), then (aj + a1+ 1)/(q; + qq+1) does not lie in 9(R). Since f"(x) is continuous with constant sign, then f'(x) is monotone, and we can define a function h(y) implicitly by
y =f'(h(y)). Then C2T
h (
(a)+aJ+l\
h
a'Z ai + a+ 1
q1+ql+1(q1)
qj +qi+1
a1 q1
1
=
g1(g1+qq+1)
z
1
2R2
Thus
aj +aj+1
h(qt+qj+1)
-hl
aj
Mz
q,) >N= 2C2R2T
We suppose that
M/8>_NZR, 2
(10.3.4)
Farey arcs
209
the second inequality being the reverse of our previous constructions, and R >- 48C2M/T,
(10.3.5)
NR <M/96C2.
(10.3.6)
which implies
Let a j/qj with j = 1, ..., J be the fractions of 9(R) in the interval
f'(M+2N)
Let No=M+N,NJ=M2-N, and, for j=1,...,J-1 Nj=
h(qj+qj+1 where [[x]] denotes the nearest integer to x. We construct a smoothed characteristic function of the interval Nj_ 1 <x
forx>N,
1
&(x)=
2(1+sins+1
ZN)
0
J
for IxI
for x < -N.
and
wj(x) = w(x-N_1) - w(x-N). Since s is an even positive integer, the function wj(x) has s continuous derivatives, with
wj(h(aj/qj)) =1. Lemma 10.3.1 (Smoothed Farey arcs) We have M2
S=
F, wj(m)b(m)e(f(m)) +O(BMk'2(N log j=1 M=M
M)112).
1
Proof We note that 0
w.(n)= j=1
for x< M,
1
forM+2N<x<M2-2N,
0
forxzM2.
By Lemma 10.2.1 we have
r
S- F, F b(m)wj(m)e(f(m)) j=1 m
M+2N
M2
E +M2-2N E M
I b(m)I << (B2MkN log M)1/'2.
0
Exponential sums with modular form coefficients
210
The bound obtained for the smoothing error using Lemma 10.2.1 is not very good. The bound for the individual Fourier coefficients (Deligne 1974) is
known only in the classical case. We define F(z) to be a primitive modular form or newform (Atkin and Lehner 1970) if F(z) cannot be derived from simpler functions by taking linear combinations of twists. Lemma 10.3.2 (The Ramanujan hypothesis) Let F(z) = E a(m)e(mz) be a newform of even weight k and some congruence level, normalized by a(1) = 1. Then a(n) is an algebraic integer for each n. and l a(n)I 5 d(n)n(k 1)"2.
(10.3.7)
There is a sense in which the Ramanujan hypothesis is the analogue for a certain quadratic polynomial of the Riemann hypothesis for the zeta function: see Ireland and Rosen (1982). For non-holomorphic modular forms, or modular forms without a congruence level, the assertion that a(n) is algebraic
is false, but the upper bound (10.3.7) may still hold. The Jacobi theta functions with weight 'z satisfy Lemma 10.2.1 but not Lemma 10.3.2: most coefficients are zero, but the Rankin series has to diverge at s = k.
Fourth-power analogues of Lemma 10.2.1 have been established by Moreno and Shahidi (1983). Their result is known for more cases than the Ramanujan hypothesis. Lemma 10.3.3 (The fourth-power Rankin series) The series 2
is r
converges for Re s > 2K - 1, and has a pole of order 2 or 3 at s = 2K - 1. The pole has order 2 for forms of even weight and congruence level one.
Corollary
When the pole has order 2, then there is a constant B2 with L
1=1
I b(l)4
lzk- 1 5 BZ (log2L)z
for any positive integer L.
Either Lemma 10.3.2 or 10.3.3 can be used to improve the smoothing error in Lemma 10.3.1.
10.4 WILTON SUMMATION ON THE FAREY ARCS We apply the truncated Wilton summation formula (Lemma 10.2.6), and evaluate the resulting exponential integrals.
Wilton summation on the Farey arcs
211
Lemma 10.4.1 (Explicit Wilton summation) We have (when f" (x) > 0) M2
E wj(m)b(m)e(f(m)) j=1 m=M
_ E 1F, F, b(l)el - 8111 wj(xj + O BMk/2
I MZRZ
N3
)hil
M
(x, )e(gj± (xj ))
11/(2s-2)
+ RZ
)(logM)12),
)
(10.4.1)
where xji and x,-.I are the stationary phase points satisfying
f'(x)=a± 1 (I), qj qj V x
(10.4.2)
the functions g j W, g,-., ( x) are given by
ax
2
qj
qj
(IX)
g (x)=f(x)- '--+-
+8+8,
the weight functions hit (x), h7 (x) are given by
(x/l
)ck-1)/2
(f" (x) ± (l/x3)1/2/2qj)
(4lxgf )1/4
and the length of the reflected sum K1 is the smallest integer with
k1+1 _ 18M/R2. Proof First we fix j, and write a/q for aj/qj. We apply the truncated Wilton summation formula with a(m), b(l) replaced by
b(m)e(am/q), q-kb(l)e(-al/q) as in Lemma 10.2.3. The size of the cut-off K depends on the derivatives ax
dxr
wj(x)e f(x) - q
which are sums of terms involving r,
q
with
ro+r1 +2r2+
+tr1=r.
We have w,(ro)(x) << 1/Nro,
and
f'>(x) << 1/M`-2NR2
Exponential sums with modular form coefficients
212
for i = 2,. .. , s, whilst when wa(x) is non-zero, then a
f'(x) -
NR <<
1
maxi f" (x)I <<
q q 4R By the order-of-magnitude assumption (10.3.4) we have
d w
We l f(x) -
-ax
.
l
1
dx' l q qR for r 5 s. We take the range of integration in Lemma 10.2.6 to be the support of w,(x), of length O(NR/q), so that we may replace M2 - M by O(NR/q) in the error estimate. Then 12/(s- 1)
K
qR)
and the error term is 1/(2s-2)
NVR-
O
(log M)1/2
BMk/2-1
( qR R
Vt
(10.4.3)
The main term is
x
a1
b(1)e -
I (I+(1) +I-(1)),
q
1=1
where
r I ±M = J
(X)(k-1)12
1
ax
2 (lx)
q
q
f(x) - - T
(41xq2 ) 1/4
T
dx.
We treat I+(1); I-(1) is similar. The exponent is stationary at xi given by (10.4.2). For x in the support of w1(x) we have
a.+a ' '+1
+Nmaxl f"(x)I
q1 + qi+ 1
a
1
q
qR
-+
NT
a
1
q
qR
+C2M25-+
1
+2R2.
For l z 18M/R2 we have
f'(x)
Ox I
2q q The weighted First Derivative Test (Lemma 5.5.5) gives (k-1)/2
1
M
(IM q 2)1/4
(1)
<<
I+(1)
q
q
Mk- /2 lk/2
f qs/2
TM M2
(N
M3
1
+ NZT q
(M) 11
!' 1) (-i--)
M 5/4 M +qs/2(l) M 5/4 M
(1)
NRZ Z
q2M
1V2,
312
A large sieve on the Farey arcs
213
By Lemma 10.2.1, K
K,+1
MKT3/2
al )I+ (l) <<
b(1)e(_
`
BM(k-1)/2(R q
(log M)112. (10.4.4)
)3/2
q
For 15 K1 we apply Lemma 5.5.6 with
a=x,, -M/4, 6 =xjl+M/4. If a< M, then we must define the function f(x) suitably in the range 3M/4 <x <M, where all the weights wa(x) are zero, and similarly if /3 > M2. We have
6/R> If'('_ 1) - f'(xx,)I >_ (N_1-xj,)max'I f"(x)I. We now see from (10.3.3) and (10.3.5) of the construction that N,_1 >- xj1-
M/4, and similarly that N.<xj+,+M/4. Hence the interval a <x < /3 includes the support of wj(x), and the endpoint terms in Lemma 5.5.6 are zero. The stationary phase term in Lemma 5.5.6 gives the explicit term in (10.4.1), and the second error term dominates the first. The error terms for I+(l) from Lemma 5.5.6 sum to K,
<< ,= Ib(1)
1
( (lMg2)1/4 l
(k
M
1)/2
I
1
M3
N2T3/2
BMk/2R3/2(log M)112 <<
(Nq)
,
(10.4.5)
where we have used Lemma 10.2.1 again. In each range Q :!g q < 2Q there are O(MQ2/NR2 + 1) Farey arcs by Lemma 1.6.1. We obtain the error term in (10.4.1) by summing (10.4.3), (10.4.4), and (10.4.5) over the Farey arcs; we find that the contribution of (10.4.5) cannot dominate both the other error terms at once. 0 10.5
A LARGE SIEVE ON THE FAREY ARCS
At this point in the argument we have been accustomed to using Lemma 5.6.6 with a bilinear exponent, and symmetry between the `integer' vectors x and the `rational' vectors y. The integers were small, and the number of rationals aj/q1 was large. In this chapter we have L large and J small. We do not want a Holder inequality replacing one variable 1 by two variables E 1; and E 1; ,
where 1, are integers in the range of 1. We treat the sum over 1 and j in Lemma 10.4.1 as a Type II sum in Lemma 5.6.1. Montgomery (1971 Lemmas 1.6 and 1.7) obtained all his large sieve results this way. We simplify the notation of Lemma 10.4.1, writing K for K1,
g;(1) =gjl (x,) - ail/qq,
(10.5.1)
hj(l) = w1(xx, )hj (xJl ).
(10.5.2)
Exponential sums with modular form coefficients
214
Of course similar arguments apply to the sums involving xp. We divide the range for l into blocks L 5 15 2L - 1, and the range for j into blocks with
Q
(10.5.3)
We write E(Q) for a sum over j restricted by (10.5.3), and J(Q) for the number of Farey arcs satisfying (10.5.3).
We cannot separate the variables completely before we apply the large sieve.
Lemma 10.5.1 (The dependence on 1)
Consider l as a continuous variable with 0< 1< K. Then for j fixed with Q 5 qq < 2Q we have in (10.4.2)
dx dl
16C2NR2
(10.5.4)
15Q (IM)
Let u? be the value of x in (10.4.2) when 1 = 0. Then in (10.4.2) we have 16C2NR2Vi
(10.5.5)
15MQ2
We can sharpen (10.5.4) and (10.5.5) to
dx dl
+
1
f" (u; );'-
F
N2R4 O I M2QZ
u
!+ qi
qi
2ui2f"(u;2)
+0
(10.5.6)
I LN2R4
1
M5/2Q3
2qi
(10.5.7)
Proof The stationary phase point x is defined by (10.4.2): q1
1
f (x)= q-+ qj
V
1
(10.5.8)
x/
so
(x)+
1
1
2qI(i)\dxdl
2qt (1x)
(10.5.9)
From (10.3.5) we have
VK 5 /R2) (x3) 2Q 1
2q
MR
5
< - minlf" (x)I. 16CZM2 - 6
Hence
dx
dl S 2qj
16
1
(lx)
15minlf"(x)I'
A large sieve on the Farey arcs
215
which gives the bound (10.5.4). Define z by
x = (uj +z)2. Then we have
-u;l<
1
2V
i dx dl, fo dl
which gives (10.5.5). Expanding f'(x) by Taylor's theorem in (10.5.8) as far as the term in f (3)(x), we find that VT
z
2gjui f" (uf (
1+0
Izl
M
which gives (10.5.7), and (10.5.6) follows similarly from (10.5.9).
Lemma 10.5.2 (A large sieve)
The main terms in Lemma (10.4.1) satisfy 2
min(2L-1, K)
T(Q) b(l)hjU)e(gj(l)) j
1=L
J(Q)z )1/4
Mk-1NR2
Q
5AOB2
+E
(LI -
log M
4VB(O1(V),O2(V),03(V),L4(V))I, (10.5.10)
V
where V is summed through positive powers of 2 satisfying V ::g
LF (LM )1/4
.
(10.5.11)
Here
Q -(L ) LQ2
A2(V) =A2
MV2
NR2
LN2R4
+A2 M3Q2
LQ2 3/4 NR2 +A3 MQ2 ( M VVL 1
A3(V) =A3
04(V) =A4
MQ2 NR-TV--2
LQ 2
0
NR2
0 M) +A4 MQ2 VM
Exponential sums with modular form coefficients
216
and B(O1, 02, 03, 04) denotes the number of pairs i, j with Q < q,, qj < 2Q for which
lizijil < All u;
q ,
u1
qj
(10.5.12)
02 min(02, A3)
if z1
< 04
zii = 0,
0,
if z,. = 0,
(10.5.13)
and
f " (u?)
f (u2)
if
(10.5.14)
where
zit = ai/gi - aj/q>, and
uj =
f'(u12)=ai/q1,
x+-o,
and A0,..., A4 are constants constructed from the constants C1.
Proof Using Lemma 5.6.1, we bound the square on the left of (10.5.10) by 2L-1 lb(l)12
lkh,(l)ht(1)e(gi(l) -ge(l))
1k
1
t
The first factor is O(B2 log L) by Lemma 10.2.1. We estimate the sum over 1 in the second factor using Lemmas 5.1.1 and 5.4.3. The weight function
H(l) = lkhi(l)h1(l) is allowed to depend on i and j. We need estimates for the size and total
variation of H(l) that are uniform in i and j. The bound for dxj,/dl in Lemma 10.5.1 shows that the weight function H(1) is f times a slowly varying function of 1, with
H(1)^
Mk 1NR2 Q
l M1 ,
and the total variation of H(l) has the same order of magnitude. After Lemma 5.1.1, we consider sums e(g,(1) - g,(1))
(10.5.15)
A large sieve on the Farey arcs
217
over subintervals of the range L < I < 2L. We have d ge(l)
dl
_ - gjV(x1l) -
-
and 1
xj
2qj
(13)
d2
dl2
ge(l)
qj
)
dx
lxj, 1
dl
1
1
2qj
,
By Lemma 10.5.1 we have kdl
a, 1 uj gi(1>+- -+-+
d
qj f
dl
32C2 NR
+q
71
2
(10.5.16)
15MQ2
A5N2R4
1
2ql
q1
aj
uj
1
ge(l) + q,
a uif, (ui)
2
MaQs
<_
M V(--)
(10.5.17)
and
d2
dl2
ui
1
g1(l) + 2q/
A6N2R4
5
13/2
(10.5.18)
M2Q3 (LM)
for some constants A5 and A6. In the sum (10.5.15), either u,
u
qj
1 15 qj
32A N2R4L
(10.5.19)
M5123 Q
or else the second derivative of the exponent in (10.5.15) has fixed order of magnitude, which we write as -gj(1))
2
Q
(L) 5
dl2
2
5 Q
(g;(1)
(LM
)
(10.5.20)
corresponding to u;
u!
qr
q11 -
'q
(10.5.21)
Q
for some i with 71 >
2A6LN2R4/M3Q2.
We call pairs i, j for which (10.5.19) holds `bad pairs', and other pairs `good
pairs'. For a bad pair, the first derivative (d/dl)(g;(l) -g,(l)) changes by at most 18A6N2R4 M2Q4
(LQ2 M
which is bounded, since
Q >> min(l, NR2/M).
Exponential sums with modular form coefficients
218
For bad pairs, the truncated sum in Lemma 5.4.3 has a bounded number of terms. For good pairs Corollary 1 to Lemma 5.4.3 gives
(r
e(g,(1) -g1(1)) <<
1/4
(-)
+ (LM)"4
(10.5.22)
We use the Riesz interchange principle, counting how often the sum over I is large. If
(LM)1/4V
e(g;(1) -g1(l))
F
(10.5.23)
with V greater than some absolute constant A7, then either the second term dominates in (10.5.22) with (10.5.24)
-q << LQ2/MV2,
or the pair i, j is bad. Combining both cases, we have LQ
« v2 to
LQZ
N2R4 + MZQ4
(
(10.5.25)
M
which gives u;
u1
LN2R4
LQ2
q1
<<
Mv2 + M3Q2
(10.5.26)
q1
In both cases the truncated sum in Lemma 5.4.3 has a bounded number of terms. The fraction z11= a;/q; - a1/q1 is only defined modulo one, so we can take the largest term to be r = 0, and
e(g,(l) -g1(l)) << I
min(2L-1,K)
L
e(g;(y) -g1(y)) dy
+ 1.
The First Derivative Test (Lemma 5.1.2) gives mint
d
dl
(g1(l) -g'3(1))I
-
(Lm) 11 4
V
There are now two cases. If z;1 is non-zero, then we use (10.5.16) to get
Iz.. <
1/4
(LM) V
+
1
u1
-u1
q,
q1
+
NR2 MQ2
(10.5.27)
If the pair i, j is bad, then the middle term in (10.5.27) is dominated by the third term. If the pair i, j is good, then the middle term has size
Q
I L 1<<
QV
I
M 1
(10.5.28)
A large sieve on the Farey arcs
219
by (10.5.21) and (10.5.24). Since the left-hand side of (10.5.23) is trivially at most L, we deduce the upper bound (10.5.11) for V, which shows that the expression on the right of (10.5.28) dominates the first term on the right of (10.5.27), and we have
z;il «
z
(M)
Q
2
+ MQZ
(10.5.29)
in both good and bad cases.
If zii is zero (so that qi = qi), then we can argue in two ways. We use (10.5.16) to get 1
,r,
I
ui
ui
qr
qi
F I
NR2
« (LM)1/4V + MQ2
so that ui
qi
ui
qi
1 «VFL(
Q2 3/4 NR2 +MQ2 M
LQZ
1
M
(10.5.30) )
We can also substitute (10.5.25) into (10.5.17) to get 1
1
i (su2)
2qZu2"f"(u?)
2qiZu2f
<<
Q
4
(M) +
M2Q3
(M) ,
analogous to (10.5.29). We deduce that
l f" (u?)
MQ2
u? u?
<<
+
NR2V2
NR2 MQ2
L2 LQ2
(-k--).
(10.5.31)
Using (10.5.30), we can replace u?/uj by q?/qj2 = 1 in (10.5.31). We now see that when (10.5.23) holds with V > A7, then the pair i, j is counted in B(A1(V ), ... , 04(V )) when A1,. .. , A4 are chosen to correspond to the implied constants in (10.5.29), (10.5.26), (10.5.30), and (10.5.31). The result (10.5.10) of the lemma follows when we split into cases in which the medulus of the sum (10.5.15) lies in ranges V to 2V.
Lemma 10.5.2 can be applied to a family of sums formed with functions fi(x) satisfying the inequalities (10.3.2)-(10.3.5) uniformly in i. We interpret the sum E(Q) as extending over Farey arcs from all sums of a family. As usual, we suppose that fi(x) =f(x, y.), where y,,..., y, are distinct points in
05y51.
Lemma 10.5.3 (Large sieve for a family of sums) For a family of sums formed with functions x
f(x,Y1)=TF M,Yi),
Exponential sums with modular form coefficients
220
where F12 is continuous and bounded away from zero, let B'(01,..., 04) denote the number of pairs i, j counted in B(i 1, ... , A4), but with
ar/qr $ aj/q,. Then we can modify (10.5.10), replacing the sum over V by 1QV
E v
vQ
B,(z
1(V ), A2(V ), i3(V ), i4(V ))
+J(Q)L +
QJ(Q)log I
(M )
minlyr -Yjl
(10.5.32)
Proof The analogue of (10.5.9) is
fll(xjr,Y) +
xr= -f12(xj,Y),
FX+3
Zqj
aY
so - ayxj+1X
M.
Let g,(l) in (10.5.16) be formed with parameter yr, gj(l) with parameter y, with a,/q; = aj/qj, but r 0 s. Then a
1
a(gt(1)-g;(l))I
=
qj
xjr
1
(l ) - qj V
l yr -Ysl
Q (LM)
I axjr
aYI
x t
1
I Y. -Ysl
Q
M
V1L
and the First Derivative Test gives
IEe(g,(1)-gj(l))I << IYrQYs)
(M)
If y;+ 1 - y; z S for i= 1, ... , I - 1, then I yr - ysl > I r - sl S, and for r fixed 1
s#r IYr-Ysl
2 <-2S 7-1 1-S-(log I+1). n S
0
n=1
10.6 TOWARDS A MEAN SQUARE RESULT We start with the smoothed sum
S(t) _
F, b(m)cej(m)g(m)e(f(m)), j
m
(10.6.1)
Towards a mean square result
221
where f(m) depends also on a parameter t by
f(m) = (T + t)F(m/M), and g(x) is supposed bounded and of bounded variation. We write S(Q) for the sum S(t) with the extra restriction (10.5.3), so that IS(t)12
=IF
2
S(Q)(t) I
s
tlo
R
g2
+1
IS(Q)(t)I2,
where Q is summed through powers of 2 withQ Q5 R. We obtain short interval means for the sums ScQ>(t), but the length of the short interval depends on Q. Changing to uniform Diophantine approximation merely moves the difficulty elsewhere. When we consider a farrtily of sums, then we subdivide the Farey arcs with Q5 qj < 2Q further into transversal sets. We use the same notation E(Q) for a typical transversal set, with IS(t)I2 << log R E IScQ>(t)12, Q
the sum being over all transversal sets for each Q. We write S,(t), Sf Q)(t) to indicate that T takes the value Ti. We sketch how to obtain a short interval mean value for the smoothed sum
S(t), provided that (10.3.3) holds for r = 1 also. We start from a smooth version of Lemma 5.6.4.
Lemma 10.6.1 (Short interval mean with smoothing) Let am be a sequence of real numbers, and let f3(M)...... 3(M2) be any complex coefficients. Let 8 > 0 be real, s >- 4 be an even integer, and let M2
M2
A(t) _ E 13(m)e(amt),
B(x)
F, f3(m)X(x - am),
M
M
where X(x) is the smooth weight with s continuous derivatives X(x)-
?(1+coss+1 S)
forIxl<S,
0
for IxI >- S.
Then
f 1r/3S JA(t)12 dt a/3s
<_
Sf 2
I B(x)12 dx.
Proof We note that X(x) is positive, and rs J X(x) dx = S.
The Fourier transform j(t) of y(x) is given by The
X(t) = f e(-tx)X(x)dx.
Exponential sums with modular form coefficients
222
For Itl < it/35 and btl 5 6, we have I txl 5 it/3, and s
Re X(t)
X(x) s dx >i f
- 26
We now use Parseval's identity for Fourier transforms:
f
I E /3(m)X(x-am)I dx= f 1E,6(m)X(-0e(ant)I dt flr/3S
2
E 13(m)e(a,,t) I $'(-t)12 dt,
-1r/3S
M
0
which gives the lemma.
We obtain a short interval mean
fUIScQ>(t)I2 dt <
S2
f IB(y)12 dy
(10.6.2)
by taking
a. = F(m/M),
6 = it/3U,
ai(m) = wi(m)b(m)g(m)e(arT).
/3(m) = DO /3i(m), i
Since (10.3.3) holds for r = 1, for fixed y, X(y - am) is non-zero for an interval of y of length between 26M/C1 and 2C16M. We take
6M x NR/Q, Lemma 10.6.2 (A short interval mean)
U x MQ/NR.
(10.6.3)
In the notation of Lemmas 10.4.1 and
10.5.2 f U IS(Q)(t)12 dt
-U
/M) k-1/2
« E(Q) j
RI L
`
L
log2 M f
]2
2L-1
2
b(1)e(gi(1)+r11)
d77
1=L
1111
( Q2 M )1/3
+ O B2Mk logz MI
Rz
QR
Q2 M2R2 Q6 M4R4 R6 N6 + R2 N3
where K(x) is the weight function of Lemma 5.2.3 with N replaced by L.
,
Towards a mean square result
223
Proof There are finitely many j for which /3j(m)X(y - am) 0 0. Hence 2
R(m)x(y - a,,,)I
2
« (Q)
aj(m)X(y I
M
am)I
.
(10.6.4)
m
In the notation of Lemma 10.4.1
E wj(m)b(m)g(m)x(y - am)e(TF(m/M)) m
qJ
b(1)e
x
M
wj(xi )g(xj )x
,
+O BMk/2(log M)1/2 r Q 3/2 MR2
+I
R)
N2
+
M
(Q) (QR
R3/2
(NQ)
(10.6.5)
We break the sum over I into blocks L < 15 min(K, 2L - 1) as usual. We write
hj(1) = wj(xj,)g(xj+,)X(y - F(xj+,/M))hj, (x, ), and treat hj(l) as a weight function, with ( M\(k-1)/2 (NR2) hi (1) « L JII
(LMQ2)1/4
We use Lemma 5.1.1 to pass to a subinterval of L 515 2L - 1 with an unweighted sum, and Lemma 5.2.3 to pass from the subinterval to the whole interval, with an integration over a factor e(7 l). Cauchy's inequality gives 2
F, wj(m)b(m)g(m)x(y - am)e(TF(m/M))I M
<< log M B2Mk log M
N 2 R M 1/3 (M2Q (QR
+ F, F,
t
Q3 M2R4 R3 ) + R3 N4 + NQ
f K('q)
L
2L-1
x 1=L
NR2
2
b(I)e(gj(1) +'ql)
L
Q (LM)
(10.6.6)
in the notation (10.5.1). The bound (10.5.6) is uniform in y, but the sum on the left is non-zero only for y within a distance 6 of a value of F(m/M) on
Exponential sums with modular form coefficients
224
the Farey arc, so y lies in an interval of length O(S) by (10.6.3). Integrating (10.6.6) with respect to y, we have fU IS(Q)(t)12 dt
-U
1
<<
M k-1 NR21og2 M
(Q)
F,
i
L E( L)
Q (LM)
2L-1 X f K(77)1
12
b(1)e(gj(1) + 711)
d07
[=L
r N2R M 1/3
+B2Mk log3 M M2Q
QR
Q3 M2R4 R3 R3 N4 + NQ I I
which gives the result of the lemma.
Lemma 10.6.2 has the sum over 1 inside the mean square. We use two lemmas on matrices. Lemma 10.6.3 (Bilinear form duality) Let A be a complex matrix, and A* its complex conjugate transpose. Then the square matrices AA* and A*A have the same non-zero eigenvalues.
Proof If both
w=Av
A*Av=Av, are non-zero vectors, then
AA*w=AAv=Aw. Lemma 10.6.4 (Gershgorin's bound) Each eigenvalue A of a square matrix C satisfies
IA -
E Ic,jI.
i#, Proof Let i be the row in which the corresponding eigenvector v has its largest entry. Consider the ith row of Cv. Lemma 10.6.5 (Dualized large sieve) 2L-1 F(Q) f1/2
i
-1/2
K(n)
In Lemma 10.6.2, for a family of sums, 2
b(1)e(g,(1) + ql)
d,7
!=L
(LM)114
(J(Q+ E VBa(O1(V ), ..., A4(V ))) + L
< < B2Lk loge L
V
+
Qlog I
L
min ly, -y1l
M
10j
Jutila's third method
225
where V is summed overpowers of 2 as in Lemma 10.5.2, and B0(01, . , 04) is the maximum over j of the number of pairs i, j counted in B'(i1, ..., 04). For a single sum we omit the term in log I.
Proof Let A be the J(Q) X L matrix corresponding to the sum in Lemma 10.5.2 or 10.5.3 with e(g,(1)) in row j, column 1. Then F(Q)
2L-1
2
F, b(1)e(gj(l) + r11) [=L
_
a,[,aj1 5A
b(11)e('711)b(l2)e(-7112) [,
Ib(l)12, [
1
12
where k is the largest eigenvalue of AA*. By Lemma 10.6.3, k is the largest eigenvalue of A*A, and by Lemma 10.6.4
A s max L 1
e(g[(l) -g,(1))
[
which we estimate as in Lemma 10.5.2 or 10.5.3. The bound is uniform in q, so we may integrate IK('q)I to O(log L).
10.7 JUTILA'S THIRD METHOD The most powerful form of Jutila's method for the mean square (1990a) uses the large sieve of Lemma 5.6.6. As in Section 2.4.5, we need some heavy gardening. The reward is to make the parameters N and R independent, so that we can choose better ranges. We use partial summations repeatedly.
Lemma 10.7.1 (Partial summation in a modulus squared integral) Let H(x, y) be differentiable for M< x5 N, 0 S y 5 Y. Then for any coefficients G(n, y) we have 2
fY N
F, G(n, y) H(n, y)
0
M
Y
dy5C sup
MSKsN 0
2
K
F, G(n,y) dy, M
where
C=2supIH(n,y)12+2(N-M)sup JMIH1(x,y)12dx. y
Y
N
Proof As in Lemma 5.2.1, we start from a partial integration identity: N
N
M
N
X
g(N, y)H(N, y) - I H1(x, y) F, G(n, y) dx.
G(n, y) H(n, y) _ M
M
M
The lemma follows on squaring and applying Cauchy's inequality.
Exponential sums with modular form coefficients
226
We write (3m = F(m/M), and consider the sum M2
S(t) = E b(m)g(m)X(m)e((3mt), M
where g(m) is a smooth weight function with g'(m) << 1/M, and X(m) is a smoothed characteristic function, zero for m (M or m )M2, rising smoothly to one within a short distance of M and M2, of the type used in Lemma 10.3.1. Lemma 10.7.1 lets us consider sums of the simpler form M2
S(t) _ E c(m)e(/3mt),
(10.7.1)
M
where c(m) = b(m)X(m).
Lemma 10.7.2 (Short interval mean) Given U in 1 < U5 M, there is an integer H in
M/4C1U - 1 < H< C1M/2U + 1,
(10.7.2)
where C1 is the constant of (10.3.2), with T+U IS(t)12
fT-U
U2
m+H
dt << - F,
I
I c(n)e( /3,,T)I
M m n=m+1
2 .
(10.7.3)
Proof Lemma 5.6.4 gives fT+oIS(t)12 dt << AF, E c(n)c(m)e( 13,,T - /3mT)A(2A( Nn -13m)).
T-a
m
n
(10.7.4)
The expression on the right of (10.7.4) is itself a modulus squared integral, by Parseval's theorem for Fourier transforms: 2
-f
c,,e(/33T)I dy,
S
where S = 1/20. We put y = F(x/M). Since F'(x/M) is bounded, the right-hand side of (10.7.4) is 2
2
<< M f I
dx, n
where the sum is over n with
F(x/M) 5 F(n/M) 5 F(x/M) + S, a range of the form x 5 n 5 x+ z with
z = S/F' ( )
Jutila's third method
227
for some . We now average A over U5 A <- 2U, so z is given implicitly by
F(xMz) -F(M)
20'
and for x fixed
x+z
1
MF( M
1
dA
2 dz
)
Thus the integer in (10.7.3) has order U2 <<
M
2 dxd0
2u
ff C/
VU
x5n5x+z
U3
2
<< M2 f f I xSnSx+z
c(n)e( /3nT) I dx dz,
where z is integrated over a subinterval of the range M/4C1U 5 z S C1M/2U.
The integral over z can be bounded by the length of the range, O(M/U), times the supremum over z. For the value of z at which the maximum occurs, the sum over n has either [z] or [z] + 1 terms. Hence the lemma holds with
Hgivenby[z]or[z]+1. We divide the range for m into intervals Ij with length N, where
M/Ux2H5N<< MU/T.
(10.7.5)
In each interval I, we pick a value m1 of m, and we put
a
M
F
I(M
Let Xj(m) be a smooth weight function with 1
X1(x)
0
forx=m+y,m in I105y5H, forIx-mil>2N,
and
Xj(')(x) << 1/N'
for r 5 s. We put ca(n) = c(n)X,(n). Lemma 10.7.3 (linearizing locally) For each a > 1, let K(a, a) be the kernel 2
a
K(a, a) =I
e(ah) h=1
Exponential sums with modular form coefficients
228
Then for each interval Ij, there is a positive integer Hj < H such that T+UIS(t)12
fT- u
2
2
dt << M
ci(n)e(an) I K(Hj, a- aj) d a. (10.7.6)
f1I
Proof We use Lemma 10.7.1 with G(h, m) = c(m + h)e((3mT + ajh), H(h, m) = e( ,6m+hT - (3mT - ajh), corresponding to
+y
H(x,y)=e(TF(x
)-TF(
M
x M)-ajY),
with
IH1(x,y)I=2Tr
T
MF'(x
+y
21rT N+H
5
)-aj
M M by (10.7.5). Hence, for some Hj s H, we have U22
T+U
fT_U
dt <<
M
maxIF"(x)I<< 1/H
2
Hj
L ME j mE
F, c(m + h)e(ajh) h=1
In the inner sum c(m + h) = cj(m + h), so we have H 2
c(m + h)e(ajh) < meld h=1
m h=1
2
c(m + h)e(ajh) Hj
Hj
_ F, F, cj(nl)cj(n2)e(ajnl - ajn2) E E 1 h,=1 h2=1
nl n2
nl-hl=n2-h2 Hj
Hj
cj(nl)cj(n2) E F, e(ajhl - ajh2)
_ nj
h,=1 h2=1
n2
hl-h2=n1-n2
2
= f 1 I L cj(n)e(an) 0
Hj
2
E e(ah - ajh) da, h=1
which gives the lemma.
We summarize the gardening steps. We can take the range of integration
in (10.7.6) to be aj - i to aj + '-2, and replace K(Hj, a - aj) by a bounded multiple of
Kj(a) = min(H2,1/IIa- ajII2).
Jutila's third method
229
Let Do run through the numbers 1/2' with
252'=1/Do
(10.7.7)
Then Kj(a) < 1/Do implies I a - ajI S Do, and we deduce from (10.7.6) that 2
fT
2
n-° II da. (10.7.8)
E fa'
I S(012 dt << M E
j
Ao OZ0
n
o
In thisvform of the method, the parameter R is independent of N. We suppose that
H
(10.7.9)
By Dirichlet's approximation theorem (Lemma 1.5.1) for each a there is some fraction a/q with q
a= q
+(3,
1
Il3I< qR.
We pick a smooth weight function (o(x) with 1
for IxI <- 1/R,
0
forIxIz2/R
and
dr)(x) << 1/Rr for r < s. We have fa,+n°
ct(n)e(an)12 da a,-o° I E n <<
/an
E Ia/q-a1I52G°
f I E ct(n)eI `q n
l2 I w(a) d/3; 1
+ /3n I /
(10.7.10)
q
here we have used the condition H < R in (10.7.8). We apply Lemma 10.2.6 to the sum over n, replacing the interval M :!g n < M2 by the support of Xj(x), whose length is O(N), with g(x) = X,(x)e( /3x),
f Ig(')(x)ldx <
(10.7.11)
Exponential sums with modular form coefficients
230
The condition (10.7.11) is not critical, because Lemma 10.2.6 can be sharpened by taking more terms of the Stokes expansion in Lemma 10.2.5. We have fT+UIS(t)12 dt<
K
1
M Do
0
i
la/q-a Is2Go
f
x(k1)/2 r dl ci(x)Ree --+ <(-l) q
1
b(l)f
(lxg2)1/4
1=1
2
(lx) +(3x+8
dx
q
We use Cauchy's inequality to divide the sum over 1 into ranges L< 15 2L, and we divide the sum over a/q into blocks Q S q5 2Q. We expand out the modulus squared, destroying the symmetry, and consider 2L-1 2L-1 1 xy (k-1)/2
f
1
=L
n
=L
b(1)b(n) f f
M
q (lnxy)
2 (lx)
do
XeI - q + q
2
q
ci(x)cj(y)
(1n)
(/31
(ny) q
do. (10.7.13)
Integrating repeatedly by parts gives (non-uniformly in s)
fe(/3x-E3y)w(g)dR<<
(I
-yl
(10.7.14)
We work to an accuracy Ts. If s >- 2/8, (10.7.15) Ix -yI z QRT5, then the integral in (10.7.14) is O(1/T2) (with constant depending on S), which is quite negligible. Hence terms in (10.7.14) which satisfy (10.7.15) can be absorbed into the error term. We write
2 410 e
q
-2
2V
(ny) e
q
(
q
(f-V )+
2y q
(v - Yy)
When we take x and V as new variables of integration, then we find (with more difficulty) that the terms with I1-nI z
Q
(LM) Ts
can also be absorbed into the error term.
(10.7.16)
Jutila's third method
231
We can now set up a large sieve (Lemma 5.6.6) with
x(1, n) _ (1- n, f - v, F (a
a
(
yq>x,Y)=I- q,
2V 2 q
,q(v -v
and
X=
(min(io(
l
(LM) T8))l,min Y= 1,0(
(2L) O Q
/(jT6)v1(2L))
Q ,O(RTS)
1
The coincidence conditions corresponding to the third entries of the vectors x and y are almost trivial. For HN << M2/T,
H «R2,
(10.7.17)
the coincidence conditions take the forms (10.5.12) and (10.5.13). We take R2 << NT-S,
so that the parameter that corresponds to N in Chapter 7 is our H, not N.
This method leads to a mean square result for S(t) itself, not just the components S(Q)(t).
Part III The First Spacing Problem: `integer' vectors
11
The ruled surface method 11.1
PREPARATION: DIVISOR FUNCTIONS
Let h be the integer vector (hl,..., h,,), and let X(h) =x(h1) + ... +x(h,).
The conditions for coincidence between the vectors x(h) and x(g) can be written as gi+...+gr
=hi+...+hr,
gi12 + ... +gr/2 = hi/2 ..... +h;/2 + O(SH3/2), g1+...
+gr=h,+... +hr,
gi/2 + ... +gi/2 = h1/2 + ... +h;/2 + O(AH1/2). We suppose that all the variables are contained within a range
HSg,,hi52H-1.
(11.1.5)
For the spacing problem in the large sieve of Chapter 6 we have r = 3, 4, 5, or
6 and A = SH. We need to treat S and A as independent in some of the proofs. We write N2r(5, A) for the number of solutions of the equations (11.1.1)-(11.1.5).
The methods of this chapter are mostly elementary number theory. We want to count the number of integer solutions in bounded regions for certain sets of equations and inequalities. The difficulties are partly algebraic, partly analytic. A system of polynomial equations defines an algebraic variety in affine space. The geometry is dominated by two integers, the dimension and the genus of the variety. The size of the set of integer solutions is usually much smaller than the dimension of the variety would suggest. In the case of projective curves, there are only finitely many integer solutions unless the genus is zero or one (Faltings 1983) (an integer point on a projective curve corresponds to a rational point on an affine curve). Dimension and genus are not sufficient to determine the number of integer solutions. The following examples are all parametrizable curves of genus zero.
The ruled surface method
236
We give the number of solutions in the box Ixl
1. straight line y = x: 2. parabola y =x2: 3. parabola 3y + 2 = x2: 4. circle x2 + y2 = N2: 5. hyperbola xy = N: 6. hyperbola x2 -2 y2 = 1:
N, I yl N for large N:
x N solutions;
x V solutions; no solutions; O(NE) solutions for any e > 0; O(NE) solutions;
x log N solutions.
In practice, the nature of the problem seems to change as soon as we add an inequality. The algebraic geometry becomes less relevant. It may be that most solutions of the inequality satisfy not just the given inequality, but a
corresponding equation, and they lie on an algebraic variety of smaller dimension. Many divide-and-conquer arguments go from an inequality to an
equation in this way. For the equations (11.1.1) - (11.1.4), if a and 0 are small enough, then we expect the diagonal solutions to predominate; these are the solutions with g1,. .. , g, equal to hl,..., h, in some order. We start with two variables. Let d(m) denote the divisor function as in Section 2.1 of Chapter 2, the number of ways of writing m as a product of two positive integers. Lemma 11.1.1 (Binary quadratic forms) Let n be a positive integer. Then the number of pairs of integers x, y (of either sign) with x2 - y2 = n is 2d(n) if n is odd, 2d(n/4) if n is a multiple of four, and zero if n is twice an odd integer. The number of pairs of integers x, y with x2 - xy +y2 =n is
6 E X(d), din
where X(d) is the Dirichlet character mod 3 taking the values X(d) = 0, ± 1 according to X(d) = d (mod 3). Corollary If a and b are integers with a2 < 3b, then the number of triples of integers hl, h2, h3 (of either sign) with
hl+h2+h3=a, h; +h2 +h3 =b is
(d
(6-3X(a2))
XI deven,
(11.1.6)
dl(3b-a2)
Proof If x2 - y2 = n, then x + y = e, x - y = d for some integers d and e of the same sign, and either d or e both even, or d and e both odd. Conversely, each such pair of d and e gives a solution. When n is odd, then any pair of integers with de = n gives a solution. When n is even, then if de = n, either d or e must be even. If n is twice an odd number, then the other factor must be
Preparation: divisor functions
237
odd, and we cannot construct a solution. If n is a multiple of four, then any pair of even integers d and e with de = n gives a solution. The equation x2 - xy + y2 = n is treated in the same way in principle, but the factorization is x + py = 6, x + p2y = e, where p is a complex cube root
of one, with p2 =1- p, and 6 and a are algebraic integers of the form a + bp, where a and b are integers (see Hardy and Wright 1960, Chapter 12).
The lemma contains the theorem that every prime number of the form 6m + 1 can be written as x2 - xy +y2. For the corollary we eliminate h3 to get
2h; + 2h2 + 2h1h2 - 2ah1- 2ah2 = b - a2, r and we note that the substitution
x=h1+2h2-a,
y=h2-h1
(11.1.7)
makes
x2 - xy +y2 = 3hi + 3h2 + 3h1h2 - 3ah1 - 3ah2 + a2 = (3b - a2)/2. (11.1.8)
Conversely, if x and y satisfy (11.1.8), and if a is a multiple of three, then (x +y)2 is a multiple of three, and so is x +y, and we get integer values for h1 and h2 from (11.1.7). If a is not a multiple of three, then
2(x +y)2 - -a2 = 2a2(mod3). Half the solutions have x + y = a, and the other half have x + y = -a. Only the solutions with x + y = - a give integer values for h and h2 in (11.1.7). 1
We deduce the formula (11.1.6).
In the next few lemmas we see that the sum over d in Lemma 11.1.1 is not too large, especially if we can average over n. Thus when h4, .... h 91, ... , gr have been chosen in (11.1.1) and (11.1.3), then the last three variables take
care of themselves. We need the more general divisor function d,(n), the number of ways of writing the positive integer n as a product of r positive integers. We give the standard bounds. The averages in Lemma 11.1.3 may be replaced by asymptotic formulae. The leading terms in the asymptotic formu-
lae may be obtained by elementary arguments as in Lemma 11.1.3, or by counting lattice points as in Chapter 2, or by using the Dirichlet series generating function
d,(n) ns
Lemma 11.1.2 (Bounds for divisor functions) We have
d,(mn) 5 d,(m)d,(n),
(11.1.9)
with equality if the highest common factor (m, n) is one, and
d,(n)ds(n) S d,s(n).
(11.1.10)
The ruled surface method
238
For any S > 0, there is a constant B(r, 8) with
dr(n) SB(r, 6)NS.
(11.1.11)
Proof If m = k1k2 ... kr,
and n = 1112 ... lr, then mn = g1g2 ... qr, with q1= k,!1. If (m, n) = 1, then q, determines k. _ (q,, m) and 1, = (q,, n). If (m, n) > 1, then there are several sets of factors k1,.. . , kr and 11, ... , Ir which give the same products q1, ... , qr. This proves (11.1.9). If n = k1k2 ... kr = 1112 ... ls, then we construct rs numbers q1j whose product is n as follows. For any factor e of n, we put ml(e) = (k1, e), and, for i > 2, m1(e) = (k1k2 ... k1, e)/(k1k2 ... k;-1, e). Then q,1 = m,(11), and, for j > 2, q,1= m1(1112 ...11)/m,(1112 ... 11_ 1).
The fraction q,1 has the form
(kk',Ii')(k,1) (k, ll')(kk',1)
with k=klk2
k1_1, k' =k1, 1=1112
1i_1, 1' =1,. Let d=(k,l), k=du,
1= dv. Then (u, v) =1 and
(uk', vl') (uk', vl') q11= (u, vl')(uk', v) (u, l')(k', v) ' which is an integer. It is easy to check that the rs numbers q,, determine
kl,...,kr and When n is expressed as a product of powers of distinct primes
n=pSI...pkk, then
dr(n) = fl dr(p,'). For p a prime, dr(ps) is the same as the number of ways of typing a sentence
of s words on paper with r lines. This is the number of ways of typing r + s - 1 things, each either a word or a carriage return, with s words and r - 1 carriage returns. Thus d"(ps) _
(r+s - 1! <2 r+s-1 (r - 1)!S!
If P'> 2", then, for all s > 1, dr(ps) <
2r+s-1
< 2rs Spsb
There are finitely many primes with ps < 2", and for each of these
(r+s - 1)!
1
(r - 1).s.
pss
239
Preparation: divisor functions
as s - oz. Hence for these primes d,(ps)/psa has a maximum at some integer sp z 0. We have dr(n) d,(ps,) = S), s 5 as, n
pv
p>2'/6
attained when
n= F1 psr.
o
pG2'/b
Lemma 11.1.3 (Averages of divisor functions) we have
For r and N positive integers
N
L dr(n) < N(log N + 1)r-1,
(11.1.12)
1
N d(n)
L r nS(log N+1)r.
(11.1.13)
For r = 2 (the Dirichlet divisor problem) we have
IN\
N
Ed(n)=N logN+(2y-1)N+2 E pl dJ+0(1) d<1N
1
=NlogN+(2y- 1)N+O(iR).
(11.1.14)
where y is Euler's constant.
Proof First we note that N
1
L -51+ fj
e=1 e
Ndx x
1
We have N
N
F, dr(n) = L L...L1 n=1 el
1
e,
el ...e,= n
N
N
E
...
L
E
1
e,-I=1 e,SN/el...e,_I
e1=1
N
N
N
e,_1=1
el...er-1
5 F, .. L e1=1
SN(logN+1)r-1
Similarly N
L 1
d(n) r n
N
= L L...L n=1
(N 1
1
el
e,
el...e,= n
e1e2...er
1
1
e
'
The ruled surface method
240
For the asymptotic formulae we use a divide-and-conquer idea of Dirichlet. We arrange el,... , e,. in order of increasing size (with a subdivision of r! cases) and sum the largest first. For r = 2 the method is quite palatable. We want to count the number of pairs of integers e, f with of < N, which is
E
E
e5FN e
=2er
E 1+ E E 1
1+ L
fyf<eSN/f
e:5 FN f=e
/
1
([1_e+)=2ee P(e)+2 e FI e -eJ.
Let M = [/]. Then, by the trapezium rule (Lemma 5.4.1)
2eE (e -e1 =N-1+M-M+2 fMi x -xdx+4 fM /
N3x)
dx
11
=N+2 j +4N f
x
( N
N
o' (x)
x
1
3
I
/N
dx+Ol
3 x
=NlogN+4IN+O(1), where I is the integral
I=f
o(x) x33
dx.
1
The simplest sum involving I is
e=2-+2M+Mfdx-+2fM v(x) x3 dx
M1
1
1
=logM+2I+1/2+O(1/M). This is the limit which defines Euler's constant of integration y (about 0.5772). Thus I = y/2 - a . 11 In counting arguments we often have to add up divisor functions at the integers in some set. If the set is fairly dense in an interval, then the divisor functions average to logarithm powers. If the set is thin, then it is better to use the number of elements in the set multiplied by the maximum of the divisor functions. This leads to a bound with an exponent e which may be
taken arbitrarily small, at the cost of increasing the order-of-magnitude constant. The aim in this account is to average divisor functions where possible, but not to put much work into minimizing the resulting logarithm power. To show what can be done, however, we quote without proof a result of Hall and Tenenbaum (1986, Theorem 3) on Hooley's divisor concentration
Families of solutions
241
functions. Let ',(n) be the maximum over (real) E1,..., E, of the number of ways of writing
n=e1e2...er with the factors in ranges
E,<e,<2E; for i = 1, ... , r. This corresponds to knowing not only the product of r numbers, but also their individual orders of magnitude. Since there are fewer possibilities for the factors, the logarithm powers in the averages of Or(n) are
smaller than in the corresponding results for dr(n). Lemma 11.1.4 (Average divisor concentration) For t z 1 a real number and N large we have
(ti(n))` <
N)r'-rr+c-1+s
n5N
where, for fixed r and t, log log log N 1/2 <<
log log N The implied constants depend on r, t, and S.
11.2 FAMILIES OF SOLUTIONS The equations (11.1.1) and (11.1.3) correspond to the algebraic variety V with equations +Yr, (11.2.1) +xr =Y1 + x1 + X1
+x; =Yi +
+y
(11.2.2)
in 2r-dimensional affine space, a quadric generalizing the hyperbola in two dimensions. Since the defining equations are homogeneous, the variety V is a cone with vertex at the origin: if (x1, ... , xr, Y1, ... , y) is a point on V, then so is (t1,. .. , ty,) for any t. Thus any integer multiple of an integer point on V is also an integer point on V. More usefully, V is a cylinder with axis along the
vector (1,1,...,1). If (x1,...,xr,y1,...,Yr) is a point on V, then so is (x1 + t, X2 + t, ... , Yr + t) for any t. This construction forms the basis of Watt's counting method (1989a, b).
The tangent hyperplane to the hypersurface (11.2.2) at a general point P (x1,...,Yr) intersects the hypersurface in an r-dimensional plane. The variety V is the intersection of the hypersurface with a hyperplane. There is an (r - 1)-dimensional plane through P lying in the variety V. We have found two lines in this plane, one joining P to the origin, the other in the direction of (1, 1,-, 1). For r >- 4, there are more lines through P in linearly independent directions. We are lucky to have a rational line in the direction of a
The ruled surface method
242
vector with bounded height. There are inequalities of Minkowski (see Cassels 1959) relating the heights of the coefficients in the equations for a plane to the heights of the coordinates of a set of basis vectors for the plane. If one
basis vector is short, then the geometric mean of the heights of the other basis vectors will be large. Our first two lemmas use these vector methods. Lemma 11.2.1 (Three each side) When r = 3, then the number of solutions of (11.1.1), (11.1.3) and (11.1.5) is
< 14H3(log H + 3).
Proof We put k.=g1+h1 Then
(1,1,1) 1=11+12+13=0,
Case 1 The vector 1 is zero. There are H choices each for hl, h2 and h3, which are equal to g1, g2 and g3, so H3 solutions in this case.
Case 2 The vector k is parallel to (1,1,1), so k1 = k2 = k3 = k, say. Since 1= k (mod 2), we have 3k ° 0 (mod 2), and k, 11, 12 and 13 are all even. There
are at most H possibilities each for k, l1, and l2, which determine 13, so at most H3 solutions in this case. Case 3 The vector I is non-zero, and k is not parallel to (1, 1, 1). Then 1 is parallel to the vector cross product
(1,1,1) xk=(k3-k2,k1 -k3,k2-k1). Let d be the highest common factor of 11, 12, and 13, and let 1. = dm1. Then, for some non-zero integer e, we have k3 - k2 = em1, k1 - k3 = em2 k2 - k1 = em3.
There are at most 4H choices each for m1 and m2, which determine m3. When m1, m2, and m3 are chosen, then there are at most H/M choices for d and 2H/M choices for e, where M = max (m11. When k1 has also been chosen, then we know all six original variables.
When M has been chosen, then M= ±m1, ±m2, or ±m3. If M= -m1, say, then m2 + m3 = M. The number of solutions in case 3 is at most 4H
6E
M 2H2
M=1 m2=0
MZ
4H M+1
H= 12H3 E
M2
M=1
< 12H3(log H + 1 + x,2/6) < 12H3(log H + 3).
Families of solutions
243
Lemma 11.2.2 (Equal sums of three squares) The number of integer solutions
of 91 +g2 +g3 = hi +h2 +h3
(11.2.3)
with
0
O(H4).
Proof We put
ki=gi+hi,
li=gi - hi.
There is one solution with gi = hi = ki = 0 for all i. For the other solutions we let d be the highest common factor of k1, k2, and k3, and put ki = dmi. We consider m = (m1, m2, m3), an integer vector of length M_, where
M=mi +mZ +m3. The condition (11.2.3) becomes k111 + k212 + k313 = 0,
so that if I is the vector (11,12,13), then
We consider the vector m to be fixed. Let A be the lattice generated by m and the integer vectors I with I m = 0. If r is a vector of A, then for some integer t. Conversely, if r is an integer vector with M I r in, then r
is in A. Since the highest common factor of ml, m2, and m3 is one, then
there is an integer vector r with r in = 1. We deduce that r in takes all integer values. Thus A has index M in the full integer lattice, and the parallelepiped formed by the basis vectors of A has volume M. One basis vector is m of length VM-. The other two basis vectors, u2 and v3 say, are at right angles to in, and they form two sides of a parallelogram of area V. These vectors v2 and v3 span a two-dimensional lattice A2. The integer vector I lies in the lattice A2, and it has entries at most H in modulus. In order to apply Lemma 2.1.1, we make a change of coordinates in the plane of A2 so that v2 and v3 are the unit vectors. The vectors with IliI
the origin. If v is any non-zero vector of A2, then (H + 1)v has some coordinate numerically greater than H, and the image of (H + 1)v lies outside S. We see that S has diameter at most 2(H + 1), and perimeter O(H). By Lemma 2.1.1 the number of integer points in S is (
Or
Hz
+H
.
These integer points correspond to vectors I in A2 with Ilil 5 H.
The ruled surface method
244
For each power of 2, R say, there are O(R3) integer vectors m with
R2 <M=MI +M2 +M2 < 4R2. The number of choices of d, m and 1 is
0 d5H RSH/d
r H2
l
`
l
R3I R +HI
,
where R is summed through powers of 2. This sum is
2 H2 = O(H4). H2
0
dsH d
Lemma 11.2.3 (Families of solutions) The integer solutions of (11.2.1) and (11.2.2) form families: lines parametrized by
xi= ,+t,
y,=rj,+t,
(11.2.4)
1, ... , , and -q1, .... rt, are fixed, and t is variable. When r = 4, then the number of families which include an integer solution 91, ... 194, hl,..., h4 in the range (11.1.5) is O(H4).
where
Proof We verify that if
1, ... , .... , rt, is a solution, then (11.2.4) also gives a solution. Now suppose that r = 4, and that g1, ... 194, hl,..., h4 is an integer solution. We put t = -(g4 + h4). Then 2h4 - t = -(2g4 - t), and 3
3
(2g, - t)2 = 1
(2h1- t)2, 1
with 12g, - t1, 12h, - tI 5 2H. By Lemma 11.2.2 with H replaced by 2H, there are O(H4) possibilities for the integers 12g1- t1, 12h, - tI with i = 1,2,3. For each of these, there are at most 64 ways of choosing the signs. Since 3
3
(2g;-t)-
2h4-g4= 1
(2h1-t), 1
we know 2g4 - t and 2h4 - t and g1, ... , g4, hl,..., h4 are determined up to the additive constant t.
Our aim is to show that an average family contains a bounded number of integer solutions of (11.1.1), (11.1.2), and (11.1.3). Some families contain many integer solutions, such as the families with g, = hi for each i. We must count how many families give several integer solutions.
-
Lemma 11.2.4 (Families with many solutions) Suppose that 2 < r 4, and that g1 + t, ... , g, + t, h1 + t, ... , h, + t gives a family of solutions to (11.1.1) and
Families of solutions
245
(11.1.3). Suppose that this family contains L solutions to (11.1.2) in the range (11.1.5), where L>3-22r.
Then for each i = 1, ... , r we have r
r
FI (gi - hj),
j=1
H (h, -gj) << nHr,
j=1
(11.2.5)
with
'i1 « SH/L. If the L integer solutions of (11.1.2) also satisfy (11.1.4), then (11.2.5) is true with
ri << min(SH/L, 0).
Proof We write
=xi + ... +x; -y;
DS(x1,...,
-Yr, where the exponent s is not necessarily an integer. The equation
D3/2(g1 +t,...,gr+t,h1 +t,...,h,+t) = a gives rise to a polynomial equation in t of degree at most 3.22x. Hence the solutions of the inequality ID3/2(g1
+t,...,g,+t,h1 +t,...,hr+t)I S SH3/2
form at most 3.22r intervals on the real line. When L > 3.22', then some interval I of t corresponds to T + 1 integer solutions, where T is a positive integer with
TEL/3.22x- 1. We substract a suitable integer from each of g1,. .. , g, hl,..., h, to normalize the family so that I includes the interval 0 5 t S T. This does not affect the differences g, - h j. By the mean value theorem applied to the interval 0 5 t S T, there is a real number T in 0< T< T for which Z TD1/2(g1 + T, ..., gr + 'r, h1 + T, ..., h, + T)I <- 26H1/2 .
We write x1 _
(g, _+T), y; _
(h; _+T). Then, for s = 1,..., r, we have
xi+...+x;=Yi+...+Y:+O()Hs/2), where
4SH
= maxi S, 3T)
4SH
3T '
since T 5 H - 1. If the solutions in I also satisfy (11.1.4), then we can also take
r = max(8, A).
The ruled surface method
246
We denote the elementary symmetric function of r variables by
Pl(xl,...,xr) =x1 + ... +Xr ..., Pr(x1,...,xr) =x1...xr. By Newton's formulae for the sums of powers of roots of equations, we see that Ps(xl,..., xr) =Ps(yl,..., yr) +O(71Hsl'2)
for s = 1, 2, 3, 4. Since r
r
xr+ E
(-1)3Ps(y1,...,yr)xr-s= fl (x-y;),
I=1
s=1
we deduce that r
r )=1
(-1)sPs(xl,...,xr)x;-s+O(-7Hr/2)
(x;-y) =x; + E s=1
1
r
_ F1 (xi -x)) + O(-qHr/2) =
O(-qHrl'2).
)=1
Since x; > Y1H--, we have r )=1
r
($t - h))=
rj (x2 - y?) = O(?1Hr). )=1
Interchanging the roles of x; and y) gives the corresponding inequality with C] g, and h) interchanged.
We can now deal with the case r = 4. This account is based on Watt (1989a). We give the simpler argument which loses a factor H. The proof has a divide-and-conquer structure. We divide the integer solutions into families, and count the number of families which have many solutions. This brings in another important idea, the Riesz interchange, which is the discrete analogue of the transformation
f13If(x)Idx= f/3 a
suplfxI flf(x)Idydx= fy=o
x=a y=o
f
dx dy,
where S(y) is the set of values of x for which I f (x)I ? y. This idea is useful when we can describe the set S(y). Theorem 11.2.5 The number of solutions of (11.1.1), (11.1.2), and (11.1.3) with r = 4 in the range (11.1.5) is
O(H4 + 6Hs+E) for any e > 0. The implied constant depends on e.
Proof By Lemma 11.2.3, the solutions fall into families. There are O(H4) solutions belonging to the families that contain at most 768 solutions.
Families of solutions
247
Families with more than 768 solutions are divided into blocks according into which range L to 2L - 1 the number of solutions falls, where L = 21 is some power of 2. We note that there are at most 24H4 trivial solutions in which 91, ... 194 form a permutation of h1, ... , h4. If (after renumbering) we have g4 = h4 and g3 = h3, then the solution must be trivial, since
gi
+g2 hi + h2, =
g1 + 92 = h1 + h2,
and so g1g2 = h1h2, and
(h1 -g1)(h1 -g2) = h; - (h, +h2)h1 0. Then either h1= g1 (whence h2 =g2), or h1=g2 (whence h2 =g1). If one solution in a family is trivial, then all solutions are trivial. Hence we can refer to trivial or non-trivial families. We estimate the number of non-trivial families with at least L solutions of
(11.1.3). We can normalize the family (renumbering if necessary) so that 0,g,z0,h4=0. By Lemma 11.2.4 with r = 4, 91929394
<<,nH4
(11.2.6)
with 71 << SH/L. Moreover, if one of the gi is zero, then after renumbering we can suppose that g4 is zero, and then Lemma 11.2.4 with r = 3 gives 919293 << 77H3.
(11.2.7)
By Lemma 11.1.3, the number of sets of positive integers g1, ... 194 satisfying (11.2.6) is
0( E d4(n) I = O('qH4 log3 H). I
nc71H4
The number of sets of/ positive integers g1, g2, and g3 satisfying (11.2.7) is T, d3(n)) = O('qH3 log2 H). nanH3 There are also O(H2) triples satisfying (11.2.7) for which one of g1, g2, or g3 is zero. When 911 ... , g4 have been chosen, then
01
h;+h2+h2=9 +2+2+92 h1+h2+h3=g1+g2+g3+g4' Let
F =F(g1,..., 94)
.21 (g1 + ... +g4)2 - i (g1 + ... +g4 )
=Eg2- EF, gigi i
i
i
(11.2.8)
248
The ruled surface method
By Lemma 11.1.2, if F(g1,...,g4) is zero, then there is one choice for hl,..., h4- If F(g1,...,g4)>0, then the number of solutions for hl, h2, and h3 is
< 6 E X(d) < 6d(F) <<
HE/2.
dIF
If F is negative, then there is no solution. The number of families containing at least L solutions is thus O(HE(H2 + 77H 4)) z O(H2+E/2 +
5H5+e/2/L).
Summing L through powers of 2, we find that the number of such non-trivial families is
E LH2+e/2 + 5H5+E/21 L=2'«H f = O(H3+E/2 + 5H5 +e/2 log H).
O(1:
We get the result on adding the number of trivial solutions, and using He12 log H << He.
Finally we give a limitation result, which shows that the last estimate is not too far from best possible. Lemma 11.2.6 (Limitation result) We have SOH2r-3
N2r(S, A) >_ maxr!(H-
r)r,
3(1 + 8)(1 + A)r4
Proof We can choose h1,. .. , hr to be distinct in
H!/(H-r)!> (H-r)r ways, and then have g1, ... , gr a permutation of h1, ... , hr in r! ways; these are diagonal solutions. For the second bound, we divide four-dimensional space into boxes B,,, and we let n; be the number of vectors x(h) (in the notation of Lemma 7.4.2) in
each box. Two vectors (distinct or not) falling into the same box give a solution of (11.1.1)-(11.1.5). Let I be the number of boxes. Then Cauchy's inequality gives (Ln,)z
with
L n, =Hr,
L n? 5 N2i(3, A).
The boxes have length 1 in the first and third coordinates, 6H3/2 in the
A comparison argument
249
second coordinate, and AH1/2 in the fourth coordinate. Since the sth coordinate runs through a range of length (2s/2 - 1) rHs/2, we have r(V - 1)H3/2 1 1)H1/2 I < 3rH2 + 11 rHl + 11 SH3/2 AH1/2 S)`(1
(r(/ -
<
3(y - 1)(1- 1)r4H3(1 +
+ A)
SA
which gives the result. 11.3
A COMPARISON ARGUMENT
We want to compare the numbers N10(S, A) for different S and A. First we
need to bound the number of families in which each gi is close to the corresponding hi. Lemma 11.3.1 (Model families) Let PS(x1,... , xr) denote the elementary symmetric function of degree s, the sum of all products of s different variables x, chosen from x1, ... , xr. For a given family of solutions, define the model family with respect to g5 as the family of solutions (not necessarily real) containing Kl,... , K41 g5, hl,..., h5, where Kl,... , K4 are the unique complex numbers satisfying
(11.3.1) Pr(K1,..., K41 95) =Pr(hl,...,h5) for r = 1,-, 4. The closeness of the model is defined as min I K; - hiI for
i, j = 1, ... , 4. The model families with respect to the other g, and hi are defined
similarly. Then, for d > 0, the number of families containing solutions of (11.1.1), (11.1.3), and (11.1.5) for which g, # hj for any i and j, and
I Pr(g1,...,g5) -Pr(hl,...,h5)1 < SHr (11.3.2) for r = 3, 4, 5, with some model family of closeness at most d, is of order O(SdH5+E) (11.3.3) for any e > 0, with constant depending on e. Proof Equation (11.1.1) implies (11.3.1) with r= 1, and (11.1.1) and (11.1.3) together imply (11.3.1) with r = 2. Since 5
(x - a1)...(a - a5) =x5 + E (-1)rxs-r(a1,..., a5), r= 1
then we have
I(x-h1)...(x-h5) - (x-g1)...(x-g5)I S7SH5 for HS x < 2H. In particular 1(g; -hI)...(g; - hs)I-<7SH5, I(h1-g1)...(hi -95)1:!g 76H5,
The ruled surface method
250
for each i and j. Since we are counting families, not individual solutions, then we can take gs to be fixed. The number of possibilities for hl, ... , hs is
F,
d5(n) << SHs log4 H << SHs+e/2
n573H5
by Lemma 11.1.3. The numbers K11 ... , K4 in the model family are now determined. If g4 is to be within a distance d of one of the numbers Kl,... , K41 then there are at most 4(2d + 1) choices for g4. The conditions (11.1.1) and (11.1.3) fix g1 +g2 +g3 and gi +g2 +g3. By Lemma 11.1.1, the number of possibilities for g1, g2, and g3 is at most 6d(n) for some integer n = O(H2), and so at most O(H/2) by Lemma 11.1.2. This model accounts for O(SdHs+e) families, and the other models differ only in the numbering of g1 and hi. O Lemma 11.3.2 (Close families) The number of families with g. # hj for any i and j, satisfying (11.3.2) of Lemma 11.3.1 for r = 3, 4, and 5 with closeness greater than d is O0 2H8+e/d2) Irrespective of closeness, the total number of families satisfying these conditions is
0(54/3H6+e) The implied constants depend on e.
Proof The elementary symmetric functions in five variables satisfy
Pr(K1,..., K4,gs) =P.(h1,...,hs) =Pr(gl,...,gs) +O(SH') for r =1, ... , 4 (with exact equality for r = 1 and 2), so the symmetric polynomials in four variables satisfy
Pr(K1,..., K4) =Pr(g1,...,g4) +O(SH'). As in Lemma 11.3.1 we deduce that (g1 - K1) ... (g1 - KO << 8H4 so that
d s Ig1 - K,I «S1/4H for some j, and that (Kj
g1)...(Kj -g4) «SH4,
so that (KK -g2)(KI
g3)(Kj -g4) << 3H4/d.
For some i = 2, 3, or 4,
kj -g; << OH 4
14113.
We deduce that g1 -g1 << S1/4H+ (SH4/d)1/3 << (SH4/d)1/3 = D,
A comparison argument
251
say. We renumber if necessary so that g1 - g2 = O(D). Similarly g3 - g, _ O(D) for some i = 1, 2, or 4. If i = 1 or 2, then g1, g2, and g3 are all within O(D) of one another, and then g4 is within O(D) of some g, with i = 1, 2, or 3, and all four are within O(D) of one another. In both cases g3 -g4 = O(D). Next we consider the model family with respect to g1. We see that g5 - gi = O(D) for some i, and either all five are within O(D) of one another, or g5 is within O(D) of one of the pairs g1, g2 or g3, g4. We renumber if necessary so that g2 - g5 = O(D). Finally we consider the model family with respect to g3. We have g1, g2, and g5 all within O(D) of one another, and now g4 must be within O(D) of one of these three. Thus, all the g, must be within O(D) of one another. By symmetry, all the hi are within O(D) of one another, and g1, . . . , g5 have the same average as hl,... , h5, so that all ten integers lie within some interval of length O(D). We can take g5 to be arbitrary, pick hl,..., h5 and g4 in O(D6) ways, and then g1 +g2 +g3 and gi +g2 +g3 are known. As in the previous lemma, there are O(HE) choices for g1, g2, and g3. This gives the first result of the lemma. This bound may be refined: after picking hl,..., h5 in O(D5) ways, we know the
K4 of the model family with respect to g5, so that g4 lies
in one of four intervals whose length can be bounded above. The second result of the lemma follows on combining the first result with (11.3.3) of the previous lemma, and choosing
d=[S1'3H].
0
Our next lemma involves a technical condition which will be removed in the next chapter. Lemma 11.3.3 (Comparison of families)
4/H2 < 5:!g 1/8H,
Suppose that
1/H:!-, 0 < SH.
Let NN0 (6, 0) be the number of solutions counted in N10 (S, 0) for which
max(gl,...,gs,h1,...,h5) <min(g1,...,g5,h1,...,h5)+H/2. Let T be a real number with
2
N10(6, A) << 7 N10(ST 2,2 AT) + N100, B 6T), where B is a constant independent of H and T. The term N10 (6, B ST) may be replaced by
O(HN8(B ST) +D4/3T1/3H7+E) for any e > 0, where
D = min(ST, 0). The constant implied in the order-of-magnitude symbol depends on e.
252
The ruled surface method
Proof Let gl,... , g5, hl, ... , h5 be a set of ten integers which contributes to Nio (S, 0). We write 5
5
(hi + t)S -
D5(t) _ 1
(gi + t)5.
(11.3.4)
1
Then D1(t) and D2(t) are identically zero, and ID112(0)I < AH1'2.
ID312(0)I < 6H3"2,
(11.3.5)
We consider only values of t for which H
and H- 1. For some 0, p in 0 < 0, cp < 1, the mean value theorem gives D3/2(t) = D3/2(0) + 2 tD112(0) + 8 t2 D- 1/2( 001
D112(t) =D1/2(0) +
t 2
D-1/2((pt).
For t in I we have
<5/V. A little calculation shows that if I tI < T/2, then ID312(t)I 5 ST2H312,
(11.3.6)
ID1,,2(t)I < 2 iTH1/2.
(11.3.7)
The inequalities (11.3.6) and (11.3.7) are weaker that those in the definition of Nf0(5, A), and we try to show that they have more integer solutions. The real line can be divided into a finite number of subintervals on which D3/2(t) and D112(t) are both monotone. If none of these intervals contains more than one integer solution with t in I, then this family contributes a bounded number of solutions to Nf 0(S, 0), but at least T/2 solutions to N1O(ST 2,2 AT). The difficult case is when a subinterval of t on which D312(t) and D112(t)
are monotone contains l + 1 > 2 integer solutions with t in I. We consider the subinterval for which l is maximal. The mean value theorem tells us that if f(t) is a twice differentiable function with I f(t)I 5 E on an interval (a, a + I], then there are points /3 in the interval (a, a + 1/3) and y in the interval (a + 21/3, a + 1) with 1
3
1
If'( PA, 3 If'(y)I s2E.
Repeating the process, we have
(y- i3)If"(e)I s 12E/l,
for some that
If"(e)I s 36E/12, between 0 and y. We use this with f(t) = D3/2(t), E = 6H312, so f'(t) = i D1
f"(t) = aD1/2(t),
/2(t).
A comparison argument
253
We can now rerun the argument of Lemma 11.3.1 with r = 5 to deduce that such symmetric function of /j,..., gs is equal to the same function of hl ,..., h5 up to a factor 1 + OOH 2/12).
In particular SH2 I D_s/2(l; A 5 CS
H-S/2.
(11.3.8)
12
The constant C., is difficult to compute. However, the numbers D_s/2 satisfy an eleven-term linear recurrence relation in s. Since linear recurrences grow (at worst) exponentially, we have <
CS
for some positive A, uniformly in S. The Taylor expansions of D3/2(6 + t) and Dl/2(6 + t) converge for
Iti 5 H/4A2, and
SltiH3/2
D3/2(6+t) «SH312+ SH3/2
Dl/2(e+t) <<
I
1
+
+
St2H3/2 12
(11.3.9)
5ItIH3/2 (11.3.10)
I2
If (11.3.11)
Iti 5 AlT
with A sufficiently small, then (11.3.6) holds with argument 6 + t in place of t, whilst (11.3.7) also holds at 6 + t provided that (11.3.12)
Al >_ SH.
When (11.3.12) is false, then by the mean value theorem there is some -1 in the interval with
D1/2(,q) «0vrH-,
D_ 1/2(,q) << pVrH /1,
(11.3.13)
and similarly, for s Z 1, D-,/2(77) s CS
OH
H-s/2
a result sharper than (11.3.8). We find that (11.3.11) again implies that (11.3.6) holds with argument 77+t in place of t, and that (11.3.10) may be replaced by D1/2(v1 + t) << 0VH + O1t1VH /11
The ruled surface method
254
and (11.3.7) holds with 77 + t in place of t provided that A is sufficiently small in (11.3.11). If (11.3.14) I
then at least half the values of t satisfying (11.3.11) lie in the interval I, and this family contributes 0(1) solutions to N10(S, 0), but some number >> IT of solutions to N10(ST2, 20T). The comparison fails if (11.3.14) is false. In this case (11.3.10) gives
D1/2(t) << STU for I tl <_ 1, so the family contributes a number >> I of solutions to N10(S, BST)
for some absolute constant B. We note that BST < B/4H, so BST is smaller
that 0 unless both S and 0 are close to 1/H in order of magnitude. Considering in turn each family that contributes to N10(S, 0), we get the first result of the lemma. We can also estimate directly the contribution to N10(S, A) from families for which (11.3.13) is false. If some g. = hi, then, after renumbering, g5 = h5,
and g1,...,g4,h1,...,h4 gives a solution counted in N8(5,BST). These families contribute O(HN8(S, BST)) solutions; in this argument these fami-
lies are trivial families. We apply Lemma 11.3.2 to non-trivial families. Consider the families for whom L < 1 < 2L, where L is a power of 2 with L > H/2T. The elementary symmetric functions of g1,.. . , g5 and of hl,..., h5 satisfy (11.3.2) with S taking some value
S1 = OOH 2 /V).
Similarly, when (11.3.13) holds for t in the interval I, then we can deduce bounds for the symmetric polynomials with S = 62, where
S2 = O(OH/L). By Lemma 11.3.2, these families contribute 0(L504/3H6+e) to N10(S, A), where 60 = min (S1, S2), which sums over L to O(min(64/3T5/3H7+ e, Q4/3T1/3H7+,)).
0
12 The Hardy-Littlewood method 12.1
INTEGRALS THAT COUNT
We treat the First Spacing Problem by exponential si}ms, following Huxley and Kolesnik (1991). Let ( h 113/2
2H-1
S(a,/3,u,v)= E e(ah2+ah+ul Hl
(h 11/21
+v HJ
J.
(12.1.1)
h=H
By coincidence detection (Lemma 5.6.5) in four dimensions, the counting number N2r0, A) of the previous chapter satisfies
N2r(S,i)_I
1/2 1/2
1/2
I I 1/2
1/2S 1/28
f
1/20
IS(a,13,u,v)12rdvdud/3da.
1/20 (12.1.2)
The Hardy-Littlewood method is simplest when there is only one condition defining the counting function on the left, and only the integration over a on the right. It has a divide-and-conquer structure.
D: Divide the range of integration into short intervals (Farey arcs), each labelled by a rational number a/q approximating the variable of integration, a. A: Approximate; there are two cases. If the rational a/q has small height, then the sum S can be approximated by a simpler function. These arcs
are the major arcs. If the rational a/q has large height, then the approximation used on the major arcs disappears into the noise, but the modulus of S cannot be too large. These arcs are the minor arcs.
C: Combine; again there are two cases. The approximate values of the integrals on the major arcs are carefully added; there may be cancellation between the contributions of different major arcs with the same denomi-
nator q (this will not happen in (12.1.2), because the integrand is a modulus squared, real and positive). On the minor arcs we have an upper bound for ISI, so that we can estimate trivially. If possible, it is better to take some factors ISI out at their maximum, and to leave IS121, for some
small integer t, inside the integral sign. The integral of IS12t over the minor arcs cannot be greater than the integral of IS12t over the whole
range - i to 2, and this is the counting function for some simpler problem, which may have an elementary treatment.
The Hardy-Littlewood method
256
The Hardy-Littlewood method first appeared in Hardy and Ramanujan's work (1918) on the partition function, where they use an integral like that in Section 10.2 of Chapter 10 to find the Fourier coefficients of a modular form with poles at the rational points a/q on the real axis. The contour could not cross the line of poles, which formed a natural boundary, but the part of the line of integration close to a rational number of small height still depended on the residue at the pole. Hardy and Littlewood explored the method in a long series of papers, for example (1923). Vinogradov made it his own. For an account by a modern tactician see Vaughan (1981). First we use (12.1.2) to remedy a technical defect in Lemma 11.3.3.
Lemma 12.1.1 (Restricted ranges) Let N2r(3, 0) denote the number of 2r-tuples of integers g1,. .. , g, hl,..., hr counted in N2i(8, 0) for which M < (H - 1)/2, where
M=max(g1,...,g2,hl,...,hr),
m=min(g1,...,gr,h1,...,hr).
Then
N2r(5,1) << N2r(6, A).
Proof We write S(a, f3, u, v) as S1 + S21 where S1 is the sum in which h
runs from H to H + [(H - 1)/2], S2 is the sum in which h runs from H + [(H + 1)/2] to 2H - 1. Then IS(a, f3, u, v)12i <-
22r-1(IS1(a,13,
u,
V)12r + IS2(a, 13,
u, v)12r
Hence the integral (12.1.2) is bounded in terms of the corresponding integrals with S replaced by Sl or S2. Each of these is bounded by a counting number
like N2i(3, 0), but with the numbers gi and h, lying in only half of the original interval [H, 2H). All solutions of these types are counted in N2r( 6, A).
0 We can now replace NN0(6, 0) by N10(6, 0) in the conclusion of Lemma 11.3.3, which now says that a bound for N10(6, 0) for larger values of 6 and 0
implies a weaker bound for N10(6, 0) for smaller values of S and A. The condition involving 0 is always satisfied if 0 >- 5. Thus we have N10(6, 0) < limit N10(8, A) Q
a
=6: 1/2 i/2
f
1/2 1/2
f 1/28 IS(a,/3,u,v)) dud/da. (12.1.3) 1/28
10
We use (12.1.3) for 6> 1/2H. The integral over u in (12.1.3) corresponds to
an inequality. For inequalities we often have a very simple form of the Hardy-Littlewood method, in which 0/1 gives a major arc, and all other fractions give minor arcs. We only want an upper bound for N10(6, A), not an asymptotic formula. We can use Lemma 5.6.5 in the reverse direction.
Integrals that count
257
Lemma 12.1.2 The major arc in the u-integration) We have
f f f 1/2
1/22
1/2
15
/2
1/
15/2
fl/2
fl/z
5
f IS(a,(3,u,v)I10dvdud/3da 5
f1s/z
1/2 -1/2 -15/2
IS(a, /3, u, 0)110 du d f3 d a x H7
Proof By (12.1.2) these integrals are the same order of magnitude as N10(15/2,5) = N10(co, c): in other words, no inequality conditions. By the H7. Since limitation result (Lemma 11.3.2), the order of magnitude is at least I S(a, /3, u, v)I cannot exceed the length of the sum, these, integrals are at most H2 times the corresponding eighth-power integrals, which have the order of magnitude of N$(co,co), which is H5 by Lemmas 11.2.4 and 11.2.6. C1
On the minor arcs we have to bound an exponential sum. Let f(x) be the exponent in (12.1.1):
f (x) = axe + (3x + u(x/H)312 + v(x/H)1/2,
(12.1.4)
so that
f'(x) = 2 ax + /3 + 3Ux1/2/2H3/2 + v/2H1/2x1/2, f" (x) = 2a + 3u/4H3/2x1/2 - v/4H1/2x3/2, f 131(x) = 3(vH -
ux)/8H3/2x5/2.
Lemma 12.1.3 (Minor arcs in the u-integration) Let T = Jul. Then
H2/FT IS(a, (3, u, 0)12 <<
H3/2 HT113
for 1 5 T:5 H, for H< T < H3/2, for H3/2 < T 5 H3/8.
Proof We use the differencing step (Lemma 5.6.2) to compare S(a, /3, u, 0) with the sums 2H-1-d
E e(f(h + d) - f(h - d)).
h=H+d
We have for some 0 in -1 < 0<1 and H < x < 2 H, 2
d
2(f(x+d)-f(x-d))=2df(3)(x+6d)x H3
By Corollary 1 to Lemma 5.4.3 (the truncated Poisson summation formula), we have
e(f(h+d)-f(h-d))« (HI +H
(j ).
The Hardy-Littlewood method
258
Lemma 5.6.2 gives
IS(«, (3 ,
T)
D-1
U, 0)12 «
H+
(H)
+H
(
)
2
<<
D
+ (DHT)+H2
(T).
The bounds follow on choosing D to be [H/2], [H2/2T], or [H/2T1/3] in the three cases. Theorem 12.1.4 For S>_ 1/2H3/2 and any A we have N1O(8, A) << SH7+E
The implied constant depends on e.
Proof First we note that, for D >_ 1, as in (12.1.3), we have
1 f1/2 D
(D 1 ) << H4 +H5+E/D by Theorem 11.2.5. For a function w(u) which is a decreasing function of Jul, the Riesz interchange principle gives 1/2
1/2 f D/2 f 1/2
8
f1/2 f1/2 f1/28 w(u)IS(a,(3,u,0)18dud/3da -1/2 -1/2 -1/25 1/2
1/2
1/28
w(u)
-1/2 -1/2 -1/28'0
g
IS(a,/3,u,0)l dtdud(3da
= fw(O) f1/2 f1/2 1D(t)/2 IS(a, )3,u,0)l8 dud/3da dt t =0 -1 / 2 - 1 /2 - D( t )/2
f w(O)(D(t)H4 +H5+E)dt,
(12.1.5)
t-o
where D(t) is the largest value of u for which w(u) >_ t. We take w(u) to be the upper bound for IS(a, (3, u, 0)12 in Lemma 12.1.3, for Jul >_ 1, and H2 for Jul < 1. Thus, for 8>_ 1/2H3/2
H4/t2 1/2S
for H >_ t > H3/2, for t5 H3/2 .
The integral in (12.1.5) is <<
f
H3i2 H4
dt +
fHz H 4
dt +H'+E
H7+F
H3n 2t2 2S and we deduce the result from (12.1.3) on multiplying by S. t
o
The minor arcs
259
We have not quoted Lemma 12.1.2 explicitly in establishing Theorem 12.1.4; the ideas of Lemma 12.1.2 are built into the proof.
12.2 THE MINOR ARCS For the full four-variable integral (12.1.2) we do the integrations over a and 13 first, using the Hardy-Littlewood method proper. We write
t=lul+Ivl, and suppose that T z 1. By Dirichlet's approximation theorem (Lemma 1.5.1)
there is a rational number a/q with
a=
q
+O
Q
q
(12.2.1)
Lemma 12.2.1 (Minor arcs estimate) For 1 < t < H3/2, and a in the range (12.2.1), we have
H log H
IS(a, 13, u, v)I << H 1/2t1/6 log H + q1/2t1/6
If lul<-U,IvI H/(tU)1 /3,
(12.2.2)
then
IS(a,13, u, v)I << H1/2U1/6 log H.
Proof We use the differencing step (Lemma 5.6.2) to compare S(a, /3, u, v) with the sums F, e(f(h + d) - f(h - d)). (12.2.3) h
We write
f(x + d) - f(x - d) =
4adx q
+g(x),
where, for some 0, cp between -1 and 1, 4ad
g'(x)=f'(x+d)-f'(x-d)- q =2df"(x+9d)-
4ad ,
(12.2.4)
q
g"(x) =f"(x+d) -f"(x-d)=2df(3)(x+ cpd) 3d (vH - u(x + ppd)). 4H3/2x5/2 The order of magnitude of g"(x) may be variable as x goes from H to 2H.
Hence we must divide the interval H to 2H for h into subintervals J on which
IvH - uxkx pHT
The Hardy-Littlewood method
260
for some p = 2-'. We note that either p x 1, or I vHl x luxl and u and v have the same sign. In the second case lul x lul. In both cases the interval J has length between bounded multiples of pH. To avoid an infinite dissection, we
put together the intervals (if any) on which p «t1"3/H, and estimate the sum over these intervals as O(t1"3), which is an allowable term.
We apply the differencing step separately on each interval J, so the conditions of summation in (12.2.3) are that the two numbers h ± d should both belong to J. The change in g'(x) on J is (p2dt
O(pHmaxlg"(x)I)=01 H2 ). The range of d is 1 to D, where D will be chosen with
p2Dt «H2,
(12.2.5)
and D so small that g'(x) changes by at most 1 on the interval J. We use the truncated Poisson summation formula (Lemma 5.4.3) with B -A < 1, so the error term is bounded, and the sum over r = 4ad (mod q) has at most one term. In fact in (12.2.4) 4ad d dt << g'(x) = 2df" (x + 9d) + Hz 9Q
R
The truncated Poisson summation formula gives
e(f(x+d)-f(x-d))= f el g(x)- q )dx+0(1). \
h±dEJ
(12.2.6)
I
where r is the unique integer (if any) with
r
- + y=g'()
r= 4ad(modq),
q
for some y with 1y1 < 1, and some in J. We call d a good value if r is non-zero mod q and rl
Ig'(x)-q
2
q
for x in J. Other values of d are called bad; for these ad
p2dT
q
H2
d +-. qQ
(12.2.7)
For good values of d we use the First Derivative Test (Lemma 5.1.2). The integrals in (12.2.6) sum to D q-1 -) q b=1116/qIl 1
OI1+
= 0((q +D)log q).
(12.2.8)
The minor arcs
261
For bad values of d we use the Second Derivative Test (Lemma 5.1.3) or the trivial estimate. The integral in (12.2.6) is Is
(H3
O(min( pH, 1/ min jg ((x)j )) = O min pH,
I pdT )
Whether d is bad depends on its size as well as on congruences mod q. If d lies in a range K 5 d < 2K, where K is a power of 2, then there are at most 2K/q multiples of q in the range, and
0(
K
gp2Kt
H2 + Q )
non-zero residue classes which can contain bad values of d. There are
0
gp2Kt K fl H2 +Q +q J
K
bad values of d in the range K 5 d < 2K, contributing
O (K+q)
+q(1+Q)
Ht
3K Pt
)
As K runs through powers of 2 not exceeding D, then this expression sums to the corresponding expression with K replaced by D.
When we add in the term pH corresponding to d = 0, and the terms (12.2.8) from good values of d, then the differencing step (Lemma 5.6.2) gives
e(f(h))IZ<<
D 3
X pH+(q+D)logq+(q+D)
i
Ht) +qI1+Q) T
J
(12.2.9)
1
We choose
D x pH/t113,
so that
q +D «Q. We can take D z 1, since the case p >> t1/3/H has already been dealt with trivially. To satisfy (12.2.5) for all p, we require
t
«H3"2.
The right-hand side of (12.2.9) is << t 1/3( pH + Q log Q + PQt113 +H 2 lqt213).
262
The Hardy-Littlewood method
The lemma follows on taking the square root and summing over O(log H) values of p. To simplify the notation we replace the limits of integration
lul < 1/28, Ivl < 1/20 in (12.1.2) by Iul 5 U, lvI <_ V, and we consider the integral
fl/2
/2
ru IV
J 1/2 f-' 1/2 J U -V
(S(a, /3, u, v)l10 dv du d(3 d a.
(12.2.10)
We take (12.2.2) as the definition of minor arcs in the integration over a when u and v are fixed. Lemma 12.2.2 (The minor arcs contribution) The contribution to the integral I in (12.2.10) from the region with a on a minor arc and t = l ul + l vI >_ 1 is H1+E
«U4/3VH5(1 +
U
)loge H.
Proof We have J
r1/2 J
/2
fU IV IS(a,
u)I $ du du
f-' 1/21/2 J-u -V 1
UVN$(2U' 2V) <<
(i
+
d/3 da
H1+E1 JH4,
U
by (12.1.2) and Theorem 11.2.5. We ``take out two factors ISI2 at their maximum from I, using Lemma 12.2.1, and then we have an eighth-power integral over a range which is a subset of the range for J: the standard minor arcs trick in the Hardy-Littlewood method.
12.3 THE MAJOR ARCS We consider regions which are minor arcs in the integration over u, but major arcs in the integration over a, with an approximation
a=a/q+A,
A<< t1"3/qH,
(12.3.1)
q 5 H/(tU)I13.
(12.3.2)
IKISZ.
(12.3.3)
where t = ul + Id >_ 1,
We put
f3=(b+K)/q,
We suppose that U >> H, since the case U << H corresponds to the case of a small, treated in Theorem 12.1.4.
263
The major arcs
We write bx
axe
f(x) =
+
q
+g(x),
q
so that
KX
g(x)=Ax2+
q
/x
+ul H)
K
g'(x) = 2Ax + q +
x
3/2
+v(
3ux1/2
1/2
H)
V
2H312 + 2H 1/2x1/2 3u v g" (x) = 2A + 4H 312 x 1/2 - 4h1/2 x 3/2
(12.3.4)
3(uH-ux)
(3)(x) = 8H 3/2X5/2 1 g
3(3ux - 5uH) 16H 3/2X7/2
Lemma 12.3.1 (Poisson summation on the major arcs) On a major arc with t z 1, either (12.3.5) IS(a, f3, u, v)I «H1/2U1/6 log H, or 1
IS(a,
max
u, v) <<
rx
2H
I JH e(g(x) - 4) dx ,
(12.3.6)
with
u <
(12.3.7)
and
«
t
+
1 HgU1/3
1 V << H2 + H1/2gU1/2
(12.3.8)
HZ
Proof This is another `divide and conquer' argument. We note that g"(x) has one stationary point for x > 0, so that we can divide the range H 5 h < 2H
into at most three intervals, on each of which g'(x) is monotone. We subdivide further into intervals J on which
oK:5 Ig"(x)Is2uK, where o- = 2_s for some integer s, and t
tUl/3
K=2IAI+ HZ << qH by (12.3.2). As in Lemma 12.2.1, we must show that the length of J tends to
The Hardy-Littlewood method
264
zero as
o,
tends to zero. Since x H and t >- 1, we must have either
Ig(3)(x)I >> 1/H4 or Ig(4)(x)I >> 1/H5 at each point of J. Suppose that J has length L. In both cases we have Ig(3)(x)I >>L/H5, I g" (x)I >>L2/H5 somewhere on J. The length of J is O( (o-K1Y5) ),
which tends to zero as promised. We estimate trivially on all intervals whose length is O(H1/2U1/6). This leaves O(log H) long intervals to consider. We apply Poisson summation (Lemma 5.4.6) to get ah2 + bh e
+g(h)
q
G(a, b - r; q) qA-rSgB+-',
f e(g(x) - rx) dx !
q
log(gB-qA+2)), with
qB+qA << gHsuplg"(x)I «vgHK, and by the Second Derivative Test (Lemma 5.1.3)
f e(g(x) - rx) dx <<
1
(o-K)
The error term is certainly O(HU log H), which is smaller than the right-hand side of (12.3.5). There are now two cases. Case 1
If
1, then the sum over r contributes QgHK
(vqK)
«H (K)
Intervals J for which this case holds give (12.3.5).
If aqHK is less than some sufficiently small constant, then there is at most one value of r in the sum, contributing Case 2
<<
VF
(
Either this expression is O(H1i2U1/6) or, for some bounded B, o-K < B/HgU1"3. This is the approximation step of the divide-and-conquer method.
(12.3.9)
We now combine information from the different intervals J. Since g" (x) has one maximum for x > 0, there are at most two intervals of x on which g" (x) S B/HgU'I3
The major arcs
265
Each interval corresponds to at most one value of r. For these values of r we
may extend the Fourier integral to the full range H< h < 2H, since this changes the integral by at most O(H'/ZU'/6 log H). Also, comparing (12.3.9) with the explicit expression for g" (x), we see that 1
t
(12.3.10)
<<
H2 + Hq U'/3 Since J has length at least H'/2U'/6, by the mean value theorem there is a point x in J with g(3)(x) << o K/H'/2U'/6. For this value of x we have u=
Hv
X
+ O(H3 Ig(3'(x)I) =
Hv
X
` +0 ( o.KH512 U1/6
JI
and
v 4h'/2X3/2 +O
g"(x)=2A+
UKH'/2 U1/6
so that
2A=
V
- 4H'/2X3/2 +O(
o-KH'/2 U1/6
In particular, by (12.3.9) Hv
u= X +0
(
H3/2 qU1/2
V
2A= - 4H'/2X3/2 + 0
H1/2gU1/2
which gives the bounds (12.3.7) and (12.3.8).
Lemma 12.3.2 (The major arcs integral) The regions which are minor arcs in the u-integration, but major arcs in the a-integration, contribute H' + )b0g2H << H'V'/4 + U
to the integral I in (12.2.10).
Proof The parts of the major arc on which the uniform bound (12.3.5) holds give the second term in the lemma, just as on the minor arcs. The remaining part corresponds to the centre of the major arc in a one-variable application of the Hardy-Littlewood method. Here it is twisted by the coordinates u and v. For a, b, and q fixed we must estimate
fflf qs
l g(x) If,e 2
r1
q1
to
dxl du du qK d1t,
The Hardy-Littlewood method
266
over the region where (12.3.6) holds, which is a subset of the region defined by
V
A <
1
K<<1,
+ HqU1/2 '
(12.3.11)
H3/2
u<< V+
(12.3.12)
IvkSV.
qU1121
The integer r is determined by a, b, and q, so it is constant in the integration. We make a change of variable x = Hy2, so that rx
g(x) - q = AH2y4 + uy3 +
(K - r) q
Hy2 + vy
=G(AH2,u,(K-r)H/q, v, y), where, for a vector v = (A, u, B, v), G(v,y) =Ay4 + uy3 +By2 + vy. Then
f2H
rx
e g(x)-
)
q
H'
dx=HJAH2,u,(K-r)H/q,v,
where J(v) is the integral
J(v)= f 2 e(Ay4+uy3+By2+vy)ydy. 1
For positive L, let M(L, y) be the set of vectors v corresponding to K, A, u,v satisfying (12.3.11) and (12.3.12) for which
IG'(v,y)I =14Ay3+3uy2+2By+vI
(12.3.13)
J G" (v, y)I =112Ay2 + 6uy + 2BI S L2,
(12.3.14)
IG'3)(v, y)I = 124A + 6u15 L3,
(12.3.15)
IG(4'(v, y) I = 241A1 <- L4,
(12.3.16)
and let M(L) be the union of the sets M(L, y) for 1- Lk for some k in 15 k:5 4. We apply the kth derivative test (Lemma 5.1.4) to get J(v) << (1/Lk)1jk << 1/L.
If v lies in M(L), but not in M(L/2), then
1J(V)1
1
L
unless G(v, y) has more than one root in 1 - E
The major arcs
267
Let Vol M(L) denote the four-dimensional volume of the set M(L). Then
M(L) corresponds to a set of points (a/q + A, (b + K)/q, u, v) of volume (Vo1M(L))/H3 on which
IS(a,13,u,v)I»H/LI; here a, b, and q are fixed. By the Riesz interchange principle, these parts of the major arc contribute
H10 Vol M(L) L L10gs
(12.3.17)
H3
where L runs through powers of 2 in the range 1
T
>rH_
<< L << U1/6
corresponding to H1/2U1/6 << IS(a, /3, u, v)I << H. If L is less than one, then (12.3.13)-(12.3.16) give IvI << L.
Jul << L3,
Since we have t = luI + I v I > 1, we conclude that L is bounded below when the set M(L) is non-empty. Since G(v, y) is a quartic polynomial, G(v, y + t) agrees with G(R(t)v, y) apart from the constant term, where R(t) is the linear map with A
R(t) u
=
1
0
0
0
4t
1
0
0
A u
B
6t2
3t
1
0
B
V
4t3
3t2
2t
1
v
The set M(L, 0) is the intersection of a box, sides 2L4, 2L3, 2L2, 2L about the
origin, and the box P defined by (12.3.11) and (12.3.12). The set M(L, y) therefore consists of a parallelepiped, volume 16L10, intersected with the box
P. If v lies in M(L, y) then R(-y)v lies in M(L,0), and IAI
(12.3.18)
lu-4Ayi5L3,
(12.3.19)
IB - 3uy + 6Ay2I 5 L2,
(12.3.20)
lv-2By+3uy2-4Ay3I5L.
(12.3.21) Eliminating u and B between (12.3.19), (12.3.20), and (12.3.21), we have
Iv-4Ay31
We deduce that Vol M(L, y) << L6 min(L4, L3 + V).
(12.3.22)
The Hardy-Littlewood method
268
Next we need a finite set of M(L, y) to cover M(L). We choose a sequence of points yi distant 2/L apart. The rows of the vector 4y3
3y2
2y
1
A
12y2
6y
2
0
u
24y
6
0
0
B
24
0
0
0
vi
A U
=D(y )
B V
correspond to the inequalities (12.3.13)-(12.3.16). The rows of A
A
=D(y+t)
D(y)R(t)
B
B
v
V
give the same expression with y replaced by y + t. We also have
D(y +t) =E(t)D(y) =
1
t
t2/2
t3/6 t2/2 D(y).
0
1
t
0 0
0 0
1
t
0
1
If It151/L and v is in M(L, y), then the entries of D(y + t)v have absolute size at most
L'(1+1+Z+6+.")SeL', where e stands for the base of natural logarithms. Hence M(L, y) is a subset of M(eL, y + t), and we can cover M(L) by the sets M(eL, y.). There are at most L points yi, so by (12.3.22) Vol M(L) << L' min(L4, L3 + V).
(12.3.23)
If L is small, then all points of M(L) satisfy the bounds (12.3.11) and (12.3.12). Also, by looking at (12.3.15), we see that a point v with 124A1 >_ L4/2
cannot lie in M(L, y,) and M(L, yi+3). For small L we have
VolM(L) xLtl Substituting (12.3.23) and (12.3.17) and performing the summation gives
H7
I
H7V 1/4
V
<< q5 L min( L,1 + L3
<<
q5
L
when we sum L through powers of 2. Summing this bound over a, b, and q gives the result of the lemma. We can now give a result for medium-sized values of S.
The major arcs
269
Theorem 12.3.3 When 6 and 0 satisfy 5A?/16 H 3/2-e >> 1,
6H1+e << 1,
(12.3.24)
then
N10(6, 0) << 603/4H'.
(12.3.25)
When 6Q9/16H3/2-e << 1,
iH8/9 >> 1,
(12.3.26)
then N10(6, A) << Q3/16H11/2+e
Here e > 0 is arbitrary, but the implied constants depend on e.
Proof We fit Lemmas 12.1.2, 12.2.2, and 12.3.2 together to estimate the integral in (12.1.2). We have
0 H'
1 N10(2U«
U(Ql/4 +
U4/3
Hs+e(1+
0
N
For
H
N10(1/2U,0) «Q3/4H'/U. We pick U = 1/2 6 if possible, and, if not, then we pick U X A9/16H3/2+e
if possible, and use N10(6, A)<_ N10(1/2U, A)
for6<1/2U.
O
We can also apply these methods to the twelfth-power integral. Theorem 12.3.4 When 6 and 0 satisfy
6H1+e «1,
6(OH2-e)3/5 >> 1,
then
N12(61 0) «60H9. When 6(OH2-e)3/5
«1,
AH1/3-e>> 1,
then
N12(6, 0) << p2/5H39/5+e
Here e > 0 is arbitrary, but the implied constants depend on e.
270
The Hardy-Littlewood method
Proof The analogue of Lemma 12.1.2 is the bound O(H9) for the major arc in the u-integration. The analogue of Lemma 12.2.2 is the bound << U5/3VH7+e(i +H/U) for the minor arcs, and the analogue of Lemma 12.3.2 is the estimate << H9 + U513VH7+e(1 +H/U) for the major arcs, since the sum over L in Lemma 12.3.2 converges with L12 in the denominator. The choices of U are analogous to those in the previous theorem. 0
12.4 EXTRAPOLATION We use Lemma 11.3.3 to obtain results weaker than Theorem 12.3.3 but valid over longer ranges. Theorem 12.4.1
For e > 0 there is a constant C(E) > 0 such that N10(6, SH) << 57/4H31/4
(12.4.1)
for C(E)
ii /"-e
1
C(e)H'+e'
(12 4 2) .
.
an d
N10(S, 5H) << 328/41H260141+e
(12.4.3)
1 C(E) H28/15 S 3 S 7j33/25-e
(12 4 4)
for .
.
and
N10(S, SH) << 367/119H104/17+e +H5
(12.4.5)
for 4
1
S S 5 H28/15 H2
Corollary
(12 4 6) .
.
In all these ranges (12.4.2), (12.4.4), and (12.4.6) we have
N10(8, SH) << SH7.
Proof The result (12.4.1) is contained in Theorem 12.3.3. For smaller values of S we use the extrapolation formula of Lemma 11.3.3 (with N10 replaced by N10 using Lemma 12.1.1): N10(S, SH) <<
T
N10(ST2,2BHT) +HN8(S, BST) + 54/3T5/3H7+e (12.4.7)
Extrapolation
271
By Theorem 11.2.5
HN$(S,BST) << H5, which is negligible compared to (12.4.3), and gives the second term in (12.4.5). In order to apply Theorem 12.3.3, we choose T so that 5T2Hl+E << 1.
(12.4.8)
The second condition in (12.3.24) requires 525T41H33-E>> 1.
(12.4.9)
First we choose T so that (12.4.9) is just satisfied; this choice easily satisfies (12.4.8). The first term in (12.4.7) is << S(SHT)3/QH7
by (12.3.25) of Theorem 12.3.3, which gives the expression (12.4.3). The third term in (12.4.7) is smaller in the range (12.4.4). In the range (12.4.6) we use (12.3.26) of Theorem 12.3.3: T
No(ST 2 26HT) << S 3/16 H 91/16+e/T 13/16, ,
and equalize the first and third terms with the choice T x 1/(655H63)1/119
to get the first term in (12.4.5). The conditions (12.3.26) of Theorem 12.3.3 require the reverse of (12.4.9), which holds in the range (12.4.6), and SH17/9T >> 1,
which is easily found to be satisfied. The corollary holds with a margin to spare except at 8 x 1/H2. 0
13 The First Spacing Problem for the double sum 13.1
FAMILIES OF SOLUTIONS
The coincidence conditions for two integer vectors in the treatment of the double exponential sum of Chapter 8 are k111 +k212=k313+k414, ll kl + l2 k2 =13 k3 + 14
(13.1.1)
k4 + O(-qLI),
11+12=13+14,
(13.1.2)
(13.1.3)
11/ kl + l2/ k2 =13/ k3 + 14/ k4 + O(11LFK ),
(13.1.4)
where rt is small, with all variables within ranges
K
L
(13.1.5)
If we can write
k,=g;+h,, 11=g1-hl,
12=g2-h2,
13=h3-g3,
14=h4-g4,
then (13.1.1) and (13.1.3) become the corresponding equations in Chapter 11
with r = 4, so the ruled surface method can be applied. The inequalities (13.1.2) and (13.1.4) are different from Chapter 11, and the range (13.1.5) is
more restricted. We shall not use condition (13.1.4). We follow Watt's account (1990).
Lemma 13.1.1 (Families of solutions) The integer solutions of (13.1.1) and (13.1.3) form families: lines parametrized by
k.=mi+t
(13.1.6)
with 11, 12, and 13 fixed, 14 =11 + 12 -13, m1, m2, m3, and m4 = 0 fixed, and I1m1 + l2m2 =13m3.
(13.1.7)
The number of families which contain integer solutions in the range (13.1.5) is O(K2L2 + L3).
Proof We verify at once that (13.1.6) and (13.1.7) together give a solution for
Families of solutions
273
all t. There are at most L3 families with m1 = m2 = m3 = 0, the diagonal
solutions. For other families we write 1= (11,12,13),
m = (ml, m2, m3)
as three-dimensional vectors, and put A = Ill, µ = I ml. We write d and e for the highest common factors d = (11,12,13), e = (m1, m2, m3). We call families primitive if d and e are both one. Case 1
Families with µ/e < A/d. For fixed m, the vectors I/d lie in the
plane lattice in 083 of integer vectors orthogonal to m/e. The determinant of
this lattice, the area of its unit cell,
is
p./e, because the lattice in 083
generated by m/e and the integer vectors orthogonal tb it has index IM/ell. By counting squares (Lemma 2.1.1), the number of vectors 1/d satisfying (13.1.5) is
L eL2 « (L/d)2 + « µ/e d A*
For each power of 2, M = 2', there are O((M/e)3) triples m with highest common factor e fixed, and M5 m < 2M. These contribute L2M2 M 3 eL2 << e d2M d2e2 families, and summing M through powers of 2 gives O(K2L2/d2e2) families with fixed common factors d and e.
«
Families with µ/e < A/d. These give the same bound O(KZL2/d2e2). In both cases we sum over d and e to get O(K2L2) non-diagonal families.
Case 2
0 Lemma 13.1.2 (Counting primitive families) The number of primitive families with 11, 12, and 13 fixed, and the ratio m2/ml in a fixed interval of length 0, and
withm1 in arange M
where o,(n) denotes the sum of the divisors of the integer n. Similarly, the number of primitive families with ml, m2, and m3 fixed, m3 0 0, and 11/12 in a fixed interval of length 0, with L< 1, < 2L is << d(Im3D+OL2o (Im3D/m3.
Proof Since the highest common factor d = (11,12, l3) is one for primitive families, then we can write (non-uniquely) 13 = rs,
(11, r) = 1,
(12, s) = 1.
Suppose that 11m1 + 12m2 =13m3,
l1n1 + l2n2 = l3n3.
(13.1.8)
The First Spacing Problem for the double sum
274
Then modulo s we have m1n212 - m2n112 = -m1n111 + m1n111 = 0 (mod s),
so that m1n2 - m2n1 = 0 (mod s).
Similarly we get a congruence mod r, so that m1n2 - m2n1 = 0 (mod 13).
If the ratios m2/m1 and n2/n1 are different, then m2
n2
m1
n1
(13.1.9)
m1n1 m1n2 - m2n1
Now let f be the highest common factor f= (m1, m2). Since e= (m1, m2, m3) =1, (13.1.8) gives f 113. So each primitive family is associated
with a factor f of 13. Suppose that n1, n2, and n3 in (13.1.8) are the parameters of a family associated with the same factor f = (n1, n2) of 13. Then we can sharpen (13.1.7) to m2
n2
m1
n1
13/f
f13
>
I(m1/f)(n1/f )I = Im1n11
f13
4M2
Since m2/m1 must lie in a short interval, then there are << 1 + iM2/f13
(13.1.10)
possible rational numbers m2/m1 (this argument corresponds to Lemma 1.2.3 with an extra congruence condition). When f and the ratio m2/m1 are known, then we know m1 and m2, and thus also m3 from (13.1.8). The first result follows on summing (13.1.10) over all factors f of 13, and using
E1-LL gO-(1) fu
f
f
g fg=1
l
1
The second assertion follows by interchanging the roles of 11, 12, and 13 with those of m1, m2, and m3. 0
Lemma 13.1.3 (Solutions in one family) The number of integer solutions of (13.1.1), (13.1.2), (13.1.3), and (13.1.5) within a particular family is at most K4
min 1+0
(13.1.11)
, K ,
Im1m2m31
and also at most K4 min 1 + 0
(m1(m1 - m2)(m1 - m3)I
)K).
(13.1.12)
Families of solutions
275
Proof We use (13.1.3) to eliminate 14 from (13.1.1) and (13.1.2), getting (k1 - k4)11 + (k2 - k4)12 = (k3 - k4)l3,
kl - k4 )l1 + ( k2 - k4 )l2 =
(13.1.13)
k3 - k4 )13 +
(13.1.14)
Eliminating 13 between (13.1.13) and (13.1.14), we find that (
-
k1
k3)( kl
- k4 )11 + (
m (m 1 - M3)11 +m2 (m 2 - M3)12' 1
k2
k3)( k2 - k4 )l2 = O('qL/),
-
kl + k4)( kl + k3) _ O(LK2)
`
(
k2 + k4 )(
k2 + k3)
(13.1.15).
Thus nK2
m2(m3-m2)Q+OI 11
12
(13.1.16)
I
m2(m1 - m3)
I m1(m1 - m3)I
where Q
k+ k4)(1
(lk +
k3) = 1+OI Iml Km2)
4)(F 1)
(
1
(13.1.17) 1i
can be regarded as a function of t, depending on the family, with d
d
r
dt k1=1'
dt
(>
k
+
k
V`-
1
2 k; +
J
k; + kj
1
2k
(k;kj)
2
so that d
2dt
1
logo
1
(klk4) + (
(k1k3)
k2 - k1)( k3 + (k1k2k3k4)
1
1
(k2k4)
(k2k3)
k4 ) >> IM1-M21 z
K
Hence within a fixed family, the values of t for which (13.1.3) holds form an interval of length ,K4 O I m1(m1 - m2)(m1- m3)I
which gives the bound (13.1.12). Similarly considering the values of 14/12 leads to the bound (13.1.11).
0
The First Spacing Problem for the double sum
276
Lemma 13.1.4 (Solutions of all four equations) Suppose that k1,. .. , k4, ll, ... ,14 satisfy (13.1.1)-(13.1.5). Define 0 and .p by
kl +12 k2 -13 k3 -14 k4 = OLFK,
ll
11/ kl + 12/ k2 - l3/ k3 -14/ k4 = cpLFK (so that 0 and cp are 0(17)). Then
max rjIki-kjl<
fl Iki - kjl xIpIK4, joi and there are at most OK
mink,1+0
cp(m1 - m2)
D
solutions in the same family as the given solution. If
101 > 2K I cp1,
then,
for each i,
11Iki-kj1
joi
101K3.
Proof We write a = kl , /3 = k2 , y = k3 , 6 = k4 , so that the system of equations becomes
MT
2
-13/7
cpL
-14/S where M is the van der Monde matrix a2 /32
y2 S2
with determinant and adjoint matrix given by
0=detM=(a-/i)(a-y)(a-S)(/3-y)(/3-Sky -S), adjM=
1r1
'r2
Tr3
T4
'r10'11
X2012
'T30'13
'T40'14
7r1 X21
'T2 022
'T3 023
74 0'24
1T1 X31
r2 0`32
'r 3 X23
'T4 X24
Families of solutions
277
are the symmetric functions of /3, y, and S:
where
0"11 =0 +y+6,
0`12=Ty+yS+S/3,
//
0`13-Th'6, and 0`2J, 0`3, and 0`4j are the corresponding symmetric functions of y, 6, and a, of S, a, and /3, and of a, /3, and y, respectively. The products are
7r1= 0/f'(a),
7r2 = 0/f'(/3),
and so on, where f(x) is the polynomial
f (x) = (x - a)(x - /3)(x - y)(x - 8) whose roots are a, /3, y, and S. After this preparation, we have
0
11/a 12/a
0
= (adj M)T 9LfK 0
-13/y -14/6
cpLFK
so that 11
aLV f'(a)
12=
f(/3) ((a+y+6)8+ay6co),
13
yLFK
f'( SL
14
f'( S)
((a+/3+6)8+a/i6cp), ((a+/3+ y)9+a/3ycp),
We deduce that
kl + k2 )(kl + k3)( kl + k4) f'(a)
(k1 - k2)(k1 - k3)(k1 - k4) = (
<< K2a/3y6 I cpI << riK4.
If 191 < K I kp1/9, then
(/3+y+6)I01<s (2K)
KI 'I 9
< V3
and similarly for the other expressions. We deduce that 11
l2 = - (a-
=
y)(a- 6)
+0(I coKI))
(y2-/32)(/32-62)(a+y)(a+3) r 1+0 (a2-y2)(a2-62)(/3+y)(/3+6) (M3 2
(m1 -m3)m1
Q(I 1+O
cpKi)),
0
cpK
The First Spacing Problem for the double sum
278
in the notation of Lemma 13.1.3. Thus, by the argument of Lemma 13.1.3, there are at most K2
0
1
(P
KIm1-m 21
solutions in this family. Similarly, if 191 >- 2K I q'l, then
IaysDls
( (2K) )3191
2K
Iwl,
and
If'(a)1x K3/2191.
13.2 GOOD AND BAD FAMILIES We estimate the total number of solutions to (13.1.1), (13.1.2), and (13.1.3) in the range (13.1.5) by summing the bound of Lemma 13.1.3 over all families.
However, as in the proof of Lemma 11.1.3, we must take care to do the longest summation first. We renumber k1, ..., k4, preserving the pairing of k1 with k2 and k3 with k4. If max ki and min k, are in the same pair, then we make them {k3, k4}, and we number within each pair so that
min min I ki - ki l =Ik2 - k41.
i=1,2 j=3,4
If max ki and min ki are in opposite pairs, then we make them k1 and k4 in either order, and number so that
Ik2-k4151k1-k21. Writing m. = ki - k4, ni = I mil, we see that m1, m2, and m3 have the same sign, n1 z n2,
In1- n31 z n2,
(13.2.1)
and that (13.1.13) becomes 11n1 +12n2 =13n3.
Since L 51. < 2L, we have 11n1 <13n3 < (ll +12)n1 = (13 +14)n1,
n1/2 < n3 5 4n1.
(13.2.2)
Families with n1 = n3 are now called trivial; by (13.2.1) this implies n2 = 0. There are KU trivial families with n1 0 0, and at most L3 trivial families with n1= 0 (so that all the ki are equal). All non-trivial families have n1 and n3 non-zero by (13.2.2).
Good and bad families
279
Now we let A(K, L,,q) be the total number of solutions of (13.1.1), (13.1.2),
and (13.1.3) in the range (13.1.5), and A'(K, L,,q) be the number of nontrivial solutions with k1, ..., k4 ordered and e = 1, and A*(K, L,,q) be the number of non-trivial ordered solutions with d = e =1. Then clearly (
L
A'(K,L,17) < >. A* K. d=1
Z,
1
d
For non-trivial solutions e = (m1, m2, m3) is defined. From the proof of Lemma 13.1.1, each value of e arises from O(K2L2/e2) families. Of these, the families with less than e solutions contribute K2L2
K
<< s e e=1
<< K2L2logK
2
e
solutions. If there are more than e solutions in the family, then by Lemma 13.1.3, the solutions form an interval of values of k4, and so a proportion at least 1/2e of them have e I k4, which implies e I k, for each i. There are eight ways of renumbering each ordered solution, so
(K A(K,L,71)
<< KL3 +K2L2 to g K+
1
eA*
d5L e5L
KL e
d
ll
(13.2.3) 1
Thus we need to consider A*(K, L, ,t7), the number of ordered primitive solutions. As in Lemma 13.1.1 there will be two cases, depending on the relative sizes of K and L. Families for which the error term in (13.1.16) is always numerically less than a are called good. Since 1 < Q < 2 and Irn31 < 4 Im11, then 1m1 - m31 <- 16 1m21= 16n2
(13.2.4)
in good families. In particular, good non-trivial families have m2 0 0. Other families are bad; for these m1(m1 - m3) << 77K2.
Lemma 13.2.1 (Bad families) The number of bad non-trivial families is O(-qK2L2 log KL). They contribute O(gK3L2 log KL) solutions.
Proof We count the number of bad families with M
(13.2.5)
which implies n2 = Im21 < 2N
by (13.2.1). By Lemma 13.1.2 with 0 << N/M, there` are O(I
d(l3)+
N
MM2v(13)/13 I
(13.2.6)
280
The First Spacing Problem for the double sum
such families for each triple 11, 12, 13. We sum this bound over l1, 12, and 13 to 2L g
<< L' log L + L2MN E EE f2g2 f
13=L
8 fg=1i
«(L3 +L2MN)log L << L2MN log L +L4, (13.2.7) using Lemma 11.1.3 on the first sum, and a technique from its proof on the second sum. Similarly, by Lemma 13.1.2 with A << 1, there are
O(d(n3) +L2o-(n3)/nj) such families for each triple m1, m2, m3. We sum this bound over m1, m2 and m3 satisfying (13.2.5) and (13.2.6) to << MN 2 log M + L2MN log K <
(13.2.8)
One or other of the bounds (13.2.7) and (13.2.8) is O(L2MN log W. Summing M and N through powers of 2 with N:5 4M, MN < r)K2 gives
0
O(r&K2L2 log KL) families, and each family gives at most K solutions.
Lemma 13.2.2 (Good families with in fixed)
Each good family with m3 and
m2 in fixed ranges
MSIm31<2M,
N<M
N<-Im21 <2N,
(13.2.9)
contributes a bounded number of solutions of (13.1.1), (13.1.2), (13.1.3), and (13.1.5) unless
(13.2.10)
M2N << -qK4,
in which case these families contribute in total
N
N O( riK4 M log M+
3L2
ll
M)
(13.2.11)
solutions.
Proof By (13.2.2), (13.2.9) implies that
M/4
(13.2.12)
and (13.2.4) and (13.2.1) imply that
N51m3-m11532N.
(13.2.13)
We fix m1, m2, and m3 satisfying these conditions. By Lemma 13.1.2 with
_
z
71K 2 M + MN << K + MN from (13.1.16) and (13.1.17), the number of possible 11, 12, and 13 is
0 « I MI K m2I << d(n3) +
LZ n2n3) 3
M
2
(K
+ MN
281
Good and bad families
As in Lemma 13.2.1, this bound sums over ml, m2, and m3 satisfying (13.2.9), (13.2.12), and (13.2.13) to
«MNZIog M+LZMNZ ( 1K+MN nK 2 By (13.1.11), the number of solutions in each family is K4
11
0 `min M2N2,K uniformly in l1, 12, and 13. The total number of solutions from such families is I
rtK4
OI M2N2
MN2 log M+
N ) + gK2L2 M K ,
L2MN2 1
K
which gives the` result of the lemma.
O
Lemma 13.2.3 (Good families with 1 fixed) Irrespective of (13.2.10), the good families with m3 and m2 in the ranges (13.2.9) contribute
N
N
11
0 lrtK3L2 M+KL3 MlogL)
(13.2.14)
solutions.
Proof We fix 11, 12, and l3. From (13.1.17)
I1m1(m1 - m3) + 12m2(m2 - m3) «OK2L,
(13.2.15)
where, by (13.2.4),
A=77+
m1(m1- m2)(m3 - m1) K3
M2N
«77+ K3
(13.2.16)
Eliminating m3 between (13.1.13) and (13.2.14) gives 11(11-13)m1 + 21112m1m2 + 13(13 -12)m2 <<
K2L2.
(13.2.17)
We multiply both sides by 1112 + V, where D is the discriminant of the quadratic, D=(1112)2-1112(13-11)(13-12)=111213(11+12-13) =11121314.
We can now factorize (13.2.17) as
(11(13 -11)m1 + (1112 + D)m2)((1112 + rD-)m1 + 12(13 -12)m2) «OK2L4. (13.2.18)
Now m1 and m2 have the same sign, and even if 13 - l2 is negative, then
1112+FD z2Ll2z2l2(12-13).
The First Spacing Problem for the double sum
282
Hence the second factor on the left of (13.2.18) is at least L2 Im1I in order of magnitude, and so m2
11(11-13)
MI1
11 12 +Vi
AK2
qK2
«
m21
M2
+
N K
(13 . 2 . 19)
we note that (13.2.18) implies
LN LN -qK 2L LN + + << M , M M2 since the family is good. By Lemma 13.1.2 there are 11
K
-13 <<
(13 . 2 . 20)
N riK2 0'(13) (d(l3+_+IM2 ) 12 3
of these families, and by (13.1.11) of Lemma 13.1.3 they give O
-qK20'(13)1 + r1K4 M2N 0'(13)
M2N K
12
<
13
-qK30' (13) 123
1
solutions. We sum over 11 satisfying (13.2.20), then over 12 and 13 to get the result of the lemma. Theorem 13.2.4 The system of equations and inequalities (13.1.1), (13.1.2), (13.1.3), and (13.1.5) has OW + K 2 L 2 log K + ,gK3L2 log K) integer solutions.
Proof First we estimate A*(K, L, -q). We count good families with m fixed unless K log K >> L2, (13.2.21) when the bound (13.2.14) of Lemma 13.2.3 is
OI gK3L2 M +KLM logK(KlogK)1/2) =O('qK3L2
M +K2L2 M (13.2.22)
1
When (13.2.21) is false, then the bound (13.2.11) of Lemma 13.2.2 is
(
O K2L2
N + K3L2 N M) M
also. Summing M and N over powers of 2 with N << M << K now gives O(K2L2 log K+ i1K3L2 log K)
(13.2.23)
The problem with a perturbing term
283
solutions from good families with M2N << -qK4. These terms dominate the contribution from good families with M2N >> qK4 and from bad families. Thus (13.2.23) is an upper bound for A*(K, L, r1). The theorem now follows from the reduction formula (13.2.3). O
13.3 THE PROBLEM WITH A PERTURBING TERM The coincidence conditions in the treatment of the exponential sum in Chapter 9 are more elaborate, but if the parameters of the construction satisfy
H5 <
(13.3.1)
then we have
L
K
L3
<<
R
XQ
S
IQ
I
Q)
<<
I
Hence for L < 1 < 2L, K5 k < 2K we have 13
k2 =19(0)+
01(0) + O
(12)
(R /(I)
3
= l + 24k312 +0 R (Q
)
and 2
k e(k2)
n
12
413
T372
+O
9 (k2)
R
(N)
Thus the coincidence conditions include (13.1.1), (13.1.3), and (13.1.4) to the same order of accuracy as before, but (13.1.2) is replaced by ll k, 1 +
12
24k2
12
k2 1 +
+ 12
24k2
12
=13 k3 1 +
12
24k3
2
+ l4 k4 1 +
24k4
2
+ O(-qL1k). (13.3.2)
We regard (13.3.2) as a perturbed version of (13.1.3), and we try to count the solutions of the new system of equations by the same method.
The First Spacing Problem for the double sum
284
Lemma 13.3.1 (A small perturbation) The system of equations and inequalities (13.1.1), (13.1.3), (13.3.2), and (13.1.5) has
O((KL4 +K2L2 + gK'L2)log K) integer solutions.
Proof We write (13.3.2) as 11
L2
k1 +12 k2 =13 k3 +14 k4
)Lv),
and apply Theorem 13.2.4.
Finally we consider the coincidence conditions for the most complicated version of Jutila's method in Chapter 10. The argument is simpler, but we must keep track of the coefficients b(l). Lemma 13.3.2 (Two each side, with coefficients) Let b(l) be any sequence of complex coefficients, and let L be a positive integer. The number of solutions of
11-13=12- 14,
l1 -
12 -
13 =
14 + O(0F
with L < li < 2L, counted with weight b(11)b(l2)b(l3)b(14) is 2L-1
2(
2
2L - 1
(
Ib(l)12) +0 I (1 + AL2 log L) L
E
Ib(l)14)
L
Proof The trivial solutions with l1 = 12 or 11 = 13 give 2L-1
2(
2
Ib(1)12) L
2L-1
- E Ib(l)I'. L
For the other solutions we use I b(l1)b(l2)b(l3)b(14)I 5 4(16(11)14 + ... +Ib(14)14).
By symmetry we can replace the weight by Ib(11)14. As in Lemma 11.2.4, the solutions with 11 fixed have
(11-13)(11-14) = O(OL2).
By Lemma 11.1.3, there are O(iL2 log L) choices for 13 and 14 different from 11, and 12 is given by 12 = 11 - 13 + 14.
Part IV The Second Spacing Problem: `rational' vectors
14 The First and Second Conditions 14.1 THE COINCIDENCE CONDITIONS First we set up a uniform notation for the Second Spacing Problem in Chapters 7, 8, and 9.
Lemma 14.1.1 (The coincidence conditions) The conditions for the coincidence of two y vectors corresponding to rationale a/q and a'/q' in Lemma 7.4.2 and Lemma 8.4.1 imply four conditions of the form a
a'
S A1,
(14.1.1)
5 A2,
(14.1.2)
5 03,
(14.1.3)
IK-K'I5A4.
(14.1.4)
ab
a'b'
q
q
In Lemma 7.4.2 we have R4
01xNZQ2V,
R2
Q
02XNZ
R2
03XNQ,
A4XN.
R2
03_HQ' R2
Q A4x H
In Lemma 8.4.1 we have
01 x
R4
HNQ2V'
A2
HN'
In both cases we have the chain of inequalities R2
R2
Al S A1Vx n202 << A2 << A3 X and
O2PQ>> 1.
T A4 << A4,
The First and Second Conditions
288
Proof The condition 1
IY1-y1'1 s
2r(2H)2V
in Lemma 7.4.2 gives in case 1 4a'
4a
R4 <<
q
q'
N2Q2V'
and a'
54 q
4a
4a'
q
q'
q'
<4
4a
4a'
q
q'
and similarly in cases 0 and 2. We argue similarly for y3 -y3. The condition 2
2
1
I
(27µ'q'3)
(27µq3)
2r(2H)3"2
gives Q3
Vµ,q,3
-1 It
R2
R6
NR2 N3Q3
µq3
)
«
N2'
and (14.1.2) follows on multiplication by the conjugate factor 1 + 3) . The condition I
3K
3K'
1
5
1/(27µ'q'3)
(27µq3)
2r(2H)1/2
gives
K - K'
1
(27µ,q,3)
1
1
(27µ,q,3)
2 µq3)
2r(2H) /2
so that µ,q,3
IK - K'ls
4L27ttk'q,,3
+
A
«NR2
Q32
+
(
R2
NR NQ
H Q «N.
We argue similarly for the condition of Lemma 8.4.1.
In the case M>> FT, the rational number a/q is less than one, and so the numerator a runs through a shorter range. If M2 - M< M/C2C3, (14.1.5)
Magic matrices
289
then the numerator a lies in a range P< l al < 2P with
QT MZ,..NR2. MQ
P
(14
.
1 6) .
We lose no generality in supposing that (14.1.5) holds, since we can take a
finite subdivision of the sum S into ranges M< M< M2, M2 + 1 < M5 M3,... . On the second range we replace M in the inequalities by M2 + 1, and so on.
Lemma 14.1.2 (Inverting the fractions) In the case M >> T , the First Coincidence Condition (14.1.1) implies 1
(14.1.7)
where q and q' are defined by
qq + as =1,
q'q' + a'a' = 1.
Proof We have q
q'
1
1
a
a'
aq
a'q'
which gives (14.1.7) since lal > P, I ql >- Q.
O
14.2 MAGIC MATRICES One of Bombieri and Iwaniec's deepest insights was that the first condition (14.1.1) was linear in the action of the modular group on primitive integer vectors, described in Section 1.3 of the first chapter. To simplify the notation, we write r/q = l a/ql. Lemma 14.2.1 (The magic matrix) in their lowest teens with r
Suppose that r/q and r'/q' are rationals r'
liq
g111
where r and r' are defined by if = 1(mod q), r'r' = 1(mod q'). Then b
(q') - (c d)(q) for some integer matrix of determinant ad - be = 1 with
Icl < Digq'.
(14.2.1)
290
The First and Second Conditions
The entries of the matrix are related by
cr'
bq
q
r'
a = q, + q, = -r - + r-, q'
cr
d=- q
+
bq'
= Y, +
q
(14.2.2)
r
(14.2.3)
Y'.
Proof Since r is only defined mod q, and r' mod q', then we can suppose that
fir'/q' - r/qI < A1. Let q and q' be defined by qq + if = 1, q'q' + r'r' = 1. Then r'
r'
q,
= q,
- q'
r
q
r
r'r + qq'
r'q - rq'
r
r'
-q
r
q
q'r - qr' q' q - rr'
q
which defines the matrix. The bound for c is immediate, and the relations (14.2.2) and (14.2.3) are restatements of (qr
c
qr
f
abq')-(aqd
'-cr' 0
Next we classify magic matrices. In the first chapter we saw the classification by trace. When calculating Fourier coefficients of modular forms (see Kuznietsov 1980), the right classification is into the identity, the upper
triangular matrices, and all other matrices. Our argument uses a third classification.
Matrices with I cl >- BQ/P or I bI >- BP/Q, where B is a sufficiently large constant. Type 3
Type 2
Upper and lower triangular matrices other than the identity.
Type 1
The identity matrix, and any matrix not already classified.
We write
SM=M2-M,
(14.2.4)
or, for a family of sums with varying endpoints,
SM = max M2(i) - M. i
Lemma 14.2.2 (Type 1 matrices) (in terms of the constant B).
The number of type 1 matrices is bounded
Magic matrices
291
Proof Type 1 matrices other than the identity have
O/lbcl5B2. -1, then
Then Ibl, Icl s B2. If be
0 0 ladl 5 B2 + 1,
and at, Idl < B2 + 1. If be = -1, then ad = 0. If a = 0, then (14.2.2) gives r' = bq, so b =1, c = -1, r' = q, and (14.2.3) gives
r d=-q +-. r Hence d is positive and d 5 4. Similarly, if be =
1 and `d = 0, then a = 1, 2,
3,or4.
0
Lemma 14.2.3 (Type 2 matrices) In the construction of Chapter 7, upper triangular matrices have
15 I b I 5 min(1, 28C2C3)P/Q. Lower triangular matrices have
15Icl S min(1,88C2C3)Q/P. There are corresponding bounds in the constructions of Chapters 8 and 9.
Proof For an upper triangular matrix q' = q, r' = r + bq and I r' - rI 5 P, so that
IbI 5 Ir' - rI/q 5 P/Q. For a lower triangular matrix r' = r, q' = cr + q, and I q' - qI 5 Q, so
IcI<
Q/P similarly. In the construction of Chapter 7, the fractions r/q and r'/q in the upper triangular case, or r/q and r/q' in the lower triangular case, are values of f"(x)/2. By the conditions (7.2.1) of chapter 7, the values of f"(x)/2 lie in an interval of length 5 SC3T/2M2, whose lower endpoint is
z T/2C2M2. Hence we must have
P T -Q -4C2 M21 so that
r
r'
Ibl= q-q 5
SC3T
2M2 5
26C2C3P Q
The First and Second Conditions
292
Similarly, for lower triangular matrices
2SC2C3P/Q ? I r/q - r/q'I = I cr2/qq'I > 4P2 ICI/Q2. Lemma 14.2.4 (Type 3 matrices)
For type 3 matrices
Idl x
lal
P Q
Q
Icl x
IbI.
The number of type 3 matrices with K5 I c I < 2K is 2
O min
Q
SZQZPZ
2
2
KP 2)
+ min K, 2+ 1+ Q
(
(
SQ )
log K
Proof We suppose that B z 8 in the definition of type 3 matrices. Then in (14.2.2) and (14.2.3), for I cl BQ/P we have B P r' q< 2< 4 s 4Q IcI 5 2q, Icl, q
r' 3r' 3P P IcI <- 2q, IcI <- lal s Zq, IcI s Q 4Q
IcI,
and similarly for Idl. Then I bcI s ladl + 1 s
3P
2
p
(Q IcI) + ( 8Q
2
IcI)
s
loP2 c2, Q2
and z
2
Ibcl > I adl - 1 >_ (
Q
IcI)
2
c2.
- (8Q ICI) 64Q2
We argue similarly when IbI z BP/Q.
To count type 3 matrices in the case P z Q, we note that if c and d are fixed, and c and d are integers with cc + dd = 1, then the relation ad - be = 1 implies that
b= - c + dt (14.2.5) for some integer t<< P/Q. When K5IcI<2K, then Idl KP/Q, so there a = d + ct,
are O(K2P2/Q2) type 3 matrices in this range. In the case P< Q we express c and d in terms of a, b, d, and b, where as + bb = 1, and argue similarly. For small 8, we write A = J- (M)12.
By the conditions (7.2.1) of Chapter 7, or their analogues in the other chapters,
r/q, r'/q' = A(1 + 0(6)).
(14.2.6)
Magic matrices
293
Put 0 = q/q'. Adding (14.2.2) and (14.2.3) gives
a + d - 2 = 0 + 1/0 - 2 + O(6A Icl).
(14.2.7)
Since z < 0 5 2, we have 0:5 0 + 1/0 - 2 5 1. If a + d = 2, then the magic is a transvection with some eigenvector (w, v), a primitive integer matrix
vector, and, for some integer u, c
d)(uv2 w 1
u 1-UVw
Substituting these values in (14.2.2), and using (14.2.6), we find that
=A+O SA+
-). 1
Icl
V
For K < Icl < 2K and fixed v and w, the number of choices for u is at most 2K/v2. The number of possible rationals w/v with v in a range
V
<<
1+(K+5A)V2,
(14.2.8)
by Lemma 1.2.3, giving a total of O(K min(1, A2) + (1 + SAK)log K)
(14.2.9)
transvection matrices with K< I cl < 2K when we sum V through powers of 2. The remaining type 3 matrices have a + d 0 2, so (14.2.7) gives (14.2.10)
6AK>> 1.
We conjugate the magic matrix into a matrix with smaller entries. Let J be the interval [(1 - S)A, (1 + S)A], and let 1/n be the rational number in J with least denominator. By Euclid's algorithm (Lemma 1.5.2) there are integers k and m with
kn-lm=e= ±1,
0<m
(14.2.11)
If m > n/2, then we replace k by l - k, m by n - m. If m > 0, then since
neither k/m nor (l - k)/(n - m) lies in J, but I/n, which is in J, lies between these two rationals, then we have
1-k
k
2SA<m -- n-m
1
m(n-m)'
so that SAmn < 1;
this inequality holds trivially if m = 0. We rewrite (14.2.6) as r
r'
1
9 , q, _ - + O(6A). n
(14.2.12)
The First and Second Conditions
294
Then (14.2.2) and (14.2.3) give
an - c1= n9+ O(SAnK) = O(SAnK), do + c1= n/O + O(SAnK) = O(SAnK). Hence by (14.2.11)
m
ce
ce
n
n
n
am - ck= -(an-cl)- - =m6- - +O(SAKm), and similarly ce
ck+dm =m/9+ - +O(SAKm). n
The magic matrix is conjugate to
d')
(c'
n)-1(c
=(k
l n1
(m
d)
akn + bmn - ckl - dim - akm - dm2 + ck2 + dkm
aln + bn2 - c12 - din
-aim - bmn + ckl + dkn
1
(an - cl)(ck + dm) - mn
c
-(am-ck)(ck+dm)+m2
(an-cl)(cl+dn)-n2 -(am-ck)(cl+dm)+mn
Here mn
a'
c
SAnK
+O
Icl <<
lm+
K n
+SAKm
K n
SAnlm+ - +SAKm << SAK,
by (14.2.12). Similar calculations give b' «S2A2Kn2, c' << K/n2, d' << SAK.
The trace is unchanged by conjugation, so a' + d' # f 2, and b' and c' are both non-zero. If 6An2 >> 1, then, for fixed c' and d', a' = d' + c't, b' = - c' + d't by (14.2.5) applied to the conjugate matrix, where t is an integer with (SAK S 2A2Kn2 t << minl Ic,l , Id'I
Summing over c' and d' gives SAK
O
1sc'«K/n2 52A2K 2 1 s c' 4 d'/SAn2
d'
=O(S2A2K2)
The Second Condition
295
possibilities for the conjugate matrix. If SAn2 << 1, then we count the number of choices of c' and d' when a' and b' are fixed, and we get the same bound. Adding the two cases gives the second term in the minimum. O
14.3 THE SECOND CONDITION Whilst the First Condition is algebraic, the Second Condition is analytic. First
we simplify it, using calculations like those following the inversion step (lemma 5.5.3).
Lemma 14.3.1 (The Second Condition)
The Second Coincidence Condition
(14.1.2) implies
q'
h(a/q)
q
h(a'/q')
<< -2,
(14.3.1)
where h(z) is a function constructed from the original function F(x) in the exponential sum and the size parameters T and M. In the case of a family of sums, corresponding to F(x, y) for different values y = y;, then h(z) becomes h(z, y), h(a/q) becomes h(a/q, y), and h(a'/q') becomes h(a'/q', y'). We have
h'(z) h(z)
M2
Q
(14.3.2)
T P' provided that F(3) and F(4) are non zero in Chapters 7 and 9, or F" and F(3) are non zero in Chapter 8, and h'(z) 1 M2 _ Q (14.3.3) T h(z) z P' provided that F(3) and F" F(4) are non zero in Chapters 7 and 9, or F" and F'F(3) - 3F" 2 are non-zero in Chapter 8. In Chapter 10 we have (14.3.2)
-
3F(3)2
and (14.3.3) with M2/T replaced by M/T, provided that F" is non zero for (14.3.2), and F', F", and F' + 2xF" are non zero for (14.3.3).
Proof For the simple exponential sum of Chapter 7, we first consider the inverse function g(z) of f"(x)/2 (in the notation of (7.1.2)). Let f" (g(z)) = 2z, so that
g'(z) = 2/f(3)(g(z)). Then h(z) is defined by
h3(z) = f (')(g (z))/6, and (14.1.2) becomes h3(Z')q'3
h3(z)g3 - 1
S A2,
The First and Second Conditions
296
with z = a/q, z' = a'/q', from which (14.3.1) follows. For a family of sums with parameter y we write f(x, y), g(z, y), and h(z, y), and we replace f" and f (3) by the corresponding partial derivatives with respect to x. We estimate the logarithmic derivatives from
3h2(z)h'(z) = f (4)(g(z))g'(z)/6, with
h'(z) h(z)
2 f (4)(g(z)) 3f(3)(g(z)) f(3) (g(z))
2M2
F(4)(g(z)/M)
3T
(F(3)(g(z)/M))2
For the double exponential sum of Chapter 8, the construction is
h3(z) =f"(g(z))/2,
f'(g(z)) =z, with
h'(z) h(z)
f(3)(g(z))
M2
3(f (g(z)))2
3T
F(3)(g(z)/M)
(F" (g(z)/M))2 In Chapter 9 we have the corresponding formulae with f(x) replaced by f(x,0), and differentiation replaced by partial differentiation with respect to X.
For Chapter 10 the construction is
f'(g(z)) =z,
h2(z) = 1/g(z),
with
h'(z)
1
h(z)
2g(z) f" (g(z))
M
1
2T (g(z)/M)F" (g(z)/M)
Type 2 matrices can give rise to many coincidences; they are the most troublesome to control.
Lemma 14.3.2 (Restrictions on type 2 matrices) When (14.3.2) holds, then upper triangular matrices giving coincidences between minor arcs of the same sum in a family of sums have b <<
2 P/Q.
When (14.3.3) holds, then lower triangular matrices giving coincidences between minor arcs of the same sum in a family of sums have
c «A2Q/P. Proof For upper triangular matrices
r'
q,=
r + bq q
so that q' = q, and (14.3.1) gives 1
O2>>
q
for some 6 between r/q and r'/q.
logh(rll =I q
bh'(
h()
)
The Second Condition
297
For lower triangular matrices r
r'
q'
cr+q'
so that r' = r, and (14.3.1) gives h(q1))-log(-h(q))I
A2>> log(
h'(1/z) d `z2h(1/z) evaluated at some value z = q between q/r and q'/r. c
log(zh(z))I
11
cI
,
z
J
o
Lemma 14.3.3 (The domain of a type 3 matrix) If either (14.3.2) or (14.3.3) holds, then for all pairs of minor arc vectors which satisfy the First and Second Conditions with a given type 3 magic matrix (and with given parameters y and y' in the case of a family of sums), the rational numbers a/q and a'/q' lie in fixed intervals of length A2/1C11
where c is the lower left entry of the magic matrix.
Proof We follow Kolesnik's account in Graham and Kolesnik (1991), which is non-constructive. Huxley and Watt (1988) gave an iterative construction of a nested family of intervals containing a/q. As in Lemma 14.2.1 we write
r/q, r'/q' for a/q, a'/q' to avoid confusion with the usual notation for matrices. In the previous lemma and Lemma 14.2.1 we have
0 » I q'h(r'/q') qh(r/q)
I -I (cz+d)h((az+b)/(cz+ d))
2
-
I'
h(z)
where z = r/q. The functions h(z) for which the right-hand side is constant are modular forms, and are undefined on the real axis, whilst our h(z) is continuous, differentiable, and non-zero. We have log(cz + d) + log h (
az + b cz + d )
logh(z) «O2.
-
(14.3.4)
The derivative of the expression on the left of (14.3.4) is c
cz+d
-T
h' az + b h(z) cz+d) (cz+d)2 h ( - h(z) 1
We need only consider values of z with
The First and Second Conditions
298
matrices is chosen so large that the term c(cz + d) dominates the values of h'/h when (14.3.2) holds. We deduce that (14.3.4) holds for an interval of z of length O(i 2/Icl). In the case when (14.3.3) holds, then we write (14.3.4) as az + b h(z) log(az + b) + log h (cz + d) - log ( z
<<
2.
The derivative is c
h'
1
cz+d + (cz+d)2h
(
az + b
cz + d
cz+d
az+b
h'(z)
)-
h(z)
1
Z.
Again, we need only consider values of z with
P/2Qslaz+bI S2P/Q, since az + b = r'/q when z = r/q. By Lemma 14.2.4 lal is bounded below for type 3 matrices in terms of the constant B, and, as above, we see that (14.3.4) holds for an interval of z of length
O A2
1P
1
lal Q =
O
jcI
0
where we have used Lemma 14.2.4 again.
Lemma 14.3.4 (Restrictions on type 3 transvections) When (14.3.2) holds, then type 3 matrices of the transvection form
r1+uvw
-uw2
uv2
1 - uvw
which give coincidences have l uvl <<
2Q.
Let Bo be the implied constant in Lemma 14.3.3, and let wo/vo be a rational number of least denominator for which, for some 0 between -Bo and Bo and some x in M5x5M2, we have wo
f"(x)
vo
2
+902 vo
in the construction of Chapter 7. Then the number of type 3 transvection matrices with K:5 Id < 2K which fix wo/vo (so that K:!-, i 2Qvo) is
O(K/vo ), and the number of those matrices which fix some other rational number (so that K :! 2Q2) is O((SKP/Q + O2 )log K/vo ), and
vo z max(1, O(Q/P)).
The Second Condition
299
Proof Since the transvection matrix gives a trivial coincidence at w/v, then we have 1 r w 1 =I q - (SBoAa
q
uv2
v
which gives the first inequality. We deduce that the fixed point w/v of the transvection must lie within BOA2/v2 of some value of f"(x)/2. Also, if K< uv2 < 2K, then w/v lies in an interval I of length O(SA + O2/K), where A is as in Lemma 14.2.4. For each L, the number of rational points of denominator at most L in the interval I is
< 1 + O((SA + O2/K)L2), by Lemma 1.2.3. When w/v is chosen, then the number of non-zero choices for the integer u is at most 2K/v2. We have
vo» max(1,1/A),
v
0
and the result follows by partial summation.
Lemma 14.3.5 (Coincidences in the First and Second Conditions) If either P >> Q and (14.3.2) holds, or P << Q and (14.3.3) holds, then (for minor arc
vectors from the same sum in the case of a family of sums), the number of coincidences in the First and Second Conditions with P :!g lal < 2P, Q :!g
q<2Qis
O(PQ + Q2(P2 + Q2) + A1(A1 +A2 )p2Q2).
(14.3.5)
If S is small, but SPQ >> 1, then we have the better estimate
O(SPQ + min(SO2, S2)(P2 + Q2)) + O(SL2 PQ loge PQ)
+ O min S A 1
PQ2 , vo
S01
QZ
2
vo
,
PQ3
A2 2 log Q) + O(5201(O1 + vo
O2)P2Q2),
)
(14.3.6) where vo was defined in the previous lemma. The implied constant depends on the bounds for the derivatives of the function F(x).
Proof The rational numbers a/q lie in an interval of length O(SP/Q). By Lemma 1.2.3 there are O(SPQ) of them. There are 0(1 + O2P/Q + t2Q/P) magic matrices of types 1 and 2 by Lemmas 14.2.2 and 14.3.2, which gives the first two terms in the bounds. If 55 O21 then we can use Lemma 14.2.3, and replace A2 by S. For type 3 matrices we consider separately each range K5 I cl < 2K, where K is a power of 2 with vo << K<< O1Q2.
The First and Second Conditions
300
In Lemma 14.3.3 the rational r/q lies in an interval of length O(O2/K). By Lemma 1.2.3 there are < 1 + O(O2Q2/K) << (Al + A2)QZ/K
(14.3.7)
possibilities for r/q. The first result (14.3.5) follows. For a < 1 we distinguish
three cases. In the proof of Lemma 14.2.4, the number of type 3 matrices which are not transvections is O(52A2K2). They give the last term in (14.3.6). For the transvection matrices we subtract one from (14.3.7), since a transvec-
tion matrix gives a trivial coincidence at w/v. For K << A2Q/P we replace (14.3.7) by the trivial estimate O(6PQ) from Lemma 1.2.3. The number of
0
type 3 transvections is estimated in the previous lemma. 14.4
A FAMILY OF SUMS
A family of exponential sums formed with functions Fi(x) is a very general situation. We suppose that
Fi(x) =F(x, yi), where, for i =1, ... , I, the points yi lie in 05 y < 1 with
yi+1 -yi ? 1/J;
(14.4.1)
this implies that 15 J + 1. In Lemma 14.3.1 the function h(z) in the Second Condition becomes h(z, y). Lemma 14.4.1 (Shifting the parameter) Suppose that (d/dz)log h(z, y) is continuous and bounded away from zero. Then a particular matrix M gives a coincidence in the Second Condition for a particular minor arc vector for
O(O2J+ 1) values of the parameter yi. The condition on h(z, y) corresponds to d F112
1
dx F111
ZC
0
for some constant Co in the constructions of Chapters 7 and 9, and to
- >-
d F12
1
dx F11
Co
in the constructions of Chapters 8 and 10.
Proof We write (14.3.1) as
q' q
h(a/q,y) «A2, h(a'/q', y)
A family of sums
301
to display explicitly the dependence on the parameter. When the rational a/q, the value of y, and the magic matrix are fixed, then a'/q' is fixed, and log h(a'/q', y') lies within an interval of length O(O2). The bound follows from the spacing condition (14.4.1). In the construction of Chapter 7 we have
a
ay
logh(z,Y)=
1
a
3 ay
logF,,I
( g(z, y)
M ,Y
g(z,y)
Full ay
M a and we eliminate g2 by differentiating the definition 3F111 ( 1
a12 F'11
T
M ,Y
+ F1112
=2z
g(z' y)
to get
F111 921M + F112 = 0.
Thus
a ay
- log h(z,Y) _
F1112
3F111
F112 F1111 2
3F111
a F112 3 ax F111 1
There is a similar calculation in the other cases.
0
To control the triangular matrices we require further differentiability conditions to ensure the uniform continuity of the mixed partial derivative h12. There is also a condition that the points y; lie in a shorter interval, analogous to (14.1.5). As with (14.1.5), no generality is lost, but we must subdivide the sum over i into a bounded number of shorter sums before applying the large sieve.
Lemma 14.4.2 (The domain of a type 2 matrix) Suppose that h(z, y) is non-zero and three times continuously differentiable. We state the conditions for upper and lower triangular matrices together: for P > Q put H(z, y) = h(z, y),
0=02, Z=P/Q, butforP
z' I a; a2H(z,Y)I :! Co I H(z,Y)I
(14.4.2)
for 05m<3,05n52,m+nS3,and COZm la; a2H(z,Y)I z I H(z,Y)I
(14.4.3)
form=0, n=1 andn=0, m=1, and COZ2 I H1(z, y)H12(z, y) -H11(z, y)H2(z, y)I Z IH(z, Y)I2 (14.4.4)
for the closed intervals K1 of z and K2 of y containing the rationals r/q (for
The First and Second Conditions
302
P z Q) or q/r (for P < Q) and the points yi, respectively. Here a1 and the suffix
1 in H1 denote partial differentiation with respect to the first variable, and similarly for a2 and the suffix 2. Suppose that the length of K2 is
D<1/40Co, and that
0
1
1-A<20C' Then for all pairs of minor arc vectors which satisfy the First and Second Conditions with a given upper triangular magic matrix and with given parameters
y and y', the rational numbers a/q and a'/q lie in fixed intervals of length
1 P2
X O2
IbI Q2
,
and similarly, with a given lower triangular magic matrix, the rational numbers a/q and a/q' lie in fixed intervals of length A2/ICI.
In the constructions of Chapters 7 and 9 the condition (14.4.4) corresponds to the non-vanishing of the determinants D1
=l Fl
1111
F11112
F1111 I
F1112
forP>-Q, and 3F1211 + 4F11F1111
3F11F111
F lt a
F11111
F1111
F111
F11112
F1112
F112
D2 =
for P < Q. In the construction of Chapter 8 we require the non-vanishing of the corresponding determinants with one less partial differentiation with respect to the first variable throughout. In the construction of Chapter 10 we require the non-vanishing of F112 for P z Q and D3 =
2xF11 + 4F1F11
F1
-2x
F1F111
F11
1
F1 F112
F12
0
forP- Q. For these matrices (14.3.1) becomes
I H(r/q + b, y') - H(r/q, y)1:5 A I H(r/q, y)I This gives
H(r/q + b, y') = B1H(r/q, y),
where 1-0
(14.4.5)
303
A family of sums
We approximate H(z', y) on a neighbourhood of the point (r/q, y)
in
terms of the partial derivatives. By the upper bound (14.4.2). log
H(z', y') H(z, y') + log H(z, y') H(z, y) Iz -z'I Z + CO l y -Y'15 CO
H(z', y')
5 I log
H(z, y)
Thus
H(r/q + b, y') = B2H(r/q + b, y) with log B215 COD.
We now have B1
log
log
B2
H(r/q + b, y)
H(r/q, y)
by the lower bound for H1 in (14.4.3). Thus
Z 'C2 D + Co log (1 and for Iz-z'I5b, Iy - y'I S D, log
H(z',y')
1
C1= 2COD + Co log
H(z, y)
1
1- 0 J .
(14.4.6)
In (14.4.5) we look at the interval of values of z for which (14.4.7) H(z + b, y') - H(z, y) = eH(z, y) for some e in -A 5 E :!g A. Let this interval have length E. By the mean
value theorem there is some z in the interval with
E
d H(z+b,y') dz
H(z, y)
5 2A,
so that
E
I H(z, y)I2 >- I H1(z + b, y')H(z, y) - H(z + b, y')H1(z, y)I
Rearranging the inequality, we have
IH1(z+b,y')-H1(z,y)IS
20 E,
0 IH(z,y)I+COZIH(z,y)I,
and
H1(z + b, y') - H1(z, y) = 'qH(z, y)/Z, with
InI <- (2 + CO) OZ/E.
(14.4.8)
The First and Second Conditions
304
For some 9 in 0 < 0 < 1, the left-hand side of (14.4.7) is
bH,(z+9b,y')-(y-y')H2(z,y'+ 0(y-y')), and, for some cp in 0 < cp < 1, the left-hand side of (14.4.8) is
(pb, y') - (y -y')H12(z, y' + (p(y -y')). Thus
b IH1(z + Bb, y')H12(z, y' + (P(y - y'))
-Hll(z + (pb, y')H2(z, y + 0(y -y')) I <- IH(z, y)I(A IH12(z, y' + (P(y -y'))I
+(2 + CO)
E
< A IH(z, Y)12.
IH2(z, y' + 0(y - y'))I) CO(3E CO)
exp COD.
(14.4.9)
Now for any z', y" with I z - z' 15 b, l y -y" I < D, by (14.4.2) we have b
IH(z,Y)I Zm
y)II/Zm,
< Clec'IH(z, for m < 2, n < 1, and m + n<- 2. The left-hand side of (14.4.9) is at least b I H1(z, y)H12(z, y) -H11(z,Y)H2(z,Y)I - 4bC1e2C, I H(z, y)I2/Z2 >
b
2b 0
IH(z, y)I 2
V
since C1 < 1/10CO. We deduce that
E-4Co(3+Co)
OZ2 b
which is the required result. The same argument works for lower triangular matrices when we write c for b. The lower bound for H1 is the condition (14.3.2) or (14.3.3) of Lemma 14.3.1, and the lower bound for H2 is the condition of Lemma 14.4.1. The condition (14.4.4) is new. In the case P< Q we find that H1H12 - H11H2
4M2
D1 1o/3
9T413
(F111)
T2/3 9M2
D2
and in the case P > Q H1H12-H11H2
10_3
(F111)
A family of sums
305
In the construction of Chapter 10 for P5 Q M F112 H H12 H11H2 4T2 x3(F11)" 1
and in the case P > Q 1
4M
H1H12 -H11H2
D3 x3(F11)3
Lemma 14.4.3 (Coincidences from a family of sums) Suppose that the family of sums is formed with functions F(x, y,) which satisfy (14.4.1) and the conditions of Lemmas 14.3.1, 14.3.5, and 14.4.1. Then the number of coincidences in the First and Second Conditions among minor arc vectors from the whole family
of sums withP> 1 in (14.2.4), then we have the alternative estimate
O(S(02J+ 1)IPQ + z2I2((P2 + Q2)log PQ + SPQ log2 PQ) + 5201(01 +A2 )j2p1Q2 )-
The implied constant is constructed from the bounds for the derivatives of the function F(x, y).
Proof The type 3 matrices are treated by Lemma 14.3.3 for each pair of values y and y' of the parameter, as in Lemma 14.3.5. For type 1 matrices we
fix a/q and y and use Lemma 14.4.1 to count the number of values of y'. We use Lemma 14.4.2 for type 2 matrices. For each y and y' the upper triangular matrices give
E
0 Q2 minA2
151b1« SP/Q
1 p2 SP Ibl Q2'
Q) + 1
= O(min(02 log PQ, S2)P2) +O(SP/Q) coincidences, and similarly for lower triangular matrices. The term O(SP/Q) may be omitted since SPQ >> 1 and A2 PQ >> 1 by Lemma 14.1.1.
In Lemma 10.6.5 we need the maximum number of coincidences involving a particular minor arc. Lemma 14.4.4 (Coincidences with a fixed minor arc) Under the hypotheses of Lemma 14.4.3, the number of coincidences involving a fixed minor arc with
P
The First and Second Conditions
306
Proof We treat the type 1 and 2 matrices in two ways. The first term in the minimum comes from Lemma 14.4.1. For the second estimate, we note that two minor arcs which coincide with the fixed minor arc for the same value of the parameter must coincide with one another for slightly larger values of 01 and 02. The matrix giving the coincidence is the product of a type 1 matrix and a type 2 matrix that obeys Lemma 14.3.2, since the parameter takes the same value for both minor arcs. For type 3 matrices we fix the parameter on the variable minor arc. The magic matrix has c << 01Q2, and r/q fixed, with
r
d
q'
q
c
cq
in (14.2.3). The integer d lies in an interval of length 3/2, so there are at most two possible values for d, and the magic matrix is fixed up to a triangular factor:
(a c
b1=r1 tao bol dJ
0
1)(c
d,
with t << 02P/Q by Lemma 14.3.2 again. The number of type 3 matrices which can give a coincidence at r/q is
0 01Q2 1 + 02
P )) =
O( 1Q2 + 01A2PQ).
A similar argument using (14.2.2) gives
O(01P2 + A102PQ), and we take the minimum.
0
15 Consecutive minor arcs 15.1 PARAMETRIZING RATIONAL POINTS The elementary methods for the First Spacing Problem used the divide-and-
conquer method, dividing the solutions of the inequalities into families parametrized by a real number t. The expressions in the different inequalities were related by differentiation with respect to t, and if one inequality held for
a long interval of t, then the inequality corresponding to its derivative became sharper. The method for the Second Spacing Problem is again divide-and-conquer, dividing the solutions of the inequalities according to the magic matrix. The minor arc vectors are parametrized by rational numbers. We find a similar phenomenon of inequalities related by differentiation, and a long interval of solutions of one inequality giving a sharpening in the inequality corresponding to the derivative: a phenomenon as unexpected as it is beautiful. We give the details for the construction of Chapters 7 and 9. The construction of Chapter 8 differs only in the numbering of the derivatives and in the numerical factors in the Taylor series. After Lemmas 14.1.1 and 14.1.2, the coincidence conditions for a pair of
minor arcs J(a/q), J(al/ql) with Q:5 q, q1 < 2Q can be written as follows:
(')=( C the magic matrix M = (A
ICIStl1ggi;
)(),
(15.1.1)
n ) will be fixed in this discussion; N-1gi
5 02,
p43 ab a1b1 q
< 03,
(15.1.2)
(15.1.3)
q1
IK-K11S'&4 -
(15.1.4)
R2 R2 01 X n2 02 << 1 3 X n2 D4.
(15.1.5)
Here
in all cases.
Consecutive minor arcs
308
In (15.1.1) we have
C «D1Q2 <
a2
at D1Q2 +
42
q
,
jat
A,D<< -C+1. q
The set of possible magic matrices
is
independent of the order of
magnitude Q. We consider an interval on the real line with endpoints e/r, f/s, consecutive fractions in some Farey sequence, with
fr - es = 1.
(15.1.6)
By Lemma 1.2.2, the rationals a/q in the interval can be written as
a=eu+ft,
q=ru+st
with (t, u) = 1. Writing Mf
e
(s r)
ell ,
__(fl
rl
Si
we have
eiu +flt al = M eu + ft _ qt
(ru+st) - riu+sIt
We suppose that the Coincidence Conditions (15.1.2), (15.1.3), and (15.1.4)
hold at a/q, but not necessarily at e/r or at f/s. Since (t, u) = 1, there are integers t, u with
tt + uu =1.
(15.1.7)
(eu + ft) (fit -su) - (et - fu)(ru +st) = 1,
(15.1.8)
We note the identities and
rt-su
u
ru +st + t
r
t(ru +st)
(15.1.9)
We think of a/q and the coefficients of the Taylor polynomial as functions of t and u, so we write b(t, u), K(t, U), A(t, u), µ(t, u), v(t, u), and m(t, u) for the quantities corresponding to a/q in the construction of Chapter 7: f" (m (t, u)) 2
=
a q
+ A(t, u),
µ(t, u) = f (3)(m(t, u))/6,
qf'(m(t, u)) = b(t, u) + K(t, u), v(t, u) = f (4)(m(t, u))/24.
Parametrizing rational points
309
We use a suffix 1 for the corresponding quantities constructed from a1/qi. We use the letters b, m, K, A, µ, v (and these only) without argument for the values at t = 0, u = 1, which occur frequently in the formulae that follow. We introduce a new variable by
n =m(t,u) -m,
n1=m1(t,u) -m1.
How n and n1 depend on t and u will be elucidated in (15.1.12) below. We
take Taylor series about m and m1 with n and n1 as the variables of expansion. These expansions cover U consecutive minor arcs, with
U< R2/rs,
(15.1.10)
n << NU.
The Taylor series given are about the left-hand endpoint; similar expansions are possible about the right-hand endpoint, with sign changes which are left to the curious reader. The order of magnitude assumptions f(3)(x) X 1/NR2, fl4)(x) << I f (3)(x)I/M << 1/MNR2,
guide us as to which terms to keep, and which terms to treat as perturbations. The mean value theorem in the form
f" (m + n) =f"(m) + nf (3)(m) + O(n2 max If (4)(x)I) gives
2(eu+ft)
(ru+st)
+2A(t,u)=
2e
+2A+6nµ+O
r
r
n2
1
MNR2 '
so that, by (15.1.6)
!n
3nµ 1 + Ol M)
t
r(ru +st)
+ A(t, u) - A.
(15.1.11)
Approximating further, we have n
3µr(ru + st)
+O( l+G - (t) +O 1+ M) ( M)
(15.1.12)
where G(x) is the continuous function
G(x) = 1/3µr(rx+s). (15.1.13) We make constant use of (15.1.12) to pass from the discrete to the continuous. The mean value theorem for the first derivative gives
f'(m+n)=f'(m)+nf"(m)+
2
f(3)(m)+O(n3maxIf(4'(x)D,
Consecutive minor arcs
310
so that b(t, u) + K
ru+st
u)
2en
b+K
r
+
b+K
+
r
2en
+
n3
r
nt
+nA(t,u)+nA+ r(ru
2en
b+K
r
r
r +2nA+3n1µ+01 MNRZ)
r
n3
(
+0I
.
-
nt (15.1.14)
+r(ru+st)+ru+st
say, where we have used (15.1.11) to eliminate the non-arithmetic term in A.
The non-arithmetic residual term y in (15.1.14) depends on t and u. We analyse y later. In order of magnitude we have y n n3 ru + st - NR1 + MNR
n1 (1+)((t)+NR1(
1+M)) (15.1.15)
by (15.1.12). We shall see that y is negligible when the right-hand side of (15.1.15) is 0(1/N). We still need some further notation. We have +st)(b + 2en) + K(TU +st) + nt + b(t, u) + K(t, u) _ ( r r r bst 2nt K(ru +st) nt + r (fr-1)+ =(b+2en)u+ + +y
r
= bu + 2n(eu +ft) + H,
r
r
(15.1.16)
where
H=H(t,u)=
(bs - n)t + K(rU+st) +y. r r
(15.1.17)
When I K(t, u)I < -11, we have
b(t, u) = bu + 2n(eu +ft) + h, K(t, u) = 771,
with
h =h(t,u)=[[H(t,u)]], ,q= rt(t, u) _ ((H(t, u))). If K(t, u) is within O(Q/N) of ± Z, then we have to consider two minor arc vectors with b(t, u) differing by one, and K(t, u) taking both values ± z + O(Q/N). The numbers h and i are treated to correspond. We count a coincidence if either possible vector belonging to the arc J(a/q) coincides with a vector belonging to the arc J(al/qi) We can now state the subtle identity at the heart of this chapter.
Coincidence over a short interval
311
Lemma 15.1.1 (Rational coincidence conditions) The Third and Fourth Conditions at a/q can be put into the form n
n1
ru +st X
K - K1
u(h - h1)
t
t
(rlu +s1t) + rl
rlu+slt
r ru+st
'n1 - y1
+
t
r 5 L3,
I-i-7111SO4
(15.1.18)
(15.1.19)
Proof By (15.1.8) we have
(rt - su)b(t, u) = bu(rt - su) + 2n(eu + ft)(rt - su) + (rt - su)h = brut + sit - s) + 2n(ru + st)(et u) + 2n + (rt - su)h. Hence
rt-su
II
l nL+stb(t,u)J
=
-su ru+st +ru+st
((2n_bs
-r(2n-bs
r
-u
11
I`I1
-
ru+st +I1t(ru+st) t)hli1 2n - bs r(H - rl) uh (( ru+st + t(ru+st) t 2n-bs bs-n K r(y - q) ru +st
+-+ ru +st t t(ru +st)
+
n
K
uh
r(y- q)
ru +st
t
t
t(ru +st)
uh
t
We subtract two expressions of this type to get the Third Condition (15.1.3). The Fourth Condition (15.1.4) in this notation is 177 - 7111 «04,
so we can suppose that
Iii - -ql I
is less than 1, and that H and H1 are
approximately congruent modulo one. Thus we can write
[[H1-H]] = [[H1]] - [[H]] =h1-h, to get (15.1.18).
15.2 COINCIDENCE OVER A SHORT INTERVAL We continue the approximation step of the divide-and-conquer method. Our task is now to show that coincidence is a property of the minor arc, not of the
312
Consecutive minor arcs
particular rational point a/q chosen. First we introduce a function g(x) depending on the two sets of reference fractions e/r, f/s and el/rl, fl/sl. We may call it the Fortean function after Charles Fort (1973), the investigator of coincidences. The function g(x) plays the same role in this chapter as the function D3/2(t) in Chapter 11. Lemma 15.2.1 (The Fortean function) Let
g(x)
-
G(x) r
G1(x)
-
rl
1
1
- 3µr2(rx +s)
3µ1ri (r1x +s1)
(15.2.1)
Then g(x) and its derivatives can be written as
g(k)(x) =
(-1)kk! 3µr3
ri+1
k+1
(s1/r,
(rlx
s/r +x ,
+S1)k+l
µr3
µlri
and if the Second Condition holds at a/q = (eu + ft)/(u + st), then, for k > 0 and x = u/t, (Q\k+1(Q2+NRZ01). NR2 nQ2
g(k)(t)<<
(15.2.2)
The last factor may be simplified to 02 in the important case k = 2.
Proof The first assertion follows from (-1)krk-2k!
(-1)kr1-2k!
g (k)(x) +s)k+1
31x(rx
3p1(rlx +s1)k+l
The Second Condition at a/q is µ1(t, u)(r1u + slt)3
µ(t,u)(ru +st)
-1 «A2.
(15.2.3)
From the bound for the fourth derivative f (4)(x) we have
µ(t, u)/µ = 1 + O(n/M),
(15.2.4)
and similarly for µ1(t, u). By (15.1.1) applied to e/r and (eu + ft)/(ru + st) we have
r1u +s1t
r1
Ct
ru +st
r
r(ru +st)
-
« Qtr D1,
(15.2.5)
and (15.1.12) gives tR2
n
rQ
N'
(15.2.6)
We can write (15.2.5) as
ru+st ru+st
r
r
1+Of
n
2
RZ01,
.
(15.2.7)
Coincidence over a short interval
313
Combining (15.2.3), (15.2.4), and (15.2.7) gives (15.2.2), when we notice that A2>> N/M (15.2.8) by the condition N3 << MR2, N << MT-1/4, which was (7.2.5). In the construction of Chapter 8, we find that (15.2.8) still holds because (8.1.5) gives HN2 << MR 2, and
A2 R2/HN. The inequality (15.2.7) is not used for k = 2, when the powers of r and
(ru + st) agree.
O
Lemma 15.2.2 (The arc of coincidence) Let J(e/r) be a minor arc, with e/r the rational of least denominator subject to r >- R. If the Coincidence Conditions
hold at a/r, then they hold at (eu + ft)/(ru + st) whenever n(t, u) << N, but possible weakened by a constant factor.
Proof The First Condition, with the particular magic matrix M, is assumed throughout this chapter. Since r may have order of magnitude smaller than Q, we write the Second, Third, and Fourth Conditions as Al r3
µr 3 -1<<
bs
b1s1
r
r1
2, Q
(15.2.9)
03,
(15.2.10)
K - K1 << n A4,
(15.2.11)
<<
r r
where A2, A3, and A4 are the sizes appropriate for denominators in the range Q5 q < 2Q. First we consider the Second Condition. We reverse part of the calculation of Lemma 15.2.1, deducing (15.2.3) from (15.2.9), (15.2.7), (15.2.4), and (15.2.8) when n << N.
In the Fourth Condition we have
(t r - b1sl bs
n1 )) _
-t( n
- Y1) + ru 11
1
+K1
ru+st
- rlu+s1t
st
r
(K- K1)
(15.2.12)
r1
r
By (15.2.10) and (15.2.6) II
( bs t r
- b,sl 1 II S t II bs - blsl II rl
r
rl
<<
Q
n RZ n A3 « N A 3 << N 04. r QZ
Consecutive minor arcs
314
By (15.1.12), (15.2.1), and (15.2.2) with k = 0 n nll ul rrt n2
rr
tY-Y)=t8(t)+OIY(1+M) 1
NR2 rt2
n2
Q
(tR2)2QN rQ
t
r2 02 +
Qn
R2 N
n
n
NZD4+RZD4<< NL14.
<<
Since K1 is bounded, (15.2.7) gives
ru+st K1(
r
- r1u+s1t r
<<
Qn r
1
n QN
NVA2«N
R2
n n2«No4.
By (15.1.15)
y-yI
nQ /
n2
n
NR1+M)<< N04.
(15.2.13)
At this stage we have
st(K-K3)))+o(N114),
(15.2.14)
and the first term is O(04) by (15.2.11). Since n «N, we deduce that 71 -771«D4,
with the proviso that when q is close to ± 2, then we consider both choices of rl; one of them is close to i7i, the other is close to -,q1. Even in this case, the Fourth Condition holds at a/q for one of the two choices of 17. The Third Condition in the form (15.1.18) is still more complicated, but we have done much of the work already. Since 71- 71, and K - K1 are small, we can remove the fractional part brackets in (15.2.14), and deduce that
K-Kl t
r n r(rt-711) R2 GG - - A4 << 2 D4 «03, t(ru + st) Q Qt N
We have
H-H1=
ti-r bs
bls1 111
rl
+O(L4),
and (15.2.10) gives
bs/r - b1s1/r1= d + O(Q03/r) for some integer d. Thus t
h-h1= [{dt+o(-A3)II1
=dt+O R2 N 113) =dt+O(N L4).
Linearizing the Fourth Condition
315
If D4 is sufficiently small, then h - h1 = dt, and
u(h-hl)/t=ud, an integer. The other terms in (15.1.18) are treated as above. By (15.2.13)
r(y- y1) r tR2 « tQ rQ 04 «D3. t(ru +st) Since -71 - y, is bounded, (15.2.7) gives r1 (,q
yl) t(r1u +s1t)
r t(ru +st)
R2
n
r
<< tQ
NY
O2 <<
02 rGG
3.
QZ
Finally, by (15.1.12), (15.2.1), (15.2.2), and (15.2.6) n
nl
ru+st
r1u+slt
1
(1
u
n2
tg,(t)+OI NR2t << rQ2
n
n`
1
N
02(1+N)+Q+QA2 1
N
<< 2+Q+-L)2<< 3. Thus the Third Condition (15.3.2) continues to hold at a/q, possible weakened by a constant factor. 0 After Lemma 15.2.2 we can, in a sense, drop Modification 1. If f/s and e/r are the two rational points of smallest denominator on a given minor arc, then, up to constant factors, the Coincidence Conditions of Lemma 14.1.1 hold for the y vectors corresponding to f/r and fl/r1 if and only if they hold for the y vectors corresponding to e/s and e1/sl. 15.3
LINEARIZING THE FOURTH CONDITION
After Lemma 15.2.2, we can suppose that the Coincidence Conditions (15.1.1)-(15.1.4) hold for a positive proportion of the rationale a/q with Q5 q < 2Q which are values of f" (x)/2 on a particular minor arc, but with the '5 ' inequality signs replaced by '<<' signs, with some bounded implied constant. Our object is to remove the term in u from (15.1.18), and to replace the absolute value modulo one by an ordinary absolute value. We make a delicate approximation in the Fourth Condition (15.1.4), and use the results of Chapter 4. Lemma 15.3.1 (Analytic coincidence conditions) Let J(a/q) be a minor arc,
Consecutive minor arcs
316
on which a/q = (eu + ft)/(ru + st) is the rational number of smallest denominator subject to q >- R. If Q S q < 2Q, and the Coincidence Conditions hold at
a/q, and if
G -t Smin
N
,(MR2)1/3
(15.3.1)
B3Q
where B3 is a certain absolute constant, then x = u/t satisfies
II(a-Ox +/3-g(x)II«R2/rG(x),
(15.3.2)
la - c -g'(x)I << G(x)r/N2
(15.3.3)
for some integer c, with
a= K- K1,
(15.3.4)
(b + K)S
(b1 + K1)s1
r
r1
2A
2A1
(15.3.5)
+ 3µr
3µ1r1
Proof The rational values of f"(x)/2, a
eu + ft
e
q
ru+st
r
1
r(ru/t+s)'
lie in an interval of length x 1/R2, which corresponds to an interval in x = u/t of length (r2x2 +s2)/R2 for
t>> rQ/R2,
n>> N.
We impose the condition n 4 NU of (15.1.10), where U ::g R2/rs. By (15.1.12)
t x nrQ/NR2 << rQU/R2. As in (15.2.12), the Fourth Condition (15.1.4) is
r1u +s1t bs n n1) ru +st ---+ K1+y-y1 «A4. IIt\---)-t Kr rl r r b1s1 r1
r1
(15.3.6)
Three of the five terms on the left of (15.3.6) are linear in t and u. We combine the other two terms. By (15.1.14) we have
y(t, u)
b(t, u) + K(t, u)
b+K
2en
ru + st
ru + st
r
r
=2nA+3n2µ-
nt
nt r(ru + st ) n3
r(ru+st) +OMNRZ
= n(2A + 3µ(n - G)) + 0(n3/MNR2),
(15.3.7)
Linearizing the Fourth Condition
317
where we have written G for G(u/t). The error term in (15.3.7) is 0(1/N) when G(u/t) is less than the second term in the minimum in Lemma 15.3.1. We use (15.1.12) repeatedly to get
ru(+st =(2A+3µ(n-G))G+OIl N /I. Thus t
(
1
y(t, u) = 3µr (2A + 3µ(n - G)) + OI N )
=r(
M 3A)
When we substitute into (15.3.6), then the only non-linear term is
rG`t)+ 1G1(t) -tI
t
We write (15.3.6) as
Ilau + /3t - tg(u/t)II «04,
(15.3.8)
where a and /3 are defined by (15.3.4) and (15.3.5). Let xo be a value of u/t on the minor arc. We take a Taylor approximation to g(xo +y) with
y << (r2x2 +s2)/R2. In the remainder term we have, for some 9 in 0 < 0 < 1, ty2g"(xo + ey) <<
t(r2x2 +52)2 NR2 R4 a
r3
rt
3
(Q) A2
N
<< 4 _ R Q3 O2 « 041 by Lemma 15.2.1.
For values of u/t corresponding to a positive proportion of the minor arc under consideration, we have a linear approximation of the form
H-Hl = a'u + /3't+O(A4), with
a' = a -g'(xa),
(15.3.9)
and
R -g(xo) +xog'(x(,).
(15.3.10)
Then
IIa'u+/3'tII<<&
.
(15.3.11)
Consecutive minor arcs
318
Let ao/qo be the rational number of smallest denominator which is a value of f" (x)/2 on the minor arc. By Lemma 1.6.3, for Q > B1 max(go, R2/qo)
(15.3.12)
there are R' rational values a/q of f" (x)/2 with Q1165 q < 2Q on the minor arc, lying between ao/qo and an adjacent fraction in some Farey sequence, and
R' Q2/B2R2. Here B1 and B2 are certain constants. By (15.1.12) and the assumption n >> N, the denominator t has fixed order of magnitude Q', say, with
t< Q' x rQG(u/t)/NR2, and u < 2Q/r, so that u Q _<M'> rQ'
t
We can rewrite (15.3.11) as
a'u
6'
v
t + R, - t 5 Q' for some integer v, where 6' «D4. We have set up the conditions of Theorem 4.3.3 with parameters a', (3',
6', M', N', and R'. Either R' ::g max(86'Q',1285'M'Q'2) + 1, (15.3.13) which is impossible if G(u/t) is less than the first term in the minimum in (15.3.1) of the lemma with B3 sufficiently large, or else all the solutions lie on a rational line ey = cx + d, with c, d, and e integers with no common factor, and l
ea
86'Q' rG « N2, - c sR'-1
(15 . 3 . 14)
l
and
1e13-dl <
126'M'Q' R, - 1
R2 << N r
(15.3.15)
.
By the construction in Lemma 1.6.3, the sequence of rationals u/t includes two consecutive Farey fractions u1/t1, u2/t2, so ev1 = cul + dt1,
eve = cue + dt2,
and
c = c(t1u2 - t2u1) = e(t1v2 - t2v1).
Hence e I c, and similarly e I d, and e must be 1. By (15.3.14) and (15.3.15)
Ia'u+(3't-cu-dtl <
206'M'Q'2
5
R' - 1
5 32'
Linearizing the Fourth Condition
319
since the second term in (15.3.13) gives the maximum provided that B1 is sufficiently large in (15.3.12). Thus we have
h-h1=[[H-H1]]=v=cu+dt. When we subtract cu from (15.3.8), then the nearest integer is a multiple of t, and so we can divide by t to get (15.3.2). In (15.1.18) we have
u
t(h-h1)=
c(1 - ft) t
C
+du= r - c+du.
(15.3.16)
The bounds (15.2.7) and (15.2.6) give X11
r1
t
r1u +s1t
r ru +st
1
rlu +slt
t
R2
n
r
<<
r ru +st
r1
I
A2 <<
Qt N
02 «A3.
Q2
When the Fourth Condition (15.1.19) holds, then
r6q1 - ri)
t(ru +st)
R2 N <
By (15.1.15) and (15.3.1)
yr t(nt + St)
n « rtt NR2
1+
n2
1
M < < Q<<
3.
The Second Condition in the form (15.1.18) has now been simplified to nl n K-K1-CI (15.3.17)
<< 03. ru + st
r1u + s1t
t
11
By (15.1.12), (15.2.1), (15.2.6), and (15.3.1) we have
n
nl
ru+st
r1u+sltl
(1 n211 Q(1+MJI tgl(t)+OI
1
G
u
B5 N2 R2
B4-A 25-
B5 2
2
<
B3 Q 1 B3 for some absolute constants B4 and B5, provided that we have chosen B1 Q
sufficiently large. Since
c = [[a -g'(xo)]], by (15.2.2) we have
c<<1+
NR2 r3
rt 12 R2
1-I Q
N2
n2r
<< 1+N3.
(15.3.18)
Consecutive minor arcs
320 Then
K - K1 -C
1
InI R2
t
I«t+NZQ'
and
I(K-K1-C)/tl
the left of (15.3.17) is the absolute value modulo one of a number that is numerically less than one-half, so it is the ordinary absolute value. We deduce that
K-K1-C t and multiplying by t gives (15.3.3).
t
t << D3, g'(u)I
15.4 COINCIDENCE OVER A LONG INTERVAL So far we have considered only one minor arc at a time. The parametrization
was set up so that we could consider a whole interval of minor arcs, each corresponding to an interval of values of x = u/t. Our next lemma is one of interpretation rather than calculation. Lemma 15.4.1 (The resonance curve)
The conditions (15.3.2) and (15.3.3) of
Lemma 15.3.1 mean that there is an integer point (c, d) close to the curve z = K(y). The function K(y) is constructed from g(x). We define k(y) implicitly by
g'(k(y)) = a -y,
(15.4.1)
z = K(y) = (a-y)k(y) -g(k(y)) +(3,
(15.4.2)
and K(y) by The functions k(y) and K(y) are algebraic, with finitely many branches, and each branch has finitely many maxima and minima. When a coincidence occurs
at (eu + ft)/(ru + st), then for y defined by k(y) = u/t, and some integers c, d,
y = c + O(rG(x)/N2), z = d + O(R2/rN).
(15.4.3) (15.4.4)
Proof The equation (15.4.3) is (15.3.3), and (15.4.4) follows on substituting x times (15.3.3) into (15.3.2).
We call z = K(y) the resonance curve because integer points on or close to it correspond to coincidences of vectors in the large sieve, and so to pairs of minor arcs which produce similar sums after Poisson summation; they may differ in length or by a phase factor. The resonance curve depends on the
Coincidence over a long interval
321
magic matrix and on the two reference fractions e/r and f/s used to define the interval. The accuracy in (15.4.3) decreases as we go further from a/r. If the domain of the magic matrix is long, then we must subdivide it. We have a set of resonance curves forming short arcs in the plane, and we want to count
how often a resonance curve comes close to an integer point. In the First Spacing Problem in Chapter 12, most families do not contain an integer solution. We expect that most resonance curves do not approach an integer point.
The functions g(x) and K(y) are complementary functions in the sense of Chapter 4, with a shift of the origin to y = a. We have
K'(y) = -k(y), K" (y) = -k'(y) =1/g" (k(y)), so that formally
y = f dy
= - f g"(x)dx,
K(y)= - f k(y)dy= - f xdy= f xg"(x)dx. The resonance curve has a cusp when g" (x) changes sign.
Lemma 15.4.2 (The long arc of coincidence) There is a constant B7 such that, if there are L minor arcs corresponding to rational numbers between e/r and f/s with Q 5 q < 2Q, with
G(x) 5 min
N ,
B7 04
(MRZ)1/3
(15.4.5)
on which the Coincidence Conditions hold, then there is a particular a/q = (eu + ft)/(ru + st) for which the Coincidence Conditions hold in the stronger form ( 1 l C <<minl AjQ2, 1 O2R2J,
µ1(t, u)(rlu + s1t)3 +st)3
µ(t,u)(ru
(15.4.6)
1
- 1 << L2 '
,
(15.4.7)
and, for some integers c and d u
a - c - g'I t
<<
rG(u/t) LN2
(a - c)t +13-g(t)-d«rG(u/t)'
(15.4.8) (15.4.9)
Proof We choose B7 so that (15.4.5) implies (15.3.1). From the proof of
Consecutive minor arcs
322
Lemma 15.3.1, all the rational points at which the Coincidence Conditions
hold correspond to points y = u/t, z = v/t on a line z = cy + d. We can suppose that L >- 32 (by absorbing powers of 2 into the order-of-magnitude constants in (15.4.6)-(15.4.8), if necessary, in order to cover the cases with L :!g 31). We number these minor arcs Jd in order, and pick points x, = u,/ti in I with Q5 ru; + sti < 2Q, corresponding to minor arcs Jd(i), with i =1, ... , 8 for which
d(i + 1) - d(i) >- L/16, so that
G(xi+1)-G(xi)>> LN. By Cauchy's mean value theorem, there are points y11 ... , y4 with
x2i_ 1
<
yi < x2i for which
a-c-g'(yi)
R2
1
< LN rG(yi) ,
G'(y,) and thus
a-c-g'(yi)<<
R2
LN(ryi + s) '
a sharpening of (15.4.3) by a factor 1/L.
We repeat the process to eliminate a - c. There are points z1, z2 with Y21- 1
g"(zi) G'(zi)
R2
1
L2N2 rzi+s'
and thus R4
1
g" (zi) << L2
N(rzi + s)'s
From the formula for g"(x) in Lemma 15.2.1 we see that
s1/r1 +zi
s/r+zi
3
µr3
(rzi +S)3
µ1r1
NR2
R4
1
L2 N(rzi+s)3
1
R2
L2 N2
which implies that
µl(r1x+S1)3 µ(rx +S)3
R2 -1«LL2 N2 1
(15.4.10)
for x = z1 and x = z2. The left-hand side of (15.4.10) is monotone, since
r1x+s1
r1
rx+s
r
C
r(rx+s)'
Extending the Taylor series
323
where C is an entry in the magin matrix M. Thus (15.4.10) holds for z1 S x < z2, an interval corresponding to at least L/32 complete minor arcs. Giving x a rational value u/t in this interval, we have p.1(r1u +s1t)3
µ(ru +st)3
-1<<
1 R2 L2 N2 <<
1 2
02
(15.4.11)
Hence the left-hand side of (15.4.7) has order n
1
<
2
+
1
<<
j2
02
n3
+
1
l
j
«jy2
A 2,
by the second inequality of (15.3.1). We obtain the corresponding strengthening (15.4.6) of the First Condition on observing that r1z2 +Sl
r1z1 +S1
C
C
rz2+s
rz1+s
r(rz2+s)
r(rz1+s)
= 3µC(G(z2) - G(zl)) >> CL/R2. Thus (15.4.10) gives 1
C
R4
1
L3 N2 << L3 A2R2 <<
o
as asserted.
15.5 EXTENDING THE TAYLOR SERIES We want the parametrization to take in as many consecutive minor arcs as possible. The upper bounds on G(x) are u
N2
N
in (15.3.1), which corresponds to the fact that we only use the Fourth Condition when applying Theorem 4.3.3, and G
tu < (MR2)1/3
(15.5.1)
in (15.3.1), which allowed us to neglect the terms in the fourth derivative. This last condition can be relaxed, at the cost of further complication. Suppose that f(x) is five times differentiable, with the natural bound f's)(x) << I f (3)(x)I/M2 << 1/M2NR2.
Consecutive minor arcs
324
We take the Taylor polynomials in the first section to one degree higher, with explicit terms in v = f (4)(m)/24. We replace (15.5.1) with the bound n2 << MR,
(15.5.2)
n4fcs>(m) << 1/N
(15.5.3)
which implies that
in the Taylor series. The mean value theorem for the third derivative gives f (3)(M + n) =f (3)(m) + of (4)(m) + O(n2 Max If (1)(X)I),
so that 61k(t, u) = 6,u + 24nv+ O(n2/ M2NR 2).
To eliminate µ(t, u) from the Seco nd Condition, we w rite 1
1+4nv
1
p.(ru + st)3 1
11(n, +st)
-1 /A.
4nv
1-
3
-I
n2
+O m2
n2 1
+0
I
M
1 3
p.(ru +st)
-
4vt
NR2
3µ3r(ru +st)
4
+O M 3 I
Q
n2 11+ 1
M )) , (15.5.4)
where we have used (15.1.12). The Second Condition can be expressed as 1
NR2
1
µ(t, u)(ru +st)3
µ1(t, u)(rlu +slt)3
<<
A2'
(15.5.5)
Q3
the error term in (15.5.5) dominates the error term in (15.5.4) because we assume (15.5.2). We substitute (15.5.4) and the corresponding relation for µ1(t, u) to make the left-hand side of (15.5.5) into u
3
(NR2
u
Q3
1
n2
(M+MZ
where h(x) is a correction to the Fortean function g(x), defined by h( x )
4v
4v1
27p.1.3r3(rx+s)2
27µi3(rlx+sl)2
(15 . 5 . 6)
The Second Condition implies u
g
(t)-h( _u )t«
NR2
t3NR2 3
_6T_
02
which corresponds to (15.2.2) of Lemma 15.2.1.
3A 2 ,
(ru/t _+d
(15.5.7)
Extending the Taylor series
325
Next we examine the mean value theorem for the second derivative:
f" (m + n) =f" (m) + of (3)(m) +
2
f (4)(m) + O(n3 max If (')(x) 1),
which gives
2(eu+ft)
ru+st
2e
+2A(t,u)=
r
n3
(
+2A+6nµ+12n2v+Ol`M2NR2
1
and
n-
t
2n v
3µr(ru +st)
t
=G(U)
n3
+O(1 + M2 )
Vt2
-
n3
(15.5.8)
9µ3r2(nt + st)
This gives n
nl
ru+st
rlu+slt
u u --g'(-) + -h'(-) +O(1 3), t t 4t t 3
1
(15.5.9)
where we have used (15.5.3) to estimate the error term. Finally we expand the first derivative by the mean value theorem:
f'(m + n) =f'(m) + nf" (m) +
2
f (3)(M) + 6 f(m) + O(n4 max 1(5)(x)D,
which gives
b(t, u) + K(t, u)
b+K
ru+st
r
+
(1 2en' +2nA+3n2µ+4n3v+Ol N ) r
by (15.5.4). From the definition (15.1.14) of y we have
y
ru +st
=2nA+3n2µ+4n3v-
nt
r(ru +st)
+O
1
(N)
= G(2A + 3µ(n - G) + 4n2v) + 0(1/N). We use (15.1.12) repeatedly to get
ru+st
=G(2A+3µ(n-G)+4G2v)+OI N1.
Thus t
2A 1
u y=-In-G(-) +-J + r t 3µ
4vt3 27µ3r3(nc +St)2
+0(i14). (15.5.10)
Consecutive minor arcs
326
We can also substitute from (15.5.8) to obtain tat 2vt3
3µr
27µ3r3(ru +st)
Continuing as in Section 15.3, we have
yr t(ru +st)
u I +O(A3). -h'(4t t 1
y1r1
t(r1u +s1t)
(15.5.11)
We are now ready to consider the Fourth Condition. From (15.5.10) u 2ltt u nt nit 2A1t
Y- y1= r -
tg(t) + 3µr
r1
3µ1r1
+th(t) +O(04).
To linearize as in Section 15.3 we require
ty2(g"(x) -h"(x)) «04, where x
(15.5.12)
a value corresponding to the minor arc considered, and
is
y << (r2x2 +s2)/R2. The estimate (15.5.7) has the same strength as Lemma 15.2.1, so (15.5.12) follows from the Second Condition in the form (15.5.7). We have a linear approximation of the same form as in Section 15.3 with
a' = a -g'(xo) +h'(xo), 0' _ /3 -g(xo) +h(xo) +xog'(xo) -xoh'(xo). If (15.3.11) and the first inequality of (15.3.1) hold, then we recover the bounds (15.3.14) and (15.3.15) for 11 a'11 and 11 011. We can still divide by t in
the Fourth Condition. We turn to the Third Condition in the form (15.1.18). When the Fourth Condition holds, then the terms involving -1 and -q1 are O(A3), and the term in u is given by (15.3.16). The left-hand side of (15.1.18) is
II
n
n1
ru +st
r1u +s1t 1
u
+
a-c t
+
u
1
-
yr t(ru +st)
tg'(t+th'(t+
y1r1
T(-r1u -+s it)
a-c t
(15.5.13)
+O(A3),
by (15.5.9) and (15.5.11). As in Section 15.3, we have
t
gl
( t ) I < 8'
and
I(a - c)/tI < s, if B1 and B3 are sufficiently large and the first inequality of (15.3.1) holds. By (15.5.6) and (15.5.3) we have u « -h'(-) t t 1
vt2
µ3r2(ru +st)3
I
u «-G2(-) <<MQ B1 t 1
1
Extending the Taylor series
327
and so if B1 is sufficiently large, then the sum of the three terms in (15.5.13) is less than one-half, and we may replace the absolute value modulo one in (15.5.13) by the ordinary absolute value. We can now reassert Lemma 15.3.1 in a modified form. Again, a similar calculation works for the construction of
Chapter 8, but with different numerical factors associated with the nth derivatives.
Lemma 15.5.1 (Higher analytic coincidence conditions) Let J(a/q) be a minor arc, on which a/q = (eu + ft)/(ru + st) is the rational number of smallest denominator subject to q z R. If Q5 q < 2Q, and the Coincidence Conditions hold at a /q, and if u
G
t
2
< min
-
B3Q
,
(MR)1/2 ,
(15.5.14)
where B3 is a certain absolute constant, then x = u/t satisfies
II(a - c)x + /3 -g(x) +h(x)II «R2/rG(x), I a - c -g'(x) + h'(x)I << rG(x)/N2
(15.5.15) (15.5.16)
0
for some integer c, with a and /3 as in Lemma 15.3.1.
Lemma 15.4.1 continues to hold with the modification that g(x) is replaced by the more accurate Fortean function g(x) - h(x) at all occurrences. There is one change in Lemma 15.4.2. Because we have a more accurate Fortean function, (15.4.10) in the proof of Lemma 15.4.2 is replaced by
µ1(t, u)(rlu +s1t)3
µ(t, u)(ru +st)
3
1
<
R2
1(
NZ+M
1+M n2
using (15.5.7). This is just what we want to prove (15.4.7) of the lemma, but not so convenient for (15.4.6). From the bound for the fourth derivative we have
µ(t1, u1) p.(t2,u2)
-1<
and similarly for µ1(t, u). Because we have replaced the bound (15.3.1) by (15.5.14), we cannot absorb O(LN/M) into O(02/L2). The new lemma is as follows.
Lemma 15.5.2 (The very long arc of coincidence) There is a constant B7 such that, if there are L minor arcs corresponding to rational numbers between e/r and f/s with Q < q < 2Q, with
G(x) < min
N B 704 7A4
(MR)1"2
(15.5.17)
Consecutive minor arcs
328
on which the Coincidence Conditions hold, then there is a particular a/q = (eu + ft)/(ru + st) for which the Coincidence Conditions hold in the stronger form
C<< minOlQ2,
1
L3
µ1(t, u)(rlu + Slt)3
µ(t, u)(ru +st)3
A2R2+
NR2
(15.5.18)
M
1 « 02, -L2 1
(15.5.19)
and for some integers c and d
a - c - g'( `l
(a-c) t +13-8(
t1
u
) + h'(
/
u
t)«
+h(t) -d<<
rG(u/t)
l
(15.5.20)
z
t)
(15.5.21)
0
16 The Third and Fourth Conditions 16.1
COUNTING COINCIDENT PAIRS OF MINOR ARCS
We put together the ideas of the two previous chapters, We want to count coincident pairs among the vectors y constructed from polynomial approximations on the minor arcs. The parameters of the method, M and T, and also H in Chapters 8 and 9 are immutable. The choice of N, the length of the short sums, was made at the outset, and determines R, and so which arcs are major, which are minor. Some fine tuning can be done before applying the large sieve. In each range Qs q 5 2Q for the denominator of the rational
number a/q, we can subdivide into a finite number of cases in which P< a < (1 + 6)P for some S» 1. We have also taken an extra parameter V>_ 1 in the First Condition in order to restrict the set of magic matrices.
The first type of magic matrix consists of the identity (and perhaps a bounded number of other matrices), which gives a coincidence on every Farey
arc. For each magic matrix of types 2 and 3, our aim is to show that minor arcs on which the Second, Third, and Fourth Conditions hold, and the First Condition with this particular magic matrix, are widely separated. We con-
struct a set of reference fractions as in Section 1.6 of Chapter 1, with consecutive reference fractions being about U/RZ apart (where the integer U is a free parameter), so that the interval between two consecutive reference
fractions corresponds to (parts of) at least U consecutive Farey arcs. We could make U depend on the length of the arc of coincidence, in the sense of Chapter 15, sharpening the counting argument, but not in the dominant term. We call the intervals between consecutive reference fractions reference intervals. The reference intervals will be chosen no longer than the domain of the magic matrix (in the sense of Chapter 14).
We apply Chapter 15 to the rationals lying in a reference interval. The reference fraction with the larger denominator will be a/r. By the construc-
tion of Chapter 1, if e/r is a left-hand endpoint, then there is another
reference fraction f/s with f/s - e/r = 1/rs, and similarly if e/r is a right-hand endpoint. We have
r», R
U
1
RZ
rs
(16.1.1)
In the construction of Chapter 15 we have G(x) << NU. Hence we can satisfy
The Third and Fourth Conditions
330
the conditions of Lemma 15.4.2 when (
1
N
r
NUSB1-min -, -,(MR2) 1/3
(16.1.2)
A2 A4
for some constant B1. We assume that R
1
2/3
U SB2mi n ( NA 2 )
1/3
MR2
1
'A4' (
(16 . 1 . 3)
N3 )
for some constant B2 z B1. By (16.1.1) we see that (16.1.3) implies (16.1.2) uniformly in the reference fraction e/r when B2 is sufficiently large. We use the notation
S=(M2-M)/M of Chapter 14, and we suppose that 6M/N >> U, (16.1.4) so that the reference intervals form a non-trivial subdivision of the range for
f"(x)/2. Lemma 16.1.1 (Type 2 matrices) The number of pairs of minor arcs that coincide in all four conditions, given by triangular matrices, is O min(U
A2,
62) R (P2 + Q2)
Proof We use the Riesz interchange principle. Instead of counting for each type 2 matrix how many coincidences it gives in the Second, Third, and
Fourth Conditions, we consider, for each reference interval, how many matrices give many coincidences. There are O(6M/NU) reference intervals, each corresponding to x U minor arcs. For a particular reference interval and a particular magic matrix, suppose that the number of minor arcs in the reference interval for which the matrix gives a coincidence lies in the range L,. .. , 2L - 1. By Lemma 15.4.2, the Second Condition holds for one of these minor arcs with A2 replaced by O(i2/L2). By Lemma 14.3.2, we have B<<
min(
02, 6
L2
C<<
Q
mint
j
26)
for upper and lower triangular matrices respectively. In the upper triangular case, these matrices give O
P 1 L-min(L0216L
coincidences. Summing L through powers of 2 with L << U, we find that there are at most P Q min(02, 6U)
0
Counting coincident pairs of minor arcs
331
coincidences for minor arcs in this reference interval given by upper triangular matrices. The result follows on summing over reference intervals. El Lemma 16.1.2 (Type 3 matrices) The number of coincidences between pairs of minor arcs given by type 3 matrices is 1
P2R4
1
I W 0/3 )A2
11 2 r(+)62
If S is small, then we can replace this bound by O
P2R4
r1
PR2 log2 R Q
1
Q2 + I U+ v2/3) A2
in the notation of Lemma 14.3.4.
`
Proof We argue as in the last lemma. Suppose that a type 3 matrix gives between L and 2L - 1 coincidences for the minor arcs belonging to some reference interval. Lemma 15.4.2 tells us that the Second Condition holds with 02 replaced by O(02/L2). The domain of the magic matrix in Lemma 14.3.3 extends over 0(/2 R2/L2 ICI) consecutive minor arcs, spread over 0(1 + Q2R2/L2U ICI)
reference intervals. The number of magic matrices with C in a range K5 ICI < 2K is O(K2P2/Q2), and for 6 small it is also
O
62K2P2 Q2
SKP log R Q
+
+tl2logR+
52K2P2
K
=0
v2
+
Q2
KP log R Q
in the notation of Lemma 14.3.4. Here K is a power of 2 satisfying
(1 1 r R21 K<< min(L11Q2,L12I'3)«02R2min( V,L). For fixed K and L, there are R2 DLU Or(L+2-k)
l 52K2P2
KPlogR 11
(16.1.5)
Q + 11 pairs of coincident minor arcs in this range. For fixed L with L << V 1/3 we sum K over powers of 2 with K<< A2R2/V to get
0I
(L+) V
r
Q2
PR2log2R
P2R4
S2L12
Q2V2 +
L12
QV
, 1
and then sum L over powers` of 2 with L << V'1' to get 1
0 (V5/3 +
1
P2R4 Q2
+L12
PR2V log R QV
The Third and Fourth Conditions
332
pairs of coincident minor arcs. For fixed L with L >> V 1"3 we sum K in (16.1.5) over powers of two with K << O2 R2/L3 to get (
OL+
5 2 2 P2R4 Q2L6 +
L2
A2
U
and then sum L over powers of 2 with L >> 1
O(
1
) 82A2 SO2
P2R4 QZ
PR2log2R QL3
V113 2
+L)
to get PR2 loge R Q
pairs of coincident minor arcs. These two estimates give the second result of
the lemma. We can also replace S by 1, and drop the second term in the second bracket of (14.1.5) and the expressions that follow. This gives the other result of the lemma. Lemma 16.1.3 (Coincidences in all four conditions) Suppose that the conditions of Lemma 14.3.5 hold, and that the system of reference fractions is chosen
so that each reference interval corresponds to at least U and to at most C1U consecutive Farey arcs (where the constant C1 may be chosen), and that R
1
U< B3 min
2/3
MRZ
1
, Q4 ,
N3
A2N)
1'3
SM
,V2'3, N
,
(16.1.6)
where B3 is a sufficiently large constant, depending on C1. Then (for minor arc
vectors from the same sum in the case of a family of sums) the number of coincidences between minor arcs which can be written as J(a/q) with
P5IaI<2P,
Qsq<2Q,
is
U02(P2+Q2)+ UA102P2Q2I).
(16.1.7)
I
We also have the estimate 2
O n2 (6PQ+minlU- A2) 82)(P2+Q2) + U O2PQ log2 R +
U 0102P2Q2)
(16.1.8)
in the notation of Lemma 14.3.4, and this is better when S is small. Our next two bounds do not require the condition U5 V 2/3 in (16.1.6). For A so small that 401Q2 5 1, the bound becomes
/1 l1
(PQ + min( U O2, 62)P2 I I,
(16.1.9)
Counting coincident pairs of minor arcs
333
and for Al so small that B401P2 < 1, where B4 is a constant constructed from the bounds for the derivatives of the function F(x), then the bound becomes rS R2 (16.1.10) O Q2I SPQ+mint UD2,S2Q211.
The implied constants are constructed from the derivatives of the function F(x).
Proof This follows from Lemmas 16.1.1 and 16.1.2 when we note that A1Q2 X A2RZ/V. If 4A1Q2 < 1, then all magic matrices are upper triangular, and if t11P2 is sufficiently small, then all magic matrices are lower triangular. In both these cases the terms from Lemma 16.1.2 do not occur. Lemma 16.1.4 (Coincidences from a family of sums) Suppose that SPQ >> 1, and that the family of sums is formed with functions F(x, y,) satisfying (14.4.1), and the conditions of Lemmas 14.3.1, 14.3.5, and 14.4.1. Suppose that the system of reference fractions is chosen so that each reference interval corresponds to at least U and to at most C1U consecutive Farey arcs (where the constant C1 may be chosen), and that
R 2/3 1 mR U<_B5 min (-) , Q4,N2 1
1/2
M 2/3
SN1
SM
log M, N
(16.1.11)
where B5 is a sufficiently large constant, depending on C1. Then the number of coincidences between minor arcs which can be written as J(a/q) with
P
Q2Q,
is 2
UA2J)+UAZIZ(PZ+QZ)logPQ+ UAl02I2P2Q2)J. (16.1.12) We also have the estimate involving 8: z
O QZ (ISPQI 1 + U OZJ) + U t2I2((P2 + Q2)log PQ +SPQ log2 PQ) 1
SZ
1
+ U Al A2J2p2Q2)
.
(16.1.13)
Our next two bounds do not require the condition U< V 2/3 in (16.1.11). For Al so small that 401Q2 < 1, the bound becomes / (RZ 1 OI Q2 SIPQI 1 + U A2J) + min( U O2 log PQ, 62) I2P211, (16.1.14) I
The Third and Fourth Conditions
334
and for Al so small that B4 A, P2 S 1, where B4 is a constant constructed from the bounds for the derivatives of the function F(x, y), then the bound becomes
0 (Q21 SIPQ(1+ UA2J) +min(UA2logPQ,S2JI2Q2I). (16.1.15) The implied constants are constructed from the derivatives of the function F(x, y). Proof In the argument leading to the previous lemma, we use Lemma 15.5.2 in place of Lemma 15.4.2 and Lemmas 14.4.1 and 14.4.2 in place of Lemma 14.3.2 as in the proof of Lemma 14.4.3. Lemma 14.4.2 introduces a new case: triangular matrices with a large entry off the diagonal, with a domain smaller than the whole range for f"(x)/2. By analogy with (16.1.5) of Lemma 16.1.2, triangular matrices with K< Cl I5 2K - 1 have K << min
,
P DZ R2 L3 SQ
,
and they continue KLU
))
coincident pairs belonging to sets of between L and 2L - 1 coincidences in
intervals between consecutive reference fractions. We sum over L << (A2R
2/K)1/3 to get
0((K2A2R2)1/3
+ Q2R2/U),
and then over K << SQ/P to get 2Q
O
Q 2R2
(s
)1/3
+ U log Q R2
coincidences. The condition U << (A2M/6N)2/3 log M
in (16.1.11) ensures that the first term in this bound can be omitted. We argue similarly for upper triangular matrices: the same condition is necessary. We have simplified the upper bounds by replacing the minimum denominator vo by its lower bound vo >> max(1, Q/P). o
Lemma 16.1.5 (Limitation result) There is a constant B6 such that for S=
MZMM >_ Bb min
( i, Ql
(16.1.16)
Counting coincident pairs of minor arcs
335
the number of coincidences between minor arc vectors corresponding to rationals
a/q with Q5 q < 4Q has order of magnitude >> SPQ + S min(6, (12)A3 Q4(P2 + Q2) + S2 A1 A2 L3
A4P2Q2.
(16.1.17)
Proof We consider the construction of Chapter 7 and the case P >_ Q. For Q < P we count rationals q/a and consider lower triangular matrices instead
of upper triangular matrices. By Lemma 1.6.2, the number of possible rational numbers a/q in the range of f"(x)/2 is 6TQ2
2C3M2
- 4Q,
where C3 is the constant defined in Chapter 7. If B6 is large enough in (16.1.16), then this number is >> 6PQ. The first term in (16.1.17) is the contribution of the identity matrix. The third term in (16.1.17) comes from the usual box argument. Since the numbers 0; are bounded, then we can divide the appropriate region of four-dimensional space into K boxes, in such a way that two minor arc vectors in the same box must register a coincidence, and K << 1/O1 O2 A3 04. Suppose that Wk vectors fall into the kth box. The number of coincidences is at least Wk > K
Wk)2 > Al 2 03 A4(SPQ)2.
To consider upper triangular matrices, we divide three-dimensional space corresponding to a/q and the third and fourth entries of the minor arc vector into K boxes, such that if two minor arc vectors with the same denominator q fall into the same box, then there is a coincidence in the Second, Third, and Fourth Conditions. The number of boxes satisfies
K « 6/02 A3 A4 for a >_ A2, and
K << 1/03 A4
for S < A2. Suppose that Wk(b, q) vectors corresponding to rational numbers
a/q with a = b(mod q) in the range of f"(x)/2 fall into the k th box. The number of coincidences is at least
F, E E Wk (b,q)>- (KL E 1)-1(I F, E Wk(b,q)) k
q
q
b mod q
b mod q
S2P2Q2 >>
Q2
k
q
2
b mod q
1
L3 A4 min 11 S O2).
We argue similarly with lower triangular matrices.
The form (16.1.17) of the result follows because the maximum of four expressions is at least one-quarter of their sum.
11
The Third and Fourth Conditions
336
16.2 SUMS WITH CONGRUENCE CONDITIONS Investigations in number theory often produce sums of the form
E a(m)e(f(m)), m
where a(m) is a periodic function that depends only on the residue of a modulo some fixed integer k. We can write the sum as
E a(l) F, e(f(kn+l)),
(16.2.1)
n
l mod k
or, by the finite Fourier transform, as
k E a(l) E el [ mod k
r
1
k l I E el f(m) +
h mod k`
hm
1
(16.2.2)
IM
When m has size m x M, then (16.2.1) gives a family of k exponential sums with summand n of size n x M/k, and (16.2.2) gives a family of k sums with
summand m x M. The sums are shorter in (16.2.1), but there may be cancellation in the sum over l in (16.2.2) when we evaluate the finite Fourier transform. In each case we have a family of sums which differ by a very small perturbation, so small that the centres of the Farey arcs J(a/q) are the same for each sum of the family. The corresponding minor arc vectors differ only in
the terms from the first derivative. The sum (16.2.1) corresponds (with a change of notation) to the sums M2
Sh = F, e(f(m +h/k)),
(16.2.3)
m=M and the sum (16.2.2) to M2
Sh = E e(f(m) +hm/k).
(16.2.4)
m=M
We call (16.2.3) and (16.2.4) congruence families of sums. The first derivative
in So, f'(m) = (b + x)/q, corresponds to the first derivative in Sh, being b+K h b+K 2ah 1 1 q + kq +ONRZ q
in (16.2.3), and b + K
h
k
q
in (16.2.4). Hence K is replaced by k 2ah
K+
Sums with congruence conditions
337
modulo one in the sum (16.2.3), and by
in the sum (16.2.4). The condition corresponding to SPQ >> 1 is (16.2.5)
SPQ >> k.
Lemma 16.2.1 (Type 1 matrices acting on congruence families) For a congruence family of sums of the form (16.2.3) or (16.2.4), satisfying (16.2.5), the number of coincidences between pairs of minor arc vectors indexed by rational
numbers a/gwith PS al <2P,Q5q<2Qis O(S(04k+d(k))kPQ). Proof We consider the sums (16.2.4). The division of the sum into Farey arcs, and the choice of rational approximations a/q to f"(x)/2, is the same
for each sum of the family. In the Fourth Condition, K in the sum So corresponds to ((K + hq/k)) modulo one in the sum Sh. Let 1= (k, q). There
are k/l different numbers of the form ((K + hq/k)), spaced 1/k apart modulo one, each occurring l times. The number of coincidences between different sums Sh with the same a/q (given by the identity matrix) is
0 11 +t14 l
) =O(1+A4k).
For a type 1 matrix other than the identity, we argue similarly. If the vector
y(a/q, h) coincides with both y(a'/q', h') and y(a'/q', h"), then
K+hk -K'- kq]5a4, and similarly for h", so that
(h' k
(I
)q' 5
204,
and, for given a/q and h, a'/q' is fixed by the magic matrix, and there are 0(1' + A4k) choices for h', where 1' = (k, q'). For each factor 1 of k, let M(1) be the number of minor arc vectors with 1 I q, and let N(I) be the number of minor arc vectors with (k, q) = 1. Then, for r 11,
M(r) _ E N(l), Ilk, rIl and
F, 1N(1) = F, N(l) F, cp(r) = F, cp(r)M(r). Ilk
Ilk
rll
rlk
The Third and Fourth Conditions
338
By Lemma 1.2.3 we have
r ram (7Q)2
M(r)=0I
+1J
WRY
=0(
SPQ
+1).
We multiply by l + 04k, and sum over the factors 1 of k, to get
E 0((l + A4k)N(l)) = 0(E cp(r)M(r) + A4kM(1)) rlk
ilk
= O(SPQd(k) + k + S&4kPQ + 04k). For the family of sums (16.2.3), we argue similarly with 1 = (2a, k). The number of minor arc vectors with r 12a is Q2+1)=0(SPQ+1
0(m r again, by Lemma 1.2.3.
)
0
/
Lemma 16.2.2 (Coincidences for a congruence family) Suppose that (16.2.5) and the conditions of Lemma 16.1.3 hold. Then the number of coincidences between minor arcs among sums of a congruence family of the form (16.2.3) or (16.2.4) which can be written as J(a/q) with P< Jal < 2P, Q5 q < 2Q, is
O(Q ((A4k + d(k))kPQ + U Azkz(Pz + Q2) + U Al O2k2PZQz)) . (16.2.6) We also have the estimate z
O
(
Q2I
S(L14k+d(k))kPQ+min(Uz2,52)kz(P2+Qz) z
1
+ U O2k2PQ log2 R +
1
U A, A2k2PZQ2J
,
(16.2.7)
in the notation of Lemma 14.3.4, and this is better when S is small. For O1 so small that 401Q2 < 1, the bound becomes
O(Qz (S(04k+d(k))kPQ+min(U
O21
82)k2P2) I,
(16.2.8)
and for 01 so small that B4 01 Pz < 1, where B4 is a constant constructed from the derivatives of the function F(x) as in Lemma 16.1.3, then the bound becomes
(A4k+d(k))kPQ+min(UDz,Sz)
(16.2.9)
The implied constants are constructed from the derivatives of the function F(x).
Eliminating the centres of the arcs
339
Proof For type 1 matrices we use the previous lemma. For types 2 and 3 matrices, we argue as in Lemma 16.1.3 for each pair of values h and h'. 16.3
ELIMINATING THE CENTRES OF THE ARCS
The construction of the approximating polynomial depends on the choice of a
centre for the Farey arc. Surprisingly, the Coincidence Conditions can be restated in a slightly weaker form in which the centre m of the Farey arc is not implicit. In the construction of Chapter 7 with f(x) = TF(x/M) as usual, we define a function g(y) by
2y =f"(g(y))
(16.3.1)
The centre of the Farey arc I(a/q) is the integer m with
m = [[g(a/q)]].
(16.3.2)
First we note that
µ=6.(3)(m)=bf(3)(g(a))(1+0(N1
)).
We can change the notation to (16.3.3)
µ-6f(3) (g(9
with a negligible change in the y vector. Hence we can define p. by (16.3.3) in the Coincidence Conditions.
Kolesnik noticed that the implicit m can be eliminated from the Fourth Condition, and, with some loss of accuracy, from the Third Condition also. Lemma 16.3.1 (Semi-analytic coincidence conditions) ditions of Lemma 14.1.1 imply
af'(g(q )) + 2gg(q) -a'f l (g(q )) 1
«01+A3+ 04++ q
The Coincidence Con-
_27g(4)
dal
(16.3.4)
NR'
q
and
a
of
(g(q))-2ag(a) <<
4+
NR2
q'f'(g(q'))+2a'g(a')
(16.3.5)
The Third and Fourth Conditions
340
Proof By the mean value theorem we have
(g(q))+(m_g())P(g())+O(If(3)(x)I)
f'(m)=fl
-f(gI a
I
g(
I + (m
q
) 2a +O(1/NRz).
q
q
Hence q K((qf'(m)))= ((qf'(g()) -2ag(gll +O(NRz
((qf'(g()) - 2ag(q ))) +O(NR q z) modulo one. Thus (for one of the two choices of K if ((K)) is close to ± z) we have
K=
((qf'(g()) -2ag(q))) +O( NR qz )
and so the Fourth Condition implies (16.3.5). Similarly we have =a(f(M)
q)
q af(g(a
m-g(a
)
+
q) a
q =af'((a))
2aa
aK
-+0(laI/NR2)
q
q
(2 2)
K
- `+O(lal/NRz) g-+ (m g(-)) q q q q aK lalz -2m--+O df - f (g(a9 )) q(g(a9 )) q + NR 1
We deduce that Zib
((q ))-((afl(g(q))+2q(g(q))
q
))+O(q+NRz)
modulo one. When we subtract two terms of this type in the Third Condition, then we have aK q
5 q'
a
a'
q
q'
+
K'
a
q(K-K') 501+14
a
q
by the First and Fourth Conditions. Hence the Third Condition implies (16.3.4).
Eliminating the centres of the arcs
341
Lemma 16.3.2 (The arc of analytic coincidence)
Let
G(y) =f'(g(y)) - 2yg(y). Suppose that the Third and Fourth Conditions hold at a/r. Define the rational
f / s b y es - 1(modr),1 5 s S r - 1 and fr = es - 1. Then, fort and u integers with t>-O,u>-O,t+u>-1, we have
/e'u+f't
(eu+ft
(r'u + st)G I
(ru + st)G I
1ru+stil
A5,
ru+s,t
where f'/s' is constructed similarly from e'/r', and
(ru+st+tR2/r)
05 <<
N
(
t2R4
l+
,
r2(ru +st)2
JI
If e
eu + ft
t
r
ru+st
r(ru+st)
1
<< -, RZ
then we have
ru + st
AS <<
N
and the Fourth Condition in the form (16.3.5) holds at (eu + ft)/(ru + st). A similar result holds with f/s constructed by es = - 1(modr), fr = es + 1.
Proof The functions f'(x)/2, -G(y)/2 are complementary in the sense that their derivatives are inverse functions. We have
G'(y) = f"(g(y))g'(y) - 2g(y) - 2yg'(y)- - 2g(y), g'(y) =
(16.3.6)
2
(16.3.7)
f(3)(g(y))
- M « RN 2
(4)
g"(y)
(f )(g(y)))3
4
6
(16.3.8)
,
where we have used (7.2.5). The Third and Fourth Conditions at e/r in the form (16.3.4) and (16.3.5) give -f)g(r)-s1G(r,)
sG(e)+2(r R2
R2
N2
rN
+2(ers
s
1
r
1
Nr
r
NR 2
N
r
(
-f)g(r')II
s+
R2
r
and
rG(r) - rG(
r
r'
r
r «N+NR2 «N.
The Third and Fourth Conditions
342
Hence, for t and u non-negative,
ref
2t
(ru+st)GI Y)+
e'
rg(e)
e'
2t
r-(r'u+s't)GI r) r'g(r'
ru+st+tR2/r
(16.3.9)
N By the mean value theorem and (16.3.6), we have
eu+ft e G(ru +st) =G ( r)
t
e
el
t2
r(ru +st) G (r)
+ 2r2(ru +st)2
rl
t3
+0 I
max IG(3)(y)I) r3(rU + St)3
2t
e
a
t2
-G(r)+r(ru+st)g(r) +0
t3
r3(ru + st)3
r2(ru+st)2glr
max lg" (y)IJ
Hence, by (16.3.7) and (16.3.8) we have
(ru
2t -+ r r
(eu + ft +st)G) (ru 1ru+st _
+st)G(e)
t2
`r R6
t3
+
3µr2(ru +st)
e
-gI-)
r (ru +st)2
and we can write (16.3.9) as
(ru+st)G
(e'u +f It
(eu + ft
-(r'u+s't)GI` r'u+s't)
+st)
+t2
3µr (ru +st) 3,u'ri2 (r'u +s't) 1
1
2
<<
ru+st+tR2/r N
+
t3
R6
r3(ru +st)2 N
(16.3.10)
The third term on the left of (16.3.10) involves the Fortean function of Lemma 15.2.1, and we find that t
2(
1
3µr2(ru +st)
-
1
3µ'ri2(r'u +s't) )
We deduce the result of the lemma.
R4 t2(ru+st+tR2/r) N
r2(ru +st)2
0
Eliminating the centres of the arcs
343
We express the condition (16.3.5) in terms of rational points close to a curve y = h(x). Lemma 16.3.3 (The Fortean function) The condition (16.3.5) inplies that there is an integer c with (16.3.11)
where 6 is a positive real number depending only on D4, and
h(x) = G(x) - (Cx +D)G(
Cx +B Y)
Cx+D
is a three times differentiable function depending only on the magic matrix, with
23µ µq (l,q ,3 3
h"
(a) q
=
1
(16.3.12)
.
Proof We deduce (16.3.11) directly from (16.3.5) with c = [[qh(a/q)]]. For (16.3.12) we observe that
h" (x) = -T(3)
4
4
(g(x)) + (Cx + D)3f (3)(g((Ax + B)/(Cx + D))) '
and we substitute the relations 6µ =f(3)( g(alq)) (similarly for µ') and Ca /q + D = q' /q. Lemma 16.3.4 (Linearizing the analytic condition)
There is a constant Bl
such that, for r < N/B1, the rational points (a/q, c/q) in the notation of Lemma 16.3.3 given by Lemma 16.3.2 with a/q = (eu + ft)/(ru + st) satisfying a
e
r 5
1
(16.3.13)
l
q
RZ
lie on a line y = lx + n, where 1 and n are integers with
h(
h,( re
r)
r
) -nl «
1
er)
N The conclusion remains true if (16.3.13) is replaced by 2(min(N'NR
q
rl5B'
with a suitable constant B2.
2
1/3
The Third and Fourth Conditions
344
Proof Since we assume the Second Condition, (16.3.12) gives
h"(a/q) z z 2NR2 << R4/N, and with x = a/q = e/r + z, we have e
h(x)=hI Y +zl =h{ r I +zh'I e I +0 -N4 `
r
lJ
In (4.3.1) we take
/
`
f(x)=ax+/3=xh'I er )+hI In the notation of (16.3.11)
/
`
1
YI +Yh'I Ye `
`
----R C
as
1 +z2R4
q
q
N
Let R' be the number of rational points with q < 2Q given by the construction of Lemma 16.3.2, and let
M' = min(,/R2,1/rs). We have set up the conditions of Lemma 4.3.1 with M and R replaced by
M' and R', and 8 <<
Q
(1 +Mr2R4) << Q
«O4.
If the points (a/q, c/q) do not all lie on a straight line, then Lemma 4.3.1 gives
R' << 1 + M'Q3/N. By Lemma 1.6.3, for Q > 256r we have R'>> M'Q2>> 1. Choosing Q = 256r + 1 gives r >> N, contradicting the assumption r < NIB, if the constant B1 has been chosen sufficiently large. Hence the points do lie
on a straight line, which we can write as Ix + my + n = 0 with integer coefficients satisfying m <- 0, (1, m, n) = 1. As in the proof of Lemma 15.3.1, we see that m = -1. The inequalities for 1 and n follow from Lemma 4.3.2.
If we drop the condition 0:5-.z< M', and replace it by I zI S M", with M" > 1/R2, then we must take 8 <<M"2QR4/N. We still obtain a contradiction unless M'Q2 << R' << 8M" Q2 << Q3Mi 2R4/N.
When we take Q = 256r + 1, then this gives
N 1/3
11/3
(r R2 M"»R2(r) mini,Irsl 1
and we deduce the last assertion of/the lemma.,
Eliminating the centres of the arcs
345
Whilst the constant term n of the line y = lx + n is localized in Lemma 16.3.4, the gradient 1 is not determined accurately. In general h"(x) changes sign, and we are close to a point of inflection. If h"(x) (which must be small
by the Second Condition) is bounded away from zero, then we may use Lemma 4.2.1 in place of Lemma 4.3.1.
Lemma 16.3.5 (A resonance curve for a triangular matrix) Suppose that the condition of Lemmas 16.3.2 and 16.3.4 hold, and also that the magic matrix is triangular, and that the minor arcs come from sums with the same function F(x), satisfying the conditions of Lemma 14.3.1. Let k(y) be the function complementary to h(x), defined by
k(y) = yj(y) - h(j(y)), where x = j(y) is the function inverse toy = h'(x). Then, in the upper triangular case,
M(R2 + rs)2 Ik(1) + n I «A2 IBIN2R4Q2 , and, in the lower triangular case,
Ik(1)+ n1«02
(R2 + rs)2 . ICIMQ2
If the same rational line contains points corresponding to L >- 5 different minor arcs, then we replace (R2 + rs)2 by R4/L2.
Proof For an upper triangular matrix (
I
B), the Second Condition in the
form (16.3.12) gives (aq)
h "(a)
=3
µ
-
1
µ
2 3
(f l3)(g(a/q +B)) 4B
=-3
f (3)(g(a/q))
f (4)(g(g)) (f(3)(g(, )))3
for some q. Upper triangular matrices occur only for M < FT, when we have assumed that IF(4)(x)I is non-zero, so that /
h"(allxA=IBIM2R 4
(16.3.14) q
The Third and Fourth Conditions
346
Lower triangular matrices
('
° ) only occur for M >- FT, when we have
assumed that 3(F(3))2 - F" F(4) is non-zero, and a similar calculation gives
xA=ICIM.
(16.3.15)
We have set up the conditions of Lemma 4.2.1, with h(x) in place of f(x),O(A4) in place of 3, and the distance A between the rational points a;/q; satisfying
A >> min(1/R2,1/rs). We have Ik(1) +nl << A4/Q + L
4/oA2Q2.
The first term is negligible by Lemma 14.3.2, and the second term gives the bounds stated. If the Coincidence Conditions hold for L minor arcs, then we can take A >> L/R2, and in (4.1.14) of Lemma 4.1.5 we can take both factors to be of size >> L/R2, so that
L2/R4 «04/i Q, and we have
A4/Q «A4R4/OLZQ2, and we still omit the term 04/Q. As in Chapter 15, we shall apply these lemmas with either e/r or f/s as the rational number of smallest denominator that is a value of f" (x)/2 on the minor arc. In the second case, e/r is the rational number of second
smallest denominator, and (e - f)/(r - s) is outside the minor arc, but adjacent to f/s in some Farey sequence. In both cases we find that rs 5 3R2.
The construction of Chapter 8 gives corresponding results, but with different derivatives and factorials in (16.3.1) and (16.3.3), and, for example, O5 << (nc + st)/H in Lemma 16.3.2.
Part V Results and applications
17 Exponential sum theorems 17.1
THE SIMPLE EXPONENTIAL SUM
We resume the treatment of the simple exponential sum in one variable. r
M2
S=
r m
eI TFI M))'
(17.1.1)
ME
from Chapter 7, obtaining the results of Huxley (1993a), extended to short intervals and congruence families. We use the notation of Chapter 7, and assume all the restrictions made there on the parameters N (length of short sums) and R (expected size of the denominator). We have estimated the numbers A and B in Lemma 7.4.3. Lemma 17.1.1 (Coincident `integer' vectors) Write A in Lemma 7.4.3 as A where r = 3, 4, 5, or 6 is the exponent of Lemma 7.4.2. Then we have A3 <
R2
H5+
NZ
R2
forQ<<
NZH7
N1-E'
As H7+E
NZ
for Q>> N1-E.
The implied constants depend on E where appropriate.
Proof The coincidence conditions in Lemma 7.4.3 give the inequalities (11.1.1)-(11.1.5) of Chapter 11 with
H X NQ/R2,
SH3/2 X (Q3/NR2)1/2,
so that
S x R2/N2,
SH2 X Q2/R2 >> 1,
SH x Q/N.
We do not change the order of magnitude of S by increasing S to 4/H2 when S is very small. The bounds quoted come from Lemma 11.2.1, Theorem 11.2.5, Theorem 12.4.1, and Theorem 12.1.4.
Exponential sum theorems
350
Lemma 17.1.2 (Choosing parameter sizes) Suppose that
M2 5 (1 + 8)M
(17.1.2)
for some 8< 1, and that the conditions of Lemma 14.3.1 hold. Then we have in the Second Spacing Problem BV << 82L110204/3P2R2V
(17.1.3)
U= (N/B1Q)213, V= U3/2
(17.1.4)
when
(17.1.5)
for some sufficiently large constant B1, for a choice of N and R in the ranges
Nx
Rx
52/19MT-1757,
6-1119MT-20157,
(17.1.6)
when
8» T-s/s1(log
T)3s/27,
(17.1.7)
and
8-24T49 «M114 << 824T65
(17.1.8)
BV << 8010204'3P2R2V,
(17.1.9)
We also have
when U is given by (17.1.4), but V is chosen suitably with A1Q2 x 1, for a choice of N and R in the ranges NX
R X M514T-11 24,
M112T-1/12,
(17.1.10)
with
8» T1/6/M,
Ts/12 << M << T 1/2.
(17.1.11)
We also have (17.1.9) when U is given by (17.1.4), and V is chosen suitably with
A P2 x 1, for a choice of N and R in the ranges R X M314 T- 5/24,
(17.1.12)
T1/2 << M << T7/12.
(17.1.13)
NxM3/2T-7/12,
with
6>>M/T5'6,
Finally we have (17.1.9) with U given by (17.1.4), and V = 1, for a choice of N and R in the ranges (17.1.14) N x R2 x M3/2/T1/2 with
8 >>M/T 2/3,
T1/3 << M << T5112,
(17.1.15)
and also for a choice of N and R in the ranges
NxM1'2,
R
xM514T-1/2
(17.1.16)
with
3 » T 1/3/M,
T7/12 << M << T 2/3.
(17.1.17)
The simple exponential sum
351
When Modification 2 operates, then the bounds for BV must be multiplied by
Q2/R2. The implied constants are constructed from the derivatives of the function F(x), and the order-of-magnitude constants in the ranges for M and 6.
Proof We substitute the values of
.,
into Lemma 16.1.3. The choices
(17.1.4) and (17.1.5) of U and V are valid in (16.1.6) of Lemma 16.1.13 if N5 <<MR4
(17.1.18)
(an easy consequence of (17.1.6)), and if
S» (NS/M3Rz)1/3.
(17.1.19)
The bound (17.1.3) corresponds to the last term in (16.1.8) of Lemma 16.1.3, which dominates the first term in the same bound if 5MN"20/3R4Q5/3,
1 << 6/ 1l202/3PQ x
(17.1.20)
and dominates the second term if 1 << 6 i
m in (Pz , Q z)
SR4Q
x N3 mi n
( M2
N2R4 , 1
(17 . 1 . 21)
and dominates the thir d term if l og e
R «S20 PQ x 1
S 2MQRz
N4
(17 . 1 . 22)
Since N and R satisfy NR2 x M3/T, the choice (17.1.6) is extremal in (17.1.20). The condition (17.1.21) gives the simultaneous conditions (17.1.8), and the condition (17.1.22) gives the condition (17.1.7). If S = 1, then we use (16.1.7) of Lemma 16.1.3 in place of (16.1.8), in which the term that requires
(17.1.22) is absent. We find that (17.1.7) implies (17.1.9), and that the conditions (7.2.3)-(7.2.8) of Chapter 7 can be satisfied by a suitable choice of N and R in the range (17.1.6). In (16.1.9) of Lemma 16.1.3, the second term dominates if 1 <<
2 L4/3P/Q
x MN-11"3Q213
The extremal choice of N and R is (17.1.10). We find that V>_ 1 for M>> Ts/1z The conditions (17.1.11) ensure that (17.1.18), (17.1.19), and the conditions of Chapter 7 can be satisfied by a suitable choice of N and R in the ranges (17.1.10). For M << T 'l 12, we take V= 1, and choose N and R so that
1x01Q2 xR4/N2, which puts N and R into the ranges (17.1.14). The conditions (17.1.15) ensure that (17.1.18), (17.1.19), and the conditions of Chapter 7 can be satisfied by a suitable choice of N and R in the ranges (17.1.14).
Exponential sum theorems
352
Similarly in (16.1.10) of Lemma 16.1.3, the second term dominates if 1 << L1204/3Q/P x M-1N-5/3Q2/3R4
The extremal choice of N and R is (17.1.12). We find that V >_ 1 for M << T'/12. The conditions (17.1.13) ensure that (17.1.18), (17.1.19) and the conditions of Chapter 7 can be satisfied by a suitable choice of N and R in the ranges (17.1.12). For M >> T'/ 12, we take V = 1, and choose N and R so that 1 x O1P2 x M2/N4, which puts N and R into the ranges (17.1.16). The conditions (17.1.17) ensure that (17.1.18), (17.1.19) and the conditions of Chapter 7 can be satisfied by a suitable choice of N and R in the ranges (17.1.16).
We can now substitute the estimates for A and B into Lemma 7.4.3, then the result into Lemma 7.4.2, and add up the cases. A complication is that the bound for A5 changes when Q x N1- E, and that we must change the bound for B from Lemma 16.1.3, valid for Q << N, to Lemma 14.3.5, valid for all Q. Our next lemma appears as simple methodology, but it embodies one of the driving ideas of the method. It also prevents the exponent S of Lemma 7.4.2 and the ratio S of Lemma 16.1.3 from appearing in the same formula. Lemma 17.1.3 (Normal-sized denominators) Suppose that Modification 2 is used in the choice of approximating polynomials on the minor arcs. Let a be the largest exponent of Q in any term in the bound for By. There is a choice of S in Lemma 7.4.2 for which the contribution of ranges corresponding to different powers of Q = 2k can be majorized by convergent geometric progressions, whose sum is in order of magnitude the bound obtained for the smallest size range Q x R, provided that a < r + 7 in the range R << Q << N, and a < r + 6 in the range Q >> N. The result still holds if the bound for ABV increases by a factor Q'' as we pass
from Q = Q0 to 2Q0 (S N), provided that =Q1-2*1/(r+7-a)
R << Q1
(17.1.23)
If R >> Q,, then a valid upper bound is given by the order of magnitude for Q x R, multiplied by R'".
Proof In Lemma 7.2.2, the largest power of Q which occurs in W(Q) is Q-1. Hence in Lemma 7.4.2 there is a factor q-3r/2-1+s outside the expression from the large sieve. When Modification 2 operates, then we can insert a
factor R2/Q2, since the sum over a/q contains >> Q2/R2 approximating polynomials corresponding to each minor arc. In the expression ABH5NR2V 3
Q
(17.1.24)
The simple exponential sum
353
of Lemma 7.4.3, we can ignore the factor 1 + Q/N for small Q. Since H NQ/R2, and the largest exponent of H in the bound for A is 2r - 3 + E, then the largest exponent of Q in (17.1.24) is 2r - 1 +a+ e for Q << N, 2r + a + e for Q >> N. We can now work out the largest exponent of Q in Lemma 7.4.2 as a + E
2
-$-
r + 7
2
for Q << N, and larger by z for Q >> N. Hence a positive choice of 6 is possible under the conditions given. If the upper bound for ABV has a discontinuity at Q0Yby a factor Qo, then we define an exponent 0 by (Qo/R)B ^ Q'7,
and insert an extra factor (Q/R)B into (17.1.24). We can still choose a if (17.1.23) holds: otherwise we use the larger estimate with the factor Q'I over the whole range Q >> R.
We can now give bounds for the simple exponential sum S of Chapter 7. Further complications arise when we consider a family of sums Si. Theorem 17.1.4 Let F(x) be a function four times continuously differentiable on the interval 1 <x :!g 2, whose derivatives satisfy the following conditions:
I F(')(x)I 5 Cl
(17.1.25)
I F(')(x)I >_ 1/C1
(17.1.26)
for r = 3, 4,
for r = 3, where Cl is a positive constant. Let 6:5 1 be a positive real number, and let M2
e(TF(mM)),
S= ME
where M and M2 are positive integers, T is a large real number, and M <M2 < (1 + 5)M < 2M < T. Suppose that for some C2 >- 1 either case 1 or case 2 holds:
Case 1 M 5
and (17.1.26) holds for r = 4 also;
Case 2 M >- C21 FT and (17.1.25) and (17.1.26) holds for r = 2 also and IF" (x)F(4)(x) for some positive constant C3.
-
3(F(')(X))21
>- C3
(17.1.27)
Exponential sum theorems
354
If M is sufficiently large in terms of C1, then we have S << 592/95M1/2T89/57olog T
(17.1.28)
for
5-24T49 <<M114 «$24T65, and
S» T-s/81(log T)3s/27 Secondly, we have S << 69/10(M7/20T13/60 +M13/20TH/12o)logT
(17.1.29)
when either S = 1 or S» M-1/4T- 11/108 +M- 1/4T-67/216 +M3/4T- 13/24,
(17.1.30)
for T1/3+E <<M<< T1/2,
and also for M << T1/3+E with log T replaced by T E in (17.1.29). Thirdly, we have S << 69/10(M7/20T 29/120 + M 13/20 T 1115 )log T
(17.1.31)
when either S = 1 or
S»
M1/4T- 19/54 + M1/4T-13/216 + M- 3/4T5/24,
(17.1.32)
for
T1/2 << M<<
T2/3-E,
and also for M >> T 2/ 3 - E with log T replaced by T E in (17.1.31). The implied constants are constructed from C1, C21 C31 from the implied constants in the various ranges for M, and, where appropriate, from e.
Proof The conditions on F(x) are those of Lemma 14.3.1. Lemmas 17.1.1 and 17.1.2 give the estimates for A and B in Lemma 7.4.3. The bound of Lemma 7.4.3 must be substituted into Lemma 7.4.2. Since the bound is uniform in q, we can perform the integration over 77 in Lemma 7.4.2 to get an extra factor log N. We take r = 5, so that the bound for A5 changes form at Q0 = N 1 + E in Lemma 17.1.1. With a slight loss of accuracy, we change the bound for B from Lemma 16.1.3 to Lemma 14.3.5 for Q > Q0. For Q0 < Q << N we have D4 >> N-E,
U << N2E/3,
log PQ << NE/12,
and so the terms of Lemma 14.3.5 correspond to those of (16.1.7) and (16.1.8) of Lemma 16.1.3 with the choice of U in Lemma 17.1.2. We saw in Chapter
15 that the set of possible magic matrices is independent of Q. When there are no type 3 matrices for small Q in (16.1.9) and (16.1.10) of Lemma 16.1.3,
355
The simple exponential sum
then we may omit the terms in Lemma 14.3.5 arising from type 3 matrices also. Hence as Q increases past the value Q0, then the bounds for BV pass smoothly from those of Lemma 16.1.3 to those of Lemma 14.3.5 (with type 3 matrices omitted where appropriate), apart from a factor of the form O(NE). We can now apply Lemma 17.1.3 to get ISIS <<
( M2
-M
5 R 5/2 +
I
N)
(1A5
`+R5/2(W(R))4
logs N
/IBHSNV/R) log5 N,
where the bounds for H and BV are those from the range Q R, provided that (17.1.33)
R << N1-E.
If (17.1.33) is false, then we insert a factor NE after ASBV inside the square root. Hence either S << 8MN-3/2R1/2 log N,
(17.1.34)
S « V-(-MR/N) log N,
(17.1.35)
g N, S «R1/2(W(R))4/5(RH12BV/N)1/10 log
(17.1.36)
or
or
with log N replaced by NE if (17.1.33) does not hold. We suppose that
(1 NR
8>> mint
M)
(17.1.37)
to simplify the expression for W(R) in Lemma 7.2.2, so that (17.1.36) becomes S << 3415M415N311OR -3/5 (BV)11
10
log N.
The estimate (17.1.3) of Lemma 17.1.2 gives SM log N S << N1/30R2/15 f
(17.1.38)
whilst (17.1.9) gives (17.1.38) with 6 replaced by 89/10
The bounds (17.1.28), (17.1.29), and (17.1.31) are given by the various choices of N, R, and V in Lemma 17.1.2. The term (17.1.34) is always smaller than (17.1.38). The term (17.1.35) is smaller than (17.1.38) for large 6. The choice (17.1.6) gives (17.1.28). The conditions on 6 in Lemma 17.1.2, repeated here, imply (17.1.37), and they keep 8 so large that the contribution (17.1.35) is negligible. The two terms of (17.1.29) come from the choices (17.1.10) and (17.1.14) in
Lemma 17.1.2. The first two terms in the lower bound for 6 in (17.1.30)
Exponential sum theorems
356
ensure that the term (17.1.35) cannot dominate. The third term comes from (17.1.37). The lower bounds for S in (17.1.11) and (17.1.14) follow from (17.1.30). We have R< Q1 in Lemma 17.1.3 provided that M>> T'13+E for some e> 0. For M<< T1/3+E, the bound (17.1.29) with a factor T E follows from Lemma 7.3.4. Similarly, the two terms of (17.1.31) come from the choices (17.1.12) and
(17.1.16) in Lemma 17.1.2. The first two terms in the lower bound for 5 in (17.1.32) ensure that the term (17.1.35) cannot dominate. The third term comes from (17.1.37). The lower bounds for S in (17.1.13) and (17.1.17) follow from (17.1.32). We have R5 Q1 in Lemma 17.1.3 provided that M << T2/3- E for some e > 0. For M >> T 2/3- , the bound (17.1.31) with a factor T E follows from Lemma 7.3.4. O
When S = 1, then the bounds of Theorem 17.1.4 give the five possible cases, depending on the size of (log M)/log T. At the end of the range (17.1.8), the bound (17.1.28) agrees with (17.1.29) or (17.1.31). However, for S < 1, the powers of S are different. To fill the gap, we should consider two extra cases in Lemma 17.1.2. There is a different incompleteness for small 6, when the bounds (17.1.30)
and (17.1.32) are determined by edge effects: the short interval may be dominated by a single major arc I(a/q) with q small. If such a major arc occurs, then we should treat the whole sum as a long major arc as in Section 7.5 of Chapter 7. If no such major arc occurs, then the bounds of Theorem 17.1.4 may be extended to smaller values of S. Our next lemma is methodological: for fixed S and T, a bound valid for
one size range M x T' gives some information for other sizes of M. For reasons of notations, the value of S is reduced by 1/M, compared with Theorem 17.1.4.
Lemma 17.1.5 (Extending ranges trivially) Let S be given in 0 < 8< 1. Suppose that there are constants C4 >_ 2, C5 (which may depend on S), a and /3 such that, for a class C of functions closed under the addition of linear terms and under linear changes of variable, we have in the notation of Theorem 17.1.4 ISI
whenever F(x) is in the class C, M2 < (1 + S)M, and M lies in the range T a/C4 < MS C4T a.
Then, for M:5 T
we have ISI 5 C5Tv,
and, for M > C4T ", we have
ISI52C5MTa-a.
Exponential sums with a parameter
357
Proof For M5 T" we pick an integer q > 1 so that
gMST"<(q+1)M. We note that
T"
For a=0,...,q-1, let Fa(x) =F(x) +aMx/T. Then
amodq
e(TFa('j
))=eITFI j
q )
The inner sum over a is q if q I n, zero if `not. Hence qM2
S= 1 F,
F, e(TFF(n )1,
q amodq n=qM
qM
which gives the first result.
For M> C4T" we pick integers q and N with
(q- 1)N<M
T"
Here M = qN - b for some integer bin 0< b < q. We write q-b-1
S= E Ee(TF(gMa)) a=-b
n
JJ
Since q > 2, we have
T"/C4 <M/q < C4T". The range for n begins at n = N, and runs through at most 6N consecutive integers. Hence, for each a, the inner sum has size at most C5T R, and M T P. ISI < gC5T a 5 2C5(q - 1)T,6 5 2C5 N We deduce the second result.
0
This lemma would be extremely useful if it could be sharpened, with cancellation between the q sums over the different residue classes a modulo q. These sums form a congruence family in the sense of Chapter 16.
17.2 EXPONENTIAL SUMS WITH A PARAMETER We consider a family of sums formed with a function F(x, y) for different
Exponential sum theorems
358
values y =y,, as in Section 14.4 of Chapter 14. For simplicity we only consider S = 1 in Lemma 16.1.4. Taking S < 1 would correspond to a family of sums over the same short interval. We do not have an analogue of Lemma 16.1.4
for the commoner case of a family of sums over different short intervals. Even with 5 = 1, there are ten important cases. We have not tried to optimize the logarithm power. Lemma 17.2.1 (Choosing parameter sizes for a family of sums) Suppose that the conditions of Lemmas 14.3.1, 14.4.1, and 14.4.2 hold. Then we have in the Second Spacing Problem (17.2.1)
BV << I2L110202/3P2R2 V
when U and V are given by (17.1.4) and (17.1.5), for suitable choices of N and R in the following ranges: RX1
N x 12119MT - 17/57,
-1119MT- 20157
(17.2.2)
for T$ I33 j
1/57
» JI+( TT7
M2l
T l log M,
(17.2.3)
and
Nx
Rx
(J/I)-2/11MT_3/11,
(J/I)1/11MT-411
(17.2.4)
for T1/3 >>
J
1/57
(T8
>> f 133
M2 T + (Mz + 7 Jlog M,
(17.2.5)
1
We also have, for M << T 1/`2, (
BV<
JQ 7)
(17.2.6)
when U is given by (17.1.4), but V is chosen suitably with A1Q2 x 1, and N and R are suitably chosen, in the following cases. Let
IT l1 -t
T2/3
1
E1 = T8133 + -TUT + ( J + M2
.
When E1 x M2/IT << 1, then we choose N and R in the ranges R X I-118M514T-11/24. NX I1/4M112T-1/12,
(17 . 2 . 7)
(17.2.8)
When E1 x 1/J, then we take N x J114MT- 1/3,
R xJ-118MT-1/3.
(17.2.9)
When E1 x 1/T8"33, then we take
N ,x MT-311,
RX
MT-411.
(17.2.10)
Exponential sums with a parameter
359
When El x T2/3/M2 << 1, then we choose the ranges (17.1.14). ForM>> T1/2 we have
BV «I I +
JP
)iA2i/3P2R2v log PQ,
(17.2.11)
where U is given by (17.1.3), but V is chosen suitably with L1P2 x 1, and N and R are suitably chosen, in the following cases. Let
IM2 -1
M2
1
E2= T8133 +7.,43 +J+ 7
(17.2.12)
When E2 x T/IM2 << 1, then we choose N and R in the ranges R X I- 1/8M3/4T-5/2a N x 11/4M3/2T-7/12,
(17.2.13) When E2 x 1/J, then we use the ranges (17.2.9). When E2 x 1/T8/33, then we
use the ranges (17.2.10). When E2 xM2/T4/3 << 1, then we use the ranges (17.1.16).
When Modification 2 operates, then the bounds for BV must be multiplied by
Q2/R2. The implied constants are constructed from the derivatives of the function F(x), and from the order-of-magnitude constants in the ranges for M, I, and J.
Proof We substitute the values of A j into Lemma 16.1.4. The choices (17.1.4) and (17.1.5) of U and V are valid in (16.1.11) of Lemma 16.1.4 if
N4 z MR3 log M. (17.2.14) This is a weaker condition than (17.1.18) of Lemma 17.1.2, which was never critical. However, we save more in this lemma, as N can be taken larger, and R smaller, than in Lemma 17.1.2. When I is very large, then (17.2.4) becomes the critical condition, and we choose the ranges (17.2.10). There is an extra term involving A 2 J in (17.1.12), (17.1.14), and (17.1.15). The choices of ranges
(17.2.4) and (17.2.9) correspond to the case when this term is the second largest. The other choices of ranges for N and R correspond to cases in Lemma 17.1.2.
Theorem 17.2.2 Suppose that the function F(x, y) is six times differentiable for 1 < x < 2, 0 5 y 5 1, and for some constants C1 >_ 1, C2 > 0 1,91 a2F(x, y)I <_ C1
for 25r<6, 0<sS2, r+s<6, and 1/C15Ia{F(x,y)I
(17.2.15)
for r = 2, 3, and a1
a,F(x, y)
z
C2
I
for r = 3. Suppose also that either Case 1 or Case 2 holds:
(17 . 2 . 16)
Exponential sum theorems
360
Case 1 M << T 1/2 and (17.2.15) and (17.2.16) hold for r = 4;
Case 2 M >> T'I' and 101 >- C4,
13F1211 - F11F11111 >- C31
for some positive constants C31 CO where 3F11z1 + 4F11F1111
3F11F111
F121
F11111
F1111
F111
F11112
F1112
F112
A=
Let Si be the sum e (TF( m
M
mMZ(`) =MI(i)
y;)),
,
where M< MI(i) S M2(i) < 2M, and y,,..., y, lie in 0
yi+1 -yi ? 1/J for i < I - 1. Let e > 0 be arbitrary. Then in both cases, for T8157
T1/3 E»
111/19
(T
J
+ I >> I
Mz 11
Mz
+
T
JlogM,
(17.2.17)
we have
I IS;15 <
Also in case 1 with M << T 1/2 I
JM2 )1/2 1
+
i=1
(log T)
IT
(17.2.19)
when 1
Tz/3
E1 = T8133 + Mz
/
IT `-1
+ J+ Mz
JI
1
« TE,
(17.2.20)
and in case 2 for M >> T 1/2 1/2
r r JT L ISIS << IM5/2T5/6E3/8I 1 + IMz
i=1
1
(log
T)11/2,
(17.2.21)
11
when 1
Mz
(
E2 = T8,33 + T4/3 + I J +
IM2)-' T
I
1
« TE .
(17.2.22)
A congruence family of sums
361
When the inequalities involving e in (17.2.17), (17.2.20), and (17.2.22) do not hold, then the upper bounds hold with the power of log T replaced by T E. The implied constants are constructed from C11 ... , C41 from E, and from the implied order-of-magnitude constants in the ranges for M, I, and J.
Proof We follow the proof of Theorem 17.1.4 with two simplifications: when we apply Lemma 7.4.2, then the first term is negligible, and we do not take a fifth root afterwards. p 17.3
A CONGRUENCE FAMILY OF SUMS
Theorem 17.2.2 does not cover the case of a congruence family of sums as defined in Chapter 16. An analogous result is true, with the same exponents appearing in the bounds. Theorem 17.3.1 Let F(x) satisfy the conditions of Theorem 17.1.4, and let k be a positive integer. Let the sums Sh, for h = 0,..., k - 1, be defined either by / M2
5,,= E el
3k,
f
S,, _ F, e f(m) +
hm )
, (17.3.2) k where M and M2 are positive integers, T is a large real number, and M5 M2 < 2M < T. Suppose that either case 1 or case 2 of Theorem 17.1.4 holds. If M is sufficiently large in terms of C1, then we have k-1 E IShls << (d(k))3/'19k16/19M5/2T89/114 logs T
m=M
0
for k/d(k) << T'/16 and 33
k
T49«M114«(d(k))33 T6s
d(k))
Secondly, we have in case 1 k-1 IShI' << (d(k))3/8k5/8M13/4T11/24 logs T F, 0
+ k(M7/4T13/12 + M5/2T16/21)log5 T
(17.3.3)
for k
d(k)
« min
/ T l
M2)
3/5 ,
T11 14 ,
and
T1/3+e <<M <
M314
TI/4
(17 . 3 . 4)
Exponential sum theorems
362
and also forM << T 1/3+, with log T replaced by T E in (17.3.3). Thirdly, we have
in case 2 k-1
E I SKIS << (d(k))3/$k5/8M7/4T29/24 logs T 0
+k( M1314 T113 + M512 T16/21)logs T
(17.3.5)
for k
d(k)
z min (
M2
3/s ,
T1/14,
T )
T1/2 M314 '
(17.3.6)
and T2/3- E'
T'/2 << M << and also for M >> T 2/ 3 - E with log T replaced by T E in (17.3.6). If the function
F(x) is five times continuously differentiable, with the bound (17.1.25) of Theorem 17.1.4 true for the fifth derivative also, then we may omit the terms M512T16/2' in (17.3.3) and (17.3.5), and the term T1114 in (17.3.4) and (17.3.6). The implied constants are constructed as in Theorem 17.1.4. Corollary Let X(m) be a proper Dirichlet character modulo k, and let M2
Sx= E X(m)e(f(m)) m=M
Then if case 1 holds for M << (kT) , not merely for M << FT, and case 2 holds for M << (T/k) , not merely for M << FT, then we have (d(k))3/95M1/2(k3T)89/570
SX <<
log T
for and
(k)4/i9 d(k)
T23/57 « M
Proof We follow the proof of Lemma 17.1.2 and Theorem 17.1.4, using Lemma 16.2.2 in place of Lemma 16.1.3. For the corollary, when M << (kT) then we write k-1 1 M2 hm 1 k l as in the proof of Lemma 5.4.4. When M >> (kT) , then we write h=0
M
k-1
X(h)
SX =
e(f(nk+h)),
(M-h)/ksns(M2-h)/k and the congruence family has the parameter M replaced by [M/k]. h=0
Sums with T large
363
For larger k we can turn the Second Spacing Problem around, fixing two
minor arcs I(a/q) and I(a'/q') independently of h, and asking for how many pairs h and h' can coincidence occur between the two minor arcs. Another treatment is to consider EIShl as a two-dimensional exponential sum, and to apply a two-dimensional form of the differencing step, Lemma 5.6.2. For a family of the form (17.3.2), the parameter h disappears after the differencing step. The same happens for the sums (17.3.1), apart from edge effects, if k is small compared with the differencing parameter.
17.4 SUMS WITH T LARGE Our method for estimating exponential sums works best when T is close to M2 in order of magnitude. In the double sum, which we shall consider next, M is a length and T has the dimensions of area, so it is natural that T should
be about M2. In the simple exponential sum, T can take any size. The rounding error sums in numerical integration correspond to lattice point problems with T larger than M2. For T > M2 we may need more terms of the Taylor expansion of F(x). The problem becomes more delicate, and the saving that we get is smaller. Sums with T in size ranges between M and M2 are also delicate. We use Poisson summation (Lemma 5.4.3), transforming them to sums of a length M', which is small compared with T, and obtain a saving using the bounds for sums with T large. Our main tool is the differencing step of Lemma 5.6.2, used repeatedly. An analogue of the reduction step of Chapter 3 with r > 1 would be very useful. For any function g(x) we write
0(d)g(x) =g(x + d) -g(x - d). Lemma 5.6.2 expresses the exponential sum S of (17.1.1) formed with f(x) = TF(x/M) in terms of a weighted average of exponential sums formed with the function 0(d)f(x). Unfortunately these include the `diagonal terms' with d = 0, in which every term is 1. The estimate for S must be at least as big as M/ VD in order of magnitude, where D is the largest value of the differencing parameter d. Let E be the average size of the upper bound for the non-diagonal terms with d 0 0. Lemma 5.6.2 gives S<<
(EM).
We iterate by differencing repeatedly with D = D1, D2, ..., so
S «M/ Dl + (E1M) , E1 «M/ DZ + (EZM) where E2 is the average size of the exponential sums formed with 0(d1)A(d2)f(x), and so on. Clearly we choose parameters so that
D,XM/Er,
DD+1 xD, .
(17.4.1)
Exponential sum theorems
364
In an R-step iteration of the form (17.4.1), we see from Theorem 6.1.1 that ER >> tom, so DR << VMW. In Theorem 17.2.2 the root mean fifth power bound is always >> M1/2T47/330 >> M157/220
in the non-trivial range M << T 213. For M >> T 2"3, the bound is >> M112T 1/6 >> M2/3.
Hence DR Z M113, and the estimate for the original sum using R differencing steps must be >> Ml- 1/3X28+'
For T large, we take R large, so we are far from the root mean square size
V.
Lemma 17.4.1 (Smoothing the differences) Let D, L, Q, and R be positive integers. Let A be a set of integers lying between L and 2L - 1. Suppose that for each 1 in A we are given R positive integers d1(l),... , dR(l) with
d1(l)d2(l) ... dR(l) =1, d1(l) + +dR(l)
If (r)(x)I S COT/Mr,
where CO is a constant, for 1 5 x 5 2, r< Q + R + 1. Then there is a function f(x, y), with the second derivative in y existing piecewise, with
f I X, L
-1) = 0(dl(l))... 0(dR(l))f(x)
for 1 in A, whilst a1 2f
(x, y) = 2RLa2(y + I) f ([t+r)(x) + 0
( 2 RDLT ) 11
MR+r+1
(17.4.2)
forr
0(dl) ... L(dr)f (x) =
dR
dR
...
f
d,
f(R)(x + t1 + ... +tR)dtl ... dtR
d,
= 2Rlf (R)(x + r) = 2Rlf (R)(x) +0(2 RjDTIMR + 1)
for some r with ITI S D, and similarly for higher derivatives of f(x).
Sums with T large
365
We define f(x, y) on each interval
(1-L- 2)/Lsys(l-L+ 2)/L for 1= L, ... , 2L. If I is not in the set A, then
f(x,y) = 2R1(y+ 1)f (')(x). If 1 lies in the set A, then f (x, y) = 2R1(y + 1)f (R)(x)
+
L(y + 1) 1
(1- 4(Ly +L -1)2)z
X (0(d1) ... A(dR)f (x) - 2RIf (R)(x)) = 2R1(y + 1)f(R)(x) + 0(2RLDT/MR+1), and 02 f (x, y) = 2RLf (r)(x) + O(2RLDT/Mr+1)a2f(x,y)=O(2RLDT/MR+1)
Theorem 17.4.2 Let R be a positive integer. Let F(x) be a real function R + 6 times continuously differentiable for 1 <x S 2, with F(r)(x) 96 0
(17.4.3)
forr=R+2, R+3, d2
dx2
lo g F(r)
(17.4.4)
0
forr=R+2, 3(F(R+3))2 - F(R+2)F(R+4) 0 0, and
3(F(R+3))2 + 4F(R+2)F(R+4)
F(R+4) F(R+5)
3F (R + 2)F(R + 3)
(F(R+2))2
F(R+3)
F(R+2)
F(R+4)
F(R+3)
0.
I
I
Let the sum S be defined as in Theorem 17.1.4, with M sufficiently large. Let Q = 2R. In the range M5 N6 we also require (17.4.3) for r = R + 4 and (17.4.4) for r = R + 3. Then for any e > 0 we have S <<
M(86Q- 13R-65)/(86Q-26)T13/(86Q-26)(log
T)3(5R2+6)/(43Q-13)
for N1 < M 5 min(N1, N3), and also for M S Nl when we replace the power of logT by TE, S <<
M(428Q- 49R- 263)/(428Q- 98)T49/(428Q- 98)(log
T)33(5R2+ 6)/(428Q- 98)
Exponential sum theorems
366
for N2 5 M 5 N4, and also for N8 5 M 5 N10 if either of these ranges is non-empty, «M(124Q-11R-46)/(124Q-4)T111(124Q-4) (log T)3(5R2+6)/(31Q-1)
S
for max(N3, N4)5MSN5, S <<M(712Q-89R-427)/(712Q- 142)T 89/(712 Q - 142) (log T )57(5R2 + 6)/(1424Q- 284)
forN55M5N7, S
<<M(160Q- 29R- 118)/40(4Q-1)T29/40(4Q-1)(log
T)12(5R2+6)/40(4Q- 1)
for N7 5 M :!g min(N8, N9), S <<M(68Q-4R-29)/(68Q-8)T1/(17Q-2)(log T)3(5R2+6)/(34Q-4)
for max(N9, N10) 5 M 5 N11, and also for M >_ N11 when we replace the logarithm power by T e. The ranges are defined by NR+3-6e = T 1
,
N25QR+139Q+26 = T75Q(logT)-15(5 R2+6)(Q-1) N337QR + 68Q + 13R + 52 = T37Q+13(log T)-16(5R2+6)(Q-1) N476QR + 138Q + 49R + 192 = T76Q+49(log T)-58(5R2+6XQ- 1)
N578QR+302Q+67R+268 = T178Q+67(log
T)-82(5R2+6)(Q-1)+356Q-71
r
N6267QR + 427Q + 18R + 143 = T267Q+18(log T)-57(5R2+6XQ-1)
T356Q - 31(logT)- 32(5R2 + 6XQ - 1) + 356Q - 71 N8254QR+388Q-49R-58 =
T 254Q-49 (log T) - 8(5R2 + 6)(Q - 1)
N937QR+54Q-2R+6 = T37Q-2(log
T)-4(5R2+6)(Q-1)
r
N 900QR + 124 Q + 41 =
T90Q(logT)-18(582+6XQ-1)
N2QR+2Q+4Qe = T2Q 11
The implied constants in the upper bounds are constructed from R and the derivatives of the function F(x), and, where appropriate, from E.
Proof This is the R-step iteration with the parameters satisfying (17.4.1). We
want an upper bound for the sum over d, 5 D, r = 1, 2,..., R, of the exponential sums over subintervals of M5 m < 2M formed with the functions
A(dl)...0(dR)f(x). We write l for the product 1= d1 d2 ... dR, and let S, be the sum of largest
Sums with T large
367
modulus among the exponential sums with d1 d2 ... dR = 1. The average ER in (17.4.1) satisfies
ERs
1
DIDZ ... DR
DID2 ... DR
1=1
F,
dR(1)ISII.
We use Holder's inequality to get rid of the divisor function dR(l): 2L-1
E dR(l) I SII < (E (dR(l))5/4)4/5( IS1I5)1/5 1=L
< (L3/8(Ed2(1))5/$4/5(F'{51151/5 << L4/5(log
L)cRZ- 1)/2(F
IS115)1/5,
where we have used the bounds for divisor functions in Lemmas 11.1.2 and 11.1.3. The power of Log L can be reduced for the important range where 1 is close to D1D2 ... DR by using the fact that d1,..., dR lie in restricted ranges. The divisor function dR(l) can be replaced by a small multiple of the divisor concentration function 1 R(l), whose averages were estimated in Lemma 11.1.4.
We have now set up the hypotheses of Lemma 17.4.1 with A the set of integers I between L and 2L - 1 which occur as products d1d2 ... dR. We can suppose that D1,. .. , DR are powers of 2 with Dr+ 1 = D'2. Then L 5 D1D2
... DR = DR/D1,
that the error term in Lemma 17.4.1 is DR negligible, provided that D1 is sufficiently large. For Theorem 17.2.2 we need to know accurately the derivatives a; f(x, y) for r5 5, and an orderof-magnitude bound for a; f(x, y). This corresponds to taking Q = 5 in Lemma 17.4.1.
We apply Theorem 17.2.2 with I = J = L, and yl =1/L - 1, and f (x, y1) = T'F(x/M, y1), where
T' = LT/MR. We run L through the powers of 2. The largest value of L should dominate the sum: from the structure of the bounds in Theorem 17.2.2, we see that the largest value of L dominates except possibly when the order of magnitude increases as we pass from (17.2.18) to (17.2.21), and when the extra factor T'E appears for M >> T'2/3- E. Since we do not intend to optimize the logarithm powers, we can remove the first discontinuity by replacing the fifth power of log T in (17.2.18) by the 11/2-th power. The discontinuity at can be overcome by using Lemma 17.1.5 for each value of yl, to compare with sums T'2/3-E
368
Exponential sum theorems
of length M', where M' is smaller than
so large that M'2/T'4/3 still the dominant term in (17.2.22). We can use a trivial comparison argument because there is no saving on the sum over the parameter in this T'2/3-E, but
is
range.
First we consider the case when (17.2.17) holds. Then L < T'8133, so the first term dominates in (17.2.18), and T)11/2.
E IS115 << L16/19M5/2TP89/114(log
In the notation of (17.4.1), ER : M1/2L- 18/570T,89/570(log
T)(5R2+6)/10
and the largest value of L satisfies LQ/(2Q-R) ^ DR ^ M112L18/570T,-89/570(log
T)-(5R2+6)/10
giving
D1 xDQ Q
^M(89R+285)/(712Q-142)T89/(712Q-142)(log T)57(5R2+6)/(1424Q-284)
The conditions (17.2.17) correspond to N5 5 M5 N7, and the bound in this
range is S << M/ D1 . For the largest value of L, the condition M2 « T' corresponds to M 5 N6. For smaller values of M, Case 1 will hold for large L and Case 2 for small L. Hence we always need the hypotheses of Case 2 of Theorem 17.2.2. There are similar calculations in the other six cases. 0
a3 I
a4
a8
a'2
1
1
a5
FIG. 17.1
a7
a10 I
a9
I
Sums with T large
369
Table 17.1 Bounds for exponential sums Steps C
C C
Exponent 0
AC AC
A2C
7
517
60
12
873 =
29 + 42a
65
7
120
114
12 =
89 + 285a
49
65
570
114
114 =
11+78a
5
49
120
12
114
13 + 21a
356
5
873
12 =
0.5922...
0.5833...
0.5702.. .
=0.4298...
4+103a
12
356
128
31
873 =
29 + 173a
227
12
280
601
31 =
89 + 908a
423
227
1282
1295
601 =
11+191a
87
423
244
275
1295 =
13 + 94a
1424
87
146
4747
275 =
4+235a
120
1424
264
419
4747 =
0.4167.. . 0.4078.. .
0.3871... 0.3777.. .
0.3266.. . 0.3164.. . 0.3000.. .
49 + 1351 a
967
120
1614
3428
419 =
29 + 464a
199
967
600
716
3428 =
89 + 2243 a
19
199
2706
74
716 =
11 + 428a
161
19
492
646
74 = 0.2568.. .
13 + 253a
2848
161
318
12173
646 =
A2C
A2C
To a=
4+39a
60
AC
From a =
0.2864.. . 0.2821.. .
0.2779...
0.2492...
If F'(x) is a monomial function in the sense of Chapter 3, with monomial part x_S for some s > 0, then the conditions of Theorems 17.1.4 and 17.4.2 hold for each R. We write M = T a. The bounds take the form S << TR(log T)'y
Exponential sum theorems
370
for some y< 1. The implied constant depends on the bounds for the derivatives of F(x), but not on a. In the first column of Table 17.1, C indicates Theorem 17.1.4, and ARC indicates Theorem 17.4.2 with R differencing steps.
Rows of the table bracketed together indicate an upper bound with several terms. The last two columns give the range of a for which each term is used. There are two types of critical values of a. Within a set of bracketed rows we
meet values of a at which one term takes over from another as dominant term. These correspond to re-entrant (concave) vertices of the piecewise linear polygonal graph of 6 against a. Between two unbracketed rows we meet values of a at which the argument giving the best estimate changes. These correspond to convex vertices of the polygonal graph. Figure 17.1 (not
to scale) illustrates the portion of the graph corresponding to ARC. The abscissae a; are given by a1 = (log N)/(log T). For values of a > '1 not in the table we apply step B first. In Section 19.2 we sketch an iterative idea which cuts away some of the corners on the graph. In many applications of exponential sums we have to consider different
order-of-magnitude ranges for M and T. The most important ranges are
those in which a= (log M)/(log T) takes one of the values in the last column of Table 17.1 which is a convex vertex of the graph, and which lies
above the lines adjoining other pairs of convex vertices. Examples are 49/114, 356/873, and 1424/4747; we call these the outstanding vertices. Joining consecutive outstanding vertices gives another polygonal line. Each side of this new polygon corresponds to a bound of the form S << T P log T,
(3 = (a + ba)/c,
(17.4.5)
which is valid for all a > 0 (even for a >_ 1). The classical notation states bounds as S << (T/M)KM'1,
(17.4.6)
where the numbers (K, A) form the exponent-pair as in Lemma 7.3.4. Table 17.2 Outstanding vertices a
/3
712Q(R - 1) + 865Q + 338
712QR + 865Q
712QR + 865Q + 338
1424QR + 1730Q + 676
65
503
114
1140
49
141
114
380
712Q
712Q - 169
712QR + 865Q + 338
712QR + 865Q + 338
Sums with T large
371
Table 17.3 Exponent pairs A- E
K - E 1
169
712R + 1577
1
169
2
1424Q - 338
712
2
1424Q - 338
7577
28229
43860
43860
89
187
570
285
6299
29507
43860
43860
169
1424Q - 338
1
169
712R + 1577
1424Q - 338
712
Because of the logarithm, (17.4.5) corresponds to K = a/c + E, y = (a + b)/c + c, for any e > 0. The integer R in Tables 17.2 and 17.3 takes the values 1, 2, 3,..., with Q = 2 K
18 Lattice points and area 18.1 EXPONENTIAL SUMS The analytic method for estimating the discrepancy between the number of lattice points inside a closed curve involves dividing the curve into five or more pieces, taking local coordinates y =f(x) on each piece, and estimating
the contribution E p(f(m)) that each piece of the curve makes to the discrepancy. The classical estimate is Corollary 2 to Lemma 5.4.3. We use the
Fourier expansion of Lemma 5.3.1, which leads to the sums considered in Chapter 8. Our first task is to use the estimates for the First Spacing Problem in Chapter 13 and the Second Spacing Problem in Chapter 16 to complete
the estimation of the exponential sum, and deduce the bounds for the rounding error sum. As in the last section, we require that certain derivatives and determinants of derivatives do not vanish. We shall show that the local coordinate systems can be chosen to satisfy these conditions. In Chapter 8 the sum divided into Farey arcs. The arc I(a/q) corresponds to a side of the Voronoi-Sierpinski polygon with the same gradient a/q. The saving over Lemma 5.4.3 comes from the large sieve and good estimates in the First and Second Spacing Problems. The three different approaches to the discrepancy of a closed curve in Chapters 2 and 6, and in this chapter, all analyse the curve in terms of tangents in rational directions. We obtain the results of Huxley (1993b), and new rounding error estimates improving those of Huxley (1991). Theorem 18.2.3 corresponds to the result of Iwaniec and Mozzochi (1988).
Lemma 18.1.1 (Choosing parameter sizes) Suppose that the conditions of Lemma 14.3.1 hold. In the Second Spacing Problem we have BV << Al 02 &4/3PZR2 V
(18.1.1)
U= (H/B1Q)2/3
(18.1.2)
with
for some sufficiently large constant B1, in the following cases.
Exponential sums
Case (a)
373
We choose V= U312, for a choice of N and R in the ranges
Nx
m ( M22T18 1/35 T I H22 ) ,
1/35
H11
R
M( M11T9
)
,
(18 . 1 . 3)
provided that MT-318 << H< B2 1MT- 17/57,
(18.1.4)
H>> max(T4/M9, M11/T6),
(18.1.5)
and
Case (b) For M< B3T 1/2, we choose V so that A1Q2 x 1. There are two subcases. If
H> T11/8/B2M3,
(18.1.6)
then (18.1.1) holds for a choice of N and R in the ranges
NxM/(HMT)1/7,
RxM(HSM/T6)1/14,
(18.1.7)
provided that (18.1.8)
M513/T2/3 << H <M112/B2T1/12
If (18.1.6) is reversed, then (18.1.1) holds for a choice of N and R in the ranges
NxM2/(HT2)1/3,
R x (HM3/T)1/6
(18.1.9)
provided that both
M3/5/T1/5 << H _ M3/2/B2T1/2,
(18.1.10)
H5 T4/B2M9.
(18.1.11)
and
Case (c) For M >_ B3 1 T1/2, we choose V so that 01P2 x 1. There are two subcases. If (18.1.12)
H;-> B2M5/T21/8,
then (18.1.1) holds for a choice of N and R in the ranges
Nx (M1/8/H5T7)1/7,
RX
(H5M3)1/14,
(18.1.13)
provided that M1/3 << H
,
M3/2/B2T7/12.
(18.1.14)
Lattice points and area
374
If (18.1.12) is reversed, then (18.1.1) holds for a choice of N and R in the ranges (HM7/T3)1/6
Rx
Nx (M2/H)11'3,
(18.1.15)
provided that both M715/T3/5 << H<M1/2/B2,
(18.1.16)
HsM11/B2T6.
(18.1.17)
an d
The parameter B3 may be chosen to be any positive real number. The constant B2 and the implied constant in (18.1.1) are constructed from the derivatives of F(x) and from B3. When Modification 2 is used, then we replace R2 in (18.1.1) by Q2.
Proof We substitute the values of IX; from Lemma 14.1.1:
_
A1 .
_
R4
A 2 ..
HNQ2V
R2
03 x
HN
R2
HQ
04 x
,
Q
H
The choice (18.1.2) of U is valid in (16.1.6) of Lemma 16.1.3 if H2N3 << MR4
(18.1.18)
(which implies (8.1.5)) and if the constant B in Lemma 16.1.3 is sufficiently large.
The last term in (16.1.7) of Lemma 16.1.3 dominates the other terms with V= U3"2, if 1
(18.1.19)
O2 04/3PQ x MR4Q513/H1113N3,
1 << 0 1 m in (P2 , Q 2)
« H Q mi n ( N
l
Z
R4
,
it .
(18 . 1 . 20)
The choice (18.1.3) is extremal in (18.1.19), and it satisfies (18.1.18). The conditions (18.1.5) imply (18.1.20). With V chosen so that O1Q2 x 1, the last term in (16.1.9) of Lemma 16.1.3
dominates the first term if
1«L1202/3P/QxMQ2"3/H513N2.
(18.1.21)
The extremal choice of N and R is (18.1.7), satisfying V >_ 1 when (18.1.6) holds. When (18.1.6) is false, then we choose V= 1, which puts N and R into the ranges (18.1.9), satisfying (18.1.21), and also satisfying (18.1.18) in the range (18.1.11). With V chosen so that A1P2 x 1, the last term in (16.1.10) of Lemma 16.1.3
dominates the first term if 1 « L)2 24 3Q/P X Q213R4/H513M.
(18.1.22)
The extremal choice of N and R is (18.1.13), and V>_ I when (18.1.12) holds.
Exponential sums
375
When (18.1.12) is false, then we choose V= 1, which puts N and R into ranges (18.1.15), satisfying (18.1.22), and also (18.1.18) in the range (18.1.17). We have to satisfy (8.1.4)-(8.1.7), so that
R :s- H5 N/64C,
(18.1.23)
R >- 2C H _+1
(18.1.24)
,
N<M/10.
(18.1.25)
The requirement (18.1.23) gives the conditions (18.1.4), (18.1.8), (18.1.10), (18.1.14), and (18.1.16). We find that, up to a constant factor, (18.1.24) follows from (18.1.23) and the condition A1Q2 >> 1 which our construction always satisfies. The constant B2 is chosen large enough for (18.1.24) to hold with
the right constant of proportionality. Also (18.1.25) follows easily from (18.1.18) if M and T are sufficiently large. The most difficult verification is (18.1.18) for the choice (18.1.7), where we use both lower bounds (18.1.6) and (18.1.10) in different ranges. Similarly, both (18.1.12) and (18.1.14) are used to verify (18.1.18) for the range (18.1.13). 0 Theorem 18.1.2 Let F(x) be a real function three times continuously differen-
tiable for 1 <x5 2, and let g(x), G(x) be bounded functions of bounded variation on 1 5 x S 2. Let C0,. .. , C6 be real numbers z 1. Let H, M, and T be large parameters. Suppose that
IF(')(x)I < C,
(18.1.26)
forr= 1,2,3, IF(')(x)I
1/C,
(18.1.27)
for r = 1, 2, and that either Case 1 or Case 2 holds:
Case l M5 C0T 1I2 and (18.1.27) holds for r = 3 also, Case 2 M >- Co 'T 1/2 and IF'F(3) - 3F" 21 Z 1/C4.
Let S denote the sum
S= E g( H
H)
m=M
G(M)e(M
F(M)).
Then there are positive constants B1 and B2, depending on C0,. .. , C6, and on the functions g(x) and G(x), such that for M in the range CS 1T49/114 SM5 C5T651114
and H satisfying
H
Lattice points and area
376
and also
HZC61T4/M9 for M
H z C61M11/T6
for M Z C5T9116,
we have
H
3/70
T23/70(log T)9"4.
<
S
(18.1.28)
In the range
CS 1T1135M
H<
min(B1M1/2T-1/12
,
B1M3/2T- 1/12, C6T 4M-9),
we have S
(18.1.29)
+B2H17/1sM-1/6T7/18(log T)9/4.
In the range CS 1T9116 <MS C5T213, min(B1M312T-7/12, B1M1/2,C6M11T-6),
H:!-,
we have S < B2H 15114M-5114 T 1/2 (log T)9/4
+ B2H17/1sM5"8T1/6(log
T)9/4.
(18.1.30)
Proof We substitute into Lemma 8.4.4 and perform the summation over Q. In Chapter 13 we have 71 x R2/HN, 77K x Q/H, so Theorem 13.2.4 gives A << K2L2(1 + 77K)log K << K2L2 log K.
When Modification 2 applies, then Lemma 18.1.1 gives A2 04'3P2Q2V <<
BV
Q 2/3
M2Q2
2 (T:) KLNR 2
2
,
2
and Lemma 8.4.4 gives
E (E F, e(hg,(n)) j
h
n
QW <<
R
(HN) log N + Q
Q
HM
R
(HN)
<< -
1/4
R2 ( ABKL4NR2W 2
log N+
)
Q3
Q )1/3 HM(log N)9"4 (HN3R2)1
6
log2 N,
Integer points and rounding error
377
The sum over j counts each minor arc with Q 5 q < 2Q several (>> Q2/R2) times. When we allow for this, then the sum over ranges Q = 2" converges, and the range Q R dominates. The major arcs contribution from Lemma 8.2.1 is smaller, so
HM(lo N)9"4 S <<
g
(HN3R2
R
1/6
2/3 V
H)
)914,
(log M
(18.1.31)
and we save a factor (R/H)2"3 (up to a logarithm power) over the easy estimate of Lemma 8.2.2.
There is a useful methodological comment here. If all the conditions are satisfied except R5 H, then the bound (18.1.31) is weaker than Lemma 8.2.2. However, for small H we can dispense with the whole apparatus of Chapter 8. Using Corollary 1 to Lemma 5.4.3 for each value of h gives
S << (H3T/M)112(1 +M2/HT),
(18.1.32)
which is better than (18.1.31) when
M2/T << H<< R.
If H «M2/T, then we use Lemma 5.4.1 to approximate the sum over m by an integral, which is estimated by the First Derivative Test (Lemma 5.1.2). This gives
S <<M2/T,
(18.1.33)
which is much better than (18.1.31). Hence we can omit from the hypotheses of Theorem 18.1.2 the lower on H in (18.1.4), (18.1.8), (18.1.10), (18.1.14), and (18.1.16) of Lemma 18.1.1. In the statement of the theorem we have collected cases, and worked out
the ranges for M in which each bound is used. We have renumbered the
constants B. and C;. The conditions on the function F(x) derive from Lemma 14.3.1. We have introduced the weight functions g(x) and G(x), which do not appear in Chapter 8. Because the ranges of summation for h and m are independent, we can remove the weight functions by applying Lemma 5.1.1 to the outer summation, interchanging orders of summation, and applying Lemma 5.1.1 again.
O
18.2 INTEGER POINTS AND ROUNDING ERROR We apply Theorem 18.1.2 to the problems of lattice points and area consid-
ered in Chapter 2, and of integer points close to a curve considered in Chapter 3. The second problem is technically simpler. Theorem 18.2.1
Let F(x) be a real function with three continuous derivatives
for 1 <x 5 2. Let M (an integer) and T (a real number) be large, and let 5
Lattice points and area
378
satisfy 0 5 6< Z. Let C z 1 be a constant. Suppose that for 1 <x S 2 either Case 1 or Case 2 holds: Case 1
The derivatives F" (x) and F(3)(x) are non-zero and M< CT 1/2.
Case 2
The derivatives F'(x), F" (x) and the expression
F'(x)F(3)(x) - 3F" (x)2 are non zero, and M > C-1 T 1/2. Let N(S) denote the number of solutions of T
(18.2.1)
m IIM F(M )IIS8
with M:5 m 5 2M - 1. Then in both cases/ (18.2.2)
N(S) << SM + T23/73(log T)315/146
for T63/146(log T)63/292 S M < T83/146(log
T)-63/292'
(18.2.3)
and in Case 1
N(S) «SM+M4/15T1/5(logT)21/10 +M-4/17T7/17(log T)81/34 (18.2.4)
for T 1/3 5 M < T 63/146 (log
T)63/292
and in Case 2 N(S) «SM+M-4/1sT7/15(log T)21/10+M4/17T3/17(log T)81/34 (18.2.5)
for T83/146(log
T)-63/292
< M < T 2/3.
The implied constants are constructed from C and the range of values taken by the derivatives of the function F(x).
Proof We suppose first that S is not too small, so that when we use Lemma 5.3.2 in the form
N(S)<-
D1 2M 4 +D-Re D
dM
dT
m
(1-D)Ee(MF(M)),
(18.2.6)
then
D=[1/881+1 is so small that Theorem 18.1.2 can be applied to each range H< h < min(D,2H), where H runs through the powers of 2. The bounds of Theorem
Integer points and rounding error
379
18.1.2 are increasing functions of H, and the order of magnitude does not change when we pass from the case of H large in (18.1.28) to the case of H
small in (18.1.29) or (18.1.30). The estimate for the sum in (18.2.6) is therefore dominated by the largest range. We want the term 2M/D to dominate in (18.2.6). In the case of H large we require D 3/70
D(M)
T23/70(log T)9/4 <<M,
which is true for S Z So given by So = M-1 T 23/73 (log T )315/146.
This choice of D is valid since 23/73 = 0.3151... > 17/57 = 0.2982... . For S 5 So we use the estimate for N(SO) instead. The largest value of H remains in the range where (18.1.28) is valid when M lies in the range (18.2.3). For H small and M small we use (18.1.29) in place of (18.1.28). The term 2M/D dominates for S >_ 51 given by 7,3 81
= min
M11
1/ 15
7.
(lo8T)21/10,
7/ 17
(M3)
(logT)81/34
For 5S 51 we use the bound for N(6) instead. This case gives the bound (18.2.4). The bound (18.2.5) follows similarly from (18.1.29).
Theorem 18.2.2 Let F(x), M, and T satisfy the conditions of Theorem 18.2.1. Let R be the sum m. T
R=Ep(MF(M))'
formed with the row-of-teeth function p(t), where M2 is an integer in the range M < M2 < 2M. Then in both cases T)315/146
R << T23/73(log
(18.2.7)
for
T 63/146 (log T) 63/292 S M < T 83/146 (log T) -63/292
(18.2.8)
and in Case 1 R << M4/15T1/5(log T
)21/10
+M-115T 2/5 (log T )81/40
(18.2.9)
for
T 1/3 < M < T63/146(log T
)63/292,
(18.2.10)
and in Case 2 R << M-4/15T7/15(log T )21/10 +
M1/5T2/5(log T )81/40
(18.2.11)
for
T83/146(log 7.)-63/292 <M < T2/3.
(18.2.12)
Lattice points and area
380
The constants implied in the << symbol depend on C in the definition of Cases 1 and 2, and on the range of values taken by the derivatives of the function F(x).
Corollary If, in addition, F(x) is defined for 1 - 112M5 x< 2 + 1/2M, and has two continuous derivatives in this wider range, then MZ
T
m
M+1/2 T
x
l
L(i)J-IM-1/2(MF(M)-zldx satisfies the order-of-magnitude bounds for the sum R given in Theorem 18.2.2.
Proof We use the expansion for p(t) with tapered partial sums in Lemma 5.3.1:
(1 --+0 + min 1, D h=2-2D k(0) 27r ih 2D-2
p(t) =
k(h) e(ht)
1
D311t11311
(18.2.13)
ho0
instead of the simpler sum in Lemma 5.3.2 which we used in the last theorem; we have replaced H in Lemma 5.3.1 by D to show the analogy. We divide the sum over h into blocks of the form M2
H2
k(h)
1
hT
m
R(H)= F, E k(0) 27r ih e(M F(M)), m=M h=H I
(18.2.14)
where H is a power of 2, and H2 = min(2 H - 1, 2 D - 2). The result (18.1.28) of Theorem 18.1.2 gives R(H) << (H/M)3/70T23/70(log T)9/a
The sum over values of H is a sum over powers of 2 in which the term with H largest dominates. We have R(H) << M/H provided that H << MT-23/73(log T)-315/146
(18.2.15)
We give D the largest value consistent with (18.2.15), provided that the largest block is one for which (18.1.28) is valid, which happens in the range C1T63/146(log T)63/292 <M < CI 1T83/146(log T) -63/292
for some constant C1, which we discuss below. This estimate holds for all blocks if C2T
7/16 5M < C2 1 T91 16
for some constant C2. For M :!g C2T7/16 we must consider blocks R(H) with H << T4/M9 separately. For these (18.1.29) gives R(H) << H1/14M3/14T3/14(log T )9/4 +H- 1118M- 1/6T7/18(log T
)9/4.
(18.2.16)
Integer points and rounding error
381
We can also use the Poisson summation bound (18.1.32), which gives
R(H) << (HT/M)112
(18.2.17)
We change from (18.2.16) to (18.2.17) when H << (M3/T )1/5(log T)81/2o Summing H through powers of 2, we have
E
min(H-1/18M-1/6TH/18(log T)9/4,
(HT/M)112`
H )81/40
<<M-1/5T2/5(log T This expression is smaller than (18.2.7) in the range (18.2.8). For T)63/292, (18.2.18) M< C1T63/146(log all blocks fall into the range of validity of (18.1.29). The block with H largest
dominates. It contributes O(M/H), provided that -21/10, (M3/T)1/2). C3 min((M11/T3)1/15(log T)
H<_
(18.2.19)
We give D the largest value consistent with (18.2.19). The first choice in the minimum gives the first term in (18.2.9). The second term gives the minimum only when the first term in (18.2.9) has ceased to dominate. As M passes through the value on the right of (18.2.18), then the order of magnitude of the estimate for the block R(H) changes smoothly. By adjust-
ing the constants in the upper bounds, we may change from the estimate (18.2.7) to (18.2.9) at T63/146(log T)63/292, so that C1 does not appear in the statement of the theorem. At the lower end of the range (18.2.10), the trivial estimate M is better, and we may drop the constant C2 similarly.
For M;_> C2 1 T9/16 we must consider blocks R(H) with H <<M11/T6 separately, using the estimates (18.1.30), (18.1.32), or (18.1.33). The discussion
runs entirely parallel to the case H << T 4/M9 above. Alternatively, Case 2 can be reduced to Case 1 by taking y as the independent variable on the curve y = (T/M)F(x/M), and counting lattice points in the y-direction; we find that O(T/M) is a trivial bound for the error when M5 T 2/3. The error term in (18.2.13) contributes M + r N(2'/D)
«
231
in the notation of Theorem 18.2.1, where the sum is over integers r with r >_ 0, 2'+ 1 D. The terms of the sum corresponds to the block sums R(H), and the corresponding estimates are the same in the range (18.2.8), and equal or smaller in the ranges (18.2.10) and (18.2.12). The corollary follows when we expand out the row-of-teeth function in the
sum R, and use fm+1/2 m- 1/2
f(x)dx=f(m)+O(maxIf"(x)I)
(18.2.20)
Lattice points and area
382
term by term with f(x) = (T/M)F(x/M). The error term in (18.2.20) sums to O(T/M2), which can be absorbed by the bounds of the theorem for M>- T1/3
The second term in (18.2.9) and (18.2.12) of Theorem (18.2.2) comes from
block sums R(H) with H smaller than D, in ranges for which the saving given by the double exponential sum is not so great. We can treat these sums as a family of single exponential sums, provided that we have more information about higher derivatives of F(x). We can extend the validity of (18.2.7)
this way. Our next theorem states a uniform result which is useful for applications like Dirichlet divisor problem, where the order of magnitude of the integer variable m is not fixed. Theorem 18.2.3 Let F(x) be a real function with four continuous derivatives for 1 <-x <_ 2. Suppose that the derivatives F" (x) and F(3)(x) and the expression
F" (x)F(4)(x) - 3(F(')(
X))2
(18.2.21) are non-zero. Let C z 1 be a constant. Then for M < CT1/2 and M2 < 2M we have M2
TM( m)) p(FM 4 T23/73(log T)315"146 1
M=M
If in addition, F(x) is defined for 1 - 112M5 x <- 2 + 1/2M, and has two continuous derivatives in this wider range, then Mz
T
M
F(m)11M+1/2 _
MM-1/2 (MF(M)
2)dx
+ O(T/M2 + T 23/73 (log T)315"146 ) The constant implied in the upper bounds depends on C and on the range of values of the derivatives of the function F(x).
Corollary
In Dirichlet's divisor problem we have T
L d(t) = T log T + (2y - 1)T + O(T 13/71 (log
T)461/146)
t=1
Proof We modify the proof of Theorem 18.2.2 by treating blocks R(H), with H small, more elaborately. We choose D to be the largest value consistent with (18.2.15). The block R(H) is split into (at most) H sums over m in which h is fixed, which we treat as single exponential sums with parameter
T' = hT/M. For example, when Mx T2/5, then T' x M7"4. When Mx T 1/2
x M1/4,
hx
MT-3120
hx
MT-3110 X M2/5
Integer points and rounding error
383
then we have T' x M 7/5. Hence M is rather larger than T, so we begin with the inversion step (Lemma 5.5.3). This gives an exponential sum formed
with the complementary function F(y). For best results we should apply Theorem 17.4.2 with R = 1, that is, a differencing step, then Theorem 17.1.4. However, for the accuracy that we require, we can replace Theorem 17.1.4 by the simpler Lemma 7.3.4. In its turn, Lemma 7.3.4 can be replaced by another differencing step, followed by Corollary 1 to Lemma 5.4.3. The only, property
required of f(y) is that F(4)(y) = (F"F(4) -
be non-zero, as in the statement of the theorem. The non-vanishing of (18.2.21) corresponds to the condition in Theorem 17.1.4 that F(3) is non-zero, which allows the construction of the Farey arcs. We have a four-step van der Corput iteration (Graham and Kolesnik 1991), of type BARB, where A denotes a differencing step, and B denotes Poisson
summation, that gives the exponent pair (2/7,4/7) in the sense of (17.4.6). The bound for the block R(H) simplifies to R(H) <<
1
H
T' 2/7
E(M)
M4/7<< (HT) 2/ 7 ,
(18.2.22)
h
which is acceptable for H << T 15/146 (log T
)2205/292'
and thus for the entire range (18.2.15) when M << T61/146(log T
)2835/292
(18.2.23)
For slightly larger values of M we use (18.2.22) when H is small, and (18.2.16) when H is large. We change from (18.2.16) or (18.2.22) when
H << M- 21/43 T 13/43(log T )567/86 Summing H through powers of 2, we have
E
min(H-1/18M- 1/6T7/18(log T)9/4, (HT)2/7) << M- 6/43T16/43(log T)81/43
H
This term is acceptable for M >> T 179/438 (log T)
-573/292
(18.2.24)
Since 61/146 =183/438 > 179/438, the ranges (18.2.23) and (18.2.24) overlap, and the estimate (18.2.7) of Theorem 18.2.2 is valid for M smaller than
the range (18.2.8). The corollary on Dirichlet's divisor problem follows immediately from (11.1.14) of Lemma 11.1.3.
0
Lattice points and area
384 18.3
LATTICE POINTS INSIDE A CLOSED CURVE
We are now technically equipped to return to the lattice point problem of Chapter 2. Accordingly C is a simple closed curve made up from pieces C, on which the radius of curvature p exists, is bounded and non-zero, and has at least one continuous derivative with respect to the tangent angle 41. In Chapter 2 we approximated the enlarged curve MC by a polygon whose sides
had rational gradients. In Chapter 8 the enlarged piece MC, has equation y = (T/M)F(x/M) in local coordinates, replacing the straight sides of the polygon by quadratic curves (splines).
For each piece C, of the curve, we set up a local coordinate system on which the hypotheses of Theorem 18.2.2 hold, and we use the automorphisms of the integer lattice to express the discrepancy in the new coordinates. We
may have to subdivide the piece C, into smaller pieces, each with its own local coordinate system. This subdivision is finite, and independent of the enlargement factor M. First we subdivide the discrepancy. In this section we use the name lattice to mean a group of points in the plane (under vector addition) with two specified generators, which are linearly independent. Lattice lines mean the straight lines through the points of the lattice in the direction of either of the two generators. The lattice lines divide the plane into parallelograms whose area is the determinant of the two generating vectors, called the determinant of the lattice. We call the centres of these parallelograms the midpoints of the lattice, and we call lines through the midpoints in the direction of either of the generators the midlines of the lattice. The midlines divide the plane into the second set of parallelograms whose centres are the lattice points. The integer lattice has the unit vectors (1, 0) and (0, 1) for generators.
The enlarged curve MC is made of pieces MC,, i = I,-, J. Let the endpoints of MC, be P; and P;+1. Since the curve is closed, we have Pr+ 1 = P1. Let Q. be the midpoint of the lattice square of the integer lattice which contains Pi (any of the possible squares if P, lies on a lattice line). We suppose that the tangent angle 41 changes by at most -7r/6 on MCj. Case 1 All tangents to the arc MC, make an angle at least nr/6 with the y-axis. Choose a point Si on the midline y = y(Q,) as follows. The arc MC, meets this line at most once. Let Si be the point of intersection if it exists. Otherwise let Si be the point of intersection of the midline y = y(Q,) with the tangent at P, (which cannot be parallel to the y-axis). Similarly choose T, on the midline y = y(Q,+1) to lie on the arc MC, if possible, otherwise on the tangent at Pi, 1 (Fig. 18.1).
Case 2 All tangents to the arc MC, make an angle at least a/6 with the x-axis. Choose Si and T, similarly on the midlines x =x(Q,), x =x(Q,+1).
In both cases the distances Q,S, and Q,+1S, are at most (1 + v)/2. Let U
Lattice points inside a closed curve
385
xT1 x Qi+1
Qi x
Ui
FIG. 18.1
be the intersection of the midlines x=x(Q,),y=y(Qi+1) The polygon Q1U1Q2U2 ... QJUJ is a union of unit squares formed by midlines of the integer lattice, so that the area of the polygon equals the number of lattice points inside. The polygon may intersect itself; in this case some unit squares are counted negatively. In Case 1 we compare the number of lattice points and the area within the region E. bounded by the arc S;T, and the midlines Qiu,,, y = y(Q1) and y =y(Qi+1). Again, lattice points and area below the base line Q;U are counted negatively. In Case 2 the region E. is bounded by the
arc S;T and the midlines UQr+1, x=x(Qi) and x=x(Qi+1). Note that when we fit consecutive arcs C, and Cr+1 together, then Si+1 is
not the same point as T,. A bounded region about P;+1 and Qi+1
is
overlapped or omitted. This causes a bounded error in the discrepancy at each point of subdivision of the curve. When we write ., for the discrepancy
of the set E,, 0 for the discrepancy of the interior of MC, then A
1, + O(J).
To avoid bad orientations of the curve MCi (such as horizontal or vertical), we rotate coordinates by an angle 6, with a rational tangent a/q. The angle 13 must be chosen from some interval I of length o(< it/4). If I contains nir/4 for some integer n, then we choose /3 = nir/4, with a2 + q2 < 2. If not, then some interval It = mir/2 ± I is a subinterval of (0, it/4). The values of tan a for a in It form an interval 12 of length at least 8. Let d = [1/S] + 1. The interval d12 has length greater than one, so it obtains some integer c. We take /31 = tan-1 c/d, /3 = ±(/31 - mir/2). Then tan /3 = a/q with
a2 + q2 < c2 + d2 < 2d2 < 8/S2;
(18.3.1)
this inequality also holds in the case /3 = nir/4.
Lemma 18.3.1 (Rotating rounding error sums) Let the curve y = g(x) be continuously differentiable for b 5 x < c, and let a/q be a rational number in its
Lattice points and area
386
lowest terms (or a = 1, q = 0), and let 6 be a number such that no angle in the a 6 6 a is that of a normal to the curve. Define - 2 , tan-1 + interval tan 1
2 q q the function v = h(u) implicitly by
au + qv = rg((qu - av)/r), where r = a2 + q2. Then there are integers e, f, and s with
f-e << q(c - b) + alg(c) -g(b)I + 1/6, and C
Y, P(g(m)) _
m-b
f
h(n) - sn r p n=e+1
r
1
+OI r+ s + J
(f - e) r
I
1
We can take s
a
a
q
q
r
qr
q
a
ar
(18.3.2)
where as + qq = 1.
Proof The rounding error sum can be regarded as the discrepancy of a region E bounded by lines SQ: x = b - Z, TU: x = c + ? , QU: y = d + Z , and the part of the curve y =g(x) with b - 15x 5 c + if necessary we extend
the definition of g(x) to be linear for x < b and 2;x > c, with continuous derivatives at b and c. Figure 18.1 shows one possible configuration. Integer points and area below the line y = d + 'z are counted negatively. The three straight sides of E are midlines of the integer lattice, which we shall call A. We divide the lattice points (m, n) of A into r = a2 + q2 different classes,
according to the residue of m + in mod q + is in the Gaussian integers. Points of A in the same class form the vertices of a system of squares with sides of length Vr-, in the directions of the vectors (q, a) and (-a, q), which are the directions of the new coordinate axes. The sides of these squares from all the r classes of integer points form the lines of a square lattice Al of side
1/%. We pick points V and W to be midpoints of the lattice Al as close as possible to S and T, respectively. There are now two cases. Either SQ or TU has length at most one. (Fig. 18.2). SQ is the short side. We take U' to be the intersection of the midline through V in the direction (q, a) with the midline through W in the direction (-a, q).
Case 1
Case 2 Otherwise TU is short. We take U' to be the intersection of the midline through W in the direction (q, a) with the midline through V in the
direction (-a, q).
Lattice points inside a closed curve
387
14G. 18.2
In both cases we define S' and T' to be the intersections of the curve (extended linearly past S and T if necessary), with the midlines of Al through
V and W in the direction (-a, q). The intersections are unique because the vector (-a, q) makes an angle at least 5/2 with any tangent. We now have a
region E' bounded by the arc S'T' and the lines S'V, VU', and U'T' (or S'U', U'W, and WT' in Case 2), which are midlines of A1. As usual, if the
curve crosses the line VU', then some of the area and lattice points are counted negatively. The new region E' differs from E by the addition and subtraction of a bounded number of regions of three types. Type 1 Regions about S and T which can be covered by rectangles with one side of length at most 1/ W, the other side of length at most (1 + cot 8/2)/ V. The discrepancy of a region of this type is trivially
<< 1+1/Sr. Right-angled triangles formed by two midlines of A and a midline of A1. The discrepancy is given by a sum of the form Type 2
(a(n q1/2)1 P
1
(18.3.3)
n
By the addition formula (Lemma 2.3.1) the summand in (18.3.3) sums to zero over any set of q consecutive integers, so the whole sum is O(q) = WT).
Right-angled triangles formed by two midlines of A1, and a midline of A. The discrepancy is given by two sums of the form Type 3
a(nP n
2)-b
9
(18.3.4)
for some b, which are again O(q). Since discrepancy is additive on disjoint sets, the discrepancies of E and E'
differ by 0(r+ 1/8r). Our last task is to express the discrepancy of E' as a rounding error sum.
388
Lattice points and area
We take new coordinates u = qx + ay, v = qy - ax. Integer values of x and y give the lattice whose (u, v)-coordinates are the multiples of q + is in the Gaussian integers. Since a/q is assumed to be in its lowest terms, we have (a,r) = (q, r) = 1, and the points of A can be characterized by qu + av = 0, (mod r) (18.3.5) and the determinant of A in (u, v)-coordinates is r. We note that (18.3.5) is equivalent to v = su (mod r), where s is a solution of as + q = 0 (mod r); the formula (18.3.2) defines an integer satisfying this congruence. Suppose that
the curve MC; is v = h(u), and the other boundaries of El are midlines u = e + 2, u = f + 2, and v = d + 2. The number of points of A in El is
f ([h(n)-snl - [d+ 1/2-snll r
11
(p(h(n)-11 sn1 -p(d+1/2-sn n=e+1
r
r
h(n)-d-1/21 /I
+
r
/I
The first term is the required rounding error sum, and the second is analogous to (18.3.4), and so is O(r). The third term is estimated as 1
fef+ +
1/2(h(u)-d-
2)du+O f -e)
max I
(18.3.6)
r J 1/2 r by (18.2.20). The integral in (18.3.6) is the area of E1 in (u, v)-coordinates, which is r times the area in (x, y)-coordinates.
o
Theorem 18.3.2 Let D be a Euclidean plane domain with area A, bounded by a simple closed curve C composed of finitely many pieces C, which are three times continuously differentiable in the following sense. The radius of curvature p is continuous and non-zero on each piece Ci, and p is continuously differentiable with respect to the tangent angle 41. Let E be a plane set obtained by expanding D linearly by a factor M > 2, followed by a rigid motion. Then, for any embedding of E in the Euclidean plane, the number of integer points in E is M)3151146),
AM2 + O(IM46/73(log where I is a number depending on the curve C, but not on Moron the position or orientation of E.
Theorem 18.3.3 In Theorem 18.3.2, suppose that the pieces C, are four times differentiable, in the sense that p is twice continuously differentiable with respect
to the tangent angle 41. Suppose also that p is either constant or has a finite number of maxima and minima on each piece Ci, and that d log p/d 4' has a bounded number of maxima and minima between any two consecutive extreme values of p on Ci. Then we may take
I= E r' pk/73' i
k
Lattice points inside a closed curve
389
where the inner sum is over local maxima Pik (including endpoints, and counting constant values only once) of the radius of curvature on Ci, provided that M is so large that the bounds M > 1 / p and M40
63 Z 1+
(log M)"
1
P
2
dP
2 103
d/,)
1 1
hold on each curve C1.
Proof of Theorems 18.3.2 and 18.3.3 We have to choose local coordinates on each piece Ci so that the conditions of Theorems 18.2.1 and 18.2.2 apply. The new coordinates are u and v in Lemma 18.3.1, with
u = r (x cos /3 +y sin f3 ),
v=W(ycos/3-xsin so, for (x, y) on the curve MCi, we have dv
- = tan(4,- /3),
(18.3.7)
du
d2v
due d3v
du3
1
r1/epcos3(f-/3) 3sin(4r-/3) 1 dp rp2cos5(ap-p) rp3cos4(q-/3) dq, 3sin(4i- f3-A) rp2
ip - /3 )cos A '
(18.3.8)
(18.3.9)
where the angle A is defined by
dp 3p d+/ 1
tan A =
We also have
dv d3v - 3 (d2v 2 du du3 due
3cos(4,-/3 -A) rp2 cos5(r/l - /3 )cos A '
(18.3.10)
We prove Theorem 18.3.2 by a convexity argument. The values of 4 on a
particular piece C. of the curve form a closed interval K. For any 4 in K there is an open interval of values of f3 for which the expressions in (18.3.8) and (18.3.9) are bounded away from zero and infinity. We pick a(41) to be the midpoint of this interval. For fixed 4 = 4i0, we take f3 = a(410) and
consider (18.3.8) and (18.3.9) for this P. Since p and A are continuous functions of +/, there is an interval of values of 41 about 1//0, open relative to the interval K, on which neither expression (18.3.8) nor (18.3.9) is zero or infinite. We call this interval I( /3 ). Each 41 inK lies in its own open interval
Lattice points and area
390
I(a(t/i)). There is a finite subcollection I1,...,IR of these intervals which covers K; we suppose that the subcollection is minimal, and numbered in order, so that I, nIr+2 is empty, but I, fl Ir+ is non-empty. We form closed intervals Jr by taking the midpoint of I, fl I,+ as the common endpoint of J, and Jr+ 1. Then J1,..., JR are closed intervals, disjoint except for endpoints,
that cover K, each corresponding to some angle /3 = ar. Let Sr be the smallest distance from an endpoint of Jr to the corresponding endpoint of Ir. The derivatives in (18.3.8) and (18.3.9) are finite and non-zero for 0rEJr+[-6r/4,6r/41. 1 /3-arl <6,./2, (18.3.11) Since the derivatives are continuous, their moduli lie within closed intervals not containing zero when the inequalities (18.3.11) hold. We have found a subdivision of K for which we can apply Lemma 18.3.1 to find a suitable angle /3 with rational tangent, then Case 1 of Theorem 18.2.2 with T = M2, perhaps with a shift of origin in the variable u. The enlargement factor M enters only into the radius p, not into the angles i/i and A. The implied
constant is constructed from the maxima and minima of the expressions (18.3.8) and (18.3.9) under the conditions (18.3.11), and from the angle /3 chosen, by way of the integer a2 + q2. Since a2 + q2 can be bounded in terms
of the length of the interval for /3 in (18.3.11), independently of a the bound in Theorem 18.3.2 is uniform under rotation as well as translation. To prove Theorem 18.3.3 we estimate all the constants which depend on the shape of the curve, and apply Theorem 18.2.2 with order-of-magnitude parameters M' and T' corresponding to the lengths of the ranges for u and v/r, satisfying rT
d2v
d3v
rT'
due Mi3' du3 and if Case 2 of Theorem 18.2.2 is used,
(_ )___3 dv
d3v
s
r d2v I
Mi4' 2
(18 . 3 . 12)
r2T'2 (18 . 3 . 13)
M'6 To simplify the expression in (18.3.13) we note that a/q has a neighbour b/d du2 )
in the Farey sequence Aq) for which ad - bq = 1, and we can take s = -(ab+dq) in (18.3.2). Let y= tan '01d). Then ad - bq 1 tan(/3-y)= ab - dq s By (18.3.9) and (18.3.10), for s non-zero the left-hand side of (18.3.13) is
3sin(if-y-A) rp2 sin( /3 - y)coss(gr- 13)cos A In Theorem 18.2.2 we require order-of-magnitude relations cos A x lsin(tr - /3)sin(4r - /3 - A)I, and, if Case 2 is used, Isin( /3 - y)cos Al x Icos(/r- /3)sin(gi- y- A)I.
(18.3.14) (18.3.15) (18.3.16)
Lattice points inside a closed curve
391
We subdivide the arc C, into smaller arcs C.1 on which:
(1) The angle changes by at most 1r/12; (2) The angle A changes by at most ir/12; (3) The quantity log p changes by at most log21/20; (4) The quantity log cos A changes by at most log 21/20. There are two cases, depending on the angle A. Case 1
JAI <- 11ir/24 on C,1. We want to choose /3 so that
I4-/3-kar/21 Z Ir/24,
141 -/3- i/i-1Ir/2IZ ir/24 (18.3.17)
for all integers k and 1. Modulo it/2, the set of (3 for which (18.3.17) holds
forms one or two intervals of total length at least Ir/12. Hence there is a suitable interval for /3 of length at least -7r/24. In Lemma 18.3.1 we take 6 = 7r/24, and (18.3.1) gives
r = a2 + q2 5 8/32 = 4608/ar2 < 500.
We take M' to be an integer, T' to be real, with M' = r112Po + 0(1),
T' = Po
for some value po of the radius of curvature p on the arc MC,1. When we write the Cartesian equation of MC;1 as u v -su T' r M' then F" and F(3) are bounded away from zero and infinity, and Case 1 of
F(M(18.3.18)
Theorem 18.2.2 applies. Case 2 For some iii on C,1 we have 0 < cos A 5 sin it/24. By construction cos A lies in some range -q ScosA52177/20,
so A is close to ±,7r/2. We treat the case when A is positive. We have
dt
IT
775
where
2
21,q
1177
(1-t2) 5-20 cosh/24 < 10
= min(21-7/20, sin a/24). Also
- f drlr53f tan AdA5[log p]c,,5log-, 10
7q c,j
21
20
c,1
and 4i changes by at most
-log-2120 5 200 771
10
777
Lattice points and area
392
Let i/ o, po refer to a particular point on the arc MCij. If 8 lies in the interval IT
2
-
37)
-<'o-13<
4
IT
7)
-
2
4
then IT
2
15717
200 -
-
a
5
IT
4371
2
200
Thus 437
- - a -A 5 200
1777)
200
and (18.3.15) holds with
cosAxcos(O -13)xsin(fi-/3-A)x17.
sin(/i-/3)xcos(+y-/3-A)x1. We take 6 = 77/2 in Lemma 18.3.1, and (18.3.1) gives
r = a 2 + q 2 <8/52
<32/7)2.
We take M' to be an integer, T' to be real with
M' = 7)2r1/2p0 + 0(1),
T=
7)3po,
for some value po of the radius of curvature on the arc MC;1, with
T'/M'2 x 1/pr. For r << 1/7) we can use Case 1 of Theorem 18.2.2. For r >> 1/7) we should use Case 2. However, the requirement (18.3.16) involves the angle y with
tan y = b/d, which depends non-uniformly on the angle /3, and so on the orientation of the arc. We choose a modulus k, and divide the sum over n into residue classes modulo k, putting n = km + c for c = 0,..., k - 1. This gives a congruence family of sums with the same T', but with M' now given by
M' = 712r1/2p0/k + 0(1),
and ((ns/r)) modulo one replaced by ((mks/r)). We choose
k x (711) << 115 and use Case 1. The rounding error sum in Lemma 18.3.1 is
123/73 (log T r)315/146 <<
po)315/146'
o
(18.3.19)
provided that n1o3p02o
>> (log
p0)63/2
The expression (18.3.19) has fixed order of magnitude on C.
(18.3.20) .
A family of lattice point problems
393
Summing over the different subdivisions C11 of C, gives
<< min(cos A)65/146p46/73(log p)315/146 cr
+fC(COS A)65/146p46/73(log p)315/146(I d l'
d
d
p
o
- logcos Oil
dir.
If p and A are monotone on C,, then each term is << max p46/73(log p)315/146 c;
In general we must subdivide C. into pieces on which p and A are both monotone; the number of such subdivisions has the same order of magnitude as the number of maxima of p on C,. To complete the proof, we replace p by Mp, and bring the scale factors in M outside the sum; the remainder terms in Lemma 18.3.1 are negligible. 0 Theorem 18.3.3 would be more elegant if the error term was a power of the area A, not of the maximum radius of curvature. However, area in intrinsic coordinates is given by the functional A
_1 f2 f2 4
9)dao r d0,
0
0
where k(0) is the kernel function defined by
k(0)=(1-0/ir)sin0 for 0 50 <7T, k(-0) =k(0), k(0+27r) =k(0). If p(0) is allowed to be negative, as for the four-cusped hypocycloid with
x= - 3 cos 41 - cos 3o,
y=3sino-sin3o,
p= -6sin2af,
then the area A need not be positive. Hence the area A is not a norm on periodic functions p(0), and it cannot appear on the upper bound side of an inequality.
18.4
A FAMILY OF LATTICE POINT PROBLEMS
Theorems 18.1.2, 18.2.1, 18.2.2 can be extended to a family of functions F,(x) in the same way as Theorem 17.1.4. Theorem 18.4.1 extends Theorem 18.1.2.
It is complicated, because there are five independent parameters, and we have chosen to consider several cases rather than to impose more conditions on the parameters H and M. Weaker results can be obtained for larger H by the method of Lemma 17.1.5, making the product hm divisible by a suitable prime number p (Huxley 1990, Section 7).
Lattice points and area
394
Theorem 18.4.1 Let F(x, y) be a real function defined for 1 5 x5 2,0:5y< 1, for which the partial derivatives are continuous and satisfy I ai a2F(x, y)I <- C,
for r55, s52, r+sS5, and 1/C1 S I a,F(x, y)I
(18.4.1)
for r = 1, 2, and al-1 a2F(x,Y)
a1F,(x y)
a1
z C2
(18.4.2)
for r = 2, where C1 and C2 are positive constants. Suppose that either Case 1 or Case 2 holds: Case 1
(18.4.1) and (18.4.2) hold for r = 3 also.
Case 2 For some positive constants C31 C4
I3F2 -F1F111I2C3,
IAIzC4,
where A is the determinant 3F11 + 4F1F111
3F1F11
F1
F1111
F111
F11
F1112
F112
F12
Let y1, ..., y, be points in [0, 1] with yi+1 - y; z 1/J, where J5 T. Let g(x) and
G.(x) be bounded functions of bounded variation on [0, 1] (uniformly in i in the case of G,(x)). Let S. denote the sum Si
ig(H)
mM
G'(M)e(M
F`M'Y,)),
where H and M are positive integers, and T is a positive real number. Then there are constants C5 and C6 constructed from C11 ... , C4 such that if
C5T1/35MSC51T2/3
(18.4.3)
H< C6 min( M3/2/T1/2, M1/2, MT-3/11),
(18.4.4)
and then we have bounds of the form E 1S;I2 << EH2IT logs T,
(18.4.5)
where the constant in the upper bound is constructed from C1,. .. , C4, from the
bounds for the functions g(x) and G;(x), and from the extra constant C7 in Cases (b) and (c):
A family of lattice point problems
395
Case (a) In Cases 1 and 2, for
HT
J
M2 r HT2/5
T
(7) +I>_C61min M2+ T ,I
M
)
,
(18.4.6)
we have (18.4.5) with
E =I-4135(HIM)3135T- 12/35 + (J/I)4/21(H/M)1/21T- 8/21 +1-1/2 (HIM )11/4T3/4 + (J/I )1 /2(H/M)'/4T1/12.
(18.4.7)
Case (b) In Case 1 for M< CST 1/2, where C7 is a constant which may be chosen arbitrarily, we have (18.4.5) with
E
=1-2/7H1/7M3/7T-4/7(1
+JM2/IT)3/14
+(H/M)1/21T-s/21(1 +JM2/IT)1/2
+H-1/9M-1/3T-2/9(1 +JM2/IT)1/2 +1-1/2HT-1/2. (18.4.8) Case (c) In Case 2 for M >_ C7 1 T 1/2 we have (18.4.5) with /7M-5/7(1
E = I-2/7H1 +(HIM)112'T- 8/21(l
+JT/IM2)3/14
+JT/IM2)1/2
+H-1/9M5/9T-2/3(1 +JT/IM2)1
12
+I-1/2HM-2T1/2.
(18.4.9)
Proof We use Lemma 16.1.4 (with 8 = 1). The conditions on F(x) are those of Lemmas 14.3.1, 14.3.5, and 14.4.1. The choice (18.1.2) of U is valid, provided that O2A4PR2>> Q
holds, so that H2N2 << MR3,
(18.4.10) arising from the fifth term in the minimum in (16.1.11) of Lemma 16.1.4. The
third term in the same minimum is larger than the corresponding term in (16.1.6) of Lemma 16.1.3, and we may relax (18.1.18) to H4N6 << M3R7,
(18.4.11)
which is a consequence of (18.4.10) since R> 1,
OiQ2 >> 1.
(18.4.12)
The conditions (18.4.11) and (18.4.12), when combined with the other inequalities involving the parameters, entail the conditions (18.4.3) and (18.4.4) of the theorem.
Cases (a), (b), and (c) correspond to the three results (16.1.12), (16.1.14)
Lattice points and area
396
and (16.1.15) of Lemma 16.1.14. We summarize the choices of N which give
the various terms in the upper bound. If T is very large, then (18.4.12) is limiting, and we take NX
M2/(HT2)1/3
in Case (b). If M is very large, then (18.4.12) is limiting, and we take
N X (M2/H)1/3 in Case (c). If I is very large, then (18.4.11) is limiting, and we take
Nx (M11/H4T3)1/7 in Case (b) or Case (c). If H is very large, but I is small, then we take N x H in all three cases. If no parameter is too large, then we choose N so that the term in P2Q2 is
the same order of magnitude as the next largest term. In Case (a) this requires N x (I6m57/H22T17)1/35
for J/I small, Nx (I/J)2/7(M11/H4T3)1/7 for J/I large. In Case (b) this requires N x (I3M6/HST )1/7
for J/I small, and in Case (c) Nx (I3M18/HST 7)1/7
for J/I small. For J/I large we take Nx (J3M12/H5 T4)1/7 in Case (b) or Case (c).
O
We deduce an analogue of Theorem 18.2.2. To reduce the complications, we express the upper bound in terms of J, using 15 J + 1 to eliminate I, and we do not attempt to optimize the logarithm power. Theorem 18.4.2 Let F(x, y), M, T, and the points y, and functions G.(x) satisfy the conditions of Theorem 18.4.1. Let R. be the sum 2M-1 R`
m=M
m
T M
G`(M)P(MF(MYj
Rounding error and integration
397
If either Case 1 holds with C5T113 < M< C7T112,
or Case 2 holds with
CVT
M.-5 CS 'T2/3
then we have the bound 57, R; << J(T26/43 +
T415M-2/5 +M2/5T2/5)log6 T
+ min( J65/73T46/73, J'1115M8115T2/5, J11115M-8/15T14115)log6 T.
The constant C7 may be chosen arbitrarily. The constant` in the upper bound is constructed from C1, ... , C4 and C7.
18.5 ROUNDING ERROR AND INTEGRATION The theorems of this chapter have been directed to approximating sums by integrals. Even the purest of mathematicians wants to know the approximate value of the resulting integrals. Some functions can be integrated in closed form to a function which is easily calculated, or uniformly approximated by a sequence of such functions. Usually the integral has to be approximated by a sum. The simplest method is given by the trapezium rule (Lemma 5.4.1): we
put f(x) = F(x/M), a = M, b = 2M in Lemma 5.4.1 to get 2
1
f F(x)dx=-F(1)+ 2M
1
2M-1
1m -I -FI m=M+1 M M
+ 2MF(2) -
1
f 2a'(Mx)F"(x)dx.
(18.5.1)
M2
The remaining integral is O(max IF"(x)I/M2). Writing down 4/3 times (18.5.1) for M minus 1/3 times (18.5.1) for 2M gives Simpson's rule: 4M
f2F(x)dx= m=2M
a m F(2M) + M
f
o (2Mx))F"(x)dx,
where am is the finite sequence (1,4,2,4,2, ... , 2, 4,1), and the remaining integral is O(max IF(4)(x)I/M4). Further accelerations are possible. Some formulae have unequally spaced blocks of sampling points, such as
X-+-+. M 2M-6M m
1
If the function F(x) has enough continuous derivatives, then we can find a formula of this type in which the remainder contains a high power of the step
length 1/M. However, the function values are not calculated to infinite precision, but to some accuracy 1/N. For hand calculation N = 10" for some
Lattice points and area
398
integer n. We round down to n decimal digits by replacing all decimal digits after the nth digit following the decimal point by zero. This corresponds to replacing NF(x) by its integer part. A similar rounding process is used for binary digits. Rounding down gives a systematic error. To avoid it, when adding a series of function values, we insert digits 5 (decimal) or 01 (binary) after the last digit retained unchanged. Computer arithmetic routines (see,
for example, Logan and O'Hara 1983) usually take the integer part of NF(x) + 2, using the instructions `shift right, jump if no carry, increase'. We call p(t) the rounding error function because 1
N([NF(x)] +
2)
=F(x) +
1
p(NF(x)),
N[NF(x) + 2] =F(x) + N p(NF(x) + D. The sampling nodes of the integration formulae form a finite number of sets of M points each of the form (m + a)/M, for m = M, ... , 2M - 1, so there are correction terms
zEi N
p(NF(mMa)),
N Z1 p(NF(mMa)+Z).
These sums are certainly O(M/N). If the rounding error functions were replaced by uniformly distributed random variables, then the sums would be
O(Vi/N) in root mean square. If F(x) is a polynomial of degree d with integer coefficients, and N = Md, a= 0, then all the p-functions are evaluated at integers, and the error is systematic. On the other hand, F(x) = x F2, and N = M, a = 0, then, since % has the continued fraction 1 + 1/(2 + 1/(2 + )), the explicit discrepancy formula of Lemma 2.3.2 gives O((log N)/N). We take the basic rounding error sum to be
R = R(M, M', N; F(x)) = F_ pI NF(M)1,
(18.5.2)
M
where M' < 2M. We expect to find cancellation in the sum R except possibly
when certain derivatives or determinants of derivatives are zero. In the vanishing case, there may be a systematic bias when F(x) has a good approximation by an algebraic curve with small integer coefficients. By analogy with the linear case given above, the bias is expected to disappear
when F(x) is replaced by /3F(x) for some number G3 whose continued fraction has bounded partial quotients. The numbers 1, v, and 2V - 1 are independent in the sense that the ratio of any pair of them has an essentially different continued fraction. Near any point y, at most one of the functions
Rounding error and integration
399
F(x), (I)F(x), and (2VI - 1)F(x) has a good algebraic approximation. So if a relevant derivative vanishes at x = y, then we should calculate fy+8RF(x)dx y-8
with 0 = 1, F, 2v - 1. We expect at most one of the calculations to be affected by a bias in the rounding error. Here 6 is chosen so small that the only critical point in the interval is y itself. Theorems 18.2.2 and 18.2.3 with T = MN give such bounds when N is close
to M in order of magnitude. We use single exponential sums to get bounds for large N. Numerical analysts will protest rightly that these bounds are only theoretical: the construction of the constant in the upper bound from the upper and lower bounds for various derivatives is not given explicitly. There are certain excuses for not giving the construction. 1. The exponents of M and N are not final; they are just numbers thrown up by the method in its present form. 2. The exponent of log M has not been optimized, and if Lemma 11.1.4 is used to optimize the exponent, then there is a function CO Xlog M)8 to minimize, where the `constant' C(S) is implicit in Hall and Tenenbaum (1986).
3. There are many contributions to the error estimate with different orders of magnitude, involving different determinants in the derivatives of F(x). 4. A complicated argument of n steps usually gives a numerical constant like 2", from repeated use of `a + b < 2 max(a, b)'. 5. The formulae would become very much more complicated.
First we note some consequences of the addition formula for p(t), Lemma 2.3.1.
Lemma 18.5.1 (Rounding error identities)
For any integer q > 1 we have
q-1 (M-a M'-a
R(M,M',N;F)= L RI1 a=0
q
q
,N;F x+
a
i1 (18.5.3)
q - i
R(M,M',N;F)= F, a=0
1
N;F(x)+ NJ, q
(18.5.4)
and for p a prime number n-i ` aMxl pR(M, M', N; F) = F, R It pM, pM, N, F(x) + N /I
a=0
- R(pM, pM', pN; F) + R(M, M', pN; F ). (18.5.5)
Lattice points and area
400
Proof We obtain (18.5.3) by writing m = nq + a, 0 < a < q, and (18.5.4) follows at once from Lemma 2.3.1. For (18.5.5) we have
P-1
EP a=0
an _ (pp(t)
if p In,
P)
if not,
t+-
St
p(pt)
so that n ant -1 + PM p
pM'
L n=pM
PEP(NF(m))_pE1
M
M
-
pin
P-1 PM _I 7
n
pM
n=pM
)_
M PM
p(pNF(PM))
n
'
+ E p pNF
n
P
PM
n = pM
0
+
( PM)
pin
We use the Fourier expansion of p(t) to deduce bounds for rounding error sums from bounds for simple exponential sums. Lemma 18.5.2 (Rounding error by simple exponential sums) Let F(x) be a real function, and let M and M2 be integers with M2 < 2M. Suppose that we have MZ
S=
T
K
e(TF(M)) «(M) M'(logT)µ,
(18.5.6)
where K > 0, uniformly for MN)")1"(K+1)
N< T < MN(MANK(log
(18.5.7)
Then M2
R=
.
M
m
p(NF( M)) <<
(MANK(log K
If (18.5.6) holds for M" < T < Ms, then (18.5.8) holds for M" <_ N << M(P-'XK+1)+A(log M)"'.
Proof We pick a positive integer H with
2H<M(M'NK(logMN)µ)
1),
and use Lemmas 5.3.1 and 5.3.2 to obtain
R« M
2HE-2
+
h1
Mz
h
e(hNF(M m=M
We apply the bound (18.5.6) term by term with T = hN.
(18.5.9)
Rounding error and integration
401
The parameter h in (18.5.9) is not used in the proof of Lemma 18.5.2. However, we can fit the sum over h into the iteration of Theorem 17.4.2. Let E1 be the average size of the upper bound for the sums with h * 0. Then
R << M/H+E1. We difference repeatedly to get
El << M/ D2 + (E2M) , where E2
is the average size of exponential sums formed with A(d2)hNF(x/M) in the notation of Section 17.4, and so on. We write D1 for H, and we choose parameters so that
D,., 1 = Dr
Dr = M/Er ,
which is (17.4.1). The smoothing-the-difference argument of Lemma 17.4.1
goes through with f'(x) replaced by (N/2)F(x/M), and the parameter choices are the same, when we interpret T as MN/2. Theorem 18.5.3 Let q be a positive integer. Let F(x) be a real function q + 5 times continuously differentiable for 1 < x 5 2, with
F(r)(x)
(18.5.10)
0
for r = q + 1, q + 2, d2
dx2
log
F(r) * 0
(18 . 5 . 11)
forr=q+1, 3(F(q+2))2 - F(q+1)F(q+3) 00' and
3(F(q+2))2 + 4F (q + 1)F(q + 3)
F(q+3) F(q+4)
3F(q+l)F(q+2) F(q+2) F(q+3)
(F(q+1))2 F(q+1) F(q+2)
0.
Let R be the sum in (18.5.8), with M sufficiently large. Define Q by Q = 2q. In
the range N z N6 we also require (18.5.10) for r = q + 3 and (18.5.11) for r = q + 2. Then, for any e > 0 we have R <
for N1 >- N >- max(N2, N3), and also for N >_ N1 when we replace the power of log N by NE, R <<
M(214Q-49q-165)/(214Q-49)N49/(214Q-49)(log N)33(5g2+6)/(214Q-49)
Lattice points and area
402
Table 18.1 Steps
D
D, E
E
C(1)
C(1)
C(1)
Rounding error bounds
To a=
Exponent /3
From a =
23(1 + a)
63
83
73
83
63 =
7+3a
83
4447
15
63
3252
355 + 496 a
4447
837
1396
3252
608 =
197 + 344a
837
643
902
608
466 =
68 + 43a
643
3001
171
466
2068 -
43 + 4a
3001
19
64
2068
12 =
62 + 29a
19
374
140
12
227 =
356 + 89a
374
872
641
227
423 -
80 + 11 a
872
122
423
87 = 2.1609.. .
34 + 13a
188
3323
73
87
1424 -
107 + 4a
3323
299
132
1424
120 =
593 + 49a
299
2461
807
120
967 =
193 + 29a
2461
300
967
979 + 89a
3619
2255
1353
1393
779
193 + 11 a
2255
485
246
779
161 =
107 + 13a
485
9325
159
161
2848
1.3175 ...
=1.3675 ...
1.3767... 1.3798...
1.4512...
1.5833... 1.6476...
2'0615...
188
2'3326...
C(2)
C(2)
C(2)
3619
1393 =
2.4917... 2.5450...
2'5980...
= 2.8947.. .
=
3.0124.. .
3 .2742.. .
Rounding error and integration Table 18.1 Steps
C(3)
C(3)
C(3)
(Cont.)
Exponent 13
From a =
239 + 4a
9325
2473
268
2848
720 - 3.437.. .
1400 + 49a
2473
7012
1663
720
1983 =
484 + 29a
7012
3356
620
1983
929
2314 + 89a
3356
5666
To a =
3.5361.. .
= 3.5740... a
3.8001.. .
1491 =
2777
929
430 + 11 a
5666
1214
494
1491
309 =
266 + 13a
1214
24177
331
309
5696 =
507 + 4a
24177
141
540
5696
32
3063 + 49a
141
3639
3375
32
803
1095 + 29a
3639
5169
1260
803
1133 =
5073 + 89a
5169
2769
5625
1133
583 =
915 + 11a
2769
581
990
583
121 =
597 + 13a
581
59577
675
121
11392 =
C(4)
C(4)
C(4)
403
3.9288... 4.2446 ...
= 4.4062...
= 4.5318... 4.5622.. .
4.7496...
4.8843...
5.2297...
for N2 >_ N >_ N4, and for N8 >_ N >_ Nlo if either of these ranges is non-empty, M(62Q-llq-33)/(62Q-2)N11/(62Q-2)(log N)6(5g2+6)/(31Q-1)
R <<
for min (N3,N4)zNzN5, R <<
M(356Q-89q-267)/(356Q-71)N89/(356Q-71)(log N)57(5g2+6)/(712Q-142)
for N5>_N>_N7, R <<
M(8oQ-
29q- 69)/20(4Q-1)N29/2o(4Q-1)(log
N)3(5g2+6)/5(4Q- 1)
Lattice points and area
404
for N7 >_ N max(N8, N9), M(34Q-4q-21)/(34Q-4)N2/(17Q-2)(log
R <<
N)3(5q'+6)/(17Q-2)
for min(N9, N10) z N Z N11, and also for N5 N11 when we replace the logarithm power by N. The ranges are defined by N1
=M9+2-6E
N 2 5Q = M75gQ + 64Q + 26 (log M)15(5g 2 + 6xQ -1)
N37Q+13 = M37gQ+31Q+13q+39(log
M)16(5g2+6)(Q-1)
N76Q+49 = M76gQ+62Q+49q+ 143(log M)S8(5g2+6)(Q -1) 4
0
N7867 =M178gQ+124Q+67q+201(log
M)82(5g2+6xQ-1)-356Q+71
N6267Q+18 = M267gQ+160Q+18q+125(log
M)57(5g2+6xQ-1)
N356Q - 31 = M356gQ+ 196Q- 31q+49(log M)32(5g2+6XQ - 1) - 356Q+71
N254Q-49 = M254gQ+134Q-49q-9(log
N37Q-2 = M37gQ+17Q-2q+8(log
N Q =M90gQ+34Q+41(log
o
M)8(5g2+6)(Q- 1)
M)4(5g2+6)(Q- 1)
M)18(5g2+6xQ-1)
N 1Q =M2gQ+1+4QE
The implied constants in the upper bounds are constructed from q and the derivatives of the function F(x), and, where appropriate, from e.
We can now set up a table of the best bounds for the rounding error sum R, when all the appropriate derivatives and determinants of derivatives are non-zero. We put N=M, a z 1. The bound R << MO (log M)5 (for some bounded 8) is valid for a in the range indicated. We have not optimized the logarithm power; this enables us to change from one estimate to the next at a number of the form MT rather than MT(log M)8. The suppressed order-ofmagnitude constants are constructed from the range of values of the deriva-
tives of the dimensionless function F(x). If the rounding error bound behaved like a random variable, then we would have the upper bound with /3 = 1 `almost surely'. We have shown that there is enough regularity in the rounding error to ensure some cancellation. In the first column of Table 18.1, C(q) denotes a case of Theorem 18.5.3, D denotes the double sum of Theorem 18.1.2, and E denotes that Lemma 18.5.2 has been used with some bound for the simple exponential sum. In some ranges we use double sum bounds for H large and simple sum bounds for H small. The bracketing of rows follows the same rules as in Table 17.1. When these bounds are used to estimate the number of integer points close
Rounding error and integration
405
to a curve, then small ranges of H cannot dominate; Theorem 18.2.1 can be used on the whole range a < 643/466. In certain ranges of a, corresponding to corners of the graph of /3 against a, the iterative ideas sketched in Section 19.2 give a small improvement. There is a rather different application to integrals in two dimensions. Let g(x, y) be a smooth function of two variables, and let D be a region bounded
by a piecewise smooth curve C. The simplest grid method of numerical integration is to take a square lattice of side 1/M, sum the values of g(x, y) at the lattice points within D, and divide by M2 to approximate the integral of g(x, y) over D. By the Riesz interchange, this estimate corresponds to
M1 I toN(to) + f "N(t) dt l
to
where to and tl are the infimum and the supremum of g(x, y) on D, and N(t) is the number of lattice points inside the subset D(t) of D on which g(x, y) >_ t. Let C(t) be the boundary of D(t), made up piecewise of arcs of C
and contour lines g(x, y) = t. We can expect further cancellation using Theorem 18.4.2 on each component of D(t), if the curves C(t) change smoothly with t. In general, part of C(t) consists of an arc of the curve C. However, if g(x, y) vanishes on the boundary of D (the Dirichlet condition), then each arc of the curve C(t) changes with t. Calculations of this type were made by Huxley (1993b) for the case when the surface z = g(x, y) is a convex cone. The mean square rounding error is given by the term in T26/43 in Theorem 18.4.2 whenever C has a radius of curvature which is non-zero, piecewise continuous, and twice continuously differentiable. However, a cone can be treated directly by Theorem 6.3.1, in which the exponent is 1, not 26/43. The Fourier series method seems to be the appropriate one in three dimensions and higher (Hlawka 1950).
19 Further results 19.1
EXPONENTIAL SUMS WITH A DIFFERENCE
The three sections of this chapter are unified only by the idea of iteration. In the course of treating an exponential sum, we come to another exponential sum, or a problem of integer points close to a curve. The first idea is to use the differencing step of Lemma 5.6.2, which produces a double sum with a central difference as in Chapter 9. The end result is slightly weaker
than Theorem 17.1.4, but still better than any bound obtained by the van der Corput iteration in one variable. The result would remain weaker than Theorem 17.1.4 even if we had the best possible result in the First Spacing Problem, with Lemma 13.3.1 as sharp as Lemma 13.1.1. These results
improve those of Heath-Brown and Huxley (1990), and were partly announced in Huxley (1994b). Theorem 19.1.1
Let F(x) be a real function, four times continuously differen-
tiable for 1 <x < 2, and let g(x), G(x) be bounded functions of bounded variation on 1 5 x < 2. Let CO, ... , C7 be real numbers >- 1. Let H, M, and T be large parameters. Suppose that IF(r)(x)I < Cr
(19.1.1)
IF(r)(x)I z 1/Cr
(19.1.2)
for r = 2,3,4,
for r = 2, 3, and that either Case 1 or Case 2 holds:
Case l M5 CoT 112, and (19.1.2) holds for r = 4 also.
Case 2 M> Co'T'/2, and 3(F(3))2I > 1/C5.
Let S denote the sum
S=
1g(H)m=M
2H-iG(M
M
)e(TF(mh)-TFI
Exponential sums with a difference
407
Then there are positive constants B1 and B2, depending on CO,..., C7, and on the functions g(x) and G(x), such that, for M in the range C6 'T 41/111 S
MS C6T 65/114
and H satisfying
H < B1MT 43/138,
(19.1.3)
and also
H>_C;1VIM' forM5C61T7/16, and also
H> C71M11/T6 for M> C6T9/11, we have H 87/140
H 13/70
<
S
I
T23/7o +
(M)
T18/35l (log T)9/4. (19.1.4)
In the range C61 T1/3 5 M 5 C6T7/16, min(B1M13/19T-10/57 BjM9/7T-3/7,C7T4M-9),
HS
(19.1.5)
we have S:5 B2(H 15/14M3/14 T 3/14 +H17/18M-1/6T7/18
+H93/56M-15/56T5/14
+ H 107/72M- 19/24 T 43/72 )(log T)9/4. (19.1.6)
In the range C61T9/16 <M 5 C6T2/3,
H5
min(B1M25/19T-28/57,B1M5/7T-1/7,C7M11T-6),
(19.1.7)
we have S
+H17/18M5/18T1/6 + H 107/72M- 13/72 T 7/24 )(log T)914. (19.1.8)
Proof We use Lemma 18.1.1 as in the proof of Theorem 18.1.2, making the same choices of N and R. There is a new condition H3 << N2R from (9.3.4) of Lemma 9.3.3, which makes the upper bounds for H in (19.1.3), (19.1.5),
and (19.1.7) smaller than in Lemma 18.1.1. The bound for A in Lemma 13.3.1 is A << (K 2L2 + KL4 )log K,
which gives an extra factor (1 +H2/RN)1/4 in the bounds for S, with extra terms in (19.1.4), (19.1.6), and (19.1.8) which do not appear in Theorem 18.1.1.
Further results
408
O
In order to state our next theorem in generality, we need one of the two-variable forms of Lemma 5.2.1.
Lemma 19.1.2 (Partial summation in a modulus squared integral) Let g(x) be a real function of bounded variation V on the closed interval a S x 5 b, where a and b are integers. Let fn(t) be continuous functions for a 5 t 5 /3, a 5 n 5 b. Then 2
b
2
n
R
L afm(t) dt.
max f E g(n)fn(t) dt 5 2(Ig(a)I + V)2 n5b f'3 a a a
Proof We write g(x) as the difference of two positive monotone functions on a S x5 b, g(x) = h(x) - k(x), with k(a) = 0 if g(a) > 0, and h(a) = 0 if g(a) < 0. We put n-+
Fn(t) = L.fn(t) a
for a 5 n 5 b, and hl(n) = h(n + 1) - h(n), for n = a + 1, ... , b, and
hl(a) = h(a). Then b
b
2
h(n)fn(t)) =
I
b
2
E hl(n)F,(t)I 5
hi(n))
a
a
(F',
a
a
hl(n) IFn(t)12) .
Integrating, we have 2
b
f13 a
b
E h(n).fn(t) dt
a
a
Now h(b) + k(b) = Ig(a)I + V, so the result follows from Cauchy's inequality, b
b
2
2
b
2IFlh(n)fn(t) +2 Ek(n)fn(t)I a
a
2 .
O
a
We apply Theorem 19.1.1 to the mean square of a simple exponential sum when the parameter T is regarded as a continuous variable. Theorem 19.1.3
Let M, T, and the functions F(x) and G(x) be as in Theorem
19.1.1, with (19.1.1) and (19.1.2) also holding for r = 1. Let S(t) be the exponential sum 2M-1
S(t)=
M
(;(M)e(tF(M))'
Exponential sums with a difference
409
and let A be any real number with A >> T72/227(log T)315"227
Then for T187/454(log T)279"454 << M << T267/454(log T)-279 454,
(19.1.9)
we have the mean square bound
fT+AI T-A
S(t)12 dt << AM.
(19.1.10)
Corollary Under the hypotheses of the theorem we haven S(t) << M1/2T36/227(log T)315/45a
19.1.11)
Proof After Lemma 19.1.2 we only need consider the special case when G(x) is 1 for 1 <x:5 M2/M (for some M2 < 2M), otherwise zero. To obtain central differences, we let S1(t) and S2(t) denote the terms in S(t) with m odd and m even, respectively. Then, since S(t) = S1(t) + S2(t), we have IS(t)I2 .2 IS1(t)I2 + 2 IS2(t)I2.
We apply Lemma 5.6.4 to the short interval means of S1(t) and S2(t) with sm = F(m/M), and t as the variable of integration. We have fT+AIS;(t)I2 dt
T-A
e((fl)
-TF(M)(fl)
TFM
<< A
-2AF(k\\
l
n k n =k =i
where n and k are summed from M to M2. With the substitution k = m + h, n = m - h, the terms for which h = 0 are seen to be O(AM), whilst the other
terms fall into ranges H :!g h < 2H, or - 2H < h 5 -H, where H is some power of 2. The congruence condition is m = h + i(mod 2), which disappears when we sum the contributions of Sl and S2. We consider M2-1
eI TF1 mM h
h=H m M+h X
`
l
-TFI mM h 1
A2AF(mMh) -2AF1 mMh
(19.1.12)
The expression (19.1.12) is of the type estimated in Theorem 19.1.1, except
that the variables are not separated in the weight function A(x). On the major arcs we used Poisson summation over m and an upper bound for the resulting integrals. Any weight which has bounded variation in m for fixed h makes no difference to the upper bounds on the major arcs.
Further results
410
On the minor arcs we approximate the weight as a function of h only. Let n be the centre of the minor arc containing m. Then
2A(F( mMh )_F( mMh ))_F( nMh )+F( nMh 4Ah(m - n)F" () M2
1 I
AHN
« M2
for some . After dividing the sum (19.1.12) into Farey arcs, we can evaluate the A-function at the centre of the minor arc, so that it becomes a function of h only, with total error
O(AH2N/M) << AM.
We now have a weight which is monotone in h, but depends on the minor arc. By Lemma 5.1.1 applied to the sum over h, we must estimate sums in which h runs through a subinterval of H5 h < 2H, depending on the minor arc. The next step in the argument, Lemma 9.3.1, involves taking a maximum over subintervals anyway, so the argument continues as before.
The bounds of Theorem 18.1.2 are increasing functions of H, and are dominated by the top range, which is H x M/A, since the A-function in (19.1.12) is A(4AhF'(e)/M) for some , and F'(x) does not vanish. The bound of Theorem 19.1.3 is linear in A, and the choice A x T72/227(log T)315"227
just gives S << M in (19.1.4) of Theorem 19.1.1. The bound (19.1.6) is smaller than (19.1.4) for H31 >> T19/M39,
and the bound (19.1.8) is smaller than (19.1.4) for H31 >> M101/T51
Thus we get (19.1.10) in part of the region where (19.1.4) or (19.1.6) is the appropriate bound when H x M/A. The range (19.1.9) where (19.1.10) is valid is longer than the range in which (19.1.4) in Theorem 19.1.1 is valid. The corollary follows from the mean-to-max argument of Lemma 5.6.3; we apply Theorem 19.1.3 with G(x) replaced by G(x)exp(crF(x/M)) for some o in -1 < o-< 1. Since A > 2, most of the range of integration in (19.1.10) is not used; in fact we can replace the integral in (19.1.10) by a sum over I S(t;)l2 at a sequence of well-spaced points t; in the interval
T-A +25 t;5T+A-2, with t,.,.,-t;>-4. We also have forms of Theorems 19.1.1 and 19.1.3 for a family of exponential sums.
411
Exponential sums with a difference
Theorem 19.1.4 Let F(x, y) be a real function defined for 1 :5x:5 2, 05 y5 1, for which the partial derivatives are continuous and satisfy t9 c9 F(x,Y)I S C1
for 25r56, s52, r+s56, and (19.1.13)
1/C1 5 I Br1F(x, y)I
for r = 2, 3, and
a;-' a2F(x, y) a' d,F(x, y)
(19.1.14)
- C2 2
for r = 3, where C1 and C2 are constants. Suppose that either Case 1 or Case 2 holds:
Case 1
(19.1.13) and (19.1.14) hold for r = 4 also.
Case 2 For some positive constants C31 C4 13F111
Io1
C31
- F11F1111
C4
where A is the determinant 3F1i1 + 4F11F1111
0=
3F11F111
F1i
F11111
F1111
F111
F11112
F1112
F112
Let Y1, ... , y, be points in [0,1] with y;+ 1- y; Z 1/J, where J:5 T. Let g(x) and G;(x) be bounded functions of bounded variation on [0,1] (uniformly in i in the case of G;(x)). Let Si denote the sum
2H-1 /h
S;= E gl -) H h=H
M-2H+1
E
G,
m=M+2H-1
m1 M
M)e(TF(I
m+h
m-h
,Y;) -TF( M 1M
,Y;
where H and M are positive integers, and T is a real number. Then there are constants Cs and C6 constructed from C1, ... , C4 such that if C5T1/35MSCS 1T2/3 and
H5 C6 min(
M3/2/T'/2,
M1/2, MT-8/27),
(19.1.15)
then we have bounds of the form
(H IS;12 << D1/2E4/21H2I
121
IT13/21 logs T
M)
(H 25/21 IT41/42 logs T, +D1/2El1/42H2I H) M
(19.1.16)
provided that
E 5 C6M9/H9T8/3,
(19.1.17)
Further results
412
where the constants in the upper bounds are constructed from C1, ... , Ca1 from the bounds for the functions g(x) and G.(x), and from the extra constant C7 in Cases (b) and (c): Case (a)
We have (19.1.16) with
J
(HT)1/5
+I,
E=(I3M
D=1,
provided that Case 1 or Case 2 holds for F(x), and that T
E z C6 1 min M2 +
Case (b)
M2 HT 2/5 T ,(
5
.
M
We have (19.1.16) with
D=1+
/ HM5 11/2
JM2
IT
,
E=1+I D3I3T2)
TS
1/6
(H5M9)
provided that Case 1 holds and M < C7T1/2, Where C7 is a constant that may be chosen arbitrarily. Case (c)
We have (19.1.16) with
JT
D=1+IM2,
HTa
11/2
E=1+(D3I3M7)
( M19 \ 1/6
+I H5T9)
provided that Case 2 holds and M > C7T 1/2.
Proof We make the same parameter choices as in Theorem 18.4.1, but omitting the case N x H, and imposing the condition H3 << N2R from (9.3.4) of Lemma 9.3.3, which leads to the extra condition (19.1.17) and the third term in the minimum in (19.1.15). The extra condition is not essential. For H3 >> N2R, the extra term in Lemma 9.3.3 becomes larger as the gardening becomes less effective, eventually dominating the contribution from the large
sieve, so that the bound no longer has the form (19.1.16). The condition H3 << N2R, by eliminating the choice N x H, allows us to collect terms in the form (19.1.16).
Theorem 19.1.5 Let M, T, the points yl,..., y,, and the functions F(x) and
G;(x) be as in Theorem 19.1.4, with the bounds (19.1.1) and (19.1.2) of Theorem 19.1.1 also holding for r = 1. Let S(t) be the exponential sum of Theorem 19.1.3, and let 0 be any real number with A >> 1- 11/227 T 72/227 (log T)350/227
A major arc estimate
413
Then for
J
1/227
T31
I <<
I134(log T)7o
and 1107/454 T 187/454 (log T )155/227 << M « I-107/454 T 267/454 (log T) - 155/227
we have the mean square bound 2
1
T,
19.2
I
2
(S(t)I dt
«IO2M2,Y
= T(1 +y1).
A MAJOR ARC ESTIMATE
We consider the major arc sum of Lemma 7.6.1, with length parameter N' = max(N2, -N3), which may be longer than the usual range for N in Chapter 7. The sums over r have
r «µgNi2.
(19.2.1)
Since r does not have constant order of magnitude, we divide the sum into blocks with H :!g r5 2H - 1, where H is some power of 2. As in Chapter 7, we can write
G(a, b + r; q) = e
4a(b + r)2 q
G(a, 0; q)
(19.2.2)
for q odd, and similarly in the other cases. The weight function remaining is
G(a,0;q)
«
(12µg3(r - K)) 1/4
1 .
(19.2.3)
(tr,gH)1/4
As in Lemma 7.3.1, we have
h(gyr) - ryr
-2
(r- K)3 27µq3
>
H3
(µq3) .
(19.2.4)
We assume that this term dominates the term in r from (19.2.2). Since we can
replace 4a and 2ab by integers congruent to them modulo q in the range
-q/2 to q/2, we require µg3H < 1/C1 for some sufficiently large constant C1.
(19.2.5)
Further results
414
The exponent (19.2.4) is a monomial function with monomial part -2r 3/2 / (27µq3) The conditions of Theorem 17.4.2 hold for all R >- 1, but in Theorem 17.1.4 (which corresponds to the case R = 0) the condition .
(17.1.27) fails. In the special case F(x)=x312, coincident minor arcs I(a/q) and I(a'/q') have a' close to a (analogously, for F(x) = x3, coincident minor arcs have q' close to q). We must argue differently; one way is to choose the parameters so that there are no non-trivial coincidences. The condition (17.1.27) of Theorem 17.1.4 is needed only for
N' >> 1/µq2, a very long major arc. We can use any of the bounds in Table 17.1 when
a
2log H
(19.2.6)
= log(H3/µq3) 5 z)
which is a consequence of (19.2.5).
As the length H of the transformed sum increases, the exponent (19.2.4)
increases like H312, and the weight (19.2.3) decreases like H-'14. The bounds of Table 17.1 have the form 13 5 p + as with p + o, >_ Z, so, within
each range, the estimate is largest when H is largest. The exponents /3 change continuously from one range to the next, and the logarithm powers are bounded. The largest value of H in (19.2.1), H x µgNr2, corresponds to the largest value of a, a -,
log µgN'2
(19.2.7)
log µN'3 '
and a weight function
1/(
1/ (µqN')
.
The dominant range contributes <<
µP+Q-;qo-ZN?3P+2o-_.
(19.2.8)
Smaller values of H contribute something smaller. We can simplify the calculations by using a weaker bound for blocks with H smaller. Suppose that
a given by (19.2.7) is in a range where we find ARC in the first column of Table 17.1. We use Theorem 17.4.2 with this value of R (Theorem 17.1.4 if R = 0) even for small H. Again, within each range, the estimate is largest when H is largest. There is a case M < N, where an extra factor T E appears in Theorem 17.4.2. The corresponding blocks are inside ranges in which we would use AQC with some Q > R, and (19.2.6) gives values of a strictly less than either (19.2.7) or the next change-over point in Table 17.1. Without the factor `TE', these blocks would give something smaller than (19.2.8) by a positive power of N'. The factor `T E, becomes O(Ni3E), which does not
cancel out all the positive power of N. We summarize the result of this discussion as a lemma.
Exponential sums with a large second derivative
415
Lemma 19.2.1 (Long major arc estimate) Let R >- 0 be an integer, and let F(x) be a real function R + 6 times continuously differentiable (four times differentiable when R = 0), with F(3)(x) non-zero for 1 <x5 2. Let M (an
integer) and T (a real number) be large, and let [ m + N3, m + N2 ] be a subinterval of [M, 2M-1]. Write N' = max(N2, -N3). Suppose that
f" (m) = 2a/q + O(T/M3). Let
µ =f'()/6. Then there are absolute constants CO and C2, depending on the function F(x), such that if M 1 11 q5 , 1/3,
N'S-min 3,M C2 µq
COT
and if log µgN'2
a = log µN,3
has a < Z, and falls into a range ARC in Table 17.1, then m+N2
e(TF(m
+n))
M
<<
µP+o-
qv-+N,sp+2Q-
(logT)e+µ-1/3q-i/2 log T,
where p and o, are the coefficients of the linear bound a = p + o a in Table 17
valid for this value of a. The exponent B depends on R, and is uniformly bounded.
O
Lemma 19.2.1 is powerful when q is small. However, in order to treat all arcs as major, we must take N' NR/q for q/R. With these choices, r is bounded in (19.2.1) when q x R. Large q dominate in Lemma 7.3.4, so
the improved bound for small q does not lead to an improvement of Lemma 7.3.4.
19.3 EXPONENTIAL SUMS WITH A LARGE SECOND DERIVATIVE In Part IV we saw that the conditions for the coincidence of two minor arc vectors by a fixed magic matrix implied that there was an integer point close to a curve, the resonance curve. This curve is given by a complementary function construction, with a cusp corresponding to the point where the error in the Second Condition changes sign. The number of integer points that might be close to the resonance curve was estimated trivially, by the length of
Further results
416
the curve. There are two cases where the error in the Second Condition has constant sign, and the resonance curve has non-vanishing curvature. These are the triangular matrices (type 2) for a simple sum, and the identity matrix (type 1) for a family of sums with a parameter. The construction of Chapter 15 gives a sequence of short resonance curves depending on the reference fractions. The construction of Section 16.3 gives one resonance curve, but it
may be longer than the underlying curve y =f'(x) in the construction of Chapter 7 or y =f(x) in the construction of Chapter 8. We can use Theorem 3.6.3 which does not require extra conditions on the derivatives, or Lemma 18.5.2 together with a bound for exponential sums. The second choice may lead to an iteration, with the original function f(x) transformed by differentiation, differencing, and taking the complementary function. The class of those monomial functions (in the sense of Chapter 3) for which f'(x) becomes proportional to a negative power of x as M and T tend to infinity is closed under these operations, and satisfies all our conditions about the size and non-vanishing of derivatives. We obtain the results of Huxley and Kolesnik (1996). We consider a family of resonance curves z = k(y, t) constructed from the
upper triangular matrices where
(10
B ) or the lower triangular matrices ( u
B=A2Pt/Q,
J,
C=A2Qt/P.
Since the Coincidence Conditions are symmetric, we can suppose for convenience that B > 0, C > 0, and so t > 0. We see from Lemma 14.3.2 that t is bounded. We use Lemma 16.3.5 with rs «R2. If a coincidence extends over L z 1 consecutive minor arcs, then Lemma 13.4.2 allows us to replace A2 by O(A2/L2) in Lemma 14.3.2, so that t << 1/L2. Both estimates (14.3.15) and (14.3.16) for h"(a/q) give MQt
Ax02
P
At this point we must distinguish the two constructions. The double sum has
an extra parameter H; for the simple sum we replace H and N in the following estimates. We use the non-linear relation HR2
R2
2 A2x04NQ x04NQ2)
in which H appears naked. The result of Lemma 16.3.5 becomes R2
1
Ik(l)+ni<8<
K1 X A2Mt X
Q4
D4PQ2t.
(19.3.1)
Exponential sums with a large second derivative
417
Since ak/ay is an argument of h(x), z = k(y, t) runs through a range of length K2 x A2
PMt
Qx
R4
taP2Qt.
Q4
For upper triangular matrices we take y as the independent variable, M' = K1, N' = K2. For lower triangular matrices we take z as the independent variable, M' = K2, N' = K1. In both cases Lemma 14.3.2 gives M' <<
02N'. Lemma 19.3.1 (Points close to resonance curves) Suppose that the conditions of Lemma 14.3.5 hold, and that we have a bound of the form
R' << SM' + E M'AIN'"I(log M'N')Ai
(19.3.2)
for R', the number of integer points (1, n) satisfying (19.3.1). Then for M << FT and V chosen so that 1 << 401Q2 S 1, we have in the Second Spacing Problem BV<< L1L2A304P2R2V r
+0102
I
Ra
";+A; 1
Q4
pz";+A;+1Q";+zA;+1V(logT)A', (19.3.3)
Aa I
for a choice of N and R in the construction of Chapter 8 with N in the range
MM
M2
1/3
Nxmin (HT2)1/3' H(T)
M2"i+4Ai(logT)A'
+
H1+I(;+A;TA;
I
provided that the conditions (8.1.4)-(8.1.7) hold, and for a choice of N and R in the construction of Chapter 7 with N in the range M3
Nxmin T
1/2
M4 1/6
,(T
1/(2+3"i+3A;)
M2"i+4A;(log T)A1
+
TA
provided that the conditions (7.2.3)-(7.2.5) hold. 1 For M >> FT and V chosen so that 1 << B4 i1 P2 5 1, where B4 is the constant of Lemma 16.1.4, we have (19.3.3) with P and Q interchanged for a choice of N and R in the construction of Chapter 8 with N in the range M2 1/3 M8/3 M4+4"i+2Ai(logT)A' 1/(1+2"+2A;)
N x min
H) ' HT +
l
H1+";+A;T2+"i
I
provided that the conditions (8.1.4)-(8.1.7) hold, and for a choice of N and R in the construction of Chapter 7 with N in the range Ai 1/(2+3Ki+3Aj) M4/3 M4+4" +2A' T Uj 1/2, T
Nxmin M
1/2 +
z+ ";
provided that the conditions (7.2.3)-(7.2.5) hold.
Further results
418
Proof In the upper triangular case the number of coincidences of between K and 2K consecutive minor arcs given by a fixed magic matrix is <<
hiK2t
zR2
+ E M,a;N'K,(log MIN')A, M
+
K;+A;
R°
(-
We sum this over O(A2P/K2Q) magic matrices to
P << a2 K2Q 0304 P
+ 02 K2Q
R2
1
Q2 PQ
K2
K;+A;
(R4
p2K;+A;QK;+2A,(log T)A'.
I Q4 04
We sum K through powers of 2 to get P2R2 <<
2A3A4
Q2
R4
+A2
K;+A;
R A41 l
P2Ki+A;+1QK;+2a;-1(logT)A,
17C
(19.3.4)
coincidences. The identity matrix contributes O((R2/Q2)PQ) trivial coinci-
dences. We want to make the ratio R/N small, but not so small that the trivial coincidences dominate. Either H3N2 << MR2,
or, for some i, H1+K;+A;N1+2K;+A;
<< M2K;+A;R2Ai(log T)A'.
These inequalities give the second and third terms in the bounds for N. The first term comes from the requirement that V Z 1 in the First Condition. The result (19.3.3) follows when we multiply (19.3.4) by A1Q2V. We calculate similarly in the lower triangular case, using Lemma 3.1.1 to interchange the
0
variables.
Corollary 2 to Lemma 5.4.3 gives
E p(k(l) ± 8) << (M'N')1/3, which gives (19.3.2) with i = 1, K1 = Al =
3,
Al = 0. In this case (19.3.3)
becomes R
BV z A1A20304P2R2V+ 0102 «L 1L 2I /3P2R2V,
(Q
141
1/3
P2R2V
Exponential sums with a large second derivative
419
and we get (17.1.9) of Lemma 17.1.2, apart from the power of R/Q. To improve Lemmas 17.1.2 and 18.1.1, we must use bounds stronger than the corollaries to the truncated Poisson summation formula.
It is a nuisance to check the conditions of Chapter 7 and 8 in applying Lemma 19.3.1. The purpose of these conditions is to exclude unsuitable ranges of the parameters. In practice, a choice of N in Lemma 19.3.1 that leads to a good bound for exponential sums will fall into a suitable range. For this reason we have left unspecified the ranges for M in the iteration lemmas that follow.
Lemma 19.3.2 (Iteration step for the simple sum) The choices of N in Lemma 19.3.1 give bounds for the sum S of Chapter 7: S << M2/3T1/12 log T+M114T1/4 logT + 57, M(6+10"i+11Ai)/(10+15"i+15Ai)T(2+3"i+2Ai)/(20+30"i+3OAi)
i T)1+Ai/(20+3OKi+3OAi)
x (log
+E
M(9-5"i-Ai)/20T(3+3"i+Ai)/20(log
T)1+Ai/1o
for a range of M with M << FT, and S << M113T1/4 log T + M314 log T
+
M(4+5"i+4Ai)/(10+15"i+15Ai)T(4+8"1+9Ai)/(20+30"1+30Ai)
X (log
T)1+Ai/(20+30"i+30Ai)
+E
M(11+5"i+Ai)/20T(1-"i)/10(log
T)1+Ai/10
for a range of M with M >> FT V.
Corollary A bound of the form (19.3.2) valid for a certain class of functions and a certain size range for (log N')/(log M') implies two other bounds of the same type, valid for related classes of curves and certain size ranges. In the first bound, the exponents (Ki, A,) are (1/13, 9/13), (1/5, 2/5), and two terms
2+3K+2A 14+23K+24A (22+33K+32A' 22+33K+32A)'
12-2K (23+3K+A' 22+3K+A 3+3K+A
corresponding to each term M'AN" in (19.3.2). In the second bound, the exponents (K;, A.) are (1/5, 7/15), (0, 3/4), and two teens
4+8K+9A 12+18K+17A 24+38K+39A'24+38K+39A)' corresponding to each term in (19.3.2).
1-K 13+3K+A
(11-K' 22-2K
Further results
420
Lemma 19.3.3 (Iteration step for the double sum) The choices of N in Lemma 19.3.1 give bounds for the sum S of Chapter 8:
S << HM'/'T'/'(log T )9/4 + H2/3T 1/3 (log T )9/4
+ 5' H(4+9K,+9A;)/(4+8x,+8A,)M(K,+2A;)/(2+4x,+4A,)T(1+2x,+A,)/(4+8K,+8A,) T)9/4+A,/(4+sK,+8A,)
x (log
+ E H(12-xi-A,)/12M-x,/2T(3+4K,+A,)/12(logT)(9+A;)/4 for suitable ranges of H and M with M << FT, and S << HM-1/3T 1/2 (log T)9/4 + H2/3M2/3(log T )9/4 H(4+9x;+9A;)/(4+8x;+8A;)M-(2x;+3A;)/(2+4x;+4A;)T(1+5x;+6A;)/(4+8x;+8 A)
+ Y` i
x (log
+E
T)9/4+A,/(4+8K,+8A,)
H(12-x
A,)/12M(4xi+Ai)/6T(1-x,)/4(log T)(9 +A,)/4'
i
for a range of H and M with M >> fT .
Corollary A bound of the form (19.3.2) valid for a certain class of functions and a certain size range for (log N')/(log M') implies another bound of the same type, valid for related classes of curves and certain size ranges. The exponents (Ki, A.) are (1/6, 1/2), (1/2, 0), and two terms
1+2K+A 1+5K+6A (4+9K+9A' 4+9K+9A)'
3+4K+A
3-3K
(12-K-A' 12-K- A)
corresponding to each term M'AN" in (19.3.2).
The proofs of Lemmas 19.3.2 and 19.3.3 follow those of Theorems 17.1.4
and 18.1.2. The corollary to Lemma 19.3.2 uses Lemma 18.5.2, and the corollary to Lemma 19.3.3 follows the proof of Theorem 18.2.1. The two cases
of Lemma 19.3.3 give the same exponents in (19.3.2) with M' and N' interchanged.
The new exponents from the iteration are not very sensitive to the values of K and A, and are useful only for a certain range of (log N')/(log M'). In practice this range is a short interval of the longer range in which the bounds are valid. We start the iteration from the bounds of Table 18.1, except that we use Theorem 18.2.1 in place of Theorem 18.2.2 and Lemma 18.5.2 for the
range 83/635 a5 643/466. Moreover, we can use the analogue of Theorem 18.2.1 for a family of exponential sums, using t as the parameter. This is the only range in which we can use t as a parameter. The bounds lower down
Exponential sums with a large second derivative
421
Table 19.1 Bounds for integer points close to a curve Iterated
Steps
D DD DDDC
23
23
41
73
73
32 =
17
35
7
3
41
396348
79
79
17
17
32
309023
DCC
DC
69787
3118
6577
396348
49115058
146195
146195
14735
14735
309023
38213083 -
15603
40525
3143
8083
49115058
1590221
84191
84191
16837
16837
38213083
1229946 =
1251
3306
115
342
1590221
3799577
6829
6829
679
679
1229946
2859937 =
3118
6577
598
333
3799577
8366056
14735
14735
1589
1589
2859337
6185661 =
3143
8083
675
1571
8366056
269897
16837
16837
3363
3363
6185661
197842
342
4
43
269897
27
679
679
64
64
197842
19 =
598
333
29
62
27
1375774
1589
1589
140
140
19
939695 =
675
1571
4
43
1375774
44151
3365
3365
64
64
939695
29740 =
4
43
44151
19
64
64
29740
12
29
62
19
374
140
140
12
227
89
356
374
872
641
641
227
423 =
C
1.2812... 1 . 2826
...
1 .2853
...
1 .2929
...
1.32 88...
1
.3525
...
=1.3573.. .
115
CC
C
=
27548
DDCC
DDC
from
Exponents K, A
Range for (log N)/(log M) from to
1
.
4211 ...
1
.
4641 ...
1 . 4846
...
=1.5833...
=
1
.
6476 ...
2 0615 .
...
Table 18.1 already use Theorem 17.2.2, with a parameter from the differencing step or from the Fourier series for the row-of-teeth function. Tables 19.1 and 19.2 give some further bounds for integer points close to a curve and for exponential sums. In the first column D means the double sum, C the simple sum, and A the differencing step, so that DCC means Lemma 19.3.3 applied to the result of Lemma 19.3.2 applied to one of the bounds that uses Theorem 17.2.2, labelled C(1) in Table 18.1. As usual, rows bracketed together correspond to an upper bound with two or more terms. Estimates for exponential sums in other ranges can be obtained by the A (differencing)
Further results
422
Bounds for exponential sums
Table 19.2
Iterated
Steps
Exponent (3
C
from
Range for a = (log M)/(log T) from to
89 + 285a
106822
139817
570
246639
246639
= 0.5669...
2387 + 17972 a
115
342
1033325
106822
27290
679
679
2642746
246639 =
2819 + 19177«
598
CDC
0.4331...
333
699371
1033325
2642746 =
29855
1589
1589
1647930
11897 + 88442 a
675
1571
156527
699371
134680
3363
3363
370694
1647930
CCC
= 0.4244...
113 + 897a
4
43
263
156527
1345
64
64
638
370694
CC
CC
0.4288...
= 0.4222...
491 + 3624«
29
62
143
263
5530
140
140
349
638 -
569 + 1053a
29
62
307
143
2800
140
140
641
349
1273 + 2484«
89
356
68682
307
6410
641
641
171139
761
AC 1
0.4122...
= 0.4097.. . = 0.4034.. .
4 + 103a
12
68682
128
31
171139 =
29 + 173a
227
12
280
601
31 =
0.4011...
0.3871.. .
or B (reflection) steps as in Table 17.1. We get some improvement near the values of a for which the steps in Table 17.1 change from A'C to A'+ 1C. For most values of a, the saving obtained from the iteration is less than the saving from using Theorem 17.2.2 instead of Theorem 17.1.4 after a sequence
of differencing steps. The iteration works best for exponential sums with a = (log M)/(log T) near 5/12, where the third derivative is still small, but the second derivative has order of magnitude larger than one. This range of a provides the title for this section.
20 Sums with modular form coefficients 20.1
EXPONENTIAL SUMS
We obtain the main results of Jutila (1987a et seq.) for the sum M2
S = E b(m)e(f (m))
(20.1.1)
M
of Chapter 10, where E b(m)e(mz) is a cusp form of even weight k for the full modular group. Lemma 20.1.1 (Choosing parameter sizes) Suppose that the conditions of Lemma 14.3.1 hold. Then, in the notation of Lemma 10.5.1, we have R 13/2
B(01(V ), 02(V ), 03(V ), 04(V )) << -
P2Q2,
Q1
V
where V is summed through positive powers of 2 satisfying
V 5 L1 /(LM)1"4,
(20.1.2)
for a choice of N and R in the ranges
NxMT-1/3,
RxM112T-1/3.
(20.1.3)
The implied constants are constructed from the derivatives of the function F(x).
Proof First we note that
0(V)<< 1
1
I1
M LQ2)
1
12
1
+ <<7 +PQ.
Type 3 matrices have B and C non-zero, and B << O1 P2, C << L 1 Q2. If AI(V) << 1/PQ, and type 3 matrices occur, then P x Q and B,C are bounded. We enlarge the definition of type 1 matrices to cover this case. Hence we can suppose that 01(V) « 1/V2 when type 3 matrices occur. Next
11
Q. 0z « 01(V) V!9M « 172+_
Sums with modular form coefficients
424
Type 2 matrices have B «O2P/Q, C z i 2Q/P. We can suppose similarly that 02(V) << 1/V2 when type 2 matrices occur. By Lemma 14.3.5, the sum is
K F, VI PQ+ y2 (P2 +Q2)+ VQ P2Q2) <
`
If
We have L3Q2 1/2 M
V2 N2R4 P2Q2 << M2Q4
R3
N2
« Q MR2 '
)
0
which motivates the choice (20.1.3)`.
Theorem 20.1.2 Let F(x) be a real function four times continuously differentiable on the interval 1 < x < 2, whose derivatives satisfy the following conditions IFV'>(x)I
(20.1.4)
I F(')(x)I z 1/C1
(20.1.5)
for r = 2,3,4, for r = 2, where C1 is a positive constant. Let b(m) be the coefficients of a modular form of even weight k for the modular group, and let MZ
m
b(m)e(TFI M )),
S=
m=M
1
where M and M2 are positive integers, T is a large real number, and M< M2 < 2M. Suppose that, for some C2 Z 1, either Case 1 or Case 2 holds:
Case 1 M 5 C2 T; Case 2 M z C2 'T and (20.1.4) and (20.1.5) hold for r = 1 also and IF'(x) + 2 xF" (x) I _ C3 for some positive constant C3. Then there is a constant C4 constructed from C1, ... , C3 such that, for C4T 2/3 5 M:!-. C4 'T 4/3,
(20.1.6)
we have S << BMk12T1/3(log M)1/2 + BM(k+1)/2T-1/6(log M)112, (20.1.7)
where B is the constant of Lemma 10.2.1. The implied constant is constructed
from C1,...,C4. Remarks The second term in (20.1.7) appears when we pass from a smoothed sum in Lemma 10.3.1 (with N x MT-1/3) to the sum (20.1.1). This term may
Exponential sums
425
be reduced by appealing to a stronger bound than Lemma 10.2.1, such as the
full result of Rankin (1939) or the results of Moreno and Shahidi (1983) (Lemma 10.3.3) or Deligne (1974) (Lemma 10.3.2).
Proof The conditions on F(x) are those of Lemma 14.3.1. In Case 1 we can subtract a linear term from F(x) which changes TF(m/M) by an integer, and so does not affect the sum S. Having done this, we suppose that (20.1.4) holds for r = 1 in both cases. In Lemma 10.5.2 the number of rationals satisfies J(Q) x PQ, and Lemma 20.1.1 bounds the block sum by 2
min(2L-1,K) (Q)
E 1=L
F, b(l)hj(l)e(gj(l)) i
(R
L3
5/2
«B2Mk 1NRI Q) P2Q2
Mk+1Q
L3
1/4
(Q )
log M
lMR2)1/4
log M.
N
Summing over the blocks L:!-. 1 < 2L, Q5 qj < 2Q, we have in Lemma 10.4.1
E E wj(m)b(m)e(f(m)) j
m M2R2
M314
(NR)
1/2
<< BMk12T1/6(log
+ N3 +
M 1/6
(-)
(log M)1/2
M)112,
by the choice (20.1.3) of N and R. The lower bound in (20.1.6) ensures that
R Z 1, and the upper bound ensures (10.3.5). This gives the first term in (20.1.7) of the theorem. The second term in (20.1.7) is the smoothing error in Lemma 10.3.1.
When we consider a family of sums, then we do not use Lemma 14.4.3 because the assumption 02 PQ >> 1 may fail. We consider types of magic matrices separately. We take M << FT, so at most a bounded number of lower triangular matrices occur. We reclassify them as type 1. We define a set of Farey arcs to be a transversal set if, for any pair of indices i, j,
a')'M(a,) q1 qj for any type 1 matrix M except the identity. The Farey arcs can be divided into a bounded number of transversal sets.
Sums with modular form coefficients
426
Lemma 20.1.3 (Parameter sizes for a family of sums) Suppose that the conditions of Lemmas 14.3.1, 14.4.1, and 14.4.2 hold, and that the sum over j in Lemma 10.5.2 is restricted to a transversal set of Farey arcs for the sums of a family. Then the bound (10.5.32) of Lemma 10.5.3 is
JR2 (LM)V4
O
I2P2Q2 log IPQ
(20.1.8)
QZ
for (20.1.9)
T2/3 << M<< T,
and suitable choices of N and R in the following ranges:
R 1-1/3M112T-1/3.
Nx1213MT-1/3,
(20.1.10)
for T1/3 12/3
J2 T »IZT+M
(20.1.11)
and
Nx (J/I)-2 MT,
R 4 JrM- /IT
(20.1.12)
for J2
T
T 1/3
+
T1/3 >> 12T » 12-3
M
(20.1.13)
1
(20.1.14)
and
N x M2/T,
R
for J2 T 1/3 T + 12T . M >> 12/3
(20.1.15)
The implied constants are constructed from the derivatives of the function F(x), and from the implied constants in the ranges (20.1.9), (20.1.11), (20.1.13), and (20.1.15).
Proof Since the Farey arc indices i and j are limited to the same transversal set, only coincidences given by type 3 and upper triangular type 2 matrices
are counted in B'(O 1, ... A4). As in Lemma 20.1.1, type 3 matrices contribute
12p2Q2 LQ2 «01(01 + A2)I2P2Q2 <<
V4
By Lemma 14.4.2, an upper triangular matrix I o
M
.
i) gives coincidences
only when the first derivative a/q lies in a `short interval of length
427
Exponential sums
O(O2P2/I bI Q2), which contains at most O(i,2P2/I bl) + 1 rationals by Lemma 1.2.3. There are two cases. As in Lemma 14.2.2, triangular matrices with
L Pz LQ2 1 5 Ibl 5 O2P2 5'117 M +A1 M
(for some constant A1) contribute
« O2I2P2 log PQ. Since Ibi z 1 and L5 18M/R2, if R is sufficiently large, then R Z 6 A1, and
P2 LQ2 A252A1Vz M , so this case contributes
12P2 LQ 2
« -F F M
log PQ.
The second case is that of triangular matrices with Ibl z 02P2, which give a bounded number of coincidences each. We use the conditions involving i3 and 04, with
03(V) «
1
(LQ2)31'4 1 M + PQ
V!L
1
PQ
LQ z
M
provided that N2>> MR2, and z
P
04(V) X VT
M LQ2
+ PQ
( M) >> i,3(V).
To avoid double suffices, we write yi for the value of the parameter on the Farey arc with index i. We have U.
-=1+O(03), ui and
f11(u3,Yi)
= 1 +O(D4).
f1i(uj ,Yy)
Since F12 and F112 are supposed bounded and non-zero, f11(u?,Yi)
f11(ui3, Yj)
1 XIYi -y,IX
f1(ui,Yi) z fl(u; ,Y;)
-1
Sums with modular form coefficients
428
Now
f1(ul,Y1) =fl(u),Yj) +b, so that bQ
i,Yi) P - fl(u),yj) .f1(uz
-1 «lyi-Yjl+L3
<< A4 + 03 << 04
and we have
b «A 4
P
2
Q2
l
2
M 1« V2
Q2
QV2
vM
we can drop the second term since b is a non-zero integer. This case contributes
of
I2P2
VT
v2
Hence
B'(A1(V).... ,04(V)) «
I2V2Q2 2
IogPQ (
M2 )
and in Lemma 10.5.3
E v
r
(LM)114V
B'(A1(V ), ..., A4(V )) +J(Q)L + Q2
<< IPQL +IJPQ log l (
QJ(q)log I min lyi -yjl
+I2P2Q2 log PQ )
(
Q2)
.
(20.1.16)
In the first two cases of the lemma, the third term on the right of (20.1.16)
has the same order of magnitude as one of the other two terms in the extreme ranges Q x R, L x M/R2. Theorem 20.1.4 Let F(x, y) be a real function defined for 1< x < 2, 0 S y < 1, for which the partial derivatives are continuous and satisfy I a1 2'F(x, y)I < Cl
forr54, s<2, r+s<4, and i/C1 < I a1F(x, y)I
(20.1.17)
I a; a2F(x, y)I > C2
(20.1.18)
for r = 1, 2, and
Mean value theorems
429
for r = 1, 2, where C1 and C2 are positive constants. Let yi,... , y, be points in [0,1] with yi+1 -yi Z 1/J. Let G;(x) be bounded functions of bounded variation on [0,1] (uniformlly in i). Let Si denote the sum analogous to (20.1.1): 2M-1
m
m
T
Si = E b(m)G,(M)e(M
F(j,Yj)),
m=M
where T is a positive real number, and M is a positive integer with T2/3 << M<< T.
(20.1.19)
Then we have
I ISi1 <<
BIMk/2T1/4(
T1/3
IZT
i=1
T "3 +AIMtk+ 1>/2
J2
I2/3
+
+
J2
T
-
1/4
T M)
log IM -1
log M,
I ZT
where A is the sum of the moduli of the coefficients when the modular form E b(m)e(mz) is expressed as a linear combination of newforms, and B is the Rankin constant of Lemma 10.2.1. The terms in A can be omitted if the functions G,(x), when defined as zero for x < 1 and x >_ 2, have G;:> continuous with
IG;S)(x)I <(CON/M)5
(20.1.20)
for i = 0, ... 4, for N as chosen in Lemma 20.1.3, and some constant Co. The implied constant is constructed from C0,. .. , C4, from the bounds for the functions G,(x), and from the implied constants in the ranges for M. Proof We modify the proof of Theorem 20.1.2, using Lemma 20.1.3 in place
of Lemma 20.1.1 on the smoothing error, and Lemma 10.3.2 in place of Lemma 10.3.1. The conditions on F(x) are those of Lemmas 14.3.1, 14.3.5, and 14.4.1.
O
The smoothed version of Theorem 20.1.4 can be generalized beyond classical cusp forms with a congruence level. There is a similar result for FT << M << T413, but the second and third terms are bigger, because the conditions involving 03 and 04 do not apply.
20.2 MEAN VALUE THEOREMS We use the notation of Section 10.6, so S,Q>(t) is a smoothed sum over a transversal set.
Lemma 20.2.1 (Parameter sizes for the mean square) Suppose that the
Sums with modular form coefficients
430
conditions of Lemmas 14.3.1, 14.4.1, and 14.4.2 hold, and that S;Qky) in Lemma 10.6.5 is formed with M << T and
T =0+yi)T, where y 1, ... , y, are points in 0- 11J, and with weight functions gi(x) = Gi(x/M) satisfying (20.1.20) for i = 0,..., 4. Then, for
U x MQ/NR,
U >> T/J,
(20.2.1)
we have fu
UIsIQ)(t)I2 dt <
(20.2.2)
for a choice of N and R in the ranges (20.1.10), provided that
M>> (IT)2"3.
(20.2.3)
For a single sum, the logarithm power can be reduced to log4 M.
Corollary For a single smoothed sum we have f U IS(t)I2 dt << B2MkT2/3 log6 M -U for
UxT2/
M>> T2/3.
Proof We must estimate Bo(01,..., L4) in Lemma 10.6.5, the number of solutions within a fixed transversal set involving a fixed minor arc. As in Lemma 14.4.4, type 3 matrices contribute
<< t1Imin(P2,Q2)+A102IPQ<< i1IPQ, and, as in Lemma 20.1.1, we can take A1(V) << 11V2. When the parameter takes the same value on both Farey arcs, then there is
no contribution from upper triangular matrices. If we fix the parameter on the variable Farey arc, then a bounded number of upper triangular matrices occur, satisfying
pi
b«V2
F1 LQ 1 M 1
as in Lemma 20.1.3. Hence
Bo(i1(V),..., 04(V)) << IPQ/V2 +I, and the second term can be omitted if V>>P(LQ2/M)114.
Mean value theorems
431
Hence
VBo(O1(V),..., 04(V)) << IPQ +IP(LQ2/M)1'4 << IPQ. The expression in Lemma 10.6.5 is
B2Lklog2L L+
(L
1/4
IPQ+JIogI
We sum over blocks for L in Lemma 10.6.2 to get f UIS,Q>(t)12 dt
<<
M B2Mk(R2
IM3/2 Q 3/2
(R)
+ NR
Q
+ RJlogIlogoM.
The term in J may be omitted if I = 1. The Lemma follows at once, and the corollary follows since IS(t)12 << log R L IS,Q>(t)12 Q
in Section 10.6, the sum being over all transversal sets for each power of
o
2, Q.
Our next lemma is valid for exponential sums with arbitrary coefficients.
Let F(x) be a real function, twice continuously differentiable for 1 5 x5 2, with F'(x) and F" (x) Lemma 20.2.2 (Large values of exponential sums) bounded away from zero. Let 2M- 1
!
g(m)eI tF( M M )),
S(t) _ m=M
and define G by 2M-1
G2M= F Ig(m)12. m=M
Suppose that t(1),..., t(N) is a sequence of real numbers that satisfy IS(t)I Z GW,[M-
for all n, and t(n + 1) - t(n) >_ 1,
t(N) S t(1) + T.
Then N <<
MlogM W2
+min
(T log M MT log M W2
, yy6
The implied constants depend on the bounds for the derivatives of F(x).
Sums with modular form coefficients
432
Proof We use Lemma 5.6.1 with M1 = M, M2 = 2M - 1, N1(m) = 1, N2(m) = N, f(m, n) = t(n)F(m/M), and h(m, n) = rt(n), a complex number of unit modulus independent of m, chosen so that -q(n)S(t(n)) is real and positive. The inner sum in Lemma 5.6.1 is 2M-1
(
E e(f(m,nl)-f(m,n2))= F, el(t(n1)-t(n2))F( M))
m=M
m
This sum is M if n1 = n2. For n1' n2, but n1 - n2 <<M, the trapezium rule (Lemma 5.4.1) gives
m
el(t(nl)-t(n2))F'(M)) «1+ It(n1)Mt(n2)I `
« In
1
Mn2l
For I n1 - n2l >> M, Corollary 1 to Lemma 5.4.3 ives
7' e ( (t(n) m
1
-t(n 2 ))F( mM ) ) «M
+
M2
M
It(n1)-t(n2)I
<< +V 4 V . The double sum on the left of Lemma 5.6.1 is real and at least GNWVi, so Lemma 5.6.1 gives
(GIWX-)2<< G2MIEM+E ` n,
M
nt n2on, In1 - n2I
+EEVT ni n2
<< G2M(MN log M + N2FT).
The term in N2 can be omitted if T <M. Hence
NW25C(MlogM+NlT), where C is some constant constructed from the bounds for the derivatives of
F(x). If T< M, or if TS T0=W4/4C2, then
N S (2CM log M)/W2. If not, then we put T1 = max(M, To), and we divide the sequence t(n) into at o most 2T/T1 subsequences, in each of which t(n) 5 t(1) + T1. Lemma 20.2.2 was proved by Huxley (1972) for Dirichlet series, using the approximate functional equation for a(s) (Lemma 5.5.4). This simple proof, replacing the approximate functional equation by Poisson summation, was suggested by S. W. Graham; the more elaborate result in Huxley (1975), in which the sum is over Dirichlet characters as well as points t(n), still seems to
Mean value theorems
433
require special properties of Dirichlet series. The lemma turns on the Halasz coefficients rt(n). The full Halasz construction would be
h(m, n) = rt(n) w(m)
,
where -q(n) has modulus unity, and w(m) is a non-negative weight function which is one for M 5 m < 2M. If F(x) can be defined suitably for x < 1 and for x >_ 2, then we can use the weighted First Derivative Test and remove the logarithm factor. Theorem 20.2.3 Let F(x) be a real function four times continuously differentiable on the interval 1 5 x 5 2, whose derivatives satisfy the following conditions: IF(')(x)I 5 C1
for r = 1, ... , 4, IF(')(x)I >_ 1/C1
for r = 1, 2, where C1 is a positive constant. Let G(x) be four times continuously
differentiable for all x, and zero outside the range 1 5x 5 2. Let b(m) be the coefficients of a cusp form of even weight k for the modular group, and let 2M-1
/
M=M
`
S(t) = m b(m)GI M)el tF( )), `
/
where M is a positive integer. Let C2 be a positive constant, and let T be a real number with M5 C2T. Then f2TIS(t)I6
dt <
T
where B is the Rankin constant of Lemma 10.2.1. The implied constant is constructed from C1 and C2, and from the bounds for the derivatives of the weight function G(x).
Sketch proof We divide the range of integration into unit intervals, and for each power of 2, V, let I(V) be the number of unit intervals on which BMk12V 5 max IS(t)I 5 2BMkl2V.
(20.2.4)
Then
f2TIS(t)I6 T
dt 5 E I(V)(2V )6B6M3k.
(20.2.5)
V
We apply Lemma 20.2.2 with g(m) = b(m)G(m/M), so that G2 << B2Mk,
and we form the sequence of points t(n) by picking the values of t at which IS(t)I is maximum from alternate intervals satisfying (20.2.4). This ensures the
Sums with modular form coefficients
434
condition t(n + 1) >- t(n) + 1 of Lemma 20.2.2, with W>> V. For each power of 2, V, we have
I(V) <<
M log M MT log M V2
+
V6
(20 . 2 . 6)
If the second term in (20.2.6) dominates, then I(V)V' << MT log M << T 2 log M.
(20.2.7)
If the first term in (20.2.6) dominates, and if I(V) >> M312/T,
(20.2.8)
then
I(V)V6 <<
T2
M3
(I(V)V2)3
<<
T2 log 3 M.
(20 . 2 . 9)
The difficult case is when
I = I(V) << M3/2/T. Using the mean-to-max argument of Lemma 5.6.3, we have max
IS(t)12 <<
uSf5u+1
max fi+31S(o., t)12 dt,
-15051 u-2
where the sum S(o,, t) has an extra factor exp(27ro-F(m/M)) in each term. We can choose integers R and N by (20.1.10), and apply Lemma 20.2.1 with U Z 1. We write S(o-, t) as a sum of O(log R) sums S'Qk(o,, t) corresponding to fractions aj/q, with Q S qj < 2Q, with the Diophantine approximation taken for the value of t at the centre of the unit interval. The conditions on G(x) ensure that we can write
G(x/M)exp(27rcrF(x/M)) = Wj(x)g(x), with g(x) of bounded variation. There are two complications, one obvious, one less so. The intervals in (20.2.2) are essentially non-overlapping by (20.2.1), so that, if possible, we must cover several unit intervals counted in I(V) with one interval to - U :..g t < to + U. However, the definition of a,,(x) depends on the unit interval, in that N changes by O(QR) as t changes by U = O(MQ/NR). We can overcome this by partial summation, at the cost of a factor log M. So Lemma 20.2.1 leads to the bound IV2 << (IT)2"3 log9 M, or
I <<
T2 V6 log27 M,
(20.2.10)
with one logarithm power extra for each factor S(t) from the dissection into sums S(Q), and another from the partial summation. We use (20.2.7), (20.2.9), and (20.2.10) for O(log M) values of V, and I(V) << T when V is small.
The modular form L -function
435
The third form of Jutila's method leads to a mean square result (Jutila 1989, 1990b).
Lemma 20.2.4 (Short intervals mean square) Suppose that the conditions of Theorem 20.2.3 hold. Let U, t1,. .. , t1 be real numbers U S t, < T - U, t;+ 1 - t, > 2U. For any e > 0, however small, if T2/3+E<< M<< T,
T1+E/M1/2<< U<< T2/3,
then
tf
t,+U
-U
1S(t)12 dt << BZMkTE(IU+ (IT)2/3),
where B2 is the Moreno-Shahidi constant of Lemma 10.3.3, and the implied constant depends also on e.
Both Theorem 20.2.3 and Lemma 20.2.4 imply slightly weaker forms of Theorem 20.1.2 by the mean-to-max argument of Lemma 5.6.3.
20.3 THE MODULAR FORM L-FUNCTION Exponential sums with modular form coefficients appear in the modular form L-function W b(m)
cp(s) _ E
s+tk- 1W/2 ' M=1 m
(20.3.1)
which has been much studied lately. It is a Fourier transform of the modular
form F(z) = E b(m)e(mz) along the line from 0 to ico. We hope that, whenever F(z) corresponds to some algebraic or geometric object, then certain special values of cp(s) will correspond not only to integrals involving
the modular form, but to integrals on the geometric object in its intrinsic metric. Different structures can give the same L-function. The derivatives of
modular forms can be used to construct new modular forms. We have normalized (20.3.1) so that F(z) and F'(z) have the same L-function, up to a factor 2ir i. Changing the sampling line 0 to ico corresponds to twisting the
modular form (Atkin and Lehner 1970). When F(z) is a newform, the simplest of a family of twisted forms, then cp(s) factorizes as an Euler product of Dirichlet series, with each factor a sum over those m which are powers of
some fixed prime. From its construction cp(s) has a functional equation cp*(s) = cp*(1 - s), where
cp*(s) = (21.)-SF(s + (k - 1)/2)cp(s).
(20.3.2)
The corresponding Dirichlet series constructed from non-holomorphic modular forms include 2(s), the square of the Riemann zeta function. It makes sense to ask whether newform L-functions cp(s) are non-zero in the region
436
Sums with modular form coefficients
Re s > z and of size O(tE) for s = o- + it, with cr- 1 fixed, as t --> . Even a bound of the form tA(°) would enable us to deduce the Wilton formula from
(20.3.2). However, to bound cp(s) we need an approximate functional equation, obtained by the Wilton formula and analytic continuation (see Jutila 1985).
Lemma 20.3.1 (Approximate functional equation) x, y chosen with
For 0< o-< 1, t >- 2, and
xy=t2/47r2, we have
cp(s) = E
b(m)
(k- 1)12+s M 5X m
b(l)
+ G(s) l5y
l(k+ 1)/2-s
+O(Bt1"2-° log t), where
G(s) = (27r)2's-1r((k+ 1)/2-s)/F((k-1)/2+s), 0
and B is the constant of Lemma 10.2.1.
We deduce bounds for the modular form L-function. Theorem 20.3.2 For T large
cp(2 + iT) = O(BT1'3(log T)3"2), where B is the constant of Lemma 10.2.1.
Proof We split the sums over l and m in the approximate functional equation into blocks of the form M5 m::5 min(2M - 1, x). For M >- C4T 2/3 we use Lemma 5.1.1 to take out the factor 1/ m__, and then use Theorem
20.1.2. For small m we use Lemma 10.2.1. The factor G(4 + it) is the quotient of two complex conjugate numbers, and so it has unit modulus.
0
Theorem 20.3.2 was proved by Good (1982), with the same exponent 3, but
a better logarithm power, using a Fourier expansion into non-holomorphic modular forms. Jutila's triumph (1987a) was the sixth-power moment. Theorem 20.3.3 For T large
f2TI
cp(Z +it)I6dt=O(B6T2log34T),
where B is the Rankin constant of Lemma 10.2.1.
(20.3.3)
0
The modular form L -function
437
Remark The same bound holds for the integral over - T to T; we fit together ranges on which IZ + it) has fixed order of magnitude. Theorem 20.3.3 is deduced from Theorem 20.2.3. The sums in the approximate functional equation are smoothed by averaging over x in Lemma 20.3.1, and then subdivided into O(log T) smoothed sums over ranges of the type M to 2M - 1. We use Holder's inequality to sum over ranges for m. Similarly we deduce a mean square result from Lemma 20.2.4 (Jutila 1990a). Theorem 20.3.4 For any e > 0 and T sufficiently large we have fT+T2/31
T-T2/3
(p(z +lt)12dt=O(BZT2/3+e),
where B2 is the Moreno-Shahidi constant of Lemma 10.3.3, and the implied constant depends on e.
21
Applications to the Riemann zeta function 21.1 INTRODUCTION The Riemann zeta function, C(s) = E 1/ns, is a generating function for the positive integers. It has a single pole as a function of s, at s = 1 where the series first diverges, because it is the generating function of an infinite set of positive coefficients. It has a functional equation (stated in Section 5.5 of Chapter 5) that relates its values at s and 1 - s; this, in a sense, is equivalent to the Poisson summation formula (Lemma 5.4.2). What makes it extraordinary is that log C(s) is a generating function for the prime numbers. Euler's product over primes 1
1
expands out as a sum of 1/n3, where each integer n occurs as many times as it can be written as a product of primes, that is, exactly once. Euler's product is simply C(s). Taking logarithms and summing the geometric progressions, we have 1
log
p
1_ p-s
°°
-, 1
= p r-1 rprs
we use the convention that p is a variable running through the prime numbers. Differentiation gives the meromorphic function log p
s
(s)
-
P r=1
P
s
(21.1.1)
The right-hand side of (21.1.1) is usually written as - E A(n)/ns, where A(n)
is log p when n is a prime power pr (with r > 1), zero otherwise (A was chosen by von Mangoldt as the initial of `logarithm'; a clash of notation with the kernel function A(x) of Chapter 5 seldom occurs). The coefficients A(n) can be recovered using the Mellin transform integral ('Perron's formula'), of which Lemma 5.2.4 is a truncated version. Riemann conjectured, and von Mangoldt proved, that the integral is equal to the sum of the residues; both the integral and the sum are only conditionally convergent. The pole of
C(s) at s = 1 gives the main term, telling us that A(n) averages to one, so prime numbers of size N are about log N apart on average, and the kth
Introduction
439
prime number is about k log k: the prime number theorem of Hadamard and de la Vallee Poussin. Points where C(s) is zero also give poles of C'(s)/C(s), and there is an extra pole from the factor 1/s in Lemma 5.2.4. The task of Hadamard and de la Vallee Poussin was to show that these zeros had real parts less than one; otherwise the residues would give oscillating terms as large as the main term.
The reciprocal 1/C(s) is the generating function for the combinatorial Mobius function µ(n). Just as A(n) averages to one, so µ(n) averages to zero, with an accuracy determined by the residues at the point s = 0 and at the zeros of C(s). This is why Selberg's expression on the left of (5.6.10) is used both in the sieve and in the study of the zeta function. Riemann predicted with some confidence that all the zeros of C(s) (except those on the negative real axis), beyond the six or so that he had calculated, should have real part one-half, and so lie on the line of symmetry in the functional equation. This Riemann Hypothesis, if true, would imply that the Farey sequence is distributed as uniformly as possible along the real line, and that the prime numbers are distributed rather uniformly among the integers. Both these sequences are irregular over short intervals. In particular, the gap between consecutive prime numbers p and would be bounded by the square root of n times a power of log n if the Riemann Hypothesis were true. Probably the gaps are always much smaller; but to prove this would require further results on the small-scale distribution of the zeros of C(s) along the `critical line' Re s = i.
The Hardy-Littlewood generation of number theorists doubted the Riemann Hypothesis, not merely because they had failed to prove it, but because there was no apparent reason why it should be true. Even the computer evidence is covered by Littlewood's observation that the zeta function is unusually well-behaved near the real axis where log log IsI is small.
The widely held view in the age of algebra is that the numbers s(1 - s) formed from the zeros of t(s) will be eigenvalues, and thus real. Suggestions as to the self-adjoint operator to produce these eigenvalues range from an infinite matrix, through a differential operator, to a homomorphism of an infinite product of topological groups. All agree that it should be a natural construction in which the rational numbers are fixed, rather as the ground field is fixed by the Frobenius automorphism in a finite field. A suggestive approach is the Selberg Trace Formula. In the special case of the modular group (which fixes the rational numbers as a set), the prime numbers appear
as a small correction term, corresponding to the term in the Riemannvon Mangoldt formula from the residues of zeros on the real axis. The coefficient sum in the formula is over units in real quadratic number fields. The main sum on the other side is over the other zeros of the Selberg zeta
function-for which the Riemann Hypothesis is true. The related trace formulae for Fourier coefficients of modular forms have been worked out by
Kuznetsov (1980) for the expansions of Chapter 10, which involve polar coordinates in hyperbolic space, and Good (1983) for Cartesian coordinates
Applications to the Riemann zeta function
440
in hyperbolic space. They provide useful information for analytic number theory. The analogies with scattering operators in quantum mechanics (Lax and Phillips 1967) should only disconcert those who like their mathematics totally disconnected. But for prime number theorists, the Selberg Trace Formula resembles a glass of beer: the most attractive part turns out to be froth, disappearing when you digest it, and the body of the drink, however nourishing, has a bitter taste. For all the loving care devoted to C(s) (see Titchmarsh 1951, and Ivic 1985, first conceived as a supplement to Titchmarsh) it is only one function. There are reasons to suspect that C(s) is extremal in the class of ordinary Dirichlet
series with bounded coefficients, like Koebe's f(z) =z/(1- z)2 among the univalent functions. The Minakshisundaram-Pleijel zeta function, E 1/An, where A is a sequence of eigenvalues, is an interesting but remote generalization. There is a select class of functions (usually called zeta function if they have a single pole, L-function if they are entire) with three properties: D: E:
a definition as a Dirichlet series; an Euler product expansion;
F:
a functional equation of the form f *(s) = g*(1- s), where g(s) is another function of the same type, and the stars indicate that there is a normalizing factor involving phase shift factors, sth powers of constants, and gamma functions or the like.
The Riemann Hypothesis is not known to be false (or true!) for any function of this class. Selberg's zeta function has E and F, but not D, since it is only defined by the analogue of C'(s)/C(s). Any direct definition of it as a series would be over the centre of the group ring of the modular group. The French school has its own motives for explaining the D, E, F properties. The same zeta function can serve several different motives. The Wilton summa-
tion formula for the Fourier coefficients of modular forms in Chapter 10 corresponds to D and F; that E holds was noticed by Ramanujan and first proved by Mordell.
While these mighty matters are mooted, we have to make the most of imperfect information, moving the Mellin contour past some of the poles, and estimating the integrand on the new contour. Useful information is:
(1) bounds for C(s); (2) mean square results (or higher powers) on lines Re s = (3) zero-density results, that there are very few zeros in some region bounded away from the critical line Re s = Z. Zero-density results use bounds of types 1 and 2, and ingenious arguments
based on partial sums of the Dirichlet series for 1/C(s) or C'(s)/C(s). Present function-theoretic methods, alas, seem incapable of disproving the following null hypotheses: ItI'18-3
(1) I C(s)I gets as large as for every S> 0; (2) 1 C(s)I is zero in Re s > 1 - 8 for every 6 > 0;
The order of magnitude in the critical strip
441
(3) functions with the D, E, F properties can vanish on the positive real axis.
Littlewood (1912) observed that convexity arguments applied to log i(s) show that off the real axis, for any e > 0,
(s) << TE (21.1.2) T to the right of the critical line that contains no
inside any region with Itl zeros. Hence the Riemann Hypothesis implies the Lindelof Hypothesis, that
(21.1.2) holds for v> 1 (with a constant depending on o-). The Lindel6f Hypothesis in turn implies the density hypothesis, that the residues from zeros with Re s = 1 dominate the sum, and so Pn+1 -Pn << n
1/2+
21.2 THE ORDER OF MAGNITUDE IN THE CRITICAL STRIP In complex function theory, knowing the maximum size of a function in a region is very useful. The series for a(s) converges for v = Re s > 1. The
functional equation reflects the region v > 1 into the region v < 0. The remaining region is the critical strip 0 5 o,5 1. We use the approximate functional equation (Lemma 5.5.4) to translate bounds for exponential sums into bounds for I (o + it)I for a fixed, t tending to infinity. First we quote a standard estimate (Jeffreys and Jeffreys 1962, art. 15.05). Lemma 21.2.1 (Stirling's formula)
Let S > 0 be fixed. Then
logF(s)=(s- Z)logs-s+ilog27r+O
Is11
as I s I -* o, uniformly in - it + S < arg s 5 it - S. In particular, as t -3, + o with o fixed, we have
logI'(s)=(v-2)logt-
i2n
+Zlog2ir i
+it(logt-1)+-(v-Z)+O
(t)
There are corresponding approximations to the derivatives of log F(s).
O
Theorem 21.2.2 Let a- be fixed in z < Q < 1. Suppose that we have MZ
r
/ m
eI TFI M)) << MATµ(logT)",
(21.2.1)
M
with F(x) = (log x)/21r, for some non-negative A, µ and v with
1-0,
(21.2.2)
Applications to the Riemann zeta function
442
uniformly in M, M2, and T satisfying M2 < 2M and µ log M
1-a.
logT S
2
Then, as It I -, cc, we have
(o. + it) = o(ItI, (log It!) ' Corollary
)
We have
(Z + it) = 0( ItI89/57° log2 ItI)
Proof By Stirling's formula (Lemma 21.2.1) we have r 21r l s
I-I
['(1 - s) x Itl 1/2
(21.2.3)
for s = o-+ it, Iti large. In the approximate functional equation (Lemma 5.5.4) we have +
I (s)I <<
log T
T1/2-°
where T = [ItI] + 2, and
N' =
[iiL}.
2irN
We can choose the integer N so that N and N' are less than
ItI
.
The trivial
estimates K 1
K
1ns
«K1-°
1
n. K'
1
nl-s << K
<
"log K',
are allowable for
K<< TK' << T('L+o-1/2)/° If µ > (1 - 0/2, then we can use the trivial estimates over the full range. Otherwise we have Tµ/(1-°)
<<
T(µ+°- 1/2)/° << T1/2,
and T here differs only trivially from the notation T = ItI in (21.2.1). The rest of the range is split into O(log T) blocks of the form M 5 n <- M2. We write n 1 1
n = n°M
M).
The mean square
443
We take out the factor n° x M° using Lemma 5.1.1, and estimate the resulting inner sum as O(M"Itl'`(log Itl )°). Since A5 o-, this block contributes O(Itl "(log Itl)°). Similarly we have Itl1/2_Q
My
Itl1/2-o
1
M1_o M" Itl"(logltl)v
n1_S
ltl)"+
< Itl1/2- a(
11tI A (log ltl) ° «Itl" (log ltl )
where we have used both inequalities of (21.2.2). The result follows on putting together the bounds in different ranges.
From Table 17.3 we deduce the further bounds 23208
43860
+ it)
«Itl 6299/43860 log
2
Itl,
(21.2.4)
and, for any integer R Z 0 and Q = 2R, 169(712R + 1577) 712(1424Q-338)
+ it
«Itl 169/(1424Q-338) log 22
(21.2.5)
These are values of o for which two outstanding vertices of the upper bound polygon in Table 17.1 correspond to sizes of M that give contributions of the
same order of magnitude. For intermediate values of o,, one range of M dominates, and we have t(o + it) << Itl µ log2 Itl,
where the exponent p. is obtained by linear interpolation between the exponents at values of a- given in the corollary, (21.2.4), and (21.2.5).
21.3 THE MEAN SQUARE Theorem 19.1.3 gives a bound for the short interval mean square of the zeta function on the critical line Re s = 2 Theorem 21.3.1
Let T and A be large real numbers with T72/227(log T)315/227
S 0 S T.
Then
+it)12dt<< T
log2T.
Applications to the Riemann zeta function
444
Proof The upper bound is linear in A, so we need only prove it for A <- F. In (21.2.3) we can fix N' as
T -31 4'
N'
2orN
since the effect is to omit a bounded number of terms in the second sum, whose total contribution O(T-O/2) can be absorbed by the error are divided into blocks term in (21.2.3). The sums E 1/ns and E M:5; n 5 2M - 1 (the upper limit for n may be replaced by N or N'). In 1/nl_S
Theorem 19.1.3 we take F(x) = (log x)/21r, G(x) =1v (if the upper limit is N or N', not 2M - 1, then we take G(x) = 0 for x > N/M or N'/M). Then S(t) is larger by a factor VM_ than the sum we want, so when (19.1.9) of Theorem 19.1.3 holds, then the squared integral for each block is O(ff). Blocks with M small are estimated as in Theorem 18.2.3. There are O(log T) blocks, so Cauchy's inequality gives O(A log2 T) for the whole integral.
Much deeper arguments give the corresponding mean value theorem T
Jo
+(2y-1)T+E(T),
with an explicit estimate for the error term E(T). We use the machinery of Heath-Brown (1979), following Heath-Brown and Huxley (1990).
Lemma 21.3.2 (Truncated mean square expansion) Suppose that 15 A S T1/3. Let t - 21rm2
E
g(t) = 2
ms (t/2a)
(m/n)"
G(t) = 2 m
n
m 1
(mn) ilogm/n
exp
02 log2 m/n 4
the conditions of summation in G(t) being
0 < I log m/ni 5 (log T)/0, mn <- t/tar.
(21.3.1) (21.3.2)
Then
g(t) = t log t/27r + (2y - 1)t + 0(1), and
+it)I2dt=g(4T)-g(2T)+G(4T)-G(2T)+0(t log2T). 2T
The mean square
445
Theorem 21.3.3 For T large
T
+it)I2 dt = T log - + (2y- 1)T + E(T),
Io with
E(T) << T 72/227 (log T
)629/227
Proof We estimate G(T) in Lemma 21.3.2. We write m + n = k, m - n = h, 2m = k + h, 2n = k - h, noting that k
1/2
h2
which may be expanded as a power series in h/k. We take A = T72/227(log
T)175"227,
so that h is much smaller than k. The contribution of the first term in the power series will dominate the sum. We also have log
m
n = log
1+h/k 1- h/k
r (khlr) 1
=2
(21.3.3)
r odd
The weight A2log2 m/n
1
log m/n eXp(
4
and the condition (21.3.1) may be expressed by a condition 0 < I log m/nl 5 S,
(21.3.4)
and a weight w(S), and then by integrating the weighted sum over S. The condition (21.3.2) is
k2 - h2 5 2T/ir.
(21.3.5)
By (21.3.1) and (21.3.3), h2/k2 5 (log2 T)/402, so the upper bound for k in (21.3.5) is one of at most two consecutive integers. We may divide the double sum over h and k into two parts, and replace (21.3.5) on each by a constant upper bound for k. For fixed k, (21.3.4) is an upper bound for h of the form
h
dH dk
x S.
Having chosen the endpoints and centres of Farey arcs, we may find that the integer part of H(k) changes within a minor arc. If so, then the sum over a
Applications to the Riemann zeta function
446
minor arc must be divided into two or more parts which are estimated separately. The number of subdivisions is 0(1 + SN), which is bounded when
HN << M. We divide the sum into blocks of the form
H ::g h s min(H2, H(k)),
H2 < 2H,
M
M2<2M.
By (21.3.1) and (21.3.2) we can suppose that
M« FT,
H<<
M log T
0
A final preparation concerns the integer summands h and k. To obtain only terms in which k - h = 2n is an even integer, we divide the terms into two groups. In terms with h and k both even we take h/2 and k/2 as new integer variables, and (21.3.6) F(x) _ (log x)/21r as usual. To obtain terms with h and k both odd, we take two sums in which h and k are unrestricted. In the first sum F(x) is still given by (21.3.6), whilst in the second sum we take
F(x) =
log x
M2x2
8T '
21r
so that TF I
k Mh)_TF( k M h)=
T 2
log k + h
-
hk
When we subtract the second sum from the first, then we obtain e(hk/2) _ (-1)"k, and only terms in which h and k are both odd remain. We can now estimate the blocks of the sum as in Theorem 19.1.1. The weight function is
0(1 MeX M H p(
4M2 YH2
so the block bound corresponding to (19.1.4) of Theorem 19.1.1 is H 87/140 H 3/70 Q2H2
<<exp(-
4M2
)(()
T23/70+(M)
Tis/3sl(logT)9/4.
For fixed M, this expression sums over blocks for H to «(0-3/70T23/70+0-87/140T18/35)(logT)9/4 <<
J
log T,
and then over blocks for M to O(0 log2 T). The conditions on M and H are those in Theorem 19.1.1. For M large, H small we can also use the bound
Gaps between zeros
447
corresponding to (19.1.6) of Theorem 19.1.1, and for M small, the onevariable 'exponent-pair bound (2/7, 4/7)' as in Theorem 18.2.3. The parameter choices in Theorem 19.1.1 satisfy the condition HN << M easily. Thus we find that G(T) = O(0 log2 T). The theorem follows on summing the integral of I I2 over the ranges T/2 to T, T/4 to T/2,..., and a trivial range 0 to T/2', where r = [(log T)/(log2)].
21.4 GAPS BETWEEN ZEROS Even if the Riemann Hypothesis is false, then there area zeros of i(s) on the critical line Re s = .1. The symmetrized function C*(s) = r(s/2)C(s)/?rs/2 (21.4.1) is real on the real axis and on the critical line, which is also a symmetry axis. Selberg (1942) showed that the number of sign changes of *(s) along the critical line is comparable to the change of argument along the parallel line
Re s = 2, so that a positive proportion of the zeros lie on the critical line Re s = .1. As a first step towards the local distribution of the zeros and the gap between consecutive primes, we try to localize this result. The whole argument can be regarded as a divide-and-conquer proof that there are zeros of i(s) on the critical line.
We quote the first lemma (due to K. Ramachandra), which is purely function theory, from Balasubramanian (1978). The lemma does not use the properties E and F of the zeta function, and it applies to any Dirichlet series with leading term 1 and an analytic continuation for large t to the left of the line a-= 2, where it is bounded by a power of It1. Lemma 21.4.1 (Mean minimum modulus) Given e > 0, however small, there is a To depending on e, such that for T > To and H Z (log T)E, we have
fT+HI
(2 + it) I dt >> H.
T
Lemma 21.4.2 (Approximation to
*(s))
Let
N=N(t)=[ (ItI/21T)]. Then for s = i + it with t large, we have
I'(s/2)
*(2 + it) = 2Re 1s/2
N1
ns + O
log t \ 1 (
ti/a
with
arg
r(s/2)
t( t _ 1 11 s/2ns = 2 log 2arn2 11 + 0 t /I
Applications to the Riemann zeta function
448
Proof This lemma follows at once from the definition of *(s), the approximate functional equation and Stirling's formula (Lemmas 5.5.4 and 21.2.1).
Lemma 21.4.3 (Repeated First Derivative Test) Let f(x) be real and r + 2 times continuously differentiable on the closed interval [ - /3, /3 ], with
f'(x)>-K>0 on[-/3,/3], and for 25k5r+2 If (k)(x)I < /31-kIf'(x)I. Then
1
e(f(x))cosr
R R
-x
dx <<
20
0)r+1
OK
The implied constant depends on r.
Proof We iterate the relation [(x)e(f(x))cosk1Tx/2/31 I1
irx
a
f Rg(x)e(f(x))cosk
2/3
dx= 7rx
27rif'(x) d
g(x)
-f e(f(x))cosk 2/3 dx f'(x)
kir 2/3
sin
rx
20
arx
cosk-1 20
g(x)
f'(x)
dx
= f Rh(x)e(f(x))cosk-1 2a dx, with
h(x) =
k7r
20
-7rx
sin
g(X)
20 f'(x) -
cos
g(x)f"(x) f (x) 2
irx g'(x)
20 f'(x)
The iteration ends when the integral on the right has no cosine factor, and we estimate each term at its maximum. Each term consists of numerical factors, sines and cosines, and a product
`(f(r+ 2) ..
°,+ z
f'
)
with
as+a2+ +ar+2=r+l. Now suppose that *(2 + it) does not change sign between T and T + H, where H >- log T. Putting a positive weight function into Lemma 21.4.1, as in Lemma 5.1.1, we see that fT+H
T
+it)Icosr (Hr t-T-
- dt>> H. 2
))
Gaps between zeros
449
Let
g(t) =
I'(1/4 + it/2)
1
2ir
arg
.1/4+ii/2
Then
I (2 + it)I =
it),
and the ambiguous sign does not change between T and T + H. Hence
2 ))dt
>> H.
By Lemma 21.4.1, we may replace g(t) by
f1(t) = with an error
-
(log 27r -
HN
1
H
1
T3/a
1
and replace e(g(t))t2 + it) by
2Ree(f1(t))
N
log T
1
(log T
2
N
+0(71/4)
7;T/ 2+ir
71/4
n
=
where
fn(t) =
1)
41r (log
with t
1
N
1
f,(t) = 41r log 2arn2 z
27r log n
We pick M < N and use Lemma 21.4.3 for n :!g M - 1 with f3 = H/2. The condition on the derivatives is satisfied if T N
H1ogM>1.
Then T+H T
M-1
Re
r
1
M-1
<< E 1
N/2
(( t- T-
H
1
n (H log 1
IN- ( <<1+ Hr
H dt 2) )
N/n)'+1
M-1
1
H-,+
N'
(N-M)r
1
1
Nr+1
rNHr .(N-n)r+1 )
« BrH ,
450
Applications to the Riemann zeta function
provided that
-
N-M>>
1/r
,
H( H
(21.4.2)
where B is a large constant. With this choice of M we must have M
N
1
n1/2+ir
>1
(21.4.3)
for some t between T and T + H. This is a short exponential sum formed with the functions 1
- log x,
F(x)
1
G(x) =
27r
,
TX
and parameters M (or N) and t, so that t x f(x)=--logM
We check that
f"(N)
M 2= ++O1
t
4irM2(NJ
2
1.
FT
The integer N is the centre of a major arc. We use Lemma 19.2.1 to treat the whole sum as a long major arc. Let
N' =N-M=MB. We have
µ=f(3)(N)/6 x 1/M, so that in Lemma 19.2.1 we have / 20-1 30-1 +OI
1 M).
After a partial summation step using Lemma 5.1.1, Lemma 19.2.1 gives M
1
«M1/2-v-v+9(sp+2v 112)(logT)B.
n1/2+U
Comparison with (21.4.3) gives
9>
p+o
3p+2a-1/2
-0
log log M
logM
(21.4.4)
After some experimentation with parameter choices, we suppose that 423/1295 5 a:5 227/601, corresponding to 0.6601 ... 5 05 0.7179.... In this
The twelfth power moment
451
range p = 89/1282, o- = 908/1282, so the rational number in (21.4.4) is 997/1442 = 0.6914... . We can choose H = TO in (21.4.2) with 9 /3
1
+0
+
2
1
(21.4.5)
g
and (21.4.4) gives 445
< 2884
+O
1
log log T
r
logT
'
(21.4.6)
This is an upper bound for H. We conclude by discussing a logical trap. Could /3 be larger than (21.4.5), so that a and 0 are smaller, so small that a < 423/1295, and the values of p
and o- change? If /3 = 3/20, then 9 is approximately 7/10, a is approximately 4/11, still in the range in which /3 = (89 + 908 a)/1282 in Table 17.1. Lemma 19.2.1 tells us that (21.4.3) cannot hold for large T; there must be a zero on the critical line with height t in T5 t< T + T 3120. So there cannot be a gap so large that /3 corresponds to a value of a < 423/1295. We have now proved a version of Karatsuba's theorem (1981). Theorem 21.4.4 Let /3 > 445/2884 (= 0.1543...) be a real number. If T is sufficiently large (depending on /3) then there is at least one zero p = 1 + i y of the Riemann zeta function with
T
O
21.5 THE TWELFTH-POWER MOMENT Jutila's method for modular form L-functions applies equally to 2(s), although the corresponding modular form is not a cusp form, and has a logarithmic singularity at ico. The summation formula in the untwisted case is the original Voronoi formula (1904). There is an extra non-oscillating term
corresponding to the singularity, and the Rankin series has a fourth-order pole at s = 1. We must replace B log M by log4 M in the bounds of Chapter 10; we can see this directly from Lemmas 11.1.2 and 11.1.3. The proof of Theorem 20.3.3 can be modified to give a sixth-power moment for 2(s). Theorem 21.5.1
For T large
LT Theorem 21.5.1 was first proved by Heath-Brown (1978) (with a better logarithm power logs' T), using Atkinson's formula (1949), which is essentially Wilton's formula for a non-holomorphic modular form whose Dirichlet series is (Z + it)C(Z - it). Another proof by Iwaniec (1980) uses Fourier expansions into non-holomorphic modular forms.
22 An application to number theory: prime integer points 22.1 PREPARATION A favourite source of problems in number theory is to take two integer sequences, and to ask how many members they have in common. Popular sequences are the prime numbers, the Fibonacci numbers, the powers of 2 and the squares (or, more generally, the values of a suitable polynomial g(x)). Problems of this type involving prime numbers are tantalizingly difficult: even primes of the form n2 + 1 are well out of reach. Gel'fond suggested
the sequence of integer parts of the values g(n) of a real function g(y) growing faster than linearly. Thus the point ([g(n)], n) is an integer point lying close to the curve x = g(y). Piatecki-Shapiro (1953) was able to show that there are infinitely many prime numbers of the form [n"] for each a in the range 1 < a < 12/11. Deshouillers (1976) showed that the set of a > 1 for which [n"] takes only finitely many prime values has measure zero. It would
be very surprising if this exceptional set were not simply the integers. Piatecki-Shapiro's range for a has been extended many times, most recently by Liu and Rivat (1992) to 1 < a < 15/13. We prove the more general result suggested by Gel'fond. For convenience we work with the function f(x) inverse to x = g(y), where
f(x) =NF(x/M),
(22.1.1)
and we count the prime values of m in an interval M< m < 2M for which there is an integer n with
f(m) 5 n < f(m + 1).
(22.1.2)
The conditions on F(x) will be of the type seen in our previous theorems. We require f(x) and g(y) to be monotone increasing, with
0
f(m + 1) -f(M) + p(-f(m + 1)) - p(-f(m))
(22.1.3)
is one if some integer n satisfies (22.1.2), zero if not. There is a technical
Preparation
453
simplification in counting prime numbers using von Mangoldt's weight A(m), defined by log p if m= p°, a> 1, p prime, A(m) = 0 if m is not a prime or a prime power. The basic property of these weights is E A(d) = log m. (22.1.4) dim
We use the prime number theorem in a weak form (see Davenport 1967 and Hardy and Wright 1960 for the two standard proofs). Lemma 22.1.1 (Prime number theorem) For P > 2 we have
E A(m) =P+O msP
P log P
Let I be a subinterval of 1 <x < 2, chosen so that no combination of derivatives of F(x) occurring in the proof vanishes on I. We work with the integers in the interval MI, the set of points of I multiplied by M; let the endpoints of MI be M1 and M2. Let E(2) denote a sum over integers m for which (22.1.2) has a solution. Then (2)
E A(m) _ MI
A(m)(f(m + 1) -f(m)) MI
+ L A(m)(p(-f(m + 1)) - p(-f(m))).
(22.1.5)
MI
By the prime number theorem and partial summation, the first term in (22.1.5) becomes
f(M2)-f(M1)+0 logM
(22.1.6)
.
On the left of (22.1.5) (2)
(2)
E A(m)
F, log p + OI
MI
r
A(m)(22.1.7) /
m52M,mnotprime
P,-Ml
in this chapter p is a variable of summation restricted to prime numbers. The second sum in (22.1.7) is
E E log P< p
rz2
log 2 M
log p
log P
p'S 2M
E
1 <<M1"2 logM<<
ps (2M) for
M << N2/logo N.
N log M
An application to number theory: prime integer points
454
The second term on the right of (22.1.5) contains p(t) functions as in Chapter 18. It is still essentially a sum over prime numbers because we have
the coefficients A(m). We use a sieve (in the original sense): a counting argument which puts more weight on the prime numbers, using Mobius inversion to remove many numbers which are not prime. Linnik's large sieve (1941) introduced generating function inequalities; Lemma 5.6.6 is a modern
version. Vinogradov showed that sieve ideas could be used to estimate exponential sums over prime numbers. Linnik and Vaughan gave different combinatoric arguments of the same general type. We follow Vaughan (1977). Let 0' (m)
= ( p(-f(m + 1))
- p(-f(m)) form in MI, otherwise,
0
and let u be an integer with u + 1 a power of 2, and (22.1.8)
u3 < M.
Let So be the finite sum
so= E µ(d) F, A(m) F, cr(dmr). d5u
r
m
We write So = S + S', where S consists of the terms with dr < u, S' the terms with dr > u. By Mobius inversion we have
S = E A(m)v(m), m
the second sum on the right of (22.1.5). Next we write S' = S, + Si, where Sl consists of the terms with m > u, S', the terms with m< u. We have S,
F, µ(d) F, A(m) E v(dmr)
d5u
r>u/d
m>u
F, A(m) E K(n)v(mn), m>u
n>u
where
K(n) _
F,
µ(d),
K(n)I
dln,d5u
In Si we can drop the condition dr > u, since cr(dmr) is zero when dmr < u2. We write S' = S2 + S31 where S2 consists of the terms with dm 5 u, S3 the terms with dm > u. We put s = dm,
A(s) _ E F, µ(d)A(m), d5u m5u
IA(s)I 5 logs.
dm=s
Then
S2 = E F, A(s)o (rs), sSu r
S3 = E
F, k(s)Q(rs).
u<s5U2 r
Preparation
455
Next by (22.1.4) we write So as
So = E E p.(d)log a o, (de). d5u
e
In Montgomery and Vaughan's classification the sums So and S2 are type
I, pure in the inner variable, which runs through a long interval (the monotone weight in So can be removed using Lemma 5.1.1). The sums Sl and S3 are type II, and can be written using (22.1.8) as
a(s) E /3(r)o,(rs).
F,
(22.1.9)
r>u
u<sSuZ
We can regard the coefficients a(rs) as forming a lacdnary square matrix of size 2M - 1. For large M we elaborate the construction, picking z so that z + 1 is a power of 2 and u2Y2 5z < VIM-,
(22.1.10)
uz2z2M.
(22.1.11)
and
Then
w=2M/z2 lies in the range 2:5 w 5 u. We write S1 = S41 + S42 + S'4, where S41 is the
terms with m < z, S42 is the terms with n < z, and S4 is the terms with m, n > z. Thus
F,
S41 =
A(m) F, K(n)a(mn), n>u
u<m
F,
S42 =
K(n) F, A(m)a(mn).
u
m>u
These are type II sums in which one summand runs through a short range. We expand S4 again as
S'4 = F, µ(d) F, A(m) F, a(dmr). dsu
r>z/d
m>z
Let S5 be the terms in S4 with d z w, S6 the terms with d < w. In S5 and S6 we cannot combine d and m into one variable s, because the condition on r is r > z/d, depending on d. However S5 and S6 both have the more general form
F, F, a(d,m) F, a(dmr). dm
r> z/d
We treat S. as a type II sum, S6 as a short type I sum. In S5 we have
r>z/u>u,
s=dm>wz=2M/z.
In S6 we have
r > z/w,
s > Z.
An application to number theory: prime integer points
456
22.2 TYPE I DOUBLE SUMS Long type I sums have the form
E a(s)
/3(r)a(rs),
s5x
(22.2.1)
r
where /3(r) is a monotone weight function. For So we have x = u, a(s) = µ(s), (3(r) = log r, and for S2 we have a(s) = a(s), /3(r) = 1. For x > z, the part of
S6 with s 5x also has this form, with 6(r) =1, and
a(s) _ F, µ(d) E A(m),
I a(s)I < logs.
m>z
d<w
dm =s
The inner sum over r in (22.2.1) is P2(s)
F, /3(r)( p(-f(rs + 1)) - p(-f(rs))),
(22.2.2)
Pa(s)
taken over a range P1(s) <- r5 P2(s) with P1(s)
X P2(s)
X M/s. By Lemma 5.4.3 Corollary 2 and Lemma 5.1.1, the sum in (22.2.2) is << (P2(s)N)1I3 max I a(r)I, and so the whole expression in (22.2.1) is
MN "
s5X
(-)
max Ia(s)/3(r)I << (MNx2)1/31
s
in all three cases. Here and in later order-of-magnitude estimates we write 1 for log M. The sum (22.2.1) can be swallowed by the expression in (22.1.6) for X <<
N M1/2
(22.2.3)
M 13 The terms in S6 with dm > x form a short type I sum. We suppose that x + 1 is a power of 2. We divide the ranges for r and q = dm into blocks of the
form PS r 5 2P -1, Q S q < 2Q -1, where P and Q are powers of 2. A typical block sum has the form 2Q-1
E F, µ(d) E A(m)
q=Q d5u
m>z dm=q
P2(d, m)
F,
r(qr),
r=P1(d,m)
greMI
and is bounded above by 2Q-1
F, q=Q
P4
F, F, A(m) dm dm=q
max Is> Pa
gP3,gP4eMl
I
E a(qr) r=P3
457
Type I double sums
The sum over d and m is simply log q. To apply Lemma 5.3.3 we put
o-(qr) = p(-f(qr+ 1)) - p(-f(qr)) = p(zr) - p(yr). We take the weights wr to be 1 for P3 < r5 P4, 0 otherwise. Then Lemma 5.3.3 gives
p I Ewro(gr)I 4
H
H
wr(e(hy,.) - e(hzr))
h=1
2arih
+E
H H h=1
H H h=1
1
1
+- E I Ewre(hyr)I +- E I Ew,.e(hzr)I; (22.2.4) Y
we have gained some simplification by taking the sum over h outside the modulus sign. If
M l3, N then the contribution 4 PQl/H of the first term is small enough. In the second term we write
(22.2.5)
H >>
e(-hf(m)) - e(-hf(m + 1))
f(m+1)
2,irih
f(m)
e(-ht) dt
=e(-hf(m))
ff(m+1)-f(m)e(-ht)dt. 0
There is some constant c with
f(m + 1) -f(m)
E wr(e(-hf(qr)) - e(-hf(qr+ 1))) r
fcN/m
we(-hf(qr)-ht)dt.
F, r
0
f(rs+1)-f(rs)z1
We suppose that f" does not vanish on the interval MI, so that the condition f(rs + 1) -f(rs) > t holds on a subinterval of P3 < r:5 P4. Hence the second term in (22.2.4) is bounded by
H N
<< F, -
P6
max
h=1 M IPs>P6)cIP3rP41
E e(-hf(qr))
.
P5
We can remove the minus sign by complex conjugation. In fact
P I Ewro(gr)I << H
1) H +(N-M +H h=1 - gP5,gP6EMI max
By (22.2.5) we can drop the term 1/H.
P6
e(-hf(gr)) P5
.
(22.2.6)
An application to number theory: prime integer points
458
We treat the sums in (22.2.5) as exponential sums over r with a parameter
q; the range for h is too short for us to expect an extra saving. We apply Theorem 17.2.2 with P in place of M, Q in place of I, and F(xy) for F(x, y) (with y in the range 1 5y <- 2), and T x hN. The results are much simplified if
Q>> max T8/3>>
(HN)8133
M8/3316/1 1;
(22.2.7)
h
we have assumed that H is chosen as small as (22.2.5) allows. Since Q does not satisfy (22.2.3), we have (22.2.7) provided that M49 <<
N661-246
(22.2.8)
We note that
aiF(x, y) = y'F(')(xy), a1-la2F(x, y)
+xyr-1F(')(xy).
= (r-
The derivatives of F(u) which must not vanish are: F" ( u),
F (3)(u),
F(4)(u) 3F ( 3)( u) + uF (4)(u)
F(5)(u)
3F(3)(u)2 - F (u) F(4)(u), F(3)(u) F(4)(u) 3F (3)(u) 2F" (u) + uF(3) (u) 2F" (u) F(4)(u ), F(3) (u)
F(5)(u) 4F(4)(u)
F(4) (u) 3F (3)(U) + uF(4) (u)
4F (4 )( U) + uF( 5)(u)
3F (3)(U)2 + 4F" (u)F(4)(u)
3F " (u)F(3)(u)
F(5)(u) 4F (4)(U) + uF(5)(u)
F(4)(u) 3F (3) (U) + uF(4)(u)
1 3F (3)(U)2
0
-F " (u)2
F(5)(u)
F(4)(u)
F(3 )(u)
4F(4)(u)
3F (3)(U)
2F "(u)
F(4)(u) 3F (3)(U)
F" (u)2 F(3)(u) 2F" (u) + uF(3)(u)
The theor em gives 5
P6(q)
E e(hf(qr)) q
<< p5/2 QT 5/6 +eE 3/8,
r=PS(q)
with 1
E << T8/33+
T2/3 :i
P2
+74/3
when we combine the second and third cases. The limits of summation P5
Type II double sums
459
and P6 also depend on h and t, but these are outside the summation over q anyway. We want to show that P6(q)
Q4
q
E e(hf(gr))
5
( 1 M. N
<
H N l3
r=P5(q)
which holds when T5/6+EE3/s <<
P5/2/H5120.
This inequality is satisfied for
M P» (NMM49/165+e + (N) 2
20/13
20/7
M1/3+E + (NM),
M4/21+F
(22.2.9)
22.3 TYPE II DOUBLE SUMS The type II sums have the form
E a(s) F, p(r)cr(rs). r>u
u<s5z
We divide the sums into blocks of the form P < r< 2P - 1, Q < s5 2Q -1, where P and Q are powers of 2, with QM/z>> M1/2. (22.3.1) Let a and /3 satisfy 2Q-1
2P-1
I a(s)12 < a2Q,
l3(r)12
1
s /32P.
P
Q
In S3 we have a(s) = A(s), /3(r) = 1, so that we can take a =1= log M, /3 = 1. In S41 we have a(s) = A(s), /3(r) = K(r), so we can take a = 1, p2 << 13. In S42 we have a(s) = K(s), /3(r) = A(r), so that a2 << 13, /3 =
1.
We write S5 (with a change of notation) as
S5 = E µ(d) E A(m) w5d5u
m>z
F, or(dmq). q>z/d
We remove the condition dq > z using Lemma 5.2.4 on the sum over q, with K2 = 2Q - 1, K1 = (z + Z)/d. This lemma introduces factors 1/ds, 1/q5 into the block sum, and an error term << E I µ(d)I A(m) log Q «P12. w5d5u
m>z PSdm <2P
The coefficient /3(r) has
/3(r) _ L
w5d5u
µ(d) ds
E A(m), m>z
dm =r
10(01 < log r,
An application to number theory: prime integer points
460
so we can take a = 1, f3 << 1. There is also a factor I from the integration over
s at the end. We apply the method of Lemma 5.3.3 with
Z. = -f(m + 1),
Y. = -f(m),
and wm zero for m not in the interval MI, 2Q-1
2P-1
wm = F, a(q) E f3(r) r=P
q=Q
for m in MI. In the argument leading to the third and fourth terms in the upper bound we use 2Q-1
IWmI s F,
2P-1
Ia(q)I F,
q=Q
I f3(r)I
r=P
for m in the interval MI. We continue as in the treatment of short type I sums, but we pay more attention to the limits of summation. Let J denote the
interval MI, and J(t) the subinterval of J on which f(x + 1) - f(x) Following the treatment of type I sums, we have 2Q-1
1 2Q-12P-1
2P-1
L a(q) F,
r=P
q=Q
f3(r)o- (gr)<< - F, E Ia(s)f3(r)I H q=Q r=P 2Q-1
H
IcN/M
q=Q
h=1 t-O
2Q-1 1
H
q=Q
H h=1 H
dt
E If3(r)Ie(hf(gr))
r=P
grEJ
2Q-1
+-H h=1 1
F f3(r)e(hf(gr))
r=P 2P-1
E Ia(q)I
+- F,
2P-1
E a(q)
2P-1
E I a(q)I
L 1 /3(r)I e(hf(gr + 1)) r=P
q=Q
.
(22.3.2)
greJ
We want to show that the block sum on the left of (22.3.2) is << N/13. The first term on the right of (22.3.2) is M N 1
<< Ha/PQ<< a(3H<<,
provided that we choose
Hx af3 a choice similar to that in (22.2.5).
M 13, N
(22.3.3)
Type II double sums
461
The other three sums on the right of (22.3.2) have the same form. We take the second sum as an example. First we remove the coefficients /3(r) by a Cauchy inequality as in Lemma 5.6.1. 2Q-1
2
2P-1
E a(q)
F,
/3(r)e(hf(gr))
r=P
q=Q
2P-1
2Q-1 E a(q)e(hf(gr))
2P-1
E I (3(r)I
F,
P
P
(22.3.4)
.
Q
greJ(t)
The first factor on the right of (22.3.4) is 5 /3 2P. In the second factor we apply the differencing step (Lemma 5.6.2): 2Q-1 E a(q)e(hf(gr)) Q
2
2
Q+2D-1'2Q-1
Ia(q)I +2Re
D2
D-I (
d
1
D)
d=
Q
greJ(t)
2Q-d-1
x
E
a(q+d)a(q-d)e(hf(gr+dr)-hf(gr-dr))
.
(22.3.5)
q=Q+d gr±dreJ(t)
The first sum on the right of (22.3.5) contributes
Q+2D-1 D2
«2Q
to the right-hand side of (22.3.5), << a2/32P2Q2/D << a2/32M2/D
to the bound for the right-hand side of (22.3.4), and
« HN M a/3M VD _
r
a2/32M13
(22.3.6)
to the second sum in (22.3.2), provided that D << Q. We choose DxaaRa(N)2
112,
so that the bound in (22.3.6) is again << N/13.
(22.3.7)
An application to number theory: prime integer points
462
In order to apply Lemma 5.4.3, Corollary 1 to the sum over r, we note that a2 ar2(hf(gr + dr) - hf(gr - dr))
=h(q+d)2f"(qr+dr) - h(q -d)2f"(qr-dr) = 2dh(2(q + Od) f" (qr + Odr) + (q + 9d)2 f (3)(qr + Odr)), (22.3.8)
for some 0 in -1 < 0 < 1. If 2F" (u) + uF(3)(u)
is bounded away from zero, then the expression in (22.3.8) has order of magnitude
dh N M2 - Q P2
dhNQ
Corollary 1 to Lemma 5.4.3 gives
L e(hf(gr+dr) - hf(gr - dr)) <<
(
dhN 1/2
I`
r
Q
Since
2I a(q+d)a(q-d)I.-I a(q-d)I2+Ice (q+d)I2, the non-trivial terms on the right of (22.3.5) sum to
«
Q Q
aD2 2
Q(DhN1/2 Q
contributing a2/32M2 (DHNP) 1/2 << « 2Pa ZQZ(DHN11/2 Q P M J
to the right-hand side of (22.3.4), and
DHN
M
M
(22.3.9)
) 1/4
to the second term in (22.3.2). The expression (22.3.9) is << N/13 for P>> a4R4DH5N 112>> (ap)13(M)
6
139
(22.3.10)
We also require Q >> D, so
Q>> (aa)4(N
2112
(22.3.11)
Prime numbers in a smooth sequence 22.4
463
PRIME NUMBERS IN A SMOOTH SEQUENCE
We can now give a form of the Piatecki-Shapiro theorem. Theorem 22.4.1 Let F(x) be a real monotone increasing function, six times
continuously differentiable on a closed subinterval I of [1, 2] on which the expressions F'(x), F"(x), and 2F"(x) +xF(3)(x) are non-zero. Let e be positive, and let M be a large integer, N be a large real number with
NmaxF'(x)<M
(22.4.1)
I
with 0 = 12/11. Then, for some constant B depending on I, F(x), and on e, but not on M or N, there are at least BN/log N integers n such that [g(n)] is a prime number p with M5 p < 2M, where x = g(y) is the function inverse to
y/N = F(x/M). If in addition the interval I is chosen so that none of the expressions
F(4)(x),
F(3)(x), F(4)(x) 3F(3)(x)
F(3)(x)
F(4)(x)
F(3)(x)
F" (x)
3F (3)(x)
2F"(x)
3F(3)(X)2 F(5 (x) 4F(4'(x)
F(5 (x) 4F(4)(x)
0
-F" (x)2
F(4)(x) 3F(3)(x)
2F"(x)
F(4)(x) 3F (3)(X)
F(3)(x)
vanishes on the interval I, then the result holds with 0 = 3300/3019 in (22.4.1).
Proof We follow the convention of writing a for any exponent which can be made arbitrarily small, so that exponents e in different bounds need not be
the same as one another, or the same as the a in the statement of the theorem. For the result with 0 = 12/11 we do not actually need the further dissection of S3 into S411 S421 S5, and S6. All that we use about the type II sums is that P >> Q, so that p >> M112. This can be achieved by interchanging q and r within each block of the sum if necessary. With the combinatorics of Section 22.1 we take z close to M112, so that w is bounded, to achieve the same effect. For p >> M 1/2, the necessary condition (22.3.10) holds for M<<
N12"11-
(22.4.2)
and (22.3.11) holds for M /2
u >>
(N
ME,
(22.4.3)
which is consistent with (22.1.8) and (22.1.10). The sum S6 may be treated as
An application to number theory: prime integer points
464
a type II sum. For the other type I sums, the condition (22.2.3) with x = u follows from (22.1.8). In (22.1.5) we have
N
(2)
E loge=f(M2)-f(M1)+O log M
PEW
p prime
Since log p >- log M and f(M2) -f(MI) >> N, we deduce the result. For the result with 0 = 3300/3019 we must choose z so that in (22.3.10)
M >>(N)
6
ME, (22.4.4)
and in (22.2.9) z3
z
M w >>
20/7
(M 1 2 M49/165+e + M )20/13 M1/3+e ,+,
N
N
M4 /21 + E
N (M) (22.4.5)
For M larger than the range (22.4.2) we take
()6_
to satisfy (22.4.4). When we substitute into (22.4.5), then the strongest condition is M <<
N3300/3019-
(22.4.6)
and both (22.1.10) and (22.1.11) with
ux
Me
hold for M<
a weaker condition.
We used Vaughan's three-factor combinatorics to replace sums with the von Mangoldt function A(m) by more tractable exponential sums. Using a five-factor identity of Heath-Brown (1983) lets us take P >> z in the short type I sums, increasing 0 from 3300/3019 to 330/301.
Part VI Related work and further ideas
23 Related work 23.1
INTEGER POINTS CLOSE TO A CURVE
Bombieri and Pila (1989) have improved Swinnerton-Dyer's estimate for the
number of points on a convex curve when the curve is very smooth. The argument again takes a `divide and conquer' form.
D. Divide the curve into arcs. A. Show that for each arc there is some algebraic curve P(x, y) = 0 passing through all the integer points on the arc, where P(x, y) is a polynomial with integer coefficients. C. Count the integer points on the corresponding arc of the algebraic curve.
With L = M = T, S = 0 in the notation of Chapter 3, the results take the form R:5 BMO+ e,
(23.1.1)
where a is positive, and can be taken arbitrarily small. The constant B depends on e. Irreducible algebraic curves (defined by P(x, y) = 0 where P(x, y) is an irreducible polynomial) are classified by an integer called the genus. Curves
with genus zero can be parameterized by rational functions x =f(t)/h(t), y =g(t)/h(t), where f(t), g(t), and h(t) are polynomials with integer coefficients. Suppose that we have the simplest such parameterization. Then (23.1.1) holds with 0 = 0 unless h(t) is a constant. When h(t) is a constant, then (23.1.1) holds with 0 =1/d, where d = max(deg f, deg g). For genus 1, R is unbounded as M tends to infinity, but (23.1.1) holds with 0 = 0 and B depending on the curve as well as on e. Swinnerton-Dyer uses the determinants
0=
1
ml
nt
1
m2
n2
1
m3
n3
If (m;, n.) are integer points on or close to a very short arc of a curve, then A
Related work
468
is numerically small. Since 0 is an integer, then 0 must be zero, and the three points must lie on a straight line. Bombieri and Pila use 1
m1
nl mj
...
mind-1
...
mDnD 1
nl
0= 1
mD nD MD
nD
where D is the number of terms mini of degree at most d, which is the binomial coefficient (d + 1)(d + 2) D =d+2C2 =
(23.1.2)
2
The conclusion from A = 0 is that the integer points (ml, n1), ..., (MD, nD) lie
on an algebraic curve defined by an equation P(x, y) = 0 of degree d with integer coefficients.
Suppose that F(x) is infinitely differentiable and equal to the sum of its Taylor series (strictly, for each point a there is a neighbourhood of a on which the Taylor series at a converges and equals F(x)). Suppose that G(x) is an algebraic function (not necessarily defined over the rationals). If F(x) = G(x) for infinitely many values of x with 0 <x 5 1, then there is some
point at which all the Taylor coefficients of F(x) - G(x) are zero, so that F(x) is the same as G(x). Consider all implicit real equations d
d-1
0=P(x,y)= E E a;;x'y1 i=o i=o of degree d, normalized by
F, F, a,j=1. r
i
The number of intersections of Y = F(x) and P(x, y) = 0 in 0 S x 5 1 is a function N(a) of the coefficients a1, continuous except at sets of coefficients where the two curves touch. For each value at = 0 there is a neighbourhood of 0 in D-dimensional space on which N(a) S N((3). Since the unit sphere is compact, we see that there is a maximum number of intersections. This is an ineffective argument: we cannot find the maximum number of intersections explicitly. For given e > 0, choosing d about 3/e leads to (23.1.1) with 0 = 0, and the constant B depending heavily on the function F(x) as well as on E.
If y = F(x) is a solution of Q(x, y) = 0, where Q(x, y) is an irreducible polynomial of degree e > d, then the curves P(x, y) = 0, Q(x, y) = 0 intersect in at most de points. Bombieri and Pila show by induction that if 0 < F'(x) < 1 for 0 s x< 1, then (23.1.1) holds with 0 = 1/e, and B depending only on E. Finally they give a uniform result for an arbitrary function F(x). If d and
Integer points close to a curve
469
D are related by (23.1.2), and if F(x) is D times continuously differentiable, then (23.1.1) holds with 8
(23.1.3)
B=z+3(d+3)
and with the constant B depending on the function F(x) only by way of an upper bound for the derivatives of F(x). There is a similar result with B depending on the number of zeros of F(D)(x). In principle the method extends to integer points at most S from the curve. But 8 must be extremely small for us to conclude that the determinant 0 is numerically less than unity. If d = 24, D = 325 (the first case in which (23.1.3) gives 0 < 3/5), then we require 6 «M-52649
Sargos has another approach to the problem of integer points close to a curve, based on the interpolation polynomial used in Lemma 3.2.2. The curve
is y = f(x), with f (") non-zero on an interval of length L. For n = 2 we considered the integer points close to the curve as a polygon. For any n + 1 points, there is a curve y = g(x) passing through them, given by the interpolation polynomial, of degree at most n. Each set of n + 1 consecutive integer points close to the curve defines an interpolation polynomial curve, which we call the Sargos spline. A general integer point lies on n + 1 Sargos splines. A spline is minor if it has degree n, major if it has degree at most n - 1. The coefficients of a Sargos spline polynomial are rational, and the coefficient of xn-1 in a major spline is a rational approximation a/q to f (" -1)(x)/(n - 1)!.
In the case n = 2 several integer points can lie on the same side of the polygon. Similarly several consecutive integer points can lie on the same Sargos spline. If n + 1 or more integer points lie close to the curve in a short interval, then they must lie on the same major spline. Theorem 23.1.1 (Huxley and Sargos 1995) Let 6, 0, and L be positive real numbers, with S < z and L > 1, and let n >- 3. Let f(x) be a real function with n continuous derivatives on an interval I of length L, with
If(")(x)I x 0
(23.1.4)
on I. Let R be the number of pairs of integers (1, m) with m in I and
IZ-f(m)I s6. Let C be any positive real number. Then S 1/" R«S2/"("-1)L+02/n(n+t>L+()
+1.
The implied constant depends on C, n, and on the implied constants in (23.1.4).
470
Related work
Theorem 23.1.1 improves Theorem 3.3.1 for large S; we have suppressed T
and M from the notation. In an iteration of the type in Section 3.4 which ends with an appeal to Lemma 3.1.2, we should replace the last iteration step by Theorem 23.1.1. More complicated results, in which an extra derivative is assumed not to vanish, will be found in Filaseta and Trifonov (1996) and Huxley and Trifonov (1995); they correspond to Theorems 3.3.1 and 3.6.3 respectively.
23.2 THE HARDY-LITTLEWOOD METHOD Our guiding principle has been that in problems with both continuous and discrete aspects, behaviour at real numbers is related to the behaviour at nearby rational numbers. In the large sieve inequality the comparison goes the other way, that an average over the Farey sequence modulo one can be bounded in terms of averages over all real numbers modulo one.
The Hardy-Littlewood method is one of the great themes of number theory this century. In its simplest form we count the number r(n) of ways that an integer n is represented in a certain form. Classically we form the generating function f(z) = E r(n)z". Usually this series converges inside the unit circle and tends to infinity for z = re(a/q), where a/q is rational, and r tends to one from below. The unit circle is a natural boundary. We recover the coefficients from 27rir(n) = f f(z)zn-1 dz, where the integral goes around the origin along a circle of radius slightly less than one. The circle is divided
into Farey arcs I(a/q). When a/q has small height (major arcs), then I(a/q) contributes approximately ?Ti times the residue at e(a/q), defined as
the limit of (z - e(a/q))f(z). When a/q has large height (minor arcs), then we use an upper bound for If(z)J, and we count the contribution as an error term. The major arc contributions are evaluated approximately, so there may be cancellation between different arcs I(a/q), first between arcs with the same denominator q, then between the sets of Farey arcs with different denominators. After order-of-magnitude factors are taken out, these contributions form the so-called singular series, with terms indexed by the Farey sequence modulo one. The singular series converges in most applications; if r(n) is zero for congruence reasons, then the singular series is found to converge to zero. Often the series factorizes into factors associated with each prime p, which can be thought of as the measure of the set of solutions of a congruence.
A simplification is to truncate the generating function f(z) to a finite
sum, and work on the unit circle with S(a) = Er(n)e(na), so r(n) = f S(a)e(- n a) d a, integrated over the real numbers modulo one. When the generating function is a product, then the truncation takes place in each factor. Complex analysis is replaced by Fourier theory. Inequalities can be treated analogously. This is the form of the Hardy-Littlewood method used
Other Farey arc arguments
471
in Chapter 12. Many complications were avoided in that chapter, as only an upper bound was required. There is a discrete form of the method, where the integral from 0 to 1 is replaced by the Riemann sum with step length 1/N for N sufficiently large. It seems to offer no advantage in practice. When the generating function factorizes into several factors, then there is
an elegant treatment on the minor arcs. Instead of multiplying an upper bound for each factor, we take some factors out at their maximum, leaving the integral of the modulus of the other factors on the minor arcs. Holder's inequality estimates this integral in terms of one or more integrals of the modulus squared of exponential sums. The range of integration is extended from the minor arcs to the whole unit interval. The modulus-squared integral becomes the counting function for a simpler problem. In Waring's famous problem of expressing n as x, + +xs, this leads to Vinogradov's mean value theorem (see Vaughan 1981), an analogue of the First Spacing Problem in Part III.
In delicate investigations we must treat all arcs as major arcs, to get cancellation over different arcs I(a/q) with the same denominator q. The endpoints of I(a/q) depend on the neighbouring Farey fractions. When q is large, then for some values of a, the arc I(a/q) does not even appear: for a = 1 and q large, the rational 1/q lies in the arc 1(0/1). In this version of the Hardy-Littlewood method, each Farey arc gives a main term, contributing to the singular series, a remainder term from the approximation, and a correction term involving the adjacent Farey fractions. Cancellation in the correction terms is called the Kloosterman refinement. In the Hardy-Littlewood method for d simultaneous equations, there are
d integrations. An equation in an algebraic number field of degree d corresponds to d simultaneous equations involving the coefficients with respect to a basis for the integers. The role of the rational numbers a/q is given to the characters o ,(v) of the additive group modulo an ideal I. The norm of the ideal corresponds to the denominator q. The characters can be
expressed as v(v) = e(Trace av), for some number a in the field; the denominator of a may involve a different ideal.
23.3 OTHER FAREY ARC ARGUMENTS Iwaniec, lecturing in Gottingen in 1992, suggested a discrete form of the Hardy-Littlewood method with the Kloosterman corrections. He requires a weight function w(t) with w(t) = w(-t), w(0) = 0, and W
w(n) = 1.
Related work
472
If n is non-zero, then
E w(k) = E to
n
kink
kin
by symmetry, whilst if n = 0, then
,w(k)=1,
L(k)=0.
kin
kin
Using Kronecker's notation 8(n) for the function which is 1 for n = 0, 0 for all other integers, we have n
8(n)
w(k) - w(k kin
k=1
=
k E eI In ) I w(k) - w bmodk
l1
`
n
k
* e( anl(w(k)-w(k))
1
k=1 k qlk amodq (an l
L * el -1
amodk 1 q
I`
q
JJ
1
E k=O(modq)
n
-(w(k)-w(_)) k k
an a mod k
q
where the star indicates a sum over a with (a, q) = 1, and Wq(n) denotes the
sum over k for fixed n and q, regarded as a weight function. To count solutions of gi+...+g:
=hi+...+h;,
we put
n =gi + ... +g, - hi - ... -hr, and sum over g1, ... , gr, hl,..., hr. Although the exponential e(an/q) factorizes,
the weight function Wq(n) does not. The weight Wq(n) makes
the expression a convolution of generating functions rather than a simple product. Everest (1989) has discovered a major and minor arc phenomenon when
studying the distribution of units in an algebraic number field. Let S be a finite set of prime ideals. An S-unit is an algebraic integer with no prime factors outside S. If S is an empty set, then we get the units of the field. The S-units form a group generated by a root of unity and a finite number (r, say)
of generators of infinite order. Let e be the number of embeddings of the field into the real or complex numbers. For each S-unit a, we form the
Higher dimensions
473
vector with e entries log I o (a)I, where o runs through the embedding maps. The S-units are mapped to an additive group of vectors in an (e -1)-dimensional subspace. The roots of unity map to the zero vector, and it can be proved that no other S-unit maps to zero. These vectors form the projection into (e -1)-dimensional space of an r-dimensional lattice. The distribution is non-uniform: the r generators v1,. .. , v,. have particular lengths and directions, so the shorter vectors have more integer multiples near the origin, and directions are favoured which correspond to linear combinations alvl + +a,.v,. for which (al,..., a.) is a primitive integer vector of small height, the major directions. The major directions determine the approximate distribution of solutions of the S-unit equation 13,x]+...+F'nxn=0,
where the variables X1,..., x are S-units, and the constants /3,, ... , . 6 are integers of the field. A recent paper of Baker and Harman (1991) makes an unexpected use of dissection into Farey arcs.
Theorem 23.3.1 (Baker and Harman) Let a be irrational, 13 be real. There are infinitely many primes p with
Ilapk+/311<1/p°
when k=2, 0<3/20, orkZ3, 9<1/3X2k-' Sketch proof
1. Approximate a by A/Q using Dirichlet's theorem (Lemma 1.5.1). 2. Express the condition in terms of exponential sums using Lemma 5.3.2. 3. Replace a sum over primes by a sum over integers m with weights A(m), and expand A(M) by a method similar to that in Chapter 22. 4. Since xk is a mononomial, the factors can be grouped in various ways to give triple sums involving e(Ahxkyk/Q).
5. For each y, approximate Ayk/Q by a/q in a Farey sequence of smaller order.
6. Difference k - 1 times in x using Lemma 5.6.2, to get sums involving e(bx/q), which can be evaluated, and are small for most values of b modulo q. The integer parameter b comes from combining h and the differencing parameters as in Lemma 17.4.1 and Theorem 18.5.3.
0
23.4 HIGHER DIMENSIONS Much less is known of lattice point problems in three or more dimensions. It
is easy to approximate a plane curve by a polygon, because the gradients correspond to the real line, which is ordered. A polyhedron with rational
Related work
474
normals that approximates a surface is more difficult to construct. However
Kendall's Fourier series method used in Chapter 6 does extend to higher dimensions (Hlawka 1950; Herz 1962a, 1962b). The expected number of lattice points inside a smooth closed surface containing a volume V in n dimensions, expanded by a linear factor M, is VM". The easy estimate for the
discrepancy D(M), corresponding to Lemma 2.1.1, is O(M"-'), and the mean-to-max argument replaces the exponent n-1 by n - 2 + 2/(n + 1). The limitation result corresponding to Theorem 6.3.1 states that the exponent must be at least (n - 1)/2. Kratzel and Nowak (1991, 1992) have improved
the exponent by grouping the Fourier series terms to form suitable twodimensional exponential sums.
Theorem 23.4.1 (Kratzel and Nowak) Let S be a closed surface in n dimensions with non-zero Gaussian curvature, with an infinitely differentiable one-one correspondence between points on the surface and their outward normal vectors. Let D(M) be the discrepancy of the interior of the M times enlarged surface MS. Then D(M) << M8(log M)A with A 5 1 and 9
_
n - 2 + 8/(5n +2)
for n=3.... 6,
n-2+3/2n
fornz7.
Exponential sums with two or more variables present new difficulties. Some difficulties already appear in the corresponding exponential integrals
f f e(f(x, y))dxdy. After we integrate by parts in x, the exponent for the integral over y depends on the limits of integration for x, and so on the shape
of the region of integration. Even for rectangles, there is no saving in the y-integration if f(x, y) =g(x), a function of x alone. If f(x, y) =g(x +y), then we can use a repeated First Derivative Test as in Chapter 21, but if g' has a turning point, then the exponential integral can be evaluated using Lemma 5.5.2, and again there is no saving in the y-integration. The most favourable case is f(x, y) = g(x) + g(y), when the variables separate over a rectangle. If the integrand has a local maximum or minimum, then we should get an asymptotic formula like Lemma 5.5.2. Since f f e(x2 +y2) dx dy and its
analogues in several variables do not converge at infinity, non-vanishing conditions are required for the third or higher derivatives, and the error term is proportionately worse than in Lemma 5.5.2. The van der Corput iteration (see Graham and Kolesnik 1991) consists of
two types of step, followed by a trivial estimate or an appeal to Theorem 17.2.2, with one summand as m, and another corresponding to the parameter.
The differencing step involves f(x + h) -f(x - h), where h is a vector in a box with centre at the origin. The box may have sides of unequal length; if a side has length one, then the corresponding component of h is zero, and the corresponding variable is fixed. The reflection step is Poisson summation (Lemma 5.5.3) in a subset of the variables, and it introduces all the difficulties of estimating exponential integrals. Even for a rectangle, the reflected sum
Exponential sums with monomials
475
may be over a region of complicated shape, as we saw in Chapter 8, and the error terms and the truncation error in the Poisson summation formula may add up to more than the estimate for the reflected sum. Another difficulty is that the reflected sum, when treated as a family of exponential sums with a parameter, may be non-standard. A standard sum is one with length M and the rth derivative of order T/M' for all required values of r. A sum becomes non-standard if the length is much shorter than M, or if certain derivatives
are too small or too large. Kratzel (1988) explains this method for many classical problems.
23.5 EXPONENTIAL SUMS WITH MONOMIALS Many applications of exponential sums in number theory involve monomial
functions in the sense of Chapter 3. When expressions like o-(dmr) in Chapter 22 are expanded as Fourier series, then the exponent in each term can be factorized as a product of powers of d, m, and r. The integer variables can be separated. They can often be recombined in ingenious ways, as in the proof of Theorem 17.2.2. A lesser advantage of monomial functions is that inverse functions and derivatives are easily found. Fouvry and Iwaniec (1989) have shown how useful the large sieve (Lemma
5.6.6) can be for exponential sums with monomial functions in several variables. As an example we sketch their Theorem 3, which is used by Liu and Rivat (1992) in the Piatecki-Shapiro problem of prime integer points close to a curve, considered in Chapter 22, when the curve is a power function y = x. That Fouvry and Iwaniec's method actually gives a slightly stronger result was noticed by Liu and Rivat. We indicate such an improvement at the end of the proof.
Theorem 23.5.1 (Fouvry and Iwaniec) For K, L, M positive integers, not all unity, T large and positive, a, 63, y non-zero real numbers, let 2K-1 2L-1 2M-1 Tm'k Ply S = E F, E a(m)b(k, l)eI MaKPLY 1 k=K I=L m=M where a(m), b(k, 1) are any coefficients. Then, for a 96 1 we have M215
T114 +M112
S << (KLM)1 /2(
(
+M1"5 + 7
/
X max la(m)II
log2 KLM
Ib(k,l)I2)1/2.
(23.5.1)
m
`k l The implied constant depends on a, (3, and y. Sketch proof
Step one The triple sum S is type II, so we apply Lemma 5.6.1 with the
Related work
476
double sum over k and I as the outer sum, followed by Lemma 5.6.2 applied to the inner sum over m. This produces a four fold sum involving
e(x(d, m)y(k, l)), with
x(d, m) = T((m + d)'- (m - d)a)/Ma, y(k, 1) = k PIY/KsLY.
Step Two We apply the large sieve (Lemma 5.6.6) in one dimension to blocks of terms H< d5 min(2H - 1, D). Step Three The Second Spacing Problem is to count the number of sets of integers k1, 11, k2, 12 with
k2
O(OKRLY),
(23.5.2)
where 0 x M/HT. Let k = (k1, k2), 1 = (11,12), k1/k2 = r/s, 12/l1= r'/s' in lowest terms. For fixed l there are O(L2/12) possibilities for r'/s', so r/s lies in an interval of length O(0) by (23.5.2). For k fixed there are now 0(1 + AK2/k2) possibilities for r/s by Lemma 1.2.3, so we get L2
O l2 +
02K2L2
k2l2
(23.5.3) )
possibilities for r/s and r'/s'. By symmetry, we can replace (23.5.3) by O I`
OZK2L2 ( k212 + Min l
which sums over k and 1 to give
K2 L2 2 , IZ
`
B «OK2L2 + KL log KLM, in the notation of Lemma 5.6.6.
(23.5.4)
Step Four The First Spacing Problem is to count the number of solutions of
(m1 + d1)a - (m1- dl)a = (m2 + d2)a - (m2 - d2)a +
O(OHMor_ 1).
(23.5.5)
For small H we write
(m + d)"- (m - d)a = 2 adma-1 + O(H3Mar-3), and the argument used for step 3 gives
A << HM logKLM+OH2M2 +H^. For large H we write
(m+d)a-(m-d)a=2adma-1+
(23.5.6)
a(a - 1)(a - 2) d3ma_3+O(H5Ma_5) 6
Exponential sums with monomials
477
We fix d1 and d2, and put A =
mi M
-
(d1/d2)1/t"- 1>. Then (23.5.5) implies
a-2 a; ( A
1
6
m
H44
111-
za
+0 0+ M J
(23.5.7)
)
Fouvry and Iwaniec study (23.5.7) by exponential sums. We can also use Farey sequence methods. For fixed A, the fraction m2/m1 lies in a short
interval. We take e/r to be the rational number of least height in this interval, and let f/s, f'/s' be its neighbours in the Farey sequence .9(r). Now m2/m1 lies in one of two reference intervals. For m2/m1 between e/r
and f/s, we have m1= ru + st, m2 = eu + ft for some integers t, u as in Lemma 1.2.2, but the common factor (t, u) = (m1, m2)'may be greater than one. We can also write
A = (e +f9)/(r+s9), where 0 may be negative. Substituting in (23.5.7), we find that (t - Ouxru + st) lies in an interval whose length is proportional to A. This is an integer point
close to a curve problem. The trivial estimate O((6 + 1XL + 1)) (in the notation of Chapter 3) gives
O(OM2 +H4/M2 +M/r), and by Lemma 1.2.3, we have r << R for O(HR) pairs d1, d2, and so A << HM log KLM + OH2M2 + H6/M2.
(23.5.8)
Fouvry and Iwaniec impose a condition H << M3/5 to remove the last term in
(23.5.8). If we retain this term, and optimize the choice of D, the upper bound for H, then the terms M1/5 +M2/5/T1/4 in (23.5.1) may be replaced by M1/7T1/14
M 1/6 +
M112
(KL)1/14 + T1/5 +
M1/2 (KL)1/24T1/6.
o
24 Further ideas 24.1 COMMENTS ON THE METHOD Mathematics delights in general principles and specific examples. The ideas of Bombieri and Iwaniec (1986a) for a specific problem have grown to a general method. A part of mathematics may not die, but it can hibernate,
either because all the main results appear to be known, or because the subject has become too complicated. The Bombieri-Iwaniec method is not yet in its final form: we end this book with some suggestions for improving or extending it. Complications
There are two main complications: short sums (over a subinterval of M to 2M) and a family of sums indexed by values of a parameter. We have not tried to combine these in all possible ways, because of the large number of
cases. In applications the sizes of the parameters S, I, J, M, and T are known approximately, so most cases do not occur. We have said nothing about congruence families of double sums. Two particular questions arise:
1. What happens when different values of the parameter correspond to different short subintervals of M to 2M?
2. Is there a saving over two independent parameters which cannot be combined into one as in the proof of Lemma 17.4.1.? Problems of technique Neither Spacing Problem is completely solved:
1. Find a good estimate for N12(81 0) in Chapter 12. 2. Can one handle the perturbed inequalities in Section 13.3? 3. The condition R2 >> N in Chapter 7 can probably be dropped, at the cost of writing f" (m)/2 = a/q + A, and carrying A through the Coincidence Conditions and all subsequent calculation. 4. Kolesnik's formulation of the Third and Fourth Conditions in Chapter 16
partly explains why the arguments of Chapter 15 work, but it loses accuracy in the Third Condition. Must accuracy be lost? The Kolesnik conditions are more elegant. Sometimes a method is first simplified, then extended in a way which was always possible, but looked impossibly difficult before.
Subdivision without absolute values
479
5. Can one show that most magic matrices do not give a coincidence?
6. Can one show that most minor arcs do not give a coincidence, using a divide-and-conquer argument that is not based on magic matrices? Extensions
It would be very interesting to find other sums which can be broken into Farey arcs this way. One such is
E F, E e(hf(m +k) -hf(m -k)), h
km
which corresponds to the mean square of a lattice point discrepancy. But estimates strong enough to improve Theorem 6.3.4 seem unlikely by this method.
24.2 SUBDIVISION WITHOUT ABSOLUTE VALUES Dividing a sum of length M into about M/N short sums of length N or less, which are estimated in modulus, cannot give an upper bound better than
(N M OE)=0
(
M NJ.
By Theorem 6.1.1, the sum is almost always O(MI/2+E), so there must be
cancellation between the sums over different Farey arcs. The HardyLittlewood method obtains this cancellation on the major arcs. With the Kloosterman cancellation, all Farey arcs can be treated as major arcs. The difficulty is transferred to estimating the Kloosterman corrections. To apply the same programme to the Bombieri-Iwaniec method for single sums, we need to know f(m) modulo one at the centres of the major arcs. To describe how they change, we will need more terms of the Taylor expansion of f(x). The double sum of Chapter 8 may be easier, as f(m) already occurs in the calculation.
In any argument analogous to the Kloosterman refinement in the Hardy-Littlewood method, the division into Farey arcs must be made arithmetically, corresponding to some Farey sequence AR). First method
If y lies between the consecutive Farey fractions cp,_ I = a,_ cpr = a,/q,, then qr- 1(ar - qrY) + gr(gr- 1Y - ar-1) = 1
1/q,_ 1
and
Further ideas
480
The Farey are sum on Ir = I(ar/qr) is now
E
gr(Rr-1f"(m) - ar- 1)e(f(m))
Wr-ISf"(m)S cVr
MSm5M2
Rr-1(ar+1 -Rr+1f" (m))e(f(m))
+ p,sf"(m)5 Wr+1
Msm5M2
_
E
kr(f"(m) - (pr)e(f(m)),
M5rn5M2
where kr(x) is a piecewise linear function with Fourier transform _s
Sr
Rr
kr(s)
Rr+1e(R q 4ar2s2 r
+1
)
+ Rr-1e(Rr
Rr
-1)
Rr+1 - Rr-1
When Mr is the centre of the Farey arc Ir, and m = Mr + n, then
kr(f" (m) - (pr) = f kr(s)e(-sf" (m) + s(Pr) ds. The integral is approximately
f kr(s)e(-µrns)ds, in the notation of Chapter 7. Second method
Jutila (1992) has proposed taking overlapping Farey arcs I(a/q), with length N(q) depending on q, for all rationals a/q with q S Q. For each integer m in the sum, let A(m) be the number of Farey arcs which contain m. We write A(m) = B(m) + D(m), where B(m) is the expected number, varying smoothly with m since f (3)(X) is non-zero, and D(m) is the discrepancy. If m is near the centre of a major arc, then A(m) = 1, and D(m) is large. On the minor arcs the discrepancy D(m) is smaller than B(m) in root mean square. Thus M2
Q
E B(m)e(f(m)) _ F,
F,
q=R M
(a, a
E e(f(m)) +E1 +E2, I(a/q)
where E1 is the correction for m on major arcs, which we estimate as before, and E2 is the correction for minor arcs. Using this idea, we can add together contributions from minor arcs with the same denominator and sums of the same length. A modification would be to take the Farey arcs so that the transformed sums had the same length H. In either method, we must sum over a for fixed q to obtain a Kloosterman refinement. For fixed q we have a double sum over a and h. The limits of
summation, and the numbers b, µ, and K, depend on a. We also have
Subdivision without absolute values
481
expressions like e(- 4ah2/q). When we use a finite Fourier transform to take them outside the sum, then we meet Kloosterman sums
K(a, b; q) = F,*
e
(an +bn) q
n mod q
where the star indicates a condition (n, q) = 1, so that integers n and q exist with nn + qq = 1. Lemma 24.2.1 (Factorizing the Kloosterman sum) If q factorizes as
q = np;', and a/q, b/q have partial fractions a q
i
a,
b
pi'
q
b, i
pi`
then
K(a,b;q)= fK(ai,bi;pi`) i
The factorization in Lemma 24.2.1 is twisted, in the sense that a is (usually) not congruent to a, modulo pi.
Lemma 24.2.2 (Weil's Kloosterman sum bound) The Kloosterman sum to a prime power modulus p` satisfies I K(a, b, p`)I 5 2p`l'2, unless both a and b are multiples of p.
The cases c >_ 2 of Lemma 24.2.2 are elementary, but the case c = 1 depends on counting the number of points on an algebraic curve over finite fields.
The terms of the Kloosterman sum correspond to matrices
of the modular group, and Kloosterman sums occur in the Fourier coefficients of modular forms. Kuznietsov (1980) followed ideas of Selberg to express sums of Kloosterman sums explicitly in terms of the Fourier coefficients of modular forms.
Further ideas
482
Lemma 24.2.3 (Kuznietsov's trace formula) For a, b positive integers, g(x) a continuous piecewise differentiable function of compact support q 2
(ab))
g(2
(ab)
a )kk-1/2
1)k
°°
+k=1 E
21T
b (-)
K(a, b; q) + Sab f J.
x
g(x) dx x
2
(cb(G2k a) - Sab)(2k -1) f J2k-1
4 (ab) bt(a)bt(b)g(K,)
c
1
+
cosh in.1
j=1
(2)
x)
g(x) dx
T;t(a)Tj,(b)g(t)
?r J- (1 + 2it)( 1 - 2it)
dt,
where G2k,a is a modular form of weight 2k, the ath Poincare series, and c is its nth Fourier coefficient (suitably normalized), b,(n) is the nth Fourier coefficient of the jth non-holomorphic Maass wave form (suitably normalized), and a + Kj is the corresponding eigenvalue, and Ta(n) is a normalized divisor function
F, E
T.(n)
d
()a,
e
de=n
Sab is 1 for a = b, 0 otherwise, and (x2
1T i
g(t)= 2sinhirt f (J2it(2)
where JS(y) is the Bessel function. In the same notation q 2 vV-Wab)
-
g
2
K(-a, b; q)
(ab)
4 (ab) b,(a)b,(b)g(Kt) j=1
cosh?rK,
T,t(a)T;t(b)g(t)
1
+ 1rL
dt,
with
g(t)= 2 cosh1rt f Kh2it(x)g(x)dx, where
Khf(y) _
2
W
fo exp(-y cosh t)cosh st dt
IT
1
=
1
I
exp(-2(' +u))u-s
du
is a less familiar Bessel function, the Hankel function of pure imaginary argument (Jeffreys and Jeffreys 1962, cap. 21).
Subdivision without absolute values
483
The two transforms g(t) and g(t) are analogous, since we can express Khs(y) in terms of 13(y) - I_S(y). The Kloosterman sum is a discrete analogue of the definition of Bessel functions of integral order as integrals round the unit circle. We call Lemma 24.2.3 a trace formula, as the right-hand side can be regarded as the normalized trace of an intertwining operator in representation theory. The lemma is proved using the Fourier theory of functions on the hyperbolic plane invariant under the action of the modular group PSL(2, 7L)
as rigid motions, and the Poincare series coefficients enter as residues at trivial poles.
Bombieri and Iwaniec (1986a) suggest that a full solution of the Second Spacing Problem will involve Lemma 24.2.3. We have calculated directly with
the matrices rather than with their Fourier theory. There are three integers a, b, and q in the Coincidence Conditions of Part IV. Perhaps 3 X 3 matrices will be needed. The representation theory of SL(3, 71) is still being investigated.
Are we on the right road to showing that the discrepancy of a smooth closed curve of linear dimensions M is O(MI/z+E)? Subdividing the curve according to rational gradients must be right. Perhaps the number of Farey
arcs should be ME, and we take about 1/e terms of the Taylor series. Alternatively, the y vectors may be so uniformly distributed that the contri-
butions of different minor arcs cancel. The Kuznietsov trace formula is appropriate here because SL(2, 7L) is the automorphism group of the lattice
points, a natural part of the problem. Should the representation theory of SL(n,7L) for n >_ 3 appear in a two-dimensional problem? In all events, we have reached the final formula: `If you want any more, you must sing it yourself.
References The numbers flush right following a reference are the chapter or section numbers in which the reference is cited, or to which it is related. Atkin, A. O. L. and Lehner, J. (1970). Hecke operators on 170(m). Math. Annalen, 185, 134-60. 10,20 Atkinson, F. V. (1949). The mean value of the Riemann zeta function. Acta Math., 81, 10, 21.5 353-76. Baker, R. C. (1986). Diophantine inequalities. Oxford University Press. 5
Baker, R. C. and Harman, G. (1991). On the distribution of apk modulo one. Mathematica, 38, 170-84.
23.3
Baker, R. C., Harman, G., and Rivat, J. (1994). The Piatecki-Shapiro Theorem. J. Number Theory, 50, 261-77. 22 Balasubramian, R. (1978). An improvement of a theorem of Titchmarsh on the mean square of I C(1/2 + it)I Proc. London Math. Soc., 36, 540-76. 21.4 Bombieri, E. (1987). Le grand crible dans la theorie analytique des nombres, 2nd edn. Asterisque, Paris. 5.6
Bombieri, E. and Iwaniec, H. (1986a). On the order of C(1/2 + it). Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4),13,449-72. 5,7,8,14,16,17,21,24.2 Bombieri, E. and Iwaniec, H. (1986b). Some mean value theorems for exponential sums. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 13, 473-86.
7.5,12
Bombieri, E. and Pila, J. (1989). The number of integral points on arcs and ovals. Duke Math. J., 59, 337-57.
23.1
Branton, M. and Sargos, P. (1994). Points entiers au viosinage d'une courbe plane a tres faible courbure. Bull. des Sciences Maths. (2) 118, 15-28. 3, 23.1 Cassels, J. W. S. (1953). A new inequality with application on the theory of Diophantine approximation. Math. Annalen, 126, 108-18. 5,6 Cassels, J. W. S. (1959). An introduction to the geometry of numbers. Springer, Berlin. 11.2
Chaix, H. (1972). Demonstration elementaire d'un theoreme de van der Corput. Comptes RenduesAcad. Sci. Paris, A 275, 883-5.
2
Chandrasekharan, K. and Narasimhan, R. (1967). On lattice-points on a random sphere. Bull. Amer. Math. Soc., 73, 68-71. 6 Colin de Verdiere, Y. (1977). Nombre de points entiers dans une famille homothetique de domaines de R. Ann. Sci. Ec. Norm. Sup. (4)10, 559-76. 6 Corput, J. G. van der (1920). Uber Gitterpunkte in der Ebene. Math. Annalen, 81, 1-20. 2,5 Corput, J. G. van der (1921). Zahlentheoretische Abschatzungen. Math. Annalen, 84, 53-79. 5 Corput, J. G. van der (1922). Verscharfung der Abschatzung beim Teilerproblem. Math. Annalen, 87, 39-65. 3,5 Corput, J. G. van der (1923). Neue zahlentheoretische Abschatzungen. Math. Annalen, 89, 215-54.
5
References
485
Corput, J. G. van der (1935). Zur Methode der stationaren Phase I. Compositio Math., 1, 15-38. 5.5 Corput, J. G. van der (1936). Zur Methode der stationaren Phase II. Compositio Math., 3, 328-72. 5.5 Corput, J. G. van der and Pisot, C. (1939). Sur la discrepance modulo un I. Proc. Kon. Ned. Akad. Wetensch., 42, 476-85. 5.3, 5.6 Davenport, H. (1967). Multiplicative number theory. Markham, Chicago. 5.4,21,22.1 Deligne, P. (1974). La conjecture de Well. Inst. Hautes Etudes Sci. Publ. Math., 53, 273-307. 10.3, 20.1 Deshouillers, J. M. (1976). Nombres premiers de la forme [n`]. Comptes RenduesAcad. Sci. Paris, A 282, 131-3. 22.1 Deshouillers, J. M. (1985). Geometric aspect of Weyl sums. Elementary and Analytic Theory of Numbers. Banach Centre Pub. 17 (Polish Sci. Pub; Warsaw) 75-82. 2 Deschouillers, J. M. and Iwaniec, H. (1982). Kloosterman sums and Fourier coefficients of cusp forms. Inventiones Math., 70, 219-88. 24.2 Everest, G. R. (1989). A `Hardy-Littlewood' approach to the S-unit equation. Compositio Math., 70, 101-18. 23.3
Faltings, G. (1983). Endlichkeitssatze filr abelsche Varietaten fiber Zahlkorpern. Inventiones Math., 73, 349-66.
11.1
Filaseta, M. (1988). An elementary approach to short interval results for k-free numbers. J. Number Theory, 30, 208-25. 3.2 Filaseta, M. and Trifonov, O. (1996). The distribution of fractional parts with applications to gap results in number theory. Proc. London Math. Soc. To appear. 23.1 Ford, L. R. (1938). Fractions. Amer. Math. Monthly, 45, 586-601. 1.2 Fort, C. (1973). New lands. Ace, New York. 15.2
Fouvry, E. and Iwaniec, H. (1989). Exponential sums for monomials. J. Number Theory, 33, 311-33.
23.5
Gel'fond, A. O. and Linnik, Yu. V. (1965). Elementary methods in analytic number theory. Rand McNally, USA.
6.3
Good, A. (1982). The square mean of Dirichlet series associated with cusp forms. Mathematika, 29, 278-95. 20.3 Good, A. (1983). Local analysis of Selbeig's trace formula. Lecture Notes in Mathematics, 1040. Springer, Berlin. 21.1 Graham, S. W. and Kolesnik, G. (1991). Van der Corput's method for exponential sums. London Math. Soc. Lecture Notes, 126. Cambridge University Press. 3,5,7,14,17.4,18.2,19.2,23.4
Hall, R. R. and Tenenbaum, G. (1986). The average orders of Hooley's 0r functions, II. Compositio Math., 60, 163-86. Hall, R. R. (1988). Divisors. Cambridge University Press.
11.1,18.5 11.1
Halberstam, H. and Richert, H.-E. (1974). Sieve methods. Academic Press, London. 5.6
Hardy, G. H. (1915). On the expression of a number as the sum of two squares. QuarterlyJ. Math. (Oxford), 46, 263-83.
10.2
Hardy, G. H. and Landau, E. (1924). The lattice points of a circle. Proc. Royal Soc., A 105, 244-58.
10.2
Hardy, G. H. and Littlewood, J. E. (1923). Some problems of `partitio numerorum'; III: on the expression of a number as a sum of primes. Acta Math., 44, 1-70. 5.6,12.1
Hardy, G. H. and Ramanujan, S. (1918). Asymptotic formulae in combinatory analysis. Proc. London Math. Soc. (2) 17, 75-115. 7.1,12.1
486
References
Hardy, G. H. and Wright, E. M. (1960). An introduction to the theory of numbers, 4th 1,11, 22.1 edn. Oxford University Press. Heath-Brown, D. R. (1978). The twelfth power moment of the Riemann zeta function. 21.5
Quarterly J. Math. (Oxford), 29, 443-62.
Heath-Brown, D. R. (1979). The fourth power moment of the Riemann zeta function. 21.3 Proc. London Math. Soc., (3) 38, 385-422. Heath-Brown, D. R. (1983). The Pjateckii-Sapiro prime number theorem. J. Number Theory, 16, 242-66.
5.5, 22
Heath-Brown, D. R. and Huxley, M. N. (1990). Exponential sums with a difference. 9,19.1, 21.3 Proc. London Math. Soc. (3) 61, 227-50. Herz, C. S. (1962a). Fourier transforms related to convex sets. Annals of Maths., 75, 6.2, 23.4 81-92. Herz, C. S. (1962b). On the number of lattice points in a convex set. Amer. J. Math., 6.2, 23.4 84,126-32. Hlawka, E. (1950). Integrale auf konvexen Korpern. Monatshefte Math., 54, 1-36, 6.2,18.5, 23.4 81-99. Hlawka, E., Schoissengeier, J., and Taschner, R. J. (1991). Geometric and analytic number theory. Springer, Berlin. Hobson, E. W. (1957). Theory of functions of a real variable. Dover, New York.
23 6.3
Huxley, M. N. (1972). On the difference between consecutive primes. Inventiones 20.2 Math., 15, 164-70. Huxley, M. N. (1975). Large values of Dirichlet polynomials III. Acta Arithmetica, 26, 435-44. 20.2 Huxley, M. N. (1985). Introduction to Kloostermania, Elementary and Analytic Theory of Numbers. Banach Centre Pub. 17, (Polish Sci. Pub., Warsaw), 217-306. 24.2 Huxley, M. N. (1987). The area within a curve. Proc. Indian Acad. Sci. (Math. Sci.), 97, 2 111-16. Huxley, M. N. (1988). The fractional parts of a smooth sequence. Mathematika, 35, 292-6. 3 Huxley, M. N. (1989a). Exponential sums and the Rieman zeta function, in Theorie des Nombres, C.R. C.I. TN. Laval 1987. de Gruyter, Berlin, 417-23. 3,17 Huxley, M. N. (1989b). The integer points close to a curve. Mathematika, 36, 198-215. 3
Huxley, M. N. (1990). Exponential sums and lattice points. Proc. London. Math. Soc. (3), 60, 471-502. Corrigenda, ibid. (3) 66, 70. 8,18 Huxley, M. N. (1991). Exponential sums and rounding error. J. London. Math. Soc. (2), 43, 367-84. 17.4,18
Huxley, M. N. (1992a). A note on short exponential sums, in Proc. Amalfi Conf. Analytic Number Theory. University Press, Salerno, 217-29. 14,17 Huxley, M. N. (1992b). Integer points in a domain with smooth boundary. in Seminaire de theorie des nombres Paris 1989-90. Birkauser, Basel, 93-111. 18.3
Huxley, M. N. (1993a). Exponential sums and the Riemann zeta function IV. Proc. London Math. Soc. (3), 66, 1-40.
4,15,17, 21.2
Huxley, M. N. (1993b). Exponential sums and lattice points II. Proc. London Math. Soc., (3), 66, 279-301. Corrigenda, ibid. (3), 68, 264.
18
Huxley, M. N. (1994a). The rational points close to a curve. Ann. Scuola Norm. Sup. Pisa Cl. Sci, (4) 21, 357-75. 4 Huxley, M. N. (1994b). A note on exponential sums with a difference. Bull. London Math. Soc., 26, 325-7. 13.3,19.1, 21.3
References
487
Huxley, M. N. (1994c). On stationary phase integrals. Glasgow Math. J., 35, 354-62. 5.5
Huxley, M. N. (1995a). A mean value theorem for exponential sums. J. Austral. Math. Soc., A, 59, 304-7. 6.1 Huxley, M. N. (1995b). The mean lattice point discrepancy. Proc. Edinburgh Math. Soc., 38, 523-31. 6
Huxley, M. N. and Kolesnik, G. (1991). Exponential sums and the Riemann zeta function III. Proc. London Math. Soc. (3), 62, 449-68. Corrigenda, ibid. (3), 66, 302. 11,12
Huxley, M. N. and Kolesnik, G. (1996). Exponential sums with a large second derivative. To appear. 16.3,19.3 Huxley, M. N. and Sargos, P. (1995). Points entiers au voisinage d'une courbe plane de class C". ActaArithmetica, 59, 359-66. 23.1 Huxley, M. N. and Trifonov, O. (1996). The square-full numbers in an interval. Proc. Cam. Philos. Soc., 119,201-8. 3, 23.1 Huxley, M. N. and Watt, N. (1988). Exponential sums and the Riemann zeta function. Proc. London Math. Soc. (3), 57, 1-24. 1.6, 7,14,17 Huxley, M. N. and Watt, N. (1989a). The Hardy-Littlewood method for exponential sums. Coll. Math. Soc. Jdnos Bolyai, 51, Number Theory. Budapest 1987. North Holland, Amsterdam, 173-91. 7,14,17
Huxley, M. N. and Watt, N. (1989b). Exponential sums with a parameter. Proc. London Math. Soc. (3),59,233-52. 14.4,17 Huxley, M. N. and Watt, N. (1994). The number of ideals in a quadratic field. Proc. 18, 24.1 Indian Acad. Sci. (Math. Sci.), 104, 157-65. Ireland, K. and Rosen, M. (1982). A classical introduction to modem number theory. 10.3 Springer, Berlin.
Ivi6, A. (1985). The Riemann zeta function. Wiley, New York.
21
Iwaniec, H. (1980). Fourier coefficients of cusp forms and the Riemann zeta function. 21.5 Seminaire de Theorie des Nombres (Bordeaux) 1979/80. Iwaniec, H. and Mozzochi, C. J. (1988). On the divisor and circle problems. J. Number Theory, 29, 60-93. 5, 8,13,18 Jarnik, V. (1925). Uber die Gitterpunkte auf konvexen Kurven. Math. Zeitschrift, 24, 500-18. 2,3
Jeffreys, H. and Jeffreys B. S. (1962). Methods of mathematical physics, 3rd edn. Cambridge University Press. 7.3,8.3,10.2,21.2,24.2 Jutila, M. (1985). On exponential sums involving the divisor function. J. reine angew. 10, 20.3 Math., 355, 173-90.
Jutila, M. (1986). On the approximate functional equation for C '(s) and other Dirichlet series. Quarterly J. Math. (Oxford) (2), 37, 193-209.
5.5,10, 20.3
Jutila, M. (1987a). Lectures on a Method in the Theory of Exponential Sums. Tata Institute Lectures in Maths. and Physics 80. Springer, Bombay.
10,20
Jutila, M. (1987b). On exponential sums involving the Ramanujan function. Proc. Indian Acad. Sci. (Math. Sci.), 97, 157-66.
10,20
Jutila, M. (1989). Mean value estimates for exponential sums, in C. R Joumees Arithmetiques Ulm 1987. Lecture Notes in Mathematics 1380. Springer, Berlin, 120-36.
10,17.4, 20
Jutila, M. (1990a). The fourth power moment of the Riemann zeta function over a short interval. Coll. Math. Soc. Jdnos Bolyai, 51, Number Theory. Budapest 1987. 10,13.3,14.4, 21 North Holland, Amsterdam, 221-44.
488
References
Jutila, M. (1990b). Mean value estimates for exponential sums II. Arch. Math., 55, 267-74. 10,20 Jutila, M. (1990c). Exponential sums related to quadratic forms, in Proc. Canadian Number Theory Conference Banff 1988. de Gruyter, Berlin.
10,20
Jutila, M. (1991). Mean value estimates for exponential sums with applications to 10,20,21 L-functions. Acta Arith., 57, 93-114. Jutila, M. (1992). Transformations of exponential sums, in Proc. Amalfi Conf. Analytic 10, 24.2 Number Theory. University Press, Salerno, 263-70. Karatsuba, A. A. (1981). On the distance between adjacent zeros of the Riemann zeta function lying on the critical line. Proc. Steklov Inst. Math., 3, 51-66, from Trudy 7.6,21.4 Mat. Inst. Steklov, 157. Karatsuba, A. A. (1987). Approximation of exponential sums by shorter ones. Proc. 5.5 Indian Acad. Sci. (Math.-Sci.), 97, 167-78. Kendall, D. G. (1948). On the number of lattice points inside a random oval. Quarterly 6 J. Math. (Oxford), 19,1-26. Kendall, D. G. and Rankin, R. A. (1953). On the number of points of a given lattice in 6.2,23.4 a random hypersphere. Quarterly J. Math. (Oxford) (2), 4, 178-89. Korobov, N. M. (1992). Exponential sums and their applications. Kluwer, Dordrecht. 5,23 3,5,23.4 Kratzel, E. (1988). Lattice points. Deutscher Verlag Wiss., Berlin. Kratzel, E. and Nowak, W. G. (1991). Lattice points in large convex bodies. Monatsh. 23.4 Math., 112, 61-72. Kratzel, E. and Nowak, W. G. (1992). Lattice points in large convex bodies H. Acta 23.4 Arith., 62, 285-95.
Kuznietsov, N. V. (1980). Peterson's hypothesis for cusp forms of weight zero and Linnik's hypothesis; sums of Kloosterman sums. Mat. Sbomik, 111, 334-83. 14.2,21.1,24.2 Landau, E. (1915). Uber die Gitterpunke. in einem Kreise (II). Gottinger Nachrichten 10.2 (1915), 161-71.
Lax, P. D. and Phillips, R. S. (1967). Scattering theory for automorphic functions. Academic Press, London. Linnik, Yu. V. (1941). The large sieve. DokladyA.N.S.S.S.R, 30, 292-4.
21.1 5.6,22.1
Littlewood, J. E. (1912). Quelques consequences de l'hypothese que la fonction i(s) de Riemann n'a pas des zeros dans le demi-plan R(s) > 1/2. Comptes RenduesAcad. 21.1 Sci. Paris, 154, 263-6. Littlewood, J. E. (1986) Littlewood's miscellany (ed. B. Bollobas). Cambridge Univer7.4 sity Press.
Liu, H.-Q. (1993). On the number of Abelian groups of given order (supplement). Acta Arith., 64, 287-96.
23.5
Liu, H.-Q. and Rivat, J. (1992). On the Pjateckii-Sapiro prime number theorem. Bull. 22, 23.5 London Math. Soc., 24,143-17. Logan, I. and O'Hara, F. (1983). The complete Spectrum ROM disassembly. Melbourne 18.5 House, Richmond, Surrey. Matthews, K. R. (1973). Bilinear forms and the large and small sieves. J. Number 5.6 Theory, 5, 16-23. Meurman, T. (1988). On exponential sums involving the Fourier coefficients of Maass 10,20 wave forms. J. reine angew. Math., 384, 192-207.
Meurman, T. (1990). On the order of the Maass L-functions on the critical line. Coll Math. Soc Janos Bolyai, 51, Number Theory. Budapest 1987. North Holland, 10,20 Amsterdam. 325-54.
References
489
Meurman, T. (1992). A simple proof of Voronoi's identity. Asterisque, 209, 265-74. 10 Montgomery, H. L. (1971). Topics in multiplicative number theory. Lecture Notes in Mathematics. 227. Springer, Berlin. 5,21 Montgomery, H. L. and Vaughan, R. C. (1974). Hilbert's inequality. J. London Math. Soc. (2), 8, 73-82. 5
Moreno, C. J. and Shahidi, F. (1983). The fourth moment of the Ramanujan Tfunction. Math. Annalen, 266, 233-9. 10.3, 20.1 Muller, W. and Nowak, W. G. (1991). Lattice points in planar domains: applications of Huxley's discrete Hardy-Littlewood method, in Number-Theoretic Analysis II. Springer Lecture Notes in Mathematics, 1452, pp. 139-64. Springer, Berlin. 23
Nowak, W. G. (1985a). On the average order of the lattice rest of a convex planar domain. Math. Proc. Cam. Philos. Soc., 98, 1-4. 6.3 Nowak, W. G. (1985b). An 11-estimate for the lattice rest of a convex planar domain. Proc. Royal Soc. Edinburgh, 100A, 295-99. 6.3 Ogg, A. (1969). Modular forms and Dirichlet series. Benjamin, New York.
10
Piatecki-Shapiro, I. I. (1953). On the distribution of prime numbers in sequences of the form [f(n)]. Mat. Sbomik, 33, 559-66.
22
Phillips, E. (1933). The zeta function of Riemann: further developments of van der Corput's method. Quarterly J. Math. Oxford, 4, 209-25.
5
Rankin, R. A. (1939). Contributions to the theory of Ramanujan's function T(n) and
similar arithmetical functions II: the order of Fourier coefficients of integral modular forms. Proc. Cam. Phil. Soc., 35, 357-72. 10.2, 20.1 Rankin, R. A. (1977). Modular forms and functions. Cambridge University Press. 1.3,10 Sargos, P. (1995). Points entiers an voisinage d'une courbe, sommes trigonometriques courtes et pairs d'exposants. Proc. London Math. Soc., (3) 70, 285-312. 3,19.3 Selberg, A. (1942). On the zeros of Riemann's zeta function on the critical line. Arch. for Math. og Naturvid., B 45, 101-14 (Collected Works 1, 142-55). 21.4 Selberg, A. (1956). Harmonic analysis and discontinuous groups in weakly symmetric
Riemannian spaces with applications to Dirichlet series. J. Indian Math. Soc., 20, 47-87 (Collected Works I, 423-63). 21.1,24.2 Sierpinski, W. (1906). Sur un probleme du calcul des fonctions asymptotiques. Prace Mat.-Fiz., 17,77-118. 2 Swinnerton-Dyer, H. P. F. (1974). The number of lattice points on a convex curve. J. Number Theory, 6, 128-35. 2,3 Titchmarsh, E. C. (1951). The theory of the Riemann zeta function. Oxford University Press.
5,21
Trifonov, O. (1988). On the number of the lattice points in some two-dimensional domains. Doklady Bulgar. Akad. Nauk, 41, 11, 25-7. 5,18.3 Uchiyama, S. (1972). The maximal large sieve. Hokkaido Maths. J. 1, 117-26. 8.4 Vaughan, R. C. (1977). Sommes trigonometriges sur les nombres premiers. Comptes RenduesAcad. Sci. Paris, A 285, 981-3. 22 Vaughan, R. C. (1981) The Hardy-Littlewood method. Cambridge University Press. 8,12, 23.2 Vinogradov, I. M. (no date; circa 1954). The method of trigonometric sums in the theory of numbers (translated K. F. Roth and A. Davenport). Interscience, London. 5,22 Voronoi, G. (1903). Sur un probli me du calcul des fonctions asymptotiques. J. reine angew. Math., 126, 241-82. 2 Voronoi, G. (1904). Sur une fonction transcendente et ses applications a la sommation de quelques series. Ann. Ecole Norm. Sup. (3) 21, 207-67, 459-533. 6,10, 21.5
490
References
Watt, N. (1988). An elementary treatment of a general Diophantine problem. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 15, 603-14 Watt, N. (1989a). A problem on semicubical powers. Acta Arith., 52, 119-40.
11 11
Watt, N. (1989b). Exponential sums and the Riemann zeta function II. J. London 11,12,17 Math. Soc., 39, 385-404. Watt, N. (1990). A problem on square roots of integers. Periodica Math. Hung., 21, 13 55-64. Watt, N. (1992). A hybrid bound for Dirichlet L-functions on the critical line, in Proc. Amalfi Conf. Analytic Number Theory. University Press, Salerno, 387-92.
16.2,17.3
Weil, A. (1967). Uber die Bestimmung Dirichletscher Reihen durch Funktional10.1 gleichungen. Math. Annalen, 168,149-56. Weil, A. (1971). Dirichlet series and modular forms. Lecture notes in mathematics, 189, 10.1 Springer, Berlin.
Wilton, J. R. (1929). A note on Ramanujan's arithmetic function T(n). Proc. Cam. 10.2 Phil. Soc., 25, 121-9. Wilton, J. R. (1932). Voronoi's summation formula. Quart. J. Math. Oxford (3), 26-32.
10.2
Zaharescu, A. (1987). On a conjecture of Graham. J. Number Theory, 31, 80-7.
1.1
Index
analytic coincidence conditions 315-20, 327, 339-46 approximate functional equation 111-13, 432,
436,441-3,447-8 arc of coincidence 313, 321, 327-8, 341-2 area in intrinsic coordinates 393 Atkin-Lehner newforms 210, 435 Atkinson's summation formula 451
degenerate cases 81 Deligne, P. 210, 425 Deshouillers, J. M. `452 diagonal solutions 236 terms 363 differencing 45-6, 65 step 115-19, 135, 185, 257, 259-60, 363-5, 383, 406, 416, 421-2, 461, 473, 474 Dirichlet approximation theorem 16, 83, 229, 259, 473
Baker, R. C. 473 Balasubramanian, R. 447 Bombieri, E. 2, 3, 4, 125, 158, 160, 289, 467-9, 478, 483
Bombieri-Iwaniec method 142-196, 235-361, 372-383, 479 Borel, E. 7 Brun-Titchmarsh theorem 122-5 Burgess D. A. 135
boundary condition 405 characters 100-101, 135, 236, 362, 432 divisor problem 2, 28, 239-40, 382-83 pigeonhole principle 16, 42-3, 48, 56, 335 series 2, 237, 432-3, 435, 438-441, 447 discrepancy of lattice points 33-41,128-141, 372, 384-8, 398, 474, 479, 483 of sequences 97-8
divide-and-conquer 2, 42-3, 48, 56, 81, 83, 115, 143, 158, 236, 240, 255, 263-4, 307, 311, 447,
Cassels, J. W. S. 116, 242 Cauchy, A. 8 inequality discussed 114-16, 121-2, 125 centre of Farey arc 145, 169, 186, 339 Chaix, H. 41 circle method 143, 152, 255-71, 470-2 see also Hardy, G. H. problem 2, 31, 203 coincidence conditions 156-7, 216, 231, 235, 272, 283-4, 287-346, 349, 416, 426-9, 478,
479
divided differences 46, 65 divisor concentration 240-1, 367 function 29, 203, 237-41, 367 problem 2, 28, 239-40, 382-83 duality of points and lines 73-4, 137-8 of bilinear forms 122, 224
483
detection 119-20, 255 commensurable subgroups 198
Erdos-Turan theorem 96-7 Euler constant 239-40
complementary function 72, 110, 321, 341, 345, 383, 415
product 113, 435, 438-41 Everest, G. 472 exponent pairs 151-2, 369-70 exponential integrals defined 87 evaluated 104-14 exponential sums bounded 99,353-61,365-71,415-22 defined 87 non-standard 475
congruence family 336-9,349,357,361-3,392, 478
continued fraction 16-19, 398 Corput, J. G. van der 4, 41, 43, 93, 116, 125 iteration 43, 62, 116-17, 135, 152, 383, 406, 474
lemmas 87-120, 221-2, 225, 448 counting squares 1-2, 25-31, 41, 131, 167 cusp forms 200 Davenport, H. 4, 101, 453
Faltings, G. 235
Index
492
families of solutions 241-48, 249-51, 272-83 of sums 179, 194, 219-20, 300-306, 333-4, 353, 357-61, 393-7, 410-13, 420, 424-9, 458, 478
Farey arcs 2, 143, 167-70, 185-7, 208-9, 255, 336-7, 372, 425-30, 445, 470, 473, 479-80, 483 sequence 8-10, 19-23, 208, 346, 439, 470, 477
see also major arcs, minor arcs
Fejer kernel 93-4 Filaseta, M. 45, 470 First Derivative Test stated 88, 104, 113-14, 448
First Spacing Problem 3, 56, 158, 235-85, 307,
372,406,471,476-7 flanks of major arcs 151 Ford circles 10 Fortean function 312, 324, 327, 342, 343 Fouvey, E. 475, 477
gardening 152,166,183-4,195-6,225,228-30, 412
Gauss circle problem 2, 31, 203 sums 100-4, 149, 162 Gel'fond, A. 0. 41, 135, 452 generating function 198, 438, 470 Gershgorin's eigenvalue bound 224 Good, A. 436, 439 Graham, R. L. 8 Graham, S. W. 4, 432 with Kolesnik, G. 2, 43, 116, 125, 152, 297, 383, 474
Halasz, G. 433 Halberstam, H. 122 Hall, R. R. 240, 399 Hardy, G. H. 2, 31, 122, 143, 203, 256, 439, 453
Hardy-Littlewood circle method, see circle method Harman, G. 473 Heath-Brown, D. R. 2, 4, 125, 185, 406, 444, 451, 464
height 7 Herz, C. S. 128, 474 Hlawka, E. 128, 405, 474 Hobson, E. W. 130 Hooley, C. 4 divisor concentration 241-2, 367
intrinsic equation of curve 24-5, 384, 388-393 inversion step 108-10, 116-17, 149, 176, 191-2, 193, 383, 422, 474-5 Ireland, K. 210
iteration between rational points and lines 76-80 by major arc method 166, 413-15 of points close to curves 43, 48-53 of resonance curves 370, 415-22 van der Corput's 43, 62, 116-17, 137, 154, 383, 406, 474 Ivic, A. 4, 440 Iwaniec, H. 2, 3, 4, 125, 167, 372, 451, 471, 475, 477; see also
Bombieri-Iwaniec
Jarnik, W. 30-33, 57 Jeffreys, H. and B. S. 150, 177, 203, 441, 482 Jordan curve theorem 25 Jutila, M. 2, 4, 208, 225, 284, 423, 435, 436, 437, 451
Karatsuba, A. A. 125, 160, 451 Kendall, D. G. 135, 474 Kloosterman refinement 471, 479
sums 481-3 Koebe's function 440 Kolesnik, G. 2, 4, 255, 297, 339, 416, 474, 478, see also Graham S. W. Kratzel, E. 4, 43, 474, 475 Kuznietsov, N. V. 290, 439, 481-83 L-function 435-7, 440-1, 451 Landau, E. 31 large sieve 3, 120-125, 126, 166, 214-20, 224-5,372,470 applied 152-8, 178-84, 193-6, 231, 329, 475-6
lemmas 120-25, 224-5 large values results 126, 431-2 Lax, P. D. 440 limitation results 126, 248-9, 334-5, 474 Lindelof hypothesis 441 Linnik, Yu. V. 41, 122, 124, 135, 454 Liu, H.-Q. 452, 475 Logan, I. 398 magic matrix 289-308, 320-1, 329-335, 415-18, 423-8, 430 major arcs in Bombieri-Iwaniec method 148-51, 160-66, 168-72, 186-8, 377, 409, 480 flanks of 151 in Hardy-Littlewood method 143, 255, 256-7, 262-68, 270 long 160-66, 356, 413-15, 450 for S-suits 473 Sargos splines 469 as sides of polygon 55, 67-72 Mangoldt, H. von 438, 453, 464 Matthews, K. R. 125 mean-to-max 117-18, 122, 135, 136, 185, 410
493
Index
Mellin transform 91, 117-18, 181-2, 195, 438-40 Minkowski, H. 242 minor arcs in Bombieri-Iwaniec method 143-9, 151, 168-70, 172-8, 186-7, 188-93, 336, 377, 410, 414, 480
in Hardy-Littlewood method 143, 255, 256-62, 270 Sargos splines 469
as sides of polygon 55, 67 MSbius function 13-14, 63-4, 123-5, 439, 454 transformation 199 Modification One 147, 149, 168, 172, 315 Two 147, 149, 156, 169, 181, 186, 195, 351, 352, 374, 376 Three 169
reflection step, see inversion step resonance curves 320-21, 345, 415-19 Riemann hypothesis 15, 210, 435-6, 439-40, 447
zeta function 2, 3, 64, 111-13, 160, 237, 432, 435, 438-451
Riesz interchange 218, 246, 267, 330-2, 405, 457
Rivat, J. 452, 475 Rosen, M. 210 row-of-glasses function 95 row-of-teeth function 2, 33, 93, 131, 167, 379-81 rounding error function 2, 33, 93, 131, 167, 379-81 sums 93-8, 131, 167, 363, 372, 379-83, 385-8,397-405
modular group 11-12, 197, 289, 439-40, 481, 483 forms 2, 3, 197-9, 256, 290, 435, 451, 481-83
Sargos splines 469
monomial function 50-2, 117, 165, 369, 414, 416,473,475-7 Montgomery, H. L. 115, 213, 455 Mordell, L. J. 440 Moreno, C. J. 210, 425, 437 Motohashi, Y. 4 Mozzochi, C. J. 2, 4, 167, 372
Second Spacing Problem 3, 158, 287-346, 363, 372, 476, 483 Selberg, A. 4, 481 sieve 122, 125, 439
Nowak, W. G. 4, 135, 474
Sierpinski, W. 41, 167, 372 sieve
trace formula 439-40 zeta function 439-40 Shahidi, F. 210, 425, 437 short interval means 118-19, 221-4, 226-7,
408-10,435,443-7 general 454
O'Hara, F. 398 partial summation lemmas 89-93, 181-2, 225, 408
Phillips, R. S. 440 Piatecki-Shapiro theorem 452, 463, 475 Pila, J. 467-9 Pisot, Ch. 116 Poisson summation formulae 93-104 used 2,103,109,110-12,149-50,173-8,
large, see large sieve Selberg's 122, 125, 439 simple asymptotic 63 simple asymptotic sieve 63 Simpson's rule 397 stakanchik 95 stationary phase integrals 105-8, 133-4, 150, 177, 212, 263
Swinnerton-Dyer, H. P. F. 4, 30, 43, 45-6, 49, 50, 54-62, 197, 467
189-93, 257, 260, 263-4, 320, 363, 381,
383,409,419,432,438,474-5 polynomial approximation 2, 142-6, 161, 173,
188-9,384,469 prime numbers 1, 15, 122-5, 438-9, 447, 452-64, 473
Tenenbaum, G. 240, 399 trapezium rule 31, 98, 128-9, 397 Trifonov, O. 4, 470 Type I double sums 115, 455-60 Type II double sums 115, 213, 455, 459-62, 475
Rademacher's binary decomposition 180 Ramachandra, K. 446 Ramanujan's hypothesis: bounds 210 hypothesis: multiplicativity 440 sum 13-14, 123, 125 work on partitions 143, 256 Rankin R. A. 4, 200, 210, 425, 433, 436, 451
reference fractions 19-20, 308, 312, 329-30, 332
Uchiyama, S. 180 uniform Diphantine approximation 19-23, 221 uniform distribution 12-15
van der Corput, see Corput, J. G. van der Vaughan, R. C. 115, 143, 152, 256, 454, 455, 464, 471
Index
494
Vinogradov, I. M. 4, 41, 93, 143, 152, 256, 454,
Wright, E. M. 453
471
Voronoi, G. 4, 41, 128 polygon 2, 40-1, 167, 372 summation formula 128, 203, 451
x-vectors 154, 178, 193, 231 y-vectors 154, 178, 194, 231
Waring's problem 3, 152, 471 Watt, N. 2, 4, 20, 241, 246, 272, 297 Weil, A. 199, 481 Weyl's criterion 13, 96-7 Wilton summation 199, 200-10, 451 translated 440 used 210-13
Zaharescu, A. 8 zeta functions 440-1 Minakshisundaram-Pleijel 440 Riemann 2, 3, 64, 111-13, 160, 237, 432, 435, 438-451 Selberg 439-40