This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
2, p,re, and we let nb nz, n3, ... be the quadratic nonresidues. Denote by Vb Vz, ... the solutions to
(x). Then ffl' ... ,jJp" _ I is also a reduced residue system. Therefore p"l
I
llns is the Riemann Zeta function. n=l e(f(x)j = e 21tif(x), eq(f(x)) = e 21tif(x)/q. (s) =
m
=
I
x(n) e21tian/m is a charaCter sum. n=l r(x) = S(l,X)' m1 S(n,m) = I e21tinx2/m, (n,m) := 1, is a Gauss sum. x=o q1 S(q,f(x)) = I eq(f(x)). x=o
S(a, X)
Chapter 1. The Factorization of Integers
Throughout this chapter the small Latin letters
a,b, ... ,n, ... ,p, ... ,X,Y,z represent integers. The main purpose of the chapter is to prove the Fundamental Theorem of Arithmetic (Theorem 5.3) and its various applications.
1.1 Divisibility We call the numbers 1,2,3, ... the natural numbers and ... ,  2,  1, 0, I, 2, ...
the integers, so that the natural numbers are sometimes called the positive integers. It 'is clear that the sum, the difference, and the product of two integers are also integers. We say that the set of integers is "closed with respect to the three operations of addition, subtraction and multiplication". Let IX be a real number. We denote by [IXJ the greatest integer not exceeding IX. For example [3J = 3,
[J2J =
1,
[nJ
=
3,
[ nJ =  4.
If IX is positive, then [IXJ is simply the integer part of IX, and we always have [IXJ :::; IX < [IXJ
+ 1.
We now take IX to be a rational number alb, b > O. Then we have
or
2
I. The Factorization of Integers
giving
0::;:; r < b. We have therefore proved: Theorem 1.1. Let a and b be any two integers with b > 0. Then there exist integers q and r satisfying a
= qb + r,
0
0::;:; r < b.
The number r in the theorem is called the (nonnegative) remainder of a when divided by b. Definition. If the remainder of a when divided by b is Zero  that is, if there exists an integer e such that a = be, then we say that a is a multiple of b. We also say that b divides a, and we write bla, and we call b a divisor of a. Clearly we always have Ila, blO and, for any a ¥ 0, ala. If b does not divide a, then we write b,.ra. Finally, if a = be and b is neither a nor I, then we call b a proper divisor of a. Concerning divisibility we have the following obvious theorems: Theorem 1.2. Suppose that b ¥ 0, e ¥ 0. Then 1) if bla and elb, then cia; 2) if bla, then belae; 3) if eld and ele, then, for any m, n, eldm + en.
0
Theorem 1.3. If b is a proper divisor of a, then I < Ibl < lal.
0
Exercise 1. If n is a positive integer, then
[[:exJ]
=
[exJ.
Exercise 2. If n is a positive integer, then
[exJ + [ex + ~] + ... + [ex + n :
I] = [nexl
Exercise 3. Prove the inequality
[2exJ + [2PJ
~
[exJ + [ex + PJ + [Pl
1.2 Prime Numbers and Composite Numbers We divide the natural numbers into three classes: ' (i) I, the only number with exactly one natural number divisor, namely I itself. (ii) p, numbers with exactly two natural number divisors, namely I and p itself. In other words p is an integer greater than I with no proper divisors.
3
1.3 Prime Numbers
(iii) n, numbers with proper divisors (so that n has more than two divisors). We call the numbers p in the second class the prime numbers, and the numbers n in the third class the composite numbers. We usually denote a prime number by the letter p. An integer is said to be even or odd according to whether it is divisible by 2 or not. Clearly even integers greater than 2 cannot be prime numbers. Theorem 2.1. Every integer greater than 1 is a product of prime numbers.
Proof Let n > 1. If n is prime, then there is nothing to prove. Suppose now that n is not prime and that q I is the least proper divisor. By Theorem 1.3, q I must be a prime number. Let n = qint. 1 < ni < n. IfnI is prime, then the required result is proved; otherwise we let q2 be the least prime divisor of ni giving
Continuing the argument we have n > ni > n2 > ... > 1, and the process must terminate before n steps so that eventually we have
where qI, . .. , qs are prime numbers. The theorem is proved.
0
We can arrange the prime numbers in Theorem 2.1 as follows al > 0, a2 > 0, ... , ak > 0,
PI
1.3 Prime Numbers The first few prime numbers are 2,3,5,7,11,13,17,19,23,29,31,37,41,43, .... If N is not too large, it is not difficult to determine all the prime numbers not exceeding N. The method is known as the sieve of Eratosthenes. If n ~ Nand n is not We first list all the prime, then n must be divisible by a prime not exceeding integers between 2 and N:
p.
2,3,4,5, ... , N.
4
I. The Factorization of Integers
We then successively remOve the following: (i) 4,6,8,10, ... , that is even integers from 22 onwards; (ii) 9,15,21,27, ... , that is multiples of 3 from 32 onwards; (iii) 25,35,55,65, ... , that is multiples of 5 from 52 onwards;
Continuing in this way we remove all those integers which are multiples of a prime not exceeding )N. The remaining numbers are all the prime numbers not exceeding N. All existing tables of prime numbers are built up this way with small modifications to the method. The most accurate table of prime numbers is by Lehmer: List of prime numbers from 1 to 10,006,721, Carnegie Institution" Washington 165 (1914). Lehmer also published a factor table: Factor table for the first ten millions, Carnegie Institution, Washington 105 (1909). An example of a 39 digits prime number is 2127 1 = 1701,41183,46046,92317,31687,30371,58841,05727, and a 79 digits prime number is 180(2127  1)2
+ 1.
Up to the present (1981) the largest known prime is 244497  1, a number with 13395 digits. The following number 2257  1 =231,58417,84746,32390,84714,19700,17375,81570, 65399,69331,28112,80789,15168,01582,62592,79871 is known to be composite, but its prime factorization is not known. These facts can be established with the aid of computing machines and special methods. We shall describe some of these methods later (see §3.9 and §16.15), but we cannot go into the details concerning the actual computations. A table of prime numbers up to 5000 is given at the end of Chapter 3.
1.4 Integral Modulus Bya modulus we mean a set of integers which is closed with respect to the operations of addition and subtraction. In other words, if m and n are integers in a modulus, then m ± n also belong to the modulus. The modulus containing only the integer 0 is called the zero modulus. The set of all integers forms a modulus, as does the set of integers which are multiples of a fixed integer k. We shall presently be concerned with integral moduli. Theorem 4.1. 1) The number 0 belongs to every modulus;
2) Let a, b belong to a modulus and m, n be any integers. Then am the modulus.
+ bn belongs to
5
1.4 Integral Modulus
Proof 1) Take any a in the modulus. Then 0 = a  a belongs to the modulus. 2) If a is in the modulus, then 2a = a + a, 3a = 2a + a, ... ,ma are also in the modulus. Similarly nb belongs to the modulus and so the required result follows. D
Theorem 4.2. Let a, b be any two integers. Then the set of numbers of the form am + bn forms a modulus. Proof This is trivial.
D
Theorem 4.3. Any nonzero modulus is the set of multiples of a fixed positive integer. Proof Let d be the least positive integer in the modulus. We claim that every number in the modulus must be a multiple of d. For, suppose the contrary and let n be a number in the modulus which is not a multiple of d. Then, by Theorem 1.1, there are integers q and r such that
n = dq
+ r,
1:::; r < d.
From the definition of a modulus, we see that r = n  dq belongs to the modulus, and this contradicts the defining minimal property of d. Therefore every member of the modulus is a multiple of d. It is also clear that every multiple of d is in the modulus. The theorem is proved. D
Definition. Let a, b be any two integers and consider the modulus of the set of numbers of the form am + bn. If this is not the zero modulus, then the number din the proof of Theorem 4.3 is called the greatest common divisor of a and b, and is denoted by (a, b). Theorem 4.4. The greatest common divisor (a, b) has the following properties: (i) There exist integers x,y such that ax + by = (a, b); (ii) Given integers x,y we always have (a, b)lax + by; (iii) If ela, elb then el(a,b). Proof (i) and (ii) are immediate consequences of Theorem 4.3, and (iii) follows from (i). D
Definition. If (a, b)
=
1, then we say that a and b are coprime.
Note: We introduced the well known method of successive divisions known as the Euclidean algorithm in the proof of Theorem 4.3. The detailed explanation of this method was also published in our country in the year 1247. Example. We take a = 323, b = 221. From Euclid's algorithm we first 323 = 221 . 1 + 102.
h~ve
6
I. The Factorization of Integers
Note that 102 belongs to the modulus of numbers ax
+ by.
Next
221 = 102·2 + 17, so that 17 also belongs to the modulus. Since 102 = 17·6 it follows that 17 is the least positive integer in the modulus, that is 17 = (323,221). This method can be used to determine the integers x, y in (i) of Theorem 4.4. In fact we have 17 = 221  2 . 102
= 221  2(323  221) = 3 . 221  2 . 323 so that x =  2, y = 3. This ancient method here is a fundamental pillar of elementary number theory.
1.5 The Fundamental Theorem of Arithmetic Theorem 5.1. Let p be a prime, and plab. Then either pia or plb.
Proof If p,./'a, then (a,p) = 1. By Theorem 4.4, there are integers x,y such that xa
+ yp =
1,
and so
x . ab But plab, so that plb. Theorem 5.2.
+ yb . p = b.
D
If c > 0 and (a, b) = d,
then (ac, bc) = dc.
Proof There are integers x, y such that
+ yb =
d,
+ ybc =
dc,
xa
or xac
and so (ac, bc)ldc. On the other hand, from dla we deduce that cdlca; similarly cdlcb. Thus dcl(ac, bc). The required result follows. D Theorem 5.3. The standard factorization of a natural number n is unique. In other words, there is only one way of writing n as a product ofprime numbers, apartfrom the ordering of the factors.
7
1.6 The Greatest Common Factor and the Least Common MUltiple
Proof From Theorem 5.1 we see that if plabc ... f, then P must divide one of a, b, c, ... ,f. In particular if a, b, c, ... ,f are all primes, then P must be one of a,b, c, ... ,f. Suppose now that
represent two standard factorization of n. We conclude from the above that each P must be a q, and each q must be a p. Therefore k = j. Also from
PI
which is impossible since only the left hand side is a multiple of Pi' Similarly we cannot have ai < bi' The theorem is proved. D It is appropriate to insert here the explanation of excluding the number 1 from the definition of a prime number. If 1 is treated as a prime, then we shall have no unique factorization, since we can insert any power of 1 in the factorization.
J2 are irrational numbers.
Exercise 1. Prove that log! 02 and Exercise 2. Let 1025 log 10 1024  a , log 10
log 10
1024 2  b 1023 . 1025  ,
125 2  d 124 '126 ,
log 10
log 10
99 2 98 , 100
=
81 2  c 80 , 82  ,
e.
Show that 1961og10 2
=
59
+ 5a + 8b 
3c  8d + 4e.
Express IOg10 3 and IOg10 41 in terms of a, b, c, d, e. Determine loglo 2 to ten decimal places and discuss the practical application of the method. (Given loge 10 = 2.3025850930.)
1.6 The Greatest Common Factor and the Least Common Multiple Let XI.' .. , Xn be any n numbers. We denote by min(xI.' .. , xn) and max(xI.' .. , x n) the least and the greatest numbers among xI. ... , Xn respectively. The following theorem is clear.
8
1. The Factorization of Integers
Theorem 6.1. Let a, b be two positive integers with prime divisors Pb ... ,Ps so that we
can write
bv ~ 0, PI < P2 < ... < Ps·
Then (a, b)
= p~1
... p~.
Definition. Let a, b be two positive integers. Integers which are divisible by both a and b are called common multiples of a and b. The least of all the positive common multiples is called the least common multiple of a and b. Since ab is certainly a positive common mUltiple, the least common multiple always exists. Theorem 6.2. Under the hypothesis o/Theorem 6.1, the least common multiple of a, b
is given by
Proof Clearly both a and b divide e. Moreover, if
is divisible by a, then a v ~ mv' Therefore, if e' is divisible by both a and b, then av ~ mv and bv ~ m v, and hencemax(a v, bv) ~ mv' Therefore ele' and the theorem is proved. D Theorem 6.3. Any common m1Jltiple
multiple.
0/ a,
b is a multiple of the least common
D
Theorem 6.4. Let [a, b] denote the least common multiple of a, b. Then
[a, b](a, b) = abo Proof Let a
Then
Also
= p~1 ... P:',
PI
1.6 The Greatest Common Factor and the Least Common Multiple
. 9
Since we always have x
the theorem is proved.
+ y = max(x,y) + min(x,y),
0
We now define inductively the greatest common factor and the least common mUltiple of n integers as follows: Let al, ... ,an be integers. The greatest common factor is the number
and the least common mUltiple is the number
Theorem 6.S. Let Pl < pz < ... < P.,
Then
= p~! [at. . .. ,an] = p~! (at> ... ,an)
... P:', ... p~',
Exercise 1. Prove the following two equations
Exercise 2. Prove the following two equations
[at. . .. ,an]
alaZ ... an
= (az'" a m ala3'" a m ···,al··· anl)
Exercise 3. Let at. . .. ,an be n integers. Then (at. ... ,an) is the least positive integer belonging to the modulus of integers of the form alx,L + ... + anxn. Exercise 4. Find x, y, z such that 6x
+ 15y + 20z =
17.
Exercise S. (Chinese publication (1372).) There is a certain sum of money in yuens. On division by seventyseven, there is a remainder of negative fifty. On division by seventyeight, there is no remainder. How much money is there? (Answer 2106 yuens.)
10
1. The Factorization of Integers
1.7 The InclusionExclusion Principle Theorem 7.1. Let there be N objects, and suppose that N!1. of them possess the property oc, Np of them possess the property /3, ... , N!1.p of them possess both the properties oc and /3, ... , N!1.PY of them possess the three properties oc, /3 and ,)" .... Then the number of objects which do not possess any of the properties oc, /3, ,)" ... is given by NN!1. N p " ' } ~ Z!1.!1.PYP ~ •.. ". (A)
+ ...  ....
Proof Let P be an object which possesses k of the properties oc, /3, .... Then P occurs exactly once in the full set of N objects, k times in the enumeration of the N!1.' N p, ... objects,
(~) = ~k(k 
I)
times in the enumeration of the N!1.p, ... objects,
e) = ~k(k
 I)(k  2)
times in the enumeration of the N!1.PY' ... objects, .... If k ?: 1, then the number of times P occurs in the enumeration (A) is
1
(~) + (~)  (~) + ... = (I 
I)k = O.
But if k = 0, then P is one of those objects which do not possess any of the properties oc, /3, ,)" ... , and it occurs exactly once in the enumeration (A). The theorem is proved. D We now apply this principle as follows: For "property oc" we mean "not exceeding a", .... Theorem 7.2. Let a, b, ... , k, / be nonnegative numbers. Then we have max(a, b, . .. , k, /) = a
+ b + ... + k + /
 min(a, b)  ...  min(k, /)
+ min(a, b, c) + .. .  ... + ...
± min(a, b, ... ,k, I). Proof We take the first N (> max(a, b, ... ,k, /) positive integers. The number of integers without the properties oc, /3, ... is N  max(a, b, ... , k, I). The required result follows from Theorem 7.1. D
11
1.8 Linear Indeterminate Equations
Theorem 7.1 can also be used to prove the following two theorems: Theorem 7.3. [at. . .. , an]
= al ... an(at. a2)1 ... (a n t. an) l(at. a2, a3) 0
... (at. ... ,an)(_l)n+l.
Theorem 7.4. (at. ... ,an) = al ... an[at.a2]1 ... [a nt. an]1[al,a2,a3] ···[at. ... ,anJ<_l)n+l.
0
Note: Exercises 1 and 2 in 1.6, and Theorems 7.3 and 7.4 establish a "principle of duality" whereby ( ) and [ ] can be interchanged.
Exercise. Let a, b, ... , k, I be positive integers. Determine the number of integers in 1,2, ... , n which are coprime with a, b, ... , k, I.
1.8 Linear Indeterminate Equations From Theorem 4.4 we have at once: Theorem S.l. A necessary and sufficient condition for the equation ax
+ by = n
to have integer solutions in x,y is that (a,b)ln.
0
Theorem S.2. Let (a, b) = 1, and xo,Yo,be a set of solutions to ax
+ by =
n.
(1)
Then each set of solutions to (1) are given by
x = Xo
+ bt,
y = Yo  at.
Moreover, given any integer t, these are solutions to (1). Proof From ax + by = nand axo + byo = n we have a(x  xo) + b(y  Yo) = o. Since (a, b) = 1 we deduce that aly  Yo. Lety = Yo  at, so that x = Xo + bt. The required result follows from substituting these into (1). 0
Theorem S.3. Let (a, b) = 1, a > 0, b > o. Then every integer greater than ab  a  b is representable as ax + by (x ~ 0, y ~ 0). Moreover, ab  a  b is not representable as such.
12
1. The Factorization of Integers
Proof From Theorem 8.2 we know that the solutions to the equation n take the form . x
= Xo + bt,
=
ax
+ by
= Yo  at.
y
We now select t so that x and yare nonnegative. We can choose t so that 0::;:; Yo  at < a, or 0::;:; Yo  at::;:; aI. From the hypothesis, we have (xo
+ bt)a =
n  (Yo  at)b > ab  a  b  (a  I)b
= 
a
a, x
+I
or Xo
+ bt > 
Xo
+ bt ~ 0.
I,
so that
Finally, suppose if possible that ab  a  b
= ax + by,
x
~
0,
y
~
0.
Then we have ab = (x
Since (a, b) = I, it follows that aly hence ab
which is impossible.
+ I)a + (y + I)b.
+ I, blx + I, so that y + I
= (x + I)a + (y + I)b
~
~
~
band
2ab,
D
The above theorem can be interpreted as follows: If a> 0, b> 0, (a, b) = I, then ab  a  b is the largest integer not representable as ax + by (x ~ 0, y ~ 0). We can generalize this to the following problem: Let a; b, c be three positive integers satisfying (a, b, c) = I. Determine the largest integer not representable as ax + by + cz (x ~ 0, y ~ 0, z ~ 0). This is an unsolved problem.
°
Exercise 1. Let a> 0, b > and (a, b) = 1. Then the number of nonnegative solutions to the equation ax + by = n is equal to
[:b] (Hint: [ex]  [fJ]
[a:] +
or
1.
= [ex  fJ] or [ex  fJ] + 1.)
Exercise 2. Let a, b, c be positive integers satisfying (a, b) = (b, c) = (c, a) = 1. Determine the largest integer not representable as bcx
+ cay + abz,
(Answer: 2abc  ab  be  ca.)
x
~
0,
y
~
0,
z
~
0.
13
1.9 Perfect Numbers
Exercise 3. Determine the number of solutions to
x
+ 2y + 3z = n,
x
~
y
0,
~
0,
z
~
o.
(Hint: The required number is the coefficient of x" in the power series expansion for (1  x)(l  x 2 )(l  x 3 )
•
The power series can be obtained by the method of partial fractions. Answer:
(n + 3)2 7 (  1)" 2 2nn 1272 +  8  +9"cos3·) Exercise 4. (Ancient Chinese publication.) Cockerel one, five cents; chicken one, three cents; baby chicks three, one cent. One hundred cents are paid for one hundred birds. How many cockerels, chickens and baby chicks are there?
1.9 Perfect Numbers Theorem 9.1. Let u(n) denote the sum of the divisors of n. If n
u(n)
=
pa1+I_l I
= p~' ... p~s, then
pas+I_l . ..
PI  1
s
•
Ps  1
Proof All the divisors of n are of the form
Therefore we have
a,
u(n) =
as
L p~' ... p:
L
=0 a,
=
L
s
xs=O
Xl
a2 p~'
Xl=O
p~' + I

.
L
p~2
...
as
L
p:s
Xs:::::O X2=0 p~s+I  1 1
PI  1
Ps  1
D
An immediate consequence of this theorem is: Theorem 9.2. If(m,n)
= 1, then u(mn) = u(m)u(n). D
Note: u(n) is called an arithmetic function. An arithmetic function possessing the property of Theorem 9.2 is called a multiplicative function. Definition. A positive integer n is called a perfect number if u(n) = 2n. Examples of perfect numbers are: 6
= 1 + 2 + 3,
28
=
1 + 2 + 4 + 7 + 14.
14
I. The Factorization of Integers
Theorem 9.3. Let p = 2n

1 be prime. Then !p(p
+ 1) =
2n 1(2 n  1)
is perfect. Moreover, every even perfect number is of this form. Proof 1) From Theorem 9.1 we have a(!p(p
2n

1 p2  1
+ 1)) =     = (2n 21 p1
 l)(p
.
+ 1) = p(p + 1).
2) Let a be any even perfect number. Set u> 1,
2,ru.
Then, by Theorem 9.2, 2n  1 2nu = 2a = a(a) =   a(u),
21
and so
But u and u/(2 n  1) are both divisors of u. Since a(u) is the sum of all the divisors of u, it follows that u has only two divisors, so that u is prime and u/(2 n  1) = 1. The theorem is proved. 0 Exercise 1. Verify that a(m) = a(n) = m
m n
+ n has the following three solutions: 9363584 9437056
Exercise 2. Prove that if a positive integer is the product of its proper divisors, then it must be a cube of a prime or a product of two distinct primes.
1.10 Mersenne Numbers and Fermat Numbers Whether there exists an odd perfect number is a famous difficult problem. From the previous section we see that the determination of even perfect numbers is reduced to the determination of Mersenne primes, that is prime numbers of the form 2n  1, since there is now a onetoone correspondence between Mersenne primes and even perfect numbers. Whether there exist infinitely many Mersenne primes is another difficult unsolved problem is number theory. Theorem 10.1.
If n > 1 and an  I
is prime, then a
= 2 and n is prime.
15
1.10 Mersenne Numbers and Fermat Numbers
Proof If a> 2, then (a  1)I(an  1) so that an  1 cannot be prime. Again, if a and n = kl, where k is a proper divisor of n, then (2k  1)1(2n  1) so that 2n
cannot be prime.
=2 
1
0
The problem of the primality of 2n is prime. We usually write

1 is thus reduced to that of 2P

1 where p
for a Mersenne prime. Up to the present (1981) Mp has been proved prime for p
= 2,3,5,7,13,17,19,31,61,89, 107, 127,521,607,1279,2203,2281, 3217,4253,4423,9689,9941,11213,19937,21701,23209,44497
so that there are 27 perfect numbers known to us. Similarly to the Mersenne numbers, there are the socalled Fermat numbers. Theorem 10.2.
Proof If m
If 2m + 1 is prime,
=
2n.
= qr, where q is an odd divisor of m, then we have 2qr
and 1 < 2r
then m
+ 1 = (2r)q + 1 = (2r + 1)(2r(ql) 
+ 1 < 2qr + 1, so
...
that 2m + 1 cannot be prime.
+ 1) 0
Let
We call Fn a Fermat number, and the first five Fermat numbers
Fo = 3,
F3
= 257,
F4 = 65537
are all primes. On this evidence Fermat conjectured that Fn is prime for all n. However, in 1732, Euler showed that Fs
= 225 + 1 = 641 x 6700417
so that Fermat's conjecture is false. Note: The divisibility of Fs by 641 can be proved as follows: Let a = 27, b = 5 so that a  b 3 = 3, 1 + ab  b4 = 1 + 3b = 24. Therefore
and this must be divisible by 1 + ab = 24
+ 54 = 641.
16
I. The Factorization of Integers
It has been found that many Fermat numbers Fn are composite, but no Fermat prime has been found apart from the first five numbers. Therefore Fermat's conjecture has been a most unfortunate one, and indeed it is now conjectured that there are only finitely many Fermat primes. There is an interesting geometry problem associated with Fn , namely that Gauss proved that if Fn is prime, then a regular polygon with Fn sides can be constructed using only straight edge and compass.
1.11 The Prime Power in a Factorial Theorem 11.1. Let p be a prime number. Then the (exact) power alp that divides n! is given by .
[~J + [;2 J + [;3 J + .... (There are only finitely many nonzero terms in this series.) Proof From n!
=
1 ·2· .. (p  1) . p . (p
+ 1) ... (2p) ... (p 
l)p ...
. p2 ...
we see that there are [~J mUltiples of p, [;2] multiples of p2, and so on. The theorem follows. 0 Theorem 11.2. The number n! ( n) r =r!(nr)! is an integer. Proof We use the fact that [O(J  [PJ is either [0( Theorem 11.1 we see that the power of p in (;) is
I([;m a nonnegative integer. Example. If n
J 
PJ
or [0( 
[;m J  [n p: rJ),
0
= iooo, p = 3, then [10300J
=
333,
[1~~OJ = [3~3J = 111,
PJ + 1.
From
17
1.12 Integral Valued Polynomials
[l~~OJ =
[l~~OJ = 12,
37,
[ 1000J 35 = 4,
[l~~OJ = 1.
Therefore the exact power of 3 which divides 1000! is 333
+ III + 37 + 12 + 4 + 1 = 498.
Exercise 1. Detennine the exact power of 7 which divides 10000!. Exercise 2. Determine the exact power of 5 which divides Exercise 3. Prove that if r
+ s + ... + t = n,
GggO).
then
n! r! s! ... t! is an integer. Prove further that if n is prime and max(r, s, ... , t) < n, then the above number is a multiple of n.
1.12 Integral Valued Polynomials Definition. By an integral valued polynomial we mean a polynomial j(x) in the variable x which only takes integer values whenever x is an integer. Example. Polynomials with integer coefficients are integral valued polynomials. The polynomial (
x) = x(x  1) ... (x  r r r!
+ 1)
is an integral valued polynomial. We shall write L1j(x) for f(x + 1)  f(x). Theorem 12.1.
Proof L1 (x) r
= (x + l)x ... (x  r + 2) _ x(x  1) ... (x  r + 1) r!
r!
=x"'(Xr+2)«X+l)_(X_r+l))=( x ). r! r 1
0
Theorem 12.2. Every integral valued polynomial of degree k can be written as
18
I. The Factorization ofIntegers
where ak, ... , ao are integers. Moreover, given any set of integers ak, ... , ao, the above is an integral valued polynomial. Proof Any polynomial f(x) of degree k can be written as
Now
1) + OCk1 C: 2) + ... +
Llf(x) = OCkC:
Writing Ll2j(X) for LI(Llj(x)), and LI'j(x)
(Llj(x))x=o =
=
OCI,
OCI'
LI(Llr1j(x)) we see that ••• ,
(LI'!(x))x=o =
oc" ••••
If j(x) is integral valued, then so are Llf(x) , Ll 2f(x),.... Therefore j(0), (Llj(x))x=o,"" (LI'j(x))x=o,'" are all integers; that is OCk>"" OCo are integers. The last part of the theorem is trivial. D The same method can be used to prove:
Theorem 12.3. Let f(x) be an integral valued polynomial. Given any integer x, a necessary and sufficient condition for j(x) to be a multiple of m is that
where ak,' .. , ao are integers given in Theorem 12.2.
D
Theorem 12.4 (Fermat). Let p be a prime number. Then,for any integer x, x P  x is a multiple of p.
Proof If P = 2, then the result follows at once from x 2  x = x(x  1). Assume therefore that p > 2, and letf(x) = x P  X. Now f(O) = 0 and Llj(x) = (x
+ 1)P 
x P  (x
+ 1) + x
where the coefficients (by Exercise 11.3) are all integers. With x = 0, we see thatj( 1) is a multiple of p; with x = 1, we see thatj(2) is a mUltiple ofp; and so on. Therefore f(x) is always a mUltiple of p if x ~ O. If x is a negative integer, we can deduce the result from
xP x The theorem is proved.
D
= 
[( 
x)P  (  x)].
19
1.13 The Factorization of Polynomials
Exercise 1. Generalize Theorems 12.2 and 12.3 to several variables. Exercise 2. Prove that n(n
+ 1)(2n + 1) is a mUltiple of 6.
Exercise 3. Prove that, as m and n run through the set of all positive integers,
m
+ t(m + n 
l)(m
+n 
2)
also runs through the whole set of positive integers, and with no repetition. Exercise 4. Prove that if a polynomial of degree k takes integer values for k successive integers, then it must be an integral valued polynomial.
+1
Exercise 5. If./{ x) =  ./{x), then we call./{x) an odd polynomial. Prove that an odd integral valued polynomial can be written as
ao
X(X+l) + allx (x) 1 + a 2 "2 3 + ... + am mx(x+ml) 2m  1 '
where at. ... ,am are integers.
1.13 The Factorization of Polynomials Theorem 13.1. Let g(x) and h(x) be two polynomials with integer coefficients:
g(x)
=
h(x)
=
alx' + ... + ao, bmxm + ... + b o,
a, i= 0, bm i= 0,
and g(x)h(x)
= C'+mx'+m + ... + co.
Then
Proof We may assume without loss that (a" ... ,ao) that pl(C,+ m, ... , co) and pl(b m, ... ,bv + I),
= 1, (b m, ... ,bo) = 1. Suppose
p,{'b v •
From the definition we have Cu + v
=
I
asb!,
s+t=u+v
and apart from the term aub v , each term is a multiple of p. Since p,{'aub v , it follows that p,{'cu+v , and so P,{'(C,+ m, ... , co), contradicting our assumption. Therefore no prime can divide (C,+ m, •.• , co)· D
20
1. The Factorization of Integers
Definition. Letfix) be a polynomial with rational coefficients. Suppose that there are two nonconstant polynomials g(x) and h(x) with rationaJ coefficients such that f(x) = g(x)h(x). Then f(x) is said to be reducible. Irreducible means not reducible. Example. x 2  2 and x 2 + 1 are irreducible polynomials, whereas 3x 2 + 8x reducible and the factorization is (3x + 2)(x + 2).
+ 4 is
Theorem 13.2 (Gauss). Let fix) be a polynomial with integer coefficients. If f(x) = g(x)h(x) where g(x) and h(x) are polynomials with rational coefficients, then there exists a rational number y such that
1
yg(x),
h(x) y
have integer coefficients. Proof We may assume that the greatest common factor of the coefficients offix) is 1. There are integers M, N such that Mg(x)
= alxl + ... + ao,
ai integer;
Nh(x) = bmxm +
... + b o, bi MNfix) = CI+mX I+m + ... + co.
integer;
From our assumption and Theorem 13.1 we have
Let
y=
M (az, ... ,ao)
and the required result follows.
=
(bm, ... ,bo) N
0
Theorem 13.3 (Eisenstein). Let f(x) = cnxn + ... + Co be a polynomial with integer coefficients. If p,tc", plCi (0 :::; i < n) and p2 ,tco, then fix) is irreducible. Proof Suppose, if possible, thatf(x) is reducible. By Theorem 13.2 we have that fix) g(x)
= g(x)h(x),
= alxl + ... + ao, 1+ m = n,
I> 0,
m>O,
where aj and bk are integers. From Co = aob o and plco we see that either plao or plb o. Suppose that plao. Then, from p2,ta ob o = Co we deduce that p,tb o. Next, the coefficients for g(x) cannot all be a multiple of p, since otherwise plcn. We can therefore suppose that pl(ao,"" arI), p,ta" 1:::; r :::; I. From Cr =
21
Notes
arb o +
... + aobr we geduce that p,./'cr. But r::;:; 1< n and so we have a contradiction. The theorem is proved. 0 As a corollary we have:
Theorem 13.4. xm  p is irreducible, so that
.:fP is an irrational number.
0
Theorem 13.5. The polynomial xp  1 xI
_ _ =xp  l + ... +x+ 1 is irreducible. Proof Write x
= y + 1 so that we have
~«y + 1)P 
1)
= ypl + pyP2 + (~)YP3 + ... + p.
It is easy to see that each coefficient, apart from the first, is a multiple of p, and that the constant term is not a multiple of p2. 0
Exercise. Prove that the following polynomials are irreducible:
Notes 1.1. up to the present there are 27 known Mersenne primes, namely Mp = 2P where p

1
= 2,3,5,7,13,17,19,31,61,89,107,127,521,607,1279,2203, 2281,3217,4253,4423,9689,9941,11213,19937,21701, 23209,44497.
The twelfth Mersenne prime, namely M 127 , was found by Lucas in 1876 and the remaining fifteen have been found since 1952 with the aid of electronic computers. Thus M44497 is the largest known prime with 13395 digits which was discovered in 1979 (see [54J). 1.2. It is known that any odd perfect number must (i) exceed 10 50 (see [26J), (ii) have a prime factor exceeding 100110 (see [27J).
Chapter 2. Congruences
2.1 Definition Let m be a natural number. If a  b is a multiple of m, then we say that a and b are congruent modm, and we write a == b (modm). If a,b are not congruent modm, then we write a ¢= b (modm). Example. 31 ==  9 (mod 10). If a, b are integers, then we always have a == b (mod 1).
The notion of congruence occurs frequently and even in our daily lives; for example we may consider the days of the week as a congruence problem with modulus 7. Again in the ancient calendar in our country we count the years with respect to the modulus 60. Indeed our country made some significant contribution to the theory of congruence. For example, the Chinese remainder theorem originates from ancient publications concerning solutions to problems such as the following: There is a certain number. When divided by three this number has remainder two; when divided by five, it has remainder three; when divided by seven, it has remainder two. What is the number? With our notation here, the number concerned"is an integer x such that x == 2 (mod 3), x == 3 (mod 5), x == 2 (mod 7). The problem is therefore a problem of the solutions to simultaneous congruences.
2.2 Fundamental Properties of Congruences Theorem 2.1. (i) a == a (modm) (r.eflexive); (ii)Ifa == b (modm), thenb == a (modm) (symmetric); (iii) If a == b, b == c (modm), then a == c (modm) (transitive). D These three properties here show that being congruent is an equivalence relation. The set of integers can then be partitioned into equivalence classes so that integers in each class are congruent among themselves, and two integers from different classes are not congruent. We call these equivalence classes residue classes. It is clear that, for the modulus m, we have precisely m residue classes: the classes whose members have remainder r = 0, 1,2, ... ,m  1 when divided by m. Ifwe select one member from each residue class, then the set of numbers formed is called a complete residue system.
23
2.3 Reduced Residue System
Theorem 2.2. If a == b, al == b l (modm), then we have a == b  bi> aal == bb l (modm). D
+ al == b + bi>
a  al
Theorem 2.2 has the following interpretation: Let A, B be any two residue classes from which we select any representatives a, b. Denote by C the residue class which contains a + b (or a  b or ab). Then C depends on A, B but not on the representatives a, b. In other words, the sum of any two integers from A, B must belong to C. We can therefore define Cto be the sum of the two classes A, B and we denote it by C = A + B. Similarly we can define A  B and A . B. We see from Theorem 2.2 that, with respect to residue classes mod m, the operations of addition, subtraction and multiplication are closed. We note that division is not always possible; for example 3 . 2 == 1 . 2, 2 == 2 (mod 4), but 3 i= 1 (mod 4). However we do have the following: Theorem 2.3. If ac == bd, c == d (modm) and (c, m)
Proof From (a  b)c + b(c  d) (c,m) = 1, so that mla  b. D
=
ac  bd ==
= 1, then a == b (modm).
°
(modm), we have ml(a  b)c. But
We denote by 0 the residue class of all mUltiples of m. Then A + 0 = A and A ·0= O. Again, if we let [be the residue class of integers with remainder 1 when divided by m, then A . [ = A. From our example and Theorem 2.3 we see that from A . B = A . C we may not deduce that B = C; but if the members of A are coprime with m (Note: if A has one member which is coprime with m, then every member must also be coprime with m), then we have B = C. If we take m to be a prime number, then apart from the class 0, every class is coprime with m. Therefore, for a prime modulus, the operations of addition, subtraction, multiplication and division are closed, except that we cannot divide by the class O.
2.3 Reduced Residue System As we said earlier, if a residue class A contains an element which is coprime with m, then every element of A is coprime with m, and we call A a class coprime with m. If A and m are coprime, then we can, by Theorem 2.3, define BIA. In particular, we write A l for [IA. For example: A A l
1°11121314 x 1 3 2 4
AA 1
I
(mod 5)
~ ~ ~ ~ ~ ~
(mod 6)
A~ll ~I~I!I~I~I~I:
(mod 7)
1
1
1
1
1
24
2. Congruences
The sign " x " in the table means "undefined". Definition. We denote by qJ(m) the number of residue classes (modm) coprime with m. This function qJ(m) is called Euler's function. If we select one member of each residue class coprime with m:
then we call this set of integers a reduced residue system. Example. qJ(l) = I,
qJ(2)
=
1,
qJ(3) = 2,
qJ(4) = 2.
We may also describe qJ(m) as the number of positive integers not exceeding m and coprime with m. If m = p is a prime, then qJ(p) = p  l. Theorem 3.1. Let a1' a2,"" a",(m) be a reduced residue system, and suppose that (k,m) = l. Then ka1, ka2,'" ,ka",(m) is also a re.duced residue system. Proof Clearly we have (ka;, m) = 1, so that each ka; represents a residue class coprime with m. If ka; == kaj (modm), then, since (k,m) = I, we have a; == aj (mod m). Therefore the members ka; represent distinct residue classes. The theorem is proved. 0
Theorem 3.2 (Euler). If(k,m) = I, then k",(m) == I (modm). Proof From Theorem 3.1 we have ",(m)
",(m)
• =1
'.=1
TI (ka.) == TI a.
(modm) .
Since (m,a;) = I, it follows that k",(m) == I (modm). Taking m
0
= p we have Fermat's theorem (Theorem 1.12.4).
Theorem 3.3. Let p be a prime. Then,for all integers a, we have a P == a (modp).
2.4 The Divisibility of 2P 
1 
0
1 by p2
In 1828 Abel asked if there are primesp and integers a such that aP1 == I (modp2)? According to Jacobi: if p ::;:; 37, then the above has the solutions p = 11, a = 3 or 9; p = 29, a = 14; and p = 37, a = 18. Recent research work on Fermat's last theorem has added some impetus to this problem. We have the following result concerning Fermat's last theorem: Let p be an odd prime. If there are integers x,y, z such that x P + yP + zP = 0, p,txyz, then (I)
2.4 The Divisibility of 2P 
1 
25
1 by p2
and (2)
and more recently we know also that nP1 == 1 (modp2) for n = 2,3, ... ,47. We do not know if there exists a prime p such that both (1) and (2) hold. Definition. If aP1 == 1 (modp2), then we call a a Fermat solution. It is clear that the product of two Fermat solutions is a Fermat solution, the product of a Fermat solution and a nonFermat solution is a nonFermat solution. In the prime factorization of a nonFermat solution there must be a prime divisor which is a nonFermat solution.
Theorem 4.1. Let a, b be two Fermat solutions with respect to p. Then there does not exist q such that qp = a ± b, p,{'q. Proof From the definition we have a P == a, b P == b (modp2), (3)
If qp = a ± b, p,{'q, then a P = (=+= b + qp)P == =+= b P (modp2) giving a P ± b P == 0 (modp2). Substituting this into (3) yields a ± b = qp == 0 (modp2), which is a contradiction. D Theorem 4.2. 3 is a Fermat solution with respect to 11. Proof We have 3 5 = 243 == 1 (mod 112) so that 3 10 == 1 (mod 11 2).
D
Theorem 4.3. 2 is a Fermat solution with respect to 1093. Proof Let p = 1093. Then 3 7 = 2187 = 2p
+ 1, so
that (4)
also 214
= 16384 = 15p  II,
2 28
==  330p + 121 (modp2),
so that 3 2 .2 28
==  2970p + 1089 (modp2) ==  2969p  4 == 310p  4 (modp2),
32 . 2 28 . 7
== 2170p  28 ==  16p  28 (modp2).
26
2. Congruences
Therefore
From the binomial theorem we have
and hence (5)
From (4) and (5) we have
Therefore
Theorem 4.4. 3 is a nonFermat solution with respect to 1093. Proof If 3 were a Fermat solution, then so would 3 7 be one. Since  I is clearly a Fermat solution, and 37  I = 2p, we obtain the required contradiction from
Theorem 4.1.
0
Theorem 4.5. There exists no prime p < 100 which satisfies (I) and (2) simultaneously. Proof Suppose that 2 and 3 are both Fermat solutions. Then 21, 3m and 213m are all
Fermat solutions, and of course I is also a Fermat solution. The theorem now follows from Theorem 4.1 and the following calculations: 7=22+3,
2=31,
3=2+1,
13=22+3 2,
17=23+3 2,
5=2+3, 19=24 +3,
53=2'3 31,
37=26 33, :'l)=2 5 +3 3,
41=25+3 2, 43=2 4 +3 3, 61=26 3, 67=26 +3,
73=26+3 2,
79= 2+34,
83=2+3 4,
31=22+3 3,
23= _22 +3 3,
89=23 +34,
11 =2+3 2, 29=2+33, 47=24 '31, 71=2 3 '3 21, 97=24+3 4. 0
Recently Lehmer has proved that if p :::; 253,747,889, then there must exist m :::; 47 such that mPl ¥= I (modp2). This makes some contribution towards
Fermat's last theorem.
2.5 The Function cp(m) Theorem 5.1. Let (m, m') = 1, and let x run over a complete residue system mod m, and x' run over a complete residue system modm'. Then mx' + m'x runs over a complete residue system modmm'.
27
2.5 The Function cp(m)
Proof Consider the mm' numbers mx' mx'
+ m'x.
+ m'x == my' + m'y
If (modmm'),
then mx'
== my' (mod m'),
m'x
== m'y (modm).
From (m, m') = 1 we have x' == y' (modm'), x == y (modm). The theorem is proved. D' Theorem 5.2. Let (m, m') = 1, and let x run over a reduced residue system mod m, and x' run over a reduced residue system mod m. Then mx' + m' x runs over a reduced residue system modmm'. Proof 1) We first prove that mx' + m'x is coprime with mm'. Suppose otherwise. Then there exists P such that pl(mm', mx' + m'x). If plm, then plm'x. Since (m, m') = 1, it follows that p,tm' and so pix. Thus pl(m, x) which is impossible. 2) We next prove that every integer a coprime with mm' must be congruent modmm' to an integer of the form mx' + m'x, (x,m) = (x',m') = 1. By Theorem 5.1 there are integers x, x' such that a == mx' + m'x (modmm'). We now prove that (x,m) = (x',m') = 1. If (x,m) = d ¥ 1, then (a,m) = (mx' + m'x,m) = (m'x,m) = (x,m) = d ¥ 1, which contradicts the hypothesis. Similarly we must have (x',m') = 1. 3) We have already proved in Theorem 5.1 that the numbers mx' + m'x are incongruent. Therefore the theorem is proved. D
We have in fact proved that
D
A multiplicative function is completely determined by the values it takes at the prime powers. Thus, if the standard factorization of m is given by PI < P2 < ...
then, from Theorem 5.3, we have
Theorem 5.4. We have
and
28
2. Congruences
qJ(m)
=m
n (1  P~), plm
where p runs over the distinct prime divisors of m. Proof Consider the integers in the interval I ::;;; n ::;;; i. There are precisely iI integers which are mUltiples of p and the others are coprime with p so that qJ(pl)
= pi
_ pl I
= pi
(I _~).
The second equation in the theorem follows from this and the multiplicative property of the function. 0 Example: qJ(300)
=
qJ(22 . 3 . 52)
= 22 ·3· 52(1  t)(I  t)(l 
t) = 80.
= m, where in the sum, d runs over all the positive
Exercise 1. Prove that Ldlm qJ(d) divisors of m.
Exercise 2. Let P be the product of the distinct prime divisors of (m, n). Prove that qJ(mn)
P
qJ(m)qJ(n)
qJ(P)
Exercise 3. Use Theorem 1.7.1 to prove Theorem 5.4.
2.6 Congruences We first discuss the solubility of the congruence ax
+b=0
(1)
(modm),
and the number of incongruent solutions. The congruence (I) is equivalent to the equation ax + b = my, where we seek integer solutions x, y. This indeterminate equation has already been discussed in §1.8, and we shall now advance one step further. If (a,m) = 1, then we can choose Xo,Yo according to Theorem 1.4.4 so that axe + myo = 1. Thus x =  bxo is a solution to (1), and we now proceed to show that this solution is unique. If ax' + b 0 (modm) and ax + b 0 (modm), then a(x  x') 0 (modm). Since (a,m) = 1, we have x x' (modm). This proves that there is only one residue class whose members satisfy (I); in other words, there is only one solution x to (1) satisfying 0 ::;;; x < m. If (a, m) = d> 1, then dmust divide b, or else there is no solution. We then have
=
=
=
=
(2)
29
2.7 The Chinese Remainder Theorem
We have already proved that (2) has a unique solution Xl satisfying 0 ~ and X = Xl + (mld)t are all solutions to (2). Therefore Xl
+ (d
Xl
< mid,
m
1)d
are all incongruent (modm) solutions to (I). We have therefore proved the following: Theorem 6.1. If (a, m)lb, then there are (a, m) incongruent (modm) solutions to (I). Otherwise (1) has no solution. 0 Theorem 6.2. A necessary and sufficient condition for the congruence aXI + ... + anxn + b = 0 (modm) to have a solution (xt. ... , xn) is that (at. ... , am m)lb. If this condition is satisfied, then the number of incongruent (mod m) solutions is m n l(at. ... , am m). Proof The case n = 1 is settled by Theorem 6.1. We now proceed by induction. Let (at. ... ,an,m) = d and (at. ... ,anI,m) = dt. SO that (dt.an) = d. From Theorem 6.1 we know that there are d· (midI) solutions to
o ~ xn < m. Corresponding to a solution Xn we set anxn .
dl
+b
=
bl .
From the induction hypothesis, the number of solutions to the congruence alxl + ... +anIxnl +bldl =0 (modm) is mn2(al, ... ,an_t>m)=mn2dl' Therefore the total number of solutions is given by md  ' m n 2d l = mnId dl
as required.
0
2.7 The Chinese Remainder Theorem Theorem 7.1. Let m be the least common multiple ofml and m2' The conditionfor the solubility of the simultaneous congruences X = al
(modmd,
=a2
(modm2),
X is
(1)
If(I) holds, then the solution is unique modm.
30
2. Congruences
Proof 1) Let (mr,m2) = d. If the simultaneous congruences have a solution, then x == ar, x == a2 (modd) and hence dial  a2' 2) If dial  a2, then the solutions to x == al (mod ml) are given by x = al + mlY' Substituting this into the second congruence gives al + mlY == a2 (mod m2)' From the proof of Theorem 6.1 this congruence has a unique solution modm2/d. Therefore the simultaneous congruences have a unique solution xmodm. 0
Theorem 7.2. If(mi' m)
=
1 (l
x == ai
~
i <j ~ n), then the simultaneous congruences
(modm;),
have a unique solution mod mI' .. m n • Proof Apply mathematical induction to Theorem 7.1.
0
Let us now discuss the ancient method of solutions to this type of problem. We already stated the problem of" What is the number?" in §1. The solution to this problem was published as a song in 1593, and it goes as follows: "Three people walking together, 'tis rare that one be seventy, Five cherry blossom trees, twenty one branches bearing flowers, Sevendisciples reunite for the halfmoon, Take away (multiple of) one hundred andfive and you shall know."
We recall that the problem was to solve the simultaneous congruences x == 2 (mod 3), x == 3 (mod 5), x == 2 (mod 7). The meaning of the song here is as follows: Multiply by 70 the remainder of x when divided by 3, multiply by 21 the remainder of x when divided by 5, multiply by 15 (the number of days in half a Chinese (synodic) month) the remainder of x when divided by 7. Add the three results together, and then subtract a suitable multiple of 105 and you shall have the required smallest solution. For our specific example, we have 2 x 70
+3
x 21
+2
x 15
= 233
and on subtracting twice 105 we have the required solution 23. How do we explain this ancient method of solution, and in particular where do 70,21, 15 come from? The answer is as follows: 70 is a mUltiple of 5 and 7 which has remainder 1 when divided by 3. 21 is a mUltiple of 3 and 7 which has remainder 1 . when divided by 5. 15 is a mUltiple of 3 and 5 which has remainder 1 when divided by 7. It follows that 70a + 21b + 15cmust have remainders a, band cwhen divided by 3, 5 and 7 respectively. We may further investigate how they obtained 70,21 and 15. They had to solve x == 0
(modm2),
31
2.8 Higher Degree Congruences
where Y satisfies mlm2Y == 1 (modm3)? The answer is that they used their own version of the Euclidean algorithm to solve the indeterminate equation mlm2Y  m3z
= 1.
The following exercises are all from ancient Chinese publications. Exercises 2,3, 4 are dated 1275. Exercise 1. Replace 3, 5, 7 by 3, 7, 11 and determine the three numbers which correspond to 70, 21, 15. Exercise 2. Seven with remainder one, eight with remainder two, nine with remainder three. What is the number? Exercise 3. Eleven with left over three, twelve with left over two, thirteen with left over one. What is the number? Exercise 4. Two with left over one, five with left over two, seven with left over three, nine with left over four. What is the number? Exercise 5. There is a number. It has no remainder when divided by five. It has a remainder ten when divided by seven hundred and fifteen. It has a remainder one hundred and forty when divided by two hundred and forty seven. It has a remainder two hundred and forty five when divided by three hundred and ninety one. It has a remainder one hundred and nine when divided by one hundred and eighty seven. May we ask what is the number? (Answer: Ten thousand and twenty.)
2.8 Higher Degree Congruences Let m be a fixed natural number, and letfix) = anxn + ... with integer coefficients. We now discuss the congruence fix)
== 0
(modm).
+ ao be a polynom.ial (1)
If Xo is a solution, then Xo + mt is also a solution. This means that if Xo satisfies (1), then each member of the residue class represented by Xo also satisfies (1). Therefore, when we speak of the number of solutions to (1) we mean the number of incongruent solutions. The number of solutions to a higher degree congruence is quite irregular. For example:
= (x  1)x(x + 1) == 0 (mod 6) has six solutions. 2. The congruence x 2 + 1 == 0 (mod 3) has no solution. 3. The congruence (x  1)(x  P  1) == 0 (mod p2) has p solutions, namely 1, 1. The congruence x 3  x
p
+ 1, 2p + 1, ... , (p 
l)p
+ 1.
We see therefore that the solutions to higher degree congruences are difficult and complicated. The follqwing theorem helps a little.
32
2. Congruences
Theorem 8.1. Let (ml,m2)
= 1. Then the number of solutions to the congruence (2)
is the product of the numbers of solutions to the congruences fix) == 0
(modml),
(3)
fix) == 0
(modm 2)'
(4)
If m
= mlm2 = pilI . .. p!s
(PI < P2 < ... < Ps)
is the standard prime factorization of m, then the number of solutions to (2) is the product of the numbers of solutions to the s congruences:
1~ i
~
s.
Proof It is clear that each solution to (2) is also a solution to (3) and (4). Conversely, let CI and C2 be solutions to (3) and (4) respectively, and let c be a solution of c == CI (modml)andc == C2 (modm2)' The solution cexists andisuniquemodm according to the Chinese remainder theorem. Moreover, this c satisfies (2) because mr!f(c), m21f(c) so that mlf(c). D
2.9 Higher Degree Congruences to a Prime Power Modulus Theorem 9.1. Let p be a prime number. The number of solutions (including repeated ones) to the congruence
fix)
= anxn + ... + aD == 0 (modp)
(1)
does not exceed n. Proof We can assume that p,./'an. The theorem becomes trivial if (1) has no solutions. If a is a solution, then we can write f(x) = (x  a)fl(x)
+ rr,
where we see thatplr l by substituting a for x. Thereforef(x) == (x  a)fl(x)(modp). If a is also a solution to fl(x) == 0 (modp), then we have similarly that fl(x) == (x  a)f2(x) (modp), and in this case we call a a repeated solution to fix) == 0 (modp). Iff(x) == (x  a)hgl(x) (modp) where gl(a) =1= 0 (modp), then we call a a repeated solution of order h tof(x) == 0 (modp). From our proof so far, we see that the degree of gl(X) is n  h. Suppose now that b is another solution. Then
33
2.10 Wolstenholme's Theorem
Sincep,r(b  a), it follows thatgl(b) == 0 (modp). If bis a repeated solution of order k to gl(X) == 0 (modp), then we have, as before,
Proceeding in this way we have fix) == (x  a)h(x  b)k .. . (x  C)lg(X)
(modp),
whereg(x) isa polynomial of de green  h  k  ... /andg(x) == no solution. The theorem is proved. 0
o(modp) has
Since 1,2, ... ,p  1 are solutions to XPl == 1 (modp) we see that XPl  1 == (x  l)(x  2) ... (x  (p  1))
(modp).
(2)
Substituting x = 0 into this, and noting that p  1 is even if p > 2, we have: Theorem 9.2 (Wilson).
If p
is a prime, then (p  I)! ==  1 (modp).
0
Theorem 9.3. Let f'(x) = nanxn l + ... + 2a2x + al. If fix) == 0, f'(x) == 0 (modp) have no common solution, then the two congruencesf(x) == 0 (modi) and fix) == 0 (modp) have the same number of solutions. Proof We prove this by induction on I, the case 1= 1 being trivial. Let Xl be a solution tof(x) == 0 (modii), so that
because (x + plly)n == xn a unique y such that
+ npllyx"l (modi). Butp,rf'(Xl) so that there exists
Theorem 9.4. The congruence XPl == 1 (modi) has p  1 solutions. Proof This is an immediate consequence of Theorem 9.3.
0
2.10 Wolstenholme's Theorem Theorem 10.1. Let p be a prime number greater than 3, and denote by ~ an integer s* such that ss* == 1 (mod p2). Then we have 1 1 1+ + + 2 3
1
... +   == 0 p1
(mod p2).
34
2. Congruences
Proof Let (x  I)(x  2)'" (x  (p  I))
 SIXp 2 +
= XPI
... + Spl>
(1)
so that SpI =(pI)!.
Since (x  I)(x  2) ... (x  (p  I))
==
XPI  1
(modp),
(2)
it follows that (3)
We set x = p in (I). Then (p  I)! = pP1  SIPp2
+ ... 
Sp2P
+ Spi>
or
Since p > 3, we have, by (3), that
or p21(p _ I)! (I
+ ~ + ... + 2
_1_),
pI
or 1* + 2*
as required.
D
+ ... + (p
 1)*
== 0
(modp2),
Chapter 3. Quadratic Residues
3.1 Definitions and Euler's Criteria Definition 1. Let m be an integer greater than 1, and suppose that (m, n) = 1. If x 2 == n (modm) is soluble, then we call n a quadratic residue mod m; otherwise we call n a quadratic nonresidue mod m. We can now divide the set of integers coprime with n into two classes: the class of quadratic residues and the class of quadratic nonresidues.
Example. The numbers 1,2,4 are quadratic residues and 3,5,6 are quadratic nonresidues mod 7. Definition 2 (Legendre's symbol). Letp be an odd prime, and suppose thatp,tn. We let if n is a quadratic residue mod p, if n is a quadratic nonresidue mod p. If is easy to see that if n == n' (modp) and p,tn, then
Theorem 1.1. Let p > 2. There are t
Proof If x 2 == n
(I)
(modp)
is soluble, then there are at most two solutions. From (p x 2 == n(modp), we see that one of the roots of (1) must satisfy 1 :;;; x :;;; }(p  1).
X)2
== (  xf = (2)
That is, if (1) is soluble, there must be a solution satisfying (2). Also 12,2 2, ... ,(}(P  1))2 are incongruent numbers because a 2  b 2 = (a  b)(a + b) and neither of these factors, being smaller than p, is a multiple of p. The theorem is proved. 0
36
3. Quadratic Residues
Theorem 1.2 (Euler's Criterion). Let p be an odd prime. Then we have
nt (p1) Proof I) If (~)
== (;) (modp).
= I, then there exists
x such that x 2
nt (p1) == XP1 == I
== n (modp), and so
(modp).
t(P 
2) From Theorem 2.9.1 we know that there are at most 1) solutions to nt(p  1) == 1 (mod p). Combining with 1) we see that this equation actually has 1) solutions, that is the quadratic residues modp, and no other. 3) We have
t(P 
pl(nP1 _ I)
= (n t (p1)  1)(nt (P1) + 1).
Therefore, if p,r(n t (P1)  1), then nt(p1)
The theorem is proved.
+ I == 0
(modp).
0
We have, as a consequence of this theorem: Theorem 1.3.
Thus,
(~)
if p,rmn , then
(;)(;)
= (:n).
0
is a mUltiplicative function of n. We also deduce:
Theorem 1.4. (i) The product of two quadratic residues is a quadratic residue. (ii) The product of two quadratic nonresidues is a quadratic residue. (iii) The product ofa quadratic residue and a quadratic nonresidue is a quadratic nonresidue. 0
3.2 The Evaluation of Legendre's Symbol From Theorem 1.3 we see that the evaluation of Legendre's symbol reduces to the evaluation of
where q is an odd prime. For if 2 < q1 < ... < q..
then
37
3.2 The Evaluation of Legendre's Symbol
Taking n =  1 in Theorem 1.2 we have
(~1) == (_ I)P~l
(modp),
and since both sides of the congruence must be ± 1, we have Theorem 2.1.
If p > 2,
then
C/ ) = ( 
ly!(pl).
D
In other words,  1 is a quadratic residue or nonresidue modp, according to whether p == 1 or 3 (mod4). It follows from this that the odd prime divisors of x 2 + 1 must be congruent to 1 (mod 4). Theorem 2.2 (Gauss's Lemma). Let p > 2, p,tn. Denote by m the number of least positive residues of the 1) numbers n, 2n, ... l)n (mod p) which exceed p/2. Then
t(P 
Example 1. p
7,n
=
=
,t(P 
10. We have 10,20, 30 == 3,6,2
(mod 7).
There is exactly one least positive residue which exceeds (If) =  1.
J. Therefore m = 1 and
Example 2. p = 11, n = 2. We have the residues 2,4,6,8, 10 (mod 11), and there are three which exceed 1f. Therefore (121) =  1.
t(P 
Proof of Theorem 2.2. Let 1= 1)  m, and let at> ... , a, be those residues which are less than p/2, and bI> ... , bm be those residues which are greater than p/2. Then'
n as n b == n I
m
t(pl)
l
s=l
1=1
k=l
(p _1)
pl
kn =   !n22
(modp).
(1)
Since 1 ::;:; p  bl ::;:; t(p  1) it follows that as and p  bl are t(p  1) integers in the 1). We now prove that they are distinct by proving interval from 1 to as :f p  bl • Suppose, if possible, as + bl = p. Then there are integers x, y such that
t(P 
xn
or x
+ yn == 0
+ y == 0 (modp),
1::;:; x ::;:;
(modp),
tCP 
1),
which is impossible. Therefore
n as n (P I
m
s=l/=l
bl )
1::;:; y ::;:;
(p _1) !.
= 
2
t(P  1)
38
3. Quadratic Residues
From (1) we see that the left hand side of this equation is
== ( l)m
rl Ii as
s=1
ht
== ( l) mnt(pl)(p 2
t=1
I)!
(modp).
Therefore nt(pl)
== ( l)m (modp).
From Euler's criterion we see that (;) == (  l)m (modp), and so (;)
= (
l)m. 0
If we take n = 2 in Theorem 2.2, then
2,2'2, 2·3, ... ,t(p 1)·2 are already in the interval from 0 to p. We can now determine the number of integers k satisfying i < 2k < p, or ~ < k < i, which gives
m= Let p = 8a
+ r, r =
[~J [~l
1,3,5,7. Then
m = 2a
+
GJ [~J
== 0, 1, 1,0 (mod 2).
Therefore we have: Theorem 2.3.
If p > 2,
then (;)
= (
l)i(pL 1).
0
In other words 2 is a quadratic residue or nO!lresidue modp, according to whether p == ± 1 or ± 3 (mod 8). It follows from this that every odd prime divisor of x 2  2 must be congruent to ± 1 (mod 8). Exercise. Let n be a positive integer such that 4n + 3 and 8n + 7 are primes. Prove that 24n + 3  1 = M 4n + 3 is composite. Use this to prove the following concerning Mersenne numbers:
231M ll ,
471M23 ,
1671Ms3 ,
263IM 131 ,
3591M 179 ,
3831M 19b
4791M239 ,
5031M251 •
3.3 The Law of Quadratic Reciprocity Theorem 3.1. Let p, q he two distinct odd primes. Then
(~) (~) = (_
l)t(p1)t(q1).
39
3.3 The Law of Quadratic Reciprocity
x2
In other words, if p == q == 3 (mod 4), then exactly one of the two congruences == p (mod q), x 2 == q (modp) is soluble. Otherwise the two congruences are either
both soluble or both insoluble. This is the famous and important Law of Quadratic Reciprocity in elementary number theory which was discovered by Legendre and proved by Gauss, who named it "the queen of number theory". The later research work on algebraic number theory by Kummer, Eisenstein, Hilbert, Takagi, Artin, Furtwangler seem to justify the name. Proof We do not, for the moment, exclude the case q = 2, and we suppose that p, q are distinct primes. When 1 ~ k ~ t(P  1) we can write
Let m
I
a=
a.,
I
b=
bt
t= I
s= I
where as and bt are defined in the previous section. Then we have tIpI)
I
+ b.
rk = a
(1)
k=I
We saw in the proof of G.auss's lemma that a., p  bt are the same as 1,2, ... ,t(P  1). Therefore
p2 _ 1
1
8=1+2+ ... +"2(pl)=a+mp b,
(2)
and
p2 _ 1  q
tIpI)
=
I
tIpI)
kq
=p
I
g k=I k=I Subtracting (2) from (3), we have p2 _ 1 g(q  1)
tIpI)
qk
I
+
tIpI)
rk
=P
k=I
I
qk
+ a + b.
(3)
k=I
!(pI)
I
=p
qk  mp
+ 2b,
k=I
or
p2_1
tIpI)
(4) I qk  m (mod 2). k=I 1) (Alternative proof of Theorem 2.3). We take q = 2 so that qk are all 0, and hence 8(q  1)
==
p2 _ 1
  ==  m (mod 2). 8
2) Let q > 2. Then tIpI)
m ==
I
k=I
qk
(mod 2).
40
3. Quadratic Residues
Therefore
Similarly we have
so that
If we can prove that t(p1)
[kq]
t(q1)
k= 1
P
1= 1
L
 + L
[lP] _P  1 q  1
 2
q
or
2
=plql 2 2
(mod 2),
then the theorem will follow. It suffices therefore to prove the following lemma.
Lemma.
_P 1q 1 L [kq] P + L [IP] q . 2 2
t(p1)
t(q 1)
k= 1
1= 1
Proof Consider the rectangle with vertices: (O,tq)
(0,0), (0, tq), (tp, 0), (tp, tq)
<tp,O)
(0,0)
The diagonal from the origin does not pass through any lattice point (a point with integer coordinates). This is because if (x, y) is a lattice point on the diagonal, then xq  yp = 0 and so pix, qly, showing that (x,y) must lie outside the rectangle. The total number oflattice points in the rectangle is 1) . t(q  1). The number of lattice points in the two triangular regions below and above the diagonal are respectively
t(P 
t(I
1
)
k=l
The lemma is therefore proved.
[kq] , P
0
Example 1. Determine those primes p > 3 of which 3 is a quadratic residue. From the law of quadratic reciprocity we have
(3) (p)
p1 \p = 3 ( 1)2.
41
3.3 The Law of Quadratic Reciprocity
Now
{m~I'
G)~ (/)~ p
pl
(1)2=
{
if p=.1
I,
(mod 3),
if p=.2 (mod 3); if p=.1 (mod 4), if p =.  1 (mod 4).
1 '  1,
It follows from the Chinese remainder theorem that if p =. if p =.
±1
(mod 12), ± 5 (mod 12).
Example 2. Determine those primes p :f 5 of which 5 is a quadratic residue.
From the law of quadratic reciprocity we have (;) = (~), and
(5"2) =(1)8= 1, 521
G)= 1,
G)=(5 2 )=I,
(i) =
so that if p =. ± 1 (mod 5), if p =. ± 2 (mod 5). Example 3. Determine those primes p of which 10 is a quadratic residue.
From Example 2 and the Chinese remainder theorem we have if p =. if p =.
± 1, ± 3, ± 9, ± 13 (mod 40), ± 7, ± 11, ± 17, ± 19 (mod 40).
Example 4. Determine the solubility of x 2 =. 1457 (mod2389).
Here p = 2389 is a prime. Since  1457 =  31 x 47 it follows from (
~ 1) = 1,
(:1) =
(:J e =
2 1) = 1,
(~) = (:7) = (:7)G~) =  (~7)G~) 8 =  G)C 3) =  G)C23) =  I, that
C2~n7
) =  1, so that the congruence is not soluble.
Exercise 1. Show that (;3) = 1,
G~) =
 1.
195) =  1, (74) Exercise 2. Show that ( 1901 101 =  1, (365) 1847 = 1.
1,
42
3. Quadratic Residues
Exercise 3. Show that
= ±1
or
±5
(mod 24),
then(~) = 1;
±7
or
±
(mod 24),
then
if
p
if
p=
11
(~) = 
1.
3.4 Practical Methods for the Solutions Although the theory above is simple and beautiful, it is nevertheless rather negative. By this we mean the following. If, following our theory, the congruence is insoluble, then the problem is finished. However, if the congruence is soluble, we may further ask for the actual solutions to the congruence, and the method does not give us the solutions. In actual fact, when p is large, the determination of the solutions to x 2 = n (modp) is no easy matter. However, ifp = 3 (mod 4) or p = 5 (mod 8), then we have the following methods. 1) p = 3 (mod 4). Since (;) = 1, we have n t (p1) = 1 (mod p). and so (n!
=(1.2 ...
~(Pl)Y=(G(Pl)!y
This gives us a solution. From (!)
=
1, we have
nt (p1)  1 = 0
(modp).
Now n satisfies n!
=1
(modp)
or n!
= 1
(modp).
From the first congruence we have
From the second congruence we have (n~p+3)f
=
n
(modp),
(modp).
(1)
43
3.4 Practical Methods for the Solutions
so that
3) p == 1 (mod 8). This is a more difficult case. When p is not too large, we usually use the method of successive eliminations. The congruence X Z == n (mod p) is equivalent to the indeterminate equation X Z = n + py. We may assume that 0< n
n
+ py == nz,...
(mode).
If y == Vi (mod e), then py + n is a quadratic nonresidue mod e, and is therefore not a square. We may therefore discard those y == Vi (mode). We may further discard more values of y by choosing different values of e until the number of trials is small enough not to be troublesome. X Z == 73 (mod 127). We try to solve x 2 = 127y + 73 where 1 ::;;; y ::;;; 31. We take e = 3, ni = 2. From 73 + 127y == 2 (mod 3), that is y == 1 (mod 3), we see that the remaining values for y are:
Example. Solve
2,3,5,6,8,9,11,12,14,15, 17, 18,20,21,23,24,26,27,29,30. We next take e = 5, ni = 2, n2 = 3. From 127y + 73 == 2,3 (mod5), we have VI == 2, V2 == 0 (mod 5) and so the remaining values for yare now 3,6,8,9,11,14,18,21,23,24,26,29. We next take e = 7, ni = 3, n2 = 5, n3 = 6. From the congruences 127y + 73 == 3,5,6 (mod 7), or y + 3 == 3, 5, 6 (mod 7) we have y == 0,2, 3 (mod 7), so that we are left with only the six values 6,8,11,18,26,29 for the trials. In fact 73 the solutions.
+8 x
127 = 1089 = 33 2 so that x == ± 33 (mod 127) are
Note. In this method, having taken e and e', there is no need to take ee'. Again, having taken an odd e, there is no need to take 2e.
All we discuss here is related to the work of Gauss. We see therefore that this "Prince of mathematics" is not only a theoretician, but also an expert problem solver.
44
3. Quadratic Residues
3.5 The Number of Roots of a Quadratic Congruence Theorem 5.1. Let I> 0, p,tn. If p > 2, then the congruence X2 = n (mod pI) has 1 + (~) solutions. If p = 2, then we have the following three cases. 1) 1= 1. There is one root. 2) I = 2. There are two or no roots depending on whether n = 1 or 3 (mod 4). 3) I > 2. There are four or no roots depending on whether n = 1 or n =1= 1 (mod 8). Proof We first discuss the three cases associated with p = 2. 1) This is trivial. 2) The congruence X2 = 1 (mod 4) has the solutions ± 1 (mod 4) and the congruence X2 = 3 (mod 4) has no solution. 3) If X2 = n (mod 2') is soluble, then x must be odd, say 2k + 1. Since . k(k + 1) (2k+l)2=4k(k+l)+1=8' 2 +1=1
(mod 8),
it follows that the congruence is not soluble if n =1= 1 (mod 8). Suppose now that n = 1 (mod 8). When I = 3, there are clearly the four roots 1,3,5,7. We now proceed by induction on I. Let a satisfy a 2 = n (mod 2'  1 ). Then
We take b = (n  a 2)j2'l. Then a + 2'  2b is a solution with respect to mod2'. Therefore a solution to X2 = n (mod 2') certainly exists. Let Xl be a solution, and let X2 be any solution. Then x~ = (Xl  X2)(Xl + X2) = 0 (mod2'), and since Xl  X2, Xl + X2 are both even it follows that t(Xl  X2) . t(Xl + X2) = 0 (mod 2'  2 ). But t(Xl  X2) and t(Xl + X2) must be of opposite parity, since otherwise their sum Xl cannot be odd. Therefore we have either Xl = X2 (mod2 '  l ) or Xl =  X2 (mod2 '  l ), and this means that X2 = ± Xl + k2 '  l (k = 0 or 1). Hence there are at most four solutions to X2 = n (mod 2'). Since ± Xl> ± Xl + 2'  1 are actually incongruent solutions we see that the congruence has exactly four solutions. When p > 2 and I = I, the result is trivial, and the remaining part of the theorem follows from Theorem 2.9.3. D
xi 
From the results of Chapter 2 we can determine the number of solutions to a quadratic congruence to any integer modulus m.
3.6 Jacobi's Symbol Throughout this section m denotes a positive odd integer. Definition. Let the standard factorization of m be PI ... Pt, where the Pr may be repeated. If (n,m) = 1 then we define the Jacobi's symbol by
45
3.6 Jacobi's Symbol
(mn) =0 G) t
r=1
Examples.
r
(~) = 1. If (a,m) = 1, then (~) = 1.
Note: If (:) = 1, it does not follow that x 2
=n (modm) is soluble.
Theorem 6.1. Let m and m' be positive odd integers. (i) If n
= n'
(modm) and
(n, m) = 1, then (:) = (:). (ii) If(n, m) = (n, m') = 1, then (:) (;,) = (m:'). (iii) If(n,m) = (n',m) = 1, then (:)(:) = (:'). Theorem 6.2. (
:n 1)
D
= (  l)t(ml).
Proof. It suffices to prove that t
t
L
Pi  1 2
i=1
=
OPi 1 (mod 2),
i=1
2
which certainly holds when t = I. Given any two odd integers u, v we always have
u  1 vI 2 + 2
=uv2 1
(mod 2)
(or (u  1)(v  I)
=0 (mod4)).
It follows by induction that t
Pi 
1
tl Pi 
1
Pt 
1
L=i= L  +2i= 1 2 1 2 tl
o
_
Theorem 6.3.
i=1
Pi 
1
1
0 Pi 
= 222 +~
1
i=1
(mod 2).
(~) = (_ 1)~mL1).
Proof. This is similar to the above, except that we replace (I) by U2 V 2 
8
1
u2  1 v2  1 =  8  +  8  (mod 2).
D
Theorem 6.4. Let m, n be coprime positive odd integers. Then ( m)(n) n m
~.'!!..::..! 2.
= (  1) 2
D
(1)
46
3. Quadratic Residues
Proof Let m = TIp, n =
=
TI q. Then nlml
plql
TITI ( 1)22 = ( 1)22p
where we have used (1).
q
0
In using the Legendre's symbol we must always ensure that the denominator is a prime. In using Jacobi's symbol however, we can avoid the factorization process. For example:
383) (443) ( 60 ) ( 22 ) ( 15 ) ( 15 ) ( 443 =  383 =  383 =  383 383 =  383 8 8 = C1 :) = ( 5) = (25) = 1. If we delete the condition that m, m' are positive in Theorem 6.4, then we have:
Theorem 6.5. Let m, n be coprime odd integers.
(Imln)(m) jnf =  ( 
Otherwise, the required value is ( 
If m, n are both negative, then mlnl
1)22.
l)t<ml).!
Example. Determine the solubility of x 2
=
0
286 (mod 4272943).
Here p = 4272943 is a prime, and we have 10 evaluate (:86). Since
(~1) = _ 1, (~) = 1, we have (:86) = We now determine
C;3)
(~1 )(~)C;3) = _ C;3).
as follows: We have
4272943 = 29880 x 143 + 103*, 143 = 2 x 103  63, 103 = 2 x 63  23, 63 = 2 x 23 + 17*, 23 = 2 x 17 17 = 2 x
11,
11  5*,
11=2x5+1
47
3.7 Two Terms Congruences
where each step with a * denotes a change of sign. Therefore
p
( 143) = (I? =1. Thus ( as
;86) = I,
and the congruence is soluble. Gauss determined the solutions
± 1493445.
3.7 Two Terms Congruences Let p be prime. We now discuss the congruence
Xk
== n (modp).
Theorem 7.1. The congruence Xk
== I (modp)
(I)
has (k,p  I) roots. Proof I) Let d = (k,p  I) and let s, t be integers such that sk + t(p  I) then have :x;
=
== I (modp),
d. We
(2)
and conversely. 2) It suffices to prove that (2) has d roots. From Theorem 2.9.1 the number of roots for (2) certainly cannot exceed d. Also, there are p  I roots to xP 1 == I (modp). Again, by Theorem 2.9.1 the number of roots for
xp 
1 
I
pl
 . ,    = (Xd)d 1
:x;
+ ... + x d + I == 0
(modp)
does not exceed p  I  d, so that the number of roots for (2) must be at least d. The theorem is proved. 0 Theorem 7.2. Either the congruence (k,p  I) solutions.
Xk
== n (modp), p,rn has no solution or it has
Proof If Xo is a solution, then (X;;l X)k == follows from Theorem 7.1. 0
XkX;;k
== I (modp). The required result
Theorem 7.3. If x runs over a reduced set of residues mod p, then (p  I)/(k,p  I) different values.
Xk
take
Proof From Theorem 7.2 we see that there are (k,p  I) distinct residues whose kth power have the same residue modp. The p  I residues are now partitioned into (p  I)/(k,p  I) classes, and there is a onetoone correspondence. 0
48
3. Quadratic Residues
Definition. Let h be an integer, and (h, n) = 1. The least positive integer I such that h' == I (modn) is called the order of h (modn). Theorem 7.4.
If hm == I (modn), then 11m.
Proof Suppose the contrary. Then there are integers q, r such that m = ql + r, 0< r < I. Now hr == hm(h,)q == I (modn) contradicts with the definition of I. 0 Theorem 7.5. Let lip  I, and denote by ({)(/) the number ofincongruent integers with order I. Then ({)(I) is the Euler's function.
Proof We first establish certain properties of (()(I). I) If (110 12) = I, then ({)(l1/2) = ({)(lr)({)(l2). Let hI and h2 be integers with orders 11 and 12 respectively, and let Ibe the order of h 1h 2. From I == (h1h2)"2 == h';2 (modp), and Theorem 7.4 we see that / 11112. Since (110 12) = I, we have 111/, and similarly /211. Therefore I = 1112, that is· the order of h1h2 is 11/2. Thus, given any hI, h2 with orders 11,/2, we can construct h1h2 whose order is 11/2. We now prove that if we do not have hI == h~,h2 == h~ (modp), then h1h2 i= h~h~ (modp). For if h1h2 == h'lh~ (modp), then h 1h'11 == h~h21 (modp). But the order of h1h~1 divides 11 and the order of h~h21 divides 12, so that h1h~1 ==h~h21 == I (modp) which contradicts our assumption. Conversely, if h is an integer with order 11/2 where (/r,/2) = I, then hI = h'2, h2 = h" are integers with orders 110 12. Therefore ({)(lr)({)(/2) = (()(l1/2). 2) If q is prime, then ({)(qt) = qt  qt1. The number of roots of xqt  1 == 0 (mod p) is qt. If x satisfies this congruence and its order is not qt, then it must satisfy ~t' _ I == 0 (modp). But the number of roots of this congruence is qt1. Therefore ({)(qt) = qt _ qt1. That (()(I) is Euler's function follows from the two properties in I) and 2). 0
3.8 Primitive Roots and Indices From Theorem 7.5 we see that there are ({)(p  I) incongruent numbers with order p  I (modp). Definition 1. A positive integer whose order is p  I is called a primitive root of p. Let g be a primitive root of p. Then gO, gl, . .. , gP 2 are incongruent (modp). Definition 2. Corresponding to each integer n not divisible by p, there exists a such that
n == ga
(modp),
O~a
We call athe index ofn (modp) and we denote it by indg n or simply ind n. If b is such that n == gb (modp), then b == indn (modp  I).
49
3.9 The Structure of a Reduced Residue System
The function ind is similar to the logarithm function in that there are following properties: I) indnm == indm + indn (modp  l),p,rmn; 2) indn' == lindn (modp  1),p,rn. Note: We do not define indn when pin; this is similar to not defining log O. Definition 3. Let p,rn. If the congruence
Y!' == n (modp)
(1)
is soluble, then we call n a kth power residue mod p; otherwise we call n a kth power nonresidue. Theorem 8.1. A necessary and sufficient condition for n to be a kth power residue modp is that (k,p  I) divides indn.
Proof Let a = indn and y = indx. Then (I) is equivalent to ky == a (modp  1), and a necessary and sufficient condition for this to be soluble is that (k,p  1) divides a. D "Base interchange formula". It is clear that the index depends on the primitive root chosen. Let gi be another primitive root and gi == gb (modp). Then n == g~ == (gb)a (modp) or
This is similar to the base interchange formula for the logarithm function. We list the least primitive roots for all the primes up to 5000 at the end of this chapter.
3.9 The Structure of a Reduced Residue System Let m be a natural number. We ask whether there exists
g
such that
gO, gi, g2, ... , g,,(m)i (modm) form a reduced residue system. If g exists, then we
call it a primitive root of m. Theorem 9.1. A necessary and sufficient condition for m to have a primitive root is that m = 2,4,p' or 2p', where p is an odd prime.
Proof 1) Let the standard factorization of m be
Pi
From Euler's theorem, any integer a not divisible by Pi must satisfy
50
3. Quadratic Residues
Let I be the least common mUltiple of cp(it'), ... , cp(p!s) so that a l = I (modm). Therefore there can be no primitive root if 1< cp(m). If p > 2, then cp(pl) is even, so that m cannot have two distinct odd prime divisors. If m has a primitive root, then m must be of the form 21, i or 2cpl. If c ~ 2, then cp(2C) = 2C 1 is also even, and so 2ci cannot have primitive roots. Therefore m must be of the form 21,pl or 2pl. 2) m = 21. If I = I, then I is a primitive root. If 1= 2, then 3 is a primitive root. Let I ~ 3. We prove by induction that for all odd a, we have a 2'  2
=
I
(mod 21).
This is easy, since if then
Therefore there is no primitive root for m = 21 (/ > 2). 3) m = i. The case I = I has already been settled in §8. Let g be a primitive root of p. If gPl  I =1= 0 (modp2), then we take r = g; if gPl  I = 0 (modp2), then we take r = g + p. We then have
Therefore such an r is a primitive root of p2. Let rP 
1 
I
=
kp, p,tk.
Since s~O,
we can prove as before that
Hence rpl  2 (pl)
= I + kp ll
(mod pi) ,
I
~
2.
(1)
If the order of r is e, then el(p  I)pll = cp(i). Since r is a primitive root of p, we see that(p  1)le. We deduce from (I) that e = cp(Pl); that is r is a primitive root ofi· 4) m = 2pl. We take g to be a primitive roqt of pl. If g is odd, then g is also a primitive root of 2pl; if g is even, then g + pi is a primitive root of 2pl. D Theorem 9.2. Let I > 2. Then the order of 5 with respect to the modulus 21 is 21 2.
Proof We first prove that, for a
52a  3
~
3,
= I + 2a 
1
(mod2 a ).
51
3.9 The Structure of a Reduced Residue System
This clearly holds when a 5 2a  2 = (5 2a  3 f
Therefore 5213 (mod 21). D
=1=
= 3,
and we now proceed by induction. We have
== (1 + 2a 1 + k2a)2 == 1 + 2a (mod2 a+ 1 ).
1 (mod 21) and 52'  2 == 1 (mod 21). That is, the order of 5 is 21 2
Theorem 9.3. Let I > 2. Then, given any odd a, there exists b such that aI
a == (  1)25 b
(mod 21),
b
~
0.
Proof If a == 1 (mod 4), then by Theorem 9.2, 5b (0::;;; b < 21 2) gives 21 2 distinct numbers mod 21; moreover they are all congruent 1 (mod 4). Therefore there must be an integer b such that a == 5b (mod 21). If a == 3 (mod 4), then  a == 1 (mod 4), and the required result follows from the above. D
Theorem 9.4. Let m = 21 . pili . .. p~s (standard factorization) with I ~ 0, 11 > 0, ... , Is > 0. We define (j to be or 1 or 2 according to whether 1= 0, 1 or 1= 2 or I > 2
°
respectively. Then the reduced residue system ofm can be represented by the products of s + (j numbers. Proof 1) Suppose that m = m'm", (m', m") = 1. Let ar, .. . , aq>(m') be a reduced residue system mod m', and that ai == 1 (modm") (this is always possible). Let br, ... , bq>(m") be a reduced residue system mod m" and that bj == 1 (modm'). Then aibj represen t a reduced residue system mod mm', and its num ber is q>( m'm"). Also, if aibj == asb t (modm'm"), then ai == as (modm'), bj == b t (modm"). 2) From Theorems 9.1 and 9.3 we know that the reduced residue system modm, where m = pi (p > 2), is the product of a single number. If m = 21 where I > 1, then the reduced residue system is the product of (j numbers. Combining this with 1), the theorem is proved. D
This theorem points out an important principle. In group theory this result is known as the Fundamental Theorem of Abelian groups. Exercise. Prove that if k < p, n, = kp2
+ 1 and
2n 
1
== 1 (modn),
then n is a prime number. Hints: (i) First prove.that n has a prime divisor congruent 1 (modp). Let dbe the least positive integer such that 2d == 1 (mod n). Deduce that d,tk, din  1 and pld. Then obtain the conclusion from pldlq>(n). (ii) Deduce from n = kp2 + 1 = (up + 1)(vp + 1) that n cannot be composite. Note: Taking p = 2127  1, k = 180, Miller and Wheeler proved, with the aid of a computer, that 180(2127  1)2 + 1 is prime. (Nature 168 (1951),838).
52
3. Quadratic Residues
The least primitive roots for primes less than 5000. An asterisk indicates that lOis a primitive root. p
p1
g
p
p1
g
p
p1
g
3 5 7* 11 13 17* 19* 23* 29* 31 37 41 43 47* 53 59* 61* 67 71 73 79 83 89 97* 101 103 107 109* 113* 127 131* 137 139 149* 151 157 163 167* 173 179* 181* 191 193* 197 199 211 223* 227 229* 233* 239
2 22 2·3 2·5 22.3 24 2.3 2 2·11 22.7 2·3·5 22.3 2 23 .5 2·3·7 2·23 22.13 2·29 22.3.5 2·3·11 2·5·7 23 .3 2 2·3·13 2·41 23 .11 25 .3 22.5 2 2·3·17 2·53 22.3 3 24 .7 2.3 2.7 2·5·13 23 ·17 2·3·23 22.37 2.3.5 2 22.3.13 2.3 4 2·83 22.43 2·89 22.3 2.5 2·5·19 26 .3 22.7 2 2.3 2·11 2·3·5·7 2·3·37 2·113 22.3.19 23 ·29 2·7·17
2 2 3 2 2 3 2 5 2 3 2 6 3 5 2 2 2 2 7 5 3 2 3 5 2 5 2 6 3 3 2 3 2 2 6 5 2 5 2 2 2 19 5 2 3 2 3 2 6 3 7
241 251 257* 263* 269* 271 277 281 283 293 307 311 313* 317 331 337* 347 349 353 359 367* 373 379* 383* 389* 397 401 409 419* 421 431 433* 439 443 449 457 461* 463 467 479 487* 491* 499* 503* 509* 521 523 541* 547 557 563
24 .3.5 2.5 3 23 2·131 22 ·67 2.3 3 .5 22.3.23 23 .5.7 2·3·47 22.73 "2.3 2.17 2·5·31 23 .3.13 22.79 2·3·5·11 24.3.7 2·173 22.3.29 25 ·11 2·179 2·3·61 22.3.31 2.3 3 .7 2·191 22.97 22.3 2·11 24.5 2 23 .3.17 2·11·19 22.3.5.7 2·5·43 24.3 3 2·3·73 2·13·17 26 .7 23 .3.19 22.5.23 2·3·7·11 2·233 2·239 2.3 5 2.5.7 2 2·3·83 2·251 22 ·127 22.5.13 2.3 2.29 22.3 3 .5 2·3·7·13 22 ·139 2·281
7 6 3 5 2 6 5 3 3 2 5 17 10 2 3 10 2 2 3 7 6 2 2 5 2 5 3 21 2 2 7 5 15 2 3 13 2 3 2 13 3 2 7 5 2 3 2 2 2 2 2
569 571* 577* 587 593* 599 601 607 613 617 619* 631 641 643 647* 653 659* 661 673 677 683 691 701* 709* 719 727* 733 739 743* 751 757 761 769 773 787 797 809 811* 821* 823* 827 829 839 853 857* 859 863* 877 881 883 887*
23 .71 2·3·5·19 26 .3 2 2·293 24 .37 2·13 ·23 23 .3.5 2 2·3·101 22.3 2·17 23 .7.11 2·3·103 2.3 2.5.7 27 .5 2·3·107 2·17·19 22 ·163 2·7·47 22.3.5.11 25 .3.7 22.13 2 2·11·31 2·3·5·23 22.5 2·7 22.3.59 2·359 2.3.11 2 22.3.61 2.3 2.41 2·7·53 2.3.5 3 22.3 3 .7 22.5.19 28 .3 22 ·193 2·3·131 22 ·199 23 ·101 2.3 4.5 22.5.41 2·3·137 2·7·59 22.3 2.23 2·419 22.3.71 23 ·107 2·3·11·13 2·431 22.3.73 24.5.11 2.3 2.72 2·443
3 3 5 2 3 7 7 3 2 3 2 3
:,
11 5 2 2 2 5 2 5 3 2 2 11 5 6 3 5 3 2 6 11 2 2 2 3 3 2 3 2 2 11 2 3 2 5 2 3 2 5
53
3.9 The Structure of a Reduced Residue System
p
p1
g
p
p1
g
p
p1
g
907 911 919 929 937* 941* 947 953* 967 971* 977* 983* 991 997 1009 1013 1019* 1021* 1031 1033* 1039 1049 1051* 1061 1063* 1069* 1087* 1091* 1093 1097* 1103* 1109* 1117 1123 1129 1151 1153* 1163 1171* 1181* 1187 1193* 1201 1213* 1217* 1223* 1229* 1231 1237 1249 1259* 1277 1279
2·3·151 2'5'7·13 2'3 3 '17 25 ·29 23 '3 2'13 22'5.47 2'11·43 23 '7'17 2·3·7·23 2·5'97 24 ,61 2·491 2.3 2511 22'3'83 24 '3 2'7 22 ·11·23 2'509 22. 3· 5 ·17 2'5·103 23 .3.43 2·3·173 23 ·131 2'3'5 2'7 22'5'53 2'3 2'59 22'3'89 2'3·181 2'5·109 22'3'7'13 23 ·137 2'19·29 22 ·277 22'3 2'31 2·3·11·17 23 '3'47 2'5 2'23 27 .3 2 2'7·83 2'3 2'5'13 22'5'59 2·593 22 ·149 24 '3'5 2 22.3'101 26 ·19 2·13·47 22,307 2·3·5·41 22'3'103 25 .3.13 2·17'37 22'11.29 2.3 2'71
2 17 7 3 5 2 2 3 5 6 3 5 6 7 11 3 2 10 14 5 3 3 7 2 3 6 3 2 5 3 5 2 2 2 11 17 5 5 2 7 2 3 11 2 3 5 2 3 2 7 2 2 3
1283 1289 1291* 1297* 1301* 1303* 1307 1319 1321 1327* 1361 1367* 1373 1381* 1399 1409 1423 1427 1429* 1433* 1439 1447* 1451 1453 1459 1471 1481 1483 1487* 1489 1493 1499 1511 1523 1531* 1543* 1549* 1553* 1559 1567* 1571* 1579* 1583* 1597 1601 1607* 1609 1613 1619* 1621* 1627 1637 1657
2·641 23 '7.23 2'3'5'43 24 '3 4 22. 52 ·13 2·3·7·31 2·653 2·659 23 .3.5.11 2'3·13·17 24 '5'17 2·683 22'7 3 22'3'5'23 2·3·233 27'11 2'3 2'79 2·23'31 22. 3· 7 ·17 23 ·179 2'719 2'3'241 2'5 2.29 22. 3.11 2 2'3 6 2.3.5'7 2 23 '5'37 2·3·13·19 2'743 24 .3'31 22'373 2'7·107 2·5·151 2'761 2.3 2. 5 ·17 2·3·257 22'3 2'43 24 '97 2·19·41 2'3 3 '29 2·5·157 2'3'263 2·7·113 22. 3· 7 ·19 26 ,5 2 2 ·11· 73 23 '3'67 22'13'31 2·809 22'3 4 '5 2·3·271 22,409 23 ,3 2 ,23
2 6 2 10 2 6 2 13 13 3 3 5 2 2 13 3 3 2 6 3 7 3 2 2 5 6 3 2 5 14 2 2 11 2 2 5 2 3 19 3 2 3 5 11 3 5 7 3 2 2 3 2 11
1663* 1667 1669 1693 1697* 1699 1709* 1721 1723 1733 1741* 1747 1753 1759 1777* 1783* 1787 1789* 1801 1811* 1823* 1831 1847* 1861* 1867 1871 1873* 1877 1879 1889 1901 1907 1913* 1931 1933 1949* 1951 1973 1979* 1987 1993* 1997 1999 2003 2011 2017* 2027 2029* 2039 2053 2063* 2069* 2081
2·3·277 2'7 2'17 22'3'139 22'3 2'47 25 '53 2·3·283 22'7'61 23 '5'43 2·3'7·41 22'433 22'3'5'29 2· 32·97 22'3'73 2'3'293 24 '3'37 2.3 4 '11 2·19'47 22'3'149 23 '3 2.5 2 2·5·181 2·911 2'3·5·61 2·13'71 22'3'5'31 2'3'311 2'5·11'17 24 '3 2'13 22'7.67 2·3'313 25 '59 22. 32'19 2'953 23 ·239 2·5·193 22'3'7'23 22 ·487 2'3'5 2'13 22'17'29 2·23·43 2·3·331 22'3'83 22'499 2.3 3 .37 2·7·11·13 2·3·5·67 25 .3 2.7 2 ·1013 22'3'13 2 2·1019 22.3 3 '19 2·1031 22 ·11·47 25 .5'13
3 2 2 2 3 3 3 3 3 2 2 2 7 6 5 10 2 6 11 6 5 3 5 2 2 14 10 2 6 3 2 2 3 2 5 2 3 2 2 2 5 2 3 5 3 5 2 2 7 2 5 2 3
54
3. Quadratic Residues
p
pI
g
p
pI
g
p
pI
g
2083 2087 2089 2099* 211l 21l3* 2129 2131 2137* 2141* 2143* 2153* 2161 2179* 2203 2207* 2213 2221* 2237 2239 2243 2251* 2267 2269* 2273* 2281 2287 2293 2297* 2309* 2311 2333 2339* 2341* 2347 2351 2357 2371* 2377 2381 2383* 2389* 2393 2399 2411* 2417* 2423* 2437* 2441 2447* 2459* 2467 2473*
2·3·347 2·7·149 23 .3 2.29 2·1049 2·5·21l 26 .3 ·Il 24.7.19 2·3·5·71 23 .3.89 22.5.107 2.3 2.7.17 23 ·269 24.3 3 .5 2.3 2.1l 2 2·3·367 2·1l03 22.7.79 22.3.5.37 22.13.43 2·3·373 2·19· 59 2.3 2.5 3 2·11·103 22.34.7 25 ·71 23 .3.5.19 2.3 2.127 22.3.191 23 .7.41 22.577 2·3·5·7·1l 22. II· 53 2·7·167 22.3 2.5.13 2·3·17·23 2.5 2 .47 22 .19.31 2·3·5·79 23 .3 3 ·Il 22.5.7.17 2·3·397 22.3.199 23 .13.23 2· 11·109 2·5·241 24 ·151 2·7·173 22 .3.7.29 23 .5.61 2 ·1223 2 ·1229 2.3 2.137 23 .3.103
2 5 7 2 7 5 3 2 10 2 3 3 23 7 5 5 2 2 2 3 2 7 2 2 3 7 19 2 5 2 3 2 2 7 3 13 2 2 5 3 5 2 3 II 6 3 5 2 6 5 2 2 5
2477 2503 2521 2531 2539* 2543* 2549* 2551 2557 2579* 2591 2593* 2609 2617* 2621* 2633* 2647 2657* 2659 2663* 2671 2677 2683 2687* 2689 2693 2699* 2707 271l 2713* 2719 2729* 2731 2741* 2749 2753* 2767* 2777* 2789* 2791 2797 2801 2803 2819* 2833* 2837 2843 2851* 2857 2861* 2879 2887 2897*
22.619 2.3 2.139 23 .3 2.5.7 2·5·1l·23 2.3 3 .47 2·31·41 4.7 2.13 2.3.5 3 .17 22.3 2.71 2·1289 2·5·7·37 25 .3 4 24.163 23 .3.109 22.5.131 23 .7.47 2.3 3 .7 2 25 .83 2·3·443 2·1l 3 2·3·5·89 22.3.223 2.3 2.149 2·17·79 27 .3.7 22.673 2·19·71 2·3·1l·41 2·5·271 23 .3.113 2.3 2.151 23 .11.31 2·3·5·7·13 22.5.137 22.3.229 26 .43 2·3·461 23 .347 22.17.41 2.3 2.5.31 22.3.233 24.5 2.7 2·3·467 2 ·1409 24.3.59 22 ·709 2.7 2.29 2.3.5 2.19 23 .3.7.17 22.5. II· 13 2 ·1439 2·3·13·37 24.181
2 3 17 2 2 5 2 6 2 2 7 7 3 5 2 3 3 3 2 5 7 2 2 5 19 2 2 2 7 5 .3 3 3 2 6 3 3 3 2 6 2 3 2 2 5 2 2 2 II 2 7 5 3
2903* 2909* 2917 2927* 2939* 2953 2957 2963 2969 2971* 2999 3001 301l* 3019* 3023* 3037 3041 3049 3061 3067 3079 3083 3089 3109 3119 3121 3137* 3163 3167* 3169 3181 3187 3191 3203 3209 3217 3221* 3229 3251* 3253 3257* 3259* 3271 3299*' 3301* 3307 3313* 3319 3323 3329 3331* 3343* 3347
2·1451 22.727 22.3 6 2·7·1l·19 2·13·1l3 23 .3 3 .41 22.739 2 ·1481 23 .7.53 2.3 3 .5 ·Il 2 ·1499 23 .3.5 3 2·5·7·43 2·3·503 2 ·151l 22·3·1l·23 25 .5.19 23 .3.127 22.3 2.5.17 2·3·7·73 2.3 4.19 2·23·67 24 ·193 22.3.7.37 2·1559 24.3.5.13 26 .7 2 2·3·17·31 2·1583 22·3 2.1l 22.3.5.53 2.3 3 .59 2·5·1l·29 2·1601 23 .401 24.3.67 22.5.7.23 22.3.269 2.5 3 .13 22.3.271 23 .11.37 2·3·181 2·3·5·109 2·17·97 22.3. 52. II 2·3·19·29 24.3 2.23 2·3·7·79 2·1l·151 28 .13 2.3 2 .5.37 2·3·557 2·7·239
5 2 5 5 2 13 2 2 3 10 17 14 2 2 5 2 3 II 6 2 6 2 3 6 7 7 3 3 5 7 7 2 II 2 3 5 10 6 6 2 3 3 3 2 6 2 10 6 2 3 3 5 2
55
3.9 The Structure of a Reduced Residue System
p
p1
g
P
p1
g
3359 3361 3371* 3373 3389* 3391 3407* 3413 3433* 3449 3457 3461* 3463* 3467 3469* 3491 3499 3511 3517 3527* 3529 3533 3539* 3541 3547 3557 3559 3571* 3581* 3583 3593* 3607* 3613 3617* 3623* 3631 3637 3643 3659* 3671 3673* 3677 3691 3697 3701* 3709* 3719 3727* 3733 3739 3761 3767 3769
2·23·73 25 '3.5.7 2·5·337 22.3.281 22. 7 .11 2 2·3·5·113 2·13·131 22'853 23 • 3 ·11·13 23 ·431 27 ,3 3 22'5'173 2·3·577 2·1733 22. 3 .17 2 2'5'349 2·3·11·53 2'3 3 '5'13 22'3'293 2·41·43 23 ,3 2,7 2 22 ·883 2·29·61 22'3.5'59 2'3 2.197 22.7.127 2·3'593 2·3·5'7·17 22.5.179 2.3 2.199 23 '449 2·3·601 22'3'7'43 25 '113 2'1811 2.3'5.11 2 22. 32. 101 2·3·607 2·31'59 2·5'367 23 '3 3 '17 22'919 2'3 2'5'41 24 .3'7.11 22'5 2'37 22 • 32·103 2.11.13 2 2· 34 ·23 22'3'311 2·3·7·89 24 '5'47 2'7·269 23 '3'157
11 22 2 5 3 3 5 2 5 3 7 2 3 2 2 2 2 7 "2 5 17 2 2 7 2 2 3 2 2 3 3 5 2 3 5 21 2 2 2 13 5 2 2 5 2 2 7 3 2 7 3 5 7
3779* 3793 3797 3803 3821* 3823 3833* 3847* 3851* 3853 3863* 3877 3881 3889 3907 3911 3917 3919 3923 3929 3931 3943* 3947 3967* 3989* 4001 4003 4007* 4013 4019* 4021 4027 4049 4051* 4057* 4073* 4079 4091* 4093 4099 4111 4127 4129 4133 4139* 4153* 4157 4159 4177* 4201 4211* 4217* 4219*
2 '1889 24 '3'79 22 ·13· 73 2 ·1901 22'5'191 2'3'7 2'13 23 '479 2·3·641 2· 52. 7·11 22. 32·107 2·1931 22'3'17'19 23 '5'97 24 '3 5 2.3 2'7'31 2·5·17·23 22 ·11· 89 2·3·653 2·37'53 23 ·491 2·3·5·131 2'3 3 '73 2 ·1973 2·3·661 22 ·997 25 .5 3 2·3·23·29 2·2003 22'17'59 2'7 2'41 22'3'5'67 2·3·11·61 24 .11.23 2.3 4 '5 2 23 • 3 '13 2 23 '509 2·2039 2'5·409 22.3.11'31 2·3·683 2·3·5·137 2·2063 25 • 3 ·43 22 ·1033 2·2069 23 .3.173 22 ·1039 2.3 3 . 7'11 24 .3 2'29 23 '3'5 2'7 2'5'421 23 '17'31 2·3·19'37
2 5 2 2 3 3 3 5 2 2 5 2 13 11 2 13 2 3 2 3 2 3 2 6 2 3 2 5 2 2 2 3 3 10 5 3 11 2 2 2 17 5 13 2 2 5 2 3 5 11 6 3 2
p
4229* 4231 4241 4243 4253 4259* 4261* 4271 4273 4283 4289 4297 4327* 4337* 4339* 4349* 4357 4363 4373 4391 4397 4409 4421* 4423* 4441 4447* 4451* 4457* 4463* 4481 4483 4493 4507 4513 4517 4519 4523 4547 4549 4561 4567* 4583* 4591 " 4597 4603 4621 4637 4639 4643 4649 4651* 4657 4663
p1
g
22'7'151 2'3 2'5'47 24 '5'53 2·3· 7 ·101 22 ·1063 2'2129 22'3.5'71 2'5'7·61 24 .3.89 2·2141 26 '67 23 '3'179 2·3'7·103 24 .271 2'3 2.241 22 '1087 22. 32.11 2 2·3·727 22 ·1093 2·5·439 22'7'157 23 ·19·29 22. 5·13 ·17 2· 3 ·11· 67 23 • 3· 5· 37, 2'3 2'13'19 2'5 2'89 23 '557 2·23·97 27 '5'7 2'3 3 '83 22 ·1123 2·3'751 25 .3'47 22 ·1129 2.3 2.251 2·7'17·19 2'2273 22'3'379 24 .3.5'19 2'3'761 2·29'79 2· 33 • 5 ·17 22'3'383 2·3 ·13'59 22 .3.5'7.11 22'19'61 2'3'773 2·11·211 23 '7'83 2'3'5 2'31 24 .3'97 2.3 2'7.37
2 3 3 2 2 2 2 7 5 2 3 5 3 3 10 2 2 2 2 14 2 3 3 3 21 3 2 3 5 3 2 2 2 7 2 3 5 2 6 11 3 5 11
5 2 2 2 3 5 3 3 15 3
56
3. Quadratic Residues
p
pI
g
p
p I
g
p
pI
g
4673* 4679 4691* 4703* 4721 4723 4729 4733 4751 4759 4783* 4787 4789
26 .73 2·2339 2·5·7·67 2·2351 24 .5.59 2·3·787 23 .3.197 22 .7.13 2 2.5 3 .19 2·3·13·61 2·3·797 2·2393 22 .3 2 .7.19
3 II 2 5 6 2 17 5 19 3 6 2 2
4793* 4799 4801 4813 4817* 4831 4861 4871 4877 4889 4903 4909 4919
23 .599 2·2399 26 .3.5 2 22 .3.401 24 .7.43 2·3·5·7·23 22 .3 5 .5 2·5·487 22 .23.53 23 .13.47 2·3·19·43 22 .3.409 2·2459
3 7 7 2 3 3 II II 2 3 3 6 13
4931* 4933 4937* 4943* 4951 4957 4967* 4969 4973 4987 4993 4999
2·5·17·29 22 .3 2 ·137 23 .617 2·7·358 2·3 2 ·5 2 ·II 22 .3.7.59 2·13·191 23 .3 3 .23 22 ·II·II3 2.3 2 .277 27 .3.13 2.3.7 2 .17
6 2 3 7 6 2 5 II 2 2 5 3
Chapter 4. Properties of Polynomials
4.1 The Division of Polynomials We consider polynomialsf(x) with rational coefficients and we denote by 13°f the degree of the polynomial.
Definition 1.1. Let./{x) and g(x) be two polynomials with g(x) not identically zero. If there is a polynomial h(x) such that./{x) = g(x)h(x), then we say that g(x) divides j{x), and we write g(x)I'/{x) or glf If g(x) does not divide ./{x), then we write g,tf Clearly we have the following: (i)flf; (ii) ifflg and gil, thenfand g differ only by a constant divisor, and we call them associated polynomials; (iii) if fig and glh, then Jlh; (iv) if fig, then 13°f ~ aOg. Ifflg and g,tI, then we callfa proper divisor of g and it is easy to see that, in this case, 13°f < 13° g. Theorem 1.1. Let./{x) and g(x) be any two polynomials with g(x) not identically zero. Then there are two polynomials q(x) and r(x) such that f = q . g + r, where either r = 0 or aOr < aOg. Proof We prove this by induction on the degree off If 13°f < aOg, then we can take q = 0, r =f If aOf~ aOg, we let f=
IXnXn
+ ... ,
g = Pmxm
+ ... ,
aOf= n, 13° g = m,
so that
From the induction hypothesis, there are two polynomials h(x) and r(x) such that
where either r
so that f
=
0 or aOr < aOg. We now put
= qg + r as required. D
58
4. Properties of Polynomials
Definition 1.2. By an ideal we mean a set I of polynomials satisfying the following conditions: (i) If f, gEl, then f + gEl; (ii) IffE I and h is any polynomial, then fh E I. Example. The multiples of a fixed polynomial fix) forms an ideal.
Theorem 1.2. Given any ideal I, there exists a polynomial f E I such that any polynomial in I is a multiple off; that is I is the ideal of the set of multiples off Proof Let f be a polynomial in I with the least degree. If g is a polynomial in I which is not a multiple off, then, according to Theorem 1.1, there are polynomials q(x) and r(x) (1' 0) such that g
= qf + r,
Since f E I, it follows from (ii) that qfE I, and hence from (i) that g  qfE I, that is rEI. But this contradicts the minimal degree property of f The theorem is proved. D Definition 1.3. Let f and g be two polynomials. Consider the set of polynomials of the form mf + ng where m, n are polynomials. From Theorem 1.2 we see that this set is identical with the set of polynomial which are multiple of a polynomial d. We call this polynomial dthe greatest common divisor offand g, and we write (f, g) = d. For the sake of uniqueness we shall take the leading coefficient of (f, g) to be I, that is a monic polynomial. Theorem 1.3. The greatest common divisor (f, g) has the following properties: (i) There are two polynomials m, n such that (f, g) = mf + ng; (ii) For every pair of polynomials m, n we have if, g)lmf + ng; (iii) If Ilf and Ilg, then 11(f, g). D Definition 1.4. If(f, g) = I, then we say thatfand g are coprime. Theorem 1.4. Let p be an irreducible polynomial. If plfg, then either plf or pig. Proof If p,tf, then (f, p) = I. Thus, from Theorem 1.3 there are polynomials m, n such that mf + np = 1 so that mfg + ngp = g. Since plfg, it follows that pig. D
4.2 The Unique Factorization Theorem Theorem 2.1. Any polynomial can be factorized into a product of irreducible polynomials. If associated polynomials are treated as identical, then, apart from the ordering of the factors, this factorization is unique. D
59
4.2 The Unique Factorization Theorem
The theorem can be proved by mathematical induction on the degree of the polynomial. Theorem 2.2. Letj(x) and g(x) be two polynomials with rational coefficients, and that j(x) be irreducible. Suppose that f(x) = 0 and g(x) = 0 have a common root. Then j(x)lg(x). Proof Sincefand g have a common zero, it follows that (f, g) # l. Let d(x) be the greatest common factor of j(x) and g(x). Then d(x) and j(x) are associated polynomials, because j(x) is irreducible. Therefore j(x)lg(x). 0
From this theorem we deduce the following: Ifj(x) is an irreducible polynomial of degree n, then the zeros
are distinct. Moreover, if 9(i) is a zero of another polynomial g(x) with rational coefficients, then the other n  I numbers are also the zeros of g(x). Theorem 2.3. Let f and g be monic polynomials:
where Pv are distinct irreducible monic polynomials. Then
where
Cv
= min (a v , bv )' 0
Definition 2.1. Letfand g be two polynomials. Polynomials which are divisible by bothfand g are called common multiples offand g. Those common multiples which have the least degree are called the least common multiples, and we denote by [f, g] the monic least common multiple. Theorem 2.4. Under the same hypothe~is as Theorem 2.3 we have
where dv
= max (a v , bv ). 0
From this we deduce: Theorem 2.S. A least common multiple divides every common multiple. Theorem 2.6. Let f, g be monic polynomials. Then fg
=
[f, g](f, g).
0
0
60
4. Properties of Polynomials
4.3 Congruences Let m(x) be a polynomial. If m(x)lfix)  g(x), then we say that fix) is congruent to g(x) modulo m(x) and we write
fix)
= g(x)
(modm(x)).
With respect to any modulus m(x) we have: (i)f=f(modm); (ii) iff= g (modm), then g =f(modm); (iii) iff= g, g = h (modm), thenf= h (modm); (iv) iff= g, fl gl (modm), thenf ±fl g ± gl,ffl ggl (modm). Being congruent is an equivalence relation which partitions the set polynomials into equivalence classes. From (iv) we see that addition and multiplication can be defined on these classes. We denote by 0 the class whose members are divisible by m(x). If m(x) is irreducible we can even define division on the set of equivalence classes (except by 0, of course). Specifically, if fix) is not a mUltiple of m(x), then there are polynomials a(x), b(x) such that a(x}f{x) + b(x)m(x) = 1 which means that there is a polynomial a(x) such that a(x)f(x) = 1 (modm(x)). We state this as a theorem.
=
=
=
Theorem 3.1. Let m(x) be irreducible. Then any nonzero equivalence class has a reciprocal. That is, if A is a nonzero equivalence class, then there exists a class B such that for any polynomials fix) and g(x) in A and B respectively we have fix)g(x) = 1 (mod m(x)). D We now give an example to illustrate the ideas in this section. Let m(x) = x 2 + 1, an irreducible polynomial. Each equivalence class contains a unique polynomial ax + b which we may take as the representative. The addition and subtraction of classes is given by ax + b ± (alx + b l ) = (a ± al)x + (b ± bl)' Multiplication is given by (ax + b)(alx + b l ) = aalx 2 + (ab l + alb)x + bb l = (ab l + alb)x + bb l  aal (modx 2 + 1). Using the ordered pair (a, b) to denote the class containing ax + b we then have
(a,b)
± (abb l ) =
(a, b)(ah b l )
(a
± abb ± bl),
= (ab l + bal, bb l  aal)'
From
(ax
+ b)( 
ax
. . ( we see thatthe Inverse of (a, b) IS
+ b) = a2 + b2
(modx 2
+ 1),
b)
a 2' 2 2 2 ' In other words we have the a +b a +b arithmetic of the complex number ai + b. Extending the idea here, if m(x) is a monic polynomial of degree n, then each equivalence class possesses a unique polynomial with degree less than n, say 
and the arithmetic of the congruence modulo m(x) becomes the arithmetic of these
61
4.4 Integer Coefficients Polynomials
polynomials. The sum of two such polynomials is obtained by adding the corresponding coefficients, and the product is the ordinary product polynomial reduced modulo m(x). Exercise 1. Let OCl, OC2, OC3 be distinct. Determine a quadratic polynomial j(x) satisfying j(OC1) = /31 '/(OC2) = /32, j(OC3) = /33'
Answer: The Lagrange interpolation formula ft..x) = /31
(x  O(2)(X  O(3) (OCI  O(2)(OCl  O(3)
+ /32
(x  O(3)(X  OCl) (OC2  O(3)(OC2  OCl)
(x  OCl)(X  O(2)
+ /33...,..,(OC8  OCl)(OC3  O(2)
Exercise 2. Let ml(x) and m2(x) be two nonassociated irreducible polynomials. Let fl(X) andf2(x) be two given polynomials. Prove that there exists a polynomialj(x) such thatj(x) =/;(x) (modmi(x)), i = 1,2.
4.4 Integer Coefficients Polynomials It is clear that the set of integer coefficients polynomials is closed with respect to addition, subtraction and multiplication. A set of integer coefficients polynomials is called an ideal if (i) f + g belongs to the set whenever f and g belong to the set, (ii) fg belongs to the set whenever f belongs to the set, and g is any integer coefficients polynomial. Theorem 4.1. (Hilbert) Every ideal A possesses a finite number of polynomials fl' ... ,J,. with the following property: Every polynomial f E A is representable as f = glfl + ... + gnfn where gb' .. , gn are integer coefficients polynomials.
Proof 1) Denote by B the set ofleading coefficients of members of A. We claim that B forms an integral modulus. To see this, we observe that if a, bEB, where ft..x) = axn + .. " g(x) = bxm + .. " then by (ii) we know thatj{x)xm, g(x)x" E A so that
j(x)xm ± g(x)xn
=
(a
± b)xm+n + ...
are in A. Therefore a ± bEB which proves our claim. From Theorem 1.4.3 members of B are multiples of an integer d. Let the corresponding polynomial with leading coefficient d be
2) Let fEA. Then there are two polynomials q(x) and r(x) such that ft..x) = q(X)fl (x) + r(x) where oOr < OOfl or r = O. This is certainly so if the degree of fis less than that offl' Ifj(x) = axn + ... + an (n ~ I), then by 1) we see that dla, and
62
4. Properties of Polynomials
is a polynomial with degree at most n  I. If the degree here is greater than or equal to I, then its leading coefficient is again divisible by d. Continuing the argument we see that our claim is valid. 3) If every member of A has degree at least I, then the theorem is proved. Otherwise we let d' be the greatest common divisor of the leading coefficients of . members of A whose degree are less than I, and we let f2
= d'xl' + d'lX"l + ...
(did')
be the corresponding polynomial in A. From the above, we see that members of A whose degree lies between l' and I can be written asfix) = Q(X)f2(X) + r(x) where aOr < a 2f2 or r = O. Continuing this argument the theorem is proved. 0
4.5 Polynomial Congruences with a Prime Modulus In this section all the polynomials have integer coefficients and p is a fixed prime number. Definition 5.1. If the corresponding coefficients of two polynomials fix) and g(x) differ by multiples of p, then we say thatf(x) and g(x) are congruent modulo p, and we writefix)~g(x) (modp). By the degree aOfofj(x) modulo p we mean the highest degree of f(x) whose coefficient is not a multiple of p. For example 7x 2 + 16x + 9~2x + 2 (mod 7), and a°(7x 2 + 16x + 9) = I (mod 7). But with respect to the modulus 3, a 2(7 x 2 + 16x + 9) = 2. Clearly we have (i) j(x)~j(x) (modp); (ii) if f~g (modp), then g~f (modp); (iii) if f~g, g~h (modp), thenf~h (modp); (iv) iff~g,Jl ~gl (modp), thenf ±fl ~g ± gl and ffl ~ggl (modp). We note particularly that (f(xW
~j(xP)
(modp).
Definition 5.2. Letf(x) and g(x) be polynomials with g(x) not identically zero mod p. If there is a polynomial h(x) such thatj(x) ~h(x)g(x) (modp), then we say that g(x) dividesf(x) modulo p. We call g(x) a divisor ofj(x) modulo p, and we write g(x)lj(x) (modp). Example. From XS + 3x4  4x 3 + 2 ~ (2X2  3)(3x3  x 2 + I) (mod 5) we see that 2X2  31x s + 3x4  4x 2 + 2 (mod 5). We have the following: (i) f(x)lj(x) (modp); (ii) if j(x)lg(x) and g(x)lf(x) (modp), thenj(x) and g(x) differs only by a constant factor; that is, there exists an integer a such thatj(x)~ag(x) (modp). In this case we say thatj(x) and g(x) are associated modulo p. It is easy to see that every polynomial has p  I associates
63
4.6 On Several Theorems Concerning Factorizations
modulo p. Moreover, there is a unique monic associated polynomial. (iii) Ifflg, glh (modp), thenflh (modp). (iv) Letfix) and g(x) be two polynomials with g(x) not identically zero modulo p. Then there are two polynomials q(x) and r(x) such that fi.x)~q(x)g(x) + r(x) (modp), where either aOr < aOg, or r(x)~O (modp). Definition 5.3. If a polynomial fix) cannot be factorized into a product of two polynomials with smaller degrees modp, then we say that f(x) is an irreducible polynomial modp, or thatf(x) is prime modp. Example. We take p = 3. There are three nonassociated linear polynomials, namely x, x + 1, x + 2, which are irreducible. There are nine nonassociated quadratic polynomials, namely x 2 , x 2 + x, x 2 + 2x, x 2 + 1, x 2 + X + 1, x 2 + 2x + 1, x 2 + 2, x 2 + X + 2, x 2 + 2x + 2. Of these there are 6 (= (x + a)(x + b)) which are reducible, and the three irreducible ones are x 2 + 1, x 2 + X + 2, x 2 + 2x + 2.
We note that if a polynomial is irreducible mod p, then it is irreducible and from this we deduce that x 2 + 2x + 2 has no rational zeros. The determination of the number of irreducible polynomials modp of degree n is an interesting problem which we shall solve in §9. Theorem 5.1. Any polynomial can be written as aproduct of irreducible polynomials modp, and this product representation is unique apartfrom associates and ordering of the factors. 0 We can define, similarly to §1, the greatest common divisor and the least common multiple. If we denote by (f, g) the monic greatest common divisor, then we have Theorem 5.2. Given polynomials j(x) and g(x), there are polynomials m(x) and n(x) such that m(x)f(x) + n(x)g(x)~(f(x), g(x)) (modp). 0
4.6 On, Several Theorems Concerning Factorizations Definition 6.1. Letj(x) = anxn + an_1x"1 + ... be a polynomial. The polynomial + (n  1)an_lxn 2 + ... is called the derivative ofj(x) and is denoted by
nanx"l f'(x).
Clearly we have (f(x) + g(x))' = f'(x) that (f(x)g(x)), = f'(x)g(x) + g'(x)j(x).
+ g'(x),
and it is not difficult to prove
Definition 6.2. If a polynomial j(x) is divisible by the square of a nonconstant polynomial modp, then we say thatfix) has repeated/actors modp. For example, x 5 + X4  x 3  x 2 + X + 1 has the repeated factors (x 2 + 1)2 modulo 3.
64
4. Properties of Polynomials
Theorem 6.1. A necessary and sufficient condition for j(x) to have repeatedfactors is that the degree of (j(x),f'(x» is at least 1. D Theorem 6.2. Ijp,(n, then X'  1 has no repeatedfactors modp. Theorem 6.3. Let (m,n)
=
d. Then (x'"  1, xn  1) =;xd  1.
D
D
Theorem 6.4. Let (m, n) = d. Then
4.7 Double Moduli Congruences Definition 7.1. Let p be a prime number and q>(x) be a polynomial. Iff1 (x)  fix) is a multiple of q>(x) mod p, then we say that f1 and f2 are congruent to the double moduli p, q>(x) and we write
f1(X) §. f2 (x)
(moddp, q>(x».
For example, x 5 + 3x4 + x 2 + 4x + 3 §. 0 (modd 5, 2X2  3). Double moduli congruences have the following properties: 1) j(x)§.j(x) (moddp, q>(x»; 2) If f§.g (moddp, q», then g§.f(moddp, q»; 3) If f§.g and g§.h (moddp, q», thenf§.h (moddp, q»; 4) If f§.g and f1 §.gl (modd p, q», then f ±f1 §.g ± gl and ff1 §.ggl (moddp, q»; 5) Suppose that the degree of q>(x) (modp) is n. Then every polynomial is congruent to one of the following polynomials
0::;;; ai::;;;p  1.
(1)
It is clear that there are pn polynomials in (1), no two of them are congruent (moddp, q>(x», and any polynomial must be congruent to one of them (moddp, q>(x». Definition 7.2. We call the pn polynomials in (1) a complete residue system (moddp, q>(x». By discarding those polynomials which are not coprime with q>(x) we have a reduced residue system (moddp, q>(x».
Theorem 7.1. Let (g(x), q>(x» = 1. Then, asj(x) runs through a complete (or reduced) residue system (moddp, q>(x», so does f(x)g(x). Proof If g(X)f1 (x) §. g(X)f2(X) (moddp, q>(x», then from (g(x), q>(x» = I we deduce that f1 (x) §. f2 (x) (moddp, q>(x». The required result follows easily from this. D
65
4.8 Generalization of Fermat's Theorem
4.8 Generalization of Fermat's Theorem Let p be a prime number, and
(1)
Given any polynomial f(x) , we have (f(xW"~f(x)
(moddp,
(2)
and in particular, we have xp"~x
(moddp,
Proof Letfl(x), ... ,1P"l(X) (moddp,
p"l
n /;(x) n (f(x)f;(x)) ~
i= I
(moddp,
i= I
or p"l
n
«f(xW"1  1)
/;(x)~O
(moddp,
i= I
and hence (j(XW"1
~
1
(modd p,
0
This theorem is a generalization of Fermat's theorem in Chapter 1. We note that (2) is a special case of (1), but we observe that (1) can also be deduced from (2), since (f(xW"~f(xP")~f(x)
(moddp,
Exercise. Generalize Euler's theorem in Chapter 2. Theorem 8.2. Any irreducible polynomial of degree n must divide Xp"l  1 (modp).
0
Theorem 8.3. The number of roots degree offiX).
off(X)~O
(moddp,
Proof Let g(x) be a root of the congruence, and let
so that
66
4. Properties of Polynomials
j(X)  j(g(x)) = an(Xn  (g(x)Y) =
+ an_l(xn 1 
(g(x))nl)
+ ...
(X  g(x))h(X).
If gl(X) is another root distinct from g(x), then h(g 1 (X)) ~O (moddp, cp(x)), and the required result follows. D Theorem 8.4. x pn  1 is not divisible by any irreducible polynomial of degree greater than n, modp. Proof Let I/J(x) be an irreducible polynomial with degree m > n, modp, and suppose, if possible, that xpn~x (moddp, I/J(x)). There are pm incongruent polynomials j(x)moddp, I/J(x). From (j(x))P~j(xP) (modp) we deduce that (j(x))pn~j(xpn)~j(x) (moddp, I/J(x)). This means that the number of roots of Xpn~X(moddp, I/J(x)), being pm, exceedspn. This is impossible by Theorem 8.3 so that the theorem is proved. D
Theorem 8.5. Let I/J(x) bean irreducible polynomial ofdegree I, modp. IfI/J(x)lxpn. x (modp), then lin. Proof From Theorem 8.2 and the hypothesis, we have I/J(x)l(x pn 
1 
1,xP1 
1 
1)
(modp),
and from Theorem 6.3, d = (n, I).
Moreover, from Theorem 8.4 we see that I lin. D
~
d = (n, I) so that 1= d, and hence
Exercise. Let I/J(x) and cp(x) be irreducible polynomials modp. Then a necessary and sufficient condition for the solubility of I/J(X) ~O (moddp, cp(x)) is that oOI/Jloocp. Prove further that if it is soluble then it can be factorized into a product of linear factors.
4.9 Irreducible Polynomials mod p Theorem 9.1. The product of all the irreducible polynomials of degree n (modp), is equal to (xpn/QIQ2  x) xpn  X Ql,Q2 (modp),
TI
where qb q2,' .. run over the distinct prime divisors of n.
67
4.1 0 Primitive Roots
Proof By Theorem 6.1 the polynomial x p "  x has not repeated factors, so that it can be factorized into a product of various distinct irreducible polynomials of the form
where t/I(x)lx Pd  x, din. We now apply the inclusionexclusion principle of §1. 7. We already know that x p "  x is a product of various irreducible polynomials of degree m where min. We exclude all those polynomials whose degrees divide n/ql; but those polynomials whose degrees possess n/qlq2 as divisors have been excluded twice, so that we have to reinclude them, and so on. D Theorem 9.2. The total number of irreducible polynomials of degree n (modp), is equal to
~ (pn _
Lpn/q, ql
+
L pn/Q,q2  Lpn/Q,Q2Q3
+ ... ).
Qt,Q2
Here the sums are over the distinct prime divisors qi of n. Proof The degree of the polynomial in Theorem 9.1 is
N = pn  L pn/Q, +
... ,
(1)
Q,
and each of its factor has degree n, so that the result follows.
D
Let n= qlt' ... q~s, where qi are the distinct prime divisors of n. Now
Therefore N> 0, so that we have: Theorem 9.3. There always exists an irreducible polynomial ofdegree n (modp).
D
4.10 Primitive Roots The content of this section is very similar to §3.8, and we shall therefore omit the details. Let (fix), q>(x)) = 1. Suppose that there exists a polynomial g(x) such that (g(x))m ~fix) (moddp, q>(x)). Then we call fix) an mth residue moddp, q>(x). A polynomial fix) is, or is not, a quadratic residue according to whether (fix))t(P"l)~ 1
(moddp, q>(x)),
or (fix))t(P"l) ~  1 (moddp, q>(x)).
68
4. Properties of Polynomials
Definition. The least positive integer I satisfying (fix))'~ 1 (moddp, q>(x)) is called the order of fix). As before, it can be proved that I divides pn  1, and that there are precisely q>(l) polynomials having order I. There are therefore q>(pn  1) polynomials with order pn _ 1, and these polynomials are called the primitive roots (moddp, q>(x)). Iffix) is a primitive root, then (fix)) v, v = 1,2, ... ,pn  I represent all the nonzero incongruent polynomials, moddp, q>(x). It is not difficult to prove that the product nv (X  fv(x)), where!., runs over all the primitive roots, is equal to
n
x pn  1 _
n
1
(X(pn_1)/q 
(X(pn_1)/qq, 
1)
1)
(1)
q
where qi runs over all the distinct prime divisors of pn  1. Exercise. Prove that the product of all the nonzero incongruent polynomials is congruent to  I (moddp, q>(x)).
4.11 Summary We may summarize the discussions of this chapter in the language of modern algebra or abstract algebra. We have a set of objects which we denote by R. The number of objects in R may be finite or infinite. 1. If we can define the operations of addition and subtraction in R and that these operations are closed in R, then we call R an integral modulus. For example: The set of even integers forms an integral modulus; the set of polynomials with even integer coefficients forms an integral modulus. An integral modulus is also known as an Abelian group. 2. If we can define the operations of addition, subtraction and multiplication which are closed in R, then we call R a ring. For example: The set of integers forms a ring; the set of integer coefficients polynomials forms a ring. 3. By an ideal E, we mean a subset of a ring R which satisfies the following conditions: i) If a,bEE, then a  bEE; ii) If aEE and rER, then arEE. For example: The subset of even integers forms an ideal in the ring of integers. In the ring of integer coefficient polynomials, we may form the ideal of polynomials having the formfix)(x 2 + 1) + 2g(x)x, wherefand g run over all integer coefficient polynomials. 4. If in R we can define the operations of addition, subtraction, multiplication and division (except by 0), and that these operations are closed in R, then we call R a field.
4.11 Summary
69
For example: The set of rational numbers forms a field. The residue classes modulo a fixed irreducible polynomial forms a field, which is known as an algebraic extension field in modern algebra. Next, take a prime number p and an irreducible polynomial qJ(x) of degree n. The residue classes with respect to the double modulus p and qJ(x) forms a field with pn elements. Students who master the various concrete examples discussed in this chapter will find it easier to learn the abstract concepts of modern algebra.
Chapter 5. The Distribution of Prime Numbers
In this chapter we give some basic results concerning the distribution of prime numbers. The reader will only require some knowledge of the calculus  this chapter is a first introduction to analytic number theory and we shall omit all the deeper investigations.
5.1 Order of Infinity In the discussion of the distribution of prime numbers we must understand the notion of the comparison of the order of growth between two functions. We often use the symbols .
«,
0,
0,
the meanings of which we shall now give. Let n be a positive integer which tends to infinity (or x a continuous variable which tends to infinity). Let
IfI ~ A
f« If f

Also f
g
=
<po
«
o(
lim f(n)
=
0 and 1 respectively
n"'co
lim f(x) = 0 and 1 respectivelY).
x'" co
71
5.2 The Logarithm Function
We have the following examples: sin x « 1,
x
1
1
x+«x«x+, x x
1
+  = O(X2),
x
X
x
+ sinx =
x
+ sinx '" x,
+ 0(1).
Naturally "x tending to infinity" may be replaced by "x tending to /" where / is a finite number. For example, as x + 0, we have x2
=
O(x),
sin x'" x,
l+x",l.
HOVl:'ever, unless otherwise stated, we shall assume that the variable is tending to infinity. It is easy to verify the following properties: (i) ({) « ({); (ii) iff « ({) and ({) « 1/1, then f« 1/1; (iii) if f« ({) and g« 1/1, then f + g « ({) + 1/1 and fg «({)1/1. The properties (ii) and (iii) still hold if we replace « by o( ). We also have (iv) ({) '" ({); (v) if 1/1 '" ({), then ({) '" 1/1; (vi) if ({) '" 1/1, and 1/1 '" X, then ({) '" X; (vii) if 1/1 '" ({) and 1/1 1 '" ({) b then 1/11/1 1 '" (()({) l'
5.2 The Logarithm Function The logarithm function log x frequently enters in the discussion of the distribution of prime numbers. We assume the reader already knows the definition oflogx and we shall recall the following simple properties. Since
x" eX =I+x+"'++ n!
X"7,1
(n+1)!
+"',
it follows that for positive x and for all n
Since, for any fixed n, the righthand side tends to infinity as x + 00, it follows that ~ grows faster than any fixed power of x. We can therefore write x" = o(e X ). If IX is positive, then x!1. = 0(x[!1.1+ 1) = o(e X ). Since log x is the inverse function of eX, on substituting logy for x in the above, we see that (logyy = o(y), or
In other words log x grows slower than any fixed positive power of x. It is easy to see that log log x is even smaller than log x.
72
5. The Distribution of Prime Numbers
Theorem 2.1. x
1
n~ 1
n
I  ~ logx.
Proof The result follows at once from x
x
dt log x = f  ~ I
x
t
n~ 1
 ~ 1 + fdt  = 1 + log x. 1
n
0
t
Theorem 2.2. Le t
Then
.
x
hx~.
logx
Proof We have
Ii x (li x)' lim   = lim ,:x+ 00
10: x
x+
00
Co: x)'
log x
= limlogx
=1.
log2 x
0
5.3 Introduction The distribution of prime numbers is the most interesting branch of number theory. The various conjectures and theorems are mostly the result of empirical observations. We now consider several problems and the ancient conjectures associated with them. (i) Let n(x) denote the number of primes not exceeding x. Then we have the following table which suggests: 1) There are infinitely many primes; that is n(x) + 00. 2) However, there are relatively few primes comparing with all the integers. That is, almost all numbers are not primes in the sense that
73
5.3 Introduction
X
x
n(x)
1000 10000 50000 100000 500000 1000000 2000000 5000000 10000000 20000000 90000000 100000000 1000000000
168 1229 5133 9592 41538 78498 148933 348513 664579 1270607 5216954 5761455 50847478
Jix
logx 145 1086 4621 8686 38103 72382 137848 324149 620417 1 189676 4913 897 5428613 48254630
n(x)
178 1246 5167 9630 41606 78628 149055 348638 664918 1270905 5217810 5762209 50849235
+
n(x)
n(x)
Jix
x
0.94 ... 0.98 ... 0.993 ... 0.996 ... 0.9983 ... 0.9983 ... 0.9991. .. 0.9996 ... 0.9994 ... 0.9997 ... 0.99983 ... 0.99986 ... 0.99996 ...
0.1680 0.1229 0.1026 0.0959 0.0830 0.0785 0.0745 0.0697 0.0665 0.0635 0.0580 0.0576 0.0508
O.
X
3) The number of primes not exceeding x is asymptotically Ii x; that is .
X
n(x) '" II x '"   .
logx
We note that 3) implies I) and 2). 4) The best approximation to n(x) is Ii x. 5) n(x) < Ii x. In this chapter our deepest result is Chebyshev's theorem which states that
x x   « n(x)«   . logx logx This result implies 1) and 2). The statement 3) is the famous prime number theorem which we shall prove in Chapter 9. The problem raised in 4) belongs to a difficult branch of analytic number theory and its discussion is outside the scope of this book. Finally, despite the convincing evidence from the table, 5) is actually false; this was proved by Littlewood. (ii) We know that
5,13,17,29, ... ,10006721, are all primes congruent 1 mod 4. A natural question is whether there are infinitely many such primes. Associated with this problem Dirichlet's theorem gives the following general answer: Let a, b be coprime integers. Then there are infinitely many primes of the form an + b. In this chapter we shall only discuss particular examples of this theorem, the proof of which is given in Chapter 9. (iii) We have 6 = 3 + 3,
8 = 3 + 5,
10 = 5 + 5,
12 = 5 + 7,
74
5. The Distribution of Prime Numbers
+ 7, 22 = 3 + 19, 14
=
7
16=3+13,
18
=
5 + 13,
20
=
7
+ 13,
24 = 5 + 19,
This suggests the following: Every even integer greater than 4 must be the sum of two odd prime numbers. This is the famous Goldbach's problem. If this problem is settled to be true, then we can deduce that every odd integer greater than 7 must be the sum of three odd primes. This is because if n is an odd integer greater than 7, then n  3 is an even integer greater than 4 so that n  3 = PI + P2 or n = 3 + PI + P2' The unsolved Goldbach's problem is extremely difficult. I. M. Vinogradov proved that every sufficiently large. odd integer is the sum of three primes. The author proved that "almost all" even numbers are the sum of two primes. V. Brun proved that every sufficiently large even integer is the sum of two numbers each having at most 9 prime factors. (See Notes.) (iv) We also note that 3,5;5,7; 11,13; 17,19;29,31; ... ; 10016957, 10016959; ... ; 10 9 + 7, 109 + 9; ... are all pairs of primes having difference 2; we call such pairs prime twins. More specifically we know that there are 1224 pairs less than 100,000 and 8164 pairs less than 1,000,000. At present (1957) the largest pair known to us is 1000000009649, 1000000009651. From the evidence here it is natural to conjecture that there are infinitely many pairs of prime twins. This too is a famous unsolved problem. (See Notes.) In the theory of numbers we shall always have more unsolved problems than solved ones. For example we also have 5,7,11; 11,13,17; 17,19,23; ... ;101,103,107; ... ; 10014491, 10014493, 10041497; ... all primes. It is conjectured that there are infinitely many primes P such that P + 2, P + 6 are also primes. Advancing even further: (v) We can verify that n 2  n + 17 is always prime when 0 ::;;; n ::;;; 16, and that 2 n  n + 41 is always prime when 0 ::;;; n ::;;; 40. We now suggest the following interesting problem: Let N be any given number. Can we always find a prime P such that
is always prime when 0 ::;;; n ::;;; N? This too is an unsolved problem and, in the author's view, it is even more difficult than (iii) and (iv). If this problem is solved affirmatively, then (iv) can also be settled. Let us see why. In order for the polynomial n 2  n + P to (successively) take prime values, n must be restricted to
75
5.4 The Number of Primes is Infinite
the integers from 0 to P  1. We now construct a sequence of polynomials  n + Pi with the following property: When 0::;;; n ::;;; Pi _ b the number  n + Pi is always prime. We note that if (v) is solved, then this construction is certainly possible. Now taking n = I and 2 will give Ph Pi + 2 both primes, and taking n = 1,2,3 will give Pi,Pi + 2,Pi + 6 all primes. This shows that (iv) follows as a consequence of (v). (vi) Another difficult unsolved problem is whether there are infinitely many primes of the form n 2 + 1. We know that n2 n2
2,5,17,37, ... ,65537, ... are all primes of this form and it is conjectured that there are infinitely many such primes. (See Notes.) (vii) Let Pn denote the nth prime. We may ask about the distribution of the values Pn  Pnl' From (iv) we see that Pn  Pnl may be as small as 2, but what about its maximum value  that is, an order estimate for Pn  Pn _ 1 as n + 00. (viii) The socalled Bertrand's postulate states that there always exists a prime in any interval from n to 2n. This is comparatively easy and we shall prove it in §7. A more delicate conjecture is that "there always exists a prime in any interval from n2 to (n + 1)2." This is a difficult unsolved problem.
5.4 The Number of Primes is Infinite Theorem 4.1. The number of primes is infinite; that is n(x)
+ 00
as x
+ 00.
Proof Let 2,3, ... ,p be all the primes not exceeding P and let q=2·3·····p+1.
Then q is not a mUltiple of2, 3, ... ,p and hence either q is prime or q is divisible by a prime between P and q. Therefore there always exists a prime greater than p, and so it follows that the number of primes is not finite. 0 This method can be generalized to give the following: Theorem 4.2. Let f(x) be any polynomial with integer coefficients. Then the numbers
f(l), f(2), f(3), ... contain infinitely many distinct prime divisors. Proof Let n~l.
If an = 0, then our sequence of numbers contains all the primes as divisors. We assume therefore that an i= O.
76
5. The Distribution of Prime Numbers
Suppose that the sequence of numbers has only finitely many prime divisors PI, P2,· .. , Pv· We considerf(p1 .. 'PvanY), a polynomial iny with all the coefficients a multiple of an' Let
where g(y) = 1 + A 1 y
+ A2y2 + ... + Anyn
is a polynomial with integer coefficients such that PI>' .. ,Pv divide AI> A 2, ... , An. If there exists an integer Yo such that g(yo) ¥ ± 1, then g(yo) must contain a prime divisor distinct from PI> ... ,Pv, and so the theorem follows at once. Butg(y) = ± 1 has at most 2n solutions so that the theorem is proved. D A different method of proof of Theorem 4.1 was given by Euler. This method, which we give below, opens the door for analytic number theory.
Lp
Theorem 4.3. The series lip is divergent; here the summation is over all primes p. Therefore the number of primes is infinite.
We first prove: Theorem 4.4 (Euler's identity). Letf(n) be definedfor all positive integers n, andf(n) not identically zero. Suppose that f(nn')
= f(n)f(n')
whenever
(n, n')
= 1.
Then we have the following identity 00
L fen) = f1 (l + f(p) + f(p2) + ... ); n=l
p
the condition for the validity of this identity is either 00
L
(i)
converges
If(n)1
n=l
or
(ii)
f1 (l + If(p)1 + If(p2)1 + ... )
converges.
p
Moreover, if f(nn') = f(n)f(n') for all conditions, we have that
n, n', then, subject to the same convergence 1
00
L fen) = f11  f( P)' p
n= 1
Proof We have, for all n,f(l)f(n) Therefore f( 1) = 1.
=
f(n) , and there exists n such that fen) ¥
o.
77
5.4 The Number of Primes is Infinite
I) Suppose that the series 00
L If(n)1
(I)
n= 1
converges to the sum S. Now consider
P(x)
=
TI (1 + f(p) + f(p2) + .. '). p~x
For any p, the series L:'= 1 If(pn)1 is part of the series (1) so that it must also converge. This means that P(x) is a finite product of absolutely convergent series. Therefore
P(x) = L'f(n) where the summation is over all integers n having prime factors ::;:; x. Let 00
L f(n)
S=
n=1
so that
L
IS  P(x)l::;:;
If(n)l·
n>x
When x + 00, IS  P(x)I+ 0 so that P(x) Using this result on If(n)1 we see that
+
S.
TI (1 + If(p)1 + If(p2)1 + ... ) p
converges to S. 2) Suppose that
TI (I + If(p)1 + If(p2)1 + ... ) p
converges to P. Then
P(x)
=
TI (I + If(p)1 + If(p2)1 + ... ) p~x
= L' If(n)1
~
L
If(n)l·
n~x
Therefore 00
L
If(n)1
n=1
converges. From our result in I) we see that the first part of the theorem is proved. The last part follows from I
+ f(p) + f(p2) + ... = I + f(p) + (f(p»2 + ... I I  f(p)
D
78
5. The Distribution of Prime Numbers
Proof of Theorem 4.3. We putf(n) = l/n in the above theorem. If I l/p converges, then we deduce that
11
and
( P1)1 1
converge, and we infer from the theorem that 00
1
In
n= 1
also converges, which is impossible. Therefore Theorem 4.3 is proved.
D
From 0 < 1  ; < 1 we deduce: Theorem 4.5.
11(1  ;) diverges to zero.
D
Exercise 1. Prove that there are infinitely many primes of the form 6n  1. Exercise 2. Prove that there are infinitely many primes of the form 4n  1. Exercise 3. Prove that
00
(Note that
I
n=l
1 2"
n
n2
= .) 6
5.5 Almost all Integers are Composite Theorem 5.1. lim n(n) 00 n
=
o.
n+
That is, the ratio of the number ofprimes in 1,2, ... , nand n tends to zero as n tends to infinity, or almost all integers are composite. Proof We prove the slightly more general result
lim n(x) x'oo
=
0,
X
where x tends to infinity through all real numbers. We first observe the following useful and simple fact. The number of integers not exceeding x that are divisible by ais [x/a]. Here [~] denotes the integer part of~.
79
5.6 Chebyshev's Theorem
Denote by w(x, r) the number of positive integers not exceeding x and not divisible by the first r primes 2,3,5, ... ,Pro Then, from Theorem 1.7.1 we have
I
w(x,r)=[x]
"'i"'r
[Pix] +
[ x ]
I
_00'
"'i<j"'r PiPj (it is not difficult to give a direct proof of this). Clearly then 1
n(x)
~
1
W(X, r)
+ r,
so that x x n(x)<xI+I ... +r+2r Pi PiPj
=
x.n (1  ~) +
<x
,~1
p,
i~ 1
Pi
n(1  ~) n(1 ~)
From Theorem 4.5 we know that, as r
r
+
2 r
+ 2r + 1.
+ 00,
+
Pi
i~1
O.
Let e > O. We can take r = r(e) so that
and therefore, for sufficiently large x, n(x) < eX.
The theorem is proved.
0
5.6 Chebyshev's Theorem The theorem in this section is an important result in elementary number theory and one should try to give the most elementary argument to prove it. Theorem 6.1. When n
~
2 we have
~ ~ n(n)H(n) < 6 8""
n
'
where .
H(n) =
n
1
I .
v~2 V
That is n(n) is about the same as the reciprocal of the average of (t, t, t, ... ,!).
80
5. The Distribution of Prime Numbers
We shall require the following two lemmas: ~
Lemma 1. When k
0 we have
Proof When x > 9 we have, by considering even and odd numbers, that n(x) Also n(2) = 1 = 2°,
n(8)
~
x/2.
= 4 = 22. 0
Lemma 2. When 1 > 0 we have
H(2 1) =
(1 + 1) + (1 + 1+ 1+ 1) + ... + 2'1"" (1 + 1) + (41+ 41+ 41+ 41) +... + ( 1 + . . . + 1) + 2!1 
2
3
4
5
6
~
7
2' 
2
2
2' 
1
~ I.
1
o
Proof of Theorem 6.1. We first prove that
n
pl(2n) =
n
n
(2,n):1
n
pro
(1)
n.n. pr~2n<pr+l
(i) Any prime in the interval from n to 2n must divide (2n)! but not n!, so that the left hand side of the formula holds.
(ii) The power of p in
C:)
is
since each term in the sum is at most 1. This proves the right hand side of the formula (1). From (1) we now have n 1t (2n)1t(n)
<
n
n
p
~ (2n) ~ n
n
pr~2n<pr+l
pr
~ (2n)1t(2n),
n
~
1.
(2)
81
5.6 Chebyshev's Theorem
Since (
2n) n
= 2n(2n  I)·· ·(n + I) n(n  1)···1 =2(2+_1) ... ( 2 +V_ ) nI nv
•••
(2+~)~2n I
and
(~n)::s:; (I + 1)2n = 22n, we deduce from (2) that
n ~ l.
(3)
Let n = 2k, k = 0, 1,2, ... , so that we have k~O,
or (4)
From Lemma I we have k~O.
Taking k
= 0, I, . .. , k and adding the corresponding results we have (k
+ l)n(2k+1) <
3(20
+ 21 + ... + 2k) <
3·2 k+1,
k~O.
(5)
From (4) and (5) we have k~O.
(6)
Let n be an integer greater than I and choose k so that k~O.
From Lemma 2 we have 2k + 2 2k+ 1 n k 2 n(n) ::s:; n(2 + ) < 3 k + 2 ::s:; 6 H(2k+2) ::s:; 6 H(n)'
(7)
and I 2k+l I 2k+2 n(n) >: n(2k+1) >: _ _ _  _ ..,....,.,..._ _ 7 72 k + 1  8 t(k + I)
1 2k + 2
>: 
I
n
>:   
78 H(2 k+1) 78 H(n)·
(8)
82
5. The Distribution of Prime Numbers
This holds for all n
~
2. Therefore I
H(n)
8
n
0
~n(n)<6.
Theorem 6.2. 1
n(n)
8
n
~~
12,
logn
Proof When n
~
2, we have n
n log  = 2
n
fdt < 1 + 1 + ... + I < fdt = log n. t
2
n
3
t
2
When n ~ 4, we have
Also
t log 3 ~ t + t, so that the required result follows from the previous theorem.
0
We note, of course, that Theorem 4.1 and Theorem 5.1 are consequences of this theorem.
5.7 Bertrand's Postulate Bertrand's postulate was first proved by Chebyshev. Theorem 7.1. Given any real x
~
1, there exists a prime in the interval x to 2x.
Proof I) We begin by giving a good estimate for the binomial coefficient ( 2n) = (2n)! , n n!n! namely, for n
~
5, we have that
~22n < (2n) < ~22n. 2n
n
4
The left hand side inequality follows from
2n) (2n) ( n
2 3 4 5
2n  I 2n  I 2n 2n
=1"1"2'2" ... '~'~'~'~>22n,
(1)
83
5.7 Bertrand's Postulate
and we shall use induction for the right hand side inequality in (1). When n have
= 5, we
1 210. ( 2n) n = 252 <,256 = 4' Since
1)) = (2n)!(2n + 1)(2n + 2) < 4 (2n) ,
( 2(n + n+l
(n!f(n+ l)(n+ I)
n
the inductive argument is complete. 2) Let b ~ 10. We denote by {~} the least integer
We then have a1
~
a2
~
...
~
ak
~
~ ~,
and we set
.. " and
b b ak < 2k + 1 = 2 2k + 1 + 1 ~ 2ak + 1 + 1. Since both the outsides are integers we have (2) Let m be the greatest integer such that am am < 10. Since 2a1 ~ b, the m intervals
covers the whole interval 10 < 1]
~
TI
~
5, so that am + 1 < 5, and hence, by (2),
b. Therefore
TI
p ~
TI
p < (2n) < 22 (nl), n
p
TI
p" .
TI
p.
From n
we have
TI
p
~
22(a l 1+a 2 1+"'+a m 1)
10
3) We already proved earlier that the power ofpin
e:)
(3) does not exceed r, where
r is the greatest integer satisfying pr ~ 2n. It follows that if p > .. (2n) not dIVIde n .
.j5z, then p2 does
84
5. The Distribution of Prime Numbers
We further observe that, when n
~
3, the primes p satisfying in < p ::;:; n cannot
divide Cnn) . This is because 3p > 2n, so that only p and 2p, and not other mUltiples of p, may occur among the divisors of (2n)!, whereas p2 clearly is a divisor of (n !)2. Therefore such a prime p cannot divide Cnn) . (This is the most important point in this proof.) Collecting our results we have
(~n)::;:; p!3~pr ~~~~fnP n
n
(2n)
n
p
fo ~ 10),
From (1) and (3) we see that, for n ~ 50 (so that
n 121" n
22n < (2n)~~+1
< (2n)~~ +
n
p
fo < p :s::; in
p.
p
n < p ~ 2n
p.
(4)
n
If there is no prime number between nand 2n, then
or (5)
But this is clearly impossible ifn is sufficiently large. We now determine an explicit bound for the validity of this inequality. We use n ::;:; 2"1 (this can be proved by induction) to give (6)
From (5) we have (using n ~ 50) that
that is (2n)! < 20 or n < t· 20 3 = 4000. Thus (5) can hold only if n < 4000 and we have therefore proved that, if n ~ 4000, there is always a prime p satisfying n
2,3,5,7,13,23,43,83,163,317,631,1259,2503,4001
(7)
is a chain of prime numbers, each one being smaller than twice its predecessor. Now, given any n (1 ::;:; n < 4000) we can select the smallest prime p in (7) which
85
5.8 Estimation of a Sum by an Integral
exceeds n, and we denote by p' its predecessor. Then we have p'
~
n
~
The proof of the theorem is complete.
2p'
~
2n.
D
Theorem 7.2. There exist two positive constants
n < n(2n)  n(n) < logn
IX
IX
and p such that
n logn
p,
n ~ 2.
Proof The right hand side inequality in the theorem follows at once from Theorem 6.2. We now prove the remaining inequality. The theorem is trivial if n < 4000. Suppose then that n ~ 4000. From (4) and (6) we have that
>
•
21\2n  19(2n)')
~ 2in(119/20) =
2io n •
From
TI
p <
(2n )"(2n)  ,,(n),
n
we have log2 n n(2n)  n(n) > _.   , 30 log2n and the theorem is proved.
D
Note: Although Theorem 7.1 settles Bertrand's postulate, it is not a very sharp result. Deep analytic methods can be used to give much better results concerning the gaps between successive primes, but these are beyond the scope of this book. Exercise. Use differential calculus to determine the bound for the validity of (5).
5.8 Estimation of a Sum by an Integral Theorem 8.1. Letf(x) be increasing and nonnegative for x
have ~
la.,~.,/(n) 
ff(X)dxl a
~f(e).
~
a. Then,for
e~ a, we
86
5. The Distribution of Prime Numbers
Proof We set b
=
[n Then i+ 1
b
f f(x) dx
= :t~
f f(x) dx
a
{
~ bi1f(i)
~ ~i:
f(i
+ 1),
i;:::;:a
or b
f(a)
+ ... + f(b
 1)
~ ff(X)dX ~f(a + 1) + ... + f(b); a
also ~
o ~ ff(X)dX ~f(e), b
and so the theorem follows. Example 1. Let A ~ O,f(x)
0
= x". Then
II
a<Sn<S~
n" 
e+l  a"+l
A+ 1
I
~
e.
From Example 1, we have, for A ~ 0, <
I
l<Sn<S~
n"
e+ 1 = 
+ O(e).
(1)
A+ 1 _
This implies that
Example 2. Letf(x)
= log x, e ~ 1 and
T(e)
= In<s~logn. Then we have
~
IT(e)  flOgXdxl
~ loge,
or IT(e)  eloge
+e
11
~
(2)
loge·
In particular, if e is an integer n, then nlogn  n
+ 1 logn ~ logn! ~ nlogn 
n
+ 1 + logn, I
87
5.8 Estimation of a Sum by an Integral
or (3) Exercise 1. Let ~ be an integer. Determine one further dominating term in (1); that is, find c so that the following holds for 2 ~ 1 : ~H1
I
nA =  1";n";~ 2+1
+ ce + O(e 1 ).
Exercise 2. Use Theorem 8.1 to study the sum
I
log10gn.
Concerning decreasing functions we have: ~
Theorem 8.2. Let f(x) be decreasing and nonnegative for x
a. Then the limit
N
;~~ C~/(n)  f f(x) dX) = a
(4)
a
exists,andthatO::;;; a ::;;;f(a). Moreover, iff(x) have,
+
Oasx+
00, thenfor~ ~
a
+ 1, we
~
la,,;~,,;/(n) 
f f(v) dv 
al ::;;;f(~ 
1).
a
Proof Let ~
g(~) = a~~~/(n) 
ff(X)dX. a
Then n+1
g(n)  g(n
+ 1) =

f(n
+ 1) +
f
f(x)dx
n
~
 f(n
+ 1) + f(n + 1) = O.
Also n+1
g(N)
=
:t~ (f(n) 
f
f(x) dX)
+ f(N)
N1
~
I
n=a
(f(n)  f(n))
+ f(N) = f(N)
~ 0,
(5)
88
5. The Distribution of Prime Numbers
so that g(n) is a decreasing function, and that
o ~ g(n) ~ g(a) = f(a). Therefore g(n) has a limit which we denote by oc, so that 0 Suppose now thatf(x) + 0 as x + 00. Then ~
g(~) 
oc
=
a~~~/(n) 
~
oc
~f(a).
N
;~ ctf(n) 
ff(X)dX a
f f(X)dX) a
[~l
~
N
= n~/(n)  f f(x)dx  f f(x)dx 
;~~ C~/(n) 
a
[~l
a ~
=  ff(X) dx  lim (
f f(X)dX)
N
I
f(n)  ff(X) dX)
n=[~l+ 1
N ... "" [~l
[~l
~
n
I
=  ff(X) dx + lim
f (f(x)  f(n» dx
N ... oo n=[~l+ 1 ~
nl n
N
L
~ lim
f (f(n  I)  f(n»dx =
f([~J) ~f(~ 
I)
N ... oo n=[~l+l
nl ~
~
f f(x) dx
~  (~  [~J)f([~J) ~  f(~ 
I),
[~l
and so the theorem is proved.
0
Example 3. We take a = I, f(x) = I/x. Then the number oc is known as Euler's constant, and is usually denoted by y. Therefore 0 ~ y ~ I, and
1
L ~ = log ~ + y + 0 (~) . ~
~n~~
Example 4. Let 0 < u # I,/(x) = x". Then there is a constant oc depends on a and u, such that when a ~ 1 we have
1 ~l" I ~L~ ~ n" 1
a 1 "

a
u
n
I
I a ~ . "" (~  I)"
From this we deduce the following: If u > I, then the series 00
1
L;; n
n= 1
(6)
= oc(a, u) which (7)
89
5.9 Consequences of Chebyshev's Theorem
converges, and when e;?; 1 we have
J~ ~a = (u  :)ea1 + O(;a).
(8)
The four results (1), (3), (6), (8) are used very frequently and the reader is advised to remember them. Exercise 1. Prove that, for e ;?; 2,
I
l';;;n';;;~
log n 1 e). _=10g2e+Cl+ 0 (lOg
n
e
2
Exercise 2. Prove that, for e ;?; 2,
I
_1_ = logloge + C2 +
Hn';;;~ nlogn
0(_1_). eloge
5.9 Consequences of Chebyshev's Theorem The letters Cl> C2," . used in this section represent absolute constants. Theorem 9.1. There exists a constant c 1 such that, for e ;?; 1,
I logp logel < Cl' IP';;;'; P Here Ip';;;~ represents the summation of all primes p not exceeding
e.
Proof 1) We assume first that e = x is an integer. From Theorem 1.11.1, we have T(x) = logx! = log
n p[;]+[f,]+ ... = p';;;x
I ([::] + [~] + p';;;x
P
.. . )lOgP.
P
From x p
p
p
x] + ... ~ x + p2 x + ... ~ x + p(p x_ 1) , 1 < [x] ji + [ p2
we have x logp I  I p';;;x
P
logp < T(x) ~ x
p';;;x
(lOgp
I 
p';;;x
P
+I
From Theorem 6.2, we have
I p~x
logp ~ logx . n(x) ~ C2 X •
p';;;x
logp ) p(p  1)
.
(1)
90
5. The Distribution of Prime Numbers
We also have "Iogn L... 2 Hn""'x+l (n  I)
"Iogp
1...    ~
p"",xp(p 
I)
+ I)
;, log(n
~ L...
n
n=1
2
_ 
C3,
so that we now have, from (I), that logp I~ IT(x)  x L: p
C4 X •
p""'x
From Example 8.2 we have IT(x)  xlogxl < logp  xlogx I IT(x) Ix L: ~
p""'x
logp I+ IT(x) L: 
x
p
But
CsX.
p""'x
p
so that logp Iogx I< I L: p
C6'
p""'x
2) Let
~
be real. Then
L:
logp
P""'~
P
=
L:
logp.
p""'[~l
P
From our earlier result we have
L: logp Iog[e] I< I pq P
C6'
But ~
Ilog[~J log~1 =
~
f
d(logt) =
~
f~t ~ f
dt
[~l
[~l
so that
IOg~I
The theorem is proved.
D
Theorem 9.2. There exists a constant
C7
such that, for
L: ~ = loglog~ + +
p"",~p
Proof Let
C7
~ ;?;
0(_1_). log~
2,
~ I,
xlogx
I
91
5.9 Consequences of Chebyshev's Theorem
so that, by Theorem 9.1, S(n) = logn
+ rm
rn = 0(1).
Therefore
I
I
~
p~~ P
p~~
_ 
logp. _1_ = I S(n)  S(n  1) P logp Hn~~ logn
"IognIog(nI) L., 2~n~~ logn
+
" r n rn 1_" L...  L.,1 2~n~~ logn
"
+ L.,2·
(2)
Now the function
~)
IOg(1
f(x) =     
x~2,
logx
is decreasing, andf(x)
0 as x
+
+ 00.
Therefore, by Theorem 8.2, we have
Since
I
f(x) =  xlogx
+0
(I) x 2 10gx
"
the integral
2
converges to
C9,
so that
~
f
I1 = ~ + xlogx 2
~
Cs
+
f
 log (I 
~)  ~ x
logx
x dx
+0
(_I) eloge
2
= logloge + C10 + 0(_1_), ,
elog e
(3)
92
5. The Distribution of Prime Numbers
where we have used
Next, from the convergence of the positive terms series
n~2 00
and
rn
(
1
I) + I)
logn Iog(n
= 0(1) we deduce that the series
n~2 rn 00
converges to
Cll.
(I
I) +
logn Iog(n
I)
Also
n~/nCo~n lOge: + 1») oC~Jlo~n lOge: + 1)1) =
=
oC~~ nlo~2 n) = oCo~e).
Therefore
L2 =
2.,~.,/nCo~n lOge: + I») + oC:~e)
=
n~z'nCo~n lOge: + I»)  J/nCo~n lOge: + I») + oCo~e)
=
Cll
+ oCo~e).
(4)
From (2), (3) and (4) we arrive at
L ~=
logloge
+ CI0 + Cll + 0(11_)
=
logloge
+ C7 +
P"~P
The theorem is proved.
oge
0(_1_). loge
0
Theorem 9.3. There exists a constant
l}~
( I)
C12
such that, for
e~ 2,
(I)
C 1 2+ 0  1  p  log e log2 e .
93
5.9 Consequences of Chebyshev's Theorem
Proof Since
I
p>~
(lOg (I 
~) + P~)
0 (
=
P
I ~)
p>~P
0
=
(I ~) n>~n
=
0
(~), ~
it follows from the previous theorem that log
n (I ~) I P =
p";~
=
loglog~ I
p>~
~) =
log (I 
p";~
C7
+

P
0(_1_) log~
I P~ + p";~ I [lOg (I  P~) + P~J
p";~
+I
(IOg(1
p>2
~) +~) P P
~) +~) = loglog~ +
(IOg(1
C13
P
P
+0(_1_), log~
where C13
= 
C7
+ I
p>2
(lOg (1
~) + ~). P P
Therefore
I
p";~
( P1) 1
(1) = __
eCl3 'c o ( log~ 1)
=elogloge+cl3+o log~
log ~
= ~(1 + o log~
(_1_)) log ~
(C1 2
= eC[3),
where we have used
eOCo~~)= I + The theorem is proved.
0(_1_). log~
D
Theorems 9.2 and 9.3 are quantitative elaborations of Theorems 4.3 and 4.5. Exercise 1. Let Pn denote the nth prime. Prove that there are constants that n
Exercise 2. Prove that there exists a positive constant ({)(n) >
cn
loglogn
,
Exercise 3. Prove that the infinite series 1
~ p(log logp)h
n ~ 3.
~
C
2.
such that
Cl, C2
such
94
5. The Distribution of Prime Numbers
converges or diverges according to whether h > 1 or h summation over all the prime numbers.
~
1. Here
Lp represents the
5.10 The Number of Prime Factors of n Let n be a positive integer. We denote by w(n) the number of distinct prime factors 'Of n and by Q(n) the total number of prime factors of n. That is, if n = p~1 ... p~', then Q(n)
w(n) = s,
If n is a prime, then w(n) of 2, then
= Q(n)
= at + ... + as.
(1)
= 1; but as n tends to infinity through power.s
Q(n)
logn log2
=   +
00;
and if n = PtP2 ... Ps is the product of the first s primes, then as n + 00, = s + 00. Thus the behaviours of w(n) and Q(n) are rather irregular and there is certainly no asymptotic formula for them. However, we do have the following: w(n)
Theorem 10.1. There are positive constants
L w(n)
:;=
Ct, C2
xloglogx
such that
+ Ct + o(x),
(2)
n:::=;x
L Q(n) = xloglogx +
C2
+ o(x).
(3)
n:.%.x
Proof 1) We have
L w(n) = L L 1 = L [~J = L ~ + O(n(x»
P p.sx P and so (2) follows from Theorem 9.2 and Theorem 6.2. 2) We have n.sx
n.sx pin
p.sx
and, by Theorem 6.2, logx
[ IOgX]
P log2 .sx
Therefore
L n:::=;x
Q(n)
=
x
L w(n) + L m + o(x). n:S;x
logx
r=
1 ~   L 1=  n ( y x) = o(x). log2 p2.sx log2
pm:s;x m~2
P
95
5.10 The Number of Prime Factors of n
But the series
~"1_,,(1+1+ ... )_,,
m':2
'7 pm  '7
p2
p3

'7 p(p 1
c 1) 
converges, so that
L
Q(n) =
n:::;:;:x
L w(n) + x(c + 0(1)) + o(x) =
x10g10gx
+ C2X + o(x). 0
n:::=;x
Theorem 10.2 (HardyRamanujan). Let e > 0, and letf(n) denote either w(n) or Q(n). Then the number of positive integers n
~
x satisfying
If(n)  10glognl > (loglogn}!+£ is o(x), as x
(4)
~ 00.
Proof(Turan). Since 10glogx  1 < 10glogn ~ 10glogx when xl/e < n ~ x, and the number of positive integers n ~ xl/e is [xl/e] = o(x), it suffices to prove that the number of positive integers n ~ x satisfying If(n) loglogxl > (loglogx)t+£
(5)
is o(x) as x ~ 00. Next, from Q(n) ;::: w(n), and by (2) and (3)
L (Q(n) 
w(n))
= O(x)
n~x
so that the number of positive integers n ~ x satisfying Q(n)  w(n) > (log 10gx)t is
o ((lOg l:g x)t )
=
o(x) ..
Therefore we need only consider the casef(n) = w(n). We consider a pair p, q of distinct prime divisors of n (p, q and q,p are treated as two different pairs). Each p may take w(n) values and for each fixed p, q may take w(n)  1 values. Therefore we have w(n)(w(n) 
1) =
L 1 = L 1  L 1. pqln p¢q
Summing over n
=
pqln
p21n
1,2, ... , [x] we have (6)
Since
96
5. The Distribution of Prime Numbers
and
L [~J = x L ~ + O(x), pq pq
pq';'x
pq';'x
it follows from (2) and (6) that I
L w 2(n) = x L  + O(x log log x). n';'x
pq';'x
(7)
pq
Now
L ~)2 ~ L ~ ~ (L ~)2, ( P.;.J; P pq p pq';'x
and Lp';'~ lip = log log
p';'x
e+ 0(1), so that both the outsides in the above are
(loglogx
+ 0(1))2 = (loglogx)2 + O(loglogx).
It now follows from (7) that
L w 2(n) =
x(loglogx)2
+ O(x log log x),
(8)
n:::;:;:x
and so
L (w(n) loglogx)2 = L w 2(n) 
2 log logx
L w(n) + [x](loglogx)2 n~x
n:::=;x
= x(loglogx)2 + O(xloglogx)  210glogx(xloglogx + O(x)) + (x + 0(1))(loglogx)2 = O(x log log x). Given any (j > 0, if there are (jx positive integers n
~ x
(9)
such that (5) holds, then
L (w(n) loglogx)2 ~ (jx(loglogX)1+2"
(10)
n~x
which contradicts with (9). Therefore the number of positive integers n that (5) holds is o(x), and the theorem is proved. D From this we see that w(n) '" log log n
and
Q(n) '" log log n
for almost all n.
5.11 A Prime Representing Function Theorem 11.1 (Miller). There exists a fixed number 2~o
then
[!Xn ]
is always prime.
= !Xl>""
!X
such that
if
~
x such
97
5.12 On Primes in an Arithmetic Progression
Proof We construct a sequence of primes {Pn} by induction: Take PI Theorem 7.1 there exists a prime Pn+ 1 satisfying
If Pn + 1 + 1 = 2Pn + 1, then Pn + 1 = 2Pn + 1 divisor 2'!(Pn+ 1 )  1). Therefore

2Pn < Pn+ 1 < Pn+ 1
=
3. By
1 cannot be prime (because it has the
+ 1<
2Pn +1.
Using logarithm base 2 we define log(n) X = log(n  1) (log X). Consider the sequences
Un = log(n) Pm Frompn < 10gPn+l < 10g(Pn+l + I)
5.12 On Primes in an Arithmetic Progression We saw in the exercises in §5 that there are infinitely many primes of the form 4n  I and 6n  I. This suggests the following: If a and b are coprime integers, then there are infinitely many primes of the form an + b. This is the famous Dirichlet's theorem which we shall prove in Chapter 9. Here we study the following special situation. We assume that a, b are positive and that b is fixed. We observe that if, given any a, there is always a prime of the form an + b (n > 0), then Dirichlet's theorem follows. For if there exists n such that an + b = PI (> b) is prime, and (replacing a by apr) there exists n such that apln + b = P2 (> PI) is prime, and so on, then there are infinitely many primes of the form an + b. Theorem 12.1. Let k > l. Then there are infinitely many primes of the form kn + l. From what we said earlier it suffices to prove that there always exists a prime of the form kn + l.
98
5. The Distribution of Prime Numbers
The roots of the equation Xk
1 are given by
=
a
= 0, 1, ... ,k  1.
Let (a,n)= 1
where the product is over a reduced set of residues a mod n. Clearly we have Xk  1 =
f1 Fn(x)
nlk where the product is over the divisors n of k, since each root on the left hand side must occur on the right hand side, and conversely without any repetition. Let
where Gk(x) is the least common multiple of the various polynomials xn._ 1 (n Ik, n < k), and its leading coefficient is 1. Therefore Gk(x) is an integer coefficient polynomial, and by Theorem 1.13.2 we see that Fk(X) is also an integer coefficient polynomial. If x is an integer not equal to ± 1, then
that is, Fk(X) and Gk(x) are nonzero integers. Lemma 1. Let n be a proper divisor of k. Then for all integers x :f
± 1, we have
Proof Let xn  1 = y, k = nd. Then Xk  1
= ~l
(y
+ l)d y
1 =ydl
== d (mody).
+
(d).,d2 + ... + (d) y+d 1
y
2
0
Lemma 2. Let x be an integer not equal to Fk(X) and Gk(x) must be a divisor of k.
± 1. Then each common prime divisor of
Proof Let pl(Fk(x), Gk(x)). From pIGk(x)
=
f1 Fn(x)
nlk n
(nlk,n < k),
99
Notes
so that plxn 
1.
Again, from pIFk(x), we have
Therefore
pi
(xn 
1,:: =:)
and the required result follows from Lemma I.
0
Proof of Theorem 12.1. Let x = kyo Then
We can select y such that Fk(X) :f ± I; this is possible because the equation Fk(x) = ± I has only finitely many solutions. There must be a prime divisor pin Fk(x), and, by Lemma 2, p does not divide Gk(x). In other words, for each proper divisor n of k, xn =1= I
(modp).
xk == 1
(modp).
(1)
But
We now prove that kip  I. Suppose otherwise. Then there are integers sand t such that (k,p  1)
That is, corresponding to n xn
= (k,p 
=
sk
+ t(p 
1).
I), we have
== (Xk)'(xP~ i)' == 1
(modp),
which contradicts (1). Therefore p == 1 (mod k); that is there exists a prime of the form kn + 1. As we already observed, this proves the theorem. 0 Exercise. Prove that there are infinitely many primes of the form 8n
+ 5.
Hint: Consider q = 32 • 52 • 72 • • • • • p2 + 2 2 , and prove that each prime p of the form x 2 + y2 must be congruent I (mod 4).
Notes 5.1. There has been much progress towards the Goldbach problem in recent years using sieve methods. Perhaps the most exciting is the following result of J. R. Chen [19J, [20J.
5. The Distribution of Prime Numbers
100
Let n be a sufficiently large even integer and denote by Pn(l, 2) the number of primes P :::; n such that either n  P is a prime or a product of two primes. Then P n (1, 2)
> 0.67
~ TI TI plnp2 p
>2
(1  (PI) 1 2)~2 . logn
p>2
It follows, of course, from this that every sufficiently large even integer is a sum of a
prime and an integer having at most two prime factors. The proof of Chen's theorem is given in the book "Sieve Methods" by H. Halberstam and H. E. Richert [28] where there is also a comprehensive bibliography. 5.2. Concerning the prime twins problem J. R. Chen [20] also proved that there are infinitely many primes P such that P + 2 is either a prime or has two prime factors. 5.3. H. Iwaniec (unpublished) has proved that there are infinitely many integers n such that n2 + I is either a prime or has two prime factors. 5.4. The principle of the "large sieve" was invented by Yu. Linnik and A. Renyi, and was substantially developed by K. F. Roth [50] and E. Bombieri [9] (see also the books by H. L. Montgomery [44] and E. Bombieri [10]). From his result Bombieri deduced the following theorem on the average value of n(x; k, I): Given any A > 0, there exists B = B(A) > 0 such that
I
I
lix = 0 max n(x;k,/) .....:
k:S;x5/1 og B x (I, k)= 1
I
(x) log x A
.
(A. I. Vinogradov [59] independently proved a slightly weaker result). 5.5. There has also been much recent work on the distribution of dn = Pn + 1  Pn where Pn is the nth prime number. For example, H. L. Montgomery [44] proved that dn = O(Pl+e), where e is any positive number, with the implied constant depending on e; M. N. Huxley [31J improved this to dn = O(ptz+e) and very recently this has been improved to dn = O(pfo+e) by H. Halberstam, D. R. HeathBrown and H. E. Richert. We observe that dn = 2 whenever Pn, Pn+ 1 are prime twins. Concerning unconditional lower bounds for dn , E. Bombieri and H. Davenport [IIJ proved that E
d I . = inf inf _ n _ :::;  (2 + )3) = 0.46650 ... , n+oo
10gPn
8
and this has been improved to E :::; ±(! + I) = 0.4463 ... (see [32]). 5.6. Besides the problems on the distribution of primes mentioned in the text there is also the problem of the least prime in an arithmetic progression, that is the estimate of the least prime P(k, I) in the arithmetic progression kn + 1 (n = 1,2, ... ) where k, 1are coprime positive integers. S. Chowla has conjectured that P(k, /) = O(kl +e) and Yu. Linnik was the first to prove that there is an absolute constant c such that P(k, I) = O(kC). Later C. T. Pan [45J gave a computable estimate for the value of c, and the present best estimate gives c < 15 which is due to J. R. Chen (unpublished).
Notes
101
5.7. In 1922 G. H. Hardy and J. E. Littlewood conjectured that every sufficiently large integer is the sum of two squares and a prime. This was proved by Yu. Linnik [40] using rather complicated methods. However there is now a simpler proof, based on E. Bombieri's mean value theorem for n(x; k, I), of this conjecture (see P. D. T. A. Elliot and H. Halberstam [23]). Many of the problems mentioned in these notes are also discussed in the author's book [30].
Chapter 6. Arithmetic Functions
6.1 Examples of Arithmetic Functions Definition 1. By an arithmetic functionj(n) we mean a function whose domain is the set of positive integers. Examples. Any sequence an is an arithmetic function. Specifically we can have n!, sin n, d(n) = Ldln 1 or r(n) where r(n) is the number of solutions to the equation n = x 2 + y2.
Definition 2. Letf(n) be an arithmetic function such that if (a, b) = 1, then j(a, b) = j(a)j(b).
(1)
Then we callf(n) a multiplicative function. If (1) holds regardless of the condition (a, b) = 1, then we say that j(n) is completely multiplicative. From this definition we see that if j(n) is a mUltiplicative function and if PI, ... ,Pr are distinct prime numbers, then
so thatj(n) is determined by the values it takes at the prime powers. Moreover, if j(n) is completely multiplicative, then
so thatj(n) is determined by the values it takes at the primes. It is clear that the product of two mUltiplicative functions is multiplicative and the product of two completely mUltiplicative functions is completely multiplicative. Example 1. The function LJ(n)
=
{~
if n = 1, if n # 1,
is completely multiplicative. Example 2. The function E;.(n)
= n A is completely multiplicative.
103
6.1 Examples of Arithmetic Functions
Example 3. The Mobius function is defined by:
if n = 1, if n is the product of r distinct primes, if n is divisible by a prime square. It is easy to see that
Jl(l) = 1, Jl(7)
=  1,
Jl(2)
=  1,
Jl(3)
=  1,
Jl(S)
= 0,
Jl(9)
= 0,
Jl(4) Jl(10)
= 0,
Jl(5)
= 1,
Jl(6)
= 1,
= 1, Jl(ll) =  1, ....
Here Jl(n) is multiplicative, but not completely multiplicative. Example 4. The number of positive integers not exceeding n and coprime with n is
denoted by cp(n), and it is called Euler's function. This function is also mUltiplicative, but not completely multiplicative. Example 5. The divisor function d(n) = Ldln 1 is also multiplicative, but not completely multiplicative. More generally, the function O';.(n) = Ldln d). is mUltiplicative. We note that O'o(n) = d(n). Example 6. Von Mangoldt's function is defined by: A(n)
=
{lOgp,
if p is the only prime factor of n, otherwise.
0,
We have A(1)
= 0,
A(2)
= log2,
= log 3,
A(4)
= log 2,
A(5)
= log 5,
A(6)
= 0,
A(7)
= log 7, A(S) = log 2,
A(9)
= log 3,
A(10)
= 0, ...
A(3)
and we see that A(n) is not mUltiplicative. Example 7. We define
if n is the mth power of a prime, otherwise. We have A 1 (1)
= 0,
Al(2)
= 1,
Al(3)
= 1,
A 1 (4)
A 1 (7)
= 1,
A 1 (S) =
t,
Al(9)
= t,
A 1 (10)
=
t,
Al(5)
= 1,
Al(6)
= 0,
= 0, ... ,
and that Al (n) is not multiplicative. Example S. Let p be a fixed prime number. If palin, we define Vp(n)
= pa. This
104
6. Arithmetic Functions
function is completely multiplicative and it is not difficult to prove that Vp(n + m) ~ max(Vp(n), Vp(m». Example 9. Let r(n) denote the number of solutions to the equation n = x 2 + y2. We shall prove in §7 that ir(n) is a mUltiplicative function. However, from r(3) = 0, r(9) = 4 we see that it is not completely multiplicative.
6.2 Properties of Multiplicative Functions Theorem 2.1. Letf(n) be a multiplicative function which is not identically zero. Then fll) = 1. Proof Letfla) :f 0. Fromfla) =fla)f(l) we deduce thatfll) = 1.
0
Theorem 2.2. Let g(n) and hen) be multiplicative functions. Then the function fln) =
Lg(d)h(~) = Lg(~)h(d)
d~
(1)
~n
is also multiplicative. Proof The second equation in (I) follows from the substitution d' = n/d. Suppose that (a, b) = l. Then f(a, b) = L g(d)h (ab). dlab d
Let u = (a, d), v = (b, d) so that uv flab)
=
=
d and hence
L Lg(UV)h(ab) uv
ula vlb
= L g(u)h ula
= f(a)flb).
(~) L g(v)h (~) u
v
vlb
0
Theorem 2.3. Letf(n) be a multiplicative function which is not identically zero. Then L J1.(d)fld) din
=
TI (l 
f(p»,
(2)
pin
where p runs through the prime divisors of n. Proof We put g(n) = J1.(n)fln), hen) = I in Theorem 2.2, so that the left hand side of (2) is a multiplicative function. It is clear that the right hand side of (2) is also multiplicative. It follows that we only need to prove (2) when n = 1 and n = pi, and these two cases can be verified easily. 0
105
6.3 The Mobius Inversion Formula
Theorem 2.4. Let j(n) be multiplicative. Then
j(m, n»j([m, n]) = f(m)f(n) , where [m, n] is the least common multiple of m and n. Proof Let
Then f(m) = f(plt') ... f(p!s), f(n) = f(p~l) ... f(p~s),
Since f(i)f(pr) = f(pmaX(l,r)f(pmin(I,r), the theorem follows.
0
6.3 The Mobius Inversion Formula Theorem 3.1. Let n > O. We have
LJl(d) = LJl(n/d) = L1(n) = din
din
{I, 0,
if n = I, if n # 1.
Proof This follows from takingf(d) = 1 in Theorem 2.3.
0
Theorem 3.2. Let 0 < '10 ::;:; '11 and let h(k) be a completely multiplicative function which is not identically zero. If for any '1 satisfying '10 ::;:; '1 ::;:; '11 we have
g('1)
j(k'1)h(k), L "'k"'ql/q
(I)
1
Jl(k)g(k'1)h(k) ; L ",k"'ql/q
(2)
1
=
then f('1) the converse also holds.
=
106
6. Arithmetic Functions
Proof From (1) we have
L
L
Jl(k)g(k'1)h(k) =
L
Jl(k)h(k)
f(mk'1)h(m).
Let mk = r. From Theorem 3.1 we have 1
""k~~I/~ Jl(k)g(k'1)h(k) = ""k~~I/~ Jl(k) ""k~~li~f(r'1)h(k)h G·) 1
1
klr
L
f(r'1)h(r)
L
f(r'1)h(r)
Jl(k)
LJl(k) klr
l""r""~li~
L
L
f( r'1)h(r)LJ(r)
= f('1)h(l) = An)
which proves (2). Suppose instead that (2) holds. Then
L
L
f(k'1)h(k) =
h(k)
L
L
L
Jl(m)g(mk'1)h(m)
Jl(r/k)g(r'1)h(k)h(r/k)
1 ""k""~I/~ 1 ""k""~li~
klr
L
g(r'1)h(r)
l""r""~I/~
L 1
which proves (1).
L
Jl(r/k)
l""k""~I/~
klr
g(r'1)h(r)LJ(r) = g('1)
""r"" "I.l/~
0
We can extend this theorem as follows: Theorem 3.3. Let ~o not identically zero.
~
1 and let H(k) be a completely multiplicative function which is all real ~ satisfying 1 :::; ~ :::; ~o we have
Iffor
G(~)
L
=
F(~/k)H(k),
(3)
Jl(k)G(~/k)H(k);
(4)
l""k""~
then we have, for such
~,
F(~) =
L l""k""~
the converse also holds. Proof Letf('1) = F(lN) and g('1) = G(I/'1)' Then from (3) and (4) we have g('1) = G(l/'1) =
L
l""k""l~
F(
~) H(k) = L
'1
l""k""l~
f('1k)H(k) ,
107
6.4 The Mobius Transformation
f{1'/)
= =I F(1/1'/)
l"k"l!~
J1.(k)G
(~) H(k) = I
l"k"l!~
1'/
These are just formulae (1) and (2) with 1'/1 = I
~
J1.(k)g(1'/k)H(k).
Igo = 1'/0.
D
We now apply this to the following:
Theorem 3.4. When
~ ~
I we have
II
J1.r)
1 H"~
Proof In (3) we set
F(~)
~ ~
(5)
= =I H(k) I
If I
I ~ l.
so that GW
I
=
J1.(k)
1"kq
=
[~].
[t]·
(6)
< 2, then (5) clearly holds. Suppose now that ~
IxI
k= 1
J1.(k) k
From (4) we have
~
2, and let x
= [~]. T~en
11=1 I J1.(k)(~[~])1 k
k= 1
k
=IIJ1.(k)(~[~])I~ k k k=2
II=xl.
k=2
Therefore
xl I k=l
J1.(k) k
and the required result follows.
I~ I + (x 
1)
=
x,
D
6.4 The Mobius Transformation Another consequence of Theorem 3.3 is the following:
Theorem 4.1. Let h(k) be a completely multiplicative function which is not identically zero, and let no be a positive integer. If for all n satisfying I ~ n ~ no, we have g(n)
=
If(d)h(~),
(I)
din
then, for such n, we have f{n) =
I din
J1.(d)g('!.)h(d); d
(2)
the converse also holds. Proof We define F(~) by setting F(~) = f(~) when ~ is an integer and F(~) = 0 if ~ is
108
6. Arithmetic Functions G(~)
not an integer, and we define G(n) = g(n) =
similarly. We can rewrite (1) and (2) as
Ij(d)h(~) = If(~)h(k) = I F(~)h(k) d k k
din
kin
l';k';n
and F(n) =j(n)
=
IJ1.(d)g(~)h(d) = IJ1.(d)G(~)h(d) d
din
=
d
din
1.;~.;/(d)G(~)h(d).
From the definition of F(~) and
G(~)
these two formulae can also be written as
G(~) = I
F(i)h(k),
F(~) = I
J1.(k)G(i)h(k).
l';kq
l';k';~
Here ~ satisfies 1 :::; ~ :::; no. Conversely (1) and (2) can be deduced from these formulae. The theorem now follows from Theorem 3.3 with ~o = no. 0 Definition. If
g(n) = If(d) = din
If(~)'
din
then we call g(n) the Mobius transform ofj(n). We also callj(n) the inverse Mobius . transform of g(n). From Theorem 4.1 we have j(n) =
IJ1.(d)g(~) = IJ1.(~)g(d).
din
din
From Theorem 2.2 we see that the Mobius transform, and the inverse Mobius transform, of a multiplicative function is multiplicative. Example 1. From Theorem 3.1 we see that A(n) is the Mobius transform of J1.(n). Example 2. From u;.(n) = Idln d\ we see that u;.(n) is the Mobius transform of the multiplicative function E;.(n) = n\ and therefore u;.(n) is a mUltiplicative function. Since 'I
U;.(pl)
=
I
p;'(l+I)_1
pm;,
= :;,
(2 # 0),
P  1
m=O
we deduce that if n = TIvP~v, then u;,(n) ~
TI v
p;'(lv+ 1) _
v
;,
Pv  1
1 •
109
6.4 The Mobius Transformation
In particular, when A = 0, we have d(n)
= (J'o(n) =
TI (Iv + 1),
which we already proved in an earlier exercise. Example 3. The function Eo(n) = 1 is the Mobius transform of LI(n). Example 4. Let n be fixed and let the integers 1,2, ... , a, ... , n be partitioned into distinct classes according to the value of t!le greatest common divisor (n, a). If d = (n, a), then we can write n = dk and 1 = (k, a/d). Now the number of integers a satisfying 1 = (k, a/d) is precisely
=
I
din
din
In other words, the function El (n) = n is the Mobius transform of
Theorem 4.2.
Jl(d)
= nI  · D din
d
Example 5. More generally we denote by
n = TIvP~v, we have
=
n).
I Jl(~) = n). TI (1  ~). din
d
pin
P
We leave the verification for this to the reader. Example 6. Consider a prime moduluS'p. Let the polynomial x p "
x be factorized into a product of irreducible factors. If m is the degree of one of its factors, then we know that min. Conversely any irreducible polynomial of degree m must be one of its factors. Denote by
That is, the function pn is the Mobius transform of n
=
I
Jl(m)pn/m,
min
which gives another proof of Theorem 4.9.2.
110
6. Arithmetic Functions
Example 7. We seek the Mobius transform of A(n). Let n = pll' ... p~v be the standard factorization of n. Then I,
IA(d) =
Ir
I
I
din
A(p~'
... p:r)
Sr=O I,
=
Ir
I
A(p~')
+ ... + I
lr
=
lr
I
S1:::::
A(p:r)
Sr= 1
" =I
lOgPI
+ ... + I
1
logPr
Sr= 1
= IllogPI + ... + IrlogPr =
logn,
that is logn is the Mobius transform of A(n). Example 8. Since A(n) is the inverse Mobius transform of logn, it follows that A(n)
= I J1.(d) logn/d = lognIJ1.(d) din
=
din
LI(n) log n 
I
I
J1.(d)logd
din
J1.(d) log d.
din
Since LI(n) logn is always zero, it follows that A(n) is the Mobius transform of  J1.(n)logn. Collecting our results we have the .following table, where g(n) represents the Mobius transform of fin). fin) g(n)
 J1.(n)logn A(n)
A(n) logn
Exercise 1. Let g(n) and gl(n) be the Mobius transforms of f(n) and fl(n) respectively. Prove that .Ig(d)fl din
(~) = If(d)gl (~). din
Exercise 2. Evaluate the inverse Mobius transform of g(n)gl(n). Exercise 3. The Mobius transform of the Mobius transform of fin) is given by
If(a)d(~). a
aln
Exercise 4. Use the method of Example 6 to prove formula (1) of §lO, Chapter 4.
111
6.5 The Divisor Function
6.5 The Divisor Function Theorem 5.1. We have, for all positive integers m, n, d(m, n)
~
d(m)d(n).
Proof If p is a prime, then
Since den) is a multiplicative function, the result follows. Theorem 5.2. Let
B
> O. Then den)
Here the Oconstant depends on Proof Let n =
If pe
~
D
=
(1)
O(ne).
B.
TIpln pa be the standard factorization of n. We have
2, then pae
~
+ 1. Therefore
2a ~ a
~
TI
1
pin
l(a
pE<2
a + 1 TI + 1)e1og2 pin
a
a
+ 1 ~ TI _2_, + 1 P£<2 e1o g 2
pC~2
and the required result follows.
D
Theorem 5.3. Let q be a nonnegative integer and
~ ~
2. Then (2) (3)
Proof We first prove (3) by induction on q. We know that the result holds when q = 0, and we now assume that it holds when q is replaced by q  1. Then
I l";n";~
(d(n»q n
=
I l";n";~
(d(n»q 1 n
I 1 uln
112
6. Arithmetic Functions
Let n = uv and using d(uv) ::;:; d(u)d(v) we see that
L
(d(n»q::;:;
l"'n"'~
n
L
(d(U»ql
L
(d(vW 1
l"'u"'~
U
l"'v"'~/u
V
To prove (2) we again use induction on q:
L
(d(n»q
= L (d(n»ql L 1 L 1
::;:;
uln (d(n»ql
L
"'u"'~ 1 "'n"'~ uln
L
(d(U»ql
::;:;~ L
L
(d(V»Ql
(d(U»Ql O((lOg~)2q11)
1 "'u"'~
U
= 0J:Wog ~)2ql). D This theorem can be made much sharper. We give only a very important special case as an example. Theorem 5.4.
If ~ ~
1, then
L
d(n) = ~log~
+ (2/,  l)~ + O(jh
where /' is Euler's constant. Proof We have
L 1 = L 1. uln In other words Ll "'n",~d(n) is the number of lattice points in the first quadrant which lie below the rectangular hyperbola uv = ~. Bya lattice point we mean a point with integer coordinates. By erecting two perpendiculars to the axes passing through the point (,fi, ,fi) the region concerned is divided into a square together with two regions each having the same number of lattice points inside. That is L
d(n) =
L
1 "'n"'~
[A]
L
1 = [,fi]2
+2
L
L
u = 1 [J~] < v '" elu
113
6.6 Two Theorems Related to Asymptotic Densities
Since
IJ {U =l llog ~ + y + 0 2
u=I
(1) rr' V~
it follows that
I
d(n)
= ~ log ~ + (2y
 l)~
+ O(.,fi).
Exercise 1. Prove that, for ~ ~ 2, d(n)
I 
l~n~~ n
=
1
log2 ~ 2
+ 2ylog~ + c + O(ctlog~).
Exercise 2. Prove that, for any positive e, we have
(J(n) = O(nl +e). Exercise 3. Prove that, for
~ ~
2,
1 (J(n) =  n 2 l~nq 12
I
(The reader may use the result Exercise 8.7.1.)
I:,= 1 l/n2 =
e+
O(~ log ~).
n 2 /6, a formula which will be proved in
6.6 Two Theorems Related to Asymptotic Densities Definition 1. Let there be a set of positive integers, and denote by N(x) the number of elements in the set not exceeding x. Suppose that . N(x) ltm= x+
IX.
X
00
Then we say that the set has asymptotic density
IX:
Examples. The set of odd positive integers has asymptotic density t. The set of all perfect squares has asymptotic density O. In this section we shall use the result ;. Jl(n) L..
n=1
n
2
=~ n
2'
(1)
the proof of which is given in Exercise 8.7.1. Definition 2. A positive integer which is not divisible by any prime square is called a squarefree number. The set of squarefree numbers has asymptotic density 6/n2. More precisely we have
114
6. Arithmetic Functions
Theorem 6.1. Let Q(x) denote the number of square free numbers not exceeding x. Then, as x + 00,
Q(x)
6x r: = 2 + O(y x).
(2)
n
Proof We partition the set of positive integers not exceeding x into subsets according to their largest square divisor q2. The number of positive integers not exceeding x having largest square divisor q2 is Q(X/q2) so that
[J~]
L
[x] =
(x)
Q 2
Let x
.
q
q= 1
= y2. Then
From Theorem 3.3 we have
L J1~~) + L
= y2
l~k~y
2
= 6 y2 + y20( n
0(1)
l~k~y
L ~2) + O(y)
k>y
6 n
= zy 2 '+ O(y), where we have used formula (5.8.8). The required result follows.
D
We can restate Theorem 6.1 as: Theorem 6.2.
If x
~
1, then
L
n~x
1J1(n)1
r:
6x
= 2 + O(y x). D
(3)
1I
The number of pairs of integers x, y satisfying 1 ~ x ~ y ~ n is equal to + 1)/2. Let us denote by 4>(n) the number of those pairs satisfying (x,y) = 1. We can prove that n(n
r n>oo
6
4>(n)
1m 1 (
"2n n
+
1)
,.~2·
We can interpret this result by saying that the probability that two given integers are coprime is 6/n2. Here we prove a sharper theorem.
115
6.7 The Representation of Integers as a Sum of Two Squares
Theorem 6.3.
L qJ(n) =
=
3n 2
.
2
+ O(n logn).
1t
m:::;;n
Proof We have
=
i
m
m=l
=
dtl
1
L J1.(d) dim
J1.(d)
=
d
L
dd'~n
:%: d' = ~
00 J1.(d) =_n 2 2 2 d=l d
L
d'J1.(d)
dt
J1.(d)
+ 0 (00 n2 L
([~J + [~J)
1) +
2"
n+l d
O(nlogn)
3n 2 =  2 + O(n) + O(n logn) 1t
3n 2 =  2 + O(nlogn) 1t
as required.
0
6.7 The Representation of Integers as a Sum of Two Squares We first introduce the function
O,
x(n)
= { (_ lyHnl),
if 21n, if 2,tn.
It is easy to verify that x(n) is multiplicative. We write J(n) =
L X(d), din
the Mobius transform of x(n), so that J(n) is also multiplicative. If n = npln pi is the standard factorization of n, then J(n)
=
n(1 + X(P) + X(p2) + ... + X(pl)). pin
Using the function x(n) we can restate Theorem 3.5.1 as follows:
116
6. Arithmetic Functions
Theorem 7.1. Let V(n) denote the number of solutions to the congruence x 2 ==  1 (modn). Then V(n)
=
{
O'
n(l + X(p)),
if 41n, if 4%n.
pin
In the product here p runs through all the distinct prime divisors of n.
D
It is not difficult to deduce this theorem from Theorem 3.5.1 and Theorem 2.8.1. The main aim of this section is to prove:
Theorem 7.2. Let r(n) denote the number of solutions to the equation n = x 2 + y2 in integers x, y. Then r(n) = 4c5(n). We shall require two auxiliary results for the proof of this theorem.
Theorem 7.3. We have the identity
Proof Direct multiplication gives the result at once.
D
Exercise 1. Prove the identity: (xi
+ x~ + x~ + x~)(Yi + y~ + y~ + y~) = (X1Yl + X2Y2 + X3Y3 + X4Y4)2 + (X1Y2  X2Yl + X3Y4  X4Y3)2 + (X1Y3  X3Yl + X4Y2  XzY4)2 + (X1Y4  X4Yl + X2Y3  X3Y2)2.
Exercise 2. Prove the identity: (xi
+ x~ + x~ + x~ + x; + x~ + x~ + x~) x (yi + y~ + y~ + y~ + Y; + y~ + y~ + y~) = (X1Yl + XzY2 + X3Y3 + X4Y4 + XsYs + X6Y6 + X7Y7 + xsYS)2 + (X1Y2  X2Yl  X3Y4 + X4Y3  XSY6 + X6YS  X7YS + XSY7)2 + (X1Y3 + X2Y4  X3Yl  X4Y2 + XSY7  X6YS  X7YS + xSY6f + (X1Y4  X2Y3 + X3Y2  X4Yl  XsYs  X6Y7 + X7Y6 + xsYs)2 + (X1Ys + X2Y6  X3Y7 + X4YS  XSYl  X6Y2 + X7Y3  XSY4)2 + (X1Y6  X2YS + X3YS + X4Y7 + XSY2  X6Yl  X7Y4  XSY3)2 + (X1Y7 + X2YS + X3YS  X4Y6  XSY3 + X6Y4  X7Yl  XSY2)2 + (X1Ys  X2Y7  X3Y6  X4YS + XSY4 + X6Y3 + X7Yz  XSY1)2.
117
6.7 The Representation of Integers as a Sum of Two Squares
Theorem 7.4. Let n > 1 be such that the congruence
f2 ==  1 (modn)
(1)
has a solution. Then there exists a unique pair of integers x, y satisfying
x> 0,
y>O,
(x,y)
= 1,
y
== Ix
(modn).
(2)
Proof Clearly if (2) is soluble, then so is (1). A necessary condition for (1) to be soluble is that n is representable as a
= 0 or 1,
and Pi (i = 1,2, ... ,s) is a prime == 1 (mod 4). We now use induction to prove the theorem. 1) We consider first the case n = pA. If A. = 1, then from 12 + 1 == 0 (modp) we see that when (x,p) = 1, we have x 2/2 + x 2 == 0 (modp). We shall presently choose y and x so that x 2f2 == y2 (modp), and x 2 < p, y2 < p. Let x and y take the values 0,1, ... , and consider the various differences xl y. Since there are + 1)2> p such differences, there must be two which are congruent modp. Let xII  YI == X21  Y2 (modp), or (Xl  x2)1 == YI  Y2 (modp), and we can assume that Xl  X2 > 0 so that Xl  X2 < IYl  Y21 < and this then gives our desired x and y. For this pair x, Y we have x 2 + y2 = tp, and it is easy to see that t = 1, (x,y) = 1. The congruence Y == mx (modp) is soluble, and from x 2(1 + m 2) == 0 (modp) we see that m == ± I. Ifm = I, then we take the pair (x,y), while ifm =  I, then we take the pair (y, x). Now assume that p ¥ 2 and thatthe theorem holds for n = pA. Let (  /)2 ==  1 (mod pH I) so that there exist u, v such that
([..JP]
[..JP]
..JP,
u > 0,
v> 0,
(u, v)
..JP
= 1,
v
== 
lu
(modpA).
When n = pA+l, we have pHI
=
(xu
+ YV)2 + (xv
_ yU)2
= X2+
y2
(X> 0, Y> 0).
First we have (X, Y) = 1, since otherwise pl(X, Y), but X
== xu + yv == xu 
flxu
== xu(1
 fl) =1= 0
(modp),
which is impossible. Next, because (X, p) = 1, the congruence Xm == Y (mod pA + I) is soluble. Thus X 2 + Y 2m 2 == 0 (modpHI) or 1 + m 2 == 0 (modpHI). From Theorem 2.9.3 this congruence has only two solutions, so that m = ± l. The desired result follows from the discussion in the case A. = 1.
118
6. Arithmetic Functions
2) Let n = ab, a > 1, b> 1, (a, b) = 1, and suppose that 12 ==  I
(modn),
u2 + v 2 = a,
u> 0, '
v> 0,
(u,v)
= I,
v == lu
(mod a),
x 2 + y2
x> 0,
y>O,
(x,y)
=
1,
y == Ix
(mod b).
 YV)2
=
= b,
From Theorem 7.3 we have n
= ab = (xv + yuf + (xu
X 2 + y2.
(If xu  yv > 0, then let xu  ·yv = Y; otherwise we let xu  yv =  Y.) We now prove the following: (i) (X, Y) = 1. Let pl(X, Y). Then xv
+ yu =ps,
xu  yv =pt,
or x(u 2 + v 2)
= p(sv + tu),
y(u 2 + v 2) = p(su  tv).
Since (x,y) = I, we must have pl(u 2 + v 2), that is pia. Similarly plb. But this contradicts (a, b) = l. (ii) X == IY (mod n). From our assumption we have xv
+ yu == Ixu 
Iyv == I(xu  yv)
(mod a),
xv
+ yu ==
+ Ixu == I(xu
(mod b).
Iyv
 yv)
Since (a, b) = I, it follows that X == IY (mod n). 3) Uniqueness. Suppose that there are two pairs (X, Y), (X', Y') both satisfying the conditions. Then n 2 = (XX'
+
yy')2
+ (XY'
_ YX')2.
But XX'
+
YY' == XX'(l
+ [2) == 0
(modn),
so that XX'
From XY'  YX'
=
+ YY' =n,
XY' YX'=O.
0, we have
X
Y
==c X' Y' ,
119
6.7 The Representation of Integers as a Sum of Two Squares
so that X 2 + y2 = C 2(X,2 + y'2) giving C = ± 1. Also from X > 0, X' > 0 we see that C = 1. The proof of our theorem is complete. 0
Proof of Theorem 7.2. From Theorem 7.1 and Theorem 7.4 we see that the number of solutions to x 2 + y2 = n, (x, y) = 1 is 4 V(n). We now consider the equation x 2 + y2 = n, and we partition the various solutions into sets according to (x, y) = d. The number of solutions satisfying (x,y) = d is equal to the number of solutions satisfying X)2 (d
(y)2
+ d
=
n d2
'
that is 4 V(n/d 2 ). Therefore
r(n)
= 4
I d21n
v(;) d
= 4
I V(~)2(d), d
din
where 2(d) = I or 0 according to whether d is a square or not. Since V(n) and 2(n) are both mUltiplicative it follows that r(n)/4 is multiplicative. Since ben) is also multiplicative the theorem will follow if we show that r(n) = 4b(n) when n = p'. Now, if 21m, then
r(pm) = V(pm) + V(pm2) + ... + V(p2) + V(l) 4
0+ ... + 0 + I = I, + ... + 0 + I = I, 2+"'+2+1= m =·2+I=m+l 2 '
°
if p = 2, if p == 3
(mod 4),
if p == 1
(mod 4),
and if 2,tm, then
I,
=
{
°~ + I,
if p = 2, if p == 3 if p == 1
(mod 4), (mod 4).
On the other hand we have
b(pm) = 1 + X(p) + ... + X(pm)
I +0+0+ ... +0= 1, _ { 1  1 + ... + I = 1, I  I + ...  I = 0, 1 + 1 + ... + 1 = m + I, The theorem is proved.
0
if if if if
p=2, p==3 p==3 p==l
(mod 4), (mod 4), (mod 4).
21m, 2,tm,
120
6. Arithmetic Functions
Theorem 7.5. Denote by A and B the number ofdivisors ofn which are congruent I and 3 (mod 4) respectively. Then r(n) = 4(A  B). Proof This is an immediate consequence of Theorem 7.2.
0
Theorem 7.6. Let e > O. Then r(n) Proof Since r(n)
~
= O(n').
4d(n), the required result follows from Theorem 5.2.
0
6.8 The Methods of Partial Summation and Integration Theorem 8.1 (Abel). Let a numbers and
~
b and let n vary in a
~
n
~
b. Let 'l'n and en be complex
Then
IJa 'l'nenl ~ a~::b ISnl C"'m~bl lem Proof Let Sal
=
em+11
+ lebl ).
(I)
O. Then b
b
n=a
n=a
L 'l'nen = L (sn =
Snl)en
b
bl
n=a
n=a
L Snen  L Snen+l bl
=
L sn(en 
en+d
+ Sbeb,
n=a
so that
Theorem 8.2. In the previous theorem if en is a positive decreasing sequence, then
Int 'l'nenl ~ a~::b ISnlea· We now apply this to the following:
0
(2)
121
6.8 The Methods of Partial Summation and Integration
Theorem 8.3.
If s >
0, then
"L... x(n)s I ....::::~~s' In~a n a so .that the series
I:'= 1 x(n)/n s converges when s> 0.
Proof We have x(a) + x(a + I) + x(a + 2) + x(a + 3) = 0,
so that
From Theorem 8.2 we deduce that
I ±X(7)1~~· n a n=a
Since the right hand side is independent of b, the theorem follows.
D
Note: In the next section we shall require x(n)
I
00
=
n= 1
n
I
I
I
n
1++'" =. 3 5 7 4
This can be proved using the series expansion for tan  1 X in ordinary calculus. Analogous to Theorems 8.1 and 8.2 we have: Theorem 8.4. Let ~ ~ '1 and let x vary in ~ ~ x ~ '1. Suppose thatf(x) and g(x) are continuous and g(x) is differentiable. Let x
11 (x) =
f fit) dt.
Then q
q
Iff(X)g(X)dxl ~
Moreover, if g'(x) ~
~ ~~::ql/l(x){flg'(X)ldX + Ig('1)I). ~
°
and g(x) > 0, then q
Iff(X)g(X)dxl
~ g(~) ~~::ql/l(X)I.
122
6. Arithmetic Functions
Proof From integration by parts we have ~
~
= I g(x)dl1 (x)
II(x)g(X)dX
~
= g(rO/l (1])  III (x)g'(X) dx, and hence
II ~
fix)g(x) dx
I~ ~~::~
~
III (x)1 (lg(1])1
+I
~
Ig'(x)1 dX).
~
The last part of the theorem is also clear.
D
Example. Let a > O. Prove that 00
II
I cOS~/~Y I~ ~ maxi 00
COSX2dxl
=
I
2y
2a
a2~~
~
a
~
ICOSYdyl
~~. a
~
6.9 The Circle Problem' Theorem 9.1.
L
r(n)
=
nx
+ o(fi)·
Proof From Theorem 7.2 we have
L
r(n)
=4
l~n~x
L LX(d) l:::=;n~xdln
=4
L 1 ~d:::=;x
= 4
x(d)
L 1 ~n:::;;x
L X(d)[~J.
l~d~x
Here we divide the sum into two parts. From Theorem 8.3 we have
123
6.9 The Circle Problem
= 4x
I: d=l
= 1tX
X(d)
+ O(Jx)
d
+ O(Jx);
the other part is
and from Theorem 8.2 we have The theorem is proved.
D
Another proof of the theorem is the following: Clearly LO";n";xr(n) is the number of pairs of integers u, v satisfying u2 + v2 ~ x. In other words the sum is the number of lattice points inside the circle centre at the origin with radius Jx. This circle has area 1tx. We partition the plane into unit squares with orthogonal lines passing through the lattice points. To each point (u, v) in our circle we assign the square whose four corners have the coordinates (u, v), (u + 1, v), (u, v + 1), (u + 1, v + 1). These squares must lie inside the circle u2 + v2 = (Jx + J2)2 and they include the circle u2 + v2 = (Jx  J2)2. Therefore
and the required result follows at once. We observe that this second proof can be used as a proof for 1t 1 1 1 1++ ... =. 3 5 7 4 Concerning the pro blem of the number oflattice points inside a closed curve, the Czech mathematician M. V. Jarnik proved the following: Theorem 9.2. Let I ~ 1 be the length of a rectifiable simple closed curve and let A be the area of the region bounded by the curve. If N is the number of lattice points inside the curve, then
IA  NI < I. Proof (Steinhaus). We first prove the following two simple lemmas. Lemma 1. Let C be a rectifiable curve inside a unit square with the two end points on the boundary of the square. IfC crosses the two diagonals of the square, then its length must be at least 1.
Proof If the two end points are on the opposite sides of the square, then the result follows at once. Suppose next that the two end points are on two adjacent sides of
124
6. Arithmetic Functions
rJ.
P a
b
P
the square as shown in the diagram. It is easy to see that
A similar argument applies when the two end points are on the same side of the square. Lemma 2. Let C be a rectifiable curve inside a unit square with the two end points on the boundary of the square so that the square is partitioned into two regions. Suppose that C does not pass through the centre of the square, and denote by LI the region which does not contain the centre. Then the area of LI must be less than the length of C.
Proof We consider separately the cases shown in the following diagrams:
rJ.
P
q fJ
rJ.
fJ
P
rJ.
P
fJ
P
rJ.
fJ
rJ.
q P
P
Let A be the area of the region LI and I be the length of C. In the first two cases it is easy to see that every point of C is of distance at most I from the base line rxf3 so that LI must lie inside a rectangle with sides 1 and I and hence A < I. In the remaining three cases we see from Lemma 1 that I;?; 1 and so A < 1 ~ l. We can now proceed to prove the theorem. Denote by I the region inside the curve. We form a net of unit squares in the plane with the lines
x=m
+t,
y=n+t
(m,n
=
0,
± 1, ± 2, ... ).
Let Qb Q2,' .. , Qk be those squares which contain part of the boundary of I, let C i be the part of the curve in Qi' let Q i be the intersection of Qi and I, and define
{I,
N.= , 0,
if Q i contains a lattice point, otherwise.
We let Ai be the area of Qi, Ii the length of Ci, so that our theorem will follow if we can prove that IAi  Nil < 1;. Now the case when the whole of Ilies inside a Q follows at once since I;?; 1. We can assume therefore that Ci is made up of a number of sections of the curve and Qi is partitioned into regions DlS).
125
6.10 Farey Sequence and Its Applications
If the lattice point does not lie in any DlS) so that it lies on Ci, then Ni = 0, o < Ai < 1 and Ii ~ 1 so that our required result follows. If the lattice point lies inside a Dl S) we denote by AlS) the area of Dl S). If Dl S) is not in I, then Ni = 0, Ai ~ 1  AlS); if DlS) is in I, then Ni = 1, 1  Ai ~ 1  AlS) and, from Lemma 2, we have 1  AlS) < Ii' The theorem is proved. D
It is clear that Theorem 9.1 is an immediate consequence of Theorem 9.2. Exercise 1. Find the asymptotic formula for the number of lattice points inside an ellipse centre at the origin. Exercise 2. Prove that the number of lattice points inside the sphere u 2 + v2 + w2 ~ x is given by
1nx 3 / 2
+ O(x).
Exercise 3. Generalize the previous exercise to a sphere in ndimensions. Exercise 4. Determine the order of Ln.;xr2(n). Exercise 5. The number of lattice points inside the circle u 2 coordinates is given by 6 x n
+ v2
~
x with coprime
+ O(fi log x).
6.10 Farey Sequence and Its Applications Farey sequence was discovered well over a hundred years ago, but its significance in number theory is revealed only in modern times. "
Definition 1. By the Farey sequence of order n we mean the fractions in the interval from 0 to 1, whose denominators are ~ n, arranged in ascending order of magnitude. That is, they are numbers of the form a
b'
(a, b) = 1,
arranged into an increasing sequence. We denote by tYn the Farey sequence of order
n. Example:
tY7
is the sequence
The total number offractionsin tYn is 1 + L~= 1qJ(m). These fractions divide the interval 0 ~ x ~ 1 into L~=l qJ(m) parts, and tYn+l is obtained from adding the
126 cp(n
6. Arithmetic Functions
+ 1) numbers a
+ 1) =
(a,n
n + l'
1,
o
Theorem 10.1. Let ~ be an irrational number, 0 < ~ < 1. Let am/b m, a~/b~ be two
successive Farey fractions of order n satisfying a~
am bm
<~<.
b~
Then (i) am/bmis an increasingfunction ofn, while a~/b~ is a decreasingfunction ofn, and
(ii) bm and b~ are increasing and unbounded functions of n.
Proof We note that every rational number in the interval [0, 1J is a term in a Farey sequence. The theorem follows once from the definition of a Farey sequence of order n. 0 Theorem 10.2. Let alb, a'/b' be two successive terms in alb < a'/b', then ba'  ab' = 1.
tJn. Then b + b'
~
n
+ 1. If
>
Proof Since (a, b) = 1, there are integers x, y such that bx  ay = 1,
n  b < y::::; n.
(1)
It follows at once that
y> 0,
(x,y)
=
x a 1 a =+>.
1,
y
b
by
b
It suffices to prove that x/y = a'/b'. This is because we can then deduce that x y = b', ba'  ab' = 1 and b + b' > n. Suppose that x/y :f a'/b'. Then
a
a'
x
b
b'
y
<<. From this we deduce that
x y
a b
x y
a' b'
a' b'
all b+y n ~ +  =   > b b'y b'b ybb' ybb'
   =    +  But we haveJrom (1),
x y
a b
by'~
giving a contradiction. The theorem is proved.
0
1 by
~.
=
a',
127
6.10 Farey Sequence and Its Applications
Theorem 10.3. Suppose that alb < a"lb" < a'lb' are three successive Farey fractions.
Then a"
a +a'
b"
b
+ b'
Proof From Theorem 10.2 we have a"b  b"a = 1 and a'b"  b'a" = 1, and so, on subtraction, a"(b + b')  b"(a + a') = O. The required result follows. 0 Definition 2. Let alb and a'lb' be two successive Farey fractions. Then we call
(a
+ a')/(b + b') the mediant of the two fractions.
Theorem 10.4. The mediant lies between the two fractions alb and a'lb', and the distance from them are
b(b
+ b')
and
b'(b
+ b')
respectively. Proof We assume that alb < a'lb'. Then a' b'
a b
+ a' + b'
ba'  ab' b'(b + b')
a b
+ a' + b'
a
a' b  ab'

=
=
~
+ b')
>0
,
1
b = b(b + b')
Theorem 10.5. Let ~ be a real number, 0 <
1 b'(b
=
b(b'
+ b) > O. 0
< 1. Then there always exists alb in
~n
such that
I~ ~I < ;n'
0< b
~
n.
Proof We partition the interval (0, 1) into subintervals by the points in ~n together with their mediants. Now ~ must be in one of these subintervals one of whose end point is alb while the other is (a + a')/(b + b'). Therefore we have
The theorem is proved.
D
~ and t'f be any two real numbers, t'f rational number alb such that
Theorem 10.6. Let
1~~I
~
1. There always exists a
o < b~t'f. ~
< 1 and the required result follows at
128
6. Arithmetic Functions
Theorem 10.7. Let
~
be any real number. There always exists a rational number alb
such that (2)
If ~
is irrational, then there are infinitely many such alb satisfying this inequality.
Proof Clearly we need only examine the case when ~ is irrational, 0 < anlbm a~/b~ be two successive terms in ~n satisfying
~
< 1. Let
an a~ <~<. b~
bn
From the proof of Theorem 10.5 we see that one of these must satisfy the inequality (2). Our theorem now follows from Theorem 10.1. 0 Theorem 10.8. Let ~ be any irrational number. Then there exist infinitely many rational numbers alb such that
I~ ~I < Jb
(3)
2 •
Proof We can assume without loss that 0 < ~ < 1. Let alb and a'lb' be two successive Farey fractions of order n satisfying alb < ~ < a' Ib'. Let w = b'lb and we consider separately the following two cases. 1) Suppose that w> (1 + J"S)/2 or w < 1)/2. Then, from Theorem 10.2, we have
(J"S 
a' a b'  b = bb'
= b2 w .
Since
1 1(1)
   1 + =
w
J"S
w2
1
(w 2 J"Sw+ 1) J"Sw 2
~(J"S + l))(W  ~(J"S 
=  _l_(w 
J"Sw 2
we have
~ ~< a
11
b + J"S b2
J
b2 ( 1 +
a'
>
2
If 
~2) = Js (:2 + b~2 ),
11
J"S b
'2 •
Therefore the two intervals and
2
1)) < 0,
6.11 Vinogradov's Method of Estimating Sums of Fractional Parts
overlap, and so one of them must contain or 2) Suppose that (fi  1)/2 < b
OJ
egiving 1Iea'b' i
129
(4)
< (1 + fi)/2. Then
+ b' > t(fi + l)b,
b
+ b' < t(fi + l)b'.
Therefore we can deal with the intervals
(~b' ~) b + b'
and
with the method in 1). That is, there are three possibilities; apart from the two situations in (4) we also have
Ie  :: :: I< f i (b 1+ b,)2 . e
Therefore, given any n, there always exist a, b such that (3) holds. Since is irrational, band b' tend to infinity with n according to Theorem 10.1, and so our theorem is proved. 0 Exercise. Prove that the denominators of two successive Farey fractions are
different.
6.11 Vinogradov's Method of Estimating Sums of Fractional Parts Let {oc} be the fractional part ofoc; that is {oc} = oc  [ocJ. The purpose of this section is to study sums of the form
L
{fix)}.
A~x
We shall apply the results in the next section. Theorem 11.1. Let m > 0, (a,m)
c ::;:; I/I(x) ::;:; c
= 1, h;::::
+ h,
°
for
and c be real. Suppose that
x=O, ... ,m,
and let
Then
IS  }ml ::;:; h + }.
130
6. Arithmetic Functions
Proof Clearly we have
,I s  ~2 m I~ mill{ax +mtjJ(X)}  ~2 I~ ~2 m. x=O
The theorem therefore follows at once if m ~ 2h + 1. Suppose now that m > 2h + 1. Let r be the least positive residue of ax + [e] modm. We then have
s=
mil {r +
(1)
m
r=O
where
Hence
+ h.
{e} ~
If 0 ~ r < m  [h
+ {e}],
(2)
then
o ~ {e} ~ r +
[h
+ ,{e}] 
1 + {e}
+ h < m,
or o~
+
r
m
< 1;
therefore
or
~ + {e} ~ {r +
+ {e}]
~ r
m
m
m
m
(3)
< m, let r = m  s. Then for s = 1,2, ... , [h + {e}], we have
If
+
and if
s~h
m
+ {e}
+ {e}
 s;
m
 s~ r
+
+ {e} ~ {r +
m
(4)
m
(5)
m
From (4) and (5) we have r {e} {r 1 ++~ m
m
+
m
m
(6)
131
6.1 I Vinogradov's Method of Estimating Sums of Fractional Parts
Now from (1), (3) and (6) we arrive at {c}  (h
ml
r
r=O
m
+ {cn ~ S  L 
~h
+ {c},
and hence  h~S
The theorem is proved.
t(m 
1) ~ h
+ 1.
D
Theorem 11.2. Let m be an integer, A > 2, 1 ~ m ~ A 1/3, (a, m) = 1, k ~ 1. Suppose that M+ml
S
=
L
{fix)},
x=M
where fix) has a continuous second derivative in M a 9 f'(M) =+,
m
(a,m)
m2 1
A
~
If"(x) I ~
~
x
~
= 1,
M
+m
191 <
 1 and satisfies
1,
k
. A
Then
IS  tml
~ t(k
+ 5).
Proof From the mean value theorem of differential calculus we have 2
fiM
+ y) =
fiM)
+ yf'(M) + ~ f"(M + 9'y),
i9'1 <
1.
In Theorem 11.1 we take I/I(y) = m(fi M )
+ ;2 Y + ~y2f"(M + 9'y»).
From the continuity of f"(x) and from 1f"(x)1 > I/A we see that f"(x) does not change sign. We can therefore assume without loss that/"(x) > O. Then we have ( m) (m m "fiM)  m 2 < I/I(y) < m "fiM) + m2
2
) + 21 m A"k ,
or mfiM)  1 < I/I(y) < mj(M)
+ 1 + tk.
The result follows from taking c = mj(M)  1 and h 11.1. D
= 2 + k/2 in Theorem
132
6. Arithmetic Functions
Theorem 11.3. Let k ~ I and let fix) have a continuous second derivative in M ~ x ~ M + m, and I

A
~
k
If"(x) I ~ . A
Then M+m1 S=
L
x=M
I {fix)} = m 2
+ 0(.1),
where
Proof We take 1: = A 1/3 , M = M 1. We see from Theorem 10.6 that there exist a 1 ,m,8 1 such that (7)
From Theorem 11.2 we have
M,+m,1
L
x=M,
We next take M2 such that
8'
+ .!..(k + 5), 2
+ m1 and again from Theorem 10.6 there exist a2, m2, 8 2
M1
=
I {fix)} = ml 2
and
M2+m21
L
X=M2
I {fix)} = m2
2
8'
+ ~(k + 5), 2
Continuing this way, if after s steps we have
o~ M + m 
I  Ms+l < 1:,
then
IS  t(m1 + ... + ms)  t(M + m  Ms+ 1)1 s
~ 2(k
or (since Ms+1 = M
I
+ 5) + 2(M + m 
M s + 1),
+ m1 + ... + ms) IS  tml < ts(k + 5) + t(1: + I).
(8)
We now have to estimate s. Suppose that 0 < q < 1:, (p, q) = 1. If p, q are given, we can estimate how many m1,'" ,ms are equal to q. From 1f"(x)1 > I/A and its
6.11 Vinogradov's Method of Estimating Sums of Fractional Parts
. 133
continuity we know thatf"(x) does not change sign. It follows that the set of values x satisfying I :;;;f'(x):;;;+
pip q
forms an interval. Let
Xl> X2
q1:
q
q1:
(9)
be any two points in the interval, so that
Hence X2
I
f
I
f"(t) dt <
:1: '
X,
and so 1
2
IX2 
A
xII <. q1:
This shows that the length of the interval of values x which satisfies (9) is at most 2A/q1:. It follows that the number of mi which are equal to q is at most 2A/q21: + 1. Next, for fixed q, we estimate the number of values P which satisfy (9). Suppose that PI > P2 and
PI 1 PI 1    :;;;f'(XI):;;; +, q
q1:
q
q1:
P2 1 P2    :;;; f'(X2) :;;; q
q1:
q
1 + .
q1:
Then
f XI
I
f"(t) dtl = If'(xd  f'(x2)1
~ PI ~ P2  :1:'
and so
and hence
PI  P2
kmq
2
A
1:
+ 1 :;;;   +  + 1.
This shows that the number of P is at most
kmq
2
++ A 1:
1.
134
6. Arithmetic Functions
Collecting our results we see that if we write f'(M i ) as in formula (7), then the number of fractions admi whose denominator mi is q is
2A ~ ( q27:
) (kmq
2
)
+ 1 A +~ + 1
= km (~+~) + (2A + 1)(1 + ~). 2 7:
Summing over q
=
7:
q
q27:
7:
1, 2, ... , [7:] we see that
s~ =
k; (2 oe;
log
7: + 2+ 7:22~ 7:) + o(~)
log A
+~).
The theorem follows from substituting this into (8).
D
6.12 Application of Vinogradov's Theorem to Lattice Point Problems We already proved in Theorem 9.1 that the number R(x) oflattice points inside the circle u2 + v2 ~ x satisfies R(x) = nx + O(fi). In this section we shall prove the following sharper result. Theorem 12.1 (Sierpinski). Let x ;::: 2. Then R(x)
= nx + O(x! logx).
This result is not the best known. Using more complicated analytic tools the author proved in 1942 that, for e > 0, R(x)
= nx + O(x~+e).
(See Note 6.1.) A famous problem in number theory is the conjectUfe that R(x) = nx
+ O(xi +e).
We require the following result for the proof of Theorem 12.1. Theorem 12.2. Letj(x) have a continuous second derivative in the interval Q and let x
u(x)
=
fGo
{t} )dt.
~
x
~
R,
135
6.12 Application of Vinogradov's Theorem to Lattice Point Problems
Then R
I
f(x)
=
ff(X) dx
+ (t 
{R})f(R) 
(t 
{Q})f(Q)  (f(R)f'(R)
Q<x':;R Q
R
+ (f(Q)f'(Q) + f
(f(X)f"(x) dx.
Q
Proof Let Xl be an integer, Q tegration by parts we have p
~
~
oc < 13
R, Xl < oc < 13 < Xl
+ 1.
From in
p
 ff(X)dX=
'"
ff(x)~G{X})dX '"
= (t  {f3})f(f3) 
(t 
{oc})f(oc)  (f(f3)f'(/3)
+ (f(oc)f'(oc)
p
+f
(1)
(f(x)f"(x) dx.
'"
Letting oc + Xl> 13 + Xl +I
Xl

+ 1 we have Xl
+ 1) 
f(x)dx =  tf(XI
f
tf(xd
Xl
+
+I
f
(f(x)f"(x)dx.
Xl
From this it follows that [R)

f f(x)dx
I
= 
fix)
+ tf([Q] + 1) + tf([R])
[Q)+ I ':;X':; [R) [Q)+ I [R)
+
f
(2)
(f(X)f"(x) dx.
[Q)+ I
If in (1) we let
(J(
=
Q, 13 + [Q]
+ 1, then
[Q)+ I
f
fix) dx = 2 1f([Q]
+ 1) 
G
{Q} )f(Q)
+ (f(Q)f'(Q)
Q [Q)+ I
+
f Q
(f(X)f"(x) dx.
(3)
136
6. Arithmetic Functions
Similarly we have
f R
j(x)dx
= (t  {R})j(R)  tj([R])  u(R)f'(R)
IR)
f R
+
(4)
u(x)f"(x) dx.
IR)
The required formula is obtained by adding (2), (3) and (4).
D
Proof of Theorem 12.1. By considering the diagram associated with the circle problem it is easy to see that R(x)
I
= I + 4[Jx] + 8
[Jx  u2 ]
x

4[
0<""')#
AT
(5)
Clearly we have
Let us estimate
It. Takej(u) =
Jx  u2 so that from Theorem 12.2 we have
o
)# x
f
u(u)du (x 
U 2?/2
(1 {A})A   I
1t x =x++ 
8
4
2
2
2
2
Jx x+O(l). '\
o
From Theorem 11.3 we have
1
(x
I2 = "2 V2" + O(x 10gx). 1
3
The theorem follows from substituting these estimates into (5).
D
A similar problem to the circle problem is the Dirichlet divisor problem. We already proved in Theorem 5.4 that
I t"'n"'~
den) = eloge
+ (2y 
l)e
+ O(e He ).
137
6.12 Application of Vinogradov's Theorem to Lattice Point Problems
Here we prove:
If ~ ~ 2,
Theorem 12.3 (Voronoi).
L
d(n)
then
= ~ log ~ + (2y
 I)~
+ O(~t log2 ~).
With reference to this problem Yin has improved the result by replacing the error term with O(~H+£). Again the conjecture is that it should be O(xi+£). Proof From the proof of Theorem 5.4 we have
L
L [~J
=2
d(n)
l"n"~
l.;u.;fi

(6)
[Jey
We take J(u) = I/u and from Theorem 12.2 we have
J{
L uI=.lIm 0
l.;u.;fi
£
L
I =
r. U l.
+~+
f
du + (I 2
U
Ir)t {v~} ~
1
fi
U(Je)C 1
+2
f
U(X)X 3 dx.
We note that
f 00
fG00
U(X)X 3 dx =
~
{x} )X2 dx
Lf 1
I
I
4
2
=
00
(n
n= 1
x
+ X)2
dx
o
L
= l l  o  o {log(n
4
2n=1
I
+ I) Iogn  I} n+l
I
= 4+2 Y' and so we have 2
L ~=~IOg~+2G{Je})~t+2Y~+O(l). l.;u.;fi
We now estimate
(7)
138
6. Arithmetic Functions
We take to so that [AJ2 to ~ 2~1 ~ [AJ 2 to S=
I
t =0
1•
Then clearly we have
{~} + O( ~1).
I [J{]2''';; u ,;; [J{]2 ,
U
From Theorem 11.3 (replacing m by [AJr t 1 , and A by [AJ 3 C lr(3t+ 1)), we have
Therefore
s=
t[AJ + 0(~110g2 ~).
Noting that [AJ 2 = ~' 2{A}~t (6), (7) and (8). 0
(8)
+ 0(1), we see that the theorem follows from
6.13 QResults A number offamous problems in number theory are concerned with the accuracy of various asymptotic formulae; that is the problem of reducing the size of the error term in the formula. These results are generally called Oresults, and our Theorem 12.1 and Theorem 12.3 are such examples. On the other hand we may also estimate how large the error term must be; that is we can prove that some error terms cannot be smaller than a certain order. These types of results are called Qresults. In §12 we mentioned that the 'Oterm in Theorem 12.1 is conjectured to be O(x i +'). Here we prove that if e > 0, then the formula R(x) = nx
+ O(xi ')
does not hold. Actually we shall prove a very general result. In this section K, Kb K 2 , K3 represent absolute constants. At various places we may use the same symbol to denote different constants, but this should not cause any confusion. Let c> °and let ai, a2,'" be integers satisfying °Theorem : ;:; al ::;:; a213.1::;:; (ErdosFuchs). .. '. Let fin) denote the number of solutions to the equation ai
+ aj =
n, and r(x) =
I
f(n)
so that r(x) is the number of pairs of integers ah aj satisfying ai formula cannot hold.
+ aj ::;:; x.
Then the
139
6.13 DResults
We shall first deal with the following auxiliary results. Theorem 13.2. Let an be real numbers such that co n= 
converges uniformly, and that
00
I:'= co a; converges.
Then
1t
1t
Proof Clearly we have co
co
I
1t/I(.9W =
I
anamei(nm)8.
n=oo m=oo
The required result follows from integrating term by term over  n to n.
°
I:"
Theorem 13.3. Let bn ~ and let q>(z) = 0( < n, z = re i8 (0 < r < I), then we have
°<
f
f
at
~ 20(
1
bnzn be convergent for Izl < l. If
1t
1q>(zW d.9
~~ 6n
at
1q>(zW d.9.
1t
Proof We introduce the function q(.9) =
{I I~I, 0,
when
1.91
when
0(
~
0(,
< 1.91
~
n.
Then we have
f at
f 1t
1q>(zW d.9
at
~
f 1t
Iq(.9WIq>(zW d.9 =
m,~
1
bnbmrn+m
1t
Iq(.9)1 2 ei(nm)8 d.9.
1t
When m =I n, we have a
1t
o
1t
=
4 O(n  m)2
(
0
1
sin(n  m)O() O(n  m)
~
0,
140
6. Arithmetic Functions
while when m = n,
f "
Iq(.9)12 d.9
=
23!Y. ,
1[
and therefore we have
a
n
Theorem 13.4. Suppose that
Izl < 1 and let co
n=O
Then there exist constants c, C such that Yn O< c < ;:=t
00.
Proof From the binomial theorem we have r(r Yn
+ 1) ... (r + n 
=
1)
1·2 ..... n
Since !
v+!
f
f
logtdt =
{log (v
v!
+ t) + log (v 
t)}dt
0
o
it follows that
f
r+I!
In
log (r
+I
1) =
1= 1
In
log t dt
+0
1= 1
I
1
(n 1= 1
(r
+I 
2
)
1)
r+li
r!+n =
f
logtdt +0(1)
r!
= (r =
(r 
t + n)log(r  t + n) t + n) log n  n + 0(1)
(r 
t + n) + 0(1)
141
6.13 (.IResults
and n
I
=
10gn!
(} + n)logn
10g1 =
 n
+ 0(1),
/= 1
= (r
10gYn
and the theorem follows. Theorem 13.5.
If bn =
+ 0(1),
 1) 10gn
0
o(nt 10g1 n), then when 0 < r < 1 we have
I
bnrn
=
0
((1  r)i 10g 
1_). 1 r
1_
n=O
Proof From the hypothesis we have 00
I
1
I
bnrn :::; K
n=O
I
+ 61(r) 10g1
ntrn
lr
n<>(lr)t
ntrn,
n >(lr)t
(1 
where 61(r) + 0 as r + 1. In the first sum there are at most r)t terms, each of which is at most (1  r)i, so that the sum is at most (1  r)!. From Theorem 13.4 the second sum is
1 1 r
:::; 6(r)10g1(1  r)i.
Together we have 00
I
bnrn:::; K(1  r)~
1
+ 6(r)10g1(1
 r)i
1 r
n= 1
o
= O(10g  1_1_(1  r)i). 1 r
Theorem 13.6. Letf(x) and g(x) be two continuous realfunctions in the interval (a, b). Then b
b
b
I ff(x)g(X)dX I:::; (fF(X)dX f a
a
g2(X)dx)t.
a
Proof Let A be any real number and consider b
b
b
A2 f F (X)dX+2A ff(x)g(X)dX+ f g2 (X)dX a
a b
=f a
(Aj{X)
+ g(X))2 dx ~ O.
a
142
6. Arithmetic Functions
The discriminant of the quadratic expression cannot be positive and so the theorem follows. 0 Proof of Theorem 13.1. Suppose that
t < r < 1, z = reiiJ., 1 
r < oc < n12. Let
00
so that we have at once 00
g2(Z)
=
I
f(n)z"
"=0
and 00
(1  z) lg2(Z)
I
=
r(n)z".
"=0
If formula (1) holds, then 00
(1  Z)lg2(Z) = c
I
nz"
+ h(z)
"=0
= cz(1  Z)2 + h(z),
(2)
where 00
I
h(z) =
v"z",
"=0
We shall now derive a contradiction. From (2) we have
f e<
f ~ f e<
Ig(z)1 2 d.9
e<
=
Icz(1  Z)l
+ (1
 z)h(z)ld.9
e<
n
C
f e<
11 
zl 1 d.9
+
n
11 
zllh(z)1 d.9,
e<
and from Theorem 13.2 and Theorem 13.4 we have
From Theorems 13.6 and 13.5 we have
f e<
~ J 11 e<
11 
zllh(z)1 d.9
e<
Zl2 d.9
J Ih(zW d.9 a
(1
e<
n
~
(2oc(1
+ r2)
 4r sin oc)
J Ih(zW d.9
(3)
143
6.14 Dirichlet Series
::;;; {(2a(l  r)2
+ 4r(a 
sina))e(r)(l  r)! 10g_1_1_}t
1 r
1
::;;; e(r) a!(l  r) t logt, I  r
where e(r)
4
0 as r 4 I. Therefore, from (3) we arrive at
I a
I ,
Ig(zW d9 ::;;; K1 log  1r
+ e(r)a (l 2
, t 1  r)  4 log   . 1r
(4)
a
On the other hand, from Theorem 13.3, we have a
I
n
Ig(zW d9 ~ 
aI
3n
a
Ig(zW d9
aI
OO
=
3n k= 1
r 2ak
a 2). = _g(r
3n
n
From (2) and Theorem 13.4 we have g2(r2) = cr2(1 _ r2) 1 =
cr 2 (l  r 2)1
+ (l + (I
_ r2)h(r2)  r 2)0(In}r 2n )
> K(l r)l  0((1 r)ll)
> K(l
~
r)l.
Therefore
I a
Ig(zW d9 > K2a(l  r)4.
(5)
a
We take K 2e1; > 1 + K1 and let a (5), we arrive at
= e1;(l 
r)tlog(l/l  r). Then from (4) and
which is a contradiction. Our theorem is proved.
D
6.14 Dirichlet Series A Dirichlet series is a series of the form F(s) =
I f(~)n
.
n= 1
Here we call F(s) the generating function of f(n). This book does not discuss the fundamental properties of Dirichlet series. Instead we only deal with the various
144
6. Arithmetic Functions
formulae and their transformations. We do' not even discuss the region of convergence for the series. If fin) is a mUltiplicative function, then F(s)
=
0(1
+f(p) +f(p2) p' p2.
p
+ ... ),
where p runs over all the primes. Also if fen) is completely multiplicative, then F(s)
~
=
(1 _f~)r1
If G(s)
I
=
g(7) ,
n
n=l
then F(s)G(s)
=
I
=
I
f(:)
1= 1
I
00
n= 1
I
1
~ n
g(~)
m= 1
m
(n)
I
f(d)g d .
din
Therefore F(s)F(s) is the generating function ofIdlnf(d)g(n/d). We can use this to derive Theorem 4.2. Let 00 1 (s) =
I
~.
n This is the famous Riemann zeta function in analytic number theory. We have the product formula n= 1
1
(s) =
~ ( 1  p'
)1
(1)
Therefore
(2)
If g(n) is the Mobius transformation offen), then their generating functions G(s) and F(s) are related by G(s) = (s)F(s). The inverse Mobius transform theorem then becomes F(s) = G(s)/(s). We also have 00
dI(n)
n= 1
n
I . = (2(S),
(3)
145
6.14 Dirichlet Series
and
I n=l
1/t(n)1
=
nS
TI (1 +~) = ry (1 ~) =
ry(1 _;s)
pS
p
((s) .
(4)
((2s)
Taking the logarithmic derivative of (1) we have ('(s) ((s)
= _ " logp ; pS
= 
I
1
00
I ;;;;
logp
m= 1
p
= _
(1 _pS~)1 p
I A(~).
n=2
(5)
n
Since 00
logn
I s' n=2 n
('(s) = 
(6)
these two formulae give a new proof of the Mobius transform relationship between logn and A(n). Now
log~(s) =

=I
~ 109( 1 
I
;s) I A1~n).
~=
p m= 1
mp
n= 1
(7)
n
Also 00
("(s) =
I
log2 n s.
n
n= 1
From
I
A(n) logn n= 1 nS
=
(('(S))' ((s)
and
I n1 ( I
(n)) =
00
n= 1
S
din
A(d)A d
("(S))2 "((s)
,
using ("(s) ((S)
= ~ ('(S) + (('(S))2 ds ((S)
((S)
(8)
146
6. Arithmetic Functions
we arrive at
L: ~(d) log2 ~ = L: A(d) A (~) + A(n) log n. d
din
d
din
The results in §8 can also be expressed as follows. Let
I:
L(s) =
n= 1
X(7) . n
Then we have 00
r(n)
n=1
n
L: . =
(9)
4L(s)C(s).
In the study of analytic number theory we study the analytic properties of F(s) and use these properties to derive results concerning the function fen). Exercise 1. Discuss the region of convergence for the series in (1)  (9). Exercise 2. Establish the following:
C3 (s)
00
=
C(2s)
n= 1
C4 (s)
=
C(2s)
C(s  I) C(s)
=

C(s)C(s  a) =
d(n2)
L:00
(d(n»2
n= 1
n'
L:  ~ ({l(n) L.. n= 1
(s>I).
n' '
n' '
00
(TaCn)
!,=1
n
L: .,
'
(s> 1).
(s > 2).
s > max(1, a
C(s)C(s  a)C(s  b)C(s  a  b) = C(2s  a  b)
I:
+ 1).
(Ta(n)~b(n),
n= 1
s > max(1,a
n
+ l,b + l,a + b + 1).
6.15 Lambert Series Definition. We call 00
F(x)
=
xn
L: fen)   n n= 1
1 x
a Lambert series. Here F(x) is the generating function of fin).
(1)
147
Notes
Expanding (l) into a power series we have co
F(x)
=
I
co
f(n)
n=l
I
xmn
m=l
co
=
I
g(n)xn,
n= 1
where g(n) = Idlnj(d). Thus if g(n) is the Mobius transform of j(n), then g(n) is the coefficient of the power series whose sum is the Lambert series generating function of j(n). We now take g(n) = LI(n), giving co
x=
Jl(n)xn
I 11
(2)
n°
n=
x
Again, if we take g(n) = ,n, then x nxn=;o n= 1 (l  x)2 ' co
I
so that co
cp(n)Xn
n~l 1 
x
xn = (1 
(3)
X)2
A similar method gives , x x2 x3 d(n)xn=+++ ... n= 1 1 x 1  x2 1  x3 co
I
(4)
and 3 r(n)xn=4 (  X   x 3 n= 1 1 x 1 x
Ico
x5 +5 '" 1 x
)
.
(5)
Notes 6.1. The present best result on the circle problem is R(x)
=
7tX
+ O(xH+,)
by J. R. Chen [17]. There is a similar result for the Dirichlet's divisor problem (see Yin [65J and G. A. Kolesnik [34J). 6.2. Concerning the ,Qresult with respect to the divisor problem, H. E. Richert [49J has proved the following: Let O(n (n = 1,2,3, ... ) be a complex sequence such that
n:::=;x
148
6. Arithmetic Functions
holds for some (j > O. Then, given any e > 0 and any constant e, the following asymptotic formula cannot hold:
L mn:::;:;:x
OCmOC n
=
x log x
+ ex + O(x}').
Chapter 7. Trigonometric Sums and Characters
7.1 Representation of Residue Classes Let m be a positive integer. We have seen that the set of integers can be partitioned into residue classes
where As is the set of integers congruent to s mod m. We can define the operation of addition on these residue classes by
s+t u { s
+ tm
if s + t < m, if s + t ~ m.
This definition satisfies properties associated with groups. Within the theory of groups there is a representation theory whereby more abstract objects are given concrete representations, and this theory has very useful applications (for example, in electronics). In this section we discuss the method ofrepresentiQ,g residue classes which form an additive group. To replace the more abstract notion of a residue class we assign to each Au a bearing in mind that the representation should have the complex number property that if
eu,
(1)
then (2)
An immediate candidate for such a representation is
The advantages of this representation are: (i) integers belonging to the same residue class are assigned the same number; that is, if u = v + km, then
(ii) if u
+ v = w (modm),
then
150
7. Trigonometric Sums and Characters
After giving this representation the abstract notion of adding residue classes becomes the concrete one of multiplication of complex numbers. Thus it is possible that some results on congruences can be obtained from the results in trigonometric sums. This is the underlying reason for the important place occupied by research in trigonometric sums in the theory of numbers. Let a be any integer. Then
also possesses the properties (i) and (ii), and so there are m different representations. We now prove that there are no other representations. Let '1u be any complex number with the above properties. Then from mu == 0 (mod m) we have '1~ = '10' But '1~ = '10 so that if'1o # 0, then '10 = I and we see that '1u must be an mth root of unity. If we let '11 = e21tia/m, then
If'1o = 0, then '1u = 0; that is, all the representations are zero which we exclude from our discussion.
Theorem 1.1. We have, according to whether m divides n or not, I
ml
m
a=O
 L e: = I
0,
or
that is I
ml
m
a=O
 L
e21tian/m =
I
or
0,
Proof If min, then the theorem is obvious. If m,rn, then ml
L e: =
I I _
a=O
em J!nn
= O. 0
..
From this theorem we see that the number of solutions to the congruence
o ~ Xv ~ m can be represented by I
ml
ml
ml
m
x,=o
Xn=O
a=O
 L ... L L
I
e 21tia(f(x" ... ,Xn)  N)/m.
After giving this representation the problem of congruence is now given an analytic interpretation. For the system of integers we have:
151
7.2 Character Functions
Theorem 1.2. We have, according to whether n is 0 or not, 1
f e 21[ inX dx = 1 or
O.
D
o
From this theorem we see that the number of sets of integer solutions to the equation
is equal to
L
L
a1 :::;:;:Xl :::;:;:bl
an:::=;xn:::;;b n
fe21[i(f(XI ..... XnlNladlX. o
Example l. Fermat's problem is to prove: when k 1
f(
i
e 21[iXk a)2(
x=l
i
~
3,
e 21[iX k a)dlX = O.
x=l
o
Example 2. Goldbach's problem is to prove: 1
L. e21[iPa)2e41[iNadlX > O.
f(
p~2N
o
Of course, in these two examples, the new representations do not assist the solutions of the problems. Exercise 1. Let (m, n)
=
I, ml ml
S=
L L
~(x)I1(y)e21[iXynlm,
x=o y=o ml
ml
x=o
y=o
L 1~(xW = X o,
L II1(xW = Yo·
Show that
7.2 Character Functions We already know that multiplication is closed within a reduced residue system. That is, if
152
7. Trigonometric Sums and Characters
are the residue classes mod m corresponding to (au, m) = 1, then
is still one of these classes. We now ask if there is also a representation for these classes. Definition. By a character x(n) mod m we mean a function on n, defined when (n, m) = 1, and x(n) satisfies the following: 1) x(l) =I 0; 2) If a == b (modm), then x(a) = X(b); 3) x(ab)
=
x(a)x(b).
Sometimes it is convenient to add: if (n,m) > 1, then x(n) = O. Example. x(n) = 1 is clearly a character. We call this the principal character and we denote it by XO. We can deduce from the definition that X(l) = 1, x(n) is also a character, and that the product of two characters is a character. As an example we first take m = p, a prime number. Take g to be a primitive root modp. Then the function Xa(n)
=
e21tiaindn/(p1)
is a representative because it has the following properties: 1) Xa(l) = 1 =I 0; 2) if n == n' (modp), then indn == indn' (modp  1), so that 3)
xin)
=
Xa(n'); Xa(nn')
=
e21tiaind(nn')/(p1)
= e21tia(ind n + ind n')/(p 
1)
More specifically, when p is an odd prime we take a = (p  1)/2 so that Xt(pl)(n)
=
e1tiindn
=
(~).
That is the Legendre symbol is a character. From the above we see that there are p  1 characters modp and it is not difficult to prove that there are only p  1
distinct characters. We now generalize our discussion to the following:
153
7.2 Character Functions
1) m = pi where p is an odd prime. From Theorem 3.9.1 there exists a primitive root modp', so that if p,rn we can define ind n, that is
n == gindn (modp'). From this we can obtain
= e27tiaindn/'P(pl),
Clearly Xii) = 1, and there exists a character Xl (n)
with the property: if n
= e27tiindn/'P(pl)
i= 1 (modp'), then
XI(n) # 1.
= 2'. 2.1) 1= 1. There is only the principal character. 2.2) I = 2. Besides the principal character there is the character 2) m
x(1)
= 1,
X(3)
=  1.
2.3) I> 2. By Theorem 3.9.3, when n is an odd prime, there is an integer b such that n == ( 1)t(n1)5 b (mod 2'), b~ 0. i
We now define
Here a may take two distinct values mod 2 and c may take 2' 2 distinct values mod 2' 2, so that there are
has the following property: if Xl,l(n) = 1, then n == I (mod2') or n ==  5213 (mod 2'). When n ==  52'  3 (mod 2'), we have XO,I(n) =  I # 1. That is, if n i= 1 (mod 2') then we can select a character XaAn) # 1. 3) The general case. Let m = P't'
... p!s,
Iv> 0,
be the standard factorization for m. Let a character mod p~v be
so that x(n) =
n x(V)(n) v= I
is a character modm. There are thus
(1)
154
7. Trigonometric Sums and Characters
Conversely, if the modulus of a character x(n) is
where k; are pairwise coprime, then there exist characters x;(n) mod k; (i such that x(n) = Xl(n) ... Xv(n).
=
I, ... , v)
In order to understand this we need only prove the case v = 2. From the Chinese remainder theorem, given any n, we can find nl and n2 such that nl == n (modk 1 ),
nl == I
n2 == I
n2 == n (mod k 2).
(modk 1 ),
(modk 2),
We define
and it is not difficult to prove that Xl(n) is a character modk 1 and X2(n) is a character modk 2. From the definition of nl and n2, we have
so that Therefore
Theorem 2.1. The cp(m) characters so constructed are all distinct.
Proof Suppose that
v=l
v=l
From the fact that x(V)(n)/xiV)(n) is also a character modp~v it suffices to prove that if
v=l
is the principal character, then x(V)(n) is the principal character
== I (modp~v), n == a (modp!s),
n
modp~v.
l~v~sl,
and we see that for all a (ps,./'a),
that is
is) is the principal character modp!s. The theorem is proved.
0
Take
155
7.2 Character Functions
Theorem 2.2. Jfn
=1= 1 (modm), then we can select,Jrom among the cp(m) characters, a x(n) such that x(n) =I 1.
Proof From the hypothesis there must exist a prime pv such that n =1= 1 (modp~v), and from earlier there exists x(V)(n) =I 1. If Jl =I v we take X(I") to be the principal character, and now
n iV)(n)
=
x(n)
v= 1
0
is the required character. Theorem 2.3.
L x(n)
if X = xo, if X =I XO,
= {cp(m), 0,
n
where the sum is over a complete set of residues mod m. Proof The theorem is obvious if X = XO. When X =I XO, there must be an integer a such that (a, m) = 1, and x(a) =I 1. From x(a) Lx(n)
= Lx(an) = Lx(n),
n
n
n
or (x(a)  1) Lx(n)
= 0,
n
the theorem follows.
0
Theorem 2.4. Let c denote the total number of characters mod m. Then
if n == 1 (modm), if n =1= 1 (modm),
Lx(n) = {c, x 0,
where the sum is over all the characters. Proof From n",(m) == 1 (modm) we deduce that (x(n))",(m)
= 1,
so that the number of characters c is finite. Ifn == 1 (modm), then the theorem is obvious. Ifn =1= 1 (modm), from Theorem 2 there is a character X(a) such that X(n) =I 1. From X(n) Lx(n) x
= LX(n)x(n) = Lx(n), x
x
156
7. Trigonometric Sums and Characters
we have (X(n)  1) Lx(n) x
= 0,
D
and the theorem is proved.
Theorem 2.5. The total number oj characters is
L x(n) =
{
n
= c,
X
D
L Lx(n) =
n,X
Definition. We call (l) the standardJactorization oja character. More specifically we let x(n,p') =
e21tiindn/q>(pl)
TIpv
(the definition of b is given in Theorem 3.9.3). Let m = 2a p~v be the standard factorization of m. Then any character x(n) mod m has the factorization:
TI (x(n,p~v))
if a = 0, 1,
Pv
(Xl (n, 2'))<0
x(n) =
TI (x(n,p~v))"v,
if a = 2,
Pv
(Xl (n, 2'))"v(X2(n, 2'))"0'
TI (x(n,p~v))"v,
if a
~
3,
Pv
(co
= 0, I,
Exercise 1. If X =I Xo, then for any two positive integers u and v (v
~
u) we have

Exercise 2. If (/, m) = 1, then L x(n) x X(/)
=
{
0,
when when
== I
(modm),
n =1= I
(modm).
n
7.3 Types of Characters Definition. A character x(n) modmis said to be primitive if, for every divisor M ofm, < M < m, there exists an integer a satisfying
°
a
== I
(modM),
(a,m) = 1,
x(a) =I 1.
157
7.3 Types of Characters
A character which is not primitive is called an improper character. Example 1. If m > 1, then the principal character modm is improper, since 1 is a divisor of m. Example 2. If m = p, then any nonprincipal character modp is primitive. Example 3. If m = pi (/ > 1) and p is an odd prime, then a necessary and sufficient condition for the character xin)
=
e 27tia ind n/",(m)
to be improper is that pia. Thus, every improper character mod pi induces a character mod p' 1. Example 4. m = 2/. If 1= 1, then there is only the principal character. If 1=2, then the non
principal character x(l)
=
X(3)
1,
=  1
is primitive. When I ;::: 3, if Xa.in)
= (  1yn  1)a/2 e 27ticb/2' 2
is an improper character, then Xa,in) = xa.in
+ 2 /  1)
and the converse also holds. That is n1
( _ 1)2 a e27ticb/2·2
= (_ =
1)ta(n1 + 2·1)e27ticb'/2·2
(_l)ta(n1)e27ticb'/2.2,
or c(b  b') == 0
(mod2 / 2 ),
where the definition of b' is given by n
+ 2 / 1 == (
n1
(mod 2 / ).
1)25b '
From n+2 /  1 ==n+n2 /
1
(mod2/)
== n(l + 2 / 1 ) (mod 2/) == n5 2 · J (mod2/), we have
That is, a necessary and sufficient condition for
Xa,in)
to be primitive is that 2,rc.
158
7. Trigonometric Sums and Characters
Let us take a more specific example. When 1=3, nl
Xa,in) = (  1)2 a+ cb ,
where b = 0, 1, 1,0 when n = 1,3,5,7. If c = 1, then Xa,I(1)
= 1,
Xa,I(5) =  1,
Xa,I(3)
=  (
Xa,I(7)
= ( 1)a
l)a,
are primitive characters, and we can simply write them as XO,I(n)
=
(~)
and
Xl,l(n)
=
(~
2) .
When c = 0, a = 1, XI,O(1) = 1, Xl,o(5)
XI,o(3)
= 1,
=  1,
XI,O(7) =  1
is an improper character, that is XI,O = (  lin). In the character representation in §2 we have x(n)
nx(v)(n).
=
If one of the characters x(V)(n) is improper, then x(n) itself is also improper. Conversely, if X(n) is an improper character, then at least one of the characters x(V)(n) is improper. We next investigate the situation under which there is a real valued primitive character. If a character is real, then each of its factor characters is also real. When p is an odd prime, in
the value C v must be a multiple of ({J(P')/2. If this character is also primitive, then from Example 3, I must be equal to 1. Suppose that
is a real character. Then we must have
If this character is also primitive, then from Example 4, we must have I ~ 3. Therefore there can be no real primitive character if I > 3. There cannot be any real primitive character either if I = 1. For if m = 2m', 2,rm', then from n
== n' (mod m'),
(n,m) = 1,
(n',m) = 1
159
7.4 Character Sums
we deduce that n == n' (mod m) giving x(n) = x(n') so that x(n) is improper. Summarizing, the possibility for the existence of real primitive character occurs when
where Pi are distinct odd primes and a = 0,2,3. Moreover, if the character is primitive, then Cv = q>(p)/2 or
(~).
(x(n,p))"Hpl) = e"iindn =
Thus, if a = 0, then the real primitive character is the Jacobi symbol (n,m)
=
1.
If a = 2, then the real primitive character is nl ( n )
( 1)2 m/4 '
and if a
=
(n,m) = 1,
3, then there are two types of real primitive character:
)~n2  (m~8) ,
( 1
1)
n  1 n  1 ( __ n ) (_ 1)2+82
m/8
= (_
(n,m)
1)~(n2)29)
=
1,
( _n ) ,
m/8
(n,m)
=
1.
7.4 Character Sums Let m
S(a, X) =
L x(n)e21tian/m. n=1
Theorem 4.1. Let (mt. m2) = 1 and let X be factorized into
where Xl(n) is a character modml and X2(n) is a character modm2' Then
Proof Let n = mln2 + m2nl' Then as nt.n2 run over the complete sets of residues modmt. modm2 respectively, n runs over the complete set of residues modmlm2'
160
7. Trigonometric Sums and Characters
Therefore ml
Sea, X)
=
Xl (m2)X2(ml) L nl
m2
L Xl (ndxin2)e21tia(mln2 +m2 n d/mlm2
=1
n2::::::
1
Thus the study of character sums mod m is reduced to that of character sums to a prime power modulus.
Theorem 4.2. Let m = pl. If pia and X is a primitive character, or if p,ra and X is an improper character (but we exclude the case I = I, X = Xo), then S(a,x)
0.
=
Proof We make the substitution n = x(l
+ plly).
When I ~ x ~ pll, p,rx and I ~ y ~ p, the number n runs over the reduced residue system mod i, and conversely. Therefore o
p'l
P
Sea, X) = L x(x)e21tiaX/P' L x(l
+ plly)e21tiaXY/P.
y=l
x=l p,/'x
If x(n) is improper, then x(l
+ ily) = I,
Sea, X) = {
so that
O'
if p,ra,
p L x(x)e21tiax/P',
if pia.
p'l
x=l
If x(n) is primitive, then there exists u such that x(l from p
+ pl1U) #
p
x(l +pl1U)L x(l +ily)
= L x(l +pll(y+U»
y=l
y=l p
=
L x(l +ily), y=l
we have p
L x(l Therefore Sea, X) =
°also.
I; now pia and so
+ plly) = 0.
y=l
0
We shall write T(X) = S(I, X)·
161
7.4 Character Sums
If (a,m)
= 1, then m
x(a)S(a, X)
L x(an)e21tian/m
=
n=l
= S(l,X)· Theorem 4.3. Let
L
Cq(n) = (a,
e21tian/q,
q)= 1
where a runs over a reduced set of residues mod q. Then 1) cq(n) is a multiplicative function of q; that is if (qt. q2) = 1, then Cq,(n)Cq2 (n) = Cq,q2(n);
i 2)
Cpl(n)= {
pll,
if iln,
_pll,
if pl,tn, pilin,
0,
if pll,tn;
3)
Proof 1) can be proved by the substitution a = qla2 method described earlier. 2) follows from Cpl(n)
=
3) follows from I) and 2). Theorem 4.4.
pI
p''
a=l
a=l
+ q2al
with the familiar
L e21tian/pl  L e21tian/pll. D
If x(n) is a primitive character, then
Proof First consider the case m = pl. We have easily
1't'(xW =
't'(X)i(X) p'
=
L
p'
x(n)e21tin/pl
q=l
n=l p'
=
L
pI
x(n)e21tin/pl
L
x(nq)e21tinq/pl
q=l
n=l p'
=
L x(q)e21tiq/pl
pI
L X(q) L e21ti(1q)n/pl. q= 1
n
=1
p,tn
If pll ,t(q  I), then from Theorem 4.3, the inner sum on the right hand side in the above is O. We need therefore only examine the situation wh enplll(q  I), that is
162
q
=
7. Trigonometric Sums and Characters
I
+ pl 1U,
0
~ U ~
P  I. But now clearly p1
1c(xW
=
pi  pl 1
L:
_
i(l + pl 1 U)pl 1
u= 1 p
=
L:
pi _ pl1
i(l + i  1u).
u= 1
Now if x(n) is primitive, then there exists v such that x(l
i(l + pl1 V) # 0, 1. From p
p
L:
i(l +pl1 V)
+ pl1 V) #
L:
i(l +i 1u)=
u=l
0, I so that
p
L:
i(l +pl1(U + v)) =
. u= 1
i(l +i 1u),
u=l
we have p
L:
i(l + pl1U) = o.
u= 1
Therefore the case m Theorem 4.1. 0
=
pi is proved, and the general case follows at once from
We see therefore that c(x) =
evlm,
lei =
1.
However, the determination of e is no easy matter. For real primitive characters we know much more and in the next section we shall determine e when X is a real primitive character. Theorem 4.5. Let X be a real primitive character. Then, for odd m, we have
c(X)
=
{± ~
if m == I (mod 4), if m == 3 (mod 4).
±lym Proof This is similar to the proof of Theorem 4.4. If m p
(C(X))2 =
=
L:
X(q)
q=l
L:
e 21ti (1 +q)n/p = X(  I)p.
n=l
We already have x(  I)
so that the theorem follows.
= ( ~ I ) = ( _ I )p; 1,
0
7.5 Gauss Sums The trigonometric sum m1
S(n, m) =
p, then
p1
L:
x=o
e21tiX2n/m,
(n,m) = I
163
7.5 Gauss Sums
is the famous Gauss sum. In this formula the summation can be taken over any complete set of residues mod m. Theorem 5.1. If(m,m') = 1, then
S(n, mm') = S(nm', m)S(nm, m'). Proof Let x
=
my
+ m'z.
Then mm'
S(n, mm') =
L
e21tix2n/mm'
x=l m'
=
m
L L e21tin(my+m'z)2/mm' y= 1 z= 1
=
and hence the result.
m'
m
y=l
z=l
L e21timny2/m' L e21tim'nz2/m
D
We see that in order to evaluate a Gauss sum we need only deal with the case m=pl. Theorem 5.2. Let
b= {
1, 2,
when p is an odd prime, when p = 2.
Then, for 1 ~ 2b, we have
Proof Let x = y
+ p'bZ. Then, from
2(1 b) ~ I, we have
y= 1 z= 1 pld
=
L
pd
L e41tiyzn/pd
e21tiy2n/pl.
y=l
z=l
plc;
=
pb
L
e21tiy2n/pl
y=l ply
p'dl
=
pb
L
e21tix2n/pl2.
x=l
When p > 2, this is what is required. When p pl 3
P
L
x=l
the result also follows.
D
=
2, then from
pl 2 e21tix2n/pl2
=
L
x=l
e21tix2n/pl>,
164
7. Trigonometric Sums and Characters
From this theorem we see that the crucial points in the evaluation of a Gauss sum rest on the determination of S(n,2),
S(n,4),
S(n,8)
and p an odd prime.
S(n,p), Theorem 5.3. If 2,rn, then
= 0, S(n,4) = 2(1 + in), S(n,2)
7ti
= 4e4"n.
S(n,8) Proof Clearly we have 2Jti
S(n,2)
= 1 + eTn = 1  1 = 0,
S(n,4)
=
S(n,8)
= 2(1 + esn + es4n + es9n )
27ti
1 + eTn
27ti 4
27ti 9
+ eT n + eT n = 1 + in + 1 + in = 2(1 + in), 27ti
2ni
27ti
Theorem 5.4. If p is an odd prime, then
= (;)S(I,P) =
S(n,p)
(;)T(X).
Here x(a)
=
(~).
Proof The number of solutions to the congruence x2
== u (modp)
is
and therefore
±
e27tix2n/p
=
x=1
f (1 + (~))e27tiun/p = f (~)e27tiun/p P P
u=1
= (':.)
p
which is the required result.
u=1
±(~)
v=1
0
P
e27tiv/p,
165
7.S Gauss Sums
Theorem 5.5.
=
S(l,p)
if if
{JP, iJP,
== 1 (mod 4), p == 3 (mod 4). p
Proof From the above theorem and Theorem 4.5 we have S(l,p) =
{±±iJP, JP,
if p == 1 (mod 4), if p == 3
(mod 4),
which, combining into a single formula, gives
t(1 + iP)(l 
i)S(1,p)
=
± JP.
If we can prove that
+ i P)(1
91H(1
 i)S(l,p)} > 
JP,
where 91{x} represents the real part of x, then the theorem will follow. Now itis easy to see that p1
I
S(1,p)  1 =
t(pl)
I
e27tix2jp =
x= I
(e27tix2/p
+ e 27ti(pX)2/ p)
x= 1
t(pl)
=
2
I
(1)
e27tix2/p.
x=l
Let j(x) be any function. Then t(pl)
I
t(pl)
j(x)
x=l
(p x ) = I f (x)  . pl
+ I f x=l
2
x=l
2
This formula clearly holds because the first term on the left hand side is merely the sum of those terms on the right hand side when x is even, and the second term is the sum on the right hand side when x is odd. We take j(x) = e27tix2/p and note that j(~  x) = iPe27tix2jp. Then, from (1), we have pl
t(1 + iP)(S(l,p) 
I
1) =
+ Z,
(2)
e27tix2/4P.
(3)
e27tix2/4p = W
x=l
where
W
=
I
e27tix2/4p,
x.;;Jp
Z
I
=
JP<x.;;pl
From (2) we have
t(1 + i P)(1 Since 91H(l
+ i P)(1
 i)S(l,p) 
t(1 + i P)(l
 i) = (1  i)(W + Z).
 i)} is 1 or 0, it follows that
91H(l + i P)(l  i)S(1,p)} ~ 91{(1  i)(W + Z)} ~ 91(1 OW 
filZI.
(4)
166
7. Trigonometric Sums and Characters
From cos x
+ sin x ;::::
1 when 0
9l{(1  i)W}
~
~
x
n12, we deduce that
nx2 nx2) 1 r L  ( cos+ sin ;:::: [vPJ;:::: yp. 2p 2p 2
=
(5)
Jp On the other hand, if we write in Z, x:S;;
nx 2p
= cosec,
Wx
then (6)
Therefore, from (3) and (6) we have pl
L
2iZ =
x~q+
(v x 
Vx 
dwx,
1
that is
21Z1
=
Pil
I
viwx 
+ VplWp 
Wx + l )
VqWq+ll
x~q+l
pl
I
~
(Wx 
Wx+l)
+ Wp + W q + l = 2wq + l
x~q+l
r:
2p q+l
~~2vp
(because
Wx
(7)
is decreasing). From (4), (5) and (7) we finally have
The theorem is therefore proved.
0
Summarizing we have the following result: Theorem 5.6. If m is odd, then
S(n,m)=
{ (:)fo, fo, .(n)
if m == 1 (mod 4), if m == 3
(mod 4).
1 
m
Proof We use induction on the number of distinct prime divisors of m. If m = pi, then we have by Theorems 5.2 and 5.4, that I
S(n,p) =
{'
p2,
if 21/,
pt(l1)S(n,p) =
(~)pt
167
7.5 Gauss Sums
{mi(~)P±' pi.
=
if 2",1,
p=.l
(mod 4),
if 2",1,
p=.3
(mod 4).
Moreover, from Theorem 5.1 and the induction hypothesis, we have S(n, mm')
=
S(nm', m )S(nm, m')
(~)2p =  (nm)  1.(~)2Fm 2 m'12 m ( nm')
.
m
m'
= (m:,)(:)(:,}(m~lY +(m';l YJmm'
if mm' =. 1 (mod 4), if mm' =. 3 (mod 4). (Here we have used the law of quadratic reciprocity.)
D
Theorem 5.7.
S(n,21) =
r (1
if 1=1
+ i n )2±,
if I is even
1+ 1 ni
22e"4n ,
if I> 1 and odd.
Proof From Theorem 5.3 we see that the result holds when 1= 1,2,3. When I > 3 the result follows from Theorems 5.2 and 5.3. D' Theorem 5.S. Let x(n) be a real primitive character. Then
't"(x)={~ Iym,
if X(l)=l, if x(  1) =  1.
Proof From §3 we know that m can be written as m = 2am', where a = 0, 2, 3 and m' is a product of distinct primes; moreover 1) ifa=O,then x(n) = ( : ) .
(n,m) = 1;
168
7. Trigonometric Sums and Characters
2) if a = 2, then nl(n) x(n) = (  1)2m' ,
(n,m)
= 1;
3) if a = 3, then x(n) = (_l)t
or
(n,m) = 1.
Here (:) and (;,) are Jacobi's symbols. We now consider the three separate cases. 1) a = O. Let m = PI ... Ps and we use induction on s. When s = 1 the result follows from Theorems 5.4 and 5.5. Let s> 1 and put m = Plm'. Then, from Theorem 4.1 we have
where Xl> X2 have the moduli PI, m' respectively, and x(n) = XI(n)X2(n). Therefore, from Theorem 3.6.4 and the induction hypothesis, we have
{fi:} ifi: . {P} iP ~~ {fi:} {P} ifi: iP
r(x) = ( m')(PI)  . PI m' = (
1)
2
2'
•
== 1 (mod 4) if m == 3 (mod 4) 2) a = 2. Let m = 22m'. If m' = 1, then x(l) = 1, if m
{ Jp1m' =Fm, = iJplm' = iFm,
or
X(l)=l,
or
X(  1) =  1.
X(3)
=  1 so that
4
I
r(x) =
x(n)e21tin/4
= e21ti /4  e61ti /4 = 2i.
n= I
If m' > 1, then from Theorem 4.1 and 1)
m'1(4)
r(x) = ( 1)2 m' 2i
.{P=i
== 1 (mod 4) if m' == 3 (mod 4) if m'
Fm, ip=Fm,
3) a
or
X(  1)
=  1,
or
X(  1)
=
1.
= 3. Let m = 23 m'. When m' = 1, we have B
r(x)=
I
n= I
.
x(n)e 21t1n /B =
{e 21ti /B _ e61ti /B _ el 01ti/8 + eI41ti/8 = j8, if X(  1) = 1,
.
.
.
.
e21t '/B + e6",/B  el 0",/8  eI41t ,/8 = ij8, if X(  1) =
Suppose that m' > 1. If x(n) = ( 1)t
 1.
169
7.6 Character Sums and Trigonometric Sums
T(X)
= (
1)~m'21)(~,)j8
{P=fo, iP=ifo,
=I m' = 3
if m'
(mod 4)
or
X(I)=I,
if
(mod 4)
or
X(  I)
If x(n) = (
I}Hnl)+~n21)(;,), then
= (
I}Hm'1)+~m'21)(~,}j8
T(X)
.{P=i fo , ip=fo,
= 
1.
if m'
=I
(mod 4)
or
X(  I) =  I,
if m'
=3
(mod 4)
or
X(  I)
Collecting I), 2) and 3) the theorem is proved.
=
1.
D
7.6 Character Sums and Trigonometric Sums We have seen in the previous section the relationship between Gauss sums and character sums. We now proceed to establish certain relationships between trigonometric sums and character sums. Theorem 6.1. Let p be a prime, and dip  I. Then a necessary and sufficient condition for an integer x to be a dth power nonresidue modp is that
~
I
e21tiaindx/d
d a =l
=
o· '
otherwise the formula is equal to I. Proof By Theorem 3.8.1 whether x is a dth power residue or not depends on whether dlindx or d,rindx. Using trigonometric sums this means that
~ d
I a= 1
e21tiaindx/d
=
{I,
if x is a dth power residue modp, if x is a dth power nonresidue modp.
0,
D
Theorem 6.2. Let p be a prime, p,ra, (p  I, k) = d. Then dl
p
I
e21tiaxk/p
x=l
=
I
S(a, l),
b=l
where X(u)
=
= e 21ti ind u/d.
Proof The congruence:x!' u (modp) has either no root, or d Therefore, from Theorem 6.1 we have
=
(p  I,k) roots.
170
7. Trigonometric Sums and Characters
p
I
e 21t ;ax k /p
=
1+
x=1
pl
I
d
e21tiau/p
u=1
e 21tib ind u/d
b=1 pl
d
= 1+
I
I I
e 21t ;au/ pi'(u) b=lu=1 pl dlpl = 1 + I e21tiau/p + I I e 21t ;au/ pi'(u) u=1 b=lu=1 dl = I S(a,i'). 0 b=1
JP so that we have:
From Theorem 4.5 we see that IS(a, i')1 ~ Theorem 6.3. Let d = (k,p  1). Then
Ixtl e 21t ;ax pI~ (d  l)JP. k
/
Exercise. Study the trigonometric sum ml I e 21t ;xk n/m, x=O by following Theorems 5.1 and 5.2.
(n,m)
0
= 1·
7.7 From Complete Sums to Incomplete Sums Theorem 7.1. Let g(x) be periodic with period q, and g(x)
={
if if
l,
0,
0 ~ x < m, m ~ x < q.
Then g(x) is representable as 1 ql g(x) = m + _ I e 21t ;nx/q(l  e 21tinm/ q)/(l  e 21tin/q). q q n=1 Proof Clearly 1 ql
g(x)
=
I
e 21t ;nx/q
q n=O m 1 q 1
ml
I
e 21tint/q
t=O .
1  e 21t;nm/q
=  +  " e21t1nx/q . q q /;;;'1 1  e 21t1n /q Theorem 7.2. Let
(J(
be a real number and
S
=
I q'
e 21t ;na.
0
171
7.7 From Complete Sums to Incomplete Sums
Then
lSI::::; min (q"  q', 2<1!y'») ' where
= min(!Y. 
+ 1
[!Y.], [!Y.]
!Y.).
Proof Clearly we have lSI ::::;q"  q'. If!Y. :f [!Y.], we let Q = q"  q' so that lSI
=
IQlL
e27tina
I= 11 
e27tiQa . 1  e 27t1a
n=O
::::;
2
I
1
=
11  e 27tia l
Isin n!Y.1
1
~
"" 2
(when 0 ::::; ~ ::::;
t, sin n~ ~ 2~, so that Isin n~1 ~ 2< 0).
Theorem 7.3. If2,(q, then
ImlL
qlL
I
e27tix2/q  m e27tix2/q ::::; x=o q x=o
Jq log q.
Proof Clearly we can assume that m ::::; q. From Theorem 7.1 we have
mlL
x=O
e27tix2/q
qlL ql =m L =
e27tix2/qg(x)
x=o
e27tix2/q
1
ql qlL
+_L
qx=o
e27ti(x2+nx)/q
qn=lx=O
1
27tinm/q  e . . 1e 27t1n/q
From the formula for a Gauss sum we have
ql Ix~o
e 27ti(x 2+nx)/q
I= Iql x~o
e27ti(X + tn)2/q 1* ::::;
so that
Iqil
x=o
e27tix2/q _ m
qi
1
e27tix2/q
I
q x=o
ql ~L1
"" Jq n=l 2(~) *
Here
t represents the solution to the congruence 2x == 1 (mod q).
Jq,
172
7. Trigonometric Sums and Characters
I
~
t(ql)q
t(ql)
I
I =Jq I Jq n=l n n=l n < Jqt('I1)(_IOg(1 ~) + IOg(1 + ~)) n=l 2n 2n t(ql)
= Jq
I n= 1
+ log(2n + I))
(log(2n  I)
= Jqlogq.
0
Theorem 7.4 (polya). Let p be an odd prime, I character modp. Then
~
~
m
p, and X be a nonprincipal
I:t~ X(x) I< Jp logp. Proof From Theorem 7.1 we have
ml pl I x(x) = I x(x)g(x) x=o
x=o
m P
=
I
1
x(x)
Px=o
IPl
+ I
x(x)
Px=O
pl l_e21tinm/p I e21tinx/p . n=l Ie 21t1n/p
From Theorem 2.3, Theorem 4.4 and Theorem 7.2 we have
ml I JP1II _e21tinm/p _21tin/p IIPl I x(x)e21tinx/p I I I x(x) ~  I x=O Pn=l I e x=O I pl I ~ r.: I  ( ) < Jplogp. 0 V Pn =12 ~ p
This theorem has the following application: Theorem 7.5. Let p be an odd prime and dl(p  I). Then there is always a dth power nonresidue modp which is less than Jp logp.
Proof Let R represent a dth power residue not exceeding m. Then R=
where X(x)
=
mid
I
d
m
I  I e 21tia ind x/d =  I I e 21tia ind x/d x=l da=l da=lX=l
e21tiindx/d. From Theorem 7.4, we have dI r.: IR dml <dvPlogp,
(1)
173
7.7 From Complete Sums to Incomplete Sums
and so
R<
m
r:
dI
d + dvPlogp.
Now if m = JP logp, then
m dI R<+m=m d
d
'
so that a dth power nonresidue less than JP logp exists.
0
In particular there must be a quadratic nonresidue less than JP logp. The determination of the smallest exponent c5 such that the least quadratic residue satisfies O(pl» is a famous difficult problem. The result of Vinogradov is:
Theorem 7.6. For sufficiently large p the least quadratic nonresidue does not exceed
Proof Let
m
=
JPlog2 p,
and suppose that I, 2, ... , T are all quadratic residues. Since every quadratic nonresidue must have a prime divisor which is also a quadratic nonresidue, it follows that every quadratic nonresidue not exceeding m must have a prime divisor q satisfying T < q ~ m. Therefore, denoting by N the number of quadratic nonresidues not exceeding m, we have N~
I L [mJ <m L , q
T
T
q
and hence, by Theorem 5.9.2, N <
=
m
log _lo_g_m_ log T
m(~ + 2
+0
(_m_) log T
log
I I
+
_4_1~_:g_I:_g_P
)
+0
(_m_)
+ 4Je log logp
log T
logp
= m(~ 2
_
4(Je  1)IOgIOgp) logp
+ o(~). log T
From (1) we have N
= m + O(JPlogp) = m + o(~) 2
2
~gp
174
7. Trigonometric Sums and Characters
so that m ( m) +0  <m (1 4(Je  l)lOglOgp) +0 ( m) 2 logp 2 logp logp ,
that is loglogp = 0(1),
0
which is impossible if p is sufficiently large. The theorem is therefore proved.
P (X2 + pax + b) 7.8 Applications of the Character Sum X~1 Theorem 8.1. The number of integers a such that a and a residues mod p is
+ 1 are
both quadratic
Before we prove this theorem we have to evaluate a sum first. Theorem 8.2. Let p >. 2, a 2

4b
=1=
0 (modp). Then
±(X2 + + b) ax
1,
= _
p
x=1
where in the formula the value 0 is given to those terms in which plx 2
+ ax + b.
Proof We can assume that a = 0, since otherwise we can use the substitution y = x + a12. Now suppose that p,tb. From Euler's criterion we have
L (X2 +b) = L (x P
P
, + b)2(pl)
(modp). P x=1 Let g be a primitive root of p. If 0 < c < p  1, then P p2 1 _ gc(pl) x< = gCV = C = 0 (modp). x=1 v=o lg 2
x=1
L
L
Substituting this into (1) yields
L (X2 +b) = L x P
x= 1
P
P
P 1
=
x= 1
= 1
(modp).
I L (X2P+b)1 ~p, x=1
L1
x= 1
Clearly P
pl
(1)
7.8 Applications of the Character Sum
LP (X2 + ax + b)
x~1
175
P
so that
IP (X2 +b) = x=l
P
1 or p  1.
Since
2 I (X2 +b) _ (b)  +2 !(P1)(X I +b) P
P
x= 1
P
P
x=l
== 1 J(mod2), we have
f (X2 + b)
= _
1.
D
P Proof of Theorem 8.1. The number of integers a with the property stated in the theorem can be represented by x=l
~
:t: (1 + (~))(1 + (a; 1)) ~ :t: (1 + (~) + 1) + =
=~
(a;
(a(a:
1)))
(p _2_(~ 1) _(~) _1)
=~(P_4_(~1)) (because I~= 1 (;) = 0).
D
From Theorem 8.1 we deduce at once: Theorem 8.3. Ifp ~ 2, then there must be a pair of consecutive integers which are both quadratic residues. D Similarly we can prove: Theorem 8.4. The number of integers a such that a and a residues mod p is
+ 1 are both quadratic non
so ,that, if p ~ 5, then there must be a pair of consecutive integers which are both quadratic nonresidues. D
Theorem 8.5. There are t(p  1) integers a such that a and a quadratic residues nor both quadratic nonresidues.
+ 1 are neither both
176
7. Trigonometric Sums and Characters
Proof The theorem follows at once from
Note: The problem concerning three consecutive quadratic residues involves the study of the character sum
xt
e(X
2))
+ ~(X +
which is outside the scope of this book. However, we have the following application of charaCter sums involving cubic polynomials.
== 1 (mod 4).
Theorem 8.6 (Jacobsthal). Let p be a prime equation p
Then a solution to the
= X2 + y2
in integers X, Y is given by 2X = S(r), 2 Y = S(u) where
(~)= 1,
(~)=l
and S(k)
p1
L:
=
=
(X(X 2 +
P
x=l
k)) .
Proof Since S(k)
=
t(p 1)
L:
(X(X 2 + k))
P t(p1) (X(X 2 +
x= 1
=2
L:
k))
P
x=l
t(p 1)
+ L:
(p _ y)«p _ y)2
+ k))
P
y= 1
'
we see that X and Yare actually integers. Also, if p,rt, then t)3 (  S(k) p
Now consider p 1 =«S(r))2 2
=L:
+ t 2k))
p1 (tX«(tX)2
P
x= 1
=L:
p1 (X(X 2 + t 2k))
t(p1)
+ (S(U))2) = L:
(S(k)f
k=l
S(t2k).
+ L:
(S(ut2))2
1=1
p1
L:
=
t(p1)
(S(rt2))2
1=1
=
P
x= 1
=
p1 p1 p1 (XY(X 2 + k)(y2
L: L: L:
x=l y=l k=l
From Theorem 8.2 we see that the innermost sum here is
={2(;),
if x ¥=
±y
(modp),
P  2,
if x ==
±y
(modp).
P
+ k))
.
177
7.9 The Problem of the Distribution of Primitive Roots
Therefore
Pi1(S(k))2 = 2(p 
1)(p  2)  2 I
k~l
(Xy)
I
x"±y p (modp)
PIPl( ) Y~l ; =
2p(p  1)  2 X~l
=
2p(p  1).
Collecting our results we have
(S(r))2
+ (S(U))2 = 4p. 0
7.9 The Problem of the Distribution of Primitive Roots Theorem 9.1. Let p be an odd prime and p,tn. Ifn is not a primitive root modp, then
i
I klpl
Jl(k) e27tiaindn/k = O. qJ(k) a~l (a.k)~
(1)
1
Proof The inner sum on the left hand side of (1) is a mUltiplicative function of k, as are the functions Jl{k) and qJ(k). Therefore the left hand side of (1) is equal to
n (1 + Jl(q)
±
qJ(q) a~ 1
qlpl
(a.q)~
e27tiaindn/q) ,
1
where q runs over the prime divisors of p  1. If n is not a primitive foot, then (ind n, p  1) > 1, and so there exists a prime divisor q of p  1 which divides indn. For this prime number we have
1 + Jl(q) qJ(q)
±
a~l (a.q)~
The theorem is proved.
e27tiaindn/q = 1 + ~. (q  1) = q 1
o.
1
0
Theorem 9.2. Let p be an odd prime, 1 ~ A < p. If x(n) is a nonprincipal character modp, then
1 I I I A
A
+1
a
a~On~a
I
A+l x(n) ~pt  t· P
Proof We already have pl
I
1"(x)1 = Ih~l x(h)e27tih/p = pt. If p,tn, then
(2)
178
7. Trigonometric Sums and Characters
p1
L x(h)e21tih/p
x(n}r(x) = x(n)
h=1 p1
L
= x(n)
x(nh)e21tinh/p
h=1 p1
L X(h)e21tinh/p. h=1 If we multiply the left hand side of (2) by TW, then =
I L L p1 L x(h)e21tinh/P I A + 1 a=O n=a h=1 _ 1 IP~1_ (sin(A+1)nh/p )21    '' X(h) 1
A
= 
a
A + 1 h= 1
sin nh/p
,
(3)
where we have used the formula
I ±
e21tinh/p
=
a=O n=a
(sin (~ + 1) nh/p)2 slllnh/p
(4)
the proof of which is not difficult. From (3) and (4) we arrive at
JP I LA La A
+
I
1 p1 (sin (A + 1)nh/p )2 x(n) ~   L :.:..,.1 a=O n=a A + 1 h=1 slllnh/p 1 p1 A a =   L L L e21tinh/p A + 1 h=1 a=O n=a 1=
LA La (PL e21tinh/p_1 )
+ 1 a = 0 n =  a' = P  (A + 1). 0 A
h= 1
Theorem 9.3. Let h(p) denote the primitive root modp with the least absolute value.
Then
where m is the number of distinct prime divisors of p  1. Proof Letp > 2. From Theorem 9.1 we have Jl(k)
0=
k
L  L
Ih(p)l 1
a
L L'
klp1q>(k) u=1 a=O n=a (u,k)= 1
e21tiuindn/\
179
7.9 The Problem of the Distribution of Primitive Roots
where If means that we omit the term n = O. On the right hand side of this equation the term k = 1 is equal to Ih(p)l 1
I
a=O
a
Ih(p)l 1
n= a
a=O
If 1 = I
2a = Ih(p)12  Ih(p)l·
For those terms in which k :f 1 we use Theorem 9.2, taking A
I
a
Ih(p)ll [
=
Ih(p)1  1, so that
Ih( )1 2
a~o n~~a x(n) ~ Ih(p)lpt 
;
,
where
Therefore Ih(p)12  Ih(p)1
~ (lh(P)lpt 
I 1J1((~)1 ({)(k)
2 Ih(P;1 )
p
= 2m (lh(P)lpt
_
klpl ({)
Ih~;12).
That is Ih(p)1 ~
2mpt
+1
1 + 2m/p
t < 2mpt.
0
From Theorem 9.3 we immediately deduce: Theorem 9.4.
If p == 1 (mod 4),
then we have the primitive root
Proof We have to prove that Ih(p)1 is a primitive root. Suppose otherwise, so that  Ih(p)1 is now a primitive root. But Ih(pW
== 1
(modp),
I
and hence (h(p))21
== 1
(modp).
From  Ih(p)1 being a primitive root we see that 21 = P  1 and whence pl
Ih(p)I2
== 1
(modp).
This means that Ih(p)1 is a quadratic residue, and since  1 is a quadratic residue, it follows that  Ih(p)1 is also a quadratic residue. This contradicts  Ih(p)1 being a primitive root. 0 Theorem 9.5. The least positive primitive root modp satisfies
180
7. Trigonometric Sums and Characters
Proof We take A
= [(g(p))  1)/2J. Then
L
0=
i I
Jl(k)
klp1<{J(k)
and here the term k
A+±+a
e21tiUindn/\
u=l a=On=A+1a (u.k)= 1
= 1 on the right hand side is equal to A
A+1+a
L
L
A
1
L (2a + 1) = (A + 1)2;
=
a=O n=A+1a
a=O
while for the terms which correspond to k :f 1, Theorem 9.2 gives
IL A
A+1+a
L
e21tiuindn/k
I~ (A + l)pt 
1 t(A
+ 1)2.
P
a=On=A+1a
Therefore, similarly to the proof of Theorem 9.4, we have
(A + 1)2 ~ 2 (A + l)pt  ;t(A + 1)2), m
1
(g(p)  1) < A 2
+1~
2mpt 1 + 2m/p
t'
which gives
7.10 Trigonometric Sums Involving Polynomials The main purpose of this section is to prove the following: Theorem 10.1. Let j(x) denote a polynomial with integer coefficients,
j(x) = akxk If (ak' ... , ao, q)
+ ... + a1x + ao.
= 1, then x=l
where e is any positive number, and the constant involved in the Osymbol depends only on k and e.
j(0)
Since le21tiao/QI = 1, we can always assume, without loss of generality, that = O. We now divide the proof of the theorem into several stages.
Theorem 10.2. If(q1, q2)
= 1, then
181
7.10 Trigonometric Sums Involving Polynomials
Proof Let x = qlY + q2z, so that as y and z run over the complete sets of residues mod q2 and mod ql respectively, x runs over a complete set of residues mod ql q2' Clearly we have
so that q,q2
S(ql q2'/(X»
=
L e27tij(x)/Q'Q2
x=l Q2 =
Q,
L L e27tij(Q,Y)/Q'Q2 . e27tij(Q2Z)/Q'Q2
y= 1 z= 1
From this theorem we see that our discussion should centre on the case q = pl. Lemma 1. Let fix) be a polynomial with integer coefficients mod p and let IX be a root
of fix) == O(modp) with multiplicity m. Let pUllf(px congruence
+ IX)*
and g(x) = PY(px
+ IX).
Then the
g(x) == o(modp) has at most m roots. Proof We can assume, without loss of generality, that IX = O. We then have
wherefl(x) andf2(x) are polynomials with integer coefficients,jl(O) and the degree of f2(x) is less than m. Thus
=1=
0 (modp),
Since pm+ 1 does not divide P"'/l (0), the coefficient of x m, it follows that u ::s; m. Also the degree of PY(px) is at most m (modp), so that the lemma follows. 0 Lemma 2. Let fix) = akxk + ... + alx be a polynomial with integer coefficients, p,t(ak,' .. , a1) and ptll(kak," ., 2a2, a 1 )· Suppose that f1 is a root of
f'(x) == 0 (modpt+l),
O::s;x
* We use the symholpUlia to represent pUla and pU+ 1.ta. We also write pUIIS(x) if pU divides all the coefficients of S(x) and pU+ 1 does not.
182
7. Trigonometric Sums and Characters
and that p"II(f{1l + px)  j(1l». Then
1~ Proof Suppose that
(f ;;:::
+ 1. Then,
k
p"
That is, given any h (1
~
h
(f
~
by hypothesis,
I~: J
~
k.
1
~ h ~ k.
k) we always have pk+ 1
I~: f(hl(Il),
and so p
I~h! J
It follows that plak> plaki>'" ,pial' p,.t'(ak, ... , al)' D
This contradicts
the
hypothesis
Fundamental Lemma. If p,.t'(ak, ... , al), then
IS(pl,j(x)1 < C(k)pl(lH Proof We use mathematical induction to prove this lemma. We first prove the case 1= 1 (Mordell). We can clearly assume that p > k. Denote by N the number of solutions to the set of congruences x~+···+~=~+···+y~(modp),
l~x,
We shall simplify the notations by writing Then, from Theorem 1.1, we have
Ix for I~=
I'" I ak
al
IIep(akx k +
y~p,\h=1,2, ... ,k. l'
(1)
and ep(j(x» for e21tij(xl/p.
... + alx) 12k
x
= I'"
II'" II'"
Iep(ak(~
+ ... + ~ 
~
 ... 
y:)
Applying the theory of symmetric functions we have, from (1), that
Therefore Xl"", Xk and Yi>'" 'Yk only differ by ordering modp, so that N ~ k!pk,
+ ...
183
7.10 Trigonometric Sums Involving Polynomials
that is L:'" L:1L:ep(akX' Ok
Ql
+ ... + al x ) 12k ~ k!p2k.
(2)
X
For any A (~O (modp)) and any f.l, we have IS(p, j(x))1
=
+ f.l)
IS(p, j(Ax
 j(f.l))I·
All the sums of this form occur on the left hand side of (2). We now take it that any two polynomials with the same coefficients reduced modp are equal modp. We shall then determine the number of distinct polynomials j(Ax + f.l)  j(f.l) (A = 1, ... ,p  1; f.l = 0, 1, ... ,p  1). We can assume, without loss of generality, that p,tak' If j(Ax + f.l)  f(f.l) is identical to f(x) modp, then
By Theorem 2.9.1 the congruence Ak == 1 (modp) has at most k roots, and each fixed A determines f.l uniquely. Therefore the number of polynomials of the form f(Ax + f.l)  j(f.l) which are identical withj(x)modp is at most k. In other words there are at least p(p  1)/k distinct polynomialsj(Ax + f.l)  j(f.l). Therefore p(p k 1) IS(p,f(x))1 2k
~ k!p2k,
that is IS(p, f(x))1
~(
1
~ (2k . k !);k pIi.
k· k! )"2k P p(p  1)
Now suppose that I > 1, pI II (kak, ... , 2a2, al), and that f.ll' ... ,f.lr are distinct roots of f'(x) == 0
o ~ x
(mod pI + 1),
with mUltiplicities mt. . .. ,mr respectively. Let ml see that m ~ k  1. We now prove that 'IS(pl, j(x))1
+ ... + mr = m, and it is easy to
~ k 2 max(l, m)p(IH.
From hypothesis, p,t(ak,"" al), pI I(kak, ... ,2a2, al) so that necessarily i~k.
1) 1< 2(t
+ 1).
Since I> 1 it follows that t
IS(pl, j(x)) I ~ pi
~ pl(li). p(21+
~
1 and hence
1+ ~ pl(liM2++)i ~ ppl(li),
so that the theorem is established. 2) I ~ 2(t + 1). We write p
S(pl,j(X)) =
L:
L:
v:::::: 1 0 ~ x ~ pI  1 x=v(modp)
,
epl(j(x)) =
p
L:
v=1
Sv'
184
7. Trigonometric Sums and Characters
If V is not one of the Jli, then we let
Fromf'(y)
Sv =
i= 0 (modp' + 1) and Theorem
I
l.l, we have
I
L
epl(fix» =
O:;;;;X
O~y
x=v(modp)
y=v(modp)
epl(f(y)  pltIf'(y)z)
o:::;:;:z
(3) O~y
O~z
y=v(modp)
If v = Jli' then, with Ui defined by Lemma 2, we have pI
pll
I
Sill =
epl(fix» =
I
epl(fiJli
+ py»
x=1 x=ll;(modp) pII
=
epl(fiJli»
I
epH;(pUI(fiJli
+ py) 
fiJl;))·
y=l
Let gi = pU;(fiJli
+ py) 
fiJli». Then, by Lemma 2,
ISIl;1 = pU;IIS(plUI, gi(x»1 ~ pU;(I~ )IS(plU;, gi(x»I.
(4)
From (3) and (4) we have
IS(P',fix»1 ~
±pU;(l~~S(P'U;,gi(X»I.
i= 1
If I ~ max (u 1, . . . ,u,), then, by the induction hypothesis, Lemma I, and the formula above, we have
IS(P',fix»1 ~
±miPu;(1~)k2p(lU;)(I~)
< mk2pl(1H
i= I
If I < max (U1> ... , u,), then 1< k and
i= I
The proof of the fundamental lemma is now complete.
0
Proof of Theorem 10.1. Let q = pIli . .. p!s, where P1> ... ,Ps are distinct prime numbers. From Theorem 10.2 we have S(q,fix» =
OS plq
(
I
p,
fiqx/P'»)
I.
qP
and so from the fundamental lemma, I
IS(q,f(x»1 ~ C~lk.
I
'
185
Notes
We can assume that C1 > 1 and so from Theorem 6.6.2
The theorem is proved.
D
Notes 7.1. As a corollary of his proof of the Riemann hypothesis for functions in finite algebraic fields, A. Weil deduced the following result on complete trigonometric sums with q = p, a prime number. When (ak, ... , at) = 1 and fix) = akxk + ... + alx we have
xt
I
e 21ti /(x)/p
I~ (k 
l)Jp.
See the author's book [30]. 7.2. Applying A. Weil's proof of the analogue of the Riemann hypothesis in finite algebraic fields, D. A. Burgess [12J has improved on G. P6lya's estimate for character sums. He proved the following: Let e, (j be any two positive numbers and letp bea large prime number. Then,for any integers N, Hwith H > pl/4H, we have
INiH (~)I
< eH, P is the Kronecker symbol. He also used this to give an estimate for n2(p), where (11) p the least quadratic nonresidue mod p, namely n2(p) = O(p±Je +e). Burgess's method can be generalized and extended to give estimates for the least primitive root h(p) and the least dth power nonresidue nip): h(p) = O(p±+') (see D. A. Burgess [13J and Y. Wang [62J), nip)=O(pl/A+,), A=4e 1  1 / d (d~2); nip) = O(pB), B = (log log d + 2)j410g d (d> e33 ) (see Y. Wang [63J). n=N+l
Chapter 8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
8.1 Introduction The following four important functions frequently occur in the theory of elliptic modular functions:
n (l 00
qo
=
q2n),
n~l
n (l + q2n), 00
ql
=
n~l
n (l + q2nl), 00
q2
=
n~l
00
Following the tradition in the theory of elliptic modular functions we use q to represent the variable, which can be real or complex and which satisfies Iql < I. The four infinite products then clearly converge. We do not give any deep discussion on the properties of the elliptic modular function in this chapter. Indeed we do not even define an elliptic modular function and instead we shall study the following associated arithmetic problems: the partition of integers, the sum of four squares, and the transformation of power series related to qo, ql, q2, q3' The problems of convergence arising in the chapter are very simple and any reader familiar with advanced calculus can easily supply the details. (In §8 we also use ndimensional multiple integration). We shall therefore omit all qiscussions on convergence in this chapter. The following is the first and simplest relationship between ql, q2, q3' Theorem 1.1. if Iql < I, tHen Proof We have
n (l 00
q2q3
=
q2(2nl»).
n~l
We rearrange the terms in ql by taking out all the powers of 2 from 2n giving
187
8.2 The Partition of Integers.
ql =
00
00
00
n=l
n=l
n=l
f1 (l + q2(2nl) f1 (l + q4(2nl) f1
(1
+ q8(2nl) ....
From this we see that 00
qlq2q3
=
00
n=l
=
n=l
00
f1
00
(l +q4(2nl)
n=l
00
00
n=l
n=l
f1
00
(1 +q8(2nl) ...
n=l
f1 (l + q4(2nl) f1 (l + q8(2nl)
(1  q4(2nl)
n= 1
=
00
f1 (1_q2(2nl) f1 (1 +q2(2nl) f1
...
00
f1
f1 (1
(1  q8(2nl)
n=l
+ q8(2nl) ... = ... = 1. 0
n=l
The theorem can also be proved from the equation 00
f1 (1
qOqlq2q3 =
 qn)
n=l
00
00
n=l
n=l
f1 (1 + qn) = f1
(1  q2n)
= qo.
8.2 The Partition of Integers Let n be a positive integer. Any collection of positive integers whose sum is equal to n is said to form a partition of n. For example:
5=4+1=3+2=3+1+1=2+2+1 = 2 + 1 + 1 + 1 = 1 + 1 + 1 + 1 + 1, so that there are 7 partitions of 5. We denote by p(n) the number of partitions of n, so that in the above example we have p(5) = 7. Ifwe restrict to those partitions of n in which each term in the partition does not exceed r, then we denote by Pr(n) the number of such partitions. For example, P3(5)
=
5.
Theorem 2.1. If Iql < ~, then 00
1+
n~l Pr(n)qn =
1 (1 _ q)(l _ q2) ... (l _ qr) .
Proof The right hand side of the equation above is equal to
(1 + q + q2 + q3 + ... + qXI + ... ) x (l + q2 + (q2)2 + (q2)3 + ... + (q2)X2 + ... ) x (l + q3 + (q3)2 + (q3)3 + ... + (q3)X3 + ... ) x ... x (1 + qr + (qr)2 + (qr)3 + ... + (qT' + ... ),
188
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
and the coefficient of qn is the number of nonnegative integers solutions to Xl
which is Pr(n).
+ 2X2 + 3X3 + ... + rXr = n
0
We can prove similarly:
If Iql <
Theorem 2.2.
I

qOq3
=
I, then 1
(l  q)(1  q2)(l  q3) . . .
=I+
00
I 1 p(n)qn.
0
n=
Theorem 2.3. Let q(n) be the number of partitions of n into odd integers. Then I
1
00
= (I  q)(l  q3)(1 _ q5) ... = I + n~l q(n)qn.
q3
0
Theorem 2.4. The coefficient ofqn in the expansion of qlq2 is the number ofpartitions of n into unequal parts. 0 The reader should have no difficulty with the proofs of the above three theorems. From Theorem 1.1 together with the results of Theorems 2.3 and 2.4 we have Theorem 2.5. The number ofpartitions ofn into unequal parts is equal to the number of partitions of n into odd parts. 0
8.3 Jacobi's Identity Theorem 3.1.
If Iql <
n «I 
I, Z ¥ 0, then
00
00
q2n)(l
+ q2nlz )(1 + q2nlz l» = I + I
qn 2 (zn
+ zn)
n=l
n=l
(1) Proof The two series are clearly equal. Let
n {(l + q2n1 Z)(1 + q2nlz l)} m
«Jm(Z)
=
n=l
where X o, Xl> ... ,Xm are independent of z. The coefficient of zm is clearly (3)
189
8.3 Jacobi's Identity
Also
n {(l + q2n+l Z)(l + q2n3 Z1)} m
({)m(q2Z)
=
n=l
that is
Substituting (2) into here and comparing the coefficient of zln we see that
Xn
=
q2nl(l_q2m2n+2) 1 2m + 2n Xn 
q
1>
or
x _ n 
(l  q2m2n+2)(1  q2m2n+4) .. .. (1  q2m) X q (l _ q2m+2n)(1 _ q2m+2n2) ... (1 _ q2m+2) o· n2
From (3) we have (l  q4m)(1  q4m2) ... (1 _ q2m+2) Xo=~~~~
(l  q2)(l  q4) ... (1 _ q2m)
so that when 0
~ n ~ m 
,
1, n2
X n 
q X' (l _ q2)(1 _ q4) ... (1 _ q2m) n'
where X' n 
(1  q2m2n+2)(l _ q2m2n+4) ... (1 _ q2m) (1  q2m+2) ... (1 _ q4m) (1 _ q2m+2n)(l _ q2m+2n2) ... (l _ q2m+2) (4)
It follows that (2) can be written as (1  q2)(l  q4) ... (1  q2m)({)m(z)
= X~ +
m
I
qn 2 (zn
+ zn)x~.
(5)
n=l
As m + 00, X~ + 1 so that the identity in the theorem follows. However we still have to justify the process of taking the limit of the individual terms in the series. Let Uo. m
= X o, if 1 ~ n
~
if n > m,
m,
190
8. On Several Arithmedc Problems Associated with the 'Elliptic Modular Function
so that co
L un,m'
({)m(z) =
(6)
n=O
As m +
00,
the term un,m + Un where (n > 0).
We have co
n (l + Iql2k) = Kl
IX~I <
(say)
k= 1
and (say), so that
Now
Vn
is independent of m and as n +
00,
Vn+l 112n+l(lzln+l + IZI
< IqI2n+l(lzl + Izl1)+O,
L
so that therefore
Vn
converges. This shows that the series (6) is uniformly convergent and co
({)m(z)
+ LUn' o
This completes the justification of taking the limit term by term.
D
There are a number of interesting special examples included in Theorem 3.1. Taking z = ± 1 and z = q separately we have: Theorem 3.2. When Iql < 1, co
qoq~
=
L n= 
qn2 00
and co
qoq~
=
L ( lyqnZ. n= 
00
co
qoqi =
L n=O
qn2+n.
D
191
8.3 Jacobi's Identity
Replacing q by  q~ and taking z = qt we have
TI «I  q3n)(l 00
n
00
I (_ q~n2(q2)
q3nl)(l  q3n2)) =
n=1
n= 
00
00
I ( l)"qt(3n 2+n)
=
n= 
00
and we deduce at once Euler's identity: Theorem 3.3. If Iql < I, then
00
=
I ( l)n qt n(3n+ n= 
1)
00
I ( l)n(qtn(3nl) + qtn(3n+ 1»)
=I+
n=1
00
D
=1_q_q2+q5+q7_qI2_qI5+ ....
Again, replacing q by qt and z by qt, we have 00
TI (l 
qn)(l
00
I
+ qn)(1 + qnl) =
n=1
n= 
qt(n 2+n), 00
giving: Theorem 3.4.
If Iql < I,
then 00
qOqlq2
I
=
D
qtn(n+ 1).
n=O
Note: The exponent tn(n + I) is commonly called a triangular number. From Theorem 1.1, we can restate Theorem 3.4 as:
Theorem 3.5.
If Iql < I,
then
qo
(l , q2)(1 _ q4) .. .
q3
(I  q)(1  q3) .. .
00
I
qtn(n+ 1).
n=O
We now prove: Theorem 3.6.
If Iql < I,
then
00
=
I (n= 
=
I)nnqtn(n+ 1)
00
I  3q
+ 5q3 
7q6 +
....
D
192
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
Proof We replace q and z in Theorem 3.1 by qt and qt( respectively, giving 00
TI ((1 
00
I
+ qn()(l + qn 1 C 1» =
qn)(l
qtn(n+ l)(n,
n=oo
n=1
or
fI
00
(+ 1 ((1  qn)(l + qnO(l + qnC 1 » = I ( n= 1 n= 
We now study the situation when ( 00
{~~\ .Ill ((1  qn)(1
+ 
qtn(n+ l)(n. 00
1. Clearly
+ qnO(l + qnC 1» =
(
.II 00
(1  qn)
)3 .
From 00
I ( l)"qtn(n+ n= 
1
00
1)
I ( l)nqtn(n+ + I (_ l)nqtn(n+
=
1)
n=O
00
n:::::: 
00
1)
00
00
I ( l)nqtn(n+ + I ( lr+ l qt m(m+
=
1)
n=O
1)
= 0,
m=O
we have
( I:
(+In=oo
qtn(n+ 1)(n = ((+1
I: n= 
~
n=L:ooq
qtn(n+ l)((n _ ( _ l)n) 00
tn(n+1) (((n  (  l)n) (+1·
Now
W  (  1)n) = n( _ 1)"  1,
lim
(+1
{>1
so that lim {>  1 (
(
00
00
I
+ 1 n=
qtn(n + l)(n 00
=
I
n= 
(_l) nnqt n(n+1). 00
The theorem therefore follows. (We have taken term by term limits twice, which is allowed since the series can be proved to be uniformly convergent.) 0 Exercise 1. Prove that when Iql < 1, 00
TI ((1 
00
qSn+l)(1  qSn+4)(1  qsn+s» =
"=0
n=oo
00
TI (P  qSn+2)(1 n=O
L ( l)n qt n(sn+3), 00
qSn+3)(1  qsn+s»
=
I ( l) nqt n(sn+l). n=oo
193
8.4 Methods of Representing Partitions
Exercise 2. Prove q(1_q24)(1_q2.24)(I_ q3.24) ... q«l_ q8)(I_q2.8)(I_q38) . .. )3
= qt2 _q5 2_q7 2+qIt2 +q13 2_q17 2_"', = qI2 _ 3q32+ 5q52_7q72 + .. '.
8.4 Methods of Representing Partitions Theorem 4.1.
.if Iql < I,
then aq
a 2q4
I  q2
(l _ q2)(1 _ q4)
(1 + aq)(l + aq3)(1 + aq5) ... = 1 +   +
+ (l
amqm2
 q2) ... (I _ q2m)
+ ...
+ ... .
Proof Let F(a) represent the left hand side of the equation above, and let F(a)
= I + cIa + C2a2 + ....
From (1 + aq)F(aq2)
= (I + aq)(1 + aq3)(1 + aq5) ... = F(a),
by comparing the coefficients of an, we see that
so that
(1  q2)(l  q4) ... (l _ q2m) .
This proves the theorem.
0
On taking separately a = I and a = q in this theorem we deduce the following two theorems: Theorem 4.2.
.if Iql < I, q2
then
= (l + q)(l + q3)(1 + q5) ... q
=I+I_
+ (l
q4 q2
+ (I _ q2)(l _ q4) + ... qm2
 q2)(1 _ q4) .. '(1 _ q2m)
+ ... . 0
194
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
Theorem 4.3.
If Iql < ql
1, then
= (l + q2)(l + q4)(l + q6) ... q2
q6
= 1 + 1 _ q2 + (1 _ q2)(1 _ q4) + ...
+ (1
qm2+m
+ .... 0
 q2)(l _ q4) ... (1 _ q2m)
Replacing q by qt in Theorem 4.3 we have: Theorem 4.4.
If Iql <
, (l + q)(l + q2)(l
Theorem 4.5.
If Iql <
1, then q
q3
1 q
(l  q)(l  q )
+ q3) ... = 1 +   +
I, then 1
2
+ ...
aq
a 2q2
1 q
(1  q)(l _ q2)
:::: = 1 +   +     _ = _ (1  aq)(l  aq2)(l  aq3) . . .
+
a3q3
(l  q)(l  q2)(l  q3)
+ ...
Proof Denote the left hand side of the above equation by F(a). Then
I F(aq)
= (l _ aq2)(l _ aq3) ... = (l  aq)F(a).
Substituting the expansion 00
m=l
into this equation we have cmqm
=
Cm 
Cml q,
or
q 1 _ qm
Cm =   C m  l '
Therefore qm Cm
= (l _ q)(1 _ q2) ... (l _ qm)'
0
A special case of this theorem is: Theorem 4.6.
If Iql <
1, then
_1_ qOq3
= 1 + _q_ + 1 q
+
q2 (l  q)(1  q2) q3
(l  q)(l  q2)(1  q3)
+ ...
.
0
.
195
8.5 Graphical Method for Partitions
Ifwe replace q and a by q2 and ql respectively in Theorem 4.5, then we have:
Theorem 4.7.lflql < I, then 1 q3
q
= 1+
q2 1 _ q2 + (l _ q2)(l _ q4) q3
D
8.5 Graphical Method for Partitions Let a partition of n be
n = al + a2 + a3 + ... + a., where the ai are arranged in descending order of magnitude, that is
We construct a diagram where there are al points in the first row, a2 points in the second row, etc. The points are equally spaced apart in each row and the first points of the rows form the first column. Such a diagram is called the graph of the partition. For example
is the graph of the partition 18 = 7 + 4 + 3 + 3 + 1. Clearly we can also read a graph vertically as columns rather than rows, giving another partition which is known as the conjugate partition. For the above graph the conjugate partition is 18=5.+4+4+2+ I + I + 1. By reading graphs vertically or horizontally we have the following theorem:
Theorem 5.1. The number ofpartitions ofn into parts not exceeding m is equal to the number of partitions of n not exceeding m parts. D Graphical methods can be used to prove more complicated theorems. For example:
196
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
Alternative proof of Theorem 4.2. Clearly the coefficient of qn in the expansion
is the number r(n) of partitions of n into unequal odd parts. For example: 15 = II
+ 3 + 1 = 9 + 5 + I = 7 + 5 + 3.
We now reconstruct the graph for the partition 15 = 11
+ 3 + 1 as follows:
jj j
I  I I 
Since each part in the partition is odd and the parts are unequal, the resulting diagram is still a graph for a partition. But this graph has a special property: it reads the same horizontally and vertically. Such a graph is called a selfconjugate graph and its corresponding partition a selfconjugate partition. Therefore each partition into unequal odd parts corresponds to a selfconjugate partition, and conversely. Therefore r(n) is the number of selfconjugate partitions of n. Denote by t the number of points on the side of the largest square in a selfconjugate graph (t = 3 in the diagram). Then, corresponding to each fixed t, the number of selfconjugate partitions is equal to the number of partitions of (n  t 2 )/2 not exceeding t parts. This is the same as the coefficient of qn in the expansion
We have therefore
where the term t = 0 in the series is 1. This is Theorem 4.2.
D
Exercise 1. Prove
:::: = 1 + (1  q)(1  q2)(l  q3) . . .
4
q
(1  qf
+ ::q::7 (1  q)2(l _ q2)2
197
8.5 Graphical Method for Partitions
Exercise 2. Use the graphical method to prove Theorem 4.4. Suggestion: Shift each row by one unit successively in the graph for the partition with unequal parts, for example
19 = 7
•
+5+4 +2 +1
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• Now read the partition to the right of the line. Another application of the graphical method is to give a proof of Theorem 3.3. This theorem clearly can be restated as follows: Theorem 5.2. Denote by E(n) the number of partitions of n into an even number of unequal parts (even partitions), and by U(n) the number ofpartitions ofn into an odd number of unequal parts (odd partitions). Then E(n)  U(n)
O, ={
if if
k
( 1) ,
± 1), n = }k(3k ± 1).
n # }k(3k
Proof(Franklin). In a graph of a partition of n we take the point in the extreme top right hand corner and draw a 45° line towards the bottom left hand corner, the end point of this line being a point of the graph. We denote this line by (J. We also denote by f3 the bottom line joining all the points in the last row.
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
.. p
: :~. ./
Figure I
We can move the line f3 to the top right hand corner of the diagram so that it is just to the right of, and parallel to, (J (we use 0 to indicate this operation). We can also move (J to below, and parallel to, f3 (we use Q to indicate this operation). Corresponding to these operations 0 or Q we may obtain another graph for a partition of n, but it is possible that the resulting diagram cannot be a graph for a
198
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
partition (the graphs of partitions are drawn in descending order for rows). In Figure 1, after the operation 0 we obtain Figure 2, whereas after the operation Q we obtain Figure 3, and according to our rules, Figure 2 is a graph for a partition whereas Figure 3 is not.
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
.. ... p
(J
Figure 2
Figure 3
We now discuss the three separate cases: 1) f3 < (j. From Figure 1 we see that 0 is always possible while Q is impossible. 2) f3 > (j. Here 0 is always impossible, and Q is possible unless f3 and (j meet, and f3 = (j + 1 (Figure 4). In the exceptional situation we have a partition with two equal parts, contrary to what we require.
p e  e e
p Figure 4
Figure 5
3) f3 = (j. Here 0 is possible apart from the situation when f3 and (j meet (Figure 5) which becomes impossible for O. Q is always impossible. From the above we see that if, for a partition of n, one of the operations 0 and Q is possible and the other is impossible, then we can obtain an even (or odd) partition from an odd (or even) partition. That is, we can establish a bijection between the even and odd partitions. However, corresponding to Figure 4 and Figure 5, such a bijection cannot be established. In the first case, n must be of the form
n
=
(k
+ 1) + (k + 2) + ... + 2k = t(3k 2 + k)
while in the second case
n=k
+ (k + 1) + ... + (2k 
1)
= t(3k 2

k).
199
8.6 Estimates for p(n)
In either case we clearly have E(n)  U(n)
= ( l)k.
D
8.6 Estimates for pen) In this section we first use the simplest algebraic method to give the roughest estimates for p(n) and we then use a slightly deeper method to determine an asymptotic formula for logp(n). However the still deeper method of applying Tauberian theory to determine the asymptotic formula for p(n), and the even deeper method of applying results in modular function theory and analytic number theory to obtain the expansion for p(n) is ou~side the scope of this book. Following the successive improvements of the results that can be obtained it is easy to judge the various levels of depth in the methods used. Theorem 6.1. If n > 1, then
Proof 1) We first prove the inequality on the left hand side. From the integers 1,2, ... , [JnJ we select any r of them at. a2,"" a r and form the partition (1)
Since
we see that (1) is a partition of n. The total number of ways of selecting these partitions is 1+
([fJ) + ([fJ) + ... + ([~J) + ... =
(1
+ 1)[Jn] = 2[Jn]
so that the left hand side inequality in the theorem follows. 2) We next prove the right hand inequality. Consider the graph bf a partition of n:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
200
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
Let r be the side of the largest square in the top left hand corner in the diagram. The top right hand corner of the diagram has at most n  r2 points forming a partition with at most r rows, and a similar interpretation can be given to the bottom left hand corner. If r is fixed, there are at most nr partitions for the top right hand corner and similarly for the bottom left hand corner. Since clearly r ~ [JnJ we have pen) ~
[JnJ
I
n 2r <
In n2
[JnJ
< n3[JnJ.
D
r= 1
Theorem 6.2.
. logp(n) hm t = n
1t
A .
3
n>oo
We shall require some preparation for the proof of this theorem. Theorem 6.3.
np(n) =
I
lp(n  lk).
Ik<;;n
Proof Suppose that Iql < 1 and let 1 00 f(q) = (1 _ q)(l _ q2)(l _ q3) ... = 1 + I~l p(l)ql. Taking the logarithmic derivative of the product formula for f(q) we have f'(q)
lqll
00
=II f(q) 1=1 1 q 1
=_
I
00
l(ql
+ q21 + q31 + ... )
q 1=1
=~
q
1: 1: 1=1
lqlk.
k=1
Differentiatingf(q) from the series expansion we have 00
I
np(n)qn
00
00
1= 1
k= 1
= qf'(q) =f(q) I I
n= 1
= (1+
lqlk
Jl
v~/(V)qv) k~1 lqlk.
The theorem follows from comparison of coefficients.
D
Theorem 6.4. If n > v > 0, then 1 v
 2
In
ttl
< n  (n  v) <  2
V
In
2
v + , .
2n'
201
8.6 Estimates for p(n)
Proof This follows from the inequality
x 2
x2 x < (1  x)t < 1  2 . 2'
1   Theorem 6.5.
If 0 <
e X (1  e X )2
1 x2
Cl
(and later
D
x < 1, then Cl<
Here
O<x
C2, C3, ••• )
1 x2
<.
represents a positive constant.
Proof From
the inequality on the right hand side follows at once. Since 1
= X (1 + O(x 2 )),
;c~,
etx _ e
2 X
we have
which establishes the inequality for the left han.d side in the theorem.
D
Theorem 6.6. Let oc be positive. Then n 2n
co
2 60c To be accurate here,
C2
Proof From I~ 1 Ix'
=
C2Jn
co
_
I I
<
le a1kn
k= 1 1= 1
n2n
t <2·
60c
depends on oc. x/(1  X)2, we see that the double sum is equal to eaknt
co
k~l
(1 
e akn
(2)
f)2·
From the inequality on the right hand side in Theorem 6.5 we see that this sum here is less than 1
co
k~l (ockn t )2
n
= oc 2
Separating the sum (2) into two sums co
k
=1
k
= [J~] + 1
co
1
n2n
k~l k 2 = 6oc 2 •
202
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
and applying the left hand inequality of Theorem 6.5, we have [J;;]
1
II> I
k t 2 + O(Jn) (oc n ) n [J;;] I =2" I2"+O(Jn) oc k=1 k k= 1
= ~:~ + 0 ( n
;2) +
I _ k>Jn
O(Jn)
n 2n + O(Jn). 60c
=2
Applying the right hand inequality of Theorem 6.5, we have
I2 = o(n I _;2) =
O(Jn).
k>Jn
Collecting these results, our theorem is proved. Proof of Theorem 6.2. Let c = I) We first prove
0
nA. (3)
When n = 1, (3) clearly holds. We now use induction on n. From Theorem 6.3 and the induction hypothesis we see that np(n) <
I
lec(nIk)t
Ik"'n
(using Theorem 6.4) lk~n
00
< ecnt
00
I I
le clk /(2n t )
k=I/=1
n2 n .'< ecnt ___2 = necn > (using Theorem 6.6)
6(c/2)
which proves (3). 2) We next prove: Given any positive e there exists A (= A(e)) such that I p(n) > _e(Cs)n t . A
We use induction on n, but the choice of A will not be made clear until later. From Theorems 6.3 and 6.4 together with the induction hypothesis we see that (4)
203
8.6 Estimates for p(n)
Since e X
;?;
I  x, the double sum is
I
;?;
Pk 2 ) I  I (c  e) ,2 n'
let(ce)lknt (
Ik';n
ce
=Il' I2 2n'
(say).
For any positive t we always have e X
I
let(ce)lknt
Ik>n
=
(5)
O(x t ), so that
(ni I h= o(n I: I: =0
11
it (lk)
~~t)
Ik>n
h
11itkit)
1= 1 k= 1
= O(n it ),
if t> 8.
(6)
From this and Theorem 6.6 we have 2n 1 n
II> 3(c 
e)
2n 2 n
C3Jn
2 
2n 2 n
(I
I)
In
= 3c2 + 3 (c _ e)2  c2  C3 n (7)
(using
I
I
2
(c  e)
2"=2 c
J x
3
dx>2ec 3 ).
ce
On the other hand, by the binomial theorem and Theorem 6.5,
I2 = I
k 2 [3e t (ce)lknt
Ik';n n
~
co
I
I
k 2
k=1
1 3 e·t(ce)lknt
1=1 e  t(c e)kn  t
n
~ 12
I
k 2
k=1
=0 ( n
(le
I
t(c
e)kn
1)4
t(c
e)kn
t)2 ).
n
k= 1
(l 
e
We divide the sum in the bracket into two parts: n
I = I k=1
k,.j~
+
I
j~
(8)
204
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
In the first part t(e  e)knt < te, and when x < te,
f x
I  e X
=
etdt > etcx,
o
which gives L _ (1 _ k ..
Jn
et~ce)knt)2 = 0 (n
L k ..
Jn
:2) =
O(n).
In the second part t(e  e)kn t ~ t(e  e) and
so that
From this and (8) we see that L2 = O(n2).
(9)
Collecting (4), (5), (7), (9) we have I np(n) > _e(ce)n\(l A
+ 2ee 1 )n 
e4Jn).
When
e4 )2 (2ee
n>  1 we have I
p(n) > _e(ce)n t .
(10)
A
When n :::; e;(2ee 1 ) 2 we take A large enough so that (10) holds. The theorem is proved. 0
8.7 The Problem of Sums of Squares Let r.(n) denote the number of sets of integer solutions (x h
xi + ... + x; = n. From Theorem 6.7.5 we already have r2(n)
=
L ( l)t(Ul), uln
... ,
x s ) to the equation
205
8.7 The Problem of Sums of Squares
where U runs over the odd divisors of n. This theorem is clearly equivalent to the following: Theorem 7.1.
if Iql <
I, then
q~qi = C=~oo qn2y =1+4(q~+~Iq Iq Iq
... ). D
(I)
We now prove: Theorem 7.2.
If Iql < I,
then
q6q~ where
I if2 n=oo
00)4
=
(
=
mqrn
+ 8I' rn'
I
Iq,
I' means summation over all integers not divisible by 4. In other words
where m runs over the divisors of n not divisible by 4.
Before we prove this theorem we shall need some preparation. Let q I_qr'
U =r
so that (2)
Theorem 7.3. 00
I
00
urn(l
rn=l
+ Urn) = I
nUn'
n=l
Proof From formula (2) we have 00
I
rn=l
00
urn(l
+ Urn) = I
qrn
rn 2 rn=l (I  q )
00
=
00
I I
rn=l
n=l
00
nqrnn
=
I
nUn'
n=l
Theorem 7.4. 00
00
rn=l
n=l
I ( Ir 1u2rn(l + U2rn) = I
(2n  I)U4n2'
D
206
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
Proof From formula (2) we have 00
m=1 2m
00
I ( I)m 
00
00
q = I ( I)m  1 I rq2mr m=1 (l_q2m)2 m=1 r=1 00 00 00 rq2r = I r I ( l)m  1 q2mr = I 2r r= 1 m= 1 r= 1 I + q 00 (rq2r 2rq4r ) _ 00 (2n _ I )q4n  2 D I I _ q2r  I _ q4r  I I _ q4n2
=
1
r= 1
n= 1
Theorem 7.5. Let 9 be real and not an even multiple of n. Then
(icott9
+ Ul sin 9 + U2 sin 29 + ... )2
1)2 + Co + I
I = ( cot9
4
00
k= 1
2
(3)
Ckcosk9,
where I
00
Co =  I nUn> 2 n= 1 Ck = uk(1
+ Uk

t k ),
k;d.
Proof The left hand side of formula (3) is equal to
1)2 + 1 I
I uncot9sinn9 2n=1 2
I ( cot9 42
00
+
I I 00
00
m=ln=1
umunsinm9sinn9.
Now
+ cos9 + ... + cos(n  1)9 + tcosn9, 2 sin m9 sin n9 = cos(m  n)9  cos(m + n)9, tcott9 sin n9 = t
so that the formula is equal to
(I 1)2 + I (1 + 00
cot9 4 2
Un n= 1 2 1
cos9
I )
+ ... + cos(n  1)9 + cosn9 2
0000
+ I
I
umun(cos(m  n)9  cos(m
1
I
2 m=1 n=1
+ n)9).
From this we have I
Co
00
="2 I
n=1
1
(Un
+ u;), 00
Ck = Uk + I Un 2 n=k+l where m ;::: 1, n ;::: 1.
+
2 mn=k
1
I
1
UmUn   I UmUn' 2 nm=k 2n+m=k
UmUn + 
207
8.7 The Problem of Sums of Squares
From Theorem 7.3 we see that
1 00 Co =  I nUn> 2 n= 1 and 1
Ck
00
+I
= Uk
2
I
00
Uk+1
1=1
+I
U,Uk+1 

k 1
I
2 , =1
1=1
U,UkI'
Now and so that
Theorem 7.6.
Proof In Theorem 7.5 we take 9
G+ n~o
n~o
U4n+ 1 
1
1
00
16 1
= 
16 1
+ I 2
+ I (
1)mc2m
m=l 00
nUn
+ I (
n=l
l)mu2m (l
+ U2m

m)
m=l
00
00
I
=  +16
nUn
2 n=l 1 00
2 1
y
00
I
=  +
U4n+3
= nl2 giving
(2m 
I)U2m1
m=l
+ I (
l)mU2m (l
+ U2m)
m=l
00
+2 I
(2m 
I)U4m2
m=l
1
1
=  +
00
I
00
(2m 
16 1
2 m=l 1 00
16
2n=1
=+
I
4.j'n
nUn'
I)U2m1
+ I
m=l
0
(2m 
I)U4m2
(by Theorem 4)
208
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
Theorem 7.2 now follows easily from Theorem 7.1 and Theorem 7.6. From Theorem 7.2 we deduce at once: Theorem 7.7. r4(n)/8 is a multiplicative function.
D
Theorem 7.8 (Lagrange). ,Every positive integer is the sum of four squares.
D
Apart from these we also have the following application: Theorem 7.9 (Jacobi). q~  q~ = l6qq~. Ifwe substitute the representation formulae in §1 into this identity then we have
CDI
(1
+ q2nI)Y 
CDI (1  q2nI)Y
=
16q
CDI
(l
+ q2n)y.
(Jacobi called this result "Aequartro identica ratis abstrura".)
00
(qoq~)4
=
L r4(n)( l)nqn n=O
and (2qoqi)4 =
C=~oo qn(n+l)r,
we see that our required identity is equivalent to
Let s4(n) denote the number of solutions to (4)
where n must be odd. Thus our theorem has the following arithmetical interpretation: if n is odd, then s4(n) is equal to 2r4(n). We multiply equation (4) by 4 and from completing squares we have (2XI
+ 1)2 + ... + (2X4 + 1)2 = 4n.
The r4(4n) solutions to the Diophantine equation
209
8.7 The Problem of Sums of Squares
have only two types: (i) Yt>Y2,Y3,Y4 all odd, (ii) Yt>Y2,Y3,Y4 all even. From this it follows that
From Theorem 7.2 we have
I
r4(4n)=S
m=sI(m+2m)=3(sIm)=3r4(n), min
ml2n
min
and hence
The theorem is proved.
0
Exercise 1. Use the following method to prove that 1
I
1
n2
1
+ 22 + 3 2 + 4 2 + ... = 6".
Obtain the asymptotic formula
for the number A(x) of lattice points inside the four dimensional sphere
Find another representation for A(x) with Theorem 7.2 and compare the results. Note. From this exercise and (6.14.2) we deduce at once that
~ J1(n) = ~
L..
n=l
n
2
n
2·
Exercise 2. Show that
Exercise 3. Use the identity (1  cosn.9) cot2 t.9
= (2n  1) + 4(n  l)cos.9 + 4(n  2) cos 2.9 + ...
+ 4cos(n to prove that
1).9 + cosn.9
210
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
I 21 {cot 8 8 2
1 12
X
+  + (1 1
+ 3 (3 1 X
1
=
 cos 8)
X
 cos 38)
X3
2X2
+ (1 1
 cos 28)
X2
+ ... }2
(~cot2~8 + ~)2 + ~{~(5 + cos 8) + 2 8 2 12 12 1  X 1
3
3X
+ 13_
3
X3
(5
X
2
X
2
(5
+ cos 28)
+ cos 38) + ... } .
8.8 Density Let r,(n, q) denote the number of solutions to
xi+"'+x;=n
(modq).
(1)
Consider the substitution
Xi + ... + x; = y. There can be q' values on the left hand side and q values on the right hand side. This means that corresponding to one value of y there are, on average, q'1 solutions. We now consider the ratio between the number of solutions and the average number A
(
LJqn
)
=
rln, q)
,1'
q
Let
we call this the pdensity of the congruence (1). We also define oo(n)
1
= lim~o 2(j
r··f
dX1 ... dx.,
which we call the real density of the congruence (1). We now calculate the values of the various densities. Theorem 8.1. When s is even the real density is equal to
(2)
211
8.8 Density
Proof We have, with polar coordinates, 21t
II
(l_x 2 _y 2)a 1dxdy= I
1
d9I(lp2)a1PdP=~' o
o
We next use induction to prove the result: !...
, V=
dx "'dx 1
xi  ...  x; > 0
n2
=
'G}
1
Let
= Yv2 JI
Xv
 xi  x~
(v=3,oo.,s).
Then
v, =
II
,2
xi 
(l 
x~)2dx1
dX2
lxixi > 0
= ~ V,_ 2 = (n:)/2 . 

2
,
2 .
We then have . I ( oo(n) = hmbO
2<5
I ... I x;+ ... +x;
dx 1 ... dX) ,
dX1' .. dx, 
~n+o
,
x;+ ... +x;
~no
In order to determine the pdensity we shall require the following preparation. Let Apl(n) =
I ;/1 ( I pI
pI
a=l
p,/'a
P
e21tiax2/pl ) ' e21tian/pl.
x=l
Theorem 8.2. I
I
m=O
Proof
Aprn(n)
=
Llpl(n).
212
8. On Several Arithmetic Problems Associated wit.h the Elliptic Modular Function
=
L L ;;1 ( L e27tiax2/pl I
pI
pI
a=l P
m=O
L ;;1 ( L e27tiax2/pl pI
a=l P
e27tian/pl
x=l
p'mila
pI
)'
)'
e27tian/pl
x=l
=  1_ . 1
L L e27tiax2/pl pI
(
pI
)'
e27tian/pl
p('l)1 pi a= 1 x= 1
= Theorem 8.3. Let s
r,(n, pi) p
= L1pl(n). 0
(,1)1
= 4r and p be an odd prime. Then
Proof From Theorem 7.5.6 we know that if p,ra, then.
so that pI
Apl(n)
=
p2rl
L e27tian/pl. a=l
p,/'a
On replacing a by  a the theorem follows.
0
Theorem 8.4. Let s = 4r. Then
A2(n)
0,
=
Proof From Theorem 7.5.3 we have A2(n) that if 2,ra, then
= 0.
Also, from Theorem 7.5.7 we see if 21/, if 21'1.
From (l
7ti
+ ia)4 = 
4, (e4"a)4
=  I, we have
(xtl e27tiax2/2Jyr
and Theorem 8.4 follows. Theorem 8.5. Let s
oin)
=
=
= ( 1),2 2r(/+l),
0
4r, p =F 2, ptlln. Then
(l  p2r)
L p(2rl)1 = 1=0
(1  p2r)(pt)(2rl)U2r_l(pt),
213
8.8 Density
where
Proof From Theorem 8.3 and Theorem 7.4.4 we have co
co
1= 1
t+ 1
L p2rl+1  L
=
1=0
=
p2rl+ll
1= 1
L p(2rl)I(1 
D
p2r).
1=0
Theorem 8.6. Suppose that s
= 4r and let 2tlln. Then if c=O, if c>O,2,./'r, if c>O,2Ir. D
The reader can supply the proof which is similar to Theorem 8.5. Definition. Let
and p
Theorem 8.7. If s = 4, then
b4(n) = r4(n) = 8
L d. din 4,/'d
Proof Let n
= 2 n', t
2,./'n'. Then from
1
 = n(lpS)
(s)
p
and Theorem 8.5 we have" 4 1 8 n op(n) =  n'lu(n') = In'lu(n'). p>2 3 (2) n Also, from Theorem 8.1, we have
214
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
so that p>2
Ifn is odd, then the theorem is proved. Ifn is even, then, from Theorem 8.6, we have
The theorem is proved.
Theorem 8.8.
D
If s = 8, then = 16(  l)n I
bs(n)
( l)d d 3 .
din
Proof Let n = 2tn', 2,rn'. Then
}]2 op(n) = 1516 ,(4)1 n'3u3(n') = 96n4n'3 u3 (n'). Also, from Theorem 8.1, we have
n 3 oo(n) = _n 6' 4
so that oo(n)
n op(n) = 16 . 23tu3(n'). p>2
Also o2(n)
=
(l  2 3 (t+l). 15)(1  t)I;
hence When n is even
I ( l)dd 3 =
 u3(n') + 23u3(n') + 23.2 u3(n') + ... + 2 3 . uj(n') t
din
= 
The theorem is proved.
2U3(n')
+
23 (t+ 1) _ 1 23 _ 1 u3(n')
D
Exercise 1. Prove the following: Let s = 2r. If r is even, then
215
8.9 A Summary of the Problem of Sums of Squares
If r is odd, then L(r)6s(2 tn')
= (( ~, 1) + ( r 1)2(1r)(d 1))n'lrPr1(n'),
where
1:
L(r) =
n= 1
X(7) , n
and x(n) = 0, 1,0,  1 when n == 0, 1,2,3 (mod 4). Also
pt(n) = L (~)qt. q qln
Exercise 2. Prove that <>2(n)
=
2r2(n).
Exercise 3. Prove that
8.9 A Summary of the Problem of Sums of Squares In the previous section we proved that r 4(n) = <>4(n), but is this a mere coincidence? Actually we can prove that, for 3 ::;;; s ::;;; 8, we have r.(n)
= <>.(n),
and that this is no longer true if s > 8. Up to the present rs(n) has been explicitly evaluated for s'::;;; 24. For example: r3(n)
16 = n!X2(n)K(  4n) 1t
f1
(I +  + ... +;=t 1
1
P
p21n
P
where the definition of"C is p2tln, p2(t+ 1),tn, K(  4n)
=
I Lco (_ 4n) , m=l
m
m
and if 4 an == 7 (mod 8), if 4 an == 3 (mod 8), if 4 an == 1,2,5,6 (mod 8),
216
8. On Several Arithmetic Problems Associated with the Elliptic Modular Function
and here the definition of a is 4a ln, 4a +1,rn.
where u1't(n)
= I(
1)dd 1 1,
din
and T(n) is the coefficient in the power series expansion 00
q«l  q)(l  q2) ... )24 =
I
T(n)qn
n=1
and if nl2 is not an integer, then T(nI2) = O. From Theorem 3.6 we have 00
«1  q)(l 
q2)(l  q3) ... )3
=
I ( 1)n(2n + l)qtn(n+
1),
n=O
so that
«
T(n) =
lY'(2x1
+ 1) + ... + (
ly8(2xs
+ 1»
txdxl + 1)+· .. +tx8(t8+ 1)=n1
I Y;+"'+yi=8n
s
I ( l)t(Yi1)Yi' i=l
2,j'Yl'''Y8
The following table records the mathematicians who did the evaluations: s
r.(n)
2,4,6,8 3
5,7 10, 12 14, 16, 18 20,22,24 9, II, 13 15, 17, 19 21,23
Jacobi, 1828 Dirichlet Eisenstein, Smith, Minkowski Liouville, 1864, 1866 Glaisher, 1907 Ramanujan, 1916 Lomadze, 1949
Chapter 9. The Prime Number Theorem
9.1 Introduction The main aim of this chapter is to prove the following formula: X
(I)
n(x) '"   . logx
Here n(x) denotes the number of primes not exceeding x, and the formula (I) is the famous prime number theorem. In this chapter we shall give two proofs. The first proof makes use of some rather deep analytic tools (the reader needs to know a little advanced calculus and complex function theory) but is relatively straightforward, the fundamental idea being due to N. Wiener. Although the other proof does not require much analytic knowledge and can indeed be classified as an elementary proof, it is more difficult to understand. This proof is due to Erdos and Selberg. One of the difficult problems in the long history of prime number theory is the search for an "elementary proof" of the prime number theorem and success came in 1949. In the following sections we do not give a direct proof of the formula (I). Instead we prove two formulae, each of which is equivalent to (I). Suppose that x > O. Let 9(x)
=
L logp,
(2)
p~x
tjJ(x)
=
L
A(n)
L' logp.
=
(3)
In formula (3) A(n) is the von Mangoldt function of Example 6in §6.1. 9(x) and tjJ(x) are called Chebyshev's functions. It is easy to see that tjJ(x) = 9(x)
+ 9(xt) + 9(xt) + ...
(4)
and tjJ(x)
=
L [~Og x] logp, p'iix
(5)
ogp
where [~] denotes the integer part of ~. Theorem 1.1. We have
I· 1m
n(x)
x ... oox(logX)l
9(x) I· tjJ(x) = I· 1m   = Imx"'oo
X
x"'oo
X
(6)
218
9. The Prime Number Theorem
and lim n(x) = lim 8(x) = lim tfJ(x) . x+oox(1ogX)l x+oo X x+oo X
(7)
Proof From (4) and (5) we derive easily 8(x) ~ tfJ(x) ~
logx Iogp = n(x) log x, p"'x logp
I
so that  . 8(x) hm  x+ 00 X
~
 . tfJ(x) hm  x+ 00 X
~
. n(x) hm 1 . x+ 00 x(log x)
Now let 0 < oc < 1, x > 1. Then 8(x) ;:::
I
logp;::: {n(x)  n(x")} log XIX ;::: oc{ n(x)  x"} log x.
xOl
Since limx+oo ;~g~ = 0, it follows that
I' 8(x) I' n(x) Im;:::oc 1m 1 x+oo X x+oo x(logx)holds for any positive oc less than 1. Therefore  . 8(x) . n(x) hm;::: hm 1 x+oo X x+oo x(logx)
Collecting our results we have n(x) lim x+ 00 x(log x)
1
 . 8(x)  . tfJ(x) = hm= hm. x+ 00 X x+ 00 X
Formula (7) can be proved similarly.
D
From Theorem 1.1 and Theorem 5.6.2 we have Theorem 1.2. There exist constants
Ci
> 0 (i = 1, 2, 3, 4) such that for x ;::: 2, (8)
and (9) Also from Theorem 1.1 we see at once that in order to prove formula (1) we need only prove that tfJ(x)
or
~
x
(10)
219
9.2 The Riemann ,Function
(11 )
8(x) '" x.
Before we prove formula (10) we need some preparation.
9.2 The Riemann ,Function From now on we write s = u function defined by the series
+ it for
a complex number with u and t real. The
1
00
(s)
I ;
=
(u> 1)
(I)
n=ln
is called the Riemann (function. Let a > I. When u ~ a, because
0011001001 ,,~,,~,,1 L... s ~ f...J O'~ L... a' n=N n
n=N
n
n=N n
we see that the series for (s) is uniformly convergent. Since a is any real number greater than I, it follows that (s) is an analytic function in the half plane u > I.
Theorem 2.1. Let 1
h(s) = (s)   . s I Then h(s) is analytic in the half plane u > 0, and
Ih(s)1
~ 1.1u
(u > 0).
Proof Let
f
n+l
fn(s) = n s

u s du,
n
so that (2)
Since
f
In s  usi
=
1
sv s 1 dv 1
n
f
n+l
u
~ lsi
n
v lJ 
1
dv
(n ~ u ~ n
+ I),
220
9. The Prime Number Theorem
we have
If
n+1
If,,(s) I =
n+1
(n S
~ lsi
u')dul

n
n
Suppose that 0 < a
~
(f
f v a 1dv.
~
~
b, 
T~
t
~
Jb 2 + T2 Na
T. Then
a,
so that the, series L:'= 1f,,(S) is uniformly convergent in 0 < a ~ (f ~ b,  T ~ t ~ T. Since a can be arbitrarily near 0, and b, T can be arbitrarily large it follows that h(s) = L:'= 1 f,,(s) is analytic in the half plane (f > O. From this we see that (2) can be used as an analytic continuation for '(s) into the half plane (f > 0, and s = 1 is the only simple pole with residue 1. From (2) we derive at once co
If,,(S)I~lslfva1dv=~ I '(S)~I=I s 1 n=l
The theorem is proved.
((f> 0).
(f
0
Theorem 2.2. In the half plane (f
~
Proof When (f> 1 the series Theorem 5.4.4
L:'= 1 (lln
1, '(s) # O. S)
converges absolutely so that from
(3)
here the product is over all primes p. Since each factor in the product is nonzero and the product converges absolutely, it follows that '(s) # 0 when (f > 1. Since '(s) has a pole at s = 1 we are left to prove: when t # 0
'(1
+ it) #
O.
Now consider the funct!on (e> 0, t # 0).
From (3) we know that
221
9.2 The Riemann ,Function
where a
P
1 . I1 3
= I1  pI 1+e
1
pI +e+it
1 . I1 4
1
:::::7"
pI +e+2it
11
'
so that
= From 3
00
1
m=l
m
L _p(1
+e)m(3
+ 4cos(mtlogp) + cos (2mtlogp)).
+ 4 cos 9 + cos 29 = 2(1 + cos 9)2
~
0, we have loga p ~ 0, that is
ICfJe(t)1 ~ 1.
Suppose that (I
+ it) = O.
(4)
Then
f
1 +e
(I
+ e + it) =
nO"
+ it) dO" = O(e).
From Theorem 2.1, we have e(1
+ e) = 0(1)
so that, for any small e, we have CfJe(t)
and this contradicts (4).
= O(e),
0
Theorem 2.3. Let
ns) (s) When
0" ~
1
+s _
1 = g(s).
1, g(s) has a continuous first derivative.
Proof Differentiating the function h(s) in Theorem 2.1 we have
1
ns)
=  (s _ 1)2 + h'(s).
Here h'(s) is infinitely differentiable in
is regular in the half plane 0"
~
0"
> O. Also from Theorem 2.2, we see that
1
s 1
(s)
1 + (s  l)h(s)
1, so that 1 + (s  1)h(s) ¥ 0 in the same half plane.
222
9. The Prime Number Theorem
Therefore
_ (
I
_ h'(S))(S _ I)
(s  1)2
I
         =    + g(s), I
+ (s 
I )h(s)
S I
and here g(s) has the required property stated in the theorem.
D
9.3 Several Lemmas Theorem 3.1. If f(x) has a continuous first derivative, then b
ff(x)eiX1dX
=
oG)·
(1)
a
Proof From integration by parts we have b
f f(x)e ixt dx
=
b
h
{[f(x)e ixtJ :  f f'(x)e ixt dX}
a
=0
G)·
a
Theorem 3.2. 00
sinx f dx=n. x
(2)
00
Proof Let 00
sinIXx J= f ekx~dx
(I ::;:;
IX ::;:;
2, 0 ::;:; k ::;:; I).
o
Fix k > 0 so that the integrand is now a continuous function of IX and x, and the partial derivative with respect to IX is e kx cos lXX, which is also continuous. From the convergence of the integral 00
f ekxdx o
we see that the integral 00
f ekxcoslXX dx o
converges uniformly in I ::;:;
IX ::;:;
2. We can therefore differentiate J under the
223
9.3 Several Lemmas
integral sign giving
Here the right hand side is obtained from integrating by parts twice. From integration formulae we have
With IX fixed, when 0
~
~
k
o ~ k ~ 1. Therefore
1, J is uniformly convergent so that J is continuous for
f 00
lim J= kO+
sin IXX IX 1t dx= lim tan 1 =. X kO+ k 2
o
Taking in particular
IX
= 1, we have 00
00
sinx
f sin x
f dx = 2 dx = x
o
1t.
X
o
00
Theorem 3.3. Let a < 0 < h.
If f(x) has a continuous second derivative, then b
1f sinwx lim  f(x)dx = f(O). ro 00 1t
(3)
X
a
Proof We consider b
sinwx f (f(x)  f(O))x dx. a
At the point 0, (f(x)  f(O))/x has a continuous first derivative so that from Theorem 3.1 we have b
sinwx
lim f (f(x)  f(O))dx = 0, (0+
X
00
a
that is
f b
b
sinwx lim 1 f(x)dx X
ro 00 1t a
=
flO) lim 1 fSinwx dx X
ro 00 1t
a
224
9. The Prime Number Theorem
f bro
1 lim =fiO)
sinx dx X
11: ro> 00 cro
f 00
1 =fiO)11:
sinx dx, x
00
and the result follows from Theorem 3.2.
0
Theorem 3.4. Let A > 0, and
K;.(x) =
Ixl { 1 2A.'
if Ixl:;;; 2A, if Ixl > 2A.
0,
Then
f . ./he 00
K;.(t)e,xt dt = k;.(x),
1
(4)
00
where
{ ./he2x =
~(SinAX)2 ,
k;.(x)
if x =I 0,
2A
if x
./he'
= 0.
Proof It is easy to see that
fo f (1 H
k;.(x) =
(5)
;A)cosxtdt.
o
If x = 0, then clearly k;.(x) =
1
M:"2A.
y2n
If x =I 0, then integration by parts gives the required result at once.
0
Theorem 3.5. We have
f . 00
K;.(x)
1 =./he
k;.(t)e,xt dt.
00
(6)
225
9.3 Several Lemmas
In particular, with A = 1, x
= 0, we have 00
(7) 00
Proof We first consider the integral
f . ro
lew) =
1 fo
f ro
k;.(t)e,xtdt =
2 fo
k;.(t)cosxtdt.
0
ro
From (5) we have ro 2;'
lew)
= ~ f f (1 o
;A) cos utcosxtdudt
0 ro
2;'
=
~f
A) du f (cos(u + x)t + cos(u 
( 1  2U
o 2).
= ~f
o
(1  ~)(sin(U ++ 2A
7t
x)t) dt
u
x)w x
+ sin(u 
X)w)dU. u x
o
If x> 2A we have lim ro _ oo lew) = 0 from Theorem 3.1; if 0 < x < 2A we see from Theorem 3.1 and Theorem 3.3 that in the above formula the limit of the first term is o and the limit of the second term is 1  X/2A. Since the integral in (6) is a continuous function of x, we see that K;.(U) = 0, K;.(O) = 1. The theorem is proved. D Theorem 3.6. Letf(t) ~ 0(0 ~ t ~ 00), andforany T > 0, the interval 0 ~ t ~ Tcan be divided into afinite number of sections in each of whichf(t) is continuous. Suppose further that, for any e > 0, the integral 00
converges. Then 00
00
lim f e''f(t) dt = ff(t) dt.
,0
o
(8)
o
Proof Since f(t) ~ 0, S~ f(t) dt increases with respect to T so that S~ f(t) dt exists either as a finite number or 00. Now
226
9. The Prime Number Theorem 00
00
f e'1j(t) dt
~f
f(t) dt,
o
o
so that 00
00
lim f e''l'(t) dt ,"'0
~ ff(t) dt.
o
o
On the other hand T
00
~f
f e'1j(t) dt
T
~ e,T f f(t) dt,
e''l'(t) dt
o
o
o
so that T
00
~ ff(t) dt.
lim f e'1j(t) dt ,"'0
Letting T +
00
o
o
00
00
we have lim f e''l'(t) dt ,"'0
~ ff(t) dt, o
o
and the theorem is proved.
0
9.4 A Tauberian Theorem Definition. If f(x) is defined in 
00
<x<
lim {f(y)  f(x)}
00
~
YX"'O
and satisfies
°
(y
> x),
(I)
X'" 00
then we say that f(x) is a slowly decreasing function. Theorem 4.1. Let f(x) be a slowly decreasing function satisfying If(x) I < M (  00 < x < 00). If 00
lim _1_ f k;.(x  t)f(t) dt
x. . oofo
00
holds for every A. > 0, then f(x)
+
I (x
+ 00).
=
I,
227
9.4 A Tauberian Theorem
Proof From Theorem 3.5 we have
f
f
co
1 ~ Y 2n
co
k;.(x  t)dt
= 1
n
co
sin 2 u  1, 2du u
co
so that, without loss of generality we .can suppose that I = o. If f(x) 0, then there exists 0 > 0 and a sequence (xn)(xn + 00) such that j(xn) <  0 (n = 1, 2, ... ) or j(xn) > 0. Assume without loss that j(xn) > 0 (n = 1,2, ... ). (The casej(xn) <  0 can be proved in the same way.) Since f(x) is slowly decreasing, there exists Xo = xo(o) and 11 = 11(0) such that
+
o "2
j(y)  f(x) ~
holds. Take a particular x in (xn). Then f(y)
From (2), when x
f
o >"2 ~
(2)
Xo and x in (x n), we have
co
~
k;.(x
+ 11 
t)f(t)dt
00
x+2q
f ~ f fo f 2fo ~f of
~ 2y~ 2n
x
k;.(X+I1 t )dt
~
y 2n
x
f
k ;.(X+I1 t )dt
co
co
k;.(x
+ 11 
t) dt
x+2q
f
x+q
=
_0_
xq
k;.(x  u)du 
xq
k;.(v)dv  ; .
o
;'q
sin2 d w w2M w2 n
=
n
o
o
+ (A. + 00),
2
fo
co
co
q
=
~
f
f
k;.(v)dv
q
co
sin 2 w dw w2
f co
k;.(x  u)du 
~
fo
x+q
k;.(xu)du
228
9. The Prime Number Theorem
so that there exists a suitably large A. o such that 00
1 r::L y2n
f kAO(x + 1'/ 
0 t)f(t)dt >4
00
Let x increase without bound in (xn ) so that 00
lim x+oo xe{Xn}
f
1 r::L y 2n
kAO
(x
+ 1'/ 
0 t)f(t) dt ~ , 4
 00
which contradicts our supposition. Therefore f(x) proved. 0
+
0 and the theorem is
Theorem 4.2 (Ikehara). Let h(t) be nondecreasing in 0 ~ t < 00, and suppose thatfor any finite T, h(t) has only afinite number ofdiscontinuities in 0 ~ t ~ T. Suppose also that the integral
f 00
j(s) =
(0" > 1)
esth(t)dt
(3)
o
converges, and that given any finite a > 0, there exists a constant A such that lim (j(s) <1+1
~
~) = g(t) S 
(4)
1
uniformly in It I ~ a, where g(t) has a continuous derivative. Then lim eth(t) = A.
(5)
t+ 00
Proof Let a(t)  {
~
eth(t)
(t
o
(t
0),
A(t)  {
A
(t ~ 0)
o
(t < 0).
We now prove the following: 1) For any A. > 0, the integral
fo f 00
IA(x) =
kix  t)(a(t)  A(t))dt
(6)
00
exists; 2) (7) X+ 00
and 3) a(t)  A(t) is a bounded slowly decreasing function. The theorem will then follow from these three points and Theorem 4.1.
229
9.4 A Tauberian Theorem
Consider the integral
fo f 00
I;jx)
=
k;.(x  t)(a(t)  A(t))eet dt.
00
From our hypothesis this integral exists for any e > 0, A > O. From Theorem 3.4 and the uniform convergence of
f 00
(a(t)  A(t))e(e+iy)t dt
00
in Iyl :::; 2A, it follows that
f f f
h.(x)
= 2~
(a(t)  A(t))eet dt
f
2A
21n
KA(y)ei(xt)y dy
u
00
=
f 2A
00
00
K;.(y)e ixy dy
2A
(a(t)  A(t))e(e+iy)t dt
00
2A
= _I ~
K;.(y)eiXY(fll
+ e + iy) 
~)dY. e+ry .
2A
From (4) we have
f 2A
. hmh.(x) .... 0
= 1 2n
g(y)K;.(y)e iXY dy.
(8)
2A
From Theorem 3.1 we have lim 1imh.(x)
=
O.
(9)
x 00 £0
On the other hand, from Theorem 3.6, we have 00
limh.(x) = lim .... 0
e ...
~(f kA(x 
ov2n
fo f fo f
t)a(t)e· t dt  A
o
00
=
k;.(x  t)e· t dt)
0
fo f 00
kA(x  t)a(t)dt 
o
f 00
kA(x  t)dt
0
00
=
00
k;.(x  t)(a(t)  A(t))dt
= I;.(x),
230
9. The Prime Number Theorem
and so from (8) we see that /;.(x) exists. This proves 1), and now 2) follows from (9). Finally we prove 3). From the definition of A(t) we see that it suffices to prove that a(t) is a bounded slowly decreasing function. From (7) we have
f
f
00
~
lim xoo
V 2n
00
k;.(x  t)a(t)dt = lim xoo
~
V 2n
00
00
f;'x
A
=lim
fo xoo
k;.(x  t)A(t)dt
A(Sin U)2  n u
du
00
f 00
=A 
n
(sinu)2   du=A, u
00
so that there exists Xo such that, when x
fo f
;?; xo,
00
k..{x  t)a(t)dt < A
+ 1;
00
that is
f ( t)2 ( "It) 00
sin t
a x
dt < n(A
+ 1)
00
Since the integrand is nonnegative, substituting x
+ 2/fi for
x, we have
J~
f ei~tya(x+ ~Ddt
(x;?;xo)·
J~
From our hypothesis eta(t) is an increasing function of t so that J~
a(x)e 3/fi
f ei~ty dt < n(A + 1) J~
Letting A. +
00
we have at once a(x)
~
A
+1
When x < xo, h(x) is bounded and this implies that a(x) is bounded  00 < x < 00. Now let (j > O. We have
III
231
9.5 The Prime Number Theorem
so that lim {a(x
+ J) 
~
a(x)}
O.
x'" 00
b ... O
This means that a(x) is slowly decreasing. The theorem is proved.
0
9.5 The Prime Number Theorem In this section we apply Ikehara's theorem to prove the prime number theorem. We do not give a direct proof of the prime number theorem; instead we prove the equivalent theorem (see §I): Theorem 5.1. ljJ(x) '" x.
Proof From the definition of ljJ(x) we see that ljJ(x) is a nonnegative increasing function with only finitely many discontinuities in the interval 0 ::::; t ::::; T. When u> I we have, from Theorem 1.2 and formula (6.14.5), that
f 00
f 00
estljJ(et) dt
=
u(l +s)ljJ(u) du
o
n+l =
n~1
n+l
f
u(1 +S)ljJ(u) du =
n~1 m~n A(m)
f
u(s+ 1) du
n
I
=
00
L (n
(n
S 
+ I)S) L
I
=  lim
N
L (n
+ 1),)
(n
S 
SNoo n =l
= ~ lim { s Noo
=~
1:
A(m)
m~n
sn=1
f.
A(n)n S

(
L A(m))(N + I)s}
m~N
n=1
A(n) s n=1 nS
L A(m)
m~n
= _ ~ . ('(s) s
(u> I).
((s)
From Theorem 2.3 we see that the function
 I('(s)    I  I (('(S)  + I) I s ((s)
s I
s ((s)
s I
s
has a continuous derivative in u ~ I, so that for any a > 0 the function is uniformly continuous in I ::::; u::::; 2, It I ::::; a, and therefore there is a continuously differentiable function get) satisfying
232
in
9. The Prime Number Theorem
lim (_
~ ('(s)
.,.+1
S
1_) =
__
(s)
S 
g(t)
1
It I :::; a uniformly. From Theorem 4.2 we see that lim etl/l(et) = 1. t+ 00
Let et
= x. Then lim I/I(x) = 1, xoo x
D
which proves the theorem.
Exercise 1. Letpn be the nth prime number. Prove that the prime number theorem is equivalent to 1l· mPn   = 1. n logn
n+oo
Exercise 2. Use the prime number theorem to deduce that M(x)
I
=
Jl(n)
= o(x).
Exercise 3. Use the prime number theorem to deduce that
Exercise 4. Let n
= p~! ... p%k and define w(n)
= k,
Let 1tk(X)
8k (x) =
I
I
=
1,
'tk(X)
I
=
1,
n~x
n~x
co(n) = Q(n) = k
Q(n)=k
log (p 1
••• Pk),
pl ••• Pk~X
o
(x)
=
I
1.
pl ••• Pk~X
k
(Note: Here the sum is over all primes Pi>'" ,Pk satisfying Pi ••• Pk :::; x; the same set of primes Pi>'" ,Pk with a different ordering is treated differently.) Prove: kx(loglogX)kl 0' (x) '" k
logx
(k
~
2),
(k ~ 2),
x(loglogX)kl
1tk(X) '" 'tk(X) '" ~:
(kl)!logx
(k ~ 2).
233
9.6 Selberg's Asymptotic Formula
9.6 Selberg's Asymptotic Formula Throughout §6  8 we use the letters q and r to represent prime numbers. ~
Theorem 6.1 (Selberg). Let x
+
.9(x)logx
1. Then
L
.9(~)IOgp = 2xlogx + O(x)
(I)
+ O(x).
(2)
p
p""x
and L log2p
+
L logplogq = 2xlogx
We first prove the following: Lemma. Let F(x) and G(x) be two functions defined for x G(x) =
L l~n:::;;x
~
I and satisfying
F(~) log x.
Then
n~/(n)G(~) = F(x)logx + n~x F(~)A(n). Proof We have, from §6.4, A(n) L n""x
=
Ldln Jl(d) log~ so that
Jl(n)G(~) = L Jl(n) n n""x = L I""x
=
L I""x
L
x
m:::;;
F(~) L I
F(~) log~n mn Jl(n)
(IOg~ + 10g~) n
I
nil
F(~)IOg~. LJl(n) + L I I nil
= F(x) log x + L
F(7)A(l)
I""x
F(~) A(l).
D I Proof of Theorem 6.1. Let y be Euler's constant. From §5.8 we have I""x
L ~ = log x + y + 0 (~) .
n~xn
X
Also
x
= L logn = flOgtdt + O(logx) = xlogx  x + O(logx). n:::;:;:x
1
234
9. The Prime Number Theorem
We apply the lemma with
= "'(x)  x + y + 1
F(x)
so that G(x)
= logx l";~";X ",(~)
xlogx

n~x~ + (y + l)xlogx + O(logx)
= 0(log2 x) = O(yIx).
From the lemma we have F(x)logx
+I
n~x
F(~)A(n) = o( I J~) = O(x). n n
(4)
n~x
From Theorem 5.9.1 we have A(n)
I 
n~x
logx
=
n
+ 0(1).
(5)
Therefore, from (3), (4), (5) and Theorem 1.2 we have "'(x) log x
+ n~x "'(~)A(n)
=xlogx+x = 2xlogx
A(n)   ( y + l)logx(y+ I) I A(n) +O(x) n~x n n~x
I
+ O(x).
(6)
From Theorem 1.2 we have
n~x "'(~)A(n) =
o(
Jx .9(~}Ogp
I
logp log q) =
pr.tqP~x a~2,p;;>l
= 0 (x
m~x A(m)A(n)  p~xlogplogq
o( I
=
I
logp
p(%~x a~2
logp ) P,,;J;p(p  1)
I
=
IOgq)
qP~xlprx. P~l
(7)
O(x)
and "'(x) = .9(x) = .9(x)
+ .9(xt) + ... + .9 (x[:::!J) + O(logx . .9(xt)) =
Formula (I) now follows from (6), (7) and (8).
.9(x)
+ O(xt log x).
(8)
235
9.7 Elementary Proof of the Prime Number Theorem
Also from 9(x)logx I log2p= I logplog::'= I logp(I p';;x p';;x P p';;x x
~+O(l))
n~p
1
I I
=
n:::=;x
n
= o(x
formula (2) follows at once.
logp
+ 0(9(x))
x p:S::;;
I ~) + O(x) =
n~xn
O(x),
0
9.7 Elementary Proof of the Prime Number Theorem Let R(x)
=
(l)
9(x)  x.
We know from Theorem 1.1 that the prime number theorem is equivalent to lim R(x) = O. x+
(2)
X
00
Before we prove (2) we first establish the following lemmas. Lemma 1.
If x;?;
3, then logp logq
I
pq
pq:S:;x
I
.
1 = log2 X 2
logplogq
pq';;x pq logpq
+ O(logx),
= logx + O(loglogx),
logp
I = O(log log x). " 2x p~x plogp
Proof Let A(n) = Ip,;;.logp/p. From Theorem 5.9.1 we have A(n) = logn where r. = 0(1). Therefore "
L...
pq';;x
logp log q "logp" log q "logp x    = L...   L...   = L. logpq
p';;x P
x
q
p';;x p
p
+ O(logx)
q~P
x = I (A(n)  A(n  1)) log + O(logx) n~x
n
+ r.
236
9. The Prime Number Theorem
I
A(n)
n~xl
=I
n~x
{lOg~ n
logn .lOg(1
109_x_} + O(logx) n+1
+~) + n
o( I
n~x
109(1
+ ~)) + O(logx) n
1
= 2log2 X + O(logx). Using the same method we have, by partial summations, logplogq
I
pq ~ x
pq logpq
= logx + O(loglogx).
Also from
I
1 2x nlogn
n "'x ~
1 1 1( 1 1 ) =I+Ilogx n"'x n n"'x n 2x logx ~
f
logn
~
x
I 1
=
n~x n
du 2 ulog u
+ 0(1)
2x n
f
=
1
2x u
f
~
I
x
~n~x 2 du
ulog u
x
+ 0(1) = du + 0(1) = ulogu
2
O(loglogx),
2
we have
I
logp
I
=
'" p log2x p
'"
p~x
I
n~x
=
(A(n)  A(n 
1 1»2x
logn
n~x
{logn log(n  I)} _1_ 1og2x n
o( I
1
'" nlog2x n
+
o( I 'n n~x
2x
logn
12x ) logn+1
) = O(loglogx).
n~x
The lemma is proved.
0
Lemma 2. 8(x)
+ I
pq~x
logplogq = 2x logpq
+ o(~) logx
(X
~
2).
237
9.7 Elementary Proof of the Prime Number Theorem
Proof Let
I
B(n) =
logplogq,
=
C(n)
pq~n
I
log2p.
p~n
Then we have 8(x)
+ =
'" logp log q L.:. pq~x logpq
I
C(n)  C(n  I) n~x logn
= C([x]) log[x]
= 2x
+ L
B(n)  B(n  1) n~x logn
+ B([x]) + I log[x]
{C(n)
{_l_ _ logn
1 } log(n + I)
IOg( 1 +~)
+ o (_x_) + I
(2nlogn
n~xl
logx
+ B(n)}
n~xl
+ O(n))
lognlog(n
n
+ 1)
=2X+O(_X), log x
and the lemma is proved.
0
Lemma 3.
logp logq R (x) pq~x logpq pq
R(x)logx = I
+ O(x log log x)
(x
~
3).
Proof From Lemma 1 and Lemma 2 we have
x) logp I 8 (  logp=2x I   Ilogp I p~x p p~x p p~x
log q log r x logqr
qr~p
+
o(x I
~
IOgP)
p~x
=
2xlogx 
2x plogp
logqlogr 8 (x) qr~x log qr qr
I
+ O(xloglogx).
Substituting this into Selberg's asymptotic formula (that is, formula (6.1)), we have 8(x) log x = I
pq~x
logplogq 8 (x) logpq pq
+ O(x log log x).
The result follows from substituting (I) into this and applying Lemma 1. Lemma 4.
IR(x)l::::; _1 I IR(~)I logx n~x n
+o
(x loglogxlog x)
(x
~
3).
0
238
9. The Prime Number Theorem
Proof Substituting (1) into formula (6.1) we have
I R(~)IOgp + O(x),
R(x)logx = 
p"'x
p
so that from Lemma 3 we see that
2IR(x)llogx:::;; I IR(~)IIOgp + I logplogq p"'x p pq"'x logpq
IR(~)I + O(x log log x). pq
From Lemma 2 and partial summations, and noting that Iial  Ihl! :::;; la  hi, we see that
2IR(x)llogx:::;;
)1)
I (Ilogp+ I 10gpIOgq)(IR(~)I_IR(_x n"'xl p"'n pq"'n logpq n n+I
I
+0 (
:::;; 2
p"'x
+ I 10gpIOgq) + O(xloglogx) pq"'x logpq
n"'~l n(IR(~)IIRC: 1)1)
I (~) II
+0( ~2
logp
I
I _n R n",xllog2n n
(n I
R x) +0
..:: n"'x I
+ o(x
R (_x ) n+1
II)
+ O(x log log x)
I n ((x) 8  8 (( n",x_llog2n x))) n n+ 1
1_))
I _n_(~ _ _
n",x_llog2n n
n+ I
+O(xloglogx).
From Theorem 1.2 we have
I 
n ((x) (x)) 8  8 n n+1
n",x_llog2n
=
I
2"'n"xl
= o(x
I
(_n _ n_1_) =
8 (~) n log2n
n"'x nlogn
1 ) log2(n  I)
+ O(x)
O(xloglogx),
so that
n~JR(~)1 + O(xloglogx),
2IR(x)llogx:::;; 2 and the required result follows. Lemma 5.
If x >
0
I, then
I
n~x
8(n) 2
n
= logx + 0(1),
239
9.7 Elementary Proof of the Prime Number Theorem and
L 8(~) =
xlogx
n
n~x
+ O(x).
Proof Since
L ~= L ~ L ~=~+O(~)+O(~), X
p";"";xn
";;'pn
">xn
P
P
we have
L
I
8(n)
L 2" L logp = p~x L logp p:::=;n~x L n2 n p~n
2 =
n:::=;x
n
n:::=;x
L IOgp(~p + O(~) + O(~)) = P x
=
logx
+ 0(1)
p";x
and
L logp . (~+ 0(1») =
=
P
p";x
Lemma 6. logn L R(n) = "";x

n
xlogx
+ O(X). 0
I (x) + O(X). L R(n)R "";x n
n
Proof From Selberg's formula (that is (6.2» and partial summations we have
x x IOg2 plog + L logplogqlog = 2xlogx + O(X). p";x P pq";x pq Substituting
L
log~ = L ~ + O(~), P
p";"";x
n
P
x log= pq
into the above formula and interchanging the summations we have
L I L log2p + L I L logp L logq = 2xlogx + O(x);
n~x
n
p~n
n~x
n
p~n
x q~;
that is logn L 8(n) + L
(x)
I 8(n)8  = 2xlogx + O(x). n n:S;xn n The required result follows from substituting (I) into this formula and then apply Lemma 5. 0 n~x
240
9. The Prime Number Theorem
Lemma 7. Let 0 < u < 1 and suppose that there· exists Xo such that, for x > Xo, IR(x)1 < ux.
(3)
Then there exists Xu such that, when x > xu, the interval subinterval (y, eby) with the property that
IR;Z) I< u ~ u when y ~
Z
~ eby. Here
2
«(1 
U)16 X,
x) contains a
,
c5 = u(1  u)/32.
Proof From Lemma 6 we have logn I In~xnR(n) ~
~R(n)R(~) n
I
x n
+II
~R(n)R(~)1 n
n<xo n
xo~n~~
+
x
~R(n)R(;)
I
+ O(x)

Xo
1 n
 + O(x) = x
u 2 xlogx
+ O(x),
xo~n~~
so that when x >
Xl,
Ix,,t,;;;x lO! n R(n) I< u (x + x') logx + O(x), 2
where x' = (l  U)16 X • Suppose that R(n) does not change sign in (x' (x' ~ y ~ x) so that
~
n
~
x). Then there must exist y
R(Y) I I logn < u (x + x')logx + O(x). Iy x'';;;n';;;x 2
From (l 
U)16
1 u < I
+ 15u '
we see that
I< u IR(y) y <
+ x' x  x'
2X
u(l
+ 3u) 4
+0
(_1_) < log x
(x>
u(l
+ 7u) + 0 8
(_1_) log x
(4)
Xl)'
But if R(n) changes sign in (x', x), then clearly there exists y (x' IR(Y)I = O(1ogy) so that (4) still holds.
~
y
~
x) such that
241
9.7 Elementary Proof of the Prime Number Theorem
When I < Y < Y' we have, by Lemma 2,
L
y
logp
~ 2(y' 
I)'
+ 0 ( y'1
y)
ogy
From (I) we have IR(y')  R(y) I < (y'  y)
Let x'
~
YI, Y2 ~
Xl,
+
o(L).
(5)
logy'
and YI satisfying (4) and
From (4) and (5) we see that
I I< I· +II  1+0(1) R(Y2) Y2
IR(YI) YI
< 0'(1
YI Y2
YI Y2
+ 30') . eb + (e b 
logx
I)
4
+0
(
I) . logx
Since eb < 1/(1  J) (0 < J < I) we have
IR(Y2) Y2
1< + _1_J +(_1__ I) + J + (I) < + < 0'(1
30') .
0(_1_)
I 
4
0'(3
50')
8
I 
+0
0'
~gx
log X
0'2
4
WhenYI ~«(1 +70')/(1 + 15O'))xwe have ebYI <x,sothatwecantakeY=YI' When YI > «(1 + 70')/(1 + 15O'))x, then ebYI>
10'
+ 150'
I
x>
Xl
so that we can take Y = ebYI' The lemma is proved.
0
Proof of the prime number theorem. We already know that there exist e > 0 and x~ such that, for X > Xo, 8(x) > ex
(this is Theorem 1.2). From Selberg's formula, we have 8(x)
= 2x  _1_
L 8(~)IOgp + o(~) p log x
= 2x 
L 8(~)IOgp
logx p~x
_1_ logx
x
p~;
xo
p
(6)
242
9. The Prime Number Theorem
::;;; 2x  ex log x logx
I+ 0 (
logx
x
L
logp)
+ 0 (x) logx
xo'
= (2  e)x + o(~) < (2  ~)x log x 2
(x > xo, e > 0).
From (I) we have
IR(x)1 < O'o(x)
(x> XO, 0'0
II ~ I,
=
0 < 0'0 <
I).
Let
From Lemma 7 there exists x"o > Xo such that when x > X"o' any interval W 1 , C) ('::;;; C ::;;; x/x"o) will contain a subinterval (Yv, e6yv) so that when Yv ::;;; n ::;;; e6y.,
From Lemma 4 we have
IR(x) I < _ I logx
Lx
IR(~)I +_1. L IR(~)I+o(~) n logx n logx x

n:S;Xao
Xao
I + 0'0+0'~   x "L.
"L... x l~n~
n
2
log x
Xao
"L... +0 I (x) yv~n~edy n Jlogx
x ~v~_
Xao
n f(Yv,edyv)
O'~) Xlogx
L
x
{v~_
(I )) +0 ( x )x
( <5+0 ; ,
Xao
< O'oX 
(0'0  O'~) x <5logx (x)   + 0 2 logx log' Jlogx
~ log
243
9.8 Dirichlet's Theorem
(1  uo)2Uo) x
< Uo ( 1 
1024 log 1 ~ Uo (1 
< Uo ( 1 
where
UI
+ 0 (X) Jlogx
UO)3) x + 0 (X)
1024
Jlogx
< Uo. Repeating the above we arrive at
IR(x) I <
(x> xuJ,
UnX
where
U =U
n I
n
~U ""
0
( 1
U_I)3)
(1  n 2000
~U
'"
n I
( ( 1  UO)3) 1~ ... 2000 '"
(1  uo)3)n ( 12000'
so that lim n>
and the required result is proved.
Un
=0
00
0
9.8 Dirichlet's Theorem Theorem 8.1 (Dirichlet). Let k > 0, I> 0, (k, I) = 1. Then there are infinitely many primes of the form kn + I. In this section we prove a stronger version of Theorem 8.1, namely: Theorem 8.2. Let k > 0, I > 0, (k, I)
1. Then
logp 1  =  l o g x + 0(1).
L
•
=
p~x
p
~(k)
p=l(modk)
Here the condition of summation is over all primes not exceeding x which are of the form kn + I, and the constant implied by the Osymbol depends on k. We shall require the following lemmas for the proof of Theorem 8.2. If X is a nonprincipal character, we write L(X) =
I n= I
x(n) ,
n
LI(X)
=
~ x(n) 10gn L.. . n= I
n
(1)
244
9. The Prime Number Theorem
Lemma 1. Let X be a nonprincipal real character. Then L(X) # 0.
Proof Let F(n)
=I
x(d)·
din
Since if X(P) = 1, if X(p) =  1, if X(p) =  1,
I + 1 + ... + 1 = 1+1 F(pl) = { 1  1 + .. . + 1 = 1 11+'''1=0
I even, I odd,
and F(n) is multiplicative, we have F(n)
~
{
I,
if n is a perfect square,
0,
otherwise,
so that G(x) =
F(n) t ~
I n';x
n
I

1 <>m<>J;'
+ 00.
m
On the other hand, when X is a nonprincipal character, we have
x(n) log n
I
0
n
x';n<>y
=0
(lOg
x)
0
x
(<5 > 0, x> 1).
(2)
(This can be proved from Exercise 7.2.1 and Theorem 6.8.2.) From formula (5.8.4) we now have G(x)
I
=
n';x
1
t
n
= 2Jx
I
x(d)
=
din
I d<>Jx
I
X(d) dtd't
dd'<>x
x(d) d
+ 0(1)
= 2Jx L(X) + 0(1). If L(X)
= 0, then
G(x)
= 0(1) which is impossible. The lemma is proved. 0
Lemma 2.
L 1 (X)
I
/l(n)x(n) = {0(1), n';x n logx + 0(1),
if if
L(x) # 0, L(X) = 0.
245
9.8 Dirichlet's Theorem
= x(n), F(n) = n. Then from
Proof In Theorem 6.3.3 we let H(n) G(x)
(x)
x(n)
= l<>~<>X F ;; H(n) = x l<>~<>Xn = xL(X) + 0(1),
we have x
x)
= F(x) = l<>~<>X J1(n)G ( ;; H(n) = xL(X) l<>~<>X
x(n)J1(n) n
+ O(x),
giving L(X)
L J1(n)x(n) = 0(1). n
n~x
If L(X) ¥ 0, then
L J1(n)x(n)
=
0(1)
n
n~x
and the result follows at once. If L(X) = 0, then in Theorem 6.3.3 we let = xlogx, H(n) = x(n) so that
F(x)
G(x)
=
L F(~)H(n) = x L n:S;x
n
n:::=;x
x(n)
n
log~ n
= L(X)xlogx  L 1 (X)x + O(logx) = 
L 1 (X)x
+ O(logx).
From Example 5.8.2 we have x
L log = O(x) n
n~x
so that
=  L 1 (X)x
L J1(n)x(n) + O(x). n
n~x
The proof of the lemma is complete.
D
Lemma 3.
L p<>x
X(p) logp p
= {0(1),  log x
+ 0(1),
if L(X) ¥ 0, if L(X) = 0.
246
9. The Prime Number Theorem
Proof L _X(_p)_lo_g_p p"'x
= L x(n)A(n) + 0(1)
P
n
n"'x
=
x(n) n L LJl(d) logn"'x n din d
=
X(d)X~d') Jl(d)logd' + 0(1)
L
dd
dd'",x
=
+ 0(1)
" Jl(d)X(d) " X(d')logd' L... d L... d' d~x
+ O( 1)
x
d''''"d
and the required result follows from Lemma 2.
0
Lemma 4. Suppose that X is a nonprincipal character. Then L(X) ::/: O. Proof Let N be the number of nonprincipal character X mod k such that L(X) = 0, and let L(x) represent the sum over all characters mod k. Then, from Lemma 3 together with Theorem 7.2.4 and Theorem 7.2.5, we have logp = L L X(p) logp
cp(k) p~x
p=l(modk)
P
P
(x) p"'X
= L logp + L P"'X p,/'k
= (I
P
x*xo
L X(p) logp p"'X
(x)
 N) logx
P
+ 0(1).
But since logp
~O
cp(k) p~x
p
p=l(modk)
we must have 0 ::::; N ::::; 1. Now if X is a complex character then we must have L(i.) ::/: 0 also so that N ~ 2. But if X is a real character we know from Lemma I that L(X) ::/: 0, so that N = O. The lemma is proved. 0 Proof of Theorem 2. From Lemma 3 and Lemma 4 we have L X(p) logp p"'x
p
= 0(1),
247
9.8 Dirichlet's Theorem
so that, by Exercise 7.2.2, logp
(()(k)

p
p~x
" _ "X(p) logp ''X(/) L..
=
(x)
p,""x
P
p=/(modk)
= logx
and the desired result follows.
+ 0(1),
0
Exercise. Suppose that (k, I) = 1, 1 :::; k. Prove that
lim n(x; k, I) = 1. x x> 00 ({)(k)logx Suggestions: 1) Let 8/(x)
=
I
logp,
l/I/(x)
I
=
A(n),
p~x
n~x
p=/(modk)
n=/(modk)
so that 1'
1m
n(x; k, I)
x+oo
1' 8/(x)
1'l/I/(x)
xoo
xoo
= 1m= 1m;
X
X
X
(()(k) log x
lim n(x;k,/) = lim 8/(x) = lim l/I/(x). xoo x xoo X xoo X (()(k) log x 2) Prove:
IJl(d)10g2~ = A(n)logn + IA(d)A(~). din
d
d
din
Summing with respect to n over 1 :::; n :::; x, n == 1 (mod k) we have 10g2 P +
I
2
I
logp log q = xlogx (()(k)
p~x
pq,""x
p=/(modk)
pq=/(modk)
+ O(x)
and 8 i (x) log x
+I p,""x
where p satisfies pp == 1 (modk).
10gP8Ip(~) = _2xlogx + O(x), p ({)(k)
248
9. The Prime Number Theorem
3) Let X
+ Rl(x),
81(x) =  k (()( )
so that
=
RI(X)logx ,
4)
IRI(x)1 :;;;
5)
6) If 0 <
I
logn
logplogq ( x) Rift;; logpq pq
I
1
({)(k)logx l';a';k (a,k)= 1
Rl(n) n';x n (T
I
pq~x
= 
I
n';x
< 1 and there exists
IRa (x)  I+ 0 (x log log X) . n
n';x
I 1
+ O(x log log x).
logx
(x) +
I
R,.{n)Rp n ap=l(modk) n
Xo
such that when
x
>
Xo
O(x).
we have
(TX
IRI(x)1 <  k ' (()( )
then there exists Xu such that, when x > xu, the interval «1  (T)16 X, x) contains a subinterval (y, eOy) (J = (T(l  (T)/32) such that when y :;;; z :;;; eOy we have RI(Z) I < _1_ . (T + (T2. Z (()(k) 2
I
7) First use Theorem 8.2 to show that (To and Xo exist such that 0 < when x> Xo
(To
< 1 and
(To
IRI(X) I <  x . (()(k)
Then use this together with 4) and 6) to prove that lim Rl(x) =
o.
X
x+ 00
Notes 9.1. The present best result on the error term of the prime number theorem, namely n(x)
~
= Ii x + O(xec(logX)'),
c a positive constant,
is due to I. M. Vinogradov and H. M. Korobov and is based on estimates on trigonometric sums.
249
Notes
9.2. In recent years a number of mathematicians have obtained error term estimates in Selberg's elementary proof of the prime number theorem. For example: n(x)
= Ii x + 0 (;) log x
where A is any constant however large and the Oconstant d((pending on A (see E. Bombieri [8] and E. Wirsing [64]). An even better estimate is given by H. Diamond and G. J. Steinig [21].
Chapter 10. Continued Fractions and Approximation Methods
10.1 Simple Continued Fractions By a finite continued fraction we mean an expression ao+a1
+
We shall see that, as N + 00, the expression here tends to a definite number; we call the infinite continued fraction a continued fraction. It is convenient to denote the above expression by 1 ao+al

1

1
+ a2 + ... + aN
or
It is easy to see that ao
[ao] =
T'
In general, we let [ao, al> ... ,an] = Pn/qn, 0 ::;;; n ::;;; N, where Pn, qn are polynomials in ao, aI, ... ,an. These polynomials are linear in anyone a, and the denominator qn is independent of ao. We call Pn/qn the nth convergent of [ao, al> ... ,aN]. Theorem 1.1. The convergents satisfy the following: PI =alaO+ 1,
Pn = anPn1
ql = al>
qn
= anqnl
+ Pn2 + qn2
(2::;;; n ::;;; N), (2::;;; n::;;; N).
0
Theorem 1.2. The convergents satisfy the following: (n
~
1),
(1)
251
10.1 Simple Continued Fractions
or Pn
Pn1
(  1)n1
and (n
~
D
2).
(2)
Definition. Let ao be an integer, and a1, a2, ... be positive integers. Then 1 ao+a1
1 + a2
+ ...
is called a simple continued fraction. We shall only deal with simple continued fractions in this chapter. From Theorems 1.1 and 1.2 we deduce at once: Theorem 1.3. (i) lfn > 1, then qn (ii)
~
qn1
P2n+1 P2n1 <, q2n+ 1 q2n1
+ 1, so that qn ~ n. P2n P2n2 >. q2n q2n2
(iii) Every convergent of a simple continued fraction is a reduced fraction.
0
Let oc be a real number. We take ao = [oc] and we let oc~ = l/(oc  [oc]). We then take a1 = [OC1] and we let oc~ = l/(oc~  [OC'1])' We continue in this way by taking an = [oc~] and defining oc~ + 1 = 1/( oc~  [oc~]). It is clear that if this process terminates, then oc must be a rational number. Conversely, if oc is a rational number p/q where (p, q) = 1, then ao = [p/q] and
1
O~<
oc'1
1,
or
and similarly
We see therefore that if oc is rational, then the evaluation of its continued fraction is similar to the Euclidean algorithm, so that we have: Theorem 1.4. Every rational number is representable as a finite continued fraction. D
252
10. Continued Fractions and Approximation Methods
An immediate problem is the uniqueness of this representation. From a + 1/1 = a + 1 we see at once that there is no uniqueness. In other words, if an > 1, then [ao, ... ,an]=[ao, ... ,ant.anl,I]; if an =l, then [ao, ... ,an]= [ao, . .. , anl + 1]. Therefore each rational number has two representations, one with n odd, and the other with n even. If IX is irrational, then the above method gives an infinite sequence ao, at. a2," . , am' ... For example, we have
n = [3,7,15,1,292, 1, 1,1,21,31, 14,2, 1,2,2,2, 2,1,84,2,1,1,15,3,13, 1,4,2,6,6,1, ... ]. Theorem 1.5. Let IXn = [ao, at. . .. , an]. Then lim IXn exists.
Proof We have IXn = Pn/qn, and by Theorem 1.3 (ii), 1X2n+ 1 < 1X2n1, 1X2n > 1X2n2' Next from Theorem 1.2 (1), 1X1 ~ 1X2n+1 ~ 1X2n ~ 1X2, so that limIX2n and limlX2n+1 exist. Finally, from Theorem 1.2 and Theorem 1.3 (i), we have 11X2n  1X2n 11 = l/q2nq2nl ::;;; 1/2n(2n  1) so that limIX2n = limIX2n1' 0 Exercise. Prove that
Pn =
1 al 1
ao 1 0
0  1 a2
0 0  1
0 0 0
0 0 0
0 0 0
.......................................
0 0
0 0
0 0
0 0
1 anl 0
 1 an
and that qn is the determinant above with the first row and first column omitted. Exercise 2. The sequence (un) = (1, 1,2,3,5,8, 13, ... ), where Ul = U2 = 1, Ui + 1 = Ui  1 + Ui (i > 1), is called the Fibonacci sequence. Prove that (i) Un +2/Un + 1 is the nth convergent of (1 + )/2; (ii) in the continued fraction [ao, at. .. .], if ai = 2 (i > 0) and an = 1 (n # i), then for m > i we have
J5
Pm Ui+1 Umi+3 + Ui Umi+l qm Ui Umi+3 + Ui1 Umi+l Exercise 3. A synodic month is the period of time between two new moons, and is 29.5306 days. When projected onto the star sphere, the path of the moon intersects the ecliptic (the path of the sun) at the ascending and the descending nodes. A draconic month is the period of time for the moon to return to the same node, and is 27.2123 days. Show that solar and lunar eclipses occur in cycles with a period of 18 years 10 days.
10.2 The Uniqueness of a Continued Fraction Expansion Definition. We call [ao, at. ... , an, ... ].
IX~
= [an, an+t. ... ] the (n + l)th complete quotient of
253
10.2 The Uniqueness of a Continued Fraction Expansion
Theorem 2.1. We have
IX =
IX~,
IX~ao + 1 IX==IX'l
IX~Pn1 +Pn2
IX
= ",     ,
IXnqn1
+ qn2
If IX is rational, then this holds up to n = N. Proof Use mathematical induction.
0
[IX~], except when IX is rational and aN = 1 in which  1. Therefore there are only two representations to a
Theorem 2.2. We always have an =
case we have aN1 = rational number.
[IX~_l]
Proof WehaveIX~ = an + I/IX~+l' If IX is irrational or iflXisrationai and n ¥ N  1, then IX~ + 1 > 1 so that an < IX~ < an + 1, as required. If IX is rational and n = N  1, IXn + 1 = 1, then an = [IX~]  1. 0 Theorem 2.3. The representation of an irrational number by a simple continued fraction is unique.
Proof Suppose that IX= [aO,aI>a2,"'] = [b o,b 1,b 2, ... ]. Certainly we have ao = [IX] = bo, and similarly a1 = b 1. Suppose now that ak = bk for k < n, and we have to prove that an = bn. From IX = [ao, . .. , anI> IX~] = [ao, . .. , an b P~], we have IX~Pn1 +Pn2 IX = ,    IXnqnl +qn2
P~Pn1 +Pn2 R'
I'nqn1 +qn2
,
so that (IX~  P~)(Pn1qn2  Pn2qn1) = O. From Theorem 1.2 we deduce that IX~ = P~ and therefore an = [IX~] = [P~] = bn· 0 Theorem 2.4. We have
( I)nb n qnIX  Pn =   
0< bn < 1,
and bn/qn + 1 is a decreasing function of n. (If IX is rational, then this holds only for 1 :::; n :::; N  2, and bN1 = 1.) Proof We have
so that Pn IX~+1Pn + Pn1 IX   = ~=qn IX~+lqn+qn1
Pn qn
 (Pnqn1  qnPn1) qn(IX~+lqn + qn+d
( I)n qn(IX~+lqn
+ qn1)'
254
10. Continued Fractions and Approximation Methods
and hence (j= n
qn+1 rx n+1qn + qn1
an+1qn+qn1 rx n+1qn + qn1
I
I
From this we see that 0 < (jn < 1 except when rxn + 1 rx~ = 1 + l/rx~+ 1 we have
= rx~ + l '
Also, from
1
;::: rx~+lqn
+ qn1
(an+1
+ l)qn + qn1
In the last inequality, equality sign holds only when rxn+ 1 rational and n = N  1. 0
= rx~+ p
that is when rx is
From this theorem we deduce: Theorem 2.5.
If rx is irrational, then limpn/qn =
0
rx.
Theorem 2.6. We have
Irx 
Pn qn
I: : ; _1_ < ~, qnqn1
qn
with the equality sign only when rx is rational and n = N  1.
0
10.3 The Best Approximation Let rx be a real number.: Among the rational numbers with denominators not exceeding N, there is one which is closest to rx, and we call it the best rational approximation to rx. We now prove that the convergents Pn/qn are the best rational approximations to rx. Theorem 3.1. Suppose that n;::: 1, 0 < q ::::; qn and p/q :f Pn/qno Then IPn/qn  rxl < Ip/q  rxl·
Proof It suffices to prove that IPn  qnrxl < Ip  qrxl. (i) If rx = [rx] + t, then Pr/q1 = rx and the result follows at once. (ii) If rx < [rx] + t, then the result holds when n = 0, and if rx > [rx] + t, then the result holds when n = 1. We now assume as induction hypothesis that the result holds for n  1, and proceed to prove by induction. If q::::; qnl> then from the induction hypothesis IPn1  qn1rxl < Ip  qrxl, so that we may assume that qn;::: q > qn1' If q = qn, then
255
lOA Hurwitz's Theorem
Also
If qn+l = 2, then n = 1, and al = a2 = 1, giving 1
IX
l' 1
= ao + 


1 + 1 + a3
+ ...
t
which shows that ao + < IX < ao + 1, and our required result clearly holds. We may therefore assume that qn+ 1 > 2, that is
and so
II!..q 
IX
I ~ II!..  Pn 1IPn q
qn
qn
IX
I ~ ~ Ipn qn
qn
IX
I > Ipn qn
IX
I·
We may now assume that qn > q > qnl' Let us write upn + vPnl = p, uqn + vqnl = q, so that u(Pnqnl  Pnlqn) = pqnl  qPnl' From Theorem 1.2 we have u = ± (pqnl  qPnl), and similarly v = ± (pqn  qPn). The numbers u, v cannot be zero, and in fact from qn > q = uqn + vqnl we see that they are of opposite signs. Now from Theorem 2A,Pn  qnlX andpnl  qnllX have opposite signs, and therefore u(Pn  qnlX) and V(Pnl  qnllX) have the same sign. Finally from pqlX=U(PnqnlX)+V(PnlqnllX) we see that IpqlXl>IPnlqnllXl > IPn  qnlXl· D
Example. From n
= [3,7,15,1,292,I,I, ...J we
obtain the convergents
3 22 333 355 103993 104348 106' ill' 33102 ' 33215 , ....
l' 7'
In the year 500 A. D. Chao JungTze obtained both the crude estimate 22/7 and the good estimate 355/113 (this is more than a thousand years earlier than the earliest European record due to Otto). More interesting still the two estimates of Chao belong to the family of best approximations to n; in other words there is no fraction with denominator less than 113 which is closer to n than 355/113 is. From Theorem 2.6 we have 3551
In ill
1
1
< 113 x 33102 <1Q6'
In fact 355/113 = 3.1415929 ... whereas n = 3.1415926 ....
10.4 Hurwitz's Theorem Theorem 4.1. Ofany two consecutive convergents to IX, at least one of them satisfies the inequality IIX  p/ql < 1/2q2.
256
10. Continued Fractions and Approximation Methods
Proof By Theorem 2.4,
0(
lies between Pn/qn and Pn+ t!qn+ 1 so that we have
If the theorem is false, then
I Ipn+ 1 Pn I 1 qnqn+l = qn+l  qn ~ 2q; giving (qn+ 1

I
+ 2q;+1'
qn)2 ::::; 0, which is impossible if n > O.
D
It follows from this theorem that, if 0( is irrational, then there are infinitely many rational numbers p/q such that 10(  p/ql < 1/2q2. Theorem 4.2 (Hurwitz). Of any three consecutive convergents to
0(,
at least one of
them satisfies the inequality
Proof Let f3n+l = qnt!qn, so that, by Theorem 2.1,
We proceed to prove that
(I) cannot hold for three consecutive values i = n  1, n, n + I. Suppose that (I) holds for i = n  I and i = n. From O(~_l = anl + I/O(~ and
we have I I ; + 13 = O(~_l + f3nl::::;.j5, O(n n giving I
= ~O(~ ::::; (.j5 O(n
~)(.j5 f3n
f3n).
Thus f3n + l/f3n ::::; .j5. Since f3n is a rational number we must have strict inequality, and since f3n < 1, we deduce easily that f3n > t(.j5  I). Similarly,if(l)holdsfori=nandi=n+ l,then f3n+l >t(.j5 I),andwe arrive at
257
10.5 The Equivalence of Real Numbers
1
an =   Pn < fi  Pn+l  Pn < fi  (fi Pn + 1 which is impossible. The theorem is proved.
1)
= 1,
D
As a corollary to this, we have Theorem 4.3. Given any irrational number satisfying the inequality
I~
0(,
 I Js 0(
<
there are infinitely many convergents
q2 .
D
fi
Theorem 4.4. The number in the above theorem is best possible in the sense that if A> then there exists a real number 0( such that 10(  p/ql < I/Aq2 has only finitely many solutions.
fi,
Proof In fact we set
0(
1 ( 2
= t(fi  I). Let us assume that
fi51)=+, p J q
q2
1 I IJI<<· A
fi
Then
and on squaring J2 _ q2
15 J = (~q + p)2 _ ~4 q2 = pq _ 2
V
q2
+ p2.
J
For sufficiently large q, the left hand side here is less than 1 in magnitude, so that + p2 = 0 or (2p + q)2 = 5q2, which is impossible. D
pq  q2
10.5 The Equivalence of Real Numbers Definition 5.1. Two real numbers integers a, b, c, d such that
e and I] are said to be equivalent if there are
al] + b e= ad  bc = ± l. Cl] + d ' The relationship between eand I] here is called a modular transformation. Example l. e= a + I] and e= 1/1] are modular transformations. Example 2. e= [a,I]J = a + 1/'7 is also a modular transformation.
(I)
258
10. Continued Fractions and Approximation Methods
Example 3. We may view IX = [ao, aI, ... , am IX~] as n successive modular transformations of Example 2. The resulting modular transformation is given by PnIIX~ + Pn2 IX =   "      qnIIX~ + qn2
The following properties belong to equivalence: (i) Every real number is equivalent to itself; (ii) if is equivalent to '1, then '1 is equivalent to (iii) if is equivalent to '1, and '1 is equivalent to " then is equivalent to ,. To see (iii) we set = (a'1 + b)/(c'1 + d) and '1 = (aI' + bl)/(CI' + d l )· Then .
e
e
e=
{(aal
e;
e
e
+ bCI)' + (ab l + bdl)}/{(cal + dCIK + (cb l + ddd},
where (aal
+ bcl)(cb l + ddl ) =
(ab l
+ bdd(cal + dcd
(ad  bc)(aldl  blCd =
±
l.
Definition 5.2. We call this last transformation above the product of the two previous transformations. It is easy to see that every rational number is equivalent to 0, so that we have
Theorem 5.1. Any two rational numbers are equivalent.
D
Theorem 5.2. A necessary and sufficient condition for the modular transformation (I) to be representable as a continued fraction = [ao, al,"" akl> '1] (k ~ 2) is that c > d > O.
e
e
e
Proof From = [ao, al>"" akl> '1], we have = (PkI'1 + Pk2)/(qkl'1 + qk2) and we see that the condition c > d > 0 is necessary. The sufficiency of the condition can be proved by induction on d. D
e
Theorem 5.3. A necessary and sufficient condition for two irrational numbers and '1 to be equivalent is that = [ao, al>' .. , am, co, Cl>' .. ] and '1 = [b o, bl> ... , bn , co, Cl,' .. J. In other words their continued fractions expansions are eventually identical.
e
Proof I) Let W = [co, Cl>' .. J. Then
+ Pml , e= [ao,al> ... ,am,w] =WPm wqm + qml Thus wand eare equivalent. Similarly wand '1 are equivalent, and hence eand '1 are equivalent. 2) Let eand '1 be equivalent, and '1 = (ae + b)/(ce + d), ad  bc = ± l. We may assume that ce + d> O. We expand einto continued fractions: e= [ao, ... , ak, ak+ l>' ..] = [ao,· .. ,akl> IX~] =
(IX~Pkl
+ Pk2)(a.~qkl + qk_2)l.
259
10.5 The Equivalence of Real Numbers
It follows that '1 = (Prx~
+ R)(Qrx~ + S)l,
aPk2 + bqk2, Q = CPk1 + dqkt. S= CPk2 satisfying PS  QR = ± 1. From Theorem 2.4 we have Pk2
where P = aPk1 + bqkt. R = + dqk2; P, Q, R, S are integers
(j'
+ ,
= ~qk2
l(jl < 1, WI < 1,
qk2
so that (c~
Q=
c(j
+ d)qk1 +  ,
S
qk1
=(c~
c(j'
+ d)qk2 + . qk2
From c~ + d> 0, qk2 ~ k  2 and by Theorem 1.3, qk1 ~ qk2 + 1 we see that Q > S > when k is sufficiently large. It follows from Theorem 5.2 that '1 = [b o,· .. , bn , rx~] and the necessity of the condition is established. 0
°
Denote by M(rx) the greatest number such that, for any e > 0, the inequality
Irx 
~ I ~ (M(rx) 1_ e)q;
has infinitely many solutions. For example Pi rx  qi
M«fi 
1)/2) =
fi. Let
1 Aiqi
= .,.z ;
then 1 Ai = (  l)i(' rxi + 1
qi1) , +~
and qi
q;/qi1
ai+qi1
ai+ ai1+qi2
Therefore
ico
ioo
If rx and {J are equivalent, then ai = bi for all large i. We have therefore proved the following Theorem 5.4.
If rx and {J are equivalent,
then M(rx)
=
fi
M({J).
0
(fi 
We see therefore that if A > and if rx is equivalent to 1)/2, then the inequality Irx  p/ql < 1/Aq2 has only finitely many solutions. We may now ask for the value of M(rx) when rx is not equivalent to 1)/2. We have the following
(fi 
260
10. Continued Fractions and Approximation Methods
(J5 
result: If oe is not equivalent to 1)/2, then M(oe) ~)8. Specifically, for such oe, the inequality loe  p/ql < 1/)8 q2 has infinitely many solutions. Also, if oe is equivalent to 1 + ,J2, then M(oe) = )8. For the general situation, we need the following: Definition 5.3. By a Markoff number we mean a positive integer u such that there are integers v, w satisfying u 2 + v2 + w2 = 3uvw. The first eleven Markoff numbers are 1,2,5,13,29,34,89,169,194,233,433. (We shall prove in the next chapter that the number of Markoff numbers is infinite.) It can be proved that if oe u = ~ (J9U 2 2u

4
+ u + 2V) w
where u, v, ware related by the definition of the Markoff number u, then M(oe u) = J9u 2  4/u. Furthermore if oe is not equivalent to oe u for I ::;:; u ::;:; v, then the inequality .
loe ~I <
M(:v)q2
has infinitely many solutions. It follows from this that if oe is not a root of a quadratic equation with rational coefficients, then M(oe) ~ J9u 2  4/u for all u, and hence M(oe) ~ 3. Finally, if o < ml < m2 < ... and oe
=
[2, 2, 1, 1, ... , I, 2, 2, I, ... , 1, 2, 2, I, I, ... , I, ... J 'v'
'v'
'v'
then M(oe) = 3. The proofs of these facts are outside the scope of this book.
10.6 Periodic Continued Fractions Definition. A continued fraction is said to be periodic if there exist k and L such that az = aZ+k for I ~ L. We call k the period and we write
J3
J5
For example, we have ,J2 = [1,2J, = [I, i,2J, = j7 = [2, i, I, 1;4]. In fact we have the following well known result.
[2,4J and
Theorem 6.1. A necessary and sufficient condition for a number to have a periodic continued fractions expansion is that it is a root of an irreducible quadratic with rational coefficients. 0
261
10.7 Legendre's Criterion
10.7 Legendre's Criterion We saw that ifp/q is a convergent of (J(, then I(J(  p/ql < 1/q2. The converse of this is not true. We shall now determine a necessary and sufficient condition for p/q to be a convergent of (J(. Let
p q
e9q2'
e=
(J(=
± 1,
and let
p q
=
[
PnI ao,···,an I] =   . qnI
We can choose n so that ( 1t 1
=
e, and we can now write
PnI qnI
e9qnI
(J(=2'
We define 13 by (J( =
PnIf3 + Pn2 ',qnIf3+qn2 '
so that
e9q;I
PnIf3+Pn2 qnIf3+qn2
PnI qnI
(It 1 qnl(qnIf3+qn2)
Therefore
9
=
qnI qnIf3 + qn2
Solving this for 13 we have 13 = (qnI  9qn2)/9qnl, and since 0 < 9 < 1 we see that 13 > O. Now (J( = [ao, ... , an b f3J. If 13 ~ 1, then 13 = (J(~ ( = [an> an+ b" .]), and this means that p/q is a convergent of (J(. If 0 < 13 < 1, then [anI + 1/13] = anI + c, c > 0 so that (J( = [ao, ... , an2, an b + c, . .. ] and we see that [ao, . .. , anI] is not a convergent. Therefore the required necessary and sufficient condition is that 13 ~ 1; in other words we have: Theorem 7.1 (Legendt:e). Let e9 = q2(J(  pq, e = ± 1, 0 < 9 < 1, and let p/q = [ao, .. . ,anI], (  1t 1 = e. Then, a necessary and sufficient conditionfor p/q to be a convergent of (J( is that
D Since the right hand side of the above inequality exceeds t we deduce at once the following
262
10. Continued Fractions and Approximation Methods
Theorem 7.2. If a rational number p/q satisfies loc  p/ql < 1/2q2, then it is a convergent of oc. 0 Theorem 7.3. Let p, q be positive integers satisfying Ip2  oc 2q21 < oc. Then p/q is a convergent of oc.
Proof Let oc 2q2  p2 = eooc, e = that
± 1, 0::;:; 0 < 1. Then ocq  p = eooc/(ocq + p), so
oocq ocq+p
[) = eq(ocq  p) =   =
oocqnl , ocqnI+Pn1
(_I)nl=e.
From Theorem 7.1 we see that it suffices to prove that
or that OOC(qn1 + qn2) < ocqnl + Pnl' Now this inequality clearly holds when n = 2 so that it suffices to establish ocqnl  Pnl < OC(qn1  qn2) for n > 2. But ocqnl  Pnl = eooc/(ocqnl + Pn d, and by Theorem 1.3 we have
qnl  qn2 The theorem is proved.
~
1
I > 
ocqnl+Pn1
0
10.8 Quadratic Indeterminate Equations In this and the next sections d denotes a positve integer which is not a perfect square. We consider the equation
0< III <
Jd
in the integer unknowns x and y. Theorem 8.1. In the continued fractions expansion of the form
Jd, the numbers oc~ must take
where Pn and Qn are integers.
Jd  [Jd]
= l/oc~ so that the required result Proof We use induction on n. First, holds by setting PI = and QI = d  [JdJ2. We now assume as induction hypothesis that oc~ = + Pn)/Qn· Since oc~ = an + l/oc~+ I we have to find two
[Jd] (Jd
263
10.8 Quadratic Indeterminate Equations
integers Pn+b Qn+ 1 such that Jd
+ Pn
Qn
= an +
Qn+1 !":J
Vd+Pn+1
'
and (1)
This means that we have to find Pn+1, Qn+ 1 so that d
+ PnPn+1 = anQnPn+1 + QnQn+b Pn + Pn+1 = anQn
(2)
(3)
and (1) hold. On subtracting Pn + 1 times equation (3) from (2) we have (4)
If (4) holds, then (1) also holds; and (2) also follows from (3) and (4). It remains therefore to find Pn+b Qn+ 1 so that (3) and (4) are satisfied. We solve (3) for Pn+1. From P; == P;+1 (modQn) we see that d P;+1 == 0 (mod Qn) so that there exists Qn+ 1 satisfying (4). The theorem is proved. 0 Theorem 8.2. The equation x 2  dy2 = ( l)nQn is always soluble. and III < fl, then the equation x 2  dy2 = I has no solution.
If 1# ( I)"Qn
Proof We have
fl =Pn11X~ + Pn2 = Pn1(Jd + Pn) + Pn2Qn qn11X~ + qn2 and since
qn1(Jd
+ Pn) + qn2Qn'
fl is irrational, we have, on clearing the denominators,
On subtracting qn  1 times the second equation from Pn 1 times the first equation we have
The last part of the theorem follows from Theorem 7.3.
0
Theorem 8.3. Let k be the period of the continued fraction expansion for Jd. Let n> Land P;1  dq;_1 = ( l)nQn· Then P;1 +lk  dq;_1 +lk = ( l)n+lkQn.
Proof This follows at once from Jd + Pn Qn
Jd + Pn+1k Qn+lk
o
264
10. Continued Fractions and Approximation Methods
10.9 Pell's Equation We shall now consider Pell's equation (1)
From Theorem 8.3, there exists Q such that the equation X2  dy2 = Q has infinitely many solutions. If we partition these solutions into Q2 classes mod IQI, then there must be one class with at least two solutions. That is, there are integers Xl>Yl and X2,Y2 such that xi  dyi = x~ : dy~ = Q,
Xl > 0, Yl > 0,
X2 > 0, Y2 > 0,
and Xl == X2
(mod IQI),
Yl == Y2
(mod IQI),
We now show that XlX2  dYlY2 x=
Q
XIY2  X2Yl Y=
Q
are solutions to Pell's equation (I). First, we have XIX2  dYIY2 == xi  dyi = Q == XIY2  X2Yl == XlYl  XIYl ==
°
(modIQI),
° (mod IQI),
so that X,Y are integers. Secondly we have Q2(X2  dy2) = (XIX2  dYl Y2)2  d(XIY2  X2Yl)2 =
(xi  dyi)(x~  dy~) = Q2,
so that X,Y are solutions to (I). Finally, they are not the trivial solutions x = ± I, Y = 0, because Y = implies XIY2  X2Yl = 0, and from (xl>yd = (X2,Y2) = I we deduce that Xl = X2, Yl = Y2 contrary to our assumption. We have therefore proved
°
Theorem 9.1. The Pel/'s equation X2  dy2
= I has a nontrivial solution. 0
From Theorem 7.3 we see that x/y = Pn t/qnl must be a convergent of and from Theorem 8.2 we know that there exists n such that ( I )nQn = I. Theorem 9.2. Let n be the least positive integer satisfying ( I )nQn solutions to the equation X2  dy2 = I are given by
jd,
= I. Then al/ the
265
10.9 Pell's Equation
+ jdqnl > 1. Because ± 1/(x + jdy) = ± (x  jdy), it suffices to show that all positive solutions to x 2  dy2 = 1 are given by x + yjd = em (m > 0). Let (x,y) be such a solution, so that x + yjd > 1. We may choose m so that em:::; x + yjd < em + l or 1 :::; em(x + yjd) < e. Let
Proof Let e = Pnl
and we shall prove that X (xo
+ Y jd =
1. Since jd is irrational, it follows that
+ Yojd)(x 
yjd)
=
X  Y jd.
On multiplying the equations together we have X 2
1< X

dy 2 = 1. Suppose now that
+ jd Y < e. Then
°< e
1
< (X + jd Y) 1 = X  jd Y < 1.
We deduce easily that 2X = (X + jd Y)
+ (X 
jd Y) > 1 + e  1 > 0,
2jd Y = (X + jd Y)  (X  jd Y) > 1  1 = 0. It follows from these that
X>o,. and
Jl
+ dy2 increases withy, so that x + jdyincreases as y increases. We Now x = deduce from the above that Y < qnl and X < Pn b so that X/Y is a convergent with denominator less than qnl' This is impossible; therefore X + jd Y = 1. D We see from the above that the equation x 2  dy2 = I is always soluble, but the equation x 2  dy2 =  I may have no solution. For example, since x 2 == 0, I (mod 4) so that x 2  3y2 == x 2 + y2 == 0, 1,2 (mod 4), we see that the equation x 2  3y2 =  1 is insoluble. In fact this example shows that x 2  dy2 =  I is insoluble whenever d == 3 (mod 4). However if xo, Yo satisfy x~  dy~ =  1, then, by defining Xl, Yl with Xl + jdYl = (xo + jdYO)2 we see that xi  dyi = 1. It is not difficult to prove that if x 2  dy2 =  1 is soluble, then all the solutions to x 2  dy2 = ± 1 are given by ± (Pn _ 1 + jd qn  d where n is the least positive integer satisfying ( 1)nQn =  1.
266
Continu~d
10.
Fractions and Approximation Methods
10.10 Chebyshev's Theorem and Khintchin's Theorem Let 8 be an irrational number. According to Theorem 2.4 there are infinitely many integers x, y satisfying Ix8 
I
yl <,
y
x
(x,y)
= [x8],
= l.
(I)
It follows at once from this that, if e > 0, then there exists an integer x such that x8 differs from [x8] by less than e. In other words the number 0 is a limit point of the point set
x = 1,2,3, ....
x8  [x8],
(2)
An immediate problem arising from this is the determination of the set of limit points of the point set (2). For this Chebyshev has proved that each point in the interval (0, I) is a limit point of the point set (2). In fact he proved the following stronger result. Theorem 10.1. Let 8 be any irrational number and /3 be any real number. Then there are infinitely many integers x, y satisfying 18x  y 
3 x
/31 <.
(3)
Proof By Theorem 2.4 there are infinitely many integers p, q > 0 such that p
(i
l(il <
8=+q q2'
I,
(p,q)=l.
(4)
For fixed q and /3 we may choose an integer t such that Iq/3  tl ::::; t
(i'
q
2q
WI::::; l.
/3=+,
t so that (5)
Since p, q are coprime, there exist integers x, y such that q 3  ~ x <q 2"" 2'
px  qy
= t.
(6)
From (4) and (5) we have 18x  y 
xp q
x(i q2
/31 = I +  =
x(i I;;: 
I
t (i' Y   q 2q
(i' x 2q < q2
I
1
+ 2q .
Since q > 2x/3, we have 18x  y  /31 < 9/4x + 3/4x = 3/x. Since q can be arbitrarily large, and x ~ q/2 by (6), the theorem is proved. 0
267
10.10 Chebyshev's Theorem and Khintchin's Theorem
According to this theorem there exists a constant c such that given any irrational
8 and any real 13 the inequality c
18x  y  131 < x
(7)
has infinitely many integer solutions in x > 0 and y. In (3) we have c = 3, and we see from Theorem 4.4 that c must be at least l/J'S. Khintchin has proved the following Theorem 10.2. Let 8 be irrational, 13 be real and e > O. Then the inequality
1+e Ix8  y  131 <  
(8)
J5x
has infinitely many solutions in integers x > 0, and y. Proof By Theorem 4.3 there are infinitely many coprime pair of integers p, q such that 8 = p/q + b/q2, where 0 < Ibl < 1/J5. We may assume that b > 0 since otherwise we can replace 8, 13 by  8,  13. Let ~1' ~2 be real numbers satisfying ~2  ~1 ~ 1, and we shall specify them later. We can choose x,y such that px  qy = [qf3],
(9)
Then we have
Ix8  y  131
=
I~x + bx _ q
q2
y _ [qf3] _ ~ q q
I (10)
where
r = qf3 
[qf3]. We want to show that
_~ ~ ~ (x; _r) < ~, or r2
1
J5
x 2b ~ q2
xr
r2
r2
1
J5.
~+<+
4b
q
4b
4b
1) Let us first assume that r2 ~ 4b/J5 so that the left hand side in the above is positive and we have to show that
or
268
10. Continued Fractions and Approximation Methods
Let
~1= ~(2fl+J:: ~);
~2 = ~ (2fl + J:: ~). We now examine how we can make
~2 
~1 ~
1. On simplifying
~ (J:: + ~  J:: ~) > I (the left hand side is merely ~2  ~1) we obtain .. 2 < ~ + <5 2 • Since the numbers involved in the simplification are positive we see that the result is established if 4<5/)5 ::;:; .. 2 < ~ + <5 2 ; that is if
2J7s::;:; .. <Ji+<5
2
•
We are left with the two cases .. 2 < 4<5/)5 and J~ + <5 2 2) Suppose that .. 2 < 4<5/)5. From .. > 0 we have
::;:; ..
< 1 to consider.
<~ ~C~+J:> j} ~Hs>L Let 1'/ > 0, and take ~1 = 1'/, ~2 = 1'/ + ~ so that ~2 number x in (9) exists, and by our assumption
~ (x; ..)
=

~1
=
~
> 1. Therefore the
(xf  2flY :; >  ~.
On the other hand we take y = ax + b so that, as x varies in an interval, y2 attains the maximum value at the end points of the interval, and therefore
269
10.11 Uniform Distributions and the Uniform Distribution of n9 (mod 1)
3) Suppose that ~ ~ 't
or I 
't
~
't
< 1. From c5 < I/fi we have
Ji J fiY + c5 2 >
(I 
+ 2c5 (I 
fi)
+ c5 2 = I 
fi
+ c5
< I/fi  c5. Let '1 > 0. We may specify x and y such that px  qy = [qP]
+ I,
'1q
~
x < (l
+ '1)q,
and similarly to (10) we have Ix.9 _ y _
PI = Ixc5 + I q2
q
't I = ~q (xc5q + (l  't»)
I{ I} I I (l + '1)2 <  (I + '1)c5 +   c5 ~ (I + '1) < =cq fi q fi xfi Since '1 is arbitrary, the theorem is proved.
D
Exercise. Let .9 be an irrational number such that, given any B > 0, there always exist integers x,y satisfying Ix.9  yl < B/X. Prove that if c5 > and Pis real, then there exist integers X,y such that Ix.9  y  PI < (I + c5)/3x.
°
10.11 Uniform Distributions and the Uniform Distribution of n9 (mod 1) Chebyshev's theorem in the last section states that the point set {x.9} = x.9  [x.9] , x = 1,2,3, ... is dense in the interval (0, I), in the sense that each point in (0, I) is a limit point of the set. We may ask about the distribution of this point set in the interval (0, I). In other words, if (a, b) is a subinterval of (0, I), then as x takes the values 1,2, ..., n does the interval (a, b) receive the "correct proportion" of points? Let us define precisely what we mean by the "correct proportion" . Definition. Let Pi (i = 1,2,3, ... ) be a point set in the interval (0, I). Let ~ a b ~ I, and for each positive integer n denote by Nn(a, b) the number of P b P 2 , ••• ,Pn that lie in the interval (a, b). Iflimn>ooNn(a,b)/n=ba always holds, then we say that the point set Pi (i = 1,2,3, ... ) is uniformly distributed in (0, I).
°points<
We shall now prove the following Theorem 11.1. Let .9 be irrational. Then the point set {x.9} = x.9  [x.9]; x = 1,2, 3, ... , is uniformly distributed in (0, I).
270
10. Continued Fractions and Approximation Methods
Proof Let (a, b) be any subinterval of (0, I). By Theorem 4.1 there are infinitely many pairs of integers p, q > 0 such that p
J
IJI <
.9=+q q2'
(p, q) = 1.
1,
Let u, v be integers satisfying
uI q
u q
v q
v+l ; q

letn = rq + s, 0 ~ s < q, and let 0 (mod q) namely jq, jq + I, ... , jq . {(N
~j
< r. Consider now a complete residue system
+q
1. It is easy to see that
+ k).9} = {kP  + jJ  + kJ} = {kP + [jJ] + J'} , q
q
q2
q
q
WI<2.
Since [jJ] does not depend on k, as k runs over 0, I, ... , q  I, the points pk + [jJ] runs over a complete residue system (mod q). Therefore, in the q numbers {(jq + k ).9} those that lie in the interval (a, b) must be more than v  u  4 and less than v  u + 6. It follows that as x takes the values 1,2, ... , n the number of numbers {x.9} that lie in the interval (a, b) exceeds
rev  u  4)
n
s (v  u  4)   (v  u  4) q q
= 
~
vu4 6 neb  a)  n n, q n
but is less than
(r + I)(v  u + 6)
~ n (v ~ u +~) + v 6
vu+6
q
n
+ n +
u+6
~ neb 
a)
n.
Let 8> O. We choose q > 12/8 and then choose n > 2(q + 6)/8. It follows that we have neb a) n8~Nn(a, b) ~n(ba) +n8. This proves that lim n _ oo Nn(a, b)/n = b a. 0
10.12 Criteria for Uniform Distributions Theorem 12.1 (Weyl). A necessary and sufficient condition for the sequence (Xn) ,
o ~ Xn ~ I to be uniformly distributed in (0, I) is that the equation 1
lim f(xd n+
00
+ ... + f(xn) =
ff(X) dx
n o
holds for every Riemann integrable function f(x) in (0,1).
(1)
271
10.12 Criteria for Uniform Distributions
°
Proof We first establish the necessity of the condition (1). I) Letf(x) be defined to be cor according to whether a clearly
+ ... + f(xn) =c I·1m
· f(xr) I1m
n
noo
Nn(a, b)
=c
~
(b
x
~
b or not. Then
) a,
n
noo
and I
f f(x) dx
=
c(b  a).
o
Therefore the equation (I) holds for this function f(x). 2) The equation (I) is linear in the sense that if it holds for f1> . .. Jk, then it holds for cdl + ... + Cdk. From 1) we see that (I) holds for all step functions. 3) It is a simple exercise to show that iffis Riemann integrable, and B > 0, then there are two step functions (f).(x), €P.(x) such that (f).(x) ~ f(x) ~ €P.(x) and I
f (€P.(t)  (f).(t)) dt <
B.
o
From 2) we see that (1) holds for (f)e(x) and €Pe(x) so that I
. «(f).(XI) 1 f (f)e(t)dt = lIm + ... + (f)e(x n+
00
n
n ))
o
. 1 lIm  (f(xr)
~
noo
+ ... + f(x n ))
n
o
Since I
f €P.(t)dt o
I
I
~ f f(x)dx ~ f o
€P.(x)dx,
o
it follows that I
lim f(xr) + ... + In+ n
f(x n ) 
ff(X) dx I <
B.
00
o
The necessity part of the theorem is therefore proved. The sufficiency part is easy:
272
10. Continued Fractions and Approximation Methods
°
we letf(x) = I or according to whether a ::;:; x ::;:; b or otherwise. Then equation (1) becomes limn> 00 Nn(a, b)/n = b  a. D It is very difficult to make a direct application of this theorem. This is because we have to verify (I) for the whole family of Riemann integrable functions. Actually the proof of the theorem shows that we need only restrict ourselves to step functions, and in fact it suffices to have the basis for a linear space of functions which includes the Riemann integrable functions as limits. This is embodied in the next theorem. Theorem 12.2 (Weyl). Under the hypothesis of Theorem 12.1, another necessary and sufficient condition is that the equation (I) holds for the functions f(x) = e21timx (m = ± 1, ± 2, ... ). In other words, a necessary and sufficient condition for the x ::;:; I to be uniformly distributed in (0,1) is that sequence (x n),
°: ;:
lim nt>oo
~n I
±
e21timxv
v=l
1= °
holds for all m :f 0. Proof There is no need to prove the necessity part. For the sufficiency part we define g(x) = {
I,
if 0::;:; x < a,
0,
if a::;:;x
Then . g(Xl) I1m
+ ... + g(xn) n
n+oo
I' Nn(O, a) =lm. n+oo
n
It is clear that we need only prove that
. g(Xl) hm
+ ... + g(xn) n
=a.
We now construct a continuous periodic function gq,b(X) with period I to approximate g(x). We define (x  '1
gq,b(X)
{ =
~ (x 
+ b)/b, a
+ '1 
0,
Here 0< b ::;:; tmin(a, 1  a),
°: ;:
if if if if
b)/b,
'1  b ::;:; x ::;:; '1, '1::;:; x ::;:; a  '1, a  '1 ::;:; x ::;:; a  '1 + b, a'1+b::;:;x::;:;'1b+l.
'1 ::;:; b. Clearly
Since gq,ix) is continuous, it follows that 00
gq,b(X)
=
L n= 
00
Cne21tinX,
273
10.12 Criteria for Uniform Distributions
where
f g~,I>(x)dx
~1>+1
Co =
= a + b  21'/;
~I>
and when n ¥ 0,
~I>
It follows that
ICnl
~ 1/b(nn)2
 g~,I>(xd S~,I> () X 
and
+ ... + g~,I>(Xk)
_ 1 ~

k

~
L.
L.
C
21tinx'
ne
kj~ln~oo
J
look
=
I
kn~oo
Cn
I
j~l
e21tinxj.
Thus we have
We observe that
I Cn~ i IInl>N k j~l
e27tinXjl
~; I ~. bn
n>N
n
Let e > 0 and choose N so that the right hand side of this inequality is less than e. With N fixed we see from I k lim  I e27tinXj = 0 k ... oo k j~l
that for all large k,
In~I
Cn 1 Ik N k j~ 1
N
. e 27t ,"Xj
I< e.
n*O
Thus, given any pair of fixed 1'/, b we have
or lim k'"
Now let
00
S~,ix)
= a + b  21'/.
274
10. Continued Fractions and Approximation Methods
From Sb.o(X)
~
S(x)
~
SO.b(X) we deduce that
k+
k+
00
00
for any lJ. Therefore limk_ 00 S = a as required.
D
For a clearer description of uniform distribution it is best to use the unit circle to represent the interval. Let en = e21tixn, n = 1,2, ... so that the sequence (x n) in ~ Xn ~ 1 is now transformed into a sequence on the unit circle. An advantage of using this description is the removal of the special properties of the end points 0, 1 in the interval (0,1). Take any arc of the unit circle with length 2noc (oc < 1). Then any uniformly distributed sequence will have the proportion oc of its points on this arc. Moreover, since e21tixn = e21ti (xn+d), it does not even matter if the sequence (xn) lie outside the interval (0,1). In other words we may define uniform distribution of f(x), mod 1 by the uniform distribution of the fractional parts of f(x) in (0, 1). A necessary and sufficient condition for the uniform distribution off(x), mod 1 is then 1 n lim  I e21timf(x) = 0, m¥O.
°
n+oonx=l
An interpretation of this condition is that the centre of gravity of the sequence of points e21timf(x), x = 1,2, ... , (m ¥ 0) is the centre of the circle. It is clear that iff(x) is uniformly distributed mod 1, then so is mf(x) for any nonzero integer m. The most interesting unsolved problem concerning this is whether eX is uniformly distributed mod 1.
Theorem 12.3. A necessary and sufficient condition for the uniform distribution of f(x) , mod 1 is that 1 n 1 lim  I {f(x)+a}=, noon x =1 2 Proof Necessity. Let fix) be uniformly distributed, mod 1. Then f(x) + a is also uniformly distributed, mod L Therefore we need only establish the case a = 0. Let Xm = {f(m)}. Then, by Theorem 12.1, we have
f 1
lim 1
I
n
noon x =1
Sufficiency. Let
°
~
{f(x)}
=
xdx
1 2
=.
o
b
~
1. Then
1 n i l {f(x) + 1  b} =  II ({f(x)} + 1  b) +  I2 ({f(x)}  b), n x= 1 n n
 I
where in II' X runs through those integers 1,2, ... ,n such that {f(x)} < b, and in I2, x runs through those integers 1,2, ... ,n such that {f(x)} ~ b. We see therefore that 1
n
 I
n x=1
{f(x)
+ 1
n
b} = n 1
I
x=1
{f(x)}
+ n 1N n(0,b)
 b.
275
10.12 Criteria for Uniform Distributions
Letting n +
00
and observing the hypothesis we see that . 1 11m Nn(O,b) n+
as required.
0
00
n
=
b
Chapter 11. Indeterminate Equations
11.1 Introduction By indeterminate equations we mean equations in which the number of unknowns occurring exceed the number of equations given, and that these unknowns are subject to further constraints such as being integers, or positive integers, or rationals etc. Apart from equations of the first and second degrees, the discussion on indeterminate equations is very scattered. The complicated nature of the subject is illustrated by the fact that Volume II of Dickson's History 0/ Number Theory devotes over eight hundred pages on such equations. The study of these equations has a long history. In the third century Diophantus attempted a systematic study and in fact nowadays indeterminate equations are often called Diophantine equations. In our country indeterminate equations have an even longer history; for example Soon Go gave the general solution of X2 + y2 = Z2 in integers x, y, z much earlier than the west.
11.2 Linear Indeterminate Equations From Theorem 2.6.2 we see that a necessary and sufficient condition for the equation alxl + a2x2 + ... + anxn = N to have a solution is that (at. ... ,an)IN. Suppose now that a 1 > 0, ... ,an> 0, (a 1 , ••• ,an) = 1. We ask for the asymptotic formula for the number of solutions to the equation Xv
;?;
0 (v = 1, 2, ... , n).
(1)
Theorem 2.1. Let (at. ... ,an) = 1, and denote by A(N) the number o/solutions to (1). Then we have
Proof 1) Since (at. ... , an) = 1, the number A(N) is the coefficient of XN in the power series for 1
j(x)
1
1
= 1 _ x'" . 1 _ x"2 ... 1 _ x an •
277
Il.2 Linear Indeterminate Equations
Let 1, (I, (2, ... , (I be the roots of (l  x a ,) ••• (l  x an ) = 0, with multiplicities n, 11, 12 , ••• , It respectively. Since (at> ... , an) = 1 we have Ii ~ n  1 (i = 1,2, ... , t). We have, by partial fractions, j{x)=
An (lx)n
Al + ... ++
B" «(IX)I,
Ix
BI + ... + (IX
+ ... (2)
where A, B, . .. , P are constants. 2) Denote by ljJ(N) the coefficient of x N in the power series expansion of A (0( _
xy
=
AO(I
(X)I 1 ~ .
Then, by the binomial theorem expansion, we have ljJ(N)
= AO(
_ ( 1)(  I  1) ... (  I  N I
N!
= AO(I (N + 1
+ 1) ( 
)N
1 0(
l)(N + 1 2) ... (N (II)!
+ 1) (~)N, 0(
so that .
hm
N+oo
ljJ(N). NI
0(1
+N
A (II)!
1
(3)
Applying this to the various terms in (2) and observing that Ii that
~
n  1 we see
and from (2) we have An=lim x+I
(l  X)n (1  x a ,)
•••
(1  x an )
Theorem 2.2. Equation (1) is always soluble
if N
aI··· an
.
D
is sufficiently large.
D
Exercise. Let (a, b) = 1, a> 0, b > 0. Show that the number of solutions to ax + by = N, x ~ 0, y ~ is given by
°
N  (bl + am) ab
+1
where I and m are the least nonnegative solutions to bl == N (mod a) and am == N (mod b) respectively.
278
11. Indeterminate Equations
11.3 Quadratic Indeterminate Equations We shall solve the equation ax 2 + bxy
+ ey2 + dx + ey + f
=
(1)
0.
We write D = b 2  4ae. If D = 0, then we multiply (1) by 4a giving (2ax + by)2 + 4adx + 4aey + 4af = 0, which is not a difficult equation to solve. Let 2ax + by = t so that t 2 + 2(2ae  bd)y
(t
+ d)2 = 2(bd 
+ 4af =  2dt, 2ae)y + d 2  4af
The number t can be obtained from the congruence (t + d)2 == d 2  4af (mod2(bd  2ae)), and so x, y can be solved. We now assume that D "# 0. Multiplying (1) by D2 we have
Substituting Dx = x' a(x'
+ 2ed 
be, Dy
= y' + 2ae  bd into (2) we have
+ 2ed  be)2 + b(x' + 2ed  be)(y' + 2ae  bd) + e(y' + 2ae + dD(x' + 2ed  be) + eD(y' + 2ae  bd) + fD2 = 0,
bd)2
or ax'2
+ bx'y' + ey'2 = k,
(3)
where
= a(2ed  be)2 + b(2ed  be)(2ae  bd) + e(2ae  bd)2
 k
+ dD(2ed 
be)
+ eD(2ae 
bd)
+ fD2.
We see therefore that whether (1) is soluble depends on whether (3) has solutions satisfying x' == be  2ed,
y' == bd  2ae
(mod D).
Our first priority is therefore to solve (3).
11.4 The Solutions to ax2 + bxy
+ cy2
=
k
We shall solve ax 2 + bxy
Let d = b 2

+ ey2 = k.
(1)
4ae. We shall assume that d is not a perfect square, and that
11.4 The Solutions to ax 2 + bxy
+ Cy2 = k
279
(a, b, c) = 1. We need only find those solutions satisfying (x,y) = I, and we call these the proper solutions. Theorem 4.1. Let x, y be a proper solution to (I). Then there are two uniquely determined integers sand r satisfying xs  yr = 1,
(2)
and the integer
1= (2ax
+ by)r + (bx + 2cy)s
satisfies
12 == d (mod 4k),
o ~ 1< 2k.
Proof Let ro, So be a solution to (2). Then the general solution to (2) is r = ro s = So + hy where h is any integer. Thus
1= (2ax
+ hx,
+ by)ro + (bx + 2cy)so + 2h(ax 2 + bxy + cy2) = 10 + 2hk,
so that we may choose a unique h such that 0 [2
(3)
~
I < 2k. Finally we have
= [(2ax + by)r + (bx + 2cY)S]2 = 4(ar2 + brs + cs 2)(ax2 + bxy + cy2) + (b 2  4ac)(xs  yr)2 == d (mod4k).
0
Theorem 4.2. Let (Xl' YI) and (X2' Y2) be two proper solutions corresponding to the same number I in the previous theorem. Then we have
where t and u are integers satisfying
(5) Conversely, if(X2, Y2) is aproper solution, then the numbers Xl> YI defined by (4) also give a proper solution and both solutions correspond to the same number I. Proof I) We first show that t
= «2axl + byr)(2ax2 + bY2)
u =  (XIY2  X2YI)/k
 dYIY2)/2ak,
(6)
are the suitaqle integers; that is we show that t and u are integers satisfying (5). From
280
11. Indeterminate Equations
1+Ujd 2
+ byd(2ax2 + bY2) 
(2axt
dYtYz 4ak
± 2a(XtY2 
X2Yt)jd
+ bYt + jdYt)(2aX 2 + bY2 ± jdY2) (2ax t + bYt + jdYt)(2ax t + bYt  jdYt) (2ax t
+ bYt + jdYl)( 2aX2 + bY2 ± jdY2) (2aX2 + bY2 + jdY2)(2aX2 + bY2  jd Yz) , (2ax t
we see that (4) follows. Next from 12  du 2
4= we see that
1
1 + jd u
2
1
.
jd u
2
=1,
and u satisfy (5). Also
2axt
+ bYt = (2axt + bYt)(StXt  TtYt) = (2axt + bYdStxt  IYt + (bxt + 2cYt)StYt ==  IYt
(mod 2k).
(7)
Similarly we have 2aX2
+ bY2 == 
IYz
(mod 2k).
Therefore 2a(xtYz  X2Yt) == 0
+ 1)(xtYz
(mod 2k),
 x2yd == 0
(mod2k).
2C(XtY2  X2Yt) == 0
(mod2k),
(b  1)(xtYz  X2Yt) == 0
(mod 2k).
(b
Similarly we have
But (2a,b
+ I,b 1,2c) = (2a,2b,2c,b + I) ~ 2,
so that XtYz  X2Yt == 0
(modk).
This shows that u is an integer. Therefore 12 is an integer, and since 1 is rational, itself must be an integer. 2) Suppose that 2axt
and
12  du 2
+ (b + jd)Yt = (2ax2 + (b + jd)Y2)
e
+ ;jd),
= 4. Then t 
Xt
bu
= 2X2  CUY2,
1
Yt
+ bu
= aux2 + 2Y2'
t
11.4 The Solutions to ax 2 + bxy
Let
rio
+ Cy2 =
SI correspond to the solution r2
t
281
k
+ bu
= 2r1 + CUSl,
Xio
Yl. Then S2 =  aurl
t  bu
+ SI 2
correspond to the solution X2, Y2, because
Finally, let II and 12 correspond to (Xio yd and (X2' Y2) respectively. Then
t  bu = (2ar l + bs1) ( 2X2  CUY2 ) + (brl + 2csd ( aux2t + +2bu Y2 )
= { 2a ( r l t  2bu+ s 1cu)
t  bu )} X2 +b (SI2+rlau
bu s 1cu.) +2c (SI2rlau t + bu )} Y2 + { b ( r l t +2
The theorem is proved.
0
We shall now separate our discussion into two cases depending on the sign of d. Theorem 4.3. Suppose that d < 0. Let
if d <  4, if d=  4, if d =  3. Then there are w proper solutions to (l) that correspond to the same l.
°
Proof From Theorem 4.2 we see that it suffices to show that the equation t 2  du 2 = 4 has w solutions. If d <  4, then clearly t = ± 2, U = are the only solutions, so that w = 2. If d =  4, then t 2 + 4u 2 = 4 has the four solutions t = ± 2, U = and t = 0, u = ± 1. Finally if d =  3, then t 2 + 3u 2 = 4 has the six solutions t = ± 1, U = ± 1; t = ± 2, U = 0. 0
°
Theorem 4.4. Let d > 0. Then all the solutions to the equation X2  dy2 = 4 can be obtained as follows: Let xo, Yo be a solution in which Xo + Yofl is least (xo > 0,
282
II. Indeterminate Equations
Yo > 0). Then all the solutions are given by x
+ yfl = + (xo + YOfl)n 2

2
n = 0, ± I, ± 2, ....
'
Proof Since the equation x 2  dy2 = I does possess a solution we see that Xo, Yo exist. The rest of the proof is the same as that in Theorem 10.9.2. 0
Let Xo
+ Yojd
e='
2
_ Xo  Yojd e= 2 .
'
Definition. Let d > O. By a primary solution to (l) we mean a solution which satisfies
2ax
+ (b 
If we write L = 2ax above becomes
fl)y > 0,
I
+ (b + fl)y,
L
+ (b + jd)YI < e2. 2ax + (b  fl)y 2ax + (b  jd)y, then the condition 2ax
~ 1
=
' I
~ I~I < e
2
•
Theorem 4.5. Let d> O. If the equation (I) has proper primary solutions which correspond to the same I, then it has a unique proper primary solution.
Proof From Theorem 4.2 we know that if Xo, Yo is a proper primary solution to (l), then, on denoting by Lo the associated number L, every proper solution of (l) corresponding to the same I can be represented by L = ± Loen. We have
so that I ~ IL/LI < e2 only when n = 0, and in this case L = Lo > O.
0
When d > 0 we set w = I. We can now generalize the definition of a primary solution: When d> 0, the definition is as given previously; when d < 0 any proper solution is also called a primary solution. Combining Theorems 4.3 and 4.5 we now have Theorem 4.6. If, corresponding to the same I, the equation (I) has proper primary solutions, then there are w proper primary solutions. 0
Theorem 4.5 suggests that in solving ax 2 + bxy + cy2 = k there is no need to search for integer points on the whole hyperbola. The primary solution occurs in a finite part of the hyperbola, and having obtained the primary solution we may use the formula L = ± Loen to find all the other solutions. That is, if e is known, all the solutions can be obtained in a finite number of steps. Specifically, from LoLo = 4ak,
Lo > 0,
II
2 I~ Lo Lo <e,
283
11.5 Method of Solution
we see that
or
giving
Iyl :;:; 2sJlakl/d.
°
That is we need only find a solution which satisfies < y :;:; 2sJlakl/d and the rest can be obtained from L = ± Los". When a > 0, k > we deduce from L > and LL > that L > 0, and whence L < L so that
°
°
0<2Jdy= L L:;:; L=
°
JLL~ :;:;s~.
Therefore
0< y:;:; sJak/d. In the actual evaluation of the solutions, this result is better than the previous bound. Exercise 1. Prove that, under the same hypothesis,
o
Xl =
the form ax 2 + bxy
bu
2X  euy,
+ ey2
becomes axi
Yl
=
t
+ bu
aux +   y 2
+ bX1Yl + eyi.
11.5 Method of Solution From the above we see that we have to solve the equation ax 2 + bxy + ey2 = k. We now discuss the case when d is positive and not a perfect square. We then have the equation (2ax + by)2  dy2 = 4ak. We therefore consider k> 0,
o=±l.
(1)
If k < Jd, then by Theorem 10.8.3, all the solutions to (1) can be obtained from the
284
II. Indeterminate Equations
continued fractions expansion for jd, and from periodicity this involves only a finite number of steps. We now show that if k > jd, then we can still reduce it to the case when k<jd. Suppose that x, Y is proper solution to (1). Then there are Xl> Yl such that (2)
Multiplying (1) by
x~
 dyi we have
or
Let xo, Yo be a solution to (2). Then all the solutions to (2) are given by XXI  dYYl = xXo  dyyo + (X2  dy2)t = xXo  dyyo + Jtk. We may therefore choose t so that
Xl
= Xo + tx, Yl = Yo + ty so that
k
IXXI 
Let
IXXI 
dyyd
dYYll ::;;;
2·
= l. Then
x~  dyi
f2d
= ~ = 1'/h,
1'/ =
± 1,
h>
o.
Therefore
From this we see that from a solution to (1) we arrived at a similar equation with a number k which is smaller. If this number is still greater than jdwe can repeat the argument. This suggests the following procedure. We first solve for all those I satisfying 12 == d (modk), 0::;;; I::;;; k12, and we let them be 11 , /2, ... , It. Set (l?  d)/Jk = 1'/ihi, 1'/i = ± 1, hi > 0 and solve the system x?  dy? = 1'/ihi (1 ::;;; i::;;; t). Suppose that hi < jd. Then we use the method of continued fractions. Let Xi, Yi be a solution. Then X=
 JdYi ± lixi 1'/ih i
 JXi ± IJ'i Y=1'/ihi
(3)
is a solution to (1). This is because from 1'/ihi(X
+ jdy) = (Xi + jdy;)(  Jjd ± Ii)
we have X2  dy2 = Jk at once. Further, if x, y in (3) are integers, then they are solutions to (1).
285
1l.5 Method of Solution
x; 
If hi > jd, then we proceed to obtain a specific solution to dy; = Yfihi' Then all the solutions to (I) can be obtained. We illustrate this with an example. Example. We wish to solve
x 2  15y2 = 61.
(4)
6r
We first solve 12 == 15 (mod61), 0::;;; I::;;; This means solving f2 = 15 + 61h, f2 ::;;; 900, or finding h so that 15 + 61 h is a square. Letting h run over 0::;;; h ::;;; [900/61] = 14 we see that there is only one suitable h, namely h = 10, 1= 25. We now have to solve
xi 
15yi = 10.
Observing that 10> ji5 we now consider f2 = 15 h = I, 1= 5 so that we have to solve x~  15y~
(5)
+ IOh, I::;;; 1f = 5. This gives
= I.
(6)
From the method of continued fractions, the solutions to (6) are given by X2 +ji5Y2 = ±(4+ji5t. Therefore Xl +ji5Yl = ±(4+ji5)n(5±ji5)and so x
+ ji5 Y = ± (4 + ji5)n(5 ± ji5)(25 ± ji5)/IO.
Here the three signs
± are independent so that either
x
+ ji5y = ± (4 + ji5)n(l4 ± 3ji5)
x
+ ji5 Y = ± (4 + ji5)n(11 ± 2ji5).
or
eJ
Alternatively we can use the inequality at the end of §4, that is 0 < Y < ak/d. For this example we have 0 < y::;;; 7 and we can construct the following table 2
3
4
5
6
7
15
45
75
105
135
165
195
15
60
135
240
375
540
735
76
121
196
301
436
601
796
y
15(2y  I)
Observe that in the second row of this table each term increases by 30, and in the third row the ith term is the sum of the (i  l)th term and the ith term of the second row.
286
11. Indeterminate Equations
Exercise 1. Solve the following indeterminate equations.
+ 7y2
(a)
3x2  8xy
(b)
3xy
(c)
9x 2  12xy
(d)
x 2  8xy  17y2
 4x
+ 2y2 
+ 2y =
109,
4x  3y = 12,
+ 4y2 + 3x + 2y = + 72y 
75
=
12,
0.
Exercise 2. Let k < )d. Show that the solutions to ax 2 + bxy + cy2 = k can be obtained from the continued fractions expansions of the roots of the equation ax 2 + bx + c = 0. Try and generalize the results in this section.
11.6 Generalization of Soon Go's Theorem Let us consider the equation x 2 + y2 = Z2. If (x, y) = d> 1, then d also divides z. We may therefore assume that (x,y) = 1, and we need only consider positive solutions. Next, if x, yare both odd, then x 2 + y2 == 2 (mod 4), so that Z2 is divisible by 2 but not by 4; since this is impossible we see that x and y must be of opposite parity. We shall assume that x is even. Theorem 6.1. The solutions of the equation x 2 + y2 = Z > 0, (x,y) = 1, 21x are given by x
Z2
satisfying x > 0, y > 0,
= 2ab,
where a, b are coprime integers of opposite parity satisfying a > b > 0. There is a one to one correspondence between (x,y,z) and (a, b). 0
e
On putting ~ = x/z, 1'/ = y/z the equation x 2 + y2 = Z2 becomes + 1'/2 = 1 + 1'/2 = 1 has infinitely and we deduce from Theorem 6.1 that the unit circle many rational points given by .
e
We generalize the problem and ask if every second degree conic possesses infinitely 31'/2 = 2 many rational points. The answer is no; for example the hyperbola has no rational points. For if we put ~ = x/z, 1'/ = y/z, (x, y, z) = 1 then we have x 2  3y2 = 2Z2, so that x 2 == 2Z2 (mod 3), which implies 31x and 31z, and whence 31y, contradicting (x,y,z) = 1. However, we do have the following:
e
Theorem 6.2. Let a second degree conic, not a pair of straight lines, have rational coefficients. If the conic has one rational point, then it has infinitely many rational points.
287
11.6 Generalization of Soon Go's Theorem
Proof We may assume that the conic passes through the origin; otherwise we can translate the origin to the rational point concerned. The conic can be written as S2(e, 1]) + Sl(e, 1]) = 0, where Si(e, 1]) is homogeneous in and I] with degree i. If Sl(e, 1]) 0, then the original conic is a pair of straight lines, and if S2(e, 1]) 0, then the original conic is a straight line. Therefore Sl(e, 1]) and S2(e, 1]) are not identically zero. Now put I] = (e so that eS2(l, 0 + Sl(l, 0 = giving
e
=
=
°
There are therefore infinitely many rational points.
0
Theorem 6.3. Let A, B, C be rational numbers, not all zero. Suppose that B2  4AC is a square. Then the conic (I) has infinitely many rational points. In other words, if the asymptotes of a hyperbola has rational points, then the hyperbola has infinitely many rational points; a parabola has infinitely many rational points. Proof Write L2 Ae
= B2  4AC, so that
+ Bel] + CI]2 = =
If L ¥
A ((
A
°
e+
2: Y+ (~ ::2) I]
(e + ;:
I] 
1]2)
2~ 1]) (e + 2: I] + 2~ 1]).
we set 1]' =
and solving for
e
B+L 2A 1],
eand I] and substituting into (I) we have Ae'I]'
+ D'e' + E'I]' + F' = 0,
which gives ,
e= 
E'I]' + F' AI]' +
P' .
Therefore (1) has infinitely many rational points. If L = we set e' = e + BI]/2A, 1]' =  I] giving Af2 + D' e' + E'I]' + F' = 0. If E' ¥ 0, then 1]' =  (Ae'2 + D'e' + F')/E'sothatthereareinfinitelymanyrational points. If E' = 0, then the original curve is not a second degree conic. 0
°
Note: Theorems 6.2 and 6.3 raise the following problem. Let
(2)
288
11. Indeterminate Equations
be a homogeneous second degree equation in Xl, X2, ... , X" with integer coefficients, not factorizable into a product of linear terms. We ask if there are infinitely many lattice points satisfying (2). We see from Theorem 6.2 that if n ~ 3 and if (2) has a nonzero lattice point, then there are infinitely many lattice points. But when does it have a lattice point? For example: xi + x~ + ... + = certainly has no nonzero lattice point. We therefore have to assume thatj(~l> ... ' ~") = has other real solutions. It can be proved that, under this assumption, and for n ~ 5, the equation (2) has integer solutions, and indeed infinitely many solutions (this is Mayer's theorem). The result does not hold when n = 4. For if xi + x~ + x~  7x; = 0, then we may assume that (Xl> X2, X3, X4) = I. Now from xi + x~ + x~ + x; == (mod8), and x 2 == 0,1,4 (mod 8) we can deduce that 21(Xl,X2,X3,X4) which is a contradiction.
x;
° °
°
11. 7 Fermat's Conjecture Fermat claimed that when n ~ 3 the equation x" + y" = z" has no positive integer solutions in x, y, z. This has been proved for 2 < n < 125000, and even this modest amount of result involves some pioneering work by mathematicians. In order to prove Fermat's claim it suffices to establish the case when n = 4 and when n is an odd prime. For if n has an odd prime divisor p, then
and if n has no odd prime divisors, then n = 2k (k ~ 2) and
The case n = 4 can be settled using Fermat's method of infinite descent. In fact we have Theorem 7.1. The equation X4
+ y4 = Z4 has no positive integer solutions.
D
11.8 Markoff's Equation We introduced in §10.5 Markoff's equation (1)
and we stated the relationship between Markoff numbers and continued fractions. We shall now study this equation. Theorem 8.1. Let xo, Yo, Zo be a solution to (1). Then so is xo, Yo, 3xoyo  zoo
289
ll.8 Markoff's Equation
Proof x~
+ y~ + (3xoYo
 zof = x~
+ y~ + z~  6xoYozo + 9x~~ =  3xoYozo + 9x~~ = 3xoYo(3xoYo 
Zo)·
D
Theorem 8.2. Every solution of (1) can be generated from Theorem 8.1 with x = y = z = 1 as an initial solution. Proof 1) If x = y = z, then clearly x = y = Z = 1. 2) If x = y :f z, then 2X2 + Z2 = 3x2z. Hence x 21z2 or xlz. Let z = wx so that 2 + w2 = 3wx (w > 0) and hence w12, giving w = 1 or 2. But x :f z so that w = 2 giving x = 1, y = 1, z = 2 and this is a solution generated by (1, 1, I) from Theorem 8.1. 3) We can now assume that x < y < z. Ifwe can establish that 3xy  z < z, then we can reduce the value of x + y + z, so that after a finite number of successive steps x, y, z cannot be all different which means that we have reduced the present case to 1) or 2). This is what we shall prove. From Z2  3xyz + x 2 + y2 = 0 we have
If then from we see that
2z < 3xy  xy = 2xy, or
z < xy. But
so that xy < z giving a contradiction. Therefore
as required.
D
Example. Starting with (l, 1, I) we have (l, 1,2) and then (l, 2, 5); (l, 5, 13); (2,5,29). Continuing we have the following table for x ::;:; y ::;:; z < 1000. z y
x
2
5
13
29
34
89
169
194
233
433
610
985
2
5
5
13
34
29
13
89
295
233
169
2
5
2
2
290
11. Indeterminate Equations
Note: Observe that this is also a method of descent. Fortunately there is no more descent after x = y = z = 1. We see therefore that Fermat's method of infinite descent can be used either to prove that there is no solution, or to prove that there are infinitely many solutions.
Exercise 1. Generalize the discussion here to the equation nXIX2 ... Xn •
Exercise 3. Show that the equation 2X4  y4
11.9 The Equation x 3
=
xi + x~ + . . . + x; =
Z4 has infinitely many solutions.
+ y3 + Z3 + w3
=
0
The number 1729 is the smallest positive integer representable as the sum of two cubes in two different ways. That is 1729 = 103 + 93 = 12 3 + P. There are other numbers having this property, for example: 23 + 34 3 = 15 3 + 333, 93 + 15 3 = 23 + 16 3 • In fact we even have 70 3 + 560 3 = 98 3 + 552 3 = 315 3 + 525 3, 121170 3 + 969360 3 = 545275 3 + 908775 3 = 342738 3 + 955512 3
= 336455 3 + 956305 3, and 34 + 43 + 53 = 63, P + 63 + 83 = 9 3. The solutions to the equation x 3 + y3 + Z3 + w3 = 0 present a very interesting problem. Unfortunately we still have not obtained a formula for all the solutions. The EulerBinet formula below provides all the rational solutions. Theorem 9.1. The rational solutions to the equation W 3 + 6XYZ = 0 are given by
+ 3 W(X2 +
y2
x = pa(a 2 + 3b 2 + 3c 2),
W=  6pabc, Y = pb(a 2 + 3b 2 + 9c 2),
Z
= 3pc(a 2 + b2 + 3c 2).
Here (a, b, c) = 1, and p is a rational number. Proof We rewrite the given equation as W Z
3Z W
Y
x
 3Y 3X =0, W
so that there must be integers a, b, c not all 0 and (a, b, c)
=
1 such that
+ Z2)
11.9 The Equation x 3
+ y3 + Z3 + w3 =
291
0
+ 3Zb  3Yc = 0, Za + Wb + 3Xc = 0, Ya  Xb + We = 0.
Wa 
Solving these for X, Y, Z, W, the required result follows.
D
Let
+ f3 + y + <5), Y = t(oc  f3 + y  <5),
W = t(oc
X = t(oc Z
+ f3 
y  <5),
= t(oc  f3  y + <5),
(1)
so that (oc
+ f3 + y + <5? + 3(oc + f3 + y + <5)[(oc + f3  y  <5)2 + (oc  f3 + y  <5)2 + (oc  f3  y + <5)2] + 6(oc + f3  y  <5)(oc  f3 + y  <5)(oc  f3  y + <5) = 0,
or (2)
Solving (1) we have oc = t(W + X
y=
t( W 
+ y + Z), X + Y  Z),
f3 = t( W + X <5 = t( W  X
 Y  Z),  Y
+ Z),
and the solutions to (2) can be obtained from Theorem 9.1. Theorem 9.2. Given any positive integer r, there exists a number N which can be
represented as a sum of two cubes in r ways. Proof Let
~1' '11
be two fixed rational numbers. Set X
= ~l(~i + 2'1D ~i  '1i
Y
X(X3 _ 2 y 3) ~2 = X 3 + y 3 '
= '11(2~i + '1D ~f  '1f
' '12
'
Y(2X3 _ y 3) = X 3 + y3
so that (3)
We then have
292
11. Indeterminate Equations
< < <±. Yfd~l
Suppose that 0
e
O<~~= Y
2Yf 1
Then 3 (Yf1)2
~1
<~(~)2 <~e2.
4 3 1 + ~ (~)
~1
4
4
2 ~1
Therefore XjY > ~d2Yf1 >
te, or YjX < 2e. Also
1 ~2 I ~(fY ~ (fY X Yf2  2 Y = 1 _
<
3(Y)
3
4X
< 2e,
so that
~ 1~2Yf2 ~I +~I~~I < 2e, 1 Yf2~2 ~I 4Yf1 2Y 2 Y 2Yf1 and hence ~2
~1
1
Yf2 <8e.
>2e>Yf2 4Yf1 8e'
~2
Continuing this way we have
< 2 e, 1 Yf3~3 ~I 4Yf2
1~4 ~I < 2 e, Yf4 4Yf3 provided that 2 (S1)e < ±. 4
7
~S+l _~I < 21+3(s1) e, I Yfs+ 4Yfs
... ,
1
3
Therefore, by taking ""(~r> Yfr) such that
Yfd~l
very small, there are pairs of numbers
(~1>
Yf1),
and the ratios
~,4~2 , ... ,4r are nearly equal. Therefore follows. 0
1
£'_
Yf1
Yf2
Yfr
~slYfs
are distinct ratios and the required result
Exercise 1. Show that the rational solutions to obtained from 0(
=
0'( 
(~
y = O'«e
where
~

3Yf)(e
+ 3Yf2)2

+ 3Yf2) + 1), (~ + 3Yf)),
0(3
+ ++ [33
y3
()3
= 0 can be
+ 3Yf)(~2 + 3Yf2)  1), O'«e + 3Yf2)2  (~  3Yf))
[3 = O'«~
<5 =
and Yf are rational numbers. If 0' = 1 and
~,
Yf are integers, it then follows
293
11.10 Rational Points on a Cubic Surface
that x 3 + y3 + w 3 + Z3 = 0 has infinitely many integer solutions. By considering IX = I, P= 12, y =  10, b =  9 show that this method here does not give all the solutions. Exercise 2. Verify that y12 = (9X4? + (3 xy3  9X 4)3 + (y4  9x 3y)3, and hence show that 5 12 = 93 + 366 3 + 580 3 = 144 3 + 606 3 + 265 3 • Exercise 3. Use Exercise 2 to prove that there exists n such that the number of nonnegative solutions to n = x 3 + y3 + Z3 exceeds tnTz. Exercise 4. Prove that (3a 2 + 5ab  5b 2)3
= (6a 2 
+ (4a 2 4ab + 4b 2)3.
4ab
+ 6b 2)3 + (5a 2 
5ab  3b 2)3
11.10 Rational Points on a Cubic Surface The cubic surfaces discussed in this section are nondegenerate. On dividing equation (2) in §9 by b3 and setting ~ =  IX/b, '1 , =  y/b, we have ~3
+ '1 3 + C =
1.
= 
P/b, (I)
In other words, from our results in §9, the cubic surface (1) has infinitely many rational points. We shall generalize this to the most common cubic surfaces. Before we introduce the difficult method involved we first consider some special examples. Theorem 10.1. Let A, B, C be rational numbers with C # O. Then the cubic surface (2) has infinitely many rational points. Proof We substitute ~
= '12 + T'1,
into (2) giving
On comparing the coefficients of '1 6 , '1 5 , '1\ '1 3 we have 22
=
3T,
(3)
294
11. Indeterminate Equations
giving 3
A=T
/I
3 T2 8 '
1 3 v=T.
= _
r
2 '
16
Substituting this into (4) we have the quadratic equation (5)
where L=A M
+C
lAv = A
Jl2 
3
+ C + 64 T 4 ,
3
= AT  2Jlv = AT +  T 5 , 64
The discriminant of (5) is LI = M2  4LM =
[(643)2 + 643. 641] T
10
+ ...
=
3
1024 T 10
+ ...
so that LI cannot be the square of a polynomial with rational coefficients. Therefore the solutions to (5) are given by
Substituting this into (3) we have
,=
11
± 12fl,
where (X2 = (2/31
+ T)/32
=
LTM CT 2£1 = 2L2 :f O.
Let
e (Xl
'1  /31
,  11
(X2
/32
12
 =  =   =
CT,
(6)
and substitute this into (2). We then have a cubic equation in CT, with rational functions in T as coefficients, and the leading coefficient (X~ is not zero. Since we already know that ± f l are two of the roots, the remaining root CTo must be a rational function of T. Substituting this into (6) we see that '1, 'can be represented '1, , so by rational functions of T. However, we still have to prove that the obtained are not constants since otherwise we may not have infinitely many rational points. If '1 is a nonzero constant, then = '12 + T'1 is not a constant, and if '1 = 0,
e,
e
e,
295
11.10 Rational Points on a Cubic Surface
then, by (3), ~ = 0 and ( = v =  T3 /16 and we see from (2) that this is not possible. Therefore if we substitute (fo into (6), then~, 1'/, (are all rational functions of Tand they cannot be all constants. The theorem is proved. 0 Theorem 10.2. Let j( ~, 1'/) be a cubic polynomial with rational coefficients which cannot be transformed into a polynomial with a single variable by a linear transformation. Then the cubic surface
(7) has infinitely many rational points. Proof Denote by f3(~' 1'/) the homogeneous cubic part of f(~, 1'/). 1) Iff3(~' 1) = 0 has a rational root a (similar method for the case off3(1,I'/», then f(~ + al'/, 1'/) = g(~, 1'/) does not contain a 1'/3 term. Therefore after the substitution ~ + ~ + al'/, 1'/ + 1'/, ( + ( the equation (7) becomes
(8)
where Li is a polynomial in ~ of degree i. Let Ll m = IX~ + /3. If IX =I 0, then we may take ~ such that IX~ + /3 = <5 2 =I 0, and the theorem follows from Theorem 6.3. If IX = 0 and /3 = 0, then (8) is linear in 1'/ and the theorem follows by solving 1'/. We now suppose that IX = 0, /3 =I 0 so that (8) can be written as (9)
where' .. represents the linear terms in ~ and 1'/. Suppose that 1X2 =I O. Let IXI ~ + 1X21'/ = 2 so that (2
= 2e + /31~2 + /32~C =
(2
We take 2 = I  /31
+ /31
 /321Xd1X2
+ /321Xd1X2
~:l~) + /3C ~:l~y + ...
+ /3lXi/IXDe + .. '.
 /3lXi/IX~, so that
By Theorem 6.3 this surface has infinitely many rational points. Supposenextthat 1X2 = O. Then IXI =I 0, since otherwise (2 = j(~, 1'/) is not a cubic surface. Therefore (2
=
1X1~3
=
/31'/2
+ /31e + /32~I'/ + /31'/2 + ... + (/32~ + 1')1'/ + j(~) /32
I' )2
= /3 ( 1'/ + 2/3 ~ + 2/3 Here
g(~)
is a cubic polynomial in
~
+ g(~).
with leading coefficient IXI' Replacing
296
11. Indeterminate Equations
o:r
I] + (/32~ + ,,/)/2/3 by I] we have (2 = /31]2 + g(~). On multiplication by and the simple substitution~' = 0:1~ + A," = 0:1( this equation is reduced to (2) and so the theorem follows from Theorem 10.1. 2) Suppose that/3(~' 1]) has no linear rational factor. The equation (7) can be written as
where/; is a polynomial in I] with degree i. Replacing new equation
~
by
~
 11(1])/30: we have the
If we mUltiply both sides by 0:2 and replace 0:(, o:~ by (, ~, then we have
As in Theorem 10.1 we substitute (11) into (10) giving (1]3
+ AI]2 + Ill] + V)2
=
+ TI])3 + (AI]2 + BI] + C)(I]2 + TI]) + DI]3 + EI]2 + FI] + G.
(1]2
(12)
We choose A, Il, v in (12) so that the coefficients for 1]5, 1]\ 1]3 are zero, giving a quadratic equation LI]2 + M I] + N = 0. The reader can verify that L ¥ so that we have I] = /31 ± /32JLi, where /31, /32 and L1 are rational functions of T. Substituting this into (II) we have
°
( = "/1
If L1
=
± "/2JLi.
0, then the result follows at once. If L1 ¥ 0, we let ~0:1 1]/31 ("/1 ===(1
0:2
/32
and obtain from (10) a cubic equation in
"/2 (1
with the nonzero leading coefficient
Two of the roots of this cubic equation are already known to be third root (10 is a rational function of T. This means that ( =
lie on the cubic surface (10). That before. 0
~,
"/1
± JLi so that its
+ "/2(10
1], ( are not all cQnstants can be proved as
297
11.10 Rational Points on a Cubic Surface
Theorem 10.3. Let
S2(~'
'1, () and T2(~' '1, () be homogeneous quadratics in ~, '1, (.
Then the cubic surface (13) has infinitely many rational points. Proof Let us denote the left hand side of (13) by f(~, '1,
O. Then
we have
where g(~, '1,0 is linear in ~ and '1. In Theorem 6.3 we take A B = PI + P2(, C = ')'1 + ')'2( so that
= IXI + 1X2(,
If 1X2 =I 0 (or ')'2 =I 0), then take (=  IXdIX2 (or (=  ')'d')'2) so that B2  4ACis a square. If 1X2 = ')'2 = 0 and P2 =I 0, then (j2 = (PI + P2()2  41XI')'1 also has rational roots, by Theorem 6.3. We must also note the following. On the substitution ( =  IXdIX2 into (14), all the coefficients for e, ~'1, '12 may be zero. In this case, if the coefficients for ~ and '1 are not both zero, then ~ (or '1) can be represented as a rational function of '1 and  IXdIX2 (or ~ and  IXdI(2), and the theorem then clearly holds. If the coefficients for ~ and '1 are both zero, then f(~, '1, ()
= (IXI + 1X2()(e + A~'1 + B'12 + (C + D()~
+ (E + FO'1 + G + H( + ,(2) + K. Ifwe put ( = 0 in (13), thenf(~, '1, 0) is a homogeneous quadratic in C = E = 0 and IXI G + K = 0 in the above, giving
~
and '1 so that
Observe that (~
+ 2()2 + A(~ + 2()('1 + 110 + B('1 + Il(f + D(~ + 2()( + F('1 + Il(K = e + A~'1 + B'12 + (2A + All + D)~( + (A2 + 2BIl + F)'1( + ....
If A2 =I 4B, then we may choose 2 and 11 so that 22 A2 + 2BIl + F = O. We may therefore assume that
g(O) Let
+ All + D = = O.
0,
298
11. Indeterminate Equations
Then X 2 + AXY + By 2 + Z4((;(1
+ ~ )g(~) = O.
Since g(O) = 0, it follows that Z4«(;(1 + (;(2/Z)g(1/Z) is a cubic in Z, and the theorem is reduced to Theorem 10.2. If A2 = 4B, then we write ~' = ~ + A'1/2, '1' = '1 so thatfi~, '1, 0 in (15) is linear in '1' and the theorem again follows. Next, if (;(2 = P2 = 12 = 0 and Pi :f 4(;(111, then, under the transformation ~ ~ ~ + AI' + A2,2, '1 ~ '1 + /11' + /12,2, (14) becomes
where 1m = A,4 + B,3 '1 = Y/Z2, , = liZ gives
+ C,2 +
D'.
The
further
substitution
~
= X/Z2,
After a further linear substitution this can be reduced to Theorem 10.2. Finally, if (;(2 = P2 = 12 = 0 and Pi = 4(;(111> then the linear substitution ~' = (;( 1 ~ + PI '1/2, '1' = '1, " = ,will make the left hand side of (14) linear in '1' and so the theorem is proved. D Theorem 10.4. If a nondegenerate cubic surface has a rational point, then it has infinitely many rational points.
Proof We may assume that the surface passes through the origin so that it can be written as
(16) where Si(~' '1, 0 are homogeneous in ~, '1,' with degree i. 1) If SI(~''1,O == 0, then S3(~''1,') + S2(~''1,') = 0, so that
'S3(t,~, 1) + S2(r~, 1) = 0, giving, =  S2«(;(, P, 1)/S3«(;(, P, 1). Observe that if S3«(;(, P, 1) == 0, then the original surface is not a cubic, and if S2«(;(, P, 1) == 0, then the cubic surface is a degenerated one. 2) If SI(~' '1, ') ¥= 0, then under the transformation SI(~' '1, ') ~, we have
If S3(~' '1, ') and
S2(~' '1, 0)
are not both identically zero, then we let, = 0 giving
299
Notes
Y
=
If S2(~' '1, 0) == 0, then S2(~' '1, 0 = (Ll(~' '1, O· We let Z = 1/(, X = ~/(, '11( so that S3(X, Y, 1)
+ ZL 1 (X,
Y, 1)
+ Z2
=
°
which gives
and this is included in Theorem 10.2 so that the required result follows. If S3(~' '1, 0) == 0, we let S3(~' '1, 0 = (T2(~' '1, 0, and this reduces to Theorem 10.3. The theorem is proved. D
Notes 11.1. The problem of the existence of solutions to the famous equation
x2
=
yn
+ 1,
has been settled by K. Chao [16]. He proved that, apart from n = 3, x = y = 2, there are no integer solutions.
± 3,
Chapter 12. Binary Quadratic Forms
12.1 The Partitioning of Binary Quadratic Forms into Classes Definition. For fixed integers a, b, c the homogeneous quadratic polynomial F = F(x,y) = ax 2 + bxy
+ cy2
is called a binary quadratic form, or simply a form, and is denoted by {a, b, c}. The integer d = b 2  4ac is called the discriminant of the form. It is easy to see that d
== 0 or I (mod 4).
Theorem 1.1. A necessary and sufficient condition for F to be factorized into a product of two linear forms with integer coefficients is that d is a perfect square. Proof 1) Let d be a perfect square, and a ¥ O. Then the equation ax 2 + bx
+ c = a {(x + :aY 
4~2} = 0
has rational roots, and therefore, by Theorem 1.13.2, the form can be factorized into a product of two linear forms with integer coefficients. If a = 0, then clearly F(x,y) = (bx + cy)y. 2) If ax 2 + bxy + cy2 = (rx + sy)(tx + uy), then d = b 2  4ac = (st
The theorem is proved.
+ ru)2
 4rt . su = (st  ru)2.
D
We shall assume from now on that d is not a perfect square. If d < 0, a > 0, then
301
12.1 The Partitioning of Binary Quadratic Forms into Classes
°
and so F(x,y) ~ Oforallx,y,andF(x,y) = ifand only if x = y = 0. We call such a form a positive definite form. If d < 0, a < 0, then F ::;;; for all x, y, and we call the form a negative definite form. Since a negative definite form becomes a positive definite form on multiplication by  1, we shall only deal with positive definite forms which we shall simply call definite forms. If d > 0, then F(1,O)
= a,
F(b,  2a)
= ab 2 
°
b . b . 2a
+ c . 4a 2 =

da.
°
If a =I 0, then the two values here have different signs. If c =I we can similarly choose two values which have different signs. If a = c = 0, then F(l, 1) = b,
F(1,  1)
= 
b
again have different signs. Thus when d > 0, the form F(x, y) can take both positive and negative values, and we therefore call such a form an indefinite form. Definition. Let the integer coefficient substitution x=rX+sY,
y=tX+uY,
(rust=l)
transform F(x,y) into G(X, Y)we say that Fis transformed into G via ( rt
su)'
The two forms F and G are then said to be equivalent, and we write F ~ G to denote this. More specifically, let F = {a, b, c} and G = {al' bi> cd. Then we have (1)
b1
= 2ars + b(ru + st) + 2ctu
+ b(l + 2st) + 2ctu, as 2 +'bsu + cu 2,
= 2ars
(2)
Cl
=
(3)
=
(2ars
and we derive at once
bi 
4alcl

+ b(ru + st) + 2ctU)2 4(ar2 + brt + ct 2)(as2 + bsu + cu 2)
= (b 2 
4ac)(ru  st)2 = b 2  4ac = d.
We see therefore that equivalent forms have the same discriminant. Also, if d < 0, a > 0, then al = F(r, t) ~ 0. Since a 1 = implies r = t = 0 which is impossible we see that al > 0. In other words forms which are equivalent to a positive definite form are themselves positive definite.
°
Theorem 1.2. (i) F ~ F (reflexive). (ii) If F ~ G, then G ~ F (symmetric). (iii) If F ~ G, G ~ H, then F ~ H (transitive).
D
302
12. Binary Quadratic Forms
We omit the simple proof for this theorem. The relation of being equivalent partitions the set of forms with discriminant d into classes, so that all the forms in one class are equivalent among themselves, and two forms from two different classes are not equivalent. It is clear that forms from the same class represent identical sets of integers. For if k = G(X, Y), then k = F(rX + sY, tX + uy).
12.2 The Finiteness of the Number of Classes Theorem 2.1. In every class offorms there is always one which satisfies the condition
Proof Let a be an integer with the least absolute value from the set of nonzero integers representable by forms in the class concerned. Let {ao, bo, co} be any form in the class. Then there exist r, t such that
and (r, t) = 1, since otherwise a/(r, t)2 is also representable by {ao, b o, co}, and lal/(r, t)2 < lal, which is impossible. We can fix sand u so that ru  st = 1. Then {ao, b o, co} is transformed into {a,b',c'} via
G:).
Now the transformation
G~)
transforms {a,b',c'} into
{a, b, c} where b = 2ah + b'. We can choose h so that Ibl ~ lal. Since c is representable by {a, b, c}, and this form also belongs to the class containing {ao, bo, co} it follows that Icl ;::i: 14 (Note that c # 0, because c = 0 implies that d is a perfect square.) D
Theorem 2.2. The number of classes is finite. Proof 1) d> 0 (indefinite). From Theorem 2.1 we have lacl ~ b 2 = d
+ 4ac > 4ac,
so that ac < O. Also 4a 2 ~ 41acl
=  4ac = d  b 2 ~ d
so that
fl
lal~,
2
and hence, by Theorem 2.1
fl
Ibl~·
2
303
12.2 The Finiteness of the Number of Classes
There are therefore only finitely many possible values for a and b. Since c = (b 2  d)/4a, the required result follows. 2) d < 0 (definite). Assuming that a > 0 we have, from Theorem 2.1,
so that
o
As before the required result follows from Theorem 2.1.
Theorem 2.3. The number of classes of positive definite forms with discriminant d is equal to the number of sets of integers a, b, c satisfying b 2  4ac
 a
= d,
~ a
< c,
o~ b ~ a =
(I)
c.
Proof I) By Theorem 2.1 there is, in any class, always a form which satisfies
(because a, c are positive). We have the following extra forms to our concluding result: a=b,
a
and
 a ~ b
< 0,
a
= c.
We now prove that {a,  a, c} '" {a, a, c}
and
{a,  b,a} '" {a,b,a}.
Since {a,  a,c} is transformed into {a,a,c} via transformed into {a, b, a} via (_
(~ ~), and
{a,  b,a} is
~ ~), we see that, in any class, there is always a
form which satisfies (I). 2) We next prove that any two forms are not equivalent. That is, if {a, b, c} '" {a', b', c'} and both satisfy (I), then a = a', b = b', c = c'. We can assume that a' ~ a. Let {a,b,c} be transformed into {a',b',c'} via
G:). Then (2) b'
= 2ars + b(ru + st) + 2ctu.
(3)
From the former we have a;::: a' ;::: ar2  alrtl
+ at 2 = a(lrl
 Itl)2
+ alrtl ;::: alrtl,
(4)
304
12. Binary Quadratic Forms
= 1, then
that is Irtl ::::; 1. If Irtl
a
= a'. Otherwise
rt
= 0, and then
so that a = a' also. Suppose first that c > a. Then t must be zero, since otherwise from ct 2 > at 2 and (4) we deduce that a > a which is impossible. Therefore t = 0, ru = 1. Now from (3), we have b' = 2ars
+ b == b
(mod 2a).
Since  a < b ::::; a and  a =  a' < b' ::::; a' = a we arrive at b = b', and hence c = c' at once. The same conclusion can be obtained if we assume that c' > a' (= a). It remains to consider the case a = a' = c = c'. Here we must have b = ± b', and from b ~ and b' ~ we arrive at b = b'. D
°
°
Note. The case of the indefinite forms is not this easy.
Definition. We call a form which satisfies (I) a reduced form. Exercise 1. Verify the following table of all the reduced forms for d
3
4
7
8
II
a
I I I
I 0 I
I I
I 0 2
I I
b c
2
I 0 3
3
Exercise 2. Prove that when d
= 
 12 2 2 2
 15 I I
4
2 I 2
 16 I 0 4
°< 
d::::; 20.
 19
 20
2 0 2
I I
5
I 0 5
2 2 3
48 there are four reduced forms:
{1,0,12},{2,0,6},{3,0,4},{4,4,4}.
12.3 Kronecker's Symbol Definition. Let m > 0, d == Kronecker's symbol
°
or 1 (mod 4) and d not a perfect square. The
(~) is defined by
(~) =0,
if pld;
G)={~
if d==1 if d==5
(~) =
(mod 8), (mod 8);
Legendre's symbol (p odd prime, p,td).
305
12.3 Kronecker's Symbol
If m
= TI~= 1 Pr
where Pr are primes, then
n
(m~)= r= (~) Pr 1
The following are very easy to prove:
(~) =
(i) If (d,m) > 1, then
(~) = ± 1.
(ii) If (d,m)
= 1, then
(iii) If
0, m2 > 0, then
ml >
O.
(ml~J = (:J (:J. Theorem 3.1. Ifm > 0, (m,d)
= 1, then the Kronecker's symbol is given by when
d) {(I:I). (m = (2)b 
m
d is odd
( 1)~~(m) 2 2 ,
lui
Here(~), (~), (m) are all Jacobi symbols. Idl m lui Proof 1) Let dbe odd. From the definition of the Kronecker's symbol and Theorem 3.6.5 we have
2) Let d
=
2b u, 2,ru. Then b
~
2, and m is odd, so that
_(2)b ( 1)~~(m) (md) _(2)b(U) m m m lui 2
2
.
0
From this theorem we deduce that
Therefore we have: Theorem 3.2. The Kronecker's symbol
(~) is a real character mod Idl·
Theorem 3.3. Suppose that m > 0, n >
°and m ==  n
(mod Idl). Then
if d> 0, if d < 0.
0
306
12. Binary Quadratic Forms
Proof Since
it follows from Theorem 3.1 that, when d is odd,
)
(Idl ~ I) = (Idll~ I) = ( ~t = ( I ={ When d is even, we let d
=
if d < 0.
2 bu, 2,ru, b ;;:: 2. Then, from Theorem 3.1, we have
=
2
The Theorem is proved. Theorem 3.4. Let k >
1
if d> 0,
I,  I,
(Idl 2)b .'C.!.. (Id l  I) (Idl d) _I _ I ( I) Iul= (
t;
"1
1"11
1)2+2
={
I
'
 I,
= (
I)
.'C.!.. ( 2
I)
~
if d> 0, if d < 0.
0
°and (d, k) = I. The number of solutions to the congruence x 2 == d
(1)
(mod 4k)
is equal to
2I(~) Ilk f ' where the sum is over all positive squarefree divisors f of k.
If x is a solution to (1) then so is x solutions to
+ 2k. Hence, by the theorem, the number of
x 2 == d (mod4k),
0~x<2k
is equal to
I(~) f .
Ilk
Proof I) Let d be odd, so that d == I (mod 4) and (d, 4k) = I. From Theorem 3.5.1 we know that the number of solutions to the congruence x 2 == d (modp') is
2, 2(1 I
+ (~)),
+ (~),
if p=2,
1=2,
if p =2,
I> 2,
if p > 2.
307
12.4 The Number of Representations of an Integer by a Form
From Theorem 2.8.1 we now see that the number of solutions to (1) is 20(1 plk
X2
+ (~)) = 2 I P
Ilk
(~). f
2) Let d be even, so that d == 0 (mod 4), and hence k is odd. The congruence == d == 0 (mod 4) has two solutions, and the congruence X2 == d (mod p') has
1 + (~) solutions. Therefore, by Theorem 2.8.1, the number of solutions to (1) is 20(1 plk
+ (~)) = 2 I P
Ilk
(~). f
0
12.4 The Number of Representations of an Integer by a Form Definition. If (a, b, c) = 1, then we call {a, b, c} a primitive form. If (a, b, c) = g > 1, then we say that {a, b, c} is imprimitive. Clearly
{~, ~,~} g g g
is a primitive form with discriminant d/g 2 • Also, if
{a, b, c} ~ {at. bt. cd then the two forms are either both primitive or both imprimitive. We denote by h(d) the number of classes of primitive forms with discriminant d. Clearly the number of classes of forms with discriminant d is equal to
From each class of primitive forms we select a representative (for definite forms we consider the primitive positive definite forms) giving a representative system which we denote by
Theorem 4.1. Let k > 0, (k, d) = 1, and denote by tjJ(k) the total number of primary solutions to k
=
F 1 (x,y),
... ,
k
= Fh(d)(X,y).
Then tjJ(k) = w I nlk
(~). n
(For the definitions of primary solution and w, see §4 in the previous chapter). Proof We begin by considering the solutions to the congruence [2
== d (mod 4k),
o ~ 1< 2k.
308
12. Binary Quadratic Forms
For a given solution Iwe can determine an integer m from f2  4km = d. This then gives a form {k, I, m} which is easily seen to be primitive and with discriminant d. Therefore {k, I, m} is equivalent to one and only one Fi • Also, from Theorem 11.4.3, we know that there are w proper primary solutions corresponding to each I. Therefore the total number of proper primary solutions to k
= F 1 (x,Y), ... , k = Fh(d)(X,y)
is
wI(~) Jlk f . Also the total number of primary solutions is t/J(k)
=w
I I (~) I f
g21k k g> 0 J g2
(since (k, d) = 1, so that ((k/g2), d) = 1). Since (g2, d) = 1 it follows that t/J(k)
=w
I I ( d) = wI (d)  . :~I~ J It> fg nlk n 2
(This is because any integer n can be written asfg2 wherefis squarefree and g > O. Also g2Ik,fl(k/g 2) and nlk are equivalent.) D Consider now the following application of the theorem. It is easy to prove that = 1 so that t/J(k) is the number of solutions to k = X2 + y2. Therefore:
h(  4)
+ y2 = k is equal to four times the difference between the number of divisors of k which are congruent 1 (mod 4) and the number which are congruent 3 (mod4). D
Theorem 4.2. The number of solutions to X2
This agrees completely with Theorem 6.7.5. Exercise 1. Let m be odd. The number of solutions to X2 + y2 = 2 1m is 20" where 0" is the difference between the number of divisors of m which are congruent 1 or 3 (mod 8) and the number which are congruent 5 or 7 (mod 8). Exercise 2. The number of solutions to X2 + xy + y2 = k is 6E(k) where E(k) is the number of divisors of k of the form 3h + 1 subtracting the number of divisors of the form 3h + 2. Exercise 3. Let m be odd and consider the number of solutions to the equation X2 + 3y2 = 2 1m. If I is odd, then this number is zero; if I = 0, then this number is 2E(m); if I is positive and even, then this number is 6E(m). Here E(m) has the same
definition as earlier.
309
12.5 The Equivalence of Forms modq
Exercise 4. If m is odd, then the equation x 2 + 3y2 = 4m has E(m) positive odd solutions. Exercise 5. Let m be odd and consider the number of solutions to the equation x 2 + 4y2 = 2km. When k = 0, this number is 2E; when k = I, this number is 0; when k ~ 2, this number is 2E. Here E is the number of prime divisors of m
congruent I (mod 4) subtract the number of divisors of k congruent 3 (mod 4). Exercise 6. Denote by e(n) the number of divisors of n congruent 1,2,4 (mod 7) subtract the number of those congruent 3, 5, 6 (mod 7). The number of solutions to x 2 + xy + 2y2 = n > 0 is then 2e(n). Exercise 7. If m is odd, then e(2am ) = (a + I)e(m). Let 3%t. If b is odd, then = 0 and if b is even, then e(3 b t) = e(t).
e(3 b ()
Exercise 8. Let m be positive and odd. The numbers of solutions to m = x 2 + 7y2 and 2m = x 2 + 7y2 are 2e(m) and 0 respectively. The number of solutions to 4k = x 2 + 7y2 is 4e(k). Exercise 9. Let m be positive and odd. Then there are e(m) positive integer solutions to x 2 + 7y2 = 8m. Exercise 10. The number of solutions to x 2 + xy + 3y2 = m > 0 is twice the difference between the number of divisors of m congruent I, 3, 4, 5, 9 (mod II) and the number of those congruent 2, 6, 7, 8, 10 (mod II).
12.5 The Equivalence of Forms mod q Let q be a prime number. Suppose that there is an integer valued coefficients substitution
x=rX+sY,
y = tX + uY,
(ru  st,q) = I
(I)
such that (2)
Then we say that the two forms {a, b, c} and {aI, bb cd are equivalent modq. Ifwe denote by dand d l the discriminants for {a,b,c} and {abbbcd, then clearly (3)
From (3) we see that if {a, b, c} and {ab bb cd are equivalent modp, then
(~) = (;).
310
12. Binary Quadratic Forms
Let us take q to be a prime p > 2. Suppose that the discriminant of {a, b, c} is d where p,j'd. Then {a, b, c} must be equivalent modp to a form {at> 0, cd. This is because p,j'(a, b, c), and if p,j'a then letting b X==x+y, 2a
Y==y
(modp)
we have ax 2 + bxy
+ cy2 == a ( x + b)2 y 2a
d d  _y2 == aX 2  _y2 4a 4a
and similarly if p,j'c; if pl(a, c), then taking x = X ax 2 + bxy
+ cy2 == bxy == bX 2 
+ Y, y = X
(modp),  Y we have
by2 (modp).
Therefore we can assume from now on that plb and p,j'ac. Lemma 1.
If p,j'ac,
then there are x, y such that ax 2 + cy2 == 1 (modp).
Proof Let x, y run over 0, 1, ... ,p  1 separately. Then ax 2 and 1  cy2 separately take (p + 1)/2 distinct values. Therefore there are x, y such that ax 2 == 1  cy2
as required.
(mod p)
0
Let 1 == ar2 + ct 2 (mod p) and let s, u be any pair of integers satisfying p,j'ru  st. With s, u fixed, we let
b l == 2ars + 2ctu,
CI
== as 2 + cu 2 (modp)
so that {a, 0, c} ~ {l, bl> cd modp. If d l is the discriminant of the second form, then from our discussions we have {l,bl>cd
~ {1'0,  ~} ~ {l,0, 
dd
(modp).
Summarizing we have: Theorem 5.1. Let the discriminant of {a, b, c} be d, and p > 2, p,j'd. Let r be any quadratic nonresidue modp. Then
{a,b,c}
if(~) = 1, and
~
{1,0,  l}
~
{O, 1,0}
(modp)
311
12.5 The Equivalence of Forms modq {a,b,c}~{l,O,
r}
(modp)
ifGJ =  1. Also {I, 0,  I} and {I, 0,  r} cannot be equivalent modp.
0
Corollary. If p is an odd prime that does not divide d, then any two forms with discriminant d must be equivalent modp. 0 When q
=
2 and the forms have odd discriminants we have:
Theorem 5.2. Any form with an odd discriminant must be equivalent mod 2 to exactly one of the following {O, 1,0}, {I, 1, I}. More specifically, we have {a,b,c}
~
{O, 1,0}
(mod 2)
if 2lac;
{a,b,c}
~
{I, 1, I}
(mod 2)
if 2,tac.
Proof Since 2,td it follows that 2,tb. Consequently if 2,tac, then ax 2 + bxy
+ cy2 == x 2 + xy + y2
(mod 2);
if 2lac, then either 21a or 21e. But if 21a then ax 2 + bxy
+ cy2 == xy + cy2 == y(x + cy)
(mod 2)
so that {a,b,c} ~ {O, 1,0} (mod2), and similarly if2lc. Finally {O, 1, O} and {I, 1, I} cannot be equivalent mod 2 so that the theorem is proved. 0 Corollary. Any two forms with the same odd discriminant must be equivalent mod2. 0 We next consider the case when p divides the discriminant of the forms. Lemma 2. Let n be any given integer. Then there are two integers x, y such that = 1 and (F(x,y),n) = 1.
(x,y)
Proof Let q be any prime number. Since F(x, y) is a primitive form, q,t(a, b, c). If q,ta, then q,tF(l,O); if q,tc, then q,tF(O, 1); if ql(a, c) and q,tb, then q,tF(l, 1). Therefore the lemma follows if n = q.
Let qi>' .. ,qt be all the distinct prime divisors of n. From the above, there are integers x;, y; such that q;,tF(x;, y;). From the Chinese remainder theorem there are
312
12. Binary Quadratic Forms
two integers X, Y such that X
==
Xi
Y
(mod qi),
= Yi
i
(mod q;),
= 1,2, ... , t.
Clearly we have (F(X, Y),n)
Now let
X
= X/(X,
Y), y
=
= 1.
Y/(X, Y). Then (x,y)
=
1 and
D
(F(x,y),n)=1.
Consider now p > 2, p Id where d is the discriminant of the form {a, b, c}. Since p,./'(a, c) we may assume that p,./'a. It is easily seen that {a,b,c}
~
{a,O,O}
(modp).
Theorem 5.3. Letp > 2 and let theforms {a, b; c} and {aI, bI> cd have discriminants d and d 1 respectively where p Id, pi d 1 • A necessary and sufficient condition for {a, b, c} and {aI> bI> cd to be equivalent mqdp is that
where k and kl are any integers representable by {a, b, c} and {aI' bI> cd respectively and satisfying (k,d) = 1, (k 1 ,d1 ) = 1. Proof That k and kl exist follows from Lemma 2. Let k == ax2 (mod p), (k, p) = 1. Then
Thus
(i)
is constant and is equal to
(~).
+ bxy + cy2
Suppose now that {a, b, c} and
{aI, bI> cd are equivalent modp. Then, from the definition of equivalence,
Conversely, if(i)
(i) = (~) = (~) = (:1). = (:1). (~) = (~ ) then
so that there is an integer z such that
a == alz2 (modp) and hence
It remains to consider the situation whenp following symbols:
= 2 and 21d. We first introduce the
313
12.5 The Equivalence of Forms modq kl
d if =0 or 3 (mod4); 4
(j(k) = (  1)2 ,
d if =0 or 2 (mod 8); 4
k 2 1
e(k) = (  1)8, kl
(j(k)e(k)
k 2 1
d if =0 or 6 (mod8); 4
= ( 1)2+8,
where k is an odd integer representable by {a, b, c}. Since 21 d implies 21 b we shall assume that b = 0 and consider
d=  4ac. Theorem 5.4. A necessary and sufficient condition for two forms satisfying (mod 4) to be equivalent mod 4 is that they should have the same (j.
Proof Since d =  4ac, it follows that ac and k is representable as
1 =3 .
=I (mod 4), that is a =c (mod 4). If 2,rk
then, since x, y must have the same parity it follows that k = a (mod 4) and hence = (j(a). The theorem can easily be deduced from this. 0
(j(k)
The same method can be used to prove the following theorems: Theorem 5.5. A necessary and sufficient condition for two forms satisfying (mod 8) to be equivalent mod 8 is that they should have the same e. 0
1 =2
Theorem 5.6. A necessary and sufficient condition for two forms satisfying (mod 8) to be equivalent mod 8 is that they should have the same ()e. 0
1= 6
Theorem 5.7. A necessary and sufficient condition for two forms satisfying (mod 4) to be equivalent mod 4 is that they should have the same (j. 0
1= 0
Theorem 5.S. A necessary and sufficient condition for two forms satisfying 1= 0 (mod 8) to be equivalent mod 8 is that they should have the same (j and e. 0 Exercise 1. Any two forms satisfying 1 = 2 (mod 4) are equivalent mod 4. Exercise 2. Any two forms satisfying 1
=I (mod4) are equivalent mod4.
Exercise 3. Any forms satisfying 1= I (mod 4) must be equivalent mod 8 to exactly one of
314
12. Binary Quadratic Forms
Deduce also that any two forms with the same discriminant d which satisfies ~ == 1 (mod 4) must be equivalent mod 8. Exercise 4. Let q be any positive integer. A necessary and sufficient condition for two quadratic forms to be equivalent mod q is that they have the same character system (see Definition 1 in the next section).
12.6 The Character System for a Quadratic Form and the Genus It follows at once from the definitions that any two quadratic forms which are equivalent are also equivalent mod q for any q.
Definition 1. Let PI>' .. ,Ps be the odd prime divisors of d. If (k, 2d) = 1 and k is representable by F(x,y) then, from the previous section, we see that
(~) , J(k), e(k), J(k)e(k)
(1)
do not depend on k. We call them the character system for F(x,y). Since two equivalent quadratic forms have the same character system we can speak of the character system of an equivalence class of forms. Definition 2. If two quadratic forms with the same discriminant d have the same values for each of the characters, then we say that they belong to the same genus. It is easily seen that a genus is formed from various equivalence classes offorms. We shall prove that each genus has the same number of equivalence classes. Since this fact falls more naturally in the study of ideals in a quadratic field we do not give the proof here. The importance of the notion of genus comes from the discussion of the representation of integers by quadratic forms. Let F(x, y) be a fixed quadratic primitive form. We now discuss the Diophantine equation k = F(x,y).
(2)
If h(d) = 1, then this problem can be solved with Theorem 4.1. But if h(d) ¥ 1, then we only have certain incomplete results from Theorem 4.1. For example if ljJ(k) = 0, then (2) has no solutions; but if ljJ(k) ¥ 0, is (2) soluble then? If it is soluble, then how many solutions are there? These questions cannot be answered by Theorem 4.1. The introduction of the notion of genus helps partly to answer these questions. Example 1. d
=  96. There are four positive definite reduced primitive forms: {1,0,24},{3,0,8},{4,4,7},{5,2,5}.
12.6 The Character System for a Quadratic Form and the Genus
315
From Theorem 4.1 we only know that if k is representable by these four forms, then the total number of solutions is t/J(k)
=2L (96) , nlk
n
where n runs over all the positive divisors of k. In order to calculate the character system we first select k coprime with d and representable by the forms. We take k = 1,11,7,5
and obtain
Form {1,0,24} {3, 0, 8} {4,4,7} {5,2,5}
(~)
o(k)
B(k)
+1
+1
+1
 1 +1  1
 1  1 +1
+1
 1  1
This table shows that each genus has one equivalence class. Therefore, when k == 1,11,7,5 (mod 12), t/J(k) represents the number of solutions of the first, the second, the third and the fourth form respectively. More specifically, if k == 1 (mod 12), then t/J(k) = 2 Lnlk (  96/n) represents the number of solutions to x 2 + 24y2 = k. At the same time we have proved that this equation has no solution if k == 11,7,5 (mod 12). Example 2. d =  15. There are two positive definite reduced primitive forms:
{l, 1, 4}, {2, 1, 2}.
Taking k = 1 and 17 will give
(~) = (~) = 1
and
(~)=(~)= 1.
We can then perform the calculations for k == 1,4 (mod 15) and k == 2,8 (mod 15). We conclude that if k == 7,11,13 or 14 (mod 15), then k is not representable by either of the two forms. If k == 1,4 (mod 15) then there are 2 Lnlk ( 15/n) ways to represent k by {I, 1, 4} ; if k == 2, 8 (mod 15), then there are the similar number of ways to represent k by {2, 1, 2}. From these two examples we see that if each genus contains only one equivalence class, then the number of solutions to (2) is completely determined when (k,2d) = 1. We tabulate all the discriminants d >  400 in which the genus has only one equivalence class in the followin~ble, where we have also included all the positive definite reduced primitive forms.
316
12. Binary Quadratic Forms
Exercise. Study, as in the examples, the cases d
=  20,  24,  32,  35,  51,
75.
d=3 4 7 8 11 12 15 16 19 20 24 27 28 32 35 36 40 43 48 51 52 60 64 67 72 75 84
88 91
1, 1, 1 1,0, 1 1,1,2 1,0,2 1,1,3 1,0,3 1,1,4 2,1,2 1,0,4 1, 1,5 1,0,5 2,2,3 1,0,6 2,0,3 1, 1,7 1,0,7 1,0,8 3,2,3 1,1,9 3,1,3 1,0,9 2,2,5 1,0,10 2,0,5 1, 1, 11 1,0,12 3,0,4 1,1,13 3,3,5 1,0,13 2,2,7 1,0,15 3,0,5 1,0,16 4,4,5 1, 1, 17 1,0,18 2,0,9 1,1,19 3,3,7 1,0,21 2,2,11 3,0,7 5,4,5 1,0,22 2,0,11 1, 1,23 5,3,5
d=96
99 100 112 115 120
123 132
147 148 160
163 168
180
187 192
1,0,24 3,0,8 4,4,7 5,2,5 1,1,25 5,1,5 1,0,25 2,2,13 1,0,28 4,0,7 1,1,29 5,5,7 1,0,30 2,0,15 3,0,10 5,0,6 1, 1,31 3,3,11 1,0,33 2,2,17 3,0,11 6,6,7 1, 1,37 3,3,13 1,0,37 2,2,19 1,0,40 4,4,11 5,0,8 7,6,7 1,1,41 1,0,42 2,0,21 3,0,14 6,0,7 1,0,45 2,2,23 5,0,9 7,4,7 1,1,47 7,3,7 1,0,48 3,0,16 4,4,13 7,2,7
 d = 195
228
232 235 240
267 280
288
312
315
340
352
372
1,1,49 3,3,17 5,5,11 7,1,7 1,0,57 2,2,29 3,0,19 6,6,11 1,0,58 2,0,29 1,1,59 5,5,13 1,0,60 3,0,20 4,0,15 5,0,12 1,1,67 3,3,23 1,0,70 2,0,35 5,0,14 7,0,10 1,0,72 4,4,19 8,0,9 8,8,11 1,0,78 2,0,39 3,0,26 6,0,13 1,1,79 5,5,17 7,7,13 9,9,11 1,0,85 2,2,43 5,0,17 10, 10, 11 1,0,88 4,4,23 8,0,11 8,8,13 1,0,93 2,2,47 3,0,31 6,6,17
317
12.7 The Convergence of the Series K(d)
12.7 The Convergence of the Series K(d) Let
(d)
00 1 K(d)=I· n= 1
This is a very important series. Since
(1)
n n
(~) is a real character mod Idl, it follows from
Theorem 7.2.3 that
Moreover we see from Theorem 6.8.2 that the series K(d) is convergent. Theorem 7.1. lim ~ t'"
I (~) =
I 1., k., t
00 "C
nlk
n
(()(Idl) K(d). Idl
(k,d)= 1
Proof 1) Let A("C; d, n) denote the number of positive integers not exceeding "Cln and coprime with d. Then 1 (d) 1 (d) "C1 I I (d) n =I n I I=I n I I "C "C 00
1 .,k"tnlk (k, d) = 1
00
n =l
I
l.,k"t (k,d) = 1 nlk
n =l
l.,k"t/n (k,d) = 1
(~)A("C;d,n) .
n= 1
n
(2)
"C
Since A("C; d, n) does not increase as n increases, and
A("C;d,n) 1 :;;;, "C n it follows from Theorem 6.8.2 that the series (2) converges uniformly in "C. Also, for fixed n, we have
.
A("C; d, n)
t"'OO
"C
hm
(()(Idl) I
=.
Idl
n
Therefore
. 1 "L." hm
,,(d) I'1m A("C;d,n) L.. = ;, L." (d) 
t",00"C 1 .,k"tnlk (k,d)= 1
n
n=l
n
= ({)(Idl) Idl
t"'OO
I n= 1
(~) ~ n n
"C
.D
318
12. Binary Quadratic Forms
12.8 The Number of Lattice Points Inside a Hyperbola and an Ellipse Theorem 8.1. Let m > 0 and let there be an ellipse centre at the origin, or a hyperbola centre at the origin (the two curves of the hyperbola together with two lines passing through the origin). Denote by I the (finite) area of the region. Magnify the original figure by (that is replacing ~ and '1 by ~Jr and '1Jr), and denote by V(r) the number of lattice points in the magnified figure whose coordinates satisfy
Jr
~ = ~o
(modm),
'1 = '10
.
I
(modm).
Then V(r)
hm=2' t  co r m Proof We form a net in the original figure with the orthogonal lines ): =
.,
~o
'10 + sm '1 = =
+ "1m
Jr'
Jr
This gives a net of squares with side length mlJr. Denote by W(r) the number of squares whose "southwest corners" lie inside the ellipse or the hyperbola. Then clearly V(r)
=
W(r).
Since the area of each square in the net is m 2 /r it follows at once from the fundamental theorem of calculus that
and hence the required result.
D
12.9 The Limiting Average Denote by I/I(k, F) the number of proper representations of k by F, and let
L
H(r,F)=
I/I(k,F),
1 :::=;k~t (k,d)= 1
The aim of this section is the evaluate .
1
hm  H(r, F). t  00
't
r
> 1.
319
12.9 The Limiting Average
Theorem 9.1. As x, y both run over a complete residue system mod Idl, there are precisely Idlcp(ldl) sets of x, y such that F(x, y) is coprime with d.
Proof It suffices to prove that if plld, I> 0, then there are icp(pl) sets of x, y in a complete residue system modi such that p,tF(x,y). For let the standard P:' Then, since (d, F(x,y)) = 1 and p,tF(x,y) are factorization for Idl be equivalent, it follows from the Chinese remainder theorem that, as x, y run over a complete residue system mod Idl, there are
ni
n plcp(pl)
=
Idlcp(ldl)
plldl
values of F(x,y) which are coprime with d. Since (a, b, c) = 1, we have p,t(a, c). We now assume that p,ta. 1) Suppose that p > 2. Since (p,4a) = 1, it follows from 4aF = (2ax
+ by)2 
dy2
¥= 0 (mod p)
that 2ax
+ by ¥= 0
(modp),
and conversely. For any given value of y (there are pI values) there are p  1 distinct values for xmodp, because p,t2a. There are thus pl1(p  1) = cp(pl) values for xmodpl. The required result is proved. 2) Suppose that p = 2. Now 21d implies 21b. The condition ax 2 + bxy
+ cy2 ==
1 (mod 2)
becomes ax
+ cy ==
1 (mod 2).
Since corresponding to each value of y (there are 21 values) there are 21 1 values x (mod 21) which satisfy the above e9uation, the theorem is proved. D Theorem 9.2. We have
2n cp(ldl) .
11m t+ 00
H(r, F) 1:
=
{
JIdf Idi' log e cp(d)
if d> O.
Jdd' Proof If d < 0, we let U(r)
=
U(1:, F, xo,Yo) denote the number of solutions to
0::::; F(x,y) ::::; x
if d < 0,
== Xo (mod Idl),
1:,
y == Yo
(mod Idl).
If d > 0, then we let U(r) = U(r, F, xo,Yo) denote the number of solutions to
320
12. Binary Quadratic Forms
X = Xo
1::;;;1~1<82,
£>0,
O::;;;F(x,y)::;;;r,
(mod Idl),
= Yo
y
(mod Idl).
Here the definitions for L, £, 8 are the same as §11.4. Let xo, Yo both run over the complete residue system mod Idl such that (F(xo, Yo), d) = 1. Then
I
U(r)
I
=
t/I(k, F)
= H(r, F),
'"
U(r).
1 ';k';r (k,d) = 1
(XO,Yo) (F(xo,yo).d) = 1
and hence l' 1 · H( r, F) 11m = 1m 't
t  00
t  00
l'
L.
(XO,YO) (F(xo,yo),d) = 1
By Theorem 9.1 we see that our theorem follows if we can prove that, for each set of xo, Yo, we have
lim U(r) r co
=
{~ :2' log 8 1 .jd d 2 '
r
if d < 0, if d> O.
Also, by Theorem 8.1, we need now only evaluate the area for the ellipse F(x, y) ::;;; 1, (d < 0), and the area for the hyperbola 0::;;; F(x,y) ::;;; 1, r > 0, 1 ::;;;
I~ I <
82
(d > 0).
1) Suppose that d < O. It is well known that the area of the ellipse 2 ax + bxy + cy2 ::;;; 1 is 2n/JIdT. The theorem is therefore proved. 2) Suppose that d > 0, and we may assume that a > O. Since L
= 2ax + (b + .jd)y,
£ = 2ax + (b  .jd)y,
so that L£
= 4a(ax 2 + bxy + cy2),
and hence L > O. The required area for the hyperbola is
1= where the integration substitution
IS
ff
dxdy
over L£::;;; 4a, £ > 0, 1::;;; L/£ < L
2Ja= p,
£ =(j
2Ja
82.
We make the
321
12.10 The Class Number: An Analytic Expression
whose Jacobian has the value op
op
ox
oy
ou ox
ou oy
Therefore 1=
~ II dpdu,
where the integration is over pu ~ 1, u > 0, u ~ p < e2 u. This is the region formed by the two straight lines from the points (1,1) and (e, lie) to (0, 0) together with the rectangular hyperbola joining the points (1,1) and (e, lie). Therefore I
Jd I =
e
P
lip
I I I I I I(~ ;) I ~p I~ I; dp
du
+
dp
e
I
=
du
(p  ; ) dp
+
dp
o
l e e
=
p
+
dp = log e.
o
o
This gives
and the theorem is proved.
0
12.10 The Class Number: An Analytic Expression Theorem 10.1.
h(d)
=
{ W~ Jd
K(d),
1K(d), oge
Proof Let
if d < 0, if d> 0.
322
12. Binary Quadratic Forms
be a representative system. From Theorem 4.1 we have
I
I
H(7:, F) =
II/I(k,F)
l~k~T
F
F
(k,d)= 1
I
1 ~k~T (k,d) 1
=w
I/I(k)
=
I (d) .
I
1 ~k~T nlk (k,d) 1
=
n
From Theorem 7.1 and Theorem 9.2 we have h(d) { 2n } cp(I~1) loge Idl'
as required.
= w cp(ldl) K(d) {if d < 0, Idl
if d> 0,
0
Therefore our problem becomes that of the determination of the sum of the series K(d)
=
I l(d)  . 00
n= 1
n n
12.11 The Fundamental Discriminants Definition. By a fundamental discriminant we mean a discriminant d which has no odd prime square divisor, and d is odd or d == 8 or 12 (mod 16). For example: 5, 8,12,13,17,21,24,28,29, ... are fundamental discriminants. Theorem 11.1. Each discriminant d is uniquely expressible as fm 2 where f is a fundamental discriminant. Proof 1) If d is odd, then we let m 2 be the largest square that divides d. Write d = fm 2 for the required result. 2) If d is even, then we first write d = qr2 where r2 is the largest square that divides d. Clearly 21 r. If q == 1 (mod 4), then q is a fundamental discriminant. If q == 2 or 3 (mod 4), then we takef = 4q so that from 4q == 8 or 12 (mod 16) we see
that f is a fundamental discriminant. 3) Uniqueness. Let d = fm 2, m > andfbe a fundamental discriminant. If fis odd, thenfhas no square divisor so that m 2 is the largest square divisor of d. Iffis even, then f == 8 or 12 (mod 16), hence 4%f/4 and therefore (2m)2 is the largest square divisor of d. From this we see that the uniqueness property follows. 0
°
Theorem 11.2. Let d = fm 2 be the representation in Theorem 11.1. Then K(d) =
n (1  ([)~)K(f). P P plm
323
12.12 The Class Number Formula
Proof We have
L (d)  I = L (m2/)  I co
=
K(d)
co
n= 1
n n
n
n= 1
I  L (I) n n'
n
co
n= 1

(m,n)= 1 Let the standard factorization of m be pill . .. p!s. Then from Theorem 1.7.1 we have
K(d) = K(f)  L (£)~K(f) pdm Pi Pi
=
11 (I  ([)~)K(f).
D
p P
plm
We see from this theorem that we need only determine the values for Exercise. Show that if d is a fundamental discriminant then character mod Idl.
12.12 The Class Number Formula We now assume that d is a fundamental discriminant. Let
Ji = {+Ji, i~1,
Theorem 12.1.
If 0 <
if ~ is positive if ~ is negative.
cp < 2n, then
I
sinncp _ n cp n=1n22' and
cosncp . cp) . L = log ( 2S111co
n Proof From 0 < cp < 2n we have*
2
n= 1
co
einq>
n= 1
n
L*
=
log(1 
The rigorous proof of this requires Abel's theorem.
eiq»
K(f).
(~) is a real primitive
324
12. Binary Quadratic Forms
=  log ( 2 sin ~) + i arc tan (cot ~ )
( . cp) + .(n2'  '2cp) .
log 2 SIll '2
= 
I
The required results follow from taking the real and imaginary parts of the equation. 0 Theorem 12.2.
If dis a fundamental discriminant, then __ 1
jdr=l r n Idl1 (d)
{ K(d)
=
di1(~)lOgSin nr,
, I

if d> 0,
d
if d < 0.
r,
r
Idl'r=l
Proof From character sums we have
Id~ 1 (~) e 2 " inr/ld1 = (~) jd. r
r=l
n
(If d is a fundamental discriminant, then
(~) is a primitive character.) Therefore
2"i nr I (d)  jd = I Ildl1 I (d)  eldi 00
.jdK(d) =
n=
00
1
Idl 1
=
I
r=l
n
n
(d) I
00
n=
1
1 n r= 1
r
2"i nr
 eldl
.
n=l n
r
1) If d > 0, then on taking the real parts of the above equation we have .jdK(d)
=
I (d)  I
d1
00
r
r=l
1 2nnr cos
n=ln
df (~)
d
log (2 sin nr) d
r= 1 r
nr I (d)  logsin
d1
(since log 2
d
r= 1 r
I (d)  = 0).
d1
r= 1 r 2) If d < 0, then on taking the imaginary parts we have
1 2nrn Jidf K(d) = IdlI 1 (d)  I  sin r n Idl 00
r
=1
n= 1
Idl 1 (d) (n
=
r~l
;
nr)
n
Idl 1 (d)
2' Idf = Idf r~l
; r.
o
325
12.12 The Class Number Formula
From Theorem 12.2 and Theorem 10.1 we deduce at once: Theorem 12.3. Let d be a fundamental discriminant. Then for d> 0, we have
.m! TI
TI Sill 
=
eh(d)
d
t
.ns Sill ;
d
s
and for d < 0, we have h(d) =
~(It 21dl t
Is), s
where s runs over those r· (0 < r < Idl) satisfying
(~) = 
1.
(~) =
1, and t those r satisfying
0
Theorem 12.4. Let d be a negative fundamental discriminant. Then
[M]
~ 2(2:(~)),t.(n
h(d)
Proof By Theorem 12.1 we have, for 2n <