Vol. 105, No. 1
DUKE MATHEMATICAL JOURNAL
© 2000
AN ANALOGUE OF SERRE’S CONJECTURE FOR GALOIS REPRESENTATIONS AND HECKE EIGENCLASSES IN THE mod p COHOMOLOGY OF GL(n, Z) AVNER ASH and WARREN SINNOTT 1. Introduction. Let p be a prime number and F an algebraic closure of the finite field Fp with p elements. Let n and N denote positive integers, N prime to ¯ p. We are interested in representations ρ of the Galois group GQ = Gal(Q/Q) into GL(n, F), unramified at all finite primes not dividing pN. (We shall say ρ is unramified outside pN.) In this paper, representation will always mean continuous, semisimple representation. We choose for each prime l not dividing pN a Frobenius element Frobl in GQ . We also fix a complex conjugation Frob∞ ∈ GQ . For every prime q, we fix a decomposition group Gq with its filtration by its ramification subgroups Gq,i . We denote the whole inertia group Gq,0 by Iq . Our aim is to make a conjecture about when such a representation should be attached to a cohomology class of a congruence subgroup of level N of GL(n, Z). Then we exhibit such evidence for the conjecture as we are able. Set 0 (N ) to be the subgroup of SL(n, Z) consisting of those matrices whose first row is congruent to (∗, 0, . . . , 0) modulo N. Let SN be the subsemigroup of the integral matrices in GL(n, Q) whose first row is congruent to (∗, 0, . . . , 0) modulo N and with determinant positive and prime to N. We denote by Ᏼ(N ) the F-algebra of double cosets 0 (N)SN 0 (N). It is commutative. This algebra acts on the cohomology and homology of 0 (N) with any coefficient FSN -module. When a double coset is acting on cohomology, we call it a Hecke operator. The Hecke algebra Ᏼ(N) contains all double cosets of the form 0 (N )D(l, k)0 (N ), where D(l, k) is the diagonal matrix with k l’s followed by (n − k) l’s, and l is a prime not dividing N. We use the notation T (l, k) for the corresponding Hecke operator. Definition 1.1. Let ᐂ be an Ᏼ(pN)-module, and suppose v ∈ ᐂ is an eigenvector for the action of Ᏼ(pN ) with T (l, k)v = a(l, k)v for some a(l, k) ∈ F, for all k = 0, . . . , n, and for all l prime to pN. Let ρ be a representation ρ : GQ → GL(n, F)
Received 2 June 1999. Revision received 12 November 1999. 2000 Mathematics Subject Classification. Primary 11F80; Secondary 11F75. Ash’s research partially supported by National Science Foundation grant DMS-9531675. 1
2
ASH AND SINNOTT
unramified outside pN such that (−1)k l k(k−1)/2 a(l, k)X k = det I − ρ(Frobl )X k
for all l not dividing pN. Then we say that ρ is attached to v. For example, theorems of Eichler, Shimura, Deligne, and Deligne-Serre imply that if n = 2 and ᐂ is the Hecke module of classical modular forms mod p of some positive weight k, then there exists a ρ attached to any Hecke eigenform v ∈ ᐂ. Such a ρ is always “odd”; that is, det ρ(Frob∞ ) = −1. Conversely, the weak form of Serre’s conjecture [Se2] states that given a mod p 2-dimensional odd Galois representation ρ of the Galois group GQ , there exists a mod p modular Hecke eigenform f with ρ attached. In the strong form of Serre’s conjecture, formulas for a weight (greater than or equal to 2), level, and nebentype character are given in terms of ρ, and it is conjectured that f may be taken to have that weight, level, and nebentype character. For a survey of the current state of Serre’s conjecture, see [Da], [Di], and [E]. The purpose of this paper is to explore an analogue of Serre’s conjecture for general n, especially n = 3. We make a conjecture in the niveau 1 case and present some experimental evidence for it. B. Gross has made a similar conjecture in [G4] for finite arithmetic groups. The case of niveau greater than 1 is under study. Our conjectures use cohomology classes instead of modular forms. By the EichlerShimura theorem, the modular form f in Serre’s conjecture can be replaced by a Hecke eigenclass in the cohomology of a congruence subgroup of SL(2, Z). When n = 2, our conjecture will imply Serre’s conjecture in the case of niveau 1. By way of background, we first state Conjecture 1.2, to which our Serre-type conjecture is a partial converse. Conjecture 1.2 is a slightly weaker version of [A2, Conjecture B], restated here in a form suitable for this paper. As proved in [A4], Conjecture B is true when n = 1, 2—by class field theory when n = 1 and by the theorems of Eichler, Shimura, and Deligne mentioned above when n = 2. Let V be a right F[GL(n, Z/N)]-module, assumed to be finite dimensional over F. We view V as an FSN -module by making the elements of SN act via their reductions modulo N. Conjecture 1.2. Suppose β ∈ H ∗ (0 (N), V ) is an eigenclass for the action of the Hecke algebra Ᏼ(pN ). Then there exists a representation ρ : GQ → GL(n, F) unramified outside pN attached to β. We state our analogues to the weak and strong forms of Serre’s conjecture as Conjectures 2.1 and 2.2. Note that they include the case where ρ is reducible. In fact, Serre’s original strong conjecture could have been made in the reducible case also, with a few modifications necessitated by the fact that there is no modular form on SL(2, Z) of level 1 and weight 2. For example, if N = 1 and ρ is a sum of the trivial
GALOIS REPRESENTATIONS AND COHOMOLOGY
3
character and the cyclotomic character mod p, the predicted weight has to be raised from 2 to p + 1. In Section 3, we discuss some theoretical evidence for Conjectures 2.1 and 2.2, that is, cases where it can be proven that they hold: (1) certain monomial ρ when n = p−1; (2) the tensor product of two odd 2-dimensional Galois representations; (3) symmetric squares of odd 2-dimensional Galois representations; and (4) 3-dimensional reducible ρ’s attached to cohomology classes that restrict nontrivially to the boundary of the Borel-Serre compactification M of the locally symmetric space for the relevant congruence subgroup of SL(3, Z). We also have numerical evidence produced by computer, which we present in Section 4. So far, due to the difficulty of calculating cohomology groups and Hecke actions, the only computations we have been able to carry through involve reducible ρ’s; compare the discussion in Remark (1) after Conjecture 2.2. (Similarly, the numerical evidence for Conjecture 1.2 presented in [AM] and [AAC] also pertains to reducible ρ’s, because of the difficulty of searching for the predicted extensions of Q when the degree is large.) Our most interesting experimental evidence concerns ρ’s that are sums of an even 2-dimensional representation and a 1-dimensional representation. In these cases, our conjecture gives a cohomological reciprocity law for the even 2-dimensional representations. These even representations are not covered by Serre’s conjecture. The corresponding cohomology classes in the cohomology of GL(3, Z) must restrict to 0 on the Borel-Serre boundary of M. Implicit in the definition of “attached” (see Definition 1.1), is a choice of a normalization of the Satake isomorphism, called η in [G3]. In our case, η is the following one-parameter subgroup n−1 t 0 η(t) = . .. 0
0
t n−2 .. . 0
··· 0 0 · · · 0 0 .. .. .. . . . . ··· 0 1
According to [G2], we should then expect that ρ(Frob∞ ) = ±η(−1). This is verified in practice in the examples found in [AM], [AAC], and [A5]. We include this parity requirement in Conjectures 2.1 and 2.2. It is the analogue of the oddness condition in Serre’s conjecture. When ρ is reducible, we have found experimentally that a finer parity requirement on ρ(Frob∞ ) is needed. This requirement is formulated below as the strict parity condition. We should stress that almost all of the mod p cohomology classes for which we test the conjectures lift to torsion classes in characteristic 0 cohomology, and therefore they have no obvious connection with automorphic cohomology or automorphic representations.
4
ASH AND SINNOTT
Acknowledgments. We thank B. Gross for explaining his work to us, H. Jacquet for telling us of Ramakrishnan’s work, and B. Mazur for many helpful conversations. Thanks also to G. Allison and especially E. Conrad for continuing help with the computations. Thanks to J. Quer for his assistance with Aˆ 4 -extensions. 2. Statement of the Serre-type conjectures. Let ω denote the cyclotomic character mod p of GQ . For any nonnegative integer g, let Vg denote the right GL(n, F)module of homogeneous polynomials of degree g in n variables (column vectors) with coefficients in F. Irreducible GL(n, Fp )-modules are parametrized by “good” n-tuples of integers (b1 , b2 , . . . , bn ), where good means that 0 ≤ b1 − b2 , . . . , bn−1 − bn ≤ p − 1 and 0 ≤ bn ≤ p − 2. We denote the corresponding module by F (b1 , b2 , . . . , bn ). It is the restriction to GL(n, Fp ) of the unique simple submodule of the dual Weyl module of GL(n, F) with highest weight (b1 , b2 , . . . , bn ). The representation F (b1 , b2 , . . . , bn ) can always be embedded into Vg for g = b1 + pb2 + · · · + p n−1 bn (see [DW]). In particular, Vg F (g, 0, . . . , 0). If (a1 , a2 , . . . , an ) is any n-tuple of integers, we denote by (a1 , a2 , . . . , an ) a good n-tuple (b1 , b2 , . . . , bn ) which is congruent to (a1 , a2 , . . . , an ) modulo p − 1. If none of the consecutive differences ai − ai+1 is divisible by p − 1, then (a1 , a2 , . . . , an ) is uniquely determined. In the ambiguous cases, we must interpret statements involving (a1 , a2 , . . . , an ) as saying that they are true for some choice of (b1 , b2 , . . . , bn ) as above. When n = 2 and ρ is irreducible, the ambiguous cases are those in which a discrimination must be made between a Galois representation’s being “peu ramifiée” and “très ramifiée” in Serre’s sense [Se2]. Since we have no numerical evidence for such cases when n > 2, we do not pursue this discrimination here. In Proposition 2.7, we shall see that this ambiguity also comes into play at level 1 in another way when ρ is reducible. Conjecture 2.1. Let ρ : GQ → GL(n, F) be a representation (as always continuous and semisimple) which is unramified outside pN . Assume that ρ(Frob∞ ) has eigenvalues ±(1, −1, 1, −1, . . . ). Then there exists some integer N divisible only by the primes dividing pN and an FSpN -module V on which the SpN -action factors through reduction modulo pN and an Ᏼ(pN )-eigenclass β ∈ H ∗ (0 (N ), V ) such that ρ is attached to β. This is the weak form of our conjecture. Note that we allow N to be divisible by p in this form. The necessity of the parity condition on ρ(Frob∞ ) is strongly indicated by the experimental evidence below. In addition, from [AAC, Theorem 4.3 and its proof], we can list all possible Galois representations that can be attached to a Hecke eigenclass in the cohomology of GL(3, Z) with coefficients in an FSp -module in which the action factors through reduction modulo p, where p = 5, 7. They all satisfy the parity condition on ρ(Frob∞ ). There are strong indications that the same is true for p = 3, 11.
5
GALOIS REPRESENTATIONS AND COHOMOLOGY
We next give a stronger conjecture in which we predict precise analogues of the level, nebentype character, and weight in Serre’s conjecture. We only venture to predict the weight when ρ restricted to the inertia subgroup Ip at p is niveau 1 in the sense of Serre [Se2]. Level. For each prime q dividing N (in particular not equal to p), let G denote the image of a chosen decomposition group at q under ρ, and let Gi be its ramification subgroups. Denote the cardinality of Gi by gi . Imitating the definition of the Artin conductor (as Serre does in [Se2]), we set gi fq = dimF M/M Gi , g0 where M is the vector space Fn viewed as a G-module under ρ. We then set N(ρ) =
q fq .
Note that if ρ ≈ σ1 ⊕ · · · ⊕ σr with σi irreducible for all i, then N(ρ) =
N(σi ).
Nebentype character. Again following [Se2], we factor det ρ into &ωd for some integer d, where & is unramified at p. By class field theory, we can view & also as an F× -valued Dirichlet character & : (Z/N(ρ))× → F× of conductor dividing N(ρ), and as such we call it &(ρ). Let N = N(ρ). We have a homomorphism from SN to (Z/N)× sending a matrix to its upper left-hand entry modulo N. We use this to pull back & to a character from SN to F× , and we denote by F& the 1-dimensional space on which SN acts via &. If V is any GL(n, Z/p)-module, we denote by V (&) the (SN ∩ GL(n, Zp ))-module V ⊗ F& , where the action on the first factor is via reduction modulo p of the matrices in SN ∩ GL(n, Zp ). Weight. The natural generalization of the weight is an irreducible GL(n, Z/p)module F (b1 , . . . , bn ). Every such module can be embedded into Vg for some g. When n = 2, the Eichler-Shimura theorem relates the cohomology of Vg to the modular forms of weight g + 2. When ρ is reducible and p is odd, there is an additional requirement needed in our stronger conjecture which we shall call the strict parity condition. We consider ρ as a given representation, not merely up to equivalence. Suppose ρ is isomorphic to the direct sum of irreducible representations σ1 , . . . , σk of degrees d1 , . . . , dk , respectively. We may assume there exists a standard Levi subgroup L of GL(n) of type (d1 , . . . , dk ) such that the image of ρ lies in L(F). Standard means that if E = {e1 , . . . , en } is the standard basis of n-space, then there exists a partition of E into k parts of sizes d1 , . . . , dk , respectively, such that L is the simultaneous stabilizer of the k subspaces, each of which is spanned by the basis vectors in one part of the partition.
6
ASH AND SINNOTT
Definition: The strict parity condition. With L and ρ as above, we say that ρ satisfies the strict parity condition if and only if p = 2 or ρ(Frob∞ ) is conjugate inside L(F) to the diagonal matrix ± diag(1, −1, 1, −1, . . . ). Conjecture 2.2. Continue the hypotheses of Conjecture 2.1. Assume that ρ satisfies the strict parity condition. Set N = N(ρ) and & = &(ρ). Let Ip denote the chosen inertia subgroup at p, as above, and suppose that ρ | Ip is conjugate inside L(F) to a matrix of the form a ω1 ∗ ··· ∗ ∗ 0 ωa2 · · · ∗ ∗ ρ | Ip ∼ . .. .. .. .. .. . . . . 0
0
· · · 0 ωan
for integers a1 , . . . , an . Then we may take N = N and V = F a1 − (n − 1), a2 − (n − 2), . . . , an (&). Remarks. (1) Our computer programs actually compute homology. In fact, one can show that Hk (0 (N ), V ) and H k (0 (N), V ) are isomorphic as Hecke modules. If the degree ∗ of the homology class to which ρ is associated is the virtual cohomological dimension of GL(n, Z) and if p is prime to the orders of all torsion elements of 0 (N ), then we may replace V by Vg for any g such that V embeds into Vg . This is because in this case H∗ (0 (N ), V ) embeds into H∗ (0 (N), Vg ) as a Hecke module. For example, if (a1 − (n − 1), a2 − (n − 2), . . . , an ) = (b1 , . . . , bn ), we can take g = b1 + b2 p + · · · + bn p n−1 . We use this in our experimental testing when n = 3 and ∗ = 3 = vcd(GL(3, Z)). The torsion primes of GL(3, Z) are 2 and 3, and we will always have p > 3 in our examples. We compute the homology of a congruence subgroup of GL(3, Z) using a cellulation of a deformation retract of minimal dimension of the Borel-Serre compactification M of the locally symmetric space X/ , where X is the symmetric space of SL(3, R). For testing the conjectures, we concentrate on the degree 3 homology classes. This is because cohomology Hecke eigenclasses that restrict nontrivially to the Borel-Serre boundary ∂M are known (see [AAC]) to satisfy Conjecture 2.1 with reducible attached Galois representations. (It would still be of interest to verify Conjecture 2.2 for them.) On the other hand, those cohomology classes that restrict to 0 on ∂M (the interior part of the cohomology) only occur in degrees 2 or 3. The degree 2 and degree 3 interior parts are dual to each other by Lefschetz duality. So for testing the conjectures on the interior cohomology, it is enough to look at the degree 3 space. See [AAC] and [AT] for details, where the quotient of homology dual to the interior cohomology is called the quasicuspidal homology. See also Proposition 2.8. Since 3 is the top dimension of the retract of M used to compute the homology, it is much easier to work in that dimension, and our programs were written only to compute H3 .
GALOIS REPRESENTATIONS AND COHOMOLOGY
7
Also, we worked with the coefficient systems easiest to handle algorithmically. Hence, our computer programs only compute cohomology with coefficients in Vg for variable g. In fact, models for the irreducible modules F (b1 , . . . , bn ) are not known in general. However, it is a result of Doty and Walker [DW] that the irreducible GL(n, Fp )module F (b1 , b2 , . . . , bn ) can be embedded in Vg where g = b1 +b2 p +· · ·+bn p n−1 . This is not always the minimal embedding degree, and certainly not the minimal degree in which F (b1 , b2 , . . . , bn ) occurs as a subquotient of Vg in general. We don’t know what the minimal g should be in general. Embedding in Vg does cause computational problems when g is too big, as happens with irreducible representations ρ. Conjecture 2.2 will predict some weight F (a, b, c). By twisting we can always insure that c = 0. Reducible representations give us an extra degree of freedom that also lets us take b to be small. That is why all our numerical examples below involve reducible ρ. As pointed out to us by Tiep, usually there are workable modules of dimension smaller than that of Vg for g = a +pb in which we can embed a given F (a, b, 0). For example, F (a, b, 0) is always a submodule of Va ⊗Vb . If we are willing to work with subquotients (although we are not guaranteed that the package of Hecke eigenvalues attached to an interior cohomology class with coefficients in a subquotient of V will also appear in H 3 (V )), then we can use the fact that F (a, b, 0) is always a subquotient of Va−b ⊗ Vb∗ where the star denotes F-dual. We hope to use these smaller models in some future calculations. (2) Our assumption on the restriction of ρ to inertia at p corresponds to the niveau 1 case of Serre’s conjecture. We have no experimental evidence for higher niveaux, nor are we certain what the conjecture should look like in those cases. (3) If the image of ρ | Ip is not the whole upper triangular subgroup (for instance, if it is contained in the diagonal subgroup), then there may be more than one ordering possible for the characters along the diagonal. One may then choose the ordering that leads to the smallest possible value of g. Other orderings would give companion forms, as they are called when n = 2 (see [G2]). Unfortunately, in the cases we have at hand, the g for the companions is too large for us to test their existence. (4) Let n = 3. In the case where ρ is the sum of a 1-dimensional and a 2-dimensional representation, the strict parity condition of the conjecture is based on computer examples discussed in Section 4. When ρ is the sum of three 1-dimensional representations, or for larger n, our conjecture is based on extrapolating from the 1+2 case. (5) If ρ(Frob∞ ) has the eigenvalues ±(1, −1, 1, . . . ), the strict parity condition can always be obtained by conjugating ρ and choosing L appropriately. However, this will affect the order of the powers of ω along the diagonal in ρ | Ip , and hence the weight. (6) Following Lemma 2.3, we prove that Conjectures 2.1 and 2.2 are stable under twisting by powers of ω and under replacing ρ with its contragredient. To state Lemma 2.3, let χ : GQ → GL(1, F) be a character of conductor dividing N, and let χ also denote the corresponding Dirichlet character χ : (Z/N)× → F× . For
8
ASH AND SINNOTT
any FSN -module V as in Conjecture 1.2, let V (χ) denote the tensor product of V and the 1-dimensional FSN -module on which SN acts via the determinant reduced modulo N composed with χ . We also let V (i) denote the tensor product of V and the 1-dimensional FSpN -module on which SpN acts via the ith power of the determinant modulo p. Lemma 2.3. A class β ∈ H ∗ (0 (N), V ) is an Ᏼ(pN)-eigenclass to which ρ is attached if and only if β ∈ H ∗ (0 (N), V (χ)(i)) is an Ᏼ(pN)-eigenclass to which ρ ⊗ χ ωi is attached. Proof. Since 0 (N ) is in the kernel of the determinant, it sees V and V (χ)(i) as the same module, so we can view β also as a Hecke eigenclass in H ∗ (0 (N), V (χ)(i)). It is easy to check that ρ ⊗ χ ωi is attached to this avatar of β. (The elements of SpN do see the difference between V and V (χ)(i).) Lemma 2.4. We have (F (c1 , . . . , cn ) )(1)(&) = F (c1 + 1, . . . , cn + 1) (&). Proof. First, assume & = 1. Let F (c1 , . . . , cn ) = F (b1 , . . . , bn ) where (b1 , . . . , bn ) is a good n-tuple. Then, (F (c1 , . . . , cn ) )(1) = F (b1 + 1, . . . , bn + 1) if bn < p − 2, and (F (c1 , . . . , cn ) )(1) = F (b1 − p + 2, . . . , bn − p + 2) if bn = p − 2. In either case, the parameters on the right-hand side form a good n-tuple that is congruent modulo p −1 to (c1 +1, . . . , cn +1). Hence, the right-hand side, in either case, equals F (c1 + 1, . . . , cn + 1) . To get the general result, just tensor both sides with F(&). Remark. On the left-hand side of Lemma 2.4, we can make any choice of F (· · · ) if there is an ambiguity, and then the proof tells what choice to make on the right-hand side. Lemma 2.5. Suppose that the representation ρ : GQ → GL(n, F) is attached to the class β ∈ H ∗ (0 (N ), F (c1 , . . . , cn ) (&)). Then ρ ⊗ω is attached to β now viewed as a class in H ∗ (0 (N ), F (c1 + 1, . . . , cn + 1) (&)). Proof. By Lemma 2.3, ρ ⊗ ω is attached to β viewed as a class in H ∗ (0 (N), F (c1 , . . . , cn ) (1)(&)). Then we are finished by Lemma 2.4. Proposition 2.6. Conjectures 2.1 and 2.2 are stable under twisting by ω. Proof. Stability of Conjecture 2.1 under twisting follows immediately from Lemma 2.3. As for Conjecture 2.2, its stability follows from Lemma 2.5. A small bit of evidence for the conjectures when n = 2 comes from the following proposition. Proposition 2.7. Conjectures 2.1 and 2.2 are true for reducible representations when n = 2. Proof. By assumption, ρ is semisimple. So if it is also reducible, there are two characters ψ and φ from GQ to F× such that ρ = ψ ⊕φ. We can write ψ = xωa and
GALOIS REPRESENTATIONS AND COHOMOLOGY
9
φ = yωb where x, y are unramified at p and have prime-to-p conductors N and M, respectively. By Lemma 2.5, we may assume that b = 0, and we may also assume that 2 ≤ a ≤ p. Conjecture 2.2 predicts that ρ is attached to some eigenclass β in H 1 (0 (N M), F (a − 1, 0) (xy)). We view x and y as the reductions modulo the prime π above p of characters X and Y into a suitable finite extension A of Zp . Letting χ denote the cyclotomic character for p, we have that ψ and φ are the reductions of Xχ a and Y . Since X, Y are unramified at p, they may be identified with A-valued Dirichlet characters having primitive conductors N and M, respectively. From the parity condition on ρ(Frob∞ ), we have that XY (−1) = (−1)a+1 . Then by [W, Proposition 1], there exists an Eisenstein series FX,Y of weight a + 1, level NM, and nebentype XY , which is a Hecke eigenform with Xχ a ⊕Y as attached Galois representation. This Eisenstein series corresponds à la Eichler-Shimura (cf. [AS1, Theorem 2.3]) to a cohomology eigenclass in H 1(0 (NM),Syma−1 (A2 )(XY )). The reduction modulo π of this class is the desired β. Conjectures 2.1 and 2.2 are compatible with duality in the following sense. For any group G and F[G]-module V , let V ∗ denote the dual module, that is, Hom(V , F), where g in G acts on a functional f by fg(v) = f (vg −1 ). Proposition 2.8. Let ρ be attached to a Hecke eigenclass α in H i (0 (N), V (&)), where V is an irreducible GL(n, Z/p)-module over F. Let σ be defined by σ (g) = t ρ(g −1 )⊗ωn−1 . Then σ is attached to a Hecke eigenclass β in H i ( (N), V ∗ (& −1 )). 0 Proof. If the invariants of ρ are N, &, and V , then the invariants of σ are N , & −1 , and V ∗ . This is a simple exercise, given the facts that V ∗ is isomorphic to the contragredient of V , because V is irreducible (see [AT, Lemma 4.6]), and that the contragredient of F (a1 , . . . , an ) is F (−an , . . . , −a1 ) . Let κ be the character of Ᏼ supported by α. Define λ to be the character of Ᏼ given by λ(s) = κ(s −1 ). Let m be the matrix diag(N, 1, 1). Consider the outer automorphism of GL(3) that sends x to m−1 · t x −1 · m. It preserves 0 and induces an isomorphism of F-vector spaces from H i (0 (N), V (&)) to H i (0 (N), V ∗ (& −1 )). It sends α to a Hecke eigenclass β supporting λ. It is a straightforward but amusing exercise to show that σ is attached to β. Remark. Using Lefschetz duality (as in the proof of [AT, Theorem 5.1]), provided α is an interior class, one can also show the existence of a Hecke eigenclass γ in H j (0 (N ), V ∗ (& −1 )) supporting λ where i + j = n(n + 1)/2 − 1 is the dimension of the locally symmetric space X/ 0 (N). For absolutely irreducible representations when n = 2, we have the following proposition. Proposition 2.9. Conjectures 2.1 and 2.2 are compatible with Serre’s conjecture for irreducible representations when n = 2.
10
ASH AND SINNOTT
Proof. The proof is easily checked using the fact that F (a, b) embeds into Va+pb , and using the Eichler-Shimura isomorphism between cohomology with coefficients in Vg and modular forms of weight g + 2. 3. Theoretical evidence. First, let n = p − 1. Conjecture 2.1 is true for the following type of ρ, as proved in [A5]. Let K be the cyclotomic extension of Q obtained by adjoining a primitive pth root of unity. Let θ denote an F-valued ray class character on K viewed as a homomorphism from GK to F× . Then let ρ : GQ → GL(n, F) denote the induced representation from θ, where n = p − 1. The weight, level, and nebentype character that come in also support Conjecture 2.2. If we assume that ρ | Ip has the form stated in Conjecture 2.2, it follows from [A5, Lemma 5.3] that ρ | Ip ≡ diag(ωn−1 , . . . , ω, 1), so the predicted weight is F (0, . . . , 0) . Indeed, at the end of [A5, Section 5], it is shown that ρ is attached to a cohomology class with trivial coefficients in this case. As for the level and nebentype character appearing in [A5], they are consistent with Conjecture 2.2, but cannot be compared exactly because a different kind of congruence subgroup is used in that paper, rather than 0 (N). (This is because in [A5] we work above the virtual cohomological dimension, and the congruence subgroup we use must have p-torsion.) Next, let n = 4. Let f1 , f2 be classical holomorphic modular newforms of even weights k1 > k2 for SL(2, Z). Let πi be the corresponding automorphic representations on GL(2, A), where A denotes the adèle group of Q. By the work of D. Ramakrishnan [R] we know that there exists an automorphic representation > on GL(4, A) which lifts the tensor product π1 ⊗ π2 from GL(2) × GL(2) to GL(4). Each fi corresponds to a Hecke eigenclass in H 1 (SL(2, Z), F (ki −2, 0)) with a mod p Galois representation σi attached to it. From [Cl, pp. 112–113], we see that a certain Tate twist of > has an infinity type with (g, K)-cohomology with a certain finitedimensional coefficient system E over C. Taking a lattice ? ⊂ E and reducing modulo p, we get the existence of a cohomology Hecke eigenclass β in H ∗ (SL(4, Z), ?/p?), which has attached to it the Galois representation ρ = σ1 ⊗ σ2 : GQ → GL(4, F). This verifies Conjecture 2.1 for ρ, and the same reasoning applies when the fi have nontrivial level and nebentype. However, in the case we are discussing, we have gone further and verified Conjecture 2.2. That is, we have determined that the weight is correctly predicted in case the σi are niveau 1 (so that ρ also has niveau 1). To be more precise, we have checked that the weight is correct up to a twist—to determine the twist would require following through all the normalizations involved in the parametrizations of automorphic representations by maps into the L-groups of GL(2) and GL(4). We also have to assume that k1 and k2 are small compared with p so that ?/p? is irreducible. Interestingly, the requirement that k1 and k2 be distinct is necessitated by the fact that otherwise >∞ is not “regular” in the sense of [Cl] and, therefore, is not known to be connected with cohomology.
GALOIS REPRESENTATIONS AND COHOMOLOGY
11
For the rest of this section and for the remainder of this paper, we set n = 3. We conclude this section by discussing two series of examples that support Conjecture 2.1. The first applies to symmetric square lifts from GL(2) to GL(3). The second example applies to reducible representations of type 1 + 2, where the 2-dimensional representation is odd. We expect this type of representation to be attached to cohomology classes that come from the boundary of the symmetric space. A fuller discussion of these examples, as of the preceding GL(4) example, will be more appropriate for a separate paper. (i) Symmetric squares. Suppose π is a cuspidal irreducible automorphic representation for GL(2, A) generated by some holomorphic newform f of integer weight k greater than 1, level N, and nebentype &. Thus, by a theorem of Eichler and Shimura, f corresponds to a cohomology Hecke eigenclass in H 1 (0 (N), Vk−2 (&)). Let σ denote the attached Galois representation GQ → GL(2, F) for a suitable F. Let A : GL(2) → GL(3) be the symmetric squares homomorphism, that is, the adjoint representation of GL(2) on the 2-by-2 matrices of trace 0 multiplied by the determinant. Let > be the symmetric square lifting of π (see [GJ]). As discussed in [AS1] and [AT], the automorphic representation > for GL(3, A) has cohomological infinity type. That is, >∞ has (g, K)-cohomology with a well-defined finitedimensional coefficient system E over C. Taking a lattice ? ⊂ E and reducing modulo p, this allows us to deduce the existence of a cohomology Hecke eigenclass βf in H 3 (0 (N ), (?/p?)(& )) for suitable N and & which has attached to it the Galois representation GQ → GL(3, F) given by A ◦ σ . Now do this in reverse. Suppose we are given ρ : GQ → GL(3, F), which is the symmetric square of an irreducible 2-dimensional representation σ and satisfies the hypotheses of Conjecture 2.1. So ρ = A ◦ σ . In particular, σ must be odd. Then, by Serre’s conjecture, σ is attached to some f mod p, and so by the preceding paragraph, ρ will be attached to the corresponding βf . Thus, Conjecture 2.1 holds for ρ if Serre’s conjecture holds for σ . In [AT] we show that if σ is niveau 1 and if 0 < k − 2 < (p − 1)/2, or if k − 2 = (p −1)/2 and p = 29, 37, or 41, then ?/p? can be replaced by the weight predicted for ρ by Conjecture 2.2, which is a twist of F (2k − 2, k − 2, 0). (ii) Borel-Serre boundary. Let X be the symmetric space for SL(3, R) and M the Borel-Serre compactification of X/ for an arithmetic subgroup of SL(3, Z). Assume that p > 3. Then the cohomology of with trivial coefficients is canonically isomorphic to the cohomology of M, and we can consider the restriction map of cohomology induced by the inclusion of the boundary of M into M. From [LS], we see that the cohomology of the boundary of M can be given in terms of classical modular forms of weights 2 and 3 and Dirichlet characters. Therefore, those boundary classes which are Hecke eigenclasses, and hence also those Hecke eigenclasses in the cohomology of that restrict nontrivially to the boundary, have attached p-adic Galois representations that are reducible. (Compare the statement and proof of [AAC, Theorem 3.1], which states that any Hecke eigenclass in the
12
ASH AND SINNOTT
cohomology of the boundary with any admissible mod p coefficient module has an attached reducible Galois representation.) We can reduce the p-adic representation modulo p to get an attached ρ. Remark. In the case where ρ is a sum of irreducible representations of dimensions 1 and 2, respectively, the 2-dimensional representation must be odd, since it is coming from a classical modular form. Therefore, to conform with Conjecture 2.2 in predicting the weight, we must take the strict parity condition into account, although there is no parity condition on the 1-dimensional character. It remains to be seen if one can prove that every sum of a 1-dimensional and an odd 2-dimensional representation, and every sum of three 1-dimensional representations, which obeys the strict parity condition, is attached to a boundary cohomology class of the predicted weight, nebentype, and level. 4. Computer generated examples Examples with N > 1. In [AM], we computed a number of cohomology Hecke eigenclasses with their Hecke eigenvalues for l = 2, 3, . . . , 97 for the congruence subgroups 0 (N ) of SL(3, Z) for prime N up to about 250, trivial nebentype character, and trivial coefficients Z/p. In some of these cases, we were able to find a Galois representation ρ that appeared to be attached to the eigenclass in the sense that the Hecke polynomial for every l ≤ 97 equaled the characteristic polynomial of Frobl . In these cases, ρ was reducible. (Cases in which the data predicted an irreducible ρ had the image of ρ so large that we were unable to find it.) In each of these cases, we have gone back now and checked that ρ(Frob∞ ) has eigenvalues (1, −1, 1) and that ρ restricted to inertia at p has eigencharacters along the diagonal (ω2 , ω, 1) in conformity with Conjecture 2.2, in particular, the strict parity condition. These cases include the examples where p = 3 and the image of the 2-dimensional component is Aˆ 4 , and the example where p = 5 and ρ is a sum of three characters. In these cases, the predicted level and nebentype characters can also be checked to be correct. One additional case that did not appear in [AM] deserves mention. There is a totally real Aˆ 4 -extension of Q unramified outside 277. This leads to a test of Conjecture 2.2 for p = 277 and N = 1 as discussed below. However, Aˆ 4 is isomorphic to SL(3, Z/3), so the same extension gives rise to σ : GQ → SL(3, F), where now p = 3. Set ρ = σ ⊕ω with L equal to the intersection of the stabilizers of the spaces of row vectors (0, ∗, 0) and column vectors t (0, ∗, 0). The strict parity condition is satisfied if we put ω in the middle. Then ρ | Ip is conjugate in L(F) to diag(1, ω, 1) = diag(ω2 , ω, 1). The predicted weight is thus the trivial module F (0, 0, 0). It is easy to see that the predicted level is N = 277 and that the predicted nebentype character is trivial. We asked Mark McConnell to run his programs to find the Hecke module H 3 (0 (277), Z/3). Indeed, there was exactly one interior class (up to scalar multiples), and its Hecke eigenvalues (for primes l = 3, l ≤ 97) confirmed Conjecture 2.2 in this case.
GALOIS REPRESENTATIONS AND COHOMOLOGY
13
We have included an example of a calculation showing the equality of a Hecke polynomial at l and the characteristic polynomial of Frobl for p = 277, g = 90, l = 5 at the end of the discussion of the Aˆ 4 cases in the next section. Examples with N = 1. Suppose that we are given an irreducible representation σ : GQ → GL(2, F) with the following properties: (1) σ is unramified outside p; (2) the image of σ has order relatively prime to p; (3) σ (Frob∞ ) is central, where Frob∞ is a complex conjugation in GQ ; (4) σ (Ip ) has order dividing p − 1. Because of condition (3), Serre’s conjecture does not apply to σ . However, if we let ρ = ωj σ ⊕ ωk , for suitably chosen integers j and k, we obtain a representation to which Conjectures 2.1 and 2.2 apply, with n = 3 and N = 1; precisely, if σ (Frob∞ ) = 1, we should choose j and k to have opposite parity, while if σ (Frob∞ ) = −1, we should choose j and k to have the same parity. In this section, we consider numerical evidence for Conjectures 2.1 and 2.2 for representations ρ obtained in this way. The prescription in 2.2 tells us that we should replace ρ by a conjugate representation corresponding to the embedding of GL(2) × GL(1) into GL(3), whose image L is the Levi subgroup of the form ∗ ∗ ∗ . ∗ ∗ Then we will have
ρ Frob∞
1 ∼L ± −1
1
and
ρ | I p ∼L
ωa
ωb
ωc
,
for certain integers a, b, and c. In these expressions, “∼L ” represents conjugacy in L, so that b ≡ k mod p −1, while a mod p −1 and c mod p −1 can be found by examining ωj σ | Ip . Conjecture 2.2 then predicts that we should find a Hecke eigenclass corresponding to ρ in H ∗ (SL(3, Z), F (a −2, b −1, c) ). By Remark (1) after Conjecture 2.2, there should be a Hecke eigenclass corresponding to ρ in H 3 (SL(3, Z), Vg (Fp )), where g = a −2+(b −1)p +cp 2 (assuming that a, b, and c have been chosen so that (a − 2, b − 1, c) is a “good” triple). The existence of an eigenclass for the predicted value of g is already a partial confirmation of the conjecture. We can usually go further and compare the characteristic polynomials of ρ(Frobl ) with the characteristic polynomials predicted by Conjecture 2.1, for a number of small primes l, and in
14
ASH AND SINNOTT
this way find additional confirmation that Conjectures 2.1 and 2.2 are correct. The computations of cohomology and Hecke action use the programs described in [AAC]. Let σ˜ : GQ → PGL(2, F) be the projective representation corresponding to σ , let G be the image of σ˜ , and let K be the fixed field of the kernel of σ˜ . Then K will be a Galois extension of Q, unramified outside of p, with a Galois group isomorphic to G. Furthermore, K is totally real, since σ˜ (Frob∞ ) = 1. Since G has order prime to p, it can be lifted to characteristic zero. Hence, G can be realized as an irreducible subgroup of PGL(2, C) and so is isomorphic to A5 , S4 , A4 , or a dihedral group. Thus we organize our search for representations σ satisfying conditions (1)–(4) above by reversing these steps, as follows. Choose a group G from the list above, and search for totally real Galois extensions K/Q unramified outside a single prime p #G with Gal(K/Q) G. Given such a K, choose a projective representation σ˜ : GQ → PGL(2, F) whose kernel has K as a fixed field, and consider the problem of lifting σ˜ to a representation σ : GQ → GL(2, F) unramified outside p. The following lemma (and its proof) is apparently due to Serre; we heard about it from R. Taylor. The technique of using twisting by a character to adjust the ramification is due to Tate (see [Se1, Section 6]). Lemma 4.1. Let σ˜ : GQ → PGL(2, F) be a continuous projective representation that is totally real (i.e., σ˜ (Frob∞ ) = 1) and unramified outside the prime p. Then there is a unimodular lifting σ : GQ → SL(2, F) of σ˜ which is unramified outside p. Proof. We show first that a unimodular lifting exists; then we investigate the ramification. Since F is algebraically closed, the map SL(2, F) → PGL(2, F) is surjective, and its kernel is ±1; hence the obstruction to lifting σ˜ : GQ → PGL(2, F) is the class of σ˜ in H 2 (GQ , ±1). From class field theory, we have a short exact sequence of local and global Brauer groups ¯ × −→ ¯ × −→ Q/Z −→ 0, 0 −→ H 2 GQ , Q H 2 GQv , Q v v
which yields, taking the kernel of multiplication by 2, the exact sequence 1 0 −→ H 2 GQ , ±1 −→ H 2 GQv , ±1 −→ Z/Z. 2 v ¯ v /Qv ) is identified with a Here, v runs over all the places of Q, and GQv = Gal(Q decomposition group of v in GQ . Hence, it suffices to show that the restriction σ˜ v of σ˜ to GQv can be lifted for each v. If v = ∞, then σ˜ v is the trivial representation, since σ˜ (Frob∞ ) = 1, and so σ˜ v can be lifted to the trivial representation. If v is a finite prime not equal to p, then σ˜ v factors through the maximal unramified extension of Qv . Therefore, σ˜ v can be lifted to an unramified representation of GQv
GALOIS REPRESENTATIONS AND COHOMOLOGY
15
by choosing an element of SL(2, F) which maps onto σ˜ v (Frobv ), where Frobv is a Frobenius automorphism in GQv . It follows that σ˜ p must also lift, by the short exact sequence above. Hence, there is a representation σ : GQ → SL(2, F) which lifts σ˜ . It is clear that σ is uniquely determined up to multiplication by a quadratic character of GQ . The image of σ is not isomorphic to G, since A5 , S4 , and A4 have no faithful 2-dimensional representations, and the dihedral groups have no faithful unimodular 2-dimensional representations. Thus, the fixed field of the kernel of σ is a quadratic extension of K and σ may be ramified at other primes besides p. However, multiplying σ by a suitable quadratic character of GQ will produce a unimodular lifting unramified outside p. This can be seen as follows. Suppose that q = p is a prime which is ramified for σ . Then σ (Iq ) is the subgroup ±I of SL(2, F), where I is the 2 × 2 identity matrix. Let Gq be the decomposition group of q in GQ containing Iq . Since σ (Gq )/σ (Iq ) is cyclic and σ (Iq ) is central, σ (Gq ) is abelian. Let N be the commutator subgroup of Gq ; then N ⊆ Iq , and Iq /N may be identified with Gal(Qq (µq ∞ )/Qq ), which in turn can be identified with Gal(Q(µq ∞ )/Q). (Here µq ∞ is the group of roots of unity of order a power of q.) Hence, there is a quadratic character χq of GQ , unramified outside q, such that σ (τ ) = χq (τ )I, for τ ∈ Iq . (A simpler argument is available for q odd. Then σ is tamely ramified at q, and Iq has a unique character of order 2, so that we may take χq to be the quadratic character of conductor q.) Let χ= χq , q
the product taken over all rational primes q = p which are ramified for σ . Then χσ is unramified outside p. Finally, once σ has been found, we check that condition (4) is satisfied. (It always is in the dihedral and A4 cases.) Suppose then that σ is a unimodular representation satisfying conditions (1)–(4). Then σ | Ip = ωd ⊕ ω−d for some integer d. Let ρ = ωj σ ⊕ ωk , where the integers j and k satisfy the parity condition needed to guarantee that the eigenvalues of ρ(Frob∞ ) are either (1, −1, 1) or (−1, 1, −1). Then we have ρ | Ip ωj +d ⊕ ωj −d ⊕ ωk . The strict parity condition then requires that we take b ≡ k mod p − 1 (a and c may be taken to be congruent to j +d and j −d in either order). In the examples that follow, we choose j and k so as to make g small. This means that we take j = ±d and k = 1 or 2 (the parity of k is fixed by the choice of j ); this allows us to take c = 0 and b = 1 or 2.
16
ASH AND SINNOTT
Examples of dihedral type. We take G to be a dihedral group, so that we are looking for a Galois extension K/Q that is totally real and unramified outside a prime p, and for which Gal(K/Q) is dihedral and of order prime to p. Since G has even order, p must be odd. Let k be the maximal abelian extension of Q in K. Then, since k is unramifed outside p, k/Q is cyclic; it follows that [k : Q] = 2 and that the relative degree [K : k] is odd, since G is dihedral. Since k is also totally real, we have √ p ≡ 1 mod 4 and k = Q( p). Let T be an inertia group of p in G. Since p is tamely ramified in K, T is cyclic; since T surjects onto Gal(k/Q), T ∩Gal(K/k) must be trivial. So T has order 2, and K/k is unramified at p. Hence, the dihedral extensions we are looking for are in one-to-one correspondence with the set of pairs (p, C), where p is a prime ≡ 1 mod 4 and C is a nontrivial cyclic √ √ quotient of the class group of Q( p). (It is known that the class number of Q( p) is odd and prime to p, so no further restrictions on C are needed; see [Sl].) We fix such a dihedral extension K, and we let σ˜ : GQ → PGL(2, F) be a projective representation corresponding to K. In this case, σ˜ lifts to a representation σ : GQ → GL(2, F) whose image is also isomorphic to G, so that σ is again totally real and unramified outside p. We have σ | Ip ω(p−1)/2 ⊕ 1, since the image of Ip under σ has order 2 and det σ = ω(p−1)/2 (the determinant of any of the 2-dimensional irreducible representations of G is the nontrivial 1-dimensional character of G). So we may take ρ = σ ⊕ ω, for which we will have a = (p − 1)/2, b = 1, c = 0, so that g = a − 2 = (p − 5)/2. There are 6 primes p ≡ 1 mod 4, p < 1000 for which the class number h of √ Q( p) is greater than 1; in each of these cases, h itself is prime, so we obtain a unique dihedral extension unramified outside p as follows. p h g
229 3 112
257 3 126
401 5 198
577 7 286
733 3 364
761 3 378
The dihedral group of order 2h has φ(h)/2 2-dimensional faithful representations. Hence, for p = 229, 257, 733, and 761, there is a unique choice of σ , while for p = 401 there are two choices for σ , and for p = 577 there are three. Using the programs described in [AAC], we find a unique interior Hecke eigenclass for p = 229 in weight g = 112, a unique interior Hecke eigenclass for p = 257 in weight g = 126, and a unique interior Hecke eigenclass for p = 733 in weight g = 364. For p = 401, we find a 2-dimensional interior Hecke eigenspace in weight g = 198, and for p = 577, there is a 3-dimensional interior Hecke eigenspace in weight 286. Moreover, as stated in [AAC], the Hecke action on the eigenclasses for p = 229 and 257 was computed for a number of small primes l (l ≤ 13 for p = 229; l ≤ 19 for p = 257) and in each case was found to be consistent with the representation ρ = σ ⊕ ω. (G S3 in these cases, so σ is the unique irreducible 2-dimensional representation of G.) Finally, we
GALOIS REPRESENTATIONS AND COHOMOLOGY
17
also computed the Hecke action for l ≤ 7 for the classes for p = 401, 577, and 733, again with results consistent with Conjecture 2.2. The final pair in the table above (p = 761, g = 378) is just out of reach of our computer programs at present. Examples of A4 type. We take G to be A4 and look for totally real Galois extensions K/Q, unramified outside a single prime p, with Galois group isomorphic to A4 . An analysis similar to that given in the dihedral case shows that such extensions are in one-to-one correspondence with the set of pairs (p, C), where p is prime, p ≡ 1 mod 3, and C is a Galois stable quotient of the ideal class group of the cubic subfield of Q(ζp ) isomorphic to the Klein four-group. These fields can be found by looking for monic quartic polynomials f (x) ∈ Z[x] with Galois group A4 , four real roots, and field discriminant the square of a prime p. A search of the Bordeaux tables [B], finds nine such extensions in the range p < 1000; the values of p for these extensions are p = 163, 277, 349, 397, 547, 607, 709, 853, and 937. For each of these primes, there is a unique field K. Note that the ramification index of p in K is 3. Let p be one of these primes, and let K be the associated A4 -extension. Let σ˜ : GQ → PGL(2, F) be a projective representation corresponding to K; σ˜ is unique (up to conjugacy). By Lemma 4.1, there are exactly two liftings of σ˜ to SL(2, F) that are unramified outside p; if σ : GQ → SL(2, F) is one of these liftings, ω(p−1)/2 σ is the other. Let Kˆ be the fixed field of the kernel of σ , and let e be the ramification index of ˆ ˆ K at p. Then e is either 6 or 3 (depending on whether K/K is ramified at p or not). Since σ (Ip ) is abelian, we have σ | Ip ω(p−1)/e ⊕ ω−(p−1)/e . It follows that replacing σ by ω(p−1)/2 σ switches the cases e = 6 and e = 3; hence, both cases do in fact occur, and we may suppose that σ has been chosen so that e is ˆ equal to 3 and K/K is unramified at p. The image of σ is isomorphic to Aˆ 4 , the double cover of A4 . The field Kˆ is either totally real or totally complex, and to test Conjectures 2.1 and 2.2 we need to know which is the case. We do this as follows. Given the field K, described as the splitting field of a monic quartic polynomial f (x) ∈ Z[x], we obtained from Jordi Quer an element z ∈ K, written as a polynomial with integer coefficients in two of the roots a, b of √ f (since K = Q(a, b), for any two roots of f ), with the property that K( z) is an Aˆ 4 -extension of Q. Quer’s programs are based on the method of Crespo [C] for constructing Aˆ n -extensions. Lemma 4.2. The set of Aˆ 4 -extensions L of Q containing K is in one-to-one correspondence with the set of square-free integers: each such L can be written as √ L = K( zm) for some unique square-free integer m.
18
ASH AND SINNOTT
Proof. From the short exact sequence 1 −→ {±1} −→ K × −→ (K × )2 −→ 1, induced by squaring, we obtain the exact sequence 1 −→ H 1 G, (K × )2 −→ H 2 G, ±1 , where G = Gal(K/Q) A4 ; and from 1 −→ (K × )2 −→ K × −→ K × /(K × )2 −→ 1, we obtain the exact sequence G Q× −→ K × /(K × )2 −→ H 1 G, (K × )2 −→ 1. Splicing these together gives us an exact sequence G Q× −→ K × /(K × )2 −→ H 2 G, ±1 . By direct calculation, we see that an element x ∈ K × that represents a nontrivial class in (K × /(K × )2 )G maps to the class in H 2 (G, ±1) of the group extension √ 1 −→ {±1} −→ Gal K( x)/Q −→ G −→ 1. √ (Here we have identified Gal(K( x)/K) with {±1}.) Now H 2 (G, ±1) has order 2, and the class of Aˆ 4 , as an extension of A4 by ±1, corresponds to the nontrivial class in H 2 (G, ±1). Hence, the element z provided by Quer lies in (K × /(K × )2 )G and maps to the nontrivial class in H 2 (G, ±1). If x ∈ (K × /(K × )2 )G is any element which maps to the nontrivial class in H 2 (G, ±1), then x can be written in the form zmy 2 , where m is a square-free integer and y ∈ K. The uniqueness of m results from the fact that Q× ∩ (K × )2 = (Q× )2 , since K has no quadratic subfields. √ Thus the extension Kˆ is of the form K( zm) for some unique square-free integer m; of course, we are free to modify z by a square as well, if convenient. ˆ To modify z so as to find a generator for K/K, we proceed as follows. Since √ σ −1 is a square in K, for every σ ∈ G. Let K( z) is Galois over Q, it follows that z F = Q(a); then [K : F ] = 3, which is odd, so that r = NK/F z will generate the same quadratic extension of K as z. Next, we find the largest rational integer N that √ divides r in the ring of integers OF of F , and set s = r/N . Then K( s) will be a possibly different Aˆ 4 -extension. √ We now observe that K( s)/K is unramified at all primes except possibly those lying above p or 2. Indeed, if q = p is an odd rational prime, then q does not divide s in O F dividing q which does not divide s. Hence, √F ; hence, there is some prime q of √ F ( s)/F is unramified at q, and so K( s)/K is unramified at all primes of K above q, and therefore at all primes above q.
GALOIS REPRESENTATIONS AND COHOMOLOGY
19
√ To decide whether K( s)/K is unramified at the primes above p, we examine the divisibility of s by the primes above p in F ; in fact, in all the examples we treat below, s turns out to be prime to p. This is easily detected by calculating NF /Q (s) and seeing that this integer is not divisible by p. √ √ At this point we can say that Kˆ is either K( s) or K( −s), since the only freedom we have left is to change s to −s. To decide which, we have to consider ramification at 2. As with odd primes, there is a prime p dividing 2 in F which does not divide s. We have the following lemma. Lemma√4.3. The following are equivalent: (1) K(√ s)/K is unramified at all primes of K above 2; (2) F ( s)/F is unramified at p; (3) s ≡ u2 mod p2 for some u ∈ OF . √ Proof. (1) ⇐⇒ (2) is immediate, since K( s)/Q is Galois and K is unramified at 2. To show s ≡ 1 mod p2 ; √ that (3) ⇒ (2), we change s by a square and suppose that 2 then F ( s)/F is also generated by the roots of the polynomial x + x + (1 − s)/4, hence is unramified at p. (Note that (1 − s)/4 is integral at p because 2 is unramified in F .) To show that (2) ⇒ (3), we first observe that (OF /p2 )× F(p)× × F(p)+ (with F(p) = OF /p), from which it follows that × s is a square in OF /p2 ⇐⇒ s q−1 ≡ 1
mod p2 ,
√ p) = Np. Now suppose that F ( s)/F is unramified at p. If p splits in where q = #F( √ F ( s), then s is a square in F(p) and (3) holds.√ On the other hand, if p remains prime, then the√ Artin symbol of p for the extension F ( s)/F is the nontrivial automorphism φ of F ( s)/F ; we have the congruence √ √ √ φ( s) = − s ≡ ( s)q
mod p ,
which implies on squaring that s ≡ s q (mod p2 ), so that (3) holds. The method for checking whether s or −s satisfies condition (3) varies from case to case. Often, the simplest thing to do is to modify s by a square in K so that it becomes prime to 2 and then consider s mod 4. (This amounts to considering condition (3) for all the primes√dividing 2 at once.) In any case, once we have found t = ±s such that Kˆ = K( t), we determine whether Kˆ is totally real or totally complex by determining whether t is totally positive or totally negative. The standard part of these calculations goes as follows, using the number theory package PARI-GP (version 1.39) [BB]. We have to supply the quartic polynomial f (x), the expression for z as a polynomial in two of the roots (a and b) of f , and the prime p.
20
ASH AND SINNOTT
f=????; galois(f) sturm(f) factor(discf(f)) sqrt(disc(f)/discf(f)) a=mod(x,f); ff=f/(x-a); b=[0,1,0;0,0,1;-coeff(ff,0),-coeff(ff,1),-coeff(ff,2)]; z=????; r=det(z); s=r/content(r); p=????; mod(norm(s),p) The first three commands compute the Galois group of f , the number of real roots of f , and the factorization of the discriminant of the field F = Q(a); these are simply to confirm that we do indeed have a totally real A4 -extension of Q ramified only at a single prime. The next command calculates the index [OF : Z[a]]. We view the second root b as the companion matrix (with entries from F ) of the polynomial f (x)/(x −a), which is an irreducible cubic over F , so that the norm r = NK/F z can be computed by taking the determinant of the matrix of z. We obtain s by removing from r the greatest common divisor of its coefficients as a polynomial (of degree at most 3) in a. The last command checks that s is prime to p (provided that mod(norm(s),p) is nonzero, which is true for the cases examined below). When [OF : Z[a]] > 1, s may still be divisible by an integer dividing this index—in the cases treated below, s is sometimes still divisible by 2. This can be detected by checking to see if s/2 is an algebraic integer: content(char(s/2,x)) —if s/2 is an integer in F , then the content of its characteristic polynomial will be 1, in which case we replace s by s/2. Of course we must then test the new s to see whether it is still divisible by 2. We illustrate with p = 277. The field F is generated by a root a of the polynomial f (x) = x 4 − x 3 − 16x 2 + 3x + 1, and the value of z obtained from Quer is 9417+787a −51a 2 −133a 3 +1359b−627ab−468a 2 b+114a 3 b−345b2 −216ab2 + 57a 2 b2 . The PARI commands above confirm that the splitting field of f is a totally real A4 -extension of Q ramified only at 277, determine that the index [OF : Z[a]] equals 4, and find that s = −150700a 3 + 193305a 2 + 2034265a + 347661. Then s is √ an √ element of the field F such that K( s) is an Aˆ 4 -extension of Q, and such √ that K( s)/K is unramified at all primes except possibly 2 and 277. In fact, K( s)/K is unramified at 277, since NF /Q (s) = 10483965209607696 is not divisible by 277.
21
GALOIS REPRESENTATIONS AND COHOMOLOGY
In this case, we need still to investigate whether s might be divisible by 2 (since Z[a] has index 4 in OF ). To see whether s/2 is integral, we calculate the content of its characteristic polynomial as follows: content(char(s/2,x)) —this yields the value 1/4. So s is not divisible by 2 and thus is not divisible by at least one prime of F above 2. Using PARI, the following facts are easy to discover. Since N(a + 1) = 16, there is a prime p above 2 which divides a + 1, so that p also divides a − 1. The number (a −1)2 /(a +1) is an algebraic integer with odd norm; hence a +1 is in fact divisible by p2 . Note that s ≡ a 2 + a + √1 mod 4; so s ≡ 1 mod(4, a + 1), and therefore s ≡ 1 mod p2 . By Lemma 4.3, K( s) is unramified at all primes dividing 2, and therefore √ Kˆ = K( s). Finally, since the trace of s is 3775974, it follows that s is totally positive, and so Kˆ is totally real. Similar calculations in PARI with the remaining primes yield the following table. p Kˆ
163
277
349
397
547
607
709
853
937
−
+
−
−
−
+
−
−
−
Here we have used + to stand for a totally real field, and we have used − to stand for a totally complex field. ˆ To Let σ : GQ → SL(2, F) be the unimodular representation associated to K. get a 3-dimensional representation to which Conjectures 2.1 and 2.2 apply, we form ρ = ωj σ ⊕ωk where j and k have different parity if σ (Frob∞ ) = 1, and j and k have ˆ is unramified at p, the ramification the same parity if σ (Frob∞ ) = −1. Since K/K −(p−1)/3 ˆ ⊕ ω(p−1)/3 . index of p in K/Q is 3. Hence, σ | Ip ω Suppose that σ (Frob∞ ) = 1. Thus Kˆ is totally real, and so p = 277 or 607. Then we take j = −(p −1)/3 and k = 1. For ρ = ω−(p−1)/3 σ ⊕ω, we take a = (p −1)/3, b = 1, and c = 0, giving (according to Conjecture 2.2) g = (p − 1)/3 − 2; so g = 90 if p = 277, and g = 200 if p = 607. Using the programs discussed in [AAC], we find a unique 1-dimensional interior Hecke eigenspace in H 3 (SL(3, Z), V90 (F277 )) and in H 3 (SL(3, Z), V200 (F607 )). Furthermore, we have computed the eigenvalues of the Hecke operators T (l, k) for l = 2, 3, 5, 7, 11, 13 and k = 1, 2 and found that the characteristic polynomials of Frobenius predicted by our conjecture are those arising from the representation ρ = ω−(p−1)/3 σ ⊕ ω. We can also take j = (p − 1)/3 and k = 1, yielding a = 2(p − 1)/3, b = 1, c = 0, and so (according to Conjecture 2.2) g = 2(p − 1)/3 − 2. So g = 182 if p = 277, and g = 402 if p = 607. The first case is small enough for calculation; we find a unique 1-dimensional interior Hecke eigenspace in H 3 (SL(3, Z), V182 (F277 )), and we can verify that Conjecture 2.2 correctly predicts the characteristic polynomials of Frobenius for l = 2, 3, 5, 7.
22
ASH AND SINNOTT
We illustrate these calculations with p = 277, l = 5, and ρ = ω−92 σ ⊕ ω. We consider the Hecke operators T (5, k) (k = 0, 1, 2, 3) acting on the unique 1-dimensional interior Hecke eigenspace in H 3 (SL(3, Z), V90 (F277 )). The Hecke operator T (5, 0) is the identity, and has eigenvalue 1; T (5, 3) arises from the scalar matrix diag(5, 5, 5), and has eigenvalue 590 (mod 277). According to [AAC], the eigenvalues of the Hecke operators T (5, 1) and T (5, 2) are 122 and 251 (mod 277), respectively, so that Conjecture 2.2 predicts that (in F277 [X]) det 1 − ρ(Frob5 )X = 1 − 122X + 5 · 251X 2 − 53 · 590 X 3 = 1 + 155X + 147X 2 + 251X 3 . We check this by calculating det(1 − ρ(Frob5 )X) as follows. Recall that K is the splitting field of f (X) = 1 + 3X − 16X2 − X3 + X4 and that F is the field generated by a root of f . Since f (X) factors mod 5 into the irreducible factors X + 2 and X3 + 2X2 + 3, 5 factors in F into a prime of degree 1 and a prime of degree 3, and from this it is easy to see that Frob5 is a 3-cycle in Gal(K/Q) A4 . Since √ Kˆ = K( s), with s = −150700a 3 +193305a 2 +2034265a +347661 ≡ 1 mod 5, the ˆ primes above 5 in K split in K. ˆ Now, A4 has a unique unimodular 2-dimensional representation, so we may identify σ with this representation. We find tr(σ (τ )) = −1 for any element of order 3 in Aˆ 4 , so that for ρ = ω−(p−1)/3 σ ⊕ ω = ω−92 σ ⊕ ω, we have det(1 − ρ(Frob5 )X) = (1 − 5−92 (−1)X + 5−184 X 2 )(1 − 5X) = 1 + 155X + 147X2 + 251X3 , confirming the prediction of Conjecture 2.2. Suppose that σ (Frob∞ ) = −1, so that Kˆ is totally complex, and p is one of the primes 163, 349, and so on. We take j = −(p − 1)/3 and k = 2, so that ρ = ω−(p−1)/3 σ ⊕ ω2 , giving a = (p − 1)/3, b = 2, c = 0, and g = (p − 1)/3 − 2 + p. So g = 215 if p = 163; g = 463 if p = 349, and so on. Note that g is odd in these cases, while the programs in [AAC] were written under the assumption that g was even. However, the modifications needed to treat odd g are minor, and after the publication of [AAC], Allison and Conrad modified their programs to compute H 3 (SL(3, Z), Vg ) with odd g. The first case (p = 163, g = 215) is small enough for calculation; we do in fact find a unique 1-dimensional interior Hecke eigenspace in H 3 (SL(3, Z), V215 (F163 )), and we have verified that Conjecture 2.2 correctly predicts the characteristic polynomials of Frobenius for l = 2, 3, 5, 7. The second case (p = 349, g = 463) was too large for calculation of the action of the Hecke operators, but we were able to verify that there is a unique 1-dimensional interior Hecke eigenspace in H 3 (SL(3, Z), V463 (F349 )). Finally, we mention several examples that led us to the strict parity condition of Conjecture 2.2 (these examples also have odd values of g). • Let p = 163, and let σ be the Aˆ 4 -representation associated to p above. We have σ (Frob∞ ) = −1 and σ | Ip ω54 ⊕ ω−54 . If we take ρ = ω−52 σ ⊕ 1, then ρ | Ip
GALOIS REPRESENTATIONS AND COHOMOLOGY
23
ω2 ⊕ ω56 ⊕ 1; if we drop the strict parity condition, we would be able to take a = 56, b = 2, c = 0, which would lead to g = 217. However, H 3 (SL(3, Z), V217 (F163 )) = 0. • Let p = 277, and let σ be the Aˆ 4 -representation associated to p above. We have σ (Frob∞ ) = 1 and σ | Ip = ω92 ⊕ ω−92 . If we take ρ = ω−91 σ ⊕ 1, then ρ | Ip ω ⊕ ω93 ⊕ 1; if we drop the strict parity condition, we would be able to take a = 93, b = 1, c = 0, which would lead to g = 91. However, H 3 (SL(3, Z), V91 (F277 )) = 0. The remaining types. The only examples of types S4 and A5 of which we know give values of g that are too large to verify at present. For example, there is a totally real S4 -extension ramified at only one prime with prime discriminant p = 2777; in ˆ ramified only at 2777. The ramification this case, σ cuts out an Sˆ4 -extension of K/Q (p−1)/4 ˆ ⊕ ω−(p−1)/4 = ω694 ⊕ ω−694 . Hence, index of p in K is 4, so that σ | Ip ω 694 if Kˆ is totally real, we may take ρ = ω σ ⊕ ω, leading to a predicted value of g = 1386. If Kˆ is totally complex, the best we can do is to take ρ = ω694 σ ⊕ ω2 , leading to a predicted value of g = 4163. References [AAC]
[A1] [A2] [A3] [A4] [A5] [AM] [AS1] [AS2] [AT] [B]
[BB] [Br] [Cl]
G. Allison, A. Ash, and E. Conrad, Galois representations, Hecke operators and the mod p cohomology of GL(3, Z) with twisted coefficients, Experiment. Math. 7 (1998), 361–390. A. Ash, Farrell cohomology of GL(n, Z), Israel J. Math. 67 (1989), 327–336. , Galois representations attached to mod p cohomology of GL(n, Z), Duke Math. J. 65 (1992), 235–255. , “Galois representations and cohomology of GL(n, Z)” in Séminaire de Théorie des Nombres (Paris, 1989–90), Progr. Math. 102, Birkhäuser, Boston, 1992, 9–22. , Galois representations and Hecke operators associated with the mod p cohomology of GL(1, Z) and GL(2, Z), Proc. Amer. Math. Soc. 125 (1997), 3209–3212. , Monomial Galois representations and Hecke eigenclasses in the mod p cohomology of GL(p − 1, Z), Math. Ann. 315 (1999), 263–280. A. Ash and M. McConnell, Experimental indications of three-dimensional Galois representations from the cohomology of SL(3, Z), Experiment. Math. 1 (1992), 209–223. A. Ash and G. Stevens, Cohomology of arithmetic groups and congruences between systems of Hecke eigenvalues, J. Reine Angew. Math. 365 (1986), 192–220. , Modular forms in characteristic l and special values of their L-functions, Duke Math. J. 53 (1986), 849–868. A. Ash and P. H. Tiep, Modular representations of GL(3, Fp ), symmetric squares, and mod-p cohomology of GL(3, Z), J. Algebra 222 (1999), 376–399. Laboratoire A2X, UFR de Mathématiques et Informatique, Université Bordeaux I, Tables of Number Fields (May 15, 1998), available from ftp://megrez.math.ubordeaux.fr/pub/numberfields/, (January 22, 1999). (The tables of fields of degree 4 are based on tables compiled by J. Buchmann, D. Ford, and M. Pohst.) C. Batut, K. Belabas, D. Bernardi, H. Cohen, and M. Olivier, PARI-GP, available from http://www.parigp-home.de/. K. Brown, Cohomology of Groups, Grad. Texts in Math. 87, Springer, New York, 1982. L. Clozel, “Motifs et formes automorphes” in Automorphic Forms, Shimura Varieties and L-functions (Ann Arbor, Mich., 1988), Vol. 1, Perspect. Math. 10, Academic Press, Boston, 1990.
24 [C] [CR] [Da] [Di]
[DW] [E] [GJ] [G1] [G2] [G3]
[G4] [LS] [R] [Se1]
[Se2] [Sl] [W]
ASH AND SINNOTT T. Crespo, Explicit constructions of An type fields, J. Algebra 127 (1989), 452–461. C. Curtis and I. Reiner, Methods of Representation Theory, Vol. 1, Pure Appl. Math., Wiley, New York, 1981. H. Darmon, “Serre’s conjectures” in Seminar on Fermat’s Last Theorem (Toronto, 1993– 94), CMS Con. Proc. 17, Amer. Math. Soc., Providence, 1995. F. Diamond, “The refined conjecture of Serre” in Elliptic Curves, Modular Forms and Fermat’s Last Theorem (Hong Kong, 1993), Ser. Number Theory 1, International Press, Cambridge, Mass., 1995, 22–37. S. Dory and G. Walker, The composition factors of Fp [x1 , x2 , x3 ] as a GL(3, p)-module, J. Algebra 147 (1992), 411–441. B. Edixhoven, “Serre’s conjectures” in Modular Forms and Fermat’s Last Theorem (Boston, 1995), Springer, New York, 1997, 209–242. S. Gelbart and H. Jacquet, A relation between automorphic representations of GL(2) and GL(3), Ann. Sci. École Norm. Sup. (4) 11 (1978), 471–542. B. Gross, A tameness criterion for Galois representations associated to modular forms (mod p), Duke Math. J. 61 (1990), 445–517. , Modular forms (mod p) and Galois representations, Internat. Math. Res. Notices 1998, 865–875. , “On the Satake isomorphism” in Galois Representations in Arithmetic Algebraic Geometry (Durham, 1996), London Math. Soc. Lecture Note Ser. 254, Cambridge Univ. Press, Cambridge, 1998, 223–237. , Algebraic modular forms, Israel J. Math. 113 (1999), 61–93. R. Lee and J. Schwermer, Cohomology of arithmetic subgroups of SL3 at infinity, J. Reine Angew. Math. 330 (1982), 100–131. D. Ramakrishnan, Modularity of the Rankin-Selberg L-series and multiplicity one for SL(2), to appear in Ann. of Math. (2). J.-P. Serre, “Modular forms of weight one and Galois representations” in Algebraic Number Fields: L-functions and Galois properties (Durham, 1975), Academic Press, London, 1977, 193–268. ¯ , Sur les représentations modulaires de degré 2 de Gal(Q/Q), Duke Math. J. 54 (1987), 179–230. I. Sh. Slavutskii, Upper bounds and numerical calculation of the number of ideal classes of real quadratic fields, Amer. Math. Soc. Transl. Ser. 2 82 (1969), 67–72. J. Weisinger, Some results on classical Eisenstein series and modular forms over function fields, Ph.D. thesis, Harvard University, 1977.
Department of Mathematics, Ohio State University, 231 West 18th Avenue, Columbus, Ohio 43210, USA;
[email protected];
[email protected]
Vol. 105, No. 1
DUKE MATHEMATICAL JOURNAL
© 2000
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM RICHARD F. BASS and KRZYSZTOF BURDZY
1. Introduction. The main purpose of this article is to give a stronger counterexample to the “hot spots” conjecture than the one presented in Burdzy and Werner [7]. Along the way we define and partly analyze a new process, which we call fiber Brownian motion, and which may have some interest of its own. Consider a Euclidean domain that has a discrete spectrum for the Laplacian with Neumann boundary conditions, for example, a bounded domain with Lipschitz boundary. Recall that the first Neumann eigenfunction is constant. The hot spots conjecture says that the maximum of the second Neumann eigenfunction is attained at a boundary point. Burdzy and Werner [7] constructed a domain where the second eigenfunction attains its maximum inside the domain but its minimum lies on the boundary. If ϕ is an eigenfunction, so is −ϕ, and hence the maximum and minimum are indistinguishable in the context of this problem. Hence the counterexample of [7] leaves open the following question. Question 1.1. Must at least one of the extrema of the second Neumann eigenfunction be attained on the boundary of the domain? The uncertainty about the answer to this question is underscored by the nature of Burdzy and Werner’s counterexample, which cannot be easily modified to solve Question 1.1. The second author learned (in a private communication) about a different counterexample, obtained by D. Jerison and N. Nadirashvili, which seems to have the same property—that the minimum of the eigenfunction lies on the boundary. The answer to Question 1.1 is given in the following statement. Theorem 1.2. There exists a bounded Lipschitz domain in the plane such that its second Neumann eigenvalue is simple, and both extrema of the corresponding eigenfunction are attained only at interior points of the domain. We believe that each extremum of the eigenfunction in Theorem 1.2 is attained at a single point, but we do not prove this. Our example is based on an idea of W. Werner’s, involving two triangles whose vertices are connected by very thin tubes. The hot spots conjecture was proposed by J. Rauch at a conference in 1974. The only published statement of the conjecture is contained in a book by B. Kawohl [11]. Received 15 January 1999. Revision received 12 January 2000. 2000 Mathematics Subject Classification. Primary 35P99, 60J65; Secondary 60J60, 60J45. Authors’ research partially supported by National Science Foundation grant number DMS-9700721. 25
26
BASS AND BURDZY
Bañuelos and Burdzy [2] proved the conjecture for a class of planar domains, and Burdzy and Werner [7] found a counterexample. The introduction to the paper of Bañuelos and Burdzy [2] contains a discussion of many aspects of the conjecture. Jerison and Nadirashvili [10] have some new results on the hot spots problem in convex planar domains with two axes of symmetry. The fiber Brownian motion is a process that moves like 2-dimensional Brownian motion in a part of its state space, but it evolves like a 1-dimensional Brownian motion if it happens to be on one of many “fibers” in its state space. The process is obtained as a weak limit of reflected Brownian motions in domains with very thin tubes. The main property of fiber Brownian motion used in the present article is that its motion along the fibers is not affected by their curvature, loosely speaking. This holds only for some families of fibers and leads to the natural question about the behavior in a general family of fibers. We limit our analysis to the families of fibers needed to prove Theorem 1.2. Many elements of the construction of the domain in Theorem 1.2 and also many elements of the argument are adapted from [6], [2], and [7]. Two types of couplings, synchronous and mirror, were used by Bañuelos and Burdzy [2] to prove some results about hot spots. We would like to emphasize that the simpler synchronous coupling does not work in the present paper, so we use the more interesting mirror coupling, whose properties relevant to this problem were first analyzed by Burdzy and Kendall [6]. The letter c with a subscript denotes a finite and strictly positive constant whose exact value is unimportant. We renumber the constants in each proof. In a number of places explicit constants appear, such as 223, 17, and so on. These numbers were convenient for presenting the proofs, but are not otherwise important. Acknowledgments. We are grateful to M. van den Berg and W. D. Evans for very useful advice on spectral theory. We would also like to thank the referee for helpful suggestions on the presentation of our results. 2. Overview of proof. Since our argument is rather technical, we review here some known facts and techniques related to the hot spots problem and explain how they motivated our project. We also outline the main components of our proofs. The precursor of the counterexample given in [7] was an idea of W. Werner’s, which consists of two large triangles whose respective vertices are connected by very thin tubes. Consider the heat problem in this region with Neumann boundary conditions. We can think about heat and cold as two substances that avoid or annihilate each other. For this reason, if the initial temperature distribution is such that one of the triangles is hot and the other is cold, it seems natural that the first will stay warmer than the second forever. The center of the warmer triangle will be the hottest spot because the center is the point that lies at the greatest distance from the tubes. By symmetry, the center of the other triangle will be the coldest spot. An eigenfunction
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
27
Figure 2.1
expansion translates this property of the heat equation solution into the fact that the second eigenfunction for the Neumann Laplacian attains its extrema at the centers of the triangles. We now explain how this sketch of an argument may be implemented in a rigorous way when the triangles and tubes are on the surface of a 2-dimensional manifold embedded in 3-dimensional space. Consider a cylinder of finite height in R3 whose base is an equilateral triangle with each vertex truncated. The sides of the cylinder consist of six rectangles, three of which are very narrow and three of which are wider. The domain Ᏸ1 consists of part of the surface of this cylinder, namely, the top, bottom, and the three narrow rectangles on the sides; see the shaded area in Figure 2.1. The reason for distorting the Werner idea in this way is to generate more symmetries. We parenthetically note that the hot spots conjecture has not been studied in the context of manifolds, as it is evidently false: the surface of an (American) football with a small hole far from its endpoints seems to be an obvious counterexample, although proving this assertion might not be quite trivial. We consider a manifold only to illustrate the method and some of the technical problems associated with it. Split the surface Ᏸ1 in Figure 2.1 into six isometric parts as indicated in Figure 2.2. Let one of them be called E; unfolded, E appears as in Figure 2.3. It is not hard to show that if the tube is thin enough, then the second Neumann eigenfunction ϕE2 for E is simple and symmetric (see proof of Lemma 6.7). Consider the heat problem in E with the initial temperature equal to 1 on the left half and −1 on the right half of E. Using a coupling argument as in [2], we can show that the left part will always have a higher temperature than the right part of E. An eigenfunction expansion then can be used to prove that the maximum of ϕE2 must take place at the left vertex of E (cf. Lemma 4.1).
28
BASS AND BURDZY
Figure 2.2
We can combine the six copies of E to form Ᏸ1 and by symmetry construct the 2 for Ᏸ from copies of ϕ 2 . The extrema of ϕ 2 are the second eigenfunction ϕᏰ 1 E Ᏸ1 1 same as those of ϕE2 , and so they lie in the interior of Ᏸ1 . We can see that the second eigenvalue for Ᏸ1 must be simple, and we thus have an example where neither extremum of the unique second eigenfunction takes place on the boundary.
Figure 2.3
When we try to construct a similar counterexample in a Euclidean domain, we are confronted with the lack of convenient symmetries. Burdzy and Werner [7] considered a domain invariant under rotation by the angle 2π/3, consisting of two bulky parts connected by thin tubes. (Their example is too complicated to be faithfully represented by a small picture.) The price they had to pay was that one of the extrema of the second eigenfunction had to lie on the exterior boundary of the set. This left the complete resolution of the hot spot conjecture unsettled, as one could make an argument that the original conjecture should have been interpreted as saying that at least one of the extrema of the second eigenfunction must lie on the boundary. The effective indistinguishability between a second eigenfunction ϕ and −ϕ makes this interpretation possible.
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
29
Figure 2.4
Our paper aims at giving a complete negative answer to the hot spots conjecture by going back to the original idea of Werner. Let us consider two equilateral triangles that are the same size, disjoint, and have their lower side on the x-axis. We would like to connect respective vertices with thin tubes in such a way that all three tubes are isometric. This, of course, is not possible. We could connect the respective vertices by curves of the same length, but as far as the eigenvalues and eigenfunctions are concerned, we would be then in the same situation as just having two disjoint triangles. Probabilistically, we can interpret this as saying that a Brownian motion cannot enter the curves with positive probability. We instead connect the two triangles by a collection of curves or fibers (in fact, an uncountable collection of them). The concept is illustrated in Figure 2.4, with an enlargement of a neighborhood of a vertex illustrated in Figure 2.5.
Figure 2.5
A large number of obstacles are placed in the upper part of Figure 2.5 so that a Brownian path can freely move in the lower part but is forced to move in an
30
BASS AND BURDZY
almost vertical fashion between the obstacles. When the maximum distance between the obstacles goes to zero, the reflected Brownian motion in this domain will converge in distribution to a fiber Brownian motion, a process that switches between 2-dimensional and 1-dimensional behavior. If we call the set consisting of two triangles plus the connecting curves (fibers) Ᏸ2 , we can break Ᏸ2 up into six isomorphic subsets and proceed as in the discussion of Ᏸ1 and E above. The difficulty is that we no longer have an open set. We therefore approximate Ᏸ2 by a domain Ᏸ3 , which consists of the two triangles together with a large number of very thin tubes connecting neighborhoods of the respective vertices. All the tubes in Ᏸ3 are approximately the same length. Much of the work in this paper is in constructing Brownian motion on Ᏸ2 as the limit of Brownian motion in domains of the form of Ᏸ3 (see Sections 3 and 4) and showing that those spectral properties of Ᏸ3 that are needed are close to the corresponding ones for Ᏸ2 (see Section 6). We note that there exist at least two easy constructions of fiber Brownian motion, as noticed by several of our colleagues. One of them uses a probabilistic technique of time-change and the other construction is based on excursion theory. We are forced to use a rather complicated approximation construction of fiber Brownian motion by the fact that, in the end, we have to consider a domain with very thin but not infinitely thin tubes. Section 5 is a description of the coupling that is needed for our version of the Bañuelos and Burdzy results. We use two types of couplings of Brownian paths, synchronous (see Figure 2.6) and mirror (see Figure 2.7).
Figure 2.6
These two basic coupling ideas are combined in a complicated way to obtain a new coupling that will always keep one of the particles in a slice (one-sixth) of Ᏸ2 to the left of the other particle, where “left” refers to a suitable partial order. The order of presentation of our results is the following. We start with the approximation construction of fiber Brownian motion in Sections 3 and 4—these sections are independent of the rest of the paper and may have some interest of their own. Section
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
31
Figure 2.7
5 is devoted to the construction of a coupling and is also self-contained, but it has no relevance outside the scope of this paper. The bulk of Section 6 is devoted to various potential theoretic estimates which show, among other things, that we have uniform convergence for several crucial quantities when the domain Ᏸ2 containing the fiber bundle is approximated by domains Ᏸ3 with thin tubes. The proof of our main result, Theorem 1.2, finishes the paper. 3. Brownian motion in thin tubes Remark 3.1. Several times we need results about reflecting Brownian motion in a Lipschitz domain. A reference for these is Bass and Hsu [5]. That paper concerns domains in Rd with d ≥ 3. However, given a domain D in R2 or a harmonic function h(x1 , x2 ) on D, we can easily derive the needed results from those in [5] ˜ 1 , x2 , x3 ) = h(x1 , x2 ), and X˜ t = (Xt , Wt ), where by considering D˜ = D × R, h(x Wt is a 1-dimensional Brownian motion independent of a 2-dimensional Brownian motion Xt . Since the domains considered in this section have polygonal boundaries, we do not need the full strength of the results of [5]. We mostly only sketch the proofs as elementary arguments sufficient to support our claims. For ε ∈ (0, 1/2) and θ ∈ [0, π/3), define V (ε, θ) to be the domain bounded below by the curve y = (tan θ )|x|, above by the curve y = (tan θ)|x| + ε/ cos θ, to the right by the line y = −(cot θ )(x −1)+tan θ, and to the left by the line y = (cot θ)(x +1)+ tan θ . We use to denote the last two pieces of the boundary and refer to them as the sides of V (ε, θ ). Let Xt (ε, θ ) denote Brownian motion in V (ε, θ) with absorption on the sides of V (ε, θ ) and normal reflection on the remainder of ∂V (ε, θ). When it is clear, we write X instead of X(ε, θ) and V instead of V (ε, θ). We begin by showing tightness for the family {Xt (ε, θ)}ε∈(0,1/2), θ∈[0,π/3) .
32
BASS AND BURDZY
Lemma 3.2. Let εn → 0, and let θn be a sequence in [0, π/3]. Let η > 0. There exists t independent of n such that Px sup Xt εn , θn − x > η < η, s≤t
x ∈ V εn , θn , n ≥ 1.
Proof. Clearly we only need to consider n large enough so that εn < η/8. Let Ht1 be the component of Xt (εn , θn ) in the direction of the vector (−1, tan θ) killed when the x component is greater than −η/4, and let Ht2 be the component of Xt (εn , θn ) in the direction (1, tan θ ) killed when the x component of Xt (εn , θn ) is less than η/4. By a rotation, it is clear that both Ht1 and Ht2 are Brownian motions killed on hitting certain endpoints. No matter what x is, for Xt (εn , θn ) to move more than η in time t, then either Ht1 or Ht2 must move at least η/4 in time t. We can make that probability as small as we like by taking t small enough. Let B(x, r) denote the ball of radius r centered at x. Let σ X(ε,θ ) (x, r) = inf t : Xt (ε, θ) ∈ ∂B(x, r) ∧ inf t : Xt hits ; we write σ (x, r) when the process is understood. Lemma 3.3. There exists c1 not depending on ε or θ such that Ex σ (x, r) ≤ c1 r 2 for all x ∈ V (ε, θ ). Proof. Let Xt be represented as (Xt1 , Xt2 ) in the usual coordinate system. Then the t = (|Xt1 |, Xt2 ) is a reflected Brownian motion in the intersection V (ε, θ) of process X V (ε, θ ) and the right half-plane, with killing on . Elementary estimates for Brownian (ε, θ), the motion, using Brownian scaling, show that for any starting point x ∈ V 2 probability that Xt does not hit ∂B(x, r) ∪ within the first r units of time is less than some p < 1, independent of ε and θ. By the Markov property, the probability of not hitting ∂B(x, r) ∪ within the first kr 2 units of time is less than p k . This easily implies the lemma. We note that when r ≤ 6ε, the result follows from [5] and scaling; see Remark 3.1. Lemma 3.4. There exists c1 such that if δ ∈ (0, 1/2), then E0 c1 δ, where c1 is independent of ε and θ.
∞ 0
1B(0,δ) (Xs ) ds ≤
Proof. From Lemma 3.3, starting in B(0, δ), the expected time before the process hits ∂B(0, 6δ) is at most c2 δ 2 . By standard Brownian estimates, starting at y ∈ ∂B(0, 6δ) there is probability at least c3 δ that the process will exit V through the sides before returning to B(0, δ). By the results in [5], C1 =
sup E
y∈B(0,δ)
y
∞ 0
1B(0,δ) (Xs ) ds < ∞.
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
If C2 = supy ∈B(0,6δ) Ey /
∞ 0
33
1B(0,δ) (Xs ) ds, then we have C1 ≤ c2 δ 2 + C2 .
/ Let τ be the lifetime of the process, that is, the time when it hits . For any y ∈ B(0, 6δ),
C2 = Ey EX(σ (0,δ)) y
∞ 0
≤ C1 P σ (0, δ) < τ ≤ C1 1 − c3 δ . We obtain
1B(0,δ) Xs ds; σ (0, δ) < τ
C1 ≤ c2 δ 2 + C1 1 − c3 δ ,
which yields C1 ≤ c2 δ/c3 . This completes the proof. Next we want to show the Hölder continuity of harmonic functions in V . Saying a function is harmonic in a subdomain of V means that it is harmonic in the interior of the subdomain and that it has normal derivative zero on the portion of ∂V at which Xt is reflecting. Lemma 3.5. Suppose for some z and r the function h is harmonic in B(z, 2r) ∩ V (ε, θ ). Assume that B(z, 2r) ∩ = ∅. There exist c1 and α not depending on r, ε, or θ such that |x − y| α h(x) − h(y) ≤ c1 sup |h| , x, y ∈ B(z, r) ∩ V . r B(z,2r)∩V Proof. Let OscA h = supA h − inf A h. Suppose h is harmonic in B(z, 2r) ∩ V . We will show that there exists ρ < 1 independent of z, r, ε, and θ such that Osc h ≤ ρ
B(z,r)∩V
Osc
B(z,2r)∩V
h,
(3.1)
and the result follows easily from this, with α = − log ρ/ log 2. By looking at Ah+B for suitable constants A and B, it suffices to prove (3.1) when supB(z,2r)∩V h = 1 and inf B(z,2r)∩V h = 0. Note that V ∩ ∂B(z, r) consists of either one, two, or three arcs. At least one of these arcs, say, γ , will have length greater than ε/2 ∧ π r/2. An arc of this size is quite a large target for Brownian motion, and standard arguments can be used to show that Py Xt hits γ before exiting B(z, 2r) ≥ c3 ,
y ∈ B(z, r).
(3.2)
34
BASS AND BURDZY
Pick a point z0 ∈ γ , and, by looking at 1−h if necessary, we may suppose h(z0 ) ≥ 1/2. By the Harnack inequality for functions harmonic with respect to reflecting Brownian motion (see [5, Theorem 3.9]), h ≥ c2 on γ . Since h is harmonic, h(y) ≥ c3 (inf γ h) ≥ c2 c3 for all y ∈ B(z, r). We then have Osc h =
B(z,r)∩V
sup
B(z,r)∩V
h−
inf
B(z,r)∩V
h ≤ 1 − c 2 c3 = 1 − c 2 c3
Osc
B(z,2r)∩V
h,
which is (3.1) with ρ = 1 − c2 c3 . Recall here that the Hausdorff distance between sets A and B is defined to be inf{s : A ⊂ B s , B ⊂ As }, where B s = {x : dist(x, B) < s}. Theorem 3.6. Suppose εn is a sequence tending to 0, θn ∈ [0, π/3]. There exists a subsequence nk with the following properties. (a) V (εnk , θnk ) converges in the Hausdorff metric to a set V which is symmetric about the y-axis and which in the first quadrant is a line segment starting from the origin. (b) There exists a strong Markov process (Pz , Xt ) on V such that whenever xn ∈ V (εn , θn ) → x ∈ V , then the Pxnk -law of X(εnk , θnk ) converges weakly to the Px -law of Xt . (c) Let ) : V → R be defined by )(x1 , x2 ) = sgn(x1 ) dist((x1 , x2 ), (0, 0)). Then )(Xt ) has the law of a 1-dimensional Brownian motion killed on hitting sec θ or − sec θ . Proof. If we start by taking a subsequence such that θnj converges, say, to θ0 , then (a) is routine. To prove (b), we follow [4, Section 6, Propositions 6.3–6.8]. Here we give a sketch. Tightness has been shown in Lemma 3.2. With the estimates of Lemmas 3.3 and 3.5, we then can prove equicontinuity of potentials as follows (cf. [4, Section 5]). Let us write Vn for V (εn , θn ), Xtn for Xt (εn , θn ), and let τn be the first time Xtn exits Vn through the sides . By Lemma 3.3, c1 = supx∈Vn Ex τn < ∞. If g is a bounded τ function, let Rn g(x) = Ex 0 n g(Xsn ) ds. Note that |Rn g(x)| ≤ c1 g∞ . If z is fixed, then for x ∈ B(z, r), Rn g(x) = E
x
0
σ (z,2r)
g Xsn ds + Ex Rn g X n σ (z, 2r) .
A similar formula holds for Rn g(y) for every y ∈ B(z, r). The first term on the right-hand side is less than c2 (2r)2 (supB(z,2r) |g|) by Lemma 3.3. The second term is harmonic in B(z, 2r), and so by Lemma 3.5 x E Rn g X n σ (z, 2r) − Ey Rn g X n σ (z, 2r) ≤ c3 |x − y|α r −α Rg∞ ,
x, y ∈ B(z, r).
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
35
Therefore, we have Rn g(x) − Rn g(y) ≤ c4 r 2 + |x − y|α r −α g∞ , x, y ∈ B(z, r). √ Letting r = c5 |x − y|, where c5 is chosen so that r > 2|x − y|, we see that Rn g is Hölder continuous with a modulus of continuity that depends only on the L∞ bound on g. Using the resolvent equation Rnλ g
=
∞
i=0
λi Rni+1 g,
x τn
where Rnλ g(x) = E 0 e−λt g(Xsn ) ds, we see that as long as λ < c1−1 , then Rnλ g is Hölder continuous with a modulus depending only on g∞ . Now take a countable dense subset {gi } of the continuous functions on B(0, 100), and take a countable dense subset {λm } of (0, c1−1 ). For every fixed i and m, in view of the uniform Hölder continuity of the functions in the sequence {Rnλm gi }n , we can find a subsequence nj such that the sequence {Rnλjm gi }j converges. By a diagonalization process, passing to a subsequence if necessary, we obtain a single sequence nj such that Rnλjm gi converges for each i and m. By approximating functions g with gi and reals λ with λm , we can show that Rnλj g converges for each continuous g and each
λ ∈ (0, c1−1 ). If we call the limit R λ g, this is enough to show that there is a limit that is a strong Markov process; see [4] for details. The fact that the laws of X n starting at x are tight for each x is sufficient to prove that the limit process is continuous; again see [4] for details. It remains to prove (c). For the process Xt when it is away from the origin, it is not hard to see that the limit is a Brownian motion; we thus have a diffusion on V , and under the map ) we have a diffusion on [− sec θ, sec θ]. It is obvious that starting at the origin, the law of )(Xt ) is the same as the law of −)(Xt ), so )(Xt ) is on natural scale. By Lemma 3.4, letting ε → 0, we obtain for every δ ∈ (0, 1/2),
∞ 1[−δ,δ] )(Xs ) ds ≤ c2 δ. E0 0
Next we let δ → 0 to see that
∞
E0 0
1{0} )(Xs ) ds = 0.
(3.3)
This shows that )(Xt ) does not spend positive time at the origin; hence, the speed measure for )(Xt ) is the same as that of Brownian motion. Remark 3.7. If we translate the family V (ε, θ) by an arbitrary vector and rotate it by an arbitrary angle, Theorem 3.6 still holds for the new family, by the invariance of Brownian motion under such transformations.
36
BASS AND BURDZY
R1
R2
P1 P2
R3
R4
R5
P3
P4
P5
R 0 P0 Figure 3.1. The set I (v, n). The figure is not to scale.
We now construct a family of 1-dimensional Brownian motions that connect a horizontal line segment with a vertical line segment by fibers of the same length. This is not hard to do, but we also want our Brownian motions to be the limit of reflecting Brownian motion in a planar domain. For v ∈ [0, 1] define u(v) to be larger than v and to satisfy 2 21 + 4(1 − v) + 2 u(v) − v + (99 + v)2 = 223. Define the curve γv as follows. The curve begins at P0 (v) = (v, 0), rises vertically to P1 (v) = (v, 21 − v), goes horizontally to P2 (v) = (3 − v, 21 − v), goes down diagonally to P3 (v) = (102, 21 − u(v)), goes up diagonally to P4 (v) = (201 + v, 21 − v), and then goes horizontally to P5 (v) = (203, 21 − v). It is easy to see that each curve γv has length 223 and that the curves are piecewise linear and are pairwise disjoint. For n > 1, define L1 (v, n) to be the line that is parallel to the line segment that goes from P2 to P3 , lies above that line segment, and is a distance 2−n from that line segment. Similarly, let L2 (v, n) be the line that is parallel to and a distance 2−n above the line segment that goes from P3 to P4 . Let L3 (v, n) be the line {y = 21−v +2−n }. Let R2 (v, n) be the point of intersection of L1 (v, n) and the line L3 (v, n); let R3 (v, n) be the point of intersection of L1 (v, n) and L2 (v, n); and let R4 (v, n) be the point of intersection of L2 (v, n) and the line L3 (v, n). Let ψv be the curve that starts at R0 (v, n) = (v − 2−n , 0), rises vertically to R1 (v, n) = (v − 2−n , 21 − v + 2−n ), goes horizontally to R2 (v, n), goes diagonally to R3 (v, n), goes diagonally to R4 (v, n), and then goes horizontally to R5 (v, n) = (203, 21−v +2−n ). Define I (v, n) to be the domain bounded by γv , ψv , the line y = 0, and the line x = 203. Let 2n i Dn = I n ,n . 2 i=1
Some elementary geometry shows that I (i/2n , n) and I (j/2n , n) are disjoint if i = j . Define Xt (n) to be Brownian motion in Dn with absorption on the portion of the boundary that is a subset of y = 0 and x = 203, and with normal reflection everywhere else that the normal derivative is defined. Set D = ∪x∈[0,1] γx .
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
37
Theorem 3.8. There exists a subsequence nk with the following properties. (a) There exists a strong Markov process (Pz , Xt ) on D such that whenever xnk → x ∈ D, then the Pxnk -law of X(nk ) converges weakly to the Px -law of X. (b) Starting at z ∈ γx , the Pz -law of Xt is that of a Brownian motion on the curve γx . Proof. There were four ingredients to the proof of Theorem 3.6: (i) a tightness estimate; (ii) an estimate on Ey σ (x, r); (iii) the Hölder continuity of harmonic functions; and (iv) the identification of the limit. Note that each of these four ingredients is a local property. Since our domain Dn is the union of I (i/2n , n), none of which is accessible from any of the other I (j/2n , n) by a Brownian motion, and each I (i, 2n ) is the union of sets that are rotations of the V (ε, θ )’s considered above, we apply the methods of Theorem 3.6 together with Remark 3.7 to obtain our proof. Remark 3.9. We have connected a horizontal line segment with a vertical one. The same technique with only minor modifications allows us to connect a horizontal line segment with one of any angle whatsoever. 4. Fiber Brownian motion in a rectangle. Let D = [−1, 1] × [0, 1]. In this section we want to construct a process that behaves like planar Brownian motion in the right half of D and moves only horizontally in the left half of D. Martin Barlow and Ron Pyke pointed out to us that one way of doing this is to take 2dimensional Brownian motion in D and time-change it by the inverse of the additive functional that increases according to the amount of time spent in the right half of D. We take the limit approach here because of the application we have in mind. Define Dn to be [−1, 1] × [0, 1] with the line segments [−1, 0] × {i/2n } removed for i = 0, 1, . . . , 2n . Let Xt (n) be reflecting Brownian motion in Dn with absorption on the lines x = −1 and x = 1. First we show tightness for the family of processes {Xt (n)}n≥1 . Lemma 4.1. Given η > 0, there exists t such that P(sups≤t |Xs (n) − x| > η) < η for all x ∈ Dn and all n. Proof. Since the horizontal component of Xt (n) is clearly a 1-dimensional Brownian motion, we can focus on the vertical component. Let Li = [0, 1] × {i/2n }. Let T1 = inf{t : Xt (n) ∈ Li for some i}, and define Ti+1 to be the first time Xt (n) is in some Lj other than the one XTi (n) is in. By symmetry, XTi (n) is a symmetric random walk. Let N be the number of steps it takes this random walk to move vertically either plus or minus η. Note that Ti+1 − Ti is stochastically larger than the time it takes a standard 1dimensional Brownian motion to move 1/2n . Also, the Ti are independent of N . So the time for Xt (n) to move vertically ±η is stochastically larger than the time for a standard Brownian motion to move ±η. Tightness is an immediate consequence. The estimate Ey σ (x, r) ≤ c1 r 2 is easy because the horizontal component is a 1dimensional Brownian motion, and we have the estimate for such a process.
38
BASS AND BURDZY
Let Q((x1 , x2 ), r) = [x1 − r, x1 + r] × [x2 − r, x2 + r]. For x, y ∈ D, define dn (x, y) = inf{length(γ )}, where the infimum is over all curves γ connecting x and y and contained in the interior of Dn (except, possibly, for x and y). Lemma 4.2. Suppose h is harmonic in Q(z, 2r) ∩ Dn . There exist c1 and α not of Q(z, r) ∩ Dn , depending on z, r, or n such that for every connected component Q α h(x) − h(y) ≤ c1 dn (x, y) sup |h| , r Q(z,2r)∩Dn
x, y ∈ Q.
Proof. The proof uses coupling. We review the basic ideas and techniques of couplings at the beginning of Section 5, where they play a crucial role. The applications of couplings to estimate harmonic functions in the current proof are quite standard, so we outline only the main steps of the argument. of Q(z, r)∩Dn , and suppose x, y ∈ Q. We start Fix some connected component Q x y x two reflected Brownian motions W and W in Dn , with W started at x and W y started at y. We define them on a common probability space so that they meet at a certain time T (coupling time). We argue that the coupling time is small with large probability, for a suitably chosen pair W x and W y . If Q(z, 3r/2) is disjoint from the vertical axis, we can rather easily couple W x and W y before they leave Q(z, 3r/2). This is because when Q(z, 3r/2) lies to the right of the y-axis, the processes W x and W y behave like standard Brownian motions until they leave the set. If Q(z, 3r/2) lies to the left of the y-axis, a mirror coupling of reflected Brownian motions in a strip gives us the desired result. In both cases, the processes can be coupled within Q(z, 3r/2) with probability greater than p1 > 0, independent of n, z, or r. Now consider the case when Q(z, 3r/2) intersects the vertical axis. Let B1 and B2 be the balls with radii r/8 and r/16 and common center located 7r/4 units to the right of z. With probability p2 > 0, each of the processes W x and W y will hit B2 within the first r 2 units of time, before leaving Q(z, 2r), and then stay inside B1 for the next 2r 2 units of time. Hence, assuming independent evolution of the processes at the initial stage, we see that with probability p22 they will be in B1 at the same time, say, S, before exiting Q(z, 2r). After time S we coordinate their motions so that they meet at some time T , before leaving Q(z, 2r), with probability p3 > 0. In all cases, we can couple the processes with probability p4 > 0, at some time T less than the exit time τQ(z,2r) from Q(z, 2r). Now suppose h is harmonic in Q(z, 2r). Let P(x,y) be the joint law of (W x , W y ). By looking at Ah + B for suitable A and B, we may suppose supQ(z,2r) h = 1 and inf Q(z,2r) h = 0. Let τQ(z,2r) be the first time the process exits Q(z, 2r). We have h(x) = E(x,y) h WTx ∧τQ(z,2r) = E(x,y) h WTx ; T < τQ(z,2r) + E(x,y) h WτxQ(z,2r) ; τQ(z,2r) < T .
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
39
We have a similar expression for h(y). Note that on (T < τQ(z,2r) ), we have WTx = y WT . So, taking the difference and using the fact that h ≥ 0, h(x) − h(y) ≤
sup h P(x,y) T > τQ(z,2r) ≤ 1 − c2 .
Q(z,2r)
Similarly, h(y) − h(x) ≤ 1 − c2 . Therefore the oscillation of h in Q(z, r) is less than 1 − c2 . Just as in Lemma 3.5, this oscillation estimate easily implies the result. Let τn∗ = inf{t : Xt (n) ∈ {−1, 1} × [0, 1]}, and let Rnλ g(x) = Ex We often write Rn g for Rn0 g.
τn∗ 0
g(Xs (n)) ds.
Lemma 4.3. There exist c1 and α not depending on n such that Rn g(x) − Rn g(y) ≤ c1 dn (x, y)α g∞ ,
x, y ∈ Dn .
Proof. This follows from Lemma 4.2 and the estimate Ey σ (x, r) ≤ c2 r 2 , just as in the proof of Theorem 3.6. For fixed x, y ∈ D, the function n → dn (x, y) is nondecreasing. Let d∞ (x, y) = limn→∞ dn (x, y). Since dn (x, y) ≤ d∞ (x, y) for all x, y, and n, Lemmas 4.1–4.3 hold if we replace the Euclidean metric and dn -metric with d∞ -metric in their statements. To use the method of Barlow and Bass [4] to construct a strong Markov process on D, it is necessary to use the Ascoli-Arzelá theorem. However, D with the metric d∞ is not separable. We thus need the following lemma. Lemma 4.4. Suppose g is continuous on [−1, 1] × [0, 1] with respect to the Euclidean metric. Then Rn g is continuous on Dn with respect to the same metric. The modulus of continuity of Rn g depends on g, but it does not depend on n. Proof. Consider arbitrary x and v in Dn . Let y be a point with the same first coordinate as x and the same second coordinate as v. Lemma 4.3 provides an effective estimate for |Rn g(v)−Rn g(y)| because dn (v, y) is the same as the Euclidean distance between v and y. Our goal is to find an estimate for |Rn g(v) − Rn g(x)|. Using the triangle inequality and an estimate for |Rn g(v) − Rn g(y)| reduces the problem to estimating |Rn g(x) − Rn g(y)|. If the first coordinate of x is positive, Lemma 4.3 provides again a desired estimate. It remains to consider the case when x, y ∈ Dn ∩ [−1, 0) × [0, 1] and y lies directly above x. We use a coupling, as in the proof of Lemma 4.2. Let W x and W y be Brownian motions in Dn started at x and y, respectively, with absorption on the left and right sides of Dn and reflection on the remainder of the boundary of Dn . We construct W x and W y so that their horizontal components are equal and their vertical components
40
BASS AND BURDZY
are independent. Let T = inf{t : Wtx ∈ {−1, 0} × [0, 1]}. We have Rn g(x) = E(x,y)
T 0
g Wsx ds + E(x,y) Rn g WTx ,
(4.1)
and similarly for y. Note that Rn g(w) = 0 for w ∈ {−1} × [0, 1]. Hence, if WTx ∈ y y {−1} × [0, 1], then Rn g(WTx ) = Rn g(WT ) = 0 and so Rn g(WTx ) − Rn g(WT ) = 0. y x x In the case when WT ∈ {0} × [0, 1], we estimate the quantity Rn g(WT ) − Rn g(WT ) y using Lemma 4.3. Note that for t ≤ T , |Wtx − Wt | ≤ |x − y| + 2 · 2−n since the two horizontal components are equal and the width of the tubes is 2−n . We see that (x,y) y Rn g(w) − Rn g(z) E Rn g W x − E(x,y) Rn g W ≤ sup T
T
|w−z|≤|x−y|+2−n+1
α ≤ c2 |x − y| + 2−n+1 g∞ .
(4.2)
It is easy to see that for some c3 < ∞ independent of n, we have E(x,y) T < c3 . Then
(x,y) E
T
g
0
Wsx
ds − E
(x,y)
sup |w−z|≤|x−y|+2−n+1
g
y Ws ds
g(w) − g(z) E(x,y) T
≤ c3
T 0
≤
sup |w−z|≤|x−y|+2−n+1
(4.3)
g(w) − g(z) .
Using (4.1)–(4.3), we obtain Rn g(x) − Rn g(y) ≤ c2 |x − y| + 2−n+1 α g∞ + c3 sup g(w) − g(z) . |w−z|≤|x−y|+2−n+1
This proves the lemma. In the next theorem, the topology on D is the Euclidean one. Theorem 4.5. There exists a subsequence nk with the following properties. (a) There exists a strong Markov process (Pz , Xt ) on D such that whenever xn → x ∈ D, then the Pxnk -law of Xt (nk ) converges weakly to the Px -law of Xt . (b) For every x ∈ D, the horizontal component of the process with distribution Px is a Brownian motion. When the horizontal component is positive, the process behaves like a planar Brownian motion; when the horizontal component is negative, the vertical component does not change. The process is killed when its horizontal component reaches 1 or −1.
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
41
Proof. Since the horizontal component of every Pxn -process is a Brownian motion for each n, this is true in the limit as well. It is therefore clear that M = sup sup Ex τn∗ < ∞. n x∈Dn
By the resolvent equation, if λ < 1/M, Rnλ g = Rn0 g − λRn0 Rn0 g + λ2 Rn0 Rn0 Rn0 g − · · · . By Lemma 4.4, each term in the sum is continuous with a modulus of continuity depending on g but not on n. Moreover, the ith summand is bounded in absolute value by λi M i+1 g∞ , so the series converges uniformly, independently of n. Hence, for λ < 1/M, Rnλ g is continuous with a modulus depending on g but not on n. With this observation and Lemma 4.1, the other parts of the theorem are proved as in the proof of Theorem 3.6. Note that the Px -process spends zero time on the vertical axis because its horizontal component is a Brownian motion. 5. Coupling of fiber Brownian motions. We start this section with a review of those properties of the mirror coupling for reflecting Brownian motions that are relevant to our argument. These aspects of the idea were originally developed by Burdzy and Kendall [6] and later applied by Bañuelos and Burdzy [2]. We start with the coupling of two Brownian motions in R2 . Suppose that x, y ∈ R2 are symmetric with respect to a line M. Let Xt be a Brownian motion starting from x, and let τ be the hitting time of M by X. We let Yt be the mirror image of Xt with respect to M for t ≤ τ , and we let Yt = Xt for t > τ . The process Yt is a Brownian motion starting from y. The pair (Xt , Yt ) is a mirror coupling of Brownian motions. Next we turn to the mirror coupling of reflected Brownian motions in a half-plane Ᏼ, starting from x, y ∈ Ᏼ. Let M be the line of symmetry for x and y. The case when M is parallel to ∂ Ᏼ is essentially a 1-dimensional problem, so we focus on the case when M intersects ∂ Ᏼ. By performing rotation and translation, if necessary, we may suppose that Ᏼ is the upper half-plane and M passes through the origin. We write x = (r x , θ x ) and y = (r y , θ y ) in polar coordinates. The points x and y are at the same distance from the origin, so r x = r y . Suppose without loss of generality that θ x < θ y . We first generate a 2-dimensional Bessel process Rt starting from r x . Then we generate two coupled 1-dimensional processes on the “half-circle” as ˜ yt = ˜ xt be a 1-dimensional Brownian motion starting from θ x . Let > follows. Let > ˜ xt ˜ xt +θ x +θ y . Let >xt be reflected Brownian motion on [0, π], constructed from > −> x by the means of the Skorokhod equation. Thus >t solves the stochastic differential ˜ xt + dLt , where Lt is a continuous nondecreasing process that equation d>xt = d > increases only when >xt is equal to 0 or π, and >xt is always in the interval [0, π ]. ˜ xt is constant on Thus >xt is constructed in such a way that the difference >xt − > x every interval of time on which >t does not hit zero or π. The analogous reflected ˜ yt is denoted > yt . Let τ > be the smallest t with >xt = > yt . process obtained from >
42
BASS AND BURDZY
y yt for t ≤ τ > and >yt = >xt for t > τ > . We define a clock by Then we let >t = > t −2 y σ (t) = 0 Rs ds. Then Xt = (Rt , >xσ (t) ) and Yt = (Rt , >σ (t) ) are reflected Brownian motions in Ᏼ with normal reflection—we can prove this using the same ideas as in the discussion of the skew-product decomposition for 2-dimensional Brownian motion presented by Itô and McKean [9]. Moreover, Xt and Yt behave like free Brownian motions coupled by the mirror coupling as long as they are both strictly inside Ᏼ. The processes stay together after the first time they meet. We call (Xt , Yt ) a mirror coupling of reflecting Brownian motions. The two processes Xt and Yt in the upper half-plane remain at the same distance from the origin. Suppose now that Ᏼ is an arbitrary half-plane, and suppose that x and y belong to Ᏼ. Let M be the line of symmetry for x and y. Then an analogous construction yields a pair of reflecting Brownian motions starting from x and y such that the distance from Xt to M ∩ ∂ Ᏼ is always the same as for Yt . Let Mt be the line of symmetry for Xt and Yt . Note that Mt may move, but only in a continuous way, while the point Mt ∩ ∂ Ᏼ will never move. We call Mt the mirror and the point H = Mt ∩ ∂ Ᏼ the hinge. The absolute value of the angle between the mirror and the normal vector to ∂ Ᏼ at H can only decrease. The next level of generality is to consider a mirror coupling of reflected Brownian motions in a polygonal domain Ᏸ. Suppose that Xt and Yt start from x and y inside Ᏸ, and let I be the side of Ᏸ which is hit first by one of the particles. Let K be the straight line containing I . Since the process that hits I does not “feel” the shape of Ᏸ except for the direction of I , it follows that the two processes will remain at the same distance from the hinge Ht = Mt ∩ K. The mirror Mt can move, but the hinge Ht will remain constant as long as I remains the side of ∂ Ᏸ where the reflection takes place. The hinge Ht will jump when the reflection location moves from I to another side of Ᏸ. The hinge Ht may from time to time lie outside ∂ Ᏸ, if Ᏸ is not convex. We continue by defining a set D that carries a fiber Brownian motion and is amenable to coupling techniques. The first ingredient is a group Ᏻ containing six elements, generated by (a) reflection with respect to the horizontal axis and (b) rotation around (0, 0) by the angle 2π/3. We define a point-to-set mapping ᐀x = {σ (x), σ ∈ Ᏻ}. Typically, ᐀x contains six points. The meaning of ᐀K for a set K is self-evident. Fix some w ∈ (0, 1/100) whose value will be chosen later. Consider points √ A0 = (0, 0), A1 = (1, 3), A2 = (5, w),
A3 = (20, w),
A4 = (20, 0).
Let DL1 be the closed bounded set whose boundary is a polygonal line with consecutive vertices A0 , A1 , A2 , A3 , A4 , and A0 . Figure 5.1 presents a schematic drawing of the set DL1 . Let DL = ᐀DL1 , and let DR be the set that is symmetric to DL with respect to the line ᏸ = {(x, y) ∈ R2 : x = 30}. By abuse of notation, ᏸx and ᏸQ denote the point and set symmetric to x and Q, respectively, with respect to ᏸ.
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
43
A1 DL1
A2
A3 SL1 A4
A0 GL 1 Figure 5.1. The figure is not to scale.
The set D that we are trying to construct is the union of DL , DR , and three polygonal “tubes” T1 , T2 , and T3 , which we describe in a somewhat informal way. Let SL1 be the line segment between A3 and (20, −w). This line segment lies on the boundary of DL . Let SL2 and SL3 be the images of SL1 by the rotations by angles 2π/3 and 4π/3. Let SRk = ᏸSLk , k = 1, 2, 3. The tubes Tk , k = 1, 2, 3, are disjoint closed polygonal sets, symmetric with respect to ᏸ, whose interiors are also disjoint from DL and DR . The tube Tk shares the piece of the boundary SLk with DL and SRk with DR . Let G1L = {(x, y) ∈ DL : x ≥ 14}, and let G2L and G3L be obtained by rotating G1L by angles 2π/3 and 4π/3. Let GkR = ᏸGkL and Gk = GkL ∪GkR ∪Tk for k = 1, 2, 3, and F = k=1,2,3 Gk . The results of the previous sections (see especially the construction of Dn preceding Theorem 3.8) show that Tk ’s can be chosen so that the following is true. For every k = 1, 2, 3, there exists a family of disjoint polygonal curves {γxk }x∈S k L with the following properties: (i) γxk ∩ SL1 = {x}; (ii) for every x and k, the curve γxk is ᏸ-symmetric; (iii) x∈S k γxk = Gk ; L
(iv) the set γxk ∩ DL is a line segment perpendicular to SLk ; (v) all curves γxk , for all x and k, have the same length—we denote it by α. Moreover, wecan choose the γxk ’s so that the fiber Brownian motion defined in D = DL ∪DR ∪ k=1,2,3 Tk is the limit of reflecting Brownian motions in a sequence of open approximating domains. We are more precise about this point later on. It may be necessary for the γxk ’s, especially for the γx1 ’s, to make several turns in order to satisfy condition (v). This is fine as long as we have the approximation property mentioned in the last paragraph. Note that although we can choose the γx2 ’s and γx3 ’s so that they are symmetric with respect to the horizontal axis, no such symmetry can exist between the γx2 ’s and γx1 ’s. This fundamental lack of symmetry in the picture is the main reason for using fiber Brownian motion instead of the usual reflected Brownian motion in an open domain. Let D˜ = DL1 ∪ DR1 ∪ x∈D 1 ∩S 1 γx1 . L L We define a quantity representing the position of a point inside one of the long tubes in D. Every point y ∈ F ∩ DR belongs to a unique fiber γxk . Let ρ(y) denote its distance from ᏸ along γxk . For x ∈ DR \F , we let ρ(x) be the sum of α/2 and the
44
BASS AND BURDZY
Euclidean distance from x to k=1,2,3 SRk . So far, we have defined ρ(x) for x ∈ DR . For x ∈ DL , we let ρ(x) = −ρ(ᏸx). For a > 0, we let Fa = {x ∈ D : −a − α/2 < ρ(x) < a + α/2}, Fac = D \Fa , and F˜a = D˜ ∩ Fa . ˜ that is, a process that is a reflected We consider a fiber Brownian motion in D, ˜ Brownian motion in D \F and moves like a 1-dimensional Brownian motion along the curves γx1 . A construction of such a process can be achieved by combining the ideas presented in Sections 2 and 4. Section 2 explains how to define approximating domains with thin tubes that can bend and such that in the limit we obtain a family of polygonal lines and a 1-dimensional Brownian motion along them. Section 3 shows how to define the transition between 2-dimensional and 1-dimensional behavior for the Brownian motion. Since the transitions are local in nature, the results obtained in ˜ that section for a rectangle can be generalized in an obvious way to D. Next we introduce four Markov transition mechanisms for a pair of Brownian ˜ Later we build a process that from time to time changes its particles Xt and Yt in D. transition mechanism, and we speak about the four “modes” of particle motion. (A) If Xt and Yt are in this mode, they behave like a pair of independent fiber Brownian motions. (B) This mode is defined only if both particles stay in D˜ \F . They behave like reflected Brownian motions in this set, related by mirror coupling. The processes are stopped when one of the particles hits the boundary of F . (C) The third mode of motion is defined as long as both particles stay in F˜15 . The particles move in such a way that ρ(Xt ) − ρ(Yt ) remains constant. They are stopped when one of them exits from F˜15 . There are many two-particle processes with this property. For definiteness, we assume that when a particle is in DL ∩ (F˜15 \F ), its component perpendicular to the long side of this set is independent of the motion of the other particle, and the same is true when a particle is in DR ∩ (F˜15 \F ). (D) In this mode we have Xt = Yt . We leave a formal proof of existence of processes satisfying (A)–(D) to the reader. To each of the modes of motion (A)–(D) we associate a stopping time: T A : the first time t such that |ρ(Xt ) − ρ(Yt )| ≤ 6; T B : the first time t such that |ρ(Xt ) − ρ(Yt )| ≥ 7 or Xt = Yt ; T C : the first time t such that one of the processes Xt or Yt hits D˜ \ F˜15 ; T D : infinity. Our coupled processes move according to the following rules. We assume that ˜ to be the starting location (X0 , Y0 ) = (x0 , y0 ) is confined to a subset Q of D˜ × D, described below. The processes start moving in one of the modes (A)–(C), depending on (x0 , y0 ). If the mode is (A), the particles move until the time T A , and then they switch to either (B) or (C), according to the following rule. If both particles are in F˜15 , they switch to mode (C); otherwise, they switch to (B). Whenever the particles are in mode (B), they wait until T B , and then they switch to (A) if |ρ(Xt ) − ρ(Yt )| attains the value 7. Otherwise, if the particles meet at some time, the processes switch
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
45
to mode (D). If they are in mode (C), then they wait until T C and switch at this time to (B). Finally, they never make a transition out of mode (D). In other words, we have the following possible transitions. (A)
(B)
(D)
(C) We adopt the convention that the angle M formed by a straight line M with the horizontal axis belongs to [0, π ). Now we will specify the set Q ⊂ D˜ × D˜ of all possible starting positions for the pair of processes (Xt , Yt ) and the corresponding starting modes. If |ρ(x)−ρ(y)| ≥ 7, then (x, y) ∈ Q and the starting mode is (A). If |ρ(x)−ρ(y)| = 6 and x, y ∈ F˜15 , then we also have (x, y) ∈ Q and we assume that if (X0 , Y0 ) = (x, y), then the processes start to move in mode (C). Let M x,y be the line of symmetry of the points x and y with the convention that the line is vertical if x = y. The starting mode is (B) and (x, y) ∈ Q if M x,y ∈ [π/3, 5π/6], |ρ(x) − ρ(y)| < 7, and M x,y ∩ D˜ ⊂ F˜10 , but it is not true that |ρ(x) − ρ(y)| = 6 and x, y ∈ F˜15 . (In that case, the starting mode is (C).) The diagonal of D˜ × D˜ is also a subset of Q. If the processes Xt and Yt start from the same point, they will stay in mode (D) forever. It is clear that each individual particle Xt and Yt moves like a fiber Brownian motion on any interval that has only a finite number of transitions between the modes (A)–(D). We show that there is only a finite number of transitions on any finite time interval. Note that in any sequence of mode transitions, mode (B) is visited over and over again, with the only excursions from this mode taking the form (B) −→ (A) −→ (B), (B) −→ (A) −→ (C) −→ (B), or the only infinite excursion (B) −→ (D). It is clear that there exist t0 , p0 > 0 such that for any points x and y satisfying |ρ(x) − ρ(y)| ≥ 7, and the processes (Xt , Yt ) starting from (x, y), the first time t with the property |ρ(Xt ) − ρ(Yt )| = 6 will be greater than t0 with probability greater than p0 . Hence, the processes will spend at least t0 units of time in mode (A), with probability greater than p0 , each time they enter that mode. A standard argument now shows that there will be only a finite number of transitions (A) → (B) on any finite time interval. We argue that if the processes Xt and Yt start from a point (X0 , Y0 ) ∈ Q, then they will never leave the set Q. As a part of this argument, we show that if the processes start from a point in Q, they will never hit F while being in mode (B). Such an
46
BASS AND BURDZY
event would cause the particles to stop, according to the definition of mode (B). We included such a possibility in the definition of mode (B) so that we would not have to explain what a mirror coupling does when one of the particles enters a fiber γx1 . Suppose that the particles start from an initial position in Q, in mode (A). Hence, they start from (x, y) such that |ρ(x)−ρ(y)| ≥ 7, and they stay in this mode until time s defined by T A , that is, such that |ρ(Xs ) − ρ(Ys )| = 6. Until that time, |ρ(Xt ) − ρ(Yt )| ≥ 6, by the continuity of paths of X and Y . All points (x, y) with |ρ(x) − ρ(y)| ≥ 6 belong to Q, so the process (Xt , Yt ) stays in Q as long as it is in mode (A). Assume now that the processes start from x, y ∈ F˜15 with |ρ(x) − ρ(y)| = 6, in mode (C). According to (C) and T C , as long as the processes stay in mode (C), we have |ρ(Xt ) − ρ(Yt )| = 6 and Xt , Yt ∈ F˜15 . Thus, the processes stay in Q when they are in mode (C). The case of mode (D) is trivial, so it remains to discuss mode (B). Suppose that the processes start from a position in Q corresponding to (B). Without loss of generality we may assume that the particles are initially in F˜10 ∩DL . Note that the initial position ˜ As long as none of M0 of the mirror is such that ρ(z) ≤ −4−α/2 for all z ∈ M0 ∩ D. the particles touches F , the angle Mt of the mirror stays within [π/3, 5π/6] by the results of [2], and it is easy to see that the particles remain in Q. We argue that none of the particles can hit F if they are in mode (B). Suppose otherwise. Let t0 be the first time one of the particles hits F . Note that |ρ(Xt0 ) − ρ(Yt0 )| ≤ 7 and so Mt0 ∩ D˜ is not a subset of F˜10 . Let t1 be the last time before t0 such that Mt1 ∩ D˜ ⊂ F˜10 . Since |ρ(Xt1 ) − ρ(Yt1 )| ≤ 7, for some t2 ∈ (t1 , t0 ), both particles stay in DL1 ∩ F˜15 in the interval (t1 , t2 ). However, as long as both processes stay in (DL1 ∩ F˜15 )\F , the interval ˜ can only decrease by the results of [2]. This Jt = {a : ρ(z) = a for some z ∈ Mt ∩ D} contradicts the definition of t1 and proves our claim. Next we introduce a partial order ≺ on DL \F . We start by considering points x, y ∈ DL1 . We say that x ≺ y if and only if M x,y ∈ [π/3, 5π/6] and ρ(x) < ρ(y). We extend the definition to v, z ∈ DL \F by declaring that v ≺ z if and only if for some x, y ∈ DL1 we have x ≺ y, v ∈ ᐀x, and z ∈ ᐀y. Lemma 5.1. Let u(t, x) be the Neumann heat equation solution in D with the initial condition f (x), so that u(t, x) = Ex f (Xt ), where the expectation is taken with respect to fiber Brownian motion in D. Consider the initial condition f (x) = 1 for x to the left of ᏸ and f (x) = −1 for all other x. The corresponding solution is monotone relative to ≺ in the sense that, for all x, y ∈ DL \F such that x ≺ y, we have u(t, x) ≥ u(t, y) for all t ≥ 0. Proof. Consider any x, y ∈ DL \F with x ≺ y. Find v, z ∈ D˜ such that x ∈ ᐀v ˜ starting from (v, z), according to the and y ∈ ᐀z. Run a process (Xt , Yt ) in D˜ × D, coupling recipe described in this section. The point of the complicated coupling construction was that ρ(Xt ) ≤ ρ(Yt ) for all t ≥ 0, with probability 1. Since f (x1 ) ≥ f (x2 ) whenever ρ(x1 ) ≤ ρ(x2 ), this implies
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
that
47
Ev f (Xt ) ≥ Ez f (Yt ).
With a slight abuse of notation, for any x1 ∈ DL , let ᐀−1 x1 denote the unique point x2 in D˜ with the property that x1 ∈ ᐀x2 . For x1 ∈ DR , we let ᐀−1 x1 = ᏸ᐀−1 ᏸx1 . Finally, we extend the mapping ᐀−1 to all points in D in such a way that (i) every ˜ and (ii) for every x1 ∈ D continuous path in D is mapped onto a continuous path in D, −1 −1 we have ρ(᐀ x1 ) = ρ(x1 ). Note that f (᐀ x1 ) = f (x1 ) for every x1 ∈ D. Let Xt∗ be a fiber Brownian motion in D with X0∗ = x. It is evident that the process ˜ and so it has the same distribution as Xt , ᐀−1 Xt∗ is a fiber Brownian motion in D, −1 starting from X0 = v = ᐀ x. Since f (᐀−1 Xt∗ ) = f (Xt∗ ) for every t, Ex f Xt∗ = Ex f ᐀−1 Xt∗ = Ev f (Xt ). If Yt∗ denotes a fiber Brownian motion in D starting from Y0∗ = y, we obtain in an analogous way Ey f Yt∗ = Ey f ᐀−1 Yt∗ = Ez f (Yt ). Hence, u(t, x) = Ex f Xt∗ = Ev f (Xt ) ≥ Ez f (Yt ) = Ey f Yt∗ = u(t, y). 6. Eigenvalue and eigenfunction estimates. In the first few lemmas of this section, we write D to denote a Lipschitz domain such that D ∩ F c = F c , where F c denotes D \ F . Lemma 6.1. Suppose D is a Lipschitz domain whose intersection with F c agrees with F c , and let pt (x, y) be the transition density for Brownian motion reflected in D . There exists c1 depending on t0 but not on the shape of D ∩ F , such that pt0 (x, x) ≤ c1 for x ∈ F2c . Proof. Let G = D ∩ (∂F1 ) and H = D ∩ (∂F7/4 ). Let TG = inf{t : Xt ∈ G}, and define TH similarly. Fix some t0 > 0. The set F3/2 \F1/2 is a Lipschitz domain so, by the proof of [5, Theorem 3.5] and Remark 3.1, there exists c2 > 0 such that the process started at z ∈ G will move at least 1/8 away from the boundary before hitting ∂F3/2 ∪ ∂F1/2 , with probability greater than c2 . This, the strong Markov property, and the support theorem for Brownian motion (see [3, Theorem I.6.6]) imply that there exists p1 < 1 such that Pz (TH < t0 ) < p1 for all z ∈ G. Let pt (x, y) be the transition density for reflecting Brownian motion in D \F1/2 with absorption on G. To get from x to y, the paths either go from x to y without c , hitting G or else they first hit G. So for x, y ∈ F1/2 pt (y, x) = p t (y, x) +
t
0
G
Py XTG ∈ dz, TG ∈ ds pt−s (z, x).
48
BASS AND BURDZY
c , we have If t ≤ t0 , x ∈ F2c , and y ∈ F1/2
t
sup sup pr (z, x) Py XTG ∈ dz, TG ∈ ds
pt (y, x) ≤ pt (y, x) +
0≤r≤t z∈G
0
≤ pt (y, x) + sup sup pr (z, x).
G
(6.1)
0≤r≤t z∈G
For z ∈ G and x ∈ F2c ,
r
Pz XTH ∈ dw, TH ∈ ds pr−s (w, x) 0 H ≤ sup sup pu (w, x) Pz TH < r .
pr (z, x) =
(6.2)
0≤u≤r w∈H
Taking the supremum over t ≤ t0 and y ∈ H in (6.1) and applying (6.2), we obtain for x ∈ F2c , sup sup pt (y, x) ≤ sup sup p t (y, x) + sup sup pr (z, x)
t≤t0 y∈H
t≤t0 y∈H
0≤r≤t0 z∈G
≤ sup sup p t (y, x) + sup sup pt (w, x) sup Pz TH < t0 . t≤t0 y∈H
t≤t0 w∈H
z∈G
Since pt (x, y) ≤ c2 t −1 exp(−c3 |x − y|2 /t) by [5, Theorem 3.1], we obtain sup sup sup p t (y, x) ≤ c4 ,
t≤t0 y∈H x∈F c 2
and, therefore, recalling that Pz (TH < t0 ) < p1 for z ∈ G, sup sup pt (y, x) ≤ c4 + p1 sup sup pt (y, x) , t≤t0 y∈H
t≤t0 y∈H
or sup sup pt (y, x) ≤
t≤t0 y∈H
c4 = c5 , 1 − p1
for x ∈ F2c . By (6.2), for x ∈ F2c , sup sup pr (z, x) ≤ sup sup pr (w, x).
0≤r≤t0 z∈G
0≤r≤t0 w∈H
We apply this inequality together with (6.1), but this time with y = x ∈ F2c , to obtain pt0 (x, x) ≤ pt0 (x, x) + sup sup pr (z, x) ≤ c6 + sup sup pr (w, x) ≤ c7 . 0≤r≤t0 z∈G
This completes the proof.
0≤r≤t0 w∈H
49
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
Next we want to obtain a modulus of continuity result for λ-resolvents in D \F1 . ∞ −λt λ x We set R g(x) = E 0 e g(Xt ) dt for λ > 0. We obtain a modulus of continuity for R λ g depending only on λ and the suprema of |g| and |R λ g| on D \F1 and not on the shape of D ∩ F1 . Lemma 6.2. For every λ > 0 there exist c1 and ζ such that if x, y ∈ D \F1 , then λ R g(x) − R λ g(y) ≤ c1 |x − y|ζ gL∞ (D \F ) + R λ g ∞ . 1 L (D \F ) 1
Proof. It suffices to fix arbitrarily small ε > 0 and prove the lemma only for x, y ∈ D \F1 whose distance apart is less than ε. Find an ε > 0 and two disjoint small balls K1 and K2 in D \ F1 with the property that for every pair of points x, y ∈ D \F1 with |x −y| < ε, the following is true for one of the balls, for example, ball K1 . We have x ∈ D \ (F1 ∪ K1 ), and if we let γ = dist(x, K1 )/4, then both x and y are in the ball of radius γ /4 about x. We now fix one of the balls, say, K1 , and prove the lemma for all x and y satisfying the above condition. The lemma can be proved for all other pairs x and y by repeating the argument for K2 in place of K1 . TK Let TK1 be the hitting time of K1 and R1λ g(x) = Ex 0 1 e−λt g(Xt ) dt for λ ≥ 0. Suppose h is a bounded function. We have R10 h(x) = Ex
Tγ 0
h(Xt ) dt + Ex R10 h XTγ ,
where Tγ = inf{t : Xt ∈ / B(x, γ )}, and similarly for R10 h(y). Then 0 R h(x) − R 0 h(y) ≤ 2hL∞ (D \F ) 1 1 1
sup Ez Tγ + Ex R10 h XTγ − Ey R10 h XTγ .
z∈B(0,γ )
Since the amount of time for a reflecting Brownian motion in a Lipschitz domain to leave a ball of radius γ is bounded by a constant times γ 2 (cf. [5, Corollary 3.3]), then the first term on the right-hand side is bounded by c2 γ 2 hL∞ (D \F1 ) . The function z → Ez R10 h(XTγ ) is harmonic with respect to reflecting Brownian motion, so by [5, Corollary 3.8] and scaling, there exists ζ1 ∈ (0, 1) such that the second term on the right-hand side is bounded by c3
|x − y|ζ1 R 0 h ∞ . 1 L (D \F1 ) γ ζ1
Choosing γ = |x − y|ζ1 /(2+ζ1 ) and adding, we see that 0 R h(x) − R 0 h(y) ≤ c4 |x − y|ζ hL∞ (D \F ) + R 0 h ∞ , 1 1 1 1 L (D \F ) 1
where ζ = 2ζ1 /(2 + ζ1 ).
(6.3)
50
BASS AND BURDZY
Now let g be bounded, and let h = −(1/2)FR1λ g so that R10 h = R1λ g. Since (λ − (1/2)F)R1λ is the identity, then h = g − λR1λ g. So both h and R10 h are bounded in L∞ norm by (1 + λ) gL∞ (D \F1 ) + R1λ g L∞ (D \F ) . 1
Applying (6.3) to R10 h = R1λ g, we obtain our estimate. Recall that D denotes a Lipschitz domain. Such a domain has a discrete spectrum for the half Laplacian with Neumann boundary conditions. Let µk be the sequence of all eigenvalues, ordered and repeated, if necessary. Let ϕk be the corresponding eigenfunctions. Note that µ1 = 0 and ϕ1 is a constant function. Corollary 6.3. If µ2 < c1 , then one can find a modulus of continuity for ϕ2 in D \F2 which does not depend on D but only on c1 . Proof. By Lemma 6.1, c2 ≥ p1 (x, x) =
e−µi ϕi (x)ϕi (x),
i
so
ϕ2 (x)2 ≤ c2 eµ2 ≤ c3 .
Now ϕ2 = (1 + µ2 )R 1 ϕ2 , since ϕ2 is an eigenfunction. So on D \F2 we have 1 R ϕ2 (x) = 1 + µ2 −1 ϕ2 (x) ≤ c1/2 . 3 By Lemma 6.2, R 1 ϕ2 has a modulus of continuity not depending on D . Using ϕ2 = (1 + µ2 )R 1 ϕ2 , we thus obtain a modulus of continuity for ϕ2 . Recall that D is a Lipschitz domain whose intersection with F c agrees with F c . Let f be a bounded function, and let u be the solution to the heat equation in D with initial condition u(0, x) = f (x). The solution is given by
u(t, x) = pt (x, y)f (y) dy, where pt (x, y) are transition densities for the reflected Brownian motion in D . The eigenfunction expansion of pt (x, y) yields
ai e−µi t ϕi (x), u(t, x) =
i
where ai = f (y)ϕi (y) dy. Lemma 6.4. Suppose D is as above and there exist b1 , b2 such that µ2 < b1 < b2 < µ3 . There exists γ > 0 and c1 depending only on b1 , b2 such that for all x ∈ D \F1 and t ≥ 1, u(t, x) − a1 − a2 e−µ2 t ϕ2 (x) ≤ c1 f 2 e−(µ2 +γ )t .
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
51
Proof. By the Cauchy-Schwarz inequality, u(t, x) − a1 − a2 e−µ2 t ϕ2 (x) 1/2 1/2
−µi t 2 −2µi t 2 ai e ϕi (x) ≤ ai e ϕi (x) . = i>2
i>2
i>2
By Bessel’s inequality, i ai2 ≤ f 2 . We take γ = (µ3 − µ2 )/2. Using Lemma 6.1 and the fact that µ3 > b2 > b1 > µ2 , we obtain
e−2µi t ϕi (x)2 ≤ e−(2µ2 +2γ )t
i>2
e−2γ t ϕi (x)2
i>2
≤e
−(2µ2 +2γ )t
≤e
−(2µ2 +2γ )t
e−2γ ϕi (x)2
i>2
p2γ (x, x)
≤ c2 e−(2µ2 +2γ )t . Hence, u(t, x) − a1 − a2 e−µ2 t ϕ2 (x) ≤
1/2
f
2
c2 e−(2µ2 +2γ )t
1/2
= c2 f 2 e−(µ2 +γ )t . 1/2
This completes the proof. Lemma 6.5. Suppose that u(t, x) is the solution of the heat equation in D with u(0, x) = f (x), where f 2 < c1 . For a fixed t > 0, we can find a modulus of continuity for x → u(t, x) on D \F2 which depends only on c1 and t. Proof. Fix some λ > 0, and let h(x) =
ai λ + µi e−µi t ϕi (x), i
where ai = f (y)ϕi (y) dy. Since the ϕi are eigenfunctions, R λ ϕi = (λ + µi )−1 ϕi , and hence u(t, x) = R λ h(x). Here R λ is the λ-resolvent. Note that (λ + µ)2 e−µt is bounded for µ ≥ 0 by a constant depending only on t. This implies that
2 e−µi t ϕi (x)2 = c2 pt (x, x). λ + µi e−2µi t ϕi (x)2 ≤ c2
i
i
52
BASS AND BURDZY
By Lemma 6.1, pt (x, x) is bounded on D \F2 by c3 . By Bessel’s inequality, i ai2 ≤ 2 f ≤ c12 . Putting these estimates together and using the Cauchy-Schwarz inequality, h(x) ≤
2
λ + µi e
−2µi t
ϕi (x)
2
1/2
i
i
1/2 ai2
≤ c 2 c3 c1 .
By Lemma 6.2, u(t, x) = R λ h(x) is continuous on D \F2 with a modulus dependent only on c1 and t. Lemma 6.6. For every open set M and real numbers a1 , a2 , b1 , b2 > 0, there exists ε > 0 with the following property. If ϕ satisfies (1/2)Fϕ = −µϕ in M with b1 < µ < b2 and if a1 < ϕ(x) < a2 , then a2 > a1 + ε. Proof. Pick a point x0 ∈ M, and let γ = dist(x0 , ∂M)/2. Let T be the first time the Brownian motion starting from x0 leaves B(x0 , γ ), and suppose x ∈ B(x0 , γ ). Since (1/2)Fϕ = −µϕ, then by Itô’s formula,
T ϕ(Xs ) ds. (6.4) ϕ(x) = Ex ϕ(XT ) + µEx 0
Now suppose a1 < ϕ < a2 on M and a2 − a1 = ε is small. The oscillation of the left-hand side of (6.4) over the set B(x0 , γ ) is less than ε; that is, sup
x∈B(x0 ,γ )
ϕ(x) −
inf
x∈B(x0 ,γ )
ϕ(x) ≤ ε.
The oscillation of the first term on the right-hand side of (6.4) is also less than ε. We write the second term on the right-hand side of (6.4) as
T ϕ − a1 (Xs ) ds + a1 µEx T . µEx 0
The first term has oscillation less than µεEx T ≤ µεc1 γ 2 ≤ c1 b2 εγ 2 . But the oscillation of the second term is greater than a1 µ times the oscillation of Ex T , and hence is bigger than c1 a1 b1 γ 2 . Thus, the oscillation of the right-hand side of (6.4) is larger than c1 a1 b1 γ 2 − c1 b2 εγ 2 − ε, and this is larger than the oscillation of the left-hand side, ε, if ε is sufficiently small. This is a contradiction. The proof is complete. Recall the set D and various elements of its construction from Section 5. By the results of Sections 2 and 3, there exists a decreasing sequence of open sets D˜ n such that reflected Brownian motions in the D˜ n ’s converge to the fiber Brownian motion in D. Every domain D˜ n is obtained by removing a finite number of line segments and
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
53
polygons from D. We can approximate every domain D˜ n by a sequence of Lipschitz domains such that reflected Brownian motions in those domains converge to the reflected Brownian motion in the D˜ n . Finding such a sequence of Lipschitz domains is rather easy in comparison with the construction presented in Sections 2 and 3, so we leave the details to the reader. Using the diagonal method, we see that there exists a sequence of Lipschitz domains Dn such that reflected Brownian motions in the Dn ’s converge to the fiber Brownian motion in D. Moreover, we can construct Dn ’s so that Dn ∩ F c = F c for every n. Let {µnk }k≥1 be the sequence of Neumann eigenvalues for Dn , ordered according to size and repeated, if necessary. The corresponding eigenfunctions are denoted ϕkn . Recall that w is the width of the fiber bundles connecting the two large subdomains of D. Lemma 6.7. We can find w ∈ (0, 1/100) and constants c1 < c2 < c3 such that for large n, c1 < µn2 < c2 < c3 < µn3 . Proof. We first prove that for any c2 > 0 we can have µn2 < c2 by choosing sufficiently small w > 0. The proof is almost identical to that of [7, Lemma 1]. Let A5 be the point at the intersection of the line containing A1 and A2 and the horizontal axis. For x ∈ DL1 , define a function f (x) as follows: (i) f (x) = 0 if |x| > 5 or if dist(x, A5 ) < 10w; (ii) f (x) = log(dist(x, A5 )/10w)/ log(1/10w) if both |x| ≤ 5 and dist(x, A5 ) ∈ [10w, 1]; (iii) f (x) = 1 if |x| < 5 and dist(x, A5 ) > 1. We extend f into a continuous function on Dn which is constant on every set ᐀x ⊂ DL , which is antisymmetric with respect to ᏸ, that is, f (x) = −f (ᏸx) for x ∈ DL , and which vanishes outside DL ∪DR . The function f satisfies the Neumann boundary conditions on ∂Dn . It is easy to check that when w → 0, the integrals (over Dn ) of |f | and |f |2 remain bounded above and bounded away from 0, and |∇f |2 → 0 when w → 0. Since the function f is orthogonal to the constant function 1 (i.e., to the lowest eigenfunction), we have |∇f |2 n . µ2 ≤ |f |2 In view of the previous remarks, the quantity on the right-hand side can be made smaller than c2 by choosing small w. Note that the estimate does not depend on n. Next we prove that the second Neumann eigenvalue µn2 in Dn is simple if w is small. It is easy to see that there exists ξ < 10−3 with the following property. Suppose c whose diameter is less than ξ , and consider Brownian that U is a subdomain of F16 motion in U , reflected on ∂U ∩ ∂D and killed on ∂U \∂D. Then with probability greater than 1/2, the Brownian motion in U is killed within the first time unit of its
54
BASS AND BURDZY
motion, independent of the starting point. This estimate is independent of w < 1/100 and of the shape of U . The estimate implies that the probability of survival for more than t units of time is less than (1/2)t , for integer t > 0. This implies that the first mixed eigenvalue in U , with Neumann boundary conditions on ∂U ∩∂D and Dirichlet conditions on ∂U \∂D, must be greater than c4 = (log 2)/2. Now assume that w is so small that c2 < c4 . The nodal line of any second eigenfunction ϕ2n divides Dn into two subdomains by the Courant nodal line theorem (see either [1, p. 112] or c and have diameter less than [8, p. 19]). None of those subdomains can lie inside F16 n ξ because ϕ2 truncated to such a subdomain U is the first mixed eigenfunction in U with eigenvalue µn2 . This eigenvalue would have to be greater than c4 and less than c2 , a contradiction. Consider any second eigenfunction ϕ2n and the function ϕ˜2n (x) = ϕ2n (x) − ϕ2n (ᏸx). If ϕ2n is not ᏸ-symmetric, then ϕ˜2n is a nonzero an antisymmetric eigenfunction, that is, such that ϕ˜2n (x) = −ϕ˜2n (ᏸx). Hence, there exists either a symmetric or an antisymmetric eigenfunction. In either case, there exists an eigenfunction whose nodal line is ᏸ-symmetric. We consider such an eigenfunction in the next paragraph. c . In view of what we We show that the nodal line K of ϕ2n cannot intersect F17 c c must be greater have just proved, if K intersects F17 , then the diameter of K ∩ F16 than ξ . For a fixed ξ > 0, there is a p > 0 such that for every starting point x ∈ Dn , c or ᏸ(K ∩ F c ) for large n, the reflected Brownian motion in Dn hits either K ∩ F16 16 within the first unit of time with probability p. This follows from the fact that the reflected Brownian motions in Dn converge weakly to a fiber Brownian motion in D and the motion along the tubes connecting DL and DR has the character of a 1-dimensional Brownian motion. The same estimate applies to the process in Dn . Hence, the probability of not hitting the nodal line of ϕ2n within the first t units of time for integer t > 0 is less than (1 − p)t . This translates to the following statement about eigenvalues. The first mixed eigenvalue in a nodal domain of ϕ2n must be greater than c5 = −(log(1−p))/2. In other words, µn2 > c5 , where c5 does not depend on w. By choosing sufficiently small w, we would have µn2 < c2 < c5 /2. This would be a contradiction, which shows that for small w, the nodal line of ϕ2n does not intersect c . By [7, Lemma 3], µn is simple; that is, ϕ n is unique up to a multiplicative F17 2 2 constant. By uniqueness and the analysis above, the eigenfunction ϕ2n must be either symmetric or antisymmetric. Suppose first that it is symmetric. One of its two nodal c , since the nodal line K cannot do that. An argument subdomains cannot intersect F17 completely analogous to the ones given above shows that the first mixed eigenvalue in such a subdomain must be larger than c2 , which is a contradiction. We conclude that ϕ2n is antisymmetric, and so K1 = ᏸ ∩Dn is a part of the nodal line. However, the nodal line of the second eigenfunction divides the domain into only two subdomains, so K1 is in fact the whole nodal set of ϕ2n . For a fixed w > 0, there exists p1 > 0 such that for large n and all x ∈ Dn , reflected Brownian motion in Dn can hit K1 within one unit of time with probability greater
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
55
than p1 . As before, this translates into a lower bound for the first mixed eigenvalue for the domain to the left of K1 , and this in turn gives a lower bound for the second eigenvalue µn2 . This completes the proof of the claim that µn2 > c1 . Next we turn to the analysis of ϕ3n . By the same argument as in the case of ϕ2n , there exists either a symmetric or an antisymmetric third eigenfunction. Suppose that ϕ3n is antisymmetric. Then K1 is a part of the nodal set for ϕ3n , and the number of nodal subdomains must be even. The number must not exceed three, so there must be two nodal subdomains. Hence, K1 is whole nodal set of ϕ3n . This and the antisymmetry the n n n of both ϕ3 and ϕ2 imply that ϕ2 ϕ3n = 0. This is a contradiction, proving that there is no antisymmetric third eigenfunction. It follows that there exists a symmetric third eigenfunction. c , then the Our earlier arguments show that if the nodal line of ϕ3n intersects F17 n eigenvalue µ3 can be bounded below, and we are done. If the nodal line does not c , then there is a nodal subdomain which is disjoint from F c . The usual intersect F17 17 arguments show that the Brownian motion in such a subdomain, with reflection on ∂Dn and killing on the rest of the boundary, has to be killed fairly quickly, and this leads to a lower bound on the first mixed eigenvalue, independent of w. This completes the proof of µn3 > c3 > c2 . Remark 6.8. It is not hard to see that the domains Dn may be constructed in such a way that, for any fixed a, b, c ∈ [−15 − α/2, 15 + α/2] such that a < b < c, the probability that the reflected Brownian motion in Dn starting from any point x ∈ Dn with ρ(x) = b will hit the set {y ∈ Dn : ρ(y) = a} before hitting {y ∈ Dn : ρ(y) = c} converges to (c − b)/(c − a) as n → ∞, uniformly in x ∈ Dn with ρ(x) = b. Lemma 6.9. There exists c1 > 1 such that for sufficiently small w and large n, sup
c x∈Dn ∩F13
ϕ2n > c1
sup
x∈Dn ∩F10
ϕ2n .
Proof. Let Gn be the part of Dn between ∂F13 and ᏸ. Choose t1 and λ with the following property. The probability that reflected Brownian motion in Gn starting from any point of ∂F10 will hit ∂F13 for the first time after time t but before hitting ᏸ is less than exp(−λt) for all t ≥ t1 . Recall that α denotes the length of a fiber γxk in D. We choose t1 so large that ∞
λ t1 + k + 1 11.5 + α/2 11 + α/2 ≤ c2 = − . exp − λ(t1 + k) exp 2 12 + α/2 12 + α/2 k=0
Choose w sufficiently small so that the eigenvalue µn2 is so small that µn2 < λ/2 and exp(µn2 t1 ) < (13 + α/2)/(12 + α/2). Let u(t, x) = ϕ2n (x) exp(µn2 t). This function is parabolic for reflected Brownian motion Xt in Gn , stopped at the hitting time of ᏸ or ∂F13 . The averaging property of parabolic functions with respect to the process says that the value of ϕ2n (x) = u(0, x)
56
BASS AND BURDZY
for any x ∈ F10 can be computed as Ex u(τ, Xτ ), where τ is the first hitting time of ᏸ ∪∂F13 . Note that u(t, x) = 0 for x ∈ ᏸ, so we concentrate on ∂F13 . By Remark 6.8, for sufficiently large n and all x ∈ Dn ∩ F10 , we have 11 + α/2 , Px Xτ ∈ ∂F13 ≤ 13 + α/2 so Ex u τ, Xτ 1{τ
≤
11 + α/2 13 + α/2 11 + α/2 sup ϕ n (y) · ≤ sup ϕ n (y). 13 + α/2 y∈∂F13 2 12 + α/2 12 + α/2 y∈∂F13 2
The other contribution to the expectation comes from the hitting after time t1 . Recall that we have chosen t1 and λ in such a way that the probability that reflected Brownian motion in Gn starting from any point of ∂F10 will hit ∂F13 for the first time after time t but before hitting ᏸ is less than exp(−λt), for all t ≥ t1 . Hence, ∞
Ex u τ, Xτ 1{τ ≥t1 } ≤ sup ϕ2n (y)
eλt P τ ∈ t1 + k, t1 + k + 1 t≤t1 +k+1 k=0 y∈∂F13 ∞
λ t1 + k + 1 n exp − λ(t1 + k) exp ≤ sup ϕ2 (y) 2 y∈∂F13 sup
k=0
≤ sup ϕ2n (y) · c2 y∈∂F13
≤ sup ϕ2n (y) · y∈∂F13
11.5 + α/2 11 + α/2 − . 12 + α/2 12 + α/2
The two estimates yield for x ∈ Dn ∩ F10 , 11.5 + α/2 sup ϕ n (y). ϕ2n (x) = Ex u τ, Xτ ≤ 12 + α/2 y∈∂F13 2 This directly implies the inequality in the lemma. Proof of Theorem 1.2. Choose c1 > 1, n and w as in Lemma 6.9, so that sup
c x∈Dn ∩F13
ϕ2n > c1
sup
x∈Dn ∩F10
ϕ2n .
We assume without loss of generality that the L2 -norm of ϕ2n is equal to 1. It follows that there exists c2 > 0 such that, for n and w as in Lemma 6.9, we must have c . ϕ2n (x) > c2 for some x ∈ F13 Fix some w as above, and then use Lemma 6.7 to find c3 > 0 such that µn2 > c3 for sufficiently large n.
FIBER BROWNIAN MOTION AND THE “HOT SPOTS” PROBLEM
57
Recall from the proof of Lemma 6.7 that the ϕ2n are antisymmetric. If one of these eigenfunctions has its maximum (and, by antisymmetry, its minimum) inside the domain Dn , then we are done. We assume that the opposite is true and show how this assumption leads to a contradiction. Indeed, let us assume that the maxima of ϕ2n occur at points xn on the boundary of c F13 ∩ DL . Recall from Section 5 the partial order ≺ among the points of D. Let M be the set of all points x ∈ D which satisfy x ≺ xn for every n. It is not hard to see that M contains a nondegenerate ball around the origin. Lemma 6.7 gives us a lower and an upper bound for µn2 . Now use Lemma 6.6 to find an ε > 0 such that if ϕ2n (x) ∈ [c2 /2, c2 /2 + ε1 ] for all x ∈ M then ε1 > ε. Let un (t, x) be the solution to the Neumann heat problem in Dn with the initial temperature equal to 1 to the left of ᏸ and equal to −1 elsewhere. Note that the L2 norm of the initial condition un (0, x) is bounded by some c4 , independent of n. The bounds in Lemma 6.7 allow us to apply Lemma 6.4. We see that for some c5 and γ , un (t, x) − a n − a n e−µn2 t ϕ n (x) ≤ c5 c4 e−(µn2 +γ )t = c6 e−(µn2 +γ )t (6.5) 1 2 2 c . for all t ≥ 1 and x ∈ F13 c ∩ D . This and Recall from Corollary 6.3 that the ϕ2n are equicontinuous on F13 L the fact that every ϕ2n has L2 -norm equal to 1 imply that
n ϕ (x) dx > c7 > 0, a2n = un (0, x)ϕ2n (x) dx = 2 Dn
Dn
where c7 is independent of n. Let t0 > 1 be so large that c6 e−γ t0 /c7 < ε/16. Recall that u(t, x) denotes the Neumann heat equation solution in D, that is, u(t, x) = Ex u(0, Xt ), where the expectation is taken with respect to the fiber Brownian motion in D. We take the same initial condition as in the case of un (t, x), that is, the initial value u(0, x) is equal to 1 to the left of ᏸ and equal to −1 on the remaining part of D. By Lemma 6.5, passing to a subsequence if necessary, we have that un (t0 , · ) converges to u(t0 , · ), uniformly on M. Find sufficiently large n such that |un (t0 , x) − u(t0 , x)| < c7 (ε/8) exp(−µn2 t0 ) for x ∈ M. For any x ∈ M, we have x ≺ xn , so u(t0 , x) ≥ u(t0 , xn ) by Lemma 5.1. We have 0 ≤ un (t0 , xn ) − un (t0 , x) ≤ un (t0 , xn ) − u(t0 , xn ) + u(t0 , xn ) − u(t0 , x) + un (t0 , x) − u(t0 , x) ε exp − µn2 t0 . ≤ 2c7 8 This and (6.5) imply that
−(µn2 +γ )t0 n 2c7 (ε/8) exp − µn2 t0 2c6 e−γ t0 ε 3ε ϕ (xn ) − ϕ n (x) ≤ 2c6 e + ≤ + ≤ . n n 2 2 c7 4 8 a2n e−µ2 t0 a2n e−µ2 t0
58
BASS AND BURDZY
From this, we obtain |ϕ2n (x)−ϕ2n (y)| ≤ 3ε/4 for all x, y ∈ M. This, however, contradicts the definition of ε. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
C. Bandle, Isoperimetric Inequalities and Applications, Monogr. Stud. Math. 7, Pitman, Boston, 1980. R. Bañuelos and K. Burdzy, On the “hot spots” conjecture of J. Rauch, J. Funct. Anal. 164 (1999), 1–33. R. F. Bass, Probabilistic Techniques in Analysis, Probab. Appl., Springer, New York, 1995. R. F. Bass and M. T. Barlow, The construction of Brownian motion on the Sierpinski carpet, Ann. I. H. Poincaré Probab. Statist. 25 (1989), 225–257. R. F. Bass and P. Hsu, Some potential theory for reflecting Brownian motion in Hölder and Lipschitz domains, Ann. Probab. 19 (1991), 486–508. K. Burdzy and W. Kendall, Efficient Markovian couplings: Examples and counterexamples, to appear in Ann. Appl. Probab. K. Burdzy and W. Werner, A counterexample to the “hot spots” conjecture, Ann. of Math. (2) 149 (1999), 309–317. I. Chavel, Eigenvalues in Riemannian Geometry, Pure Appl. Math. 115, Academic Press, Orlando, 1984. K. Itô and H. P. McKean, Diffusion Processes and Their Sample Paths, Springer, Berlin, 1974. D. Jerison and N. Nadirashvili, The “hot spots” conjecture for domains with two axes of symmetry, preprint, 1999. B. Kawohl, Rearrangements and Convexity of Level Sets in PDE, Lecture Notes in Math. 1150, Springer, Berlin, 1985.
Bass: Department of Mathematics, University of Connecticut, Storrs, Connecticut, 06269, USA;
[email protected] Burdzy: Department of Mathematics, University of Washington, Box 354350, Seattle, Washington 98195-4350, USA;
[email protected]
Vol. 105, No. 1
DUKE MATHEMATICAL JOURNAL
© 2000
OSCILLATION AND VARIATION FOR THE HILBERT TRANSFORM JAMES T. CAMPBELL, ROGER L. JONES, KARIN REINHOLD, and MÁTÉ WIERDL 1. Introduction. For each > 0, let 1 f (x − t) H f (x) = dt. π |t|> t The Hilbert transform, Hf (x), is defined by Hf (x) = lim H f (x). →0+
It is well known that this limit exists a.e. for all f ∈ Lp , 1 ≤ p < ∞. In this paper, we will consider the oscillation and variation of this family of operators as goes to zero, which gives extra information on their convergence as well as an estimate on the number of λ-jumps they can have. For earlier results on oscillation and variation operators in analysis and ergodic theory, including some historical remarks and applications, the reader may look in [2], [3], [5], [4], and [6]. For each fixed sequence (ti ) 0, we define the oscillation operator
ᏻ H∗ f (x) =
∞ i=1
sup
ti+1 ≤i+1 <i ≤ti
H f (x) − H f (x)2 i i+1
1/2
and the variation operator ∞ 1/ H f (x) − H f (x) ᐂ H∗ f (x) = sup . i i+1
(i ) 0
i=1
The main results of this paper are the following two theorems. Theorem 1.1. The oscillation operator ᏻ(H∗ f )(x) satisfies ᏻ(H∗ f )p ≤ cp f p for 1 < p < ∞ and m{x : ᏻ(H∗ f )(x) > λ} ≤ (c/λ)f 1 . Received 13 July 1999. Revision received 6 January 2000. 2000 Mathematics Subject Classification. Primary 42B20, 42B25; Secondary 42A50, 40A30. Jones’s work partially supported by National Science Foundation grant number DMS-9531526. Wierdl’s work partially supported by National Science Foundation grant number DMS-9500577. 59
60
CAMPBELL, JONES, REINHOLD, AND WIERDL
Theorem 1.2. If > 2, then the variation operator ᐂ (H∗ f )(x) satisfies ᐂ (H∗ f )p ≤ c(p, )f p for 1 < p < ∞ and m{x : ᐂ (H∗ f )(x) > λ} ≤ (c()/λ)f 1 . The first issue to address is whether the operator ᐂ is well defined. Since we are taking a supremum over such a large set of sequences, it is not obvious that the resulting operator is measurable. To deal with this, we first restrict (k ) to lie in some finite set, prove a norm inequality that is independent of the set, and then enlarge the set. In this way, we obtain the result with (k ) restricted to a countable dense set, say, the rationals. The final result follows from the continuity properties of the operators involved. It is useful to have a second notation for the oscillation and variation of the family of operators. Since both ᏻ(H∗ f ) and ᐂ (H∗ f ) are seminorms on the family H f , we often denote ᏻ(H∗ f )(x) by H∗ f (x)ᏻ and ᐂ (H∗ f )(x) by H∗ f (x)v . See [4] for further discussion of these seminorms. More generally, if W is a family of operators, we consider the oscillation operator ∞ ᏻ W∗ f (x) = W∗ f (x) ᏻ =
i=1
sup
ti+1 ≤i+1 <i ≤ti
W f (x) − W f (x)2 i i+1
1/2 ,
where (ti ) is a fixed decreasing sequence, and the variation operator ᐂ W∗ f (x) = W∗ f (x)v = sup
(i ) 0
∞ 1/ W f (x) − W f (x) . i i+1 i=1
We also study the λ-jump operator W , f, λ, x = sup n ≥ 0 : such that there exist s1 < t1 ≤ s2 < t2 < · · · ≤ sn < tn
with the property that Wti f (x) − Wsi f (x) > λ, i = 1, 2, . . . , n , which gives the number of λ-jumps of the family W f . Clearly, for a convergent family, the number of λ-jumps must be finite a.e. The size of (W , f, λ, x) gives us information about how the family W converges. We prove the following theorem. Theorem 1.3. If > 2, then the operator (H , f, λ, x) satisfies (H , f, λ, ·)1/ p ≤ c(p, )f p for 1 < p < ∞, and if n ≥ 1, we have c() m x : H , f, λ, x > n ≤ 1/ f 1 . λn
61
OSCILLATION
Define
Vk H∗ f (x) = sup (i ) 0
1/2k <i+1 <i ≤1/2k−1
1/2 2 H f (x) − H f (x) , i i+1
where the supremum is taken over all decreasing sequences (i ). Define the “short variation operator” ∞ 1/2 2 . Vk H∗ f (x) SV H∗ f (x) = k=−∞
As before, it is sometimes convenient to denote this operator by H∗ f (x)sV . For a family of operators, W , define 1/2 2 W f (x) − W f (x) Vk W∗ f (x) = sup i i+1 (i ) 0
1/2k <i+1 <i ≤1/2k−1
and
SV W∗ f (x) =
∞
2 Vk W∗ f (x)
1/2 ,
k=−∞
where again the supremum is taken over all decreasing sequences (i ). We also prove the following theorem. Theorem 1.4. The short variation operator SV satisfies SV (H∗ f )p ≤ cp f p for 1 < p < ∞ and m{x : SV (H∗ f )(x) > λ} ≤ (c/λ)f 1 . The paper is organized as follows. In Section 2, we prove some general results for oscillation and variation norms of certain types of convolution operators. These are used in Section 3 to show that the oscillation and variation norms of the Hilbert transform are bounded operators on Lp , 1 < p < ∞. In Section 5, we prove that the oscillation and variation norms satisfy a weak type (1,1) inequality if the gaps are “short.” In Section 6, we show that the “long variation,” that is, gaps between dyadic terms, satisfies a weak (1,1) result. Finally, in Section 7, we complete the proof of Theorems 1.1, 1.2, and 1.3 by combining the results for long variation and short variation, thus obtaining results for the full variation. Remark 1.5. If we wanted to consider truncation both near zero and near infinity, we could argue as in the case of truncation only near zero. That follows since a<|t|
a (f (x − t)/t) dt − |t|>b (f (x − t)/d) dt. Hence, we can control truncation in both directions by a sum of two operators, each of which controls the truncation only on the small end.
62
CAMPBELL, JONES, REINHOLD, AND WIERDL
Remark 1.6. The inequalities in Theorems 1.1 and 1.2 are not a consequence of a general convergence principle such as the Banach principle. Consider the following simple example. For n ≥ 1, let T2n f (x) =
1 f (x) ln(n + 1)
and
T2n+1 f (x) =
−1 f (x). ln(n + 1)
It is clear that Tn f (x) converges to zero a.e. for all f ∈ Lp for any p, 1 ≤ p ≤ ∞. However, for any > 0, we have V (T! f )(x) = ∞ and ᏻ(T! f )(x) = ∞ whenever f (x) = 0. Remark 1.7. In general, to obtain a variation result, we need to assume > 2. This already occurs in the case of martingales (see [7]) and in the case of differentiation operators. We could also define the oscillation and short variation operators with exponents ≥ 2, but these are dominated by the 2-oscillation (2-short variation, resp.), so there is no need to do so. For < 2, the -oscillation operator fails to be a bounded operator on any Lp (with ti = 1/2i ). See [1] for the case of differentiation. The argument presented there can be applied to the case considered here as well. Remark 1.8. Throughout this paper, c and C, sometimes with additional parameters, represent constants. However, they may represent different constants from one occurrence to the next. Remark 1.9. There are higher dimensional analogs of the results mentioned in this paper. In particular, certain Calderón-Zygmund singular integrals in Rd satisfy similar estimates. These operators will be considered in a subsequent paper. 2. Oscillation and variation norms for certain convolution operators. In this section, we prove some lemmas about the oscillation and variation norms of certain kinds of convolution operators. These are used in the later sections. Lemma 2.1. Let B + (x) = χ[0,1] (x), and let Dt+ (x) = (1/t)B + (x/t). Define Dt+ f (x) = Dt+ ! f (x). Thus, Dt+ is the standard differentiation operator. (1) For any fixed sequence (ti ) 0, the operator ᏻ(D∗+ ) is weak type (1,1) and strong-type (p, p) for 1 < p < ∞. (2) For > 2, the operator ᐂ (D∗+ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (3) The operator SV (D∗+ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. Proof. For the special case p = 2, see Bourgain [2]. For other values of p, 1 ≤ p < ∞, see [4] for a proof in the discrete case. (See [4, Theorem 2.13] for p = 2, [4, Theorem 3.6] for the weak (1,1) result, and [4, Theorem 4.4] for the BMO result that allows the interpolation to get the remaining values of p.) The discrete case easily implies the continuous case. To see this, first consider the operators with the additional restriction that, for a fixed choice of n ∈ Z+ and a fixed J ∈ Z+ , the differentiation lengths are selected from the set {r/2n : r ∈ Z}, and at most J terms are in the sum
OSCILLATION
63
defining the operators. In this case, the discrete results imply the desired norm bounds for the operators associated with the differentiation averages. Moreover, the constants (although possibly not the best constants) that are obtained do not depend on n or J . Letting n → ∞, we obtain the result in the continuous parameter case, for a fixed J . Now let J → ∞ to obtain the stated results. Remark 2.2. We will need to apply Lemma 2.1 with χ[0,1] replaced by (1/2)χ[−1,1] or by χ[−1,0] . There is no change in the conclusion of the lemma. See [4], where it is clear the arguments can be adapted to these other cases as well. Remark 2.3. The above result will be proved (in higher dimensions) in [6]. The proof there will be much more geometric than the Fourier transform proof given in [4]. For p > 1, the higher dimensional result can be obtained from the above d = 1 result. For p = 1, more work is required, and indeed, it was the p = 1 case that motivated the geometric arguments contained in [6]. Although in the present paper we need the following result only for the case d = 1, we state and prove it in the more general case since (assuming the higher dimensional result in [6]) the additional details in the proof are trivial. Lemma 2.4. Let ( be a radial function on Rd , such that (a) lim|x|→∞ ((x) = 0; ∞ (b) if φ(|x|) = ((x), then φ(t) is differentiable and 0 |φ (t)|t d dt = A < ∞. Let (t (x) = (1/t d )((x/t) and (t f (x) = (t ! f (x). (1) For any fixed sequence (ti ) 0, the operator ᏻ((∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (2) For > 2, the operator ᐂ ((∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (3) The operator SV ((∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. Proof. First, note that
∞
φ(s) = − 0
χ[0,t] (s)φ (t) dt.
Consequently, we have
∞
((x) = − 0
χ[0,t] (|x|)φ (t) dt.
Let B(x) denote the indicator function of the unit ball in Rd , and, as in Lemma 2.1, write Dt (x) = (1/t d )B(x/t) and Dt f (x) = Dt ! f (x). Then (s (x) =
∞ ∞ 1 |x| 1 |x| d = − φ ( χ (t)t dt = − Dst (x)φ (t)t d dt. d [0,t] sd s (st) s 0 0
64
CAMPBELL, JONES, REINHOLD, AND WIERDL
Hence, (s f (x) = −
∞ 0
Dst f (x)φ (t)t d dt.
We now follow the argument in Bourgain [2, Lemma 3.28], using his technique with modifications that allow us to obtain the result for all p, 1 < p < ∞. We have (∗ f (x)
1/ J (s f (x) − (s f (x) = sup j j −1 v() (sj ),J
j =1
J ≤ sup (sj ),J
j =1
≤ sup
∞
(sj ),J 0
∞
≤ 0
∞
Dsj t f (x)φ (t)t d − Dsj −1 t f (x)φ (t)t
0
0
1/ dt
1/ J Ds t f (x) − Ds t f (x) φ (t)t d dt j j −1 j =1
1/ J Ds t f (x) − Ds t f (x) φ (t)t d dt sup j j −1
(sj ),J
j =1
∞
≤
d
D∗ f (x)
≤ D∗ f (x)v()
= AD∗ f (x)
v()
d φ (t)t dt
∞
φ (t)t d dt
0
v()
.
Using this pointwise estimate, we have (∗ f
= AD∗ f ≤ c(p, )Af p . v() p
v() p
Using the same pointwise estimate, we get
c()A f 1 . m x : (∗ f (x)v() > λ ≤ m x : AD∗ f (x)v() > λ ≤ λ The results for ᏻ and SV are obtained using the same argument. Since the details are almost the same, we do not include them.
OSCILLATION
65
Corollary 2.5. Let (t be as in Lemma 2.4. Then, for any > 2 and n ≥ 1, we have c()A f 1 . m x : (t , f, λ, x > n ≤ λn1/ Proof. Note that λ(((t , f, λ, x))1/ ≤ (∗ f (x)v() . Now the result follows from Lemma 2.4. We now have the following two simple corollaries regarding the variation norm of the Poisson integral and the Gaussian kernel. In both cases, the proof just involves checking the hypothesis of Lemma 2.4, and this is an easy computation. Corollary 2.6. Let P (x) = cd (1+|x|2 )−(d+1)/2 , and let Pt (x) = (1/t d )P (x/t). Define Pt f (x) = Pt ! f (x). (1) For any fixed sequence (ti ) 0, the operator ᏻ(P∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (2) For > 2, the operator ᐂ (P∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (3) The operator SV (P∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. Corollary 2.7. Let ((x) = (4π)−d/2 e−|x| /4 , and let (t (x) = (1/t d )((x/t). Define (t f (x) = (t ! f (x). (1) For any fixed sequence (ti ) 0, the operator ᏻ((∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (2) For > 2, the operator ᐂ ((∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (3) The operator SV ((∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. 2
Corollary 2.6 has the following corollary. Corollary 2.8. Let Qy denote the conjugate Poisson kernel; that is, let Qy (x) = (1/π )(x/(x 2 + y 2 )). Define Qy f (x) = Qy ! f (x). (1) For any fixed sequence (ti ) 0, the operator ᏻ(Q∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (2) For > 2, the operator ᐂ (Q∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (3) The operator SV (Q∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. Proof. Note that Qy f (x) = Py (Hf )(x), and recall that P∗ gv p ≤ c(, p) gp . From this we have Q∗ f = P∗ Hf ≤ c(, p)Hf p ≤ c(, p)f p . v p v p
The same argument works for the operators ᏻ(Q∗ f ) and SV (Q∗ f ). Remark 2.9. In the case d = 1, the same argument shows that if we have a convolution kernel that is supported on [0, ∞), then the conclusions of Lemma 2.4 still
66
CAMPBELL, JONES, REINHOLD, AND WIERDL
hold. The only change is to use the one-sided differentiation operator in place of the symmetric differentiation operator used above. We also need two special variants of the above results. The proofs are exactly the same as above. 2.10. Let φ(t) be supported on [1, ∞). Assume limt→∞ φ(t) = 0 and ∞Lemma (t)| dt = A < ∞. Let φ (t) = (1/y)φ(t/y). Define φ f (x) = φ ! f (x). t|φ y y y 0 (1) For any fixed sequence (ti ) 0, the operator ᏻ(φ∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (2) For > 2, the operator ᐂ (φ∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (3) The operator SV (φ∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. Proof. Write
∞
φ(t) = − 1
χ[1,u] (t)φ (u) du = −
∞ 1
χ[0,u] (t)φ (u) du −
∞ 1
χ[0,1] (t)φ (u) du.
Then we have ∞ ∞ 1 1 φy (t) = − χ[0,yu] (t)uφ (u) du + χ[0,1] (t)φ (u) du = φy1 (t) + φy2 (t). yu y 1 1 Hence, we get
φ ∗ f
v
≤ φ∗1 f v + φ∗2 f v .
Let Dy denote the one-sided differentiation; that is, Dy f (x) = (1/y) For a suitable choice of (yi ), we will have 1 φ f (x) ∗
v
∞ 1 φ ! f (x) − φ 1 ! f (x) ≤2 yk yk+1
k=1
∞ ≤2 k=1
∞
−∞ 1
∞
y 0
f (x −t) dt.
1/
f (x − t)
1/ 1 1 × χ[0,yk u] (t) − χ[0,yk+1 u] (t) uφ (u) du dt yk u yk+1 u ∞ 1/ ∞ =2 Duyk f (x) − Duyk+1 f (x) uφ (u) du
k=1
1
(by Minkowski’s inequality)
67
OSCILLATION
∞
≤2
1
∞ Dy u f (x) − Dy u f (x) k k+1
1/
uφ (u) du
k=1
D∗ f (x) uφ (u) du v
∞
≤2
1
≤ cD∗ f (x)v .
Since we already know that D∗ f v p ≤ c(p, )f p and that m{x : D∗ f v > λ} ≤ (c()/λ)f 1 , we are done with this piece. The argument for the second piece, φ∗2 f v , is similar. The oscillation operator and the short variation operator are handled in the same way. The details are left for the reader. Lemma 2.11. Let ψ(t) be supported and continuous on [0, 1]. Assume that ψ(t) 1 is differentiable on (0,1) and that 0 uψ (u) du = B < ∞. Let ψy (t) = (1/y)ψ(t/y). Define ψy f (x) = ψy ! f (x). (1) For any fixed sequence (ti ) 0, the operator ᏻ(ψ∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (2) For > 2, the operator ᐂ (ψ∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. (3) The operator SV (ψ∗ ) is weak type (1,1) and strong type (p, p) for 1 < p < ∞. 1 Proof. Clearly ψ(1)χ[0,1] (t) − ψ(t) = t ψ (s) ds. Hence,
ψ(t) = ψ(1)χ[0,1] (t) −
1 t
ψ (s) ds = ψ(1)χ[0,1] (t) −
1 0
χ[0,s] (t)ψ (s) ds.
For the dilation ψy (t), we have 1 t t 1 1 ψ(1)χ[0,1] − χ[0,s] ψ (s) ds y y y y 0 1 1 ψ(1) χ[0,y] (t) − χ[0,ys] (t)sψ (s) ds = y ys 0
ψy (t) =
= ψy1 (t) + ψy2 (t). We split the variation norm up into two pieces. The term ψ∗1 f v is just ψ(1) times the variation norm of the differentiation operator and hence satisfies the required estimate. For the second term, we argue as before. We have ∞ 1/ 2 2 2 ψ f (x) ≤ 2 ψ ! f (x) − ψ ! f (x) ∗
v
k=1
yk
yk+1
68
CAMPBELL, JONES, REINHOLD, AND WIERDL
∞ =2 k=1
∞
−∞ 0
1
f (x − t)
1/ 1 1 χ[0,yk u] (t) − χ[0,yk+1 u] (t) uψ (u) du dt × yk u yk+1 u ∞ 1/ 1 =2 Duyk f (x) − Duyk+1 f (x) uψ (u) du
k=1
0
1/ 1 ∞ Dy u f (x) − Dy u f (x) ≤2 uψ (u) du k k+1
0
k=1 1
≤2 0
D∗ f (x) uψ (u) du v
≤ cD∗ f (x)v .
As before, we already know the operator D∗ f (x)v satisfies the required estimates. The same argument works for both the oscillation operator and the short variation operators. 3. The Lp result for the Hilbert transform. In this section, we use the results of the prior section to establish the Lp inequalities, 1 < p < ∞ for Theorems 1.1 and 1.2. In particular, we prove that for 1 < p < ∞, ᏻ H∗ f ≤ cp f p ; p that for > 2 and 1 < p < ∞, H∗ f ≤ c f p ; v p
and that for 1 < p < ∞,
H∗ f ≤ cf p . s p v
Proof. We actually prove only the result for the variation operator, the proofs for the other two operators being exactly the same. Let 1 Qy f (x) = π denote the conjugate Poisson kernel.
∞ −∞
f (x − t)
t t 2 + y2
dt
69
OSCILLATION
By the triangle inequality, we have H∗ f (x)
v
= H∗ f (x) − Q∗ f (x)v + Q∗ f (x)v .
By Corollary 2.8, we see it is enough to control H∗ f (x) − Q∗ f (x)v . Fix f ∈ Lp (R), x ∈ R, and y > 0. We write Hy f (x) − Qy f (x) t 1 1 t 1 − f (x − t) dt − f (x − t) 2 dt = π |t|>y t t 2 + y2 π |t|
4. Some necessary lemmas. In this section, we state some of the key lemmas that we need in the proof of the weak type (1,1) results. The first lemma is the CalderónZygmund decomposition. This decomposition allows us to break up a function into two pieces. One piece can be handled by L2 techniques. The second piece, which is supported on a small set, has mean value zero on each of the intervals that make up the support. Lemma 4.1 (Calderón-Zygmund decomposition). Any f ∈ L1 (R) can be written in the form f = g + b, where (1) g2L2 ≤ f L1 ; (2) b = j bj , where each bj satisfies (a) for some n, bj is supported in a dyadic interval Bj ∈ Ln ; (b) R bj (x) = 0 for each j ;
70
CAMPBELL, JONES, REINHOLD, AND WIERDL
(c) bj L1 ≤ 3|Bj |; (d) j |Bj | ≤ f L1 , where |Bj | denotes the length of the interval Bj . Since the Calderón-Zygmund decomposition is a standard tool in harmonic analysis, we do not include a proof. See Stein [8, pp. 17–19] or Torchinsky [9, pp. 84–85] for details. We also need the following almost orthogonality lemma. The use of almost orthogonality is a standard tool to prove L2 inequalities. Our use of the lemma to prove an L1 inequality is somewhat unusual. We use Lemma 4.2 and the L2 information it provides to handle the function b that occurs in the Calderón-Zygmund decomposition. See [6] for a similar application of this technique. Lemma 4.2. Let (dn ) be a sequence, and let (hk,n ) be a double sequence of vectors in a normed space (B, ·). Let (σ (j ))j ∈Z be a sequence of positive numbers with w = j σ (j ) < ∞. Suppose that hk,n ≤ σ (n − k) dn for every n, k. Then we have
2 ≤ w2 · h dn 2 . k,n k
n
n
Proof. Clearly, 1/2 hk,n = hk,n ≤ σ (n − k) · σ −1/2 (n − k) · hk,n . n n n Applying Cauchy’s inequality and (4.1), we have 1/2 1/2 hk,n ≤ σ (n − k) · σ (n − k) · dn 2 n n n 1/2 2 = w· σ (n − k) · dn . n
Hence, 2 hk,n ≤ w · σ (k − n) · dn 2 n n k
k
(4.1)
71
OSCILLATION
= w·
n
=w · 2
σ (n − k) · dn 2
k
dn 2 ,
n
the desired result. 5. Weak type results—short variation. In this section, we prove the following proposition. Proposition 5.1. The short variation operator SV (H∗ ) is a weak type (1,1) operator. Recall that Vk f (x) = sup
(i ) 0
1/2k <i+1 <i ≤1/2k−1
H f (x) − H f (x)2 i i+1
1/2 ,
where the supremum is taken over all decreasing sequences (i ). Fix f ∈ L1 and x ∈ R. Select a decreasing sequence (i ) = (i (x)) such that for each k ≥ 1 and for this (now fixed) sequence we have Vk f (x) ≤ 2
1/2k <i+1 <i ≤1/2k−1
H f (x) − H f (x)2 i i+1
1/2 .
For this fixed sequence (i ), define vk f (x) =
1/2k <i+1 <i ≤1/2k−1
H f (x) − H f (x)2 i i+1
1/2 .
(5.1)
Using these (now fixed) choices, we dominate the operator SV f (x)2 by four times the sum over k of the squares of the variation operators vk f (x). That is, SV f (x) ≤ 2
∞
vk f (x)
2
1/2 .
(5.2)
k=1
Note that vk is sublinear; that is, vk (h+g)(x) ≤ vk (h)(x)+vk (g)(x) for functions h and g (and for countable sums as well). Using the Calderón-Zygmund decomposition, write f = g +b, where g and b have the properties in the lemma.
72
CAMPBELL, JONES, REINHOLD, AND WIERDL
Remark 5.2. If the estimate SV H∗ f > 1 ≤ c f
L1
is established, then SV (H∗ ) satisfies the usual weak (1, 1) inequality; that is, there exists a constant c, independent of f and λ > 0 so that SV H∗ f > λ ≤ c f 1 . L λ This follows from the fact that for any λ > 0 and f ∈ L1 , SV (H∗ (λf )) = λSV (H∗ f ). Now apply the result with f replaced by f/λ. We have 1/2 1/2 ∞ ∞ 2 2 >1 ≤m x: > vk f (x) vk g(x) m x: k=1
k=1
+m x :
∞
2 vk b(x)
1 2
1/2
k=1
>
1 2
(5.3) .
For the first term we use the fact that we already know a strong (2,2) result for SV and the fact that g ∈ L∞ with g∞ ≤ 2. We have
m x:
∞
vk g(x)
k=1
2
1/2
1/2 ∞ 2 2 1 ≤ 4 > vk g(x) 2 k=1
≤ cg22 ∞ ≤c
−∞
2
2g(x) dx
≤ cf 1 . To handle the second term, we first need to introduce some new notation. For each i, let B˜ i denote the interval with the same center as Bi , but with three times the length. Let B˜ = ∪i B˜ i . We have 1/2 1/2 ∞ ∞ 2 1 1 2 ≤ m x ∈ B˜ c : +m B˜ . > > vk b(x) vk b(x) m x: 2 2 k=1
k=1
73
OSCILLATION
˜ ≤3 Since m(B)
i m(Bi ) ≤ 3f 1 ,
m x ∈ B˜ c :
it will be enough to study
∞
2 vk b(x)
1/2
k=1
1 . > 2
We have
m x ∈ B˜ c :
∞
2 vk b(x)
1/2 >
k=1
It will now be enough to prove B˜ c k
1 2
≤4
∞ B˜ c k=1
2 νk b ≤ C · Bj .
2 vk b(x) .
(5.4)
j
Define Ln to be the collection of intervals of the form [r/2n , (r + 1)/2n ) for some integer r. Let us write b = n hn , where
hn =
bj
Bj ∈Ln
and
χn =
1Bj .
Bj ∈Ln
Note that
n
χn 2L2 =
n
χn L1 =
Bj . j
We will prove B˜ c
νk hn 2 ≤ C · 2−|n−k| · χn 2 2 . L
(5.5)
Then, by applying the almost orthogonality lemma, Lemma 4.2, we conclude that vk hn 2 ≤ vk hn 2 ≤ χn 22 . B˜ c k
k
n
We consider two cases, n ≤ k and n > k.
B˜ c
n
74
CAMPBELL, JONES, REINHOLD, AND WIERDL
Case 1: n ≤ k. Note that we only need to consider x ∈ B˜ c . We have 2 2 νk hn (x) = Hi hn (x) − Hi+1 hn (x) = i
i
i+1 <|t|≤i
hn (x − t) 2 dt , t
where for each i we have 1/2k+1 ≤ i+1 < i < 1/2k . If x ∈ B˜ c and k ≤ n, then for each t in the above integral, hn (x − t) = 0 so that νk hn (x) = 0. Case 2: n > k. Let Ii denote the interval centered at the origin of length 2i , and let 9i = Ii \ Ii+1 . We split the sum defining νk hn (x) into two sums. The first consists of integration over those elements Bj ∈ Ln which are entirely contained within the interior of some x − 9i ; the second consists of the Bj ∈ Ln which intersects the boundary of some x − 9i . It may help the reader to think of n as being much larger than k, and to visualize the small intervals of Ln as either being caught entirely inside a large annuli x − 9i or intersecting the boundary of a given x − 9i . With these conventions, we may write |νk hn (x)| as νk hn (x)2 = i
= i
t∈9i
hn (x − t) dt t
2 2
x−t∈Bj Bj ∈Ln Bj ⊂int(x−9i )
hn (x − t) dt t
x−t∈Bj ∩9i Bj ∈Ln Bj ∩(x−∂9i )=∅
hn (x − t) dt . t (5.6)
Consequently, we have
2 1/2
νk hn (x) ≤
x−t∈Bj Bj ∈Ln Bj ⊂int(x−9i )
i
hn (x − t) dt t
2 1/2
+ i
x−t∈Bj ∩9i Bj ∈Ln Bj ∩(x−∂9i )=∅
= Interior Sum + Boundary Sum.
hn (x − t) dt t
75
OSCILLATION
We estimate the Interior Sum first. Note that Bj hn (x) dx = 0 for each Bj ∈ Ln . Hence, if Bj ⊂ int(x − 9i ) and yj is any point in Bj , we have x−t∈Bj
hn (x − t) dt = t
x−t∈Bj
1 1 − hn (x − t) dt, t yj
so that
2 1/2
Interior Sum = i
Bj ∈Ln Bj ⊂int(x−9i )
x−t∈Bj
1 1 − hn (x − t) dt t yj
.
If x − t ∈ Bj , then |t| < 1/2k , since Bj ⊂ x − 9i ⊂ x − I1/2k /I1/2k+1 . But the length of Bj is 1/2n , so we have that for any t ∈ Bj , |(1/t) − (1/yj )| ≤ C · 2−n /2−2k = C · 2k−n 2k . Therefore, we may estimate the Interior Sum as Interior Sum ≤ C · 2
2 1/2
k−n
i
≤ C · 2k−n 2k
2 ·
Bj ∈Ln Bj ⊂ int(x−9i )
i
k
Bj
x−t∈B j
hn (x − t) dt x−t∈B j
hn (x − t) dt. (5.7)
The index i in the last row above enumerates the sets 9i , which are all disjoint subsets of I1/2k , and the sum will be larger only if we expand the index to include all of I1/2k . We have k−n k hn (x − t) dt Interior Sum ≤ C · 2 2 i
≤ C · 2k−n 2k
≤ C ·2
k−n k
2
Bj
x−t∈B j
x−t∈B j Bj ∈Ln Bj ⊂int(x−I1/2k )
Bj Bj ∈Ln Bj ⊂int(x−I1/2k )
hn (x − t) dt
hn (t) dt
76
CAMPBELL, JONES, REINHOLD, AND WIERDL
≤ C · 2k−n 2k
|Bj |
Bj ∈Ln Bj ⊂ int(x−I1/2k )
= C · 2k−n D1/2k χn (x), where D f (x) denotes the differentiation average (1/) − f (x − t) dt. Hence, we have 2 2 k−n Interior Sum ≤ C · 2 D1/2k χn (x) c c B˜ B˜ 2 D1/2k χn (x) ≤ C · 2k−n
(5.8)
B˜ c
≤ C · 2k−n χn 2L2 , giving a good estimate for the Interior Sum. Now we consider the Boundary Sum. Recall that the Boundary Sum is given by
2 1/2
Boundary Sum =
x−t∈Bj ∩9i Bj ∈Ln Bj ∩(x−∂9i )=∅
i
hn (x − t) dt t
.
Define Pi as Pi = t : there exists Bj ∈ Ln with x − t ∈ Bj ∩ 9i = ∅, Bj ∩ x − ∂9i = ∅ . Write the square of the Boundary Sum as Boundary Sum 2 = i
We estimate the factor
x−t∈Bj ∩9i Bj ∈Ln Bj ∩(x−∂9i )=∅
= i
2
t∈Pi
hn (x − t) dt · t
t∈Pi
t∈Pi
hn (x − t) dt t
hn (x − t) dt t
(5.9)
hn (x − t) dt . t
(5.10)
in two ways. First, we use the relative size of Pi . The length of 9i is at most 1/2k−1 , while the length of Bj ∈ Ln is 1/2n . Since this is the Boundary Sum, the Bj ’s must
77
OSCILLATION
contain the edges of 9i . Hence, there are only four possible Bj that we need to consider, and we have |Pi | ≤
4 . 2n
(5.11)
For t ∈ Pi , |1/t| = O(|t|−1 ) = O(2k ) so that hn (x − t) dt ≤ C · 2k · hn (x − t) dt t t∈Pi t∈Pi k |Bj | ≤ C ·2 · Bj ∩(x−∂9i )=∅
(5.12)
≤ C · 2k · |Pi | ≤ C · 2k · 2−n = C · 2k−n , where the inequality t∈Pi
hn (x − t) dt ≤ C ·
Bj
Bj ∩(x−∂9i )=∅
follows from Condition 2(c) of Lemma 4.1. The second way we will estimate the factor (5.10) is to simply apply the estimate |1/t| = O(|t|−1 ) = O(2k ) for t ∈ Pi : hn (x − t) dt ≤ C · 2k · hn (x − t) dt. (5.13) t t∈Pi t∈Pi Consequently, we have 2 Boundary Sum hn (x − t) hn (x − t) dt · dt ≤ t t t∈Pi t∈Pi i k−n k hn (x − t) dt ≤ C ·2 ·2 · i
≤ C · 2k−n · 2k
i
≤ C · 2k−n · 2k · ≤ C · 2k−n · D
t∈Pi
t∈Pi
hn (x − t) dt
t∈I1/2k \I1/2k+1
hn (x − t) dt sum over the entire annulus
,
1/2k hn (x)
(5.14)
78
CAMPBELL, JONES, REINHOLD, AND WIERDL
where (as before) D f (x) denotes the differentiation operator (1/2) Hence, D1/2k hn (x) dx Boundary Sum 2 ≤ C · 2k−n B˜ c B˜ c ≤ C · 2k−n D1/2k hn (x) dx R ≤ C · 2k−n hn (x) dx
− f (x −t) dt.
(5.15)
R
≤ C · 2k−n
Bj Bj ∈Ln
≤ C · 2k−n χn 2L2 , giving a good estimate for the Boundary Sum. We have 2 νk hn 2 ≤ Interior Sum + Boundary Sum c c ˜ ˜ B B 2 2 ≤C· Interior Sum + Boundary Sum ≤ C ·2
B˜ c k−n
(5.16)
χn 2L2 ,
as required. 6. Weak type (1,1) for the long variation. Let H˜ k (x) = H2−k f (x) = − t)/t) dt. Define ∞ 1/ H˜ n f (x) − H˜ n f (x) V˜ f (x) = sup , k k+1 nk
|t|>2−k (f (x
k=1
where the sup is over all increasing sequences of integers, (nk ). Note that we can start with n1 a negative integer. We refer to V˜ as the long variation operator and to V as the full variation operator. Our goal is to prove the following lemma. Lemma 6.1. The operator V˜ is weak type (1,1) for > 2. Remark 6.2. In the argument that follows, we will use the fact that we are only looking at the truncated Hilbert transform where we truncate at powers of 2. The argument does not work with more general truncation. That is why we also needed to consider the “short variation operator” in the earlier section. Remark 6.3. To obtain control of the full variation operator, we need to assume that > 2. However, the only place this is used in the proof is in proving the L2 result for the long variation. For the remainder of the argument, = 2 is enough.
79
OSCILLATION
Proof. We will apply the Calderón-Zygmund decomposition, Lemma 4.1. Using the notation in Lemma 4.1, we need to study the functions g and b. The “good function” g is handled as in the case of the short variation operator in Proposition 5.1 since we already have an L2 result for the full variation and hence, in particular, for the long variation. Following the argument in the proof of Proposition 5.1, and using the notation there, we see that it is enough to prove that V H˜ ∗ b (x) dx ≤ c |Bi |. B˜ c
Since the that
<1
norm dominates the sup
i
<
norm for ≥ 1, it is in fact enough to prove
∞ H˜ n b(x) − H˜ n b(x) dx ≤ c Bj . k k+1
B˜ c (nk ) k=1
j
This certainly follows if we prove for each j that ∞ H˜ n bj (x) − H˜ n bj (x) dx ≤ cBj . sup k k+1 B˜ jc (nk ) k=1
Hence, proving this is our goal. For the remainder of the argument, fix j . Assume that x ∈ B˜ jc . There are two cases. Case 1. There may be some exponent k0 so that x −1/2k0 ∈ Bj or x +1/2k0 ∈ Bj , depending on which side of Bj the point x happens to be. Note that because x ∈ B˜ jc , and because we are considering only dyadic truncation, there is at most one such k0 , and there may be none. Hence, for x ∈ B˜ jc , if k0 is defined, we have sup nk
∞ H˜ n bj (x) − H˜ n bj (x) dx k k+1 k=1
≤ H˜ k0 −1 bj (x) − H˜ k0 bj (x) + H˜ k0 bj (x) − H˜ k0 +1 bj (x).
We will consider only the first term. The other term is handled in the same way. We have b (x − t) j H˜ k −1 bj (x) − H˜ k bj (x) ≤ dt 0 0 t |t|>2−k0 +1 c bj 1 ≤ d x, Bj c , ≤ Bj d x, Bj where d(x, Bj ) denotes the distance from x to the set Bj .
80
CAMPBELL, JONES, REINHOLD, AND WIERDL
Let mj denote the midpoint of Bj . For each n, let xn+ and xn− be defined by = 2n and xn− − mj = −2n . Let
xn+ − mj
An =
2n
xn+ −
Bj 2
, xn+ +
Bj Bj Bj − − ∪ xn − , xn + . 2 2 2
Note that k0 is defined only for x ∈ An for some n. Further, x ∈ B˜ jc implies ≥ |Bj |. Let A = ∪2n ≥|Bj | An . Using the above estimates, we have c H˜ k −1 bi (x) − H˜ k bi (x) dx ≤ cBj n Bj ≤ cBj , 0 0 2 A n 2 ≥|Bj |
as required. Case 2. We now deal with the points x ∈ B˜ c ∩ Ac . However, for such x, there is only one term involved. For that term, one of the integrals does not contribute at all, since it starts beyond the support of bj , and the other involves integration over all of Bj . For such x, we have
ᐂ H∗ bj (x) = H bj (x)
bj (t) = dt B˜ jc x − t bj (t) bj (t) ≤ − dt x − mj B˜ jc x − t c Bj ≤ |bj (t) dt 2 B˜ jc d x, Bj cBj ≤ bj 1 2 d x, Bj 2 cBj ≤ 2 . d x, Bj
We can estimate the integral of this expression over the relevant set of x’s by
∞ H˜ n bj (x) − H˜ n bj (x) dx ≤ sup k k+1
B˜ jc ∩Ac (nk ) k=1
as required.
B˜ jc
2 cBj 2 dx ≤ cBj , d x, Bj
81
OSCILLATION
7. Completion of the proof. We are now in a position to complete the proof of Theorems 1.1, 1.2, and 1.3. Proof of Theorem 1.2. Using Lemma 6.1, combined with the results about the “short variation operator,” Proposition 5.1, we show that ᐂ (H∗ ) is a weak type (1,1) operator. The idea is to introduce new operators, Rt f (x), defined by Rt f (x) = H˜ k f (x), where 2−(k+1) < t ≤ 2−k . Then we have
∞ Hy f (x) − Hy f (x) V f (x) = sup k k+1 yk
≤ sup yk
1/
k=1 ∞ Hy f (x) − Ry f (x) + Ry f (x) − Ry f (x) k k k k+1 k=1
+Ryk+1 f (x) − Hyk+1
f (x)
1/
∞ Hy f (x) − Ry f (x) + Ry f (x) − Hy f (x) ≤ sup k k k+1 k+1 yk
k=1
∞ Ry f (x) − Ry f (x) + sup k k+1 yk
1/
1/ .
k=1
The first term is controlled by the weak (1,1) result for the short variation. The second term is controlled by V˜ f , which, by Lemma 6.1, is a weak (1,1) operator. We now have that ᐂ (H∗ ) is weak (1,1) and strong (p, p) for 1 < p < ∞, completing the proof of Theorem 1.2. Proof of Theorem 1.1. The idea is the same as in the proof above. We already have a strong (p, p) inequality. Thus we only need to prove a weak (1,1) result. Again, the Calderón-Zygmund decomposition is used. We already have control of the “good function,” g. For the “bad function,” b, we note that the only place we used the fact that > 2 in the study of ᐂ was in the proof of the L2 result. The rest of the argument for the oscillation operator is exactly the same as the corresponding argument for the variation operator. The details are left to the reader. Proof of Theorem 1.3. It is clear that λ(H , f, λ, x)1/ ≤ V (H∗ f )(x). Consequently, we have λ H , f, λ, · 1/ ≤ V H∗ f ≤ c(p, )f p . p p
82
CAMPBELL, JONES, REINHOLD, AND WIERDL
Thus, (H , f, λ, ·)1/ p ≤ (c(p, )/λ)f p . For the weak (1,1) result, we note that 1/ m x : H , f, λ, x > n ≤ m x : λ H , f, λ, x > λn1/ ≤ m x : V H∗ f (x) > λn1/ c() f 1 . λn1/ Fix α < β. Define the number of upcrossings N(H , f, α, β, x) by N H , f, α, β, x = max n : there exist s1 < t1 < s2 < t2 < · · · < sn < tn ≤
such that Hsi f (x) < α, Hti f (x) > β, i = 1, 2, . . . , n .
Theorem 1.3 has the following immediate corollary regarding upcrossings. Corollary 7.1. Let α < β. If > 2, we have N H , f, α, β, · 1/ ≤ c() f p p β −α for 1 < p < ∞, and m x : N H , f, α, β, x > n ≤
c() f 1 . (β − α)n1/
Theorems 1.3 and 7.1 suggest the following conjectures. Conjecture 7.2. The estimates in Theorem 1.3 can be improved to (H , f, λ, ·)1/2 p ≤ (c(p)/λ)f p , and m{x : (H , f, λ, x) > n} ≤ (c/λn1/2 )f 1 . Conjecture 7.3. The estimates in Theorem 7.1 can be improved to N(H , f, α, β, ·)p ≤ (c(p)/(β − α)n)f p , and m{x : N(H , f, α, β, x) > n} ≤ (c/(β − α)n)f 1 . In the case of analogous upcrossing and λ-jump operators for martingales, differentiation averages, and ergodic averages, we know the improvements conjectured above are possible. However, our current techniques do not allow us to prove these conjectures for the Hilbert transform. References [1] [2] [3] [4]
M. Akcoglu, R. L. Jones, and P. Schwartz, Variation in probability, ergodic theory and analysis, Illinois J. Math. 42 (1998), 154–177. J. Bourgain, Pointwise ergodic theorems for arithmetic sets, Inst. Hautes Études Sci. Publ. Math. 69 (1989), 5–45. R. Jones, Ergodic theory and connections with analysis and probability, New York J. Math. 3A (1997), 31–67, available from http://nyjm.albany.edu:8000/nyjm.html. R. Jones, R. Kaufmann, J. Rosenblatt, and M. Wierdl, Oscillation in ergodic theory, Ergodic Theory Dynam. Systems 18 (1998), 889–935.
OSCILLATION [5] [6] [7] [8] [9]
83
R. Jones, I. Ostrovskii, and J. Rosenblatt, Square functions in ergodic theory, Ergodic Theory Dynam. Systems 16 (1996), 267–305. R. Jones, J. Rosenblatt, and M. Wierdl, Oscillation in ergodic theory: Higher dimensional results, preprint, http://moni.msci.memphis.edu/ ˜ mw/high-final.pdf. J. Qian, The p-variation of partial sum processes and the empirical process, Ann. Probab. 26 (1998), 1370–1383. E. M. Stein, Singular Integrals and Differentiability Properties of Functions, Princeton Math. Ser. 30, Princeton Univ. Press, Princeton, 1970. A. Torchinsky, Real-Variable Methods in Harmonic Analysis, Pure Appl. Math. 123, Academic Press, Orlando, 1986.
Campbell: Department of Mathematical Sciences, University of Memphis, Dunn Hall Room 373, Memphis, Tennessee 38152, USA; [email protected] Jones: Department of Mathematics, DePaul University, 2219 North Kenmore, Chicago, Illinois 60614, USA; [email protected] Reinhold: Department of Mathematics, University at Albany, State University of New York, 1400 Washington Avenue, Albany, New York 12222, USA; [email protected] Wierdl: Department of Mathematical Sciences, University of Memphis, Dunn Hall Room 373, Memphis, Tennessee 38152; [email protected]
Vol. 105, No. 1
DUKE MATHEMATICAL JOURNAL
© 2000
THE RAMANUJAN PROPERTY FOR REGULAR CUBICAL COMPLEXES BRUCE W. JORDAN and RON LIVNÉ
0. Introduction. Ramanujan graphs were defined by Lubotzky, Phillips, and Sarnak in [15] as regular graphs whose adjacency matrices, or their Laplacians, have eigenvalues satisfying some “best possible” bounds. Such graphs possess many interesting properties. In this paper, we give a higher-dimensional generalization of this theory to regular cubical complexes. By definition, (r = (r1 , . . . , rg ))-regular complexes are cell complexes locally isomorphic to the (ordered) product of g regular trees, with the j th tree of regularity rj ≥ 3. Each cell is an i-cube (i.e., an i-dimensional cube) with 0 ≤ i ≤ g. Throughout each (g − 1)-cube, exactly one of the tree factors, say the j th one, is constant, and there are rj g-cubes passing through it. When g = 1, we simply have an r-regular graph. The spaces of i-cochains C i (X) (with real or complex coefficients) of a finite cubical complex X are inner product vector spaces with an orthonormal basis corresponding to the characteristic functions of the i-cells. There are partial boundary operators ∂j = ∂j,i : C i (X) → C i+1 (X) for 1 ≤ j ≤ g. With these we get the adjoint ∗ : C i+1 (X) → C i (X) and, hence, the partial Laplacians operators ∂j∗ = ∂j,i ∗ ∗ j = j,i = ∂j,i ∂j,i + ∂j,i−1 ∂j,i−1 : C i (X) −→ C i (X).
Each j,i is a selfadjoint nonnegative operator. For i fixed they all commute and one gets a combinatorial harmonic theory (cf. [21]). When X is infinite, these notions extend to L2 -cochains. When X = is an r = (r1 , . . . , rg )-regular product of trees, Kesten’s 1-dimensional results (see [13]) extend, and we get that each λ in the spectrum of rj Id −j acting on L2 -cochains of satisfies |λ| ≤ 2 rj − 1. As in the 1-dimensional case, we say that a (r1 , . . . , rg )regular cubical complex X is Ramanujan if the eigenvalues of rj Id −j on X are ±rj or if they satisfy the same properties for each j . One justification for this definition in the 1-dimensional case is the Alon-Boppana result, which shows that these bounds are essentially the best possible ones for the trivial local system. We generalize this result under a natural hypothesis. Another parallel with the 1-dimensional case is that when X is finite, connected, and uniformized Received 11 January 2000. 2000 Mathematics Subject Classification. Primary 68R05, 11R80, 11R52. Jordan supported in part by grants from National Science Foundation and Professional Staff Congress–City University of New York. Both authors supported by a joint Binational Israel-USA Foundation grant. 85
86
JORDAN AND LIVNÉ
by a lattice in a product 1≤j ≤g Gj of p-adic SL2 ’s, the Ramanujan property is equivalent to the following condition. For a (g − 1)-cube σ in the universal covering complex, which is constant in the j th direction, let σ be the stabilizer of σ in . Then, no nontrivial, unramified complementary series representations should appear in L2 (σ \ Gj ) for any j and σ . A fundamental problem in the 1-dimensional case is to explicitly construct an infinite family of Ramanujan graphs of a fixed regularity. All such known families have regularities of the form k = q + 1 with q a power of a prime. The standard examples in [15] depend, in fact, on the Ramanujan-Petersson conjectures for weight-2 holomorphic cusp forms on GL2 (Q). These were reduced to the Weil bounds for curves by Eichler, Shimura, and Igusa. They only give examples with q a prime. The case of any prime power was subsequently handled in [16] via function fields. Ramanujan local systems were defined and constructed in [12] with q a prime. This construction requires Deligne’s results on the Ramanujan-Petersson conjecture for cusp forms of higher weight. They depend on the Weil bounds for higher-dimensional varieties. In the higher-dimensional case, a product of Ramanujan graphs always gives a Ramanujan cubical complex, and the problem is to construct irreducible examples. In this work, we construct infinite families of Ramanujan regular cubical complexes, with fixed regularities (r1 , . . . , rg ), where rj = qj +1 and the qj ’s are any prime powers. As in the examples of [12] and [17], our construction uses quaternion algebras, this time over totally real number fields. It reduces the Ramanujan property to the RamanujanPetersson conjecture for certain holomorphic Hilbert modular forms. The RamanujanPetersson conjecture seems not to be known for all holomorphic Hilbert modular forms. In particular, the results of [3] do not help us because they establish the bounds outside an unspecified finite set of primes, which a priori may depend on the form. However, in the literature, this conjecture is proved under additional hypotheses that still permit us to construct infinite families of examples with arbitrary prime powers for regularities. If all the qj ’s are powers of the same prime, one could give, as before, an alternate construction using function fields. As in [12], we also construct Ramanujan local systems on our examples. Among the interesting features of our examples is the fact that their cohomology vanishes except in the top dimension g (and zero). In addition, one can bound the girth from below as in the 1-dimensional case (see [14, Theorem 7.3.7]). As in [12], this theory is valid for Hermitian local systems over cubical complexes. We similarly give examples of Ramanujan local systems on cubical complexes. Acknowledgments. We would like to thank Y. Glasner and P. Sarnak for their generous help and interest. 1. The harmonic theory of regular cubical complexes 1.0. Cubical complexes. In this article, we study cell complexes locally isomorphic to a finite product of regular trees, in which the order and the regularities of the factors are globally constant.
RAMANUJAN CUBICAL COMPLEXES
87
Let g be the dimension of such a complex X, and let rj be the regularity of the j th factor tree (1 ≤ j ≤ g). We call such a complex X an (r1 , . . . , rg )-regular cubical complex. Each cell in X is a cube of dimension less than or equal to g. The zerodimensional cells are called vertices and the 1-dimensional cells are called edges. For each subset I ⊂ {1, . . . , g}, the I -cubes, or cubes of direction I of X, are the products of edges from the factors i ∈ I with vertices from the factors not in I . Each such cube has 2|I | orientations, and we denote the set of oriented I -cubes by I . There are bottom and top maps botj , topj : I → I −{j } and inversion maps invj : I → I for any j ∈ I . These are subject to the following axioms: (1) {invj }j ∈I generate a group, isomorphic to the group (Z/2Z)I of maps from I to Z/2Z, which acts simply transitively on the orientations of each I -cube; (2) topj invj = invj topj and bot j invj = invj botj for j = j ; (3) topj invj = bot j (and, hence, also bot j invj = topj ) for all j ; (4) any oriented I -cube is the j th top of precisely rj -oriented I ∪ {j }-cubes for j ∈ I . Geometrically, these combinatorial conditions mean that X is locally isomorphic to the ordered product = j j of g regular trees of respective regularities rj . In particular, when X is connected, these conditions hold if and only if the universal cover X˜ is isomorphic to , with the covering transformations preserving the order of the tree factors. The “if” part is clear, and the “only if” part holds because is simply connected (in fact, explicitly contractible) and locally isomorphic to X. Hence, is isomorphic to X˜ by the uniqueness of the universal cover. The directions are preserved under the covering map, and, hence, the deck transformations form, indeed, a subgroup of j Aut( j ). We set Ver = ∅ , Edoj = {j } , and Edo = j Edoj . For an oriented edge e of direction {j }, we write bot j (e) = o(e), topj (e) = t (e), and invj (e) = e. We refer to these as the origin, the terminal vertex, and the opposite edge, respectively. If necessary, we indicate the dependence of all these objects on X. Example 1.1. (1) The unit cube in Rg is the product of g intervals, which are 1-regular trees. (2) The tiling of Rg by unit cubes with integer vertices is the product of g lines, viewed as 2-regular trees. (3) The graphs in the sense of [20] which are k-regular are precisely the k-regular cubical complexes (of dimension 1). (4) A finite product of regular cubical complexes is a regular cubical complex. A connected regular cubical complex is called irreducible if it has no finite unramified cover by a product of regular cubical complexes of positive dimension. In many important cases, the vertices of an (r1 , . . . , rg )-regular cubical complex X come with parities: these are maps pj from the vertices to {0, 1} which satisfy pj (topi e) = pj (bot i e) if and only if i = j for any edge e of direction {i} of X, 1 ≤ i, j ≤ g. On a complex with parities, we give each cube a canonical orientation
88
JORDAN AND LIVNÉ
by agreeing that its bottommost vertex has all its parities equal to zero. In the 1dimensional case, we recover the notion of a bipartite graph. 1.1. Local systems. A real/complex local system, or a flat vector bundle, on a regular cubical complex X depends only on the cells of dimension less than or equal to 2. (For the 1-dimensional case, see, e.g., [12].) In the general case, a local system ᏸ on X consists of a real/complex vector space ᏸ(v) for any vertex v of X. In addition, for any oriented edge e, we get a linear isomorphism ᏸe : ᏸ(o(e)) → ᏸ(t (e)) so that ᏸe = ᏸe−1 . The ᏸe ’s must also satisfy the flatness condition: for any 2-cell of direction {j, j } in X, we require (1)
ᏸtopj ᏸbotj = ᏸtopj ᏸbotj .
The local system is metrized if each fiber ᏸ(v) is a finite-dimensional (positivedefinite) inner product space and all transition maps ᏸe are isometries. The (metrized) trivial local system ᐀V , for a finite-dimensional (inner product) vector space V , has all fibers V and all transition maps the identity. Over a contractible space, all local systems admit a trivialization, that is, an isomorphism to some ᐀V . The notions of maps and direct sums make sense for (metrized) local systems over a fixed space (see, e.g., [12]). A local system is irreducible if it is not the direct sum of nonzero local systems. Every metrized local system is the direct sum of irreducible ones, and a sublocal system has an orthogonal complement. Let X1 , X2 be regular cubical complexes with (metrized) local systems ᏸi on Xi . The product X = X1 × X2 is naturally a regular cubical complex on which we have a (metrized) local system ᏸ = ᏸ1 ᏸ2 , called the external product of the ᏸi ’s. It is irreducible if and only if both ᏸi ’s are irreducible. If X1 and X2 are connected, then any irreducible local system on X is an external product. Let X be a connected regular cubical complex, and let v0 be a base vertex. The universal cover X˜ is a product of regular trees { j }1≤j ≤g , and the fundamental group = π1 (X, v0 ) is discrete in j Aut( j ), which acts properly on X˜ with quotient X. (Metrized) local systems ᏸ on X are equivalent to (orthogonal or unitary) representations ρᏸ of on ᏸ(v0 ). We reconstruct ᏸ as \ (X˜ × V ). A similar construction is possible in the disconnected case. If acts freely on a regular cubical complex X˜ preserving directions, and ρ is an (orthogonal or unitary) representation of on V , then the quotient X = \ X˜ is again a regular cubical complex, and we get a (metrized) local system ᏸ = \ (X˜ × V ). This way, we get all local systems on X whose pullback to X˜ is trivial. 1.2. Cochains. Let X be an (r1 , . . . , rg )-regular cubical complex, and let ᏸ be a local system on X. For a subset I ⊂ {1, . . . , g} and σ ∈ I , we denote by o(σ ) the iterated bottom vertex ( j ∈I botj )(σ ), and for j ∈ I let ej (σ ) be the oriented edge of direction {j } defined by ej (σ ) = ( j =j botj )(σ ). By the combinatorial conditions, these are well defined (i.e., they do not depend on the order of the operations). Now set
RAMANUJAN CUBICAL COMPLEXES
89
ᏸ(σ ) = ᏸ(o(σ )) and ᏸσ,j = ᏸej (σ ) . By definition, the space of I -cochains C I (X, ᏸ)
is the space of completely alternating collections s = {s(σ )}σ ∈I ∈ ᏸ(σ ). σ ∈I
The condition of being completely alternating is vacuous for zero-cochains. Otherwise, a collection s as above is completely alternating if for any j ∈ I we have s(invj (σ )) = −ᏸσ,j s(σ ). For ᏸ = ᐀V , we view the I -cochains as maps from I to V . For an integer 0 ≤ i ≤ g, put C i (X, ᏸ) = ⊕|I |=i C I (X, ᏸ). The partial boundary operators ∂j = ∂j,i : C i (X, ᏸ) → C i+1 (X, ᏸ) are defined by linearity and by the formula ᏸ−1 s topj (σ ) − s bot j (σ ) for σ ∈ I with |I | = i + 1 and j ∈ I, j,σ ∂j s(σ ) = 0 for σ ∈ I with |I | = i + 1 and j ∈ I for any s ∈ C i (X, ᏸ). These are well defined and satisfy ∂j ∂j = ∂j ∂j for all j, j . Clearly, ∂j2 = 0. We denote by ∂j,I the restriction of ∂j to C I (X, ᏸ). If j ∈ I , then ∂j,I = 0. As in [19, Section II.1], the total boundary of s ∈ C I (X, ᏸ) is defined by d(s) = dI (s) = j ∈I (−1)αI (j ) ∂j,I , where αI (j ) is the place of j in I ∪ {j }. As usual, d 2 = 0. We get the spaces of cohomology with respect to ∂j and d by the usual formulas Hji (X, ᏸ) = Ker ∂j,i / Im ∂j,i−1 and H i (X, ᏸ) = Ker di / Im di−1 . We can, moreover, break Hji (X, ᏸ) as a direct sum of subspaces HjI (X, ᏸ) with |I | = i defined by restricting dj to C I (X, ᏸ). When ᏸ is the trivial local system ᐀R , it is usually omitted from the notation. ∗ : C i+1 (X, ᏸ) → C i (X, ᏸ) are defined by The operators ∂j∗ = ∂j,i
−1 ∗ ∂j,i s(σ ) = ᏸτ,j s(τ ). topj (τ )=σ
These operators satisfy the analogous relations ∂j∗ ∂j∗ = ∂j∗ ∂j∗ and (∂j∗ )2 = 0. We let ∗ denote the restriction of ∂ ∗ to C I ∪{j } (X, ᏸ) for j ∈ I . Finally, for t ∈ C I (X, ᏸ), ∂j,I j,i ∗ ∗ 2 we define d ∗ (t) = j ∈I (−1)αI −{j } (j ) ∂j,I −{j } . Again, (d ) = 0. 1.3. Hodge theory. From now on we consider only metrized local systems. Let C2I (X, ᏸ) denote the Hilbert space completion of the space of I -cochains with coefficients in ᏸ, with respect to the pre-Hilbert norm
s2 = 2−|I | s(σ )2 . σ ∈I
We let C2i (X, ᏸ) be the orthogonal sum of the corresponding C I ’s. It is clear that the operators previously defined (e.g., ∂j,i , di∗ , etc.) induce bounded operators, denoted
90
JORDAN AND LIVNÉ
by the same letters, on the corresponding spaces C2∗ (X, ᏸ). In particular, we have ∗ 2 ≤ r . To see this, notice that for |I | = i and s ∈ C I (X, ᏸ), ∂j,i 2 ≤ rj and ∂j,i j we have ∂j,i s = 0 if j is in I . Otherwise, set I = I ∪ {j }. Then
ᏸ−1 s top (σ ) − s bot j (σ ) 2 ∂j,i s 2 = 2−i−1 j j,σ σ ∈I
≤2
s bot j (σ ) 2
−i+1
σ ∈I
= rj 2
−i
s(τ )2
τ ∈I
= rj s
2
(compare with [12, Lemma 1.2]). This implies the bound ∂j,I 2 ≤ rj . Since the C I (X, ᏸ)’s are mutually orthogonal, the same bound holds for the direct sum operator ∗ is similar, and we omit it. ∂j,i . The case of ∂j,i ∗ is the adjoint of ∂ . It is routine and easy to verify that ∂j,I j,I i i The Laplacians j,i : C (X, ᏸ) → C (X, ᏸ) are defined as usual by the formula ∗ ∗ ∂ . On each C I (X, ᏸ), this simplifies to ∂ ∗ j,i = ∂j,i−1 ∂j,i−1 + ∂j,i j,i j,I −{j } ∂j,I −{j } ∗ if j ∈ I and to ∂j,I ∂j,I if j ∈ I . These are commuting, bounded, selfadjoint, and nonnegative operators. The Laplacian tot,i is defined by ∗ + di∗ di . tot,i = 1,i + 2,i + · · · + g,i = di−1 di−1
The restriction of j,i (tot,i ) to C I (X, ᏸ) will be denoted by j,I (tot,I ). All of these are also bounded, selfadjoint, and nonnegative. By definition, the space of •harmonic forms for • = i or I , Ᏼ• (X, ᏸ) ⊂ C • (X, ᏸ), is Ker tot,• . We, likewise, set Ᏼj• (X, ᏸ) = Ker j,• . We now have the following routine proposition. Proposition 1.2. Let X be a finite regular cubical complex, and let ᏸ be a metrized local system on X. Then ∗ and Ᏼi (X, ᏸ) = Ker ∂ ∩Ker ∂ ∗ (1) Ᏼi (X, ᏸ) = Ker di ∩Ker di−1 j,i j j,i−1 . Moreover, I ∗ Ᏼj (X, ᏸ) = Ker ∂j,I if j ∈ / I and ᏴjI (X, ᏸ) = Ker ∂j,I if j ∈ I. −{j } (2) We have orthogonal sum decompositions (the Hodge decomposition) C2i (X, ᏸ) = Ᏼi (X, ᏸ) ⊕ Im di−1 ⊕ Im di∗ , ∗ , C2i (X, ᏸ) = Ᏼji (X, ᏸ) ⊕ Im ∂j,i−1 ⊕ Im ∂j,i
and ∗ C2I (X, ᏸ) = ᏴjI (X, ᏸ) ⊕ Im ∂j,I
for j ∈ I,
= ᏴjI (X, ᏸ) ⊕ Im ∂j,I −{j }
for j ∈ I.
RAMANUJAN CUBICAL COMPLEXES
91
2. The spectrum of the Laplacian. Our main interest is in the eigenvalues of the j,i ’s. First, we have the following lemma. Lemma 2.1. For every subset I ⊂ {1, . . . , g}, the I -cochains C I (X, ᏸ) are preserved by j,i for all j with i = |I |. Moreover, for j ∈ I , the maps ∂j and ∂j∗ induce isomorphisms between the λ-eigenspace of j on C I (X, ᏸ) and on C I ∪{j } (X, ᏸ) for any λ = 0. Proof. Suppose j c = λc, j c = λc for c ∈ C I (X, ᏸ), and c ∈ C I ∪j (X, ᏸ) with j ∈ I . Then, j ∂j c = ∂j ∂j∗ ∂j c = λ∂j c and, likewise, j ∂j∗ c = λ∂j∗ c . Since ∂j∗ ∂j and ∂j ∂j∗ are both multiplication by λ = 0 on all such c, c , respectively, the lemma follows. The study of the nonzero eigenvalues and eigenspaces of j on any C I (X, ᏸ) can be reduced to the case j ∈ I . In fact, it is also possible to reduce the study of the full diagram C I (X, ᏸ) o
∂j ∂j∗
/
C I ∪{j } (X, ᏸ)
to the case when X is an rj -regular graph and I is empty. To do this, we make for convenience the following hypothesis: PAR: X has parities pj for all 1 ≤ j ≤ g. Under hypothesis PAR, we define for any j ∈ I a graph Gr j (X) = Gr j,I (X) and a (metrized) local system ᏸ = ᏸj,I (X), as follows. The vertices of Gr j (X) are the cubes of type I of X; its (oriented) edges are the (oriented) cubes of type I ∪{j } (with the I part of the orientation all zeros). For an I -cube σ of X, we define ᏸj,I (X)(σ ) = ᏸ(vσ ), where vσ is the bottommost vertex of σ . The transition maps are those then induced from ᏸ. The compatibilities of the corresponding operators ∂j , ∂j∗ , and j with ∂, ∂ ∗ , and are obvious from the definitions. In any mention of these graphs, hypothesis PAR will be assumed. Let Gr be a locally finite graph, and let ᏸ be a local system on Gr. The star operator S(Gr, ᏸ) : C 0 (Gr, ᏸ) → C 0 (Gr, ᏸ) is defined to be the map sending a cochain s ∈ C 0 (Gr, ᏸ) to the cochain S(Gr, ᏸ)(s) given by
S(Gr, ᏸ)(s)(v) = ᏸe−1 s(o(e)). {e|t (e)=v}
In the case of the graph Gr = Gr j,I (X), we call Sj,I = S(Gr j,I (X), ᏸ) the j th star operator and we put Sj,i = ⊕|I |=i Sj,I . The operator Sj,I then acts on the space C I (X, ᏸ) = C 0 (Gr j,I (X), ᏸ), and, accordingly, Sj,i acts on the space C i (X, ᏸ). Clearly, the norm of Sj,I is bounded by rj . Since j,I can be viewed as rj − Sj,I (compare with [12, Proposition 2.20]), we get the following proposition.
92
JORDAN AND LIVNÉ
Proposition 2.2. The norm of j,I restricted to C I (X, ᏸ) is bounded by 2rj and its spectrum is contained in [0, 2rj ]. As a corollary, the same bound is valid for the norm of j,i (use the orthogonal decomposition of C i (X, ᏸ) to the C I (X, ᏸ)’s, separately treating the cases j ∈ I and j ∈ I ). In dimension 1, the classical result of Alon and Boppana (see, e.g., [14, Proposition 4.5.4]) shows that the nontrivial eigenvalues of the star operator for the trivial local system are essentially bounded in terms of the norm of the star operator on the universal covering regular tree (see also [13] or [4]). Proposition 2.3 (Alon and Boppana). Let Gr n be a family of finite r-regular connected graphs whose number of vertices goes to ∞ with n. Let L2,0 (Gr n ) denote the space of zero-sum L2 zero-cochains. Then √ lim inf S(Gr n )|L2,0 (Grn ) ≥ 2 r − 1. n→∞
In the higher-dimensional case, set • = i or I and define for j ∈ I :
µj,• (X, ᏸ) = max |µ| | µ is an eigenvalue of Sj,• different from ± rj . Here, ᏸ is any metrized local system on X. Put also µj,• (X) = µj,• (X, ᐀). In particular, µj,• (X, ᏸ) lies in the interval [−rj , rj ]. The Alon-Boppana result now implies the following g-dimensional version. Proposition 2.4. Let Xn be a sequence of (r1 , . . . , rg )-regular connected cubical complexes. If the number of vertices of each connected component of Gr j,I (Xn ) tends to ∞ with n, then lim inf n→∞ µj,I (Xn ) ≥ 2 rj − 1. In analogy with the 1-dimensional case (see [12, Definition 2.25]), we state the following. Definition 2.5. A local system ᏸ on a regular cubical complex X is Ramanujan if µj,I (X, ᏸ) ≤ 2 rj − 1 for all j and I , as above. We say that X is Ramanujan if the trivial local system ᐀C on X is Ramanujan. It is of course possible to define the notion of a (j, I )-Ramanujan system, so that ᏸ is Ramanujan if and only if it is (j, I )-Ramanujan for all (j, I ) (j ∈ I ). Many examples are known in the 1-dimensional case (see, e.g., [14]). Since the external tensor product of Ramanujan local systems is Ramanujan, we get many examples in the higher-dimensional case as products, or, more generally, by pulling back a product system from a finite unramified cover. We say that a connected regular cubical complex is reducible if a finite unramified cover of it is a product of cubical complexes of lower dimensions. Otherwise, we say that X is irreducible.
RAMANUJAN CUBICAL COMPLEXES
93
When X is connected, we write X as a quotient X = \ of a product = j j of trees. Then X is reducible if and only if the following two conditions are satisfied: (a) is a product = × with , products over complementary subsets of the j ’s; (b) a subgroup of finite index of is compatibly a product × , so that the corresponding finite unramified cover of X is \ × \ . A metrized local system is always an orthogonal sum of irreducible local systems. The irreducible (metrized) local systems over a product (connected) complex are precisely the external tensor products of irreducible ones. The challenge is then to find Ramanujan local systems over irreducible cubical complexes. Even when a graph or a cubical complex is Ramanujan, there are local systems on it which are not: A generic deformation of the transition maps of the trivial local system is an example. Let F1 , . . . , Fg be nonarchimedean local fields, and set Gj = PGL2 (Fj ) and G = j Gj . Let G+ be the elements of G whose j th components have determinants with even valuations for each j . The standard trees j associated to the Gj ’s are(rj = (qj + 1))-regular with qj the cardinality of the residue field of Fj . Set = j j . hypothesis For a discrete subgroup of G+ , the quotientcomplex \ satisfies I PAR. For any I ⊂ {1, . . . , g}, G acts on I = j ∈I j , on = j ∈I j , and on = I × I . For an I -cube σ ∈ I ( I ), let σ denote the projection to GI = j ∈I Gj of the stabilizer of σ . We now have the following proposition. Proposition 2.6. (1) Let be a discrete, cocompact, torsion-free subgroup of G. Let ᏸ be the (metrized) local system ᏸ on X = \ corresponding to a unitary representation ρ of on a (finite-dimensional) space V . Then ᏸ is (j, I )-Ramanujan for j ∈ I if and only if the following condition holds. For any I -cube σ of {j } , no (nontrivial) representations of G of the unramified complementary series appear in L2 (σ \ Gj × V ), where Gj acts through its right action on the Gj -factor. (2) The local system ᏸ is Ramanujan on X if and only if no nontrivial representations of G of the unramified complementary series appear in L2 (σ \ Gj × V ), for any cube σ of {j } of direction I j = {1, . . . , g} − {j }. Proof. (1) After the choice of a maximal unramified compact subgroup of G, the unramified vectors in L2 (σ \ Gj × V ) can be identified with L2 (σ \ Ver j × V ), and the action of the j th factor Hecke operator Tvj corresponds to the action of the j th star operator (compare Step 1 in the proof of [12, Theorem 3.4]). Our claim now follows from Satake’s reformulation of the Ramanujan-Petersson conjecture (see [18]). (2) For j ∈ I ⊂ {1, . . . , g}, an I j -cube has 2g−1−|I | faces of direction I . Hence, the diagram C I (X, ᏸ)
∂j ∂j∗
C I ∪{j } (X, ᏸ)
embeds (in 2g−1−|I | ways) into the corresponding diagram with I replaced by I j . Hence, to verify the Ramanujan property it suffices to verify the (j, I j )-Ramanujan
94
JORDAN AND LIVNÉ
property for all 1 ≤ j ≤ g. For any I as above, we have the decomposition into connected components Gr j,I (X, ᏸ) = σ \ I ∪{j } , the union taken over representatives σ of \ I ( {j } ). It suffices, therefore, to verify the Ramanujan property for ᏸ over each of these components for I = I j . Our claim follows from the first part. 3. Ramanujan local systems arising from quaternion algebras. Let F be a totally real field of degree d over Q with ring of integers ᏻF . Let ∞i , i = 1, . . . , d, be the infinite places of F . For an algebraic group H over F , let HF be the F -rational points of H, let HA be the adelic points of H, and let H f denote the finite-adelic points of H. For any finite set of finite places v1 , . . . , vn of F , let H f,v1 ,...,vn denote the finite-adelic points of H without the v1 , . . . , vn component. Let B be a totally definite quaternion algebra over F with reduced norm Nm = NmB/F . (A general reference for quaternion algebras over totally real fields is [22].) Let G be the algebraic group over F associated with the multiplicative group B × . Let qv denote the cardinality of the residue field of the completion Fv of F at a nonarchimedean place v. With H denoting the Hamilton quaternions, let Symmk denote the kth symmetric power of the 2-dimensional representation of H× obtained by identifying H ⊗R C with Mat2×2 (C). We have G∞ := i G(F∞i ) " (H× )d . The irreducible representa tions of G∞ are then Symmk,s = ⊗i (NmB∞i /R )si Symmki with k = (k1 , . . . , kd ) and
s = (s1 , . . . , sd ), the si being any complex numbers. We denote by V k,s the space on which Symmk,s acts. They can be unitarized if and only if the center of G∞ acts through a unitary character, and this happens if and only if 2si + ki is imaginary for each i. In this event, there is a G∞ -invariant positive-definite Hermitian inner product on V k,s which is unique up to a scalar. In what follows, we also assume that s = −k/2. Then an element (z1 , . . . , zd ) in the center of G∞ acts via i sign(zi )ki . In particular, an element x of the center F × of GF ⊂ G∞ acts via sign(NmF /Q (x)k ) if all the ki ’s have the same parity k. Let Z be the center of G, viewed as an algebraic group over F . Let vj , j = 1, . . . , g, be distinct nonarchimedean places of F , where B is unramified, and fix identifications Bvj " Mat 2×2 Fvj . (2) Set Gj = GL2 (Fvj ), G = j Gj , and Kj = Zvj GL2 (ᏻF,vj ). We view the vertices of the ((rj = qvj + 1)-regular) tree j of Gj as Ver j = Gj /Kj , and we set = 0 f,v1 ,...,vg , set = (K 0 ) = G ∩ K 0 . F j j . For a compact open subgroup K of G 0 Then , divided by its center, acts discretely on , and if K is sufficiently small, the action is free. We can, therefore, use the notation and results of the previous section.
RAMANUJAN CUBICAL COMPLEXES
95
Recall that the Ramanujan-Petersson conjecture asserts that the eigenvalues of the √ Hecke operator Tvj are bounded by 2 qvj on the automorphic forms on GL2 (F ) which are unramified at vj . We now have the following theorem. Theorem 3.1. (1) The cubical complex X = \ is irreducible, and the graph Gr j,I (X) is connected for any j ∈ I ⊆ {1, . . . , g}. (2) Suppose that every element in which fixes a point on acts trivially on V k,−k/2 . Then the formula ᏸ = ᏸk,−k/2 = \ ( × V k,−k/2 ) defines a local system on X = \ , which is metrized, irreducible, and nontrivial unless all the kj are zero. (3) The cohomology groups H ∗ (X, ᏸ) vanish except in dimensions zero and g. (4) When ᏸ defines a local system on X, it is Ramanujan, provided the RamanujanPetersson conjecture holds for holomorphic automorphic forms on GL2 (F ) of weight (k1 + 2, . . . , kg + 2). Proof. (1) By the property of strong approximation, the projection of to each strictly partial GI is dense, and the G-action on the (unoriented) I -cubes of {j } is transitive. The first fact implies that X is irreducible, and the second implies that Gr j,I (X) is connected. (2) The given condition implies that ᏸ as defined is alocal system on X. (In our case, it simply means that if z is a central element in , then 1≤i≤g (sign∞i x)ki = 1.) Since s = −k/2, ρ k,−k/2 is unitary and, hence, ᏸ is metrized. The Eichler-Kneser strong approximation theorem implies that the image of in G∞ contains a Zariski dense subgroup of the group (G1 )∞ " SU(2)d of the norm-1 elements of G∞ , implying the irreducibility of ᏸk,−k/2 . The nontriviality statement is clear. (3) This is a consequence of representation-theoretic results of Garland and Casselman (see [2, Chapter 13, Proposition 3.6(i)]). (4) Hypothesis PAR holds since the elements of have norm 1. We deduce the Ramanujan property from bounds on Hecke eigenvalues as in [12, Theorem 3.4.2]. Define a finite r-regular cubical complex Y = Y (K 0 ) and a local system ᏸY on Y by f,v1 ,...,vg G × and Y = GF K0 (3) f,v1 ,...,vg G k/2 k,− ᏸY = GF × ×V K0 with the diagonal GF action. The connected components of Y are in bijection with the 0 idèle class group F × \A×,f F / Nm B/F (KK ) by the Eichler-Kneser strong approximation theorem. Moreover, the natural inclusion of into ×Gf,v1 ,...,vg exhibits X as a connected component of Y , with ᏸ the restriction of ᏸY to X. By Proposition 2.6(2), the Ramanujan property is equivalent to bounds for the “interesting” eigenvalues of Sj,I j on each C 0 (Gr := Gr j,I j (Y ), ᏸY ), 1 ≤ j ≤ g. j j {j } First, observe that the stabilizer U of a fixed I -cube σ of in G 0is the product j i∈I j Ui of the stabilizers Ui of each ith edge factor of σ . Set K = K Kj U . Then
96
JORDAN AND LIVNÉ
we are reduced to studying the eigenvalues of the star operator Sj,I j on f,vj G k/2 k,− Ver( j ) × × V C 0 Gr, ᏸY = GF K 0U j Gf k/2 k,− = GF × V . K 0 Kj U j
Let Ꮾk,−k/2 (G) be the space of continuous maps φ : Gf → V k,−k/2 satisfying φ(g f x) = ρ k,−k/2 (g∞ )φ(x), for any g = g∞ g f ∈ (G∞ ×Gf )∩GF , with Gf acting on it through right translations. We view C 0 (Gr, ᏸY ) as the K-invariants of Ꮾk,−k/2 (G), with Sj,I j corresponding to the Hecke operator Tvj . We express this space as a space of automorphic forms for B × . First, we decompose under the action of Z f ,
Ꮾk,−k/2 (G)K = ⊕ω∈3(G,K 0 ) Ꮾk,−k/2 (G, ω)K
with the sum over the set 3(G, K 0 ) characters of ZA trivial on ZF , on the connected component of the identity of Z∞ , and on K ∩ Z f . In particular, it is a finite set. As in [11, Section 1.1], each Ꮾk,−k/2 (G, ω) is closely related to the space Ꮽ(G, ω) of automorphic forms on B × in the sense of Jacquet and Langlands [10, Chapter 14]. (By the compactness of G1∞ , these can be viewed as the complex-valued continuous functions f : GF \ GA → C, which are right GA -finite and which satisfy f (gz) = ω(z)f (g) for any g ∈ GA and z ∈ ZA .) For an irreducible representation ρ of G1∞ , k/2 k,−
which is isomorphic to ρ|G1
∞
, the Peter-Weyl theory furnishes an isomorphism
Ꮾk,−k/2 (G, ω) " HomG1∞ V k,−k/2 , Ꮽ(G, ω) " HomG∞ V k,−k/2 , Ꮽ(G, ω) .
The second isomorphism because both ω and ρ k,−k/2 are trivial on the connected holds × component of Z∞ " i F∞i . The spaces involved are nonzero if and only if the × . In the opposite direction, we central character of ρ k,s agrees with ω on Z∞ " i F∞ i
identify the ρ-isotypical part Ꮽ(G, ω)ρ of Ꮽ(G, ω) with V k,−k/2 ⊗C Ꮾk,−k/2 (G, ω); see, for example, [11, Section 1.1] for explicit formulas for these isomorphisms (in the case F = Q, but the generalization is immediate). These isomorphisms are Gf equivariant. Therefore, we need to study the eigenvalues of Tvj on Ꮽ(G, ω)K for each ω ∈ 3(G, K 0 ). K Let Ꮽ(G, ω)K 1 be the space of functions in Ꮽ(G, ω) factorizing through the norm, and let Ꮽ(G, ω)K 2 be its orthogonal complement. Then K Ꮽ(G, ω)K = Ꮽ(G, ω)K 1 ⊕ Ꮽ(G, ω)2 .
Moreover, Ꮽ(G, ω)K 1 is the sum of character spaces V (χ), each 1-dimensional, with χ going over the idèle class characters trivial on NmB/F (K) and satisfying χ 2 = ω. On
RAMANUJAN CUBICAL COMPLEXES
97
the corresponding space, Tvj acts as the scalar λχ = (qvj + 1)χ(πvj ), where πvj is a uniformizer for vj . Hence, λ2χ = rj2 ω(πvj ), and since πvj is in Zvj , we get λ2χ = ±rj . The elements of Ꮽ(G, ω)K 2 lift Hecke equivariantly to cusp forms on GL2 (F ) by the Eichler-Shimizu-Jacquet-Langlands theory. The lift is injective, and its image is characterized by square integrability at each prime v, which is ramified for B. When v is infinite, the square integrability is essentially the same as holomorphy. Moreover, the weights correspond as follows. To Symmk at an infinite place correspond forms of weight k + 2 at that place on GL2 (F ). Hence, we have reduced the Ramanujan property for X to the Ramanujan-Petersson conjecture for holomorphic forms on GL2 (F ) of the type claimed. Remark 3.2. (1) We need only special cases of the holomorphic RamanujanPetersson conjecture. Let B/F , v1 , . . . , vg , K 0 , and k be as above. To prove that ᏸk,−k/2 is a Ramanujan local system on X only requires that the eigenvalues of Tv1 , . . . , Tvg on those automorphic forms on GL2 (F ) which are lifts of ρ k,−k/2 K 0 isotypical forms on Ꮽ(G, ω) with ω ∈ 3(G, K ) satisfy the Ramanujan-Petersson bound. These are the automorphic forms on GL2 (F ) of level dividing K 0 , of ∞-type as indicated, and of central character in 3(G, K 0 ) which are, moreover, square integrable at all places where B ramifies. However, the Ramanujan-Petersson conjecture seems not to be known in this generality. (2) Fortunately, when B is ramified at some finite place v (which always happens if [F : Q] is odd), then the Ramanujan-Petersson conjecture is known. If F = Q, the automorphic form can be lifted to GL2 (Q), where the Ramanujan-Petersson conjecture was proved by Deligne to follow from the Weil conjectures (see [7]), which he subsequently proved in [8]. Otherwise, let B /F be a quaternion algebra ramified precisely at the places where B is, except for ∞1 and v. Then our form can be lifted to B ; Carayol [5] showed that they occur in the cohomology of a local system on a Shimura curve with good reduction at v. The Ramanujan-Petersson conjecture then follows from the Weil-Deligne bounds (see [9]). (3) More cases of the Ramanujan-Petersson conjecture can undoubtedly be proved by the techniques of [3] or of [1]. The scope of results that this might yield is unclear. As a result, we get the following theorem. Theorem 3.3. In the notation of Theorem 3.1, we assume in addition that B is ramified in at least one finite place and that all the ki ’s are of the same parity. Then ᏸk,−k/2 is Ramanujan on X. f
4. Explicit arithmetic examples. Let pj j be g powers of rational primes pj , not f
necessarily distinct, and put rj = pj j + 1. Our results suffice to give the following theorem.
98
JORDAN AND LIVNÉ
Theorem 4.1. There exist infinitely many irreducible Ramanujan local systems over infinitely many irreducible (r1 , . . . , rg )-regular complexes. Proof. Let j be an rj -regular tree, and put = j j . Let d be an integer ≥
fj . max {p a rational prime}
{j | pj =p}
There exists a totally real number field F of degree d over Q, and there exist pairwise f distinct finite places vj of F whose residue fields have pj j elements. There exists a totally definite quaternion algebra B/F that is unramified at all the vj ’s and ramified over at least one finite prime (again, this is automatic if d is odd). Viewing each j as the tree associated to Bv×j " GL2 (Fvj ), we get an action of B × modulo its center F × on . Choose an order ᏹ in B, which for simplicity we take to contain ᏻF , and set S = {v1 , . . . , vg }. Let ᏻF,S and ᏹS be the localizations at S, namely, the elements of F and B that are integral outside of S. For an ideal N of ᏻF prime to S, let (N) be the principal congruence subgroup in ᏹS , namely, the kernel of the (surjective) reduction ᏹS× → (ᏹ/N ᏹ)× modulo N. Suppose that N is sufficiently small. Then (N ), divided by its center, acts freely on . Now take k = (k1 , . . . , kg ), where the kj ’s are nonnegative integers of the same parity. Then ᏸ = (N) \ ( × V k,−k/2 ) is a local system on X(N) = (N) \ if the condition in Theorem 3.1(2) is satisfied. In our case, it means that if the parity of the ki ’s is odd, then there are no elements in ᏻ× F,S congruent to 1, modulo N whose norm to Q is negative. For example, this holds if either the ki ’s are even or if −1 is not in the subgroup of (Z/(N ∩ Z))× generated f by the pj j ’s. Assuming this, ᏸ is a Ramanujan local system on X(N) by Theorem 3.3. If we fix the quaternion algebra and vary the level N , we thus get an infinite family of complexes of the same regularities with growing number of vertices, thereby proving the theorem. In certain cases, the construction takes a particularly simple form in which, among other things, the Gr j ’s are all Cayley graphs on the same group. (The general case is not much worse—see [12, Section 2.6] for the prototypical examples.) Namely, let us assume there is an ideal N0 = 0 of ᏻF , prime to the vj ’s (we allow N0 = ᏻF ) such that the following condition holds. Condition 4.2. (1) Every ideal of F has a totally positive generator equivalent to 1 mod N0 . (2) The class number of B is 1. (3) The units ᏹ× of a maximal order ᏹ of B surject onto (ᏹ/N0 ᏹ)× with the × kernel being contained in the center ᏻ× F of ᏹ . We then have the following two propositions.
RAMANUJAN CUBICAL COMPLEXES
99
Proposition 4.3. (1) For each 1 ≤ j ≤ g, there are exactly rj (principal) ideals Pj,i , 1 ≤ i ≤ rj of ᏹ whose norm to F is the prime ideal vj . This ideal has a totally positive generator, say πj , which is equivalent to 1 mod N0 . We next choose ∗ generators 8j,i ≡ 1 mod N0 ᏹ for Pj,i whose norm is πj and whose image 8j,i under the main involution x % → x ∗ of B is some 8j,i for some 1 ≤ i ≤ rj (i = i may happen). (2) For every permutation σ of {1, . . . , g} and any sequence of indices i1 , . . . , ig with 1 ≤ ij ≤ rj , there is a (unique) sequence i1 , . . . , ig with 1 ≤ ij ≤ rj and a (unique) unit u ∈ ᏻ× F , satisfying u ≡ 1 mod N0 such that (4)
8σ (1),i1 · · · 8σ (g),ig = u81,i1 · · · 8g,ig .
Proof. We use the notation of Section 3. Fix generators πj of vj as required. Let
ᏹ be a maximal order in B. For each finite prime v of F , let ᏹv be the completion of ᏹ at v, and set Kv = ᏹv× and K 0 = Kv , the product taken over the finite v’s not among the vj ’s. Fix the identifications in (2) so that ᏹvj = Mat 2×2 (ᏻF,vj ).
The assumption that F and B have class number 1 implies, using the Eichler-Kneser strong approximation theorem, that the complex Y (K 0 ) of (3) coincides with the complex X = (K 0 ) \ and that it has one vertex. Therefore, there are elements ∈ (K 0 ), for 1 ≤ j ≤ g and 1 ≤ i ≤ r , mapping the vertex of fixed by K to 8j,i j its rj neighbors of direction {j }. Multiplying by an element in the center, we assume that these elements are in ᏹ and not divisible by any of the vj . Then the norm of must be a generator of v . Multiplying by a unit in ᏹ, we assume that each 8j,i j 8j,i ≡ 1 mod N0 ᏹ. Multiplying by a unit in the center ᏻF , we further assume that, in addition, the norm is πj . This is because there are units in ᏻF which have arbitrary signs at the infinite places of F and which are equivalent to 1 mod N0 . The result gives the required 8j,i ’s. For the second part, we see from the action on the complex that 8σ (1),i1 · · · 8σ (g),ig = 81,i1 · · · 8g,ig u for some unit u of ᏹ. Then u must be equivalent to 1 mod N0 and, therefore, in the center. To formulate the second proposition, we set some notation first. For a subset J of {1, . . . , g}, we denote by j ∈J gj the product gj1 , . . . , gjn , where the g’s are elements in any semigroup and the ji ’s are the elements of J in increasing order. Let N1 be a prime ideal of ᏻF , prime to the vj ’s and to N0 , and set N = N0 N1 . Let A be the subgroup of (ᏻF /N1 )× generated by the images modulo N1 of the πj ’s. Let B be the subgroup of scalars in (ᏹ/N1 ᏹ)× generated by the images modulo N1 ᏹ of the πj ’s and by those units of ᏻF which are congruent to 1 modulo N0 . Let B be the subgroup of scalars in (ᏹ/N1 ᏹ)× generated by the images modulo N1 ᏹ of the πj ’s and by those units of ᏻF which are congruent to 1 modulo N0 and whose norm to Q is positive (namely, 1). The groups H = {g ∈ (ᏹ/N1 ᏹ)× | Nm(g) ∈ A}/B and H = {g ∈ (ᏹ/N1 ᏹ)× | Nm(g) ∈ A}/B are isomorphic to the groups SL2 (ᏻF /N1 ), PSL2 (ᏻF /N1 ), or PGL2 (ᏻF /N1 ). The latter case occurs if and only if at least one of the πj ’s is a square modulo N1 . Moreover, H is a quotient of H , and the kernel
100
JORDAN AND LIVNÉ
subgroup has order 1 if −1 is in the subgroup of (Z/(N0 ∩Z))× generated by the norms NmF /Q πj mod N0 ∩ Z and has order 2 otherwise. If the kernel is 1, we assume that the parity of the ki ’s is even. For x ∈ ᏹS× let x and x denote its respective reductions into H and H . These make sense because ᏹS /N0 ᏹS is isomorphic to ᏹ/N0 ᏹ. Choose a section s : H → H . With this notation, we have the following proposition. Proposition 4.4. (1) The vertices of X(N) are the elements of H . (2) The (oriented) cubes of direction J ⊂ {1, . . . , g} of X(N) are the pairs (v, I ), where v ∈ H = Ver X(N) and I is a |J |-tuple (ij )j ∈J with 1 ≤ ij ≤ rj . (3) For j ∈ J ⊂ 1, . . . , g, the j th bottom of the J -cube (v, I ) is the J -cube (v8j,ij , I ), where J = J − {j }, and I = (ik )k∈J is the |J | = (|J | − 1)-tuple characterized by 8k,ik = u8j,ij 8k,ik k∈J
k∈J
with u a unit u ∈ ᏻ× F , as in Proposition 4.3(2). With the same notation, invj (v, I ) is the J -cube (v8j,ij , I ) with I characterized by ∗ 8k,ik = u 8k,ik 8j,i j k∈I
u
k∈J
∈ ᏻ× F.
for an appropriate (4) The fiber of ᏸ over each v ∈ H is ᏸ(v) = V k,−k/2 . To describe the transition maps, let e be an edge of direction {j }, say, e = (v, i)(= (v, {i})), with 1 ≤ i ≤ rj . Then bot j e = v and topj e = v8j,i . Set < = 1 if s(topj e) = s(v)8j,i and < = −1 otherwise. Then ᏸ(e) : ᏸ(bot j e) → ᏸ(topj e) is given by (5)
< k1 ρ k,−k/2 8j,i .
Proof. Let K 0 (N ) be the principal congruence subgroup of level N in K 0 . Using the Eichler-Kneser strong approximation theorem, we see that the complex Y (K 0 (N)) is described as (K 0 ) \ ( × K 0 /CK 0 (N)), where C is the center of (K 0 ). Then X(N) is the connected component of Y (K 0 (N)) lying under × {1}. On the other hand, K 0 /CK 0 (N ) " (ᏹ/N ᏹ)× /B = H . Since (K 0 ) acts transitively on the set Ver × (ᏹ/N0 ᏹ)× , part (1) follows readily. The rest is routine. We directly show (without using ) that the transition maps ᏸ(e) as in (4) satisfy the flatness condition in (1). First, notice that v = topj2 topj1 σ indeed coincides with topj1 topj2 σ for any 2-cell σ = (v, ij1 , ij2 ) with 1 ≤ ij ≤ rj for j = j1 , j2 . Hence, the lifts of v and of v are related by s(v ) = < 8j1 ,ij1 8j2 ,ij2 s(v) = < 8j2 ,ij 8j1 ,ij v, ˜ 2
1
where the product of the signs corresponding to < and < is the image in H of u from equation (4). Condition (1) is an immediate consequence.
101
RAMANUJAN CUBICAL COMPLEXES
The girth of a cubical complex is the length, namely, the number of edges, of the f shortest homotopically nontrivial closed path. Set q = maxj pj j and n = NmF /Q N1 . We now have the following generalization of [14, Theorem 7.3.7]. Proposition 4.5. We assume that the group H of X(N) is SL2 (ᏻF /N1 ). Then the girth of X(N) is at least 2 logq (N 2 /4d ). Put differently, girth X(N) ≥
4 logq # Ver X(N) − constant. 3
Proof. As in the 1-dimensional case, the girth is min dist(x, γ x), the minimum taken over all vertices x of the universal cover and noncentral elements γ of = (K 0 ). Fix such γ and x where the minimum occurs. Then the distance is, by definition, the sum of the distances distj of the projections to the j th tree factor for 1 ≤ j ≤ g. Moreover, distj ≥ −2 valvj Tr B/F γ , exactly as in [14, Lemma 7.3.2]. We have N12 | NmB/F (γ − 1) = 2 − Tr B/F γ . Also, since G∞i " SU(2), we have | Tr B/F γ |∞i ≤ 2 for all 1 ≤ i ≤ d. Taking norms to Q gives that NmF /Q (2−Tr B/F γ ) lf is of the form n2 m/ j pjj j for some m, necessarily ≥ 1, where lj = − valvj (2 − Tr B/F γ ). Hence, NmF /Q 2 − Tr B/F γ ≤ 2 + Tr B/F γ ∞ ≤ 4d , i
1≤i≤d
so that n2 /4 ≤
j
l fj
pjj
≤q
girth =
j lj
. Therefore,
dist j ≥ −2
j
(6) =2
lj >0
min 0, valvj Tr F /Q γ
j
lj ≥ 2
j
lj ≥ 2 logq
n2 . 4d
On the other hand, the cardinality of Ver X(N) is at least that of PSL2 (ᏻF /N1 ), which is at least a positive constant c times n3 . This implies the last assertion. Example 4.6. In [22, Chapter 5], one finds examples satisfying Condition 4.2. These suffice to give many types of examples. However, the regularities are generally assumed to satisfy some congruence conditions besides being primes or prime powers of a specific type. To get rid of such restrictions, it appears necessary to consider the general cases, whose finite description is somewhat messier and, hence, not discussed here. (A) Take F = Q and let B2 have discriminant 2. When S consists of g = 1 prime p ≡ 1 mod 4, one gets the Lubotzky-Phillips-Sarnak graphs (see [15]). The nontrivial local systems over them are handled in [12]. Notice that the sign < in (5) is mistakenly missing there. The case of g = 2 distinct primes p ≡ q ≡ 1 mod 4, without
102
JORDAN AND LIVNÉ
local systems, is studied for different reasons in [17]. The resulting complexes are (p + 1, q + 1)-regular. (B) The case F = Q, with B3 of discriminant 3 and N0 = 2, satisfies Condition 4.2. (C) The case F = Q, with B13 of discriminant 13 and N0 = 1, satisfies Condition 4.2. It is used to get 3-regular Ramanujan graphs in [6]. (D) To get regularities other than√p + 1, one must use fields other than Q. The simplest case occurs when F = Q( 5 ) and BF = B2 ⊗ F , with B2 as in (A), and N1 = 2ᏻF . Then BF is totally definite but unramified at all the finite places. If we assume the Ramanujan-Petersson conjecture for the relevant automorphic forms as in Theorem 3.1, then this gives irreducible Ramanujan local systems over infinitely many (p 2 + 1)-regular graphs for all primes p ≡ ±2 mod 5. It also gives irreducible Ramanujan local systems over infinitely many (p + 1, p + 1)-regular complexes for all primes p ≡ ±1 mod 5. Of course, one can combine regularities to get higherdimensional examples. Unfortunately, Theorem 3.3 does not apply, and we have not found an adequate reference in the literature to prove the Ramanujan-Petersson conjecture in this case. (E) For our last example, we take F = Q(cos 2π/7) and set BF = B2 ⊗ F with B2 , as before. Then BF is ramified precisely at the prime 2 of F and at the three infinite places of F . Here, Theorem 3.3 applies. This example allows us, therefore, to get infinitely many Ramanujan graphs of regularity p 3 + 1 (and Ramanujan local systems on them) for any p that is not a square modulo 7 (see [16] for more general examples coming from function fields (with the trivial local system)). Our present example also gives infinitely many (p + 1, p + 1)-regular and (p + 1, p + 1, p + 1)regular Ramanujan complexes and Ramanujan local systems on them for any p that is a square modulo 7. Again, such different regularities may be combined to form higher-dimensional examples. References [1] [2]
[3] [4] [5] [6] [7] [8] [9] [10]
D. Blasius and J. D. Rogawski, Motives for Hilbert modular forms, Invent. Math. 114 (1993), 55–87. A. Borel and N. R. Wallach, Continuous Cohomology, Discrete Subgroups, and Representations of Reductive Groups, Ann. of Math. Stud. 94, Princeton Univ. Press, Princeton, 1980. J.-L. Brylinski and J.-P. Labesse, Cohomologie d’intersection et fonctions L de certaines variétés de Shimura, Ann. Sci. École Norm. Sup. (4) 17 (1984), 361–412. M. W. Buck, Expanders and diffusers, SIAM J. Algebraic Discrete Methods 7 (1986), 282–304. H. Carayol, Sur les représentations B-adiques associées aux formes modulaires de Hilbert, Ann. Sci. École Norm. Sup. 19 (1986), 409–468. P. Chiu, Cubic Ramanujan graphs, Combinatorica 12 (1992), 275–285. P. Deligne, Formes modulaires et représentations B-adiques, Séminaire Bourbaki 1968/69, exp. no. 355, Lecture Notes in Math. 179, Springer, Berlin, 1971, 139–172. , La conjecture de Weil, I, Inst. Hautes Études Sci. Publ. Math. 43 (1974), 273–307. , La conjecture de Weil, II, Inst. Hautes Études Sci. Publ. Math. 52 (1980), 137–252. H. Jacquet and R. P. Langlands, Automorphic Forms on GL(2), Lecture Notes in Math.
RAMANUJAN CUBICAL COMPLEXES
[11] [12] [13] [14] [15] [16] [17] [18]
[19] [20] [21] [22]
103
114, Springer, Berlin, 1970. B. W. Jordan and R. Livné, Integral Hodge theory and congruences between modular forms, Duke Math. J. 80 (1995), 419–484. , Ramanujan local systems on graphs, Topology 36 (1997), 1007–1024. H. Kesten, Symmetric random walks on groups, Trans. Amer. Math. Soc. 92 (1959), 336–354. A. Lubotzky, Discrete Groups, Expanding Graphs and Invariant Measures, Progr. Math. 125, Birkhäuser, Basel, 1994. A. Lubotzky, R. Phillips, and P. Sarnak, Ramanujan graphs, Combinatorica 8 (1988), 261–277. M. Morgenstern, Existence and explicit constructions of q + 1 regular Ramanujan graphs for every prime power q, J. Combin. Theory Ser. B 62 (1994), 44–62. S. Mozes, Actions of Cartan subgroups, Israel J. Math. 90 (1995), 253–294. I. Satake, “Spherical functions and Ramanujan conjecture” in Algebraic Groups and Discontinuous Subgroups (Boulder, Colo., 1965), Amer. Math. Soc., Providence, 1966, 258–264. J.-P. Serre, Homologie singulière des espaces fibrés: Applications, Ann. of Math. (2) 54 (1951), 425–505. , Arbres, amalgames, SL2 , Astérisque 46, Soc. Math. France, Paris, 1977. U. Stuhler, Über die Kohomologie einiger arithmetischer Varietäten, I, Math. Ann. 273 (1986), 685–699. M.-F. Vignéras, Arithmétique des algèbres de quaternions, Lecture Notes in Math. 800, Springer, Berlin, 1980.
Jordan: Department of Mathematics, City University of New York, Baruch College, Box G-0930, 17 Lexington Avenue, New York, New York 10010, USA Livné: Mathematics Institute, Hebrew University of Jerusalem, Givat Ram, Jerusalem 91904, Israel; [email protected]
Vol. 105, No. 1
DUKE MATHEMATICAL JOURNAL
© 2000
CORRECTION TO “HÖLDER FOLIATIONS” CHARLES PUGH, MICHAEL SHUB, and AMIE WILKINSON
A. Török has pointed out to us the need for a better proof of [1, Theorem B]. Accordingly, the first two full paragraphs on [1, p. 539] should be replaced with the following argument. We are trying to show that the subfoliation of the center unstable leaves by the strong unstable leaves is of class C 1 . Let W denote the disjoint union of the center unstable leaves: W= W cu (p). It is a nonseparable manifold of class C 1 . Partial hyperbolicity implies that its tangent bundle T W = E cu is continuous. The restriction of T M to W is a C 1 bundle TW M that contains the C 0 subbundle T W . Since f is a diffeomorphism of class C 2 , the tangent map Tf : TW M −→ TW M is a C 1 bundle isomorphism. As in the proof of Theorem A (see [1, pp. 527–538]), u , E cs , and express Tf with respect to the approximate E u , E cs by smooth bundles E cs u splitting T M = E ⊕ E as A B . C K pu → E pcs Let ᏼ(1) be the bundle over W whose fiber at p is the set of linear maps P : E such that P ≤ 1. The linear graph transform sends P to −1 . Tf (P ) = C + KP ◦ A + BP It is a bundle map that covers the identity on W , contracts fibers by approximately . KA−1 = T c f /m(T u f ), and contracts the base, at worst, by approximately . m(A) = m(T c f ). The unique invariant section p → Pp of ᏼ(1) of Tf has graph Pp = Epu . Center bunching implies that −1 . T c f c −1 = u m T f < 1. fiber contraction base contraction m T f Received 16 August 1999. 2000 Mathematics Subject Classification. Primary 37D30. 105
106
PUGH, SHUB, AND WILKINSON
So fiber contraction dominates base contraction, and the invariant section is of class C 1 . That is, E u is a C 1 bundle over the C 1 manifold W . Since E u is tangent to the foliation ᐃu , it is integrable. Frobenius’s theorem states that the foliation tangent to a C k integrable subbundle of T W is of class C k , in the sense that there is a C k atlas of foliation charts covering the manifold W . Strictly speaking, the proof requires that the underlying manifold be of class C k+1 , so we need to recheck the result in the case of the C 1 manifold W . Locally, W cu (p) is the graph of a C 1 function g : Epcu → Eps . The linear projection π : Epcu × Eps → Epcu restricts to a C 1 diffeomorphism πp : W cu (p) → Epcu , πp : x, g(x) −→ x. The tangent to π gives a C 1 bundle surjection T π : TW cu (p) M −→ T Epcu . The restriction of T π to E u |W cu (p) agrees with T πp , which implies that T πp : E u |W cu (p) −→ T πp E u |W cu (p) is a C 1 bundle isomorphism. The latter bundle is C 1 and is integrated by the foliation πp (ᐃu ). Since E cu (p) is smooth (being a plane), we can apply Frobenius’s theorem to conclude that the foliation πp (ᐃu ) is C 1 . Therefore, the foliation ᐃu |ᐃcu (p) is also of class C 1 . References [1]
C. Pugh, M. Shub, and A. Wilkinson, Hölder foliations, Duke Math. J. 86 (1997), 517–546.
Pugh: Department of Mathematics, University of California at Berkeley, Berkeley, California 94720-3840, USA Shub: IBM, T. J. Watson Research Center, Yorktown Heights, New York 10598, USA; [email protected] Wilkinson: Department of Mathematics, Northwestern University, 2033 Sheridan Road, Evanston, Illinois 60208-2730, USA; [email protected]
Vol. 105, No. 1
DUKE MATHEMATICAL JOURNAL
© 2000
ON A REFINEMENT OF WARING’S PROBLEM VAN H. VU
§1. Introduction §1.1. The problem and the result. In this paper N0 denotes the set of nonnegative integers. A subset S of N0 is a basis of order r if every positive integer can be represented as the sum of r elements in S. The most trivial basis is N0 itself, while the most interesting ones are probably the sets of kth powers (k = 2, 3, . . . ). Waring’s classical problem (first solved by Hilbert [Hil]) asserts that for any fixed k and s sufficiently large, every positive integer can be represented as a sum of s kth powers. For instance, every positive integer is a sum of four squares, nine cubes, and so on. Using HardyLittlewood’s circle method, one can actually estimate the number of representations. The following theorem is classical (see [Vau] and [Nat2], for instance). Theorem 1.1. For any fixed k ≥ 2, there is a constant s1 (k) such that if s > s1 (k), then R s k (n), the number of representations of n as a sum of s kth powers, satisfies N0
s/k−1 s RN k (n) = n 0
for every positive integer n. Theorem 1.1 (proved by Vinogradov and also many others) shows that the set Nk0 of kth powers is not only a basis but also a very rich one; that is, the number of representations of n is huge for all n. (Theorem 1.1 also holds for k = 1 as a trivial fact.) A natural question is whether Nk0 contains a subset X that is a thin basis s (n) is (sometimes we call X a subbasis of Nk0 ); that is, for every positive integer n, RX positive but small. The study of thin bases was started by Rohrbach and Sidon in the 1930s and has since then attracted considerable attention from both combinatorialists and number theorists (see [Erd], [EN], [CEN], [Ruz], [Nat1], [Zöl1], [Zöl2], [Wir], [Spe], [ER], [ET], and [HR]). How small? one may wonder. A very old, but still unsolved, conjecture of Erd˝os 2 (n) = ∞. and Turán [ET] states that if X is a basis of order 2, then lim supn→∞ RX Since this conjecture is commonly believed to be true even for arbitrary order, the best s (n) is a positive but we can hope for is to prove that there exists X ⊂ Nk0 such that RX slowly increasing function in n. The objective of this paper is to prove the following theorem. Received 25 May 1999. Revision received 24 January 2000. 2000 Mathematics Subject Classification. Primary 11P05; Secondary 05D40. Author’s work supported by a grant from the state of New Jersey. 107
108
VAN H. VU
Theorem 1.2 (Main theorem). For any fixed k ≥ 2, there is a constant s2 (k) such that if s > s2 (k), then Nk0 contains a subset X such that s (n) = (log n) RX s (1) = 1. for every positive integer n ≥ 2 and RX
A simpler question is whether there is a subbasis X of Nk0 with small density. This question has been investigated by several researchers for the case k = 2 ( see [EN], [CEN], [Zöl1], [Zöl2], [Wir], and [Spe]). Denote by X(m) the number of elements of X not exceeding m. Choi, Erd˝os, and Nathanson [CEN] proved that N20 contains a subbasis X of order 4, where X(m) ≤ m1/3+ . Improving this result, Zöllner [Zöl1], [Zöl2] showed that for any s ≥ 4 there is a subbasis X ⊂ N20 of order s satisfying X(m) ≤ m1/s+ for arbitrary positive . Prior to this paper, the strongest result in this direction was due to Wirsing [Wir], who, sharpening Zöllner’s theorem, proved that for any s ≥ 4 there is a subbasis X ⊂ N20 of order s satisfying X(m) = O(m1/s log1/s m). It is easy to see, via the pigeon-hole principle, that this result is best possible, up to the log term. A short proof of Wirsing’s result for the case s = 4 was given by Spencer [Spe]. For k ≥ 3, little has been known. In 1980, Nathanson [Nat1] proved that Nk0 contains a subbasis with density o(m1/k ). (Nk0 itself has density m1/k .) In the same paper, he posed the following question. Question 1.3. Let k ≥ 2 and s be fixed, positive integers, where s is sufficiently large compared to k. What is the smallest density of a subbasis of order s of Nk0 ? Can it be m1/s+o(1) ? It is clear that the density m1/s+o(1) is best possible up to the o(1) term. A simple corollary of the main theorem affirmatively answers Nathanson’s question by showing that this density can be achieved. Fix k ≥ 2, and consider the subbasis X ⊂ Nk0 described in the main theorem. By the pigeon-hole principle, it is trivial that X(m) = O(m1/s log1/s m) = m1/s+o(1) . We mention here that, although this density result is a corollary of the main theorem, it can be proved directly with considerably less effort (see Theorem 4.1). However, the proof does make use of the machinery developed for the main theorem. Finally, let us discuss the case k = 1. For this case, the statement of the main theorem was proved by Erd˝os and Tetali [ETe]. The key advantage of having k = 1 is that N0 = N10 is a trivial basis, and therefore the problem involves no number theoretical difficulty. The method developed in the present paper allows us (with a very simple proof) to extend Erd˝os-Tetali’s result to a more general kind of representation (see Theorem 4.4). On the other hand, some ideas from [ETe], especially the one used in Lemma 1.5, prove useful in our situation. The rest of the paper is organized as follows. In the next subsection, we describe our main ideas. We highly recommend reading this part carefully before moving to the detailed proof. The key ingredients of the proof are two lemmas: one is a new large
ON A REFINEMENT OF WARING’S PROBLEM
109
deviation bound and the other is an estimate for the number of roots of the equation x1k + · · · + xsk = n in a restricted region. In the last part of this section (see §1.3), we present the first key lemma and some other probabilistic tools; §2 is devoted to the proof of the second key lemma. The proof of the main theorem is presented in §3; §4 contains several related results, including the generalization of Erd˝os-Tetali’s result and a short proof of the existence of a subbasis with small density answering Nathanson’s question. Notation. We use standard asymptotic notation such as o, O, , ω, , mostly under the assumption that n → ∞. For 0 ≤ aj ≤ bj , j = 1, . . . , s, the box sj =1 [aj , bj ] is the set of all s-tuples (x1 , . . . , xs ), where xj ∈ N0 ∩ [aj , bj ]. A box is empty if no such tuple exists. Throughout, E(F ) denotes the expectation of a random variable F and Pr(A) is the probability of an event A. We call a random variable t a {0, 1}-random variable if it takes only two values 0 or 1. By saying that “A(n) holds for all sufficiently large n,” we always mean that there is a constant N0 such that for all n ≥ N0 the event A(n) holds. §1.2. The main ideas. Our proof is probabilistic. This approach was introduced by Erd˝os [Erd] and remains the only effective way to produce a thin basis. A random subset X ⊂ Nk0 is defined by choosing, for each x ∈ N0 , x k with probability px = cx −1+k/s log1/s x (p0 = 0), where c is a positive constant to be determined later. Let tx denote the characteristic random variable representing the choice of x k : tx = 1 if x k is chosen and 0 otherwise. Thus Pr(tx = 1) = px and Pr(tx = 0) = 1 − px , and the tx ’s are mutually independent. The number of representations of n (not counting permutations), restricted to X, is a random variable and can be expressed as s (n) = RX
s
x1k +···+xsk =n
j =1
txj =: G t1 , . . . , tn1/k .
(1.1)
In order to prove the main theorem, we show that almost surely (a.s.) X satisfies = (log n) for every sufficiently large n (n ≥ N0 for some constant N0 ). To 1/k obtain the result for all n, simply add (if needed) 0, 1k , 2k , . . . , N0 to X. This guarantees that every n ∈ N0 has at least one representation in X. Moreover, for large n, it turns out that adding a few elements to X does not change the order of s (n). magnitude of RX The plan of the proof is as follows. For each (sufficiently large) n, we show that s (n)) = (log n); (A) E(RX (B) there is a small constant such that with probability at least 1 − O(n−2 ), s (n) RX
R s (n) ≤ 1 + . 1 − ≤ Xs E RX (n)
110
VAN H. VU
(A) and (B) together yield that there is a constant N1 such that for any n ≥ N1 , s (n) = (log n) with probability at least 1 − O(n−2 ). Since ∞ −2 conRX n=N1 n verges, one can use the classical Borel-Cantelli lemma (see Lemma 1.6) to deduce s (n) = (log n) holds a.s. for all n ≥ N , where N is some other constant that RX 0 0 completing the proof. There are two serious problems we need to solve in order to carry out our plan. s ). Following (1.1), we have The first problem is to estimate E(RX s (n) = E RX
x1k +···+xsk =n
cs
s j =1
−1+k/s
xj
log1/s xj .
(1.2)
To see that the right-hand side of (1.2) has order (log n), one may argue as follows. A typical solution (x1 , . . . , xs ) of sj =1 xjk = n should satisfy xj = (n1/k ) for all j . Thus, a typical term in the sum has order (n−s/k+1 log n). On the other hand, by Theorem 1.1, the number of terms is (ns/k−1 ), and we would be done by taking the product. The trouble is that there could be many nontypical solutions with much larger contributions to the sum. For instance, assume that x = (x1 , . . . , xs ) is a solution where 1 ≤ xj ≤ Pj and some of the Pj ’s are considerably smaller than n1/k (e.g., P1 = n with 1/k). The contribution of the term corresponding to x −1+k/s is at least ( sj =1 Pj ), which is significantly larger than the contribution of a typical term. To overcome this difficulty, we need an upper bound on the number of solutions of the equation sj =1 xjk = n, restricted to a box of type sj =1 [1, Pj ], for arbitrary P1 , . . . , Ps . Such a bound is given in Lemma 2.1, one of the main lemmas of this paper, using the circle method. From the number theoretical point of view, this lemma may be of independent interest. s (n). As shown in (1.1), The second difficulty is to prove a concentration result on RX s s RX is a sum of random variables of form j =1 txj . Since these random variables are not independent, it is not trivial to show that their sum is strongly concentrated. The solution to this problem is a general concentration result, found recently by the present author in [Vu4]. This concentration result (see Lemma 1.4) gives very good large deviation bounds for multivariate polynomials with positive coefficients. s (n) can be expressed as a To apply the result, we need to exploit the fact that RX polynomial in t1 , . . . , tn1/k . §1.3. Probabilistic lemmas. Assume that t1 , . . . , tm are independent (but not necessarily identically distributed) {0, 1}-random variables. We are talking about probability in the product space generalized by the ti ’s. Consider a polynomial F (t) = F (t1 , . . . , tm ) of degree s; we say that F is positive if its coefficients are positive. Furthermore, F is homogeneous of degree s if each of its monomials has degree s. Finally, we say that F is normal if its coefficients are at most 1. For each set A of at most s indices (with possible repetitions), let ∂A (F ) denote the partial derivative of
ON A REFINEMENT OF WARING’S PROBLEM
111
F with respect to the indices in A; furthermore, let EA (F ) denote the expectation of the function ∂A (F ). Example. Let F = t12 t22 t3 +t45 , and let A1 = {1, 2}, A2 = {1, 1, 3}. Then ∂A1 (F ) = 4t1 t2 t3 and ∂A2 (F ) = 2t22 . If the expectation of ti is pi , then EA1 (F ) = 4p1 p2 p3 and EA2 (F ) = 2p2 . Furthermore, F is normal and homogeneous with degree 5. The following concentration result is proved in [Vu2]. Lemma 1.3. For any positive constants s, k, α, , there is a constant g =g(s,k,α,) such that the following hold. If F (t1 , . . . , tn ) is a positive, normal, homogeneous polynomial of degree s satisfying the following two conditions • E(F ) > g log n, • EA (F ) < n−α , for all A, 1 ≤ |A| ≤ s − 1, then Pr F − E(F ) > E(F ) < n−2k . Concentration in product spaces is a fundamental and well-studied topic in probability theory. A wonderful survey by Talagrand [Tal] gives an insightful look on this subject and provides a wide range of results based on the following phenomenon: “A smooth function concentrates strongly around its expected value.” The usual way of defining smoothness is to require that the function in concern has a small Lipschitz coefficient (in other words, changing the value of one variable should not change the value of the function significantly). This definition of smoothness, unfortunately, is not usable in our case. Consider the function G in (1.1). If a number x participates in µ(x) representations of n, then by changing tx , in the worst case, G could change by µ(x). On the other hand, by Theorem 1.1, µ(x) is at least a positive power of n for most x’s. So the Lipschitz coefficient of G is too large for us to apply standard concentration results. Lemma 1.4 arises from a new way of looking at “smoothness.” We say that a function F is smooth if each partial derivative (of any positive order) of F has small expectation. It turns out that this definition of smoothness is sufficient to guarantee strong concentration for an important class of functions: positive polynomials with low degrees. The first concentration result of this type was proved in [KV1] by Kim and the present author, in a somewhat different terminology. Many other results, including Lemma 1.4, were proved in [Vu2] and [Vu4]. Thanks to the very mild assumption on smoothness, these results can be applied even when the function in question has a larger Lipschitz coefficient. This feature is exactly what we need in the present case. In the last few years, these new concentration results have found many applications in various areas, ranging from the theory of random graphs to finite geometry (see [KV1], [KV2], [Vu1], [Vu3], and [Vu4]). For instance, one of these results plays a significant role in the solution of a long-standing problem of Segre in finite geometry (see [Seg] and [KV2]).
112
VAN H. VU
To present the second probabilistic lemma, we need some new notation. Let Y be a random sequence defined by choosing each number m ∈ N0 with probability pm . Similar to §1.2, we denote by tm the random variable representing the choice of m. For all l and n, let
l l Q (n) = x = x1 , . . . , xl xj = n, xj ∈ N0 , j =1
and let QlY (n) be the restriction of Ql (n) on Y , QlY (n) =
l x = x 1 , . . . , xl xj = n, xj ∈ N0 ∩ Y .
j =1
Here and in the rest of the paper, we think of x as both a sequence and a set. Two l-sets x and x are disjoint if x ∩ x = ∅. We denote by Disj(QlY (n)) the maximum h such that QlY (n) contains h pairwise disjoint tuples. Observe that RYl (n) = |QlY (n)| by definition. Lemma 1.4. Let Y be a random sequence defined as above. Assume that there is a positive constant α and a positive integer l such that for all j ≤ l and for all n, E(RYl (n)) = O(n−α ). Then there are constants K1 and K2 such that the following statements hold: j • with probability at least 0.9, Disj(QY (n)) ≤ K1 for every 1 ≤ j ≤ l and every n; j • with probability at least 0.9, RY (N) ≤ K2 for every 1 ≤ j ≤ l and every n. Proof. By the assumption of the lemma, there is a positive constant c such that j txi ≤ cn−α E RY (n) = E x∈Qj (n) xi ∈x
for all j ≤ l and n. For a fixed number h, let Ᏺh be the family of all sets of h pairwise disjoint j -tuples in Qj (N ); each element A of Ᏺh consists of h mutually disjoint sets j x1 , . . . , xh . The probability that Disj(QY (n)) ≥ h is therefore at most E
A∈Ᏺh
xj ∈A
xi
∈xj
txi j
≤ E
x∈Qj (n)
xi ∈x
h txi
≤ ch n−αh .
Set K1 = 2/α; then Pr(Disj(QY (n) ≥ K1 )) = O(n−2 ) for all n. By Borel-Cantelli’s lemma (see Lemma 1.6), the first statement follows. To prove the second statement, we make use of Erd˝os-Rado’s sunflower lemma [ERa], following an elegant idea in [ETe]. The sunflower lemma asserts that if e1 , . . . , em are subsets of size r of the same ground set, and m > (K − 1)r r!, then
ON A REFINEMENT OF WARING’S PROBLEM
113
there are K subsets ej1 , . . . , ejK such that eji ∩ eji ’s are the same for 1 ≤ i = i ≤ K. (These K subsets form a sunflower.) j Set K2 = (K1 − 1)l l! + 1. By the sunflower lemma, if RY (n) ≥ K2 , then there are j K1 j -tuples x1 , . . . , xj ∈ QY (n) and a set U such that
xi ∩ xi = U j for all 1 ≤ i = i ≤ K1 . It then follows that Disj(QY (n ) > K1 , where j = j − |U | and n = n − y∈U y. We finish by applying the first statement. Lemma 1.5 Let Ai be a sequence of events in a probability space. (Borel-Cantelli). ∞ If the series i=1 Pr(Ai ) converges, then with probability 1, at most a finite number of the events Ai can occur. We omit the simple proof of this lemma. For a detailed discussion about the use of Borel-Cantelli’s lemma in infinite product spaces, we refer to Halberstam and Roth [HR]. §2. Number of solutions in a fixed box. In this we give an upper bound s section, for the number of solutions of the equations xjk = n, restricted to a box j =1 s j =1 [1, Pj ]. The (rather involved) proof of this upper bound (see Lemma 2.1) is independent of the rest of the proof of the main theorem, so the reader may want to skip the details and check them later after reading the proof of the main theorem. §2.1. The result. For a fixed sequence P1 , . . . , Ps , let Root(P1 , . . . , Ps ) denote the number of solutions of the equation s j =1
in the box
xjk = n
s
j =1 [1, Pj ].
Lemma 2.1. For a fixed positive integer k ≥ 2, there exists a constant s3 (k) = O(8k k 2 ) such that the following holds. For any constant s > s3 (k), there is a positive constant δ = δ(k, s) such that for every sequence P1 , . . . , Ps of positive integers, s s 1−k/s−δ Root P1 , . . . , Ps = O n−1 Pj + Pj j =1
j =1
for all n. Let us first give the reader some feeling about this upper bound. Theorem 1.1 asserts that there are O(ns/k−1 ) roots in the big box sj =1 [0, n1/k ], which has volume ns/k . On the other hand, the box sj =1 [1, Pj ] has volume sj =1 Pj , so by averaging we
114
VAN H. VU
may expect that the number of roots in this smaller box is O(n−1 sj =1 Pj ). This (to a certain extent) explains the first term in the bound. The second term can be seen as an error term that comes from analytic arguments. However, this error term is negligible if and only if all Pj ’s are close to n1/k . It is easy to deduce from the proof of Lemma 2.1 that there is a positive constant η = 1−η η(k, s) such that if Ps ≤ Pi ≤ Ps = n1/k for all 1 ≤ i < s, then Root(P1 , . . . , Ps ) = s −1 (n 1 Pj ) (see Corollary 2.11). This statement contains Theorem 1.1 as a special case where all Pj = n1/k . For a general sequence P1 , . . . , Ps , one cannot hope for a lower bound, since a very degenerate box, even with a big volume, may contain no roots. The proof of Lemma 2.1 is presented in the rest of this section. To this end, we set K = 2k−1 and denote by P the product sj =1 Pj and by e(α) the function e2πiα . Without loss of generality, we may also assume that 2 ≤ P1 ≤ P2 ≤ · · · ≤ Ps . (In fact, we may also assume that P1 is sufficiently large, whenever needed.) It is trivial that if Ps < (n/s)1/k , then the equation has no roots, so we may assume that Ps = n1/k . Pj e(αmk ), We use the circle method to prove Lemma 2.1. Letting fj (α) = m=1 we have 1 s fj (α)e(−αn) dα. (2.1) Root P1 , . . . , Ps = 0 j =1
In particular, when P1 = P2 = · · · = Ps = n1/k , we deal with the special integral 1 s 0 f1 (α) dα. This integral was estimated by Vinogradov and several other authors (see [Vau], for instance) leading to Theorem 1.1. The main obstacle in the estimate of the more general integral in (2.1) is the possible irregularity of the sequence P1 , . . . , Ps . Since the sequence is arbitrary, the elements Pj ’s can be very large or very small compared to each other. Therefore, the functions fj (α) can have totally different behavior, and by this reason most arguments used for the case that Pj are all equal cannot be repeated. As is well known, the circle method first breaks the unit interval into major and minor arcs and then uses different arguments to estimate the integral over these arcs. The crucial step here is to find a proper definition for these arcs. Because of the possible irregularity of the sequence P1 , . . . , Ps , a good definition has to take into account some information about the distribution of the Pj ’s. This leads to a somewhat subtle definition presented below. The reader will see that this definition is, in fact, the very heart of our proof. §2.2. Definition of the major and minor arcs. Until the end of the proof we set 26K + 4 ν 1 3.5ν , ν= τ= , χ= . k k 2 26K + 17 The reader should not be puzzled by these strange looking constants as they are defined to satisfy several inequalities presented later in the proof. Let s3 (k) = dK 3 k 2 ,
115
ON A REFINEMENT OF WARING’S PROBLEM
where d is a constant such that for all s > s3 (k) the following holds: τ νs s ≥ K(k + 1). j =1 1/j Lemma 2.2. There is 1 ≤ j ≤ s such that Pj ≥ P (1/s)(1−τ )+(K(k+1)/νs(s−j +1)) . Proof. If the inequality in the lemma did not hold for any j , then, taking the product over all j , we would obtain s
P < P (1−τ )+
j =1 (1/(s−j +1))(K(k+1)/νs)
which would imply τ<
s 1 K(k + 1) j =1
j
s
= P (1−τ )+
νs
j =1 (1/j )(K(k+1)/νs)
,
,
a contradiction. Let l be the smallest j satisfying the inequality in the previous lemma. It follows that (s−l+1)ν
Pl
≥ P K(k+1)/s
(2.2)
and Pl ≥ P (1/s)(1−τ ) .
(2.3)
Now we are ready to define the arcs. Set ρ = Plν , B = Plk−ν , and β = B −1 . For all 1 ≤ a ≤ q ≤ ρ and (a, q) = 1, let Ia,q denote the interval (a/q − β, a/q + β). The intervals Ia,q are called the major arcs, and we denote by M their (disjoint) union. To avoid half-intervals we consider the unit interval [β, 1 + β] instead of [0, 1]. The set m = [β, 1 + β]\M forms the minor arcs. §2.3. Some technical lemmas. Lemmas 2.3, 2.5, 2.7, and the second statement of Lemma 2.6 are classical results, and we refer the reader to [Vau] or [Nat2] for their proofs. To this end, a and q are always integers. Lemma 2.3 (Weyl’s inequality). Suppose that (a, q) = 1 and |α − a/q| ≤ q −2 . Then for any Q, Q k ≤ Q1+o(1) q −1 + Q−1 + qQ−k 1/K . e αm m=1 Since the term o(1) in the exponent of Q on the right-hand side plays no role, we ignore it in later applications for the sake of a clearer presentation.
116
VAN H. VU
Lemma 2.4. Let f (α) = 1 > |I | ≥ Q−k+ ; then
Q
m=1 e(αm
1 |I |
k ).
Assume that I is an interval of length
f (α) dα ≤ 4Q1−/4K .
I
Proof. Set µ = Q−k+/4 . For all 1 ≤ q ≤ Q/4 and a, let Ja,q denote the interval [a/q − µ, a/q + µ]. Let ᏹ = I ∩ (∪a,q Ja,q ) and ᏹ = I \ᏹ. The total length of ᏹ is at most 2µQ/2 = 2Q−k+3/4 . It is clear that 1 f (α) dα ≤ Qk− 2Q−k+3/4 Q ≤ 2Q1−/4 . (2.4) |I | ᏹ On the other hand, if α ∈ ᏹ, then by the pigeon-hole principle (or by Dirichlet’s lemma), there is a, q such that 1 ≤ q ≤ Qk−/4 and |α −a/q| ≤ q −1 Q−k+/4 . By the definition of ᏹ, it follows that q ≥ Q/4 . Applying Weyl’s inequality, we obtain f (α) ≤ Q q −1 + Q−1 + qQ−k 1/K ≤ 2Q1−/4K , which yields 1 |I |
ᏹ
f (α) dα ≤ 2Q1−/4K .
(2.5)
The claim follows from (2.4) and (2.5). Pjk Lemma 2.5. Let vj (γ ) = m=1 (1/k)m1/k−1 e(γ m); then for all γ , |γ | ≤ 1/2, vj (γ ) ≤ min Pj , |γ |−1/k . Lemma 2.6. Let S(a, q) =
t, n, Q =
q
m=1 e(am
k /q),
q
q
q≤Q a=1 (a,q)=1
−1
and let
t an . S(a, q) e − q
Then ρ • for any t > 3K, |(t, n, ρ)| ≤ q=1 q|q −1 S(a, q)|t = O(1); • |(s, n, ρ)| = (1). Proof. To prove the first statement we need only show that ρ t q q −1 S(a, q) = O(1). q=1
ON A REFINEMENT OF WARING’S PROBLEM
117
By Weyl’s inequality, |q −1 S(a, q)| = O(q −1/K ), which implies that q|q −1 S(a, q)|t −t/K+1 ) = O(q −2 ). Now the statement follows from the convergence of the = O(q −2 series ∞ q=1 q . The upper bound in the second statement follows from the first statement and the fact that s > 3K. For the proof of the lower bound, we refer to [Vau]. Consider α ∈ I = Ia,q , an interval in M centered at a/q, and let Vj (α) = q −1 S(a, q)vj (γ ), where γ = α − a/q. For all j = 1, . . . , s, set Dj (α) = fj (α) − Vj (α). Note that Vj and Dj are defined over the major arcs only. Lemma 2.7. Assume that α ∈ Ia,q ; then for all j , Dj (α) = O q 1 + βP k . j
§2.4. The contribution of the minor arcs Claim 2.8. There is a constant δ = δ(k, s) > 0 such that g(α)e(−αn) dα = O P 1−k/s−δ . m
Proof. Using Dirichlet’s lemma, we have, by the definition of the minor arcs, that for all α ∈ m, there is a, q, (a, q) = 1, ρ ≤ q ≤ B, such that |α −a/q| ≤ q −1 β ≤ q −2 . By Weyl’s inequality, for all j ≥ l, we have fj (α) ≤ Pj q −1 + P −1 + qP −k 1/K = O Pj q −1/K . j j On the other hand, for all j < l, |fj (α)| ≤ Pj . Thus g(α)e(−αn) dα ≤ max g(α) α∈m m l−1 s = O Pj Pj q −1/K j =1
j =l
= O P q −(s−l+1)/K .
Since q ≥ ρ = Plν , by (2.2) we have −ν(s−l+1)/K
q −(s−l+1)/K ≤ Pl Setting δ = 1/s completes the proof.
≤ P −(k+1)/s .
(2.6)
118
VAN H. VU
§2.5. The contribution of the major arcs. Lemma 2.1 follows from Claim 2.8 and the following. Claim 2.9. There is a positive constant δ = δ(k, s) such that for every sequence 2 ≤ P1 ≤ P2 ≤ · · · ≤ Ps = n1/k and P = sj =1 Pj , g(α)e(−αn) dα = O P n−1 + P 1−k/s−δ . M = M
Proof. The key problem here is again the possible irregularity of the sequence P1 , . . . , Ps . This forces us to split the proof into several cases, according to the order of magnitude of Pl and Ps . Case A:
Pl ≥ P (1/s)(1+τ ) . Consider g(α) dα, g(α)e(−αn) dα ≤ M = I ∈M I
M
(2.7)
where the summation is taken over the disjoint intervals contained in M. Since M contains at most ρ 2 intervals of length 2β and |g(α)| ≤ P for all α, it follows that M ≤ ρ 2 βP = O Pl−k+3ν P by the definitions of ρ and β. Using the lower bound on Pl , we have M = O P 1−(1/s)(k−3ν)(1+τ ) = O P 1−k/s−δ , with δ = (1/s)(kτ − 3ν − 3τ ν) > 0 by the definitions of τ and ν (see §2.2). This completes the proof for this case. Case B: Pl≤ P (1/s)(1+τ ) and Ps ≥ P (1/s)(1+χ) . Again consider (2.7). In order to upper bound I |g(α)| dα, let us notice that I
g(α) dα =
s fj (α) dα ≤ P fs (α) dα. Ps I I
(2.8)
j =1
Moreover, the upper bound on Pl , the lower bound on Ps , and the definition of β together imply |I | = 2β > β ≥ P −(k/s)(1+τ ) ≥ Ps−k+0 , with 0 = k(χ − τ )/(χ + 1). Applying Lemma 2.4, we obtain fs (α) dα ≤ |I |Ps1−0 /4K ≤ 2βPs1−0 /4K . I
(2.9)
(2.10)
ON A REFINEMENT OF WARING’S PROBLEM
119
Combining (2.7), (2.8), and (2.10) with the trivial fact that M contains at most ρ 2 intervals yields P 1− /4K − /4K M = O ρ 2 Ps 0 β = O P Ps 0 Pl−k+3ν . Ps
(2.11)
By the lower bounds on Ps and on Pl (see (2.3)), it follows from (2.11) that M = O P × P (−1/s)((1−τ )(k−3ν)+(0 /4K)(1+χ)) = O P 1−k/s × P −(1/s)((k(χ−τ )/4K)−τ k−3ν) = O P 1−k/s−δ for δ = (1/s)((k(χ −τ )/4K)−τ k −3ν). Again, by the definitions of ν, χ, and τ (see §2.2), δ > 0. This completes the proof for the present case. Case C: Pl ≤ P (1/s)(1+τ ) and Ps ≤ P (1/s)(1+χ) . This is the most complex case and requires a more delicate analysis. First observe that I
g(α)e(−αn) dα s i−1 s fj (α) Di (α) = Vj (α) e(−αn) dα, i=0 I
where D0 =
−1
j =1 fj
j =1
=
j =i+1
0
j =1 fj
Li (α) =
i−1
= 1 and Di , Vi ’s are defined as in §2.2. Setting
fj (α) Di (α)
j =1
we have
s
Vj (α) ,
j =i+1
s Li (α)e(−αn) dα . g(α)e(−αn) dα ≤ I
Setting Si =
(2.12)
i=0
I ∈M | I
I
Li e(−αn) dα|, it follows that s M = g(α)e(−αn) dα ≤ Si . M
i=0
(2.13)
120
VAN H. VU
Let i0 be the smallest index i such that Pi ≥ P 1/2s . By the upper bound Pi ≤ Ps ≤ P (1/s)(1+χ) ≤ P (13/10)/s , it is easy to prove that i0 cannot be very large. For instance, one can show s i0 ≤ . 2
(2.14)
Consider the case i ≥ i0 . Since |fj (α)| ≤ Pj and |Vj (α)| ≤ Pj for all j and α, it follows from Lemma 2.7 that Li (α) ≤ P |Di (α)| = O P P −1 q 1 + βP k . i i Pi
(2.15)
Since P 1/2s ≤ Pi ≤ Ps ≤ P (1/s)(1+χ) , (2.15) together with the bounds on Pl gives Li (α) = O P × P −(1/s)(1/2−ν(1+τ )−k(1+χ)+(k−ν)(1−τ )) . (2.16) Therefore, 2 −(1/s)(1/2−ν(1+τ )−k(1+χ)+(k−ν)(1−τ )) Si = Li e(−αn) dα = O ρ βP × P I ∈M I = O P × P −(1/s)(1/2−3ν(1+τ )−k(1+χ)+2(k−ν)(1−τ )) = O P 1−k/s × P −(1/s)(1/2−3ν(1+τ )−kχ −2kτ −2ν) = O P 1−k/s−δ , (2.17) where δ = (1/s)((1/2) − 3ν(1 + τ ) − kχ − 2kτ − 2ν) > 0 (again by the definitions of ν, τ , and χ ). Now we deal with the case i < i0 . By the definition of Vj , we have i s Li (α)e(−αn) dα = O Pj Vj (α)e(−αn) dα I I j =i+1 j =1 i s−i an (2.18) =O Pj q −1 S(a, q) e − q j =1 s β × vj (γ )e(−γ n) dγ . −β j =i+1 By (2.14), i < i0 < s/2, so s − i > s/2 > 3K. Lemma 2.6 applies and gives −1 q S(a, q) s−i = O(1). 1≤a
(2.19)
121
ON A REFINEMENT OF WARING’S PROBLEM
Equation (2.19) implies s β Si = O Pj vj (γ )e(−γ n) dγ . −β j =i+1 j =1
i
(2.20)
β It now remains to bound Ti = ij =1 Pj | −β sj =i+1 vj (γ )e(−γ n) dγ |. In order to do this, we have to distinguish two cases. To begin, let us set = 1/s 2 . Case C1: P1 ≥ Ps1− . Since P1 is the smallest among the Pj ’s, this case deals with those sequences P1 , . . . , Ps whose elements are very close to each other. In particular, it contains the special case P1 = · · · = Ps = n1/k considered in Theorem −1 1.1. Because P1 ≤ P 1/s , it follows that Ps ≤ P (1/s)(1−) ≤ P (1/s)(1+2) . Notice that, by the definition of Pl (see the sentence preceding (2.2)), we have Pl = P1 . It is also clear that i0 = 1. So, in this case, we only have to look at β s vj (γ )e(−γ n) dγ . T0 = −β j =1 The very small gap between P1 and Ps allows us to apply the classical treatment of T0 for the case that all Pj ’s are equal to n1/k . By the triangle inequality, 1/2 s T0 ≤ vj (γ )e(−γ n) dγ −1/2 j =1 1/2 s −β s + vj (γ )e(−γ n) dγ + vj (γ )e(−γ n) dγ β j =1 −1/2 j =1
(2.21)
= U 1 + U2 + U3 . Using Lemma 2.5, we have U2 ≤
β
1/2
γ −s/k dγ = O β −s/k+1 = O B s/k−1 .
(2.22)
Since
B = Plk−ν ≤ Ps(k−ν) ≤ P (1/s)(k−ν)(1+2) ≤ P (1/s)(k− ) , where = ν − 2 > 0, it follows from (2.22) that U2 = O P (1/s)(k− )(s/k−1) = O P 1−k/s− (s/k−1)/s = O P 1−k/s−δ ,
(2.23)
122
VAN H. VU
with δ = (s/k −1)/s > 0. By symmetry, U3 = U2 = O(P 1−k/s−δ ). Finally, U1 can be written as
U1 =
k −s
s j =1
1≤mj ≤Pjk m1 +···+ms =n
1/k−1
mj
.
(2.24)
It is clear that if sj =1 mj = n, then one of the mj ’s should be at least n/s. Let j0 be the smallest index such that Pjk0 ≥ n/s; it follows that U1 ≤
Pik s j =j0
=O =O
m
1/k−1
i=j m=1
s j =j0
n 1/k−1 s
Pi n−1 Pj
i=j
(2.25)
P n−1
j ≥j0
= O P n−1 . Equations (2.25) and (2.23) together complete the proof of Claim 2.9 for the present case. Case C2:
P1 ≤ Ps1− . Let P0 = 2 and Ps+1 = ∞. Set P −k i s r−1 dγ Pj v (γ ) Hi,r = j −k P r j =1 j =i+1
for all 0 ≤ i ≤ s and 1 ≤ r ≤ s + 1. Lemma 2.10. For all 0 ≤ i ≤ s and 1 ≤ r ≤ s + 1, k/s P1 logP . Hi,r = O P 1−k/s Ps Proof. We make use of Lemma 2.5, which asserts that vj (γ ) ≤ min Pj , |γ |−1/k
(2.26)
−k ), for all j and |γ | ≤ 1/2. On the other hand, for any γ in the interval (Pr−k , Pr−1
Pr−1 < γ −1/k < Pr .
(2.27)
ON A REFINEMENT OF WARING’S PROBLEM
123
Consider two cases. (a) Let r ≥ i. Using (2.26) and (2.27), we have Hi,r ≤
r−1
Pj
−k Pr−1
Pr−k
j =1
γ −(s−r+1)/k dγ ≤
r−1 j =1
Pj Pr¯s−r−1−k Gs,r,k = Ꮽ,
where • Pr¯ = Pr if s − r − 1 − k < 0 and Pr¯ = Pr−1 otherwise; • G(s, r, k) = log(Pr /Pr−1 )k = O(logP ) if s − r − 1 − k = 0 and G(s, r, k) = 1 otherwise. (b) Let r < i. We have Hi,r ≤
i
Pj
j =1
−k Pr−1
Pr−k
γ
−(s−i)/k
dγ ≤
r−1 j =1
Pj Pr˜s−i−1−k Ks,i,k = Ꮾ,
where • Pr˜ = Pr if s − i − 1 − k < 0 and Pr˜ = Pr−1 otherwise; • K(s, r, k) = log(Pr /Pr−1 )k = O(logP ) if s − i − 1 − k = 0 and K(s, r, k) = 1 otherwise. It now remains to show that k/s P1 logP . max(Ꮽ, Ꮾ) = O P 1−k/s Ps
(2.28)
Equation (2.28) follows fairly easily from the definitions of Ꮽ and Ꮾ, and we invite the reader to work out the details. To complete the proof of case C2, it suffices to show that there is a constant δ > 0 such that for all i ≤ i0 , β s Pj vj (γ )e(−γ n) dγ = O P 1−k/s−δ . Ti = −β j =i+1 j =1 i
It follows from Lemma 2.10 and the assumption P1 ≤ Ps1− that Ti ≤
i j =1
Pj
s vj (γ ) dγ −β β
j =i+1
k/s P1 logP = O P 1−k/s Ps
(2.29)
124
VAN H. VU
2 = O P 1−k/s P −k/s logP = O P 1−k/s−δ , with δ = k/2s 2 > 0. This proves (2.29) and completes the proof of Claim 2.9. Remark. In case C1, it is possible to prove that M has order (n−1 P ) + O(P 1−k/s−δ ). To show this, it suffices to note that U1 has order (n−1 P ). We obtain the following corollary. Corollary 2.11. There are positive constants and δ such that if P1 ≥ Ps1− , then Root P1 , . . . , Ps = n−1 P + O P 1−k/s−δ . As already mentioned in §2.1, this corollary implies Theorem 1.1, which is the special case P1 = · · · = Ps = n1/k . §3. Proof of the main theorem. In this section, k and s are fixed, where s > s2 (k) = s3 (k)k and where s3 (k) is the threshold in Lemma 2.1. Consider the random sequence X defined in §1.2. A solution (x1 , . . . , xs ) is always ordered; that is, x1 ≤ x2 ≤ · · · ≤ xs . The proof of the main theorem follows the plan described in §1.3, with one additional twist. A solution (x1 , . . . , xs ) is called small if x1 ≤ n1/ks ; otherwise it is normal. Let Snormal (n) (resp., Ssmall (n)) denote the set of normal (resp., small) solutions
s Snormal (n) = x = x1 , . . . , xs xjk = n, xj ∈ N0 , x1 ≥ n1/ks , (3.1) j =1
s k 1/ks . xj = n, xj ∈ N0 , x1 < n Ssmall (n) = x = x1 , . . . , xs
(3.2)
j =1
Snormal,X (n) is the restriction of Snormal (n) to X (x ∈ Snormal,X (n) if and only if for all xj ∈ x, xj ∈ X). Similarly, Ssmall,X (n) is the restriction of Ssmall (n). Furthermore, s Rnormal,X (n) = |Snormal,X (n)| is the number of normal representations of n in X. s Defining Rsmall,X analogously, we have s s s (n) = Rnormal,X (n) + Rsmall,X (n). RX
The main theorem follows from the following two claims. s Claim 3.1. A.s. Rnormal,X (n) = (log n) for every sufficiently large n.
Claim 3.2. There is a positive constant C such that with probability at least 0.8, ≤ C for every n.
s Rsmall,X
125
ON A REFINEMENT OF WARING’S PROBLEM
Before presenting the proofs of these claims, let us point out why we need to define small and normal representations. The reason is that Lemma 1.4 cannot be applied s (n). To apply Lemma 1.4, directly to the polynomial G in (1.1) which represents RX it requires that all partial derivatives (of any order) of the polynomial involved have expectation O(n−α ). On the other hand, if (x1 , . . . , xs ) is a representation of n with x1 very small (a constant, say), then E(∂ s−1 G/∂tx2 · · · ∂txs ) = px1 = (1), which is too large for us. The introduction of normal representations resolves the situation. By definition, every element of a normal representation is a positive power of n; this guarantees that if F is the restriction of G to normal representations, then bad situations as above do not occur. This enables us to use Lemma 1.4 to prove Claim 3.1. Furthermore, Claim 3.2 asserts that by restricting the problem to normal representations we do not lose s (n) is bounded by anything, because the contribution of small representations in RX a constant. s (n) as a polynomial as Proof of Claim 3.1. First we write Rnormal,X s Rnormal,X (n) =
x∈Snormal (n) xj ∈x
txj = F t1 , . . . , tn1/k .
To apply Lemma 1.4 to the polynomial F , we need to estimate the expectations of F and its derivatives. These estimates follow from the following lemmas, which, in turn, are simple corollaries of Lemma 2.1. Lemma 3.3. If l > s3 (k), then for all m, Yl,m =
m=x1k +···+xlk
l
−1+k/ l xj
= O(1).
j =1
Proof. We first break Yl,m into diadic subsums. Let ᏼ be the set of all l-tuples {P1 , . . . , Pl }, where Pj ∈ {2, 4, . . . , 2t } and t is the smallest integer such that 2t ≥ m1/k . If A = {P1 , . . . , Ps } ∈ ᏼ, let σA denote the subsum of Yl,m taken over all s-tuples (x1 , . . . , xs ) satisfying Pj /2 ≤ xj ≤ Pj for all 1 ≤ j ≤ s. It follows that Ym,l ≤
σA .
(3.3)
A∈ᏼ
Let PA = −1+k/ l
l
j =1 Pj . It is clear that each term in σA
−1+k/ l is of order ( lj =1 Pj )= 1−k/ l−δ
). By Lemma 2.1, the number of terms in σA is O(PA m−1 + PA (PA Therefore, k/ l σA = O PA m−1 + PA−δ .
).
(3.4)
126
VAN H. VU
Notice that if PA < (m/ l)1/k , then σA = 0, because in this case there is no solution in the box sj =1 [1, Pj ]. This and (3.4) imply k/ l σA = O PA m−1 + m−η , for some positive constant η. Thus k/ l −1 + O m−η logs m . σA = O m PA
(3.5)
A∈ᏼ
A∈ᏼ
The second term on the right-hand side of (3.5) is o(1). Moreover, since the summand in the first term runs over all diadic boxes, we have k/ l PA = O(m), (3.6) A∈ᏼ
completing the proof. l (m)) = O(m−µ ) for Lemma 3.4. There is a positive constant µ such that E(RX all 1 ≤ l < s and all m.
Proof. It suffices to show that there is a constant µ > 0 such that −1+k/s l xj = O m−µ , Zl,m = m=x1k +···+xlk
(3.7)
j =1
since the contribution of the log term is negligible. Suppose l > s3 (k). Then Lemma 3.3 and the fact that if lj =1 xjk = m then xl (the largest term) should be at least (m/ l)1/k yield (1/k)(k/s−k/ l) m Zl,m ≤ Yl,m = O m−(s−l)/sl , (3.8) l and we are done. If l ≤ s3 (k), then, by the fact that s > s2 (k) = ks3 (k), we get l/s < 1/k. Using xl ≥ (m/ l)1/k , we obtain 1/k l−1 m (3.9) x −1+k/s . Zl,m = O m(1/k)(−1+k/s) x=1
This is an upper bound since on the right-hand side we sum up over all possible s-tuples where 1 ≤ xj ≤ m1/k , 1 ≤ j < l. Moreover, 1/k m
x=1
x −1+k/s ≈
m1/k 1
x −1+k/s dx = O m1/s ,
(3.10)
ON A REFINEMENT OF WARING’S PROBLEM
127
so Zl,m = O m−(1/k−1/s−(l−1)/s) = O m−(1/k−l/s) ,
(3.11)
completing the proof. Lemma 3.5. For every sufficiently large n, s (n) = log n . E Rnormal,X Proof. Notice that sj =1 log1/s xj ≤ log n for all solutions (x1 , . . . , xs ) of x1k + · · · + xsk = n. Thus, the upper bound O(log n) follows from Lemma 3.3. To prove the lower bound, notice that by convexity the contribution of a term (x1 , . . . , xs ), where xj = 0, 1, is s j =1
−1+k/s
cxj
log1/s xj ≥ cs
(s/k)(k/s−1) n n . log s s
By Theorem 1.1, there are (ns/k−1 ) terms that do not contain either 0 or 1. This completes the proof. Before continuing, let us point out an important remark—that by increasing the s constant c we can assume that the ratio E(Rnormal,X (n))/ log n is arbitrarily large, whenever needed. s Now we apply Lemma 1.4 to F = Rnormal,X (n). By Lemma 3.5, there are positive constants C1 and C2 such that C1 log n ≤ E(F ) ≤ C2 log n.
(3.12)
Consider a set A = {i1 , . . . , ir }, where ij ∈ {1, 2, . . . , n1/k } and r ≤ s − 1. Let m = n − y∈A y k and l = s − |A|. Consider the partial derivative with respect to A, ∂A F =
∂ |A| F . ∂ti1 · · · ∂tir
(3.13)
By the definition of F and Zl,m (see (3.7)),
E ∂A F = E
x∈Snormal (n) xj ∈x\A A⊂x
txj = O Zl,m log n = O m−µ
(3.14)
for some positive constant µ. Recalling that Snormal (n) consists of solutions with
128
VAN H. VU
x1k ≥ n1/s , it follows that if |A| < s, then m ≥ x1k ≥ n1/s . This and (3.14) yield that E ∂A F = O n−µ/s = O n−α
(3.15)
for all A, 1 ≤ |A| ≤ s − 1, α = µ/s. Equations (3.12) and (3.15) provide the sufficient conditions to apply Lemma 1.4. Let be a small positive constant ( = 0.1, say), and set α as in (3.15). By the remark right after the proof of Lemma 3.5, we can assume that C1 is sufficiently large so that (3.16) C1 > g s, k, α, , where g = g(s, k, α, ) is a parameter in Lemma 1.4. Lemma 1.4 yields (notice that in the present case we have only n1/k random variables) (3.17) Pr F − E(F ) > E(F ) ≤ n−2 . Since
∞
n=1 n
−2
converges, (3.17) and Borel-Cantelli’s lemma complete the proof.
s Proof of Claim 3.2. First we need to estimate E(Rsmall,X (n)). s Lemma 3.6. There is a positive constant µ such that E(Rsmall,X (n)) = O(n−µ ).
Proof. Recall that if x ∈ Ssmall (n), then x1 ≤ n1/ks , k/s−1 s (n) = O log n xj E Rsmall,X = O log n
x∈Ssmall (n) xj ∈x
1/ks n
x=1
x −1+k/s
max
n−n1/s ≤m≤n
Zs−1,m .
By (3.8) and the fact that m > n/2, Zs−1,m = O(n−1/s(s−1) ). Moreover, similar to (3.10), 1/ks n 2 x −1+k/s = O n1/s . x=1
Together we have s 2 (n) = O n−(1/s(s−1)−1/s ) log n = O n−µ , E Rsmall,X where µ = (1/s(s − 1) − 1/s 2 )/2 > 0. s Define Disj(Ssmall,X (n)) as in §1.3. Since E(Rsmall,X (n)) = O(n−µ ), repeating the proof of the first statement of Lemma 1.5, we can show that there exists a constant K1 so that with probability at least 0.9, Disj(Ssmall,X (n)) ≤ K1 for all n. Moreover,
ON A REFINEMENT OF WARING’S PROBLEM
129
by Lemmas 1.5 and 3.4, there is a constant K2 such that with probability at least 0.9, l (n) ≤ K for all l < s and all n. RX 2 Now we can finish the proof by an argument similar to the one used in the proof s of the second statement of Lemma 1.5. Assuming that Rsmall,X (n) > K3 , where s K3 = (K −1) s! and K = max(K1 , K2 )+1, we can find a sunflower x1 , . . . , xK . This l (m) ≥ K for some l < s and m < n implies that either Disj(Ssmall,X (n)) ≥ K or RX K i according to whether i=1 x = ∅ or not. These events can happen with probability at most 0.2. Claims 3.1 and 3.2 yield that there is a sequence X and a finite number N0 such s (n) = (log n) for all n ≥ N . To be completely done, we need to take care of that RX 0 the small (between 1 and N0 ) integers as well. Notice that, using Lemma 1.5 together l (n) = O(1) for all l < s and all n. So with Lemma 3.4, we can also require that RX we may assume that X satisfies the following two properties: s (n) = (log n) for all n ≥ N ; • RX 0 l (n) = O(1) for all l < s and n. • RX The second property of X implies that adding finitely many numbers to X would s (n) by at most a constant for all sufficiently large n. On the other hand, change RX 1/k X = X ∪ {0, 1k , . . . , N0 } is a basis, since by Theorem 1.1 all small numbers (at most N0 ) now have at least one representation in X . Thus X is a basis satisfying the statement of the main theorem. §4. Some related results. In this section we consider several results related to the main theorem which can be obtained fairly easily from Lemma 1.4. For instance, we give a direct answer to the question of Nathanson (see §1.1) without invoking the main theorem. We also briefly mention an application of Lemma 1.5 which leads to the generalization of a classical result of Erd˝os and Rényi (see [ER] and [HR]). But let us start with a few questions of our own. The threshold function s2 (k) in the main theorem is of order (K 3 k 3 ) = (23k k 3 ), which is exponential in k. On the other hand, it is well known that the threshold s1 (k) in Theorem 1.1 is much smaller, s1 (k) = O(k log k). Question 1. What is the smallest s2 (k) for which the main theorem holds? The value of s2 (k) depends mainly on the threshold s3 (k) in Lemma 2.1. In the proof of Lemma 2.1, we need to set s3 (k) = (23k k 2 ). We believe that, using more advanced techniques from the circle method, one should be able to reduce s3 (k) to some value around s1 (k) or at least to a polynomial in k. Since a rigorous proof of this might be significantly more technical and, on the other hand, since finding small s2 (k) and s3 (k) is not the primary goal of this paper, we do not try to investigate this matter here. Questions 2. What is the smallest s3 (k) for which Lemma 2.1 holds? Can s3 (k) = s1 (k)?
130
VAN H. VU
Theorem 4.1 below gives a direct answer to Nathanson’s question without appealing to the main theorem. Moreover, the threshold on s in Theorem 4.1 is somewhat better than the threshold in the main theorem. Theorem 4.1. For any k ≥ 2 and s such that 2s > s1 (k)k, there is a subset X of Nk0 such that for all positive integers n, s RX (n) = log n and
X(m) = O m1/s log1/s m .
Moreover, if k = 2, then the statement holds for all s ≥ 4. Proof. Consider the random sequence X used in the proof of the main theorem. Using Chernoff’s bound (X(m) is a sum of independent random variables and thus is strongly concentrated around its mean, which is of order O(m1/s log1/s m)) combined with Borel-Cantelli’s lemma, it is easy to show that a.s. X has the required density. The proof that a.s. X is a basis relies entirely on Lemma 1.4. Since we need only s (n), we do not have to use Lemma 2.1. We call a pair (k, s) a lower bound on RX exceptional if k = 2 and s = 4 or 5. For a nonexceptional pair (k, s), we say that a solution (x1 , . . . , xs ) is big if x1 ≥ n1/k− for = 1/ks 2 . (Again a solution x1 , . . . , xs is ordered; i.e., x1 ≤ x2 ≤ · · · ≤ xs .) If (k, s) is exceptional, then any solution is big. s A solution that is not big is small. Define Sbig (n) and Rbig,X (n) consequentially. It is sufficient to show that a.s. s = log n (4.1) Rbig,X for every sufficiently large n. Following the proof of the main theorem, let H be the restriction of G (see (1.1)) onto the set of big solutions. To apply Lemma 1.4, we only need to estimate the expectations of H and its partial derivatives. These estimates are elementary and do not require the difficult Lemma 2.1. Lemma 4.2. We have E(H ) = (log n). Proof. It is enough to show that the number of big solutions is (ns/k−1 ), since a solution’s contribution in the expectation of H is of order (n−s/k+1 log n). If s − 1 > s1 (k) (s1 (k) is the threshold in Theorem 1.1), then we can show that the number of small solutions is negligible. Fixing x1 , there are O(n(s−1)/k−1 ) solutions to the equation x2k + · · · + xsk = n − x1k . On the other hand, x1k can have only n1/k− values. Therefore, there are only O(ns/k−1− )= o(ns/k−1 ) small solutions. So the statement follows from Theorem 1.1. The only cases when s − 1 ≤ s1 (k) are the exceptional cases: k = 2, s = 4 or 5. For the case s = 4, it is well known that the number of representations of n as a sum
ON A REFINEMENT OF WARING’S PROBLEM
131
of four squares is (n). The statement for these cases becomes trivial since every solution is big by definition. Lemma 4.3. There is a positive constant µ such that EA (H ) = O(n−µ ) for all A, 1 ≤ |A| ≤ s − 1. Proof. First assume that k > 2. Let m = n − y∈A y k , and let l = n − |A|. We have EA (H ) = O log n
l
m=y1k +···+ylk yj ≥n1/k−
j =1
−1+k/s
yj
.
(4.2)
If l > s1 (k), then there are O(nl/k−1 ) terms in the sum. On the other hand, each term is at most n(1/k−)(−1+k/s)l = n−l/k+l/s+l(1−k/s) . Thus, EA (H ) = O n−(1−l/s−l(1−k/s)) log n = O n−µ , (4.3) where µ = (1 − l/s − l) > 0. The proof for the case l ≤ s1 (k) relies on the fact that the number of solutions of x k + y k = n is no(1) . It follows that the number of terms in the sum is O(n(l−2)/k+o(1) ). So, (4.4) EA (H ) = O n(l−2)/k+o(1)−l/k+l/s+l(1−k/s) log n = O n−µ , where µ = 2/k − l/s − l > 0, completing the proof. For k = 2, the situation is even simpler. Since l < s, µ = 2/k − l/s − / l > 1/s − l > 0, and the statement follows directly from (4.4). s Applying Lemma 1.4, it follows that, with probability at least 1−O(n−2 ), Rbig,X (n) = (log n). By Borel-Cantelli’s lemma, (4.1) holds a.s. for all large n. To obtain (4.1) for every n one can repeat the last argument of the proof of the main theorem.
Remark. Theorem 4.1 can also be proved using Janson’s inequality [Jan]. However, the computation is a little bit more complicated. A generalization of Erd˝os-Tetali’s result (case k = 1). Our proof of the main theorem also gives a new proof for the result of Erd˝os and Tetali, which verifies the statement of the main theorem for k = 1. The key advantage of k = 1 is that, in this case, Lemma 2.1 becomes a triviality, and the proof can be obtained by repeating the arguments in §3 with few nominal changes. (In fact the computation becomes much easier for this case.) Furthermore, our machinery allows us to generalize the result in the following way. Let a1 , . . . , as be fixed positive integers with gcd(a1 , . . . , as ) = 1. Consider the following type of representations: n = a1 x1 + · · · + as xs .
(4.5)
132
VAN H. VU
Theorem 4.4. For any s ≥ 2, there is a subset X ∈ N0 such that s (n) = log n RX s (n) is the number of representations of type for every positive integer n, where RX (4.5) of n in X.
Erd˝os-Tetali’s result is the special case where all aj = 1. To prove Theorem 4.4, instead of the difficult Lemma 2.1, we need only the following straightforward statement. Lemma P1 , . . . , Ps , the number of solutions of (4.5) in the 4.5. For all sequences box sj =1 [1, Pj ] is O(n−1 sj =1 Pj ). It looks plausible that the main theorem could also be generalized in this direction. However, due to space limitations, we do not investigate this matter. Sidon’s question and an extension of a theorem of Erd˝os and Rényi. In 1932, Sidon [Sid],in connection with in Fourier analysis, investigated power series of ∞his awork ∞ ai h i type i=1 z , when ( i=1 z ) has bounded coefficients. This leads to the study of sequences S = (ai )∞ 1 such that for all m, RSh (m) ≤ g for some fixed constant g. A set S is of type Bh (g) if it satisfies the above inequality for all m. One classical question in this area is the following (see [PS]). Let S be a sequence of type Bh (g). How fast can S(n) grow, where S(n) is the number of elements of S not exceeding n? In [ER], Erd˝os and Rényi gave an answer to this question for the case h = 2. This result was discussed in great detail by Halberstam and Roth [HR]. Theorem 4.6. For any positive constant , there is g = g() such that there is an infinite sequence S of type B2 (g) satisfying S(n) > n1/2− for sufficiently large n. The result is best possible up to the term. Erd˝os-Rényi’s proof uses moment computation and is technical. Our Lemma 1.5 yields a simple proof for the following theorem, which generalizes Theorem 4.6 to arbitrary h. Theorem 4.7. For any positive integer h ≥ 2 and any positive constant , there is g = g(, h) such that there is an infinite sequence S of type Bh (g) satisfying S(n) > n1/ h− for all sufficiently large n.
ON A REFINEMENT OF WARING’S PROBLEM
133
We leave the details for the reader as an exercise. Several related problems that can be solved by the technique developed in this paper will be discussed in a future paper. Acknowledgments. Most of this work was completed while the author was at the Institute for Advanced Study in Princeton, New Jersey. I thank P. Sarnak for several useful discussions and P. Tetali for his careful reading of the manuscript. I also thank the referee for useful comments. References [CEN] [Erd] [EN]
[ERa] [ER] [ESS] [ETe] [ET] [HR] [Hil]
[Jan] [KV1] [KV2] [Kol]
[Nat1]
[Nat2] [PS]
[Ruz] [Seg]
˝ and M. Nathanson, Lagrange’s theorem with N 1/3 squares, Proc. S. L. G. Choi, P. Erdos, Amer. Math. Soc. 79 (1980), 203–205. ˝ “Problems and results in additive number theory” in Colloque sur la Théorie des P. Erdos, Nombres (Bruxelles, 1955), Masson and Cie, Paris, 1956, 127–137. ˝ and M. Nathanson, “Lagrange’s theorem and thin subsequences of squares” in P. Erdos Contribution to Probability, ed. J. Gani and V. K. Rohatgi, Academic Press, New York, 1981, 3–9. ˝ and R. Rado, Intersection theorems for systems of sets, J. London Math. Soc. 35 P. Erdos (1960), 85–90. ˝ and A. Rényi, Additive properties of random sequences of positive integers, Acta P. Erdos Arith. 6 (1960), 83–110. ˝ A. Sárközy, and V. T. Sós, Problems and results on additive properties of general P. Erdos, sequences, III, Studia Sci. Math. Hungar. 22 (1987), 53–63. ˝ and P. Tetali, Representations of integers as the sum of k terms, Random Structures P. Erdos Algorithms 1 (1990), 245–261. ˝ and P. Turán, On a problem of Sidon in additive number theory, and on some P. Erdos related problems, J. London Math. Soc. 16 (1941), 212–215. H. Halberstam and K. F. Roth, Sequences, 2d ed. Springer, New York, 1983. D. Hilbert, Beweis für die Darstellbarkeit der ganzen Zahlen durch eine feste Anzahl unter Potenzen (Waringsche Problem), Nachrichten von der Königlichen Gesellschaft der Wissenschaften zu Göttingen, mathematisch-physikalische Klasse aus dom Jahr 1909, 17–36; Math. Ann. 67 (1909), 281–300. S. Janson, Poisson approximation for large deviations, Random Structures Algorithms 1 (1990), 221–229. J. H. Kim and V. H. Vu, Concentration of multivariate polynomials and its applications, Combinatorica 20 (2000), 417–434. , Small complete arcs on projective planes, submitted. M. Kolountzakis, “Some applications of probability to additive number theory and harmonic analysis” in Number Theory (New York, 1991–1995), ed. D. V. Chudnovsky, G. V. Chudnovsky, and M. B. Nathanson, Springer, New York, 1996, 229–251. M. Nathanson, “Waring’s problem for sets of density zero” in Analytic Number Theory (Philadelphia, 1980), ed. M. Knopp, Lecture Notes in Math. 899, Springer, Berlin, 1981, 301–310. , Additive Number Theory: The Classical Bases, Grad. Texts in Math. 164, Springer, New York, 1996. C. Pomerance and A. Sárközy, “Combinatorial number theory” in Handbook of Combinatorics, Vol. 1, ed. R. Graham, M. Grötschel, and L. Lovász, Elsevier, Amsterdam, 1995, 967–1018. I. Ruzsa, A just basis, Monatsh. Math. 109 (1990), 145–151. B. Segre, Le geometrie di Galois, Ann. Mat. Pura Appl. (4) 48 (1959), 1–96.
134 [Sid] [Spe]
[Tal] [Vau] [Vu1] [Vu2] [Vu3] [Vu4] [Wir] [Zöl1] [Zöl2]
VAN H. VU S. Sidon, Ein Satz über trigonomische Polynome und seine Anwendung in der Theorie der Fourier-Reihen, Math. Ann. 106 (1932), 536–539. J. Spencer, “Four squares with few squares” in Number Theory (New York, 1991–1995), ed. D. V. Chudnovsky, G. V. Chudnovsky, and M. B. Nathanson, Springer, New York, 1996, 295–297. M. Talagrand, A new look at independence, Ann. Probab. 24 (1996), 1–34. R. C. Vaughan, The Hardy-Littlewood Method, Cambridge Tracts in Math. 80, Cambridge Univ. Press, Cambridge, 1981. V. H. Vu, On some simple degree conditions that guarantee the upper bound on the chromatic (choice) number of random graphs, J. Graph Theory 31 (1999), 201–226. , On the concentration of multivariate polynomials with small expectation, Random Structures and Algorithms 16 (2000), 344–363. , New bounds on nearly perfect matchings of hypergraphs: Higher codegrees do help, Random Structures and Algorithms 17 (2000), 29–63. , Some new concentration results and applications, manuscript. E. Wirsing, Thin subbases, Analysis 6 (1986), 285–308. J. Zöllner, Über eine Vermutung von Choi, Erd˝os und Nathanson, Acta Arith. 45 (1985), 211–213. , Der Vier-Quadrate-Satz und ein Problem von Erd˝os and Nathanson, Ph.D. thesis, Johannes Gutenberg-Universität, Mainz, 1984.
Microsoft Corporation, One Microsoft Way, Redmond, Washington 98052, U.S.A; [email protected]
Vol. 105, No. 1
DUKE MATHEMATICAL JOURNAL
© 2000
TWO-BODY SHORT-RANGE SYSTEMS IN A TIME-PERIODIC ELECTRIC FIELD JACOB SCHACH MØLLER 1. Introduction. In this paper, we treat the scattering problem for two ν-dimensional particles interacting through a short-range potential and placed in an external time-periodic electric field. The Hamiltonian for such a system is H (t) =
p2 p12 + 2 − q1 Ᏹ(t) · x1 − q2 Ᏹ(t) · x2 + v x2 − x1 2m1 2m2
on L2 R2ν .
(1.1)
Here, mi and qi , i ∈ {1, 2}, are the masses and the charges of the two particles, and x1 , x2 denote their positions. The electric field Ᏹ is periodic with some period T > 0; that is Ᏹ(t +T ) = Ᏹ(t) almost everywhere. The short-range potential v will be allowed to have an explicit time-dependence as long as this dependence is periodic with the same period as the field. Recently, asymptotic completeness for many-body systems in constant electric fields has been proved for large classes of potentials; see [AT], [HMS1], and [HMS2]. For a treatment of propagation estimates for such systems, see [A]. All these results rely on well-known techniques that use local commutator estimates to obtain spectral and scattering information. By controlling the energy along the time evolution, one can apply some of these techniques to time-dependent problems. This has been done in [SS] and has been applied in [Si2]. In [G1] and [Z], time-boundedness of the kinetic energy plays an essential role, and in [HL], the problem of bounding the kinetic energy is treated for repulsive potentials using positive commutator techniques. In the present problem, however, one readily observes that the energy is generally not bounded in time. In fact, for very simple examples like Ᏹ(t) = 1/2 + cos(t) (ν = 1 and v = 0), one finds that the expectation value of the energy oscillates with an amplitude that grows like t 2 . On the other hand, the expectation of x2 − x1 grows like (q2 /m2 − q1 /m1 )t 2 as one would expect from the constant field problem. Consequently, it is natural to suggest that completeness and absence of bound states (the meaning of which is discussed later) hold here as well, provided the particles have different charge to mass ratio. In order to ensure the growth of x2 −x1 , we make the crucial assumption that the field has nonzero mean. Received 13 April 1999. 2000 Mathematics Subject Classification. Primary 81U05; Secondary 81Q10. Møller partially supported by Rejselegat for Matematikere and by Training and Mobility of Researchers grant number FMRX-960001. 135
136
JACOB MØLLER
In order to circumvent the problem of controlling the energy, we adopt a method due to Howland [H1], which is motivated by a similar procedure in Hamiltonian mechanics, where one also faces the problem of nonconservation of energy in timedependent problems. The idea is to include time as a space variable and to introduce a new momentum variable τ , conjugate to time. One defines a new time-independent Hamiltonian Hˆ = τ + H as a function on Rνq × Rt × Rνp × Rτ , from the original Hamiltonian H , to obtain the system q(s) ˙ = ∇p Hˆ t (s), τ (s), q(s), p(s) , q(0) = q0 , p(0) = p0 , p(s) ˙ = −∇q Hˆ t (s), τ (s), q(s), p(s) , t˙(s) = 1, t (0) = t0 , τ (0) = τ0 . τ˙ (s) = −∂t Hˆ t (s), τ (s), q(s), p(s) , Obviously, the (q, p) part of the solution to this extended system is nothing more than time-translates of the solution to the physical system, with the initial value t0 . In [H1], the scattering problems of the quantized versions of H and Hˆ are related, and in [Ya1] and [H2], the idea is applied to periodic systems. It is shown that the ranges of wave operators are connected in the same way the spectral subspaces of the monodromy operator U (T , 0) are connected with those of Hˆ . This observation makes it possible to transfer asymptotic completeness statements between the two settings. (The monodromy operator is the unitary operator that evolves the physical system through one period.) Assuming periodicity enables one to compactify the extra space variable. The resulting Hamiltonian Hˆ is called the Floquet Hamiltonian. Compactification makes it possible to show that the potential is relatively compact with respect to the Floquet Hamiltonian; see Lemma 4.6. This point was used first by Yajima in [Ya1] to treat twobody systems with time-periodic potentials and was pointed out to the author by his supervisor E. Skibsted. It can be used to obtain positive commutator estimates locally in energy. This idea has recently been applied in [Yo] to obtain propagation estimates for the Floquet Hamiltonian of time-periodic two-body Schrödinger operators. The idea of this paper is, thus, to treat the spectral and scattering problems for the time-independent Floquet Hamiltonian where one can apply local commutator estimates to obtain results and then in the end make the connection to the physical problem. More precisely, we prove that the Floquet Hamiltonian has no bound states, which, due to an argument by Yajima given in [Ya1], implies that the monodromy operator has empty point spectrum. This is what we mean by absence of bound states. Furthermore, we prove that the wave operators exist and are unitary. The results hold under some regularity assumptions on the potential which do not include singularities. We do, however, obtain some partial results for potentials with weak singularities. For a treatment of the usual two-body problem with time-dependent potentials that does not use local commutator methods, see [KY]. For two-body systems in a constant electric field, see [G2] and [JY].
TIME-PERIODIC ELECTRIC FIELDS
137
The case of time-periodic electric fields with zero mean includes AC-Stark effect
Ᏹ(t) = µ sin t, µ ∈ Rν . By virtue of the Avron-Herbst formula (see Proposition 4.3),
this falls under the category of two-body systems with time-periodic potentials, and its scattering theory is, therefore, covered by [KY], [Ya1], and [Yo]. In [Ya1] and [H2], the authors did not work explicitly with the Floquet Hamiltonian but rather with its resolvent, which can be expressed in terms of the physical flow. In this paper, however, we wish to utilize specific properties of the Stark Hamiltonian in our analysis of its Floquet Hamiltonian. In Section 2, we phrase the assumptions that we impose on the potential and on the electric field, and we state the main results of this paper. In Section 3, we elaborate on the work done by Howland in [H2] and Yajima in [Ya1], and we prove some abstract results on the structure of Floquet Hamiltonians. Some preliminary results are derived in Section 4 and in Section 5 we apply a combination of ideas used in [Si1] and [HMS1] to obtain absence of bound states for the Floquet Hamiltonian. In Section 6, we prove a Mourre estimate and apply it to obtain absence of singular continuous spectrum for the Floquet Hamiltonian, as well as a pointwise propagation estimate for the momentum operator which we use to get a minimal acceleration estimate. Finally, in Section 7, we prove the existence and completeness of wave operators for the Floquet Hamiltonian, and we argue, as in [Ya1] and [H2], to obtain the main result. Appendix A is devoted to a treatment of absolutely continuous vectorvalued functions and the derivative on an interval. In Appendix B, the time-dependent Schrödinger equation for Hamiltonians defined almost everywhere is discussed. Acknowledgments. The central ideas of this paper were developed during a stay at the University of Tokyo. I wish to thank Professor H. Kitada for discussions on absence of bound states. The abstract theory and the removal of the time-dependency from the electric field were done during a stay at the University of Virginia. A discussion that I had with I. Herbst inspired the transformation that is used to move the time-dependency. The treatment of singularities was added when the paper was included in my Ph.D. thesis, and comments from the committee led to the two appendices. Finishing touches were made at Université Paris-Sud. Finally, I would like to thank my supervisor E. Skibsted and the committee, H. H. Andersen, J. Derezi`nski, and A. Jensen, for corrections and suggestions. 2. Assumptions, notation, and results. We work within the following framework. Let X be a real, finite-dimensional vector space equipped with an inner product. We denote by x the operator of multiplication with the identity function on X, and p is the momentum operator −i∇x . We write x for (1+|x|2 )1/2 , and we use the same abbreviation for parameters as well as for other selfadjoint operators on Ᏼ = L2 (X). Let ν = dimX. In this section, T > 0 denotes a common period for the field and for the potential.
138
JACOB MØLLER
Assumption 2.1. The potentials Vt ∈ C 2 (X) form a family of real-valued functions, periodic with period T . The family and its distributional derivative with respect to t satisfy sup Vt (x) +sup ∇x Vt (x) = o(1) and sup ∂t Vt (x) +sup ∂xα Vt (x) = O(1), t∈R
t∈R
t∈R
t∈R
for |α| = 2. We say that Vt is short-range if it satisfies Assumption 2.1 and the decay assumption sup Vt (x) = O x −1/2− , t∈R
for some > 0. (The symbol is used in this connection only.) t +Vt t Assumption 2.2. The potentials Vt = Vreg sing are real-valued, Vreg satisfies t Assumption 2.1, and Vsing = 0 if ν < 3. The singular part is periodic with period T , t ) ⊂ X is compact, and there exist p > ν such that V t ∪t∈R supp(Vsing sing and its firstorder distributional derivatives satisfy t p + sup ∂t V t q + sup ∇x V t q sup Vsing sing L (X) sing L (X) < ∞, L (X) t∈R
t∈R
t∈R
where q = 2νp/(ν + 4p) if ν ≥ 5 and q > 2p/(p + 1) if ν ∈ {3, 4}. The specific form of this assumption is chosen such that the result on existence of evolutions in [Ya2] applies and such that the class of potentials satisfying Assumption 2.2 is invariant under the type of transformations given in Lemma 4.4. Assumption 2.3. The electric field E ∈ L1loc (R; X), E(t + T ) = E(t) a.e., and T 0 E(t) = 0. We consider the Hamiltonians H0 (t) = p 2 − E(t) · x
and
H (t) = H0 (t) + Vt .
(2.1)
In Section 4, we prove that under Assumptions 2.2 and 2.3 the time-dependent Schrödinger equation corresponding to H0 (t) and H (t) can be solved uniquely in the sense of Definition B.7 (with Ᏼ1 = Ᏸ(p 2 ) ∩ Ᏸ( x )). We write U0 (t, s) and U (t, s) for the solutions. The main results of this paper are the following two theorems. Theorem 2.4 (Absence of bound states). Assume Vt satisfies Assumption 2.1 and E satisfies Assumption 2.3. Then the monodromy operator U (T , 0) has purely absolutely continuous spectrum. Under Assumption 2.2 we prove that the eigenfunctions of the Floquet Hamiltonian vanishes in a half-space determined by the field.
TIME-PERIODIC ELECTRIC FIELDS
139
Theorem 2.5 (Asymptotic completeness). Assume Vt is short-range and E satisfies Assumption 2.3. Then the wave operators W± (s) = s − lim U ∗ (t, s)U0 (t, s) t→±∞
(2.2)
exist for all s ∈ R and are unitary. Furthermore, W± (s + T ) = W± (s) and U s + T , s W± (s) = W± (s)U0 s + T , s (2.3) for all s ∈ R. The Hamiltonian presented in (1.1) takes the form of (2.1) in the center of mass frame. The configuration space becomes X = {x ∈ R2ν : m1 x1 + m2 x2 = 0} with the inner product x · y = 2m1 x1 · y1 + 2m2 x2 · y2 . The orthogonal projection onto X is m2 Iν −m2 Iν 1 , π= m1 + m2 −m1 Iν m1 Iν where Iν is the (ν × ν)-identity matrix, and we find q1 /2m1 Ᏹ(t) m2 Ᏹ(t) 1 q1 q2 = . − E(t) = π −m1 Ᏹ(t) 2 m1 + m 2 m 1 m 2 q2 /2m2 Ᏹ(t) Thus, we have the following corollary. Corollary 2.6. Assume q1 /m1 = q2 /m2 , the potential v is short-range (with X = Rν ), and the electric field Ᏹ satisfies Assumption 2.3 (with X = Rν ). Then the wave operators in (2.2) exist and they are unitary. Furthermore, the periodicity and intertwining relation (2.3) are satisfied. 3. The Floquet Hamiltonian. Let {U (t, s)}t,s∈R be a family of strongly continuous unitary operators on a separable Hilbert space Ᏼ. Assume the family satisfies the Chapman-Kolmogorov equations U (t, r)U (r, s) = U (t, s)
for all t, r, s ∈ R
(3.1)
and is periodic; that is, it satisfies the periodicity condition U t + T , s + T = U (t, s) for all t, s ∈ R
(3.2)
for some T > 0, which we call the period. We note that U (t, t) = I and U ∗ (t, s) = U (s, t) for all t, s ∈ R. For any unitary operator V on Ᏼ, we define the set
ᏰV = ψ ∈ AC 2 [0, T ]; Ᏼ : ψ(0) = V ψ(T ) ,
140
JACOB MØLLER
and the selfadjoint operator τV as −id/dt with domain ᏰV . See Appendix A for a discussion of absolutely continuous functions and the derivatives τV . We define a family of operators pointwisely on C 0 ([0, T ]; Ᏼ) by
Uˆ (s)ψ (t) = U (t, t − s)ψ t − s − [t − s] ,
(3.3)
where [s] denotes the largest multiple of T smaller than or equal to s. By boundedness, we extend to a family of unitary operators on
ˆ = L2 [0, T ]; Ᏼ . Ᏼ It is easily verified that {Uˆ (s)}s∈R is a strongly continuous group with Uˆ (0) = I . We have the following proposition. Proposition 3.1 (The Floquet Hamiltonian). The selfadjoint operator that generates Uˆ (s) is Hˆ = U τU (T ,0) U ∗ with domain U ᏰU (T ,0) , where U is the unitary operator defined pointwisely by [U ψ](t) = U (t, 0)ψ(t). For λ ∈ C, Im λ = 0, the resolvent of Hˆ is given by
t
−1 ψ (t) = iU (t, 0)
Hˆ − λ
ei(t−s)λ U (0, s)ψ(s) ds
0
+ e
−iλT
U (0, T ) − I
−1
T
e
i(t−s)λ
U (0, s)ψ(s) ds .
0
Proof. Compute for f ∈ C 0 ([0, T ]; Ᏼ) using (3.1)–(3.3):
Uˆ (s)f (t) = U (t, 0)U (0, t − s)f t − s − [t − s] = U (t, 0)U − [t − s], 0 U ∗ f t − s − [t − s] = U (t, 0)U (0, T )[t−s]/T U ∗ f t − s − [t − s] .
By Proposition A.8, we find that Uˆ (s) = exp(−isU τU (T ,0) U ∗ ), which shows that the generator Hˆ , of Uˆ (s), equals U τU (T ,0) U ∗ . The resolvent formula is now a consequence of (A.2). Let S(t) be a strongly continuous periodic family of unitary operators on Ᏼ. Then S(t)U (t, s)S ∗ (s) satisfies (3.1) and (3.2) and has a Floquet Hamiltonian. We say that S(t) is a periodic change of coordinates. Lemma 3.2. Let U1 (t, s) and U2 (t, s) be periodic families of unitary operators. Suppose there exists a periodic change of coordinates S(t) such that U1 (t, s) = S(t)U2 (t, s)S ∗ (s). Then the Floquet Hamiltonians Hˆ 1 and Hˆ 2 satisfy Hˆ 1 = S Hˆ 2 S ∗ , where the unitary operator S is given by [Sf ](t) = S(t)f (t).
TIME-PERIODIC ELECTRIC FIELDS
141
ˆ given by (S0 f )(t) = S(0)f (t), Proof. We write S0 for the unitary operator on Ᏼ and we compute, using the assumption, Proposition 3.1, and (A.3), S Hˆ 2 S ∗ = SU2 τU2 (T ,0) U2∗ S ∗
= SU2 S0∗ S0 τU2 (T ,0) S0∗ S0 U2∗ S ∗ = U1 τS(0)U2 (T ,0)S(0)∗ U1∗ = Hˆ 1 . We now present a result on the spectral structure of the Floquet Hamiltonian, which was observed by Yajima in [Ya1]. It follows from Propositions 3.1 and A.9. Proposition 3.3. Let U (t, s) be a periodic family of unitary operators. The Floquet Hamiltonian satisfies λ ∈ σpp Hˆ ⇐⇒ e−iλT ∈ σpp U (T , 0) , Ᏼpp (Hˆ ) = U L2 [0, T ]; Ᏼpp U (T , 0) , Ᏼac Hˆ = U L2 [0, T ]; Ᏼac U (T , 0) , λ ∈ σac Hˆ ⇐⇒ e−iλT ∈ σac U (T , 0) , Ᏼsc (Hˆ ) = U L2 [0, T ]; Ᏼsc U (T , 0) . λ ∈ σsc Hˆ ⇐⇒ e−iλT ∈ σsc U (T , 0) , In the following, H (t) denotes a time-periodic family of Hamiltonians which fits the requirements given in Definition B.7 (for some Ᏼ1 ⊂ Ᏼ). Time-periodicity means that the following identity holds a.e. H (t + T )ϕ = H (t)ϕ
for all ϕ ∈ Ᏼ1 .
(3.4)
Suppose we have a family of unitary operators U(t, s) which solves the time-dependent Schrödinger equation in the sense of Definition B.7. For such a solution, (3.1) is given by Corollary B.6 and (3.2) follows from (3.4) and Theorem B.4; see Appendix B for details. We introduce some assumptions that enable us to interpret Hˆ as the operator sum of τI and H (t). Assumption 3.4. Let H (t) be a time-periodic family of Hamiltonians. Suppose there exist a selfadjoint operator B on Ᏼ and a family of unitary operators U (t, s) which solves the time-dependent Schrödinger equation with Ᏼ1 = Ᏸ(B). Suppose, furthermore, that H (t)(B − i)−1 ∈ L2 ([0, T ]; Ꮾ(Ᏼ)). Proposition 3.5. Let H (t) be a time-periodic family of Hamiltonians. Suppose Assumption 3.4 is satisfied for some B. Then Ᏸ0 = ψ ∈ ᏰI : ψ [0, T ] ⊂ Ᏸ(B) and sup Bψ(t) < ∞ t∈[0,T ]
is a core for Hˆ . The operator τI + H defined on Ᏸ0 by [(τI + H )ψ](t) = [τI ψ](t) + H (t)ψ(t) is essentially selfadjoint and its closure equals Hˆ .
142
JACOB MØLLER
Proof. Consider the set = span {eim(2π /T ) ψ : m ∈ Z, ψ ∈ Ᏸ(B)}, which is dense ˆ . Since ⊂ Ᏸ0 , we find that Ᏸ0 is dense. in Ᏼ By (3.1), U (t, t − s) = U (t, 0)U (0, t − s), and this, together with Assumption 3.4 and (B.3), shows that the operator (B + i)U (t, t − s)(B + i)−1 is bounded uniformly in s, t ∈ [0, T ]. This implies by (3.3) that
sup B Uˆ (s)ψ (t) < ∞
s,t∈[0,T ]
(3.5)
for ψ ∈ Ᏸ0 . By (3.1) and Proposition B.3(ii) and (iii) we find that t −→ U (t, t − s)ψ t − s − [t − s] ∈ AC [0, T ]; Ᏼ for ψ ∈ Ᏸ0 . Furthermore, we have
U t, t − s − U (s, 0) ψ t − s − [t − s] t
i U (r, r − s)H (r − s) − H (r)U (r, r − s) ψ r − s − [r − s] = s + U (r, r − s)(∂ψ) r − s − [r − s] dr. The last part of Assumption 3.4 and (3.5) now combine to prove that the derivative of Uˆ (s)ψ is square-integrable; thus, we get Uˆ (s)Ᏸ0 ⊂ Ᏸ0 . This inclusion shows that Hˆ |Ᏸ0 has no other selfadjoint extension than Hˆ and is, therefore, essentially selfadjoint with closure equaling Hˆ . We compute
Hˆ ψ (t) = H (t)ψ(t) + [τI ψ](t)
for ψ ∈ Ᏸ0 , which concludes the proof. Example 3.6. If H (t) = H is independent of time one readily finds Assumption 3.4 satisfied with Ᏸ = Ᏸ(B) = Ᏸ(H ) and B = H . In this case the conclusion of Proposition 3.5 is trivial since Hˆ equals the closure of H ⊗I +I ⊗τI on Ᏸ(H )⊗{ψ ∈ AC([0, T ]; C) : ψ(0) = ψ(T )}. Example 3.7. For α < 1 we consider H (t) = (t − [t])−α , where [t] is the integer part of t, as an operator on Ᏼ = C (H (0) = 0). For s, t ∈ [0, 1], we find U (t, s) = 1−α 1−α ei(s −t )/(1−α) with Ᏸ = C. For α < 1/2 we find Assumption 3.4 satisfied with B = 1. When α ≥ 1/2 we find that any domain on which we should be able to write Hˆ as a sum of H (t) and τI has to have its functions vanish at zero at some slow polynomial rate. Since functions in such a domain are also periodic, they must vanish at both endpoints. This is the classical example of an operator with deficiency indices equal to 1 and a whole range of selfadjoint extensions indexed by the unitary operators on C; see also Proposition A.7.
143
TIME-PERIODIC ELECTRIC FIELDS
4. Preliminary Results. In this section, we solve the time-dependent Schrödinger equation for the Hamiltonians H0 (t) and H (t), and we prove that multiplication by the indicator function F (|x| < R) is Hˆ -compact. Furthermore, we construct a unitary transformation that can be used to move the time-dependency of the field into the potential. We end the section by proving that the potential is relatively bounded with respect to the free Floquet Hamiltonian with relative bound zero. Assumption 2.2 is assumed to be satisfied throughout this section. The periodicity assumption on the potential is not required for existence of evolutions, that is, Lemma 4.1 and Proposition 4.3. For simplicity, we take T = 1 in the following. Lemma 4.1. For c ∈ C 1 (R; X), we write Gc (t) = p2 + Wtc , where Wtc (x) = Vt (x − c(t)). There exists a family of unitary operators, Vc (t, s), which solves the time-dependent Schrödinger equation for Gc (t) (in the sense of Definition B.7) with Ᏼ1 = Ᏸ(p 2 ). Proof. Under the given assumptions we find that Gc (t) is selfadjoint on Ᏸ(p 2 ) for all t, and by [Ya2, Theorem 1.3], we obtain a solution to (B.2) and (B.4) with the stated property. We introduce a time-dependent coordinate change (see Definition B.1 for the set ᐁ1,2 ).
Lemma 4.2. Assume E ∈ L1loc (R; X). Let
t
b(t) =
E(r) dr,
t
a(t) =
0
b(r)2 dr,
and
t
c(t) = 2
0
b(r) dr,
0
and define a time-dependent family of unitary operators on Ᏼ by T (t) = eia(t) exp ib(t) · x exp − ic(t) · p . Then T ∈ ᐁ1,2 , with Ᏼ1 = Ᏸ(p 2 ) ∩ Ᏸ( x ). Furthermore, T ∗ (t)xT (t) = x + c(t)
and
T ∗ (t)pT (t) = p + b(t),
T (t)xT ∗ (t) = x − c(t)
and
T (t)pT ∗ (t) = p − b(t).
Using this coordinate change, we get the following generalization of the AvronHerbst formula. Proposition 4.3 (Avron-Herbst). Assume E ∈ L1loc (R; X). Then the evolutions U0 (t, s) = T (t) exp i(s − t)p 2 T ∗ (s)
and
U (t, s) = T (t)Vc (t, s)T ∗ (s)
are solutions to the time-dependent Schrödinger equation for H0 (t) and H (t), respectively (with Ᏼ1 = Ᏸ(p 2 ) ∩ Ᏸ( x )).
144
JACOB MØLLER
Proof. Let Ᏼ1 = Ᏸ(p 2 ) ∩ Ᏸ( x ). By Lemma 4.1, we find Vc (t, s)Ᏼ1 ⊂ Ᏸ(p 2 ) 2 and Vc (·, s), Vc (s, ·) ∈ L∞ loc (R; Ꮾ(Ᏸ(p ))). Compute for ψ ∈ Ᏼ1 Vc∗ (t, s) x Vc (t, s)ψ t
Vc∗ (r, s) ∇ x · p + p · ∇ x Vc (r, s)ψ dr, = x ψ +
(4.1)
s
which shows that Vc (·, s), Vc (s, ·) ∈ ᐁ1,2 . The result now follows from Lemma 4.2, Proposition B.3(iii), and a simple computation. (See the computation for the free evolution in [CFKS, Theorem 7.5].) We remark that one can prove existence of wave operators using the Avron-Herbst formula in conjunction with Cook’s method and a simple stationary phase argument. For proofs of existence of long-range wave operators using this transformation, see [G2] and [JY]. If E in L1loc (R; X) is also periodic, we get, due to Proposition 4.3 and the discussion ˆ = L2 ([0, 1]; Ᏼ). Note that due to in Section 3, Floquet Hamiltonians Hˆ 0 and Hˆ on Ᏼ periodicity we find (as discussed in Section 3) that U0 (t, s) and U (t, s) satisfy (3.2). We now introduce another change of variable which is time-periodic. It was inspired by a discussion the author had with I. Herbst during a stay at the University of Virginia. Let 1
E0 =
E(t) dt.
0
Lemma 4.4. Let E ∈ L1loc (R; X) satisfy E(t + 1) = E(t) a.e. and E0 = 0. We define continuous periodic functions t 1 t ˜ = E(s) − E0 ds − b˜0 , b(t) b˜0 = E(s) − E0 ds dt 0
t
c(t) ˜ =2
˜ ds − c˜0 , b(s)
0
c˜0 =
0
1
b(t) ˜ 2 + 2
0
t
a(t) ˜ =
0
t 0
˜ ds dt E0 E0 · b(s) |E0 |2
b(s) ˜ 2 + E0 · c(s) ˜ ds
0
and a strongly continuous periodic family of unitary operators on Ᏼ by ˜ ˜ · x exp − i c(t) T˜ (t) = ei a(t) exp i b(t) ˜ ·p . We have T˜ ∈ ᐁ1,2 with Ᏼ1 = Ᏸ(p 2 ) ∩ Ᏸ( x ). This change of coordinates can be used to move the oscillating part of the electric field into the potential. We write H0const = p 2 − E0 · x
and
H const (t) = H0const + Wtc˜
TIME-PERIODIC ELECTRIC FIELDS
145
and present another Avron-Herbst-type result. Note that Wtc˜ also satisfies Assumption 2.2; therefore, the time-dependent Schrödinger equation for H const (t) has a solution U˜ (t, s) by Proposition 4.3. Proposition 4.5. Let E ∈ L1loc (R; X) satisfy E(t + 1) = E(t) a.e. and E0 = 0. Then U0 (t, s) = T˜ (t) exp i(s − t)H0const T˜ ∗ (s)
and
U (t, s) = T˜ (t)U˜ (t, s)T˜ ∗ (s).
Furthermore, we have Hˆ 0 = T˜ Hˆ 0const T˜ ∗
and
Hˆ = T˜ Hˆ const T˜ ∗ ,
where [T˜ ψ](t) = T˜ (t)ψ(t). Proof. The proof of the first part follows from Proposition B.3(iii), Lemma 4.4, Proposition 4.3, and Theorem B.4, noting that the transformations at the end of Lemma 4.2 apply here as well. The last part follows from Lemma 3.2. This result is used to reduce the proof of Theorem 2.4 to the case where the electric field is constant. Notice that T˜ (0) = exp(−i b˜0 · x) exp(i c˜0 · p) is not necessarily the identity operator. The constants b˜0 and c˜0 are chosen such that the family T˜ (t) becomes time-periodic. We introduce some notation. We write F (s < R) for the characteristic function of the set (−∞, R). We denote by χ(s < R) a smooth version of F (s < R). More precisely, χ : R → [0, 1] is smooth, χ(s < R) = 0 for s > 2R, and χ(s < R) = 1 for s < R. Furthermore, we write F (s ≥ R) = 1 − F (s < R) and χ(s > R) = 1 − χ(s < R). Lemma 4.6. Suppose E ∈ L1loc (R; X) and E(t +1) = E(t) a.e. Then the operator F (|x| < R)(Hˆ 0 − λ)−1 is compact for any R > 0 and λ ∈ C with Im λ = 0. Proof. The idea is to use the resolvent formula of Proposition 3.1, the AvronHerbst formula, and the known integral kernel for the free propagator to approximate the operator −1 χ |x| < R Hˆ 0 − λ χ |x| < R , by Hilbert-Schmidt operators. This implies the result by the first resolvent formula. It is sufficient to consider Im λ < 0. For the first term in the formula for the resolvent given in Proposition 3.1, we get iχ |x| < R U0 (t, 0)
t 0
ei(s−t)λ U0 (s, 0)∗ χ |x| < R f (s) ds
= iT (t)χ x + c(t) < R
146
JACOB MØLLER
t
×
ei(s−t)λ exp i(s − t)p 2 χ x + c(s) < R T ∗ (s)f (s) ds
0
= iT (t)
1 0
−ν/2 F s ∈ 0, t − δ χ x + c(t) < R 4π(t − s)
X
× ei(s−t)λ+i|x−y|
2 /4(t−s)
χ y + c(s) < R T ∗ f (s; y) ds dy + Oδ (t).
√ ˆ satisfying that Oδ ≤ δ e− Im(λ)δ f , and T ∗ is Here, Oδ denotes a function in Ᏼ the adjoint of the operator T given by [Tf ](t) = T (t)f (t). Since T is unitary and δ is arbitrary, we conclude compactness for the first of the two terms. For the second term, we use the Neumann series to write ∞
−1 −iλ ∗
n −iλ ∗ =− e U0 (1, 0) − I e U0 (1, 0) , n=0
which is convergent by the choice of λ. We can use this together with (3.1) and (3.2) to write the second term in the resolvent formula as −i
N(δ)
F |x| < R
n=0
ei(t−s−n)λ U0 (t + n, s)F |x| < R f (s) ds + Oδ (t).
1 0
To treat this term, we cut out a δ piece as for the first term, we apply the AvronHerbst formula, and we use the integral kernel for the free propagator to rewrite the remaining part. As before, we find that each term in the sum above is compact. This concludes the proof. If E is L2loc (R; X) and periodic, we find by (4.1), Lemma 4.1, and Proposition 4.3 that the operators H0 (t) and H (t) satisfy Assumption 3.4 with B = p2 + x . Thus, Proposition 3.5 implies that the operators Hˆ 0 and Hˆ are essentially selfadjoint on the domain
Ᏸ0 = ψ ∈ ᏰI : ψ [0, 1] ⊂ Ᏸ x ∩ Ᏸ p 2 and sup p 2 + x ψ(t) < ∞ . 0≤t≤1
Let Vreg , Vsing , and V denote the multiplication operators defined by
t ψ(t), Vreg ψ (t) = Vreg
t Vsing ψ (t) = Vsing ψ(t),
and
V ψ (t) = Vt ψ(t),
ˆ . Note that V = Vreg + Vsing . As operators on Ᏸ0 , we have Hˆ 0 = τ + p 2 − for ψ ∈ Ᏼ E(t) · x and Hˆ = Hˆ 0 + V . Lemma 4.7. Suppose E ∈ L2loc (R; X) and E(t +1) = E(t) a.e. Then the operator Vsing is Hˆ 0 -bounded with relative bound zero.
TIME-PERIODIC ELECTRIC FIELDS
147
Proof. The proof is very similar to a proof given by Yajima in [Ya1, Lemma 3.1] and we, therefore, skip some of the details. Notice that by assumption Ᏸ(p 2 ) ⊂ t ) for all t and, hence, Ᏸ ⊂ Ᏸ(V Ᏸ(Vsing 0 sing ). We compute for ψ ∈ Ᏸ0 and Im λ > 0 (a.e. with respect to t): ∞ −1
ˆ Vsing H0 − λ Vsing ψ (t) = i eisλ Vsing Uˆ 0 (s)Vsing ψ (t) ds =i
0 t −∞
(4.2) t s ei(t−s)λ Vsing U0 (t, s)Vsing ψ s − [s] ds
= I1 (t) + I2 (t), where −2
I1 (t) = i
s t ei(t−s−n)λ Vsing U0 t, s + n Vsing ψ(s) ds,
n=−∞ 0 t i(t−s)λ
I2 (t) = i
1
−1
e
t s Vsing U0 (t, s)Vsing ψ s − [s] ds.
At this point, we use the Avron-Herbst formula to make the substitution U0 (t, s) = T (t) exp i(s − t)p 2 T ∗ (s) and rewrite I1 (t) = iT (t)
−2
n,s ∗ 0,t ei(t−s−n)λ Vsing exp i s + n − t p 2 Vsing T (s)ψ(s) ds,
n=−∞ 0 t i(t−s)λ
I2 (t) = iT (t)
1
−1
e
0,s ∗ 0,t Vsing exp i(s − t)p 2 Vsing T (s)ψ(s) ds,
k,t t (x − c(t + k)) for k ≤ 0. Note that V k,t t (x) = Vsing where Vsing sing p˜ = Vsing p˜ < ∞ for all 2 ≤ p˜ ≤ p and k ≤ 0. One can now proceed exactly as in [Ya1] and use a result by Kato, namely, f exp − itp 2 gu ≤ 4π|t| −ν/p˜ f p˜ g p˜ u
for f, g ∈ Lp˜ (X), p˜ ≥ 2, and u ∈ Ᏸ(g). The unitary transformation T (t) disappears after an application of this inequality to I1 and I2 . The problem is now exactly the same as the one considered in [Ya1] and we omit the remaining steps. We, thus, obtain ˆ , and I1 , I2 ∈ Ᏼ I1 2 + I2 2 ≤ C(Im λ) f 2 , where C(r) → 0 as |r| → ∞. The same procedure applies to the case Im λ < 0. The result now follows from (4.2), Proposition 3.5, and the first resolvent formula.
148
JACOB MØLLER
We are now in a position where we can begin the spectral analysis of the Floquet Hamiltonian. 5. Absence of bound states. In this section, we prove that the monodromy operator U (1, 0) has no bound states (for nonsingular potentials). This is equivalent to proving absence of bound states for Hˆ as noted by Yajima in [Ya1, Section 4]; see Proposition 3.3. We restrict attention to constant electric fields E(t) = E0 , in view of Proposition 4.5, and we use the notation p0 = ω0 · p, where ω0 = E0 /|E0 |. We work under the assumption that Vt satisfies Assumption 2.2. For θ > 0, we write χθ (r) = χ(r/θ < 1) and define r Gθ (r) = χθ (s) ds. 0
This function will be used to regularize an exponential weight. The fact that x −1/2 p is relatively bounded with respect to the Hamiltonian H (t) is used in both [Si2] and [HMS1] to prove absence of bound states in the case of constant fields. This is, however, not true for the Floquet Hamiltonian. We introduce the following operator family in order to circumvent the technical problems arising from this: −1 p0 PR = i +i , R > 0, R which satisfies s − lim PR = I R→∞
Pick d > 0 such that |E0 | t sup ∇Vreg (x) < 2 0≤t≤T
p0 PR = 0. R→∞ R
s − lim
and
and
x∈ /
t supp Vsing
(5.1)
for ω0 · x > d.
0≤t≤T
We write in the following, unless otherwise noted, χθ = χθ (ω0 · x − d) and Gθ = Gθ (ω0 · x − d), and we introduce Aθ =
1 χθ p0 + p0 χθ . 2
We write Hˆ 1 = Hˆ 0 + Vreg and compute the following commutators (as forms on Ᏸ0 ):
i i Hˆ 1 , PR = PR ω0 · E0 − ∇Vreg PR R
(5.2)
and
i Hˆ 1 , eαGθ = 2αAθ + iα 2 χθ2 eαGθ .
(5.3)
TIME-PERIODIC ELECTRIC FIELDS
149
Lemma 5.1. Suppose Vt satisfies Assumption 2.2. Let ψ be an eigenfunction for Hˆ . ˆ. For all α > 0, we have eαω0 ·x ψ ∈ Ᏼ Proof. The proof is inspired by Sigal’s ideas in [Si2]. Assume there exists α > 0 ˆ. such that eαω0 ·x ψ ∈ Ᏼ ˆ , we abbreviate For functions ϕ ∈ Ᏼ eαGθ PR ϕ ϕθ,R = αG e θ PR ϕ
and
eαGθ ϕ ϕθ = αG . e θ ϕ
For expectation values, we write A ϕ = ϕ, Aϕ . Using the choice of d, we estimate for ϕ ∈ Ᏸ0
i Hˆ 1 , p0 ϕ
θ,R
≥
|E0 | ϕ 2 − ∇Vreg ∞ . 2 eαGθ PR ϕ 2
(5.4)
On the other hand, we can use (5.3) to compute
i Hˆ 1 , p0 ϕ
θ,R
+ α 2 i χθ2 , p0 ϕ = −4α Re Aθ p0 ϕ θ,R θ,R αG −1 αG
θ θ − 2e PR ϕ Re ie Hˆ 1 , PR + PR Hˆ − Vsing ϕ, p0 ϕθ,R .
To estimate the first term, we notice that by construction i Aθ = p0 χθ + χθ# , 2 which in turn yields
1 2 Re Aθ p0 = 2p0 χθ p0 − χθ## . 2 This gives the identity
i Hˆ 1 , p0 ϕ
θ,R
= − 4α p0 χθ p0 ϕ + αχθ## − 2α 2 χθ χθ# ϕ θ,R θ,R αG −1 αG
− 2e θ PR ϕ Re ie θ Hˆ 1 , PR + PR Hˆ − Vsing ϕ, p0 ϕθ,R . (5.5)
150
JACOB MØLLER
We rewrite the term containing Vsing using Vsing = χθ Vsing αG αG 1 θ θ PR Vsing ϕ, p0 ϕθ,R = e PR Vsing ϕ, χθ p0 ϕθ,R + Bθ (R)Vsing ϕ, p0 ϕθ,R , e R where Bθ (R) denote bounded operators satisfying supR≥1 Bθ (R) < ∞. We estimate the first term using the Cauchy-Schwarz inequality and χθ2 ≤ χθ : 2 αG eαGθ PR Vsing ϕ e θ PR Vsing ϕ, p0 ϕθ,R ≤ + 2α eαGθ PR ϕ p0 χθ p0 ϕ θ,R 2α eαGθ PR ϕ 1 + Bθ (R)Vsing ϕ, p0 ϕθ,R . R By this inequality, (5.2), and Lemma 4.7, we can estimate (5.5):
i Hˆ 1 , p0 ϕ ≤ αχθ## − 2α 2 χθ χθ# ϕ θ,R θ,R αG −1 αG θ −2 e PR ϕ Re ie θ PR Hˆ ϕ, p0 ϕθ,R αG e θ PR Vsing ϕ 2 1 ˆ + + B p (R) H + i ϕ, ϕ 2 θ 0 0 θ,R . R α eαGθ PR ϕ First, we combine this estimate with (5.4) and take the limit ϕ → ψ in the graph norm of Hˆ . This is possible due to the fact that Ᏸ0 is core for Hˆ , and by assumption, the limit of Re{ ieαGθ PR Hˆ ϕ, p0 ϕθ,R } is equal to zero. Secondly, we use both limits in (5.1) to take R to infinity. In this way, we obtain the inequality ∇Vreg ψ 2 + (1/α)Vsing ψ 2 |E0 | ## 2 # ∞ ≤ αχθ − 2α χθ χθ ψ + . θ 2 eαGθ ψ 2 (k)
We have thus obtained a contradiction since |χθ | = O(θ −k ) and eαGθ ψ → ∞ as θ → ∞ by assumption. Lemma 5.2. Let α > 0 and R > 2α. The following hold. ˆ we have eαGθ PR ϕ ≤ (1/(1 − α/R)) eαGθ ϕ . (i) For any ϕ ∈ Ᏼ ˆ ˆ . Then (ii) Let ψ ∈ Ᏼ satisfy e2αω0 ·x ψ ∈ Ᏼ ˆ eαω0 ·x PR ψ ∈ Ᏼ
and
lim eαGθ PR ψ = eα(ω0 ·x−d) PR ψ.
θ→∞
Proof. We first prove (i). Compute the commutator
−iα i eαGθ , PR = PR χθ eαGθ PR . R
(5.6)
TIME-PERIODIC ELECTRIC FIELDS
151
This gives the estimate αG e θ PR ϕ ≤ eαGθ ϕ + α eαGθ PR ϕ R ˆ . Rearrangement gives (i). for any ϕ ∈ Ᏼ In order to prove (ii), we first show that the family {ψθ }θ>0 = {eαGθ PR ψ}θ>0 is Cauchy. Secondly, we verify that its limit is as expected. We use (5.6) again to estimate
ψθ − ψθ ≤ eαGθ1 − eαGθ2 ψ + α ψθ − ψθ + χθ − χθ ψθ . 1 2 1 2 1 2 1 R We rewrite this to obtain ψ θ − ψ θ ≤ 1 2
αG 1 e θ1 − eαGθ2 ψ + α χθ − χθ ψθ . 1 2 1 1 − α/R R
We can estimate the second term using the Cauchy-Schwarz inequality and the proof of (i) with Gθ replaced by 2Gθ1 , χθ − χθ ψθ 2 ≤ 1 2 1
1 χθ − χθ 2 PR ψ e2αGθ1 ψ . 1 2 1 − 2α/R
The fact that the sets supp(χθ1 − χθ2 ) and supp(eαGθ1 − eαGθ2 ) are contained in {x : ω0 · x ≥ min{θ1 , θ2 } + d}, together with the two bounds αG χθ − χθ ≤ 2 e θ1 − eαGθ2 e−αω0 ·x ≤ 2eαd , and 1 2 shows, in conjunction with the choice of ψ, that {ψθ }θ>0 is Cauchy. The limit limθ →∞ ψθ thus exists. The Lebesgue theorem on monotone convergence ˆ , and a simple argument completes the proof of (ii). yields eαω0 ·x PR ψ ∈ Ᏼ We now apply Lemma 5.1 using an idea from [HMS1] to obtain absence of bound states. Proposition 5.3. Let ψ be an eigenfunction for Hˆ . Then ψ vanishes on the set {x ∈ X : ω0 · x > d}. If Vsing = 0, then σpp (Hˆ ) = ∅. Proof. Let λ be an eigenvalue for Hˆ with corresponding eigenfunction ψ. First we compute, as in [HMS1], for φ ∈ PR Ᏸ0 using (5.3): αG e θ Hˆ 0 − λ φ 2 = 2αAθ eαGθ φ 2 + Hˆ 0 − λ + α 2 χ 2 eαGθ φ 2 θ
+ 2α i p 2 − E0 · x − α 2 χθ2 , Aθ eαGθ φ (5.7) (3) ≥ 2α|E0 | χθ eαGθ φ + 4α 3 χθ2 χθ# − αχθ eαGθ φ + 2α p0 χθ# p0 eαGθ φ .
152
JACOB MØLLER
On the other hand, substitute φ = PR ϕ, ϕ ∈ Ᏸ0 and compute αG
e θ Hˆ 0 − λ PR ϕ 2 = eαGθ Hˆ 1 , PR − Vreg PR + PR Hˆ − λ − Vsing ϕ 2 . We combine this with (5.7), take the limit ϕ → ψ in the graph-norm of Hˆ , and use (5.2), the choice of d, and Lemma 5.2(i): E0 − ∇Vreg 2 2 2 2 ∞ 2 + Vreg ∞ eαGθ PR ψ + 2 Vsing ψ R −α 1 − α/R 3 2 # (3) ≥ 2α|E0 | χθ eαGθ PR ψ + 4α χθ χθ − αχθ eαGθ P ψ + 2α p0 χθ# p0 eαGθ P ψ . R
R
(k) |χθ | = O(θ −k )
and take the limit θ → ∞ We apply Lemma 5.2(ii) and the fact that on both sides to obtain E0 − ∇Vreg 2 2 2 eαd ∞ + Vreg ∞ eαω0 ·x PR ψ + 2 Vsing ψ R −α 1 − α/R (5.8) αω ·x 2 ≥ α|E0 |e 0 PR ψ . Here, we used the fact that p0 eGθ PR ψ is bounded uniformly with respect to θ, as can be seen from (5.6) and Lemmas 5.1 and 5.2(i). We estimate using Lemmas 5.1 and 5.2(i) and (ii): αω ·x e 0 I − PR ψ ≤ I − PR eαω0 ·x ψ + α eαω0 ·x ψ , R −α which by Lemma 5.1 and (5.1) implies eαω0 ·x PR ψ −→ eαω0 ·x ψ
as R −→ ∞.
Taking R to infinity in (5.8) thus gives for all α > 0, Vreg 2 eαω0 ·x ψ 2 + eαd Vsing ψ 2 ≥ α|E0 |eαω0 ·x ψ 2 . ∞ From this inequality, we conclude the statement of Proposition 5.3. By Propositions 3.3 and 4.5, this proves that the monodromy operator has no pure point spectrum. If one could prove a unique continuation theorem for the differential operator τ + p2 , see [ABG], one could conclude absence of bound states for singular potentials as well. 6. Mourre estimate. We write ηδ for any smooth function η : R → [0, 1] satisfying that η = 0 on the complement of [−2δ, 2δ] and η = 1 on [−δ, δ]. We combine the result on absence of bound states with Proposition 4.5 to obtain the “squeezing rule” (cf. [HMS1, Proposition 3.7] for a corresponding result that played a crucial role in the treatment of the many-body constant field problem). In this section, we work under Assumption 2.1.
TIME-PERIODIC ELECTRIC FIELDS
153
Proposition 6.1 (Squeezing rule). Suppose E satisfies Assumption 2.3. Let λ ∈ R. Then we have for any R > 0, lim F |x| < R ηδ Hˆ − λ = 0.
δ→0
Proof. Let 0 < δ < 1/2. Since, by Propositions 4.5 and 5.3, λ ∈ R is not an eigenvalue for Hˆ , we see that s−limδ→0 ηδ (Hˆ −λ) = 0. This, combined with Lemma 4.6, gives F |x| < R ηδ Hˆ − λ = F |x| < R η1 Hˆ − λ ηδ Hˆ − λ −→ 0 in norm as δ → 0. Keeping Proposition 4.5 in mind, we assume in the remaining part of this section that E(t) = E0 = 0. We write as in Section 5: p0 = ω0 · p, ω0 = E0 /|E0 |, and compute as a form on Ᏸ0 ,
i Hˆ , p0 = |E0 | − ω0 · ∇V . Since Ᏸ0 ⊂ Ᏸ(p0 ) ∩ Ᏸ(Hˆ ), we find by Proposition 3.5 that Ᏸ(p0 ) ∩ Ᏸ(Hˆ ) is dense in Ᏸ(Hˆ ). Clearly, exp(isp0 )Ᏸ0 ⊂ Ᏸ0 and Hˆ exp(isp0 )ψ ≤ C( Hˆ ψ + ψ ), uniformly in |s| ≤ 1 for ψ ∈ Ᏸ0 , which implies the same estimate for ψ ∈ Ᏸ(Hˆ ). These remarks, together with [Mo, Proposition II.1] (applied with = Ᏸ0 ), show that the form i[Hˆ , p0 ] is represented on Ᏸ(Hˆ ) by the operator on the right-hand side. This argument, Assumption 2.1, and the squeezing rule imply the following proposition. Proposition 6.2 (Mourre estimate). Let λ ∈ R. Then for all e < |E0 | there exists δ > 0 such that
ηδ Hˆ − λ i Hˆ , p0 ηδ Hˆ − λ ≥ eηδ2 Hˆ − λ . From the abstract theory of Mourre [Mo], we get the limiting absorption principle, which implies the following corollary. By Proposition 4.5, it holds for E satisfying Assumption 2.3. (Note that we have verified above the technical assumptions used in [Mo].) In conjunction with Proposition 3.3, this completes the proof of Theorem 2.4. Corollary 6.3. The spectrum of Hˆ is purely absolutely continuous. In the time-independent case, one would now proceed in standard fashion to obtain an integral propagation estimate for p0 from the limiting absorption principle (local smoothness). Since x0 −1/2 p0 is bounded, relative to p 2 − E0 · x, this implies an integral propagation estimate for x0 , which in turn yields an easy proof of asymptotic completeness. In the present case, however, x0 −1/2 p0 is not Hˆ 0 -bounded, and instead we choose to proceed via pointwise propagation estimates.
154
JACOB MØLLER
Corollary 6.4. There exist κ > 0 and ρ > 0 such that p0 < κ exp − is Hˆ f (Hˆ ) p0 −1 = O s −ρ F s
as s −→ ±∞,
for any f ∈ C0∞ (R). Proof. Let A(s) = p0 − es for some 0 < e < |E0 |. Its Heisenberg derivative is
DA(s) = i Hˆ , A(s) + ∂s A(s) = i Hˆ , p0 − e. By the Mourre estimate, one can apply [Sk, Corollary 2.5] to obtain the result. Notice that, as in [HMS2, Appendix A], one has to remove the lower boundedness assumption on the Hamiltonian by making a slight modification of [Sk, Lemma 2.10]. Before we present the applications of this propagation estimate, we give a technical lemma, which is needed. Lemma 6.5. Let f ∈ C0∞ (R). Then f (Hˆ )ᏰI ⊂ ᏰI , f (Hˆ )Ᏸ(p0 ) ⊂ Ᏸ(p0 ), and f (Hˆ ) is, furthermore, bounded as an operator on these domains equipped with their respective graph-norms. Proof. Since p0 is a conjugate operator to Hˆ , we know from abstract theory (see, e.g., [Mø, Lemma 5.1(ii)], applied with n = 0) that the stated result holds for p0 . As for τI , it is easy to see that exp(−isτI )Ᏸ0 ⊂ Ᏸ0 and Hˆ exp(−isτI )ψ ≤ C Hˆ ψ + ψ , uniformly in |s| ≤ 1 for ψ ∈ Ᏸ0 and, hence, also for ψ ∈ Ᏸ(Hˆ ) by Proposition 3.5. Furthermore, ᏰI ∩ Ᏸ(Hˆ ) contains Ᏸ0 and is, therefore, dense in Ᏸ(Hˆ ). We compute
i τI , Hˆ = Vt# , as a form on Ᏸ0 and, hence, on Ᏸ(τI ) ∩ Ᏸ(Hˆ ) by [Mo, Proposition II.1]. The desired result now follows from [Mø] again. In the following, we abbreviate x0 = ω0 · x and assume for simplicity that the ρ appearing in Corollary 6.4 is smaller than 1/2. Proposition 6.6. There exist θ > 0 such that |x0 | < θ exp − is Hˆ f (Hˆ ) p0 −1 τI −1 = O s −ρ F 2 s
as s −→ ±∞
for any f ∈ C0∞ (R). Proof. This proof is based on ideas used to solve an analogous problem in [HMS2, Appendix A] combined with the regularization procedure of Section 5.
155
TIME-PERIODIC ELECTRIC FIELDS
ˆ , f ∈ C ∞ (R), ψ = f (Hˆ ) p0 −1 τI −1 ϕ, and ψR = BR ψ, where Let ϕ ∈ Ᏼ 0
p 2 + x BR = i +i R
−1 for R > 0.
By Lemma 6.5, we find that ψ ∈ ᏰI
and
ψR ∈ Ᏸ0 .
We abbreviate χ (s) = χ (|x0 |/ s 2 < θ) exp(−is Hˆ )ψ ∈ Ᏸ(p0 )∩ Ᏸ( x ) and χR (s) = χ (|x0 |/ s 2 < θ ) exp(−is Hˆ )ψR ∈ Ᏸ0 . (Note that we verified in the proof of Proposition 3.5 that exp(−is Hˆ )Ᏸ0 ⊂ Ᏸ0 and it is easy to check that exp(−is Hˆ )Ᏸ(p0 ) ⊂ Ᏸ(p0 ).) We can now compute (with the convention that all constants C > 0 are independent of R > 0), writing o(1) for errors converging to 0 as R → ∞, pχR (s)2 = Hˆ + E0 · x − V − τI χR (s), χR (s) 2 ≤ θ C1 s 2 χR (s) + τI χR (s)χR (s) + C2 ϕ
2
+ o(1),
(6.1)
where the error term comes from the commutators i[Hˆ , BR ] and i[p0 , BR ] (see (5.1)). Since τI χR (s) ≤ C s ϕ , we get pχR (s)2 ≤ θC3 s 2 χR (s)2 + C4 s ϕ
2
+ o(1).
(6.2)
On the other hand, we can estimate 2 2 2 p 0 2 p0 χR (s) ≥ δ 2 s 2 χ > δ (s) χ R s 2 2 p02 2 2 2 2 2 ≥ δ s χR (s) − δ s χ < δ (s) χ χR (s) R 2 s 2 2 p02 1 2 2 2 2 ≥ δ s χR (s) − C5 s χ χ < δ (s) . R s 2 2 Combining this estimate with (6.2) and taking the limit R → ∞ give 2 1 2 δ − θ C3 s 2 χ(s) 2 2 p02 2 2 ≤ C5 s χ < δ χ(s) + C4 s ϕ 2 . s 2
(6.3)
156
JACOB MØLLER
As in [HMS2, Appendix A], we have p02 |x| 2 χ < δ ,χ <θ = O s −3 . 2 2 s s By choosing 0 < δ < κ/4, we can use Corollary 6.4 to estimate (6.3) further: 2 1 2 δ − θ C3 s 2 χ(s) ≤ C6 s 2−2ρ ϕ 2 , 2 which yields the stated result by choosing θ < δ 2 /2C3 . By the estimate (6.1) and Proposition 6.6, we get (since p0 BR → p0 strongly on the domain of p0 as R → ∞) the following corollary. Corollary 6.7. Let f ∈ C0∞ (R). We have the estimate |x0 | p0 χ < θ exp − is Hˆ f Hˆ p0 −1 τI −1 = O s 1−ρ . 2 s 7. Asymptotic completeness. First, we prove completeness for the Floquet Hamiltonians. Proposition 7.1. Assume Vt is short range and E(t) satisfies Assumption 2.3. Then the wave operators Wˆ ± = s − lim exp is Hˆ exp − is Hˆ 0 s→±∞
exist and are unitary. Proof. By Proposition 4.5, it is sufficient to prove the result for constant nonzero fields. We only prove the existence of s − lim exp is Hˆ 0 exp − is Hˆ s→±∞
since the existence of wave operators follows in the same way. By a standard argument this implies that the wave operators are unitary. Furthermore, we restrict attention to the limit at +∞. ˆ , f ∈ C ∞ (R), and ψ = f (Hˆ ) p0 −1 τI −1 ϕ. For δ > 0, we write χδ (s) Let ϕ ∈ Ᏼ 0 for χ (|x0 |/ s 2 > δ) and compute, using Proposition 6.6, exp is Hˆ 0 exp − is Hˆ ψ = exp is Hˆ 0 χθ/2 (s) exp − is Hˆ ψ + o(1) s
= exp ir Hˆ 0 i Hˆ 0 , χθ/2 (r) + χθ/2 (r)V + ∂r χθ/2 (r) exp − ir Hˆ ψ dr. 0
TIME-PERIODIC ELECTRIC FIELDS
157
i Hˆ 0 , χθ/2 (r) = r −2 b1 (r)p0 χθ (r) + O r −4
(7.1)
Here (as a form on Ᏸ0 ),
and ∂r χθ/2 (r) = r −1 b2 (r)χθ (r), where b1 and b2 denote bounded functions from R+ into R. By [Mø, Lemma 2.2] applied with H = p0 , A = Hˆ 0 , H˜ = χθ/2 (r), and = Ᏸ0 , we find that (7.1) holds as a form on Ᏸ(Hˆ ) ∩ Ᏸ(p0 ) as well. Proposition 6.6 and Corollary 6.7 now yield the existence of the integral. Proof of Theorem 2.5. We restrict attention to the case s = 0 (since W± (s) = U (s, 0)W± (0)U0 (0, s)). We know from the Avron-Herbst formula and a standard stationary phase argument that the physical wave operators W± (0) = s − lim U (0, t)U0 (t, 0) t→±∞
exist. As in [Ya1], one can compute Wˆ ± = U W± (0)U0∗ , which implies
Range Wˆ ± = U L2 [0, 1]; Range W± (0) .
This shows by Proposition 7.1 that W± (0) are unitary. (See also [H2, Corollary 4.1].) The remaining statements follow from (3.1) and (3.2) and from the existence of wave operators. Appendices Appendix A. Absolutely continuous vector-valued functions and the derivative. In this appendix, we discuss absolutely continuous vector-valued functions as well as different realizations of the derivative on an interval. We restrict attention to separable Hilbert spaces since that is all we need in this paper and it makes the exposition simpler because the different notions of measurability coincide. See [RS] and [T] for some material on measurability and integration. We just mention that integrals of vector-valued and operator-valued functions are weak and strong, respectively. In the case where Ᏼ is finite-dimensional, being absolutely continuous is equivalent to being an indefinite integral; see, for example, [R]. In general, however, this is not so (although being an indefinite integral implies absolute continuity in the usual sense). In this section, we work with indefinite integrals since these functions form natural domains for the derivative.
158
JACOB MØLLER
Definition A.1. The space of absolutely continuous functions on the real line is defined by AC R; Ᏼ t g(s) ds + ψ . = f : R −→ Ᏼ : ∃g ∈ L1loc R; Ᏼ and ψ ∈ Ᏼ s.t. f (t) = 0
Note that AC(R; Ᏼ) ⊂ C 0 (R; Ᏼ), the space of continuous Ᏼ-valued functions. It is easy to check that the map t g, ψ −→ g(s) ds + ψ 0
from L1loc (R; Ᏼ)× Ᏼ onto AC(R; Ᏼ) is one-to-one. The following definition is, therefore, good. Definition A.2. The derivative ∂ : AC(R; Ᏼ) → L1loc (R, Ᏼ) is given for g ∈ and ψ ∈ Ᏼ by t g(s) ds + ψ = g. ∂
L1loc (R, Ᏼ)
0
We have the following proposition, as in the case Ᏼ = C. Proposition A.3. Let f ∈ AC(R; Ᏼ). Then the limit 1 f (t + h) − f (t) h→0 h
g(t) = lim exists a.e. and g = ∂f . Proof. Write f (t) =
t 0
∂f (s) ds + f (0). Compute
1 1 f (t + h) − f (t) − ∂f (t) = h h and estimate
t+h t
∂f (s) − ∂f (t) ds
t+|h| 1 f (t + h) − f (t) − ∂f (t) ≤ 1 ut (s) ds, h |h| t
where ut (s) = ∂f (s) − ∂f (t) is in L1loc (R) for almost every t. The limit on the right-hand side thus exists and equals ut (t) = 0 a.e. This concludes the proof. We now consider absolutely continuous functions from R into the bounded operators between separable Hilbert spaces Ᏼ1 and Ᏼ2 . We write · 1 and · 2 for their respective norms and · 1,2 for the norm on Ꮾ(Ᏼ1 ; Ᏼ2 ).
TIME-PERIODIC ELECTRIC FIELDS
159
Definition A.4. We say B ∈ AC(R; Ꮾ(Ᏼ1 ; Ᏼ2 )) if t → (Bψ)(t) := B(t)ψ ∈ AC(R; Ᏼ2 ) for all ψ ∈ Ᏼ1 and sup ψ 1 ≤1 ∂(Bψ)(·) 2 ∈ L1loc (R). If B ∈ AC(R; Ꮾ(Ᏼ1 ; Ᏼ2 )), there exists a family ∂B ∈ L1loc (R; Ꮾ(Ᏼ1 ; Ᏼ2 )) such that (∂Bψ)(t) = (∂B)(t)ψ a.e. Compute for t > s, t B(t) − B(s) ≤ sup (∂B)(s)ψ 2 ds. 1,2 s
ψ
1 ≤1
By assumption, the right-hand side converges to 0 as t → s, and the map t → B(t) is, therefore, continuous. Again, one can verify that the map t A(s) ds + B0 A, B0 −→ 0
from L1loc (R; Ꮾ(Ᏼ1 ; Ᏼ2 ))× Ꮾ(Ᏼ1 ; Ᏼ2 ) onto AC(R; Ꮾ(Ᏼ1 ; Ᏼ2 )) is one-to-one, which justifies the following. Definition A.5. The derivative ∂ : AC(R; Ꮾ(Ᏼ1 ; Ᏼ2 )) → L1loc (R, Ꮾ(Ᏼ1 ; Ᏼ2 )) is given for A ∈ L1loc (R, Ꮾ(Ᏼ1 ; Ᏼ2 )) and B0 ∈ Ꮾ(Ᏼ1 ; Ᏼ2 ) by
t
∂
A(s) ds + B0 = A.
0
We similarly have, copying the proof of Proposition A.3, the following proposition. Proposition A.6. Let B ∈ AC(R; Ꮾ(Ᏼ1 ; Ᏼ2 )). Then the limit 1 B(t + h) − B(t) h→0 h
A(t) = lim exists a.e. and A = ∂B.
Let f, g ∈ AC(R; Ᏼ). Since f, g are indefinite integrals, one can verify that t → f (t), g(t) is an absolutely continuous function in the ordinary sense and, hence, is itself an indefinite integral. This argument shows that f, g ∈ AC R; Ᏼ &⇒ f (·), g(·) ∈ AC(R). (A.1) We turn to the analysis of the Hilbert space L2 ([0, T ]; Ᏼ), T > 0, and consider the space of absolutely continuous functions with square integrable derivative
AC 2 [0, T ]; Ᏼ = f ∈ AC [0, T ]; Ᏼ : ∂f ∈ L2 [0, T ]; Ᏼ and the operators τ0 ⊂ τV ⊂ τ∗ , V ∈ ᐁ(Ᏼ) (the unitary operators on Ᏼ), which are
160
JACOB MØLLER
different realizations of −i∂ on the interval, namely, with the respective domains
Ᏸ∗ = AC 2 [0, T ]; Ᏼ , Ᏸ0 = f ∈ Ᏸ∗ : f (0) = f (T ) = 0 , and
ᏰV = f ∈ Ᏸ∗ : f (0) = Vf (T ) .
The following result can be verified as in [RS] (for the case Ᏼ = C). Proposition A.7. We have (i) τ∗ is closed; (ii) τ0 is closed and symmetric; (iii) τV is selfadjoint for any V ∈ ᐁ(Ᏼ); (iv) σ (τ0 ) = ∅ and σpp (τ∗ ) = C; (v) the adjoint of τ0 equals τ∗ ; (vi) the spectrum of the τV ’s are periodic with period 2π/T . Furthermore, the resolvent of τV is given pointwisely by −1 τV − λ f (t) t −1 T iλ(s−T ) eiλ(s−t) f (s) ds + i V ∗ − eiλT e f (s) ds =i 0
(A.2)
0
for λ ∈ C, Im λ = 0. Let B ∈ Ꮾ(Ᏼ). We lift B to a bounded operator on L2 ([0, T ]; Ᏼ) by (Bf )(t) = Bf (t) for f ∈ C 0 ([0, T ]; Ᏼ) and extend it by continuity. The following identity is easily verified: S ∗ τV S = τ S ∗ V S
for S ∈ ᐁ(H ).
(A.3)
Here we have used the same symbol for the operator itself and for its lifting. If there is cause for confusion, we write B for the lifted operator. We have, for example, σ B = σ (B), (A.4) which follows since resolvents of B maps the subspace of constant functions into itself. In fact, ( B − z)−1 = (B − z)−1 . We now determine the flow generated by τV . Let f ∈ C 0 ([0, T ]; Ᏼ) and define UV (s)f (t) = V −[t−s]/T f t − s − [t − s] , where [s] is the largest multiple of T smaller than s. This is clearly a one-parameter strongly continuous unitary group. Proposition A.8. Let V ∈ ᐁ(H ) and s ∈ R. Then exp(−isτV ) = UV (s).
161
TIME-PERIODIC ELECTRIC FIELDS
Proof. It is sufficient to prove that the generator of UV coincides with τV on f ∈ ᏰV ∩ C ∞ ([0, T ]; Ᏼ), which by (A.2) is a core for τV . We compute 2 s 2 1 1 # Vf T + t − s − f (t) + f (t) dt i UV (s)f − f − τV f = s s 0 2 T 1 # dt. + f (t − s) − f (t) + f (t) s s By the choice of f and the Lebesgue theorem on dominated convergence, we see that the right-hand side converges to zero as s tends to zero, which proves the result. We end the appendix with the following structure result. Proposition A.9. Let V ∈ ᐁ(Ᏼ). Then
λ ∈ σpp (τV ) ⇐⇒ e−iλT ∈ σpp (V ),
Ᏼpp (τV ) = L2 [0, T ]; Ᏼpp (V ) ,
λ ∈ σac (τV ) ⇐⇒ e−iλT ∈ σac (V ),
Ᏼac (τV ) = L2 [0, T ]; Ᏼac (V ) ,
λ ∈ σsc (τV ) ⇐⇒ e−iλT ∈ σsc (V ),
Ᏼsc (τV ) = L2 [0, T ]; Ᏼsc (V ) .
Proof. By Proposition A.8 we find that exp(−iT τV ) = V . This shows the equivalence of pure point spectrums and it implies the result for the pure point subspace. It also shows that for a Borel set F ⊂ S 1 (the unit circle) and f ∈ L2 ([0, T ]; Ᏼ), T f, PF exp − iT τV f = f (t), PF (V )f (t) , 0
where PF denotes the characteristic function for the set F. This identity shows the stated identity for the absolutely continuous subspace, and since the spectral subspaces decompose the Hilbert space, we find the identity for the singular continuous subspaces as well. The equivalence of the last two spectra now follows from this discussion and from (A.4). Appendix B. The time-dependent Schrödinger equation. Let {H (t)}t∈R be a family of selfadjoint operators on a separable Hilbert space Ᏼ which satisfies that there exists a dense subspace with ⊂ ∩t∈R Ᏸ H (t) . We wish to discuss the time-dependent Schrödinger equation corresponding to the family H (t), in particular, what a solution is supposed to satisfy. The following suggestion is a natural one: A (two-parameter) family of unitary operators {U (t, s)}t,s∈R is a solution to the time-dependent Schrödinger equation if (1) U (t, s) ⊂ for all t, s ∈ R;
162
JACOB MØLLER
(2) for any ϕ ∈ the map t → U (t, s)ϕ admits a pointwise derivative almost everywhere and its derivative satisfies the vector-valued differential equation d U (t, s)ϕ = H (t)U (t, s)ϕ, U (s) = I, (B.1) dt almost everywhere. Solutions to this equation are not unique and we, therefore, need to discuss which one we consider (if any exist). Let be as above. A natural class of unitary families would be those for which U ϕ ∈ AC(R; Ᏼ) for all ϕ ∈ . It is, however, not clear why this family should be stable under composition, which makes it difficult to work with. Instead, we have a slightly weaker result that we choose to supply since it covers what is needed in the present paper. We consider two Hilbert-spaces, Ᏼ1 and Ᏼ2 , equipped with inner products ·, · 1 and ·, · 2 . We assume Ᏼ1 ⊂ Ᏼ2 is a dense subspace, and i
ψ
2
≤C ψ
1,
ψ ∈ Ᏼ1 ,
˜ 1,2 for the for some C > 0. (Compared to above = Ᏼ1 and Ᏼ = Ᏼ2 .) We write space of symmetric operators with domain containing Ᏼ1 and we say H1 ∼ H2 if ˜ 1,2 / ∼. Since a symmetric operator H ∈ 1,2 H1|Ᏼ1 = H2|Ᏼ1 and define 1,2 = is closable, we find that H as an operator from Ᏼ1 into Ᏼ2 is closed and, hence, bounded by the closed graph theorem. Thus, there is a canonical inclusion of 1,2 into Ꮾ(Ᏼ1 ; Ᏼ2 ). In the norm sup ψ 1 ≤1 H ψ 2 , the space 1,2 is complete and the inclusion into Ꮾ(Ᏼ1 ; Ᏼ2 ) is an isometry. We consider operator families H from the space L1loc (R; 1,2 ), which we identify with a subspace of L1loc (R; Ꮾ(Ᏼ1 ; Ᏼ2 )). We are interested in solutions U (·, s) to the evolution equation i∂U = H U,
U (s, s) = I.
(B.2)
Solutions are sought in the set ᐁ1,2 given by the following definition. Definition B.1. The set ᐁ1,2 consists of U : R → ᐁ(Ᏼ2 ), which are measurable and satisfy that U|Ᏼ1 ∈ AC(R; Ꮾ(Ᏼ1 ; Ᏼ2 )) and U ψ, U ∗ ψ ∈ L∞ loc (R; Ᏼ1 ) for all ψ ∈ Ᏼ1 . Note that being a solution to (B.2) is consistent with being an element of ᐁ1,2 . All operators are identified with operators on the large Hilbert space Ᏼ2 and adjoints are taken with respect to the inner product ·, · 2 . Note that families in ᐁ1,2 are strongly continuous and by the uniform boundedness principle, we have sup U (t)1,1 + U ∗ (t)1,1 < ∞, (B.3) |t|≤T
for any T > 0.
TIME-PERIODIC ELECTRIC FIELDS
163
Instead of (B.2) one could seek solutions U (s, ·) from ᐁ1,2 to the equation i∂U = −U H,
U (s, s) = I
(B.4)
and, in fact, we have the following result. Lemma B.2. We have (i) suppose U (·, s) ∈ ᐁ1,2 solves (B.2). Then U ∗ (·, s) ∈ ᐁ1,2 and it solves (B.4); (ii) suppose U (s, ·) ∈ ᐁ1,2 solves (B.4). Then U ∗ (s, ·) ∈ ᐁ1,2 and it solves (B.2). Proof. We only verify (i) since (ii) can be proved in similar fashion. Since U (·, s) is in AC(R; Ꮾ(Ᏼ1 ; Ᏼ2 )) and solves (B.2), we find iU (t, s) = iI +
t s
H (r)U (r, s) dr.
Due to the construction of the integral (see Appendix A), we thus find, for ψ, ϕ ∈ Ᏼ1 ,
ψ, iU (t, s)ϕ = − iψ, ϕ +
t s
U ∗ (r, s)H (r)ψ, ϕ dr,
which implies that ∗
−iU (t, s) = −iI +
t s
U ∗ (r, s)H (r) dr.
Hence U ∗ (·, s) ∈ ᐁ1,2 and it solves (B.4). The following proposition shows that the set ᐁ1,2 has a group structure. Proposition B.3. Let U1 , U2 ∈ ᐁ1,2 and ψ ∈ AC(R; Ᏼ2 ) ∩ L∞ loc (R; Ᏼ1 ). Then we have (i) U1∗ ∈ ᐁ1,2 , Ᏼ1 ⊂ Ᏸ((∂U1 )∗ ) and ∂U1∗ = (∂U1 )∗|Ᏼ1 ; (ii) U1 ψ ∈ AC(R; Ᏼ2 ) ∩ L∞ loc (R; Ᏼ1 ) and ∂U1 ψ = (∂U1 )ψ + U1 ∂ψ; (iii) U2 U1 ∈ ᐁ1,2 and ∂(U2 U1 ) = (∂U2 )U1 + U2 ∂U1 . Proof. Compute for ψ, ϕ ∈ Ᏼ1 , using (A.1) and Proposition A.6, 0 = ∂ ψ, ϕ 2 = ∂ U1 (·)ψ, U1 (·)ϕ 2 = (∂U1 )(·)ψ, U1 (·)ϕ 2 + U1 (·)ψ, (∂U1 )(·)ϕ 2 , which shows that U1∗ (i∂U1 ) ∈ L1loc (R; 1,2 ). For ψ, ϕ ∈ Ᏼ1 , we thus have ψ, (∂U1 )(·)ϕ 2 = U1∗ (·)ψ, U1∗ (·)(∂U1 )(·)ϕ 2 = − U1∗ (·)(∂U1 )(·)U1∗ (·)ψ, ϕ 2 .
164
JACOB MØLLER
This implies that Ᏼ1 ⊂ Ᏸ((∂U1 )∗ (t)) for almost all t and (∂U1 )∗|Ᏼ1 = −U1∗ (∂U1 )U1|∗ Ᏼ1 .
(B.5)
As in the proof of Lemma B.2, this shows that U1∗ ∈ ᐁ1,2 . Let U1 and ψ be as in the statement of the proposition. Compute for ϕ ∈ Ᏼ1 , using (A.1) and the above, t ϕ, U1 (t)ψ(t) = h(s) ds + ϕ, U1 (0)ψ(0) , 0
where by Proposition A.6, h = ∂ U1∗ (·)ϕ, ψ(·) = (∂U1 )∗ (·)ϕ, ψ(·) + U1∗ (·)ϕ, (∂ψ)(·) = ϕ, (∂U1 )(·)ψ(·) + U1 (·)(∂ψ)(·) . By construction of the integral and (B.3), this concludes the proof of (ii), and (iii) follows directly from (ii). Notice that by (B.5) the maps GL (U ) = (i∂U )U ∗
and
GR (U ) = U ∗ (i∂U )
both map ᐁ1,2 into L1loc (R; 1,2 ), and, hence, U is a solution to (B.2) with H = GL (U ) and to (B.4) with H = GR (U ). Theorem B.4 (Uniqueness). Let s ∈ R and H ∈ L1loc (R; 1,2 ). (i) There can be at most one solution U (·, s) ∈ ᐁ1,2 to (B.2). (ii) There can be at most one solution U (s, ·) ∈ ᐁ1,2 to (B.4). Proof. We restrict attention to (i). Assume there exist U1 (·, s), U2 (·, s) ∈ ᐁ1,2 such that GL (U1 (·, s)) = GL (U2 (·, s)). Compute using Proposition B.3(i), (iii), and (B.5): ∂ U2∗ U1 = ∂U2∗ U1 + U2∗ (∂U1 ) = −U2∗ (∂U2 )U2∗ U1 + U2∗ (∂U1 )U1∗ U1 = −iU2∗ GL (U2 )U1 + iU2∗ GL (U1 )U1 = 0. This shows that U2 (·, s)∗ U1 (·, s) = U0 (s), a unitary operator, and completes the proof since the left-hand side equals the identity at t = s. We have the following easy consequences of Lemma B.2, Proposition B.3, and Theorem B.4.
TIME-PERIODIC ELECTRIC FIELDS
165
Corollary B.5. Let U (t, s) be a family of unitary operators. The following three statements are equivalent. (i) U (·, s), U (s, ·) ∈ ᐁ1,2 and solves both (B.2) and (B.4). (ii) U (·, s) ∈ ᐁ1,2 solves (B.2) and U ∗ (t, s) = U (s, t). (iii) U (s, ·) ∈ ᐁ1,2 solves (B.4) and U ∗ (t, s) = U (s, t). Corollary B.6 (Chapman-Kolmogorov). Suppose the family U (t, s) satisfies either one of the three conditions in Corollary B.5. Then U (t, s)U (s, r) = U (t, r) for all t, s, r ∈ R. In the light of Theorem B.4 and Corollary B.5, we employ the following definition of what a solution to the time-dependent Schrödinger equation is. Definition B.7. Let H (t) be a family of Hamiltonians for which there exists a Hilbert space Ᏼ1 ⊂ Ᏼ2 = Ᏼ such that H ∈ L1loc (R; 1,2 ). A family of unitary operators U (t, s) is said to solve the time-dependent Schrödinger equation if (i) U (·, s), U (s, ·) ∈ ᐁ1,2 ; (ii) U (·, s) solves (B.2) and U (s, ·) solves (B.4). For other purposes, one might want to assume the Hamiltonians H (t) to be essentially selfadjoint on Ᏼ1 for almost all t instead of just symmetric. This is, however, not necessary in our situation. In construction procedures, one often considers a corresponding integral equation, and invariance of some domain is a typical byproduct; see [T] and [Ya1]. In [DG, Appendix B.3], a situation similar to ours is considered and it is shown how one can use control of certain commutators to obtain invariance of a domain under the flow. Their approach is independent of a construction procedure. These two observations indicate that Definition B.7 is not too restrictive. References [A] [AT] [ABG] [CFKS]
[DG] [G1] [G2] [HMS1]
T. Adachi, Propagation estimates for N-body Stark Hamiltonians, Ann. Inst. H. Poincaré Phys. Théor. 62 (1995), 409–428. T. Adachi and H. Tamura, Asymptotic completeness for long-range many-particle systems with Stark effect, II, Comm. Math. Phys. 174 (1996), 537–559. W. O. Amrein, A.-M. Berthier, and V. Georgescu, Lp -inequalities for the Laplacian and unique continuation, Ann. Inst. Fourier (Grenoble) 31 (1981), 153–168. H. L. Cycon, R. G. Froese, W. Kirsch, and B. Simon, Schrödinger Operators with Application to Quantum Mechanics and Global Geometry, Texts Monogr. Phys., Springer, Berlin, 1987. ´ J. Derezinski and C. Gérard, Scattering theory of classical and quantum N-particle systems, Texts Monogr. Phys., Springer, Berlin, 1997. G. M. Graf, Phase space analysis of the charge transfer model, Helv. Phys. Acta 63 (1990), 107–138. , A remark on long-range Stark scattering, Helv. Phys. Acta 64 (1991), 1167–1174. I. Herbst, J. S. Møller, and E. Skibsted, Spectral analysis of N-body Stark Hamiltonians, Comm. Math. Phys. 174 (1995), 261–294.
166 [HMS2] [H1] [H2] [HL] [JY] [KY] [Mø] [Mo] [RS] [R] [Si1] [Si2] [SS] [Sk] [T] [Ya1] [Ya2] [Yo] [Z]
JACOB MØLLER , Asymptotic completeness for N -body Stark Hamiltonians, Comm. Math. Phys. 174 (1996), 509–535. J. S. Howland, Stationary scattering theory for time-dependent Hamiltonians, Math. Ann. 207 (1974), 315–335. , Scattering theory for Hamiltonians periodic in time, Indiana Univ. Math. J. 28 (1979), 471–494. M. J. Huang and R. B. Lavine, Boundedness of kinetic energy for time-dependent Hamiltonians, Indiana Univ. Math. J. 38 (1989), 189–210. A. Jensen and K. Yajima, On the long range scattering for Stark Hamiltonians, J. Reine Angew. Math. 420 (1991), 179–193. H. Kitada and K. Yajima, A scattering theory for time-dependent long-range potentials, Duke Math. J. 49 (1982), 341–376. J. S. Møller, An abstract radiation condition and applications to N-body systems, to appear in Rev. Math. Phys. E. Mourre, Absence of singular continuous spectrum for certain selfadjoint operators, Comm. Math. Phys. 78 (1980/1981), 391–408. M. Reed and B. Simon, Methods of Modern Mathematical Physics, I: Functional Analysis, 2d ed., Academic Press, New York, 1980. H. L. Royden, Real Analysis, 3d ed., Macmillan, New York, 1988. I. M. Sigal, Stark effect in multielectron systems: Nonexistence of bound states, Comm. Math. Phys. 122 (1989), 1–22. , On long-range scattering, Duke Math. J. 60 (1990), 473–496. I. M. Sigal and A. Soffer, Local decay and velocity bounds for time-independent and time-dependent Hamiltonians, preprint, 1987. E. Skibsted, Propagation estimates for N-body Schrödinger operators, Comm. Math. Phys. 142 (1991), 67–98. H. Tanabe, Equations of Evolution, Monogr. Stud. Math. 6, Pitman, Boston, 1979. K. Yajima, Scattering theory for Schrödinger equations with potentials periodic in time, J. Math. Soc. Japan 29 (1977), 729–743. , Existence of solutions for Schrödinger evolution equations, Comm. Math. Phys. 110 (1987), 415–426. K. Yokoyama, Mourre theory for time-periodic systems, Nagoya Math. J. 149 (1998), 193–210. L. Zielinski, Scattering for a dispersive charge transfer model, Ann. Inst. H. Poincaré Phys. Théor. 67 (1997), 339–386.
Université Paris-Sud, Département de Mathématiques, Orsay, France; Moeller.Jacob@ math.u-psud.fr
Vol. 105, No. 1
DUKE MATHEMATICAL JOURNAL
© 2000
INTERIOR REGULARITY OF THE COMPLEX MONGE-AMPÈRE EQUATION IN CONVEX DOMAINS ZBIGNIEW BŁOCKI 0. Introduction. For C 2 -smooth plurisubharmonic (psh) functions, we consider the complex Monge-Ampère equation det uij = ψ, (0.1) where uij = ∂ 2 u/∂zi ∂zj , i, j = 1, . . . , n. The main result of this paper is the following theorem. Theorem A. Let be a bounded, convex domain in Cn . Assume that ψ is a C ∞ function in such that ψ > 0 and |Dψ 1/n | is bounded. Then there exists a C ∞ -psh solution u of (0.1) in with limz→∂ u(z) = 0. The theory of fully nonlinear elliptic operators of second order can be applied to the operator (det(uij ))1/n . It follows in particular that if u is strictly psh and C 2,α for some α ∈ (0, 1), then det(uij ) ∈ C k,β implies u ∈ C k+2,β , where k = 1, 2, . . . , and β ∈ (0, 1) (see, e.g., [9, Lemma 17.16]). Therefore, to prove Theorem A, it is enough to show existence of a solution that is C 2,α in every , where α ∈ (0, 1) depends on . We obtain this assuming only that ψ 1/n is positive and Lipschitz in (see Theorem 4.1). In a special case of a polydisc, we also allow nonzero boundary values. Theorem B. Let P be a polydisc in Cn . Assume that ψ is a C ∞ function in P such that ψ > 0 and |D 2 ψ 1/n | is bounded. Let f be a C 1,1 function on the boundary ∂P such that f is subharmonic on every analytic disc embedded in ∂P . Then (0.1) has a C ∞ -psh solution in P such that limζ →z u(ζ ) = f (z) for z ∈ ∂P . In Section 5, we explain what we precisely mean by saying that a function is C 1,1 on a (nonsmooth) set ∂P . In particular, all functions that are extendable to a C 1,1 function in an open neighborhood of ∂P are allowed. Usually, the Dirichlet problem for the complex Monge-Ampère operator is considered on smooth, strictly pseudoconvex domains in Cn . For these, the existence of (weak) continuous solutions was proved in [1], whereas smooth solutions were obtained, for example, in [5], [10], and [11]. Here, however, we do not assume any Received 2 June 1999. Revision received 2 February 2000. 2000 Mathematics Subject Classification. Primary 32W20; Secondary 35J60. Author’s work supported in part by the Committee for Scientific Research grant number 2 PO3A 003 13. 167
168
ZBIGNIEW BŁOCKI
regularity of the boundary. In case of the real Monge-Ampère operator, a result corresponding to Theorem A is due to Pogorelov, and a proof without gaps can be found in [6, Theorem 7] (see also [7]). To prove Theorem A, we need interior C 1 , C 2 , and C 2,α a priori estimates for the solutions of (0.1). One of the main problems in the complex case was to derive a C 1 -estimate, whereas in the real case it is trivial (because for any convex function on , vanishing on ∂ , we have |Du(x)| ≤ −u(x)/ dist(x, ∂ )). We do it in Section 2 (Theorem 2.1), and this is the only point when we need the assumption that is convex. We suspect that Theorem A should hold in a broader class of hyperconvex domains. An interior C 2 -estimate for the complex Monge-Ampère equation is proved in [14]. However, it gives an L∞ -bound only for u and not for |D 2 u|; therefore, we cannot use the C 2,α -estimate from [15]. In Section 3, we adapt the methods of [16] for the real Monge-Ampère equation and get an interior C 2,α -estimate of solutions of (0.1) using only the upper bounds of u and |Dψ 1/n |. To show Theorem A, we could have used a result from [13] instead of Theorem 3.1, but this would not give Theorem 4.1 in its full generality. In the proofs of the above theorems, we use a notion of a generalized solution of (0.1) introduced in [1]. The solutions obtained in Theorems A and B are unique, even among continuous psh functions. Acknowledgments. Parts of this paper were written both during my stay at the Mid Sweden University in Sundsvall and at the Mathematical Institute of the Polish Academy of Sciences while on leave from the Jagiellonian University. I would like to thank all three institutions. I am also grateful to S. Kołodziej for helpful discussions on the subject. 1. Preliminaries. If u is a continuous psh function, then we can uniquely define a nonnegative Borel measure Mu in such a way that (i) if uj → u locally uniformly, then Muj → Mu weakly; (ii) Mu = det(uij ) dλ if u is C 2 (see, e.g., [1]). Bedford and Taylor [1] solved the Dirichlet problem for the operator M in strictly pseudoconvex domains. This result was generalized in [2] (see also [3]) for the class of hyperconvex domains. Theorem 1.1. Let be a bounded, hyperconvex domain in Cn . Assume that ψ is nonnegative, continuous, and bounded in . Let f be continuous on ∂ and such that it can be continuously extended to a psh function on . Then there exists a solution of the following Dirichlet problem: u psh on , continuous on , (1.1) Mu = ψ on , u = f on ∂ .
REGULARITY OF THE COMPLEX MONGE-AMPÈRE EQUATION
169
We recall that a domain is called hyperconvex if it admits a bounded psh exhaustion function. In particular, all bounded convex domains are hyperconvex. In [1] Bedford and Taylor also proved the following comparison principle, which implies in particular the uniqueness of (1.1) in an arbitrary bounded domain in Cn . Proposition 1.2. Let be a bounded domain in Cn . If u, v are psh in , continuous on , and such that u ≤ v on ∂ and Mu ≥ Mv in , then u ≤ v in . The following regularity result can be also found in [1]. Theorem 1.3. Let = B be a Euclidean ball in Cn . Assume that f is C 1,1 on ∂B and ψ 1/n is C 1,1 on B (i.e., it is C 1,1 inside B and the second derivative is bounded there). Then a solution of (1.1) is C 1,1 in B. Moreover, for any B B, we have 2 D u ≤ C, B where C depends only on n, D 2 f ∂B , D 2 ψ 1/n B , dist(B , ∂B), and the radius of B. In Section 5, we prove a similar result for a polydisc in Cn . The following theorem was proved in [5]. Theorem 1.4. Assume that is strictly pseudoconvex with C ∞ boundary, ψ is on , ψ > 0, and f is C ∞ on ∂ . Then u, the solution of (1.1), is C ∞ on .
C∞
It is well known that 1/n 1/n 1/n ≥ Mu1 + Mu2 , M u1 + u2
u1 , u2 psh and C 2 .
(1.2)
The above inequality does not make sense if u1 and u2 are just continuous, since then Mu1 and Mu2 are only measures. However, we can generalize it as follows (see [3, Theorem 3.11]). Proposition 1.5. Let u1 and u2 be psh and continuous with Mu1 ≥ ψ1 , Mu2 ≥ ψ2 , where ψ1 and ψ2 are continuous and nonnegative. Then 1/n 1/n n . M u1 + u2 ≥ ψ1 + ψ2 The following C 2 -estimate was proved by F. Schulz [14]. Theorem 1.6. Let be a bounded, hyperconvex domain in Cn , and let u be a C 3 psh function in with limz→∂ u(z) = 0. Assume, moreover, that for some positive constants K0 , K1 , b, B0 , and B1 , we have |u| ≤ K0 ,
|Du| ≤ K1
and b ≤ ψ ≤ B0 ,
|Dψ| ≤ B1
170
ZBIGNIEW BŁOCKI
in , where ψ = det(uij ). Then for any ε > 0, there exists a constant C, depending only on n, ε, b, B0 , B1 , K0 , K1 and on the upper bound for the volume of such that u(−u)2+ε ≤ C in . In the proof of Theorem B, instead of applying Theorems 1.4 and 1.6, we use the following proposition. Proposition 1.7. Let be a bounded domain in Cn . Assume that u is a psh function in a neighborhood of and such that, for a positive constant K and h sufficiently small, it satisfies the estimate u(z + h) + u(z − h) − 2u(z) ≤ K|h|2 ,
z ∈ .
Then u is C 1,1 in and |D 2 u| ≤ K. This result was essentially proved in [1, pp. 34–35]. The arguments from [1] were simplified in [8], and we present Demailly’s proof for the convenience of the reader. Proof of Proposition 1.7. Let uε = u∗ρε denote the standard regularizations of u. Then for z ∈ ε := {z ∈ : dist(z, ∂ ) > ε} and h sufficiently small, we have uε (z + h) + uε (z − h) − 2uε (z) ≤ K|h|2 . This implies that D 2 uε .h2 ≤ K|h|2 .
(1.3)
Since uε is psh, we have D 2 uε .h2 + D 2 uε .(ih)2 = 4
∞
∂ 2 uε hj hk ≥ 0. ∂zj ∂zk
j,k=1
Therefore, by (1.3), D 2 uε .h2 ≥ −D 2 uε .(ih)2 ≥ −K|h|2 . This implies that |D 2 uε | ≤ K on ε , and the proposition follows. 2. A C 1 -estimate in convex domains. In this section we prove the following interior a priori gradient estimate for the complex Monge-Ampère operator in convex domains.
REGULARITY OF THE COMPLEX MONGE-AMPÈRE EQUATION
171
Theorem 2.1. Let u be psh and continuous in a bounded, convex domain in Cn with limz→∂ u(z) = 0. Assume, moreover, that Mu = ψ is continuous and ψ 1/n is Lipschitz in with a constant K1 . Then for any , u is Lipschitz in with the constant
D 2 2K0 K =D + K1 1 + , d d where D = diam , d = dist( , ∂ ), and K0 = sup ψ 1/n . In the proof of Theorem 2.1, we use the following elementary lemma. Lemma 2.2. Assume that is a bounded convex domain in Rn containing the origin. Then, if 0 < α < 1, we have dist α , ∂ = (1 − α) dist(0, ∂ ). Proof. The inequality “≤” is clear. To prove the reverse, we take x, y ∈ ∂ . We have to show that |x −αy| ≥ (1−α) d, where d := dist(0, ∂ ). Let l be a line passing through x and y. If 0, x, and y form an acute-angled triangle, then |x − αy| ≥ |x − αx| ≥ (1 − α) d. Otherwise, from the convexity of , it follows that d ≤ dist(0, l) and, consequently, |x − αy| ≥ (1 − α) dist(0, l) ≥ (1 − α) d. Proof of Theorem 2.1. We may assume that is convex. Fix a, b ∈ with |a − b| < d. It is enough to show that − b|. u(b) − u(a) ≤ K|a For z ∈ , put
(2.1)
|a − b| T (z) := 1 − (z − a) + b. d
Then T (a) = b and, by Lemma 2.2, dist
|a − b| |a − b| 1− ( − a), − a = dist(a, ∂ ) ≥ |a − b|, d d
and it follows that T ( ) ⊂ . Moreover, simple calculation shows that
T (z) − z ≤ 1 + D |a − b|, d
z ∈ ,
172
ZBIGNIEW BŁOCKI
and, since ψ 1/n is Lipschitz,
D |a − b|. ψ 1/n T (z) ≥ ψ 1/n (z) − K1 1 + d
(2.2)
For z ∈ , put 2 K v(z) := u T (z) + 2 |z − a|2 − D 2 |a − b|. D (It is well defined because T ( ) ⊂ .) The function v is psh, continuous, and negative on . From Proposition 1.5 and (2.2), we infer that
2
n 2 K |a − b| 1/n Mv ≥ 1− ψ T (z) + 2 |a − b| d D
n 2 K 2|a − b| ψ 1/n T (z) + 2 |a − b| ≥ 1− d D 2
n K 2K0 ≥ ψ 1/n T (z) + |a − b| − d D2 2
n K 2K0 D 1/n ≥ ψ (z) + − K1 1 + |a − b| − d d D2 = ψ(z). The comparison principle now implies that v ≤ u; thus − b|, u(a) ≥ v(a) = u(b) − K|a and we get (2.2). 3. A C 2,α -estimate and local regularity. The aim of this section is to show the following result. Theorem 3.1. Let u be a C 4 -psh function in an open ⊂ Cn . Assume that for some positive K0 , K1 , K2 , b, B0 , and B1 , we have |u| ≤ K0 ,
|Du| ≤ K1 ,
and b ≤ ψ ≤ B0 ,
u ≤ K2
Dψ 1/n ≤ B1
in , where ψ = det(uij ). Let . Then there exist α ∈ (0, 1) depending only on n, K0 , K1 , K2 , b, B0 , B1 and a positive constant C depending, besides those
REGULARITY OF THE COMPLEX MONGE-AMPÈRE EQUATION
173
quantities, on dist( , ∂ ) such that 2 D u α ≤ C. C ( ) We use similar methods, as in other papers on nonlinear elliptic operators, especially the methods in [16]. Note that if we knew that |D 2 u| ≤ K2 , then Theorem 3.1 would be a consequence of [15]. On the other hand, if we additionally assumed that |D 2 ψ 1/n | ≤ B2 , then from [13, Theorem 1] we would get the estimate D(u) ≤ C, and Theorem 3.1 would follow from the Schauder estimates. It is interesting to generalize Theorem 3.1 to arbitrary, continuous psh functions u (since u ∈ L∞ , u would have to be at least in W 2,p for every p < ∞). In the proof of Theorem 3.1, we need the following fact from the matrix theory. Lemma 3.2. Let λ and + be such that 0 < λ < + < +∞. By S[λ, +] we denote the set of positive Hermitian matrices in Cn×n with eigenvalues in [λ, +]. Then we can find unit vectors γ1 , . . . , γN in Cn and λ∗ , +∗ depending only on n, λ, and + such that 0 < λ∗ < +∗ < +∞. For every A = (aij ) ∈ S[λ, +], we can write A=
N
β k γk ⊗ γ k ,
that is, aij =
k=1
N
βk γki γ kj ,
k=1
where β1 , . . . , βN ∈ [λ∗ , +∗ ]. The set {γ1 , . . . , γN } can be chosen so that it contains a given finite subset of the unit sphere in Cn , for example, the set of the coordinate unit vectors. The proof of Lemma 3.2 for real symmetric matrices can be found, for example, in [9, Lemma 17.13], and it readily extends to the case of Hermitian matrices. Proof of Theorem 3.1. If we consider constants depending only on the quantities used in the assumption, we say that those constants are under control, and we usually denote them by C1 , C2 , etc. Let a ij denote the i, j -cominor of the matrix (uij ), so that a kl = ∂ det(uij )/∂ukl . If we set uij := a ij /ψ, then we have (uij )T = (uij )−1 . If we differentiate both sides of the equation uij uik = δj k with respect to zp and solve a suitable system of linear equations, we obtain ij u p = −uil ukj uklp . Since ψp = a kl uklp , we get ij a p = ψ uij ukl − uil ukj uklp .
174
ZBIGNIEW BŁOCKI
Therefore,
a ij 0
i
= a i0 j j = 0
(3.1)
n for every 1, and for arbitrary function v denote i0 , j0 = 1, . . . , n. Take γ ∈ C , |γ | =1/n vγ = p vp γp . The operator F (A) := (detA) is concave on the set of nonnegative Hermitian matrices. If we differentiate the equation F ((uij )) = ψ 1/n with respect to γ and γ , we obtain Fuij ,ukl uij γ uklγ + Fuij uij γ γ = ψ 1/n γ γ .
Since Fuij = (1/n)ψ −1+1/n a ij and since F is concave, by (3.1) we have
2 1 a ij uγ γ ij = a ij uγ γ i j ≥ nψ 1−1/n ψ 1/n γ γ = ψγ γ − 1 − ψ −1 ψγ , n
and we arrive at the estimate 2n
ij ∂f s a uγ γ i j ≥ −C1 + , ∂xs
(3.2)
s=1
where f s L∞ ( ) ≤ C2 . From the assumptions of the theorem, it follows that the eigenvalues of the matrix (uij ) are in [λ, +], where λ, + > 0 are under control. By Lemma 3.2, there are unit vectors γ1 , . . . , γN such for z, w ∈ we write N
a ij (w) uij (w) − uij (z) = βk (w) uγk γ k (w) − uγk γ k (z) , k=1
where βk (w) ∈ [λ∗ , +∗ ] and λ∗ , +∗ > 0 are under control. It is a consequence of the inequality between geometric and arithmetic means that for any nonnegative Hermitian matrices A, B ∈ Cn×n we have 1 trace AB T ≥ (det A)1/n (detB)1/n . n Therefore,
1−1/n 1/n a ij (w)uij (z) ≥ n ψ(w) . ψ(z)
We conclude that N
k=1
since |Dψ 1/n | ≤ K1 .
βk (w) uγk γ k (w) − uγk γ k (z) ≤ C3 |z − w|
(3.3)
REGULARITY OF THE COMPLEX MONGE-AMPÈRE EQUATION
175
Fix z0 ∈ and denote BR = B(z0 , R) for R < 1 such that 0 < 4R < dist(z0 , ∂ ). Set Mk,R = supBR uγk γ k and mk,R = inf BR uγk γ k . By (3.2) and the weak Harnack inequality (see [9, Theorem 8.18]), it follows that R −2n (3.4) Mk,4R − uγk γ k dλ ≤ C4 Mk,4R − Mk,R + R . BR
Summing (3.4) over k = k0 , where k0 is fixed, we obtain
R −2n Mk,4R − uγk γ k dλ ≤ C4 ω(4R) − ω(R) + R , BR k= k 0
where ω(R) =
N
k=1 (Mk,R − mk,R ).
(3.5)
By (3.3) for z ∈ B4R , w ∈ BR , we have
βk0 (w) uγk0 γ k (w) − uγk0 γ k (z) ≤ C3 |z − w| + βk (w) uγk γ k (z) − uγk γ k (w) 0
0
k=k0
≤ C5 R + +
∗
k=k0
Thus, uγk0 γ k (w) − mk0 ,4R 0
and (3.5) gives R −2n
Mk,4R − uγk γ k (w) .
1 ≤ ∗ C5 R + + ∗ Mk,4R − uγk γ k (w) , λ k=k0
BR
uγk0 γ k − mk0 ,4R dλ ≤ C6 ω(4R) − ω(R) + R . 0
This, coupled with (3.4), easily implies that ω(R) ≤ C7 ω(4R) − ω(R) + R ; hence ω(R) ≤ δω(4R) + R, where δ ∈ (0, 1) is under control. In an elementary way (see [9, Lemma 8.23]), we deduce that for any µ ∈ (0, 1), (1−µ)(− log δ)/ log 4 1 1 R 1−µ R µ R0 , ω(R0 ) + ω(R) ≤ δ R0 1−δ where 0 < R < R0 < min{1, dist(z0 , ∂ )}. Therefore, if we choose µ so that (1 − µ)(− log δ)/ log 4 ≤ µ, we obtain ω(R) ≤ CR α , where α ∈ (0, 1) is under control and C depends additionally on dist(z0 , ∂ ).
176
ZBIGNIEW BŁOCKI
Since γ1 , . . . , γN can be chosen so that they contain the coordinate vectors, we deduce that uC α ( ) ≤ C for some α ∈ (0, 1) under control. The conclusion of the theorem follows from the Schauder estimates. We now prove the following local regularity of the Monge-Ampère operator. Theorem 3.3. Assume that u is a C 1,1 -psh function such that Mu is C ∞ and Mu > 0. Then u is C ∞ . Proof. We may assume that u is defined in a neighborhood of a Euclidean ball B. There is a sequence fj ∈ C ∞ (∂B) decreasing to u on ∂B and such that D 2 fj ∂B ≤ C1 . Theorem 1.4 gives uj ∈ C ∞ (B), uj psh in B such that Muj = Mu, and uj = fj on ∂B. By the comparison principle, uj is decreasing to u in B. From Theorem 1.3 it follows that for every B B there is C2 such that D 2 uj B ≤ C2 . Thus, by Theorem 3.1, for every B B we can find α ∈ (0, 1) and C3 such that D 2 uj C α (B ) ≤ C3 . It follows that u ∈ C 2,α (B ), which finishes the proof. 4. Proof of Theorem A. As mentioned in the introduction, Theorem A is an immediate consequence of the following result. Theorem 4.1. Let be a bounded, convex domain in Cn . Assume that ψ is a positive function in such that ψ 1/n is (globally) Lipschitz in , and let u be the (unique) solution of (1.1) with f = 0. Then for every there exists α ∈ (0, 1) such that u ∈ C 2,α ( ). Proof. Let be a convex domain such that , and let j be a sequence of smooth strictly convex domains such that j j +1 and ∞ ∞ in a neighj =1 j = . Then one can find functions ψj , which are positive, C 1/n
borhood of j and such that limj →∞ ψj − ψ j = 0, and Dψj j ≤ C1 . (The functions ψj can be chosen as ψ ∗ ρε , the standard regularizations of ψ, where ε is sufficiently small.) Theorem 1.4 provides C ∞ functions uj on j , psh in j with uj = 0 on ∂ j , and Muj = ψj . We claim that the sequence uj tends locally uniformly to u in . The following two inequalities can be easily deduced from superadditivity of the complex Monge-Ampère operator and from the comparison principle: 1/n u(z) + |z − z0 |2 − D 2 ψj − ψ ≤ uj (z), z ∈ j , j
and 1/n uj (z) + |z − z0 |2 − D 2 ψj − ψ ≤ u(z) + u∂ j , j
Here, z0 is a fixed point of and D = diam . This implies that u − uj ≤ u∂ + D 2 ψj − ψ 1/n , j j
and the right-hand side converges to 0 as j → ∞.
j
z ∈ j .
REGULARITY OF THE COMPLEX MONGE-AMPÈRE EQUATION
177
We claim that the sequence uj is uniformly bounded in . Choose a and b so that max u < a < b < 0. For j big enough, we have ⊂ uj < a ⊂ {u < a} ⊂ uj < b ⊂ {u < b} ⊂ j . By Theorem 2.1, applied to convex domains j , there is C2 such that for every j , Duj ≤ C2 . {u 0, there exists C3 such that 2+ε uj b − uj ≤ C3 on uj < b . Therefore,
uj
C3 , (b − a)2+ε which proves the claim. Now, from Theorem 3.1, it follows that there exists α ∈ (0, 1) such that Dju C α ( ) ≤ C4 ; hence, u ∈ C 2,α ( ).
≤
We conjecture that Theorem 4.1 (as well as Theorem A) holds if is only hyperconvex. It would be sufficient if we knew that the sequence |Duj | is locally bounded in , where uj is the sequence constructed in the proof of Theorem 4.1. This would require a counterpart of Theorem 2.1 for nonconvex domains. Theorem A implies the following analogue of the local regularity of the real Monge-Ampère operator. Theorem 4.2. Let u be a convex function defined on an open subset of Cn such that its graph contains no line segment. Suppose that Mu is positive and C ∞ . Then u is C ∞ . Proof. By denote a domain where u is defined. Fix z0 ∈ . Let T be an affine function such that T ≤ u and T (z0 ) = u(z0 ). Since the graph of u contains no line segment, one can easily show that for some ε > 0 a convex domain {u−T +ε < 0} is relatively compact in . Now we apply Theorem A to this domain. By the uniqueness of the Dirichlet problem, we conclude that u must be smooth in some neighborhood of z0 . 5. Interior regularity in a polydisc. Throughout this section, P denotes the unit polydisc in Cn ; that is, P = n = {z ∈ Cn : |zj | < 1, j = 1, . . . , n}. Similarly as before, our starting point in proving Theorem B is Theorem 1.1. In order to use it, we need the following proposition. Proposition 5.1. Let f be a continuous function on ∂P . Then the following are equivalent: (i) f is subharmonic on every disc embedded in ∂P ;
178
ZBIGNIEW BŁOCKI
(ii) f can be continuously extended to a psh function on P . Proof. (ii)⇒(i) is clear. To show the converse, define u := sup v : v psh on P , v ∗ ≤ f on ∂P . Here v ∗ denotes the upper regularization of v which is defined on P ; the lower regularization is denoted by v∗ . By a result from [17] (see also [3, Theorem 1.5]), it is enough to show that u∗ = u∗ = f on ∂P . By the classical potential theory, we can find a harmonic function h on P , continuous on P and such that h = f on ∂P . Therefore, u ≤ h, and it remains to show that u∗ ≥ f on ∂P . Take any ε > 0 and w ∈ ∂P . We assume that w = (1, 0, . . . , 0). For z ∈ P and A positive, we can define v(z) := f 1, z2 , . . . , zn + A Re z1 − 1 − ε. Then v is continuous on P , psh on P , and we claim that for A big enough, v ≤ f on ∂P . We can find positive r such that f (1, z2 , . . . , zn ) − ε ≤ f (z) if |z1 − 1| ≤ r and z ∈ ∂P . Therefore, it is enough to take A, which is not smaller than f 1, z2 , . . . , zn − f (z) − ε sup . 1 − Re z1 z∈∂P ,|z1 −1|≥r Eventually, u∗ (w) ≥ v(w) ≥ f (w) − ε, which completes the proof. In case of a bidisc, Theorem 1.1 was earlier proved in [12] with probabilistic methods. In fact, similarly as in [12], if = P , then the assumption in Theorem 1.1 that ψ is bounded can be relaxed. One can allow nonnegative, continuous ψ with ψ(z) ≤
β
1 − |z1 |
C
β , · · · 1 − |zn |
z ∈ P,
for some positive C and β < 2. This arises from the subsolution ε ε u(z) = − 1 − |z1 |2 · · · 1 − |zn |2 , where 0 < ε ≤ 1/n; then (nε−2) (nε−2) · · · 1 − |zn |2 1 − ε|z|2 . Mu(z) = ε n 1 − |z1 |2 Before stating the main result of this section, we explain the notation. We say that a function is C 1,1 on P if it is C 1,1 on P and its second derivative is (globally) bounded. By saying that a function is C 1,1 on ∂P , we mean that it is continuous on ∂P , C 1,1 on the (2n − 1)-real-dimensional manifold R :=
n
j −1 × ∂ × n−j ,
j =1
and the second derivative is bounded on R.
REGULARITY OF THE COMPLEX MONGE-AMPÈRE EQUATION
179
In order to prove Theorem B, we show the following counterpart of Theorem 1.3 for a polydisc. Theorem 5.2. Assume that ψ ≥ 0 is such that ψ 1/n ∈ C 1,1 (P ). Let f be C 1,1 on ∂P and subharmonic on every disc embedded in ∂P . Then a solution of (1.1) is C 1,1 on P . Note that, contrary to Theorem 3.1, we do not assume here that ψ > 0. We conjecture that for arbitrary bounded, hyperconvex domain in Cn , if f = 0 and ψ ≥ 0, ψ 1/n ∈ C 1,1 ( ), then a solution of (1.1) belongs to C 1,1 ( ). The analogous problem can be stated for the real Monge-Ampère operator and bounded, convex domains in Rn . By [11], the answer in both the complex and real case is positive if is C 3,1 strictly pseudoconvex (resp., convex); we then get a solution in C 1,1 ( ). However, we cannot expect global boundedness of the second derivatives in general because if, for example, ψ = 1, then all eigenvalues of the complex (resp., real) Hessian of u would be bounded away from zero. This would imply in particular that there are no analytic discs (resp., line segments) in ∂ , but this is allowed in general. Proof of Theorem 5.2. The proof is similar to the proof of [1, Proposition 6.6]. Let D be open and relatively compact in P . Define Ta,h (z) = T a, h, z hn + 1 − |an |2 − a n h1 zn h1 + 1 − |a1 |2 − a 1 h1 z1 ,..., . := 1 − |a1 |2 − a1 h1 + h1 z1 1 − |an |2 − an hn + hn zn Then T is C ∞ -smooth in a neighborhood of the set {(a, h, z) : a ∈ D, |h| ≤ d/2, z ∈ P }, where d = dist(D, ∂P ). Moreover, Ta,h is a holomorphic automorphism of P mapping a to a + h and such that Ta,0 (z) = z. For a ∈ D, |h| < d/2, and z ∈ P , put u Ta,h (z) + u Ta,−h (z) v(z) := − K1 |h|2 + K2 |z|2 − n . 2 We claim that if K1 and K2 are big enough, then for all a, h, and z we have v ≤ u. By the comparison principle, it is enough to show that v ≤ u on ∂P and Mv ≥ Mu on P . Since Ta,h maps R onto R, it is easy to see that if we take 2 ∂ 1 K1 := 2 f T (a, h, z) , 2 ∂h {a∈D, |h|≤d/2, z∈R} then v ≤ u on R. Since both functions are continuous, the inequality holds on ∂P . From Proposition 1.5, we infer n 2/n 2/n (z) + ψ 1/n Ta,−h (z) Ta,−h (z) ψ 1/n Ta,h (z) Ta,h + K2 |h|2 , Mv ≥ 2
180
ZBIGNIEW BŁOCKI
where by T we mean the Jacobian of T . Therefore, we have Mv ≥ Mu = ψ if 2/n 1 ∂ 2 1/n (z) T (z) . ψ T K2 = a,h a,h 2 ∂h2 {a∈D, |h|≤d/2, z∈P } Eventually, v ≤ u and u(a + h) + u(a − h) − K1 + nK2 |h|2 , 2 The theorem follows from Proposition 1.7. u(a) ≥ v(a) ≥
a ∈ D, |h| <
d . 2
It is clear from the proof that, similarly as in Theorem 1.3, we have an interior a priori estimate for D 2 u in Theorem 5.2. Theorem B can be deduced from Theorems 5.2 and 3.3. The assumption that ψ > 0 in Theorem B is essential, as the following example shows. Example. Let P = 2 be the unit bidisc. The function f (z, w) = (Re z)2 (Re w)2 is separately subharmonic; thus, by Proposition 5.1 and Theorem 1.1, the function u := sup v : v psh in 2 , v ∗ ≤ f on ∂ 2 2
is psh in 2 , continuous on , u = f on ∂(2 ), and Mu = 0 in 2 . By Theorem 5.2, u is C 1,1 in 2 . Note that for any z, w ∈ C, we have 2 4 Re z Re w − 1 − |z|2 1 − |w|2 = |z + w|2 − 1 − zw . Thus, {|z + w| = |1 − zw|} ∩ ∂(2 ) ⊂ {Re z Re w = 0}. It is easy to check that the set 2 {|z + w| = |1 − zw|} ∩ can be foliated by analytic discs with boundaries in ∂(2 ) 2 and that u = 0 on {|z + w| ≤ |1 − zw|} ∩ . For ε ∈ (0, 1), set
ε 2 z + w 2 vε (z, w) = − 1 4 ε + 1 − zw ε2 4 Re z Re w − 1 − |z|2 1 − |w|2 − 2ε 1 − Re(zw) − ε 2 = . 4 ε + 1 − zw2 2
Then vε is psh in 2 , continuous on , and vε (z, √w) ≤ Re z Re w there. Therefore, we √ have (max{0, vε })2 ≤ u and vε ≤ u. For t ∈ ( 2 − 1, 1), an elementary calculation gives (2t)2 1 ε2 2/3 2 2/3 3 u(t, t) ≥ sup , 2 − 1 = (2t) − 1 − t 4 ε∈(0,1) 4 ε +1−t2 since the supremum is attained for ε with (ε +1−t 2 )3 = (2t)2 (1−t 2 ). For t ∈ (0, 1),
REGULARITY OF THE COMPLEX MONGE-AMPÈRE EQUATION
we thus have
u(t, t)
=0 ≥ 2−4
(2t)2/3 −
2/3 6 1−t2
if t ≤ if t ≥
√ √
181
2 − 1, 2 − 1,
and we conclude that u is not C 6 . We conjecture that, in fact, u is not even C 2 . References [1] [2] [3] [4]
[5]
[6] [7]
[8] [9] [10]
[11] [12] [13] [14] [15] [16] [17]
E. Bedford and B. A. Taylor, The Dirichlet problem for a complex Monge-Ampère equation, Invent. Math. 37 (1976), 1–44. Z. Błocki, On the Lp stability for the complex Monge-Ampère operator, Michigan Math. J. 42 (1995), 269–275. , The complex Monge-Ampère operator in hyperconvex domains, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 23 (1996), 721–747. , “On the regularity of the complex Monge-Ampère operator” in Complex Geometric Analysis (Pohang, 1997), Contemp. Math. 222, Amer. Math. Soc., Providence, 1999, 181–189. L. Caffarelli, J. J. Kohn, L. Nirenberg, and J. Spruck, The Dirichlet problem for non-linear second-order elliptic equations, II, Complex Monge-Ampère, and uniformly elliptic, equations, Comm. Pure Appl. Math. 38 (1985), 209–252. S. Y. Cheng and S. T. Yau, On the regularity of the Monge-Ampère equation det((∂ 2 u/∂x i ∂x j )) = F (x, u), Comm. Pure Appl. Math. 33 (1977), 41–68. , “The real Monge-Ampère equation and affine flat structures” in Biejing Symposium on Differential Geometry and Differential Equations (Beijing, 1980), Science Press, Beijing, 1982, 339–370. J.-P. Demailly, Potential theory in several complex variables, preprint, 1991. D. Gilbarg and N. S. Trudinger, Elliptic Partial Differential Equations of Second Order, Grundlehren Math. Wiss. 244, Springer, Berlin, 1983. N. V. Krylov, Smoothness of the payoff function for a controllable diffusion process in a domain (in Russian), Izv. Akad. Nauk SSSR Ser. Mat. 53 (1989), 66–96; English translation in Math. USSR-Izv. 34 (1990), 65–95. , On analogues of the simplest Monge-Ampère equation, C. R. Acad. Sci. Paris Sér. I Math. 318 (1994), 321–325. N. Levenberg and M. Okada, On the Dirichlet problem for the complex Monge-Ampère operator, Michigan Math. J. 40 (1993), 507–526. D. Riebesehl and F. Schulz, A priori estimates and a Liouville theorem for complex Monge-Ampère equations, Math. Z. 186 (1984), 57–66. F. Schulz, A C 2 -estimate for solutions of complex Monge-Ampère equations, J. Reine Angew. Math. 348 (1984), 88–93. , Über nichtlineare, konkave elliptische Differentialgleichungen, Math. Z. 191 (1986), 429–448. J. B. Walsh, Continuity of envelopes of plurisubharmonic functions, J. Math. Mech. 18 (1968), 143–148. R. Wang and J. Jiang, Another approach to the Dirichlet problem for equations of MongeAmpère type, Northeastern Math. J. 1 (1985), 27–40.
Jagiellonian University, Institute of Mathematics, Reymonta 4, 30-059 Kraków, Poland; [email protected]
Vol. 105, No. 1
DUKE MATHEMATICAL JOURNAL
© 2000
CORRECTION TO “THE GROSS-KOHNEN-ZAGIER THEOREM IN HIGHER DIMENSIONS” RICHARD E. BORCHERDS
J. Bruinier pointed out the following gap in the proof of the main theorem of [B2]. The paper defines certain principal Heegner divisors in terms of the divisors of automorphic forms with unitary characters. Unfortunately, if these characters are allowed to have infinite order as in [B2], then there are sometimes “too many” principal divisors (see [F]) and the Heegner-divisor class group collapses. So the definition of principal Heegner divisors in [B2] should include the condition that the unitary character of these automorphic forms must have finite order. But then we have to show that the infinite-product automorphic forms used in the proof have this property. This can be shown as follows. For O2,n (R) with n > 2 this follows because these Lie groups have no almost simple factors of real rank 1, and if G is a lattice in a connected Lie group with no simple factors of rank 1, then the abelianization of G is finite (see [M, Proposition 6.19, p. 333]). So any character of G has finite order. For the cases n = 1 and n = 2 we use the embedding trick (see [B1, Lemma 8.1]) to see that if f is an infinite product of O2,n (R), then f is the restriction of an infinite product g of O2,24+n (R). The infinite product g is not necessarily single valued; however, a look at the proof of [B1, Lemma 8.1] shows that if f is constructed from a vector-valued modular form with integral coefficients, then g 24 has zeros and poles of integral order and is therefore a meromorphic automorphic form for some unitary character. By the previous paragraph this character has finite order, and therefore so does the character of f . Another minor correction is that [B2, Theorem 3.1] should have the condition of ρ = σk on ∩ K added to it. References [B1] [B2] [F]
Richard E. Borcherds, Automorphic forms with singularities on Grassmannians, Invent. Math. 132 (1998), 491–562. , The Gross-Kohnen-Zagier theorem in higher dimensions, Duke Math. J. 97 (1999), 219–233. John D. Fay, Fourier coefficients of the resolvent for a Fuchsian group, J. Reine Angew. Math. 293/294 (1977), 143–203.
Received 8 May 2000. 2000 Mathematics Subject Classification. Primary 11G40; Secondary 11F11, 11G05, 14G10. Author supported by National Science Foundation grant number DMS-9970611. 183
184 [M]
RICHARD E. BORCHERDS G. A. Margulis, Discrete Subgroups of Semisimple Lie Groups, Ergeb. Math. Grenzgeb. (3) 17, Springer, Berlin, 1991.
Mathematics Department, University of California at Berkeley, Berkeley, California 94720-3840, USA; [email protected]; http://www.math.berkeley.edu/˜reb
Vol. 105, No. 2
DUKE MATHEMATICAL JOURNAL
© 2000
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE KENTARO ITO 1. Introduction. Let S be an oriented closed surface of genus g > 1. A projective structure on S is a maximal system of local coordinates modeled on the Riemann sphere C, whose transition functions are Möbius transformations. For a given projective structure on S, we have a pair (f, ρ) of a local homeomorphism f from the universal cover S of S to C, called a developing map, and a group homomorphism ρ of π1 (S) into PSL2 (C), called a holonomy representation. Let P (S) denote the space of all (marked) projective structures on S, and let V (S) denote the space of all conjugacy classes of representations of π1 (S) into PSL2 (C). Holonomy representations give a mapping hol : P (S) → V (S), which is called the holonomy mapping. It is known that the map hol is a local homeomorphism (see [13]). The quasi-Fuchsian space QF(S) is the subspace of V (S) consisting of faithful representations whose holonomy images are quasi-Fuchsian groups. In this paper, we investigate the subset Q(S) = hol−1 (QF(S)) of P (S). We say an element of Q(S) is standard if its developing map is injective; otherwise, it is exotic. The set of standard projective structures with fixed underlying complex structure is well known as the image of the Teichmüller space under Bers embedding (see [5]). On the other hand, the existence of exotic projective structures was first shown by Maskit [21]. More investigations of exotic projective structures are found in [11], [12], [13], [25], [30], and [32]. As we see in Proposition 2.3, each connected component of Q(S) is biholomorphically isomorphic to QF(S). Moreover, as a consequence of the result of Goldman [12], the connected components of Q(S) are in one-to-one correspondence with the set ᏹᏸZ (S) of integral points of measured laminations. (See Section 2.4 for a precise definition.) We denote by ᏽλ the component of Q(S) corresponding to λ ∈ ᏹᏸZ (S), where ᏽ0 is the component consisting of all standard projective structures. Recently, McMullen [25, Appendix A] discovered the following phenomenon. Theorem 1.1 (McMullen). There exists a sequence of exotic projective structures that converges to a point of the relative boundary ∂ ᏽ0 = ᏽ0 − ᏽ0 of ᏽ0 . This phenomenon deeply depends on the following phenomenon in the theory of Kleinian groups: There is a sequence of quasi-Fuchsian groups whose algebraic limit is properly contained in the geometric limit. Such a sequence of quasi-Fuchsian Received 25 August 1998. Revision received 13 December 1999. 2000 Mathematics Subject Classification. Primary 30F40; Secondary 57M50. 185
186
KENTARO ITO
groups used in the proof of Theorem 1.1 is essentially constructed by Anderson and Canary [2]. Related topics can be found in [7], [8], and [18]. Theorem 1.1 brings up naturally the following questions. (1) Can we characterize the points of ∂ ᏽ0 which are limits of exotic projective structures? (2) Can we characterize how a sequence of exotic projective structures can converge to a point of ∂ ᏽ0 ? As for the first question, we remark that the holonomy image of the limit projective structure constructed in Theorem 1.1 is a regular b-group. Moreover, we show that any element of ∂ ᏽ0 whose holonomy image is a degenerate group without accidental parabolics cannot be an accumulation point of exotic projective structures (see Corollary 3.5). This result has been announced by Matsuzaki already. We discuss this topic in Section 3. In this paper, we are mainly concerned with the second question. Our first main result shows that there exists a sequence in any exotic component ᏽλ which converges to a point of ∂ ᏽ0 . Theorem A. For any λ ∈ ᏹᏸZ (S), we have ᏽ0 ∩ ᏽλ = ∅. Especially, the closure of Q(S) in P (S) is connected. The proof depends on the following observation: For a converging sequence of exotic projective structures, which component of Q(S) contains the sequence is closedly related to how the algebraic limit is contained in the geometric limit of corresponding holonomy representations. In Section 5, we develop a technique to construct some sequences of representations with the same algebraic limit but with mutually distinct geometric limits. Using this technique, we can extend Theorem A to the following form. Theorem B. For any finite set {λi }m i=1 of ᏹᏸZ (S) satisfying i(λj , λk ) = 0 for all j, k ∈ {1, . . . , m}, we have ᏽ0 ∩ ᏽλ1 ∩· · ·∩ ᏽλm = ∅, where i(·, ·) denotes the geometric intersection number. Since the holonomy mapping hol : P (S) → V (S) is a local homeomorphism, the complexity of Q(S) at ∂ ᏽ0 is inherited by the complexity of ∂ QF(S). In fact, Theorem 1.1 implies that the closure of QF(S) in V (S) is not a manifold with boundary (see [25, Theorem A.1]). This shows the advantage of consideration of the space of projective structures to investigate the quasi-Fuchsian space. As a consequence of Theorem B, we obtain the following theorem. Theorem C. For any positive integer n ∈ N, there exists a point [ρ] of ∂ QF(S) such that U ∩ QF(S) consists of more than n components for any sufficiently small neighborhood U of [ρ]. Theorems A and B can be viewed as the projective structure analogues of the works of Anderson and Canary [2] and Anderson, Canary, and McCullough [3] in
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE
187
characterizing when components of the set of discrete faithful representations of finitely generated group G into PSL2 (C) have intersecting closures. Theorem C also can be viewed as the analogue of their work because Theorem C describes how the closure of a unique component of the set of discrete faithful representations intersects itself. In Section 2, we provide detailed definitions and basic properties of the spaces and maps with which we are concerned. In Section 3, we investigate the relationship between sequences of exotic projective structures and algebraic and geometric limits of their holonomy representations. Sections 4 and 5 are devoted to the proofs of Theorems A and B, respectively. Acknowledgments. The author would like to express his gratitude to Hiroshige Shiga and Katsuhiko Matsuzaki for their encouragement and useful suggestions. He also appreciates the referee’s valuable comments. 2. Notation and basic facts. A Kleinian group G is a discrete subgroup of PSL2 (C), which acts on the hyperbolic space H3 as isometries, and on the sphere at 2 = infinity S∞ C as conformal automorphisms. The region of discontinuity (G) is the largest open subset of C on which G acts properly discontinuously, and the limit set (G) of G is its complement C−(G). The quotient manifold NG = H3 ∪(G)/G is called the Kleinian manifold of G. A quasi-Fuchsian group is a Kleinian group whose limit set is a Jordan curve and which contains no element interchanging the two components of its region of discontinuity. A quasi-Fuchsian group is obtained by a quasi-conformal deformation of a Fuchsian group. 2.1. Beltrami differentials. For a given Kleinian group G with (G) = ∅, a measurable function µ on C is called a Beltrami differential for G if µ(g(z))g (z) = µ(z)g (z) holds for a.e. z ∈ C and for all g ∈ G. The space of all Beltrami differentials µ for G with essential sup-norm satisfying µ∞ < 1 is denoted by Belt(G)1 . For a given element µ ∈ Belt(G)1 , there exists a unique quasi-conformal map w : C→ C satisfying the Beltrami equation wz¯ = µwz and fixing 0, 1, and ∞. Throughout this paper, this normalized quasi-conformal map with the Beltrami coefficient µ is denoted by wµ . For more information about the quasi-conformal map, see [20]. The quasiconformal map wµ induces a group isomorphism µ of G into PSL2 (C) satisfying wµ ◦g = µ (g)◦wµ for all g ∈ G. For a G-invariant open set U ⊂ (G), we denote by Belt(U, G)1 the subset of Belt(G)1 consisting of all elements with support in U . 2.2. Teichmüller space. Let S be an oriented closed surface of genus g > 1. The Teichmüller space T (S) consists of pairs (f, X), where X is a Riemann surface and f : S → X is an orientation preserving diffeomorphism. Two pairs, (f1 , X1 ) and (f2 , X2 ), represent the same point in T (S), if there is a holomorphic isomorphism
188
KENTARO ITO
h : X1 → X2 such that h ◦ f1 is isotopic to f2 . It is known that the space T (S) is a (3g − 3)-dimensional complex manifold, diffeomorphic to a cell. There is another, but equivalent, definition of the Teichmüller space. We fix a Fuchsian group $ acting on the upper half-plane H = {z ∈ C : Im z > 0} such that S = H/ $. Two elements µ, ν ∈ Belt(H, $)1 are called equivalent if wµ | ∂H = wν | ∂H, or equivalently, µ = ν . The Teichmüller space T ($) (or T (S)) is the space of equivalence classes [µ] of elements µ in Belt(H, $)1 . For each t = [µ] ∈ T (S), let $t denote the quasi-Fuchsian group µ ($), whose region of discontinuity is a union of Ht = wµ (H) and Ht∗ = wµ (H∗ ), where H∗ = {z ∈ C : Im z < 0} is the lower half-plane. The quasi-conformal map wµ | H : H → Ht descends to a quasiconformal map gt : S = H/ $ → St = Ht / $t such that the pair (gt , St ) represents t ∈ T (S). 2.3. The space of projective structures. For a given projective structure on S, we obtain a local homeomorphism f : S→ C by lifting the structure to the universal cover S of S and continuing the coordinates analytically. This map f is called a developing map of the projective structure. A developing map f induces a group homomorphism ρ : π1 (S) → PSL2 (C) satisfying f ◦ g = ρ(g) ◦ f for all g ∈ π1 (S), which is called a holonomy representation. This pair (f, ρ) is called a projective pair. Note that a projective structure determines a projective pair (f, ρ) uniquely up to the action of PSL2 (C); the action is defined by (f, ρ) −→ A ◦ f, A ◦ ρ ◦ A−1 for A ∈ PSL2 (C). For each t ∈ T (S), let B2 (Ht , $t ) denote the space of holomorphic quadratic differentials for $t on Ht , whose element is a holomorphic function ϕ on Ht satisfying ϕ(γ (z))γ (z)2 = ϕ(z) for all γ ∈ $t , z ∈ Ht . The space B2 (Ht , $t ) is a (3g −3)-dimensional complex vector space. A projective structure determines naturally its underlying complex structure. The set of all projective structures with an underlying complex structure St = Ht / $t is parametrized by B2 (Ht , $t ) as follows. For a projective structure on St , let f : Ht → C be its developing map. Then we assign an element S(f ) ∈ B2 (Ht , $t ) to this projective structure, where S(f ) is the Schwarzian derivative of f defined by S(f ) = (f /f ) − 1/2(f /f )2 . Conversely, for an element ϕ ∈ B2 (Ht , $t ), there is a holomorphic map f : Ht → C satisfying S(f ) = ϕ, which descends to a projective structure on St . Here and hereafter, with this identification, we regard a pair (t, ϕ) as a marked projective structure, where t ∈ T (S) and ϕ ∈ B2 (Ht , $t ). Let P (S) denote the holomorphic cotangent bundle over T (S) with projection π : P (S) → T (S). Then, each fiber π −1 (t) over t ∈ T (S) is identified with B2 (Ht , $t ).
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE
189
We regard P (S) as the space of marked projective structures. For any projective structure (t, ϕ) ∈ P (S), let ft,ϕ : Ht → C denote its developing map, and let ρ¯t,ϕ : $t → PSL2 (C) denote its holonomy representation satisfying ft,ϕ ◦γ = ρ¯t,ϕ (γ )◦ft,ϕ for any γ ∈ $t . A representation ρt,ϕ = ρ¯t,ϕ ◦ µ of π1 (S) ∼ = $ into PSL2 (C) is also called a holonomy representation, where µ ∈ Belt(H, $)1 is a representative of t ∈ T (S). It is known that, for any projective structure (t, ϕ), the holonomy image ρt,ϕ (π1 (S)) is a non-abelian subgroup of PSL2 (C). A sequence {ρn } of representations of π1 (S) into PSL2 (C) is said to converge algebraically to ρ if ρn (g) converges to ρ(g) in PSL2 (C) for any g ∈ π1 (S). Let V (S) denote the space of all conjugacy classes [ρ] of representations ρ : π1 (S) → PSL2 (C) such that ρ(π1 (S)) is non-abelian. A sequence {[ρn ]} in V (S) converges to [ρ] ∈ V (S) if there is a sequence of representatives of {[ρn ]} which converges algebraically to a representative of [ρ]. It is known that V (S) is (6g −6)-dimensional complex manifold (see, e.g., [23, Theorem 4.21]). The holonomy map hol : P (S) −→ V (S) is defined by hol(t, ϕ) = [ρt,ϕ ]. The basic fact is that the holonomy map is a holomorphic local homeomorphism (see [13], [9], and [14]). 2.4. Grafting. Let denote the set of homotopy classes of nontrivial simple closed curves on S. By abuse of notation, we also denote a representative of C ∈ by C. Let ᏹᏸZ (S) denote the set of integral points of measured laminations on S. Namely, each element λ ∈ ᏹᏸZ (S) is written as a formal summation nj Cj , where {nj } are positive integers and {Cj } are mutually distinct disjoint elements in . We shall contain the “zero” measured lamination in ᏹᏸZ (S). Let X be a Riemann surface marked by S. A canonical projective structure on S is provided by the Fuchsian uniformization X = H/G, where H is the upper half-plane and G is a Fuchsian group acting on H. For any element λ = nj Cj of ᏹᏸZ (S), we can construct a new projective structure on S by cutting X along each Cj and “grafting” a projective annulus at each cut locus. More precisely, let lX (Cj ) denote the hyperbolic length of a geodesic representative of Cj on X, and let C − iR+ Aj = z −→ elX (Cj ) z be the annulus equipped with a natural projective structure. Then a new projective structure on S is obtained by cutting X along geodesic representatives of each Cj and inserting Aj for nj -times. This new projective structure is said to be obtained by grafting along λ. (See [12], [17], [25], and [32] for more information.) We explain this grafting operation in the context of complex analysis for later use. The following construction is due to Maskit [21], and our explanation is based on that of Gallo [11]. An element (t, ϕ) ∈ P (S) is called a Fuchsian projective structure if ft,ϕ is injective and ρt,ϕ (π1 (S)) is a Fuchsian group.
190
KENTARO ITO
For simplicity, we first explain the case where λ ∈ ᏹᏸZ (S) is a simple closed curve with weight 1. Take a Fuchsian projective structure (t, ϕ) and an element C ∈ . We denote the holonomy image ρt,ϕ (π1 (S)) by G. We may assume that the image of developing map ft,ϕ (Ht ) coincides with the upper half-plane H and that the imaginary axis iR+ projects onto C via the covering map H → St = H/G. Choose α ∈ (0, π/2) so that Bα = {z ∈ H : π/2 − α < arg z < π/2 + α} projects onto a collar about C in St . Take a C 1 homeomorphism π π 5π π − α, + α −→ − α, +α v: 2 2 2 2 satisfying v(π/2 − α) = π/2 − α, v(π/2 + α) = 5π/2 + α, and v (π/2 − α) = v (π/2 + α) = 1. We define a local homeomorphism W : H → C as follows. Let g ∈ G be a generator for the stabilizer of iR+ in G, and set Dα = h∈G/g h(Bα ). Let W (z) = z for z ∈ H−Dα , and let W (z) = reiv(θ) for z = reiθ ∈ Bα . For z ∈ h(Bα ) with some h ∈ G, let W (z) = h◦W ◦h−1 (z). One can easily verify that W ◦h = h◦W for all h ∈ G and, hence, that µ = Wz¯ /Wz ∈ Belt(H, G)1 . ∗ (µ) of µ via developing map f , Now let µˆ ∈ Belt(Ht , $t )1 be the pullback ft,ϕ t,ϕ which is defined by µ ◦ ft,ϕ ft,ϕ ∗ ft,ϕ (µ) = . ft,ϕ Let ν ∈ Belt(H, $)1 be a representative of t ∈ T (S). Let t ∈ T (S) be the equivalence class of the Beltrami coefficient ν of the quasi-conformal map wµˆ ◦ wν : C→ C. −1 Since W ◦ ft,ϕ ◦ (wµˆ ) is locally conformal on Ht = wµˆ (Ht ), we can take its Schwarzian derivative ϕ ∈ B2 (Ht , $t ). Now we obtain a new projective structure (t , ϕ ) with surjective developing map ft ,ϕ which commutes the following diagram: H
wν
ft,ϕ
/ C
ft ,ϕ
/ C.
wµˆ
id
H
/ Ht
wν
/ Ht
W
Moreover, one can see that ρt ,ϕ = ρt,ϕ from the fact that W ◦h = h◦W for all h ∈ G. It is shown in [11, Lemma 3.1] that the element (t , ϕ ) ∈ P (S) does not depend on the choice of α, v, and ν. The projective structure (t , ϕ ) is said to be obtained from (t, ϕ) by grafting along C and is denoted by Gr C (t, ϕ). An important remark is that, for (t , ϕ ) = Gr C (t, ϕ), the subset ft−1 ,ϕ ((G))/ $t in St consists of two simple closed curves, each of which is homotopic to C (see [12]). This can be seen as follows. First, note that the limit set (G) of G = ρt ,ϕ (π1 (S)) −1 (Bα ) projects onto a collar about C in St via the coincides with R ∪ {∞}. Since ft,ϕ
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE
191
−1 covering map Ht → St , wµˆ (ft,ϕ (Bα )) is also projected onto a collar about C in St −1 via the covering map Ht → St . Since the developing map ft ,ϕ maps wµˆ (ft,ϕ (Bα )) −1 onto the multi-sheeted domain {z ∈ C : π/2−α < arg z < 5π/2+α}, wµˆ (ft,ϕ (Bα ))∩ ft−1 ,ϕ ((G)) consists of two connected components, each of which projects onto a simple closed curve homotopic to C. The grafting operation can be naturally extended to ᏹᏸZ (S). For example, if λ = nC ∈ ᏹᏸZ (S), we only have to change v into v : [π/2 − α, π/2 + α] → [π/2 − α, 2nπ + π/2 + α]. Let Gr λ (t, ϕ) denote the projective structure obtained from (t, ϕ) by grafting along λ ∈ ᏹᏸZ (S). As we have observed, the grafting operator does not change holonomy representations. Conversely, Goldman [12] showed that all projective structures with Fuchsian holonomy are obtained by grafting.
Theorem 2.1 (Goldman). For any Fuchsian projective structure (t, ϕ), hol−1 hol(t, ϕ) = Gr λ (t, ϕ) λ∈ᏹᏸ (S) . Z
2.5. Quasi-conformal deformations of projective structures. Let AH(S) denote the subset of V (S) consisting of discrete faithful representations. The quasi-Fuchsian space QF(S) is the subset of AH(S) consisting of faithful representations whose images are quasi-Fuchsian groups. It is known that AH(S) is a closed subset in V (S) (see [15, Theorem 1]) and that the interior int AH(S) of AH(S) coincides with QF(S) (see [31, Theorem A]). (It is conjectured that QF(S) = AH(S), which is the so-called Bers-Thurston conjecture.) We denote the subset hol−1 (AH(S)) of P (S) by K(S) and hol−1 (QF(S)) by Q(S). Then, since the holonomy map hol : P (S) → V (S) is a local homeomorphism, one obtains int K(S) = Q(S). We now introduce the notion of a quasi-conformal deformation of a projective structure with quasi-Fuchsian holonomy, which was developed by Shiga and Tanigawa [30]. Fix an element (t, ϕ) ∈ Q(S), and denote its holonomy image ρt,ϕ (π1 (S)) by ∗ (µ) ∈ Belt(H , $ ) of µ, G. For each µ ∈ Belt(G)1 , we take the pullback µˆ = ft,ϕ t t 1 which determines a new point t ∈ T (S) in the same manner described in Section 2.4. Since wµ ◦ ft,ϕ ◦ (wµˆ )−1 is locally conformal on Ht , we can take its Schwarzian derivative ϕ ∈ B2 (Ht , $t ). Now we obtain a new projective structure (t , ϕ ), a quasi-conformal deformation of (t, ϕ), satisfying ft ,ϕ = wµ ◦ ft,ϕ ◦ wµˆ | Ht −1 : Ht −→ C, ρ = t ,ϕ where a map
µ
µ ◦ ρt,ϕ
: π1 (S) −→ PSL2 (C),
is the group isomorphism of G into PSL2 (C) induced by wµ . We define t,ϕ : Belt(G)1 −→ P (S) 6
t,ϕ (µ) = (t , ϕ ). The two elements µ, ν ∈ Belt(G)1 are said to be equivalent if by 6
192
KENTARO ITO
µ is PSL2 (C)-conjugate to ν . The quotient space of Belt(G)1 by this equivalent relation can be naturally identified with the quasi-Fuchsian space QF(S).
t,ϕ descends to a map Lemma 2.2 (cf. [30]). The map 6 6t,ϕ : QF(S) −→ P (S). t,ϕ (µ) = Proof. For any equivalent two elements µ, ν ∈ Belt(G)1 , we show that 6 t,ϕ (ν). The same argument in [10] reveals that there is a path cτ (τ ∈ [0, 1]) in 6 Belt(G)1 joining µ and ν and contained in the equivalence class [µ] of µ. Note that t,ϕ (cτ ) is constant on τ ∈ [0, 1]. Since the map hol is a local homeomorphism, hol ◦6 t,ϕ (cτ ) is constant on τ ∈ [0, 1] and that 6 t,ϕ (µ) = 6 t,ϕ (ν). it implies that 6 Using the map in Lemma 2.2, we can show the following. Proposition 2.3. For any connected component ᏽ of Q(S), hol | ᏽ : ᏽ −→ QF(S) is a biholomorphic map. Moreover, 6t,ϕ = (hol | ᏽ)−1 holds for any (t, ϕ) ∈ ᏽ. Therefore, the map 6t,ϕ does not depend on the choice of (t, ϕ) ∈ ᏽ. Proof. To show that hol | ᏽ is biholomorphic, it suffices to show that hol | ᏽ is bijective, since the map hol is a local biholomorphism. Fix an element (t, ϕ) ∈ ᏽ. Note that 6t,ϕ (QF(S)) ⊂ ᏽ, since QF(S) is connected and 6t,ϕ is continuous. It can be easily seen by definition that (hol | ᏽ) ◦ 6t,ϕ is the identity map of QF(S). Therefore we only have to show that 6t,ϕ ◦ (hol | ᏽ) is the identity map of ᏽ. To this end, we show that a subset ᏽ = (s, ψ) ∈ ᏽ : (s, ψ) = 6t,ϕ ◦ hol(s, ψ) of ᏽ is nonempty, open and closed. Since 6t,ϕ ◦ (hol | ᏽ) is continuous, ᏽ is closed. Moreover, 6t,ϕ ([ρt,ϕ ]) = (t, ϕ) implies (t, ϕ) ∈ ᏽ , and hence ᏽ = ∅. Take (s, ψ) ∈ ᏽ and its neighborhood U in ᏽ such that hol | U is injective. Let V be a neighborhood of (s, ψ) contained in U and satisfying 6t,ϕ ◦hol(V ) ⊂ U . Note that, for any (s , ψ ) ∈ V , hol(6t,ϕ ◦ hol(s , ψ )) = (hol ◦6t,ϕ ) ◦ hol(s , ψ ) = hol(s , ψ ). Since both 6t,ϕ ◦ hol(s , ψ ) and (s , ψ ) are contained in U and hol | U is injective, we have 6t,ϕ ◦ hol(s , ψ ) = (s , ψ ) for all (s , ψ ) ∈ V . Therefore ᏽ is open. 2.6. Components of Q(S). Take a Fuchsian projective structure (t, ϕ). By Proposition 2.3, each connected component of Q(S) contains a unique projective structure whose holonomy representation coincides with [ρt,ϕ ]. But, from Theorem 2.1, these projective structures are written in the form {Gr λ (t, ϕ)}λ∈ᏹᏸZ (S) . Therefore we obtain the decomposition of Q(S) into its connected components: ᏽλ , Q(S) = λ∈ᏹᏸZ (S)
where ᏽλ is the component containing Gr λ (t, ϕ). Note that this suffix does not depend
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE
193
on the choice of Fuchsian projective structure (t, ϕ). Recall that an element of Q(S) is called standard if its developing map is injective; otherwise, it is called exotic. Since any element of ᏽλ is obtained from Gr λ (t, ϕ) by a quasi-conformal deformation, one can easily see that any element of ᏽ0 is a standard projective structure and that any element of ᏽλ (λ = 0) is an exotic projective structure. Note that, since Gr λ (t, ϕ) (λ = 0) has a surjective developing map, any exotic projective structure has a surjective developing map. Moreover, in the following lemma, we can characterize the component of Q(S) in which an element (t, ϕ) of Q(S) is contained. Lemma 2.4. Take an element (t, ϕ) ∈ Q(S), and denote ρt,ϕ (π1 (S)) by G. Then (t, ϕ) is contained in a component ᏽλ corresponding to λ = nj Cj ∈ ᏹᏸZ (S) if −1 and only if the subset ft,ϕ ((G))/ $t of St consists of disjoint unions of 2nj simple closed curves, each of which is homotopic to Cj for all j . 3. Exotic projective structures and limits of representations. In this section, we investigate the relationship between sequences of exotic projective structures and algebraic and geometric limits of their holonomy representations. We begin with the definition of geometric convergence of Kleinian groups. Definition 3.1. Let X be a locally compact Hausdorff space. We denote by Ꮿ(X) the set of all closed subsets of X. A sequence {An } of closed subsets of X converges to a closed subset A ⊂ X in the Hausdorff topology on Ꮿ(X) if every element x ∈ A is the limit of a sequence {xn ∈ An } and if every accumulation point of every sequence {xn ∈ An } lies in A. A sequence of Kleinian groups {Gn } is said to converge geometrically if {Gn } converges to G in the Hausdorff topology on Ꮿ(PSL2 (C)). to a group G We recall some basic facts on the convergence of representations. Let {ρn } be a sequence of discrete faithful representations of π1 (S) into PSL2 (C) which converges algebraically to ρ∞ . Then ρ∞ is also a discrete faithful representation (see [15, Theorem 1]). Moreover, there is a subsequence of {Gn = ρn (π1 (S))} converging which contains G∞ = ρ∞ (π1 (S)) (see [16, geometrically to a Kleinian group G Proposition 3.8]). The following theorem is due to Kerckhoff and Thurston [18, Corollary 2.2]. Theorem 3.2 (Kerckhoff-Thurston). Let {ρn : π1 (S) → PSL2 (C)} be an algebraically convergent sequence of faithful representations whose images {Gn = ρn (π1 (S))} are quasi-Fuchsian groups. Assume that {Gn } converges geometrically in the Hausdorff topology on Ꮿ( Then {(Gn )} converges to (G) C). to G. Lemma 3.3 plays an important role in this paper, especially in the proof of Theorem A. Since the situation under which we consider Lemma 3.3 is somewhat complicated, we first describe it. Let {(tn , ϕn )} be a sequence in Q(S) converging to an element (t, ϕ) in K(S). Take a sequence of projective pairs {(ftn ,ϕn , ρtn ,ϕn )} and a projective pair (ft,ϕ , ρt,ϕ )
194
KENTARO ITO
such that {ρtn ,ϕn } converges algebraically to ρt,ϕ . Put Gn = ρtn ,ϕn (π1 (S)) and G∞ = ρt,ϕ (π1 (S)). Moreover, we assume that {Gn } converges geometrically to a Kleinian Since {tn } converges to t in T (S), one can take a smooth quasi-conformal group G. map ωn : St → Stn such that ωn ◦ gt is homotopic to gtn , where gt : S → St and gtn : S → Stn are markings for t and tn , respectively, and the maximal dilatation 1 + (ωn )z¯ /(ωn )z ∞ 1 − (ωn )z¯ /(ωn )z ∞
of ωn tends to 1 as n → ∞. In this situation, we have the following lemma. −1 ((Gn ))/ $tn )} converges to ft,ϕ ((G))/ Lemma 3.3. The sequence {ωn−1 (ft−1 n ,ϕn $t in the Hausdorff topology on Ꮿ(St ).
Proof. We first observe that {ftn ,ϕn } converges to ft,ϕ locally uniformly in Ht . Since {(tn , ϕn )} converges to (t, ϕ), {tn } converges to t in T (S) and {ϕn } converges to ϕ locally uniformly in Ht . Therefore, one can take a sequence of projective pairs {(fˇtn ,ϕn , ρˇtn ,ϕn )} and a projective pair {(fˇt,ϕ , ρˇt,ϕ )} such that {fˇtn ,ϕn } converges to fˇt,ϕ locally uniformly in Ht . Choose an element An ∈ PSL2 (C) so that fˇtn ,ϕn = An ◦ftn ,ϕn holds. Then, since both {ρtn ,ϕn } and {ρˇtn ,ϕn = An ◦ ρtn ,ϕn ◦ An −1 } are algebraically convergent sequences, {An } converges to some element A ∈ PSL2 (C). Therefore, {ftn ,ϕn } also converges to ft,ϕ locally uniformly in Ht . Take a sequence of lifts ωn : Ht → Htn of ωn that converges to the identity locally uniformly in Ht . Then {ftn ,ϕn ◦ ωn } also converges to ft,ϕ locally uniformly in Ht . in the Hausdorff topology on Ꮿ( C) (see TheoSince {(Gn )} converges to (G) −1 rem 3.2), one can easily check that { ωn−1 (ft−1 ((G )))} converges to ft,ϕ ((G)) n ,ϕ n
n
( in the Hausdorff topology on Ꮿ(Ht ). This implies that the sequence {ωn−1 (ft−1 n ,ϕn −1 $t in the Hausdorff topology on Ꮿ(St ). (Gn ))/ $tn )} converges to ft,ϕ ((G))/ −1 $t restricts the shape Remark. Lemma 3.3 implies that the shape of ft,ϕ ((G))/ −1 of ftn ,ϕn ((Gn ))/ $tn and, hence, the component of Q(S) in which (tn , ϕn ) is contained by Lemma 2.4. This is the fundamental idea of the proof of Theorem A.
A Kleinian group G is called geometrically finite if it has a finite-sided fundamental domain in H3 . A Kleinian group G is said to be a b-group if it has only one simply connected invariant component of (G), which is denoted by 0 (G). A geometrically finite b-group is said to be regular. A degenerate group is a b-group with 0 (G) = (G). For a b-group G, take a Riemann mapping f : 0 (G) → H, which induces a group isomorphism χf : G → f Gf −1 . An accidental parabolic element g in G is a parabolic element such that χf (g) is a loxodromic element in f Gf −1 . Let U (S) denote the subset of P (S) consisting of all projective structures whose developing maps are injective. Then U (S) is closed in P (S), contains ᏽ0 , and is contained in K(S). Since int AH(S) = QF(S), one can easily see that int U (S) = ᏽ0
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE
195
and that ∂ ᏽ0 ⊂ ∂U (S). (For a subset A of a topological space X, we denote its relative boundary A − A by ∂A.) Note that, for an element (t, ϕ) in ∂U (S), its holonomy image G = ρt,ϕ (π1 (S)) is a b-group with an invariant component ft,ϕ (Ht ) = 0 (G) (see [19]). Using Lemma 3.3, we can characterize a sequence of exotic projective structures converging to an element of ∂U (S) by algebraic and geometric limits of their holonomy representations. Proposition 3.4. In the same situation as in Lemma 3.3, with the additional assumption that (t, ϕ) ∈ ∂U (S), the following are equivalent: (1) (tn , ϕn ) are exotic projective structures for large enough n; = ∅. (2) 0 (G∞ ) ∩ (G) Remark. The “(2) ⇒ (1)” part of Proposition 3.4 is due to McMullen [25]. In fact, he constructs a sequence of representations satisfying (2) and applies the “(2) ⇒ (1)” part to show Theorem 1.1. Later, we explain his arguments more precisely. Proof of Proposition 3.4. Recall that any exotic projective structure has a surjective developing map. Hence, a projective structure (tn , ϕn ) ∈ Q(S) is exotic if and only if its developing map ftn ,ϕn is surjective. Therefore the condition (1) is equivalent to the condition that ft−1 ((Gn ))/ $tn = ∅ for large enough n. Using Lemma 3.3, n ,ϕn −1 $t = ∅, which is equiv((G))/ it turns out to be equivalent to the condition ft,ϕ alent to the condition ft,ϕ (Ht ) ∩ (G) = ∅. Since ft,ϕ (Ht ) = 0 (G∞ ) holds for (t, ϕ) ∈ ∂U (S), we complete the proof. As a consequence of the “(1) ⇒ (2)” part of Proposition 3.4, we have the following assertion due to Matsuzaki (by way of oral communication; see also [23, Section 7.4]). Corollary 3.5 (Matsuzaki). If a projective structure (t, ϕ) in ∂U (S) is an accumulation point of exotic projective structures, its holonomy image ρt,ϕ (π1 (S)) contains accidental parabolics. Proof. Assume that there is a sequence of exotic projective structures {(tn , ϕn )} converging to a projective structure (t, ϕ) ∈ ∂U (S) whose holonomy image contains no accidental parabolics. Then the following theorem, due to Thurston (see [27, of {Gn = ρtn ,ϕn (π1 (S))} coincides Corollary 6.1]), implies that the geometric limit G with the algebraic limit G∞ = ρt,ϕ (π1 (S)). This contradicts Proposition 3.4. Theorem 3.6 (Thurston). Let {ρn } be a sequence of discrete faithful representations of π1 (S) into PSL2 (C) converging algebraically to ρ∞ . Assume that Gn = ρn (π1 (S)) and G∞ = ρ∞ (π1 (S)) contain no accidental parabolics. Then {Gn } converges geometrically to G∞ . We remark that a b-group with no accidental parabolics is a degenerate group (see [5]). Therefore, Corollary 3.5 implies that an element of ∂U (S) whose holonomy image is a degenerate group without accidental parabolics cannot be an accumulation point of exotic projective structures. Moreover, we obtain the following corollaries.
196
KENTARO ITO
Corollary 3.7. The subset of ∂ ᏽ0 of projective structures that cannot be accumulation points of exotic projective structures is dense in ∂ ᏽ0 . Proof. A similar argument in [5, p. 598] and [24, p. 221] reveals that the subset of ∂ ᏽ0 of projective structures whose holonomy images contain no accidental parabolics is dense in ∂ ᏽ0 . Then the assertion follows immediately from Corollary 3.5. Corollary 3.8. For any t0 ∈ T (S), there exists a projective structure (t0 , ϕ) in ∂ ᏽ0 whose holonomy image ρt0 ,ϕ (π1 (S)) is a regular b-group, such that there is no sequence of exotic projective structures converging to (t0 , ϕ). Proof. Fix t0 ∈ T (S), and consider a subspace T = ᏽ0 ∩ B2 (Ht0 , $t0 ) of B2 (Ht0 , $t0 ). The space T is coincident with the image of the so-called Bers embedding of the Teichmüller space into B2 (Ht0 , $t0 ) (see [29]). Note that the boundary ∂T of T in B2 (Ht0 , $t0 ) is contained in ∂ ᏽ0 ∩ B2 (Ht0 , $t0 ). It is known by McMullen [24] that the subset ∂ T ⊂ ∂T of projective structures whose holonomy images are regular b-groups is dense in ∂T . Let (t0 , ψ) be a point in ∂T such that ρt0 ,ψ (π1 (S)) is a degenerate group, and take a sequence {(t0 , ϕn )} in ∂ T converging to (t0 , ψ). Now suppose that the assertion were false. Then, by the diagonal argument, one can take a sequence of exotic projective structures converging to (t0 , ψ). This contradicts Corollary 3.5. 4. The proof of Theorem A. We first recall some basic facts of the quasiFuchsian space. Take a faithful representation ρ : π1 (S) → PSL2 (C) whose image G = ρ(π1 (S)) is a quasi-Fuchsian group. Then the Kleinian manifold NG = (H3 ∪ (G))/G is homeomorphic to S × [0, 1] and ∂NG = (G)/G consists of two Riemann surfaces X1 and X2 . We assume that ∂NG is equipped with the orientation induced from that of C. Then ∂NG = X1 ∪ X2 , combined with markings induced from ρ, determines a point in T (S) × T (S), where S denotes S with its orientation reversed. Moreover, this assignment induces a holomorphic bijection (see [4]): qf : T (S) × T (S) −→ QF(S). A subset Bt = qf({t} × T (S)) of QF(S) for some t ∈ T (S) is called a vertical Bers slice. On the other hand, B t = qf(T (S)×{t}) for some t ∈ T (S) is called a horizontal Bers slice. Note that the boundary of any vertical Bers slice is contained in hol(∂ ᏽ0 ), while the boundary of any horizontal Bers slice is not. The mapping class group Mod(S) is the group consisting of isotopy classes of orientation preserving homeomorphism of S. Recall that Mod(S) acts naturally on T (S) and on T (S). We devote the rest of this section to the proof of the following theorem. Theorem A. For any λ ∈ ᏹᏸZ (S), we have ᏽ0 ∩ ᏽλ = ∅. Especially, the closure of Q(S) in P (S) is connected.
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE
197
The second statement can be easily seen from the first statement. Hence, we concentrate our attention on the first statement. 4.1. Proof for special case. Here, we prove Theorem A for the case where λ is a simple closed curve C ∈ of weight 1. Our aim is to show that there exists a sequence in ᏽC which converges to an element of ∂ ᏽ0 . This proceeds as follows. (1) We review the proof of Theorem 1.1. Let (u, v) be any pair of Riemann surfaces in T (S) × T (S) and τ ∈ Mod(S) be the Dehn twist around C. We will see that the sequence {[ρn ] = qf(τ n u, τ 2n v)} converges algebraically to a point [ρ∞ ] on the boundary of some vertical Bers slice Bt . Let (t, ϕ) be a point of ∂ ᏽ0 such that hol(t, ϕ) = [ρ∞ ]. Since the holonomy map hol : P (S) → V (S) is a local homeomorphism, one can take a sequence {(tn , ϕn )} in Q(S) converging to (t, ϕ) and satisfying hol(tn , ϕn ) = [ρn ]. For large enough n, (tn , ϕn ) turns out to be exotic and the proof of Theorem 1.1 is completed. In the following steps, we show that (tn , ϕn ) is contained in ᏽC for large enough n. (2) Let {(ftn ,ϕn , ρtn ,ϕn )} and (ft,ϕ , ρt,ϕ ) be projective pairs corresponding to {(tn , ϕn )} and (t, ϕ), respectively, such that {ρtn ,ϕn } converges algebraically to ρt,ϕ . Let {ωn : St → Stn } be a sequence of quasi-conformal maps as in Lemma 3.3. Let G be the geometric limit of the sequence {Gn = ρtn ,ϕn (π1 (S))}. In Lemma 4.1, we see −1 $t in St consists of two components and is contained ((G))/ that the subset ft,ϕ in an annulus whose core is homotopic to C. Recall that {ωn−1 (ft−1 ((Gn ))/ $tn )} n ,ϕn −1 converges to ft,ϕ ((G))/ $t in the Hausdorff topology on Ꮿ(St ) by Lemma 3.3. ((Gn ))/ $tn consists of two simple closed Therefore, one may expect that ft−1 n ,ϕn curves, each of which is homotopic to C for large enough n. If it is true, (tn , ϕn ) is contained in ᏽC for large enough n by Lemma 2.4, and the proof is completed. We justify the above expectation in the following steps. (3) It is easy to see that all components of ft−1 ((Gn ))/ $tn are simple closed n ,ϕn curves homotopic to C for large enough n (see Lemma 4.2). Hence, all that we have to show is that any component of ft−1 ((Gn ))/ $tn does not join with another n ,ϕn component as n tends to ∞. (4) In Lemma 4.3, we see that any two components of ft−1 ((Gn ))/ $tn are n ,ϕn separated by some annulus whose core is homotopic to C and whose modulus is larger than m0 , where m0 does not depend on n. (5) We will see that the hyperbolic distance between two boundary components of an annulus in a Riemann surface can be estimated below by using the modulus of the annulus (see Lemma 4.4). Hence, the hyperbolic distance between any two components of ωn−1 (ft−1 ((Gn ))/ $tn ) is bounded below by some positive constant n ,ϕn ((Gn ))/ $tn L > 0 that does not depend on n. Therefore, one can see that ft−1 n ,ϕn consists of two simple closed curves, each of which is homotopic to C for large enough n, and one can complete the proof of Theorem A for the case that λ is a simple closed curve. We now fill in the details.
198
KENTARO ITO
Step 1. Theorem 1.1 is due to McMullen [25, Appendix A]. In our sketch of this proof, we make use of a variation of Thurston’s hyperbolic Dehn surgery theorem, which is due to Comar [8] (see also [2, Theorem 2.2] and [6]). Related arguments can be found in [7] and [18]. The “adding twist” technique, discovered by Anderson and Canary [2], also plays an important role in this proof. be a geometrically finite Kleinian group Sketch of proof of Theorem 1.1. Let G 3 G is homeomorphic to S × [0, 1] − whose Kleinian manifold NG = (H ∪ (G))/ is guaranteed by Thurston’s C × {1/2}. The existence of such a Kleinian group G geometrization theorem (see [26]). Here, the tubular neighborhood of C × {1/2} corresponds to the rank-two cusp end of NG . We fix a basis γ , δ for the fundamental group of the rank-two cusp so that γ is homotopic to C × {0} and δ is trivial in S ×[0, 1]. By performing (n, 1) Dehn filling on the cusp (n ∈ N), we obtain a sequence → PSL2 (C)} which satisfies the following conditions (see of representations {βn : G [8]): is a quasi-Fuchsian group; • Gn = βn (G) • the kernel of βn is normally generated by γ n δ; • {Gn } converges geometrically to G; • {βn } converges algebraically to the identity representation of G. Let f0 be the natural inclusion map S → S × {1/4} ⊂ NG , and denote by (f0 )∗ Then we obtain a sequence of the induced group homomorphism of π1 (S) into G. faithful representations ρn = βn ◦ (f0 )∗ of π1 (S) onto quasi-Fuchsian groups Gn . By modifying βn slightly, if necessary, we may assume that ∂NGn is conformally isomorphic to ∂NG for all n. Then the above representations are expressed as
ρn = qf u, τ n v ,
where (u, v) ∈ T (S) × T (S) is the complex structure on ∂NG combined with trivial markings, and τ ∈ Mod(S) is the Dehn twist around C. Now we add a twist to f0 . More precisely, we construct an immersion fC : S → NG which is homotopic to f0 in S ×[0, 1] but not in S ×[0, 1]−C ×{1/2} in the following way. Let A be a tubular neighborhood of C in S. Then the map fC | (S −A) is defined by fC (x) = (x, 1/4), and the map fC | A is defined so that fC (A) wraps once around the tubular neighborhood of C × {1/2} (see Figure 1). This immersion fC is called the wrapping map associated to C. Again, we obtain a sequence of faithful representations ρn = βn ◦ (fC )∗ of π1 (S) onto quasi-Fuchsian groups Gn , which can be expressed as [ρn ] = qf τ n u, τ 2n v . The sequence {ρn } converges algebraically to ρ∞ = (fC )∗ . We denote ρ∞ (π1 (S)) by G∞ . We can show that G∞ is a regular b-group and, moreover, that [ρ∞ ] lies on the boundary of some vertical Bers slice Bt (see Lemma 4.1(1)). Therefore, there
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE
199
S × {1}
S
C × {1/2}
fC
S × [0, 1]
S × {0} C Figure 1. The wrapping map fC
exists an element (t, ϕ) ∈ ∂ ᏽ0 with hol(t, ϕ) = [ρ∞ ]. Since the holonomy map hol : P (S) → V (S) is a local homeomorphism, one can take a sequence {(tn , ϕn )} in Q(S) converging to (t, ϕ) and satisfying hol(tn , ϕn ) = [ρn ]. Since fC : S → NG is not homotopic to a map into ∂NG , G does not represent the fundamental group of ∞ either component of ∂NG , and, hence, 0 (G∞ )∩(G) = ∅ (see also Lemma 4.1(2)). Therefore, Proposition 3.4 implies that (tn , ϕn ) are exotic. This completes the proof of Theorem 1.1. Step 2. Our aim is to show that the sequence {(tn , ϕn )} constructed above is con−1 $t ⊂ St in the tained in ᏽC . To this end, we examine the shape of ft,ϕ ((G))/ following lemma. Lemma 4.1. (1) G∞ is a regular b-group. Moreover, [ρ∞ ] lies on the boundary of some vertical Bers slice Bt . −1 $t in St consists of two connected components and is (2) The subset ft,ϕ ((G))/ contained in an annulus whose core is homotopic to C (see Figure 2).
−1 $t in St Figure 2. The subset ft,ϕ ((G))/
Proof. During this proof, the reader is advised to refer to Figure 3. (1) Since G∞ is a finitely generated subgroup of the geometrically finite Kleinian
200
KENTARO ITO
R
1 C
3 C
2 C
and 0 (G∞ ) (the shaded part) Figure 3. (G)
with (G) = ∅, it is also geometrically finite (see, e.g., [23, Theorem 3.11]). group G Moreover, since G∞ has a parabolic element corresponding to C, it is not a quasiFuchsian group. Therefore, to show that G∞ is a regular b-group, we have only to show that (G∞ ) has a simply connected invariant component. Let A be a closed annular neighborhood of C in S. By deforming the wrapping map fC : S → NG in its homotopy class, we may assume that fC maps S − A onto (S −C)×{0} ⊂ ∂NG and that fC (int A) ⊂ int NG S → H3 ∪(G) . We take a lift f C : of fC satisfying fC ◦ g = ρ∞ (g) ◦ fC
for all g ∈ π1 (S),
where π1 (S) is regarded as the covering transformation group of the universal covering map p : S → S. Since the map fC is π1 -injective, we may assume that the map S) of S is simply connected. fC is an embedding, and, hence, the image fC ( −1 Fix a component A0 of A = p (A). Let g0 ∈ π1 (S) be a generator for the stabilizer 0 in π1 (S). Put γ = ρ∞ (g0 ), and take an element δ ∈ G such that γ , δ is of A which is conjugate to γ , δ in G. Since γ a rank-two parabolic subgroup of G 0 ), each of two components C 1 , C 2 of fC (∂ A 0 ) forms a simple closed stabilizes fC (A 1 and C 2 curve together with the common fixed point of γ , δ . Note that both C project onto C ⊂ S × {0} ⊂ ∂NG . Since fC (A) via the covering map (G) → ∂NG wraps once around a tubular neighborhood of C × {1/2} in NG , we may assume that 2 = δ C 1 and C 2 , 1 holds. Let R be the crescent-like domain in C C lying between C 3 and let D be the domain that is cut out of H by fC (A0 ) and is facing R. Note = ∅ and that D is precisely invariant under the subgroup γ of that R ∩ fC ( S − A) G∞ ; that is, (γ )l (D) = D for any l ∈ Z and D ∩ g(D) = ∅ for any g ∈ G∞ − γ .
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE
201
Therefore fC is homotopic to an embedding F : S → C satisfying F ◦ g = ρ∞ (g) ◦ F
for all g ∈ π1 (S),
Since G∞ acts on F ( S) properly disconand the homotopy is constant on S − A. tinuously, there is a component E of (G∞ ) containing F ( S). One can easily see that E coincides with F ( S) because F : S → C descends to a homeomorphism S → F ( S)/G∞ ⊂ E/G∞ . Since F ( S) is simply connected, G∞ is a regular b-group with F ( S) = 0 (G∞ ). A result of Abikoff [1] implies that any representation whose image is a regular bgroup lies on the boundary of some (vertical or horizontal) Bers slice. Since F : S→ C descends to an orientation preserving homeomorphism S → 0 (G∞ )/G∞ ⊂ NG∞ which induces the representation [ρ∞ ], one can see that [ρ∞ ] lies on the boundary of a vertical Bers slice Bt for some t ∈ T (S). is connected since each component of (2) We first remark that the limit set (G) is simply connected; the latter can be seen from the fact that each component (G) of ∂NG is incompressible in NG .
= g∈G /γ g(R). Since From the above argument, we have F (A) ∞
0 (G∞ ) −
g∈G∞
we have
= 0 (G∞ ) ∩ (G)
⊂ (G), g(R) = F ( S) − F (A)
/γ
g∈G∞ /γ
= g(R) ∩ (G)
. g R ∩ (G)
g∈G∞ /γ
Let G be a subgroup of G Therefore, we concentrate our attention on R ∩ (G). representing the fundamental group of S × {0} ⊂ ∂NG . By conjugating G in G, if 2 ⊂ 0 (δ G δ −1 ). Note that 1 ⊂ 0 (G ) and that C necessary, we may assume that C −1 0 (G ) ∩ 0 (δ G δ ) = ∅, since δ ∈ / G . One can easily see that there exists a unique lift C3 of C ⊂ S × {1} ⊂ ∂NG contained in R and terminating in the common 3 divides R into two crescent-like fixed point of γ , δ at both ends. The curve C domains R1 and R2 . Since C1 , C2 , and C3 are contained in distinct components of Rj ∩ (G) = ∅ for j = 1, 2. Moreover, since (G) is connected, Rj ∩ (G), (G), (S) in S consists of j = 1, 2, is also connected. Therefore, the subset F −1 ((G))/π 1 two connected components and is contained in the annulus A whose core is homotopic to C. Since F : S→ C can be regarded as the developing map of the projective structure on S corresponding to (t, ϕ) ∈ P (S), we obtain the assertion. Step 3. Let {λn } be the sequence of ᏹᏸZ (S) satisfying (tn , ϕn ) ∈ ᏽλn . Lemma 4.2. λn ∈ {kC : k ∈ N} ⊂ ᏹᏸZ (S) for large enough n.
202
KENTARO ITO
−1 $t by and ωn−1 (ft−1 Proof. For simplicity, we denote ft,ϕ ((G))/ ((Gn ))/ n ,ϕn $tn ) by n . Take an open annulus A in St whose core is homotopic to C and which . Then K = St − A is a compact set. Now suppose that there exists a contains subsequence {nj } such that λnj ∈ {kC : k ∈ N}. Then, from the characterization of nj in Lemma 2.4, one can easily see that nj ∩K = ∅ for all j . Since K is compact, any sequence {znj ∈ nj ∩K} has an accumulation point z∞ ∈ K. On the other hand, (see Lemma 3.3) implies that z∞ ∈ , which the Hausdorff convergence n → contradicts ∩ K = ∅.
Step 4. For an annulus A with a conformal structure, the modulus m(A) of A is defined by m(A) = (2π )−1 log c when A is conformally equivalent to a round annulus {z ∈ C : 1 < |z| < c}. We will make use of the monotonicity of moduli: If an annulus A contains disjoint essential annuli A1 and A2 , then m(A1 ) + m(A2 ) ≤ m(A) (see [20, Lemma 6.3]). Lemma 4.3. There exists a positive constant m0 > 0, independent of n, satisfying the following: Any curve on Stn joining any two connected components of ft−1 ((Gn ))/ $tn traverses some annulus whose core is homotopic to C and whose n ,ϕn modulus is larger than m0 . Proof. Fix a positive integer k ∈ N. We consider a sequence {(sn , ψn )} = {6kC ([ρn ])} in ᏽkC , where −1 : QF(S) −→ ᏽkC . 6kC = hol | ᏽkC We will show that there exists a positive constant m0 , independent of k and n, satisfying the following: Any curve in Ssn joining any two connected components of fs−1 ((Gn ))/ $sn traverses an annulus whose core is homotopic to C and whose n ,ψn modulus is larger than m0 . Once its existence has been shown, since (tn , ϕn ) = 6kn C ([ρn ]) for λn = kn C, we obtain the assertion in Lemma 4.3. The proof depends on the particular form of [ρn ] = qf(τ n u, τ 2n v). For simplicity, we assume that [ρ0 ] = qf(u, v) is a Fuchsian representation. (The following argument, with a slight modification, works out without this assumption.) Let (p, φ) denote the projective structure in ᏽ0 such that [ρp,φ ] = [ρ0 ]. Then (s0 , ψ0 ) = Gr kC (p, φ). To obtain (s0 , ψ0 ) from (p, φ), we perform the same construction as described in Section 2.4. We use the same notation and normalization as in Seciton 2.4; for example, G0 = ρp,φ (π1 (S)) is a Fuchsian group acting on H, g ∈ G0 is a generator of cyclic subgroup which stabilizes Bα , and so on. In addition, we prepare some notation. Let Bα∗ = {z ∈ H∗ : 3π/2 −α < arg z < 3π/2 +α} be the complex conjugation of Bα , put
α = Bα ∪ Bα∗ , and set D α = h∈G /g h(B α ). B 0 Recall that fs−1 ((G0 ))/ $s0 ⊂ Ss0 consists of 2k simple closed curves, each of 0 ,ψ0 α )/ $s0 contains 2k + 1 which is homotopic to C. Moreover, observe that fs−1 (D 0 ,ψ0 annular domains A1 , . . . , A2k+1 , each of whose core is homotopic to C. There exists
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE
203
exactly one connected component of fs−1 ((G))/ $s0 lying between Aj and Aj +1 0 ,ψ0 for every j ∈ {1, . . . , 2k}. Therefore, any curve on Ss0 joining any two components of fs−1 ((G0 ))/ $s0 traverses some Aj . Note that the developing map fs0 ,ψ0 induces a 0 ,ψ0 (∗)
(∗)
natural conformal isomorphism ξj : Aj → Bα /g for all j , where Bα is Bα or Bα∗ . α/2 such that the One can take an element µn ∈ Belt(G0 )1 with supp(µn ) ⊂ D quasi-conformal map wµn : C → C induces the quasi-conformal deformation [ρn ] = qf(τ n u, τ 2n v) of [ρ0 ] = qf(u, v). (See [33] for an explicit description of µn .) This element µn ∈ Belt(G0 )1 also induces the quasi-conformal deformation (sn , ψn ) of (s0 , ψ0 ), as described in Section 2.5. Recall that the quasi-conformal map wµˆ n | Hs0 : Hs0 → Hsn with the Beltrami coefficient µˆ n = fs∗0 ,ψ0 (µn ) descends to a quasiconformal map gµˆ n : Ss0 → Ssn . (∗) (∗) The two components of ξj−1 ((Bα − Bα/2 )/g) ⊂ Aj are denoted by Aj and Aj , each of which is an essential subannulus in Aj . An easy calculation reveals that m(Aj ) = m(Aj ) = α/2l for all j , where l is the exponent of g, that is, g(z) = el z. Since gµˆ n is conformal on Aj ∪ Aj , combining with the monotonicity of moduli, we have α m gµˆ n (Aj ) ≥ m gµˆ n Aj + m gµˆ n Aj = m Aj + m Aj = . l Put m0 = α/ l. Then any curve on Ssn = gµˆ n (Ss0 ) joining any two connected com((Gn ))/ $sn = gµˆ n (fs−1 ((G0 ))/ $s0 ) traverses some gµˆ n (Aj ) ponents of fs−1 n ,ψn 0 ,ψ0 whose core is homotopic to C and whose modulus is larger than m0 . The proof of Lemma 4.3 is complete. Step 5. The following lemma implies that the hyperbolic distance between two boundary components of an annulus in a Riemann surface can be estimated below by using the modulus of the annulus. The essential tool in this proof is Grötzsch’s module theorem (see [20]). For t ∈ T (S), we denote the hyperbolic distance of z1 , z2 ∈ St by dt (z1 , z2 ) and the hyperbolic length of the closed geodesic representing C ∈ by lt (C). Lemma 4.4. Let t ∈ T (S) and C ∈ . Let A ⊂ St be an annular domain such that ∂A consists of two simple closed curves C1 , C2 , each of which is homotopic to C. Then there is a positive constant I = I (lt (C), m(A)) > 0 that depends only on lt (C) and m(A) such that dt C1 , C2 > I lt (C), m(A) , where dt (C1 , C2 ) = inf{dt (z1 , z2 ) : zj ∈ Cj (j = 1, 2)}. Proof. Let H = {z ∈ C : Im z > 0} denote the upper half-plane. Take a holomorphic covering map p1 : H → St so that the imaginary axis iR+ projects onto a simple denote the connected component of p1 −1 (A) closed curve homotopic to C. Let A
204
KENTARO ITO
is stabilized by a cyclic group g, which is genthat connects zero and ∞. Then A erated by g(z) = z exp(lt (C)). Since m(H/g) = π/ lt (C), there is a holomorphic covering map p2 : H → R = {w ∈ C : 1 < |w| < exp(2π 2 / lt (C))} whose covering transformation group is g. Let λR (w)|dw| denote the complete hyperbolic metric on R, and let dR (·, ·) denote the distance with respect to λR (w)|dw|. Note that A is and that dt (C1 , C2 ) = dR (C , C ), where C conformally equivalent to A = p2 (A) 1 2 1 and C2 are components of ∂A . One can easily see that there is a positive constant I1 = I1 (lt (C)) which depends only on lt (C) such that λR (w)|dw| > I1 (lt (C))|dw|. Therefore, we obtain dR C1 , C2 > I1 lt (C) de C1 , C2 , (4.1) where de (·, ·) is the distance with respect to the Euclidean metric |dw|. We can assume that C1 (resp., C2 ) is the inner (resp., outer) component of ∂A . Take w1 ∈ C1 and w2 ∈ C2 satisfying de (w1 , w2 ) = de (C1 , C2 ). Let D ⊂ C be a connected component of C − C2 that contains zero. Take a Riemann mapping h : E = {z ∈ C : |z| < 1} → D such that h(0) = 0 and h−1 (w1 ) = r > 0. Now, Koebe’s one-quarter theorem and Koebe’s distortion theorem (see, e.g., [28, p. 9]) imply that there is a positive constant I2 = I2 (1 − r) such that de C1 , C2 > I2 (1 − r). (4.2) Let µ(r) denote the modulus of the domain {z ∈ C : |z| < 1, z ∈ [0, r]} for 0 < r < 1. Then Grötzsch’s module theorem asserts that m(Ꮽ) ≤ µ(r) for any annular domain Ꮽ ⊂ E = {z ∈ C : |z| < 1} which separates zero and r from ∂E (see [20, Chapter II] for more information). Since h−1 (A ) separates zero and r from ∂E, Grötzsch’s theorem implies that m(A ) ≤ µ(r). Since µ is a monotone decreasing function, there is a positive constant I3 = I3 (m(A )) such that 1 − r > I3 m(A ) . (4.3) From the inequalities (4.1)–(4.3), we obtain dR C1 , C2 > I1 lt (C) I2 I3 m(A ) . Since dR (C1 , C2 ) = dt (C1 , C2 ) and m(A ) = m(A), we obtain the assertion. Using Lemma 4.4, we can finally prove the next lemma and complete the proof of Theorem A for the special case. Lemma 4.5. We have (tn , ϕn ) ∈ ᏽC for large enough n. Proof. Since maximal dilatations of quasi-conformal maps ωn : St → Stn tend to 1, there exists a positive constant m1 > 0 such that moduli of ωn −1 ◦ gµˆ n (Aj ) exceed m1 for sufficiently large n. Since any curve in St joining any two components of
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE
205
n = ωn −1 (ft−1 ((Gn ))/ $tn ) traverses some ωn −1 ◦ gµˆ n (Aj ), Lemma 4.4 implies n ,ϕn that there exists a positive constant L > 0 such that the hyperbolic distance of any two components of n is bounded below by L for sufficiently large n. Then, from Lemmas 3.3 and 4.1(2), one can easily see that n consists of two connected components, each of which is homotopic to C for sufficiently large n. Therefore, from Lemma 2.4, we obtain the assertion. 4.2. Proof for general case. Take an arbitrary element λ = lj =1 nj Cj ∈ ᏹᏸZ (S), be a geometrically finite Kleinian group whose Kleinian manifold NG and let G is homeomorphic to S × [0, 1] − ∪lj =1 Cj × {1/2}. Let fλ : S → NG be an immersion such that S minus every annular neighborhood of Cj is mapped into S × {1/4} by inclusion and the image of each annular neighborhood of Cj is wrapping nj -times around Cj × {1/2}. This immersion is called the wrapping map associated to λ. Now we perform simultaneous (n, 1) Dehn filling on all the cusps to obtain a → Gn } as before. Moreover, sequence of quasi-Fuchsian representations {βn : G we obtain a sequence {ρn = βn ◦ (fλ )∗ } of representations converging algebraically to ρ∞ = (fλ )∗ and the corresponding sequence {(tn , ϕn )} of projective structures converging to (t, ϕ) ∈ ∂ ᏽ0 . Then we show, in the same manner as in the proof of −1 $t in St consists of disjoint unions of 2nj Lemma 4.1, that the subset ft,ϕ ((G))/ connected components contained in annuli whose cores are homotopic to Cj for all j . Again, we show that (tn , ϕn ) ∈ ᏽλ for large enough n. Finally, we have completed the proof of Theorem A. 4.3. A remark. Although it is difficult to understand exactly the shape of the closed set ᏽ0 ∩ ᏽλ , one can see that it is not a compact subset of P (S). Proposition 4.6. For any λ ∈ ᏹᏸZ (S), ᏽ0 ∩ ᏽλ is not compact in P (S). Proof. We prove this, again, for the case that λ is a simple closed curve C ∈ be the same as in the proof of Theorem A for the special case. of weight 1. Let G Let A be an annular domain in a component S × {1} of conformal boundary ∂NG , whose core is homotopic to C. Observe that A can be conformally embedded in St , where t ∈ T (S) is the underlying complex structure of the limit point (t, ϕ) ∈ ∂ ᏽ0 constructed in the proof of Theorem A. If we deform the conformal structure on S × {1} so that m(A) → ∞, then the hyperbolic length of C on S × {1} tends to 0. Then t diverges in T (S), and, hence, (t, ϕ) ∈ ∂ ᏽ0 diverges in P (S). 5. The proofs of Theorems B and C. For C1 , C2 ∈ , the geometric intersection number i(C1 , C2 ) is the minimum number of points in which the representations of C1 and C2 must intersect. This can be naturally extended to ᏹᏸZ (S). The main aim of this section is to prove Theorem B. Theorem B. For any finite set {λi }m i=1 ⊂ ᏹᏸZ (S) satisfying i(λj , λk ) = 0 for all j, k ∈ {1, . . . , m}, we have ᏽ0 ∩ ᏽλ1 ∩ · · · ∩ ᏽλm = ∅.
206
KENTARO ITO
To prove this theorem, we first prepare Lemma 5.1, which provides us with a method to construct some sequences of representations with the same algebraic limit but with mutually distinct geometric limits. For a Kleinian group G, let MG = H3 /G denote the interior of the Kleinian manifold NG = (H3 ∪ (G))/G. Take an element λ = lj =1 nj Cj ∈ ᏹᏸZ (S) and an ordered set of positive integers k = {k1 , . . . , kl }. For this pair (λ, k), we define a new element λ/k of ᏹᏸZ (S) by l λ nj = Cj , k kj j =1
where [nj /kj ] is the largest integer that does not exceed nj /kj . be a geLemma 5.1. Let λ = lj =1 nj Cj be an element of ᏹᏸZ (S), and let G is homeomorphic to S × [0, 1] − ometrically finite Kleinian group such that NG
l j =1 Cj × {1/2}. Then, for any ordered set of positive integers k = {k1 , . . . , kl }, such that there exists a geometrically finite Kleinian group G
l • NG is also homeomorphic to S × [0, 1] − j =1 Cj × {1/2}, and • the wrapping map fλ : S → MG associated to λ and the wrapping map fλ : S → MG associated to λ = λ/k induce the same representation up to conjugation in PSL2 (C); that is, [(fλ )∗ ] = [(fλ )∗ ]. Proof. Take an element λ¯ = lj =1 mj Cj ∈ ᏹᏸZ (S) such that mj = nj − [nj / kj ]kj for all j . Note that 0 ≤ mj < kj hold for all j . We consider the wrapping map ¯ and denote (fλ¯ )∗ (π1 (S)) by G. Recall that G is a regular fλ¯ : S → MG associated to λ b-group and that MG is homeomorphic to the interior of S × [0, 1]. Via the canonical projection p1 : MG → MG , the map fλ¯ is lifted to an embedding f λ¯ : S → MG , which is a homotopy equivalence. for the fundamental group of each rank-two We now choose a basis γj , δj ⊂ G cusp in MG corresponding to Cj ×{1/2} as before, but with an additional assumption that γj is contained in G. We take a subgroup = G, δ k1 , . . . , δ kl G l 1 Since mj < kj holds for all j , one can take Jordan domains {Dj , D }l of G. j j =1 in (G) such that • both Dj and Dj are stabilized by γj , • δj kj maps ∂Dj onto ∂Dj and the interior of Dj onto the exterior of Dj , • {Dj , Dj }lj =1 project onto 2l disjoint cusp neighborhoods in ∂NG via the covering map (G) → ∂NG . Now one can apply the Klein-Maskit combination Theorem II [22, Theorem E.5(xi)]
l inductively to show that NG is homeomorphic to S × [0, 1] − j =1 Cj × {1/2}.
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE
207
Moreover, note that the canonical projection p2 : MG → MG maps the end of MG corresponding to S × {0} homeomorphically onto the end of MG corresponding to S × {0}. Therefore, p2 ◦ fλ¯ is homotopic to the inclusion map f0 : S → MG onto S × {1/4}. We now consider the canonical projection p3 : MG → MG . Note that p3 ◦ f0 is homotopic to fλ¯ . Observe that the restriction of p3 on the tubular neighborhood of Cj ×{1/2} in MG is a kj -fold covering map on the tubular neighborhood of Cj ×{1/2} in MG . We now perform a surgery to obtain the wrapping map fλ : S → MG from f0 : S → MG , as before. Then p3 ◦ fλ is homotopic to the wrapping map fλ : S → → G is MG associated to λ. Therefore [(p3 ◦ fλ )∗ ] = [(fλ )∗ ], and since (p3 )∗ : G the inclusion map, we have [(fλ )∗ ] = [(fλ )∗ ]. Now Theorem B follows immediately from Lemma 5.1. satisfying i(λj , λk ) = 0 Proof of Theorem B. For any finite set {λi }m i=1 ⊂ ᏹᏸZ (S) for all j, k ∈ {1, . . . , m}, one can easily find an element λ = lj =1 nj Cj ∈ ᏹᏸZ (S) (i)
(i)
and ordered sets of positive integers ki = {k1 , . . . , kl } satisfying λi = λ/ki for all i. and G i (i = 1, . . . , m) such that the By Lemma 5.1, there exist Kleinian groups G wrapping maps fλ : S → MG and fλi : S → MG i induce the same representations [(fλ )∗ ] = [(fλi )∗ ] for all i. Take the element (t, ϕ) ∈ ∂ ᏽ0 such that hol(t, ϕ) = [(fλ )∗ ]. By performing simultaneous (n, 1) Dehn filling on all the cusps in MG i , (i)
(i)
we obtain a sequence {(tn , ϕn )} in ᏽλi converging to (t, ϕ). Hence, we obtain the assertion. As a consequence of Theorem B, we obtain the following theorem.
Theorem C. For any positive integer n ∈ N, there exists a point [ρ] ∈ ∂ QF(S) such that U ∩ QF(S) consists of more than n components for any sufficiently small neighborhood U of [ρ]. Proof. For any positive integer n ∈ N, a finite set {kC}nk=1 in ᏹᏸZ (S) satisfies the condition in Theorem B. Since the holonomy map hol : P (S) → V (S) is a local homeomorphism, we obtain the assertion. References [1] [2] [3] [4] [5] [6]
W. Abikoff, On boundaries of Teichmüller spaces and on Kleinian groups, III, Acta Math. 134 (1975), 211–237. J. W. Anderson and R. D. Canary, Algebraic limits of Kleinian groups which rearrange the pages of a book, Invent. Math. 126 (1996), 205–214. J. W. Anderson, R. D. Canary, and D. McCullough, On the topology of deformation spaces of Kleinian groups, preprint, http://front.math.ucdavis.edu/math.GT/9806079. L. Bers, Simultaneous uniformization, Bull. Amer. Math. Soc. 66 (1960), 94–97. , On boundaries of Teichmüller spaces and on Kleinian groups, I, Ann. of Math. (2) 91 (1970), 570–600. F. Bonahon and J. P. Otal, Variétés hyperboliques à géodésiques arbitrairement courtes,
208 [7]
[8] [9]
[10]
[11] [12] [13] [14]
[15] [16] [17]
[18] [19] [20] [21] [22] [23] [24] [25] [26]
[27]
[28] [29]
KENTARO ITO Bull. London Math. Soc. 20 (1988), 255–261. J. F. Brock, “Iteration of mapping classes on a Bers slice: Examples of algebraic and geometric limits of hyperbolic 3-manifolds” in Lipa’s Legacy (New York, 1995), Contemp. Math. 211, Amer. Math. Soc., Providence, 1997, 81–106. T. D. Comar, Hyperbolic Dehn surgery and convergence of Kleinian groups, Ph.D. dissertation, Univ. of Michigan, 1996. C. J. Earle, “On variation of projective structures” in Riemann Surfaces and Related Topics (Stony Brook, N.Y., 1978), Ann. of Math. Studies 97, Princeton Univ. Press, Princeton, 1981, 87–99. C. J. Earle and C. T. McMullen, “Quasiconformal isotopies” in Holomorphic Functions and Moduli, Vol. I (Berkeley, Calif., 1986), Math. Sci. Res. Inst. Publ. 10, Springer, New York, 1987, 143–154. D. M. Gallo, Deforming real projective structures, Ann. Acad. Sci. Fenn. Math. 22 (1997), 3–14. W. M. Goldman, Projective structures with Fuchsian holonomy, J. Differential. Geom. 25 (1987), 297–326. D. A. Hejhal, Monodromy groups and linearly polymorphic functions, Acta Math. 135 (1975), 1–55. J. H. Hubbard, “The monodromy of projective structures” in Riemann Surfaces and Related Topics (Stony Brook, N.Y., 1978), Ann. of Math. Studies 97, Princeton Univ. Press, Princeton, 1981, 257–275. T. Jørgensen, On discrete groups of Möbius transformations, Amer. J. Math. 98 (1976), 739–749. T. Jørgensen and A. Marden, Algebraic and geometric convergence of Kleinian groups, Math. Scand. 66 (1990), 47–72. Y. Kamishima and S. P. Tan, “Deformation spaces on geometric structures” in Aspects of Low Dimensional Manifolds, Adv. Stud. in Pure Math. 20, Kinokuniya, Tokyo, 1992, 263–299. S. P. Kerckhoff and W. P. Thurston, Noncontinuity of the action of the modular group at Bers’ boundary of Teichmüller space, Invent. Math. 100 (1990), 25–47. I. Kra, Deformations of Fuchsian groups, Duke Math. J. 36 (1969), 537–546. O. Lehto and K. I. Virtanen, Quasiconformal Mappings in the Plane, 2d ed., Springer, New York, 1973. B. Maskit, On a class of Kleinian groups, Ann. Acad. Sci. Fenn. Ser. A I Math. 442 (1969), 1–8. , Kleinian Groups, Grundlehren Math. Wiss. 287, Springer, New York, 1988. K. Matsuzaki and M. Taniguchi, Hyperbolic Manifolds and Kleinian Groups, Oxford Math. Monographs, Oxford Univ. Press, New York, 1998. C. T. McMullen, Cusps are dense, Ann. of Math. (2) 133 (1991), 217–247. , Complex earthquakes and Teichmüller theory, J. Amer. Math. Soc. 11 (1998), 283–320. J. W. Morgan, “On Thurston’s uniformization theorem for three-dimensional manifolds” in The Smith Conjecture (New York, 1979), Pure Appl. Math. 112, Academic Press, Orlando, 1984, 37–125. K. Ohshika, “Divergent sequences of Kleinian groups” in The Epstein Birthday Schrift, Geom. Topol. Monogr. 1, Geom. Topol., Coventry, 1998, 419–450, http://www.maths. warwick.ac.uk/gt/gtmono.html. C. Pommerenke, Boundary Behavior of Conformal Maps, Grundlehren Math. Wiss. 299, Springer, Berlin, 1992. H. Shiga, Projective structures on Riemann surfaces and Kleinian groups, J. Math. Kyoto Univ. 27 (1987), 433–438.
EXOTIC PROJECTIVE STRUCTURES AND QUASI-FUCHSIAN SPACE [30] [31] [32] [33]
209
H. Shiga and H. Tanigawa, Projective structures with discrete holonomy representations, Trans. Amer. Math. Soc. 351 (1999), 813–823. D. Sullivan, Quasiconformal homeomorphisms and dynamics, II: Structural stability implies hyperbolicity for Kleinian groups, Acta Math. 155 (1985), 243–260. H. Tanigawa, Grafting, harmonic maps and projective structures on surfaces, J. Differential Geom. 47 (1997), 399–419. S. Wolpert, The Fenchel-Nielsen deformation, Ann. of Math. (2) 115 (1982), 501–528.
Graduate School of Mathematics, Nagoya University, Chikusa-ku, Nagoya 464-8602, Japan; [email protected]
Vol. 105, No. 2
DUKE MATHEMATICAL JOURNAL
© 2000
QUANTUM DEFORMATION OF WHITTAKER MODULES AND THE TODA LATTICE A. SEVOSTYANOV
Introduction. In 1978 Kostant suggested the Whittaker model of the center of the universal enveloping algebra U (g) of a complex simple Lie algebra g. An essential role in this construction is played by a nonsingular character χ of the maximal nilpotent subalgebra n+ ⊂ g. The main result is that the center of U (g) is isomorphic to a commutative subalgebra in U (b− ), where b− ⊂ g is the opposite Borel subalgebra. This observation is used in the theory of principal series representations of the corresponding Lie group G and in the proof of complete integrability of the quantum Toda lattice. The goal of this paper is to generalize Kostant’s construction to quantum groups. An obvious obstruction is the fact that the subalgebra in Uh (g) generated by positive root generators (subject to the quantum Serre relations) does not have nonsingular characters. In order to overcome this difficulty we use a family of new realizations of quantum groups introduced in [13]. The modified quantum Serre relations allow for nonsingular characters, and we are able to construct the Whittaker model of the center of Uh (g). Using the Whittaker model of the center of Uh (g), we introduce quantum deformations of Whittaker modules. The new Whittaker model is also applied to the deformed quantum Toda lattice recently studied by Etingof [6]. We give new proofs of his results which resemble the original Kostant’s proofs for the quantum Toda lattice. The paper is organized as follows. Section 1 contains a review of Kostant’s results on the Whittaker model and Whittaker modules [11], [10]. In order to create a pattern for proofs in the quantum group case, we recall most of Kostant’s proofs. The central part of the paper is Section 2. There we discuss properties of new realizations of finite-dimensional quantum groups and present the Whittaker model of the center of Uh (g). In Section 2.4 we introduce quantum deformed Whittaker modules. Section 2.6 contains a discussion of the deformed quantum Toda lattice. 1. Whittaker modules. In this section we recall the Whittaker model of the center of the universal enveloping algebra U (g), where g is a complex simple Lie algebra. 1.1. Notation. Fix the notation used throughout the text. Let G be a connected simply connected finite-dimensional complex simple Lie group, and let g be its Lie algebra. Fix a Cartan subalgebra h ⊂ g, and let be the set of roots of (g, h). Choose Received 8 November 1999. Revision received 20 December 1999. 2000 Mathematics Subject Classification. Primary 17B37; Secondary 17B10. 211
212
A. SEVOSTYANOV
an ordering in the root system. Let αi , i = 1, . . . , l, l = rank(g) be the simple roots, and let + = {β1 , . . . , βN } be the set of positive roots. Denote by ρ a half of the sum of positive roots, ρ = (1/2) N i=1 βi . Let H1 , . . . , Hl be the set of simple root generators of h. Let aij be the corresponding Cartan matrix. Let d1 , . . . , dl be coprime positive integers such that the matrix bij = di aij is symmetric. There exists a unique nondegenerate invariant symmetric bilinear form ( , ) on g such that (Hi , Hj ) = dj−1 aij . It induces an isomorphism of vector spaces h h∗ under which αi ∈ h∗ corresponds to di Hi ∈ h. We denote by α ∨ the element of h that corresponds to α ∈ h∗ under this isomorphism. The induced bilinear form on h∗ is given by (αi , αj ) = bij . Let W be the Weyl group of the root system . W is the subgroup of GL(h) generated by the fundamental reflections s1 , . . . , sl , si (h) = h − αi (h)Hi ,
h ∈ h.
The action of W preserves the bilinear form ( , ) on h. Let b+ be the positive Borel subalgebra, and let b− be the opposite Borel subalgebra; let n+ = [b+ , b+ ] and n− = [b− , b− ] be their nilradicals. Let H = exp h, N+ = exp n+ , N− = exp n− , B+ = H N+ , and B− = H N− be the Cartan subgroup, the maximal unipotent subgroups, and the Borel subgroups of G which correspond to the Lie subalgebras h, n+ , n− , b+ , and b− , respectively. We identify g and its dual by means of the canonical invariant bilinear form. Then the coadjoint action of G on g∗ is naturally identified with the adjoint one. We also identify n+ ∗ ∼ = n− , b+ ∗ ∼ = b− . Let gβ be the root subspace corresponding to a root β ∈ , gβ = {x ∈ g | [h, x] = β(h)x for every h ∈ h}; gβ ⊂ g is a 1-dimensional subspace. It is well known that for α = −β the root subspaces gα and gβ are orthogonal with respect to the canonical invariant bilinear form. Moreover, gα and g−α are nondegenerately paired by this form. Root vectors Xα ∈ gα satisfy the following relations: Xα , X−α = Xα , X−α α ∨ . 1.2. The Whittaker model. In this section we introduce the Whittaker model of the center of the universal enveloping algebra U (g). We start by recalling the classical result of Chevalley which describes the structure of the center. Let Z(g) be the center of U (g). The standard filtration Uk (g) in U (g) induces a filtration Zk (g) in Z(g). The following important theorem may be found, for instance, in [1, Chapter 8, §8, no. 3, Corollary 1 and no. 5, Theorem 2]. Theorem (Chevalley). One can choose elements Ik ∈ Zmk +1 (g), k = 1, . . . , l, where mk are called the exponents of g, such that Z(g) = C[I1 , . . . , Il ] is a polynomial algebra in l generators. The adjoint action of G on g naturally extends to S(g). Let S(g)G be the algebra of G-invariants in S(g). Clearly, Gr Z(g) ∼ = C[I1 , . . . , Il ], = S(g)G . In particular, S(g)G ∼
QUANTUM DEFORMATION OF WHITTAKER MODULES
213
where Ii = Gr Ii , i = 1, . . . , l. The elements Ii , i = 1, . . . , l, are called fundamental invariants. Following Kostant we realize the center Z(g) of the universal enveloping algebra U (g) as a subalgebra in U (b− ). Let l
χ : n+ −→ C
be a character of n+ . Since n+ = i=1 CXαi ⊕[n+ , n+ ], it is clear that χ is completely determined by the constants ci = χ(Xαi ), i = 1, . . . , l, and ci are arbitrary. In [11] χ is called nonsingular if ci = 0 for all i. Let f = li=1 X−αi ∈ n− be a regular nilpotent element. From the properties of the invariant bilinear form (see Section 1.1), it follows that (f, [n+ , n+ ]) = 0, (f, Xαi ) = (X−αi , Xαi ), and hence the map x → (f, x), x ∈ n+ , is a nonsingular character of n+ . Recall that in our choice of root vectors no normalization was made. But now given a nonsingular character χ : n+ → C we say that f corresponds to χ in case χ Xαi = X−αi , Xαi . Conversely, if χ is nonsingular, there is a unique choice of f so that f corresponds to χ. In this case χ (x) = (f, x) for every x ∈ n+ . Naturally, the character χ extends to a character of the universal enveloping algebra U (n+ ). Let Uχ (n+ ) be the kernel of this extension so that one has a direct sum U (n+ ) = C ⊕ Uχ (n+ ). Since g = b− ⊕ n+ , we have a linear isomorphism U (g) = U (b− ) ⊗ U (n+ ) and, hence, the direct sum (1.2.1)
U (g) = U (b− ) ⊕ Iχ ,
where Iχ = U (g)Uχ (n+ ) is the left-sided ideal generated by Uχ (n+ ). For any u ∈ U (g), let uχ ∈ U (b− ) be its component in U (b− ) relative to the decomposition (1.2.1). Denote by ρχ the linear map ρχ : U (g) −→ U (b− ) given by ρχ (u) = uχ . Let W (b− ) = ρχ (Z(g)). Theorem A [11, Theorem 2.4.2]. The map (1.2.2)
ρχ : Z(g) −→ W (b− )
is an isomorphism of algebras. In particular, χ χ χ W (b− ) = C I1 , . . . , Il , Ii = ρχ (Ii ), i = 1, . . . , l, is a polynomial algebra in l generators.
214
A. SEVOSTYANOV
Proof. First, we show that the map (1.2.2) is an algebra homomorphism. If u, v ∈ Z(g), then uχ v χ ∈ U (b− ) and uv − uχ v χ = u − uχ v + uχ v − v χ . Since (u − uχ )v = v(u − uχ ), the right-hand side of the last equality is an element of Iχ . This proves uχ v χ = (uv)χ . By definition the map (1.2.2) is surjective. We have to prove that it is injective. Let U (g)h be the centralizer of h in U (g). Clearly, Z(g) ⊆ U (g)h. From the PoincaréBirkhoff-Witt theorem, it follows that every element z ∈ U (g)h may be uniquely written as p1 pN q q z= X−β · · · X−β ϕ X 1 · · · XβNN , N p,q β1 1 p,q∈NN ,p=q
where p = ri=1 pi βi ∈ h∗ and ϕp,q ∈ U (h). Now recall that χ (Xβi ) = 0 if βi is not a simple root, and we easily obtain ρχ (z) =
p,q∈Nl ,p=q=0
pj
pj
X−α1k · · · X−αl k ϕp,q 1
l
l i=1
qj
cki i + ϕ0,0 .
Let z ∈ Z(g). One knows that the map Z(g) −→ U (h),
z −→ ϕ0,0 ,
called the Harich-Chandra homomorphism, is injective (see [3, (c), p. 232]). It follows that the map (1.2.2) is also injective. Remark 1.2.1. The first part of the proof of Theorem A used only the fact that v ∈ Z(g). Therefore, ρχ (uv) = ρχ (u)ρχ (v) for any u ∈ U (g), v ∈ Z(g). Definition A. The algebra W (b− ) is called the Whittaker model of Z(g). Next we equip U (b− ) with a structure of a left U (n+ )-module in such a way that W (b− ) is realized as the space of invariants with respect to this action. Let Yχ be the left U (g)-module defined by Yχ = U (g) ⊗U (n+ ) Cχ , where Cχ denotes the 1-dimensional U (n+ )-module defined by χ. Obviously Yχ is
QUANTUM DEFORMATION OF WHITTAKER MODULES
215
just the quotient module U (g)/Iχ . From (1.2.1) it follows that the map (1.2.3)
U (b− ) −→ Yχ ,
v −→ v ⊗ 1
is a linear isomorphism. It is convenient to carry the module structure of Yχ to U (b− ). For u ∈ U (g), v ∈ U (b− ) the induced action u ◦ v has the form (1.2.4)
u ◦ v = (uv)χ .
The restriction of this action to U (n+ ) may be changed by tensoring with the 1dimensional U (n+ )-module defined by −χ . That is, U (b− ) becomes a U (n+ )-module where if x ∈ U (n+ ) and v ∈ U (b− ), then one puts (1.2.5)
x · v = x ◦ v − χ(x)v.
Lemma A [11, Lemma 2.6.1]. Let v ∈ U (b− ) and x ∈ U (n+ ). Then x · v = [x, v]χ . Proof. By definition, x · v = (xv)χ − χ(x)v. Then we have xv = [x, v] + vx, and hence x · v = ([x, v])χ + (vx)χ − χ(x)v. But clearly (vx)χ = vχ(x). Thus, x · v = ([x, v])χ . The action (1.2.5) may be lifted to an action of the unipotent group N+ . Consider the space U (b− )N+ of N+ invariants in U (b− ) with respect to this action. Clearly, W (b− ) ⊆ U (b− )N+ . Theorem B [11, Theorems 2.4.1, 2.6]. Suppose that the character χ is nonsingular. Then the space of N+ invariants in U (b− ) with respect to the action (1.2.5) is isomorphic to W (b− ); that is, (1.2.6)
U (b− )N+ ∼ = W (b− ).
1.3. Whittaker modules. In this section we recall basic facts on Whittaker modules (see [11]). Let V be a U (g)-module. The action is denoted by uv for u ∈ U (g), v ∈ V . Let χ : n+ → C be a nonsingular character of n+ (see Section 1.2). A vector w ∈ V is called a Whittaker vector (with respect to χ) if xw = χ(x)w for all x ∈ U (n+ ). A Whittaker vector w is called a cyclic Whittaker vector (for V ) if U (g)w = V . A U (g)-module V is called a Whittaker module if it contains a cyclic Whittaker vector.
216
A. SEVOSTYANOV
If V is any U (g)-module, we let UV (g) be the annihilator of V . Then UV (g) defines a central ideal ZV (g) by putting (1.3.1)
ZV (g) = Z(g) ∩ UV (g).
Now assume that V is a Whittaker module for U (g) and w ∈ V is a cyclic Whittaker vector. Let Uw (g) ⊆ U (g) be the annihilator of w. Thus UV (g) ⊆ Uw (g), where Uw (g) is a left ideal and UV (g) is a two-sided ideal in U (g). One has V = U (g)/Uw (g) as U (g)-modules so that V is determined up to equivalence by Uw (g). Clearly, Iχ = U (g)Uχ (n+ ) ⊆ Uw (g) and U (g)ZV (g) ⊆ Uw (g). The following theorem says that, up to equivalence, V is determined by the central ideal ZV (g). Theorem C [11, Theorem 3.1]. Let V be any U (g)-module that admits a cyclic Whittaker vector w, and let Uw (g) be the annihilator of w. Then (1.3.2)
Uw (g) = U (g)ZV (g) + Iχ .
The proof of Theorem C is based on Lemma B. We use the notation of Section 1.2. If X ⊆ U (g), let X χ = ρχ (X). Note that Uw (g) is stable under the map u → ρχ (u). We recall also that, by Theorem A, ρχ induces an algebra isomorphism Z(g) → W (b− ), χ where W (b− ) = Z(g)χ . Thus, if Z∗ is any ideal in Z(g), then W∗ (b− ) = Z∗ is an χ isomorphic ideal in W (b− ). But (U (g)Z∗ ) = U (b− )W∗ (b− ) by Remark 1.2.1. Thus by (1.2.1) one has the direct sum (1.3.3)
U (g)Z∗ + Iχ = U (b− )W∗ (b− ) ⊕ Iχ .
Lemma B [11, Lemma 3.1]. Let X = {v ∈ U (b− ) | (x · v)w = 0 for all x ∈ n+ }, where x · v is given by (1.2.5). Then (1.3.4)
X = U (b− )WV (b− ) + W (b− ),
where WV (b− ) = ZV (g)χ . Furthermore, if we denote Uw (b− ) = Uw (g) ∩ U (b− ), then (1.3.5)
Uw (b− ) = U (b− )WV (b− ).
Proof of Theorem C. Let u ∈ Uw (g). We wish to show that u ∈ U (g)ZV (g) + Iχ . But by (1.3.3) it suffices to show that uχ ∈ U (b− )WV (b− ). Since uχ ∈ Uw (b− ), the result follows from (1.3.5). Now one can determine, up to equivalence, the set of all Whittaker modules. They are naturally parametrized by the set of all ideals in the center Z(g). Theorem D [11, Theorem 3.2]. Let V be any Whittaker module for U (g), the universal enveloping algebra of a simple Lie algebra g. Let UV (g) be the annihilator
QUANTUM DEFORMATION OF WHITTAKER MODULES
217
of V , and let Z(g) be the center of U (g). Then the correspondence (1.3.6)
V −→ ZV (g),
where ZV (g) = UV (g) ∩ Z(g), sets up a bijection between the set of all equivalence classes of Whittaker modules and the set of all ideals in Z(g). Proof. Let Vi , i = 1, 2, be two Whittaker modules. If ZV1 (g) = ZV2 (g), then clearly V1 is equivalent to V2 by (1.3.2). Thus the map (1.3.6) is injective on equivalence classes. Conversely, let Z∗ be any ideal in Z(g), and let L = U (g)Z∗ + Iχ . Then V = U (g)/L is a Whittaker module, where we can take Uw (g) = L. But then L = χ U (g)ZV (g) + Iχ by Theorem C. By (1.3.2) this implies ZV (g)χ = Z∗ . However, by Theorem A, ρχ is injective on Z(g). Therefore, ZV (g) = Z∗ . Hence, the map (1.3.6) is surjective. Now consider the subalgebra Z(g)U (n+ ) in U (g). By [10, Theorem 2.1], Z(g)U (n+ ) ∼ = Z(g) ⊗ U (n+ ). Now let Z∗ be any ideal in Z(g), and regard Z(g)/Z∗ as a Z(g)-module. Equip Z(g)/Z∗ with a structure of (Z(g) ⊗ U (n+ ))-module by (u ⊗ v)y = χ (v)uy, where u ∈ Z(g), v ∈ U (n+ ), and y ∈ Z(g)/Z∗ . We denote this module by (Z(g)/Z∗ )χ . The following result is another way of expressing Theorem D. Theorem E [11, Theorem 3.3]. Let V be any U (g)-module. Then V is a Whittaker module if and only if one has an isomorphism (1.3.7)
V ∼ = U (g) ⊗Z(g)⊗U (n+ ) Z(g)/Z∗ χ
of U (g)-modules. Furthermore, in such a case, the ideal Z∗ is unique and is given by Z∗ = ZV (g), where ZV (g) is defined by (1.3.1). Proof. If 1∗ is the image of 1 in Z(g)/Z∗ , then the annihilator in Z(g) ⊗ U (n+ ) of 1∗ is U (n+ )Z∗ + Z(g)Uχ (n+ ). Thus, the annihilator in U (g) of 1 ⊗ 1∗ = w in the right-hand side of (1.3.7) is U (g)Z∗ + Iχ . The result then follows from Theorem D since w is clearly a cyclic generator of this module. Now one can determine all the Whittaker vectors in a Whittaker module and discuss the question of irreducibility for Whittaker modules. Theorem F [11, Theorem 3.4]. Let V be any U (g)-module with a cyclic Whittaker vector w ∈ V . Then any v ∈ V is a Whittaker vector if and only if v is of the form v = uw, where u ∈ Z(g). Thus the space of all Whittaker vectors in V is a cyclic Z(g)-module that is isomorphic to Z(g)/ZV (g). Proof. Obviously, if v = uw for u ∈ Z(g), then v is a Whittaker vector. Conversely, let v ∈ V be a Whittaker vector. Write v = uw for u ∈ U (g). Then clearly v = uχ w,
218
A. SEVOSTYANOV
so that we can assume u ∈ U (b− ). But now if x ∈ n+ , then xuw = χ(x)uw. But also uxw = χ (x)uw. Thus, [x, u]w = 0, and hence [x, u]χ w = 0. But x · u = [x, u]χ by Lemma A. Thus, in the notation of Lemma B, one has u ∈ X. But then by Lemma B one can write u = u1 + u2 , where u1 ∈ Uw (b− ) and u2 ∈ W (b− ). But then u1 w = 0. χ Thus, v = u2 w. But now u2 = u3 , where u3 ∈ Z(g) by Theorem A. But then v = u3 w, which proves the theorem. If V is any U (g)-module, then EndU V denotes the algebra of operators on V which commute with the action of U (g). If πV : U (g) → End V is the representation defining the U (g)-module structure on V , then clearly πV (Z(g)) ⊆ EndU V . Furthermore, it is also clear that πV (Z(g)) ∼ = Z(g)/ZV (g). Theorem G [11, Theorem 3.5]. Assume that V is a Whittaker module. Then EndU V = πV (Z(g)). In particular, one has an isomorphism EndU V ∼ = Z(g)/ZV (g). Note that EndU V is commutative. Proof. Let w ∈ V be a cyclic Whittaker vector. If α ∈ EndU V , then αw is a Whittaker vector. But then by Theorem F there exists u ∈ Z(g) such that αw = uw. For any v ∈ U (g) one has αvw = vαw = vuw = uvw. Thus, α = πV (u). Now one can describe all irreducible Whittaker modules. A homomorphism ξ : Z(g) −→ C is called a central character. Given a central character ξ , let Zξ (g) = Ker ξ so that Zξ (g) is a typical central ideal in Z(g). If V is any U (g)-module, one says that V admits an infinitesimal character, and ξ is its infinitesimal character if ξ is a central character such that uv = ξ(u)v for all u ∈ Z(g), v ∈ V . Recall that by Dixmier’s theorem any irreducible U (g)-module admits an infinitesimal character. Given a central character ξ , let Cξ,χ be the 1-dimensional (Z(g)⊗U (n+ ))-module defined so that if u ∈ Z(g), v ∈ U (n+ ), and y ∈ Cξ,χ , then u⊗vy = ξ(u)χ(v)y. Also let Yξ,χ = U (g) ⊗Z(g)⊗U (n+ ) Cξ,χ . It is clear that Yξ,χ admits an infinitesimal character, and ξ is that character. Theorem H [11, Theorem 3.6.1]. Let V be any Whittaker module for U (g), the universal enveloping of a simple Lie algebra g. Then the following conditions are equivalent. (1) V is an irreducible U (g)-module. (2) V admits an infinitesimal character. (3) The corresponding ideal given by Theorem D is a maximal ideal.
QUANTUM DEFORMATION OF WHITTAKER MODULES
219
(4) The space of Whittaker vectors in V is 1-dimensional. (5) All nonzero Whittaker vectors in V are cyclic vectors. (6) The centralizer EndU V reduces to constants C. (7) V is isomorphic to Yξ,χ for some central character ξ . Proof. The equivalence of (2), (3), (4), and (6), follows from Theorems F and G. This is also equivalent to (5) since (5) implies that Z(g)/ZV (g) is a field by Theorem F. One gets the equivalence with (7) by Theorem E. It remains to relate (2)–(7) with (1). But (1) implies (2) by Dixmier’s theorem. The proof of the equivalence (2)–(7) with (1) may be found in [11]. 2. Quantum deformation of the Whittaker modules. Let g be a complex simple Lie algebra, and let Uh (g) be the standard quantum group associated with g. In this section we construct a generalization of the Whittaker model W (b− ) for Uh (g). Let Uh (n+ ) be the subalgebra of Uh (g) corresponding to the nilpotent Lie subalgebra n+ . Uh (n+ ) is generated by simple positive root generators of Uh (g) subject to the quantum Serre relations. It is easy to show that Uh (n+ ) has no nonsingular characters (taking nonvanishing values on all simple root generators) except for the simplest case of sl(2). Our first main result is a family of new realizations of the quantum group Uh (g), one for each Coxeter element in the corresponding Weyl group (see also [13]). The counterparts of U (n+ ), which naturally arise in these new realizations of Uh (g), do have nonsingular characters. Using these new realizations we can immediately formulate a quantum group version of Definition A. We also prove counterparts of Theorems A and B for Uh (g). Finally, we define quantum group generalizations of the Toda Hamiltonians. In the spirit of quantum harmonic analysis, these new Hamiltonians are difference operators. An alternative definition of these Hamiltonians was recently given in [6]. 2.1. Quantum groups. In this section we recall some basic facts about quantum groups. We follow the notation of [2]. Let h be an indeterminate, and let C[[h]] be the ring of formal power series in h. We consider C[[h]]-modules equipped with the so-called h-adic topology. For every such module V , this topology is characterized by requiring that {hn V | n ≥ 0} is a base of the neighbourhoods of zero in V and that translations in V are continuous. It is easy to see that, for modules equipped with this topology, every C[[h]]-module map is automatically continuous. A topological Hopf algebra over C[[h]] is a complete C[[h]]-module A equipped with a structure of C[[h]]-Hopf algebra (see [2, Definition 4.3.1]); the algebraic tensor products entering the axioms of the Hopf algebra are replaced by their completions in the h-adic topology. We denote by µ, ı, , ε, and S the multiplication, the unit, the comultiplication, the counit, and the antipode of A, respectively. The standard quantum group Uh (g) associated to a complex finite-dimensional simple Lie algebra g is the algebra over C[[h]] topologically generated by elements
220
A. SEVOSTYANOV
Hi , Xi+ , Xi− , i = 1, . . . , l, and with the following defining relations Hi , Xj± = ±aij Xj± , Hi , Hj = 0, (2.1.1)
Xi+ Xj− − Xj− Xi+ = δi,j
Ki − Ki−1 qi − qi−1
,
where Ki = edi hHi , eh = q, qi = q di = edi h , and the quantum Serre relations 1−aij
(−1)
r
r=0
where
1 − aij r
qi
[m]q ! m , = n q [n]q ![n − m]q !
Xi±
1−aij −r
r Xj± Xi± = 0,
[n]q ! = [n]q · · · [1]q , [n]q =
i = j,
q n − q −n . q − q −1
Uh (g) is a topological Hopf algebra over C[[h]] with comultiplication defined by h (Hi ) = Hi ⊗ 1 + 1 ⊗ Hi , h Xi+ = Xi+ ⊗ Ki + 1 ⊗ Xi+ , h Xi− = Xi− ⊗ 1 + Ki−1 ⊗ Xi− , antipode defined by Sh (Hi ) = −Hi , and counit defined by
Sh Xi+ = −Xi+ Ki−1 ,
Sh Xi− = −Ki Xi− ,
εh (Hi ) = εh Xi± = 0.
We also use the weight-type generators defined by Yi =
l di a −1 ij Hj , j =1
and the elements Li = ehYi . They commute with the root vectors Xi± as follows: (2.1.2)
±δij
Li Xj± L−1 i = qi
Xj± .
The Hopf algebra Uh (g) is a quantization of the standard bialgebra structure on g; that is, Uh (g)/ hUh (g) = U (g), h = (mod h), where is the standard comultiplication on U (g), and opp h − h (mod h) = δ, h
QUANTUM DEFORMATION OF WHITTAKER MODULES
221
where δ : g → g ⊗ g is the standard cocycle on g. Recall that δ(x) = adx ⊗ 1 + 1 ⊗ adx 2r+ , r+ ∈ g ⊗ g, (2.1.3)
r+ =
l −1 1 Yi ⊗ Xi + Xβ ⊗ X−β . Xβ , X−β 2 i=1
β∈+
Here X±β ∈ g±β are root vectors of g. The element r+ ∈ g ⊗ g is called a classical r-matrix. The following proposition describes the algebraic structure of Uh (g). Proposition 2.1.1 [2, Proposition 6.5.5]. Let g be a finite-dimensional complex simple Lie algebra, and let Uh (h) be the subalgebra of Uh (g) topologically generated by the Hi , i = 1, . . . , l. Then there is an isomorphism of algebras ϕ : Uh (g) → U (g)[[h]] over C[[h]] such that ϕ = id (mod h) and ϕ |Uh (h) = id. Proposition 2.1.2 [2, Proposition 6.5.7]. If g is a finite-dimensional complex simple Lie algebra, the center Zh (g) of Uh (g) is canonically isomorphic to Z(g)[[h]], where Z(g) is the center of U (g). Corollary 2.1.3 [2, Corollary 6.5.6]. If g is a finite-dimensional complex simple Lie algebra, then the assignment V → V [[h]] is a one-to-one correspondence between the finite-dimensional irreducible representations of g and indecomposable representations of Uh (g) which are free and of finite rank as C[[h]]-modules. Furthermore, for every such V the action of the generators Hi ∈ Uh (g), i = 1, . . . , l, on V [[h]] coincides with the action of the root generators Hi ∈ h, i = 1, . . . , l. The representations of Uh (g) defined in Corollary 2.1.3 are called finite-dimensional representations of Uh (g). For every finite-dimensional representation πV : g → End V , we denote the corresponding representation of Uh (g) in the space V [[h]] by the same letter. Uh (g) is a quasitriangular Hopf algebra; that is, there exists an invertible element ∈ Uh (g) ⊗ Uh (g), called a universal R-matrix, such that (2.1.4)
h (a) = h (a)−1 opp
for all a ∈ Uh (g),
where opp = σ , σ is the permutation in Uh (g)⊗2 , σ (x ⊗ y) = y ⊗ x, and h ⊗ id = 13 23 , (2.1.5) id ⊗h = 13 12 , where 12 = ⊗ 1, 23 = 1 ⊗ , 13 = (σ ⊗ id)23 . From (2.1.4) and (2.1.5) it follows that satisfies the quantum Yang-Baxter equation (2.1.6)
12 13 23 = 23 13 12 .
222
A. SEVOSTYANOV
For every quasitriangular Hopf algebra we also have (see [2, Proposition 4.2.7]) (S ⊗ id) = −1 and (S ⊗ S) = .
(2.1.7)
We explicitly describe the element . First, following [9] we recall the construction of root vectors of Uh (g). We use the so-called normal ordering in the root system + = {β1 , . . . , βN } (see [14]). Definition 2.1.1. An ordering of the root system + is called normal if all simple roots are written in an arbitrary order, and for any three roots α, β, γ such that γ = α + β we have either α < γ < β or β < γ < α. To construct root vectors we apply the following inductive algorithm. Let α, β, γ ∈ + be positive roots such that γ = α + β, α < β, and [α, β] is the minimal segment including γ ; that is, the segment has no other roots α , β such that γ = α + β . Suppose that Xα± , Xβ± have already been constructed. Then we define Xγ+ = Xα+ Xβ+ − q (α,β) Xβ+ Xα+ ,
(2.1.8)
Xγ− = Xβ− Xα− − q −(α,β) Xα− Xβ− .
Proposition 2.1.4. For β = li=1 mi αi , mi ∈ N Xβ± is a polynomial in the noncommutative variables Xi± homogeneous in each Xi± of degree mi . The root vectors Xβ satisfy the following relations: ∨
∨
ehα − e−hα Xα+ , Xα− = a(α) , q − q −1
where a(α) ∈ C[[h]]. They commute with elements of the subalgebra Uh (h) as follows:
(2.1.9)
Hi , Xβ± = ±β(Hi )Xβ± ,
i = 1, . . . , l.
Note that, by construction, Xβ+ (mod h) = Xβ ∈ gβ , Xβ− (mod h) = X−β ∈ g−β are root vectors of g. This implies that a(α) (mod h) = (Xα , X−α ).
QUANTUM DEFORMATION OF WHITTAKER MODULES
223
Let Uh (n+ ), Uh (n− ) be the C[[h]]-subalgebras of Uh (g) topologically generated by the Xi+ and by the Xi− , respectively. Now using the root vectors Xβ± we can construct a topological basis of Uh (g). Define for r = (r1 , . . . , rN ) ∈ NN , r r (X+ )r = Xβ+1 1 · · · Xβ+N N , r r (X − )r = Xβ−1 1 · · · Xβ−N N , and for s = (s1 , . . . , sl ) ∈ N l , H s = H1s1 · · · Hlsl . Proposition 2.1.5 [9, Proposition 3.3]. The elements (X+ )r , (X− )t , and H s , for r, t ∈ NN , s ∈ Nl , form topological bases of Uh (n+ ), Uh (n− ), and Uh (h), respectively, and the products (X + )r H s (X − )t form a topological basis of Uh (g). In particular, multiplication defines an isomorphism of C[[h]]-modules: Uh (n− ) ⊗ Uh (h) ⊗ Uh (n+ ) −→ Uh (g). An explicit expression for may be written by making use of the q-exponential expq (x) =
∞ xk , (k)q ! k=0
where (k)q ! = (1)q · · · (k)q ,
(n)q =
qn − 1 . q −1
Now the element may be written as (see [9, Theorem 8.1]): l
= exp h expq −1 q − q −1 a(β)−1 Xβ+ ⊗ Xβ− , Yi ⊗ H i (2.1.10) i=1
β
β
where qβ = q (β,β) ; the product is over all the positive roots of g, and the order of the terms is such that the α-term appears to the left of the β-term if α < β with respect to the normal ordering of + . Remark 2.1.2. The r-matrix r+ = (1/2)h−1 ( − 1 ⊗ 1) (mod h), which is the classical limit of , coincides with the classical r-matrix (2.1.3). 2.2. Nonsingular characters and quantum groups. In this section, following [13], we recall the construction of quantum counterparts of the principal nilpotent Lie subalgebras of complex simple Lie algebras and of their nonsingular characters. Subalgebras of Uh (g) which resemble the subalgebra U (n+ ) ⊂ U (g) and possess nonsingular
224
A. SEVOSTYANOV
characters naturally appear in the Coxeter realizations of Uh (g) defined in [13] as follows. Denote by Sl the symmetric group of l elements. To any element π ∈ Sl we associate a Coxeter element sπ by the formula sπ = sπ(1) · · · sπ(l) . Let Uhsπ (g) be the associative algebra over C[[h]] with generators ei , fi , Hi , i = 1, . . . , l, subject to the relations Hi , Hj = 0, Hi , fj = −aij fj , Hi , ej = aij ej , π
ei fj − q cij fj ei = δi,j (2.2.1)
Ki − Ki−1 qi − qi−1
,
Ki = edi hHi , 1−aij
π 1−a ij r rcij (−1) q (ei )1−aij −r ej (ei )r = 0, r q r=0
1−aij
i = j,
i
π r rcij
(−1) q
r=0
1 − aij r
qi
(fi )1−aij −r fj (fi )r = 0,
i = j,
π = (((1 + s )/(1 − s ))α , α ) are matrix elements of the Cayley transform where cij π π i j of sπ in the basis of simple roots.
Theorem 2.2.1 [13, Theorem 4]. For every solution nij ∈ C, i, j = 1, . . . , l, of equations π dj nij − di nj i = cij ,
(2.2.2)
there exists an algebra isomorphism ψ{n} : Uhsπ (g) → Uh (g) defined by the formulas ψ{n} (ei ) = Xi+
l p=1
n Lpip ,
ψ{n} (fi ) =
l p=1
−nip
Lp
Xi− ,
ψ{n} (Hi ) = Hi .
We call the algebra Uhsπ (g) the Coxeter realization of the quantum group Uh (g) corresponding to the Coxeter element sπ . Let Uhsπ (n+ ) be the subalgebra in Uhsπ (g) generated by ei , i = 1, . . . , l. Proposition 2.2.2 [13, Proposition 2]. The map χhsπ : Uhsπ (n+ ) → C[[h]] defined on generators by χhsπ (ei ) = ci , ci ∈ C[[h]], ci = 0, is a character of the algebra Uhsπ (n+ ). The proof of this proposition given in [13] is based on the following lemma. Lemma 2.2.3 [13, Lemma 3]. The matrix elements of (1 + sπ )/(1 − sπ ) are of the form 1 + sπ π π (2.2.3) = αi , αj = εij bij , cij 1 − sπ
QUANTUM DEFORMATION OF WHITTAKER MODULES
where
225
−1, π −1 (i) < π −1 (j ), π εij = 0, i = j, 1, π −1 (i) > π −1 (j ).
Now we study the algebraic structure of Uhsπ (g). Denote by Uhsπ (n− ) the subalgebra in Uhsπ (g) generated by fi , i = 1, . . . , l. From defining relations (2.2.1) it follows that the map χ hsπ : Uhsπ (n− ) → C[[h]] defined on generators by χ hsπ (fi ) = ci , ci ∈ C[[h]], ci = 0, is a character of the algebra Uhsπ (n− ). Let Uhsπ (h) be the subalgebra in Uhsπ (g) generated by Hi , i = 1, . . . , l. Define sπ Uh (b± ) = Uhsπ (n± )Uhsπ (h). We construct a Poincaré-Birkhoff-Witt basis for Uhsπ (g). It is convenient to introduce an operator K ∈ End h such that KHi =
(2.2.4)
l nij j =1
In particular, we have
di
Yj .
nj i = KHj , Hi . dj
Equation (2.2.2) is equivalent to the following equation for the operator K: K − K∗ =
1 + sπ . 1 − sπ
Proposition 2.2.4. (i) For any solution of equation (2.2.2) and any normal or∨ −1 dering of the root system + , the root vectors eβ = ψ{n} (Xβ+ ehKβ ) and fβ =
−1 −hKβ (e Xβ− ), β ∈ + , lie in the subalgebras Uhsπ (n+ ) and Uhsπ (n− ), respecψ{n} tively. (ii) Moreover, the elements er = eβr11 · · · eβrNN , f t = eβt11 · · · eβtNN , and H s = H1s1 · · · Hlsl for r, t, s ∈ NN form topological bases of Uhsπ (n+ ), Uhsπ (n− ), and Uhsπ (h), and the products f t H s er form a topological basis of Uhsπ (g). In particular, multiplication defines an isomorphism of C[[h]]-modules: ∨
Uhsπ (n− ) ⊗ Uhsπ (h) ⊗ Uhsπ (n+ ) −→ Uhsπ (g). l
∈ + be a positive root, and let Xβ+ ∈ Uh (g) be the corresponding root vector. Then β ∨ = li=1 mi di Hi , and so Kβ ∨ = li,j =1 mi nij Yj . Now the proof of the first statement follows immediately from Proposition 2.1.4, commutation relations (2.1.2), and the definition of the isomorphism ψ{n} . The second assertion is a consequence of Proposition 2.1.5. Proof. Let β =
i=1 mi αi
226
A. SEVOSTYANOV
Now we choose a normal ordering of the root system + in such a way that χhsπ (eβ ) = 0 and χ shπ (fβ ) = 0 if β is not a simple root. Proposition 2.2.5. Choose a normal ordering of the root system + such that the simple roots are written in the following order: απ(1) , . . . , απ(l) . Then χhsπ (eβ ) = 0 and χ hsπ (fβ ) = 0 if β is not a simple root. Proof. We consider the case of positive root generators. The proof for negative root generators is similar to that for the positive ones. The root vectors Xβ+ are defined in terms of iterated q-commutators (see (2.1.8)). Therefore, it suffices to verify that for i < j , χhsπ eαπ(i) +απ(j ) −1 + + + (απ(i) ,απ(j ) ) + hK(dπ(i) Hπ(i) +dπ(j ) Hπ(j ) ) Xπ(i) e = 0. Xπ(j − q X X = χhsπ ψ{n} ) π(j ) π(i) From (2.2.4) and commutation relations (2.1.2), we obtain that (2.2.5) −1 ψ{n}
+ + + (απ(i) ,απ(j ) ) + hK(dπ(i) Hπ(i) +dπ(j ) Hπ(j ) ) Xπ(j − q X X Xπ(i) e ) π(j ) π(i) = q −dπ(j ) nπ(i)π(j ) eπ(i) eπ(j ) − q bπ(i)π(j ) +dπ(j ) nπ(i)π(j ) −dπ(i) nπ(j )π(i) eπ(j ) eπ(i) .
Now using equation (2.2.2) and Lemma 2.2.3 the combination bπ(i)π(j ) + π dπ(j ) nπ(i)π(j ) −dπ(i) nπ(j )π(i) may be represented as bπ(i)π(j ) +επ(i)π(j ) bπ(i)π(j ) . But π επ(i)π(j ) = −1 for i < j , and therefore the right-hand side of (2.2.5) takes the form q −dπ(j ) nπ(i)π(j ) eπ(i) , eπ(j ) . Clearly,
χhsπ eαπ(i) +απ(j ) = q −dπ(j ) nπ(i)π(j ) χhsπ eπ(i) , eπ(j ) = 0.
2.3. Quantum deformation of the Whittaker model. In this section we define a quantum deformation of the Whittaker model W (b− ). Our construction is similar to the one described in Section 1.2; the quantum group Uhsπ (g), the subalgebra Uhsπ (n+ ), and characters χhsπ : Uhsπ (n+ ) → C[[h]] serve as natural counterparts of the universal enveloping algebra U (g), of the subalgebra U (n+ ), and of nonsingular characters χ : U (n+ ) → C, respectively. Let Uhsπ (n+ )χ sπ be the kernel of the character χhsπ : Uhsπ (n+ ) → C[[h]] so that one h has a direct sum Uhsπ (n+ ) = C[[h]] ⊕ Uhsπ (n+ )χ sπ . h
227
QUANTUM DEFORMATION OF WHITTAKER MODULES
From Proposition 2.2.4 we have a linear isomorphism Uhsπ (g) = Uhsπ (b− )⊗Uhsπ (n+ ) and hence the direct sum Uhsπ (g) = Uhsπ (b− ) ⊕ Iχ sπ ,
(2.3.1)
h
where Iχ sπ = Uhsπ (g)Uhsπ (n+ )χ sπ is the left-sided ideal generated by Uhsπ (n+ )χ sπ . h
sπ
h
h
For any u ∈ Uhsπ (g), let uχh ∈ Uhsπ (b− ) be its component in Uhsπ (b− ) relative to the decomposition (2.3.1). Denote by ρχ sπ the linear map h
ρχ sπ : Uhsπ (g) −→ Uhsπ (b− ) h
sπ
given by ρχ sπ (u) = uχh . h Denote by Zhsπ (g) the center of Uhsπ (g). From Proposition 2.1.2 and Theorem 2.2.1, s we obtain that Zhsπ (g) ∼ = Z(g)[[h]]. In particular, Zhπ (g) is freely generated as a commutative topological algebra over C[[h]] by l elements I1 , . . . , Il . Let Wh (b− ) = ρχ sπ (Zhsπ (g)). h
Theorem Ah . The map (2.3.2)
ρχ sπ : Zhsπ (g) −→ Wh (b− ) h
is an isomorphism of algebras. In particular, Wh (b− ) is freely generated as a comχhsπ
mutative topological algebra over C[[h]] by l elements Ii
= ρχ sπ (Ii ), i = 1, . . . , l. h
Proof. This proof is similar to that of Theorem A in the classical case. Remark 2.3.3. Similarly to Remark 1.2.1 we have ρχ sπ (uv) = ρχ sπ (u)ρχ sπ (v) h
h
h
for any u ∈ Uhsπ (g), v ∈ Zhsπ (g). Definition Ah . The algebra Wh (b− ) is called the Whittaker model of Zhsπ (g). Next we equip Uhsπ (b− ) with a structure of a left Uhsπ (n+ )-module in such a way that Wh (b− ) is identified with the space of invariants with respect to this action. Following Lemma A in the classical case, we define this action by (2.3.3)
sπ
x · v = [x, v]χh ,
where v ∈ Uhsπ (b− ) and x ∈ Uhsπ (n+ ). sπ Consider the space Uhsπ (b− )Uh (n+ ) of Uhsπ (n+ )-invariants in Uhsπ (b− ) with respect sπ to this action. Clearly, Wh (b− ) ⊆ Uhsπ (b− )Uh (n+ ) .
228
A. SEVOSTYANOV
Theorem Bh . Suppose that χhsπ (ei ) = 0 (mod h) for i = 1, . . . , l. Then the space of Uhsπ (n+ )-invariants in Uhsπ (b− ) with respect to the action (2.3.3) is isomorphic to Wh (b− ); that is, (2.3.4)
sπ
Uhsπ (b− )Uh
(n+ )
∼ = Wh (b− ).
The proof of this theorem, as well as the proofs of many statements in this paper, is based on the following lemma. Lemma 2.3.1. Let V be a complete C[[h]]-module, and let A, B ⊂ V be two closed subspaces. Denote by p : V → V / hV the canonical projection. Suppose that B ⊆ A, p(A) = p(B), and for any a ∈ A, b ∈ B such that a − b = hc, c ∈ V , we have c ∈ A. Then A = B. Proof. Let a ∈ A. Since p(A) = p(B), one can find an element b0 ∈ B such that a − b0 = ha1 , a1 ∈ A. Applying the same procedure to a1 , one can find elements b1 ∈ B, a2 ∈ A such that a1 − b1 = ha2 , that is, a − b0 − hb1 = 0 (mod h2 ). We can continue this process. Finally, we obtain an infinite sequence of elements bi ∈ B such p that a − i=0 hp bp = 0 (mod hp+1 ). Since the subspace A is closed in the h-adic p topology, the series ∞ i=0 h bp ∈ B converges to a. Therefore, a ∈ B. This completes the proof. Proof of Theorem Bh . Let p : Uhsπ (g) → Uhsπ (g)/ hUhsπ (g) = U (g) be the canonical projection. Note that p(Uhsπ (n+ )) = U (n+ ), p(Uhsπ (b− )) = U (b− ), and for every x ∈ Uhsπ (n+ ), χhsπ (x) (mod h) = χ(p(x)) for some nonsingular character χ : U (n+ ) → C. Therefore, p(ρχ sπ (x)) = ρχ (p(x)) for every x ∈ Uhsπ (g), and h hence, by Theorem Ah , p(Wh (b− )) = W (b− ). Using Lemma A and the definition of sπ action (2.3.3), we also obtain that p(Uhsπ (b− )Uh (n+ ) ) = U (b− )N+ = W (b− ). Now Theorem Bh follows immediately from Lemma 2.3.1 applied to V = Uhsπ (g), sπ A = Uhsπ (b− )Uh (n+ ) , B = Wh (b− ). 2.4. Quantum deformations of Whittaker modules. In this section we define quantum deformations of Whittaker modules. The construction of these modules is similar to that for Lie algebras (see Section 1.3). Fix a Coxeter element sπ ∈ W . Let Vh be a Uhsπ (g)-module, which is also free as a C[[h]]-module. The action is denoted by uv for u ∈ Uhsπ (g), v ∈ Vh . Let χhsπ : Uhsπ (n+ ) → C[[h]] be a nonsingular character of Uhsπ (n+ ) (see Proposition 2.2.2). We assume that χhsπ (ei ) = 0 (mod h), i = 1, . . . , l. We call a vector wh ∈ Vh a Whittaker vector (with respect to χhsπ ) if xwh = χhsπ (x)wh
for all x ∈ Uhsπ (n+ ). A Whittaker vector wh is called a cyclic Whittaker vector (for Vh ) if Uhsπ (g)wh = Vh . A Uhsπ (g)-module Vh is called a Whittaker module if it contains a cyclic Whittaker vector.
229
QUANTUM DEFORMATION OF WHITTAKER MODULES
Remark 2.4.4. Note that in this case V = Vh / hVh is naturally a Whittaker module for U (g), w = wh (mod h) ∈ V being a cyclic Whittaker vector for V with respect to the nonsingular character χ of U (n+ ) defined by χ(ei (mod h)) = χhsπ (ei ) (mod h). sπ sπ If Vh is any Uhsπ (g)-module, we let Uh,V (g) be the annihilator of Vh . Then Uh,V (g) sπ (g) by putting defines a central ideal Zh,V sπ sπ (g) = Zhsπ (g) ∩ Uh,V (g). Zh,V
(2.4.1)
Now assume that Vh is a Whittaker module for Uhsπ (g) and wh ∈ Vh is a cyclic sπ sπ Whittaker vector. Let Uh,w (g) ⊆ Uhsπ (g) be the annihilator of wh . Thus, Uh,V (g) ⊆ sπ sπ sπ sπ Uh,w (g), where Uh,w (g) is a left ideal and Uh,V (g) is a two-sided ideal in Uh (g). One sπ (g) as Uhsπ (g)-modules so that Vh is determined up to equivahas Vh = Uhsπ (g)/Uh,w sπ sπ sπ (g) and Uhsπ (g)Zh,V (g) lence by Uh,w (g). Clearly, Iχ sπ = Uhsπ (g)Uhsπ (n+ )χ sπ ⊆ Uh,w h
h
sπ (g). ⊆ Uh,w Theorem Ch , similar to Theorem C for Lie algebras, says that, up to equivalence, sπ Vh is determined by the central ideal Zh,V (g).
Theorem Ch . Let Vh be any Uhsπ (g)-module that admits a cyclic Whittaker vector sπ wh , and let Uh,w (g) be the annihilator of wh . Then sπ sπ Uh,w (g) = Uhsπ (g)Zh,V (g) + Iχ sπ .
(2.4.2)
h
Proof. Denote by p : Uhsπ (g) → Uhsπ (g)/ hUhsπ (g) ∼ = U (g) the canonical prosπ jection. Then we have p(Uh,w (g)) = Uw (g), where Uw (g) is the annihilator of the Whittaker vector w = wh (mod h) in the Whittaker module V = Vh / hVh for U (g) (see Remark 2.4.4). On the other hand from Theorem C we have Uw (g) = U (g)ZV (g)+Iχ . sπ But clearly p(Uhsπ (g)Zh,V (g) + Iχ sπ ) = U (g)ZV (g) + Iχ . Now the result of the theh sπ (g) and B = orem follows immediately from Lemma 2.3.1 applied to A = Uh,w sπ sπ Uh (g)Zh,V (g) + Iχ sπ . h
We also have the following quantum counterpart of Lemma B. We use the notation sπ sπ (g) is stable of Section 2.3. If X ⊆ Uhsπ (g), let Xχh = ρχ sπ (X). Note that Uh,w h under the map u → ρχ sπ (u). We recall also that, by Theorem Ah , ρχ sπ induces an h
sπ
h
algebra isomorphism Zhsπ (g) → Wh (b− ), where Wh (b− ) = Zhsπ (g)χh . Thus, if Zh,∗ χ sπ
h is any ideal in Zhsπ (g), then Wh,∗ (b− ) = Zh,∗ is an isomorphic ideal in Wh (b− ). But sπ
(Uhsπ (g)Zh,∗ )χh = Uhsπ (b− )Wh,∗ (b− ) by Remark 2.3.3. Thus by (2.3.1) one has the direct sum (2.4.3)
Uhsπ (g)Zh,∗ + Iχ sπ = Uhsπ (b− )Wh,∗ (b− ) ⊕ Iχ sπ . h
h
230
A. SEVOSTYANOV
Lemma Bh . Let X = {v ∈ Uhsπ (b− ) | (x · v)wh = 0 for all x ∈ Uhsπ (n+ )}, where x · v is given by (2.3.3). Then (2.4.4)
X = Uhsπ (b− )Wh,V (b− ) + Wh (b− ), sπ
sπ sπ sπ (g)χh . Furthermore, if we denote Uh,w (b− ) = Uh,w (g) ∩ where Wh,V (b− ) = Zh,V Uhsπ (b− ), then
(2.4.5)
sπ (b− ) = Uhsπ (b− )Wh,V (b− ). Uh,w
Proof. This is similar to that of Theorem Ch : One should apply Lemma 2.3.1 to A = X and B = Uhsπ (b− )Wh,V (b− ) + Wh (b− ). Now one can determine, up to equivalence, the set of all Whittaker modules for Uhsπ (g). They are naturally parametrized by the set of all ideals in the center Zhsπ (g). Remark 2.4.5. The proofs of Theorems Dh , Eh , Fh , and Gh below are based on Theorems Ah , Ch , and Lemma Bh and are completely similar to the proofs of Theorems D, E, F, and G in the classical case (see Section 1.3). We do not reproduce these proofs in this section. sπ (g) be the Theorem Dh . Let Vh be any Whittaker module for Uhsπ (g). Let Uh,V sπ sπ annihilator of Vh , and let Zh (g) be the center of Uh (g). Then the correspondence
(2.4.6)
sπ (g), Vh −→ Zh,V
sπ sπ (g) = Uh,V (g) ∩ Zhsπ (g), sets up a bijection between the set of all equivawhere Zh,V lence classes of Whittaker modules and the set of all ideals in Zhsπ (g).
Now consider the subalgebra Zhsπ (g)Uhsπ (n+ ) in Uhsπ (g). Using the arguments applied in the proof of Theorem Ah , it is easy to show that Zhsπ (g)Uhsπ (n+ ) ∼ = sπ sπ sπ sπ Zh (g) ⊗ Uh (n+ ). Now let Zh,∗ be any ideal in Zh (g), and regard Zh (g)/Zh,∗ as a Zhsπ (g)-module. Equip Zhsπ (g)/Zh,∗ with a structure of (Zhsπ (g) ⊗ Uhsπ (n+ ))module by (u ⊗ v)y = χhsπ (v)uy, where u ∈ Zhsπ (g), v ∈ Uhsπ (n+ ), y ∈ Zhsπ (g)/Zh,∗ . We denote this module by (Zhsπ (g)/Zh,∗ )χ sπ . h The following result is another way of expressing Theorem Dh . Theorem Eh . Let Vh be any Uhsπ (g)-module. Then Vh is a Whittaker module if and only if one has an isomorphism (2.4.7)
s s Vh ∼ = Uhπ (g) ⊗Z sπ (g)⊗U sπ (n+ ) Zhπ (g)/Zh,∗ χ sπ h
h
h
of Uhsπ (g)-modules. Furthermore, in such a case the ideal Zh,∗ is unique and is given sπ sπ by Zh,∗ = Zh,V (g), where Zh,V (g) is defined by (2.4.1).
QUANTUM DEFORMATION OF WHITTAKER MODULES
231
Note that the question of reducibility for Uhsπ (g)-modules that are free as C[[h]]modules does not make any sense since if Vh is such a module, then hVh is a proper subrepresentation. However, it is natural to study indecomposable modules for Uhsπ (g). Below we describe all indecomposable Whittaker modules for Uhsπ (g). First, we determine all the Whittaker vectors in a Whittaker module for Uhsπ (g). Theorem Fh . Let Vh be any Uhsπ (g)-module with a cyclic Whittaker vector wh ∈ Vh . Then any v ∈ Vh is a Whittaker vector if and only if v is of the form v = uwh , where u ∈ Zhsπ (g). Thus the space of all Whittaker vectors in Vh is a cyclic Zhsπ (g)sπ module that is isomorphic to Zhsπ (g)/Zh,V (g). If Vh is any Uhsπ (g)-module, then EndUh Vh denotes the algebra of operators on Vh which commute with the action of Uhsπ (g). If πV : Uhsπ (g) → End Vh is the representation defining the Uhsπ (g)-module structure on Vh , then clearly πV (Zhsπ (g)) ⊆ s sπ EndUh Vh . Furthermore, it is also clear that πV (Zhsπ (g)) ∼ (g). = Zhπ (g)/Zh,V Theorem Gh . Let Vh be a Whittaker module. Then EndUh Vh = πV (Zhsπ (g)). In particular, one has an isomorphism sπ s EndUh Vh ∼ (g). = Zhπ (g) Zh,V Note that EndUh Vh is commutative. Now one can describe all indecomposable Whittaker modules for Uhsπ (g). A homomorphism ξh : Zhsπ (g) −→ C[[h]]
sπ (g) = Ker ξh so is called a central character. Given a central character ξh , let Zh,ξ h sπ sπ ( g ) is a typical central ideal in Z ( g ). that Zh,ξ h h If Vh is any Uhsπ (g)-module, one says that Vh admits an infinitesimal character, and ξh is its infinitesimal character if ξh is a central character such that uv = ξh (u)v for all u ∈ Zhsπ (g), v ∈ Vh . Given a central character ξh , let C[[h]]ξh ,χ sπ be the 1-dimensional (Zhsπ (g) ⊗ h Uhsπ (n+ ))-module defined so that if u ∈ Zhsπ (g), v ∈ Uhsπ (n+ ), y ∈ C[[h]]ξh ,χ sπ , h then (u ⊗ v)y = ξh (u)χhsπ (v)y. Also let
Yξh ,χ sπ = Uhsπ (g) ⊗Z sπ (g)⊗U sπ (n+ ) C[[h]]ξh ,χ sπ . h
h
h
h
It is clear that Yξh ,χ sπ admits an infinitesimal character and ξh is that character. h
Theorem Hh . Let Vh be any Whittaker module for Uhsπ (g). Then the following conditions are equivalent. (1) Vh is an indecomposable Uhsπ (g)-module. (2) Vh admits an infinitesimal character. (3) The corresponding ideal given by Theorem Gh is a maximal ideal.
232
A. SEVOSTYANOV
(4) The space of Whittaker vectors in Vh is 1-dimensional (over C[[h]]). (5) The centralizer EndUh Vh reduces to C[[h]]. (6) Vh is isomorphic to Yξh ,χ sπ for some central character ξh . h
Proof. The equivalence of (2), (3), (4), and (5) follows from Theorems Fh and Gh . One gets the equivalence with (7) by Theorem Eh . It remains to relate (2)–(6) with (1). First, we prove that (1) implies (4). Suppose that Vh is indecomposable. Then the corresponding Whittaker module V = Vh / hVh for U (g) is irreducible, and hence by Theorem H the space of Whittaker vectors of V is 1-dimensional. We denote this space by Wh(V ). Let wh be a cyclic Whittaker vector for Vh . Denote Wh (Vh ) = C[[h]]wh . Then clearly Wh (Vh ) (mod h) = Wh(V ). On the other hand, if Wh(Vh ) is the space of all Whittaker vectors in Vh , then Wh(Vh ) (mod h) = Wh(V ). Now from Lemma 2.3.1 applied to A = Wh(Vh ) and B = Wh (Vh ) it follows that Wh(Vh ) = Wh (Vh ) = C[[h]]wh . Now we prove that (6) implies (1). Assume that (6) is satisfied. Then, by construction, the corresponding Whittaker module V = Vh / hVh for U (g) is isomorphic to Yξ,χ for some central character ξ . Now suppose that Vh is decomposable. Then V must be reducible, which is impossible by Theorem H. This completes the proof of Theorem Hh . 2.5. Coxeter realizations of quantum groups and Drinfeld twist. In this section we show that the Coxeter realizations Uhsπ (g) of the quantum group Uh (g) are connected with quantizations of some nonstandard bialgebra structures on g. At the quantum level, changing bialgebra structure corresponds to the so-called Drinfeld twist. We consider a particular class of such twists described in the following proposition. Proposition 2.5.1 [2, Proposition 4.2.13]. Let (A, µ, ı, , ε, S) be a Hopf algebra over a commutative ring. Let Ᏺ be an invertible element of A ⊗ A such that (2.5.1)
Ᏺ12 ( ⊗ id)(Ᏺ) = Ᏺ23 (id ⊗)(Ᏺ),
(ε ⊗ id)(Ᏺ) = (id ⊗ε)(Ᏺ) = 1.
Then v = µ(id ⊗S)(Ᏺ) is an invertible element of A with v −1 = µ(S ⊗ id) Ᏺ−1 . Moreover, if we define Ᏺ : A → A ⊗ A and S Ᏺ : A → A by Ᏺ (a) = Ᏺ(a)Ᏺ−1 ,
S Ᏺ (a) = vS(a)v −1 ,
then (A, µ, ı, Ᏺ , ε, S Ᏺ ) is a Hopf algebra denoted by AᏲ and called the twist of A by Ᏺ. Corollary 2.5.2 [2, Corollary 4.2.15]. Suppose that A and Ᏺ are as in Proposition 2.5.1, but assume in addition that A is quasitriangular with universal R-matrix
233
QUANTUM DEFORMATION OF WHITTAKER MODULES
. Then AᏲ is quasitriangular with universal R-matrix Ᏺ = Ᏺ21 Ᏺ−1 ,
(2.5.2) where Ᏺ21 = σ Ᏺ.
Fix a Coxeter element sπ ∈ W , sπ = sπ(1) . . . sπ(l) . Consider the twist of the Hopf algebra Uh (g) by the element
l nj i Ᏺ = exp −h Yi ⊗ Yj ∈ Uh (h) ⊗ Uh (h), dj
(2.5.3)
i,j =1
where nij is a solution of the corresponding equation (2.2.2). This element satisfies conditions (2.5.1), and so Uh (g)Ᏺ is a quasitriangular Hopf algebra with the universal R-matrix Ᏺ = Ᏺ21 Ᏺ−1 , where is given by (2.1.10). We explicitly calculate the element Ᏺ . Substituting (2.1.10) and (2.5.3) into (2.5.2) and using (2.1.9), we obtain
l
l
nij nj i Yi ⊗ Yj + di dj i=1 i,j =1
∨ ∗ ∨ × expq −1 q − q −1 a(β)−1 Xβ+ ehKβ ⊗ e−hK β Xβ− ,
Ᏺ = exp h
Yi ⊗ H i +
−
β
β
where K is defined by (2.2.4). −1 −1 Equip Uhsπ (g) with the comultiplication given by sπ (x) = (ψ{n} ⊗ ψ{n} ) sπ Ᏺ h (ψ{n} (x)). Then Uh (g) becomes a quasitriangular Hopf algebra with the uni−1 −1 Ᏺ ⊗ ψ{n} . Using equation (2.2.2) and Lemma 2.2.3 this versal R-matrix sπ = ψ{n} R-matrix may be written as
sπ
= exp h
(2.5.4) ×
β
l
Yi ⊗ H i +
i=1
expq −1 β
q −q
−1
l 1 + sπ i=1
1 − sπ
! Hi ⊗ Y i
∨ a(β)−1 eβ ⊗ eh((1+sπ )/(1−sπ ))β fβ .
The element sπ may be also represented in the form
sπ
= exp h
!
l
Yi ⊗ H i
i=1
234 (2.5.5)
A. SEVOSTYANOV
×
β
expq −1
β
× exp h
∨ q − q −1 a(β)−1 eβ e−h((1+sπ )/(1−sπ ))β ⊗ fβ
l 1 + sπ i=1
1 − sπ
! Hi ⊗ Y i
.
The comultiplication sπ is given on generators by sπ (Hi ) = Hi ⊗ 1 + 1 ⊗ Hi , sπ (ei ) = ei ⊗ ehdi 2/(1−sπ )Hi + 1 ⊗ ei , sπ (fi ) = fi ⊗ e−hdi ((1+sπ )/(1−sπ ))Hi + e−hdi Hi ⊗ fi . Note that the Hopf algebra Uhsπ (g) is a quantization of the bialgebra structure on g defined by the cocycle sπ sπ δ(x) = adx ⊗ 1 + 1 ⊗ adx 2r+ (2.5.6) , r+ ∈ g ⊗ g, sπ where r+ = r+ + (1/2) li=1 ((1 + sπ )/(1 − sπ ))Hi ⊗ Yi , and r+ is given by (2.1.3). We also need the following property of the antipode S sπ of Uhsπ (g). Proposition 2.5.3. The square of the antipode S sπ is an inner automorphism of given by ∨ ∨ (S sπ )2 (x) = e2hρ xe−2hρ , where ρ ∨ = li=1 Yi .
Uhsπ (g)
Proof. First, observe that by Proposition 2.5.1 the antipode of Uhsπ (g) has the form where
−1 (vSh (ψ{n} (x))v −1 ), S sπ (x) = ψ{n}
l nj i Yi Yj . v = exp h dj
i,j =1
−1 Therefore, (S sπ )2 (x) = ψ{n} (vSh (v −1 )Sh2 (ψ{n} (x))Sh (v)v −1 ). Note that Sh (v) = v,
−1 2 and hence (S sπ )2 (x) = ψ{n} (Sh (ψ{n} (x))). Finally, observe that from explicit formulas for the antipode of Uh (g) it follows ∨ ∨ that Sh2 (x) = e2hρ xe−2hρ . This completes the proof.
In conclusion, we note that using Corollary 2.1.3 and the isomorphism ψ{n} one can define finite-dimensional representations of Uhsπ (g). 2.6. Quantum deformation of the Toda lattice. Recall that one of the main applications of the algebra W (b− ) is the quantum Toda lattice (see [12]). Let χ : n− → C be a nonsingular character of the opposite nilpotent subalgebra n− . We denote the
QUANTUM DEFORMATION OF WHITTAKER MODULES
235
character of N− corresponding to χ by the same letter. The algebra U (b− ) naturally acts by differential operators in the space C ∞ (Cχ ⊗N− B− ). This space may be identified with C ∞ (H ). Let D1 , . . . , Dl be the differential operators on C ∞ (H ) χ χ which correspond to the elements I1 , . . . , Il ∈ W (b− ). Denote by ϕ the operator of multiplication in C ∞ (H ) by the function ϕ(eh ) = eρ(h) , where h ∈ h. The operators Mi = ϕDi ϕ −1 , i = 1, . . . , l, are called the quantum Toda Hamiltonians. Clearly, they commute with each other In particular, if I is the quadratic Casimir element, then the corresponding operator M is the well-known second-order differential operator M=
l i=1
∂i2 +
l χ Xαi χ X−αi e−αi (h) + (ρ, ρ), i=1
where ∂i = ∂/∂yi , and yi , i = 1, . . . , l, is an orthonormal basis of h. Using the algebra Wh (b− ), we construct quantum group analogues of the Toda Hamiltonians. A slightly different approach was recently proposed in [6]. Denote by A the space of linear functions on C[[h]]χ sπ ⊗U sπ (n− ) Uhsπ (b− ), where h h C[[h]]χ sπ is the 1-dimensional Uhsπ (n− )-module defined by χ hsπ . Note that we have a h ∼ U sπ (h). Therefore, A = U sπ (h)∗ . linear isomorphism C[[h]]χ sπ ⊗U sπ (n− ) Uhsπ (b− ) = h h h h The algebra Uhsπ (b− ) naturally acts on C[[h]]χ sπ ⊗U sπ (n− ) Uhsπ (b− ) by multiplication h h from the right. This action induces a Uhsπ (b− )-action in the space A. We denote this action by L, L : Uhsπ (b− ) → End A. Clearly, this action generates an action of the algebra Wh (b− ) on A. To construct deformed Toda Hamiltonians, we use certain elements in Wh (b− ). These elements may be described as follows. Let µ : Uhsπ (g) → C[[h]] be a map ∨ ∨ such that µ(uv) = µ(vu). By Proposition 2.5.3, (S sπ )2 (x) = e2hρ xe−2hρ . Hence ∨ sπ sπ from [5, Remark 1] (see also [7]) it follows that (id ⊗µ)(21 (1 ⊗ e2hρ )), where sπ 21 = σ sπ , is a central element. In particular, for any finite-dimensional g-module V , the element (2.6.1)
sπ s ∨ CV = id ⊗ tr V 21 π 1 ⊗ e2hρ ,
where tr V is the trace in V [[h]], is central in Uhsπ (g). Using formulas (2.5.4) and (2.5.5), we can easily compute elements ρχ sπ (CV ) ∈ h Wh (b− ). For every finite-dimensional g-module V , we have ρχ sπ (CV ) = id ⊗ tr V h
× et0
β
expq −1 β
∨ q − q −1 a(β)−1 fβ ⊗ eβ e−h((1+sπ )/(1−sπ ))β
236 (2.6.2)
A. SEVOSTYANOV
× et0
β
expq −1 β
q − q −1 a(β)−1 χhsπ (eβ )
⊗e
h((1+sπ )/(1−sπ ))β ∨
! 2hρ ∨ fβ 1 ⊗ e ,
where t0 = h li=1 (Yi ⊗ Hi ). Rep We denote by Wh (b− ) the subalgebra in Wh (b− ) generated by the elements ρχ sπ (CV ), where V runs through all finite-dimensional representations of g. Note h that, for every finite-dimensional g-module V , ρχ sπ (CV ) is a polynomial in noncomh
mutative elements fi , ehx , x ∈ h. Rep Now we realize elements of Wh (b− ) as difference operators. Let Hh ∈ Uhsπ (h) be the subgroup generated by elements ehx , x ∈ h. A difference operator on A is an operator T of the form T = fi Txi (a finite sum), where fi ∈ A, and for every y ∈ Hh , Tx f (y) = (yehx ), x ∈ h. Proposition 2.6.1 [6, Proposition 3.2]. For any Y ∈ Uhsπ (b− ), which is a polynomial in noncommutative elements fi , ehx , x ∈ h, the operator L(Y ) is a difference Rep operator on A. In particular, the operators L(I ), I ∈ Wh (b− ), are mutually commuting difference operators on A. Proof. It suffices to verify that L(fi ) are difference operators on Hh . Indeed, L(fi )f ehx = f ehx fi = e−hαi (x) f fi ehx = χ hsπ (fi )e−hαi (x) f ehx . This completes the proof. Let : Hh → Uhsπ (h) be the canonical embedding. Denote Ah = ∗ (A). Let T be a difference operator on A. Then one can define a difference operator ∗ (T ) on the space Ah by ∗ (T )f (y) = T ( (y)). Let Dih = ∗ (L(ρχ sπ (CVi ))), where Vi , i = 1, . . . , l, are the fundamental repreh sentations of g. Denote by ϕh the operator of multiplication in Ah by the function ϕh (ehx ) = ehρ(x) , where x ∈ h. The operators Mih = ϕh Dih ϕh−1 , i = 1, . . . , l, are called the quantum deformed Toda Hamiltonians. From now on, we suppose that π = id and that the ordering of positive roots + is fixed as in Proposition 2.2.5. We denote sid = s. Now using formula (2.6.2) we outline computation of the operators Mih . This computation is simplified by the following lemma. Lemma 2.6.2 [6, Lemma 5.2]. Let X = fγ1 · · · fγn . If the roots γ1 , . . . , γn are not all simple, then L(X) = 0. Otherwise, if γi = αki , then s χ h fki . ∗ (L(X))f ehy = e−h( αki ,y) f ehy i
237
QUANTUM DEFORMATION OF WHITTAKER MODULES
Proof. This proof follows immediately from Proposition 2.2.5 and the arguments used in the proof of Proposition 2.6.1. Using this lemma we obtain that if β is not a simple root, then the term in (2.6.2) containing root vector fβ gives a trivial contribution to the operators L(ρχhs (CVi )). Note also that by Proposition 2.2.5 χhs (eβ ) = 0 if β is not a simple root. Therefore, from formula (2.6.2), we have (2.6.3) L ρχhs (CVi )
t expq −2di qi − qi−1 fi ⊗ ei e−hdi ((1+s)/(1−s))Hi e0 = L id ⊗ tr V i
× et0
i
expq −2di
qi − qi−1 χhs (ei )
⊗e
hdi ((1+s)/(1−s))Hi
fi
1⊗e
2hρ ∨
! .
In particular, let g = sl(n), V1 = V be the fundamental representation of sl(n). Then direct calculation gives n n−1 hy 2 M1 f e = Tj2 − q − q −1 χhs (ei )χ sh (fi )e−h(y,αi ) Ti+1 Ti f ehy , j =1
i=1
where Ti = Tωi , ωi are the weights of V . The last expression coincides with [6, formula (5.7)]. References [1] [2] [3] [4] [5] [6]
[7] [8] [9]
N. Bourbaki, Éléments de mathématique, fasc. 38: Groupes et algèbras de Lie, chapitres 7, 8, Actualités Sci. Indust. 1364, Hermann, Paris, 1975. V. Chari and A. Pressley, A Guide to Quantum Groups, Cambridge Univ. Press, Cambridge, 1994. J. Dixmier, Algèbres enveloppantes, Cahiers Sci., 37, Gauthier-Villars, Paris, 1974. V. G. Drinfeld, “Quantum groups” in Proceedings of the International Congress of Mathematicians (Berkeley, Calif., 1986), Vol. 1, Amer. Math. Soc., Providence, 1987, 798–820. , On almost cocommutative Hopf algebras, Leningrad Math. J. 1 (1990), 321–342. P. Etingof, “Whittaker functions on quantum groups and q-deformed Toda operators” in Differential Topology, Infinite-Dimensional Lie Algebras, and Applications, Amer. Math. Soc. Transl. Ser. 2 194, Amer. Math. Soc., Providence, 1999, 9–25. L. D. Faddeev, N. Yu. Reshetikhin, and L. Takhtajan, Quantization of Lie groups and Lie algebras, Leningrad Math. J. 1 (1990), 193–225. V. G. Kac, Infinite-Dimensional Lie Algebras, 3d ed., Cambridge Univ. Press, Cambridge, 1990. S. M. Khoroshkin and V. N. Tolstoy, Universal R-matrix for quantized (super) algebras, Comm. Math. Phys. 141 (1991), 599–617.
238 [10] [11] [12]
[13] [14]
A. SEVOSTYANOV B. Kostant, Lie group representations on polynomial rings, Amer. J. Math. 85 (1963), 327–404. , On Whittaker vectors and representation theory, Invent. Math. 48 (1978), 101–184. , “Quantization and representation theory” in Representation Theory of Lie Groups, (Oxford, 1977), London Math. Soc. Lecture Notes Ser. 34, Cambridge Univ. Press, Cambridge, 1979, 287–316. A. Sevostyanov, Regular nilpotent elements and quantum groups, Comm. Math. Phys. 204 (1999), 1–16. D. P. Želobenko, Extremal cocycles on Weyl groups (in Russian), Funktsional. Anal. i Prilozhen 21, no. 3 (1987), 11–21; English translation in Funct. Anal. Appl. 21, no. 3 (1987), 183–192.
Max-Planck-Institut für Mathematik, Box 7280, D-53072 Bonn, Germany; sevastia@ mpim-bonn.mpg.de
Vol. 105, No. 2
DUKE MATHEMATICAL JOURNAL
© 2000
REMARKS ON QUIVER VARIETIES G. LUSZTIG
0. Introduction 0.1. In several respects, the quiver varieties introduced by Nakajima [N1] are analogous to the variety Ꮾe of Borel subalgebras in a complex simple Lie algebra containing a given nilpotent element e. It would be very interesting to define a canonical basis of the equivariant K-theory of a quiver variety analogous to the (conjectural) one proposed in [L1] and [L3] for Ꮾe . (This would give something very close to the canonical basis of the modified quantized affine enveloping algebra, whose geometric interpretation has been elusive until now.) In this paper, we take (what we hope is) a small step in this direction. All the ingredients used for Ꮾe (except one) make sense for a quiver variety, and here we concentrate on finding the missing ingredient, the analogue of an opposition involution. For this purpose, it is essential to use the model of a quiver variety introduced in [L2], not as an orbit space (as in [N1]), but as a kind of Grassmannian. More precisely, it appears as the variety of submodules of a projective module (of finite-dimension over C) over the Gelfand-Ponomarev algebra [GP] corresponding to a Coxeter graph of type A, D, and E. In §1 we show that the duals of these projective modules are again projective, and the resulting duality allows us to construct an analogue of the opposition involution. In §2 we speculate on how the canonical basis may be defined in our case. In §3 we study a partition of a quiver variety. 1. New symmetries of quiver varieties 1.1. Assume that we are given two finite sets I, H with I = ∅, two maps H → I denoted as h → h and h → h such that h = h for all h ∈ H , and an involution ¯ = h for all h ∈ H . We say that i, j ∈ I are joined if there h → h¯ of H such that (h) exists h ∈ H such that h = i, h = j . Thus I becomes the set of vertices of a graph. The path algebra Ᏺ associated to (I, H ) is the associative C-algebra with 1 defined by the generators ei (i ∈ I ), h (h ∈ H ) and the relations e i ej = δi,j ei for i, j ∈ I , i ei = 1, eh h = h = heh for h ∈ H . It follows that h1 h2 = 0 for h1 , h2 ∈ H such that h1 = h2 . Received 29 September 1999. Revision received 19 January 2000. 2000 Mathematics Subject Classification. Primary 20G42; Secondary 16G20. Author’s work supported by the National Science Foundation. 239
240
G. LUSZTIG
For j ∈ I , we define sj : Z[I ] → Z[I ] by sj i νi i = i νi i, where = j and νi = νi for i νj = −νj + h; h =j νh . Let W be the subgroup of Aut(Z[I ]) generated by {sj | j ∈ I }. A subset J of I is said to be discrete if there is no h ∈ H such that h ∈ J , h ∈ J . For such J , we set sJ = i∈J sj ∈ W (a product of commuting involutions). 1.2. Throughout this paper we assume that W is finite and that (I, H ) is connected, in the sense that ei Ᏺej = 0 for any i, j in I . Then there is a unique partition I = I 1 I −1 such that I 1 and I −1 are discrete. For δ ∈ {1, −1}, we set sδ = sI δ ∈ W . There is a unique element w0 ∈ W and a unique involution i → i ∗ of I such that w0 (i) = −i ∗ (equality in Z[I ]) for all i ∈ I . There is a unique involution h → h∗ of H such that (h∗ ) = (h )∗ , (h∗ ) = (h )∗ . This involution commutes with ¯ : H → H . 1.3. Following [GP], we define P to be the quotient algebra of Ᏺ by the two-sided ideal generated by the elements hh¯ h∈H ; h =i
(one for each i ∈ I ). We define for any n ∈ Z a subspace Pn as follows. If n ≥ 1, Pn is the subspace of P spanned by the images of the products h1 h2 · · · hn with h1 , h2 , . . . , hn ∈ H ; let P0 be the subspace of P spanned by {ei | i ∈ I }; for n < 0, we set Pn = 0. We have P = ⊕n Pn as a vector space. There is a unique (involutive) algebra antiautomorphism ι : P → P such that ι(ei ) = ei ∗ for all i and ι(h) = h¯ ∗ for all h ∈ H . We also write ι(x) = x¯ ∗ for x ∈ P. 1.4. Let Ꮿ0 be the category whose objects are I -graded finite-dimensional Cvector spaces V = ⊕i∈I Vi and whose morphisms are linear maps compatible with the grading. For V ∈ Ꮿ0 , we set |V| = dim Vi i ∈ Z[I ]. i
Pn e
0 n For k ∈ I and n ∈ N, we regard k as an object of Ꮿ with i-component ei P ek . Let c be the Coxeter number of W . Let c = c − 2. We have c ≥ 0.
Proposition 1.5. Let k ∈ I . Define δ ∈ {1, −1} by the condition that k ∈ I δ . In Z[I ], we have 0 s−δ 0 1 sδ 1 2 s−δ P ek −−→ P ek + P ek −→ P ek + P ek −−→ · · · (a) −→ Pc −1 ek + Pc ek −→ Pc ek , where we apply alternately s−δ , sδ , n P ek = 0 (b)
for n > c ,
REMARKS ON QUIVER VARIETIES
(c) (d)
241
0 P ek = k, c P ek = k ∗ .
This is contained in [L2, §4]. A related result can be found in [GP]. Corollary 1.6 [GP]. We have dim P < ∞.
1.7. For i ∈ I , we set ᏸi = ei ∗ Pc ei . Clearly, ι : P → P restricts to an involution of ᏸi . Since dim ᏸi = 1 (see Proposition 1.5(d)), there exists a unique κi ∈ {1, −1} such that x¯ ∗ = κi x for all x ∈ ᏸi . For example, if (I, H ) is of type A and the distances from i ∈ I to the two extremal vertices of the graph are a, b (so that a + b = |I | − 1), then κi = (−1)ab . 1.8. For a C-vector space V , we denote by V ∗ the dual vector space; for a linear map φ : V → V , we denote by t φ : V ∗ → V ∗ the transpose of φ. If M is a P-module, then M ∗ is naturally a P-module: p ∈ P acts on M ∗ as t ι(p), where ι(p) : M → M. Clearly, (a) if M, M are P-modules and f : M → M is P-linear, then t f : M ∗ → M ∗ is P-linear. If M is a P-module M of finite-dimension over C, let Gr P (M) be the set of all Psubmodules of M. If V ∈ Gr P (M), then the annihilator V ⊥ of V in M ∗ belongs to Gr P (M ∗ ). Similarly, if V ∈ Gr P (M ∗ ), then the annihilator V ⊥ of V in M belongs to Gr P (M). It is clear that V → V ⊥ and V → V ⊥ are inverse bijections between Gr P (M), Gr P (M ∗ ). Note also that Gr P (M), Gr P (M ∗ ) are naturally projective varieties, and the bijections above are in fact isomorphisms of algebraic varieties. Proposition 1.9. (a) For i ∈ I , let Ci be the vector space C with the P-module structure in which h acts as zero for any h ∈ H and ej acts as δi,j for any j ∈ I . Then (Ci )i∈I is a set of representatives for the isomorphism classes of simple P-modules. (b) For k ∈ I , the left ideal Pek of P is an indecomposable projective P-module. Moreover, (Pek )k∈I is a set of representatives for the isomorphism classes of indecomposable projective P-modules. We prove (a). By Proposition 1.5(b), P+ = ⊕n≥1 Pn is a nilpotent two-sided ideal of P. Hence, the simple P-modules are the same as the simple P/P+ -modules and (a) follows. We prove (b). From P = ⊕k Pek , we see that each Pek is a projective P-module. Let M be a P-submodule of Pek such that M ⊂ P+ . Then we can find m ∈ M such that m − ek ∈ P+ . We show by descending induction on r that, for any p ∈ Pr ek , we have p ∈ M. For r > c , this is clear from Proposition 1.5(b). In the general case, we have pm − p = p(m − ek ) ∈ r ; r >r Pr ek . Hence, by the induction hypothesis, we have pm − p ∈ M. Since pm ∈ M, we have p ∈ M. This gives the induction step. We see that if M is a P-submodule of Pek such that (Pek )/M is simple, then M is contained in P+ , hence, in P+ ek . Thus, Pek has exactly one simple quotient P-module,
242
G. LUSZTIG
namely, (Pek )/(P+ ek ) ∼ = Ck . This shows that the P-module Pek is indecomposable. Therefore, (b) follows from (a). 1.10. Let D ∈ Ꮿ0 . Following [L2, §4.2], we define inductively an object Du = for any h ∈ H ⊕i∈I Dui ∈ Ꮿ0 for any u ∈ Z and a linear map φhu+1,u : Duh → Dhu+1 and u ∈ Z as follows. For u < 0, we set Du = 0. We set D0 = D. For u < 0 and h ∈ H , we set φhu+1,u = 0. Assume that for some u0 ∈ N, Du is already defined for u ≤ u0 and φhu+1,u : Duh → Dhu+1 is already defined for u < u0 and h ∈ H . We set u +1
Di 0
u −1 a u = coker Di 0 −→ ⊕h∈H ; h =i Dh0 , u ,u0 −1
where a is the linear map whose h-component is φh 0 u D ˜ 0 h
u +1 → D ˜ 0 h
u +1,u0 h
. For h˜ ∈ H , let φ ˜ 0
:
be the composition u h
u
u +1 . h
D ˜ 0 −→ ⊕h∈H ; h =h˜ Dh0 −→ D ˜ 0 u
(The first map is the identity isomorphism of D ˜ 0 onto the direct summand correh ˜ the second map is the canonical one.) This completes the inductive sponding to h = h;
definition of Du , φhu+1,u . u u +1 Let b : ⊕h∈H ; h =i Dh0 → Di 0 be the canonical (surjective) map. Note that the u +1,u0 . h-component of b is just φh¯ 0 † u † Let D = ⊕u∈Z D . Then D is an Ᏺ-module as follows. For i, j ∈ I and x ∈ Dui , we have ej x = δi,j x; for h ∈ H and x ∈ Dui , we have hx = φhu+1,u (x) ∈ Dhu+1 if h = i and hx = 0 if h = i. Clearly, this factors through a P-module structure on D† . Let D♥ = ⊕k∈I Pek ⊗ Dk . Then D♥ is a (projective) P-module, in fact the direct sum of the (projective) P-modules Pek ⊗ Dk , where P acts only on the first factor. Using Proposition 1.9(b), we see that all projective P-modules of finite-dimension over C are obtained in this way. We set D♥,u = ⊕k∈I Pu ek ⊗ Dk , D♥ i = ei ⊕k∈I Pek ⊗ Dk , ♥,u . = ei ⊕k∈I Pu ek ⊗ Dk = D♥ D♥,u i ∩D i ♥,u = ⊕ D♥,u . Then D♥ = ⊕i D♥ i,u i i = ⊕u D D† Proposition 1.11. (a) There is a unique isomorphism of P-modules σ : D♥ → 0 such that σ (ek ⊗ d) = d for any k ∈ I and any d ∈ Dk = Dk . This restricts for any Du . u ∈ Z to an isomorphism of vector spaces D♥,u → u (b) If u > c , then D = 0. (c) For any k ∈ I , the isomorphism in (a) restricts to an isomorphism bk : ᏸk ⊗ Dck ∗ . Dk →
REMARKS ON QUIVER VARIETIES
243
(d) For any i ∈ I and any u ∈ [0, c ], we have dim Diu−1 − dim ⊕h; h =i Duh + dim Diu+1 = 0. The first assertion of (a) is contained in [L2, Theorem 4.5]; the second assertion of (a) follows from the first, using the definitions. Now (b) and (c) follow from (a), using Proposition 1.5. We prove (d). Using (a), we see that it suffices to show that dim ei Pu−1 ek − dim eh Pu ek + dim ei Pu+1 ek = 0 h; h =i
for any i, k. This is equivalent to the equality s Pu−1 ek + Pu ek = Pu ek + Pu+1 ek for any k ∈ I (see Proposition 1.5(a)), where s = s(−1)u−1 δ and k ∈ I δ . This completes the proof. Corollary 1.12. For any u ∈ [0, c ] and any i ∈ I , the sequence a
b
0 −→ Diu−1 −→ ⊕h; h =i Duh −→ Diu+1 −→ 0 is exact. (Here a, b are as in §1.10.) The exactness of the sequence above follows from the definition, except for the injectivity of a. The injectivity of a then follows from Proposition 1.11(d). 1.13.
˜ ∈ Ꮿ0 be defined by Let D ˜ i = Dc∗ ∗ D i
˜ u , φ˜ u+1,u , and D ˜ † be defined in terms of D ˜ in the same way that for all i ∈ I . Let D h i u+1,u Dui , φh , and D† were defined in terms of D. We define for any i, u an isomorphism of vector spaces ∗ ˜u − γi,u : D → Dic∗−u i
such that, for any u, h, the diagram Ᏸu,h : ˜ u D h
γh ,u
t φ c −u,c −u−1 h¯ ∗
φ˜ hu+1,u
˜ u+1 D h
∗ / Dc −u (h )∗
γh ,u+1
∗ / Dc −u−1 (h )∗
244
G. LUSZTIG
˜ u = 0 = (Dc∗ −u )∗ (see Proposition 1.11) and we is commutative. If u < 0, we have D i i ˜ u = (Dc∗ −u )∗ (by definition) and we take γi,u to take γi,u = 0. If u = 0, we have D i i be the identity map. Assume now that u ∈ [0, c ] and that the maps γi,u are already defined whenever u ≤ u and that the diagrams Ᏸu ,h are commutative whenever u < u. To define γi,u+1 , we consider the following commutative diagram. 0
/D ˜ u−1 i γi,u−1
0
/ Dc∗ −u+1 ∗ i
a˜
/ ⊕h; h =i D ˜ u h
b˜
⊕h γh ,u
tb
−u ∗ / ⊕ Dc∗ h; h =i h
ta
/D ˜ u+1 i
/0
/ Dc∗ −u−1 ∗ i
/0
˜ Here the upper horizontal maps are the exact sequence as in Corollary 1.12 (for D instead of D); the lower horizontal maps are the transpose of the exact sequence as in Corollary 1.12; the left and middle vertical maps are known from the induction hypothesis; the right vertical map is not yet defined, but we define it uniquely by the condition that the new square that arises is commutative. This map is by definition γi,u+1 . This completes the inductive definition of the maps γi,u for u ≤ c + 1. For ˜ u = 0 = (Dc∗ −u )∗ and we set γi,u = 0. This completes the u ≥ c + 1, we have D i i definition of the maps γi,u . Taking the direct sum of the isomorphisms γi,u over all i, u, we obtain an isomor˜† → (D† )∗ . From the definitions, it is clear that this is an isomorphism phism γ : D ˜ ˜ † is regarded as a P-module, as in §1.10 (replacing D by D), of P-modules, where D † ∗ † and the P-module structure on (D ) is obtained from that of D , as in §1.8. ˜ † → C, given by [y, y ] = (γ y )(y), is It follows that the bilinear pairing D† × D ˜ u are orthogonal unless u = c − u, perfect. Under this pairing, the summands Dui , D i ∗ i = i . Moreover, from the definition of γ , we have ˜ †, h ∈ H ; (a) [h¯ ∗ y, y ] = [y, hy ] for any y ∈ D† , y ∈ D ˜ 0 = (Dc∗ )∗ . (b) [y, y ] = y (y) if y ∈ Dci ∗ , y ∈ D i i Proposition 1.14. Let k ∈ I . Let ( , ) : (Pek ) × (Pek ) → ᏸk be the C-bilinear pairing given by (y, y ) = 0 for any y ∈ Pu ek , y ∈ Pu ek with u + u = c , (y, y ) = y¯ ∗ y for any y ∈ Pc −u ek , y ∈ Pu ek . (a) The pairing ( , ) is perfect, and (x¯ ∗ y, y ) = (y, xy ) for any y ∈ Pek , y ∈ Pek , x ∈ P. (b) For y, y in Pek , we have (y, y ) = κk (y , y). ˜ in the case Using §1.13 and the isomorphism Proposition 1.11(a) for D and for D, ∗ ˜k =ᏸ , D ˜ k = 0 for k = k), we see that where Dk = C, Dk = 0 for k = k (hence, D k there exists a perfect bilinear pairing (Pek )×((Pek )⊗ ᏸk∗ ) → C, denoted by [[y, y ]], such that the summands ei Pu ek , (ei Pu ek ) ⊗ ᏸk∗ are orthogonal unless u = c − u,
REMARKS ON QUIVER VARIETIES
245
i = i ∗ and such that [[h¯ ∗ y, y ]] = [[y, hy ]] for any y ∈ Pek , y ∈ (Pek ) ⊗ ᏸk∗ , h ∈ H ; [[y, y ]] is given by the obvious pairing ᏸk × ᏸk∗ → C if y ∈ ek ∗ Pc ek = ᏸk , y ∈ (ek P0 ek ) ⊗ ᏸk∗ = ᏸk∗ . We deduce that there exists a perfect bilinear pairing (Pek ) × (Pek ) → ᏸk , denoted by (y, y ), such that the summands ei Pu ek , ei Pu ek are orthogonal unless u = c −u, i = i ∗ and such that (c) (h¯ ∗ y, y ) = (y, hy ) for any y ∈ Pek , y ∈ Pek , h ∈ H ; (d) (y, y ) is given by the obvious pairing ᏸk × C → ᏸk if y ∈ ek ∗ Pc ek = ᏸk , y ∈ ek P0 ek = C. To prove (a), it is enough to show that (e) (y, y ) = y¯ ∗ y for any y ∈ ei ∗ Pc −u ek , y ∈ ei Pu ek . We argue by induction on u. If u < 0 or if u = 0 ∗ but i = k , we have y = 0 = y and (e) holds. If u = 0 and i = k ∗ , then (e) holds by (d). Assume now that u > 0. Then y is a C-linear combination of elements of the form hy˜ with h ∈ H and y˜ ∈ eh Pu−1 ek . It is then enough to show that (y, hy) ˜ = y¯ ∗ hy˜ ˜ = y¯ ∗ hy. ˜ This follows from the induction hyor, equivalently (by (c)), that (h¯ ∗ y, y) pothesis. Thus (e) is proved and (a) follows. We prove (b). By the perfectness of ( , ), there is a unique C-linear map t : Pek → Pek such that (y , y) = (t (y), y ) for all y, y ∈ Pek . From the defining properties of ( , ), we see that t maps ei Pu ek into itself for any i, u. In particular, t (ek ) = cek for some c ∈ C. Using (a), we see that for any y, y ∈ Pek , x ∈ P, we have t (xy), y = (y , xy) = x¯ ∗ y , y = t (y), x¯ ∗ y = xt (y), y ; hence, t (xy) = xt (y). Thus t is P-linear. For any y ∈ ᏸk , y = 0, we have (t (ek ), y ) = (y , ek ) = c(ek , y ); hence, cek ∗ y = ∗ y¯ ek . By the definition of κi , we have y¯ ∗ = κk y ; hence, cy = κk y and c = κk . Thus t (ek ) = κk ek . Since t is P-linear, it follows that t (y) = κk y for all y ∈ Pek . This proves (b). 1.15. We regard (Pek )∗ ⊗ ᏸk as a P-module in which P acts only on the first factor. Let τk : Pek → (Pek )∗ ⊗ ᏸk be the C-linear map to which y ∈ Pek associates the linear map Pek → ᏸk , y → (y, y ) for all y ∈ Pek . Corollary 1.16. The map τk is a P-linear isomorphism. Let x ∈ P, y ∈ Pek . For all y ∈ Pek , we have τk (xy)(y ) = xy, y = y, x¯ ∗ y = τk (y) x¯ ∗ y = xτk (y)(y ); hence, τk (xy) = xτk (y). The corollary is proved.
246
G. LUSZTIG
1.17. Let us choose for each k ∈ I an isomorphism of vector spaces ck : Dk → ᏸk∗ ∗ ∗ ⊗ Dk . (Such ci exists since ᏸi is 1-dimensional.) We define an isomorphism ∗
→ D♥ C : D♥ − as the direct sum over k ∈ I of the isomorphisms
∗ ∗ τk ⊗ck Pek ⊗ Dk −−−−→ Pek ⊗ ᏸk ⊗ ᏸk∗ ⊗ D∗k = Pek ⊗ D∗k .
From Corollary 1.16, we see that C is P-linear. We now apply the same construction to ck instead of ck , where ck is the map κk (1⊗t ck )
Dk = ᏸk∗ ⊗ ᏸk ⊗ Dk −−−−−−→ ᏸk∗ ⊗ D∗k . (Here κk is as in §1.7.) Then, instead of C, we obtain an isomorphism of P-modules C : D♥ → (D♥ )∗ . Lemma 1.18. We have C = t C. It is enough to show that t τk = κk τk , where τk is the map
∗ ∗ τk ⊗1 Pek ⊗ ᏸk∗ −−−→ Pek ⊗ ᏸk ⊗ ᏸk∗ = Pek .
This follows easily from Proposition 1.14(b). 1.19. For two C-vector spaces V , V , let Iso(V , V ) be the set of all isomorphisms V → V . Let GD = G1D = Iso Di , Di , Iso Di , ᏸi∗ ⊗ D∗i . i∈I
i∈I
˜ D = GD G1 is an (algebraic) group with the following multiplication law: Then G D (gi ), gi −→ gi gi ∈ GD (gi ), fi −→ 1 ⊗ t gi−1 fi ∈ G1D (fi ), gi −→ fi gi ∈ G1D (fi ), fi −→ κi 1 ⊗ t fi−1 fi ∈ GD
if (gi ) ∈ GD , gi ∈ GD , if (gi ) ∈ GD , fi ∈ G1D , if (fi ) ∈ G1D , gi ∈ GD , if (fi ) ∈ G1D , fi ∈ G1D .
(Here we do not use the definition of κi .) Similarly, for two P-modules M, M , let IsoP (M, M ) be the set of all P-linear isomorphisms M → M . Let
Ᏻ = IsoP D♥ , D♥ ,
Ᏻ1 = IsoP D♥ , D♥
∗
.
REMARKS ON QUIVER VARIETIES
247
Then Ᏻ˜ = Ᏻ Ᏻ1 is an (algebraic) group with the following multiplication law: (g, g ) −→ gg ∈ Ᏻ
if g ∈ Ᏻ, g ∈ Ᏻ,
(g, f ) −→ t g −1 f ∈ Ᏻ1
if g ∈ Ᏻ, f ∈ Ᏻ1 ,
(f, g ) −→ fg ∈ Ᏻ1
if f ∈ Ᏻ1 , g ∈ Ᏻ,
(f, f ) −→ t f −1 f ∈ Ᏻ
if f ∈ Ᏻ1 , f ∈ Ᏻ1 .
˜ D → Ᏻ˜ as follows. If (gi ) ∈ GD , then 1(gi ) ∈ Ᏻ is the We define a map 1 : G automorphism of the P-module D♥ whose restriction to (Pek ) ⊗ Dk is 1 ⊗ gk for any k ∈ I . If (ci ) ∈ G1D , then 1(ci ) ∈ Ᏻ1 is the isomorphism C, as defined in §1.17. ˜ D → Ᏻ˜ is a group homomorphism. Lemma 1.20. The map 1 : G ˜ D or Ᏻ˜ , define x by xx = x x = 1. We have For x in G (a) 1(x ) = 1(x) ˜ D . (This is obvious when x ∈ GD , and it follows from Lemma 1.18 for any x ∈ G 1 when x ∈ GD .) We must show that (b) 1(xx ) = 1(x)1(x ) ˜ D . There are four cases to consider. for any x, x ∈ G Case 1: x ∈ GD , x ∈ GD . Case 2: x ∈ GD , x ∈ G1D . Case 3: x ∈ G1D , x ∈ GD . Case 4: x ∈ G1D , x ∈ G1D . In Cases 1 and 2, formula (b) follows directly from the definitions. Assume now that x, x are as in Case 3. By Case 2, applied to x , x , we have 1(x x ) = 1(x )1(x ). Using (a) three times, we deduce 1(xx ) = 1(x ) 1(x) . Applying to both sides, we deduce 1(xx ) = 1(x)1(x ), as desired. Assume next that x, x are as in Case 4. By Case 2, applied to xx , x , we have 1(x) = 1(xx )1(x ); hence, 1(xx ) = 1(x)1(x ) . Using (a), we deduce 1(xx ) = 1(x)1(x ), as desired. The lemma is proved. 1.21. It is easy to see that 1 is an imbedding of algebraic groups. Hence, via 1, ˜ D with a we may identify GD with a closed subgroup of Ᏻ and we may identify G ˜ closed subgroup of Ᏻ. Let x ∈ Ᏻ. Then x maps P+ D♥ = ⊕k (P+ ek ) ⊗ Dk into itself. Hence, it induces a P-linear automorphism of D♥ /(⊕k (P+ ek ) ⊗ Dk ) = D, that is, an element of GD . This defines a (surjective) homomorphism of algebraic groups Ᏻ → GD . Let U be the kernel of this homomorphism. Then U consists of all x ∈ Ᏻ such that x(ek ⊗d) = ek ⊗ d mod ⊕l (P+ el ) ⊗ Dl for all k ∈ I , d ∈ Dk . Since x is P-linear, it follows that for any u ≥ 0, any k ∈ I , and any ξ ∈ Pu ek ⊗ Dk we have xξ = ξ mod ⊕l Pu+1 el ⊗ Dl .
248
G. LUSZTIG
It follows that x is unipotent. Thus U is a unipotent algebraic group. Since GD is reductive, it follows that U is the unipotent radical of Ᏻ. Thus GD may be identified with the reductive quotient of Ᏻ. 1.22. For any g ∈ Ᏻ˜ , V ∈ Gr P (D♥ ), we define g ◦ V ∈ Gr P (D♥ ) as follows. If g ∈ Ᏻ, then g : D♥ → D♥ maps the P-submodule V onto a P-submodule g(V ); we set g ◦ V = g(V ). If g ∈ Ᏻ1 , then g : D♥ → (D♥ )∗ maps the P-submodule V onto a ♥ ∗ P-submodule g(V ) of (D ) ; we set g ◦ V = g(V )⊥ . Note that (g, V ) → g ◦ V defines an action of Ᏻ˜ on Gr P (D♥ ) (in fact, an algebraic group action). This follows from the equality (g(U ))⊥ = t g −1 (U ⊥ ) for any g ∈ Ᏻ˜ and for any U ∈ Gr P (D♥ ). 1.23. Combining Lemma 1.20 with §1.22, we see that (x, V ) → 1(x) ◦ V is an ˜ D on Gr P (D♥ ). algebraic group action of G 1.24. Let Bili be the set of all nonsingular forms Di ⊗ Di → ᏸi∗ . We identify G1D with i Bili by associating to (ci ) ∈ G1D the system of forms ( , )i given by the composition ci ⊗1
1⊗ω
Di ⊗ Di −−−→ ᏸi∗ ⊗ D∗i ⊗ Di −−−→ ᏸi∗ , where ω : D∗i ⊗ Di → C is the obvious duality map. The form corresponding in this way to (ci ) ∈ G1D (as in §1.17) is given by (d, d )i = κi (d , d)i for all d, d ∈ Di . Let ᐁ be the set of all x ∈ G1D such that the corresponding ( , )i satisfies (d, d )i = (d , d)i for all i and all d, d ∈ Di or, equivalently, such that x = κx. (Here κ ∈ GD denotes the element whose i-component is κi times the identity.) 1.25. We consider the C∗ -action λ → ρλ on D♥ , where ρλ (y) = λu y for y ∈ This action is not compatible with the P-module structure, but satisfies
D♥,u .
ρλ (zy) = λu zρλ (y) for z ∈ Pu , y ∈ D♥ , λ ∈ C∗ . Hence, if V ∈ Gr P (D♥ ), we have ρλ (V ) ∈ Gr P (D♥ ). Thus C∗ acts on Gr P (D♥ ). From the definitions, we see that if x ∈ GD , then 1(x)ρλ = ρλ 1(x); if x ∈ G1D , then 1(x)ρλ = λc t ρλ−1 1(x). (In the last equality, the exponent c comes from the fact that c enters the definition ˜ D , we have of γi,u (see §1.13).) Hence, if V ∈ Gr P (D♥ ) and x ∈ G 1(x) ◦ ρλ (V ) = ρλ 1(x) ◦ V . ˜ D -action on Gr P (D♥ ) commute. Thus the C∗ -action and G 2. Speculation on a canonical basis ¯ = 0 for 2.1. Let D, V ∈ Ꮿ0 . Let 9 : H → C∗ be a function such that 9(h) + 9(h) all h. Nakajima [N1] associates to this datum a smooth algebraic variety :sD,V ; it is
REMARKS ON QUIVER VARIETIES
249
an open set in the variety consisting of all (x, p, q), where x = (xh )h∈H , xh ∈ Hom Vh , Vh , pi ∈ Hom Di , Vi , p = (pi ), qi ∈ Hom Vi , Di q = (qi ), satisfy h; h =i 9(h)xh xh¯ −pi qi = 0 for all i. (Nakajima actually assumes that 9 takes only the values ±1, but, following the conventions of [L2], we do not assume this.) Then GV acts naturally and freely on :sD,V so that the quotient GV \ :sD,V is a welldefined (symplectic) manifold. Nakajima also shows that, imposing the condition q = 0, one obtains a Lagrangian subvariety LD,V of GV \ :sD,V . Let L˜ D = V (GV \:sD,V ), LD = V LD,V ; here V runs over a set of representatives for the isomorphism classes of objects in Ꮿ0 (in both unions, all but finitely many terms are empty). Then L˜ D is smooth and LD is a closed subvariety of L˜ D . We assume that 9(h) = h 9, where i 9 ∈ C∗ (i ∈ I ) is a constant for i ∈ I 1 and is minus that constant for i ∈ I −1 . In this case, LD is independent of the choice of i 9. It may be interpreted as the variety consisting of all P-module structures on V (with ei acting as δi,j on Vj ), together with a map p : D → V in Ꮿ0 whose image generates V as a P-module, modulo the natural action of GV . We define (a) {V ⊂ Gr P (D♥ ) | codimei D♥ (ei V ) = dim Vi ∀i} → LD,V as follows. Given V in the first set, we choose an isomorphism D♥ /V → V in Ꮿ0 , we ♥ use it to carry the P-module structure of D /V to V, we define p as the composition D → D♥ → D♥ /V → V (the first map is the inclusion, the second map is the canonical one), and we set q = 0. We thus get a point of LD,V . The map (a) thus defined is an isomorphism [L2, Theorem 2.26]. Taking disjoint union over all V ∈ Ꮿ0 (up to isomorphism), we obtain an identification (b) Gr P (D♥ ) = LD . 2.2. For any algebraic variety X with an algebraic action of a torus R , we denote by CohR (X ) the abelian category of coherent R -equivariant sheaves on X . Let KR (X ) be the Grothendieck group of CohR (X ). This is naturally a module over R , the representation ring of R . 2.3. Let D ∈ Ꮿ0 . Let X = Gr P (D♥ ). The group GD acts naturally on X (see §1.23). However, we want to restrict this action to a maximal torus R of GD . More precisely, we fix a direct sum decomposition r0 r (a) D = ⊕r=1 D 0 (in Ꮿ ), where r D ∈ Ꮿ0 satisfies |r D| = ir ∈ Z[I ] for some ir ∈ I . Define R to be the subgroup (torus) of GD consisting of all (gi ) which map each summand r D into itself. In addition to the action of the torus R, we consider the action (see §1.25) of C∗ ˜ on X. This commutes with the R-action; hence, we obtain an R-action on X, where ∗ ˜ R = R ×C .
250
G. LUSZTIG
We identify C∗ = Z[v, v −1 ], where v is an indeterminate that corresponds to the identity representation C∗ → C∗ . Then we have canonically
R˜ = R ⊗ C∗ = R ⊗ Z v, v −1 = R v, v −1 . Hence, the R˜ -module KR˜ (X) is naturally a Z[v, v −1 ]-module. Moreover, for any character τ : R → C∗ and for any ξ ∈ KR˜ (X), the product τ ξ ∈ KR˜ (X) is well defined by the R -module structure. Let x → x † be the involution of the ring R defined by τ → τ −1 for all τ as above. We seek a canonical basis B of the Z[v, v −1 ]-module KR˜ (X), which is preserved by the maps ξ → τ ξ for any character τ : R → C∗ . Because of the analogy between the variety X and the varieties Ꮾe (see §0.1), we expect that B should have a definition along the lines of [L3, §5.11]. Thus we expect that ±B should be defined as the set of all ξ ∈ KR˜ (X) which satisfy ˜ ) = ξ, β(ξ (ξ %ξ ) ∈ 1 + v −1 R [v −1 ], where β˜ : KR˜ (X) → KR˜ (X) is a certain R -linear involution that is antilinear with respect to the ring involution of Z[v, v −1 ] given by v → v −1 , and (%) : KR˜ (X) × KR˜ (X) → R˜ is a certain pairing that is R -linear in the first variable, R -semilinear (with respect to †) in the second variable, and Z[v, v −1 ]-bilinear. ˜ (%). The provisional form As in [L3], we first consider some provisional form of β, ˜ of β is DX (Serre-Grothendieck duality; see [L1, §6.10]). The provisional form of (%) is (:) : KR˜ (X) × KR˜ (X) −→ R˜ given by (F : F ) = p∗ (F ⊗L˜ F ), where LD
⊗L : KR˜ (X) × KR˜ (X) −→ KR˜ (X) L˜ D
is the tor-product [L1, §6.4] relative to the smooth variety L˜ D and its closed subvarieties X, X; p is the map from X to the point. The provisional form of β˜ is not quite correct. It satisfies the antilinearity requirement but is not R -linear. Similarly, (:) is not quite correct. It is R -linear in both variables. For this reason, we (as in [L1]) add some correction factors that will remedy these defects. Thus we define β˜ = T −1 C ∗ DX , where C is an involution of X and T is an automorphism of KR˜ (X). Similarly, we set (F %F ) = v −2 dim X1 (T C ∗ F : F )† if F , F are supported by the same component X1 of X and (F %F ) = 0 if F , F are supported by different components of X. In analogy with [L1], we take T to be the action of w0 in the braid group action attached to the natural module structure on KR˜ (X) over the quantized enveloping algebra given by (I, H ). (We are not precise about how to normalize T ; this remains to be done.) Such a module structure has been defined by Nakajima [N2]. (Note, however, that he looks at KGD ×C∗ (X) instead of KR˜ (X).)
REMARKS ON QUIVER VARIETIES
251
Therefore the essential missing ingredient is the involution C : X → X. We take C to be the action on X of an element in the subset R 1 of G1D (see §1.19) consisting of all (ci ) which map r Dir onto ᏸi∗r ⊗ r D∗ir for all r. One checks easily that ˜ D; R ∪ R 1 is a subgroup of G 1 κ ∈ R, R ⊂ ᐁ (see §1.24); yxy = x for all x ∈ R 1 , y ∈ R; x 2 = κ for all x ∈ R 1 . It follows that C ∗ : KR˜ (X) → KR˜ (X) is a well-defined involution, independent of the choice of element of R 1 . (Here we also use the fact that C commutes with the C∗ -action (see §1.25).) Note that, with this definition of C ∗ , β˜ and (%) have the expected linearity properties. 2.4. Now the action of R˜ on X = LD extends naturally to an action of R˜ on L˜ D . We define (:) : KR˜ (X) × KR˜ L˜ D −→ R˜ by (F : F ) = p∗ (F ⊗L˜ F ), where LD
⊗L : KR˜ (X) × KR˜ L˜ D −→ KR˜ (X) L˜ D
is the tor-product [L1, §6.4] relative to the smooth variety L˜ D and its closed subvarieties X, L˜ D ; p is the map from X to the point. Let (%) : KR˜ (X) × KR˜ L˜ D −→ R˜ be given by (F %F ) = v −2 dim X1 (T C ∗ F : F )† if F is supported by some component X1 of X and F is supported by the unique component of L˜ D that contains X1 ; (F %F ) = 0 if F is supported by some component X1 of X and F is supported by a component of L˜ D that does not contain X1 . ˜ =1 One can expect that for any b ∈ B there is a unique b˜ ∈ KR˜ (L˜ D ) such that (b%b) ˜ and (b %b) = 0 for any b ∈ B such that b is not of the form ±τ b for some character τ : R → C∗ . Moreover, {b˜ | b ∈ B} should be a (signed) basis of the Z[v, v −1 ]-module KR˜ (L˜ D ). The elements of this signed basis are expected to be ± vector bundles on L˜ D (rather than differences of vector bundles). 3. A partition 3.1. Using again the analogy between Gr P (D♥ ) and the variety Ꮾe (see §0.1), we define (by analogy with [DLP, Proposition 3.2]) a partition of Gr P (D♥ ) into locally closed smooth subvarieties by prescribing the relative position with a fixed, canonically defined (incomplete) flag in D♥ .
252
G. LUSZTIG
3.2. Let D ∈ Ꮿ0 . Let (Miu )i∈I ; u∈Z be a collection of C-vector spaces such that u dim Miu = dim D♥,u i . Let Mi = ⊕u Mi . Assume that we are given C-subspaces u u V i ⊂ Mi .
Consider the vector space E consisting of all (fhu ,u )h∈H ; u,u ∈Z; u >u , where fhu ,u ∈ Hom Mhu , Mhu
maps Vhu into Vhu . Let E0 be the open subset of E consisting of those (fhu ,u ) such that for any i, u, Miu−1 → ⊕h; h =i Mhu is injective for u ∈ [0, c ], ⊕h; h =i Mhu → Miu+1 is surjective for u ∈ [0, c ] (the maps are fhu,u−1 , fh¯u+1,u ). Let F be the vector space consisting of all
giu ,u
i∈I ; u,u ∈Z; u −u≥2
,
where giu ,u ∈ Hom(Miu , Miu ) maps Viu into Viu . Define τ : E0 → F by (fhu ,u ) →
(giu ,u ), where
giu ,u =
h,v; h =i; u >v>u
fh¯u ,v fhv,u : Miu −→ Miu .
Proposition 3.3. The map τ is a submersion (of smooth manifolds).
Let (fhu ,u ) ∈ E0 . The tangent map from the tangent space E of E0 at (fhu ,u ) to
the tangent space F of F at τ (fhu ,u ) sends (ξhu ,u ) ∈ E to (γiu ,u ) ∈ F , where (a) γiu ,u = h,v; h =i,u >v>u (ξh¯u ,v fhv,u + fh¯u ,v ξhv,u ).
We must show that this tangent map is surjective. Assume that an element (γiu ,u ) ∈ F is given. We prove that the equations (a) have some solution (ξhu ,u ) ∈ E. For any i, u we choose a subspace Wiu of Miu such that Miu = Viu ⊕ Wiu . We write
γiu ,u = γiu ,u ⊕ γiu ,u , where u ,u γi
u ,u γi
: Viu −→ Viu ,
: Wiu −→ Miu .
Similarly, we can write the unknown (ξhu ,u ) as ξhu ,u = ξhu ,u ⊕ ξhu ,u , where u ,u ξh
: Vhu −→ Vhu ,
u ,u ξh
: Whu −→ Mhu ,
and we write (fhu ,u ) as fhu ,u = fhu ,u ⊕ fhu ,u , where u ,u fh
: Vhu −→ Vhu ,
u ,u fh
: Whu −→ Mhu .
253
REMARKS ON QUIVER VARIETIES
Then to solve (a), it is enough to solve (a1) and (a2) (with unknowns ξhu ,u , ξhu ,u ): (a1) γiu ,u = h,v; h =i; u >v>u ( ξh¯u ,v fhv,u + fh¯u ,v ξhv,u ), (a2) γiu ,u = h,v; h =i; u >v>u ( ξh¯u ,v fhv,u + ξh¯u ,v fhv,u + fh¯u ,v ξhv,u ). For any n ≥ 0, we consider statements An , An . An : If γiu ,u ∈ Hom(Viu , Viu ) are given for all i, u, u ; 2 ≤ u − u ≤ n + 1, then
there exist ξhu ,u : Vhu → Vhu for all h, u, u ; 1 ≤ u − u ≤ n such that (a1) holds for all i, u, u ; 2 ≤ u − u ≤ n + 1. An : If γ˜iu ,u ∈ Hom(Wiu , Miu ) are given for all i, u, u ; 2 ≤ u − u ≤ n + 1, then
there exist ξhu ,u : Whu → Mhu for all h, u, u ; 1 ≤ u − u ≤ n such that u ,u γ˜i
=
h,v; h =i; u >v>u
ξh¯u ,v fhv,u + fh¯u ,v ξhv,u
holds for all i, u, u ; 2 ≤ u − u ≤ n + 1. Note that A0 , A0 are trivially true. If An , An hold for all n ≥ 0, then clearly (a1), (a2) can be solved. We first solve (a1) using An , then we solve (a2), with u ,u u ,v v,u γ˜i = γiu ,u − ξh¯ fh , h,v; h =i; u >v>u
using An . It is therefore enough to prove the following statement. Assume that n ≥ 1 and that An−1 , An−1 hold. Then An , An hold. We choose a solution of An−1 and a solution of An−1 . Replacing (for u − u = n + 1)
γ u ,u i
by u ,u γi −
h,v; h =i; u >v,v−u≥2
u ,v v,u ξh¯ fh −
h,v; h =i; u −v≥2,v>u
u ,v v,u fh¯ ξh ,
and γ˜iu ,u by u ,u γ˜i −
h,v; h =i; u >v; v−u≥2
u ,v v,u ξh¯ fh −
h,v; h =i; u −v≥2; v>u
fh¯u ,v ξhv,u ,
we see that to prove An , An , it is enough to prove A˜ n , A˜ n : A˜ n : If γiu ,u ∈ Hom(Viu , Viu ) are given for all i, u, u ; u − u = n + 1, then there
exist ξhu ,u : Vhu → Vhu for all h, u, u ; u − u = n such that u ,u+1 u+1,u u ,u γi = ξh¯ fh + fh¯u ,u −1 ξhu −1,u h; h =i
holds for all i, u, u ; u − u = n + 1;
254
G. LUSZTIG
A˜ n : If γ˜iu ,u ∈ Hom(Wiu , Miu ) are given for all i, u, u ; u − u = n + 1, then there
exist ξhu ,u : Whu → Mhu for all h, u, u ; u − u = n such that u ,u+1 u+1,u u ,u γ˜i = ξh¯ fh + fh¯u ,u −1 ξhu −1,u h; h =i
holds for all i, u, u ; u − u = n + 1. For any m ∈ (−∞, c ], we consider the following statement: : If γ u ,u ∈ Hom(V u , V u ) are given for all i, u, u ; u − u = n + 1, u < m, Bn,m i i i
then there exist ξhu ,u : Vhu → Vhu for all h, u, u ; u − u = n, u ≤ m such that u ,u+1 u+1,u u ,u γi = ξh¯ fh + fh¯u ,u −1 ξhu −1,u h; h =i
holds for all i, u, u ; u − u = n + 1, u < m. ˜ ˜ is trivial; if Bn,c If m ≤ 0, then Bn,m −1 holds, then An holds. Hence, to prove An , it is holds, then Bn,m holds. Consider enough to prove that, if m ∈ [1, c − 1] and Bn,m−1 the following statement: : If δ m+n,m−1 ∈ Hom(V m−1 , V m+n ) are given for all i, then there exist B˜ n,m i i i m+n,m ξ : Vhm → Vhm+n for all h such that h m+n,m m,m−1 δin+m,m−1 = ξh¯ fh h; h =i
for all i. and B˜ n,m hold, then Bn,m holds. We now prove that B˜ n,m holds. Clearly, if Bn,m−1 m+n,m for all h is the same as having the maps bi (as below) for Having the maps ξh¯ all i in such a way that bi ai is prescribed bi
ai
Vim−1 −→ ⊕j ; j,i joined Vjm −→ Vim+n . But ai is injective (it is the restriction of a map Mim−1 → ⊕j ; j,i joined Mjm with
components fhm,m−1 , which is injective since (fhu ,u ) ∈ E0 ); hence, the set of all such (bi ) is isomorphic to a vector space. In particular, it is nonempty. We see that B˜ n,m holds. Hence, An holds. For any m ∈ [1, ∞), we consider the following statement: : If γ˜ u ,u ∈ Hom(W u , M u ) are given for all i, u, u ; u − u = n + 1, u > m, Bn,m i i i
then there exist ξhu ,u : Whu → Mhu for all h, u, u ; u − u = n, u ≥ m such that u ,u+1 u+1,u u ,u γ˜i = ξh¯ fh + fh¯u ,u −1 ξhu −1,u h; h =i
holds for all i, u, u ; u − u = n + 1, u > m.
255
REMARKS ON QUIVER VARIETIES
is trivial; if B holds, then A ˜ n holds. Hence, to prove A˜ n , it is If m ≥ c , then Bn,m n,1 holds. Consider enough to prove that, if m ∈ [1, c − 1] and Bn,m+1 holds, then Bn,m the following statement: : If δ m+1,m−n ∈ Hom(W m−n , M m+1 ) are given for all i, then there exist B˜ n,m i i i ξ m,m−n : W m−n → M m for all h such that h h h
δim+1,m−n =
h; h =i
fh¯m+1,m ξhm,m−n
for all i. and B˜ n,m hold, then Bn,m holds. We now prove that B˜ n,m holds. Clearly, if Bn,m+1 m,m−n for all h is the same as having the maps ci (as below) for Having the maps ξh all i in such a way that di ci is prescribed di
ci
Wim−n −→ ⊕j ; j,i joined Mjm −→ Mim+1 .
Here di has components fh¯m+1,m . But di is surjective since (fhu ,u ) ∈ E0 ; hence, the set of all such (ci ) is isomorphic to a vector space. In particular, it is nonempty. We holds. Hence, A holds. The proposition is proved. see that Bn,m n Corollary 3.4. The fibre τ −1 (0) is a smooth manifold of pure dimension dim Whu dim Whu − dim Wiu dim Wiu + dim Miu dim Viu . h; u >u
i; u −u≥2
i; u ≥u>0
From Proposition 3.3, we see that τ −1 (0) is a smooth manifold of pure dimension dim E − dim F = dim Vhu dim Vhu + dim Whu dim Mhu h; u >u
−
i; u −u≥2
=
h; u >u
−
h; u >u
dim Miu dim Viu + dim Wiu dim Wiu .
For any i ∈ I and u ∈ [0, c ], we substitute dim Miu−1 , and we get
dim Mhu dim Vhu + dim Whu dim Whu
i; u −u≥2
dim E − dim F =
dim Viu dim Viu + dim Wiu dim Miu
dim Whu dim Whu
u h; h =i dim Mh
= dim Miu+1 +
256
G. LUSZTIG
+
i; u∈[0,c ]; u >u
−
i; u −u≥2
=
h; u >u
+
dim Miu dim Viu + dim Wiu dim Wiu
dim Whu dim Whu −
i; u∈[1,c ]; u ≥u
i; u −u≥2
dim Wiu dim Wiu
dim Miu dim Viu
+
dim Miu+1 dim Viu + dim Miu−1 dim Viu
i; u∈[0,c −1]; u −u≥2
dim Miu dim Viu −
i; u −u≥2
dim Miu dim Viu .
The corollary follows. 3.5.
Consider the vector space E 1 consisting of all (fhu+1,u )h∈H ; u∈Z , where fhu+1,u ∈ Hom Mhu , Mhu+1
u+1,u 1 maps Vhu into Vhu+1 ) . Let E0 be the open subset of E consisting of those (fh such that for any i, u, Miu−1 → ⊕h; h =i Mhu is injective for u ∈ [0, c ], ⊕h; h =i Mhu → Miu+1 is surjective for u ∈ [0, c ] (the maps are fhu,u−1 , fh¯u+1,u ). Let F 1 be the vector space consisting of all u+2,u gi , i∈I ; u∈Z
where giu+2,u ∈ Hom(Miu , Miu+2 ) maps Viu into Viu+2 . Define τ 1 : E01 → F 1 by (fhu+1,u ) → (giu+2,u ), where u+2,u+1 u+1,u giu+2,u = fh¯ fh : Miu −→ Miu+1 . h; h =i
Proposition 3.6. The map τ 1 is a submersion (of smooth manifolds). Let (fhu+1,u ) ∈ E01 . The tangent map from the tangent space E 1 of E01 at (fhu+1,u ) to the tangent space F 1 of F 1 at τ 1 (fhu+1,u ) sends (ξhu+1,u ) ∈ E 1 to (γiu+2,u ) ∈ F 1 , where (a) γiu+2,u = h; h =i (ξh¯u+2,u+1 fhu+1,u + fh¯u+2,u+1 ξhu+1,u ).
We must show that this tangent map is surjective. Assume that an element (γiu+2,u ) ∈ F 1 is given. We must prove that the equations (a) have some solution (ξhu+1,u ) ∈ E 1 . Let Wiu be as in Proposition 3.3. We write γiu+2,u = γiu+2,u ⊕ γiu+2,u , where u+2,u γi
: Viu −→ Viu+2 ,
u+2,u γi
: Wiu −→ Miu+2 .
257
REMARKS ON QUIVER VARIETIES
Similarly, we can write the unknown (ξhu+1,u ) as ξhu+1,u = ξhu+1,u ⊕ ξhu+1,u , where u+1,u ξh
: Vhu −→ Vhu+1 ,
u+1,u ξh
: Whu −→ Mhu+1 ,
and we write (fhu+1,u ) as fhu+1,u = fhu+1,u ⊕ fhu+1,u , where u+1,u fh
: Vhu −→ Vhu+1 ,
u+1,u fh
: Whu −→ Mhu+1 .
Then to solve (a), it is enough to solve (a1) and (a2) below (with unknowns ξhu+1,u , ξ u+1,u ): h (a1) γiu+2,u = h; h =i ( ξh¯u+2,u+1 fhu+1,u + fh¯u+2,u+1 ξhu+1,u ), u+2,u+1 f u+1,u + ξ u+2,u+1 f u+1,u + f u+2,u+1 (a2) γiu+2,u = h; h =i ( ξh¯ h h h¯ h¯ ξ u+1,u ). h
The fact that these equations can be solved follows from the fact that A1 , A1 in Proposition 3.3 can be solved. The proposition is proved. Corollary 3.7. The fibre (τ 1 )−1 (0) is a smooth manifold of pure dimension h; u
dim Whu dim Whu+1 −
i; u
dim Wiu dim Wiu+2 +
i; u>0
dim Miu dim Viu .
From Proposition 3.6, we see that (τ 1 )−1 (0) is a smooth manifold of pure dimension dim E 1 − dim F 1 = dim Vhu dim Vhu+1 + dim Whu dim Mhu+1 h; u
−
i; u
=
h; u
−
dim Mhu dim Vhu+1 + dim Whu dim Whu+1
i; u
dim Viu dim Viu+2 + dim Wiu dim Miu+2
dim Miu dim Viu+2 + dim Wiu dim Wiu+2 .
As in the proof of Corollary 3.4, this equals dim Whu dim Whu+1 + dim Miu+1 dim Viu+1 + dim Miu−1 dim Viu+1 i; u∈[0,c ]
h; u
−
i; u
The corollary follows.
dim Miu dim Viu+2 + dim Wiu dim Wiu+2 .
258 3.8.
G. LUSZTIG
Let E be the set of all collections (fhu ,u )h∈H ; u,u ∈Z; u >u , where fhu ,u ∈ Hom Mhu , Mhu
are such that u ,v v,u (a) fh = 0 : Miu → Miu for any i and any u, u with h,v; h =i; u >v>u fh¯ u − u ≥ 2; (b) for any i ∈ I and any u such that 0 ≤ u ≤ c , the sequence 0 → Miu−1 → ⊕h; h =i Mhu → Miu+1 → 0 is exact (the middle maps have components fhu,u−1 , fh¯u+1,u , resp.).
Let P be the set of all collections (giu ,u )i∈I ; u,u ∈Z; u ≥u , where giu ,u ∈ Hom Miu , Miu
are such that giu,u are isomorphisms for all i, u. Then P is an (algebraic) group under
the multiplication (giu ,u )(g˜ iu ,u ) = ( giu ,u ), where u ,u gi = giu ,v g˜ iv,u . v; u ≥v≥u
P acts on E by (giu ,u )(fhu ,u ) = ( fhu ,u ), where u ,v v,u fh gh = v; u >v≥u
v; u ≥v>u
ghu,v fhv,u
for all h and u > u. (This formula defines fhu ,u by induction on u − u.) Let E 1 be the set of all collections (fhu+1,u )h∈H ; u∈Z , where fhu+1,u ∈ Hom Mhu , Mhu+1 are such that (b) holds and (a) holds for u − u = 2. Let L be the set of all collections (giu,u )i∈I ; u∈Z , where giu,u ∈ Hom Miu , Miu is an isomorphism for any i, u. Then L is an (algebraic) group under the multiplication (giu,u )(g˜ iu,u ) = (giu,u g˜ iu,u ). It acts on E 1 by (giu,u )(fhu+1,u ) = ( fhu+1,u ), where f u+1,u = g u+1,u+1 f u+1,u (g u,u )−1 . h h h h
Lemma 3.9. Let (ψhu+1,u ) ∈ E 1 . Let 1 be the set of all (fhu ,u ) ∈ E such that (a) fhu+1,u = ψhu+1,u for all h, u. Then 1 is isomorphic to CN , where N = i,u,u ; u >u>0 dim Miu dim Miu .
259
REMARKS ON QUIVER VARIETIES
For n ∈ [1, c ], let 1n be the set of all collections (fhu ,u )h∈H ; u,u ∈Z; 1≤u −u≤n , where
fhu ,u ∈ Hom(Mhu , Mhu ) are such that §3.8(a) holds whenever 2 ≤ u − u ≤ n + 1 and (a) holds. For n ∈ [2, c ] and m ∈ [n − 1, c ], let 1nm be the set of all collections
fhu ,u
h∈H ; u,u ∈Z; 1≤u −u≤n−1 or u −u=n,u ≤m
,
where fhu ,u ∈ Hom(Mhu , Mhu ) are such that §3.8(a) holds whenever 2 ≤ u − u ≤ n + 1 or u − u = n + 1, u ≤ m and (a) holds. For n ≥ 2, m ∈ [n, c ], let πmn : 1nm → 1nm−1 be defined by forgetting the appropri
ate coordinates. Let Y be the fibre of πmn at (fhu ,u ) ∈ 1nm−1 . Then
ci = −
h,v; h =i; m>v>m−n−1,v=m−n
fh¯m,v fhv,m−n−1
is well defined, and Y may be identified with the set of all collections (fhm,m−n )h∈H , , Mhm ) are such that where fhm,m−n ∈ Hom(Mhm−n h; h =i
fh¯m,m−n ψhm−n,m−n−1 = ci : Mim−n−1 −→ Mim
for all i. To give (fhm,m−n )h∈H as above is the same as giving the maps bi (as below) for all i in such a way that bi ai = ci : ai
bi
Mim−n−1 −→ ⊕j ; j,i joined Mjm−n −→ Mim . Here ai has coordinates ψhm−n,m−n−1 . Since m − n ∈ [0, c ], the map ai is injective (by the definition of E 1 ); hence, the set of all such (bi ) may be identified with i
Hom Sim−n , Mim ,
where Sim−n is a fixed subspace of ⊕j ; j,i joined Mjm−n , which is complementary to the image of ai . Since Sim−n is canonically isomorphic to i Mim−n+1 , we see that Y is identified with the vector space i Hom(Mim−n+1 , Mim ). It follows that (b) 1nm ∼ = 1nm−1 × CNn,m , where Nn,m = dim i Hom(Mim−n+1 , Mim ). Using repeatedly (b) and the obvious isomorphisms 1nc ∼ = 1n ,
1nn−1 ∼ = 1n−1 ,
1c ∼ = 1,
11 = E 1 ,
260
G. LUSZTIG
we see that 1 ∼ = CN , where Nn,m = N= n,m; 2≤n≤m≤c
i,n,m; 2≤n≤m≤c
dim Mim−n+1 dim Mim .
The lemma follows.
We identify E 1 with the subset of E consisting of all (fhu ,u ) such that
3.10.
fhu ,u = 0 whenever u − u ≥ 2.
Lemma 3.11. We keep the notation of Lemma 3.9. Let U be the subgroup of P consisting of all (giu ,u ) such that giu,u = 1 for all i, u. Let 1 be the U -orbit of (ψhu+1,u ) in E (under the restriction of the P -action). Then (a) 1 ∼ = CN , (b) 1 = 1.
Let U0 be the stabilizer of (ψhu+1,u ) in U . Let (giu ,u ) ∈ U . The condition that
u ,u
(gi
) ∈ U0 is
(c) ψhu +1,u ghu ,u = ghu+1,u+1 ψhu+1,u for all h and u > u. Let n ∈ [1, c − 1]. We prove the following statement for any m ∈ [0, c − n − 1]. Cn,m : Assume that we are given elements giu+n,u ∈ Hom(Miu , Miu+n ) for each i ∈ I and each u ∈ [0, m] so that (c) holds for all h and u ∈ [0, m − 1], u = u + n. Then there are unique elements gim+n+1,m+1 ∈ Hom(Mim+1 , Mim+n+1 ) for each i ∈ I which, together with the previous elements, satisfy (c) for all h and u ∈ [0, m], u = u + n. Let ch = ψhm+n+1,m+n ghm+n,m . For any i, we have m+n+1,m+n m+n,m m,m−1 ch ψhm,m−1 = ψh gh ψh¯ ¯ h; h =i
h; h =i
=
h; h =i
ψhm+n+1,m+n ψhm+n,m+n−1 gim+n−1,m−1 = 0. ¯
Thus the kernel of (ch )h =i : ⊕h; h =i Mhm → Mim+n+1 contains the image of )h =i : Mim−1 → ⊕h; h =i Mhm , which is the same as the kernel of (ψhm,m−1 ¯
ψhm+1,m
h =i
: ⊕h; h =i Mhm −→ Mim+1 .
Since the last map is surjective, it follows that there is a unique linear map Mim+1 → Mim+n+1 (called gim+n+1,m+1 ) such that gim+n+1,m+1 ψhm+1,m h =i = (ch )h =i . This map has the required properties. Thus Cn,m is proved.
REMARKS ON QUIVER VARIETIES
261
The properties Cn,m for n ∈ [1, c − 1], m ∈ [0, c − n − 1] show that the solutions of the equations (c) are in bijection with the set of collections (giu ,0 ), where giu ,0 : Mi0 → Miu for u ∈ [1, c ]. Thus U0 ∼ = CN0 , where N0 =
i,u ; u >0
dim Mi0 dim Miu .
It follows that dim U /U0 = N. Since U is a unipotent group, (a) follows. We prove (b). It is clear that 1 ⊂ 1. Now 1 ∼ = CN by (a) and 1 ∼ = CN by Lemma 3.9. It is well known that any injective map of algebraic varieties CN → CN is bijective. It follows that 1 = 1. Proposition 3.12. (a) The L-action on E 1 is transitive. (b) The P -action on E is transitive. We prove (a). Let (fhu+1,u ) ∈ E 1 . This makes M into a P-module, generated by ⊕i Mi0 . Hence, there is a canonical surjective P-linear map ψ : ⊕k (Pek ⊗ Mk0 ) → M. Since these have the same dimension, ψ is an isomorphism. It restricts to ⊕k (ei Pu ek ⊗ Miu . Now let (f˜hu+1,u ) ∈ E 1 . This gives rise in the same way to a P-linear Mk0 ) → ˜ −1 : M → M belongs to L and one isomorphism ψ˜ : ⊕k (Pek ⊗ Mk0 ) → M. Then ψψ can check that it carries (fhu+1,u ) to (f˜hu+1,u ). This proves (a). We prove (b). By Lemma 3.11, the P -orbit of any element of E contains some element of E 1 . Thus (b) follows from (a). Proposition 3.13. (a) E 1 is irreducible of dimension i; u>0 dim Miu dim Miu . (b) E is irreducible of dimension i; u ≥u>0 dim Miu dim Miu . The fact that E, E 1 are irreducible follows from Proposition 3.12. If in §3.2 we take Viu = Miu , then τ −1 (0) = E, (τ 1 )−1 (0) = E 1 ; hence, the dimension formulas in (a) and (b) are special cases of Corollaries 3.4 and 3.7. 3.14. Assume that we are given d = (diu )i; u∈Z , where diu ∈ N, diu ≤ dim Miu for all i, u. Let ᏼd be the set of all collections (Vi )i∈I , where Vi is a subspace of Mi such that Vi ∩ (Miu ⊕ Miu+1 ⊕ · · · ) has dimension diu + diu+1 + · · · for any i, u. Let ᏼd be the set of all collections (Viu )i; u∈Z , where Viu is a subspace of Miu of dimension diu . In Lemma 3.15, we fix (Viu ) ∈ ᏼd , and we let Wiu be a complement of Viu in Miu . Lemma 3.15. (a) ᏼd is a homogeneous P -manifold of dimension i; u ≥u
dim Miu dim Wiu − dim Wiu dim Wiu .
262
G. LUSZTIG
(b) ᏼd is a homogeneous L-manifold of dimension dim Miu dim Wiu − dim Wiu dim Wiu . i; u
The homogeneity statements are standard. The dimension formulas are obtained by taking the difference of dim P (or dim L) and the dimension of a corresponding isotropy group dim ᏼd = dim Miu dim Miu − dim Viu dim Viu + dim Wiu dim Miu , i; u ≥u
dim ᏼd
=
i; u
i; u ≥u
dim Miu dim Miu −
i; u
dim Viu dim Viu + dim Wiu dim Miu .
The lemma follows.
3.16. If f = (fhu ,u ) ∈ E and (Vi ) ∈ ᏼd , we write f ' (Vi ) whenever ⊕u fhu : Mh → Mh carries Vh into Vh for any h. If f = (fhu+1,u ) ∈ E 1 and (Viu ) ∈ ᏼd , we write f ' (Viu ) whenever fhu : Mhu → Mhu+1 carries Vhu into Vhu+1 for any h. Let Z be the variety consisting of all pairs (f, (Vi )) ∈ E × ᏼd such that f ' (Vi ). Let Z 1 be the variety consisting of all pairs (f, (Viu )) ∈ E 1 × ᏼd such that f ' (Viu ). For any f ∈ E, let Zf = {(Vi ) ∈ ᏼd | f ' (Vi )}. For any f ∈ E 1 , let Zf1 = Viu ∈ ᏼd | f ' Viu . In the following result we fix (Viu ) ∈ ᏼd . We choose a complement Wiu of Viu in Miu , and we set Vi = ⊕u Viu , Wi = ⊕u Wiu . Then (Vi ) ∈ ᏼd and Wi is a complement of Vi in Mi . Proposition 3.17. (a) For any f ∈ E, Zf is a smooth variety of pure dimension D − D0 , where 2 1 dim Wh dim Wh + dim Mi0 dim Wi − dim Wi , 2 h i i 1 u+1 D0 = dim Whu dim Whu − dim Wiu dim Wi . 2 D=
h; u
i; u
(b) For any f ∈ E 1 , Zf1 is a smooth variety of pure dimension h; u
dim Whu dim Whu+1 +
−
i
dim Mi0 dim Wi0
i; u
dim Wiu dim Wiu+2 −
i; u
dim Wiu dim Wiu .
263
REMARKS ON QUIVER VARIETIES
We prove (a). The second projection pr 2 : Z → ᏼd can be regarded as a P equivariant map, with ᏼd a P -homogeneous space. The fibre of pr 2 over (Vi ) is the variety τ −1 (0) in Corollary 3.4; hence, it is smooth of pure dimension. It follows that Z is smooth of pure dimension equal to dim τ −1 (0) + dim ᏼd . The first projection pr 1 : Z → E can again be regarded as a P -equivariant map with E a P -homogeneous space (see Proposition 3.12). Hence, Zf = pr −1 1 (f ) is smooth of pure dimension dim Z − dim E = dim τ −1 (0) + dim ᏼd − dim E. Using Corollary 3.4, Proposition 3.13, and Lemma 3.15, we obtain
dim Zf =
h; u >u
+
i; u ≥u>0
=
−
dim Miu dim Viu +
dim Whu dim Whu +
i; u,u
dim Wiu dim Wiu
i; u ≥u
dim Miu dim Wiu − dim Wiu dim Wiu
dim Miu dim Miu
h; u >u
i; u −u≥2
i; u ≥u>0
−
dim Whu dim Whu −
i; u ≥0
dim Wiu dim Wiu +
dim Mi0 dim Wiu
i; u =u+1
dim Wiu dim Wiu
1 1 = dim Whu dim Whu − dim Whu dim Whu 2 2 h; u h; u ,u 0 u + dim Mi dim Wi − dim Wiu dim Wiu i; u ≥0
i; u,u
+
u
i; u =u+1
dim Wiu dim Wi .
This implies (a). We prove (b). As in the proof of (a), we see, using the two projections Z 1 → ᏼd , 1 Z → E 1 , that Zf1 is smooth of pure dimension dim(τ1−1 )(0) + dim ᏼd − dim E 1 . Using Corollary 3.7, Proposition 3.13, and Lemma 3.15, we obtain dim Zf1 =
h; u
+
dim Whu dim Whu+1 −
i; u
This implies (b).
i; u
dim Wiu dim Wiu+2 +
dim Miu dim Wiu − dim Wiu dim Wiu
−
i; u>0
i; u>0
dim Miu dim Viu
dim Miu dim Miu .
264
G. LUSZTIG
3.18. In the remainder of this section, we assume that Miu = D♥,u for all i, u. For i any h, u, let fhu+1,u ∈ Hom(Mhu , Mhu+1 ) be left multiplication by h in the obvious u+1,u ♥ ) satisfies conditions 3.8(a) and (b) P-module structure on D . Then f = (fh (condition 3.8(a) just expresses the fact that the relations of P are satisfied; condition 3.8(b) follows from Proposition 1.11 and Corollary 1.12). Thus f ∈ E 1 . Let Gr(D♥ ) be the variety consisting of all C-subspaces V of D♥ such that V = ⊕i Vi , where ♥ ♥ Vi = V ∩ D ♥ i . We have a partition Gr(D ) = d Gr(D , d), where d runs over all u ♥ (di ) as in §3.14 and Gr(D , d) consists of all C-subspaces V = ⊕i Vi such that (Vi ) ∈ ᏼd . The pieces of this partition are just the orbits of the natural action of P on Gr(D♥ ). By intersecting these pieces with Gr P (D♥ ) (a subvariety of Gr(D♥ )), we obtain a partition (a) Gr P (D♥ ) = d Gr P (D♥ , d), where Gr P (D♥ , d) = Gr P (D♥ ) ∩ Gr(D♥ , d). This last intersection is just the variety Zf in §3.16, hence, is smooth of pure dimension given by Proposition 3.17. 3.19. The C∗ -action on D♥ (see §1.25) induces a C∗ -action on Gr(D♥ ), which restricts to the C∗ -action on Gr P (D♥ ) in §1.25. This action is given by 1-parameter subgroup of L. In particular, it preserves each P -orbit on Gr(D♥ ). Hence, each piece ∗ Gr P (D♥ , d) is stable under the C∗ -action. Let Gr(D♥ , d)C be the fixed point set of the C∗ -action on Gr(D♥ , d) (a smooth projective variety). ∗ Let s : Gr(D♥ , d) → Gr(D♥ , d)C be defined by s(V ) = limλ→∞ ρλ (V ). Then s is easily seen to be a vector bundle such that the C∗ -action is linear on each fibre, with weights strictly less than zero. Since Gr P (D♥ , d) is a closed, smooth C∗ -stable subvariety of Gr(D♥ , d), it follows that s restricts to a map C∗ C∗ = Gr P D♥ , d ∩ Gr D♥ , d , Gr P D♥ , d −→ Gr P D♥ , d ∗
which is itself an affine space bundle. Note that Gr P (D♥ , d)C may be identified with Zf1 (see §3.16). It is a smooth projective variety of pure dimension given by Proposition 3.17. The decomposition 2.3(a) gives rise to a decomposition of P-modules D♥ = ∗ and it is clear that the fixed point set of the R-action on Gr P (D♥ )C may r0 ∗ be naturally identified with r=1 Gr P ((r D)♥ )C . In particular, 3.20.
r0 ⊕r=1 (r D)♥ ,
C∗ C∗ χ Gr P D♥ χ Gr P (r D)♥ = , r
where χ denotes the Euler characteristic. It follows that (a) χ (Gr P (D♥ )) = r χ (Gr P ((r D)♥ )). ∗ We also see that each connected component of Gr P (D♥ )C has an action of R whose ∗ r0 fixed point set is a union of components of r=1 Gr P ((r D)♥ )C .
REMARKS ON QUIVER VARIETIES
265
3.21. In this subsection, we assume that |D| = k for some k. Let δ ∈ {1, −1} be u such that k ∈ I δ (see §1.2). We have ei Pu ek = 0 if i ∈ / I (−1) δ . Hence, D♥,u = 0 if i u u u (−1) δ i∈ /I . It follows that in Proposition 3.17(a) we have dim Wh dim Wh = 0 for all h, u and dim Wiu dim Wiu+1 = 0 for all i, u; hence, D0 = 0. In particular, dim Zf = D depends only on dim Wi (not on dim Wiu ). ∗ The connected components C of Gr P ((D♥ )C ) can be described explicitly in the following cases. Type A: each C is a point. Type D: each C is a point or a product of projective lines. Type E6 : each C is a point; a product of 1, 2, or 3 projective lines; or a projective plane with 0, 1, 2, or 3 points blown up. It would be interesting to explicitly describe C for type E7 , E8 . References [DLP] [GP] [L1] [L2] [L3] [N1] [N2]
C. De Concini, G. Lusztig, and C. Procesi, Homology of the zero-set of a nilpotent vector field on a flag manifold, J. Amer. Math. Soc. 1 (1988), 15–34. I. M. Gelfand and V. A. Ponomarev, Model algebras and representations of graphs (in Russian), Funktsional. Anal. i Prolozhen. 13 (1979), no. 3, 1–12. G. Lusztig, Bases in equivariant K-theory, Represent. Theory 2 (1998), 298–369, http://www.ams.org/ert/. , On quiver varieties, Adv. Math. 136 (1998), 141–182. , Bases in equivariant K-theory, II, Represent. Theory 3 (1999), 281–356, http://www. ams.org/ert/. H. Nakajima, Quiver varieties and Kac-Moody algebras, Duke Math. J. 91 (1998), 515–560. , Quiver varieties and finite dimensional representations of quantum affine algebras, preprint, 1999.
Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; [email protected]
Vol. 105, No. 2
DUKE MATHEMATICAL JOURNAL
© 2000
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p) FLORIN P. BOCA and ALEXANDRU ZAHARESCU
1. Introduction. The problem of understanding the distribution of fractional parts of polynomials has a long history. Weyl [19] proved that, for any polynomial f (x) = αd x d + αd−1 x d−1 + · · · + α1 x having at least one irrational coefficient, the sequence of fractional parts ({f (n)})n∈N is equidistributed in the unit interval [0, 1). Since then, many other results on the distribution of fractional parts of polynomials, particularly on “small” fractional parts, have been obtained (see, e.g., Schmidt [15] and Baker [1]). A different aspect of the random behavior of such a sequence has attracted attention recently, namely, the distribution of the spacings between members of the sequence. This problem came up in the context of the distribution of spacings between energy levels of integrable systems (see Berry and Tabor [2]). If f (x) = αx, the spacings are essentially those of the energy levels of a twodimensional harmonic oscillator (see Pandey, Bohigas, and Giannoni [10] and Bleher [3], [4]). In this case the sequence is not random; in fact, for any α and N the consecutive spacings of ({αn})n∈N take at most three values (see Sós [16] and Swierczkovski [17]). In the more interesting case f (x) = αx d , d ≥ 2, the pair correlation problem was investigated by Rudnick and Sarnak [12], who proved that for almost all α the pair correlation function exists and is “Poissonian.” The pair correlation function R2 (x) is defined as follows: say, x = (xn )n∈N is a sequence of real numbers equidistributed in the unit interval [0, 1). We take the first N members of the sequence, which provide us with a partition of [0, 1) in small intervals of length 1/N in average. Then we count the number of normalized differences between members of the sequence which fall (mod Z) in a given interval I = [a, b]. Thus, we consider the quantity 1 R2 (N, x, I ) := # 1 ≤ i = j ≤ N; N(xi − xj ) ∈ I + NZ . (1.1) N Note that for I = [−s, s] one gets s 1 . R2 N, x, [−s, s] = # 1 ≤ i = j ≤ N ; xi − xj ≤ N N This is the formula that defined the pair correlation in [12]. Received 26 August 1999. Revision received 20 January 2000. 2000 Mathematics Subject Classification. Primary 11J71; Secondary 11S05, 11J83. Boca’s work supported by an Engineering and Physical Sciences Research Council Advanced Fellowship. 267
268
BOCA AND ZAHARESCU
If limN→∞ R2 (N, x, I ) exists, then as a function of I it is a measure on R, called the pair correlation measure associated to the sequence (xn )n∈N . The pair correlation function is the density of this measure; that is, b lim R2 N, x, [a, b] = R2 (t) dt. N→∞
a
If R2 (t) exists and R2 (t) = 1, the pair correlation of the given sequence is called Poissonian. This is what one would expect to obtain for numbers xn randomly chosen in the interval [0, 1). A smooth form of (1.1) is to take a test function h ∈ Cc∞ (R) and set R2 (N, x, h) :=
1 N
h N xi − x j + k .
1≤i =j ≤N k∈Z
∞ The Poisson case occurs when R2 (N, x, h) → −∞ h(t) dt as N → ∞. For comparison, the correlation of the imaginary parts of zeros of the Riemann zeta function is not Poissonian (see Montgomery [7] and Rudnick and Sarnak [11]). In that case the pair correlation conjecture of Montgomery asserts that R2 (t) exists and R2 (t) = 1 − (sin π t/π t)2 . For an overview of some spacing distribution problems, see Sarnak [14]. Significant results on the m-level correlation of ({αn2 })n∈N were obtained by Rudnick, Sarnak, and Zaharescu in [13]. In that paper, rational approximations b/q for α were used to reduce the spacing distribution problem for the original sequence to a similar problem for sequences of form {b (mod q)}1≤n≤N , where q 1/2+ε N q 1−ε , ε > 0 and q → ∞. At least for q prime, the problem is related to estimates on the number of points of certain algebraic curves over finite fields, for which one can apply the results of Weil [18]. In this paper we study the pair correlation for sequences of form {r(n) (mod p)}1≤n≤N , where r(X) is an arbitrary rational function with integer coefficients. If p ≥ 2 is an integer and x = (xn )n is a sequence of rational numbers with denominators coprime with p and I an interval, then one defines 1 p R2 p, N, x, I := # 1 ≤ i = j ≤ N ; ∃h ∈ Z ∩ I, xi − xj = h (mod p) . N N (1.2) It is easily seen that for xn ∈ Z one has
R2 p, N, x, I = R2 N,
xn p
,I .
(1.3)
n
For r rational function with integer coefficients one lets R2 p, N, r, I = R2 p, N, (r(n))n , I .
(1.4)
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
269
For p 3/4+δ N p 1−δ one can show, following the method described above, that the pair correlation is Poissonian. However, this method fails for N small with respect to p. We fix a rational function r(X) = f (X)/g(X), with f, g relatively prime polynomials in Z[X], and we consider for large prime numbers p sequences of the form {br(n) (mod p)}1≤n≤N , where N is small with respect to p, more precisely, pδ N p 1/c(r)−δ . In what follows we take c(r) = 3 deg(g) + deg(f ) + 2 if deg(g) ≥ 1, and c(r) = deg(r) if r is a nonlinear polynomial. Although we are not able to treat individually each sequence {br(n) (mod p)}1≤n≤N , we present a method that enables us to show that the pair correlation is Poissonian for almost all the residues b (mod p). More precisely, we prove in Section 3 the following theorem. Theorem 1.1. Fix δ > 0, a rational function r(X) with integer coefficients which is not a linear polynomial and h ∈ Cc∞ (R). Then there exists δ1 = δ1 (δ, r, h) > 0, and for each p there exists an exceptional set Ᏹ(δ, r, h, p) of residues b (mod p) having at most p1−δ1 elements such that, for any b in the complementary set of Ᏹ(δ, r, h, p) and any integer N with p δ < N < p 1/c(r)−δ , one has R2 p, N, br, h − h(t) dt δ,r,h p −δ1 . R
Since this can handle the case when N is small, Theorem 1.1 has as a corollary the following result of Rudnick and Sarnak [12]. Corollary 1.2. Let f ∈ Z[X] be a polynomial with d = deg(f ) ≥ 2. Then for almost all real numbers α the pair correlation of the sequence of fractional parts ({αf (n)})n∈N is Poissonian. In Section 4 we use the averaging technique from the proof of Theorem 1.1 in a more subtle way, fixing a rational function r(X) ∈ Fp (X) and performing an average along the sequence of fractional parts of r(n). Thus we look at small pieces ᏺm = ({r(n)})m p 3/4+δ . By averaging over m we can deal with smaller values of N similar to those from Theorem 1.1. What is a little bit surprising is that M itself can be made smaller than p3/4+δ . In fact, any M > p 1/2+δ will do. More precisely, with R2 (ᏺm , I ) equal to 1 # m < x = y ≤ m + N; g(x), g(y) = 0, N (1.5) p ∃h ∈ Z ∩ I, r(x) − r(y) = h (mod p) , N we prove the following result.
270
BOCA AND ZAHARESCU
Theorem 1.3. Fix 1/2 < θ < 1, 0 < 8δ < 2θ − 1, D ≥ 1. Let also δ1 > 0. Then, for any large prime p, any M ∈ [p θ , p], and any rational function r(X) = f (X)/g(X) ∈ Fp [X], deg(f ), deg(g) ≤ D, r(X) not a polynomial of degree less than or equal to 3, there exists an exceptional set Ᏹ = Ᏹ(θ, δ, δ1 , D, p, M) ⊂ {1, . . . , M} with #Ᏹ θ,δ,D Mp 2δ1 −δ
(1.6)
and such that for any m ∈ {1, . . . , M}\ Ᏹ, any N ∈ [p 2δ , p c(θ)−δ ], any s > 0, and any interval I ⊂ [−s, s], one has R2 (ᏺm , I ) − |I | ≤ sp −δ1 . (1.7) The fact that M can be taken to be small in the above result gives hope that one might be able to obtain on the same lines a corresponding result for the pair correlation of αn2 (mod p) for any irrational number α. Technically, the proof of Theorem 1.3 gets more complicated when r(X) is a second degree polynomial. In Section 5 we treat this case for the more general modulus q. The attempt turns out to be successful, and we conclude with the following theorem. Theorem 1.4. For any irrational number α and any σ ≥ 5, there exist two sequences of integers Mj , Nj → ∞ with lim j
log Mj =σ log Nj
such that the pair correlation of ({αn2 })Mj
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
271
Lemma 2.1. Assume that g = g1 , . . . , gt , where g1 , . . . , gt are primitive integral polynomials with discriminants D1 , . . . , Dt = 0, and that d ≥ 1 is an integer. Then the number of solutions x (mod d) of the congruence g(x) = 0 (mod d) is at most ω(d) . d 1−1/t deg(g) max D12 , . . . , Dt2 Proof. We first assume d = p α , p prime, and notice that # x mod p α ; g(x) = 0 mod p α ≤ max # x mod p α ; p [(α+t−1)/t] | gj (x) . 1≤j ≤t
(2.1) Let 0 ≤ β < α. Writing x = x +p β x , x ∈ {0, 1, . . . , p β −1}, x ∈ {0, 1, . . . , p α−β − 1}, it is clear that gj (x) = gj (x ) (mod p β ); hence # x mod p α ; p β | gj (x) = p α−β # x mod p β ; gj (x ) = 0 mod p β . By a result of Nagell and Ore (see [9, Theorem 54]) the right-hand side of this equality is at most pα−β deg(gj )Dj2 ; thus # x mod p α ; p [(α+t−1)/t] | gj (x) ≤ p α−[(α+t−1)/t] deg(g)Dj2 , which we combine with (2.1) to get # x mod pα ; g(x) = 0 mod p α ≤ p α−[(α+t−1)/t] deg(g) max D12 , . . . , Dt2 ≤ p α−α/t deg(g) max D12 , . . . , Dt2 . We conclude the proof using the Chinese remainder theorem. Let N and q be (large) integers, and let A, B ≥ 1 be integers such that max(4A, 4B) ≤ q. Let r(X) = f (X)/g(X), where f, g ∈ Z[X]. We denote by ᏸ(N, q, A, B) the number of integral 6-tuples (x1 , y1 , x2 , y2 , t, v) such that 1 ≤ x1 , y1 , x2 , y2 ≤ N,
r(x1 ) = r(y1 ), r(x2 ) = r(y2 ), g(xj ) = 0, g(yj ) = 0, (2.2)
|t| ∈ [A, 2A], |v| ∈ [B, 2B], v r(x2 ) − r(y2 ) = t r(x1 ) − r(y1 ) (mod q). For δ > 0 we consider
N, q, A, B, δ = N 2 max(N, A) max(N, B)q −δ .
(2.3) (2.4)
272
BOCA AND ZAHARESCU
For g(X) as in Lemma 2.1 or r ∈ Z[X], we let 3 deg(g) + deg(f ) + 2, if r(X) = f (X) and g nonconstant, g(X) c(r) = deg(r), if r ∈ Z[X] and deg(r) ≥ 2 and α(r) =
2 t + deg(g) ,
1,
f (X) and g is as in Lemma 2.1, g(X) if r ∈ Z[X], deg(r) ≥ 2.
if r(X) =
We also consider a(δ) and b(δ) such that α(r)δ < a(δ) < b(δ) <
1−δ . c(r)
(2.5)
Proposition 2.2. Assume that deg(g) ≥ 1, δ > 0, and a(δ), b(δ) are as in (2.5). Then, for any large integers N and p, with p prime, such that p a(δ) ≤ N ≤ p b(δ) , one has
(2.6)
ᏸ = ᏸ N, p, A, B δ,r = N, p, A, B, δ ,
(2.7)
uniformly in A and B. Proof. We let
a1 = vg(x1 )g(y1 ) f (x2 )g(y2 ) − f (y2 )g(x2 ) , a2 = tg(x2 )g(y2 ) f (x1 )g(y1 ) − f (y1 )g(x1 ) .
(2.8) (2.9)
The congruence (mod p) from (2.4) is equivalent to a1 = a2
(mod p).
(2.10)
It is clear that 0 = |a1 | BN c(r)−2 p 2−δ
and
0 = |a2 | AN c(r)−2 p 2−δ , (2.11)
and therefore τ (a1 ), τ (a2 ) pε3 . It follows that for each pair (a1 , a2 ) as in (2.10) and (2.11), the number of 6-tuples (x1 , y1 , x2 , y2 , t, v) which satisfy (2.8) and (2.9) is p 6ε3 p ε2 . This is clear when g is nonconstant. Note for further use in the proof of Lemma 2.3 that this is also true when r ∈ Z[X]. In this case deg(f ) ≥ 2 and the number of admissible values for f (x1 ) − f (y1 ) and f (x2 ) − f (y2 ) is pε3 , and so is the number of admissible values for x1 − y1 as divisor of f (x1 ) − f (y1 ) and for x2 − y2 as divisor of f (x2 ) − f (y2 ). Since the equation f (x + h) − f (x) = c, with h = x1 − x2 , has at most deg(f ) − 1 solutions, it follows that the number of 6-tuples (x1 , y1 , x2 , y2 , t, v) corresponding to (a1 , a2 ) is p ε2 in this case too.
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
273
Case 1: max(A, B)N c(r)−2 ≥ p1−ε1 for some ε1 > 0 (to be determined later). Assume, for instance, AN c(r)−2 ≥ p 1−ε1 .
(2.12)
For fixed a1 , the number of a2 which satisfy (2.10) and the second part of (2.11) is AN c(r)−2 + 1. p But the number of a1 given by (2.8) with x1 , y1 , x2 , y2 , and v as in (2.2) and (2.3) is BN 4 (g(x1 ) takes at most N values, etc.); thus the number of admissible pairs (a1 , a2 ) is BN 4 +ABN c(r)+2 /p and the total number of admissible 6-tuples (x1 , y1 , x2 , y2 , t, v) is ABN c(r)+2 . BN 4 p ε2 + p 1−ε2 The proof is complete in this case since, from (2.12), BN 4 p ε2 ≤ ABN 2 /p δ ≤ and, from (2.6), ABN c(r)+2 /p 1−ε2 ≤ ABN 2 /p δ ≤ , provided that 0 < ε1 + ε2 < 1 − δ − c(r)b(δ), which is our choice for ε1 and ε2 . Case 2: max(A, B)N c(r)−2 ≤ p 1−ε1 . In this case for large N and p, |a1 | < p/2 and |a2 | < p/2; hence the congruence (2.10) transforms into the equality vg(x1 )g(y1 ) f (x2 )g(y2 ) − f (y2 )g(x2 ) = tg(x2 )g(y2 ) f (x1 )g(y1 ) − f (y1 )g(x1 ) . (2.13) Since gcd(f (x2 ), g(x2 )) divides for any integer x2 the resultant R = R(f, g) ∈ Z and, by (2.13), g(y2 ) divides vg(x1 )g(y1 )g(x2 )f (y2 ), it follows that, for a fixed quadruple (v, x1 , y1 , x2 ), the number of admissible y2 is τ (vg(x1 )g(y1 )g(x2 )) pε4 . Combining this with the fact that v, if it exists, is uniquely determined by t,x1 , y1 , x2 , y2 , we get ᏸ p ε4 · # admissible (t, x1 , y1 , x2 ) BN 3 p ε4 .
On the other hand, ≥ BAN 2 /p δ ≥ BN 3 p ε4 if A ≥ Np ε4 +δ , and similarly ᏸ if B ≥ Nq ε4 +δ . As a result we can assume in the remainder of the proof max(A, B) ≤ Npε4 +δ .
(2.14)
Using again the previous argument (t determined by the other five, τ (vg(x1 ) g(y1 )g(x2 )) p ε4 , and g(y2 ) divides vg(x1 )g(y1 )g(x2 )f (y2 )) and (2.14), it follows that the number of admissible 6-tuples (x1 , y1 , x2 , y2 , t, v) with x2 ≤ N/p ε4 +δ is pε4
BN 3 BN 3 = δ ≤ . ε +δ p4 p
274
BOCA AND ZAHARESCU
So we may and do assume for the remainder of the proof that both (2.14) and min(x2 , y2 ) ≥ hold. If we denote d = gcd g(x2 ), vg(x1 )g(y1 )
N p ε4 +δ
(2.15)
d = gcd g(y2 ), vg(x1 )g(y1 ) ,
and
then (2.13) provides g(x2 )g(y2 ) | dd f (x2 )g(y2 ) − f (y2 )g(x2 ) .
(2.16)
Denote by h ∈ Q[X] the remainder of the division of f by g. Then there exists L = L(f, g) ∈ Z∗ such that Lh ∈ Z[X] and, by (2.16), g(x2 )g(y2 ) | z = Ldd h(x2 )g(y2 ) − h(y2 )g(x2 ) = 0. (2.17) Let C > 0 such that a(δ)−(2ε4 +2δ) deg(g) > 2C > 2δt (for ε4 > 0 small enough). By Lemma 2.1 there exists a constant C(g) > 0 such that the number ᏺ(N, d) of integers 1 ≤ x2 ≤ N with d | g(x2 ) is at most C(g)ω(d) Nd −1/t if N > d. For N ≤ d we have the trivial bound ᏺ(N, d) ≤ deg(g). Since t → C(g)ω(d) is a multiplicative function, we get C(g)ω(d) ≤ τ (d [C(g)]+1 ) d ε5 and, therefore, Nd ε5 −1/t , if N > d, (2.18) ᏺ(N, d) deg(g), if N ≤ d. But d is a divisor of vg(x1 )g(y1 ); hence it is allowed to take a number of values which is pε6 , and we conclude, using the fact that y2 takes q ε4 values, that the number of admissible 6-tuples (x1 , y1 , x2 , y2 , t, v) with d ≥ N, respectively, N ≥ d ≥ p C , is BN 2 p ε6 +ε4 ≤
BN 3 ≤ , pδ
respectively, BN 2 p ε4 +ε6
N p C(1/t−ε5 )
≤
BN 3 ≤ , pδ
for appropriate choices of ε4 , ε5 , and ε6 . A similar argument sorts out the case d ≥ p C . Finally, for d, d ≤ p C , (2.17) provides |z| N 2 deg(g)−1 p 2C . On the other hand, by (2.15), the first part of (2.6), and (2.5), we gather g(x2 )g(y2 )
N 2 deg(g) p (2ε4 +2δ) deg(g)
≥ N 2 deg(g)−1 p 2C |z|.
Hence, for large p and N, we get |z| < |g(x2 )g(y2 )|, which contradicts (2.17).
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
275
Lemma 2.3. Assume that r = f ∈ Z[X], d = deg(f ) ≥ 2, 0 < δ < 1, and, for large N and p, pa(δ) ≤ N ≤ p b(δ) , with a(δ) and b(δ) as in (2.5). Then ᏸ = ᏸ N, p, A, B δ,r = N, p, A, B, δ uniformly in A and B. Proof. In this case we simply take a1 = v f (x2 ) − f (y2 ) , and we get
0 = |a1 | BN d ,
a2 = t f (x1 ) − f (y1 ) , 0 = |a2 | AN d .
Since a2 takes AN 2 values and the congruence a1 = a2 (mod p) with |a1 | BN d has a number of solutions which is [BN d /p] + 1 ≤ BN d /p + 1, it follows that the number of admissible pairs (a1 , a2 ) is ABN d+2 /p + AN 2 . Arguing as in the proof of Proposition 2.2, we conclude that the number of admissible 6-tuples (x1 , y1 , x2 , y2 , t, v) is in this case
ABN d+2 + AN 2 p ε2 , p 1−ε2
which is ≤ ABN 4 /p δ + AN 3 /p δ ≤ if we choose 0 < ε2 < min a(δ) − δ, 1 − δ − db(δ) . We estimate now S p, N, r, I =
R2 p, N, br, I − |I |2 , b (mod p)
where R2 (p, N, br, I ) is as in (1.4), and prove the following proposition. Proposition 2.4. Let s > 0, and assume that r and δ are as in Proposition 2.2 or Lemma 2.3, a(δ) and b(δ) as in (2.5), and I ⊂ [−s, s] is an interval. Then, for any large integers N and p, with p prime, such that pa(δ) ≤ N ≤ p b(δ) , one has S = S p, N, r, I δ,r s 2 p 1−δ .
(2.19)
276
BOCA AND ZAHARESCU
Proof. By the definition of R2 we gather
1 R2 p, N, br, I = N
ν(N, b, h),
h∈(p/N)I
where
ν(N, b, h) = # 1 ≤ x = y ≤ N; g(x), g(y) = 0, b r(x) − r(y) = h (mod p) t b(r(x) − r(y)) − h 1 = . e p p 1≤x =y≤N g(x),g(y) =0
t (mod p)
Hence
1 R2 p, N, br, I = Np =
1 Np
h∈(p/N)I 1≤x =y≤N t (mod p) g(x),g(y) =0
t b(r(x) − r(y)) − h e p
F (t)G(tb),
t (mod p)
where G(tb) =
tb r(x) − r(y) e p
1≤x =y≤N g(x),g(y) =0
and where
F (t) =
h∈(p/N)I
th e − p
is a geometrical progression. Thus, for t = 0 (mod p), p|I | p (2s + 1)p |F (t)| min , ≤ . N |t| max(N, |t|) From now on it is assumed that −
p p
We put the contribution of terms with t = 0 into 2 1 S = F (t)G(tb) , 2 2 N p 0 =t (mod p) b (mod p)
(2.20)
(2.21)
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
277
and we assume that we can prove S δ,r s 2 p 1−δ .
(2.22)
The contribution of t = 0 to R2 is |I |(1 + O(1/N)); hence 1 R2 p, N, br, I − |I | = Np
F (t)G(tb) + O
0 =t (mod p)
s . N
Since N 2 = o(p 1−δ ), this gives R2 p, N, br, I − |I |2 S= b (mod p)
s = S + O N
b (mod p)
1 F (t)G(tb) . Np 0 =t (mod p)
(2.23)
But, by Cauchy and Schwartz and (2.22), 1 F (t)G(tb) ≤ p 1/2 S 1/2 sp 1−δ/2 , Np 0 =t (mod p) b (mod p)
which we compare with (2.23) to get 2 1−δ/2 s p S = S +O = S + o p 1−δ . N
(2.24)
This shows that the contribution of terms with t = 0 to R2 is negligible, and it is enough actually to prove (2.22). For we write S = Using
0 =t1 ,t2 (mod p)
F (t1 )F (t2 )
G(t1 b)G(t2 b).
b (mod p)
G(t1 b)G(t2 b)
b (mod p)
=
1 N 2 p2
1≤x1 ,y1 ,x2 ,y2 ≤N, xj =yj ,g(xj ),g(yj ) =0 b (mod p)
t1 b r(x1 ) − r(y1 ) − t2 b r(x2 ) − r(y2 ) e p
1 ≤ x1 , y1 , x2 , y2 ≤ N ; xj = yj , g(xj ), g(yj ) = 0, = p# , t1 r(x1 ) − r(y1 ) = t2 r(x2 ) − r(y2 ) (mod p)
278
BOCA AND ZAHARESCU
we gather S =
1 N 2p
F (t1 )F (t2 ).
1≤x1 ,y1 ,x2 ,y2 ≤N 0 =t1 ,t2 (mod p) xj =yj ,g(xj ),g(yj ) =0 t1 (r(x1 )−r(y1 ))=t2 (r(x2 )−r(y2 )) (mod p)
Since for large N and p we have N c(r) ≤ p, it follows that if r(x1 ) = r(y1 ) and t1 (r(x1 )−r(y1 )) = t2 (r(x2 )−r(y2 )) (mod p), 0 = t1 , t2 (mod p), then r(x2 ) = r(y2 ). Hence S = SI + SI I , where 1 S = 2 N p
I
SI I =
1 N 2p
1≤x1 =y1 ≤N 1≤x2 =y2 ≤N 0 =t (mod p) r(x1 )=r(y1 ) r(x2 )=r(y2 )
2 F (t) ,
F (t1 )F (t2 ).
1≤x1 ,y1 ,x2 ,y2 ≤N 0 =t1 ,t2 (mod p) g(xj ),g(yj ) =0, r(xj ) =r(yj ) t1 (r(x1 )−r(y1 ))=t2 (r(x2 )−r(y2 )) (mod p)
The set {(x, y) ∈ N2 ; x = y, r(x) = r(y)} is finite. Hence S I r
1 N 2p
|F (t)|2 r
0 =t (mod p)
s2p N 2 log2 p
δ,r s 2 p 1−δ .
Next, we estimate S I I . The inequality N c(r) ≤ p shows that if r(xj ) = r(yj ), 1 ≤ xj , yj ≤ N, then r(xj ) = r(yj ) (mod p). Hence, since p is prime, the congruence t1 r(x1 ) − r(y1 ) = t2 r(x2 ) − r(y2 )
(mod p)
determines uniquely, for instance, t2 in terms of t1 as t2 = t1 u (mod p), where −1 u = r(x1 ) − r(y1 ) r(x2 ) − r(y2 )
(mod p).
With (2.21) we gather SI I
s2 N 2p
1≤x1 ,y1 ,x2 ,y2 ≤N 0 =t g(xj ),g(yj ) =0 r(xj ) =r(yj )
where v = tu (mod p) and |v| < p/2.
p2 , max N, |t| max N, |v| (mod p)
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
279
If we denote
H (u) =
1 , max N, |t| max N, |v|
0 =t (mod p) v=tu (mod p) 0 =|v|
then II S p N2
H (u).
(2.25)
1≤x1 ,y1 ,x2 ,y2 ≤N g(xj ),g(yj ) =0 r(xj ) =r(yj )
Remark 1. For any u we have, by Cauchy’s inequality, 1/2 1/2 1 1 H (u) ≤ 2 · 2 0 =t (mod p) max N, |t| 0 =t (mod p) max N, |v| =
0 =t (mod p)
=
0 =|t|≤N
1
2
max N, |t|
1 + N2
N<|t|≤p/2
This implies |S I I | p1+δ /N 2 factor of N.
1 1 . 2 N |t|
1≤x1 ,x2 ,y1 ,y2 ≤N
1/N = Np1+δ , so we lose by a
Remark 2. We may try something different: for any u there are at most N 3 quadruples (x1 , y1 , x2 , y2 ) such that u = (r(x1 ) − r(y1 ))(r(x2 ) − r(y2 ))−1 since any three of x1 , x2 , y1 , y2 will determine the fourth one. Hence |S I I | Now u (mod p)
p 1+δ 3 N N2
u (mod p)
H (u) = Np 1+δ
1 1 · max N, |t| max N, |v| (mod p) 0 =v (mod p)
H (u) = 0 =t
= 0 =t
H (u).
u (mod p)
2 1 log2 p, max N, |t| (mod p)
so we lose again by a factor of N .
280
BOCA AND ZAHARESCU
The previous remarks show that, in order to succeed, we need to study carefully the terms in H (u) for which |t| and |v| are simultaneously small and see how many of these do correspond to quadruples (x1 , y1 , x2 , y2 ) of numbers from {1, 2, . . . , N }. To get S I I δ,r s 2 p 1−δ we need to show
H (u) δ,r s 2 N 2 p −δ .
(2.26)
1≤x1 ,x2 ,y1 ,y2 ≤N g(xj ),g(yj ) =0 r(xj ) =r(yj )
To prove (2.26) we break [1, p/2], the range of summation for t in H (u), into dyadic intervals of the form [A, 2A], and we break the range for v into intervals [B, 2B] of the same kind. For 1 ≤ A ≤ p/4, 1 ≤ B ≤ p/4, we let
HA,B (u) =
1 . max N, |t| max N, |v|
|t|∈[A,2A] |v|∈[B,2B] v=tu (mod p)
Then it is clear that, for any u, H (u) ≤ log2 p
sup 1≤A,B≤p/4
HA,B (u).
Note also that if L(A, B, u) denotes the number of pairs (t, v) with |t| ∈ [A, 2A], |v| ∈ [B, 2B], and v = tu (mod p), then HA,B (u)
L(A, B, u) . max(N, A) max(N, B)
Hence (2.26) is implied by the following statement. For any 1 ≤ A, B ≤ p/4, 1≤x1 ,x2 ,y1 ,y2 ≤N g(xj ),g(yj ) =0 r(xj ) =r(yj )
L(A, B, u) δ,r
s 2 N 2 max(N, A) max(N, B) p δ log2 p
.
(2.27)
But the left-hand side of (2.27) equals the left-hand side of (2.27), and the righthand side of (2.27) equals log−2 p times the right-hand side of (2.7); hence (2.26) holds true under our hypotheses. 3. Proof of Theorem 1.1 and Corollary 1.2. Theorem 1.1 follows immediately from the next result approximating h by step functions in the uniform norm.
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
281
Theorem 3.1. Let δ > 0, and let a(δ), b(δ) be as in (2.5). Then there exists δ1 = δ1 (δ) > 0 such that, for any large prime p, p ≥ p(δ), there exists a set Ꮾp = Ꮾp (δ, r) of residue classes (mod p) which contains at most p 1−δ1 elements such that, for all integers N with p a(δ) < N < pb(δ) , all s > 0, all intervals I ⊂ [−s, s], and all b in the complementary set Ꮾcp of Ꮾp , one has R2 p, N, br, I − |I | ≤ sp −δ1 . Proof. We pick θ, δ0 , δ0 , δ1 > 0 such that 0 < θ < 1,
0 < δ − (1 − θ)b(δ),
δ1 < δ0 < δ0 < (1 − θ)a(δ),
1 + 2δ0 − δ + (1 − θ)b(δ) < 1 − δ1 . We write Ᏽ = Ᏽ(p, δ) = p a(δ) , p b(δ) ∩ Z =
Kp
Ᏽi , i=1
where Ᏽi = [Ni , Ni + Niθ ) ∩ Z ∩ Ᏽ, with Ni integers. If we also let this cover be minimal, that is, take Ni+1 to be the smallest integer that is greater than or equal to Ni + Niθ , then Ni+1 ≥ 1 + Niθ−1 > 1 + p (θ−1)b(δ) . Ni Hence
K pb(δ)−a(δ) > 1 + p −(1−θ)b(δ) p ,
which provides b(δ) > Kp log 1 + p −(1−θ)b(δ) Kp p −(1−θ)b(δ) , so we may assume Kp δ b(δ)p (1−θ)b(δ) .
(3.1)
Denote Mi = sup Ᏽi . For each N ∈ Ᏽ and ε > 0, consider (p, ε, N ), respectively, ᏸ(p, ε, N), the set of residue classes b (mod p) with the property that there exist
s0 > 0 and an interval I0 = I0 (p, N) ⊂ [−s0 , s0 ] such that
R2 p, N, br, I0 − |I0 | > s0 p −ε , and, respectively,
R2 p, N, br, I0 − |I0 | < −s0 p −ε .
282
BOCA AND ZAHARESCU
Then
(p, ε, N)c = b (mod p); R2 p, N, br, I − |I | ≤ sp −ε , ∀s > 0, ∀I ⊂ [−s, s] ,
ᏸ(p, ε, N)c = b (mod p); R2 p, N, br, I − |I | ≥ −sp −ε , ∀s > 0, ∀I ⊂ [−s, s] , Ᏻ(p, ε) =
!
(p, ε, N )c ∩ ᏸ(p, ε, N )c
N∈Ᏽ
= b (mod p); R2 p, N, br, I − |I | ≤ sp −ε ,
∀s > 0, ∀I ⊂ [−s, s], ∀N ∈ Ᏽ .
Consider also (p, ε) =
(p, ε, N ),
ᏸ(p, ε) =
N∈Ᏽ
ᏸ(p, ε, N ), N∈Ᏽ
Ꮾ(p, ε) = (p, ε) ∪ ᏸ(p, ε) = Ᏻ(p, ε)c .
By Proposition 2.4, we gather, for all 1 ≤ i ≤ Kp , # p, δ0 , Mi δ,r p 1−δ+2δ0 .
(3.2)
By the very definition of R2 (p, N, br, I ) we see that, for all intervals I and all k ≥ 0, N NR2 p, N, br, I ≤ (N + k)R2 p, N + k, br, I ; N +k hence, for all 0 ≤ k ≤ N θ , N N R2 p, N + k, br, I − |I | ≥ R2 p, N, br, I − |I | N +k N +k N N N = R2 p, N, br, I − |I | N +k N +k N +k
−
(3.3)
2kN + k 2 |I |. (N + k)2
If we also assume b ∈ (p, δ0 , N ), then it follows by (3.3) and the choice for δ0 and δ0 that, for p > p(δ), there exist s0 > 0 and an interval I0 = I0 (p, N) ⊂ [−s0 , s0 ]
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
such that, for all 0 ≤ k ≤ N θ , N +k N +k R2 p, N + k, br, I0 − |I0 | N N N θ−1 (2N + k) 1 R2 p, N, br, I0 − |I0 | − |I0 | 2 N s0 p −δ0 > − 12s0 N θ−1 2 s0 p −δ0 − 12s0 p −δ0 > 2 > 2s0 p −δ0 N +k s0 p −δ0 ; > N
>
hence b ∈ (p, δ0 , Mi ). Therefore, for p > p1 (δ), we have
p, δ0 , N ⊂ p, δ0 , Mi ,
1 ≤ i ≤ Kp ,
N∈Ᏽi
leading to
Kp
p, δ0 ⊂
p, δ0 , Mi , i=1
which we combine with (3.2) and (3.1) to get, for p > p2 (δ),
#(p, δ0 ) ≤ p 1−δ+2δ0 +(1−θ)b(δ) . Using the inequality N −k N I ≤ R2 p, N, br, I , R2 p, N − k, br, N N −k one proves in a similar way, for p > p3 (δ),
#ᏸ(p, δ0 ) ≤ p 1−δ+2δ0 +(1−θ)b(δ) . Summarizing, if we put Ꮾp (δ) = Ꮾ(p, δ0 ), then, for p > p4 (δ),
#Ꮾp (δ) ≤ 2p 1−δ+2δ0 +(1−θ)b(δ) < p 1−δ1 and for all b ∈ Ꮾp (δ)c , all s > 0, and all intervals I ⊂ [−s, s], 2 sup R2 p, N, br, I − |I | ≤ sp −δ0 < sp −δ1 .
N∈Ᏽ
283
284
BOCA AND ZAHARESCU
Proof of Corollary 1.2. It suffices to prove that for all s > 0 the set of α which satisfies lim R2 N, {αf (n)} n , I = |I | for all intervals I ⊂ [−s, s] (3.4) N
has full Lebesgue measure. Fix s > 0 and ε > 0. Let δ be such that 0 < δ < 2/(3d + 3); hence d−
2 = −1 − c < −1. 3δ
Choose a(δ), b(δ) such that δ < a(δ) <
1−δ 3δ < b(δ) < . 2 d
We keep the notation from the proof of Theorem 3.1. Note first that there exists a sequence of primes (pk )k such that −δ 3δ/2 a(δ) a(δ) and pk 1 < ε. (3.5) pk < pk+1 < pk k
This can be easily achieved by taking, for instance, 1 < λ1 < λ2 < 3δ/2a(δ), then choosing p1 such that ε(p1λ1 δ1 − 1) > 1 and p1λ2 −λ1 − 1 > 1. Then, inductively, −1 ≤ p1−kλ1 , [pkλ1 , pkλ2 ) contains at least a new prime pk+1 (since pkλ2 ≥ 2pkλ1 ) and pk+1 so −δ −kδ λ 1 pk 1 ≤ p1 1 1 = δ λ < ε. 1 1 p1 − 1 k k The proof of Theorem 3.1 provides us, for p > p(δ), with sets Ꮾp = Ꮾp (δ) = Ꮾ(p, δ0 ) such that #Ꮾp ≤ p 1−δ1 and, for all t > 0 and all intervals I ⊂ [−t, t], R2 p, N, bf, I − |I | ≤ tp −δ1 ,
" ∀ N ∈ p a(δ) , p b(δ) , ∀ b ∈ Ꮾcp .
The Lebesgue measure of Ᏹp = b∈Ꮾp
b 1 b 1 − , + ⊂ [0, 1) p 2p p 2p
is then µ(Ᏹp ) ≤
1 · |Ꮾp | ≤ p−δ1 . p
We take Ᏹ=
Ᏹpk k
(3.6)
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
and Then µ(Ᏹ) ≤
285
Ᏺ = Ᏹc = [0, 1) \ Ᏹ. −δ1 k pk
< ε; hence µ(Ᏺ) > 1 − ε. Since ! ! 1 b 1 b c Ᏺ= Ᏹpk ⊂ − , + ⊂ [0, 1), pk 2pk pk 2pk c k
k b∈Ꮾp
k
it follows that for all α ∈ Ᏺ and all k, there exists bk ∈ Ꮾcpk such that bk 1 bk 1 α∈ − in [0, 1); , + pk 2pk pk 2pk that is, for all k,
# # # # #α − bk # ≤ 1 , # pk # 2pk 3δ/2
which provides further, for all N ≤ pk , # # d # # b f (n) k # δ,r N ≤ N d−2/3δ = N −1−c . αf (n) − max # 1≤n≤N # pk # pk
(3.7)
Let I0 ⊂ [−s, s]. We compare R2 (N, ({αf (n)})n , I0 ) with R2 (N, ({bf (n)/p})n , I0 ) for α ∈ Ᏺ. Firstly, remark that, since r ∈ Z[X], (1.4) provides bf (n) , I0 . (3.8) R2 p, N, bf, I0 = R2 N, p n Also, we see from the very definition of R2 (N, (xn )n , I ) that if (xn )n and (yn )n are such that max1≤n≤N xn −yn ≤ N −1−c , c > 0, then there exist intervals I0 ⊂ I0 ⊂ I0 with |I0 | = |I0 | − 2N −c , |I0 | = |I0 | + 2N −c such that (3.9) R2 N, (yn )n , I0 ≤ R2 N, (xn )n , I0 ≤ R2 N, (yn )n , I0 . If we take xn = αf (n), yn = bk f (n)/pk , n ≤ N, then (3.8) and (3.9) provide, for 3δ/2 all k and N ≤ pk , bk f (n) , I0 R2 pk , N, bk f, I0 = R2 N, pk n ≤ R2 N, {αf (n)} n , I0 (3.10) bk f (n) ≤ R2 N, , I0 pk n = R2 pk , N, bk f, I0 .
286
BOCA AND ZAHARESCU a(δ)
But bk ∈ Ꮾcpk ; hence by (3.6) we get, for all N ∈ [pk , pk ], R2 pk , N, bk f, I0 = |I0 | + Oδ,f sN −δ1 = |I0 | − 2N −c + Oδ,f sN −δ1 , R2 pk , N, bk f, I0 = |I0 | + 2N −c + Oδ,f sN −δ1 . Summarizing, we have proved R2 N, {αf (n)} n , I0 = |I0 | + Oδ,f max N −c , sN −δ1
3δ/2
a(δ) 3δ/2 " for all N ∈ pk , pk .
$ a(δ) 3δ/2 a(δ) Since k [pk , pk ] = [p1 , ∞), the previous equality holds for all α ∈ Ᏺ and all N, allowing us to conclude that lim R2 N, {αf (n)} n , I0 = |I0 | for all α ∈ Ᏺ. N
4. Proof of Theorem 1.3 Theorem 4.1. Fix 1/2 < θ < 1, 0 < 8δ < 2θ − 1, D ≥ 1. Denote c(θ) = θ/2 − 1/4. Then, for any large prime number p, any M ∈ [p θ , p], any N ∈ [p 2δ , p c(θ)−δ ], any rational function r(X) = f (X)/g(X) ∈ Fp (X), deg(f ), deg(g) ≤ D, r(X) not a polynomial of degree less than or equal to 3, any s > 0, and any interval I ⊂ [−s, s], one has 2 R2 (ᏺm , I ) − |I |2 θ,δ,D s M . S = S p, r, M, N = pδ
(4.1)
1≤m≤M
Proof. From (1.5) we obtain R2 (ᏺm , I ) =
1 N
ν m, N, r, p, h ,
h∈(p/N)I
where ν m, N, r, p, h = # m < x = y ≤ m + N;
g(x), g(y) = 0, r(x) − r(y) = h (mod p) .
Hence 1 R2 (ᏺm , I ) = Np 1 = Np
h∈(p/N)I m<x =y≤m+N t (mod p)
t (mod p)
F (t)Gm (t),
t r(x) − r(y) − h e p (4.2)
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
287
where F (t) is given by (2.20),
Gm (t) =
m<x =y≤m+N
t r(x) − r(y) e p
and we take −p/2 < t < p/2. The contribution of t = 0 to R2 is |I |(1+O(1/N)). So, arguing as in the beginning of the proof of Proposition 2.4, we find
S = S + Oθ,δ,D
s2 M · N p δ/2
= S + oθ,δ,D
M , pδ
provided that S θ,δ,D s 2 Mp −δ ,
(4.3)
where
S =
1≤m≤M
1 = 2 2 N p
2 1 F (t)G (t) m N 2 p 2 0 =t (mod p)
0 =t1 ,t2 (mod p)
F (t1 )F (t2 )
(4.4) Gm (t1 )Gm (t2 ).
1≤m≤M
So it suffices to prove (4.3). The inner sum in the last term of (4.4) equals
t1 r(x1 ) − r(y1 ) − t2 r(x2 ) − r(y2 ) . e p
(4.5)
1≤m≤M m<xj ,yj ≤m+M
At this point we perform the change of variables x1 = x˜1 + m, x2 = x˜2 + m, y1 = y˜1 +m, and y2 = y˜2 +m so that x˜1 , x˜2 , y˜1 , and y˜2 vary in the same interval {1, . . . , N }. Also, moving the summation over m inside, it follows that (4.5) equals
t1 r x˜1 + m − r y˜1 + m − t2 r x˜2 + m − r y˜2 + m e . p 1≤x˜j =y˜j ≤N 1≤m≤M j =1,2
(4.6) Here, the inner sum is an incomplete exponential sum of a rational function r˜ , where r˜ (X) = t1 r x˜1 + X − r y˜1 + X − t2 r x˜2 + X − r y˜2 + X . (4.7)
288
BOCA AND ZAHARESCU
To complete the sum we write it as r˜ (m) = e p 1≤m≤M
m (mod p)
r˜ (m) · χ[1,M] (m), e p
(4.8)
where χ[1,M] is the characteristic function of the set {1, 2, . . . , M}. Next, we expand χ[1,M] in a Fourier series on Fp : mz , (4.9) χ[1,M] (m) = χ %[1,M] (z)e p z (mod p)
where the Fourier coefficients are given by 1 χ %[1,M] (z) = p
w (mod p)
wz 1 wz χ[1,M] (w)e − = . e − p p p 1≤w≤M
This is a geometric series that is bounded as M p χ %[1,M] (z) 1 |z|
if z = 0, (4.10) if z = 0,
for any z ∈ {−(p − 1)/2, . . . , (p − 1)/2}; hence |z|≤(p−1)/2
M χ %[1,M] (z) + log p log p. p
(4.11)
By (4.9), the sum from (4.8) equals r˜ (m) mz e χ %[1,M] (z)e p p m (mod p)
z (mod p)
=
z (mod p)
χ %[1,M] (z)
m (mod p)
r˜ (m) + mz . e p
(4.12)
Here, the inner sum is a complete exponential sum for which we can apply the Bombieri-Weil inequality (see Bombieri [5]). In the convenient form given by Moreno and Moreno [8] it says that if the rational function (on m) r˜ (m) + mz is not constant, then r˜ (m) + mz e (4.13) D p 1/2 . m (mod p) p
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
289
Let us consider first only those r˜ that are not linear polynomials. Then (4.13) holds true for all z (mod p), and from (4.12) and (4.11) we derive r˜ (m) χ %[1,M] (z) D p 1/2 log p. e (4.14) D p 1/2 p 1≤m≤M |z|≤(p−1)/2
As a result, the contribution of 6-tuples (t1 , t2 , x˜1 , y˜1 , x˜2 , y˜2 ), with 1 ≤ x˜j , y˜j ≤ N, x˜j = y˜j , and such that the corresponding r˜ is not a linear polynomial to (4.4), is 2 N4 D 2 2 · p 1/2 log p · |F (t)| , N p 0 =t (mod p)
which is, using (2.21), θ,δ,D s 2 N 2 p 1/2 log3 p s 2 Mp −δ .
(4.15)
Now let t1 , t2 , x˜1 , x˜2 , y˜1 , and y˜2 be such that the corresponding r˜ is a linear polynomial. Assume first that r(X) = dk=0 ad−k Xk , ak ∈ Z, a0 = 0 (mod p), and d ≥ 5. Then the degree of ad−k t1 (X + x˜1 )k − (X + y˜1 )k − t2 (X + x˜2 )k − (X + y˜2 )k r˜ (X) = 0≤k≤d
is at most d − 1 and the coefficients of X d−1 , X d−2 , and X d−3 are, respectively, a0 d t1 (x˜1 − y˜1 ) − t2 (x˜2 − y˜2 ) , d a0 t1 x˜12 − y˜12 − t2 x˜22 − y˜22 + a1 (d − 1) t1 (x˜1 − y˜1 ) − t2 (x˜2 − y˜2 ) , 2 d t1 x˜13 − y˜13 − t2 x˜23 − y˜23 a0 3 d −1 2 + a1 t1 x˜1 − y˜12 − t2 x˜22 − y˜22 + a2 (d − 2) t1 (x˜1 − y˜1 ) − t2 (x˜2 − y˜2 ) . 2 All these three coefficients must vanish; hence t1 (x˜1 − y˜1 ) = t2 (x˜2 − y˜2 ) (mod p), t1 x˜12 − y˜12 = t2 x˜22 − y˜22 (mod p), t1 x˜13 − y˜13 = t2 x˜23 − y˜23 (mod p).
(4.16)
Since t1 , t2 , x˜1 − y˜1 , and x˜2 − y˜2 are nonzero (mod p), we readily get from (4.16) x˜1 + y˜1 = x˜2 + y˜2
and
x˜1 y˜1 = x˜2 y˜2 .
(4.17)
290
BOCA AND ZAHARESCU
It follows that either x˜1 = x˜2 , y˜1 = y˜2 or x˜1 = y˜2 , y˜1 = x˜2 . From the first equality in (4.16) we also derive t1 = ±t2 . In conclusion, r˜ is not a linear polynomial unless t1 = ±t2 , and, moreover, if this holds true, then the contribution of such 4-tuples (x˜1 , x˜2 , y˜1 , y˜2 ) in (4.6) is bounded by 2N 2 M. Therefore the contribution of these terms to (4.4) is 2N 2 M N 2 p2
0 =t1 (mod p)
M |F (t1 )| · |F (t1 )| + |F (−t1 )| 2 p
|F (t)|2 .
0 =t (mod p)
Using (2.21), the above is bounded by (2s + 1)2 M
0 =|t|
1
max N, |t|
2
s2M , N
which is smaller than the right-hand side of (4.3). This proves (4.3) in the case when r(X) is a polynomial of degree greater than or equal to 5. Consider now the case when r(X) is not polynomial and d = deg(g) = f (X)/g(X) ≥ 1. Assume also that y˜2 = max x˜1 , y˜1 , x˜2 , y˜2 , and fix a root α of g in Fp . Consider the set ᏹ = β ∈ Fp ; g(β) = 0, β − α ∈ Fp ⊂ α + Fp . The bijection
0, 1, . . . , p − 1
a −→ α + a ∈ α + Fp
yields an order relation on α +Fp . Also, ᏹ defines a partition of α +Fp with endpoints in ᏹ. Then #ᏹ ≤ d and there exists β ∈ ᏹ such that β − 1, β − 2, . . . , β − [p/d] are not roots of g. It is clear that β − y˜2 is a pole for one of r(x˜1 + X), r(y˜1 + X), or r(x˜2 + X). In other words, there exists x∗ ∈ {x˜1 , y˜1 , x˜2 } such that g(x∗ − y˜2 + β) = 0. But 0 ≤ y˜2 − x∗ < y˜2 ≤ N < [p/d]; hence y˜2 = x∗ ∈ {x˜1 , y˜1 , x˜2 }. It follows that y˜2 = y˜1 or y˜2 = x˜1 . Assume first that y˜2 = y˜1 and t1 = t2 . Then r˜ (X) = t1 r(x˜1 + X)−t2 r(x˜2 +X)+(t2 −t1 )r(y˜2 +X) and, since r˜ has no poles and β − y˜2 is a pole of r(y˜2 + X), it follows that β + x˜1 − y˜2 or β + x˜2 − y˜2 is a root of g, which contradicts 0 < y˜2 − xj < [p/d], j = 1, 2. Therefore y˜2 = y˜1 implies t1 = t2 . In a similar way, y˜2 = x˜1 implies t1 = −t2 . In both situations the proof proceeds as above. When deg(r) = 4, r˜ is a linear polynomial if and only if t1 (x˜1 − y˜1 ) = t2 (x˜2 − y˜2 ) (mod p), t1 (x˜12 − y˜12 ) = t2 (x˜22 − y˜22 ) (mod p).
(4.18)
By (4.18) x˜1 + y˜1 = x˜2 + y˜2 (mod p), and since N p 1/4 , we get x˜1 + y˜1 = x˜2 + y˜2 . Case 1: max(|t1 |, |t2 |) p 1−δ/4 /N. In this case the congruences from (4.18) become equalities. As a result, if we keep (t1 , x˜1 , y˜1 ) fixed, it follows that x˜2 − y˜2 takes, as a divisor of t1 (x˜1 − y˜1 ), at most p δ/4 values. Therefore, the number of pairs
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
291
(x˜2 , y˜2 ) which satisfy (4.18) is p δ/4 and, for such a pair, t2 = ut1 takes at most pδ/4 values too, where u = (x˜1 − y˜1 )(x˜2 − y˜2 )−1 (mod p). Using also
max
0 =t (mod p)
0 =u (mod p)
|F (t)| · |F (ut)|
0 =t (mod p)
s 2 p 2 log p s 2 p2 N|t| N
r˜ (m) e ≤ M, 1≤m≤M p
and the trivial estimate
it follows that the contribution of such 6-tuples (t1 , t2 , x˜1 , y˜1 , x˜2 , y˜2 ) to (4.4) is δ,D
2 2 s 2 Mp δ/2 log p 1 M 2 δ/2 s p log p = δ,D δ . · N Mp · 2 2 N N p N p
Case 2: max(|t1 |, |t2 |) p1−δ/4 /N, for instance, |t2 | p1−δ/4 /N. In this case there exists an integer a with |a| N such that t1 x˜1 − y˜1 + ap = t2 x˜2 − y˜2 . (4.19) For a fixed quadruple (a, t1 , x˜1 , y˜1 ) we have at most pδ choices for t2 and x˜2 − y˜2 ; hence the number of pairs (x˜2 , y˜2 ) which satisfy (4.19) and x˜2 + y˜2 = x˜1 + y˜1 is p δ and the contribution of such 6-tuples (t1 , t2 , x˜1 , y˜1 , x˜2 , y˜2 ) to (4.4) is δ,D
1 N 2 p2
· N 3 Mp δ
0 =|t|
MNp δ−2
0 =|t|
MN p M δ. p
2 2δ−1
max
0 =u (mod p) p/2>|ut|p1−ε /N
|F (t)| · |F (ut)|
p pN · |t| p 1−δ
log p
Proof of Theorem 1.3. We let Ᏹ = 1 ≤ m ≤ M; ∃s0 > 0, ∃I0 ⊂ [−s0 , s0 ]
interval such that R2 (ᏺm , I0 ) − |I0 | ≥ s0 p −δ1 .
Then Theorem 3.1 provides s02 p −2δ1 #Ᏹ ≤
R2 (ᏺm , I0 ) − |I0 |2 θ,δ,D s 2 Mp −δ , 0 1≤m≤M
which implies (1.2). Property (1.3) follows from the definition of Ᏹ.
292
BOCA AND ZAHARESCU
In the case deg(r) = 3, we consider only M = p and obtain the following proposition. Proposition 4.2. Assume that r ∈ Fp [X] has degree 3. Then for all δ > 0, all large primes p, all integers N ∈ [p 2δ , p 1/4−δ ], all s > 0, and all intervals I ⊂ [−s, s], one has R2 (ᏺm , I ) − |I |2 δ s 2 p 1−δ . S = S(p, r, N) = 1≤m≤p
Proof. We take r(X) = aX 3 + bX 2 + cX + d, and a, b, c, d ∈ Z, a = 0 (mod p). Then r˜ (X) = 3aX 2 + 2bX + c t1 x˜1 − y˜1 − t2 x˜2 − y˜2 + (3aX + b) t1 x˜12 − y˜12 − t2 x˜22 − y˜22 . As in the proof of Theorem 4.1, we find S = S + oδ p 1−δ , provided that we know S δ,r s 2 p 1−δ , where 1 S = 2 2 N p
F (t1 )F (t2 )
0 =t1 ,t2 (mod p)
1≤x˜1 ,y˜1 ,x˜2 ,y˜2 ≤N 1≤m≤p x˜j =y˜j
r˜ (m) . e p
The 6-tuples (t1 , t2 , x˜1 , y˜1 , x˜2 , y˜2 ) for which r˜ (X) is not a linear polynomial are dealt with as in the proof of Theorem 4.1. In the sequel we consider only 6-tuples with t1 (x˜1 − y˜1 ) = t2 (x˜2 − y˜2 ) (mod p). Their contribution to S is S =
1 N 2 p2 ×
F (t1 )F (t2 )
0 =t1 ,t2 (mod p)
1≤x1 ,y1 ,x2 ,y2 ≤N, xj =yj 1≤m≤p t1 (x1 −y1 )=t2 (x2 −y2 ) (mod p)
3ma + b t1 x1 − y1 x1 + y1 − x2 − y2 . p
e
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
293
With the change of variables xj + yj = lj , xj − yj = kj , j = 1, 2, S equals
1 N 2 p2
F (t1 )F (t2 )
0 =t1 ,t2 (mod p)
×
0<|kj |
bt1 k1 (l1 − l2 ) e p
1≤m≤p
3mat1 k1 (l1 − l2 ) . e p
Since p > 6N, the inner sum above is zero unless l1 = l2 , providing |S |
|F (t1 )| · |F (t2 )|.
0 =t1 ,t2 (mod p) 0<|kj |
We proceed as in the proof of Proposition 2.4, and we break [1, p−1] into [log2 p]+ 1 dyadic intervals [A, 2A]. We have to prove that if 1 ≤ A < 2A < p/2, 1 ≤ B < 2B < p/2, then # t1 , t2 , k1 , k2 ; t1 ∈ [A, 2A], t2 ∈ [B, 2B], a1 = a2 (mod p) N max(N, A) max(N, B)p −δ log−2 p,
(4.20)
where we put a1 = t1 k1 , a2 = t2 k2 . It is clear that |a1 | AN, |a2 | BN , so for fixed a1 there exist at most [BN /p] + 1 solutions a2 of the congruence a1 = a2 (mod p). It follows that the number of admissible pairs (a1 , a2 ) is ABN 2 /p + AN; hence the left-hand side of (4.20) p
ε
ABN 2 + AN . p
But p ε ABN 2 /p NAB/p δ log2 p and p ε AN N 2 A/p δ log2 p provided that we take 0 < ε < δ. 5. The case deg(r) = 2. We start with an averaging result, without assuming that p is prime. Theorem 5.1. Let α, β > 0 such 2α +β = 1, and let small δ > 0 be fixed. Let q be a large integer, let N and M be nonnegative integers, and let r(X) = bX2 +cX +d ∈ Z[X] such that gcd(b, q) = 1 and M ≥ q 2α−11δ ,
N 2 ≤ q 1−3δ ,
and
N 2 q 1−α−3δ ≤ M ≤ q,
(5.1)
294
BOCA AND ZAHARESCU
# # # bk # q 4δ # #> for all 1 ≤ k ≤ q 6δ , #q # N # # # 2bk # −1+β+20δ # # for all 1 ≤ |k| ≤ q α+20δ . # q #>q
(5.2) (5.3)
Then, for any interval I ⊂ [−s, s], one has 2 R2 (ᏺm , I ) − |I |2 δ s M . S = S q, r, M, N = qδ 1≤m≤M
Proof. We write, as in the proof of Proposition 2.4, R2 (ᏺm , I ) =
1 Nq
F (t)Gm (t),
t (mod q)
where F (t) =
th , e − q
h∈(q/N)I
Gm (t) =
m<x =y≤m+M
t r(x) − r(y) , e q
and we make steady use of (2.21). From now on we always take |t| ≤ q/2. We gather, as in the proof of Proposition 2.4, M (5.4) S = S + o δ , q provided that S δ
s2M , qδ
(5.5)
where we let 2 1 F (t)Gm (t) . S = 2 2 N q 1≤m≤M 0 =t (mod q)
(5.6)
In the sequel we prove (5.5). For any function ψ : R → R which is periodic (mod 1), we put 1 S(ψ) = 2 2 N q
m (mod q)
m ψ q
0 =t (mod q)
2 F (t)Gm (t) .
(5.7)
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
295
If ψ is smooth, then ψ(t) =
cl (ψ) e(lt),
(5.8)
ψ(t) e(−lt) dt.
(5.9)
l∈Z
with Fourier coefficients given by cl (ψ) =
1 0
We see, as in the proof of Proposition 2.4 and Section 4, that S(ψ) =
1 cl (ψ) N 2q 2 l∈Z
F (t1 )F (t2 )
0 =t1 ,t2 (mod q), |tj |≤q/2 1≤x1 =y1 ≤N 1≤x2 =y2 ≤N
c t1 (x1 − y1 ) − t2 (x2 − y2 ) + b t1 x12 − y12 − t2 x22 − y22 ×e q ×
m (mod q)
=
ml 2mb t1 (x1 − y1 ) − t2 (x2 − y2 ) + e q q
1 cl (ψ) N 2q l∈Z
(5.10)
F (t1 )F (t2 )
0 =t1 ,t2 (mod q), |tj |≤q/2 1≤xj =yj ≤N 2b(t1 (x1 −y1 )−t2 (x2 −y2 ))+l=0 (mod q)
c t1 (x1 − y1 ) − t2 (x2 − y2 ) + b t1 x12 − y12 − t2 x22 − y22 . ×e q Consider a fixed smooth function H such that H (t) ≥ 0
for all t,
(5.11)
H (t) ≥ 1
for all t ∈ [0, 1],
(5.12)
H (t) = 0
for |t| ≥ 2.
(5.13)
Then let T be a (large) positive number, and let ψ be the periodic function (mod 1) such that 1 ψ(t) = H (T t) for |t| ≤ , 2
(5.14)
296
BOCA AND ZAHARESCU
with Fourier coefficients cl (ψ) = =
1
ψ(t) e(−lt) dt
0 1/2
H (T t) e(−lt) dt
−1/2 1 T /2
ly dy H (y) e − = T −T /2 T ly 1 dy. H (y) e − = T R T
(5.15)
In particular, c0 (ψ) =
1 T
H (y) dy = R
H 1 > 0. T
(5.16)
Note that, from (5.11), (5.15), and (5.16), |cl (ψ)| ≤ c0 (ψ),
l ∈ Z.
(5.17)
We integrate by parts r times in (5.15) and get, for all l = 0, T r 1 ly · H (r) (y) e − dy |cl (ψ)| = · T 2πl T R r (r) T H 1 · ≤ T 2πl r (r) H 1 T = c0 (ψ) · · H 1 2πl r T = C(r)c0 (ψ) · . l
(5.18)
For each δ > 0 we take r = r(δ) such that (r − 3)δ > 3. We also take T = q/M with M and N as in (5.1). Therefore ψ = Hq/M in the sequel and (5.18) provides
|cl (ψ)| ≤ C(r)c0 (ψ) ·
|l|≥q 1+δ /M
qr Mr
|l|≥q 1+δ /M
1 |l|r
C(r)c0 (ψ) ·
1+δ −r+1 qr q · r M M
= C(r)c0 (ψ) ·
qr M r−1 · (1+δ)(r−1) r M q
(5.19)
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
297
= C(r)c0 (ψ)q 1−δ(r−1) = C(r)H 1 Mq −δ(r−1) . By (5.16), (5.7), and (5.12), 0 ≤ S ≤ S(ψ); therefore (5.5) follows from S(ψ) δ
s2M . qδ
(5.20)
To prove (5.20) we write S(ψ) = SI (ψ) + SI I (ψ) + SI I I (ψ), where SI (ψ), SI I (ψ), and SI I I (ψ) are defined by the last equality in (5.10), when summing over |l| ≥ q 1+δ /M, 0 < |l| < q 1+δ /M, and, respectively, l = 0. From the definition of SI (ψ), (2.21), (5.19), and the choice of r, we get s2N 4 SI (ψ) 2 · N q
q |t| 0 =|t|≤q/2
2
|cl (ψ)|
|l|≥q 1+δ /M
δ s 2 N 2 Mq 1−(r−1)δ log2 q
(5.21)
≤ s 2 Mq 3−(r−2)δ ≤ s 2 Mq −δ . The estimates for |SI I (ψ)| and |SI I I (ψ)| are proved in the next two lemmas. Lemma 5.2. With the assumptions from Theorem 5.1 and ψ = Hq/M , one has 2 SI I (ψ) δ s M . qδ
(5.22)
Proof. We set k1 = x1 −y1 , k2 = x2 −y2 . By the definition of SI I (ψ), (5.17), and (5.16), SI I (ψ) ≤ M q2
0<|l|
0 =t1 ,t2 (mod q) |tj |≤q/2, 1≤|kj |≤N 2b(t1 k1 −t2 k2 )+l=0 (mod q)
|F (t1 )| · |F (t2 )|.
(5.23)
298
BOCA AND ZAHARESCU
With the dyadic interval argument from the proof of Proposition 2.4, (5.22) follows from (5.23) and from q 1+δ l, t1 , t2 , k1 , k2 ; l, tj , kj ∈ Z, 0 < |l| < , M max(N, A) max(N, B) |t1 | ∈ [A, 2A], |t2 | ∈ [B, 2B], # q δ log2 q 1 ≤ k1 , k2 ≤ N, 2b t1 k1 − t2 k2 + l = 0 (mod q) (5.24) for all 1 ≤ A, B ≤ q/4, uniformly in A and B. We first consider the case max(|t1 |, |t2 |) ≥ q α+6δ /N. Assume, for instance, BN ≥ α+6δ q , and set a1 = t1 k1 , a2 = t2 k2 . Then |a1 | ≤ 2AN and |a2 | ≤ 2BN. For fixed l and a1 , the number of a2 such that |a2 | ≤ 2BN and 2b(a1 − a2 ) + l = 0 (mod q) is BN BN + 1 α+6δ . q q Thus, for each l, the number of pairs (a1 , a2 ) with |a1 | ≤ 2AN, |a2 | ≤ 2BN , and 2b(a1 − a2 ) + l = 0 (mod q) is ABN 2 /q α+6δ . Moreover, since the number of divisors of each (fixed) a1 and a2 is q δ/2 , we have the left-hand side of (5.24)
q 1+δ ABN 2 δ AB N 2 q 1−α−4δ · ·q = δ · . M q α+6δ q M
(5.25)
But the right-hand side of (5.25) AB/q δ log2 q, since N 2 q 1−α−4δ ≤ Mq −δ ≤ M/log2 q, so (5.24) is proved in this case. In the remaining situation we have max(|t1 |, |t2 |) ≤ q α+6δ /N, so t1 k1 − t2 k2 ≤ 2q α+6δ , and (5.3) provides # # # # # 2b(t1 k1 − t2 k2 ) # # l # # = # # > q −2α+20δ . # # #q # # q On the other hand, 0 < |l| <
q 1+δ q 1+δ ≤ 2α−11δ = q β+12δ ; M q
hence l/q = |l|/q and the previous inequality yields |l| > q β+20δ ; thus q 1+δ /M > q β+20δ and M < q 2α−19δ , which is a contradiction.
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
299
Lemma 5.3. With the assumptions from Theorem 5.1 and ψ = Hq/M , one has 2 SI I I (ψ) δ s M . qδ
Proof. By the very definition of SI I I (ψ) and (5.16), SI I I (ψ) =
=
MH 1 N 2q 2
MH 1 N 2q 2
F (t1 )F (t2 )
0 =t1 ,t2 (mod q), |tj |≤q/2 1≤xj =yj ≤N 2t1 (x1 −y1 )=2t2 (x2 −y2 ) (mod q)
0 =t1 ,t2 (mod q), |tj |≤q/2 1≤|k1 |,|k2 |
and so |SI I I (ψ)|
M N 2q 2
bt1 (x1 − y1 ) x1 + y1 − x2 − y2 ×e q bt1 k1 (l1 − l2 ) , F (t1 )F (t2 ) e q
2
|F (t1 )| · |F (t2 )| · min N 2 ,
0 =t1 ,t2 (mod q), |tj |≤q/2 1≤|k1 |,|k2 |
1 . 2bt1 k1 /q2 (5.26)
We first evaluate the contribution of (t1 , t2 ) with |t1 | ≥ Nq 3δ to the right-hand side of (5.26) and show that 1 q2
Nq 3δ ≤|t1 |≤q/2 0<|t2 |
|F (t1 )| · |F (t2 )| δ
s2 . qδ
(5.27)
With the dyadic interval argument from Proposition 2.4, it suffices to prove (we put a1 = 2t1 k1 , a2 = 2t2 k2 ) t1 , k1 , t2 , k2 ; tj , kj ∈ Z max(N, A) max(N, B) δ # |t1 | ∈ [A, 2A], |t2 | ∈ [B, 2B], q δ log2 q 0 < |kj | < N, a1 = a2 (mod q) (5.28) for all 1 ≤ B ≤ q/4, Nq 3δ ≤ A ≤ q/4.
300
BOCA AND ZAHARESCU
The number of integer pairs (a1 , a2 ) with |a1 | ≤ 2AN, |a2 | ≤ 2BN, a1 = a2 (mod q), is ABN 2 + BN. q Since t1 , k1 are divisors of a1 , and since t2 , k2 are divisors of a2 , the left-hand side of (5.28) q
δ
ABN 2 + BN . q
(5.29)
But N 2 q 1−3δ ; hence q δ ABN 2 /q AB/q δ log2 q. Also, A Nq 3δ implies q δ BN AB/q δ log2 q and (5.28). We see in a similar way that the contribution of terms (t1 , t2 ) with |t2 | ≥ Nq 3δ to the right-hand side of (5.26) is δ s 2 M/q δ . As a result, we may assume max |t1 |, |t2 | ≤ Nq 3δ ,
(5.30)
which provides |2t1 k1 −2t2 k2 | ≤ 4N 2 q 3δ < q, forcing 2t1 k1 = 2t2 k2 (= a). Therefore, it suffices to prove SI V
1 = 2 2 N q
1 |F (t1 )| · |F (t2 )| · min N , ba/q2 2
|a|≤2N 2 q 3δ 0<|tj |≤N 2 q 3δ 0<|kj |
δ
s2 . qδ
(5.31) We put
#
#
# ab # q 3δ # Ꮽ = a ∈ Z; |a| ≤ 2N 2 q 3δ , # , # #< q
N
and we prove #Ꮽ ≤ N 2 q −2δ .
(5.32)
If (5.32) were not true, we would find a , a ∈ Ꮽ, a = a , such that
4N 2 q 3δ |a − a | ≤ + 1 < q 6δ . N 2 q −2δ
On the other hand, since a , a ∈ Ꮽ, we have # # # 2 a − a b # 2q 3δ q 4δ #< # < . # # q N N
(5.33)
(5.34)
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
301
It is clear that (5.33) and (5.34) contradict (5.2). Therefore, (5.32) holds true, and (5.31) follows from SI V ≤ SV + SV I , where SV =
1 2 N q2
|F (t1 )| · |F (t2 )| · N 2
a∈Ꮽ 0<|tj |≤N 2 q 3δ 0<|kj |
SV I =
1 N 2q 2
|F (t1 )| · |F (t2 )| ·
|a|≤2N 2 q 3δ 0<|tj |≤N 2 q 3δ a∈ /Ꮽ 0<|kj |
#Ꮽ δ q 2 qδ N 2 1 · q · ≤ · 2δ = δ , 2 2 2 q q N N q
1 ba/q2
N 2 q 3δ δ q 2 N 2 1 · q · 2 · 6δ = 2δ . 2 2 N q N q q
The proof of Theorem 5.1 is now complete. For M = q the estimate of S(q, r, M, N ) reduces to l = 0. For we see as in (5.10) that 1 S = 2 N q
0 =t1 ,t2 (mod q) 1≤xj =yj ≤N 2(t1 (x1 −y1 )−t2 (x2 −y2 ))=0 (mod q)
c t1 (x1 − y1 ) − t2 (x2 − y2 ) F (t1 )F (t2 ) e q
b t1 x12 − y12 − t2 x22 − y22 . ×e q
This sum can be estimated as in the proof of Lemma 5.3, getting the following corollary. Corollary 5.4. Let δ > 0 be fixed, let q be a large integer, and let r(X) = bX 2 + cX + d ∈ Z[X] such that gcd(b, q) = 1. Then, for all positive integers N such that N 2 ≤ q 1−3δ and # # # bk # q 4δ # #> #q # N one has S(q, r, N) =
for all 1 ≤ k ≤ q 6δ ,
R2 (ᏺm , I ) − |I |2 δ s 2 q 1−δ ,
m (mod q)
for all s > 0 and all intervals I ⊂ [−s, s].
302
BOCA AND ZAHARESCU
6 5
(2,5)
4
(9/4,4)
B
3 2 1
(5/2,2)
(15/4,2)
(9/4,1) 1
2
3
4
5
6
Figure 5.1. The set B
We consider the following subset of [2, ∞] ∪ [1, ∞): 1 4κ + 2 , B = (κ, σ ); σ ≥ max 2, min κ − 2 3κ − 4 2κ 1 − ,2 . ∪ (κ, σ ); 1 ≤ σ ≤ min 3 2
(5.35)
Theorem 5.5. For all (κ, σ ) ∈ B and all irrational numbers ξ of Diophantinetype κ, there exist two sequences of positive integers Mj , Nj → ∞ such that (i) limj R2 ({ξ n2 }Mj 0 small. We choose sequences (bj , qj ) of integers such that gcd(bj , qj ) = 1 and ξ − bj < 1 ; (5.36) qj qjκ−ε 3/4+ε
then let Nj = [qj
σ (3/4+ε)
] and Mj = [qj ]. From (5.36) we gather 2 2 bj n2 bj 3σ /2+2σ ε−κ+ε n ξ− < Mj + Nj ξ − qj . max 1≤n≤Mj +Nj qj qj
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
303
−3/4−ε
By the choice of σ this is qj
, so 2 n2 bj max n ξ− 1≤n≤Mj +Nj q j
1 Nj1+ε
.
(5.37)
We combine (5.37) with Theorem 5.6 to conclude that the pair correlation of {n2 ξ }Mj
j σ (3/4+ε) Mj = [qj ],
Then we again let and we argue as above. Assume now that the approximation order of ξ is 2 ≤ κ < ∞. We check that (5.2) is fulfilled by qj and bj which satisfy gcd(qj , bj ) = 1 and (5.36), by r(X) = bj X 2 4(κ+1)δ and Nj ≥ qj . This is clear since, for large j and 1 ≤ a ≤ qj4δ , we have # # # # # abj # # abj # abj 1 # # # # aξ − # ≤ aξ − ≤ aξ − < ; qj # # qj # qj qjκ−ε−4δ hence
# # # abj # 1 # # # q # > aξ − κ−ε−4δ qj j ≥ ≥ >
Cξ 1 − κ−ε−4δ κ−1 a qj Cξ (κ−1)4δ qj qj4δ Nj
−
1 qjκ−ε−4δ
>
Cξ (κ−1)4δ 2qj
.
We now check that (5.3) is fulfilled for κ < 3. Let α, β > 0 such that 2α + β = 1 β+20δ , and (to be chosen precisely later), and assume that |z| ≤ qjα+20δ , 0 = |l| ≤ qj 2bj z = l (mod qj ). Then # # # # # # # # 2 #2zξ − l # = #2zξ − 2zbj # ≤ 2zξ − 2zbj ≤ 2|z| ≤ . # # # # κ−α−ε−20δ qj qj qj qjκ−ε qj So, since 1 − β < κ − α, we get, for large j , 2zξ ≤
l 2 1 2 2 2 + κ−α−ε−20δ ≤ 1−β−20δ + κ−α−ε−20δ < 1−β−20δ = 2α−20δ . qj qj qj qj qj qj (5.38)
304
BOCA AND ZAHARESCU
On the other hand, Cξ Cξ ≥ . κ−1 (κ−1)(α+20δ) qj 2|z|
2zξ ≥
(5.39)
By (5.38) and (5.39), (κ − 3)α + 20κδ ≥ 0,
(5.40)
which is a contradiction for 2 ≤ κ < 3 (by choosing ε > 0 and δ > 0 small enough). Now let σ ≥ max(4, (4κ + 2)/(3κ − 4)). Then 2σ /(3σ − 4) ≤ min(1, κσ /(2σ + 1)). We need to find 0 < θ < 1 and 0 < α < 1/2 such that 2θ +
θ < κ −ε σ
(5.41)
and 2θ < 1 − 3δ, σ
θ > 2α − 11δ,
2θ + 1 − α − 3δ < θ. σ
(5.42)
This can be achieved for small ε and δ by 2σ (1− 4δ)/(3σ − 4) < θ <(κ − ε)σ /(2σ + 1) θ/σ and α = θ/2 + δ. Then we let Mj = [qjθ ] and Nj = [qj ]. The inequalities from (5.42) show that (5.1) is fulfilled, while (5.41) shows that there exists c > 0 such that Mj2 Nj1+c qjκ−ε .
(5.43)
All the assumptions of Theorem 5.1 are now fulfilled; hence, for all s > 0 and all intervals I ⊂ [−s, s], we have bj n2 R 2 qj M
1≤m≤Mj
If we let Mj 2 ≤ m ≤ Mj ; Ᏹ=
2 C(δ)s 2 Mj , I − |I | ≤ . δ q <m≤M +N j j j j
(5.44)
∃ s0 > 0, ∃ I0 ⊂ [−s0 , s0 ] interval such that
2C(δ)1/2 s , bj n2 0 , I0 − |I0 | ≥ R2 δ/2 qj Mj <m≤Mj +Nj q j
then, by (5.43), 4C(δ)s02 qjδ
· #Ᏹ ≤
C(δ)s02 qjδ
.
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p)
305
Hence #Ᏹ ≤ Mj /4 and there is Mj ∈ [Mj /2, Mj ] such that, for all s > 0 and all intervals I ⊂ [−s, s], 2 1/2 n b R2 j − |I | ≤ 2C(δ) s . , I (5.45) δ/2 qj M <m≤M +Nj q j
j
j
By (5.41) we have, for all n ∈ (Mj , Mj + Nj ], # # # 2 bj n2 # 2 bj n2 20Mj2 20 1 #ξ n − # ≤ ξ n − ≤ ≤ κ−2θ−2ε < 1+c . # qj # qj qjκ−ε qj Nj Now let s > 0 be fixed, and let I0 ⊂ [−s, s] be an interval. With the previous inequality, (3.9) provides intervals I0 ⊂ I0 ⊂ I0 such that |I0 | = |I0 | − and
bj n2 R2 qj
1 , Njc
|I0 | = |I0 | +
Mj
, I0 ≤ R2
1 , Njc
(5.46)
ξ n2 M
j
bj n2 ≤ R2 qj
j
Mj
(5.47)
, I0 .
We can now combine (5.47) with (5.45) and get, for j large, 2 n b 1 j , I0 − |I0 | + c R2 ξ n2 M
≤
2C(δ)1/2 (s + 1)
and R2
ξn
2
Mj
1/2 qj
, I0 − |I0 | ≥ R2
≥−
j
+
1 Njc
n2
bj qj
2C(δ)1/2 s 1/2 qj
Mj
−
1 . Njc
, I0 − |I0 | −
1 Njc
306
BOCA AND ZAHARESCU
Finally, by (5.45) and the previous two inequalities we conclude that lim R2 ξ n2 M
j
j
j
(5.48)
Assume now that σ > max(2, 1/(κ − 2)). Then, for all ε > 0, there exist sequences (bj )j , (qj )j which satisfy gcd(bj , qj ) = 1 and (5.36). We take Mj = qj and Nj = 1/σ [qj ]. Then # 2 # 4qj2 # 2 # 4 1 #n ξ − n bj # < = κ−2−ε < 1+c , # # κ−ε qj qj qj Nj
1 ≤ n ≤ M j + Nj
(5.49)
for an appropriate choice of c > 0. Since σ > 2 we also have Nj2 ≤ qj1−3δ for small δ > 0. Therefore, we can apply Corollary 5.4 and get, as in the first case, Mj ∈ [qj /2, qj ] such that (5.48) holds true for all intervals I0 . Note that Mj satisfy the requirements for Mj in the statement of Theorem 1.4. References [1] [2] [3] [4] [5] [6] [7]
[8] [9] [10] [11] [12] [13] [14]
R. C. Baker, Diophantine Inequalities, London Math. Soc. Monogr. (N.S.), Clarendon Press, Oxford, 1986. M. V. Berry and M. Tabor, Level clustering in the regular spectrum, Proc. Royal Soc. London Ser. A 356 (1977), 375–394. P. M. Bleher, The energy level spacing for two harmonic oscillators with golden mean ratio of frequencies, J. Statist. Phys. 61 (1990), 869–876. , The energy level spacing for two harmonic oscillators with generic ratio of frequencies, J. Statist. Phys. 63 (1991), 261–283. E. Bombieri, On exponential sums in finite fields, Amer. J. Math. 88 (1966), 71–105. G. Casati, I. Guarneri, and F. M. Izra˘ılev, Statistical properties of the quasi-energy spectrum of a simple integrable system, Phys. Lett. A 124 (1987), 263–266. H. L. Montgomery, “The pair correlation of zeros of the zeta function” in Analytic Number Theory (St. Louis, Mo., 1972), Proc. Sympos. Pure Math. 24, Amer. Math. Soc., Providence, 1973, 181–193. C. J. Moreno and O. Moreno, Exponential sums and Goppa codes, I, Proc. Amer. Math. Soc. 111 (1991), 523–531. T. Nagell, Introduction to Number Theory, 2d ed., Chelsea, New York, 1964. A. Pandey, O. Bohigas, and M. J. Giannoni, Level repulsion in the spectrum of twodimensional harmonic oscillators, J. Phys. A 22 (1989), 4083–4088. Z. Rudnick and P. Sarnak, Zeros of principal L-functions and random matrix theory, Duke Math. J. 81 (1996), 269–322. , The pair correlation function of fractional parts of polynomials, Comm. Math. Phys. 194 (1998), 61–70. Z. Rudnick, P. Sarnak, and A. Zaharescu, The spacings between the numbers n2 α (mod 1), in preparation. P. Sarnak, “Quantum chaos, symmetry and zeta functions, I, II” in Current Developments in Mathematics (Cambridge, Mass., 1997), International Press, Boston, 1999, 127–144, 145–159.
PAIR CORRELATION OF VALUES OF RATIONAL FUNCTIONS (mod p) [15] [16] [17] [18] [19]
307
W. M. Schmidt, Small Fractional Parts of Polynomials, CBMS Regional Conf. Ser. in Math. 32, Amer. Math. Soc., Providence, 1977. V. Sós, On the distribution mod 1 of the sequence nα, Ann. Univ. Sci. Budapest. Eötvös Sect. Math. 1 (1958), 127–134. S. Swierczkowski, On succesive settings of an arc on the circumference of a circle, Fund. Math. 46 (1959), 187–189. A. Weil, Sur les courbes algébriques et les variétés qui s’en déduisent, Actualités Sci. Indust. 1041, Publ. Inst. Math. Univ. Strasbourg 7 (1945), Hermann, Paris, 1948. H. Weyl, Über die Gleichverteilung von Zahlen mod. Eins, Math. Ann. 77 (1916), 313–352.
Boca: Institute of Mathematics of the Romanian Academy, Post Office Box 1-764, Bucharest RO-70700, Romania; Current: School of Mathematics, Cardiff University, Senghennydd Road, Cardiff CF2 4YH, United Kingdom; [email protected] Zaharescu: Institute of Mathematics of the Romanian Academy, Post Office Box 1-764, Bucharest RO-70700, Romania; Current: Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA; [email protected]
Vol. 105, No. 2
DUKE MATHEMATICAL JOURNAL
© 2000
SURFACES WITH PRESCRIBED GAUSS CURVATURE SAGUN CHANILLO and MICHAEL KIESSLING 1. Introduction. Let Sg = (R2 , g) denote a conformally flat surface over R2 with metric given by ds 2 = g ij dxi dxj = e2u(x) dx1 2 + dx2 2 , (1.1) where u is a real-valued function of the isothermal coordinates x = (x1 , x2 ) ∈ R2 . If u is given, the Gauss curvature function K for Sg is then explicitly given by K(x) = −e−2u(x) u(x), where is the Laplacian for the standard metric on R2 . The quantity (u) ≡ K(x)e2u(x) dx, R2
(1.2)
(1.3)
where dx denotes Lebesgue measure on R2 , is called the integral curvature of the surface (sometimes called total curvature). We say that Sg is a classical surface over R2 if u ∈ C 2 (R2 ). Clearly, K ∈ C 0 (R2 ) in that case. The inverse problem, namely, to prescribe K and to find a surface Sg pointwise conformal to R2 for which K is the Gauss curvature, renders (1.2) a semilinear elliptic partial differential equation (PDE) for the unknown function u. The problem of prescribing Gaussian curvature thus amounts to studying the existence, uniqueness or multiplicity, and classification of solutions u of (1.2) for the given K. A particularly interesting aspect of the classification problem is the question under which conditions radial symmetry of the prescribed Gauss curvature function K implies radial symmetry of the classical surface Sg = (R2 , g) and under which conditions radial symmetry is broken. Notice that the inverse problem may not have a solution. In particular, when considered on S2 instead of R2 , there are so many obstructions to finding a solution u to (the analog of) (1.2) for the prescribed K that Nirenberg was prompted many years ago to raise the question, Which real-valued functions K are Gauss curvatures of some surface Sg over S2 ? For Nirenberg’s problem, see [4], [6], [9], [10], [11], [33], [36], [38], [44], [45], [47], [48], [50], and [51]. For related works on other compact 2-manifolds, see, for example, [26] and [60]. In this work, we are interested in the prescribed Received 7 June 1999. Revision received 14 January 2000. 2000 Mathematics Subject Classification. Primary 53C21; Secondary 60K35. Chanillo’s work supported by National Science Foundation grant DMS-9623079. Kiessling’s work supported by National Science Foundation grant DMS-9623220. 309
310
CHANILLO AND KIESSLING
Gauss curvature problem on R2 . There is considerable literature on this problem, for example, [2], [3], [12], [17], [18], [19], [21], [22], [46], [49], and [59]. We study here the existence problem of surfaces for a large class of K via a novel approach. We also mention an existence-of-solutions result for a monotonically decreasing K that is unbounded below and positive at the origin. Moreover, we study the question of radial symmetry of classical surfaces, which correspond to classical solutions of (1.2), for monotonically decreasing K and for nonpositive K. The problem of nonpositive prescribed Gaussian curvature K is already fairly well understood (see [1], [21], [22], [52], [53], and [61]). In particular, [21, Theorem III] characterizes any Sg with compactly supported K and finite integral curvature uniquely by its integral curvature and by an entire harmonic function H to which u is asymptotic at infinity. If the entire harmonic function is constant and K radially symmetric, then u is radially symmetric, by uniqueness. Cheng and Ni’s [21, Theorem II] characterizes any Sg with K ∼ −C|x|− when |x| → ∞, > 2, and finite integral curvature uniquely by its integral curvature alone, so that u is radially symmetric if K is. This theorem is extended in [22] to K satisfying an integrability condition and C|x|−m ≤ |K(x)| ≤ C|x|m as |x| → ∞. Our Theorem 2.1 generalizes [21, Theorem III], as well as Cheng and Ni’s [21, Theorem II] and its sequel in [22], to a larger class of K satisfying mild integrability conditions without pointwise asymptotic bounds or even compact support for K. Our existence results follow as corollaries from our probabilistic Theorem 8.4, which applies to nonnegative K as well as nonpositive ones. We prove our Theorem 8.4 using the methods developed in [37], [8], [40], and [41]; see also [38]. For nonpositive radial K, the radial symmetry of u then follows from our uniqueness Theorem 2.2, which we prove in its dual version Theorem 9.1. Prescribing Gaussian curvature K that is somewhere strictly positive is a much richer problem and is less well understood. Existence results are available in [3], [18], [19], [46], and [59]; note [18] regarding [3]. The question of radial symmetry of u has been studied by various authors for decreasing K under various additional conditions, see [12], [15], [16], [17], and [54]. As already emphasized above, our Theorem 2.1 establishes existence of u also for nonnegative K, under mild integrability conditions on K rather than prescribed asymptotic behavior or pointwise bounds, as employed in [3], [18], [19], [46], and [59]. We also announce an existence result of a radial surface with positive integral curvature for a radial continuous K that is positive at the origin and diverges logarithmically to −∞ as |x| → ∞ (see Proposition 2.4). In our proof of Proposition 2.4, we actually do not prescribe K but, inspired by [39], we consider a system of equations whose solutions determine both K and u, and we use scattering theory and gradient flow techniques to control it. This system case is of independent interest, and details will be published elsewhere. The radial symmetry of surfaces with K positive somewhere does not follow simply by uniqueness. In Section 2, we list various nonradial surfaces with radial Gauss
PRESCRIBED GAUSS CURVATURE
311
curvature. We extract from this discussion a set of conditions on K and g which rule out the various nonradial surfaces we found. In particular, we demand that K be radial decreasing. We formulate a conjecture that, under this set of conditions, any classical surface for the corresponding prescribed Gauss curvature K is radially symmetric about some point. We then state in Section 3, and subsequently prove in Sections 4–7, using the method of moving planes (see [30] and [49]), Theorems 3.2 and 3.3 on radial symmetry of classical surfaces. Our symmetry theorems require a slightly stronger set of conditions than those formulated in our conjecture. However, our conditions are considerably weaker than those used in [12], [15], [16], and [17]. In particular, we impose no pointwise bounds near infinity on positive K. We also allow K to be unbounded below, but with some growth conditions near infinity, allowing logarithmic as well as power law growth of |K|. Our existence-of-solutions Theorem 2.1 and Proposition 2.4 establish that solutions exist under these conditions on K and thus verify that our radial symmetry theorems cover more cases than the earlier symmetry results listed above. After submission of our work, existence results when K is positive somewhere and satisfies 0 ≥ K(x) ≥ −C|x| as |x| → ∞, with 0 < < 2, appeared in [20]. Of these surfaces, those that also satisfy the hypotheses on K listed in Proposition 3.1 are radially symmetric by Theorems 3.2 and 3.3. 2. Broken symmetry and a symmetry conjecture. We say that Sg is radially symmetric about some point x ∗ ∈ R2 if the associated solution u of (1.2) satisfies u(x − x ∗ ) = u((x − x ∗ )) for any ∈ SO(2). We say that u is nonradial if no such point exists. We now collect a list of examples of nonradial surfaces from which we extract conditions on K and g under which one can hope to assert the radial symmetry of u. Clearly, u cannot be radially symmetric about some point if K is not radially symmetric about the same point. Without loss, we choose the point about which K is radially symmetric to be the origin; that is, we demand that K(x) = K(x).
(2.1)
A few moments of reflection reveal that some further conditions on K(x) and u(x) are needed; for without further conditions, examples to nonradially symmetric surfaces having a Gauss curvature K satisfying (2.1) are readily found. In particular, if K satisfying (2.1) is compactly supported, then solutions u of (1.2) which display some nonconstant entire harmonic behavior near infinity are asserted to exist (for nonpositive K) in [21, Theorem III]. Our first theorem, proved in Section 8, generalizes [21, Theorem III], as well as [21, Theorem II] and its extension in [22], to a much wider class of sufficiently “concentrated” K that have well-defined sign. We define the sign σ (K) of the function K by: σ (K) = +1 if K ≡ 0, K(x) ≥ 0 for
312
CHANILLO AND KIESSLING
all x ∈ R2 ; σ (K) = −1 if K ≡ 0, K(x) ≤ 0 for all x ∈ R2 ; σ (K) = 0 if K(x) ≡ 0. For other K, σ (K) does not exist. Theorem 2.1. Assume K ∈ L∞ (R2 ) has well-defined sign σ (K). Furthermore, assume that for some entire harmonic function H : R2 → R and for all 0 < γ < 2, K satisfies |y − x|−γ |K(x)|e2H (x) dx −→ 0 as |y| −→ ∞, (2.2) B1 (y)
where BR (y) ⊂ R2 is the open ball of radius R centered at y. Given the same H , assume also that K satisfies |K(x)|e2H (x) |x|q dx < ∞ (2.3) R2
for some q > 0. If K ≤ 0, define
κ ∗ (K, H ) = −2π sup q : (2.3) is true . q>0
Then, for any such K, H , and any κ satisfying ∗ (κ , 0) if K ≡ 0, K ≤ 0, κ ∈ {0} if K ≡ 0, (0, 4π) if K ≡ 0, K ≥ 0,
(2.4)
(2.5)
there exists a solution u = UH,κ ∈ Wloc ∩ L∞ loc of (1.2) for the prescribed Gaussian curvature function K, having integral curvature UH,κ = κ (2.6) 2,p
and having asymptotic behavior given by UH,κ (x) = H (x) −
κ ln |x| + o | ln |x|| 2π
as |x| −→ ∞.
(2.7)
Moreover, if K ∈ C 0,α , then UH,κ is a classical solution. If K ∈ C 0,α also satisfies (2.1) and H is nonconstant, then UH,κ generates a classical surface that is asymptotic to a nonradial entire harmonic surface, hence breaking radial symmetry. We remark that if K ∈ C 0,α satisfying (2.1) is also decreasing, then all the conclusions of Theorem 2.1 hold without imposing (2.2). Surfaces that are asymptotic to some nonradial entire harmonic surface (entire harmonic surfaces for K ≡ 0) can be eliminated by the mild integrability condition u+ ∈ L1 BR (y), dx , uniformly in y, (2.8)
PRESCRIBED GAUSS CURVATURE
313
where u+ (x) := max{u(x), 0}. For the category of K ≤ 0 covered in Theorem 2.1, which, in addition, satisfy (2.1), condition (2.8) already eliminates all nonradial solutions u of (1.2) with finite integral curvature. Indeed, we have the following theorem. Theorem 2.2. Under the hypotheses stated in Theorem 2.1, if K ≤ 0, then the solution UH,κ is unique. Moreover, if K ≤ 0 also satisfies (2.1) and u satisfies (2.8), then UH,κ is radially symmetric and decreasing. It remains to discuss K that are strictly positive somewhere. In that case, among the Sg that satisfy (2.1) and (2.8), one finds nonradial surfaces that are periodic about the origin of the Euclidean plane, having fundamental period 2π/n, with n > 1. We illustrate this with the following examples, taken from [12] (see also [54]). For x = 0, we introduce the usual polar coordinates (r, θ) of x, that is, r = |x| > 0 and tan θ = x2 /x1 , with θ ∈ [0, 2π). Let N denote the natural numbers. For n ∈ N, let K(x) = K (n) (x), with K (n) (x) = 4n2 |x|2(n−1) .
(2.9)
Clearly, K (n) ∈ C ∞ (R2 ). Let y ∈ R2 be chosen arbitrarily, except that y = 0, and let (n) θ0 be the polar angle coordinate of y. Let ζ ∈ R. Then u( . ) = Uζ ( . ; y), with
|x|n |x|2n (n) Uζ (x; y) = − ln 1 − 2 n cos n(θ − θ0 ) tanh ζ + 2n − ln |y|n cosh ζ , |y| |y| (2.10) is a C ∞ (R2 ) solution of (1.2) for the Gaussian curvature function (2.9). The integral curvature of the surface described by (2.10) is given by
(n) Uζ (x; y) = 4πn, (2.11) independently of ζ and y. For ζ = 0 and all n ∈ N, the solution (2.10) is radially symmetric about the origin. For ζ = 0, if n = 1 so that (2.9) reduces to a constant, K (1) = 4, the solution (2.10) is periodic about the origin with fundamental period 2π, yet it is radially symmetric about and decreasing away from the point x ∗ = tanh(ζ )y. For ζ = 0 and n > 1, in which cases K (n) increases monotonically with |x|, the solution (2.10) is periodic about the origin with fundamental period 2π/n, hence nonradial about any point; see Figure 1. This last family of nonradial surfaces is eliminated by admitting only monotonically decreasing radial K, that is, those K satisfying K(x) ≤ K(y),
whenever |x| ≥ |y|.
(2.12)
Among the Sg that satisfy (2.1), (2.8), and (2.12), we still find nonradial surfaces, namely, when K(x) = K0 , with K0 = const > 0,
(2.13)
314
CHANILLO AND KIESSLING
1 0.5 x2 0 -0.5 -1 -1.5
-1
-0.5
0 x1
0.5
1
1.5
Figure 1. Level curves e2u(x) = 2a , a ∈ {−5, −4, . . . , 0, 1}, for u given by (2.10) with n = 2, |y| = 1, θ0 = 0, and ζ = 1; max e2u ≈ 2.57 is taken at the centers of the two islands. For |x| large, the conformal factor e2u(x) ∼ C|x|−8 and the level curves become circular.
in which case (1.2) is the conformally invariant Liouville equation [42]. Besides the radially symmetric entire solutions obtained with n = 1 in (2.10), this equation has entire classical solutions that are periodic along a Cartesian coordinate direction. Let y ∈ R2 be an arbitrary fixed point, and let v ∈ R2 and v ∈ R2 be two fixed vectors that are orthogonal with respect to Euclidean inner product, that is, v, v = 0, having 1/2 identical lengths given by |v| = |v | = K0 . Let ζ ∈ R. Then u( . ) = Uζ ( . ; y), with (2.14) Uζ (x; y) = − ln cosh(ζ ) coshv, x − y − sinh(ζ ) sin v , x − y , is a nonradial C ∞ (R2 ) solution of (1.2) for the Gauss curvature function (2.13) (see also [12]). For ζ = 0, the solution is √ translation invariant along v , while for ζ = 0, it is periodic along v with period 2π/ K0 . See Figure 2. Since exp(2Uζ ( . ; y)) ∈ Lp (R2 , dx) for all p except p = ∞, the surface corresponding to (2.14) has integral curvature (u) = +∞, as does any surface that is periodic or invariant along a fixed direction. To rule out translation invariant surfaces and those that are periodic along a fixed Cartesian direction of the Euclidean plane, we could impose the integrability condition exp(2u(x)) dx < ∞. However, it suffices to impose the milder, and more natural, restriction that the surface’s Gauss curvature is absolutely integrable; that is, |K(x)|e2u(x) dx < ∞, (2.15) which reduces to
R2
exp(2u(x)) dx < ∞ if K = const > 0.
315
PRESCRIBED GAUSS CURVATURE
2 1 x2 0 -1 -2 -6
-4
-2
0 x1
2
4
6
Figure 2. Level curves e2u = 2a , a ∈ {−6, −5, . . . , 0}, with u given by (2.14), with ζ = 1, y = −v , K0 = 1, x1 = x, v , and x2 = x, v; max e2u ≈ 1.22 is taken at the centers of the islands. For |v, x| large, e2u(x) ∼ Ce−|v,x| and level curves become straight lines.
We summarize the various conditions on Sg as follows. Definition 2.3. For each K ∈ C 0,α (R2 ) satisfying (2.12), we denote by SK the set of classical surfaces Sg with Gauss curvature K being absolutely integrable, (2.15), and with metric (1.1) satisfying (2.8). Notice that there exist K for which the set SK is empty. Thus, since K satisfies (2.12), no entire solutions of (1.2) exist if K < 0 everywhere (see [52]). In particular, entire solutions in R2 with K = const < 0 do not exist (see [1], [52], and [61]). Moreover, if K(x) ∼ −C|x|p for p ≥ 2 (irrespective of whether K(x) ≤ 0 for |x| < R or not), then it follows from an easy application of Pokhozaev’s identity that SK is empty. On the other hand, if K ≥ 0 everywhere, then there are plenty of radially symmetric surfaces in SK , which follows from Theorem 2.1 with H ≡ const. Furthermore, we note that SK is not empty for certain radial K that are unbounded below, for the following proposition. Proposition 2.4. There exist continuous K(x) satisfying (2.12) and K(x) ∼ − C ln |x| as |x| → ∞ for which SK contains radial surfaces with finite positive integral curvature. The proof of Proposition 2.4, which uses ideas from scattering theory similar to those in [39], together with gradient flow techniques, is of independent interest and will be published elsewhere. All known examples of surfaces in SK are radially symmetric, and we could not conceive of any counterexample to radial symmetry. Hence, we conjecture that all surfaces in SK are radially symmetric. More precisely, our conjecture reads as follows.
316
CHANILLO AND KIESSLING
Conjecture 2.5. Any classical surface Sg ∈ SK is equipped with a radially symmetric nonexpansive metric, in the sense that the conformal factor e2u is radially symmetric and decreasing about some point. Presumably, Conjecture 2.5 can even be widened to include certain K that are not everywhere decreasing (see [12] and [54] for examples). However, currently it seems not clear how to prove even Conjecture 2.5 without some additional technical conditions. In the ensuing sections we will first state and then prove radial symmetry theorems for SK under conditions that are weaker than those used in previous theorems, yet slightly stronger than those stated in Conjecture 2.5. In the next section we state precisely our main symmetry results, assess the territory covered by them, and also compare them to existing results. 3. Symmetry theorems for radially decreasing K. To state our new symmetry results for K ∈ C 0,α (R2 ) satisfying (2.12), we define −q κ∗ (K) = π inf q > 0 : |K(x)|(1 + |x|) dx < ∞ . (3.1) R2
The significance of κ∗ (K) is that of an explicit lower bound to the integral curvature. Proposition 3.1. Let K ∈ C 0,α (R2 ) satisfy (2.12). If K is unbounded below, then let K also satisfy one of the following two conditions, either (1) there exists some C > 0 such that |K(x)| ≤ C inf |K(y)| y∈B1 (x)
as |x| −→ ∞,
(3.2)
uniformly in x (this condition is satisfied, e.g., if K ∼ −C|x| , any > 0); or (2) there exist some finite P ≥ 1 and C > 0 such that P |K(x)| ≤ C ln |x| as |x| −→ ∞. (3.3) Let K be the Gauss curvature function for a surface Sg ∈ SK . Then the integral curvature of Sg is bounded below by (u) ≥ κ∗ (K).
(3.4)
We now state two theorems on radial symmetry of surfaces in SK , distinguishing the cases (u) > κ∗ (K) and (u) = κ∗ (K). Theorem 3.2 verifies Conjecture 2.5, under the hypotheses of Proposition 3.1, for all integral curvatures (u) > κ∗ (K). By Proposition 3.1, this covers the spectrum of potential integral curvature values all the way down to its lower bound (3.4), but not including it. This signals that the borderline case (u) = κ∗ (K) is critical. The critical case (u) = κ∗ (K) is dealt with in Theorem 3.3, where we assert the radial symmetry and decrease of u under an additional hypothesis that is mildly stronger than (2.15).
PRESCRIBED GAUSS CURVATURE
317
Theorem 3.2 (the sub-critical case). Under the assumptions stated in Proposition 3.1, all surfaces Sg ∈ SK with integral curvature (u) > κ∗ (K) are equipped with a radially symmetric, nonexpansive metric (1.1); that is, there exists a point x ∗ ∈ R2 such that u in (1.1) is radially symmetric and decreasing about x ∗ , u(x − x ∗ ) ≤ u(y − x ∗ ),
whenever |x − x ∗ | ≥ |y − x ∗ |.
(3.5)
Moreover, if K ≡ const, then x ∗ = 0, and if K ≡ const, then x ∗ is arbitrary. Theorem 3.3 (the critical case). Under the assumptions stated in Proposition 3.1, a surface Sg ∈ SK having integral curvature (u) = κ∗ (K) is equipped with a radially symmetric, nonexpansive metric (1.1) (in the sense of (3.5)) provided ln |x|2 |K(x)|e2u(x) dx < ∞. (3.6) R2
In that case, if K ≡ 0, then x ∗ = 0, and if K ≡ 0, then x ∗ is arbitrary. With reference to Conjecture 2.5, the foremost question now is how much of SK is actually covered by Theorems 3.2 and 3.3, and how much remains uncharted territory. A priori, Theorems 3.2 and 3.3 leave us anywhere in between the following extreme scenarios. In the best conceivable case, all surfaces with critical integral curvature satisfy (3.6), and then Theorems 3.2 and 3.3 taken together would prove Conjecture 2.5 completely. In the worst conceivable case, all surfaces have critical integral curvature, and none satisfies (3.6); thus, Theorems 3.2 and 3.3 would be empty. To assess the situation, we need to address the question whether for any K there exists a critical surface Sg such that inequality (3.4) is an equality, and if so, whether any such critical Sg satisfies (3.6). Notice that (3.6) is needed only for those K for which there exists a critical surface, that is, a surface for which (3.4) is an equality. Inequality (3.4) is certainly an equality in the trivial case K ≡ 0, where we have (u) = 0 = κ∗ (0). Of course, (3.6) is trivially satisfied when K ≡ 0; hence this case is covered by Theorem 3.3. If K(x) ≡ 0 decreases to zero at least as C|x|−2−* , possibly having compact support, then (u) > 0, by (1.3), while κ∗ (K) = 0. Obviously inequality (3.4) is strict in these cases; hence Theorem 3.2 covers all possible surfaces for each such K. We remark that by Theorem 2.1, with H ≡ const, it follows for such decreasing K that surfaces do exist for all integral curvature values in the open interval (0, 4π). Together with (u) > 0, this implies for these K that κ∗ (K) = 0 is the infimum to the set of integral curvatures for surfaces Sg ∈ SK . For Gauss curvature functions K = K0 > 0, with K0 a constant, we have κ∗ (K0 ) = 2π , while (u) = 4π for all solutions u of (1.2), (2.8), (2.12), and (2.15) (see [13] and [15]). Not only is inequality (3.4) strict in these cases, but κ∗ (K0 ) is not even the best constant in the sense of an optimal lower bound to the integral curvature. Clearly, the cases K = K0 > 0, with K0 a constant, are entirely covered by Theorem 3.2.
318
CHANILLO AND KIESSLING
The situation seems less clear when, as |x| → ∞, K behaves like C|x|−p or like −C|x|p , with p < 2. In these cases, explicit existence statements of surfaces in SK with critical curvature (u) = κ∗ (K) seem currently not available. We remark that surfaces with critical curvature (u) = κ∗ (K) do exist when K ≤ 0 and K(x) ∼ −|x|− as |x| → ∞, with > 2. While these surfaces are radially symmetric by a uniqueness argument, it is nevertheless quite interesting to register that they do not satisfy (3.6)! The metric (1.1) of these surfaces is equipped with a conformal factor e2U , where U is the maximal solution of Cheng and Ni (see [21, Theorem II, p. 723]). Cheng and Ni’s result signals the possible existence of surfaces with critical curvature in SK to which our Theorem 3.3 does not apply. We summarize this state of affairs with the following list of interesting open questions. Open Problems 3.4. Do there exist radially decreasing K ≡ 0 for which there exist solutions of (1.2), (2.8), and (2.12), with (u) = κ∗ (K)? If the answer to the previous question is positive, is (3.6) a genuine condition, in the sense that there exist surfaces in SK violating (3.6)? In case the answer to that question is also positive, is Conjecture 2.5 false for some of these surfaces? Incidentally, the above discussion also points to a related open question, which, though less directly relevant to our inquiry into radial symmetry, is an interesting problem in itself. To this extent, we introduce the notion of a least integrally curved surface in SK , and, with an eye toward the discussion above, also the notion of when such a surface is critical. Definition 3.5. A surface Sg ∈ SK is called least integrally curved if (u) = κ(K), where κ(K) is defined as the infimum of the set of integral curvatures for which there exists a surface Sg ∈ SK , given K. A least integrally curved surface is called critical if κ(K) = κ∗ (K). Open Problems 3.6. Find and classify all K for which there exists a least integrally curved surface in SK ; with reference to Problems 3.4, determine which of those surfaces are critical! We now return to the question of radial symmetry and to our strategy of proof for Theorems 3.2 and 3.3. We use the technique of the moving planes (see [30] and [43]), adapted to the setting in two-dimensional Euclidean space (where it is proper to rather speak of moving lines) so that it is possible to move in the lines from “spatial infinity.” Because of the logarithmic divergence of solutions at infinity, this part is more delicate than in higher dimensions, in particular when K > 0. Various authors before have applied this method to the problem under consideration here. Hence, before we enter the details of our proof, we briefly explain the way in which Theorems 3.2 and 3.3 go beyond existing results. Radial symmetry of surfaces with strictly positive, constant Gauss curvature function (2.13) and finite integral curvature (2.15) was proven by Chen and Li [15]. In
PRESCRIBED GAUSS CURVATURE
319
[15], a radial “comparison function” was invented that made it possible to overcome the “problem at infinity.” In this case, the result allows one to compute explicitly all surfaces, which are given by (2.10) with n = 1. This result was also obtained, with two different alternate methods, in [23] and [13]. In [12], the method of [15] was extended to a wider class of surfaces with monotone decreasing, bounded Gauss curvature functions, given certain integrability conditions. The following was proven in [12]. Theorem 3.7. Let K be the bounded Gauss curvature function of a classical surface Sg , with metric given by (1.1), and assume that (2.8), (2.15), and (2.12) are satisfied. Let K + denote the positive part of K. Then any surface Sg whose integral curvature satisfies
ln K + (x) (u) > π 3 + lim sup (3.7) ln |x| |x|→∞ is radial; more precisely, there exists a point x ∗ ∈ R2 such that (3.5) holds. Remark 3.8. The proof of Theorem 3.7 is contained in [12, proof of Theorem P1]. Clearly, Theorem 3.7 falls short of proving Conjecture 2.5 because K is assumed bounded in Theorem 3.7 and, furthermore, because there exist surfaces with radial decreasing and bounded Gauss curvature function whose integral curvatures (u) violate (3.7). For instance, consider the special case of (2.3), where K > 0 satisfies the growth condition ln K(x) = −m < −2. |x|→∞ ln |x| lim
(3.8)
Theorem 3.7 asserts the radial symmetry of surfaces with (u) > π(3 − m)+ ((3.7) with “lim sup” now “lim”). Surfaces with integral curvature in the interval 0 < (u) ≤ π(3 − m)+ , which, by Theorem 2.1, exist for m ∈ (2, 3), are not covered by Theorem 3.7. On the other hand, by Proposition 3.1, κ∗ (K) = 0 for K > 0 satisfying (2.12) and (2.3), while (u) > 0 because K > 0. Hence, Theorem 3.2 applies and asserts the radial symmetry of all surfaces in SK with nonnegative radially decreasing Gauss curvature functions K satisfying (2.3), including, as a special case, the K that satisfy (3.8). Closer inspection of the proof of Theorem 3.7 (see Remark 3.8) reveals that the origin of the 3π in (3.7), versus the 2π that is required to cover all surfaces for the K satisfying (3.8), traces back to our using the comparison function of [15]. That comparison function, while well-suited for constant and for certain monotonically decreasing Gauss curvature functions K, does not suit radial decreasing K in general. One main technical innovation of the present paper is the systematic construction of a new, radial comparison function that proves itself nearly optimal for handling the
320
CHANILLO AND KIESSLING
problem at infinity. We also obtain better control of solutions u of (1.2) near infinity, which allows us to forgo some technical contraptions used in [12]. Other, heuristic, comparison functions have been explored in the literature. Chen and Li [16] use a translation invariant comparison function rather than a radial one, and they require the stronger conditions that e2u ∈ L1 (R2 ), thereby restricting integral curvatures to (u) > 2π , to prove that all corresponding surfaces with strictly positive, radially symmetric decreasing K are given by radially symmetric and decreasing solutions u of (1.2). This result of [16] is contained in our Theorems 3.2 and 3.3. Furthermore, because of its stronger conditions on u, it intersects with, but does not subsume, [12, Theorem 3.11]. For example, consider the Gauss curvature function K(x) = Kγ (x), with (1) Kγ (x) = 4γ exp 2(1 − γ )U0 (x; y) ,
(3.9)
(1)
where U0 (x; y) is the special case ζ = 0 and n = 1 in (2.10), with y = 0 arbitrary, and 0 < γ ≤ 1. All Kγ are radially decreasing, and we have Kγ (x) ∼ C|x|−4(1−γ ) . Clearly, (1)
u(x) = γ U0 (x; y)
(3.10)
is a radial, decreasing solution of (1.2) for K given by (3.9). A classical radial surface described by (3.10) has integral curvature (1) Kγ (x)e2γ U0 (x;y) dx = γ 4π ∈ (0, 4π], (3.11) R2
independently of y. When γ ≤ 1/2, our examples (3.10) violate Chen and Li’s condition that e2u ∈ L1 . Nevertheless, for K given by (3.9), solutions of (1.2) that satisfy (2.8) and (2.15) also satisfy condition (3.7) in Theorem 3.7, irrespective of γ ; hence radial symmetry follows by Theorem 3.7 (cf. also [12, Theorem V2]). Incidentally, κ∗ (Kγ ) = 2π(2γ − 1)+ < γ 4π, and so none of these surfaces is critical. Hence, the radial symmetry of these surfaces follows by Theorem 3.2 as well. Finally, a nonsymmetric comparison function (a sum of a radial and a translation invariant function) is used in [17] to prove the radial symmetry of surfaces with radial decreasing Gauss curvature function K, having finite integral curvature, under stronger conditions on K than in Theorems 3.2 and 3.3, namely, that K be strictly positive and decay slower than exponentially. This concludes our discussion of the radial symmetry theorems. The next three sections are devoted to the proof of Theorems 3.2 and 3.3. In Section 8 we prove Theorem 2.1, and in Section 9 we prove Theorem 2.2.
PRESCRIBED GAUSS CURVATURE
321
4. Asymptotics. To prepare the proofs of our theorems, we need to gather some facts about the asymptotic behavior of the solutions u of (1.2). In the following lemma, X = x/|x|2 denotes the Kelvin transform of x. Lemma 4.1. Let u be a classical solution of (1.2) satisfying (2.8). Assume that (2.15) holds and that K satisfies (2.12). Then u satisfies the integral equation (u)
1 ln |x| − u(x) − u(0) = − 2π 2π
R2
ln |X − Y |K(y)e2u(y) dy
(4.1)
for all x. Proof. By hypothesis, K is monotone decreasing. We distinguish the cases with K ≥ 0 from those where K < 0 for |x| > R. In case K becomes negative somewhere, say, for |x| > R, then outside the disk BR (0) the function u is subharmonic, and so is u+ . Hence, for concentric disks B1/2 (y) and B1 (y), we have u+ L∞ (B1/2 (y)) ≤ Cu+ L1 (B1 (y))
(4.2)
for some constant C that is independent of y. Our hypothesis (2.8) guarantees that the right-hand side in (4.2) is bounded by a constant; hence we have a uniform L∞ bound for u+ outside a disk, and this implies a uniform L∞ bound for u+ in all R2 . In case K ≥ 0, since K is decreasing and we are assuming that u is a classical solution so that K is continuous, we automatically have K ∈ L∞ . Then, by examining Brezis and Merle [7, Theorem 2] (see also [14]), we again conclude that u+ is uniformly bounded above. With u+ ∈ L∞ , we now proceed, as in the proof of [12, Lemma 1, p. 224], to get 1 u(x) = u(0) − 2π
R2
ln |x − y| − ln |y| K(y)e2u(y) dy.
(4.3)
Pulling out the contribution ∝ ln |x| from the integral, noting that x y |x − y| = ln 2 − 2 , ln |x||y| |x| |y|
(4.4)
and recalling the definition of the Kelvin transform, gives us (4.1). Proof of Proposition 3.1. Let |x| ≥ 4. We define, for given x, the set |x| ≤ |y| ≤ 2|x| and |x − y| ≤ 4 Dx = y : 2
(4.5)
322
CHANILLO AND KIESSLING
and split R2 accordingly into R2 = Dx ∪ DxC , where DxC is the complement of Dx in R2 . Moreover, for y ∈ DxC , we use the decomposition DxC = Ex ∪ Fx ∪ Gx , with Ex = y : 2|y| ≤ |x| , (4.6) (4.7) Fx = y : |y| ≥ 2|x| , (4.8) Gx = y : |y| ≤ 2|x| ≤ 4|y| and |x − y| ≥ 4 . Recall (4.4). Let I1 denote the indicator function of the set 1. It is now readily verified that, with positive generic constants C, y ∈ Dx , |x − y| C ln |x| + C ln |x − y|, ln |X − Y | = ln ≤ |x||y| C + C ln |y|IEx + C ln |x|IFx ∪Gx , y ∈ DxC . (4.9) In each of these regions, the corresponding inequality in (4.9) follows by an application of the triangle inequality, paying attention to the a priori bounds on x, y, and x − y. Thus, with positive generic constants C, |x − y| 1 |K(y)|e2u(y) dy ln ln |x| R2 |x||y| ln |y| C ≤ |K(y)|e2u(y) dy ln |y| dy + C ln |x| |y|≤1 1≤|y|≤|x|/2 ln |x| ln |x − y| 2u(y) +C |K(y)|e dy + C |K(y)|e2u(y) dy. ln |x| |y|≥2|x| |y−x|≤4 (4.10) The first term on the right obviously goes to zero as |x| → ∞. The second integral on the right goes to zero as |x| → ∞, by the dominated convergence theorem and because Ke2u ∈ L1 (R2 ). The third integral on the right goes to zero as |x| → ∞ because Ke2u ∈ L1 (R2 ). For the fourth integral on the right, we need to distinguish two cases, (i) K ∈ L∞ and (ii) K ∈ L∞ . As for case (i), since u+ ∈ L∞ , we have Ke2u ∈ L∞ , and so 1 ln |x − y||K(y)|e2u(y) dy ≤ C ln |x − y| dy ln |x| |y−x|≤4 ln |x| |y−x|≤4 (4.11) C −→ 0 as |x| −→ ∞. ≤ ln |x| As for case (ii), since then K(x) < 0 for |x| > R, we have − u(x) = K(x)e2u(x) ≤ 0
for |x| > R;
(4.12)
323
PRESCRIBED GAUSS CURVATURE
hence u(x) is subharmonic for |x| > R. Thus, for |x0 | ≥ R + 1, we have 1 u(x0 ) ≤ u(y) dy. π B1 (x0 )
(4.13)
By Jensen’s inequality [34], e2u(x0 ) ≤
1 π
hence K(x0 )e2u(x0 ) ≤ 1 π
B1 (x0 )
B1 (x0 )
e2u(y) dy;
(4.14)
K(x0 )e2u(y) dy.
(4.15)
Now, by hypothesis, either (3.2) or (3.3) holds. If (3.2) holds, then |K(x0 )| ≤ C|K(y)| for all y in B1 (x0 ); hence K(x0 )e2u(y) dy ≤ C |K(y)|e2u(y) dy ≤ C, (4.16) B1 (x0 )
B1 (x0 )
where the second estimate holds by (2.15). It follows once again that Ke2u ∈ L∞ , and so we are back to (4.11). If (3.3) holds, then, writing |K| = |K|1/p |K|1/q with p = P , 1/p + 1/q = 1, we have, by Hölder’s inequality [34], ln |x − y||K(y)|e2u(y) dy |y−x|≤4
1/q
≤
|y−x|≤4
|K(y)|e2qu(y) dy
|y−x|≤4
ln |x − y|p |K(y)| dy
1/p . (4.17)
Since u+ ∈ L∞ , and since (3.3) holds, we now have 1 ln |x − y||K(y)|e2u(y) dy ln |x| |y−x|≤4 1/q
|K(y)|e2u(y) dy ≤C |y−x|≤4
−→ 0
|y−x|≤4
ln |x − y|p dy
1/p
(4.18)
as |x| −→ ∞,
and this completes the estimates on the third integral in (4.10). In total, by Lemma 4.1 and our estimates on the last integral in (4.1), we conclude that for any * there exists a C(*) and R(*) such that e2u(x) ≤ C|x|−((u)/π)+*
for |x| > R(*).
Recalling now the definition of κ∗ , Proposition 3.1 follows.
(4.19)
324
CHANILLO AND KIESSLING
Lemma 4.2. Let u be a classical solution of (1.2) satisfying (2.8). Assume that (2.15) holds and that K satisfies (2.12). Moreover, if K is unbounded below, assume that either (3.2) or (3.3) holds. Finally, if (u) = κ∗ (K), let (3.6) be satisfied. Then, uniformly in x,
1 1 lim u(x) − u(0) + (u) ln |x| = ln |y|K(y)e2u(y) dy. (4.20) |x|→∞ 2π 2π R2 Proof. By Proposition 3.1, (u) ≥ κ∗ (K). If (u) = κ∗ (K), then (3.6) is satisfied, by hypothesis, and this implies that R2 ln |y|K(y)e2u(y) dy exists. If (u) > κ∗ (K), then, by (4.19) and the definition of κ∗ (K), the existence of R2 ln |y|K(y)e2u(y) dy follows once again. By inspecting the estimates of the proof of Proposition 3.1, we now conclude, once again by dominated convergence, that lim ln |X − Y |K(y)e2u(y) dy = − ln |y|K(y)e2u(y) dy. (4.21) X→0 R2
R2
Lemma 4.2 follows. 5. Global results. With the help of Lemma 4.2, and noting (2.12) and (2.15), we now see that the asymptotic behavior of u implies that the integral curvature (u) of Sg ∈ SK is strictly positive if K(x) < 0 for |x| > R. In addition, it follows trivially from the definition of (u) that (u) ≥ 0 if K ≥ 0, with equality holding if and only if K ≡ 0. We summarize this as the following lemma. Lemma 5.1. Let u be a classical solution of (1.2) satisfying (2.8). Assume (2.15) holds. In addition, assume that K satisfies (2.12). If (u) = κ∗ (K), let (3.6) be satisfied. Then the integral curvature (u) of Sg is positive, K(x)e2u(x) dx ≥ 0, (5.1) R2
with “=” holding if and only if K ≡ 0. We also need an angular average of u. In the following, we set r = |x|, and we identify points in R2 with points in C. We define the radial function 2π 1 u(r) = u reiθ dθ, (5.2) 2π 0 which is well defined for all r ≥ 0 because u is a classical solution. Similarly, we define K(r). Notice that K(|x|) = K(x). Lemma 5.2. Let u be a classical solution of (1.2) satisfying (2.8) and (2.15), with K satisfying (2.12). If (u) = κ∗ (K), let (3.6) be satisfied. Let u be defined by (5.2). Then there exists a positive constant c(u) < ∞ such that u(x) − u(|x|) ≤ c(u) (5.3) for all x, and c(u) is the smallest such c.
PRESCRIBED GAUSS CURVATURE
325
Proof. For |x| ≤ R, the statement is trivial, since u is a classical solution. For |x| > R, the statement follows from Lemma 4.2. Lemma 5.3. Let u be a classical solution of (1.2) satisfying (2.8) and (2.15), with K satisfying (2.12). Let u be defined by (5.2). Then we have |K(x)|e2u(|x|) dx < ∞. (5.4) R2
If (3.6) holds, then we also have (ln |x|)2 |K(x)|e2u(|x|) dx < ∞. R2
(5.5)
Proof. By Jensen’s inequality, eu(r) ≤
1 2π
2π
eu(re
iθ )
dθ.
(5.6)
0
Upon multiplying (5.6) by 2πr|K(r)| and then integrating over r, we get 0
2π
∞
K(r)e2u(r) r dr dθ ≤
0
R2
|K(x)|e2u(x) dx,
(5.7)
which now shows that (5.4) holds because of (2.15). Similarly, if (3.6) holds, then we can multiply (5.6) by 2π r(ln r)2 |K(r)| and subsequently integrate the result over r to get 0
2π
∞
K(r)e2u(r) (ln r)2 r dr dθ ≤
0
R2
|K(x)|e2u(x) (ln |x|)2 dx,
(5.8)
which shows that (5.5) now holds because of (3.6). 6. The comparison function. In this section, we construct a comparison function for u, a classical solution of (1.2) satisfying (2.8) and (2.15), with K satisfying (2.12). In case (u) = κ∗ (K), we assume that (3.6) is satisfied. Recall that u is defined by (5.2). We first introduce a function g : [0, ∞) → R, given by ∞ ∞ K(s)e2u(s) s(ln s)2 ds − r ln r K(s)e2u(s) s ln s ds g(r) = r (6.1) r
r
if r > 0, while g(0) is given by continuous extension to r = 0. Notice that g is well defined for r ≥ 0, for, by Lemma 5.3, the integrals are well defined for all r ≥ 0, and r ln r has a removable singularity at r = 0.
326
CHANILLO AND KIESSLING
Lemma 6.1. The function g defined in (6.1) is the unique C 2 (R+ ) solution of the inhomogeneous Euler equation (6.2) r 2 g (r) − rg (r) + g(r) = K(r)e2u(r) r 3 ln r, under the asymptotic condition g(r) = o(r) as r −→ ∞.
(6.3)
Furthermore, g is eventually positive, g(r) ≥ 0
if r > 1,
(6.4)
and g vanishes at r = 0, g(0) = 0.
(6.5)
Proof. Inserting (6.1) into (6.2), one verifies that (6.1) is a particular solution of (6.2). Moreover, since |K| ≥ 0 and r < s, when r > 1 we have the bounds 0 < (ln r)(ln s) < (ln s)2 , which imply ∞ K(s)e2u(s) s(ln s)2 ds for r > 1. 0 ≤ g(r) ≤ r (6.6) r
The first inequality in (6.6) states positivity (6.4), and both together prove (6.3), for clearly ∞ g(r) K(s)e2u(s) s(ln s)2 ds = 0, 0 ≤ lim (6.7) ≤ lim r→∞ r r→∞ r the last step as a consequence of Lemma 5.3. Moreover, since g(0) is defined by g(0) = limr→0 g(r), (6.5) holds because of Lemma 5.3 and r ln r → 0 for r → 0. The general solution of (6.2) is obtained by adding to this particular solution the general solution of the homogeneous problem Ar + Br ln r, with A, B constants. By (6.3), we conclude that A = B = 0, and thus also uniqueness is shown. Let α > 0, and define R(α) as the smallest R > 0 such that r − αg(r) > e for all r > R. By (6.5), (6.3), and the continuity of r → r − αg(r), it follows that a positive R(α) exists and that R(α) − αg(R(α)) = e. We now introduce the family of radial functions fα : R2 \ BR(α) → R, given by fα (x) = ln |x| − αg(|x|) . (6.8) Clearly, fα (x) > 1 for |x| > R(α), and fα (x) = 1 for |x| = R(α). We also introduce α ∗ (u) = 2e2c(u) , where c(u) is defined in Lemma 5.2.
(6.9)
PRESCRIBED GAUSS CURVATURE
327
Lemma 6.2. Given u, α > α ∗ (u), the function fα defined in (6.8) satisfies the partial differential inequality
fα (x) + 2K(x)e2u(x) fα (x) < 0
(6.10)
for all x satisfying |x| > max{1, R(α)}. Proof. In the following, g (r) = ∂r g(r), and so on. Recall that r = |x|. By explicit calculation we find
fα (x) α − r 2 g (r) + rg (r) − g(r) + α 2 rg(r)g (r) − rg (r)2 + g(r)g (r) = 2 fα (x) r r − αg(r) ln r − αg(r) α K(r)e2u(r) = − 1 − αg(r)/r 1 + ln 1 − αg(r)/r / ln r 2 α 2 g(r) − rg (r) − 2 r 2 r − αg(r) ln r − αg(r) < −α|K(x)|e2u(|x|)
for r > R(α), (6.11)
the last step by the facts that r > 1 and αg(r) > 0 for r > 1, and r > R(α) and 1−αg/r > 1/r for r > R(α). By (6.11), Lemma 5.2, and α > α ∗ (u) defined in (6.9), we now have
fα (x) + 2K(x)e2u(x) fα (x) < − α|K(x)|e2u(|x|) − 2K(x)e2u(x) fα (|x|) ≤ − α|K(x)|e−2c(u) − 2K(x) e2u(x) fα (|x|) (6.12) ≤0 for all x satisfying |x| > max{1, R(α)}. 7. Proof of symmetry Theorems 3.2 and 3.3. In the following, we always understand that Sg ∈ SK , that u is the associated solution of (1.2), and that (3.6) is assumed to be satisfied in case that (u) = κ∗ (K). Moreover, if K is unbounded below, it is also assumed that either (3.2) or (3.3) holds. By Lemma 4.2, u(x) → −∞ as |x| → ∞. Therefore, and since u is a classical solution, u has a global maximum, say, at x ∗ . Since K satisfies (2.12), if x → u(x) solves (1.2), then so does x → u((x)) for any ∈ SO(2). Therefore, after at most a rotation, we can assume that our solution u has a global maximum at the point x ∗ = (−|x ∗ |, 0), with |x ∗ | ≥ 0. We now introduce the family of straight lines Tλ = x ∈ R2 | x1 = λ (7.1)
328
CHANILLO AND KIESSLING
and the half-plane “left of Tλ ,” 9λ = x : x 1 < λ .
(7.2)
We denote the reflection of x at Tλ by x (λ) = 2λ − x1 , x2 .
(7.3)
Lemma 7.1. For x1 ≤ λ ≤ 0, and in particular for x ∈ 9λ , with λ ≤ 0, we have K(x) ≤ K x (λ) . (7.4) Proof. K satisfies (2.12). We next introduce uλ (x) = u(x (λ) ), and also vλ (x) = uλ (x) − u(x).
(7.5)
Clearly, vλ is well defined on R2 . Lemma 7.2. For all λ ∈ R, vλ vanishes on Tλ and at infinity; that is, lim vλ (x) = 0
|x|→∞
(7.6)
uniformly in |x|. Proof. Notice that on Tλ we have x (λ) = x; hence vλ (x) = 0 for x ∈ Tλ . The vanishing of vλ at infinity is a consequence of Lemma 4.2. We next resort to our comparison function fα . We pick any α > α ∗ (u) and introduce the function wλ : 9λ ∪ Tλ → R, defined as v (x) λ , |x| ≥ R(α), f α (x) (7.7) wλ (x) = vλ (x), |x| ≤ R(α). Notice that wλ is twice continuously differentiable at all x with |x| = R(α), and it is continuous as a function of x ∈ 9λ ∪ Tλ , with any λ. It vanishes for |x| → ∞ as well as for x ∈ Tλ . Therefore, if wλ (x) < 0 for some x ∈ 9λ , then wλ will have a global negative minimum in 9λ . Lemma 7.3 allows us to initialize the moving planes argument and also to finalize it. Lemma 7.3. For each u, there exists an R(u) > 0 such that, if x∗ ∈ 9λ is a minimum point for wλ , and wλ (x∗ ) < 0, then |x∗ | < R(u), independently of λ. Proof. We begin by observing that, in the flat case K ≡ 0, u = const, and vλ ≡ 0 for all λ, so that the claim is trivially true.
329
PRESCRIBED GAUSS CURVATURE
In the nonflat case where K ≡ 0, we prove Lemma 7.3 by contradiction. Thus, assume that no such R(u) exists. Then, for any R, we can find a λ ≤ 0 such that |x∗ | > R, where x∗ ∈ 9λ is a minimum point for wλ with wλ (x∗ ) < 0. In particular, we may choose R > max{1, R(α)}. At such a minimum point x∗ ∈ 9λ , we have ∇wλ (x∗ ) = 0 and wλ (x∗ ) ≥ 0, and, of course, wλ (x∗ ) < 0. First, notice that the reflected function uλ satisfies the PDE − uλ (x) = K x (λ) e2uλ (x) . (7.8) Taking the difference between (7.8) and (1.2), we get − vλ (x) = K x (λ) e2uλ (x) − K(x)e2u(x) .
(7.9)
By the mean value theorem, there exists a number ψλ (x) between u(x) and u(x (λ) ) such that e2uλ (x) − e2u(x) = 2vλ (x)e2ψλ (x) .
(7.10)
By (7.10) and Lemma 7.1, we see that vλ satisfies the partial differential inequality
vλ (x) + 2K(x)e2ψλ (x) vλ (x) ≤ 0
(7.11)
for all x ∈ 9λ . With the help of (7.11), we now easily find that wλ satisfies the partial differential inequality
∇fα (x)
fα (x) 2ψ(x) wλ (x) ≤ 0 · ∇wλ (x) + + 2K(x)e (7.12)
wλ (x) + 2 fα (x) fα (x) for all x ∈ 9λ for which |x| > max{1, R(α)}. Now, by assumption, wλ (x∗ ) < 0, with |x∗ | > max{1, R(α)}, and since fα (x) > 1 for |x| > max{1, R(α)}, we also have vλ (x∗ ) < 0, and this means that uλ (x∗ ) < u(x∗ ). But then ψλ (x∗ ) ≤ u(x∗ ). Making use of this and of ∇wλ (x∗ ) = 0, from (7.12) we now obtain the inequality
fα (x∗ ) 2u(x∗ ) + 2K(x∗ )e (7.13) wλ (x∗ ) ≤ 0.
wλ (x∗ ) + fα (x∗ ) Using now Lemma 6.2, recalling that α > α ∗ (u), in combination with wλ (x∗ ) < 0, we see that (7.13) implies that wλ (x∗ ) < 0. But this is a contradiction to wλ (x∗ ) ≥ 0. Hence, wλ has no strictly negative minimum outside the disk BR(u) with R(u) = max{1, R(α ∗ (u))}. This concludes the proof of Lemma 7.3. Corollary 7.4. For each u, when λ < −R(u), then vλ (x) ≥ 0 for x ∈ 9λ . Proof. Assume vλ (x∗ ) < 0 for some x∗ ∈ 9λ , with λ < −R(u). Then, since wλ = vλ /fα for all x ∈ 9λ with λ < −R(u), and since fα > 1 for all x ∈ 9λ with
330
CHANILLO AND KIESSLING
λ < −R(u), we conclude that wλ (x∗ ) < 0 for x∗ ∈ 9λ with λ < −R(u). But then, since wλ → 0 as |x| → ∞ and wλ = 0 on Tλ , we see that wλ attains a negative minimum for some x∗ ∈ 9λ , with λ < −R(u). This is a contradiction to Lemma 7.3. Recall the maximum principle (MP) and the Hopf maximum principle (HMP) (see [31]). MP: Let v(x) + i bi (x)∂xi v(x) + c(x)v(x) ≤ 0 in = ⊂ Rn , and let v ≥ 0. If v(x) ˆ = 0 for at least one xˆ ∈ int(=), then v ≡ 0 in all of =. HMP: Under the same assumptions as in MP, if v ≡ 0 in = and ∂= is smooth with v|∂= ≡ 0, then ∂v/∂ν < 0, where ∂v/∂ν is the exterior normal derivative on ∂=. Notice that no sign condition is being imposed on c(x) as the minimum of v is zero. We are now ready for the moving lines. The arguments in our ensuing proof of Theorems 3.2 and 3.3 are a straightforward modification of those in the proof of [12, Theorem P1]. For the convenience of the reader, we give the complete argument instead of listing where to modify the arguments of [12]. Proof of Theorems 3.2 and 3.3. By Lemma 7.4, vλ (x) ≥ 0 for λ < −R(u), independently of λ. We now slide the line Tλ to the right until we reach a critical value λ0 , which is the largest value of λ for which vλ (x) ≥ 0, x ∈ 9λ . Claim A. We have vλ (x) > 0 for x ∈ 9λ with λ < λ0 , and ∂x1 u > 0 for x1 < λ0 . Claim B. We have λ0 = −|x ∗ |. Proof of Claim A. We begin by establishing the first assertion in Claim A. Suppose, for λ < λ0 , that vλ (x) = 0 at some point x ∈ 9λ . Since vλ (x) ≥ 0 for x ∈ 9λ , if vλ (x) = 0, the minimum of vλ (x) is achieved in 9λ . Since (7.11) holds and vλ (x) ≥ 0, we can apply the MP and deduce vλ (x) ≡ 0 in 9λ . This means for λ = λ0 − δ, some δ > 0, that u(λ0 −2δ, x2 ) = u(λ0 , x2 ). But vλ (x) ≥ 0; thus u(x (λ) ) ≥ u(x), which implies ∂x1 u ≥ 0 for x1 ≤ λ0 . This fact, together with the fact u(λ0 −2δ, x2 ) = u(λ0 , x2 ), yields ∂x1 u = 0 for λ0 − 2δ ≤ x1 ≤ λ0 . In particular, ∂x1 u = 0 when x1 = λ0 − 2δ. By the HMP and the MP, we have vλ ≡ 0 if and only if ∂x1 vλ = 0 on Tλ . Now, ∂x1 vλ = −2∂x1 u for x1 = λ. But, since ∂x1 u = 0 when x1 = λ0 − 2δ, we see ∂x1 vλ0 −2δ = 0 for x1 = λ0 − 2δ or, which is the same, on Tλ0 −2δ . Now the HMP says vλ0 −2δ ≡ 0. We may repeat this procedure indefinitely and thus deduce that u is independent of x1 . This is a contradiction, and so the first assertion of Claim A is proved. As for the second assertion of Claim A, note that since vλ > 0 in 9λ for λ < λ0 and vλ = 0 on Tλ , by the HMP we get ∂x1 vλ < 0 on Tλ . Since for x1 = λ we have ∂x1 u = −(1/2)∂x1 vλ , we also have ∂x1 (u) > 0 for x1 = λ, with λ < λ0 . So Claim A is proved.
PRESCRIBED GAUSS CURVATURE
331
Proof of Claim B. From the second assertion in Claim A, we see u is strictly increasing for x1 < λ0 . By a rotation, we arranged that the maximum of u is at (−|x ∗ |, 0). It follows that λ0 ≤ −|x ∗ |. Thus, to prove Claim B, we need to rule out the case λ0 < −|x ∗ |. Assume λ0 < −|x ∗ |. There are two possibilities. Either vλ0 ≡ 0 or vλ0 ≡ 0. We will first rule out the case λ0 < −|x ∗ | and vλ0 ≡ 0. Indeed, since vλ0 (x) ≡ 0 for x ∈ Tλ0 and for |x| → ∞, but vλ0 ≡ 0 in 9λ0 , and since vλ0 satisfies (7.11), then by the MP we get vλ0 > 0 for x1 < λ0 . Hence, by the HMP, ∂x1 vλ0 < 0 when x1 = λ0 . On the other hand, by definition of λ0 , there exists a sequence of numbers λk , decreasing to λ0 , such that vλk < 0, hence also wλk < 0, and λ0 < λk < −|x ∗ |. Notice that wλk is well defined for λk < −|x ∗ |. Let xk be a minimum point for wλk . Then wλk (xk ) < 0 and ∇wλk (xk ) = 0. As Lemma 7.3 implies |xk | < R(u), independently of λ, there exists a subsequence xkj → x ∗ such that ∇wλ0 (x ∗ ) = 0 and wλ0 (x ∗ ) ≤ 0 for x ∗ = (A, B), A ≤ λ0 . This is a contradiction. Thus, our claim is proved in this case. We now rule out the case λ0 < −|x ∗ | and vλ0 ≡ 0. Indeed, in that case u(xλ0 ) = u(x) for x ∈ 9λ0 . But u attains its maximum at (−|x ∗ |, 0), and by Claim A, ∂x1 u > 0 for x1 < λ0 . Since λ0 < −|x ∗ |, it follows that ∂x1 u = 0 at (|x ∗ |+2λ0 , 0), which again is a contradiction. Thus, λ0 = −|x ∗ |. Recall that λ0 is the largest value of λ for which vλ (x) ≥ 0, x ∈ 9λ . Hence, v−|x ∗ | (x) ≥ 0 for x ∈ 9−|x ∗ | ; thus u−|x ∗ | (x) ≥ u(x). We may now repeat this argument by sliding the line Tλ in from x1 = ∞ to get u−|x ∗ | (x) ≤ u(x). Putting the two inequalities together, we conclude that u−|x ∗ | (x) = u(x). This now implies that u is symmetric with respect to T−|x ∗ | . Moreover, from the arguments involving the HMP, we see that any solution is also decreasing away from T−|x ∗ | . Recall that (−|x ∗ |, 0) is the point of global maximum of u. Finally, if x ∗ = 0, then, since u satisfies (1.2), and since K is radially symmetric about (0, 0), we conclude that K is a constant. But if K is a constant, then if u(x) is a solution of (1.2), so is u(x + x ∗ ) for any fixed x ∗ . Thus, by a simple translation of the origin to x ∗ , we can assume that our solution is in fact symmetric with respect to, and decreasing away from, T0 . On the other hand, if K ≡ const, then x ∗ = 0, and again our solution is symmetric with respect to, and decreasing away from, T0 . But if x ∗ = 0, then we can repeat our moving line argument with any other than the x1 direction; thus, we come to the conclusion that u is symmetric about, and decreasing away from, any straight line through the origin. This now means that u is radially symmetric about and decreasing away from the origin, modulo a translation in case that K = const. This completes the proof of Theorems 3.2 and 3.3. 8. Proof of existence Theorem 2.1. We begin with the remark that, in the special case of identically vanishing Gauss curvature, our Theorem 2.1 is obviously true. Hence, in the rest of this section we assume that the Gauss curvature is not identically zero.
332
CHANILLO AND KIESSLING
In the following, we prove a probabilistic theorem which implies Theorem 2.1 as an immediate corollary. Incidentally, the proof also provides us with an algorithm for the construction (in principle at least) of nonradial surfaces. We use the methods developed in [37] (see also [8]), [40], and [41]. For applications to Nirenberg’s problem, see [38]. We first introduce some probabilistic notation and terminology. In the following, x1 , x2 , . . . denote points in R2 , not Cartesian components of x. Let N denote the natural numbers. For each N ∈ N, we denote the probability measures on R2N by P (R2N ). For B(N) ∈ P (R2N ), we denote the associated Radon measure by B (N) . A measure (N) 2N B ∈ P (R ) is called absolutely continuous with respect to a measure C (N) ∈ 2N P (R ), written dB(N) # dC (N) , if there exists a positive dC (N) -integrable function f (x1 , . . . , xN ), called the density of B(N) with respect to C (N) , such that dB(N) = f (x1 , . . . , xN ) dC (N) . By P s (R2N ), we denote the exchangeable probabilities, that is, the subset of P (R2N ) whose elements are permutation symmetric in x1 , . . . , xN . The nth marginal measure of B(N) ∈ P s (R2N ), n < N , is an element of P s (R2n ), given by Bn(N) dx1 · · · dxn = B(N) dx1 · · · dxn dxn+1 · · · dxN . (8.1) R2N−2n
N
By = ≡ (R2 ) , we denote the infinite Cartesian product of the exchangeable R2 valued infinite sequences. By P s (=), we denote the permutation symmetric probability measures on =. The de Finetti-type result of Hewitt and Savage [35] states that each µ ∈ P s (=) is uniquely presentable as a convex superposition of product measures; that is, for each µ ∈ P s (=), there exists a unique probability measure ν(dB | µ) on P (R2 ), such that µn dx1 · · · dxn = ν dB | µ B⊗n dx1 · · · dxn , n ∈ N, (8.2) P (R2 )
where B⊗n (dx1 · · · dxn ) ≡ B(dx1 )⊗ · · · ⊗B(dxn ) and µn denotes the nth marginal measure of µ. For de Finetti’s original work, see [29] (see also [27], [24], and [25]). We remark that (8.2) coincides with the extremal decomposition for the convex set P s (=), an application of the Krein-Milman theorem. For details, see [35]. To B ∈ P (R2 ), we assign the energy 1 1 ⊗2 B Ᏹ(B) ≡ ln |x − y|B(dx)B(dy), (8.3) ln |x − y| = 2 2 R2 R2 whenever the integral on the right exists. We denote by PᏱ (R2 ) the subset of P (R2 ) for which Ᏹ(B) exists. For µ ∈ P s (=), the mean energy of µ is defined as 1 e(µ) = µ2 ln |x − y| , (8.4) 2 whenever the integral on the right exists. Proposition 8.1, proved in [57], characterizes the subset of P s (=) for which (8.4) is well defined.
333
PRESCRIBED GAUSS CURVATURE
Proposition 8.1. The mean energy of µ, (8.4), is well defined for those µ whose decomposition measure ν(dB | µ) is concentrated on PᏱ (R2 ), and in that case it is given by ν dB | µ Ᏹ(B). (8.5) e(µ) = PᏱ (R2 )
Let ϒ : R2 → R+ be an L∞ function, ϒ ≡ 0. For some entire harmonic function H , which may be constant, and all 0 < γ < 2, we assume ϒ satisfies ϒ(x)e2H (x) |x − y|−γ dx −→ 0 as |y| −→ ∞. (8.6) B1 (y)
Moreover, we assume that for the same harmonic function H and some q > 0, ϒ satisfies ϒ(x)e2H (x) |x|q dx < ∞, (8.7) R2
and we define ∗
q (ϒ, H ) = sup q > 0 :
R2
ϒ(x)e
2H (x)
q
|x| dx < ∞ .
(8.8)
Given such H and ϒ, we now define the a priori measure τ (dx) = ϒ(x)e2H (x) dx on R2 . Since ϒ satisfies (8.7), the integral M (1) =
R2
τ (dx)
(8.9)
(8.10)
exists and is called the mass of τ . The probability measure associated to τ , given by 1 τ (dx), (8.11) M (1) is thus clearly absolutely continuous with respect to dx. For each B(N) (dx1 · · · dxN ) ∈ P (R2N ), its entropy with respect to the probability measure µ(1) (dx1 ) ⊗ · · · ⊗ µ(1) (dxN ) ≡ µ(1)⊗N (dx1 · · · dxN ) is defined as
dB(N) (N) (N) B =− B(N) dx1 · · · dxN ln (8.12) (1)⊗N 2N dµ R µ(1) (dx) =
if B(N) is absolutely continuous with respect to dτ ⊗N , and provided the integral in (8.12) exists. In all other cases, (N) (B(N) ) = −∞. In particular, if µn is the nth marginal measure of a µ ∈ P s (=), then the entropy of µn , n ∈ {1, . . . }, is given by (n) (µn ), with (n) defined as in (8.12) with B(n) = µn . We also define (0) (µ0 ) = 0. For each µ ∈ P s (=), the sequence n → (n) (µn ) enjoys the following useful properties, proofs of which are found in [55, Section 2, proof of Proposition 1] (see also [28] and [37]).
334
CHANILLO AND KIESSLING
Nonpositivity of (n) (µn ): For all n, (n) (µn ) ≤ 0.
(8.13)
Monotonic decrease of (n) (µn ): If n < n , then
(n ) (µn ) ≤ (n) (µn ).
(8.14)
Strong subadditivity of (n) (µn ): For n , n ≤ n, and with (−m) (µ−m ) ≡ 0 for m > 0,
(n) (µn ) ≤ (n ) (µn ) + (n ) (µn )
)
+ (n−n −n
µn−n −n − (n +n −n) µn +n −n .
(8.15)
As a consequence of the subadditivity (8.15) of (n) (µn ), the limit 1 (n) (µn ) n→∞ n
s(µ) = lim
(8.16)
exists whenever inf n n−1 (n) (µn ) > −∞; otherwise, s(µ) = −∞. The quantity s(µ) given in (8.16) is called the mean entropy of µ ∈ P s (=). The mean entropy is an affine function (see [55]). This entails the following useful representation, proved in [55]. Proposition 8.2. The mean entropy of µ, (8.16), is given by ν dB | µ (1) (B). s(µ) = P (R2 )
(8.17)
Next, identifying each xk ∈ R2 with the corresponding zk ∈ C, we recall the definition of the alternant (N) (x1 , . . . , xN ), zi − z j . (8.18)
(N) x1 , . . . , xN = 1≤i<j ≤N
Clearly, (N) x1 , . . . , xN =
xi − xj .
(8.19)
1≤i<j ≤N
We also recall the definition of q ∗ > 0 in (8.8) and define β ∗ (ϒ, H ) = −2q ∗ .
(8.20)
For β ∈ (β ∗ , 4), and N ∈ N, we now introduce the probability measure µ(N) on R2N by (N) −β/N 1 τ (dx ) (8.21) x1 , . . . , x N µ(N) dx1 · · · dxN ≡ (N) M (β) 1≤≤N
PRESCRIBED GAUSS CURVATURE
335
if N > 1, and µ(N) ≡ µ(1) given in (8.11) if N = 1. Lemma 8.3 asserts that (8.21) is well defined for all N ∈ N, and all β ∈ (β ∗ , 4). Lemma 8.3. For all β ∈ (β ∗ , 4), the measure (8.21) satisfies dµ(N) # dτ ⊗N . Moreover, for the associated density we have dµ(N) /dτ ⊗N ∈ Lp (R2N , dτ ⊗N ), with p ∈ [1, β ∗ /β) when β < 0, p ∈ [1, ∞] when β = 0, and p ∈ [1, 4/β) when β > 0. Proof. First, if β = 0, or N = 1, the claim is obviously true. If N > 1 and β ∈ (β ∗ , 0), we make use of the inequality xi − xj ≤ |xi | + 2 |xj | + 2 ,
(8.22)
valid for any two xi ∈ R2 and xj ∈ R2 . Inequality (8.22) is a consequence of the triangle inequality |xi − xj | ≤ |xi | + |xj |, the fact that |x| < |x| + 2, and finally the fact that r + s < sr when both r > 2 and s > 2. To verify this last inequality, use that 2+r < 2r whenever r > 2, so that when r > 2 and s > 2, we have r +s = r +2+* < 2r +* = (2+*)r −*r +* = sr −*(r −1) < sr. With the help of (8.22), we now have for β < 0, M (N) (β) =
≤
R2N
(N) −β/N τ (dxk ) x1 , . . . , x N
R2N 1≤i≤N
=
1≤k≤N
2 + |xi |
R2
−β/2
2 + |x|
−β/2
τ (dxi )
N
τ (dx)
(8.23)
.
The last integral exists, by hypothesis (8.7). This proves dµ(N) # dτ ⊗N for β ∈ (β ∗ , 0). If N > 1 and β ∈ (0, 4), we use the inequality between arithmetic and geometric means (see [34]), permutation invariance (twice), and Hölder’s inequality (see [34]) to get M (N) (β) =
≤
R2N
=
R2N
(N) −β/N τ (dxk ) x1 , . . . , x N
1 N
1≤k≤N
xi − xj −β/2 τ (dxk )
1≤i≤N 1≤j ≤N (j =i)
1≤k≤N
x1 − xj −β/2 τ dxj τ (dx1 )
R2 R2(N−1) 2≤j ≤N
(8.24)
336
CHANILLO AND KIESSLING
=
R2
≤
sup x
R2
R2
|x − y| |x − y|
−β/2
−β/2
N−1
τ (dy)
N−1
τ (dy)
τ (dx)
M (1) .
In the last step, we used that for β ∈ (0, 4), we have |x − y|−β/2 τ (dy) < M (1) + Kβ (x),
(8.25)
R2
with Kβ : x → B1 (x) |x −y|−β/2 τ (dy) ∈ C 0 (R2 ) (because ϒ ∈ L∞ ) and Kβ (x) → 0 for |x| → ∞ (by hypothesis (8.6)). This proves dµ(N) # dτ ⊗N for β ∈ (0, 4). By repeating now the same chains of estimates with pβ in place of β, one concludes that dµ(N) /dτ ⊗N ∈ Lp (R2N , dτ ⊗N ) for all p ∈ [1, 4/β) when β > 0, respectively, all p ∈ [1, β ∗ /β) when β < 0. We now come to the main theorem of this section. It addresses the limiting behavior (N) of µn as N → ∞, with n arbitrary but fixed. (N)
Theorem 8.4. The sequence of probability measures N → µn (dx1 · · · dxn ) is the union of weakly convergent subsequences in the sense that there exist disjoint sequences E = {N (k)}k∈N , E ∩ E = ∅, for = , such that for each , the map (N (k)) k → µn (dx1 · · · dxn ) converges weakly in the sense of probability measures, with densities with respect to dτ ⊗n converging weakly in Lp (R2n , dτ ⊗n ), for all p ∈ [1, ∞). Let µn denote the weak limit point of such a subsequence. Then there exists a unique µ ∈ P s (=) (of which µn is the nth marginal), and µ has its decomposition measure ν(dB | µ ) concentrated on the subset of P (R2 ) ∩ p>1 Lp (R2 , dτ ), whose elements minimize the functional Ᏺβ (B) = β Ᏹ(B) − (1) (B).
(8.26)
Remark 8.5. Notice that Theorem 8.4 asserts that Ᏺβ does have a minimizer Bβ ∈ PᏱ . If it can be shown that (8.26) has a unique minimizer, say, Bβ , then in fact we have convergence to a product measure, Bβ (dxk ), dx1 · · · dxn = lim µ(N) n
N→∞
(8.27)
1≤k≤n
weakly in P (R2n ) ∩ Lp (R2n , dτ ⊗n ) for any p ∈ [1, ∞). Before we prove Theorem 8.4, we show that Theorem 2.1 is a corollary of Theorem 8.4.
337
PRESCRIBED GAUSS CURVATURE
Proof of Theorem 2.1. Assume that all hypotheses of Theorem 8.4 are fulfilled. Then (8.26) has a solution for all β ∈ (β ∗ , 4). The minimizers of (8.26) are of the form B(dx) = ρ(x) dx, with ρ satisfying the Euler-Lagrange equation ϒ(x) exp − β R2 ln |x − y|ρ(y) dy + 2H (x) . ρ(x) = (8.28) R2 ϒ(x) exp − β R2 ln |x − y|ρ(y) dy + 2H (x) dx Recall that ϒ ≥ 0, by hypothesis. If β ∈ (0, 4), we now identify ϒ with a (positive) Gauss curvature function, K ≡ ϒ, and if β ∈ (β ∗ , 0), we identify −ϒ with a (negative) Gauss curvature function, K ≡ −ϒ. In either case, K satisfies the hypotheses of Theorem 2.1. We also identify βπ with the integral Gauss curvature, κ = βπ,
(8.29)
and we notice that β ∗ π = κ ∗ , as defined in (2.4). We now pick a corresponding solution of (8.28), say, ρH,β , which exists by Theorem 8.4. With the help of this ρH,β , we define, for all x ∈ R2 , the function β ln |x − y|ρH,β (y) dy + U0 , (8.30) UH,κ (x) = H (x) − 2 R2 the constant U0 being uniquely determined by the requirement that K(x)e2UH,κ (x) dx = κ.
(8.31)
R2
By Theorem 8.4, ρH,β ∈ Lp (R2 , dτ ) for all p ∈ [1, ∞); hence UH,κ ∈ Wloc ∩ L∞ loc . With ln |x −y| = 2π δ(x −y), it now follows that u(x) = UH,κ (x) is a distributional solution of (1.2) for the prescribed Gauss curvature function K, with K satisfying (2.2) and (2.3), and u satisfying the asymptotics (2.7). For the subset of K ∈ C 0,α (R2 ), we can bootstrap to UH,κ ∈ C 2,α (R2 ) by using elliptic regularity, thus obtaining an entire classical solution of (1.2). For the further subset of K satisfying also (2.1), this classical solution obviously breaks the radial symmetry if H ≡ const. Finally, for the further subset of K satisfying (2.12), a straightforward estimate shows that (2.2) is redundant. This concludes the proof of Theorem 2.1. 2,p
We now prepare the proof of Theorem 8.4. Let M(R2N ) denote the subset of P (R2N ) whose elements continuous with respect to dτ ⊗N , having are absolutely (N) ⊗N p 2N density dB /dτ ∈ p>1 L (R , dτ ⊗N ). On M(R2N ), we define the functional (N) B(N) ln (N) − N (N) B(N) . (8.32) Ᏺβ B(N) = β Lemma 8.6. For each β ∈ (β ∗ , 4), the functional (8.32) takes its unique minimum at the probability measure (8.21); that is, (N) (N) min Ᏺβ B(N) = Ᏺβ µ(N) . (8.33) B(N) ∈M(R2N )
338
CHANILLO AND KIESSLING
Moreover, (N) (N)
Ᏺβ
µ
−β/N = −N ln µ(1)⊗N (N) .
(8.34)
For β ≥ 4, and for β < β ∗ , (8.32) is unbounded below. Proof. Since ln | (N) | ∈ Lp (R2N , dτ ⊗N ) for all p ∈ [1, ∞) by Lemma 8.3, the (N) integral Ᏺβ (µ(N) ) is well defined for β ∈ (β ∗ , 4). The identity (8.34) is readily verified by explicit computation. In turn, the Gibbs variational principle (8.33) is just convex duality (see [56]), verified by the standard convexity argument (cf. [28, proof of Proposition I.4.1]). Thus, rewriting (8.32) as
(N) (N) dB dB (N) (N) Ᏺβ B ln dx1 · · · dxN (8.35) = (N) dµ dµ(N) R2N and using now x ln x ≥ x − 1, with equality if and only if x = 1, we find that (N) Ᏺβ (B(N) ) ≥ 0, with equality holding if and only if B(N) = µ(N) . This proves Lemma 8.6 for β ∈ (β ∗ , 4). Now, let β ≥ 4, or let β < β ∗ . Assume that M (N) (β) is finite. Then, by (8.34) and by (N) the Gibbs variational principle (8.33), we have minB Ᏺβ (B(N) ) = −N ln M (N) (β). However, a simple scaling argument shows that M (N) (β ≥ 4) > C for any C, and similarly we have M (N) (β < β ∗ ) > C for any C, by definition of β ∗ . This verifies the unboundedness below of (8.32) for β ≥ 4 and β < β ∗ . Lemmas 8.6 and 8.3 entail the following lemma. Lemma 8.7. The function β → F (β) defined by F (β) ≡
inf
B∈M(R2 )
Ᏺβ (B)
(8.36)
is continuous for all β ∈ (β ∗ , 4). Proof. The Gibbs variational principle (8.33) evaluated with a trial product measure B(N) = B⊗N ∈ P (R2N ), with B ∈ P (R2 ) ∩ Lp (R2 , dτ ) for some p > 1, gives us
1 (N) (N) 1 (N) ⊗N 1 Ᏺ µ β Ᏹ(B) − (1) (B) ≤ 2 Ᏺβ B (8.37) = 1− N N2 β N for all B ∈ P (R2 ) ∩ Lp (R2 , dτ ), p > 1, and N > 1. Now, by (8.23) and (8.24), the left-hand side in (8.37) is uniformly bounded below. Letting N → ∞ in (8.37), we obtain a lower bound for Ᏺβ (B), uniformly over P (R2 )∩Lp (R2 , dτ ), p > 1, for each β ∈ (β ∗ , 4). Thus, β Ᏹ(B) − (1) (B) ≥ lim sup N→∞
1 (N) (N) 1 (N) Ᏺβ µ ≥ lim inf 2 Ᏺβ µ(N) ≥ f0 (β), 2 N→∞ N N (8.38)
339
PRESCRIBED GAUSS CURVATURE
with
f0 (β) =
− ln sup x
R2
− ln
|x − y|−β/2 µ(1) (dy)
R2
2 + |x|
−β/2
for β ≥ 0, (8.39)
(1)
µ (dx)
for β ≤ 0.
Recalling (8.26), this proves that Ᏺβ is bounded below for β ∈ (β ∗ , 4). Having a lower bound, continuity of F now follows from the definition of F . Assume that F is discontinuous at β0 ∈ (β ∗ , 4). Without loss of generality, we can assume F (β0− ) > F (β0+ ). (The reverse case F (β0− ) < F (β0+ ) is treated essentially verbatim.) Now let β = β0 +*. Clearly, for each * we can find a minimizing sequence {Bk }k∈N (depending on *) such that Ᏺβ0 +* (Bk ) < F (β + ) + δ if k > M(δ). Pick a sufficiently small δ, and select a B∗ ∈ {Bk }k>M(δ) . Insert this B∗ into Ᏺβ0 −* . Using Ᏺβ = β Ᏹ − S (1) , we find, for any * and δ, F (β0 − *) ≤ Ᏺβ0 −* (B∗ ) = Ᏺβ0 +* (B∗ ) − 2* Ᏹ(B∗ )
(8.40)
≤ F (β0 + *) + δ − 2* Ᏹ(B∗ ). Letting * → 0 and δ → 0, we obtain F (β0− ) ≤ F (β0+ ), which is a contradiction. Taking the infimum over B in (8.38) and noting Lemma 8.7 gives the following proposition. Proposition 8.8. For all β ∈ (β ∗ , 4), lim sup N→∞
1 (N) (N) µ ≤ F (β). Ᏺ N2 β
(8.41)
Proposition 8.8 is complemented by a sharp estimate in the opposite direction. Proposition 8.9. For all β ∈ (β ∗ , 4), lim inf N→∞
1 (N) (N) Ᏺ µ ≥ F (β). N2 β
(8.42)
To prove Proposition 8.9, we need to prove that the sequence of the nth marginal (N) measures µn is not “leaking at ∞” as N → ∞. When β > 0, we also need to (N) show that the sequences of the densities dµn /dτ ⊗n of these marginal measures p 2n ⊗n are uniformly in L (R , dτ ) for N > Nn (β). However, since it gives a priori regularity, we prove uniform Lp bounds for all β ∈ (β ∗ , 4). We remark that when ϒ is radially symmetric decreasing or has compact support, then many of the following proofs simplify considerably, some to trivialities. However, since we work with a minimal set of assumptions on ϒ, it is unavoidable that the ensuing estimates become somewhat more technical.
340
CHANILLO AND KIESSLING
We begin by deriving bounds on the expected value of ln | (N) | with respect to (N) which, using permutation symmetry, can be written in terms of µ2 ,
µ(N)
1 (N) µ(N) ln (N) = N(N − 1) µ2 ln |x − y| . 2
(8.43)
Lemma 8.10. For each β ∈ (β ∗ , 4), there exist constants C(β) and C(β), independent of N, such that for all N ≥ 2 we have the estimates (N) (8.44) C(β) ≥ β µ(1)⊗2 ln |x − y| ≥ β µ2 ln |x − y| ≥ C(β). Proof. The first inequality in (8.44) is implied by our hypotheses (8.6) and (8.7), which enter our definitions of τ (8.9) and µ(1) (8.11). To obtain the second inequality, we study the functions β → fN (β), N > 1, given by fN (β) = −
−β/N 2 ln µ(1)⊗N (N) N −1
(8.45)
for β ∈ (β ∗ , 4). Jensen’s inequality [34] with respect to µ(1)⊗N applied in (8.45) gives us fN (β) ≤ β µ(1)⊗2 ln |x − y| . (8.46) (N)
On the other hand, N(N −1)fN (β) = 2Ᏺβ (µ(N) ). Therefore, by Lemma 8.6, (8.35), definition (8.34), and the negativity of (N) (see (8.13)), we have (N)
µ2 fN (β) = β
ln |x − y| −
2 (N) (N) µ(N) ≥ β µ2 ln |x − y| . N −1
(8.47)
The second estimate in (8.44) is proved. To prove the third estimate in (8.44), we note that for any β ∈ (β ∗ , 4), there exists a small * > 0 such that (1+*)β ∈ (β ∗ , 4). By Jensen’s inequality with respect to µ(N) ,
1 (N) (N) (N) (1 + *)β ≥ M (β) exp − (N − 1)*β µ2 ln |x − y| . (8.48) M 2 N
Dividing (8.48) by M (1) , taking the logarithm, and then multiplying by −2/(N −1) gives (N) fN (1 + *)β ≤ fN (β) + *β µ2 ln |x − y| . (8.49) Now, fN (β) is bounded above and below independently of N, N > 1, for 2F (β) ≥ 1 − N −1 fN (β) ≥ 2f0 (β),
(8.50)
PRESCRIBED GAUSS CURVATURE
341
β ∈ (β ∗ , 4), and since 1 − N −1 → 1. The first inequality in (8.50) is Proposition 8.8; the second is (8.38). With the help of (8.50), from (8.49) we now obtain, for N > 1, 1 (N) β µ2 ln |x − y| ≥ fN (1 + *)β − fN (β) * 2 (8.51) f0 (1 + *)β − F (β) ≥ −1 1−N * ≥ C(β) uniformly in N , for all β ∈ (β ∗ , 4). We next prove a hybrid bound, which for N = 1 reduces to the first inequality in (8.44). such Lemma 8.11. For each β ∈ (β ∗ , 4), N ≥ 1, there is an N -independent C(β) that (N) β µ(1) ⊗ µ1 ln |x − y| ≤ C(β). (8.52) Proof. For β = 0, the statement is obvious. For β ∈ (β ∗ , 0), we have
(N) (N) (1) (1) µ1 µ1 ln |x − y| ≤ β ln |x − y|µ (dy) β µ ⊗ B1 (x)
(N) ≤ µ1 C(β)
(8.53)
= C(β). The first estimate in (8.53) is obvious, since β ∈ (β ∗ , 0). The second estimate follows from the fact that Klog : x → B1 (x) ln |x − y|µ(1) (dy) ∈ C 0 (R2 ) (because ϒ ∈ L∞ ), with Klog (x) → 0 as |x| → ∞ (by | ln |x − y|| < |x − y|−γ on B1 (x), γ ∈ (0, 2), followed by (8.6)). For β ∈ (0, 4), we use (8.22) to estimate (N) (N) µ(1) ⊗ µ1 ln |x − y| ≤ µ(1) ln(2 + |x|) + µ1 ln(2 + |y|) . (8.54) By (8.7),
µ(1) ln(2 + |x|) = C1 < ∞.
(8.55)
(N)
As to estimating µ1 (ln(2 + |y|)), if β ∈ (0, 2), we can pick p ∈ (1, 2/β) and apply Hölder’s inequality with respect to τ (dx1 ) followed by obvious L∞ estimates to (N) get the upper bound µ1 (ln(2 + |y|)) ≤ C(β)M (N−1) (β )/M (N) (β), where β = −1 (1 − N )β and where 1/p∗ 1/p
p ∗ −pβ sup sup |x − y| τ (dy) < ∞. ln(2 + |y|) τ (dy) C(β) = R2
N x∈R2
R2
(8.56)
342
CHANILLO AND KIESSLING
We subsequently estimate the ratio of M’s uniformly in N in the manner used below, but when β ∈ [2, 4), Hölder’s inequality does not lead to L∞ functions and so this road is then blocked. However, noting that for q ∈ (0, q ∗ ) we have, by (8.7), exp q ln(2 + |y|) τ (dy) = C2 < ∞, (8.57) R2
we can use convex duality (see [56]) for “exp” to get, for any q ∈ (0, q ∗ ) and all β ∈ (0, 4), M (N−1) (β ) (N) µ1 ln(2 + |y|) − exp q ln(2 + |y|) τ (dy) (N) M (β) R2 1 (8.58) (N) µ2 ln |x − y| ≤ − 1 + ln q + β q ≤ C ∗ (β). In (8.58), C ∗ (β) is independent of N, by Lemma 8.10. Hence, it now remains to estimate M (N−1) (β )/M (N) (β) from above uniformly in N , for each β ∈ (0, 4). To carry out this last step, we regularize M (N) and prove an N-independent upper bound on the “regularized ratio of M’s” which is independentof theregularization parameter. We regularize ln |x − y| by −V* (x, y) ≡ π −2 * −4 B* (x) B* (y) ln |ξ − η| dξ dη. Let Ᏼ* denote the Hilbert space obtained by completing the C0∞ (R2 ) functions with vanishing integral, R2 f (x) dx = 0, with respect to the positive definite inner product −1 f (x)V* (x, y)f (y) dx dy. (8.59) f, f * ≡ N β R2 R2
If B 1 ≡ B1/√π (0) denotes the disk of area 1 centered at the origin, and δy (x) is the Dirac measure on R2 concentrated at y, we note that δyQ (x) ≡ δy (x) − χB 1 (x) ∈ Ᏼ* .
(8.60)
Accordingly, Q
δ(N) (x) ≡ as well. We now define W* (x) ≡
N k=1
B1
V* (x, y) dy −
δxQk (x) ∈ Ᏼ*
1 2
(8.61)
B1 B1
V* (x, y) dx dy
(8.62)
and write τ (dx) = eβW* (x) τ (dx).
(8.63)
343
PRESCRIBED GAUSS CURVATURE
Note that, unless q ∗ > β, τ does not have finite mass, but that does not cause a (N) problem. We write M* (β) for M (N) (β) with − ln |x − y| replaced by V* (x, y). With (8.59) to (8.63), we have the identity M*(N) (β) = e−1/2 βV* (0, 0)
R2N
e
Q
Q
(1/2)δ(N) ,δ(N)
*
N
τ (dx ).
(8.64)
=1
We now use Gaussian functional integrals (see [32]) to rewrite (8.64). Minlos’s theorem (see [32]) asserts that N −1 βV* (x, y) is the covariance “matrix” of a Gaussian probability measure with mean zero; that is, there exists a Gaussian average Ave( . ) on a space of linear functionals S on Ᏼ* , with Ave(φ(x)) = 0 and Ave(φ(x)φ(y)) = Q N −1 βV* (x, y), where φ(x) is shorthand for S(δx ). Using the generating function (see [32]), (8.65) Ave eS(f ) = e(1/2)f,f * , Q
τ ⊗N , we with f = δ(N) given in (8.61), then integrating over R2N with respect to obtain
N (N) −(1/2)βV* (0,0) φ(x) Ave e τ˜ (dx) . (8.66) M* (β) = e R2
Jensen’s inequality in the form F N ≥ F N−1 N/(N −1) applied to the right-hand side of (8.66) now gives, in terms of the M* ’s, 1/(N−1) (8.67) M*(N) (β) ≥ M*(N−1) (β ) M*(N−1) (β ) for all *. Hence, we let * → 0, and then N → ∞ to obtain lim sup N→∞
−1/(N−1) M (N−1) (β ) ≤ lim sup M (N−1) (β ) (N) M (β) N→∞ 1 ≤ (1) einf B Ᏺβ (B) M 1 (1) ≤ (1) eᏲβ (µ ) M
1 1 (1)⊗2 β µ = (1) exp ln |x − y| . 2 M
(8.68)
By Lemma 8.10, the right-hand side of (8.68) exists and is obviously N-independent. Combining (8.57), (8.58), and (8.68), we have (N) 2 (β) µ1 ln(2 + |x|) ≤ C (8.69) 2 (β), 1 (β)+ C independently of N . By (8.54), (8.55), and (8.69), and setting C(β) =C Lemma 8.11 is proved also for β ∈ (0, 4).
344
CHANILLO AND KIESSLING
We now prepare for uniform Lp bounds. Lemma 8.12. For each n ∈ N, β ∈ (β ∗ , 4), there exist Nn (β) ∈ N and C(n, β) > (N) 0, such that for N > Nn , the Radon-Nikodym derivative of µn with respect to τ ⊗n is bounded by (N) −β/N dµn x1 , . . . , xn ≤ C(n, β) (n) x1 , . . . , x n . ⊗n dτ
(8.70)
Proof. When β = 0, this is trivial. When β = 0, we begin by writing (N) −β/N 1 dµn x1 , . . . , xn = (N) G x1 , . . . , xn (n) x1 , . . . , x n , ⊗n dτ M (β)
where G x1 , . . . , xn =
R2(N−n) 1≤i≤n<j ≤N
xi − xj −β/N
(8.71)
xk − x −β/N τ (dxj ).
n
(8.72) Let [[ . ]] denote integer part. We define ∗ 2β − β if β ∈ (β ∗ , 0), n β∗ − β Nn (β) = 8−β n if β ∈ (0, 4). 4−β
(8.73)
Given β ∈ (β ∗ , 4), let N > Nn (β). Then, by Hölder’s inequality, 2n/N −β/2n xi − xj G x1 , . . . , xn ≤ τ (dxj ) R2(N−n) 1≤i≤n<j ≤N
×
R2(N−n) n
1−2n/N −β/(N−2n) xi − xj τ (dxk ) . n
(8.74) As for the first factor on the right-hand side of (8.74), permutation symmetry gives N−n −β/2n −β/2n xi − xj xi − x τ dxj = τ (dx) . R2(N−n) 1≤i≤n<j ≤N
R2 1≤i≤n
(8.75)
PRESCRIBED GAUSS CURVATURE
345
By the arithmetic-geometric mean inequality and permutation invariance, we have
xi − x −β/2n τ (dx) ≤ 1 xi − x −β/2 τ (dx). 2 n R
R2 1≤i≤n
(8.76)
1≤i≤n
For the right-hand side of (8.76), we have the estimates sup y
|y − x|−β/2 τ (dx) if β ≥ 0, 2 R 1 xi − x −β/2 τ (dx) ≤ 2 n 1≤i≤n R C n (2 + |x|)−β/2 τ (dx) if β < 0,
(8.77)
R2
where Cn = maxi∈{1,...,n} (2+|xi |)|β|/2 . By (8.75), (8.76), and (8.77), the first term on the right-hand side of (8.74) is bounded by the 2n(1−n/N)th power of the right-hand side of (8.77), hence uniformly with respect to N. As for the second factor on the right-hand side of (8.74), we split off the (−2n/N)th power. We set α(N ) = (N − n)/(N − 2n). Since N > Nn , we have 1 < α(N) < 4/β if β > 0, and 1 < α(N) < β ∗ /β if β < 0. We also have α(N) → 1 as N → ∞. Proceeding as in the proof of Lemma 8.7, we find that
lim sup N→∞
R2N−2n n
= lim sup M N→∞
≤
eF (β) M (1)
(N−n)
−2n/N −β/(N−2n) xi − xj τ (dxk )
α(N)β
−2n/N
n
(8.78)
2n ,
which implies an N-independent bound on (M (N−n) (α(N)β))−2n/N . Feeding (8.75), (8.76), (8.77), and (8.78) back into (8.74), we see that G(x1 , . . . , xn ) ≤ CM (N−n) (α(N )β). This already proves that the density (8.71) will eventually (i.e., for large (N) N) be in any Lp (R2n , dτ ⊗n ), p < ∞. To prove that dµn /dτ ⊗n ∈ Lp (R2n , dτ ⊗n ) (N−n) uniformly in N, it remains to estimate the ratio M (α(N)β)/M (N) (β) from above, independently of N. We once again can apply the Gaussian functional integral method used in the proof of Lemma 8.11. Since α(N)β occurs in the argument of M (N−n) instead of (1−nN −1 )β, besides Jensen’s inequality (now pulling a power N/(N −n) out of the average), we also need a “change of covariance formula” (see [32]). However, having proved Lemmas 8.10 and 8.11 already, a more direct way follows.
346
CHANILLO AND KIESSLING
Using Jensen’s inequality twice in a self-explanatory way, we obtain
M (N−n) α(N )β n(n − 1) 1 (1)⊗2 1 exp ≤ β µ ln |x − y| n N 2 M (N) (β) M (1)
n (N−n),α (1) (8.79) × exp n 1 − β µ ⊗ µ1 ln |x − y| N
n+1 (N−n),α α(N)β µ2 ln |x − y| , × exp − n 1 − N where µ(N−n),α stands for (8.21) with α(N)β in place of β. The first exponential factor on the right-hand side of (8.79) is bounded above uniformly in N because β µ(1)⊗2 (ln |x − y|) ≤ C(β) independently of N, by the first inequality in Lemma 8.10; as for the second exponential factor on the right-hand side of (8.79), by reidentifying N → N − n and β → αβ, the N-independent upper bound in Lemma (N−n),α 8.11 gives β µ(1) ⊗ µ1 (ln |x − y|) ≤ C(α(N )β). Since α(N) → 1 as N → ∞, the second exponential factor on the right-hand side of (8.79) is bounded above uniformly in N. As for the third exponential factor on the right-hand side of (8.79), since β ∗ < α(N )β < 4, and since (8.44) holds for all β ∈ (β ∗ , 4), by Lemma 8.10 we (N−n),α have β µ2 (ln |x − y|) ≥ C(α(N)β). Again, since α(N) → 1, we now see that also the third exponential factor on the right-hand side of (8.79) is bounded above uniformly in N . This proves Lemma 8.12. Lemma 8.12 establishes that for each triple n ∈ N, β ∈ (β ∗ , 4), p ∈ [1, ∞) there ⊗n ∈ Lp (R2n , dτ ⊗n ) uniformly in n (β, p) ( > Nn (β)) such that dµ(N) exists a N n /dτ n (β, p). Hence, the sequence N → µ(N) N when N > N is Lp (R2n , dτ ⊗n )-weakly n n (β, p), for each p ∈ [1, ∞). compact when N > N (N) However, a weak Lp limit point of µn need not be a probability measure. Since (N) R2 is unbounded, some partial mass of the marginals µn of (8.21) could escape to infinity when N → ∞. We now show that this does not happen by proving tightness (N) of the sequences. Recall (see [5]) that the sequence of probability measures µn is (N) n tight if, for each * # 1, there exists R(*) such that µn (BR(*) ) > 1−*, independent of N, where BnR ⊂ R2n is the n-fold Cartesian product of the ball BR ⊂ R2 that is centered at the origin, having radius R. (N)
Lemma 8.13. For each n, the sequence {µn }N≥n given by (8.21) is tight. Proof. Since our marginal measures are permutation symmetric and consistent, (N) (N) in the sense that µn (dx n ) = µm (dx n ⊗ R2(m−n) ) for m > n, it suffices to prove tightness for n = 1. It follows from the definition of µ(1) that the map y → h(y) ≡ R2 ln |y −x|µ(1) (dx) + C is continuous and independent of N. The constant C is chosen so that h(y) > 0. Moreover, we have h(y) → ∞ as |y| → ∞, uniformly in y. Therefore, and by
PRESCRIBED GAUSS CURVATURE
347
Lemma 8.11, for each positive * # 1, we can find R(*), independent of N , such that for all N, inf
x1 ∈ BR(*)
1 (N) µ (h(x1 )). h(x1 ) ≥ * 1
(8.80)
Let I1 denote the indicator function of the set 1. We then have the chain of estimates (N) (N) µ1 (h(x1 )) ≥ µ1 h(x1 )IR2 \BR (*) 1 (N) (N) ≥ µ1 (h(x1 )) µ1 IR2 \BR (*) * 1 (N) (N) µ1 (h(x1 )) 1 − µ1 BR(*) . = *
(8.81)
(N)
Dividing (8.81) by * −1 µ1 (h(x1 )) and resorting terms slightly gives us (N)
µ1
BR(*) ≥ 1 − *,
(8.82)
independent of N . The proof is complete. To prove Proposition 8.9, we also need a lower bound on the mean entropy. Lemma 8.14. For each β ∈ (β ∗ , 4), there exists a C(β), independent of N, such that 1 (N) (N) µ ≥ C(β). N
(8.83)
(N)
Proof. By definition (8.32) of Ᏺβ (B(N) ), 1 (N) (N) 1 (N) (N) 1 (N) µ µ = β 2 ln − 2 Ᏺβ µ(N) . N N N
(8.84)
The bound (8.83) now follows from Proposition 8.8, (8.43), and Lemma 8.10. Proof of Proposition 8.9. By Lemma 8.13, the sequence of probability measures (N) {µn | N = n, n + 1, . . . } is tight in P (R2n ) for all n. Therefore (see [5]) we can (N (k)) select a subsequence k → N (k) ∈ N, k ∈ N such that for each n ∈ N, µn V µn ∈ P (R2n ), as k → ∞. Since the marginals are consistent (in the sense defined above in the proof of tightness), by Kolmogorov’s existence theorem (see [5, p. 228 ff.] and [28, p. 301 ff.]), the infinite family of marginals {µn }n∈N now defines a unique µ ∈ P s (=). Furthermore, for β ∗ < β < 4, we have as a corollary of Lemma 8.12 that, (N) for any n and any p ∈ [1, ∞), the sequence {µn | N = n, n + 1, . . . } is eventually in a ball {g : gLp (R2n ) ≤ T }, where T (n, β, p) is independent of N. Therefore, as k → ∞, after at most selecting a subsubsequence (also denoted by k → N (k) ∈ N,
348
CHANILLO AND KIESSLING (N (k))
k ∈ N), we have that dµn /dτ ⊗n V dµn /dτ ⊗n , weakly in Lp (R2n , dτ ⊗n ), any p ∈ [1, ∞). We first study convergence of energy. By (8.43), we have
(N (k)) 1 (N (k)) 1 (N (k)) = 1 − 1 µ2 µ ln ln |x − y| . (8.85) 2 N (k) 2 N (k) Since ln |x −y| ∈ Lq (R4 , dτ ⊗2 ), 1/q +1/p = 1, by weak Lp (R4 , dτ ⊗2 ) convergence (N (k)) of µ2 , 1 (N (k)) 1 ln |x − y| −→ µ2 µ2 ln |x − y| = e µ . 2 2
(8.86)
Furthermore, since 1 − N (k)−1 → 1 as k → ∞, we have 1 µ(N (k)) ln (N (k)) = e µ . 2 k→∞ N (k) lim
(8.87)
We now turn to the entropy. We define m = N (k)−[[N (k)/n]]n. By subadditivity (8.15) and negativity (8.13) we have, for any n < N (k), 1 N (k) 1 1 (k)) + (k)) (N (k)) µ(N (k)) ≤ (n) µ(N (m) µ(N n m N (k) N (k) n N (k) 1 N (k) (k)) . ≤ (n) µ(N n N (k) n (8.88) Clearly, N (k)−1 [[N (k)/n]] → n−1 . Moreover, for each n, weak upper semicontinuity of (n) (see [57]) gives us (k)) ≤ (n) µ . lim sup (n) µ(N n n k→∞
Therefore, for all n, lim sup k→∞
1 (N (k)) (N (k)) 1 (n) µ ≤ µn . Nk n
(8.89)
Recalling (8.16) and Lemma 8.14, we see that s(µ) exists. Hence, n → ∞ in (8.89) gives lim sup k→∞
1 (N (k)) µ(N (k)) ≤ s µ N (k)
for each convergent subsequence µ(N (k)) V µ .
(8.90)
349
PRESCRIBED GAUSS CURVATURE
Pulling the estimates (8.87) and (8.90) together, we find, for any β ∈ (β ∗ , 4), lim inf k→∞
1 (N (k)) (N (k)) Ᏺ µ ≥ βe µ −s µ . β N2 (k)
Recalling Propositions 8.1 and 8.2, and finally using Lemma 8.7, we have ν dB | µ Ᏺβ (B) ≥ F (β). βe µ − s µ = P (R2 )
(8.91)
(8.92)
By (8.92) and (8.91), the proof of Proposition 8.9 is complete. We remark that, when β < 0, Proposition 8.9 can be proved without Lp estimates. Indeed, when β < 0, then (8.91) follows already, with (8.87) replaced by 1 1 (N (k)) lim sup µ2 µ2 ln |x − y| = e µ , ln |x − y| ≤ 2 k→∞ 2
(8.93)
which holds by the weak upper semicontinuity of ln |x −y| and the weak convergence (N) of µ2 in the sense of measures (see [37] and [41]). Also, the entropy estimate in (N) the proof of Proposition 8.9 holds by just such weak convergence of µ2 . However, without Lp estimates one loses the a priori information on the regularity of the solutions of (8.28). We now prove our main existence theorem. Proof of Theorem 8.4. Combining Propositions 8.8 and 8.9, we conclude that 1 (N) (N) Ᏺ µ = F (β). N→∞ N 2 β lim
(8.94)
Recalling (8.91) and (8.92), we see that (8.94) implies ν dB µ Ᏺβ (B) = F (β)
(8.95)
P (R2 )
for every limit point µ of µ(N) . Equation (8.95) in turn implies that the decomposition measure ν(dB | µ ) is concentrated on the minimizers of Ᏺβ (B); for otherwise we would have ν dB µ Ᏺβ (B) > F (β), P (R2 )
by Lemma 8.7, which contradicts (8.95). The proof of Theorem 8.4 is complete. We now are also in the position to vindicate Remark 8.5. By the tightness and weak Lp compactness, the sequence {µ(N) , N = 1, 2, . . . } is a union of weakly convergent subsequences in Lp . If the minimizer Bβ is unique, the set of limit points of {µ(N) , N ∈ N} consists of the single product measure µ = Bβ⊗N .
350
CHANILLO AND KIESSLING
9. Proof of uniqueness Theorem 2.2 for K ≤ 0. We conclude this paper with a proof of Theorem 2.2. We do this by proving the dual version, that is, uniqueness of solutions of (8.28) when β < 0. Theorem 9.1. For β < 0, the solution ρβ,H of the fixed point equation (8.28) is unique. Proof. We introduce operator notation for (8.28); thus ρ = ᏼ(ρ),
(9.1)
where ᏼ indicates that the right side is a probability density over R2 . Now assume that for given β < 0 and H entire harmonic, two solutionsof (9.1) exist, say, ρ (1) and ρ (2) . Then ρ2,1 ≡ ρ (2) − ρ (1) ∈ H0−1 (R2 ). In particular, R2 ρ2,1 dx = 0, and ρ2,1 (x) ln |x − y|ρ2,1 (y) dx dy ≥ 0, (9.2) − R2 R2
with equality holding if and only if ρ2,1 ≡ 0 (cf. [58]). For λ ∈ [0, 1], we define the interpolation density ρλ = ρ (1) +λρ2,1 . Expected value with respect to ᏼ(ρλ ) is denoted by f (x)ᏼ(ρλ )(x) dx. (9.3) f (λ) = R2
We use (8.28) for one of the ρ2,1 in the left-hand side of (9.2) and, with the abbreviation ln |x − y|ρ2,1 (y) dy, (9.4) U2,1 (x) = R2
find that
−
R2 R2
ρ2,1 (x) ln |x − y|ρ2,1 (y) dx dy =− U2,1 (x) ᏼ ρ (2) − ᏼ ρ (1) (y) dx dy
R2 R2
1
d ᏼ(ρλ )(y)dλ dx dy 0 dλ R2 R2 1 2 =β U2,1 − U2,1 (λ) (λ) dλ. U2,1 (x)
=−
(9.5)
0
Since β < 0, the last term in (9.5) is less than or equal to zero, and it vanishes if and only if U2,1 ≡ const. By (9.5) and (9.2), we conclude that ρ2,1 ≡ 0. Uniqueness is proved.
PRESCRIBED GAUSS CURVATURE
351
Corollary 9.2. If β < 0, H ≡ const, and ϒ is radially symmetric, then the unique solution ρβ,H of (8.28) is radially symmetric as well. Gauss’s theorem then implies that the corresponding solution of (1.2), UH,κ given in (8.30), is radially decreasing. The proof of Corollary 9.2 is trivial. Theorem 9.1 and Corollary 9.2 prove Theorem 2.2. References [1] [2]
[3] [4] [5] [6] [7] [8]
[9] [10] [11] [12]
[13] [14] [15] [16] [17] [18] [19]
L. V. Ahlfors, An extension of Schwarz’s lemma, Trans. Amer. Math. Soc. 43 (1938), 359–364. T. Aubin, Meilleures constantes dans le théorème d’inclusion de Sobolev et un théorème de Fredholm non linéaire pour la transformation conforme de la courbure scalaire, J. Funct. Anal. 32 (1979), 148–174. P. Aviles, Conformal complete metrics with prescribed nonnegative Gaussian curvature in R2 , Invent. Math. 83 (1986), 519–544. W. Beckner, Sharp Sobolev inequalities on the sphere and the Moser-Trudinger inequality, Ann. of Math. (2) 138 (1993), 213–242. P. Billingsley, Convergence of Probability Measures, J. Wiley and Sons, New York, 1968. T. Branson, Group representations arising from Lorentz conformal geometry, J. Funct. Anal. 74 (1987), 199–291. H. Brezis and F. Merle, Uniform estimates and blow-up behavior for solutions of − u = V (x)eu in two dimensions, Comm. Partial Differential Equations 16 (1991), 1223–1253. E. Caglioti, P. L. Lions, C. Marchioro, and M. Pulvirenti, A special class of stationary flows for two-dimensional Euler equations: A statistical mechanics description, Comm. Math. Phys. 143 (1992), 501–525. E. Carlen and M. Loss, Competing symmetries, the logarithmic HLS inequality, and Onofri’s inequality on S n , Geom. Funct. Anal. 2 (1992), 90–104. S.-Y. A. Chang and P. Yang, Prescribing Gaussian curvature on S2 , Acta Math. 159 (1987), 215–259. , Conformal deformation of metrics on S2 , J. Differential Geom. 27 (1988), 259–296. S. Chanillo and M. K.-H. Kiessling, Rotational symmetry of solutions of some nonlinear problems in statistical mechanics and geometry, Comm. Math. Phys. 160 (1994), 217–238. , Conformally invariant systems of nonlinear PDE of Liouville type, Geom. Funct. Anal. 5 (1995), 924–947. S. Chanillo and Y. Y. Li, Continuity of solutions of uniformly elliptic equations in R2 , Manuscriota Math. 77 (1992), 415–433. W. Chen and C. Li, Classification of solutions of some nonlinear elliptic equations, Duke Math. J. 63 (1991), 615–622. , Qualitative properties of solutions to some nonlinear elliptic equations in R2 , Duke Math. J. 71 (1993), 427–439. K.-S. Cheng and C.-S. Lin, On the asymptotic behavior of solutions of the conformal Gaussian curvature equations in R2 , Math. Ann. 308 (1997), 119–139. , Compactness of conformal metrics with positive Gaussian curvature in R2 , Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 26 (1998), 31–45. , On the conformal Gaussian curvature equations in R2 , J. Differential Equations 146 (1998), 226–250.
352 [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45]
CHANILLO AND KIESSLING , Conformal metrics in R2 with prescribed Gaussian curvature with positive total curvature, Nonlinear Anal. 38 (1999), 775–783. K.-S. Cheng and W.-M. Ni, On the structure of the conformal Gaussian curvature equation on R2 , Duke Math. J. 62 (1991), 721–737. , On the structure of the conformal Gaussian curvature equation on R2 , II, Math. Ann. 290 (1991), 671–680. K. S. Chou and T. Y. H. Wan, Asymptotic radial symmetry for solutions of u + eu = 0 in a punctured disc, Pacific J. Math. 163 (1994), 269–276. P. Diaconis, “Recent progress on de Finetti’s notions of exchangeability” in Bayesian Statistics, 3 (Valencia, 1987), Oxford Sci. Publ., Oxford Univ. Press, New York, 1988, 111–125. P. Diaconis and D. Freedman, A dozen de Finetti-style results in search of a theory, Ann. Inst. H. Poincaré Probab. Statist. 23 (1987), 397–423. W. Ding, J. Jost, J. Li, and G. Wang, The differential equation u = 8π − 8πheu on a compact Riemann surface, Asian J. Math. 1 (1997), 230–248. E. B. Dynkin, Classes of equivalent random quantities (in Russian), Uspekhi Mat. Nauk. 8 (1953), 125–130. R. S. Ellis, Entropy, Large Deviations, and Statistical Mechanics, Grundlehren Math. Wiss. 271, Springer, New York, 1985. B. de Finetti, Funzione caratteristica di un fenomeno aleatorio, Atti Accad. Naz. Lincei Mem. Cl. Sci. Fis. Mat. Natur. 4 (6) (1931), 251–300. B. Gidas, W.-M. Ni, and L. Nirenberg, Symmetry and related properties via the maximum principle, Comm. Math. Phys. 68 (1979), 209–243. D. Gilbarg and N. S. Trudinger, Elliptic Partial Differential Equations of Second Order, 2d ed., Grundlehren Math. Wiss. 224, Springer, New York, 1983. J. Glimm and A. Jaffe, Quantum Physics, 2d ed., Springer, New York, 1987. Z.-C. Han, Prescribing Gaussian curvature on S2 , Duke Math. J. 61 (1990), 679–703. G. Hardy, J. E. Littlewood, and G. Pólya, Inequalities, Cambridge Math. Lib., Cambridge Univ. Press, Cambridge, 1988. E. Hewitt and L. J. Savage, Symmetric measures on Cartesian products, Trans. Amer. Math. Soc. 80 (1955), 470–501. J. L. Kazdan and F. W. Warner, Curvature functions for compact 2-manifolds, Ann. of Math. (2) 99 (1974), 14–47. M. K.-H. Kiessling, Statistical mechanics of classical particles with logarithmic interactions, Comm. Pure Appl. Math. 46 (1993), 27–56. , Statistical mechanics approach to some problems in conformal geometry, Phys. A 79 (2000), 353–368. M. K.-H. Kiessling and J. L. Lebowitz, Dissipative stationary plasmas: Kinetic modeling, Bennett’s pinch and generalizations, Phys. Plasmas 1 (1994), 1841–1849. , The micro-canonical point vortex ensemble: Beyond equivalence, Lett. Math. Phys. 42 (1997), 43–58. M. K.-H. Kiessling and H. Spohn, A note on the eigenvalue density of random matrices, Comm. Math. Phys. 199 (1999), 683–695. J. Liouville, Sur l’équation aux différences partielles ∂ 2 log λ/∂u∂v ± λ/2a 2 = 0, J. Math. Pures Appl. 18 (1853), 71–72. C.-M. Li, Monotonicity and symmetry of solutions of fully nonlinear elliptic equations on unbounded domains, Comm. Partial Difference Equations 16 (1991), 585–615. Y.-Y. Li, Prescribing scalar curvature on Sn and related problems, I, J. Differential Equations 120 (1995), 319–410. Li Ma, Bifurcation in Nirenberg’s problem, C. R. Acad. Sci. Paris Sér. I Math. 326 (1998), 583–588.
PRESCRIBED GAUSS CURVATURE [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61]
353
R. C. McOwen, Conformal metrics in R2 with prescribed Gaussian curvature and positive total curvature, Indiana Univ. Math. J. 34 (1985), 97–104. J. Moser, A sharp form of an inequality by N. Trudinger, Indiana Univ. Math. J. 20 (1971), 1077–1092. , “On a nonlinear problem in differential geometry” in Dynamical Systems (Proc. Sympos., Univ. Bahia, Salvador, 1971), Academic Press, New York, 1973, 273–280. W. M. Ni, On the elliptic equation u + Ke2u = 0 and conformal metrics with prescribed Gaussian curvatures, Invent. Math. 66 (1982), 343–352. M. Obata, The conjectures on conformal transformations of Riemannian manifolds, J. Differential Geometry 6 (1971), 247–258. E. Onofri, On the positivity of the effective action in a theory of random surfaces, Comm. Math. Phys. 86 (1982), 321–326. R. Osserman, On the inequality u ≥ f (u), Pacific J. Math. 7 (1957), 1641–1647. H. Poincaré, Les fonctions fuchsiennes et l’équation u = eu , J. Math. Pures Appl. 40 (1898), 137–23. J. Prajapat and G. Tarantello, On a class of elliptic problems in R2 : Symmetry and uniqueness results, to appear in Proc. Roy. Soc. Edinburgh. D. W. Robinson and D. Ruelle, Mean entropy of states in classical statistical mechanics, Comm. Math. Phys. 5 (1967), 288–300. R. T. Rockafellar, Convex Analysis, Princeton Math. Ser. 28, Princeton Univ. Press, Princeton, N.J., 1970. D. Ruelle, Statistical Mechanics: Rigorous Results, Addison-Wesley, Redwood City, Calif., 1989. E. B. Saff and V. Totik, Logarithmic Potentials with External Fields, Grundlehren Math. Wiss. 316, Springer, Berlin, 1997. D. H. Sattinger, Conformal metrics in R2 with prescribed curvature, Indiana Univ. Math. J. 22 (1972), 1–4. G. Tarantello, Vortex-condensations of a non-relativistic Maxwell-Chern-Simons theory, J. Differential Equations 141 (1997), 295–309. H. Wittich, Ganze Lösungen der Differentialgleichung u = eu , Math. Z. 49 (1944), 579–582.
Department of Mathematics, Rutgers University, Piscataway, New Jersey 08854, USA; [email protected]; [email protected]
Vol. 105, No. 3
DUKE MATHEMATICAL JOURNAL
© 2000
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS ANDREA CIANCHI 1. Introduction. Let l > 0 and 1 ≤ p ≤ ∞. Given any nonnegative function u from the Sobolev space W 1,p (0, l), the decreasing rearrangement u∗ of u is also in W 1,p (0, l) and ∗ u p ≤ u Lp (0,l) . (1.1) L (0,l) Here, prime stands for derivative. The above statement is a variant of the so-called Pólya-Szegö principle. In its original (one-dimensional) formulation, such a principle tells us that for every nonnegative function u ∈ W 1,p (0, l) vanishing at zero and l, the symmetric rearrangement u of u is in W 1,p (0, l) and u P ≤ u Lp (0,l) . (1.2) L (0,l) Inequality (1.2) is classical and very well known. It has been the object of extensions and variants which can be found in a number of papers and monographs, including [AFLT], [Bae], [BBMP], [BH], [Bro], [BZ], [CP], [E], [H], [Ka2], [Kl], [Ma], [Sp1], [Sp2], [Spi], [Ta1], [Ta5], and [Ta6]. Even if not as popular as (1.2), inequality (1.1) has also been known for a long time, and versions of it appear in [Ci1], [Du], [Ga], [Ka1], [Ma], [RT], and [Ry]. Various are the applications of PólyaSzegö-type inequalities. In particular, they have proved to be crucial for such results as isoperimetric inequalities of mathematical physics (see [PS] and [Ta4]) and Sobolevtype inequalities in optimal form (see, e.g., [AFT], [A], [Ci2], [CF], [EKP], [L], [Ko], [Mo], [Ta1], and [Ta5]). They are also strictly connected to a priori sharp estimates for solutions to second-order elliptic and parabolic boundary value problems (see [ALT], [Ba], [Di], [Ka1], [Ke], [Ta2], and [Ta3]). On the other hand, the effect of rearrangements on Dirichlet-type functionals depending on higher-order derivatives seems to be still unknown. The present paper is aimed at giving a contribution on this subject. Indeed, we are concerned with inequalities in the spirit of (1.1) and (1.2) involving second-order derivatives. An evident obstacle in attacking this question is that very smooth functions may have a decreasing (and symmetric) rearrangement whose first-order derivative is not even weakly differentiable (see Section 2, Remark 3, for an example). Thus, unlike W 1,p (0, l), membership of a function in the second-order Sobolev space W 2,p (0, l) need not be preserved after rearranging it in decreasing order. This shortcoming can be overcome by enlarging the class of admissible functions. Actually, our main result—Section 2, Received 3 June 1999. Revision received 29 December 1999. 2000 Mathematics Subject Classification. Primary 26D10, 26A45, 46E35. 355
356
ANDREA CIANCHI
Theorem 1—states that the space of those functions whose second-order distributional derivative is a measure with finite total variation in (0, l) is closed under the operation of decreasing rearrangement, and that a Pólya-Szegö-type principle holds for such a variation. A counterpart of Theorem 1 for functions vanishing at the endpoints of their domain is proved in Section 3. Interestingly, a new phenomenon of loss of symmetry comes into play here. 2. A Pólya-Szegö principle for the second-order derivative. We begin by recalling a few definitions and basic properties about rearrangements and about function spaces involved in our discussion. Given any measurable nonnegative function u in [0, l], the distribution function µu of u is defined as µu (t) = 1 x ∈ [0, l] : u(x) > t for t ≥ 0. (2.1) Here and in what follows, 1 denotes Lebesgue measure in R. Functions having the same distribution function are called equimeasurable or equidistributed. The decreasing rearrangement u∗ of u is the decreasing, right-continuous function in [0, l] which is equidistributed with u. Namely, u∗ (s) = inf t ≥ 0 : µu (t) ≤ s for s ∈ [0, l]. (2.2) The increasing rearrangement of u is denoted by u∗ and defined accordingly. We call symmetric rearrangement u of u the function in [0, l] which is equidistributed with u and is symmetric about l/2. In formulas, l ∗ u = u 2s − for s ∈ [0, l]. (2.3) 2 Notice that, unlike common usage, here u is defined in [0, l] instead of [−l/2, l/2]. A function u ∈ L1 (0, l) is called of bounded variation if its distributional derivative is a (signed) Radon measure Du whose total variation Du is finite in (0, l). The space of all functions of bounded variation in (0, l) will be denoted by BV(0, l). Recall that l Du(0, l) = sup uφ dx : φ ∈ C0∞ (0, l), |φ(x)| ≤ 1 for x ∈ (0, l) (2.4) 0
for every u ∈ BV(0, l). Moreover, if we make use of equation (2.4) as a definition for the left-hand side when u is merely in L1 (0, l), we have that u ∈ BV(0, l) if and only if Du(0, l) < ∞. It is well known that, if u ∈ BV(0, l), then n−1
u(xi+1 ) − u(xi ) : 0 < x1 < · · · < xn < l Du(0, l) = sup i=1
are points of approximate continuity of u
(2.5)
357
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
(see, e.g., [Z]). Recall that a point x ∈ [0, l] is called of approximate continuity for u if 1 x ∈ [0, l] : |u(x) − u(x)| > ε ∩ x : |x − x| < r = 0 for every ε > 0, lim 2r r→0+ and that a function u ∈ BV(0, l) is approximately continuous at a.e. point in [0, l]. The Sobolev space W 1,p (0, l), 1 ≤ p ≤ ∞, is the space of those functions from P L (0, l) whose distributional derivative is absolutely continuous with respect to the Lebesgue measure and has a density u , the weak derivative of u, in LP (0, l). Note l that W 1,1 (0, l) ⊂ BV(0, l) and that Du(0, l) = 0 |u | dx for every u ∈ W 1,1 (0, l). Every function u ∈ W 1,1 (0, l) agrees, up to a set of Lebesgue measure zero, with a function from AC(0, l), the space of absolutely continuous functions in (0, l). In what follows, we always identify u with such a representative. Spaces of functions endowed with higher-order derivatives are defined analogously. In particular, we shall deal with W 2,p (0, l) = u ∈ W 1,p (0, l) : u ∈ W 1,p (0, l) , 1 ≤ p ≤ ∞, and with
BV2 (0, l) = u ∈ W 1,1 (0, l) : u ∈ BV(0, l) .
If u ∈ BV2 (0, l), then Du = D 2 u, the second-order distributional derivative of u. Hence, by (2.5), n−1
2 u (xi+1 ) − u (xi ) : 0 < x1 < · · · < xn < l D u(0, l) = sup i=1
(2.6)
are points of approximate continuity of u for u ∈ BV2 (0, l). Now we are in a position to state our second-order Pólya-Szegö-type principle. Theorem 1. Let u be a nonnegative function from BV2 (0, l). Then u∗ ∈ BV2 (0, l) and 2 ∗ D u (0, l) ≤ D 2 u(0, l). (2.7) Remark 1. In the case when u may change sign, u∗ can be defined on taking |u| instead of u on the right-hand side of (2.2). With such a definition, inequality (1.1) is easily seen to be true for every u ∈ W 1,p (0, l). As far as second-order derivatives are concerned, it is known that if u ∈ BV2 (0, l), then |u| ∈ BV2 (0, l) (see [Sa]), whence u∗ ∈ BV2 (0, l). Nevertheless, inequality (2.7) can be false if u changes sign, as is readily seen on taking into account an affine function w satisfying w(0)w(l) < 0 and −w(0) = w(l). On the other hand, inequality (2.7) obviously still holds if u∗ denotes the signed rearrangement of u, that is, if the infimum in (2.2) is taken among all t ∈ R.
358
ANDREA CIANCHI
Remark 2. With inequality (1.1) in mind, the question can be raised of whether a version of Theorem 1 holds for more general Dirichlet-type functionals depending on D 2 u than just D 2 u(0, l). However, counterexamples of convex sublinear functionals (defined on measures, e.g., as in [DT] and [Te]) and of functions u ∈ BV2 (0, l) can be found such that J (D 2 u∗ ) > J (D 2 u). See also [Ka1, Remark 2.23a] on this point. Remark 3. It is not difficult to exhibit smooth functions, for example, polynomials, whose decreasing rearrangement has a nonsmooth first-order derivative. On taking into account, for instance, u : [0, 2] → [0, ∞), u(x) = 3 + (x − 1/2)2 − (x − 1/2)4 , one may verify that D 2 u∗ has a singular part with respect to 1 due to two jumps of u∗ . One of them is a consequence of the fact that u(0) belongs to the interior of the range of u; the other jump is caused by a critical level of u lying in the interior of the same range. In fact, these can be shown to be the only circumstances that may cause singularities in D 2 u∗ for a Sobolev function. Indeed, arguments related to those of [LS] enable one to show that if u is a nonnegative function from W 2,p (0, l), 1 ≤ p ≤ ∞, and s1 and s2 are points in [0, l], with s1 < s2 , such that neither u(0), nor u(1), nor any critical level of u belongs to the interval [u∗ (s2 ), u∗ (s1 )], then u∗ ∈ AC(s1 , s2 ) and −3 (x)
u 1 u∗ (s) = (x)| (x)|3 |u |u (2.8) −1 ∗ −1 ∗ x∈u
({u (s)})
x∈u
({u (s)})
for a.e. s ∈ (s1 , s2 ). If, in addition, u ∈ C 2 (u−1 ([u∗ (s2 ), u∗ (s1 )])), then u∗ ∈ C 2 (s1 , s2 ) and equation (2.8) holds for every s ∈ (s1 , s2 ). Moreover, following an heuristic argument from [Du], formula (2.8) can be used to show that
s2
∗ p p u (s) ds ≤ u (x) dx, s1
u−1 ([u∗ (s2 ),u∗ (s1 )])
with the usual modification if p = ∞. We now come to the proof of Theorem 1. Our approach relies on an approximation argument that enables us to reduce the proof of inequality (2.7) for a general u ∈ BV2 (0, l) to the proof of the same inequality for piecewise affine functions. Even for this special class of functions, inequality (2.7) is still quite delicate. Compared with (1.1), the main difficulty is that, whereas inequality (1.1) has a local nature, in the sense that
∗ p u ds ≤ |u |p dx {s:t1
{x:t1
for every pair of levels 0 ≤ t1 ≤ t2 , inequality (2.7) is true only for the total variation of D 2 u in the whole of (0, l). (Incidentally, let us note that the counterexamples we
359
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
alluded to in Remark 2 are really related to the global character of inequality (2.7).) The proof of inequality (2.7) for piecewise affine functions is the task of Lemma 4 and is accomplished by passing from a general piecewise affine function to its decreasing rearrangement through a finite sequence of intermediate functions. Each of these functions is constructed by means of a local rearrangement in such a way that the total variation in (0, l) of the second-order derivative is not increased. This procedure allows us to restrict our attention to two basic cases of piecewise affine functions having either exactly one or two extremum points in (0, l). These cases are dealt with in Lemmas 2 and 3, respectively. Their proofs rest on an algebraic inequality between differences of positive numbers and differences of their harmonic means which is established in the following lemma. Lemma 1. Let ai,k ∈ R+ for i = 1, . . . , m and k = 1, . . . , n. Then m−1
1 1 1 1 n − n + n a − n a a a k=1 m,k k=1 1,k k=1 i,k k=1 i+1,k i=1
m−1 n n
1 1 1 1 ≤ + − a − a . am,k a1,k i,k i+1,k k=1
(2.9)
i=1 k=1
Proof. Let us denote by i1 the first index greater than or equal to 1 such that n
ai1 ,k >
n
ai1 +1,k ,
k=1
k=1
and by i2 the first index greater than i1 such that n
ai2 ,k <
k=1
n
ai2 +1,k ;
k=1
and let us define i3 < i4 < · · · < ih ≤ m − 1 accordingly. Note that the left-hand side of (2.9) vanishes if i1 does not exist, so that (2.9) trivially holds by the triangle inequality. Let us assume that i1 exists. Then m−1
1 1 1 1 n n − n + − n k=1 am,k k=1 a1,k k=1 ai,k k=1 ai+1,k i=1
= 2 n
1
k=1 ai2 ,k
+···+
− n
1 n 2
k=1 ai1 ,k
k=1 am,k
1 2 n
1
k=1 aih ,k
1 + 2 n
k=1 ai4 ,k
− n
1
k=1 aih ,k
1 − n k=1 aih−1 ,k
− n
1
k=1 ai3 ,k
if h is odd, if h is even.
(2.10)
360
ANDREA CIANCHI
By the triangle inequality, we can estimate the terms on the right-hand side of (2.9) as n i n n 1 −1
1
1 1 1 ≥ − + − , − a a a a 1,k i,k i+1,k i1 ,k i=1 k=1 k=1 k=1 i −1 n n 2 1 1 1 1 − − ≥ , a a ai+1,k ai2 ,k i,k i1 ,k i=i k=1 k=1 1 i −1 n n 3 1 1 1 1 , − a − a ≥ ai2 ,k ai3 ,k i+1,k i=i2 k=1 i,k k=1 .. . n
1 1 if h is odd, − m−1 n a am,k ih ,k
1 1 k=1 ≥ a − a n i,k i+1,k 1 1 i=ih k=1 − if h is even. aih ,k am,k
(2.11)
k=1
From inequalities (2.11), we get m−1 n n
1 1 1 1 + − − a am,k a1,k ai+1,k i,k k=1
i=1 k=1
1 1 − − + ≥ ai2 ,k ai1 ,k ai2 ,k ai1 ,k k=1 (2.12) n
1 1 1 1 + if h is odd, − − am,k aih ,k am,k aih ,k k=1 + · · · + n
1 1 1 1 − − + if h is even. aih ,k aih−1 ,k aih ,k aih−1 ,k n
1
1
k=1
We claim that n
1 1 1 1 1 1 + ≥ 2 − − − n n aiq ,k aiq−1 ,k aiq ,k aiq−1 ,k k=1 aiq ,k k=1 aiq−1 ,k k=1 (2.13) for every even index q. In order to prove inequality (2.13), let us denote by kj ,
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
361
j = 1, . . . , jq , the values of the index k for which aiq−1 ,kj > aiq ,kj . One has n
1
k=1 aiq ,k
n 1 k=1 aiq−1 ,k − aiq ,k n − n = n k=1 aiq−1 ,k k=1 aiq ,k k=1 aiq−1 ,k jq j =1 aiq−1 ,kj − aiq ,kj n ≤ n k=1 aiq ,k k=1 aiq−1 ,k ≤
jq
aiq−1 ,kj − aiq ,kj j =1
(2.14)
aiq ,kj aiq−1 ,kj
jq
1 1 . = − aiq ,kj aiq−1 ,kj j =1
Since
1 1 − − + aiq ,k aiq−1 ,k aiq ,k aiq−1 ,k 1 1 2 if k = kj for some j = 1, . . . , kq , − aiq ,k aiq−1 ,k = 0 otherwise, 1
1
(2.15)
inequality (2.13) follows from (2.14). Obviously, when h is odd, an inequality like (2.13) holds with iq−1 and iq replaced by ih and m, respectively. Combining equation (2.10) and inequalities (2.12)–(2.13) yields (2.9). Lemma 2. Let w be a nonnegative piecewise affine function in [0, l]. Assume that there exists a point x ∈ (0, l) such that w is increasing in (0, x) and decreasing in (x, l), or vice versa. Then 2 ∗ D w (0, l) ≤ D 2 w (0, l).
(2.16)
Moreover, in the case when w(0) ≥ w(l), ∗ w (0+) ≤ w (0+) if w has a minimum at x, ∗ w (l−) ≤ w (l−) if w has a maximum at x;
(2.17)
362
ANDREA CIANCHI
in the case when w(0) ≤ w(l), ∗ w (0+) ≤ w (l−) if w has a minimum at x, ∗ w (l−) ≤ w (0+) if w has a maximum at x.
(2.18)
Proof. We may assume, without loss of generality, that w has no flat zone, that is, that w is not constant in any subinterval of (0, l). Indeed, w can be approximated, if necessary, by a sequence {wn } of piecewise affine functions having this property, in such a way that every quantity involving w in the statement is the limit as n goes to +∞ of the same quantity evaluated at wn . We shall carry out the proof in the case when w(0) ≥ w(l) and w has a minimum in (0, l), the other cases being analogous. Denote by t0 > t1 > · · · > tm the levels at which the graph of w has corners (i.e., where w jumps) and set di = ti − ti+1 for i = 0, . . . , m − 1. Observe that, since w has exactly one minimum point in (0, l), for every i = 0, . . . , m − 1 there exist either one or two intervals Ii,k , with k = 1 or k = 1, 2, respectively, where w takes values between ti+1 and ti . For fixed i, the intervals Ii,k will be ordered in k starting from the left. Denote by $i,k the length of Ii,k . Thus, w ≡ − w ≡
di $i,1
di $i,2
in Ii,1 , (2.19) in Ii,2 (if the last interval exists)
for i = 0, . . . , m − 1. Now, let us distinguish two cases. Case 1: t0 = w(0) = w(l). In this case, k takes both values 1 and 2 for every i. It is easily seen that w ∗ is the piecewise affine decreasing function whose derivative satisfies w∗ (s) ≡ − 2
di
k=1 $i,k
if s ∈
2 i−1
j =0 k=1
$j,k ,
2 i
$j,k
(2.20)
j =0 k=1
for i = 0, . . . , m − 1. Hereafter, sums have to be taken as zero whenever the upper limit of summation is smaller than the lower one (see Figure 1). We can now make use of equation (2.6) applied to w and w ∗ and get, by (2.19), 2 m−2 2
di 2 di+1 dm−1 D w (0, l) = + − $ $i+1,k $m−1,k i,k i=0 k=1
k=1
(2.21)
363
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
w∗
w t0 d0 t1 d1 t2
l s
m
m
$
m
−
1, 1+
$
$ 2, 1+ − m
$
− 1, 2
− 2, 2
1, 2
2
$
0,
$
+
$
0,
$
1, 1
2
1+
0,
$ 1 $ ,2
− 2, 1 − $ 1,1 m $ −1,2 m − 2, 2 m
m
lx 0
$
$
$ 0, $1
0
1, 1
tm−2 dm−2 tm−1 dm−1 tm
Figure 1
and, by (2.20), m−2
2 ∗ di+1 di D w (0, l) = − 2 2 . $ $ k=1 i,k k=1 i+1,k i=0
(2.22)
One has d$ di+1 di i i+1,1 − di+1 $i,1 + di $i+1,2 − di+1 $i,2 − 2 = 2 2 2 k=1 $i,k k=1 $i+1,k k=1 $i,k k=1 $i+1,k di $i+1,1 − di+1 $i,1 di $i+1,2 − di+1 $i,2 + < $i,1 $i+1,1 $i,2 $i+1,2 2
di di+1 = − $ $i+1,k i,k k=1 (2.23) for i = 0, . . . , m − 2. Inequality (2.16) then follows from (2.21)–(2.23). The first of inequalities (2.17) is a consequence of the fact that w∗ (0+) = − 2
d0
k=1 $0,k
>−
d0 = w (0+). $0,1
364
ANDREA CIANCHI
Case 2: t0 = w(0) > w(l). Since w ∗ (s) = w(s) for every s such that w(s) > w(l), we may assume that t1 = w(l). Thus, the index k in $i,k takes only the value 1 if i = 0 and both values 1 and 2 for i = 1, . . . , m − 1. We have d 0 − $ 0,1 di w ∗ (s) ≡ − 2 k=1 $i,k
if s ∈ 0, $0,1 , 2 2 i−1 i
if s ∈ $0,1 + $j,k , $0,1 + $j,k j =1 k=1
j =1 k=1
for i = 1, . . . , m − 1. (2.24)
By (2.19), m−2 2 2 2
di di+1 dm−1 D w (0, l) = d0 − d1 + − + $ $ $1,1 $i+1,k $m−1,k 0,1 i,k i=1 k=1
(2.25)
k=1
and, by (2.24) and the triangle inequality, m−2 d0 2 ∗ d d d 1 i i+1 + D w (0, l) = − 2 2 $ − 2 0,1 k=1 $1,k k=1 $i,k k=1 $i+1,k i=1 d0 d1 d1 d1 + − − 2 ≤ $0,1 $1,1 $1,1 k=1 $1,k m−2
di+1 di + − 2 2 . k=1 $i,k k=1 $i+1,k i=1
(2.26)
Thanks to Lemma 1, inequality (2.16) follows from (2.25)–(2.26). Finally, equality holds in the first of inequalities (2.17), since w∗ (0+) = −
d0 = w (0+). $0,1
Lemma 3. Let w be a nonnegative piecewise affine function in [0, l] having a unique local minimum point, say, s, and a unique local maximum point, say, r, in (0, l), with min w(0), w(l) ≤ w(s) ≤ w(r) ≤ max w(0), w(l) . (2.27) Then 2 ∗ D w (0, l) ≤ D 2 w (0, l).
(2.28)
365
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
Moreover, ∗ w (0+) ≤ |w (0+)|
and
∗ w (l−) ≤ w (l−)
(2.29)
if w(0) > w(l), and ∗ w (0+) ≤ w (l−)
and
∗ w (l−) ≤ w (0+)
(2.30)
if w(0) < w(l). Proof. As in Lemma 2, we may assume that w has no flat zones. We shall deal with the case when w(0) > w(l), the other one following by symmetry. We make use of the notation ti , di , and $i,k introduced in the proof of Lemma 2. Here, the index i will range from zero to m + 1. Clearly, both w(s) and w(r) agree with one of the values ti . Moreover, the index k in $i,k takes the unique value 1 if either w(s) ≥ ti or w(r) ≤ ti+1 , and ranges from 1 to 3 otherwise. Let us first focus on the case when the strict inequalities w(l) < w(s) and w(r) < w(0) hold. Since w ∗ (s) = w(s) for every s such that w(s) ≤ w(s) or w(s) ≥ w(r), we may assume that t0 = w(0),
t1 = w(r),
tm = w(s),
tm+1 = w(l)
(see Figure 2a). It is readily seen that 3 3 m−2
2 di di+1 d1 D w (0, l) = d0 − d1 + + − $ $ $1,1 $1,k $i+1,k 0,1 i,k k=2
+
2
k=1
dm−1 $m−1,k
i=1 k=1
(2.31)
dm−1 dm + − . $m−1,3 $m,1
Moreover, we have
w ∗ (s) ≡
d 0 − $0,1 di − 3
k=1 $i,k
dm − $
m,1
if s ∈ 0, $0,1 , 3 3 i−1 i
if s ∈ $0,1 + $j,k , $0,1 + $j,k j =1 k=1
if s ∈ $0,1 +
3 m−1
i=1 k=1
j =1 k=1
for i = 1, . . . , m − 1, $i,k , l (2.32)
366
ANDREA CIANCHI
(see Figure 2b). Therefore, m−2 d0 2 ∗ d d d 1 i i+1 + D w (0, l) = − − $ 3 3 3 0,1 k=1 $1,k k=1 $i,k k=1 $i+1,k i=1 dm dm−1 − + 3 . $m−1,k $m,1
(2.33)
k=1
w
x
$
m
1,
,1
3
3 − m
l
$
m
−
2,
2, 3
$
m
$ 2 $ ,2 1 $ ,2 1 $ ,3
m ,2 − 2, 2
$ −1,1 m $ −1
1 2, 1
$
1,
$ 0 $ ,1
dm−2 dm−1 dm 0
$ m $ −2,1
d0 d1 d2
Figure 2a
w∗
Figure 2b
,1
$
m
k − m
$ 3 k= 1
3 k= 1
$
m
−
1,
2,
k
$ 3 k= 1
2,
k 1,
$
$
3 k= 1
0,
1
dm−2 dm−1 dm 0
k
d0 d1 d2
l
s
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
367
By Lemma 1, applied with ai,k = $i,k /di , i = 1, . . . , m − 1, k = 1, 2, we have 2 m−2
i=1 k=1
2 2
di di+1 d1 dm−1 ≥ − − $ $i+1,k $1,k $m−1,k i,k k=1
k=1
+ 2
dm−1
k=1 $m−1,k
− 2
d1
k=1 $1,k
(2.34)
d d i i+1 + − 2 2 . $ $ k=1 i,k k=1 i+1,k i=1 m−2
Combining (2.31) and (2.34) yields 3 m−2
di 2 di+1 d1 D w (0, l) ≥ d0 − d1 + + − $ $ $1,1 $1,k $i+1,3 0,1 i,3 k=2
+
i=1
2 2
dm−1 d1 d1 dm−1 − + 2 − 2 $1,k $m−1,k k=1 $m−1,k k=1 $1,k k=1
k=1
m−2
di+1 di − + 2 2 k=1 $i,k k=1 $i+1,k i=1
(2.35)
2
dm−1 dm−1 dm . + − + $m−1,k $m−1,3 $m,1 k=1
We now use again Lemma 1, with n = 2 and ai,1 =
$m−i,1 + $m−i,2 , dm−i
ai,2 =
$m−i,3 , dm−i
i = 1, . . . , m − 1,
to get m−2 di d d d i+1 i i+1 + − − $ 2 2 $i+1,3 i,3 k=1 $i,k k=1 $i+1,k
m−2
i=1
i=1
≥−
d1 d1 dm−1 dm−1 dm−1 − 2 + + 2 − 3 $m−1,3 $1,3 k=1 $1,k k=1 $m−1,k k=1 $m−1,k
(2.36)
368
ANDREA CIANCHI
+ 3
di+1 di + − 3 3 . $ $ k=1 i,k k=1 i+1,k i=1 m−2
d1
k=1 $1,k
From (2.35) and (2.36), we deduce that 3 2 2
2 d1 d1 dm−1 D w (0, l) ≥ d0 − d1 + + − $ $ $ $ $ 0,1 1,1 1,k 1,k m−1,k k=2
+ 2
dm−1
k=1 $m−1,k
k=1
− 2
d1
k=1 $1,k
−
k=1
d1 d1 − 2 $1,3 k=1 $1,k
dm−1 dm−1 dm−1 + 2 − 3 $m−1,3 k=1 $m−1,k k=1 $m−1,k m−2
di d1 di+1 + 3 + − 3 3 $ $ $ k=1 1,k k=1 i,k k=1 i+1,k i=1 +
+
(2.37)
2
dm−1 dm dm−1 + − . $m−1,k $m−1,3 $m,1 k=1
On the other hand, 2 ∗ D w (0, l) ≤ d0 − d1 + d1 − d1 $ 3 $1,1 $1,1 0,1 k=1 $1,k m−2
di+1 di + − 3 3 $i,k $i+1,k i=1
k=1
(2.38)
k=1
dm−1 dm−1 dm dm−1 − 3 + − . + $m−1,3 $m−1,3 $m,1 k=1 $m−1,k
Inequalities (2.37) and (2.38) imply that 2 D w (0, l) − D 2 w ∗ (0, l) d1 dm−1 d1 d1 , + 3 + 2 − 2 ≥2 $1,2 k=1 $m−1,k k=1 $1,k k=1 $1,k a positive number. Hence, (2.28) follows. Inequalities (2.29) hold as equations in the case under consideration.
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
369
In the case when either w(0) = w(¯r ) or w(l) = w(¯s ) (or when both equations are true), the conclusion follows via similar (simpler) estimates. Alternatively, it can be derived by two subsequent applications of Lemma 2. Notice that strict inequality holds in the first of inequalities (2.29) if w(0) = w(¯r ), and in the second one if w(l) = w(¯s ). Lemma 4. Let w be a nonnegative piecewise affine function in [0, l]. Then 2 ∗ D w (0, l) ≤ D 2 w (0, l). (2.39) Proof. We assume, as in the preceding lemmas, that w has no flat zones. If w is already monotone, there is nothing to prove. If this is not the case, we shall produce another nonnegative piecewise affine function w1 in [0, l] satisfying the following properties: w1 is equidistributed with w; 2 D w1 (0, l) ≤ D 2 w (0, l);
(2.40)
the number of local extrema of w1 is strictly smaller than that of w.
(2.42)
(2.41)
Let us denote by {wk } the sequence of functions obtained by iterating this procedure. ¯ one ends up with a Thanks to (2.42), after a finite number of iterations, say, k, ∗ monotonic function wk¯ that, by (2.40), satisfies either wk¯ = w or wk¯ = w∗ . Moreover, inequality (2.41) implies that 2 ∗ D w (0, l) = D 2 w ¯ (0, l) ≤ D 2 w ¯ (0, l) k k−1 ≤ · · · ≤ D 2 w1 (0, l) ≤ D 2 w (0, l), whence (2.39) follows. We now describe an algorithm to construct w1 . In what follows, given a subinterval [a, b] of [0, l], we shall denote by (w|[a,b] )∗ and by (w|[a,b] )∗ the decreasing and the increasing rearrangements defined in [a, b] of the restriction of w to [a, b]. Let us call s1 , s2 , . . . the (finite) increasing sequence of those points in (0, l) where w achieves local minima and r1 , r2 , . . . the (finite) increasing sequence of those points in (0, l) where w achieves local maxima. Assume, for instance, that s1 = min{s1 , r1 }. (If r1 = min{s1 , r1 }, the argument is analogous.) If r1 does not exist, that is, if w has no local maximum in (0, l), just take w1 = w ∗ . Inequality (2.41) holds by Lemma 2. Assume that r1 exists. If w(0) ≤ w(r1 ), define w|[0,r1 ] ∗ (s) if 0 ≤ s ≤ r1 , w1 (s) = w(s) if r1 < s ≤ l.
370
ANDREA CIANCHI
Properties (2.40) and (2.42) hold by construction, and inequality (2.41) follows from Lemma 2. Assume that w(0) > w(r1 ). If s2 does not exist and w(s1 ) ≤ w(l), set w(s) if 0 ≤ s ≤ s1 , w1 (s) = w|[s1 ,l] ∗ (s) if s1 < s ≤ l and use Lemma 2. If s2 does not exist and w(s1 ) > w(l), define w1 = w ∗ and use Lemma 3. Assume that s2 exists. If w(s1 ) ≥ w(s2 ), set ∗ w|[0,s2 ] (s) if 0 ≤ s ≤ s2 , w1 (s) = w(s) if s2 < s ≤ l and use Lemma 3. Assume that w(s1 ) < w(s2 ). If r2 does not exist and w(r1 ) ≥ w(l), set w(s) if 0 ≤ s ≤ r1 , w1 (s) = ∗ w|[r1 ,l] (s) if r1 < s ≤ l and use Lemma 2. If r2 does not exist and w(r1 ) < w(l), set w(s) if 0 ≤ s ≤ s1 , w1 (s) = w|[s1 ,l] ∗ (s) if s1 < s ≤ l and use Lemma 3. Assume that r2 exists. If w(r1 ) ≤ w(r2 ), set w|[s1 ,r2 ] ∗ (s) if s1 ≤ s ≤ r2 , w1 (s) = w(s) otherwise and use Lemma 3. Assume that w(r1 ) > w(r2 ). If s3 does not exist and w(s2 ) ≤ w(l), set w(s) if 0 ≤ s ≤ s2 , w1 (s) = w|[s2 ,l] ∗ (s) if s2 < s ≤ l and use Lemma 2.
(2.43)
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
371
If s3 does not exist and w(s2 ) > w(l), set w(s) if 0 ≤ s ≤ r1 , w1 (s) = ∗ w|[r1 ,l] (s) if r1 < s ≤ l and use Lemma 3. Assume that s3 exists. If w(s2 ) ≥ w(s3 ), set ∗ w|[r1 ,s3 ] (s) if r1 ≤ s ≤ s3 , w1 (s) = w(s) otherwise and use Lemma 3. Assume that w(s2 ) < w(s3 ). The situation is now the same as when we assumed (2.43), provided that the indices of the minimum and maximum points are shifted by one unit. Thus, one can start again from (2.43) and proceed as above. Clearly, the algorithm ends in a finite number of steps. Proof of Theorem 1. For n ∈ N, let us denote by un the piecewise affine function in [0, l] such that l l i = u i , i = 0, . . . , n. un n n Since u is uniformly continuous, the sequence un converges uniformly to u in [0, l] as n goes to infinity. In particular, limn→+∞ u − un L1 (0,l) = 0. Thus, for every φ ∈ C0∞ (0, l),
l
l
l
l un φ dx = lim − un φ dx = − uφ dx = u φ dx, lim n→+∞ 0
un
u
n→+∞
0
0
0
Ᏸ (0, l),
whence the space of distributions. By the lower semicontinuity → in of the total variation in (0, l) with respect to the convergence in Ᏸ (0, l), (2.44) lim inf D 2 un (0, l) ≥ D 2 u(0, l). n→+∞
On the other hand, since un (x) equals the mean value of u in ((l/n)i, (l/n)(i + 1)) for every x in this interval, it is easily seen from equation (2.6) that 2 D un (0, l) ≤ D 2 u(0, l) for n ∈ N. (2.45) Inequalities (2.44) and (2.45) imply that lim D 2 un (0, l) = D 2 u(0, l). n→+∞
(2.46)
Since the decreasing rearrangement operator is nonexpansive in L1 (0, l) (see, e.g., [Ch]), ∗ u − u ∗ 1 (2.47) n L (0,l) ≤ u − un L1 (0,l) .
372
ANDREA CIANCHI
u
u 2 1 0
2
3
5
6x0
1
2
3
4
5
6 s
Figure 3
Thus, limn→+∞ u∗ − u∗n L1 (0,l) = 0 as well, and the same argument as above tells us that lim inf D 2 u∗n (0, l) ≥ D 2 u∗ (0, l).
(2.48)
2 ∗ D u (0, l) ≤ D 2 un (0, l), n
(2.49)
n→+∞
Finally,
by Lemma 4. From (2.46), (2.48), and (2.49), we deduce that u∗ ∈ BV2 (0, l) and that (2.7) holds. 3. Inequalities for functions vanishing at the endpoints. In view of inequalities (1.1) and (1.2) and Theorem 1, one could be led to conjecture that D 2 u (0, l) ≤ D 2 u(0, l) for every nonnegative u ∈ BV2 (0, l) satisfying u(0) = u(l) = 0. However, this is false in general, as shown by the example displayed in Figure 3. Indeed, from equation (2.6), one easily gets 2 D u (0, 6) = 6 > 4 = D 2 u(0, 6). Thus, given a nonnegative function u ∈ BV2 (0, l) satisfying u(0) = u(l) = 0, the problem arises of characterizing those functions that minimize the total variation of the second-order derivative among all functions from BV2 (0, l) which are equidistributed with u and vanish at zero and l. The simple example above turns out to be particularly instructive in connection with this problem. Actually, loosely speaking, the main result of this section tells us that a typical minimizer has a graph shaped like that of the function u in Figure 3, in the sense that it increases linearly from zero to its maximum and then decreases again to zero. To be more specific, in Theorem 2 we show that there always exists a minimizer from a new one-parameter family of rearrangements uL of u defined as follows. Given a nonnegative Lipschitz continuous function u and
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
373
a real number L > ess sup |u∗ |, uL is the function obeying min u + Ls
max u − min u , L uL (s) = t max u − min u min u + inf t ≥ 0 : µu t + min u + > s if < s ≤ l. L L (3.1) if 0 ≤ s ≤
The restriction on L is needed for u and uL to be equidistributed (see Lemma 5). Theorem 2. Let u be a nonnegative function from BV2 (0, l) such that u(0) = u(l) = 0. Then there exists L ≥ 2 ess sup |u∗ | such that uL ∈ BV2 (0, l) and 2 D uL (0, l) ≤ D 2 v (0, l)
(3.2)
for every v ∈ BV2 (0, l) which is equidistributed with u and vanishes at zero and l (see Figure 4).
u max u
0
uL
l x 0 max u L
l s
Figure 4
In the next theorem, a special class of functions u is exhibited for which u happens to be a solution to the minimum problem stated above. Theorem 3. Let u be a nonnegative function from BV2 (0, l) such that u(0) = u(l) = 0. Assume that u is concave. Then 2 D u (0, l) ≤ D 2 v (0, l)
(3.3)
for every v ∈ BV2 (0, l) which is equidistributed with u and vanishes at zero and
374
ANDREA CIANCHI
l. Moreover, equality holds in (3.3) whenever v is concave and v (0+) = u (0+), v (l−) = u (l−). In particular, equality holds in (3.3) if v = uL with L = ess sup |u |. The proof of Theorem 2 proceeds through the following three main steps. The first one consists in showing that we can limit ourselves to minimizing the righthand side of (3.2) in the class of competing functions with a unique maximum point. Here, Lemma 4 is called into play again. In the second step, we prove that passing from a function u in this restricted class to the function uL defined as in (3.1) with L = ess sup |u | does not increase the total variation of the second-order derivative in (0, l). This step is based on Lemmas 7 and 9. The former substantiates an approximation argument. The latter deals with piecewise affine functions and depends upon an algebraic inequality provided by Lemma 8, a counterpart of Lemma 1 in the present setting. The last step is the proof of the lower semicontinuity of D 2 uL (0, l) with respect to L and is the concern of Lemma 6. Lemma 5. Let u be a nonnegative Lipschitz continuous function in [0, l]. Let uL be defined as in (3.1) with L > ess sup |u∗ |. Then (i) uL (0) = uL (l) = min u; (ii) uL increases in [0,(max u−min u)/L] and decreases in [(max u−min u)/L, l]; (iii) u and uL are equidistributed; (iv) uL is Lipschitz continuous and ess sup u∗ . (3.4) ess sup uL ≤ L max 1, L − ess sup u∗ In particular, if L = ess sup |u |, then ess sup uL ≤ ess sup |u |.
(3.5)
Proof. It is not restrictive to assume that min u = 0. Assertions (i) and (ii) are straightforward consequences of definition (3.1). Consider (iii). We preliminarily observe that the function νu , defined by νu (t) = µu (t) +
t , L
(3.6)
is strictly decreasing in [0, max u]. Indeed, fix t1 and t2 satisfying 0 ≤ t1 < t2 ≤ max u. Since u∗ is continuous, there exist s1 , s2 ∈ [0, l] such that s1 > s2 and u∗ (s1 ) = t1 , u∗ (s2 ) = t2 . Moreover, it is easily seen that s1 and s2 can be chosen in such a way that µu (u∗ (s1 )) = s1 and µu (u∗ (s2 )) = s2 . Thus, νu (t1 ) − νu (t2 ) = s1 − s2 +
u∗ (s1 ) − u∗ (s2 ) > 0, L
inasmuch as we are assuming L > ess sup |u∗ |.
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
375
Now, we claim that
s ∈ [0, l] : uL (s) > t =
t , νu (t) L
for 0 ≤ t ≤ max u.
(3.7)
Obviously, (3.7) implies that µ uL = µ u ,
(3.8)
that is, the equimeasurability of u and uL . Our claim is a consequence of the equations max u t max u s ∈ 0, : uL (s) > t = , (3.9) L L L and
max u max u s∈ , l : uL (s) > t = , νu (t) . L L
(3.10)
Equation (3.9) is trivial. As for (3.10), the left-hand side is included in the righthand side because uL (νu (t)) ≤ t, since νu is decreasing and uL is its right-continuous inverse. The converse inclusion follows via an easy argument exploiting the continuity of νu from the right. Finally, we prove (iv). Let max u/L ≤ s1 < s2 ≤ l. One has L 1 x ∈ [0, l] : uL (s2 ) < u(x) < uL (s1 )
∗ L u (r) dr ≥ ess sup u∗ {s∈[0,l];u (s )
L
u∗ (s
1)
2
1
H 0 u∗ (s) = t dt
=
L ess sup u∗
=
L u∗ (s1 ) − u∗ (s2 ) . ess sup u∗
u∗ (s2 )
L
(3.11)
Notice that the first equality in (3.11) holds because of the equimeasurability of u and u∗ , the second by the coarea formula, and the last one because H 0 ({s ∈ [0, l] : u∗ (s) = t}) = 1 for a.e. t ∈ [0, sup u]. On the other hand, uL (s1 ) − uL (s2 ) 1 x ∈ [0, l] : uL (s2 ) < u(x) < uL (s1 ) ≤ s2 − s1 + . L
(3.12)
Actually, uL (s2 ) 1 x ∈ [0, l] : u(x) > uL (s2 ) = µu uL (s2 ) = νu uL (s2 ) − L
(3.13)
376
ANDREA CIANCHI
and
1 x ∈ [0, l] : u(x) ≥ uL (s1 ) = µu uL (s1 ) + 1 x ∈ [0, l] : u(x) = uL (s1 )
= νu uL (s1 ) +
1
x ∈ [0, l] : u(x) = uL (s1 ) −
Furthermore,
and
(3.14) uL (s1 ) . L
νu uL (s2 ) ≤ s2
νu uL (s1 ) + 1 x ∈ [0, l] : u(x) = uL (s1 ) 1 max u , l : uL (s) = uL (s1 ) s∈ ≥ s1 , = νu uL (s1 ) + L
(3.15)
(3.16)
since u and uL are equimeasurable and uL is strictly increasing in [0, max u/L]. Combining (3.13)–(3.16) yields (3.12). From (3.11) and (3.12) one gets ess sup u∗ uL (s1 ) − uL (s2 ) max u for ≤ s1 < s2 ≤ l. ≤ ∗ L s2 − s 1 L − ess sup u The Lipschitz continuity of uL in the whole of [0, l] and inequality (3.4) then easily follow. Inequality (3.5) is a consequence of (3.4) and of the inequality (1.1) with p = ∞. Lemma 6. Under the same assumption as in Lemma 5, the function L −→ D 2 uL (0, l) is lower semicontinuous in the interval (ess sup |u∗ |, ∞). Proof. As in the proof of (2.44), it suffices to show that, if {Ln } is a sequence converging to L ∈ (ess sup |u∗ |, ∞) as n tends to infinity, then (3.17) lim uL − uLn L1 (0,l) = 0. n→+∞
On assuming, without loss of generality, that min u = 0, we have
l max u
l ds uL (s) − uL (s) ds = (s) − χ (s) dt χ {uL >t} {uLn >t} n 0
0
≤
0
max u l
0
0 max u
= 0
χ{u
L >t}
(s) − χ{uLn >t} (s) ds dt
1 uL > t $ uLn > t dt.
(3.18)
377
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
Here, $ stands for symmetric difference of sets. By Lemma 5(iii), µu = µuL = µuLn . From these equations and the very definition of uL and uLn , it is easily seen that 1 1 1 uL > t $ uLn > t ≤ 2 max u − . L Ln
(3.19)
Combining (3.18) and (3.19) yields (3.17). Lemma 7. Let u and un , n ∈ N, be nonnegative Lipschitz continuous functions in [0, l]. Set L = ess sup |u | and Ln = ess sup |un |, and let uL and (un )Ln be defined as in (3.1). If lim min un = min u,
n→+∞
lim max un = max u,
n→+∞
lim Ln = L,
n→+∞
(3.20)
and lim u − un L1 (0,l) = 0,
(3.21)
lim uL − (un )Ln L1 (0,l) = 0.
(3.22)
n→+∞
then n→+∞
Proof, sketched. Let us set min u = M, min un = Nn , max u = M, max un = Mn , an = min{(M − N )/L, (Mn − Nn )/Ln }, and bn = max{(M − N)/L, (Mn − Nn )/Ln }. It is easily seen that
l
2 uL (s) − (un )L (s) ds ≤ an L − Ln + max{M, Mn }(bn − an ) n 2
l uL (s) − (un )L (s) ds. + n
0
(3.23)
bn
Owing to (3.20), the last but one and the last but two terms converge to zero as n goes to +∞. As for the last integral, on calling u the restriction of uL to [bn , l] and un the restriction of (un )Ln to the same interval, one can show that
l 0
uL (s) − (un )L (s) ds ≤ n
≤
max{M,Mn } 0
1 {u > t}$ un > t dt
min{M,Mn } max{N,Nn }
µu (t) − µu (t) dt n
(3.24)
M Mn + 2 min{M, Mn } − max{N, Nn } − L Ln
378
ANDREA CIANCHI
min2 {M, Mn } − max2 {N, Nn } 1 − 1 L Ln 2 N Nn + min{M, Mn } − max{N, Nn } − L Ln + max{M, Mn } − min{M, Mn } l + max{N, Nn } − min{N, Nn } l. +
The integral on the right-hand side of (3.24) converges to zero as n goes to ∞ as a consequence of (3.21), whereas the remaining terms converge to zero because of (3.20). Lemma 8. Let ai,k ∈ R+ for i = 1, . . . , m and k = 1, 2. Assume that m ∈ {1, . . . , m} is such that am,1 = min ai,k .
(3.25)
i=1,...,m k=1,2
Then 1 am,1
+ 2
1
k=1 a1,k
− am,1
+
2
m−1
i=1
≤
1
k=1 ai,k
− am,1
− 2
k=1
ai+1,k − am,1 1
2 2 m−1
1 1 1 + − . a a1,k ai+1,k i,k i=1 k=1
k=1
(3.26)
Proof. We argue by induction on m. Equality trivially holds in (3.26) if m = 1. Assume now that the statement is true with m replaced by m − 1. We shall prove that it is true for m as well. Let us distinguish two cases. Case 1: 1 ≤ m ≤ m − 1. Then, besides (3.25), we also have am,1 =
min
i=1,...,m−1 k=1,2
ai,k .
Hence, by induction, inequality (3.26) holds with m replaced by m − 1. Furthermore, one has 2 1 1 1 1 − − . (3.27) ≤ 2 2 a am,k m−1,k k=1 am−1,k − am,1 k=1 am,k − am,1 k=1 Inequality (3.27) is an easy consequence of the fact that 2 2
am−1,κ − am,1 am,κ − am,1 ≥ am−1,k am,k κ=1
κ=1
for k = 1, 2.
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
379
The last inequality is in turn equivalent to 2
am,k±1 − am,1 am−1,κ − am,1 + am−1,k±1 − am,1 am,k ≥ 0
for k = 1, 2,
κ=1
where the index k + 1 or k − 1 has to be taken according to whether k = 1 or k = 2. This inequality obviously holds by the definition of m. Adding inequality (3.26) with m replaced by m − 1 to inequality (3.27) yields (3.26). Case 2: m = m. Inequality (3.26) then reduces to 1
2
k=1 a1,k
− am,1 m−2
1 1 + − 2 2 k=1 ai,k − am,1 k=1 ai+1,k − am,1 i=1 1 1 − + 2 am−1,k − am,1 am,2 k=1
≤
2
k=1
1 a1,k
+
(3.28)
2 m−2
i=1 k=1
1 1 1 1 1 − . − + − a ai+1,k am−1,1 am−1,2 am,2 i,k
Inequality (3.28) follows via similar arguments, as in proof of Lemma 1. We skip the details for brevity. Lemma 9. Let w be a nonnegative piecewise affine function in [0, l] satisfying w(0) = w(l) = 0. Assume that there exists x ∈ (0, l) such that w increases in (0, x) and decreases in (x, l). Let wL be the function defined as in (3.1), with L = ess sup |w |. Then 2 D wL (0, l) ≤ D 2 w (0, l). (3.29) Proof. Without loss of generality, we may assume, as in Lemmas 2 and 3, that w has no flat zones. We shall use the notation ti , i = 0, . . . , m + 1, di , i = 0, . . . , m, $i,k , i = 0, . . . , m, k = 1, 2, introduced in Lemma 2. Observe that here t0 = max w and tm+1 = 0. Thus, 2 2 m−1
di 2 di+1 d0 D w (0, l) = + (3.30) $ − $ . $0,k i,k i+1,k k=1
i=0 k=1
It is not restrictive to assume that L = ess sup w . Therefore, there exists m ∈ {0, . . . , m} such that di dm = max . L= $m,1 i=0,...,m $i,k k=1,2
380
ANDREA CIANCHI
Computations show that wL is the piecewise affine function in [0, l] whose derivative is given by dm wL (s) ≡ $m,1 if s ∈ (0, ($m,1 /dm ) max w), and by wL (s) ≡ 2
k=1 $i,k
di − (di /dm )$m,1
if s belongs to 2 i−1 i−1
dj $m,1 $m,1 max w + $j,k − $m,1 , max w dm dm dm j =0 k=1
j =0
+
2 i
j =0 k=1
i
dj $j,k − $m,1 dm j =0
for i = 0, . . . , m. Hence, 2 D wL (0, l) d0 dm + 2 = $m,1 k=1 $0,k − (d0 /dm )$m,1 m−1
di di+1 + − 2 2 . $ /d )$ $ /d )$ − (d − (d i m i+1 m m,1 m,1 k=1 i,k k=1 i+1,k i=0 (3.31) Inequality (3.29) then follows from (3.30) and (3.31), on applying Lemma 8 with ai,k = $i,k /di . Proof of Theorem 2. Step 1. Let x ∈ (0, l) be a maximum point for u. With the same notation as in the proof of Lemma 4, we define the function uˆ in [0, l] as u|[0,x] (s) s ∈ [0, x], ∗ u(s) ˆ = (3.32) ∗ u |[x,l] (s) s ∈ [x, l]. Notice that uˆ is equidistributed with u, vanishes at zero and l, increases in (0, x) and decreases in (x, l). Moreover, (3.33) ess sup |u | ≥ ess sup uˆ ≥ ess sup u by inequalities (1.1) and (1.2) with p = ∞. The aim of this step is to observe that 2 D uˆ (0, l) ≤ D 2 u(0, l). (3.34)
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
381
Indeed, inequality (3.34) holds for piecewise affine functions by Lemma 4 applied in (0, x) and in (x, l) and by inequalities analogous to (2.17) at x− and x+. The general case follows on approximating u by a sequence of piecewise affine functions {un } such that un (x) = u(x) and un ((l/n)i) = u((l/n)i) for i = 0, . . . , n, and arguing as in the proof of Theorem 1. Step 2. Let u be any nonnegative function from BV2 (0, l) vanishing at zero and l. Assume that there exists x ∈ (0, l) such that u increases in (0, x) and decreases in (x, l). Let uL be defined as in (3.1) with L = ess sup |u |. Then 2 D uL (0, l) ≤ D 2 u(0, l). (3.35) In order to prove (3.35), consider an approximating sequence {un } as above, and let (un )Ln be the function defined as in (3.1) with Ln = ess sup |un |. We have max un = max u and min un = min u = 0 for n ∈ N. Moreover, limn→+∞ Ln = L, since Ln ≤ L for n ∈ N and since limn→+∞ un = u a.e. in [0, l], as the argument used in [Du, Lemma 4.3] shows. Finally, un converges to u in L1 (0, l) when n tends to +∞. An application of Lemma 7 then tells us that limn→+∞ uL −(un )Ln L1 (0,l) = 0, whence we deduce, as in the proof of (2.48), that (3.36) lim inf D 2 (un )Ln (0, l) ≥ D 2 uL (0, l). n→+∞
Furthermore, lim D 2 (un )(0, l) = D 2 u(0, l),
(3.37)
2 D (un )L (0, l) ≤ D 2 un (0, l), n
(3.38)
n→+∞
as in (2.46), and
by Lemma 9. Combining (3.36)–(3.38) yields (3.35). Step 3. Steps 1–2 clearly imply that, for every v ∈ BV2 (0, l) which is equidistributed with u and vanishes at zero and l, there exists L ≥ ess sup |u | = 2 ess sup |u∗ | such that 2 D uL (0, l) ≤ D 2 v (0, l). (3.39) Since D 2 uL (0, l) ≥ ess sup |uL | ≥ L, we have limn→+∞ D 2 uL (0, l) = +∞. Thus, the conclusion follows from Lemma 6. The following characterization of those functions whose symmetric rearrangement is concave will be needed in the proof of Theorem 3. Lemma 10. Let u be any nonnegative measurable function in [0, l]. Then the following assertions are equivalent: (i) µu is concave;
382
ANDREA CIANCHI
(ii) u∗ is concave; (iii) u is concave. If, in addition, u is Lipschitz continuous, the above assertions are also equivalent to (iv) uL is concave for every L > ess sup |u∗ |. Proof. The functions µu and u∗ are decreasing and are mutual (right-continuous) inverses. Therefore, the subgraphs of µu and of u∗ are mutually symmetric about the line s = t in the (s, t) plane. Since a function is concave if and only if its subgraph is convex, the equivalence of (i) and (ii) follows. By symmetry, u is concave in [0, l] if and only if it is concave in [l/2, l]. Inasmuch as u (s) = u∗ (2s − l) for s ∈ [l/2, l], the equivalence of (ii) and (iii) follows. Assume now that u is Lipschitz continuous. We also assume, without loss of generality, that min u = 0. Since uL increases linearly in [0, max u/L] and decreases in [max u/L, l], uL is concave in [0, l] if and only if it is concave in [max u/L, l]. The restriction of uL to the interval [max u/L, l] and the function νu , defined in (3.6), are decreasing and mutual (right-continuous) inverses. Then the same argument as above tells us that such a restriction of uL is concave if and only if νu is concave. The concavity of νu is in turn equivalent to that of µu , since µn and νu differ by a linear function. Proof of Theorem 3. Let L > ess sup |u∗ |. By Lemma 10, uL is concave; consequently, uL is decreasing. Thus, by (2.6), 2 D uL (0, l) = u (0+) − u (l−). (3.40) L L Clearly, uL (0+) = L.
(3.41)
We claim that uL (l−) =
L u∗ (l−) . L + u∗ (l−)
(3.42)
Indeed, since u∗ is continuous and, by Lemma 10, concave in [0, l], then it is strictly decreasing in a left neighborhood of l and u∗ (l−) = − ess sup u∗ . (3.43) Thus, µu is strictly decreasing in [0, max u], it is the (classical) inverse of u∗ in a right neighborhood of zero, and µu (0+) =
1 u∗ (l−)
.
Hence, the function νu is strictly decreasing in [0, l], it is continuous in a right neighborhood of zero (since µu so is), and νu (0+) = µu (0+) +
1 1 1 = + . L u∗ (l−) L
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS
383
Thus, uL is the (classical) inverse of νu in a left neighborhood of l and uL (l−) =
1 . νu (0+)
Equation (3.42) follows. Owing to (3.42) and (3.43) we have 2 D uL (0, l) =
L2 . L − ess sup u∗
(3.44)
An elementary argument now shows that the minimum of D 2 uL (0, l) in the interval (ess sup |u∗ |, ∞) is achieved at 2 ess sup |u∗ |, and that such a minimum equals 4 ess sup |u∗ |. Since 2 ess sup |u∗ | = ess sup |u | and 2 D u (0, l) = u (0+) − u (l−) = 2 ess sup u , (3.45) inequality (3.3) follows, thanks to Theorem 2. The remaining assertions of the statement are a consequence of equations (3.44) and (3.45) and of Lemma 10. References [AFT] [AFLT] [ALT]
[A] [Bae]
[Ba] [BBMP]
[BH] [Bro] [BZ] [Ch] [Ci1]
A. Alvino, V. Ferone, and G. Trombetti, Moser-type inequalities in Lorentz spaces, Potential Anal. 5 (1996), 273–299. A. Alvino, V. Ferone, P. L. Lions, and G. Trombetti, Convex symmetrization and applications, Ann. Inst. H. Poincaré Anal. Non Linéaire 14 (1997), 275–293. A. Alvino, P. L. Lions, and G. Trombetti, Comparison results for elliptic and parabolic equations via Schwarz symmetrization, Ann. Inst. H. Poincaré Anal. Non Linéaire 7 (1990), 37–65. T. Aubin, Problèmes isopérimétriques et espaces de Sobolev, J. Differential Geom. 11 (1976), 573–598. A. Baernstein II, “A unified approach to symmetrization” in Partial Differential Equations of Elliptic Type (Cortona, 1992), Symposia Math. 35, Cambridge Univ. Press, Cambridge, 1994. C. Bandle, Isoperimetric Inequalities and Applications, Monographs Stud. Math. 7, Pitman, London, 1980. M. F. Betta, F. Brock, A. Mercaldo, and M. Posteraro, A weighted isoperimetric inequality and applications to symmetrization, J. Inequal. Appl. 4 (1999), 215–240. S. Bobkov and C. Houdré, Some Connections between Isoperimetric and Sobolev-Type Inequalities, Mem. Amer. Math. Soc. 129, Amer. Math. Soc., Providence, 1997. F. Brock, Weighted Dirichlet-type inequalities for Steiner symmetrization, Calc. Var. Partial Differential Equations 8 (1999), 15–25. J. E. Brothers and W. P. Ziemer, Minimal rearrangements of Sobolev functions, J. Reine Angew. Math. 384 (1988), 153–179. G. Chiti, Rearrangements of functions and convergence in Orlicz spaces, Appl. Anal. 9 (1979), 23–27. A. Cianchi, Continuity properties of functions from Orlicz-Sobolev spaces and embedding theorems, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 23 (1996), 575–608.
384 [Ci2] [CF] [CP] [DT] [Di]
[Du] [EKP] [E] [Ga]
[Gi] [H] [Ka1] [Ka2] [Ke] [K1] [Ko] [LS] [L] [Ma] [Mo] [PS] [RT] [Ry] [Sa] [Sp1]
ANDREA CIANCHI , A sharp embedding theorem for Orlicz-Sobolev spaces, Indiana Univ. Math. J. 45 (1996), 39–65. A. Cianchi and N. Fusco, Gradient regularity for minimizers under general growth conditions, J. Reine Angew. Math. 507 (1999), 15–36. A. Cianchi and L. Pick, Sobolev embeddings into BMO, VMO and L∞ , Ark. Mat. 36 (1998), 317–340. F. Demengel and R. Temam, Convex functions of a measure and applications, Indiana Univ. Math. J. 33 (1984), 673–709. J. I. Diaz, “Applications of symmetric rearrangement to certain nonlinear elliptic equations with a free boundary” in Nonlinear Differential Equations (Granada, 1984), Res. Notes Math. 132, Pitman, London, 1985, 155–181. G. F. D. Duff, Differences, derivatives and decreasing rearrangements, Canad. J. Math. 19 (1967), 1153–1178. D. E. Edmunds, R. A. Kerman, and L. Pick, Optimal Sobolev embeddings involving rearrangement-invariant quasinorms, J. Funct. Anal. 170 (2000), 307–355. A. Ehrhard, Inégalités isopérimetriques et intégrales de Dirichlet Gaussiennes, Ann. Sci. École Norm. Sup. (4) 17 (1984), 317–332. S. Gallot, “Inégalités isopérimétriques et analytiques sur les variétés riemanniennes” in On the Geometry of Differentiable Manifolds (Rome, 1986), Astérisque 163–164, Soc. Math. France, Montrouge, 1988, 5–6, 31–91, 281. E. Giusti, Metodi diretti nel calculo delle variazioni, Unione Mat. Italiana, Bologna, 1994. K. Hilden, Symmetrization of functions in Sobolev spaces and the isoperimetric inequality, Manuscripta Math. 18 (1976), 215–235. B. Kawohl, Rearrangements and Convexity of Level Sets in PDE, Lecture Notes in Math. 1150, Springer, Berlin, 1985. , On the isoperimetric nature of a rearrangement inequality and its consequences for some variational problems, Arch. Rational Mech. Anal. 94 (1986), 227–243. S. Kesavan, Some remarks on a result of Talenti, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 15 (1988), 453–465. V. S. Klimov, Imbedding theorems and geometric inequalities, Math. USSR Izvestija 10 (1976), 615–638. V. I. Kolyada, Rearrangements of functions and embedding theorems, Russian Math. Surveys 44 (1989), 73–117. P. Laurence and E. Stredulinsky, A bootstrap argument for Grad generalized differential equations, Indiana Univ. Math. J. 38 (1989), 377–415. E. Lieb, Sharp constants in the Hardy-Littlewood-Sobolev and related inequalities, Ann. of Math. (2) 118 (1983), 349–374. V. M. Maz’ja, Sobolev Spaces, Springer Ser. Soviet Math. Springer, Berlin, 1985. J. Moser, A sharp form of an inequality by N. Trudinger, Indiana Univ. Math. J. 20 (1971), 1077–1092. G. Pólya and G. Szegö, Isoperimetric Inequalities in Mathematical Physics, Ann. of Math. Stud. 27, Princeton Univ. Press, Princeton, 1951. J.-M. Rakotoson and R. Temam, A co-area formula with applications to monotone rearrangement and to regularity, Arch. Rational Mech. Anal. 109 (1990), 213–238. J. V. Ryff, Measure preserving transformations and rearrangements, J. Math. Anal. Appl. 31 (1970), 449–458. G. Savaré, On the regularity of the positive part of functions, Nonlinear Anal. 27 (1996), 1055–1074. E. Sperner, Zur Symmetrisierung von Funktionen auf Sphären, Math. Z. 134 (1973), 317–327.
SECOND-ORDER DERIVATIVES AND REARRANGEMENTS [Sp2] [Spi] [Ta1] [Ta2] [Ta3] [Ta4] [Ta5]
[Ta6] [Te] [Z]
385
, Symmetrisierung für Funktionen mehrerer reeler Variablen, Manuscripta Math. 11 (1974), 159–170. W. Spiegel, Über die Symmetrisierung stetiger Funktionen im euklidischen Raum, Arch. Math. (Basel) 24 (1973), 545–551. G. Talenti, Best constant in Sobolev inequality, Ann. Mat. Pura Appl. (4) 110 (1976), 353–372. , Elliptic equations and rearrangements, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 3 (1976), 697–718. , “Rearrangements and PDE” in Inequalities (Birmingham, 1987), Lecture Notes in Pure and Appl. Math. 129, Dekker, New York, 1991, 211–230. , “On isoperimetric theorems of mathematical physics” in Handbook of Convex Geometry, Vol., A, B, North-Holland, Amsterdam, 1993, 1131–1147. , “Inequalities in rearrangement invariant function spaces” in Nonlinear Analysis, Function Spaces and Applications (Prague, 1994), Vol. 5, Prometheus, Prague, 1994, 177–230. , A weighted version of a rearrangement inequality, Ann. Univ. Ferrara Sez. VII (N.S.) 43 (1997), 121–133. R. Temam, Approximation de fonctions convèxes sur un espace de mésures et applications, Canad. Math. Bull. 25 (1982), 392–413. W. P. Ziemer, Weakly Differentiable Functions, Grad. Texts in Math. 120 Springer, New York, 1989.
Dipartimento di Matematica e Applicazioni per l’Architettura, Università di Firenze, Piazza Ghiberti 27, 50122 Firenze, Italy
Vol. 105, No. 3
DUKE MATHEMATICAL JOURNAL
© 2000
UNIQUENESS FOR LOCALLY INTEGRABLE SOLUTIONS OF OVERDETERMINED SYSTEMS S. BERHANU and J. HOUNIE 0. Introduction. It is well known that if u = u(z1 , . . . , zn ) is a holomorphic function defined on a connected open set in Cn and vanishes on a subset of positive measure, then it vanishes everywhere. A holomorphic function may be regarded as a solution of the overdetermined system of equations ∂u = 0, ∂zj
j = 1, . . . , n.
In this paper we explore generalizations of this property for locally integrable solutions of overdetermined systems Lj u = 0,
j = 1, . . . , n,
(0.1)
where the Lj are linearly independent vector fields with smooth, complex-valued coefficients defined on some open region in RN . In contrast to the case of the CauchyRiemann (CR) equations, a locally integrable solution u(x1 , . . . , xN ) of the system of n ≤ N − 1 equations ∂u = 0, j = 1, . . . , n, ∂xj may very well vanish on a set of positive measure without vanishing identically. Here we explain these different behaviors in terms of geometric objects associated to each system, namely, the family of orbits of the system (0.1) (see Section 3 for precise definitions). Since the properties involved do not change if each vector Lj in (0.1) is replaced by a vector Lj = k aj k (x)Lk where the smooth matrix [aj k (x)] is invertible, we state our results in the framework of a structure ᏸ of rank-n defined on an open set in RN of which the vectors (0.1) are local generators. Most of the time we assume that the structure is locally integrable, that is, that every point of has a neighborhood where a set of m = N − n first integrals Z1 , . . . , Zm of the structure (i.e., solutions of ᏸZj = 0) are defined and satisfy dZ1 ∧· · ·∧dZm = 0. On the subject of locally integrable structures, we refer to [T]. One of our results is the following theorem. Received 8 April 1999. Revision received 3 February 2000. 2000 Mathematics Subject Classification. Primary 35F05, 35N10, 35A05; Secondary 35F20, 32F40. Authors’ work partially supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico, Financiadora de Estudos e Projetos, and Fundaçäo de Amparo à Pesquisa do Estado de Säo Paulo. 387
388
BERHANU AND HOUNIE
Theorem A. Let ᏸ be a locally integrable structure defined on a connected open set in RN . Assume that can be decomposed as = ᏻ ∪ F , where ᏻ is an open a.e. minimal orbit of ᏸ and F is a set of measure zero. Then any solution u ∈ L1loc () of ᏸu = 0 that vanishes on a set of positive measure must vanish identically. (See Definition 3.3 for the concept of a.e. minimal orbit.) If a region satisfies the hypotheses of Theorem A, it has, in particular, the property of a.e. attainability with respect to ᏸ because almost every point p ∈ can be attained from a fixed point q ∈ ᏻ by a continuous curve which is piecewise formed by integral curves of the real parts of local generators of ᏸ. Thus, has the a.e. attainability property if and only if admits a trivial decomposition, that is, if it can be expressed as the union of an open orbit and a set of measure zero. For instance, the decomposition assumed in Theorem A is a trivial decomposition. The a.e. attainability property is not necessary for the conclusion of the theorem; this is true because a manifold may not admit a trivial decomposition with respect to a given structure while the structure possesses only trivial constant global solutions. For instance, the structure ᏸ generated on the 2-torus T2 by any real globally hypoelliptic vector field L exhibits this latter property (see [H]). So in this case, homogeneous solutions are determined by their value at a single point. On the other hand, local a.e. attainability is necessary if the conclusion of Theorem A holds on any base of connected neighborhoods of a given point. More precisely, we have the following theorem. Theorem B. Let ᏸ be a structure defined on an open set in RN , and let p ∈ . Assume that there is a base {j }∞ j =1 of connected neighborhoods of p which do not admit a trivial decomposition. Then there is a base of connected neighborhoods Uk ⊂ k of p and nontrivial functions uk ∈ L1 (Uk ) that satisfy ᏸuk = 0 on Uk , yet the sets {uk = 0} have all positive measure. We point out that in Theorem B the structure is not assumed to be locally integrable. It is not even assumed that it is involutive. At any rate, for analytic structures (which are always locally integrable) Theorems A and B imply the local equivalence between the property of a.e. attainability and the uniqueness property that local homogeneous solutions are determined on sets of positive measure. For locally integrable structures that do not necessarily have the a.e. attainability property, we single out classes of homogeneous solutions which are a priori determined by their values on sets of positive measure. An important example of the system (0.1) comes from the tangential Cauchy-Riemann vector fields of a CR manifold embedded in CN . In this latter case, our main result (see Theorem 1.1) shows that if a locally integrable CR function u vanishes on a set E of positive measure and is the distribution boundary value of a holomorphic function defined on some wedge, then u vanishes in a neighborhood of the Lebesgue density points of E. Loosely speaking, the strong uniqueness property that holomorphic functions have—that of being determined on any domain by their values on any subset of positive measure or, equivalently, that their null sets have measure equal to zero except in a trivial case—is
UNIQUENESS FOR SOLUTIONS OF OVERDETERMINED SYSTEMS
389
inherited by their boundary values at the edge of the wedge where they are defined. In the simpler situation of holomorphic functions of one variable defined on a disk, this principle is well known and present in the classical uniqueness theorems of Riesz and Priwaloff. Thus, Theorem 1.1 is, to a certain extent, a higher-dimensional version of Priwaloff’s theorem. If we are dealing with a CR structure with the property that all CR functions have local holomorphic extension to some wedge, a behavior that has a characterization in terms of orbits by Tumanov’s theorem on wedge extendability, we conclude by Theorem 1.1 that all CR functions possess the strong uniqueness property. This implies Theorem A for CR structures. For more general locally integrable systems as in (0.1), we prove an analogue of Theorem 1.1. As a consequence of this result and the analogue of Tumanov’s theorem for locally integrable systems due to Marson, we derive the following result: if for any small ball B(p, r), p ∈ RN , the Sussmann orbit [S] in B(p, r) of the real and imaginary parts of the Lj through point p has dimension N , and if p is a density point of the set where a locally integrable solution u vanishes, then u must vanish identically in a neighborhood of p. This is the basis for the proof of Theorem A in the general case. Heuristically, the existence of an a.e. minimal orbit that fills almost all of assumed in Theorem A means that the structure Re ᏸ has maximal rank-N at many points in any neighborhood of a given point. The opposite situation holds when a point p has a base of connected neighborhoods j that do not admit a trivial decomposition, as assumed in Theorem B. In this case, either j contains an open orbit ᏻ ⊂ j such that j \ ᏻ has positive measure, or j contains no open orbit, which implies the existence of an element of hypersurface tangent to ᏸ. In both cases, it is relatively easy to see that strong uniqueness does not hold. These are purely geometric arguments that do not involve the theory of holomorphic functions and as such work for quite general structures; in particular, noninvolutive structures are allowed. The organization of this paper is as follows. In Section 1 we give the statement of our main result (see Theorem 1.1) in the CR case and provide a simple proof in the hypersurface situation. Section 2 contains the proof of Theorem 1.1 in full generality. In Section 3 we present the analogue of Theorem 1.1 for more general overdetermined systems. Section 4 contains a necessary condition that a system of vector fields must fulfill in order that their locally integrable solutions be determined on sets of positive measure. Section 5 is devoted to a variety of examples and applications where our main results can be applied. In particular, we prove an analyticity result for certain fully nonlinear systems (see Theorem 5.2). Finally, in the appendix, we present a simpler proof of our main result in the analytic case. The key tool used in the proof of our main result in Theorem 1.1 is the existence of partially attached analytic disks proved recently in [BER]. We provide separate proofs for the CR hypersurface case and the analytic case because in these cases the arguments are simpler, as there is no need to use the analytic disk theorem in [BER].
390
BERHANU AND HOUNIE
Acknowledgment. We wish to thank the referee whose comments improved both the mathematics and presentation of this paper. 1. Statement of the main result and a proof in the hypersurface case. In Section 2 we prove the following result, which is the main theorem in this paper. Theorem 1.1. Let M ⊂ CN be a generic CR manifold of codimension d and CR dimension n (N = n + d). Assume that ᐃ ⊂ CN is a wedge with edge M. Suppose F is a holomorphic function of tempered growth on ᐃ with distribution boundary value f in L1loc (M). If f vanishes on a subset E of positive measure, then f ≡ 0 in a neighborhood of any Lebesgue density point of E. As explained in the introduction, this theorem, its generalization given in Section 3, and its consequences are the essential ingredients for the proof of Theorem A. We now prove the theorem in the hypersurface case. We mention that a different proof for Lemma 1.1 was given in [R]. Let be a C ∞ -hypersurface in Cn containing the central point zero. Suppose f is a CR function on , and assume that f ∈ Lp () for some 1 ≤ p ≤ ∞. Suppose also that f extends to a holomorphic function F on a side + of , that is, that f is the boundary value of F in the distribution sense. Under this setup, we have the following lemma. Lemma 1.1. For any and a sufficiently small ball B in Cn containing zero, the restrictions of F to the hypersurfaces {z ∈ B : dist(z, ) = t} have uniformly bounded Lp -norms. In particular, F ∈ Lp (B ∩ + ). Remark. If p > 1, a partial converse of the theorem holds: if F is holomorphic in + and the restrictions of F , as in the theorem, have uniformly bounded Lp -norms, p then the boundary value of F is in Lloc (). Proof. Without loss of generality, we may assume that is part of the boundary of a bounded open set D with smooth boundary such that D ⊆ + . Let H be harmonic in D with boundary value f on and zero off . By the classical hp -theory for harmonic functions (see [K, Chapter 8]), the restrictions Ht of H to the hypersurfaces St = {z ∈ D : dist(z, ∂D) = t} (t small) are all in Lp and Ht Lp (St ) ≤ f Lp () . Moreover, it is well known that “dist(z, ∂D)” can be replaced by any defining function for ∂D. Since F is holomorphic in + and has a boundary value on , there exist C, k > 0 such that, for any z ∈ D, |F (z)| ≤ C dist(z, )−k . This may require contracting , but again we may assume this without loss of generality. It follows that F has a boundary value that is a distribution on ∂D: Let u = F − H, where u is harmonic in D and has a distributional boundary value bu on ∂D which
UNIQUENESS FOR SOLUTIONS OF OVERDETERMINED SYSTEMS
391
is zero on the piece . We show that u is smooth up to . Let G(x, y) be Green’s function for D, and let P (x, y) be its Poisson kernel. We recall that P (x, y) = −Ny G(x, y) for x ∈ D, y ∈ ∂D, where Ny = the unit outer normal to D at y. Fix x ∈ D. The function y → G(x, y) is zero on ∂D and positive on D \{x}. By Hopf’s lemma, Ny G(x, y) = 0 for all y ∈ ∂D. Hence, for ( small enough, the open sets D( = y ∈ D : G(x, y) > ( ( (z, y) is Green’s function for D( , then have smooth boundaries. Observe that if G ( (x, y) = G(x, y) − (. G Hence the Poisson kernel P( (z, y) for D( satisfies P( (x, y) = −Ny( G(x, y), where Ny( is the unit outer normal to D( at y. We thus have −1 u(x) = P( (x, y)u(y) dσ( (y) = P( x, *−1 ( (y) u *( (y) J( (y) dσ (y), ∂D(
∂D
where *( : ∂D( → ∂D is the normal projection map and J( is the Jacobian of *−1 ( . ( + Since P( (x, y) = −Ny G(x, y), as ( → 0 , P( x, *−1 ( (y) J( (y) −→ P (x, y) in C ∞ (∂D). It follows that for any x ∈ D, u(x) = bu, P (x, ·) . This latter formula, together with the vanishing of bu on , tells us that u is C ∞ up to the boundary piece . Since F = H + u and H ∈ hp (D), the assertions of the theorem follow. Corollary 1.1 (Nontangential convergence). Let f and F be as in Lemma 1.1, and let D be as in the proof of the lemma. For α > 1 and A ∈ , define .α (A) = z ∈ D : |z − A| < αδ(z) , where δ(z) = dist(z, ∂D). Then lim
.α (A)z→A
for almost all A in .
F (z) = f (A)
392
BERHANU AND HOUNIE
Proof. Recall from the proof that F = H + u. Since u is smooth up to the piece and bu vanishes on , limDz→A u(z) = 0 for all A ∈ . The corollary therefore follows from the fact that H ∈ hp (D) and that on , H = f . We now use Corollary 1.1 to prove the following uniqueness result. Theorem 1.2. Suppose is a smooth hypersurface containing the origin in Cn and f is the boundary value of a holomorphic function F defined on one side of . Assume that f ∈ L1loc () and that it vanishes on a set E of positive measure for which the origin is a point of density. Then f vanishes in a neighborhood of the origin. Proof. We may choose coordinates (z, s) near zero in Cn−1 × R so that = z, s + iφ(z, s) : z ∈ U, |s| < a , where U is a neighborhood of zero in Cn−1 and a > 0. The function φ is smooth, real-valued, φ(0) = 0, and dφ(0) = 0. We may assume that F is holomorphic in + = z, s + it ∈ B : t > φ(z, s) , where B is a ball in Cn centered at zero. We may also assume that ⊂ B and that f ∈ L1 (). Let
A = z ∈ Cn−1 : z, s + iφ(z, s) ∈ E for s in a set of positive measure . Observe that A has a positive measure in Cn−1 . After modifying A by a set of measure zero, we may assume that for z ∈ A, the function (−a, a) s −→ f (z, s) is in L1 . By Corollary 1.1, F converges to f nontangentially a.e. on . Therefore, after modifying A again but still maintaining its measure, we may suppose that, for each z ∈ A and P = (z, s + iφ(z, s)) ∈ , lim
.α (P )Q→P
F (Q) = F (P )
for s a.e. in (−a, a). For z ∈ A, consider the planar domain D z = z, s + it ∈ B : t > φ(z, s) , which intersects in the smooth curve γ z = z, s + iφ(z, s) : |s| < a . Fix z ∈ A. The function w → F (z, w) is holomorphic on D z and has nontangential limit f a.e. on γ z . Moreover, since z ∈ A, f vanishes on a subset of γ z of positive
393
UNIQUENESS FOR SOLUTIONS OF OVERDETERMINED SYSTEMS
measure. By Priwaloff’s theorem [LP], we conclude that F ≡ 0 on D z . But since
Dz
z∈A
has a positive measure in + , F ≡ 0 on + . It follows that f vanishes in a neighborhood of the origin. 2. Proof of Theorem 1.1. To prove Theorem 1.1, we may assume that 0 ∈ M is a density point of E and that M near zero is defined by w = φ(x, y, w), where z = x + iy ∈ Cn and w ∈ Cd , N = n + d. The function φ is real-valued, smooth, φ(0) = 0, and dφ(0) = 0. We may also assume that the wedge ᐃ contains a wedge of the form
(z, w) : w = s + iφ(x, y, s) + iv, |z| < 2δ, |s| < 2δ, |v| < 2δ, v ∈ .
for some open convex cone . ⊂ Rd and δ > 0. We may suppose that dφ(x, y, s) < 1/4 for |x|, |y|, |s| < 2δ. Without loss of generality, assume that . = v = v , vd : |v | < 2δvd . Let
. = (y, t) ∈ Rn+d : |(y, t )| < δtd , t = t , td .
For |y0 | < δ, the set
ᐃy0 = x + iy0 + iy, s + iφ x, y0 , s + it : (y, t) ∈ . , |x|, |y|, |s|, |t| < δ is contained in the wedge ᐃ. Indeed, this follows from the definitions of . and . and from the assumption on the norm of dφ. Observe that ᐃy0 is a wedge in CN with a maximally totally real edge My0 =
Fix y0 , |y0 | < δ such that
x + iy0 , s + iφ x, y0 , s : |x| < δ, |s| < δ . (x, s) −→ f x, y0 , s
is in L1 and such that the (n+d)-dimensional set My0 intersects E in a set of positive measure. Note that F is holomorphic and of tempered growth in the wedge ᐃy0 . Hence F has a distribution boundary value bF on My0 . We eventually show that bF on My0 agrees with f for almost all y0 . Assuming this for now, it is clear that Theorem 1.1 would follow if we could show that F ≡ 0 on ᐃy0 . This kind of reduction to a maximally totally real manifold also appears in [BER, proof of Theorem 7.2.6].
394
BERHANU AND HOUNIE
We are thus led to consider a maximally totally real submanifold of Cm given in a neighborhood U of 0 ∈ Cm by t = φ(s),
s ∈ U,
where w = s + it are standard complex coordinates in Cn , φ is a smooth Rm -valued function defined near 0 ∈ Rn , and φ(0) = dφ(0) = 0. We recall that a Cm -valued analytic disk is a map A : 9 → Cm of class C 1+α from the closed unit disk of the complex plane of class C 1+α which is holomorphic on 9 (here 0 < α < 1 is fixed once from now on). An analytic disk A is said to be partially attached to at p if (i) A(eiθ ) ∈ for |θ | ≤ π/2 and (ii) A(1) = p. The Banach space of Cm -valued analytic disks is denoted by Ꮽm . We recall from [BER, Theorem 7.4.12] the existence of analytic disks partially attached to . Theorem [BER, Theorem 7.4.12]. There exist a neighborhood U × V of (0, 0) ∈ Rm × Rm and a smooth map U × V (s, v) → As,v ∈ Ꮽm satisfying the following properties for all (s, v) ∈ U × V : (i) As,v (1) = s + iφ(s); (ii) As,v (eiθ ) ∈ for |θ | ≤ π/2; (iii) (d/dθ )(As,v )(eiθ )|θ =0 = v + iφ (s) · v; (iv) (d/dr)(As,v )(r)|r=1 = iv − φ (s) · v. Notice that we have included (iv) here since it follows from (iii) and the CauchyRiemann equations satisfied by ζ → As,v (ζ ) at ζ = 1. The meaning of (i) and (ii) is that As,v is partially attached to at p = (s, φ(s)); (iii) implies that we can choose a neighborhood U˜ ⊂ U of the origin and a small ( > 0 such that, for every p = s0 + iφ(s0 ), s0 ∈ U˜ , the map (0, () × S m−1 (0, () (θ, ω) −→ As0 ,ω eiθ ∈ yields a C 1+α local system of polar coordinates centered at p on , where S m−1 (0, () denotes the sphere of radius ( centered at 0 ∈ Rm . In particular, given v0 ∈ Rn , |v0 | = (, and p1 = s1 + iφ(s1 ), s1 ∈ U˜ , we may find s0 ∈ U and θ0 ∈ (0, () such that p1 = As0 ,v0 eiθ0 . Assume that p1 is a density point of a measurable set E ⊆ (in particular, E has positive measure), and let U0 s0 and V0 v0 be open sets of diameter smaller than (. Consider the set
( iθ m−1 ( (0, () ∩ V0 : χE As,v e E = (s, v) ∈ U0 × S dθ > 0 , 0
where χE denotes the characteristic function of E. We observe that we may assume ( has positive (2m−1)-dimensional measure. Indeed, without loss of generality that E
UNIQUENESS FOR SOLUTIONS OF OVERDETERMINED SYSTEMS
the function
θ −→
|s−s0 |<( |v−v0 |<2(
395
χE As,v eiθ ds dv
is continuous and assumes a positive value at θ = 0 because As,v (1) = s + iφ(s). Hence ( χE As,v eiθ dθ ds dv > 0, |s−s0 |<( |v−v0 |<2(
0
and by writing v in polar coordinates we see that, for some 0 < ( < 2(, our claim is ( . true for E Now consider the map U0 × S m−1 (0, () ∩ V0 × 1 − (, 1 (s, v, r) −→ As,v (r) ∈ Cm .
(2.1)
Taking account of (iv), we note that this map has rank-2m for small ( > 0 and maps {s} × (S m−1 (0, () ∩ V0 ) × (1 − (, 1) onto Bp \ {p}, where Bp is a C 1+α -differentiable m-ball that intersects orthogonally at p = s +iφ(s). Indeed, the respective tangent planes at p are Tp = s + iφ(s) + v + iφ (s) · v, v ∈ Rn , Tp Bp = s + iφ(s) + iv − φ (s) · v, v ∈ Rn . onto a set of positive measure Since the map (2.1) is a local diffeomorphism, it takes E which is contained in the union of the disks {As,v : (s, v) ∈ E}. We could say E the that these disks are strongly attached to E in the sense that, for any (s, v) ∈ E, set of boundary points {As,v (eiθ ) : 0 < θ < (} intersects E at a nonnegligible set of values of θ . Now consider a holomorphic function F of slow growth defined in a wedge ᐃ = ×. with edge possessing a weak trace f ∈ Lp (), and assume that f vanishes on E. Furthermore, assume that v0 ∈ .. We now sketch how we try to prove that F must vanish. First we prove that if ( > 0 is small enough and (s, v) ∈ E, ( then the portion As,v of the disk As,v described by the inequalities −( < θ < ( and 1 − ( < r < 1 is contained in the wedge ᐃ. Then the composition F (As,v (reiθ )) is defined for −( < θ < (, 1−( < r < 1, is holomorphic and has a weak boundary value is given by—and the proof of this fact is our second step— that, for a.e. (s, v) ∈ E, iθ f (As,v (e )). The third step is to prove that, for a.e. (s, v), the restriction of f to the curve ((/2, () θ → As,v (eiθ ) is in Lp . Hence, by the analysis made in the proof of Theorem 1.2, the holomorphic function of one complex variable F (As,v (reiθ )) vanishes identically for −( < θ < (, 1 − ( < r < 1, in particular, for θ = 0. But we × (1 − (, 1) and keeping θ = 0, the union of know that by letting (s, v, r) vary on E iθ and so must vanish identically. The {As,v (re )} covers E. Thus, F vanishes a.e. on E proof of the second step involves a discussion about the trace, which is developed next.
396
BERHANU AND HOUNIE
We begin our considerations by looking at the simplest case of a holomorphic function of one complex variable F (x + iy) defined for |x| < 1, 0 < y < 2 which satisfies the inequality F (x + iy) ≤ C | ln y|, |x| < 1, 0 < y < 2. (2.2) We assume (2.2) for simplicity, but the argument below can be iterated to handle the case |F (x + iy)| ≤ C|y|−N . The standard manner of defining the weak trace f of F as an element of Ᏸ is through the formula (2.3) f, ψ = lim F (x + iε) ψ(x) dx, ψ ∈ Cc∞ (−1, 1). ε!0
In formula (2.3) we see that, for each fixed x, the argument of F describes a straight vertical segment ε → x + iε that flows towards x as ε → 0. We wish to see what happens if we change each vertical segment to a curve ε → x + α(x, ε) + iε. We assume that (−1, 1) × [0, 1) (x, ε) → α is of class C 2 (we would need class C N +2 if we were assuming |F (x + iy)| ≤ C|y|−N instead of (2.2)) and that α(x, 0) = 0, |x| < 1. The latter assumption simply means that the curve ε → x +α(x, ε)+iε flows towards x as ε → 0. Thus, j ∂ α(x, 0) = 0, j = 0, 1, 2. (2.4) ∂x We now define
f, ψ = lim
ε!0
F x + α(x, ε) + iε ψ(x) dx,
ψ ∈ Cc∞ (−1, 1),
and we wish to prove that f = f. To that end we write 1 F x + α(x, ε) + iε = F x + α(x, ε) + i − i F x + α(x, ε) + it dt. ε
It follows from (2.4) that if x belongs to a compact part of (−1, 1) and ε is small, then |αx (x, ε)| < 1/2. We assume for simplicity that |αx (x, ε)| < 1/2 holds everywhere. Then F x + α(x, ε) + iε ψ(x) dx = F x + α(x, ε) + i ψ(x) dx 1 ∂ ψ(x) dx. F x + α(x, ε) + it dt +i ∂x 1 + αx (x, ε) ε Letting ε → 0 and taking account of (2.4), we obtain f, ψ = f, ψ , as we wished.
UNIQUENESS FOR SOLUTIONS OF OVERDETERMINED SYSTEMS
397
From now on we return to the general situation of a maximally totally real submanifold of Cm and a holomorphic function F defined on a wedge ᐃ = × . and possessing a trace f ∈ Lp (). We now take advantage of two facts. (1) The formula f, ψ = lim F s + α(s, ε) + i φ(s) + εv0 + β(s, ε) ψ(s) ds ε→0
is independent of the family of curves σ (s, ε) = (s + α(s, ε), εv0 + β(s, ε)) as long as all curves ε → σ (ε, s) are contained in ᐃ, they have the right number of bounded derivatives, and α(s, 0) = β(s, 0) = 0, s ∈ U (the assumptions imply that vo ∈ .). (2) The analytic disks described in [BER, Theorem 7.4.12] are taken to be of class C k+α rather than C 1+α , where k is a large positive integer. The first fact follows from [BER, Proposition 7.2.22]. The second fact is true because [BER, Theorems 6.5.4 and 7.4.12] are valid with the same proofs if the analytic disks are taken to be in C k,α for a fixed-positive integer k. In the proof of [BER, Theorem 7.4.12], the function γ h has to be modified so that one gets a C k -extension. Set s = (s1 , . . . , sm−1 ). We assume without loss of generality that (i) for any ( > 0 the set
(2.5) s : |s | < ( and s , 0, v0 ∈ E was defined on page 8); has positive measure (remember that E (ii) v0 = (0, . . . , 0, a) for some small a > 0. For |s | < (, |θ| < (, consider the map (s , θ) −→ A(s ,0),v0 eiθ , which for small ( has an injective differential. We consider a family of curves σ (p, ε) defined by p = A(s ,0),v0 eiθ , σ (p, ε) = A(s ,0),v0 (1 − ε)eiθ . Observe that σ (p, 0) = p and that we are implicitly using (s , θ) as local coordinates. For small ( the curves ε → σ (p, ε) are contained in ᐃ, and it follows from our assumptions that, for any test function ψ with small support around s = 0, f, ψ = lim F p + σ (p, ε) + iσ (p, ε) ψ(p) ds ε→0 = lim F A(s ,0),v0 (1 − ε)eiθ ψ(s , θ)J (s , θ) ds dθ. ε→0
Assuming that f ∈ Lp () and using Fubini’s theorem in the coordinates (s , θ), we see that for a.e. |s | < (, the function θ → A(s ,0),v0 [eiθ ] is in Lp . Fixing such an s
398
BERHANU AND HOUNIE
is equivalent to fixing an analytic disk with the property that the restriction of f to a portion of its boundary that is contained in is in Lp . We now take test functions such that ψJ has separated variables; that is, ψ(s , θ)J (s , θ) = ψ1 (s )ψ2 (θ). Since F has tempered growth, so does the compose F ◦ A(s ,0),v0 , and it follows that ˜ fs , ψ2 = lim F A(s ,0),v0 (1 − ε)eiθ ψ2 (θ) dθ ε→0
defines a distribution in θ that depends continuously on s as a parameter (use the usual method to define the trace integrating by parts with respect to θ). We further have f˜s , ψ2 ψ1 (s ) ds = f, ψ . We may now reason, as in [HT, Lemma B.2], that for a.e. s , |s | < (, f˜s ∈ Lp (−(, () and f˜s (θ ) = f (s , θ ). If s is in the set (2.5) and f˜s (θ) = f (s , θ) holds, this means that ζ → F (A(s ,0),v0 (ζ )) has an Lp -boundary value that vanishes on a set of positive measure, which implies that ζ → F (A(s ,0),v0 (ζ )) vanishes identically. We conclude that for a.e. s on the set (2.5), F (A(s ,0),v0 (ζ )) = 0 or, equivalently, that the set
E(0, v0 ) = s : |s | < ( such that F ◦ A(s ,0),v0 (ζ ) ≡ 0 has positive measure. A similar conclusion could have been reached for the set
E(sm , v) = s : |s | < (, such that F ◦ A(s ,sm ),v (ζ ) ≡ 0 , where sm is a small number and |v − v0 | is small. Thus, the set {(s, v)} such that F ◦ As,v (ζ ) ≡ 0 has positive measure and so does the union of the corresponding partially attached disks. 3. Uniqueness for more general overdetermined systems. Let be a neighborhood of the origin zero in Rm+n , and suppose ᐂ is a vector subbundle of the complexified tangent space CT with the Frobenius property and fiber dimension n. If is an open subset of and u is a distribution in , we say that u is a solution in if Lu = 0 in for every smooth section L of ᐂ over . We assume that every point of has a neighborhood in which the orthogonal of ᐂ in CT ∗ is spanned by m exact forms. This implies that there are m smooth, complex-valued functions Z1 , . . . , Zm whose differentials form a basis of the orthogonal of ᐂ near zero. We may assume that the Zj vanish at the central point zero. One can then arrange it (see [T]) so that, for some integer r ≤ n, xj = Zj ,
yj = Zj
(j = 1, . . . , r),
sk = Zr+k
(k = 1, . . . , m − r)
form part of a local coordinate system for in some neighborhood W of zero. Let yr+1 , . . . , yn denote the rest of the coordinates in that system. We may further assume
UNIQUENESS FOR SOLUTIONS OF OVERDETERMINED SYSTEMS
399
that the forms dZr+1 , . . . , dZm are real at zero. Thus, we may write these “first integrals” as Zj = xj + iyj ,
1 ≤ j ≤ r,
Zr+k = sk + iφk (x, y, s),
1 ≤ k ≤ m − r,
where each φk is real-valued, φk (0) = 0, and dφk (0) = 0. The integer r is an invariant attached to the bundle ᐂ at zero. Indeed, r equals the dimension of the real characteristic set of the bundle ᐂ at zero. In general, if the bundle ᐂ does not define a CR structure, r varies from point to point. In the coordinates (x, y, s), a basis for ᐂ over the neighborhood W is given by the vector fields m−r
Lj = Lr+l =
∂φk ∂ −i Nk , ∂zj ∂zj
1 ≤ j ≤ r,
∂ −i ∂yl
1 ≤ l ≤ n − r,
k=1 m−r k=1
∂φk Nk , ∂yl
where each Nk = jm−r =1 µkj ∂/∂sj and the matrix (µkj ) is the inverse of the matrix (∂Zr+k /∂sj ). We next briefly recall the concepts of a hypoanalytic wave front set and the mini-Fourier-Bors-Iagolnitzer transform as presented in [M] (see also [BRT] and [BR]). Assume W = U × V , where 0 ∈ U ⊂ Rn+r and 0 ∈V ⊂ Rm−r , with 2 (x, y) ∈ U and s ∈ V . Let w, σ ∈ Cm−r with |σ | < |σ |, σ = ( jm−r =1 σj ), and 9(w, σ ) = det(∂θ/∂σ ), where θ = σ +iσ w. Suppose u is a distribution in W with compact support in V for each fixed (x, y) in U . The mini-Fourier-Bros-Iagolnitzer transform of u is defined as 2 ei(w−s −iφ(x,y,s ))−σ (w−s −iφ(x,y,s )) F x, w, y, σ, u = m−r R × 9 w − s − iφ(x, y, s ), σ u(x, y, s ) det I + iφs (x, y, s ) ds . Suppose h is a solution defined in W . Let χ(s) ∈ C0∞ (V ) with χ ≡ 1 in a neighborhood of the origin in V , and let u(x, y, s) = χ(s)h(x, y, s). Definition 3.1. The solution h is called microlocally hypoanalytic at (0, σ 0 ), σ 0 ∈ if there exist C > 0, open neighborhoods 0 ∈ U ⊂ Rn+r and 0 ∈ ᏻ ⊂ Cm−r , and a conic neighborhood . of σ 0 in Cm−r \ {0} such that F x, w, y, σ, u ≤ Ce−|σ |/C
Rm−r ,
for each (x, y) ∈ U , w ∈ ᏻ, and σ ∈ .. If h is not microlocally hypoanalytic at (0, σ 0 ), we say (0, σ 0 ) is in the hypoanalytic wave front set of h and we write σ 0 ∈ W F0 h. Theorem 3.1. Let h be a locally integrable solution in a neighborhood of the origin, and assume that h vanishes on a subset E for which zero is a density point. If W F0 (h) ⊆ . for some strictly convex, closed cone in Rm−r , then h ≡ 0 in a neighborhood of zero.
400
BERHANU AND HOUNIE
Proof. The theorem follows by an application of Theorem 1.1 and the embedding trick of ᐂ into a CR structure, as in [M]. Following [M], let W be the neighborhood = X ×W , where of the origin, as in the discussion preceding this theorem. Define W n−r 0∈X⊆R is an open set. Let x˜ = (x˜r+1 , . . . , x˜n ) denote the coordinates in X and define Z˜ j x, ˜ x, y, s = x˜r+j + iyr+j , 1 ≤ j ≤ n − r, Z˜ n−r+j x, ˜ x, y, s = Zj (x, y, s), 1 ≤ j ≤ m. ˜ , a basis for which is given by we have a CR bundle ᐂ In W L˜ j = Lj ,
1 ≤ j ≤ r,
1 ∂ L˜ r+j = + Lr+j , 2 ∂ x˜r+j
1 ≤ j ≤ n − r.
= Observe now that h is a locally integrable CR function and that it vanishes on E X ×E for which the origin in W is a point of density. By the results in [M], the wave front assumption on h implies that h is the boundary value of a holomorphic function defined on a wedge in Cn+m−r . The theorem then follows from Theorem 1.1. Definition 3.2. The bundle ᐂ is called minimal at zero if, given an open set 0 ∈ ω ⊂ , there exists a smaller open set 0 ∈ ω ⊂ ω such that each point in ω can be reached from zero by a finite sequence of integral curves contained in ω of sections of ᐂ. The preceding theorem, together with the extension of Tumanov’s theorem in [M], now yields the following corollary. Corollary 3.1. Assume ᐂ is minimal at zero. If h is a locally integrable solution in a neighborhood of the origin and vanishes on a set E for which zero is a density point, then h vanishes in a neighborhood of the origin. We now briefly recall the concept of a Sussmann’s orbit [S], which is related to the definition of minimality. Let ᏹ be a smooth manifold, and let ᏸ be a set of vector fields on ᏹ. Let denote the set of locally defined vector fields on ᏹ which are real parts of elements of ᏸ. Define an equivalence relation on ᏹ by declaring that p is related to q if there is a piecewise smooth curve from p to q whose pieces are integral curves of elements of . An equivalence class of this relation is called an orbit of ᏸ. In [S] Sussmann proved that each orbit is an immersed submanifold of ᏹ. Definition 3.3. Let ᏸ be a set of vector fields on an open set in RN . Suppose ᏻ is an orbit of ᏸ. We say that ᏻ is a.e. minimal if the bundle ᏸ is minimal at p for almost every p ∈ ᏻ in the sense of Lebesgue measure in RN . Observe that if ᏻ is a.e. minimal, it has maximal dimension equal to N, as in the definition, and is open. In the real analytic category an open orbit is minimal at every one of its points. This follows from the fact that in the analytic case, the dimension
UNIQUENESS FOR SOLUTIONS OF OVERDETERMINED SYSTEMS
401
of an orbit equals the dimension of a fiber of the Lie algebra generated by the real parts of ᏸ over any point in the orbit. Finally, we indicate how Theorem A in the introduction follows from Corollary 3.1. If u ∈ L1loc () satisfies ᏸu = 0 on and vanishes on a set of positive measure E, we may find a density point p of E ∩ ᏻ such that ᏸ is minimal at p. We may apply Corollary 3.1 with ᏸ in the place of ᐂ to conclude that u vanishes in a neighborhood of p. Since the support of u propagates along an orbit (see [T]), u vanishes on ᏻ, which amounts to saying that u vanishes a.e. on . 4. Necessary conditions for uniqueness. Corollary 3.1 provided a sufficient condition that guaranteed when every homogeneous solution of the overdetermined system of equations Lj u = 0,
j = 1, . . . , n,
(4.1)
is determined by its value on a set of positive measure. To study this matter from a local viewpoint, we introduce some definitions and notation. We denote by Zu the zero set of any locally integrable function u defined on . The set Zu is determined up to a set of measure zero. A point p ∈ is said to be a density point of Zu if for any ball B(p, r) the measure |Zu ∩ B(p, r)| > 0. Definition 4.1. We say that a system ᏸ of n linearly independent complex vector fields with smooth coefficients L1 , . . . , Ln has the strong uniqueness property at a point p ∈ if any solution u of (4.1) defined in a neighborhood of p such that p is a density point of Zu must vanish a.e. in a neighborhood of p. Consider a system ᏸ of n linearly independent complex vector fields with smooth coefficients L1 , . . . , Ln defined on an open region ⊂ RN that contains the origin. By an orbit of ᏸ in we mean an orbit in the sense of Sussmann of the set of real vector fields {Re Lj , Im Lj }j =1,...,n . The next theorem gives a necessary condition that ᏸ must fulfill in order to possess the strong uniqueness property at the origin: any neighborhood V of the origin should contain an open orbit ᏻ of ᏸ in V such that V \ ᏻ has measure zero. Theorem 4.1. Assume that no orbit ᏻ of ᏸ in has the following two properties: (i) ᏻ is open; that is, it has maximal dimension N; (ii) \ ᏻ has measure zero. Then there exist a neighborhood of the origin U and u ∈ L1 (U ) that are not identically zero such that Lj u = 0, j = 1, . . . , n, and u vanishes on a set of positive measure. For the proof we need two lemmas. Lemma 4.1. Suppose that contains a germ of a hypersurface such that ᏸ is tangent to . Then the conclusion of the theorem follows.
402
BERHANU AND HOUNIE
Proof. Let q ∈ belong to . Then divides the ball B = B(q, r) into two components B + and B − for r > 0 small. If 0 ∈ / B, we may enlarge B to a connected open set U that contains the origin in such a way that U \ is also the disjoint union of two components U − and U + 0 (we may even take U − = B − ). The function u equal to 1 on U − and equal to zero on U + has the required properties. Lemma 4.2. Let X be a real vector field in with smooth coefficients, and let V ⊂ be an open set that is invariant under the flow of X. Then the characteristic function χV satisfies XχV = 0 in the sense of distributions. Proof. To show that XχV vanishes identically, we begin by proving that each point p ∈ for which X(p) = 0 has a neighborhood where XχV = 0 is satisfied. If X(p) = 0, we choose local cubic coordinates y1 , . . . , yN so that yk (p) = 0, k = 1, . . . , N , and we assume without loss of generality that X= Let us write
∂ . ∂y1
V = * V ∩ |yk | < a; k = 1, . . . , N ,
where * denotes the projection (y1 , . . . , yN ) → (y2 , . . . , yN ). The straight horizontal segments y2 = const, . . . , yN = const are contained in integral curves of X; thus they are either contained in V or are disjoint from V . It follows that V ∩ |yk | < a = (−a, a) × V . Hence, χV (y) = χV (y2 , . . . , yN ) for |yk | < a and XχV = ∂1 χV = 0, as we wished to prove. Now let ψ ∈ Cc∞ () be an arbitrary test function with support S , and consider the compact subset K of S where the vector field X vanishes. Hence, the support of XχV is contained in K. For any ( > 0, we find a function φ( ∈ Cc∞ () such that (i) φ( = 1 in a neighborhood of K; (ii) φ( (q) = 0 if dist(q, K) ≥ (; and (iii) |∇φ( | ≤ C/(, where C > 0 is independent of (. Thus, XχV , ψ = XχV , φ( ψ + XχV , (1 − φ( )ψ = XχV , φ( ψ = X t (φ( ψ) dm, V
where X t is the transpose of X and dm denotes the Lebesgue measure in RN . As ( → 0, Xt (φ( ψ) remains bounded and converges pointwise to zero off the subset K ⊂ K, where X vanishes of order exactly one. Since K has measure zero, the dominated convergence theorem shows that XχV , ψ = 0. Proof of Theorem 4.1. We consider two cases. First, assume that there is an orbit ᏻ in of dimension N. By hypothesis, \ ᏻ has positive measure and certainly ᏻ is a union of orbits of ᏸ. It follows from Lemma 4.2 that the characteristic function χᏻ of ᏻ satisfies the equations Lj χᏻ = 0, Lj χᏻ = 0, j = 1, . . . , n. So we obtain the
UNIQUENESS FOR SOLUTIONS OF OVERDETERMINED SYSTEMS
403
desired result in this case as soon as we take U ⊂ to be any bounded neighborhood of the origin such that U \ ᏻ has positive measure, and we choose u as the restriction of χᏻ to U . Let us next assume that no orbit of ᏸ in has dimension N, and let M be the maximal real dimension of the Lie algebra Lq generated by {Lj , Lj }j =1,...,n at q as q varies in . It is a general fact that Lq is contained in the tangent space at q of the orbit of ᏸ that passes through q, in particular, M < N . Let q ∈ be a point where the maximal dimension M of Lq is attained. Then the dimension of the Lie algebra Lp is greater than or equal to M for all p close to q. The maximality of M then shows that, near q, p → Lp is a distribution of constant rank-M, and so, by Frobenius’s theorem, its integral manifolds foliate a neighborhood V of q. More precisely, in appropriate local coordinates x1 , . . . , xN centered at q, the integral manifolds are described by xM+1 = const, . . . , xN = const, and it is obvious that ᏸp ⊂ C ⊗ Lp is tangent to them. Thus, the hypersurface described by xN = 0 satisfies the hypothesis of Lemma 4.1. The proof is complete. 5. Applications and examples. We first present two examples of locally integrable structures ᏸ defined in a neighborhood of the origin in RN with the following property: there is a basis {Uk } of neighborhoods of the origin such that for every k = 1, 2, . . . the decomposition of Uk into orbits of ᏸ contains one open orbit ᏻ such that Uk \ ᏻ is not empty but has measure zero. The third example is a hypersurface in C2 with an open orbit whose boundary is a lower dimensional orbit. Our examples have analytic coefficients. These examples illustrate the role of the negligible set F in the statement of Theorem A. In this section we also present applications of our results to uniqueness questions for semilinear and fully nonlinear equations. Example 1. Here N = 3 and the coordinates are denoted by x, s1 , s2 . Consider the structure generated by the complex vector field L = X + iY,
X=
∂ ∂ ∂ , Y = cos(x)s2 + sin(x)s1 . ∂x ∂s1 ∂s2
It is clear that the x-axis is an orbit of {X, Y } in R3 . We also have [X, Y ] = − sin(x)s2
∂ ∂ + cos(x)s1 . ∂s1 ∂s2
The vectors Y and [X, Y ] are linear combinations of ∂s1 , ∂s2 , and the matrix of their coefficients has a determinant equal to s1 s2 . Hence, X, Y and [X, Y ] are linearly independent at any point (c, a, b) such that ab = 0 and that point belongs to a 3dimensional orbit. Let p = (c, a, b) be a point off the x-axis, that is, a 2 + b2 > 0. Assume that a = 0. If b = 0, then ab = 0 and the orbit of p is open. If b = 0, either a sin(c) = 0 or a cos(c) = 0. Then any point p1 = (c, a1 , b1 ) = p close to p either on the integral curve of Y through p or on the integral curve of [X, Y ] through p will
404
BERHANU AND HOUNIE
satisfy a1 b1 = 0. This proves that the orbit of p is open. A similar argument can be given in case b = 0 and a = 0. Hence, the orbit of any point p that does not belong to the x-axis is open, and it is easy to conclude that ᏸ has only two orbits in R3 : the x-axis and its complement. The same decomposition holds if we consider the orbits in a ball of radius r > 0 centered at the origin. Example 2. We now consider a complex vector field L in C×R2 ∼ R4 and denote the coordinates by z = x + iy, s1 , s2 . This example is obtained from the previous one by formally adding an imaginary part iy to the real variable x and replacing ∂x by ∂z , that is, ∂ ∂ ∂ , + i cos(x)s2 + sin(x)s1 L= ∂z ∂s1 ∂s2 or in terms of its real and imaginary parts, L = X + iY,
X=
1 ∂ 1 ∂ ∂ ∂ , Y =− + cos(x)s2 + sin(x)s1 . 2 ∂x 2 ∂y ∂s1 ∂s2
Then L and L are everywhere linearly independent, and the structure ᏸ generated by L is a CR structure of codimension 2. It is apparent that C × {0} is an orbit of ᏸ. We also have that the four vector fields X, Y , ∂ ∂ + cos(x)s1 , ∂s1 ∂s2 ∂ ∂ [X, [X, Y ]] = − cos(x)s2 − sin(x)s1 ∂s1 ∂s2 [X, Y ] = − sin(x)s2
are linearly independent whenever 1/2 0 0 0 −1/2 s2 cos x det 0 0 −s2 sin x 0 0 −s2 cos x
0 s1 sin x = − s1 s2 = 0. s1 cos x 4 −s1 sin x
Then the analysis of Example 1 shows that the decomposition in orbits of ᏸ of any ball centered at the origin consists of the 2-dimensional orbit given by s1 = s2 = 0 and its complement, which is an open orbit. A slightly different proof of this fact that does not mention Example 1 is as follows. If p = (a, b, c, d), cd = 0, and c2 +d 2 > 0, then the orbit of ᏸ through p is transversal to the hypersurface S of R4 \{s1 = s2 = 0} given by the equation s1 s2 = 0, because either [X, Y ] or [X, [X, Y ]] will not be tangent to S. Hence, if ᏻ is an open orbit of ᏸ in B(0, r) \ {s1 = s2 = 0} and ᏻ is properly contained in B(0, r) \ {s1 = s2 = 0}, we have ∂ ᏻ \ {s1 = s2 = 0} = ∅ and the orbit through a point q ∈ ∂ ᏻ \ {s1 = s2 = 0} cannot be open and will have dimension less than 4. Then q must belong to S, but we saw that the orbit through q does not remain confined in S; in particular, it has dimension 4, a contradiction.
UNIQUENESS FOR SOLUTIONS OF OVERDETERMINED SYSTEMS
405
Example 3. We now borrow [BM, Example 6.1]. Denote the variables in C2 by (z, w) = (x + iy, s + it). Let ρ(z, w) = s 2 + (t − |z|2 )2 − |z|4 , and let ᏹ = (z, w) ∈ C2 \ 0 : ρ(z, w) = 0 . Note that ᏹ is a real analytic hypersurface and that it contains = (C \ 0) × 0. The set ᏹ \ is strictly pseudoconvex and connected, and hence it is a single open orbit. Observe that the boundary is also an orbit. We now consider a locally solvable smooth vector field L in the plane. In appropriate coordinates, after multiplication by a nonvanishing function, near a central point, say, zero, L has the form ∂ ∂ L = − ib(x, t) ∂t ∂x with b(x, t) real-valued. The local solvability of L is then equivalent to the condition [NT]: for each x, the function t → b(x, t) does not change sign. We also assume that the Sussmann orbit through zero generated by L and L is 2-dimensional. We then have the following proposition. Proposition 5.1. Let L be as above, and let f (x, t, ζ ) be a C 1 complex-valued p function with bounded derivatives for x, t real and ζ complex. Let u, w ∈ Lloc , p > 1 solve Lu = f (x, t, u), Lw = f (x, t, w) near the origin. If zero is a density point of the set where u = w, then u ≡ w in a neighborhood of the origin. Proof. If we set v = u − w, then v satisfies an equation of the form Lv = Av + Bv and the assumptions on L imply that the function b(x, t) has a consistent sign. Therefore, by [BHS, Theorem 3.1], v = eg h, where eg , h, and v = eg h are locally integrable and where Lh = 0. The proposition therefore follows from Corollary 3.1. We next consider a fully nonlinear system of equations Fj (x, u, ux ) = 0,
j = 1, . . . , n,
(5.1)
which extend the linear systems considered in Section 3. Here the Fj = Fj (x, ζ0 , ζ ) are smooth functions and they are holomorphic in (ζ0 , ζ ). The linear independence of the linear vector fields is generalized by assuming that the forms dζ Fj are linearly independent where ζ = (ζ1 , . . . , ζm ). If we assume that the Fj vanish at a central
406
BERHANU AND HOUNIE
point p, it follows that near p,
= (x, ζ0 , ζ ) : Fj (x, ζ0 , ζ ) = 0, j = 1, . . . , n is a smooth manifold. The involution condition for the linear system is generalized by requiring that the Poisson brackets {Fk , Fl } = 0
for all l, k on .
After using the implicit function theorem and renaming the variables (see [T]), (5.1) can be rewritten as utj = fj x, t, u, ux , j = 1, . . . , n, (5.2) where the fj (x, t, ζ0 , ζ ) are complex-valued, smooth, and holomorphic in (ζ0 , ζ ). Here x ∈ Rm , t ∈ Rn , and (ζ0 , ζ ) lies in an open subset of Cm+1 . We assume that u is a C 2 -solution near (x, t) = (0, 0) in Rm+n . Consider also the linearized operators m
∂fj ∂ ∂ − . x, t, u, ux Lj = ∂tj ∂ζl ∂xl l=1
With this set up, we have the following theorem. Theorem 5.2. Suppose the fj are real analytic in (x, t, ζ0 , ζ ) and that the orbit through the origin of the system Lj is an open set. If zero is a density point in Rm of {x ∈ Rm : u(x, 0) = 0}, then u(x, 0) ≡ 0. Furthermore, u(x, t) is analytic in a neighborhood of the origin. Proof. By [T, Theorem X.3.2], there exist real analytic functions Z1 x, t, ζ0 , ζ , . . . , Zm x, t, ζ0 , ζ and
P0 x, t, ζ0 , ζ , . . . , Pm x, t, ζ0 , ζ
holomorphic in (ζ0 , ζ ) such that Zi x, 0, ζ0 , ζ = xi ,
Pj x, 0, ζ0 , ζ = ζj ,
and the functions Ziu (x, t) = Zi x, t, u, ux
and
Puj (x, t) = Pj x, t, u, ux
u are a comare all solutions of the system Lk , k = 1, . . . , n. Observe that Z1u , . . . , Zm plete set of first integrals for the Lj . By Marson’s extension of Tumanov’s theorem [M], it follows that the function Pu0 extends holomorphically into a certain wedge, as
UNIQUENESS FOR SOLUTIONS OF OVERDETERMINED SYSTEMS
407
described in [M, (1.1)]. But then by [M, Theorem 2], the wave front set of Pu0 (x, t) (at the origin) is contained in a strictly convex closed cone of Rm . It is crucial to note here that, for solutions like Pu0 which extend holomorphically into a wedge, the wave front set defined in [M] agrees with the hypoanalytic wave front set of Baouendi, Chang, and Treves ([BCT, p. 580]). But u(x, 0) = Pu0 (x, 0) and Zju (x, 0) = xj . Therefore, the analytic wave front set of u(x, 0) is contained in the same cone as above. By a well-known theorem (see [Ho, Theorem 9.3.4]) it follows that u(x, 0) is the boundary value of a function holomorphic in a wedge, continuous on the closure of the wedge which lies in Cm . By Theorem 1.1, we conclude that u(x, 0) ≡ 0. Now, using the power series method, we may find a real analytic local solution v(x, t) of (5.2) such that v(x, 0) ≡ 0. That u and v coincide in a neighborhood of the origin follows from a general result of Baouendi, Goulaouic, Treves, and Metivier ([BGT], [Me]). APPENDIX: UNIQUENESS IN REAL ANALYTIC CR MANIFOLDS Consider a CR manifold in Cn+d of codimension d parametrized near a central point zero as follows:
= z, s + iφ(x, y, s) : z = x + iy ∈ Cn , s ∈ Rd . Here z and s vary near zero, and we assume that the function φ = (φ1 , . . . , φd ) is real-valued and real analytic. We also assume that φ(0) = 0 and dφ(0) = 0. Suppose f is a locally integrable CR function on which extends as a holomorphic function to F on the wedge defined by
Wδ (, .) = (z, s + it) : t − φ(x, y, s) ∈ ., 0 < |t| < δ , where . is a cone in Rd given by
. = (y1 , y ) = y1 , y2 , . . . , yd ∈ Rd : y1 > c|y | , and c is some positive number. Under these assumptions on f and , we prove the following theorem. Theorem. If the locally integrable CR function f vanishes on a set E of positive measure and E intersects every neighborhood of the central point zero in a set of positive measure, then f vanishes a.e. in a full neighborhood of zero. Proof. For z ∈ Cn and s = (s2 , . . . , sd ) ∈ Rd−1 , both near zero, consider the complex curve γz,s given by w1 −→ z, w1 + iφ1 x, y, w1 , s , s2 + iφ2 x, y, w1 , s , . . . , sd + iφd x, y, w1 , s . Choose a > 0 such that for any z ∈ Cn , |z| ≤ a, and s = (s2 , . . . , sd ) ∈ Rd−1 ,
408
BERHANU AND HOUNIE
|sj | ≤ a, the points {γz,s (w1 )} are in the wedge Wδ (, .) whenever w1 = s1 + it1 ∈ (−a, a)×(0, a). Indeed, such an a exists since dφ(0) = 0. For z ∈ Cn , |z| ≤ a, define
Ez = s ∈ Rd : z, s + iφ(x, y, s) ∈ E, |sj | < a . We also define
A = z ∈ Cn : |z| < a, |Ez | > 0 .
Here |Ez | denotes the d-dimensional measure as a subset of Rd . Because of the assumptions on the set E, we note that the 2n-dimensional measure of the set A is positive. For each z ∈ A, let B1z denote the points s = (s2 , . . . , sd ) ∈ Rd−1 , |sj | < a, such that the set {s1 : (s1 , s ) ∈ Ez } has a positive 1-dimensional measure. Clearly, |B1z | > 0. Since f ∈ L1 (), after contracting the sets A and B1z if necessary, we assume that, for each z ∈ A and s ∈ B1z , the function s1 −→ f z, s1 + iφ1 x, y, s1 , s , . . . , sd + iφd x, y, s1 , s is integrable on (−a.a). The sets A and B1z still have the same positive measures. Fix z ∈ A and s ∈ B1z . The function F z, w1 + iφ1 x, y, w1 , s , s2 + iφ2 x, y, w1 , s , . . . , sd + iφd x, y, w1 , s = Fz,s is holomorphic for w1 ∈ (−a, a) × (0, a) since γz,s (w1 ) is contained in the wedge Wδ (, .) for such w1 . When w1 = s1 is real, Fz,s (s1 ) = 0 on a subset of (−a, a) of positive measure. Moreover, by Corollary 1.1 and the arguments in the proof of Theorem 1.1 on equality a.e. of boundary values, Fz,s has nontangential limit a.e. on the boundary piece (−a, a) × {0}. Hence, by Priwaloff’s theorem we conclude that Fz,s (w1 ) = 0
for w1 ∈ (−a, a) × (0, a).
Next, for z ∈ A, let B2z be the points s = (s3 , . . . , sd ) ∈ Rd−2 , |sj | < a, such that the set s2 : (s2 , s ) ∈ B1z has a positive measure. Again note that |B2z | > 0. After decreasing a if necessary, we find b > 0 small such that for z ∈ Cn , |z| < a, w1 = s1 + it1 ∈ (−a, a) × (0, a), wj = sj + itj ∈ (−a, a) × (0, bt1 ), j = 2, . . . , d, the points
z, w1 + iφ1 (x, y, w), . . . , wd + iφd (x, y, w)
lie in the wedge Wδ (, .). Fix z ∈ A, s ∈ B2z , and w1 = s1 + it1 ∈ (−a, a) × (0, a).
UNIQUENESS FOR SOLUTIONS OF OVERDETERMINED SYSTEMS
409
The function F z, w1 +iφ1 x, y, w1 , w2 , s , w2 +iφ2 x, y, w1 , w2 , s , s +iψ x, y, w1 , w2 , s is holomorphic in the rectangle w2 = (−a, a) × (0, bt1 ). Here ψ = (φ3 , . . . , φd ). This function is continuous on the boundary piece w2 = s2 ∈ (−a, a), and since s ∈ B2z , there is a subset of positive measure on this boundary piece where it is zero, as was already established. We can therefore apply Priwaloff’s theorem to conclude that it vanishes for w2 ∈ (−a, a) × (0, bt1 ). Since F is holomorphic, we have in fact shown that for z ∈ A and s ∈ B2z , the preceding function vanishes whenever w1 ∈ (−a, a) × (0, a) and w2 ∈ (−a, a) × (0, ba). Continuing this way, we conclude that F vanishes on a set of the form A × for some open set in Cd . Since F is holomorphic and vanishes on a set of positive measure, it follows that it is identically zero, and hence f vanishes a.e. in a neighborhood of the origin in . References [BCT] [BER] [BGT] [BR] [BRT] [BHS] [BM] [Ho] [H] [HT] [K] [LP] [M] [Me] [NT]
M. S. Baouendi, C. H. Chang, and F. Trèves, Microlocal hypo-analyticity and extension of CR functions, J. Differential Geom. 18 (1983), 331–391. M. S. Baouendi, P. Ebenfelt, and L. P. Rothschild, Real Submanifolds in Complex Space and their Mappings, Princeton Math. Ser. 47, Princeton Univ. Press, 1999. M. S. Baouendi, C. Goulaouic, and F. Trèves, Uniqueness in certain first-order nonlinear complex Cauchy problems, Comm. Pure Appl. Math. 38 (1985), 109–123. M. S. Baouendi and L. P. Rothschild, Normal forms for generic manifolds and holomorphic extension of CR functions, J. Differential Geom. 25 (1987), 431–467. M. S. Baouendi, L. P. Rothschild, and F. Trèves, CR structures with group action and extendability of CR functions, Invent. Math. 82 (1985), 359–396. S. Berhanu, J. Hounie, and P. Santiago, A similarity principle for complex vector fields and applications, to appear in Trans. Amer. Math. Soc. S. Berhanu and G. A. Mendoza, Orbits and global unique continuation for systems of vector fields, J. Geom. Anal. 7 (1997), 173–194. L. Hormander, The Analysis of Linear Partial Differential Operators, Grundlehren Math. Wiss. 256, Springer, Berlin, 1983. J. Hounie, Globally hypoelliptic vector fields on compact surfaces, Comm. Partial Differential Equations 7 (1982), 343–370. J. Hounie and J. Tavares, On removable singularities of locally solvable differential operators, Invent. Math. 126 (1996), 589–623. S. G. Krantz, Function Theory of Several Complex Variables, Pure Appl. Math., Wiley, New York, 1982. N. Lusin and J. Priwaloff, Sur l’unicité et la multiplicité des fonctions analytiques, Ann. Sci. École Norm. Sup. (4) 42 (1925), 143–191. M. E. Marson, Wedge extendability for hypo-analytic structures, Comm. Partial Differential Equations 17 (1992), 579–592. G. Métivier, Uniqueness and approximation of solutions of first order nonlinear equations, Invent. Math. 82 (1985), 263–282. L. Nirenberg and F. Trèves, Solvability of a first order linear partial differential equation, Comm. Pure Appl. Math. 16 (1963), 331–351.
410 [R]
[S] [T]
BERHANU AND HOUNIE J.-P. Rosay, “On the radial maximal function and the Hardy Littlewood maximal function in wedges” in Madison Symposium on Complex Analysis (Madison Wis., 1991), Contemp. Math. 137 (1992), 383–398. H. J. Sussmann, Orbits of families of vector fields and integrability of distributions, Trans. Amer. Math. Soc. 180 (1973), 171–188. F. Trèves, Hypo-analytic Structures: Local Theory, Princeton Math. Ser. 40, Princeton Univ. Press, Princeton, 1992.
Berhanu: Department of Mathematics, Temple University, Philadelphia, Pennsylvania 19122-6094, USA; [email protected] Hounie: Departamento de Matemática, Universidade Federal de São Carlos 105, 13.565905, São Carlos, SP, Brasil; [email protected]
Vol. 105, No. 3
DUKE MATHEMATICAL JOURNAL
© 2000
FROBENIUS∞ INVARIANTS OF HOMOTOPY GERSTENHABER ALGEBRAS, I S. A. MERKULOV
1. Introduction. Frobenius manifolds play a central role in the usual formulation of mirror symmetry, as may be seen in the following diagram, A category of compact GW A category of BK A category of compact −−→ ←−− , symplectic manifolds Frobenius manifolds Calabi-Yau manifolds where morphisms in all categories are just diffeomorphisms preserving relevant structures, and GW and BK stand, respectively, for the Gromov-Witten (see, e.g., [Ma1]) M) consisting and Barannikov-Kontsevich (see [BK] and [Ba]) functors. A pair (M, of a symplectic manifold M and a Calabi-Yau manifold M is said to be mirror if = BK(M). According to Kontsevich [Ko1], this equivalence is a shadow of GW(M) a more fundamental equivalence of natural A∞ -categories attached to M and M. This paper is much motivated by the Barannikov-Kontsevich construction (see [BK] and [Ba]) of the functor from the right in the above diagram, and by Manin’s comments [Ma2] on their construction. The roots of the BK functor lie in the extended deformation theory of complex structures on M, more precisely in very special properties of the (differential) Gerstenhaber algebra g “controlling” such deformations. One of the miracle features of Calabi-Yau manifolds, the one that played a key role in the BK construction, is that deformations of their complex structures are nonobstructed, always producing a smooth versal moduli space.1 In the language of Gerstenhaber algebras, the exceptional algebraic properties necessary to produce a Frobenius manifold out of g have been axiomatized in [Ma1] and [Ma2]. As a result, the functor BK Frobenius “Exceptional” −−→ manifolds Gerstenhaber algebras
is now well understood.
1A
similar phenomenon occurs in the extended deformation theory of Lefschetz symplectic structures which also produces, via the same BK functor, Frobenius manifolds (see [Me1]). These should not be confused with GW. Received 11 January 2000. Revision received 8 May 2000. 2000 Mathematics Subject Classification. Primary 14D07; Secondary 14J32, 17B66, 32G13, 58D29. 411
412
S. A. MERKULOV
One of our purposes in this paper is to extend the BK functor from the category of Calabi-Yau manifolds to the category of arbitrary compact complex manifolds, which means the study of a diagram Arbitrary Gerstenhaber algebras
?
−→
?
.
Generically, the extended deformation theory of complex structures is obstructed, and it would be naive to expect that the question mark above stands for the category of Frobenius manifolds. In fact, it does not, and the answer is captured in the following notion. Definition 1.1. An F∞ -manifold is the data set (Ᏼ, E, ∂, [µ∗ ], e), where (i) Ᏼ is a formal pointed Z-graded manifold; (ii) E is the Euler vector field on Ᏼ, Ef := (1/2)|f |f , for all homogeneous functions on Ᏼ of degree |f |; (iii) ∂ is an odd homological (i.e., ∂ 2 = 0) vector field on Ᏼ such that [E, ∂] = (1/2)∂ and ∂I ⊂ I 2 , I being the ideal of the distinguished point in Ᏼ; (iv) [µn : ⊗n ᐀Ᏼ → ᐀Ᏼ ], n ∈ N, is a homotopy class of smooth unital strong homotopy commutative (C∞ ) algebras defined on the tangent sheaf, ᐀Ᏼ , to Ᏼ, such that LieE µn = (1/2)nµn , for all n ∈ N, and µ1 is given by µ1 : ᐀Ᏼ −→ ᐀Ᏼ ,
X −→ µ1 (X) := [∂, X];
(v) e is the unit, that is, an even vector field on Ᏼ such that [∂, e] = 0, µ2 (e, X) = X, ∀X ∈ ᐀Ᏼ , and µn (. . . , e, . . . ) = 0 for all n ≥ 3. Clearly, the category of F∞ -manifolds contains Frobenius manifolds as a subcategory. On any F∞ -manifold the vector field ∂ defines an integrable distribution, Im µ1 , which is tangent to its subspace of zeros, “zeros(∂).” The structures (i)–(v) make the tangent sheaf to the smooth part of the associated quotient, “zeros(∂)”/ Im µ1 , into a sheaf of graded unital associative algebras. Theorem A. For any differential unital (graded commutative) Gerstenhaber algebra g, its cohomology, H(g), if finite-dimensional, is canonically an F∞ -manifold. The resulting diagram Differential unital F∞ F∞ −−→ Gerstenhaber algebras manifolds
413
FROBENIUS∞ MANIFOLDS
implies, in turn, a diagram A category of compact symplectic manifolds
F∞
A category of F∞ -manifolds
F∞
A category of compact complex manifolds
F∞ A category of holomorphic vector bundles on compact complex manifolds through the Gerstenhaber algebras controlling extended deformations of symplectic, complex, and holomorphic vector bundle structures. Moreover, the F∞ -functor enjoys the correct “classical limit”: when restricted to exceptional Gerstenhaber algebras (i.e., the ones satisfying Manin’s axioms (see [Ma2])), the F∞ -functor coincides precisely with the Barannikov-Kontsevich construction (see [BK]), and, hence, takes values in the subcategory of Frobenius manifolds. Let us emphasize once again that, in the above diagram, F∞ (symplectic manifolds) has nothing to do with Gromov-Witten invariants.2 Nevertheless, to rather different mathematical objects we can canonically attach invariants lying in one and the same geometric category. Hence, we can use these F∞ -invariants for a classification and even speak about dull mirror symmetry when . F∞ Object = F∞ Object Such a relation may be a shadow of something conceptually more interesting (cf. [Ko1]). Theorem A is explained and generalized by the following theorem. Theorem B. There is a canonical functor, F∞ , from the derived category of unital homotopy Gerstenhaber (G∞ ) algebras with finite-dimensional cohomology to the category of F∞ -manifolds. This result implies that the cohomology space of any homotopy Gerstenhaber algebra is, if finite-dimensional, canonically an F∞ -manifold, Homotopy Gerstenhaber algebras
2 At
F∞
−−→
F∞ manifolds
best, this is a very weak shadow of the mirror symmetry; see below.
.
414
S. A. MERKULOV
The recent proof of Deligne’s conjecture (see [Ta1], [Ko2], [V], and [MS]) gives the following diagrammatic corollary of Theorem B, A∞ algebra Ꮽ
−→
Hochschild complex C • (Ꮽ , Ꮽ )
F∞
−−→
F∞ manifold
,
which, probably, has a direct relevance to the mirror symmetry through the following specializations, Theorems C and D, of Theorem B. Any G∞ -algebra g is, in particular, an L∞ -algebra so that its cohomology, H(g), has the induced structure, [ , ]ind , of Lie algebra. If there exists a quasi-isomorphism of L∞ -algebras,
F g, L∞ -component of theG∞ -structure −→ H(g), [ , ]ind ,
then g is said to be L∞ -formal, and F is called a formality map. In terms of the associated F∞ -invariant, the L∞ -formality of a G∞ -algebra g gets translated into a canonical flat structure, the Gauss-Manin connection ∇, such that ∇X e = 0 and ∇X ∇Y ∇Z ∂ = 0 for any flat vector fields X, Y , and Z. This specialization of F∞ structure is called pre-Frobenius∞ structure. Theorem C. There is a canonical functor from the category of pairs (g, F ), where g is an L∞ -formal homotopy Gerstenhaber algebra and F is a formality map, to the category of pre-Frobenius∞ manifolds. In fact, a pre-Frobenius∞ manifold (a Frobenius manifold, in particular) is itself a homotopy Gerstenhaber algebra. According to Kontsevich’s celebrated Formality theorem (see [Ko3]), for any compact complex manifold M, the Hochschild differential Lie algebra, C • (Ꮽ, Ꮽ), associated to the algebra of Dolbeault forms, Ꮽ = ((M, 0,• M ), ∂), is L∞ -formal so that Theorem C has a wide area of applications. Theorem D. If a homotopy Gerstenhaber algebra g is quasi-isomorphic, as an L∞ -algebra, to an Abelian differential Lie algebra, then the tangent sheaf, ᐀Ᏼ , to its cohomology H(g) viewed as a linear supermanifold is canonically a sheaf of unital graded commutative associative algebras. The point is that the Hochschild complex built out of the Dolbeault algebra, 0,•
Ꮽ = ((ᏹ, M ), ∂), of a Calabi-Yau manifold ᏹ, satisfies the conditions of Theo-
rem D. In fact, the canonically induced associative product on the tangent sheaf to the associated linear supermanifold, H• (ᏹ, ∧• Tᏹ ), is, for an appropriate formality map, potential and satisfies the WDVV equations. The resulting composition, Calabi-Yau −→ manifold M
• (Ꮽ, Ꮽ), formality map, trace CHoch F∞ Frobenius −−→ , 0,∗ manifold where Ꮽ = M, M , ∂
FROBENIUS∞ MANIFOLDS
415
together with its analogue for the de Rham algebra of a compact Lefschetz symplectic manifold, will be discussed in the second part of this paper. In Section 2, the origin of data (i)–(iii) in Definition 1.1 of an F∞ -manifold is explained via the deformation theory. Here we use only the L∞ -component of a G∞ structure. The main technical tool is a modified version of the classical deformation functor which is proved to be nonobstructed. This part of the story is, probably, of independent interest. In Section 3, we use a homotopy technique to explain the origin of data (iv)–(v) in Definition 1.1 and to prove Theorems A–D. In Section 4, we give second proofs of the main claims of this paper using perturbative solutions of algebraic differential equations. 2. Deformation functors 2.1. Odd Lie superalgebras. Let k be a field with characteristic = 2. An odd Lie superalgebra over k is a vector superspace g = go˜ ⊕ g1˜ equipped with an odd k-linear map (see [Ma1]) [•] : g ⊗ g −→ g, a ⊗ b −→ [a • b], which satisfies the following conditions ˜ ˜ b+1) (a) odd skew-symmetry: [a • b] = −(−1)(a+1)( [b • a], (b) odd Jacobi identity: ˜ ˜ b+1) a • [b • c] = [a • b] • c + (−1)(a+1)( b • [a • c] , for all a, b, c ∈ go˜ ∪ g1˜ . The parity change functor transforms this structure into the usual Lie superalgebra brackets, [ , ], on g. Thus, odd Lie superalgebras are nothing but Lie superalgebras in the “awkwardly” chosen Z2 -grading. For this reason, we sometimes omit the qualifier odd, and treat (g, d, [•]) and (g, d, [ , ]) as different representations of one and the same object. One advantage of working with [•] rather than with the usual Lie brackets [ , ] is that this awkward Z2 -grading induces the correct, for our purposes, Z2 -grading on the associated cohomology supermanifold (see below). Another advantage will become clear when we introduce on g one more algebraic structure (an even, in this awkward Z2 -grading, associative product) making g into a Gerstenhaber algebra (cf. [Ma1] and [Ma2]). 2.2. Cohomology as a formal supermanifold. The data set (g, [•], d), with (g, [•]) being a Lie superalgebra and d : g −→ g an odd k-linear map satisfying d[a • b] = [da • b] − (−1)a˜ [a • db], is called a differential Lie superalgebra, or dLie-algebra. This triple (g, [•], d) is often abbreviated to g.
416
S. A. MERKULOV
The cohomology of g,
Ker d , Im d inherits the structure of a Lie superalgebra. We always assume in this paper that H(g), which we often abbreviate to H, is a finite-dimensional superspace, say, dim H = p | q. Let {[ei ], i = 1, . . . , p + q} be a basis consisting of homogeneous elements with ˜ and let {t i , i = 1, . . . , p + q} be the associated dual basis in parity denoted by i, H∗ . The supercommutative ring, k[[t 1 , . . . , t p+q ]], of formal power series will be abbreviated to k[[t]]. The (purely) notational advantage of working with k[[t]] rather than with the invariantly defined object • H∗ is that we prefer viewing (i) H as a smooth formal pointed (p | q)-dimensional supermanifold denoted (to emphasize this change of thought) by Ᏼ or Ᏼg, (ii) {t i } as linear coordinates on Ᏼ, (iii) k[[t]] as the space of global sections of the structure sheaf, ᏻᏴ , on Ᏼ. The ideal sheaf of the origin, 0 ∈ Ᏼ, will be denoted by I . There is a canonical map H(g) :=
s : k[[t]] ⊗ H −→ H 0 (Ᏼ, ᐀Ᏼ ),
a i (t)[ei ] −→
a i (t)
∂ , ∂t i
where H 0 (Ᏼ, ᐀Ᏼ ) stands for the space of global sections of the sheaf, ᐀Ᏼ , of formal vector fields on Ᏼ. There is a well-defined action of H 0 (Ᏼ, ᐀Ᏼ ) on both k[[t]] ⊗ H and k[[t]] ⊗ g through the first factor. If X is a formal vector field on Ᏼ and is an element of k[[t]] ⊗ g (or of k[[t]] ⊗ H), then the result of this action is denoted by − → X . Any element in k[[t]] ⊗ g (or in k[[t]] ⊗ H) can be uniquely decomposed, = [0] + [1] + · · · + [n] + · · · , into homogeneous polynomials, [n] , of degree n in the variables t i . The sum of the first n terms in the above decomposition is denoted by (n) ; that is, (n) = mod I n+1 . We call an element ∈ k[[t]] ⊗ g versal if it is even, mod I = 0, [1] = p+q mod I 2 ∈ I ⊗ Ker d, and [1] mod Im d = i=1 t i [ei ]. The sheaf ᐀Ᏼ comes canonically equipped with a flat torsion-free affine connection ∇ whose horizontal sections are, by definition, the linear span of s([ei ]), i = 1, . . . , p+ q; that is, ∇X = 0 if and only if s −1 (X) is a “constant” (independent of t i ) element in k[[t]] ⊗ H. This connection memorizes the origin of Ᏼ as a vector superspace. It will be important, in this paper, to ignore sometimes the flat structure and view Ᏼ only as a smooth formal supermanifold with a distinguished point zero but no
FROBENIUS∞ MANIFOLDS
417
preferred coordinate system. To avoid possible confusion, we adopt from now on this latter viewpoint unless the flat connection ∇ is explicitly mentioned. 2.2.1. Z-grading. Differential Lie superalgebras (g, d, [•]), which we often encounter in geometry, have their Z2 -grading induced from a finer structure, Z-grading, which is, by definition, a decomposition of g into a direct sum, g=
gi ,
i∈Z
with the following consistency conditions: (a) d gi ⊂ gi+1 ; (b) [gi • gj ] ⊂ gi+j −1 . The Z2 -grading associated to this structure is then simply go˜ := i∈2Z gi and g1˜ := i i∈2Z+1 g . Clearly, there is an induced Z-grading on the cohomology Lie superalgebra, H = ⊕i∈Z Hi (g), as well as on the structure sheaf, ᏻᏴ , of the associated cohomology supermanifold. 2.3. Classical deformation functor. One of the approaches to constructing a (versal) deformation space of a given mathematical structure Ꮽ consists of the following steps (see, e.g., [Ko3], [Ba], and references therein). (1) Associate to Ꮽ a “controlling" differential Z-graded Lie algebra (g = k∈Z gk , d, [•]) over a field k (which is usually R or C). (2) Define the deformation functor Def 0g :
the category of Artin −→ the category of sets , k-local algebras
as follows:
2
∈ g ⊗ mᏮ d + [ • ]/2 = 0 , 1 exp g ⊗ mᏮ
Def 0g(Ꮾ) =
where mᏮ is the maximal ideal of the Artin algebra Ꮾ, the latter is viewed as a Z-graded algebra concentrated in degree zero (so that (g ⊗ mᏮ )i = gi ⊗ mᏮ ), and the quotient is taken with respect to the following representation of the gauge group exp (g ⊗ mᏮ )1 , −→ g = eadg −
eadg − 1 dg, adg
1 g ∈ g ⊗ mᏮ ,
where ad is just the usual internal automorphism of g, adg := [g • ].
418
S. A. MERKULOV
(3) Try to represent the deformation functor by a topological (pro-Artin) algebra ᏻᏹ so that
Def 0g(Ꮾ) = Homcont ᏻᏹ , Ꮾ .
This associates to the mathematical structure Ꮽ the formal moduli space ᏹ whose “ring of functions” is ᏻᏹ . In geometry, one often continues with a fourth step by constructing a cohomological splitting of g and applying the Kuranishi method (see [Ku] and [GM]) to represent versally the deformation functor by the ring of analytic (rather than formal) functions on the Kuranishi space. The tangent space, Def 0g(k[ε]/ε 2 ), to the functor Def 0g is isomorphic to the cohomology group H2 (g) of the complex (g, d). If one extends in the obvious way the above deformation functor to the category of arbitrary Z-graded k-local Artin algebras (which may not be concentrated in degree zero), one gets the functor Def g∗ with the tangent space isomorphic to the full cohomology group ⊕i∈Z Hi (g). When working with the extended deformation functor Def ∗g, it is often no loss of essential information to forget the Z-grading on g and keep only the associated Z2 -grading. One then gets the following equivalent definition of Def ∗g:
the category of Artin −→ the category of sets , k-local superalgebras ∈ g ⊗ mᏮ o˜ | d + [ • ]/2 = 0 . Def ∗g(Ꮾ) := exp g ⊗ mᏮ 1˜
Def ∗g :
This functor is representable by a smooth formal moduli space ᏹ if there exists a versal (in the sense of Section 2.2) solution, ea t a + a1 a2 t a1 t a2 + · · · ∈ g ⊗ k[[t]] o˜ , (1) = a
a1 ,a2
to the so-called Maurer-Cartan equation, 1 d + [ • ] = 0. 2 Because of versality of , any other solution over an arbitrary Artin algebra Ꮾ is equivalent to this one by a base change k[[t]] → Ꮾ. 2.3.1. L∞ -morphisms, part I. Let g1 and g2 be two dLie-algebras. To formulate the basic theorem of the classical deformation theory, we shall need the following notion. A sequence of linear maps Fn : n g1 −→ g2 ,
n = 1, 2, . . . ,
F˜n = n˜ + 1,
FROBENIUS∞ MANIFOLDS
419
defines an L∞ -morphism from g1 to g2 if n dFn γ1 , γ2 , . . . , γn + ±Fn γ1 , . . . , dγi , . . . , γn i=1
1 1 = ± Fk γσ1 , . . . , γσk • Fl γσk+1 , . . . , γσn 2 k!l! k+l=n k,l≥1
+
σ ∈,n
±Fn−1 γi • γj , γ1 , . . . , γn ,
i<j
for arbitrary γ1 , . . . , γn ∈ g1 . In particular, the first map F1 is a morphism of complexes which respects the Lie brackets up to homotopy defined by the second map F2 . An L∞ -morphism F = {Fn } : g1 → g2 is called a quasi-isomorphism if its linear part F1 induces an isomorphism, H(g1 ) → H(g2 ), of associated cohomology groups. Basic Theorem of Deformation Theory 2.3.2 [Ko3]. An L∞ -morphism F = {Fn } : g1 → g2 defines a natural transformation of the functors F∗ : Def ∗g1 −→ Def ∗g2 ,
−→ F∗ () :=
∞ 1 Fn , . . . , ; n! n=1
that is, if is a solution to Maurer-Cartan equations in (g1 ⊗ mᏮ )o˜ , then F∗ () is a solution to Maurer-Cartan equations in (g2 ⊗ mᏮ )o˜ . Moreover, if F is a quasi-isomorphism, then F∗ is an isomorphism. Corollary 2.3.3. If a dLie-algebra g is quasi-isomorphic to an Abelian dLiealgebra, then Def ∗g is versally representable by a smooth formal pointed supermanifold Ᏼg (the cohomology supermanifold of g). Proof. If h is an Abelian dLie-algebra, then, in the notation of Section 2.2, =
dim H(h)
t i [ei ]
i=1
is a versal solution of Maurer-Cartan equations. Hence, Def ∗h is representable by Ᏼh. If g is quasi-isomorphic to h, the required statement follows from Theorem 2.3.2 and the isomorphism Ᏼg = Ᏼh. There are two remarkable examples, one dealing with extended deformations of complex structures on a Calabi-Yau manifold (see [BK]) and another with extended deformations of the symplectic structure on a Lefschetz manifold (see [Me1]), when
420
S. A. MERKULOV
the rather strong condition of Corollary 2.3.3 holds true.3 In general, however, there are obstructions to constructing a versal solution to the Maurer-Cartan equations, and we resort to other technical means such as modifying the deformation functor as explained in Section 2.4 or further extending the deformation problem to the category of L∞ -algebras. Example 2.3.4 (Deformations of complex manifolds). It is well known that the total space of the cotangent bundle, 1R , to a real n-dimensional manifold M carries a natural Poisson structure { , } making the structure sheaf ᏻ1 into a sheaf of Lie R algebras. In a natural local coordinate system (x a , pa := ∂/∂x a ), {f, g} =
∂f ∂g ∂f ∂g − a . a ∂pa ∂x ∂x ∂pa
If we now change the parity of the fibers of the natural projection 1R → M (which is allowed since they are vector spaces), we will get an (n | n)-dimensional supermanifold 1R equipped with a natural odd Poisson structure {•} making the structure sheaf ᏻ1 into a sheaf of odd Lie superalgebras. In a natural local coordinate system R
(x a , ψa := ∂/∂x a ) on 1R ,
{f • g} =
∂f ∂g ∂g ˜ ∂f + (−1)f a . a ∂x ∂ψa ∂ψa ∂x
The smooth functions on 1R have a simple geometric interpretation in terms of the underlying manifold M—they are just smooth polyvector fields. Indeed, a standard power series decomposition in odd variables gives f=
n k=0 a1 ,...,ak
f a1 ···ak (x)ψa1 · · · ψak ,
implying the isomorphism of sheaves ᏻ1 = 0• TR , where TR is the real tangent R bundle to M. Therefore, the sheaf 0• TR with the Z2 -grading,
0∗ TR
o˜
:= 0even TR ,
0 ∗ TR
1˜
:= 0odd TR ,
induced from that on ᏻ1 , is naturally a sheaf of odd Lie superalgebras. The odd R Poisson bracket {•} is called, in this incarnation, the Schouten bracket and is often denoted by [•]Sch . 3 One more example (of a different technical origin though of the same mirror symmetry flavor) when a naturally extended deformation problem gives rise to a smooth extended moduli space is discussed in [Me2].
FROBENIUS∞ MANIFOLDS
421
If M is a complex manifold, then the canonical odd Poisson structure on the parity changed holomorphic cotangent bundle, 1M , is itself holomorphic, thereby giving rise to the structure of odd Lie superalgebras on the sheaf, 0• TM , of holomorphic polyvector fields. This can be used to make the vector space ∗ g= gk , gk := M, 0i TM ⊗ 0j T M , k∈Z
i+j =k
into a Z-graded differential algebra by taking ∂, the (0, 1) part of the de Rham operator, as a differential, and the map, ∗ ∗ [•] : M, 0i1 TM ⊗ 0j1 T M × M, 0i2 TM ⊗ 0j2 T M ∗ −→ M, 0i1 +i2 −1 TM ⊗ 0j1 +j2 T M , X1 ⊗ w1 × X2 ⊗ w 2 −→ X1 ⊗ w1 • X2 ⊗ w2 , given by
˜˜ X1 ⊗ w1 • X2 ⊗ w2 := (−1)j1 i2 X1 • X2 Sch ⊗ w1 ∧ w2
as the (odd) Lie brackets. The importance of this dLie-algebra stems from the fact that the associated deformation functors Def 0g and Def ∗g describe, respectively, ordinary and extended deformations of the given complex structure on a smooth manifold M. Indeed, a complex structure on a real 2n-dimensional manifold M is a decomposition, C ⊗ TR = TM ⊕ T M , of the complexified real tangent bundle into a direct sum of complex integrable distributions, TM , and its complex conjugate, T M . A similar decomposition, C ⊗ TR = TM ⊕ T M , can be described in terms of the original complex structure by the graph, T M , of a linear map : T M → TM , that is, by an element ∈ g2 . The integrability of TM amounts then to the Maurer-Cartan equation in g2 , 1 ∂ + [ • ]Sch = 0. 2 By solving (if possible) the above equation in the full Lie algebra g rather than in its subalgebra g2 (and taking the quotient by the gauge group describing equivalent deformations), one gets a so-called extended complex structure on M whose geometric meaning is not yet fully understood. It is understood (see [Ko1]), however, that this structure does have an important mirror symmetry aspect (at least for Calabi-Yau manifolds). Kontsevich noticed that his Formality theorem (see [Ko3]) identifies the moduli space of extended complex structures on a given complex manifold M with the moduli space of A∞ -deformations of the derived category of coherent sheaves on M, mirror counterpart of the conjectured Fukaya category built out of a dual complex manifold M.
422
S. A. MERKULOV
If M is a Calabi-Yau manifold, then, as was shown by Barannikov and Kontsevich (see [BK]), the Maurer-Cartan equations in g admit a versal solution of the form (1) implying that the moduli space of extended deformations of complex structures on M is smooth and is isomorphic to Ᏼ.4 In fact they have shown much more (see [BK] and [Ba]). In this case, Ᏼ has an induced structure of Frobenius manifold C) which conjecturally coincides with the Frobenius manifold structure on H∗ (M, constructed via the Gromov-Witten invariants of the dual Calabi-Yau manifold M. Barannikov [Ba] checked this conjecture for projective complete intersections CalabiYau manifolds. For a general complex manifold M, the (extended) deformations are obstructed and the Maurer-Cartan equations associated to g have no versal solutions of the form (1). It is one of the main tasks of this paper to understand what happens to the BarannikovKontsevich Frobenius structure on ᐀Ᏼ in the presence of obstructions. Example 2.3.5 (Deformations of Poisson and symplectic manifolds). It is well known that a 2-vector field, ν0 ∈ (M, 02 TR ), defines a Poisson structure, {f, g} = ν0 df ⊗ dg , f, g ∈ ᏻM , on a smooth real n-dimensional manifold M if and only if ν0 • ν0 Sch = 0. Then a deformed 2-vector field, ν0 +ν ∈ (M, 02 TR ), is again a Poisson structure if and only if ν satisfies the Maurer-Cartan equation, 1 dν + [ν • ν]Sch = 0, 2 in the differential Z-graded algebra n M, 0i TR , [•]Sch , d := ν0 • · · · Sch . g= i=0
Hence, the associated deformation functors Def 0g/Def ∗g describe (extended) deformations of the given Poisson structure ν0 on M. For generic ν0 , the associated cohomology group ⊕i Hi (g) may not be finitedimensional even for compact manifolds. If, however, ν0 comes from a symplectic 4 Strictly speaking, this isomorphism holds true in the category of formal manifolds, in which we work in this paper. It is no problem to choose the power series (1) convergent, thereby inducing on the extended moduli space a smooth analytic structure. The latter is then analytically isomorphic to an open neighborhood of zero in H, which we denote by the same symbol Ᏼ (and continue doing this every time the analyticity aspect emerges).
FROBENIUS∞ MANIFOLDS
423
2-form ω on M, the situation is very different. In this case, one may use the “lowering indices map” ω : TR → 1R to identify (g, d) with the de Rham complex on M; it is not hard to check that this map sends the differential [ν0 • · · · ]Sch on 0• TR into the usual de Rham differential on •R . The (odd) Lie brackets induced on ∗R from the Schouten brackets on 0• TR we denote by [•]ω to emphasize their dependence on the symplectic structure. Hence, the deformation functors Def g0 /Def ∗g associated with the dLie-algebra n g= M, iR , [•]ω , d = de Rham differential i=0
describe (extended) deformations of a symplectic structure ω on M. Its cohomology H is nothing but the de Rham cohomology of M. A compact symplectic manifold (M, ω) is called Lefschetz if the natural cup product on the de Rham cohomology, k ω : H m−k (M, R) −→ H m+k (M, R), is an isomorphism for any k ≤ m := (1/2) dim M. This class of manifolds (which includes the class of Kähler manifolds by the Hard Lefschetz theorem) is of interest to us, for the extended deformation functor Def ∗g associated with an arbitrary Lefschetz symplectic manifold is nonobstructed and is representable by a smooth moduli space isomorphic to Ᏼ. Moreover, this moduli space of “extended symplectic structures” is always a Frobenius manifold (see [Me1]). This result is more than parallel to the Barannikov-Kontsevich construction of extended moduli spaces/Frobenius structures for Calabi-Yau manifolds—it is another example of how their beautiful machinery works. (See a nice exposition of Manin’s in [Ma1].) The extended deformation functor associated with a generic compact symplectic manifold seems to be obstructed, and one should employ different techniques (see below) to study geometric structures induced on moduli spaces of extended symplectic structures. Example 2.3.6 (Deformations of holomorphic vector bundles). Let E → M be a holomorphic vector bundle on a complex n-dimensional manifold M. There is an associated differential Z-graded Lie algebra
• g = M, End E ⊗ M [−1] , [•], ∂
with the Lie brackets i −1 i −1 [•] : M, End E ⊗ M1 × M, End E ⊗ M2 (i +i −1)−1 −→ M, End E ⊗ M1 2 ,
424
S. A. MERKULOV
A1 ⊗ w 1 × A2 ⊗ w 2 −→ A1 ⊗ w1 • A2 ⊗ w2 , given by
A1 ⊗ w1 • A2 ⊗ w 2 := A1 A2 − A2 A1 ⊗ w1 ∧ w2 .
The deformation functor Def 0g associated with this algebra describes deformations of the holomorphic structure in the vector bundle E. It is tempting to view its extension Def ∗g as a tool for studying extended deformations, but we reserve this role for the functor Def ∗··· associated with a larger differential algebra constructed in Example 3.1.5. In general, all these functors are obstructed. One may combine this differential Lie algebra (or its extension in Example 3.1.5) with the one from Example 2.3.4 into their natural semidirect product to study joint (extended) deformations of the pair E → M. 2.4. L∞ -algebras. These algebras will play only an auxiliary, purely technical, role in this paper. By definition, a strong homotopy Lie algebra, or shortly L∞ -algebra, is a vector superspace h equipped with linear maps, µk : 0k h −→ h, ˜ v1 ∧ · · · ∧ vk −→ µk v1 , . . . , vk , k ≥ 1, µ˜ k = k, satisfying, for any n ≥ 1 and arbitrary v1 , . . . , vn ∈ ho˜ ∪ h1˜ , the following higher order Jacobi identities,
(−1)σ˜ +k(l−1) e σ ; v1 , . . . , vn
k+l=n+1 σ ∈Sh(k,n)
× µl µk vσ (1) , . . . , vσ (k) , vσ (k+l) , . . . , vσ (n) = 0,
where Sh(k, n) is the set of all permutations σ : {1, . . . , n} → {1, . . . , n} which satisfy σ (1) < · · · < σ (k) and σ (k + 1) < · · · < σ (n). The symbol e(σ ; v1 , . . . , vn ) (which we abbreviate from now on to e(σ )) stands for the Koszul sign defined by the equality vσ (1) ∧ · · · ∧ vσ (n) = (−1)σ˜ e(σ )v1 ∧ · · · ∧ vn , σ˜ being the parity of the permutation σ . The Z-graded version of this definition would require µk to be homogeneous of degree 2 − k. This notion and the associated notion of A∞ -algebra (recalled below) are due to Stasheff [S]. The first three higher order Jacobi identities have the form n = 1: d 2 = 0, n = 2: d[v1 , v2 ] = [dv1 , v2 ] + (−1)v˜1 [v1 , dv2 ],
425
FROBENIUS∞ MANIFOLDS
n = 3: [[v1 , v2 ], v3 ] + (−1)(v˜1 +v˜2 )v˜3 [[v3 , v1 ], v2 ] + (−1)v˜1 (v˜2 +v˜3 ) [[v2 , v3 ], v1 ] = −dµ3 (v1 , v2 , v3 ) − µ3 (dv1 , v2 , v3 ) − (−1)v˜1 µ3 (v1 , dv2 , v3 ) − (−1)v˜1 +v˜2 µ3 (v1 , v2 , dv3 ), where we denoted dv1 := µ1 (v1 ) and [v1 , v2 ] := µ2 (v1 , v2 ). Therefore, L∞ -algebras with µk = 0 for k ≥ 3 are nothing but the usual differential Lie superalgebras with the differential µ1 and the Lie bracket µ2 . If, furthermore, µ1 = 0, one gets the class of usual Lie superalgebras. 2.4.1. Odd L∞ -algebras. To make the above picture consistent with the choices made in Section 2.1, we should change the parity of h. Hence, we shall work from now on with g := h and denote µn v1 • v2 • · · · • vn := µn v1 , v2 , . . . , vn ,
∀v1 , . . . , vn ∈ g,
for all n ≥ 1. This change of grading also unveils, through the following three obseritself. vations, a rather compact image of the L∞ -structure n g has a natural structure of cosymmetric (1) The vector superspace • g = ∞ n=1 coalgebra, n 5 w 1 · · · wn =
e(σ ) wσ (1) · · · wσ (i) ⊗ wσ (i+1) · · · wσ (n) .
i=1 σ ∈Sh(i,n)
(2) Every coderivation of this coalgebra, that is, an odd map Q : • g → • g satisfying 5 ◦ Q = (Q ⊗ Id + Id ⊗Q) ◦ 5, is equivalent to an arbitrary series of odd linear maps, µn : n g → g. (3) A codifferential Q = {µ∗ } is a differential; that is, Q2 = 0 if and only if µn satisfy the higher order Jacobi identities. In conclusion, an (odd) L∞ -structure on g is equivalent to a codifferential on (• g, 5). 2.4.2. L∞ -morphisms, part II. Assume we have two L∞ -algebras, (g, µ∗ ) and (g , µ∗ ). An L∞ -morphism F from the first one to the second is, by definition, a differential coalgebra homomorphism F : • g, 5, Q −→ • g , 5, Q . It is completely determined by a set of even linear maps Fn : n g → g satisfying a sequence of equations. If both input and output of F are usual differential Lie superalgebras, these equations are precisely the ones written in Section 2.3.1. An L∞ -morphism F : (g, µ∗ ) → (g , µ∗ ) is called a quasi-isomorphism if its first component F1 : g → g induces an isomorphism between cohomology groups of
426
S. A. MERKULOV
complexes (g, µ1 ) and (g , µ1 ). It is called a homotopy if F1 : g → g is an isomorphism of underlying vector graded superspaces. If the L∞ -morphism F : (g, µ∗ ) → (g , µ∗ ) is a quasi-isomorphism, then, as was proven in [Ko3], there exists an L∞ -morphism F : (g , µ∗ ) → (g, µ∗ ) which induces the inverse isomorphism between cohomology groups of complexes (g, µ1 ) and (g , µ1 ). 2.4.3. A geometric interpretation of an L∞ -algebra g. The dual of the free cocommutative coalgebra • g can be identified with the algebra of a formal power series on the vector superspace g viewed as a formal pointed supermanifold. (To emphasize this change of thought, we denote the supermanifold structure on g by Mg.) With this identification, the L∞ -structure µ∗ on g, that is, the codifferential Q on • g, goes into an odd vector field Q on Mg satisfying (a) Q2 = 0, (b) QI ⊂ I (see [Ko3]), where I is the ideal of the distinguished point, 0 ∈ Mg. (An odd vector field on a formal pointed superspace satisfying the above two conditions is usually called homological.) An L∞ -morphism F between two L∞ -algebras (g, µ∗ ) and (g , µ∗ ) is nothing but a Q-equivariant map between the associated formal pointed homological supermanifolds, (Mg, Q) and (Mg , Q ). 2.5. A modified deformation functor. For a general dLie-algebra g, the classical deformation functor Def ∗g is not representable by a smooth versal moduli space. At best, one can use Kuranishi technique to represent Def ∗g by a singular analytic space. There is, however, a simple geometric way to keep track of versality and smoothness. First, we extend the input category used in the construction of Def ∗g in Section 2.3 to a category of differential Artin superalgebras whose Obs are pairs, (Ꮾ, ∂), consisting of an Artin superalgebra Ꮾ together with a differential ∂ : Ꮾ → Ꮾ satisfying ∂mᏮ ⊂ m2Ꮾ , and whose Mors are morphisms of Artin superalgebras commuting with the differentials. A Z-graded version of this definition would require ∂ to have degree +1. Second, to the “controlling” differential Lie algebra g, we associate a new deformation functor,
the category of differential Def∗g : −→ the category of sets , Artin superalgebras (Ꮾ, ∂) −→ Def∗g(Ꮾ, ∂) by setting Def∗g(Ꮾ, ∂) =
# + [ • ]/2 = 0 ∈ g ⊗ mᏮ o˜ | d + ∂ . exp g ⊗ mᏮ 1˜
FROBENIUS∞ MANIFOLDS
427
Here the quotient is taken with respect to the following representation of the gauge group, eadg − 1 −→ g = eadg − d + ∂# g, ∀g ∈ g ⊗ mᏮ 1˜ . adg On the subcategory (Ꮾ, 0) the deformation functor Def∗g coincides precisely with the classical one Def ∗g. Remark 2.5.1. If, for a derivation ∂ : Ꮾ → Ꮾ, an element ∈ (g ⊗ mᏮ )o˜ satisfies the equation # + 1 [ • ] = 0 d + ∂ 2 (which we sometimes call the Master equation), then
# + 1 [ • ] 0 = d d + ∂ 2 # = −∂d + [d • ]
# − ∂ # • − 1 [ • ] • = −∂d 2 1 = −∂# d + [ • ] 2 2 = −∂# , motivating our assumption above that ∂ is a differential in Ꮾ rather than merely a derivation. 2.5.2. L∞ -extension of Def∗ . This extension will be used later only as a technical tool in the study of Def∗g for usual differential Lie superalgebras g. Given an L∞ -algebra (g, µ∗ ), we define Def∗g :
the category of differential −→ the category of sets , Artin superalgebras (Ꮾ, ∂) −→ Def∗g(Ꮾ, ∂)
by setting
Def∗g
Ꮾ, ∂ =
n(n+1)/2 /n! µ • · · · • # = ∞ ∈ g ⊗ mᏮ o˜ ∂ n n=1 (−1)
∼
.
Here the quotient is taken with respect to the gauge equivalence, ∼, which is best described using the following geometric model of the Def ormation functor.
428
S. A. MERKULOV
In Categoryop , both the differential Artin superalgebra, (Ꮾ, ∂), and the L∞ -algebra, (g, µ∗ ), are represented by formal pointed analytic homological superspaces, (MᏮ , 0, ∂) and, respectively, (Mg, 0, Q). Then the set
∞ (−1)n(n+1)/2 # µn • · · · • S = ∈ g ⊗ mᏮ o˜ ∂ = n! n=1
is just the set of all formal maps of pointed supermanifolds, : MᏮ , 0 −→ Mg, 0 , satisfying the equivariency condition d(∂) = ∗ (Q). The latter is precisely the (L∞ -generalization of) the Master equation. Both superspaces, (MᏮ , 0, ∂) and (Mg, 0, Q), are foliated by integrable distributions,
Ᏸ∂ := X ∈ T MᏮ | X = [∂, Y ] for some Y ∈ T MᏮ ,
ᏰQ := X ∈ T Mg | X = [Q, Y ] for some Y ∈ T Mg , ˆ through and d(Ᏸ∂ ) ⊂ ∗ (ᏰQ ) for any ∈ S. Hence, any such defines a map, , the following commutative diagram. MᏮ
MᏮ
ˆ
Mg
Mg ᏰQ
Ᏸ∂
We say that two elements in S are gauge equivalent, 1 ∼ 2 , if ˆ 1 = ˆ 2 . Infinitesimally, the gauge equivalence is given by ∞ (−1)n(n+1)/2 ∼ + d + ∂# g − µn g • • · · · • , (n − 1)! n=2
∀g ∈ g ⊗ mᏮ 1˜ .
Remark 2.5.4. If, for a derivation ∂ : Ꮾ → Ꮾ, an element ∈ (g ⊗ mᏮ )o˜ satisfies the master equation, # = ∂
∞ (−1)n(n+1)/2 n=1
n!
µn • · · · • ,
then, using the higher Jacobi identities (as in Remark 2.5.1), one gets an implication, ∂# 2 = 0.
FROBENIUS∞ MANIFOLDS
429
Basic Theorem of Deformation Theory 2.5.5. Let (g1 , Q1 ) and (g2 , Q2 ) be two L∞ -algebras (in particular, dLie-algebras). An L∞ -morphism F = {Fn } : g1 → g2 defines a natural transformation of the functors, F∗ : Def∗g1 −→ Defg∗2 , −→ F∗ () :=
∞ 1 Fn , . . . , . n! n=1
Moreover, if F is a quasi-isomorphism, then F∗ is an isomorphism. Proof. Assume ∈ (g1 ⊗ mᏮ )o˜ satisfies the master equation, d(∂) = ∗ (Q1 ). The L∞ -morphism F , when viewed as a map, Mg1 → Mg2 , of pointed formal manifold, satisfies dF (Q1 ) = F ∗ (Q2 ). Hence, F∗ (), which is the same as F ◦ , obviously satisfies the master equation in (g2 , Q2 ). Smoothness Theorem 2.5.6. The deformation functor Def∗ is unobstructed; that is, for any dLie-algebra g with finite-dimensional cohomology H(g), the functor Def∗g is versally representable by a smooth pointed formal dim H(g)-dimensional homological supermanifold (Ᏼg, ∂). Moreover, the diffeomorphism class of (Ᏼg, ∂) is an invariant of g. Proof. It is enough to show that (i) there exists a versal element ∈ k[[t]]⊗ g and a differential ∂ : k[[t]] → k[[t]] satisfying the master equation # + 1 [ • ] = 0; d + ∂ 2
(2)
(ii) the differential ∂, when viewed as a vector field on the cohomological supermanifold Ᏼg, is an invariant of g. Unless g is formal, there is no quasi-isomorphism from (g, [•], d) to its cohomology (H(g), [•]ind , 0). However, there always exists an L∞ -structure, {µ∗ , with µ1 = 0}, on H(g) which is quasi-isomorphic, via some L∞ -morphism F , to (g, [•], d). Moreover, this structure is defined uniquely up to a homotopy. Setting [1] = i t i [ei ], in the notation of Section 2.2, we define a derivation, ∂ : k[[t]] → k[[t]], by the formula # [1] = ∂
∞ (−1)n(n+1)/2 n=2
n!
µn [1] • · · · • [1] .
430
S. A. MERKULOV
By Remark 2.5.4, this derivation satisfies ∂ 2 = 0. Hence, ([1] , ∂) is a versal solution of the master equation in (H(g), µ∗ ), while (F∗ ([1] ), ∂) is, by Theorem 2.5.5, a versal solution of the master equation in (g, [•], d). This proves claim (i). The L∞ -structure {µ∗ } on H & Rp|q is well defined only up to a homotopy, {η(n) : n H → H, n ≥ 2, ηn = o˜ }. It is easy to check that a homotopy change of the induced L∞ -structure, η(∗)
µ∗ −−→ µ∗ , does change the differential, ∂ −→ ∂ , but in a remarkably geometric way, ∂ = dη(∂), where η : Rp|q → Rp|q is just a formal change of coordinates i
t i −→ t = t i +
j,k
i j k ±η(2)j kt t +
j,k,l
i j k l ±η(3)j kl t t t + · · ·.
Put another way, a homotopy change of µ∗ affects only the coordinate representation of the vector field ∂ on Ᏼg. As a geometric entity, this is an invariant of g. Corollary 2.5.7. The derived category of L∞ -algebras with finite-dimensional cohomology is canonically equivalent to the (purely geometric) category of pointed formal homological supermanifolds, (Ᏼ, ∂, 0), with ∂ satisfying ∂I ⊂ I 2 , I being the ideal of the distinguished point 0 ∈ Ᏼ. Proof. It is well known that each quasi-isomorphism of L∞ -algebras is a homotopy equivalence. Thus the derived category of L∞ -algebras is equivalent to their homotopy category. Then the required statement follows immediately from an observation made in the proof of Theorem 2.5.6 that, for any homotopy class of L∞ -algebras [g], the associated homotopy class of L∞ -structures induced on the cohomology H(g) is isomorphically mapped into one and the same homological manifold (Ᏼg, ∂, 0). 2.5.8. Remarks. (i) The origin of the vector field ∂ in Theorem 2.5.6 can be traced back to Chen’s power series connection (see [C]). This will be made apparent in Section 4, where we give another, perturbative proof of Theorem 2.5.6. From now on we call ∂ the Chen differential or Chen vector field. (ii) The higher order tensors µ∗ induced on the cohomology H(g) by an L∞ -quasiisomorphism from a dLie-algebra g coincide precisely with the Massey products when they are well defined and univalued. Thus, the Chen differential gives a compact (and invariant) representation of the homotopy class of Massey products.
FROBENIUS∞ MANIFOLDS
431
2.5.9. Extended Kuranishi moduli space. Since the Chen vector field ∂ on Ᏼ is homological, the distribution
Ᏸ∂ = X ∈ ᐀Ᏼ | X = [∂, Y ] for some Y ∈ ᐀Ᏼ is integrable (cf. Section 2.5.3). Indeed, the Jacobi identities imply [∂, X], [∂, Y ] = ∂, X, [∂, Y ] . Consider an affine subscheme, Spec k[[t]] , ∂t 1 , . . . , ∂t p+q
“ Zeros(∂)” :=
of zeros of the vector field ∂. The distribution Ᏸ∂ is tangent to “ Zeros(∂)” since [∂, [∂, X]] = 0. We define the extended Kuranishi space, g, as the so-called nonlinear homology (see [BK], [Ma1], and [Ba]) of the Chen differential, that is, as the quotient “ Zeros(∂)”/Ᏸ∂ . (For our purposes, it is enough to understand the latter as Spec k[[t]] ∩ Ker ∂/(∂t 1 , . . . , ∂t p+q ).) This passage from (Ᏼ, ∂) to g establishes a clear link between the unobstructed deformation functor Def∗g and the classical one Def ∗g. Kuranishi spaces, K g, originally emerged (see [Ku] and [GM]) in the context of the deformation functor Def 0g associated to a cohomologically split Z-graded dLiealgebra g. It is not hard to see that Kg is a proper subspace of the extended Kuranishi space g. We will see below that for a rich class of dLie-algebras g (the so-called (homotopy) Gerstenhaber algebras), the tangent sheaves to the smooth parts, smooth , of the associated extended Kuranishi spaces are canonically sheaves of associative algebras. This structure is not visible if working in the category of original (nonextended) Kuranishi spaces K only. 2.6. Cohomological splitting. It is very easy to compute the Chen differential once a cohomological decomposition of a dLie-algebra g under investigation is chosen. The latter means the data set (i, p, Q), where i : H(g) → g is a linear injection, p : g → H(g) is a linear surjection, and Q : g → g is an odd linear operator, all satisfying the conditions p ◦ i = Id = i ◦ p ⊕ dQ ⊕ Qd in Endk (g). Such a decomposition of g often occurs in (complex) differential geometry (see [K] and [Ku]), where typical dLie-algebras come equipped with a norm * * and their cohomologies H get identified with harmonic subspaces, Harm := Ker d ∩Ker d ∗ ⊂ g, d ∗ being the adjoint of d with respect to * *. The operator Q is then Gd ∗ , where G is the Green function of the Laplacian ✷ = dd ∗ + d ∗ d. In this situation, the formal power series solution, , of the master equation and the associated Chen vector field
432
S. A. MERKULOV
can be chosen to be convergent, thereby inducing the structure of an analytic (rather than formal) homological supermanifold on Ᏼ. It is not hard to check that, given a cohomological splitting of g, the pair, (, ∂), given recursively by [1] = t i ei ∈ Ker Q ∩ Ker d, i
1 [2] = − Q [1] (t) • [1] (t) , 2 1 [3] = − Q [1] (t) • [2] (t) + [2] (t) • [1] (t) , 2 .. . n−1 1 [n] = − Q [k] (t) • [n−k] (t) 2 k=1
(3)
and
# [1] := − 1 p [ • ] , ∂p 2 give an explicit versal solution of the master equation in g. The above power series for is well known in the classical deformation theory (see [K] and [Ku]), where it plays a key role in constructing Kuranishi analytic moduli spaces. This series is essentially an inversion of the Kuranishi map (see [Ku]) in the category of L∞ -algebras. 2.7. Formality and flat structures. If an algebra (g, d, [•]) is formal, then, as follows from the proof of Theorem 2.5.6, the associated homological supermanifold (Ᏼg, ∂, 0) has a canonical flat structure ∇. In the associated flat coordinates t i , the Chen vector field ∂ has coefficients that are polynomials in t i of order less than or equal to 2. (This observation can, in fact, be made into a geometric criterion of formality.) More precisely, the following is true. Theorem 2.7.1. For any formal dLie-algebra g, there is a canonical isomorphism of sets,
Homotopy classes Flat structures on Ᏼg such that ∇X ∇Y ∇Z ∂ = 0 ←→ . for any horizontal vector fields X, Y , and Z of formality maps Let us choose a basis, {si , i = 1, . . . , p + q}, in the (p | q)-dimensional vector superspace H(g), and let {t i } be the associated linear coordinates. We need, for a short time, a category Artink[[t]] consisting of Artin superalgebras of the form k t 1 , . . . , t p+q . ᏭN := (t 1 )N1 · · · (t p+q )Np+q
433
FROBENIUS∞ MANIFOLDS
Denoting the maximal ideal of such a superalgebra by mN , we set (g ⊗mN )versal to be a linear subspace in g ⊗mN consisting of even elements, , satisfying mod mN = 0, mod m2N ∈ Ker d, and
mod m2N
mod Im d =
p+q
t i si .
i=1
This set is invariant under the action of the gauge group exp(g ⊗ mN )1˜ (see Section 2.5). Lemma 2.7.2. For any formal dLie-algebra g, there is a canonical isomorphism of sets,
# + [ • ]/2 = 0 ∈ (g ⊗ mN )versal | d + ∂ Formality maps lim = , ← − gauge group homotopy equivalence where the projective limit is taken over the category Artink[[t]] . Proof. For any ᏭN ∈ Artink[[t]] , the master equation in the Lie algebra (H(g) ⊗ ᏭN , [•]ind ) has a canonical versal solution, 0 =
p+q
t i si ,
i=1
∂=
p+q
˜˜
(−1)j (i+1) t i t j Cijk
i,j,k=1
∂ , ∂t k
where Cijk are the structure constants of [•]ind . If g is formal, and F = {Fn : n H(g) → g, n = 1, 2, . . . } is a formality map, then, by Theorem 2.5.5, :=
∞ 1 Fn (0 , . . . , 0 ), n!
the same ∂,
n=1
is a versal solution of the master equation in g ⊗ ᏭN . It is easy to check that an arbitrary homotopy change, F −→ F h , of the formality map, say, the one induced by a set of linear maps h = {hn : n H(g) → g, h˜ n = n, ˜ n = 1, 2, . . . }, change the versal solution into a gauge equivalent one, −→ g , where g=
i
h1 (ei )t i +
i,j
±h2 (ei , ej )t i t j +
i,j,k
±h3 (ei , ej , ek )t i t j t k + · · ·.
434
S. A. MERKULOV
Hence there is a canonical map,
# + [ • ]/2 = 0 ∈ (g ⊗ mN )versal | d + ∂ Formality maps −→ , homotopy equivalence gauge group
∞ Fn : n H(g) −→ g, n = 1, 2, . . . (1/n!)Fn (0 , . . . , 0 ) F= −→ n=1 , ∼ ∼ which implies (almost immediately) the desired result. 2.7.3. Proof of Theorem 2.7.1. Let Diff 0 be the group of all formal diffeomorphisms of Ᏼg = H(g) into itself preserving the origin, and set
Diff 0,∂ := φ ∈ Diff 0 | φ∗ (∂) is quadratic in t i . Note that Diff 0,∂ = Diff 0 if the Chen vector field ∂ vanishes. In general, GL(p + q) ⊆ Diff 0,∂ ⊆ Diff 0 . There is an obvious isomorphism,
Diff 0,∂ Flat structures on Ᏼg such that ∇X ∇Y ∇Z ∂ = 0 = . GL(p + q) for any horizontal vector fields X, Y, and Z On the other hand, by Theorem 4.2.2,
# + [ • ]/2 = 0 ∈ (g ⊗ mN )versal | d + ∂ Diff 0,∂ = lim . − GL(p + q) ← gauge group The final link in the chain of canonical isomorphisms is provided by Lemma 2.7.2. Corollary 2.7.4. For any compact Calabi-Yau manifold, there is a canonical isomorphism of sets,
Flat connections on Barannikov-Kontsevich’s Homotopy classes ←→ . moduli space of extended complex structures of formality maps 3. Homotopy Gerstenhaber algebras 3.1. Differential Gerstenhaber algebras. A differential Gerstenhaber algebra, or dG-algebra, is the data set (g, d, [•], ·), where (i) (g, d, [•]) is a Z-graded dLie-algebra as defined in Section 2.2.1; (ii) (g, d, ·) is a differential Z-graded associative algebra with the product · : g ⊗ g −→ g, having degree zero;
a ⊗ b −→ a · b,
435
FROBENIUS∞ MANIFOLDS
(iii) the binary operations [•] and · satisfy the odd Poisson identity ˜
˜ b [a • (b · c)] = [a • b] · c + (−1)(a+1) b · [a • c],
for all homogeneous a, b, c ∈ g. A dG-algebra is called graded commutative if such is the dot product. The identity in g is an even element e0 such that de0 = 0, e0 · a = a · e0 = a, and [e0 • a] = 0 for any a ∈ g. It defines a cohomology class [e0 ] in H, and a constant vector field on Ᏼ which we denote by e. Remark 3.1.1. If g is a unital dG-algebra, then a versal solution, , of the master equation (2) in g can (and will) be always normalized in such a way that e# = e0 . 3.1.2. Differential Gerstenhaber-Batalin-Vilkovisky algebras. Let (g, ·) be a Zgraded commutative associative algebra over a field k. Let us say that the zero operator, 0 : g → g, is of order −1, and let us denote the linear operator, x → a · x, of left multiplication by an element a ∈ g by la . A homogeneous linear operator, D : g → g, is said to be an operator of order k if the operator [D, la ] is of order k − 1 for any homogeneous a in g. Assume now that (g, ·) comes equipped with (i) a degree +1 linear operator, d : g → g, of order 1, and (ii) a degree −1 linear operator, 5 : g → g, of order 2, satisfying the conditions d 2 = 0, In this case, the data set with
52 = 0,
g, d, ·, [•]
d5 + 5d = 0.
[a • b] := (−1)a˜ 5(a · b) − (−1)a˜ (5a) · b − a · (5b),
∀a, b ∈ g,
defines a dG-algebra (see [Ma1]). The dG-algebras arising in this way are often called exact or dGBV-algebras. Example 3.1.3 (Complex manifolds). For any n-dimensional complex manifold M, the differential Lie algebra of Example 2.3.1, ∗ g = M, 0• TM ⊗ 0• T M , ∂, [•] , equipped with a supercommutative product, ∗ ∗ ∧ : M, 0i1 TM ⊗ 0j1 T M × M, 0i2 TM ⊗ 0j2 T M ∗ −→ M, 0i1 +i2 TM ⊗ 0j1 +j2 T M ,
436
S. A. MERKULOV
˜˜ X1 ⊗ w1 × X2 ⊗ w 2 −→ (−1)j1 i2 X1 ∧ X2 ⊗ w1 ∧ w2 , is a unital Z-graded commutative dG-algebra. If M admits a nowhere vanishing global holomorphic volume form, ∈ (M, nM ), then the above dG-algebra is actually exact (see [Ti], [To], and [BK]) with 5 being the composition i
−1 i
∂
n−i n−i+1 5 : 0i TM −−→ M −→ M −−→ 0i−1 TM .
Here, i : 0• TM → •M is the natural isomorphism given by contraction with the holomorphic volume form, and ∂ is the (1, 0)-part of the de Rham operator. Example 3.1.4 (Symplectic manifolds). For any symplectic manifold (M, ω), the dLie-algebra of Example 2.3.5, g = M, •R , d, [•]ω , together with a graded commutative product, a · b := a ∧ b, is a unital Z-graded dGalgebra. Moreover, it is a dGBV-algebra with the second order differential given by 5 |k = (−1)k+1 ∗ d ∗ . R
is the symplectic analogue of the Hodge duality operator Here, ∗ : kR → defined by the condition, β ∧(∗α) = (β, α)ωm /m!, with ( , ) being the pairing between k-forms induced by the symplectic form. 2m−k R
Example 3.1.5 (Vector bundles). Let M be a complex manifold, and let π : E → M be a holomorphic vector bundle. There is a complex of Z-graded sheaves (• E ⊗ 0• E ∗ , 5), 5
5
5
5
· · · −→ k+1 E ⊗ 0l+1 E ∗ −→ k E ⊗ 0l E ∗ −→ k−1 E ⊗ 0l−1 E ∗ −→ · · · , where the differential 5 is just the contraction, the Z-grading is induced from the one on 0• E, and we set k E = 0k E ∗ = 0 for k < 0. It is easy to see that 5 is a linear operator of order 2 with respect to the natural supercommutative product, • E ⊗ 0• E ∗ × • E ⊗ 0• E ∗ −→ ˙ • E ⊗ 0• E ∗ , a1 ⊗ b1∗ × a2 ⊗ b2∗ −→ a1 a2 ⊗ b1∗ ∧ b2∗ . Hence, the data set j k k • i ∗ g= g , g := M, E ⊗ 0 E ⊗ M , 5, ∂ k
i+j =k
is a unital dGBV-algebra. It extends the dLie-algebra of Example 2.3.6, as the following calculation shows.
FROBENIUS∞ MANIFOLDS
437
Proposition. The bracket [•] on • E ⊗ 0• E ∗ , when restricted to E ⊗ E ∗ , coincides, up to a sign, with the usual commutator of morphisms. Proof. Let us consider a pair of germs, C1 = a1 ⊗ b1∗ and C2 = a2 ⊗ b2∗ , in the same stalk of E ⊗ E ∗ . Then ˜ ˜ C1 • C2 = (−1)C1 5 C1 · C2 − (−1)C1 5(C1 ) · C2 − C1 · 5(C2 ) = −5 a1 a2 ) ⊗ b1∗ ∧ b2∗ +5 a1 ⊗ b1∗ · a2 ⊗ b2 − a1 ⊗ b1 ·5 a2 ⊗ b2∗ = − a1 , b1∗ a2 ⊗ b2∗ + a1 , b2∗ a2 ⊗ b1∗ − a2 , b1∗ a1 ⊗ b2∗ + a2 , b2∗ a1 ⊗ b1∗ + a1 , b1∗ a2 ⊗ b2∗ − a2 , b2∗ a1 ⊗ b1∗ = a1 , b2∗ a2 ⊗ b1∗ − a2 , b1∗ a1 ⊗ b2∗ = − C 1 C 2 − C 2 C1 , where angle brackets stand for the usual pairing between a vector and a 1-form. There is a problem with the constructed dGBV-algebra—its cohomology may not be finite-dimensional even for compact manifolds. It is resolved by passing to its dGBV-subalgebra, j k k i i ∗ gE = g , g := M, E ⊗ 0 E ⊗ M , 5, ∂ . k
i+j =k
If necessary, the asymmetry of E and E ∗ can be eliminated by taking the tensor product gE ⊗ gE ∗ . Example 3.1.6 (Hochschild cohomology). Let A be an associative algebra over a field k. The Z-graded vector space of Hochschild cochains, C • (A, A) :=
∞
Homk A⊗n , A ,
n=0
can be made into a Hochschild complex with the differential, d : C n (A, A) → C n+1 (A, A), given by (df ) a1 ⊗ · · · ⊗ an+1 := a1 f a2 ⊗ · · · ⊗ an+1 +
n i=1
(−1)I f a1 ⊗ · · · ⊗ ai ai+1 ⊗ · · · ⊗ an+1
+ (−1)n+1 f a1 ⊗ · · · ⊗ an an+1 for any f ∈ C n (A, A).
438
S. A. MERKULOV
One can define two binary operations, C • (A, A) ⊗ C • (A, A) → C • (A, A), the degree zero dot product, (f · g) a1 ⊗ · · · ⊗ ak+l := (−1)kl f a1 ⊗ · · · ⊗ ak g a1 ⊗ · · · ⊗ al , ∀f ∈ C k (A, A), g ∈ C l (A, A), and the degree −1 bracket, [f • g] := f ◦ g − (−1)(k+1)(l+1) g ◦ f, where (f ◦ g) a1 ⊗ · · · ⊗ ak+l−1 :=
k−1
(−1)(i+1)(l+1) f a1 ⊗ · · · ⊗ ai ⊗ g ai+1 ⊗ · · · ⊗ ai+l ⊗ · · · ⊗ ak+l−1 .
i=1
These two make the Hochschild complex into a Z-graded differential associative algebra and a differential (odd) Lie algebra, respectively. Though (C • (A, A), d, [•], ·) is not a dG-algebra, it is a remarkable fact that the associated Hochschild cohomology, Hoch• (A, A) =
Ker d , Im d
carries the structure of graded commutative dG-algebra with respect to the naturally induced dot product, the Lie bracket, and the zero differential. 3.2. A∞ -algebras. A strong homotopy algebra, or A∞ -algebra, is by definition a vector superspace V equipped with linear maps, µk : ⊗k V −→ V , v1 ⊗ · · · ⊗ vk −→ µk v1 , . . . , vk ,
k ≥ 1,
of parity k˜ satisfying, for any n ≥ 1 and any v1 , . . . , vn ∈ V , the following higher order associativity conditions:
k−1
(−1)r µk v1 , . . . , vj , µl vj +1 , . . . , vj +l , vj +l+1 , . . . , vn = 0,
(4)
k+l=n+1 j =0
˜ v˜1 + · · · + v˜j ) + j˜(l˜ − 1) + (k˜ − 1)l˜ and v˜ denotes the parity of v ∈ V . where r = l( Denoting dv1 := µ1 (v1 ) and v1 · v2 := µ2 (v1 , v2 ), we can spell the first three conditions from the above infinite series as n = 1: d 2 = 0; n = 2: d(v1 · v2 ) = (dv1 ) · v2 + (−1)v˜1 v1 · (dv2 );
FROBENIUS∞ MANIFOLDS
439
n = 3: v1 ·(v2 ·v3 )−(v1 ·v2 )·v3 = dµ3 (v1 , v2 , v3 )+µ3 (dv1 , v2 , v3 )+(−1)v˜1 µ3 (v1 , dv2 , v3 ) + (−1)v˜1 +v˜2 µ3 (v1 , v2 , dv3 ). Therefore, A∞ -algebras with µk = 0 for k ≥ 3 are nothing but the differential associative superalgebras with the differential µ1 and the associative multiplication µ2 . If, furthermore, µ1 = 0, one recovers the usual associative superalgebras. There is a (finer) Z-graded version of the above definition in which the maps µn are required to be homogeneous (usually of degree n − 2) with respect to the given Z-grading on V . 3.2.1. Identity. An element e in the A∞ -algebra is called the identity if µ1 (e) = 0, µ2 (e, v) = µ2 (v, e) = v, and µn (v1 , . . . , e, . . . , vn−1 ) = 0 for all n ≥ 3 and arbitrary v, v1 , . . . , vn−1 ∈ V . 3.2.2. Homotopy classes of A∞ -algebras. For a pair of A∞ -algebras, (V , µ∗ ) and (V˜ , µ˜ ∗ ), there is a natural notion of an A∞ -morphism from V to V˜ which is, by definition, a set of linear maps,
F = fn : V ⊗n −→ V˜ , n ≥ 1 , of parity n˜ + 1 (or of degree 1 − n in the Z-graded case) which satisfy (−1)i+r µ˜ i fk1 v1 , . . . , vk1 , fk2 −k1 vk1 +1 , . . . , vk2 , . . . , 1≤k1
=
k−1
k+l=n+1 j =0
fn−ki−1 vki−1 +1 , . . . , vn (−1)l(v˜1 +···+v˜j +n)+j (l−1) fk v1 , . . . , vj , µl vj +1 , . . . , vj +l , vj +l+1 , . . . , vn .
The first three floors in the above infinite tower are n = 1: µ˜ 1 = µ2 =: d; n = 2: µ˜ 2 (v1 , v2 ) = µ2 (v1 , v2 ) + (df2 )(v1 , v2 ); n = 3: µ˜ 3 (v1 , v2 , v3 ) + µ˜ 2 (f2 (v1 , v2 ), v3 ) − (−1)v˜1 µ˜ 2 (a1 , f2 (v3 , v4 )) = µ3 (v1 , v2 , v3 ) − f2 (µ2 (v1 , v2 ), v3 ) + f2 (v1 , µ2 (v3 , v4 )) + (df3 )(v1 , v2 , v3 ), where we naturally extended the differential d : V → V to d : ⊗k V ∗ ⊗V → ⊗k V ∗ ⊗ V (so that, e.g., (df2 )(v1 , v2 ) = df2 (v1 , v2 ) + f2 (dv1 , v2 ) + (−1)v˜1 f2 (v1 , dv2 )). A morphism F = {fn } of the A∞ -algebra (V , µ∗ ) to itself is called a homotopy if f1 = Id. If (V , µ∗ ) has the identity e, then by a homotopy of (V , µ∗ , e) we understand a homotopy of (V , µ∗ ) satisfying the additional conditions, fn (v1 , . . . , e, . . . , vn−1 ) = 0 for all n ≥ 2 and arbitrary v1 , . . . , vn−1 ∈ V . It is not hard to see that homotopy defines an equivalence relation in the set of all possible (unital) A∞ -structures on a given vector superspace V .
440
S. A. MERKULOV
Remark 3.2.3. For future reference, we rewrite the nth order associativity condition (4) as 0n v1 , . . . , vn = (dµn ) v1 , . . . , vn , where 0n v1 , . . . , vn := (−1)r µk+1 v1 , . . . , vj , k+l=n−1 k,l≥1
µl+1 vj +1 , . . . , vj +l+1 , vj +l+2 , . . . , vn
and r = (l + 1)(v˜1 + · · · + v˜j ) + j l + k(l + 1) + 1. Remark 3.2.4. It follows from (4) for n = 3 that the cohomology H (V ) :=
Ker µ1 Im µ1
of a (unital) A∞ -algebra (V , µ∗ ) is canonically a (unital) associative algebra. Moreover, a homotopy class of (unital) A∞ -structures on V induces one and the same structure of a (unital) associative algebra on H (V ). 3.2.5. The bar construction. There is a conceptually better interpretation (see [S]) of an A∞ -structure on the vector superspace V as a codifferential on the barconstruction of V . Here are the details. (i) The vector space ∞ ⊗n B(V ) := V [1] n=1
is naturally a coalgebra with the coproduct given by n 5 w1 ⊗ · · · ⊗ wn = w1 ⊗ · · · ⊗ wi ⊗ wi ⊗ · · · ⊗ wn . i=1
(ii) A linear map Q : B(V ) → B(V ) is said to be a coderivation if 5Q = Q ⊗ Id + Id ⊗Q. There is a one-to-one correspondence between such coderivations and Hochschild cochains understood as elements of Hom(B(V ), V ). (iii) A homogeneous (of degree −2) Hochschild cochain µ∗ : B(V ) → V defines an A∞ -structure on V if and only if the associated coderivation Q is a codifferential, that is, satisfies Q2 = 0. In this setup, a morphism (V , µ∗ ) → (V˜ , µ˜ ∗ ) as in Section 3.2.2 is precisely a morphism of the associated bar-constructions respecting codifferentials. 3.3. C∞ -algebras. This notion is a supercommutative analogue of the notion of A∞ -algebra.
FROBENIUS∞ MANIFOLDS
441
Let V be a Z-graded vector space and B(V ) its bar construction. One can make the latter into an associative and graded commutative algebra by defining the shuffle tensor product, : B(V ) ⊗ B(V )→ B(V ), as follows:
w1 ⊗· · ·⊗wk wk+1 ⊗· · ·⊗wn :=
e σ ; w1 , . . . , wn wσ (1) ⊗· · ·⊗wσ (n) .
σ ∈Sh(k,n)
Here we used the notation explained in Section 2.4. By definition (see [GJ]), a strong homotopy commutative algebra, or C∞ -algebra, is an A∞ -algebra (V , µ∗ ) such that the associated Hochschild cochain µ∗ : B(V ) → V factors through the composition5 natural projection
µ∗ : B(V ) −−−−−−→
B(V ) −→ V . B(V ) B(V )
This implies, in particular, that µ2 (v1 , v2 ) = (−1)v˜1 v˜2 µ2 (v2 , v1 ) for any v1 , v2 ∈ V . One defines notions of unital C∞ -algebras, of a morphism of C∞ -algebras, of their homotopy, and so on, in the same way as in the A∞ -case. 3.4. G∞ -algebras. Let V be a Z-graded vector space, and let ∞ Lie V [1]∗ = Liek V [1]∗ k=1
be the free graded Lie algebra generated by the shifted dual vector space V [1]∗ ; that is, Lie1 V [1]∗ := V [1]∗ ,
Liek V [1]∗ := V [1]∗ , Liek−1 V [1]∗ .
The Lie bracket on Lie(V [1]∗ ) extends in a usual way to the skew-symmetric associative algebra ∞ ∗ • ∧k Lie V [1]∗ , ∧ Lie V [1] = k=0
making the latter into a Gerstenhaber algebra. Definition 3.4.1 [Ta1], [TT]. A homotopy Gerstenhaber algebra, or G∞ -algebra, 5 Such
cochains are often called Harrison cochains.
442
S. A. MERKULOV
is a graded vector space V together with a degree-one linear operator Q : ∧• Lie V [1]∗ −→ ∧• Lie V [1]∗ such that Q2 = 0 and Q is a derivation with respect to both the product and the bracket. A G∞ -morphism, V → V , of G∞ -algebras is by definition a morphism, (∧• Lie(V [1]∗ ), Q) → (∧• Lie(V [1]∗ ), Q ), of associated differential Gerstenhaber algebras. Definition 3.4.1 makes sense only in the case when V is finite-dimensional. However, an obvious dualization fixes the problem (see [TT]). (i) The dual of Lie(V [1]∗ ) can be identified with the quotient B(V )/ B(V ) B(V ), being the shuffle tensor product. (ii) Derivations of ∧∗ Lie(V [1]∗ ) can be identified with arbitrary collections of linear maps, m∗k1 ,...,kn : V [1]∗ −→ Liek1 V [1]∗ ∧ · · · ∧ Liekn V [1]∗ , which upon dualization go into linear homogeneous maps, mk1 ,...,kn :
V ⊗kn V ⊗k1 ··· −→ V , shuffle product shuffle product
of degree 3 − n − k1 − · · · − kn . (iii) The condition Q2 = 0 translates into a well-defined set of quadratic equations for mk1 ,...,kn which say, in particular, that m1 is a differential on V and that the product, v1 · v2 := (−1)v˜1 m2 (v1 , v2 ), together with the Lie bracket, [v1 • v2 ] := −(−1)v˜1 m1,1 (v1 , v2 ), satisfies the Poisson identity up to a homotopy given by m2,1 . Hence, the associated cohomology space H is a graded commutative Gerstenhaber algebra with respect to the binary operations induced by m2 and m1,1 . The identity in a G∞ -algebra V is an even element e such that all mk1 ,...,kn(. . . , e, . . . ) vanish except m2 (e, v) = v. Theorem-Construction 3.4.2. There is a canonical functor from the derived category of unital G∞ -algebras with finite-dimensional cohomology to the category of F∞ -manifolds. Proof. Since each quasi-isomorphism of G∞ -algebras is an equivalence relation, the derived category of G∞ -algebras coincides with their homotopy category. We construct the desired functor, Derived category of G∞ algebras in two steps.
F∞
−−→
F∞ manifolds
443
FROBENIUS∞ MANIFOLDS
Step 1. Suppose we are given a homotopy class, [ ], of G∞ -structures on a graded vector space V . By [Ko2, Lemma 1], a cohomological splitting of the complex (V , m1 ) transfers [ ] into a homotopy class, [∧• Lie(H[1]∗ ), Q], of minimal G∞ -algebras on the finite-dimensional cohomology space of the above complex. Moreover, this class does not depend on the choice of a particular cohomological splitting, and it is homotopy equivalent to the original one. Step 2. Let Ᏽ be the multiplicative ideal in ∧• Lie(H[1]∗ ) generated by the commutant of Lie(H[1]∗ ). Any differential Q from the induced homotopy class preserves this ideal and induces, through the quotient ∧• Lie(H[1]∗ )/Ᏽ, a homotopy class of L∞ -structures on V which, by Corollary 2.5.7, can be identified with an odd vector field ∂ on the associated cohomological supermanifold Ᏼ satisfying ∂I ⊂ I 2 and [E, ∂] = ∂, I being the ideal of the distinguished point 0 ∈ Ᏼ and E the Euler vector field. We claim that the rest of the data listed in Definition 1.1 gets induced on Ᏼ through the quotient ∧• Lie(H[1]∗ )/Ᏽ2 . Indeed, what is left of a differential Q on this quotient can be described as a collection of tensors, mk,1,...,1 , which, in a basis {ea } of H, are represented by their components, µab1 ···bk ,c1 ,...,cl , k ≥ 1, l ≥ 0. The Chen’s vector field ∂ and the tensors µk defining the structure of a C∞ -algebra on the tangent sheaf ᐀Ᏼ are then given by formal power series, ∂=
l≥0
±µab1 ,c1 ,...,cl t b1 t c1 · · · t cl
∂ ∂t a
and µab1···bk =
l≥0
±µab1···bk ,c1 ,...,cl t c1 · · · t cl ,
where t a are the associated linear coordinates on Ᏼ to which we assign degree 2−|ea |. It is easy to see that the G∞ -identities for mk,1,...,1 get transformed into the right identities for the tensor fields ∂ and µk on Ᏼ. This completes the construction. Corollary 3.4.3. For any unital G∞ -algebra with finite-dimensional cohomology, the tangent sheaf to the smooth part of the extended Kuranishi space, ᏹ = “zeros(∂)”/ Im µ1 , is canonically a sheaf of induced (unital) associative algebras. It will be interesting to find out when ᏹsmooth with its canonically induced structure in Corollary 3.4.3 is an F∞ -manifold in the sense of Hertling and Manin [HM]. Remark 3.5. Different “resolutions” of the chain operad in the little disk operad give different notions of homotopy Gerstenhaber algebra (see [V]). Definition 3.4.1 is the most canonical one. However, the functor F∞ is not an equivalence in this case. The proof of Theorem 3.4.2 suggests one more version: a reduced homotopy Gerstenhaber algebra is a graded vector space V together with the structure of G∞ -algebra such that all composition maps mk1 ,...,kn vanish except mk1 ,1,...,1 . The
444
S. A. MERKULOV
derived category of such algebras is equivalent to the category of F∞ -manifolds (cf. Theorem 2.5.7). 3.6. Formality and Gauss-Manin connections. A pre-Frobenius∞ manifold is the data set (Ᏼ, E, ∇, ∂, [µ∗ ], e), where (i) Ᏼ is a formal pointed Z-graded manifold; (ii) E is the Euler vector field on Ᏼ, Ef := (1/2)|f |f , for all homogeneous functions on Ᏼ of degree |f |; (iii) ∇ is a flat torsion-free affine connection, called the Gauss-Manin connection, on Ᏼ; (iv) ∂ is an odd homological (i.e., ∂ 2 = 0) vector field on Ᏼ such that [E, ∂] = ∂, ∇X ∇Y ∇Z ∂ = 0 for any horizontal vector fields, X, Y , and Z on Ᏼ, and ∂I ⊂ I 2 , I being the ideal of the distinguished point in Ᏼ; (v) [µn : ⊗n ᐀Ᏼ → ᐀Ᏼ ], n ∈ N, is a homotopy class of smooth unital strong homotopy commutative (C∞ ) algebras defined on the tangent sheaf, ᐀Ᏼ , to Ᏼ, such that LieE µn = (1/2)nµn , for all n ∈ N, and µ1 is given by µ1 : ᐀Ᏼ −→ ᐀Ᏼ ,
X −→ µ1 (X) := [∂, X];
(vi) e is the flat unit, that is, an even vector field on Ᏼ such that [∂, e] = 0, ∇e = 0, µ2 (e, X) = X, ∀X ∈ ᐀Ᏼ , and µn (. . . , e, . . . ) = 0 for all n ≥ 3. Theorem 3.6.1. There is a canonical functor from the category of pairs (g, F ), where g is L∞ -formal unital homotopy Gerstenhaber algebra and F is a formality map, to the category of pre-Frobenius∞ manifolds. Proof. The desired statement follows immediately from Remark 2.5.9 and a version of Theorem-construction 3.4.2 where the formality map F is used to transfer the G∞ -structure from the algebra to its cohomology. Theorem 3.6.2. If a homotopy Gerstenhaber algebra g is quasi-isomorphic, as an L∞ -algebra, to an Abelian dLie-algebra, then the tangent sheaf, ᐀Ᏼ , to its cohomology viewed as a linear supermanifold is canonically a sheaf of unital graded commutative associative algebras. Proof. In this case, ∂ = 0, and µ2 , which is now defined uniquely, makes ᐀Ᏼ into a sheaf of unital graded commutative associative algebras. 4. Perturbative construction of F∞ -invariants. The purpose of this section is to give second “down-to-earth” proofs of some of the main claims of this paper. Our approach here is based on perturbative solutions of algebraic differential equations rather than on the homotopy technique used in the two previous sections. First comes a perturbative proof of the Smoothness Theorem 2.5.6. Theorem 4.1 (Chen’s construction). For any differential Lie superalgebra g, there exists a versal element, ∈ k[[t]]⊗ g, and an odd derivation, ∂ : k[[t]] → k[[t]], such
445
FROBENIUS∞ MANIFOLDS
that ∂ 2 = 0 and the equation # + 1 [ • ] = 0 d + ∂ 2 holds. Moreover, for any quasi-isomorphism of complexes of vector spaces, φ : (g, d) → (H, 0), may be normalized so that φ([n] ) = 0 for all n ≥ 2. We prove Theorem 4.1 by induction using (twice) the following lemma, which is merely a truncated version of Remark 2.5.1. 4.1.1. Assume that the elements (n) = nk=0 [k] ∈ k[[t]]⊗ g and ∂(n) = Lemma n k=0 ∂[n] ∈ Der k[[t]] satisfy 1 d(n) + ∂#(n) (n) + (n) • (n) = 0 2 Then
mod I n+1 .
1 ψ[n+1] := d(n) + ∂#(n) (n) + (n) • (n) 2
satisfies
2 dψ[n+1] = −∂#(n) (n)
mod I n+2
mod I n+2 .
4.1.2. Proof of Theorem 4.1. Let φ : (g, d) −→ (H, 0) be a quasi-isomorphism, that is, a morphism of complexes inducing an isomorphism on cohomology. Since g is defined over a field, such a quasi-isomorphism always exists (note that we do not ask for any sort of a relationship between φ and the Lie brackets). Let ei be any representatives of the cohomology classes [ei ] in Ker d ⊂ g. We may assume without loss of generality that φ(ei ) = [ei ]. Then, choosing [0] = 0, p+q [1] := i=1 t i ei , and ∂[0] = ∂[1] = 0, we get the data set ((1) , ∂(1) ) satisfying the master equation modulo terms in I 2 and the nilpotency condition ∂ 2 = 0 modulo terms in I 3 . n Assume we have constructed n a versal element (n) = k=1 [k] ∈ k[[t]] ⊗ g and an odd vector field ∂(n) = k=2 ∂[n] on Ᏼ such that the equations 1 d(n) + ∂#(n) (n) + (n) • (n) = 0 2 Pn : ∂ 2 = 0 mod I n+2 (n)
are satisfied.
mod I n+1 ,
446
S. A. MERKULOV
Let us show that there exists [n+1] ∈ k[[t]] ⊗ g and ∂[n+1] ∈ H 0 (᐀MH ) such that (n+1) = (n) + [n+1] ,
∂(n+1) = ∂(n) + ∂[n+1]
satisfy the equations Pn+1 . Note that, in the notation of Lemma 4.1.1, one has 1 d(n+1) + ∂#(n+1) (n+1) + (n+1) • (n+1) 2
mod I n+2 = d[n+1] + ψ[n+1] + ∂#[n+1] [1] .
Let us now define ∂#[n+1] by setting ∂#[n+1] [1] := −φ ψ[n+1] . As dψ[n+1] = 0 by Lemma 4.1.1 and the second equation of Pn , we conclude that ψ[n+1] + ∂#[n+1] [1] ∈ Ker φ ∩ Ker d ⊗ k[[t]][n+1] . Since φ is a quasi-isomorphism, Ker φ ∩ Ker d = Im d. Hence, there exists [n+1] ∈ k[[t]] ⊗ g such that d[n+1] = −ψ[n+1] − ∂#[n+1] [1] . Thus, the first equation of the system Pn+1 holds. This implies, by Lemma 4.1.1, 2 dψ[n+2] = −∂#(n+1) (n+1)
2 mod I n+3 = −∂#(n+1) [1]
mod I n+3 .
Applying φ to both sides of this equation, we get 2 ∂#(n+1) φ [1] = 0, implying the second equation of the system Pn+1 , 2 ∂#(n+1) =0
mod I n+3 ,
and completing thus the inductive procedure. Finally, we note that Ker d + Ker φ = g for φ is a quasi-isomorphism. Hence, we can always adjust [n+1] , n ≥ 1, so that it lies in Ker φ. 4.1.3. Remarks. (i) The role of ∂ in Chen’s construction is to absorb all the obstructions so that constructing a versal solution to the master equation poses no problem (cf. Smoothness Theorem 2.5.6).
FROBENIUS∞ MANIFOLDS
447
(ii) The Chen’s differential ∂ is completely determined by . Indeed, the master equations imply 1 # ∂φ() = − φ [ • ] . 2 Decomposing, φ() =
f i (t)[ei ],
i
we note that f i (t) = t i
mod I 2 . Hence, the functions f i (t) define a coordinate system
on Ᏼ and the values ∂f i (t) completely determine the differential ∂. In particular, if is φ-normalized, that is, if φ([n≥2] ) = 0, then ∂ can be computed by the formula p+q 1 t i [ei ] = − φ [ • ] . ∂# 2 i=1
(iii) We shall understand from now on a versal solution, , of the master equations and the associated Chen differential ∂ as, respectively, global sections of the sheaves g ⊗ ᏻᏴ and ᐀Ᏼ on Ᏼ. (In practical terms, this essentially fixes their transformation properties under arbitrary changes of coordinates on the cohomology supermanifold.) We sometimes call a master function. (iv) The argument in (ii) also downplays the role of the quasi-isomorphism φ used in the Chen construction. If is normalized with respect to a quasi-isomorphism φ : (g, d) → (H, 0), then, for any other quasi-isomorphism φ , the same can be viewed as φ -normalized, but in a new coordinate system t i = f i (t j ) given by φ () = f i (t j )[ei ]. Thus, varying quasi-isomorphism φ used in the construction of amounts to varying flat structure on the pointed supermanifold Ᏼ. (v) Chen has actually invented his differential ∂ in the context of differential associative algebras (see [C]). Its Lie algebra analogue, Theorem 4.1, is due to Hain [H]. 4.2. Gauge equivalence. Let us consider the following action, called a gauge transformation, of g1˜ ⊗ I on g ⊗ k[[t]]: g ⊗ I × g ⊗ k[[t]] −→ g ⊗ k[[t]],
g ⊗ −→ g := eadg −
eadg − 1 d + ∂# g. adg
Lemma 4.2.1. If ∈ g ⊗ k[[t]] is a master function, then, for any g ∈ g1˜ ⊗ I , the function g is also a master function, and both these share the same Chen differential. # + (1/2)[ • ] = 0 implies Proof. We have to show that the equation d + ∂ # g + 1 g • g = 0. d g + ∂ 2
448
S. A. MERKULOV
This follows immediately from the well-known formulae (see [GM]), eadg de− adg = d − ad((eadg −1)/ adg )dg , eadg ∂e− adg = ∂ − ad((eadg −1)/ adg )∂g # , eadg ad e− adg = adeadg , and
eadg (· · · ) • (· · · ) = eadd (· · · ) • eadg (· · · ) .
Theorem 4.2.2. Let g be a differential Lie algebra. For any two master functions on Ᏼ, and , there is a gauge function g ∈ (Ᏼ, g ⊗ I ) and a diffeomorphism f : (Ᏼ, 0) → (Ᏼ, 0) such that = f ∗ ( g ) and ∂ = f∗ (∂ ). A sketch of the proof. Let us fix a quasi-isomorphism φ : (g, d) → (H, 0) of complexes of Abelian groups, and a coordinate system on Ᏼ in which is φ-normalized. We have [1] = [1] − dg[1] , for some g[1] ∈ (Ᏼ, g ⊗ I ), and ∂[1] = ∂[1] = 0. Hence, = g[1] mod I 2 and there is a unique diffeomorphism f1 : Ᏼ → Ᏼ such that the master function := f1∗ ( g[1] ) is φ-normalized and [1] = [1] . Hence, −−−−−−−→ d [2] − [2] + ∂[2] − ∂[2] [1] = 0, implying ∂[2] = ∂[2] and d([2] − [2] ) = 0. Since φ([2] − [2] ) = 0 and φ is a quasi-isomorphism, there exists g[2] ∈ g ⊗ I 2 such that [2] − [2] = dg[2] . Hence, g = [2] = f1∗ g(2)
mod I 3 .
Continuing by induction and using Lemma 4.2.1, one easily obtains the desired result. Corollary 4.2.3. The Chen’s vector field ∂ on Ᏼ is an invariant of g. 4.3. Differential on ᐀Ᏼ . From now on, we fix a dG-algebra g and a master function on Ᏼ. The latter is not defined canonically, though the associated Chen differential ∂ is. We also fix a quasi-isomorphism, φ : (g, d) → (H, 0), of complexes of Abelian groups. This puts no restriction whatsoever on the dG-algebra under consideration. Moreover, our main results will not depend on the particular choices of and φ we have made—these two are no more than working tools. The global vector field ∂ on Ᏼ makes ᐀Ᏼ into a sheaf of complexes with the differential δ : ᐀Ᏼ −→ ᐀Ᏼ ,
X −→ δX := [∂, X],
FROBENIUS∞ MANIFOLDS
449
where [ , ] stands for the usual commutator of (germs) of vector fields. Indeed, 1 (δ)2 X = ∂, [∂, X] = [∂, ∂], X = ∂ 2 , X = 0, 2 where we have used the Jacobi identity and the fact that ∂ 2 = 0. ⊗m This, of course, induces a differential on the sheaf of tensor products, ᐀Ᏼ ⊗ ∗ ⊗n (᐀Ᏼ ) (and on the associated vector space of global sections), which we denote by the same symbol δ. 4.4. Deformed dG-algebra. It is easy to check that the map d : k[[t]] ⊗ g −→ k[[t]] ⊗ g,
# + [ • a] a −→ d a := da + ∂a
satisfies (i) (d )2 = 0, (ii) d (a · b) = (d a) · b + (−1)a˜ a · d b, (iii) d [a • b] = [d a • b] − (−1)a˜ [a • d b], implying that the data set (k[[t]] ⊗ g, [•], ·, d ) is a dG-algebra. ∗ )⊗k on Ᏼ into a sheaf of comThe differentials d and δ make the sheaf g ⊗ (᐀Ᏼ plexes with the differential, which we denote by D . For example, for any germ ∗ and any germ X ∈ ᐀ over the same point in Ᏼ, D ∈ g ⊗ ᐀Ᏼ Ᏼ
˜ D D (X) := d D(X) − (−1)D D(δX).
⊗k , g ⊗ ᏻᏴ ) is also a complex with the differential denoted The vector space Hom(᐀Ᏼ by the same symbol D .
4.5. Morphism of sheaves of complexes. The versal solution gives rise to a morphism of ᏻᏴ -modules, ϒ : ᐀H −→ ᏻᏴ ⊗ g,
− → X −→ ϒ(X) := X .
It is not hard to check that ϒ is a monomorphism. Lemma 4.5.1. The element ϒ ∈ Hom(᐀Ᏼ , g ⊗ ᏻᏴ ) is cyclic; that is, D ϒ = 0. Proof. Applying X ∈ ᐀Ᏼ to both sides of the equation # + 1 [ • ] = 0, d + ∂ 2 we get implying (D ϒ)(X) = 0.
− ˜ → −−−→ (−1)X d X + [X, ∂] = 0,
450
S. A. MERKULOV
− → −→ Corollary 4.5.2. For any X ∈ ᐀Ᏼ , d (X ) = δX . Corollary 4.5.3. For any χ ∈ Hom(⊗k ᐀Ᏼ , ᐀Ᏼ ), one has D (ϒ ◦ χ) = ϒ ◦ (δχ). Proof. We have, using Corollary 4.5.2, −−−−−−−−−−→ −−−−−−−−−−−→ D (ϒ ◦ χ ) X1 , . . . , Xk = d χ X1 , . . . , Xk − (−1)χ˜ χ δX1 , . . . , Xk −−−−−−−−−−−→ ˜ ˜ − · · · − (−1)χ˜ +X1 +···+Xk−1 χ X1 , . . . , δXk −−− −−−−−−−−→ −−−−−−−−−−−→ = δχ X1 , . . . , Xk − (−1)χ˜ χ δX1 , . . . , Xk −−−−−−−−−−−→ ˜ ˜ − · · · − (−1)χ˜ +X1 +···+Xk−1 χ X1 , . . . , δXk = ϒ ◦ δχ X1 , . . . , Xk for arbitrary X1 , . . . , Xk ∈ ᐀Ᏼ . Therefore, the map
ϒ : (᐀Ᏼ , δ) −→ g ⊗ ᏻᏴ , d
is a morphism of sheaves of complexes. Note that the “projection” map s ◦ φ : g ⊗ ᏻᏴ → ᐀Ᏼ satisfies s ◦ φ ◦ ϒ = Id but does not, in general, respect the differentials.
Analogously, one shows that the morphism ϒ · ϒ defined by the commutative diagram ᐀Ᏼ ⊗ ᐀Ᏼ
ϒ ⊗ϒ
ϒ ·ϒ
g ⊗ g ⊗ ᏻᏴ
· ⊗ Id g ⊗ ᏻᏴ
⊗2 , g ⊗ ᏻᏴ ), D ). In a similar way, one uses defines a cyclic element in (Hom(᐀Ᏼ muliplicative structure in g to construct cyclic elements ϒ · ϒ · ϒ and so on.6 For future reference, we define a morphism ϒ(n) ∈ Hom(᐀, g ⊗ ᏻᏴ ) by setting − → ϒ(n+1) (X) := X (n+1) . Similarly, one defines ϒ(n) · ϒ(n) and so on.
4.6. Multiplicative structure in ᐀Ᏼ . We will show in this subsection that, for any dG-algebra g, the associated tangent sheaf ᐀Ᏼ is always a sheaf of differential associative algebras (defined uniquely up to a homotopy). ⊗2 , ᐀Ᏼ ), Theorem 4.6.1. There exists an even morphism of sheaves, µ ∈ Hom(᐀Ᏼ 6 The
cyclicity of ϒ · ϒ, and so on, relies on the Poisson identity holding in (g, [•], ·, d).
FROBENIUS∞ MANIFOLDS
451
such that δµ = 0 and the diagram ᐀Ᏼ ⊗ ᐀Ᏼ
µ
᐀Ᏼ
ϒ
ϒ ·ϒ
g ⊗ ᏻᏴ
is commutative at the cohomology level; that is, [ϒ · ϒ] = [ϒ ◦ µ] in the cohomology sheaf Ker D / Im D associated with the sheaf of complexes ⊗2 (Hom(᐀Ᏼ , g ⊗ ᏻᏴ ), D ). ⊗2 Proof. We have to show that there exists µ ∈ Hom(᐀Ᏼ , ᐀Ᏼ ) such that ˜
δµ(X, Y ) = µ(δX, Y ) + (−1)X µ(X, δY )
(5)
−−−−−→ − → − → X · Y = µ(X, Y ) + D A (X, Y )
(6)
and
⊗2 for some A ∈ Hom(᐀Ᏼ , g ⊗ ᏻᏴ ) and any X, Y ∈ ᐀Ᏼ . We shall proceed by induction and assume, without loss of generality, that the vector fields X and Y are constant; that is, ∇X = ∇Y = 0. Equations (5) and (6) can obviously be satisfied modulo I : just set
− → − → µ[0] (X, Y ) := φ X [1] · Y [1] . Indeed,
−−−−−−−→ − → − → X [1] · Y [1] − µ[0] (X, Y ) [1] ∈ Ker φ ∩ Ker d
and, hence, this expression is d-exact. Denote it by dA[0] (X, Y ). (We can always ⊗2 , ker φ ⊗ ᏻᏴ ).) This solves (6) modulo I . normalize A[0] so that it lies in Hom(᐀Ᏼ Equation (5) modulo I is trivial (recall that ∂[<2] = 0). Assume now that we have constructed µ(n) and A(n) so that the equations ˜ δ(n) µ(n−1) (X, Y ) = µ(n−1) δ(n) X, Y + (−1)X µ(n−1) X, δ(n) Y mod I n+1 , −−−−−−−→ − → − → X (n+1) · Y (n+1) = µ(n) (X, Y ) (n+1) + D (n) A(n) (X, Y ) mod I n+1 hold.
(7) (8)
452
S. A. MERKULOV
The theorem will be proved if we find µ[n+1] and A[n+1] satisfying δ(n+1) µ(n) (X, Y ) + µ[n+1] (X, Y ) = µ(n) δ(n+1) X, Y ˜ + (−1)X µ(n) X, δ(n+1) Y mod I n+2 and −−−−−−−→ −−−−−−−−−→ − → − → X (n+2) · Y (n+2) − µ(n) (X, Y ) (n+2) − µ[n+1] (X, Y ) [1] − D (n+1) A(n) (X, Y ) = dA[n+1] (X, Y )
mod I n+2 .
Defining µ[n+1] (X, Y ) − −−−−−−−→ − → → := φ X (n+2) · Y (n+2) − µ(n) (X, Y ) (n+2) − D (n+1) A(n) (X, Y ) , we ensure that the morphism λ[n+1] (X, Y ) := ϒ(n+1) · ϒ(n+1) − ϒ(n+1) ◦ µ(n) − D (n+1) A(n) − ϒ(0) ◦ µ[n+1] (X, Y )
mod I n+2
takes values in the sheaf in Ker φ ⊗ ᏻᏴ . Since it vanishes modulo I n+1 , we have, modulo I n+2 , dλ[n+1] (X, Y ) = D(n+1) λ[n+1] (X, Y ) ϒ(n+1) ◦ µ(n) (X, Y ) = − D(n+1) −−−−−−−→ −−−−−−−→ −−−−−−−→ −−−− ˜ −−−− µ(n) (X, Y ) + µ(n) δ(n+1) X, Y + (−1)X µ(n) X, δ(n+1) Y = −dn+1 − −−−−−−−−−−−− → −−−−−−−−−−−−−−−−−− −−−−−−−−−−−−−−− ˜ = − δ(n+1) µ(n) (X, Y ) − µ(n) δ(n+1) X, Y − (−1)X µ(n) X, δ(n+1) Y , where we have used Corollary 4.5.2 and the fact that D (ϒ · ϒ) = 0. Applying φ to the last equation, we get ˜ δ(n+1) µ(n) (X, Y ) = µ(n) δ(n+1) X, Y + (−1)X µ(n) X, δ(n+1) Y
mod I n+2 ,
and hence, dλ[n+1] (X, Y ) = 0. Since Ker φ ∩ Ker d ⊂ Im d, there exists A[n+1] (X, Y ) (which can be chosen to lie in
FROBENIUS∞ MANIFOLDS
453
Ker φ ⊗ ᏻᏴ ) such that λ[n+1] (X, Y ) = dA[n+1] (X, Y ). This completes the inductive procedure and, hence, the proof of the theorem. Definition 4.6.2. An even morphism of sheaves, µ ∈ Hom(⊗2 ᐀Ᏼ , ᐀Ᏼ ), satisfying the conditions of Theorem 4.6.1 is called induced. The associated data set (᐀Ᏼ , δ, µ) is called a sheaf of induced differential algebras. Clearly, an induced product on ᐀Ᏼ is supercommutative if the product · in g is supercommutative. 4.7. (Non)Uniqueness. How unique is the product µ induced on the tangent sheaf
᐀Ᏼ by Theorem 4.6.1? When is it associative? To address these questions we shall
need the following technical result. Lemma 4.7.1. If τ ∈ Hom(⊗k ᐀Ᏼ , ᐀Ᏼ ) and B ∈ Hom(⊗k ᐀Ᏼ , g ⊗ ᏻᏴ ) satisfy the equation ϒ ◦ τ = D B, then there exist χ ∈ Hom(⊗k ᐀Ᏼ , ᐀Ᏼ ) and C ∈ Hom(⊗k ᐀Ᏼ , Ker φ ⊗ ᏻᏴ ) such that (1) B = ϒ ◦ χ + D C; (2) τ = δχ ; that is, τ X1 , . . . , Xk = δχ X1 , . . . , Xk − (−1)χ˜ χ δX1 , . . . , Xk ˜ ˜ − · · · − (−1)χ˜ +X1 +···+Xk−1 χ X1 , . . . , δXk for any X1 , . . . , Xk ∈ ᐀Ᏼ . Proof. Without loss of generality, we may assume that (germs of) vector fields X1 , . . . , Xk are constant. In view of Corollary 4.5.3 and injectivity of the map ϒ, it is enough to show that the equation −−−−−−−→ −− τ X1 , . . . , Xk = D B X1 , . . . , Xk (9) implies
−−−−−−−−−−→ B X1 , . . . , Xk = χ X1 , . . . , Xk + D C X1 , . . . , Xk
for some χ ∈ Hom(⊗k ᐀Ᏼ , ᐀Ᏼ ) and C ∈ Hom(⊗k ᐀Ᏼ , Ker φ ⊗ ᏻᏴ ). We shall proceed by induction. Equation (9) modulo I is −−−−−−−−−−−→ τ[0] X1 , . . . , Xk [1] = dB[0] X1 , . . . , Xk . Hence, τ[0] = 0 and dB[0] = 0. Set χ[0] X1 , . . . , Xk := φ B[0] X1 , . . . , Xk .
454
S. A. MERKULOV
−−−−−−−−−−−→ Then B[0] (X1 , . . . , Xk ) − χ[0] (X1 , . . . , Xk ) [1] lies in (Ker φ ∩ Ker d) ⊗ ᏻᏴ and, hence, equals dC[0] (X1 , . . . , Xk ) for some C[0] ∈ Hom(⊗k ᐀Ᏼ , Ker φ ⊗ ᏻᏴ ). Assume that χ(n) and C(n) are constructed so that the equation −−−−−−−−−−−→ C(n) X1 , . . . , Xk B(n) X1 , . . . , Xk = χ(n) X1 , . . . , Xk (n+1) + D(n)
mod I n+1
holds. Let us show that there exist χ[n+1] and C[n+1] satisfying B(n+1) X1 , . . . , Xk −−−−−−−−−−−−−→ −−−−−−−−−−−→ = χ(n) X1 , . . . , Xk (n+2) + χ[n+1] X1 , . . . , Xk [1] + D(n+1) C(n) X1 , . . . , Xk + dC[n+1] X1 , . . . , Xk
mod I n+2
or, equivalently, satisfying −−−−−−−−−−−−−→ dC[n+1] X1 , . . . , Xk = ψ[n+1] X1 , . . . , Xk − χ[n+1] X1 , . . . , Xk [1]
mod I n+2 ,
where we have set ψ[n+1] X1 , . . . , Xk := B(n+1) X1 , . . . , Xk −−−−−−−−−−−→ C(n) X1 , . . . , Xk . − χ(n) X1 , . . . , Xk (n+2) − D(n+1) Since ψ[n+1] (X1 , . . . , Xk ) vanishes modulo I n+1 , this is a monomial of degree n + 1 and t i , and hence, modulo I n+2 , dψ[n+1] X1 , . . . , Xk = D(n+1) ψ[n+1] X1 , . . . , Xk B(n+1) X1 , . . . , Xk − D(n+1) ϒ(n+1) ◦ χ(n) X1 , . . . , Xk = D(n+1) −−−−−−−− −−−−−−−−− −−−−−−−−−−− −−−−− − → = τ(n+1) X1 , . . . , Xk − δ(n+1) χ(n) X1 , . . . , Xk . Applying φ to both sides of these equations, we get τ(n+1) X1 , . . . , Xk = δ(n+1) χ(n) X1 , . . . , Xk and, hence,
dψ[n+1] X1 , . . . , Xk = 0
mod I (n+1) .
mod I n+1
FROBENIUS∞ MANIFOLDS
455
We define χ[n+1] by χ[n+1] X1 , . . . , Xk := φ ψ[n+1] X1 , . . . , Xk . −−−−−−−−−−−−−→ Then ψ[n+1] (X1 , . . . , Xk ) − χ[n+1] (X1 , . . . , Xk ) [1] ∈ Ker d ∩ Ker φ ⊂ Im d. This proves the existence of C[n+1] and, hence, completes the proof of the lemma. If µ , µ ∈ Hom(⊗2 ᐀Ᏼ , ᐀Ᏼ ) are two products as in Theorem 4.6.1, then ϒ ◦ µ − µ = D B for some B ∈ Hom(⊗2 ᐀Ᏼ , g ⊗ ᏻᏴ ), and hence, by Lemma 4.7.1, µ − µ = δχ for some odd χ ∈ Hom(⊗2 ᐀Ᏼ , ᐀Ᏼ ); that is, µ and µ are what is called homotopy equivalent. On the other hand, if µ is a product with the properties stated by Theorem 4.6.1, then, for any odd χ ∈ Hom(⊗2 ᐀Ᏼ , ᐀Ᏼ ), the product µ := µ + δχ also enjoys the properties of Theorem 4.6.1. Indeed, by Corollary 4.6.3, ϒ ◦ µ = ϒ ◦ µ + ϒ ◦ (δχ) = ϒ ◦ µ + D (ϒ ◦ χ), and hence,
[ϒ · ϒ] = ϒ ◦ µ = ϒ ◦ µ
in the cohomology sheaf Ker D / Im D . Thus the set of products µ induced on ᐀Ᏼ by Theorem 4.6.1 is a principal homogeneous space over the Abelian group δ Hom1˜ (⊗2 ᐀Ᏼ , ᐀Ᏼ ). Hence, all induced products on each stalk of ᐀Ᏼ combine into a single homotopy class that we call induced. Theorem 4.7.2. The sheaf ᐀Ᏼ is canonically a sheaf of induced homotopy classes of differential algebras. Proof. We have to show that the homotopy class of products induced on Ᏼ is an invariant of the dG-algebra under consideration; that is, it is independent of the choice of a quasi-isomorphism φ and of the choice of a master function used in its construction. In view of Remark 4.1.3(iv), it is enough to check the invariance of the product under the gauge transformations −→ g := eadg −
eadg − 1 d + ∂# g, adg
g ∈ Ᏼ, g1˜ ⊗ Ᏼ .
456
S. A. MERKULOV
A straightforward analysis of the basic equation (5) shows that gauge transformation changes the tensor A, Ag = eadg A(X, Y ) − G · ϒ − ϒ · G + G · D G + G ◦ µ , where G ∈ Hom(᐀Ᏼ , g ⊗ ᏻᏴ ) is given by ˜
G(X) := (−1)X
− eadg − 1 → d + ∂# X g adg
but leaves the product invariant, µg = µ. 4.8. Identity in g ⇒ identity in ᐀Ᏼ . If the dG-algebra under consideration, g, has the identity e0 , and if the versal solution is appropriately normalized (see Remark 3.1.1), then 1 −−→ −−→ # δ(e) = [∂, e] = ∂e0 + e# d + [ • ] = 0 + de0 + [e0 • ] = 0, 2 so that δ(e) = 0. We shall show next that the induced homotopy class of differential algebras on each stalk of ᐀Ᏼ contains a canonical subclass of unital differential algebras. Theorem 4.8.1. If g has the identity eo , then ᐀Ᏼ is canonically a sheaf of induced homotopy classes of differential algebras with the identity e. A sketch of the proof. It is enough to show that there exists a δ-closed element, µ ∈ Homo˜ (⊗2 ᐀Ᏼ , ᐀Ᏼ ), such that, for arbitrary (constant) X, Y ∈ ᐀Ᏼ , the equation −−−−−→ − → − → X · Y = µ(X, Y ) + D A (X, Y ) ⊗2 , g ⊗ ᏻᏴ ) satisfying A(X, e) = X and A(e, Y ) = Y holds for some A ∈ Hom(᐀Ᏼ (cf. (6)). Recall (see the proof of Theorem 4.6.1) that at the lowest order we have − → − → µ[0] (X, Y ) = φ X [1] · Y [1] ,
−−−−−−−→ − → − → dA[0] (X, Y ) = X [1] · Y [1] − µ[0] (X, Y ) [1] , and hence, µ[0] (X, e) = µ[0] (e, X) = X and dA[0] (X, e) = dA[0] (e, X) = 0. We claim that A[0] can be chosen to satisfy A[0] (X, e) = A[0] (e, X) = 0. This can be achieved by a replacement, − → A[0] (X, Y ) −→ A[0] (X, Y ) := A[0] (X, Y ) − A[0] (X, e) · Y ˜− ˜− → → − → − (−1)X X · A[0] (e, Y ) + (−1)X X · A[0] (e, e) Y ,
FROBENIUS∞ MANIFOLDS
457
which satisfies dA[0] (X, Y ) = dA[0] (X, Y ),
A[0] (X, e) = A[0] (e, X) = 0.
This observation allows us to include in the inductive procedure of the proof of Theorem 4.6.1 the additional assumptions µ(n) (X, e) = µ(n) (e, X) = X,
A(n) (X, e) = A(n) (e, X) = 0
and to show, by exactly the same argument as in the case n = 0 above, that they hold true for n + 1. Thus, there does exist a product µ from the induced homotopy class satisfying µ(X, e) = µ(e, X) = X. It is defined uniquely up to a transformation µ −→ µ + δχ, with χ satisfying χ (X, e) = χ(e, X) = 0 for arbitrary X ∈ ᐀Ᏼ . Thus what is well defined is the induced homotopy class of unital differential algebras. Theorem 4.9. For any (unital) dG-algebra g, the tangent sheaf ᐀Ᏼ to its cohomology supermanifold is canonically a sheaf of homotopy classes of (unital) A∞ algebras with (i) µ1 = [∂, . . . ], ∂ being the Chen’s vector field, and (ii) the homotopy class of µ2 being the induced homotopy class as in Theorem 4.6.1. A sketch of the proof. By Theorem 4.6.1, there exists a product µ2 ∈ Hom(⊗2 ᐀Ᏼ , ᐀Ᏼ ) satisfying the equation −−−−−−−−→ −→ −→ µ2 (X1 , X2 ) = X1 · X2 + D A2 (X1 , X2 ) for some odd A2 ∈ Hom(⊗2 ᐀Ᏼ , g ⊗ ᏻᏴ ) and arbitrary X1 , X2 ∈ ᐀Ᏼ . We have, in the notation of subsection 3.2.3, −−−−−−−−−−−→ −−−−−−−−→ −−−−−−−−−−−−−−−−−−− −−− 03 X1 , X2 , X3 = µ2 X1 , µ2 (X2 , X3 ) − µ2 µ2 (X1 , X2 ), X3 −→ −→ −→ = X1 · X2 · X3 + D A2 X1 , µ2 (X2 , X4 ) −→ −→ −→ + ϒ(X1 ) · D A2 (X2 , X3 ) − X1 · X2 · X3 − D A2 µ2 (X1 , X2 ), X3 − D A2 (X1 , X2 ) · ϒ(X3 ) = D B3 X 1 , X 2 , X 3 , where ˜ B3 X1 , X2 , X3 := (−1)X1 ϒ(X1 ) · A2 (X2 , X3 ) − A2 (X1 , X2 ) · ϒ(X3 ) + A2 X1 , µ2 (X2 , X3 ) − A2 µ2 (X1 , X2 ), X3 .
458
S. A. MERKULOV
Here we used associativity of the dot product in g, D -closedness of ϒ, and δclosedness of µ2 . By Lemma 4.7.1, there exists µ3 ∈ Homo˜ (⊗3 ᐀Ᏼ , ᐀Ᏼ ) such that the third order associativity condition, 03 = δµ3 , is satisfied, and −−−−−−−−→ −−− µ3 X1 , X2 , X3 = B3 X1 , X2 , X3 + D A3 X1 , X2 , X3 for some A3 ∈ Homo˜ (⊗3 ᐀Ᏼ , g ⊗ ᏻᏴ ). Exactly the same procedure constructs inductively all the higher order products µn ∈ Homn˜ (⊗n ᐀Ᏼ , ᐀Ᏼ ) satisfying the higher order associativity conditions. Step 1. Assume that we have constructed µk ∈ Homk˜ (⊗k ᐀Ᏼ , ᐀Ᏼ ) and Ak ∈ k Homk+ ˜ 1˜ (⊗ ᐀Ᏼ , g ⊗ ᏻᏴ ), k = 2, . . . , n − 1, such that 0k = δµk (kth order associativity condition) and −−−−−−−−→ −−− µk X 1 , . . . , X k ˜ ˜ = (−1)(j +1)(X1 +···+Xi )+i+1 Ai X1 , . . . , Xi · Ai Xi+1 , . . . , Xk i+j =k
+
i−1 i+j =k+1 i≥2 j ≥2
(−1)r Ai X1 , . . . , Xl , µj Xl+1 , . . . , Xl+j , Xl+j +1 , . . . , Xk
l=0
+ D Ak X 1 , . . . , X k =: Bk X1 , . . . , Xk + D Ak X1 , . . . , Xk , ˜ j˜ + 1. ˜ j˜ − 1) + (i˜ − 1) where we have set A1 := ϒ and r = j˜(X˜1 + · · · X˜1 ) + l( Step 2. Use the above expressions for ϒ ◦ µk , k = 2, . . . , n − 1, to show that −−−−−−−−→ −−− 0n X1 , . . . , Xn = D Bn X1 , . . . , Xn . Step 3. Apply Lemma 3.6.1 to conclude that there exists µn such that 0n = δµn (nth order associativity condition) and ϒ ◦ µn = Bn + D An for some An ∈ k Homn+ ˜ 1˜ (⊗ ᐀Ᏼ , g ⊗ ᏻᏴ ). Finally, we note that at each stage of the above construction the nth product µn is defined only up to a δ-exact term, δfn . These arbitrary terms combine all together into a homotopy of the A∞ -structure (᐀Ᏼ , µ∗ ). Corollary 4.9.1. The cohomology sheaf on Ᏼ, ᏴH :=
Ker δ , Im δ
is canonically a sheaf of induced (unital) associative algebras.
FROBENIUS∞ MANIFOLDS
459
Corollary 4.9.2. The tangent sheaf, ᐀ᏹsmooth , to the smooth part of the extended Kuranishi space is canonically a sheaf of induced (unital) associative algebras. 4.9.3. The Euler field. If the dG-algebra g under consideration is Z-graded, then the cohomology H and, hence, its dual H∗ are also Z-graded. We make k[[t]] into a Z-graded ring by setting k[[t]] = • H∗ [2]. This also induces Z-grading in the sheaves ᏻH and ᐀Ᏼ on the supermanifold Ᏼ. If {[ei ]} is a basis in H and {t i } are the associated linear coordinates on Ᏼ as in Section 2.2, then i t = 2 − |[ei ]|. With this choice of Z-grading on ᏻᏴ , we ensure that || = 2, and hence, |∂| = 1, |δ| = 1, and |µn | = n for all the induced higher order products on ᐀Ᏼ . The Euler field on M is, by definition, the derivation E of k[[t]] given by 1 Ef = |f |f, 2 In coordinates, E=
∀f ∈ k[[t]].
1 i i ∂ t t i. 2 ∂t i
This vector field generates the scaling symmetry on (Ᏼ, µ∗ ) (cf. [BK], [Ma1]). If we decompose ∂ ∂ ∂ µn , . . . , µki1 ···in (t) k , = ∂t i1 ∂t in ∂t k
then, as follows from the explicit construction of µn given in the proof of Theorem 4.9, Eµki1···in =
1 k i1 t − t − · · · − t in + n µki1···in . 2
Note also that in the presence of identity, [e, E] = e. 4.9.4. The perturbative proof of Theorem A. The required statement follows immediately from the graded commutative version of Theorem 4.9 and Section 4.9.3. 4.9.5. A generalization to G∞ -algebras. In the perturbative construction of the F∞ -functor for dG-algebras, the odd Poisson identity was used in a few places. For example, in the construction of µ2 , the only place where we relied on it was the cyclicity of ϒ · ϒ, D (ϒ · ϒ) = 0.
460
S. A. MERKULOV
However, a glance at the basic equation (6) (and its higher order analogues in Section 4.9) shows that the perturbative argument stands if the cyclicity (and its analogues) holds only up to a homotopy. Therefore, the generalization from dG-algebras to G∞ -algebras is straightforward, affecting only auxiliary tensors An . Acknowledgments. This work was done during my visit to the Max Planck Institute for Mathematics (MPIM) in Bonn. Excellent working conditions in the MPIM are gratefully acknowledged. I would like to thank Yu. I. Manin for many stimulating discussions and A. A. Voronov for valuable communications. References [Ba] [BK] [C] [GV] [GJ] [GM] [H] [HM] [K] [Ko1]
[Ko2] [Ko3] [Ku] [Ma1] [Ma2] [MS] [Me1] [Me2] [S]
S. Barannikov, Generalized periods and mirror symmetry in dimensions n > 3, preprint, http://xxx.lanl.gov/abs/math/9903124. S. Barannikov and M. Kontsevich, Frobenius manifolds and formality of Lie algebras of polyvector fields, Internat. Math. Res. Notices 1998, 201–215. K. T. Chen, Iterated path integrals, Bull. Amer. Math. Soc. 83 (1977), 831–879. M. Gerstenhaber and A. Voronov, Homotopy G-algebras and moduli space operad, Internat. Math. Res. Notices 1995, 141–153. E. Getzler and J. D. S. Jones, Operads, homotopy algebra and iterated integrals for double loop spaces, preprint, http://xxx.lanl.gov/abs/hep-th/9403055. W. M. Goldman and J. J. Millson, The homotopy invariance of the Kuranishi space, Illinois J. Math. 34 (1990), 337–367. R. Hain, Twisting cochains and duality between minimal algebras and minimal Lie algebras, Trans. Amer. Math. Soc. 277 (1983), 397–411. C. Hertling and Y. Manin, Weak Frobenius manifolds, Internat. Math. Res. Notices 1999, 277–286. K. Kodaira, Complex Manifolds and Deformations of Complex Structures, Grundlehren Math. Wiss. 283, Springer, New York, 1986. M. Kontsevich, “Homological algebra of mirror symmetry” in Proceedings of the International Congress of Mathematicians (Zürich, 1994), Vol.1, Birkhäuser, Basel, 1995, 120–139. , Operads and motives in deformation quantization, Lett. Math. Phys. 48 (1999), 35–72. , Deformation quantization of Poisson manifolds, I, preprint, http://xxx.lanl.gov/abs/qalg/9709040. M. Kuranishi, Deformations of Complex Manifolds, Sém. Math. Sup. 39, Presses Univ. Montréal, Montréal, 1971. Yu. I. Manin, Frobenius Manifolds, Quantum Cohomology, and Moduli Spaces, Amer. Math. Soc. Colloq. Publ. 47, Amer. Math. Soc., Providence, 1999. , Three constructions of Frobenius manifolds: A comparative study, Asian J. Math. 3 (1999), 179–220. J. E. McClure and J. H. Smith, A solution of Deligne’s conjecture, preprint, http://xxx.lanl. gov/abs/math/9910126. S. A. Merkulov, Formality of canonical symplectic complexes and Frobenius manifolds, Internat. Math. Res. Notices 1998, 727–733. , The extended moduli space of special Lagrangian submanifolds, Comm. Math. Phys. 209 (2000), 13–27. J. D. Stasheff, Homotopy associativity of H -spaces, II, Trans. Amer. Math. Soc. 108 (1963), 293–312.
FROBENIUS∞ MANIFOLDS [Ta1] [Ta2] [TT] [Ti]
[To] [V]
461
D. Tamarkin, Another proof of Kontsevich formality theorem, preprint, http://xxx.lanl.gov/ abs/math/9803025. , Formality of chain operad of small squares, preprint, http://xxx.lanl.gov/abs/math/ 9809164. D. Tamarkin and B. L. Tsygan, Noncommutative differential calculus, homotopy BV algebras and formality conjectures, preprint, http://xxx.lanl.gov/abs/math/0002116. G. Tian, “Smoothness of the universal deformation space of compact Calabi-Yau manifolds and its Petersson-Weil metric” in Mathematical Aspects of String Theory (San Diego, 1986), World Sci., Singapore, 1987, 629–646. A. Todorov, The Weil-Petersson geometry of the moduli space of SU(n) (n ≥ 3) (CalabiYau) manifolds, I, Comm. Math. Phys. 126 (1989), 325–346. A. A. Voronov, Homotopy Gerstenhaber algebras, preprint, http://xxx.lanl.gov/abs/math/ 9908040.
Max-Planck-Institut für Mathematik, Vivatsgasse 7, D-53111, Bonn, Germany Current: Department of Mathematics, University of Glasgow, Glasgow, G12 8QW, United Kingdom; [email protected]
Vol. 105, No. 3
DUKE MATHEMATICAL JOURNAL
© 2000
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS: SOME COUNTEREXAMPLES CHRISTIAN REMLING
1. Introduction. In this paper, we are interested in one-dimensional Schrödinger equations, −y (x) + V (x)y(x) = Ey(x),
(1)
with potentials V bounded by a decaying power: |V (x)| ≤
C (1 + x)α
(α > 0).
(2)
More specifically, we study the spectral properties of the associated selfadjoint operators Hβ = −d 2 /dx 2 + V (x), acting on the Hilbert space L2 (0, ∞). The index β ∈ [0, π ) refers to the boundary condition y(0) cos β + y (0) sin β = 0. For the general theory, see, for example, [29]. These questions are of relevance in quantum mechanics; for more background information on this topic, refer to, for example, [18]. Since V tends to zero as x → ∞, it follows that the essential spectrum satisfies σess = [0, ∞). (This is a classical result going back to Weyl.) However, this does not say much about the physics of the corresponding system since it does not give information on the type of the spectrum on (0, ∞) (absolutely continuous, singular continuous, or point spectrum). There has been some progress on this question recently, and several new positive results have been obtained. In this paper, we launch the counterattack: by constructing suitable examples, we show that these results are in fact optimal. We are interested in situations where σac = [0, ∞) with possibly also some embedded singular spectrum on (0, ∞). The corresponding range of exponents in (2) is 1/2 < α ≤ 1. If α > 1 or if only V (x) = o(x −1 ), then the spectrum is purely absolutely continuous on (0, ∞). See [20] for the proof of this under the weaker assumption V (x) = o(x −1 ); if α > 1 is assumed, the result is classical and easy to prove. On the other hand, there are examples of (random) potentials V (x) = O(x −1/2 ) with purely singular spectrum (see [6] and [24]), so there need not be any absolutely continuous spectrum if α ≤ 1/2. That α > 1/2 does imply presence of absolutely Received 11 May 1999. 2000 Mathematics Subject Classification. Primary 34L40, 81Q10; Secondary 34L20, 81Q20. Author’s work supported by Heisenberg program of the Deutsche Forschungsgemeinschaft. 463
464
CHRISTIAN REMLING
continuous spectrum was first shown in [1], [3], and [20] (see also [10] and [11] for earlier work in this direction). Deift and Killip [4] then proved the most satisfactory result that it already suffices to assume V ∈ L2 . Actually, in the case where V satisfies (2), one can say still more on the structure of the spectrum. To formulate this precisely, we first require a definition. The solutions of (1) are said to be of Wentzel-Kramers-Brillouin (WKB) form if there is a solution y(x, E) satisfying the asymptotic formula
x 1 y(x, E) √ exp i = E − V (t) dt + o(1) x −→ ∞ . y (x, E) i E 0
(3)
Note that in this case, we have control over all solutions of (1), because y is a linearly independent second solution. The so-called WKB methods give solutions satisfying (3), provided V tends to zero and does not oscillate too much (see, for instance, [8]). Of course, this latter assumption need not be satisfied if we suppose only (2), but it turns out that we still have WKB asymptotics for “most” energies E. Namely, let S denote the exceptional set; that is, S = E > 0 : No solution of (1) satisfies (3) .
(4)
Then we have the following bound on the Hausdorff dimension of S. Theorem 1.1 [22]. Suppose (2) holds. Then dim S ≤ 2(1 − α). This strengthens the result from [1] and [20] that S is of Lebesgue measure zero if α > 1/2. Note that σac = [0, ∞), in turn, follows from |S| = 0 by the general results of [25] and [27]. In fact, we get the stronger statement that (0, ∞) is an essential support of the absolutely continuous part of the spectral measure. It is not, in general, possible to exclude embedded singular spectrum (cf. [16], [26], and [28]), but the bound on dim S leads to the following restrictions. Theorem 1.2 [22]. Suppose (2) holds. Then (a) the singular continuous part of the spectral measure is supported by a set of dimension less than or equal to 2(1 − α); (b) there is an exceptional set B ⊂ [0, π) of boundary conditions β with dim B ≤ 2(1−α), so that the spectrum of Hβ is purely absolutely continuous on (0, ∞) if β ∈ [0, π ) \ B. Part (a) is an immediate corollary to Theorem 1.1, because S itself always supports the singular part of the spectral measure on (0, ∞). For the proof of part (b) (which is based on methods developed in [5]), see again [22]. In all these statements, the bound 2(1−α) gives the correct values in the borderline cases α = 1/2 and α = 1, and it looks natural. So the obvious guess is that it is optimal. Our first construction addresses this issue; it may be summarized as follows.
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
465
Theorem 1.3. Suppose α > 2/3. Then there is a potential V so that (2) holds and dim S = 2(1 − α). In other words, Theorem 1.1 is sharp, and, in fact, the bound 2(1 − α) is attained. A variation on the construction behind Theorem 1.3 gives an analogous statement (namely, Theorem 7.1), which shows that Theorem 1.2(b) is optimal, too (again, if α ≥ 2/3). On the other hand, no such claim can be made as to Theorem 1.2(a). The situation is even worse: It is an open question if there are potentials satisfying (2) with α > 1/2 and a nonempty singular continuous spectrum. In fact, examples with a singular continuous spectrum embedded in the absolutely continuous spectrum were constructed only very recently in [15] and [21] (based in part on ideas from [12]), using so-called sparse potentials. In these examples, the essential support of the absolutely continuous part of the spectral measure does not have full measure in σac , in contrast to the situation under consideration here. That makes it possible to deduce existence of singular continuous spectrum by indirect arguments that are not available here. The assumption α > 2/3 in Theorem 1.3 is of course of a technical character and thus somewhat annoying. On the other hand, we try to argue below that at least the method used here cannot work for 1/2 < α ≤ 2/3. The borderline case α = 1 is particularly subtle. We are now interested in the set P = E > 0 : (1) has an L2 -solution . This is the set of positive energies that are eigenvalues for some boundary condition β. Clearly, P ⊂ S, so dim P = 0 by Theorem 1.1. Kiselev, Last, and Simon showed that much more is true. Theorem 1.4 [12]. Suppose (2) holds with α = 1. Then P is countable and E∈P E < ∞. We prove that Theorem 1.4 is sharp in the sense that there are no further asymptotic conditions on the energies from the set P . Theorem 1.5. For any sequence en > 0 with en < ∞, there are energies En ≥ en and a potential V (x) = O((1 + x)−1 ) so that P ⊃ {En }. This answers a question asked in [12]. Earlier, Simon constructed potentials with the property that the point spectrum contains an arbitrary prescribed sequence En > 0 1/2 with En < ∞ (see [26]). One can use a slight variant of this construction and combine it with an additional argument to get examples with the following properties. Corollary 1.6. There are potentials V (x) = O((1+x)−1 ) with E∈P E p = ∞ for every p < 1. Of course, this is weaker than Theorem 1.5. We have stated it separately, because the proof of the Corollary based on [26] is elegant and different from the construction we use to prove Theorem 1.5.
466
CHRISTIAN REMLING
The potentials we construct here are obtained by pasting together carefully chosen periodic pieces. Everything relies on two nice features of periodic potentials: first of all, they are good at localizing particles even at high energies, and second, there are efficient tools for an accurate analysis in this high energy regime. In fact, I used similar ideas already in [19] in a somewhat different context. It is also important that certain “inverse” problems can be solved. Namely, it is possible to prescribe lower bounds on the gap lengths and on the Lyapunov exponent. We attack these problems by reducing them to a similar problem on the Fourier coefficients of bounded functions, and the solution to this problem can be found in the literature (see Theorem 5.1). Here is a more detailed overview on what is done in this paper. The proof of Theorem 1.3 is given in Sections 2–6. In Section 2, we collect some simple preliminary observations. Then we interrupt the formal proof; in Section 3, we give a heuristic discussion in order to explain and motivate what happens in the following sections. We also try to show here that the assumption α > 2/3 is inevitable with the method used. Section 3 does not contain rigorous arguments and may thus be skipped by the confident reader. In Section 4, we prove asymptotic estimates on the trace of the transfer matrix (also known as the Lyapunov function) for periodic potentials. Although this subject is classical, there are several new features in the analysis we give. We deal with different potentials simultaneously, and we use recent results from [1] on multilinear operators to study more carefully than usual the remainder terms for potentials that are not smooth. What we state in Theorem 4.2 is what is needed later, but perhaps the ideas used in the proof are also of some independent interest. The main part of the proof of Theorem 1.3 is then presented in Sections 5 and 6. We construct the exceptional set S and discuss its properties in Section 5. Finally, the analysis of the solutions of (1) for E ∈ S given in Section 6 concludes the proof of Theorem 1.3. Section 7 contains the modification of this construction relevant to Theorem 1.2(b). Theorem 1.5 can now also be proved with the ideas developed in the preceding sections. This is done in Section 8. In Section 9, we present the independent proof of Corollary 1.6. Finally, in the appendix, we sketch the proof of the result of [1] that we use in Section 4. We do this partly because this result is rather recent and partly because the treatment of [1] simplifies considerably in the special case needed here. Acknowledgment. I would like to thank Sasha Kiselev for most useful notes on the proof of Theorem 4.1. √ 2. Preliminaries. We work with wavenumbers k = E instead of energies E. Of course, the transformed set {k > 0 : k 2 ∈ S} has the same Hausdorff dimension as S. Slightly abusing notation, we denote this set also by S. Let y(x, k) be the solution of (1) with E = k 2 with the initial values y(0, k) = 1, y (0, k) = 0 (say). It is convenient to use modified Prüfer variables R(x, k) > 0, ϕ(x, k) defined by the relations y = R sin ϕ, y = kR cos ϕ and by demanding that ϕ be continuous in x. The
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
transfer matrix T (x, t; k) is the (2 × 2)-matrix that takes solution vectors from s = t to s = x. In particular,
sin ϕ . R(x, k) = R(t, k) T (x, t; k)eϕ(t,k) , eϕ ≡ cos ϕ
467 y(s,k)
Also, if u, v solve (1) and u(t, k) = v (t, k) = 1, u (t, k) = v(t, k) = 0, then u(x, k) kv(x, k) T (x, t; k) = . u (x, k)/k v (x, k)
y (s,k)/k
(5)
If V = 0 on an interval (a, b), then the evolution of R, ϕ is very simple. R(x, k) ≡ R(a, k) and ϕ(x, k) = ϕ(a, k) + k(x − a) for all x ∈ (a, b). As already mentioned in the introduction, the potential we are after is built out of periodic pieces. The nth piece has period gn−1 and size ∼ cn gn2 , where gn > 0 is small. The number of periods is equal to Ln ∈ N. We also need intervals of length )n with zero potential between the blocks in order to randomize the phases ϕ. So, V is of the following form: put a1 = 0 and bn = an + Ln gn−1 , an+1 = bn + )n , where Ln ∈ N, )n ∈ [0, π/2], and define
cn Wgn (x − an ), x ∈ (an , bn ), V (x) = 0, x ∈ bn , an+1 . Here, Wg denotes the rescaled function Wg (x) = g 2 W (gx), and the basic potential W (as well as the other parameters) is chosen later. The following elementary observation is crucial to our construction. Namely, the rescaling W → Wg has the same effect as replacing k by k/g. More precisely, write TW for the transfer matrix associated with −y + Wy = k 2 y. Then we have the following lemma. Lemma 2.1. We have TWg (1/g, 0; k) = TW (1, 0; k/g). Sketch of the proof. Introduce the new variable t = gx in the Schrödinger equation and use (5). (See also [19, Lemma 3.1].) Of course, we need a few simple facts about periodic potentials; a more comprehensive treatment can be found in [7]. In the sequel, T (k) = TW (1, 0; k) is shorthand for the transfer matrix over one period of some (unspecified, but fixed) potential W with period 1. It suffices to study this object since TW (N, 0; k) = T (k)N by periodicity. Then, since det T (k) = 1, the eigenvalues of T (k) can be computed from its trace D(k) ≡ tr T (k) = u(1, k) + v (1, k) (u, v solve −y + Wy = k 2 y, and u(0) = v (0) = 1, u (0) = v(0) = 0). Namely, the eigenvalues are λ, λ−1 , where D(k)2 D(k) − 1. (6) λ(k) = ± 4 2
468
CHRISTIAN REMLING
Here, we choose the sign so that |λ| ≥ 1. If |D| > 2, then λ is real and |λ| > 1; if |D| < 2, then |λ| = 1, λ ∈ / R. The set {E ∈ R : |D(E)| ≤ 2} (this notation is a little sloppy since we have switched from k to E, but it is self-explanatory) is a (possibly infinite) union of closed intervals. These intervals are called bands. The intervals where |D| > 2 are called gaps. The spectrum of the whole-line operator with potential W is purely absolutely continuous and consists precisely of the bands. If k is in a gap, the Lyapunov exponent is positive. So solutions that are not very close to the decaying solution are exponentially increasing. Here is the precise statement we use. Lemma 2.2. If |D(k)| > 2, there is an angle θ(k) ∈ [0, π) so that, for all n ∈ N,
T (k)n eϕ ≥ |λ(k)|n sin θ(k) − ϕ . (7) Here, we use the /2 -norm on C2 , and, as above, eϕ = (sin ϕ, cos ϕ)t . Proof. Fix k, and pick ψ, θ ∈ [0, π) so that eψ , eθ are eigenvectors of T corresponding to the (real and distinct!) eigenvalues λ and λ−1 , respectively. Actually, we may assume ψ = 0 here without loss of generality. Then θ = 0, and a calculation shows that
n 2
T eϕ = sin−2 θ λ2n sin2 (θ − ϕ) + λ−2n sin2 ϕ + 2 cos θ sin(θ − ϕ) sin ϕ 2 = sin−2 θ λ−n sin ϕ + λn sin(θ − ϕ) cos θ + λ2n sin2 (θ − ϕ) ≥ λ2n sin2 (θ − ϕ), as claimed. We conclude this section with an asymptotic estimate of the usual type (compare, e.g., [7], [14], and [17]). This result is used in Section 8, and, incidentally, it also suffices to prove Theorem 1.3 for α > 4/5. However, to go further, we need the more accurate analysis of Section 4. Therefore, we omit the proof of the following n = 1 W (x)e2π inx dx and W 1 = 1 |W (x)| dx. proposition. We use the notation W 0 0 Proposition 2.3. There is a constant C0 > 0 such that the following holds. If 1 j |2 ≥ C0 W 3 , and W ∈ L2 (0, 1), 0 W = 0, j ≥ C0 W 1 , j |W 1 |k − j π| ≤
W j 10j
,
then |λ(k)| ≥ 1 +
W j 10j
.
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
469
3. A guide to the proof. The set S from (4) is constructed as a Cantor-type set. More precisely, at step n, we use some of the gaps of the potential cn Wgn , and we then let T be the intersection over all n of the sets thus obtained. There is definite hope that S, as defined in (4), can indeed contain such a set T , because, by Lemma 2.2, we expect that the solutions of (1) grow if E is in a gap of cn Wgn for all n. Now, to analyze this situation quantitatively, we have to apply Proposition 2.3 to potentials of the form cn gn2 W (gn x); by Lemma 2.1 we may as well apply the proposition to cn W (x) if we also replace k by k/gn . In this way, we see that the j |. gaps are located at the points kj = j πgn ; their size is of the order ln ≈ cn gn2 |W We proceed inductively, and we use only those gaps that are contained in a gap that was used in the preceding step. Let Nn denote the number of new gaps contained in some fixed gap constructed in the (n−1)st step, and write P n ≡ N1 · · · Nn for the total j |2 < ∞, there are number of gaps used in step n. Because of the condition j |W j | ≈ Pn−1/2 for the indices j |. We pick a W that satisfies |W restrictions on the size of |W −1/2
j that are used in the nth step. Then ln ≈ cn gn2 Pn . Since adjacent gaps are separated −1 by a distance of order ≈ gn , we obtain the relation ln−1 ≈ Nn gn = Pn Pn−1 gn . Finally, in order to get a set of dimension D, we should have a scaling of the type Pn lnD ≈ 1. Solving these latter three relations for ln , gn , and cn , we obtain −1/D
ln ≈ Pn
,
1−1/D gn ≈ Pn−1 Pn−1 , 5/2−1/D 2/D−2 Pn−1 . cn ≈ Pn
(8) (9)
By Proposition 2.3 again, the large eigenvalue λn of the transfer matrix is of the order j |/j ≈ 1 + cn gn Pn−1/2 . By Lemma 2.2, we expect that the solution |λn | ≈ 1 + cn |W grows by a factor ≈ |λn |Ln across the nth piece. Thus, to get substantial increase, we must take Pn 1/D−1 1/2 −1 . Ln ≈ Pn (cn gn ) ≈ Pn−1 It remains to estimate the rate of decay of V we get with these parameters. In the nth piece, we have that x ≈ nm=1 Lm /gm . If Lm /gm increases rapidly, this is 1/D 1/2−1/D for these x, so ≈ Ln /gn ≈ Pn . On the other hand, |V (x)| ≤ Ccn gn2 ≈ Pn V (x) = O(x −α ) with α = 1 − D/2 or D = 2(1 − α), as desired. We made heavy use of asymptotic expansions in these heuristic considerations, so we certainly need that k/gn cn W 1 (the basic assumption in these expansions). 3/2−1/D 1/D−1 Pn−1 , our method can, That is, we need that cn gn 1, and since cn gn ≈ Pn unfortunately, only work if D < 2/3 (or, equivalently, α > 2/3). 4. Asymptotic expansions. We need asymptotic formulae more precise than Proposition 2.3. We begin by introducing two maximal functions that are used to control
470
CHRISTIAN REMLING
the remainders. For f, g ∈ L1 (0, 1), put b 2π ikx dxf (x)e Mf (k) = max , 0≤b≤1
Mf,g (k) = max 0≤b≤1
0
b
dxf (x)e
2π ikx
0
x
dtg(t)e
0
.
−2π ikt
Then the results of Christ and Kiselev [1] on multilinear operators, specialized to the case at hand, give the following norm bounds.
Theorem 4.1 [1]. If p ∈ (1, 2) and 1/p + 1/q = 1, then ∞ 1/q 2/q ∞ q q/2 ≤ Cp f p , ≤ Cp f p gp . Mf (n) Mf,g (n) n=−∞
n=−∞
The proof is sketched in the appendix. Actually, we need only the weak-type bounds that follow from these estimates. These bounds let us prove the following theorem. Theorem 4.2. There is a constant C0 > 0 such that the following holds. If W ∈ 1 L2 (0, 1), 0 W = 0, j ≥ C0 W 1 , and W j |k − j π| ≤ , 10j then
2 W j
− R(j ), 70j 2 where, for every q > 2, R(j ) satisfies the weak-type estimate q # j ∈ N : j ≥ C0 W 1 and R(j ) > λW 1 j −3 ≤ Cq λ−q/2 W 2 . |D(k)| ≥ 2 +
Remark. Note that the constants C0 , Cq are independent of W . This is important in the application of Theorem 4.2. Proof of Theorem 4.2. The following expansion holds: D(k) = 2 cos k +
∞ Dn (k) n=2
where
Dn (k) =
kn
,
t1 tn−1 dt2 W (t2 ) · · · dtn W (tn ) dt1 W (t1 ) 0 0 0 × sin k 1 + tn − t1 sin k(t1 − t2 ) · · · sin k tn−1 − tn . 1
(10)
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
471
This follows from the corresponding expansions of the solutions u, v obtained from iterating the variation of constants formula (see, e.g., [17, Chapter 1]). The details of these routine computations can be left to the reader. Note also that the series for D converges absolutely and uniformly in k ≥ 0 (say). We want to treat n≥3 k −n Dn as a remainder. Of course, there is the obvious bound (W 1 /k)3 eW 1 /k , and, indeed, this estimate lets one prove Proposition 2.3. We give a more careful analysis in this proof. It is useful to rewrite the Dn ’s using the complex representation of the sine. We get σ1 · · · σn eiσ1 k Sn (σ ) (11) Dn (k) = (2i)−n σ1 ,...,σn =±1
with Sn (σ ) ≡
1 0
t1
dt1 W (t1 )
tn−1
dt2 W (t2 ) · · ·
0
0
dtn W (tn )
× eik(σ2 −σ1 )t1 eik(σ3 −σ2 )t2 · · · eik(σ1 −σn )tn . 1 If σ1 = · · · = σn , then Sn (σ ) = ( 0 W )n /n! = 0. On the other hand, if not all the σ ’s are equal, we can find m, l ∈ {1, . . . , n} with σm+1 − σm = 2 and σl+1 − σl = −2 (here, we have put σn+1 ≡ σ1 ). We have dtm+1 W (tm+1 ) |Sn (σ )| ≤ dt1 |W (t1 )| · · · dtm−1 W (tm−1 ) · · · dtl−1 W (tl−1 ) dtl+1 W (tl+1 ) · · · dtn |W (tn )| 2iktm −2iktl dtl W (tl )e × dtm W (tm )e . The integration is over the region {(t1 , . . . , tn ) : 1 ≥ t1 ≥ · · · ≥ tn ≥ 0}. In particular, in the last integral tl runs over [tl+1 , tl−1 ] (using the obvious convention t0 = 1, tn+1 = 0). The structure of the double integral with respect to (tm , tl ) depends on the distance between the indices m and l. We have to distinguish two cases. If m − l (evaluated modulo n) is not equal to ±1, then the region of integration for the tm integral is [tm+1 , tm−1 ], and thus the double integral is equal to the product of the two single integrals. Since, clearly, t k 2ikx dxW (x)e max ≤ 2MW π , 0≤s,t≤1 s we see that, in the case under consideration, |Sn (σ )| ≤ 4
W 1n−2 2 k MW . (n − 2)! π
472
CHRISTIAN REMLING
On the other hand, if l = m + 1, say, then the limits of the double integral are (in self-explanatory notation)
tm−1 tm+2
dtm
tm tm+2
dtl .
Since
b a
dt
t a
b t
ds = 0
0
a t
− 0
a a
+
0
0
0
b a
− 0
,
0
we get this time that W 1n−2 k k 2 |Sn (σ )| ≤ 2 MW + MW,W . (n − 2)! π π
(12)
In either case, it is true that Sn (σ ) can be estimated by the right-hand side of (12) with 2 replaced by 4. By (11), |Dn | satisfies the same inequality. Since we want to expand Dn (k) around k = j π, we also need similar estimates on dDn /dk. We have that d Sn (σ ) = i dk
1
dt1 W (t1 )
t1
dt2 W (t2 ) · · ·
0 0 ik(σ2 −σ1 )t1 ik(σ3 −σ2 )t2
×e
e
tn−1
dtn W (tn ) 0 · · · eik(σ1 −σn )tn σ2 − σ1 t1 + · · · + σ1 − σn tn .
We pick m, l as above, and we treat separately the contributions coming from 2itm , −2itl and those coming from r=m,l (σr+1 − σr )tr . For these latter terms, it suffices to note that | r=m,l (σr+1 − σr )tr | ≤ 2(n − 2) and then apply the above estimates again. Dealing with the other two contributions, we also proceed as above, but now new maximal functions enter; for instance, in the case when l = m + 1, we use that t tm m−1 2iktm −2iktl dtm 2itm W (tm )e dtl W (tl )e tm+2
tm+2
k k k MW + MtW,W . ≤ 4 MtW π π π
We omit the details; the final result is an inequality of the form n−2 dSn (σ ) 2 ≤ 8 W 1 MW + MtW MW + MW,W + MtW,W + MW,tW dk (n − 3)! (where all maximal functions are evaluated at k/π). Since the additional term coming
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
473
from differentiating eiσ1 k is easily controlled, it now follows from (11) that also n−2 2 D (k) ≤ 12 W 1 MW + MtW MW + MW,W + MtW,W + MW,tW . n (n − 3)!
For the second derivative, a crude estimate is sufficient. It follows directly from (10) that 2 n d Dn (k) ≤ 4 W 1 . dk 2 (n − 2)! We now write k = j π ± δ with 0 ≤ δ ≤ π/2 (and, typically, j large). Then our results on Dn from above imply that δ2 |Dn (k)| ≤ Dn (j π ) + Dn (j π)δ + max Dn (k) · 2 n−2 W 1 2 MW (j ) + MW,W (j ) ≤C (n − 2)! +C
W 1n−2 2 MW (j ) + MtW (j )MW (j ) + MW,W (j ) (n − 3)! W n1 2 + MtW,W (j ) + MW,tW (j ) δ + C δ . (n − 2)!
Now if W 1 /j ≤ 1, say, then this inequality leads to ∞ D (k) W 31 2 W 1 n ≤ CM(j ) + C δ , kn j3 j3 n=3
2 +M M +M where M ≡ MW tW W W,W + MtW,W + MW,tW . Theorem 4.1 shows that for q > 2, 1/p + 1/q = 1,
M/q/2 ≤ MW 2q + MtW q MW q + MW,W q/2
+ MtW,W q/2 + MW,tW q/2 ≤ Cq W 2p + tW 2p ≤ 2Cq W 2p ≤ 2Cq W 22 . This implies the weak-type bound q # j ∈ N : M(j ) > λ ≤ Cq λ−q/2 W 2 .
474
CHRISTIAN REMLING
Finally, we need to take a closer look at D2 (k). A calculation gives t1 1 W (k/π ) 2 1 2ikt1 D2 (k) = cos k + Im dt1 W (t1 )e dt2 W (t2 )e−2ikt2 sin k, 4 2 0 0 (s) = 1 W (x)e2π isx dx. We again use Taylor expansions about k = j π to where W 0 control the second term on the right-hand side; using the fact that | sin k| ≤ δ, we get the bound CM(j )δ + CW 21 δ 2 . Therefore, putting everything together, we obtain D(k) = 2 cos k 1 +
W (k/π)2 8k 2
2 W 1 δ W 1 δ + . + O M(j ) + j j3 j2
The constant implicit in the error term O(· · · ) does not depend on anything. To conclude the proof, we also expand the leading term: δ2 j + O W 1 δ . k =W W cos k = (−1)j 1 − + O δ 4 , 2 π j |/(10j ), it is now routine, if Using this and noting that, by hypothesis, δ ≤ |W somewhat tedious, to check that the assertion holds if W 1 /j is small enough, that is, if we take C0 sufficiently large. The role of R(j ) is taken by a remainder of the order O(M(j )W 1 /j 3 ). 5. Construction of the exceptional set. We now make the program outlined in Section 3 rigorous. We start by constructing a Cantor-type set (which basically is the set S from (4)) and, simultaneously, the basic periodic potential W . So, fix α ∈ (2/3, 1), and let D = 2(1 − α). We want to construct a potential V of the order V (x) = O(x −α ) so that dim S = D. It is more convenient to work with still another parameter a related to D by a=
4(1 − D) ; 2 − 3D
notice that a > 2. Pick b ∈ (1, a), and put gn = exp − a n+1
2a − 3 , (2a − 4)(a − 1) n+1 a n+1 cn = 10 exp −b . a −1
(13)
(14) (15)
The motivation for these choices comes from the arguments of Section 3. We make the ansatz Nn = exp(a n ), so Pn ≈ exp(a n+1 /(a − 1)); then we use (8) and (9) and finally pick a so that cn gn 1. The use of the rapidly increasing sequence exp(a n )
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
475
is not a whim but is necessary to be able to satisfy this latter condition. Finally, the additional parameter b is introduced for technical reasons; it allows us to slightly improve the decay rate of the V to be constructed. In fact, Theorem 1.3 can be proved with b = 0, but in the proof of Theorem 7.1, it is essential that we squeeze out a little more than what Theorem 1.3 states. In order to get gaps at the right places (by an application of Theorem 4.2), we need to make sure that certain Fourier coefficients of W are large enough. More precisely, we construct a Cantor-type set ∩Fn , and we simultaneously try to find a function W satisfying the following conditions. (1) Put F0 = [1, 2]. (2) The set F1 is a union of intervals of the form (1) Jj
l1 l1 , = j πg1 − , j πg1 + 2 2
where l1 =
π a 1+1 c1 g12 exp − . 10 2(a − 1)
(The motivation for this choice becomes clear shortly; see Lemma 5.2.) In fact, we (1) use only those Jj that are contained in F0 . For the corresponding indices j , we require that 1+1 W j ≥ exp − a . 2(a − 1) (1)
(1)
Let F1 be the union of the Jj ’s with Jj ⊂ F0 . (3) Similarly, F2 is a union of intervals
(2) Jj
l2 l2 = j πg2 − , j πg2 + 2 2
l2 =
π a 2+1 c2 g22 exp − . 10 2(a − 1)
with
(2)
This time, we use only those j for which Jj
⊂ F1 , and we require that then
2+1 W j ≥ exp − a . 2(a − 1)
476
CHRISTIAN REMLING
Continuing in this way, we obtain a sequence of sets Fn ⊂ Fn−1 . Every Fn is a (n) union of intervals Jj of the form ln ln (n) Jj = j πgn − , j πgn + , 2 2 π a n+1 2 ln = cn gn exp − . (16) 10 2(a − 1) Actually, we cannot be sure at this point that Fn = ∅ for all n. However, it is easy to (n ) repair this. We start out with a collection of Jj 0 ⊂ [1, 2] instead of F0 , where we take n0 ∈ N large enough, and then proceed as described above. We see shortly that then Fn = ∅ for all n ≥ n0 . In any event, we are also trying to find a function W with the property that n+1 W j ≥ exp − a (17) 2(a − 1) (n)
whenever j is such that Jj ⊂ Fn . The existence of such a W follows from a result of de Leeuw, Kahane, and Katznelson. Theorem 5.1. Suppose wj ≥ 0, wj2 < ∞. Then there exists a bounded function f : [0, 1] → R with |fj | ≥ wj for all j . Moreover, it is possible to choose f so that f ∞ ≤ C( wj2 )1/2 . See [9, Section 5.9] for the proof. The bound on f ∞ is also derived in the proof given there, although it is then not mentioned in the theorem of that section. Incidentally, we need this bound only in Section 8. Before applying Theorem 5.1, we can simplify things a little by passing to subsets of the Fn ’s which are, in a sense, more symmetric. To this end, we first observe that (n−1) every fixed Jj ⊂ Fn−1 contains at least ln−1 1 π an 2 − 2 = cn−1 gn−1 exp − −2 10 2(a − 1) πgn πgn (n)
intervals Jk . A computation using (14) and (15) shows that this number is equal to exp(a n − bn ) − 2. In particular, for large n, it is certainly larger than exp(a n − bn )/2, say. We fix once and for all integers n n 1 n 1 n Nn ∈ exp a − b , exp a − b (18) ∩ N n ≥ n0 + 1 . 3 2 By the above remarks, we can now (inductively) pass to subsets Gn ⊂ Fn (n ≥ n0 ) as (n ) follows. Put Gn0 = Fn0 ; then Gn0 is the union of certain intervals Jj 0 . Generally, if (n)
Gn−1 is such a union, pick precisely Nn subintervals Jk
(n−1) Jj
⊂ Gn−1 , and let Gn be the union of these
(n) Jk ’s.
(n−1)
⊂ Jj
for every interval
477
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
Now, if we denote the right-hand side of (17) by wn and if Nn0 is the number of intervals contained in Gn0 , then ∞
n=n0 {j :J (n) ⊂G } n j
wn2 =
∞ n=n0
Nn0 Nn0 +1 · · · Nn wn2
a n+1 bn+1 − wn2 ≤C 2 exp a − 1 b − 1 n=n0 ∞ bn+1 −n =C 2 exp − b−1 n=n ∞
−n
0
< ∞. So Theorem 5.1 applies. There is a bounded function W so that, for every n ≥ (n) n0 , condition (17) holds for all j ∈ N with Jj ⊂ Gn . Since we want to apply 1 Theorem 4.2 to (multiples of) W later on, we replace this W (x) by W (x) − 0 W to 1 make sure that 0 W = 0. Of course, this does not change Wj for j = 0. Let T = n≥n0 Gn . In the sequel, we assume that n0 = 0 and G0 = F0 = [1, 2]. This is done in order to simplify the notation; the following arguments are of course also valid in the general case. We summarize the properties of T , and we also add an important additional observation. (n)
Lemma 5.2. Gn is a (disjoint) union of precisely N1 N2 · · · Nn intervals Jj form ln ln (n) . Jj = j πgn − , j πgn + 2 2 (n)
Every Jj
(n+1)
⊂ Gn contains exactly Nn+1 intervals Ji
(17) holds. Moreover, if k
(n) ∈ Jj
(n)
⊂ Gn+1 . If Jj
of the
⊂ Gn , then
⊂ Gn , then
c W k −jπ ≤ n j . g 10j n (n)
Proof. It remains to check the last statement. So suppose k ∈ Jj ⊂ Gn . Then, by (16) and (17), n+1 k π − j π ≤ ln = π cn gn exp − a j . ≤ cn gn W 2g g 2(a − 1) 20 20 n n Since j πgn ∈ Gn ⊂ F0 = [1, 2], we have gn ≤ 2/(πj ), proving the assertion. There is a natural Borel (probability) measure µ supported by T . Namely, we define µ by the following two properties: µ(R\T ) = 0, and for every n, µ puts equal weight
478
CHRISTIAN REMLING
on the subintervals of Gn ; that is, (n) µ Jj =
1 N1 N2 · · · N n
(n)
for every Jj ⊂ Gn . It is easy to see that µ is indeed well defined by these requirements. We can now complete our analysis of T by showing that T itself and certain subsets are not too small. More precisely, we have the following lemma. Lemma 5.3. If µ(M) > 0, then dim M ≥ D. Proof. We prove the following. For all γ < D, we have that lim sup
δ→0+ |I |≤δ
µ(I ) = 0. |I |γ
Then, by general results on Hausdorff measures (see [23, Section 3.4, Theorem 67]), it follows that µ(M) = 0 for all sets M with dim M < D, and this is exactly what we have to prove. So fix γ < D, let δ > 0, and let I be an interval with |I | ≤ δ. Define n ∈ N by ln < |I | ≤ ln−1 . First, we consider the case when ln < |I | ≤ πgn . Then I intersects (n) at most two of the intervals Jj that build up Gn , so D−γ
µ(I ) 2 2ln ≤ |I |−γ ≤ . γ |I | N1 N2 · · · N n N1 N2 · · · Nn lnD Next, if πgn < |I | ≤ ln−1 , then I intersects at most |I |/(πgn ) + 1 ≤ 2|I |/(πgn ) of (n) the intervals Jj . Thus D−γ
ln−1 2 2ln−1 µ(I ) ≤ |I |1−γ ≤ . D γ |I | πgn N1 N2 · · · Nn πNn gn N1 N2 · · · Nn−1 ln−1 By (18), we have that N1 · · · Nn ≥ C3
−n
a n+1 bn+1 − , exp a −1 b−1
and a computation using (13), (14), (15), and (16) shows that ln = π exp −
a n+1 − bn+1 . D(a − 1)
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
479
Therefore, N1 · · · Nn lnD ≥ C1 exp(−C2 bn ). Similarly, one sees that ln−1 /(Nn gn ) ≤ C1 exp(C2 bn ). So, since D −γ > 0 by assumption, we get, in either case, an inequality of the form µ(I ) n n ≤ C exp C b − @a 1 2 |I |γ with @ = @(D, γ ) > 0. Our claim follows, because n goes to infinity as δ tends to zero. 6. Asymptotics of the solutions. We know from Section 3 that we can expect things to work out provided nothing special happens, but there are two kinds of bad luck we have to be prepared against: first of all, the Prüfer angle ϕ could be close to the exceptional phase θ from Lemma 2.2, and second, the error term R(j ) from Theorem 4.2 could be large. The first problem is overcome by randomizing the Prüfer angles with the help of the parameters )n . More precisely, we show the following lemma. Lemma 6.1. Given measurable functions θn (k), it is possible to choose )n ∈ [0, π/2] in such a way that there is a subset T1 ⊂ T of measure µ(T1 ) ≥ 1/2 with the following property. If k ∈ T1 , then ϕ(an , k) − θn (k) ≥ π 8
(19)
for infinitely many n. Remarks. (1) In Lemma 6.1 and its proof, all angles are evaluated modulo π. (2) As is probably already clear from the remarks preceding the lemma, θn eventually is the exceptional phase defined by Lemma 2.2. However, the following proof clearly also works for arbitrary (measurable) functions θn . (3) Since ϕ(an , k) depends on the values of V (x) for x ≤ an , it is clear that we should first fix cn , gn , Ln , and W , and only then can Lemma 6.1 be used to finally pick the )n ’s. (On top of that, the exceptional angle θn from Lemma 2.2 of course depends on cn , gn , and W .) Having said that, we do not find it necessary to follow this correct order also in the representation of our arguments, because that would involve making rather unmotivated choices. Proof of Lemma 6.1. We begin by showing that the )n ’s can be (inductively) chosen so that π 1 µ k : ϕ(an , k) − θn (k) < ≤ 8 2
(20)
for every n. Since V = 0 on (bn−1 , an ), we have ϕ(an , k) = ϕ(bn−1 , k) + k)n−1 . So,
480
CHRISTIAN REMLING
if we denote the set defined in (20) by M()n−1 ), then 0
π/2
µ M )n−1 d)n−1
=
≤
π/2
dµ(k)
≤
0
d)n−1 χ(−π/8,π/8) ϕ bn−1 , k − θn (k) + k)n−1
π dµ(k) 4k
π . 4
Hence (20) must indeed hold for some )n−1 ∈ [0, π/2]. Now (20) implies the statement of the lemma by an elementary probabilistic argument. Namely, let T be the set of k’s for which (19) holds only for finitely many n. Also, for k ∈ T , let N(k) be the smallest integer with the property that (19) fails for all n ≥ N(k). Then the sets Tn ≡ {k ∈ T : N(k) ≤ n} increase to T , thus µ(T ) = lim µ(Tn ). But (20) in particular implies that µ(Tn ) ≤ 1/2 for every n, so the proof is complete. Next, we show that the error term R(j ) from Theorem 4.2 can cause problems only on a set of µ-measure zero. To formulate this precisely, we need some notation. Of course, eventually, we want to get information on the trace of the transfer matrix associated with the potential cn Wgn . By Lemma 2.1, we can instead apply Theorem 4.2 to cn W if we also replace k by k/gn . We thus denote the trace and the largest eigenvalue (in absolute value) of the transfer matrix associated with cn W by Dn and λn , respectively. Our goal is to establish the following estimate for k ∈ T : n+1 λn k ≥ 1 + 1 cn gn exp − a . g 6 2(a − 1) n
(21)
Indeed, we have the following lemma. Lemma 6.2. Let T0 = {k ∈ T : (21) fails for infinitely many n ∈ N}. Then µ(T0 ) = 0. Proof. In view of (6), inequality (21) is implied by 2 n+1 Dn k ≥ 2 + cn gn exp − a , g 6 a −1 n
(22)
so it suffices to show that for µ–almost all k, this latter condition holds for all but finitely many n. Now if k ∈ T , then, for every n, there is a unique j = j (n, k) so (n) that k ∈ Jj . Moreover, 1 ≤ j πgn ≤ 2; in particular, j ≥ 1/(πgn ), and this is bigger than C0 cn W 1 for n large enough, because cn gn → 0. Using this and invoking
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
481
Lemma 5.2, we see that for n ≥ n0 , Theorem 4.2 applies to k/gn and cn W . Here, n0 is independent of k ∈ T . Thus 2 c2 W Dn k ≥ 2 + n j − Rn (j ) gn 70j 2 π2 2 2 a n+1 c g exp − − Rn (j ). ≥ 2+ 280 n n a −1
(23)
To pass to the second line, we used (17) and the fact that j ≤ 2/(πgn ). The remainder Rn satisfies, for every q > 2, the following weak-type estimate. Here, we absorb some irrelevant constants by Cq , so Cq below is of course not the same as the one from Theorem 4.2. In particular, it now depends on W (but not on n!): q (24) # j ∈ N : 1 ≤ j πgn ≤ 2 and Rn (j ) > λcn gn3 ≤ Cq λ−q/2 cn . We can now again reformulate the claim. Namely, let n+1 −a 2 2 Sn = j ∈ N : 1 ≤ j πgn ≤ 2 and Rn (j ) > δcn gn exp (a − 1) (n)
with δ = π 2 /280 − 1/36 (this is positive); then if k ∈ Jj with j ∈ / Sn , then (23) implies (22). In other words, it suffices to prove that for µ–almost all k, there are only finitely many n so that (n) Jj . k∈ j ∈Sn
But, by definition of µ, µ
j ∈Sn
and (24) with λ=δ
(n) Jj ≤
|Sn | , N1 · · · N n
a n+1 cn exp − a −1 gn
says that |Sn | ≤ Cq δ −q/2 (cn gn )q/2 exp
q a n+1 . 2 a −1
Using this and inserting the definitions (14), (15), and (18), we finally arrive at n+1 (n) q 2a − 5 a n − 1 + Cb . (25) Jj ≤ Cq exp µ a − 1 2 2a − 4 j ∈Sn
482
CHRISTIAN REMLING
We can now take q = 4a/(2a − 1), say (obviously, this is bigger than 2, as required). Then the right-hand side of (25) is summable, and hence an application of the BorelCantelli lemma completes the proof. We are now in a position to show that the potential we have constructed (we still have to choose the Ln ’s, though) has the properties stated in Theorem 1.3. We begin by proving that the set S defined in (4) contains T1 \T0 . By Lemmas 5.3, 6.1, and 6.2, it must then indeed be true that dim S ≥ D. So fix k ∈ T1 \ T0 , and let n be such that (19) and (21) hold. There are infinitely many such n’s. We want to show that the solution of (1) grows on the interval (an , bn ) (which corresponds to the nth step of the construction; please consult Section 2 again for this and other definitions). The Prüfer radius R satisfies
Ln
1
R(bn , k) = R(an , k) Tcn Wgn , 0; k eϕ(an ,k) ,
gn so Lemmas 2.1 and 2.2 together with (19) and (21) now show that Ln 1 π a n+1 1 + cn gn exp − , R(bn , k) ≥ R(an , k) sin 8 6 2(a − 1) or, taking logarithms,
R(bn , k) ln R(an , k)
a n+1 ≥ CLn cn gn exp − + ln sin(π/8) . 2(a − 1)
This suggests that we take n+1 1 a Ln = A , exp cn g n 2(a − 1)
(26)
where [x] denotes the largest integer less than or equal to x. By taking the constant A in this definition large enough, we can achieve that ln(R(bn )/R(an )) ≥ ln 2, say. Hence lim sup n→∞
R(bn , k) ≥2 R(an , k)
(27)
for all k ∈ T1 \ T0 . On the other hand, it is easy to see that WKB asymptotics would imply that R(x, k) tends to a positive limit as x → ∞, and this is incompatible with (27). Thus S ⊃ T1 \ T0 , as claimed.
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
483
It remains to check that the V constructed above has the right rate of decay, but this is easy. Clearly, if an < x < an+1 , then |V (x)| ≤ Ccn gn2 . On the other hand, an+1 =
n
)r +
r=1
Lr gr
n
≤
nπ Lr Ln + ≤C . 2 gr gn r=1
Here, the last estimate follows from the rapid increase of Lr gr−1 . As usual, we plug in the definitions (14), (15), and (26) to obtain a n+1 |V (x)| ≤ C1 exp − − bn+1 , a −2 3a − 4 n+1 n+1 +b . x ≤ C2 exp a (a − 1)(2a − 4)
(28)
A calculation using (13) shows that (2a −2)/(3a −4) = 1−D/2 = α. So (28) implies that |V (x)| ≤ Cx −α exp((α−1)bn+1 ); in particular, V (x) = O(x −α ), as required. The proof of Theorem 1.3 is complete. Using (28) again to estimate n, we see that we have in fact proved that |V (x)| ≤ Cx −α exp − δ(ln x)1−@ . Here, δ, @ > 0; moreover, we can get arbitrarily small @ by taking b sufficiently close to a. I do not think this improvement of V (x) = O(x −α ) is particularly interesting. We use Hausdorff dimensions to measure the size of the set S, so it is not natural to use a scale finer than the power scale to classify the potentials. The real point is the following. As we show in the next section, we can sacrifice a bit of the decay rate of V to get more detailed information on the solutions. Then, of course, it is most useful to start out with a bound better than O(x −α ). 7. Embedded eigenvalues. In this section, we show that it is also possible to get dim P = 2(1 − α). Here, P is the set of positive energies (or wavenumbers, depending on the context) with L2 -solutions, as defined in the introduction. Recall also that P ⊂ S, so the following result is, in a sense, a strengthening of Theorem 1.3. Theorem 7.1. Suppose α > 2/3. Then there is a potential so that (2) holds and dim P = 2(1 − α). To explain the significance of this variation on Theorem 1.3, we combine it with a result of del Rio, Jitomirskaya, Last, and Simon. Theorem 7.2 [5, Theorem 5.1]. Let B = {β ∈ [0, π) : σpp (Hβ ) ∩ (0, ∞) = ∅}. Then dim B = dim P .
484
CHRISTIAN REMLING
In other words, Theorem 7.1 now shows that Theorem 1.2(b) is optimal (for α ≥ 2/3). Proof of Theorem 7.1. We use the results and notation of the preceding sections. The basic idea of the modification is as follows. We try to get rapidly increasing solutions on the set T , and then we prove that there are also decaying solutions that eventually turn out to be square integrable. To this end, we modify the construction at two places. The Ln ’s are taken somewhat larger, and the )n ’s are picked such that eventually the Prüfer angle is never very close to the phase θ corresponding to the decaying direction. We discuss this latter point first. For technical reasons, it is necessary to work with two linearly independent solutions y1 , y2 in this proof. We can take, let us say, y1 (0) = y2 (0) = 1, y1 (0) = y2 (0) = 0. Denote the corresponding Prüfer angles by ϕ1 and ϕ2 , respectively. The following statement replaces Lemma 6.1. Lemma 7.3. Given measurable functions θn (k), it is possible to choose )n ∈ [0, π/2] such that the following holds on a set T1 ⊂ T of measure µ(T1 ) = 1. If k ∈ T1 , then there is an n0 = n0 (k) so that ϕi (an , k) − θn (k) ≥ 1 n2 for all n ≥ n0 and for i = 1, 2. Proof. Let Mn(i) (1)
1 = k ∈ T : ϕi (an , k) − θn (k) < 2 , n
(2)
Mn = Mn ∪ Mn , and argue as in the first part of the proof of Lemma 6.1; one sees that the )n ’s can be picked inductively so that µ(Mn ) ≤ 8/(πn2 ). This is summable, so the claim follows by the Borel-Cantelli lemma. To go from increasing solutions to decaying solutions, we need the following tool. Proposition 7.4 [13, Theorem 8.1]. Let Bn ∈ R2×2 with det Bn = 1, and put Tn = Bn Bn−1 · · · B1 . Suppose that
∞
Bn+1 2 n=1
Tn 2
Then there is u ∈ R2 , u = 1, so that Tn u2 ≤ 8 Tn −2 + Tn 2
< ∞.
2 ∞
Bm+1 2 . 2 T m m=n
(29)
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
485
Moreover, if v ∈ R2 is linearly independent of u, then inf
n∈N
Tn v > 0. Tn
See [13] for the proof. The “moreover” part is not explicitly stated in [13], but it is established in the proof given there. In this section, it is necessary to analyze the behavior of the solutions at the points xn,j ≡ an + j/gn . Lemma 7.3, however, gives control on the phases ϕ only at the points xn,0 . We therefore need the following extension of Lemma 2.2. (We drop k in the notation.) Lemma 7.5. In the situation of Lemma 2.2, we also have that
m+n m
T eϕ ≥ T eϕ · |λ|n sin |θ − ϕ| for all m, n ∈ N0 . Clearly, in the special case m = 0, we recover Lemma 2.2. Proof. We use the notation from the proof of Lemma 2.2. Of course, we can again assume that ψ = 0. In fact, we can further reduce the general case to the following situation: λ > 1, 0 < θ ≤ π/2, and 0 < ϕ < π. (The case ϕ = 0 is of course trivial.) We define ω ∈ (0, 2π ) by T m eϕ = T m eϕ eω . We must then show that T n eω ≥ n λ sin |θ − ϕ|. Intuitively, this holds because the position of ω is not worse than that of the original phase ϕ. More precisely, we have the following. If 0 < ϕ < θ , then 0 < ω < ϕ, and if θ < ϕ < π , then ϕ < ω < π . In other words, T m eϕ approaches the subspace corresponding to the large eigenvalue λ. Since this is geometrically obvious, we do not give the formal verification (which is carried out using the explicit form of T ). Now if 0 < ω ≤ π/2 + θ , then sin |θ − ω| ≥ sin |θ − ϕ| by the discussion above; so applying Lemma 2.2 with ω taking the role of ϕ gives the assertion in this case. But if π/2 + θ < ω < π, things are even better, because then T n eω ≥ λn . To prove this, it suffices to consider n = 1 because by the above remarks again, under repeated application of T , the phase remains in the interval (π/2 + θ, π). By direct computation, as in the proof of Lemma 2.2, we have that 2
T eω 2 = λ f λ−2 , 2 sin θ
where f (x) ≡ x 2 sin2 ω + sin2 (ω − θ) − 2x cos θ sin ω sin(ω − θ). The parabola f has its minimum at x0 =
cos θ sin(ω − θ) . sin ω
(30)
486
CHRISTIAN REMLING
Since 0 < θ < π/2 and π/2 + θ < ω < π , we see that x0 = cos2 θ + sin θ cos θ | cot ω| ≥ cos2 θ + sin θ cos θ cot π/2 + θ = 1. By assumption λ−2 ≤ 1, so we get, from (30), 2
T eω 2 ≥ λ f (1) = λ2 , 2 sin θ
as claimed. With these preparations out of the way, we can now begin the more careful analysis of the solutions, as outlined above. To this end, fix k ∈ T1 \T0 , where T0 and T1 are the sets defined in Lemmas 6.2 and 7.3, respectively. (Although we are about to change )n , Ln , we may still apply Lemma 6.2, because the set T0 defined there is independent of these parameters.) By these lemmas together with Lemma 5.3, dim T1 \ T0 ≥ D. Recall that xn,j = an + j/gn , and abbreviate the corresponding transfer matrix by Tn,j ≡ T (xn,j , 0; k). Our first aim is to show that Proposition 7.4 applies to this situation. More precisely, we claim that, for appropriate Ln ’s (see (32)),
Ln
∞
T xn,j +1 , xn,j ; k 2 < ∞.
Tn,j 2 n=1 j =0 (Here, we set xn,Ln +1 = xn+1,0 .) First of all, we observe that the sequence of integrals xn,j +1 |V (x)| dx = cn gn W 1 xn,j
is bounded. Thus also T (xn,j +1 , xn,j ; k) ≤ C by a Gronwall estimate. Moreover, writing Rn,j = R(xn,j , k) for the Prüfer radius (corresponding to either y1 or y2 ) at −2 xn,j , we clearly have that Tn,j ≥ Rn,j , so it suffices to prove that n,j Rn,j < ∞. Now for n big enough and l ≥ j , we can proceed as in the preceding section to estimate Rn,l /Rn,j . Namely, we have that Rn,l = Rn,0 T l eϕ , where T ≡ Tcn W (k/gn ) and ϕ ≡ ϕ(an , k). Of course, the same relation holds if l is replaced by j . Thus Rn,l /Rn,j = T l−j +j eϕ /T j eϕ , and we can now invoke Lemma 7.5 together with Lemma 7.3 and inequality (21). We get Rn,l a n+1 ln ≥ C(l − j )cn gn exp − − 2 ln n (C > 0). (31) Rn,j 2(a − 1) The last term comes from the bound 1/n2 on sin |θ − ϕ|. In this section, we take n+1 a 1 , (32) Ln = A n exp cn gn 2(a − 1)
487
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
where A is a constant with A > a. Then (31), when specialized to j = 0, l = Ln , says that Rn+1,0 ln (33) ≥ CAn . Rn,0 Using (31) and (33), it is now easy to obtain bounds on R −2 which show that indeed −2 n,j Rn,j < ∞. We omit the details, because a similar analysis is carried out below. We now want to show that the solution y obtained from the vector u from Proposition 7.4 as (y(x), y (x)/k)t = T (x, 0; k)u is square integrable. So we must analyze the right-hand side of (29). To begin, we notice that, by the last statement of Proposition 7.4, we must have an estimate of the type Rn,j ≥ CTn,j with C > 0 for at least one of the two solutions y1 , y2 . (Recall that R involved a choice between y1 , y2 , even though this has not been made explicit in the notation.) Fix such a solution once and for all. Since the reverse inequality R ≤ T is trivially true, we may replace T by R whenever it occurs on the right-hand side of (29). Finally, we can of course again estimate Bm+1 by a constant. Thus, we can work with
2
Tn,j u ≤ CR −2 1 + n,j
xm,l ≥xn,j
2
2 Rn,j
.
2 Rm,l
We break this up according as m = n, m = n + 1, or m ≥ n + 2:
Tn,j u 2 ≤ CR −2 1 + n,j
Ln 2 Rn,j l=j
2 Rn,l
Ln+1
+
2 Rn,j l=0
2 Rn+1,l
+
Lm ∞ 2 Rn,j m=n+2 l=0
2 Rm,l
2 ,
and we call these three sums D1 , D2 , and D3 , respectively. As for D1,2 , we use the crude estimates D1 ≤ Cn4 Ln , D2 ≤ Cn4 Ln+1 , which follow at once from (31) and also (33) in the case of D2 . To analyze D3 , we write it in the form D3 =
Lm ∞ 2 2 2 Rn,j Rm,0 Rn+1,0 m=n+2 l=0
2 2 2 Rn+1,0 Rm,l Rm,0
.
Noting that Rn+1,0 = Rn,Ln , we can again use (31) to bound the first two ratios by Cn4 and Cm4 , respectively. The last ratio is controlled by iterating (33). This gives a bound of the type exp(−CAm ). Hence D3 ≤ C 1 n 4
∞ m=n+2
m4 Lm exp − C2 Am .
488
CHRISTIAN REMLING
−2 Collecting the bounds just proved and using a similar estimate on Rn,j , we get
2 ∞
2
4 4 n 4 4 m
Tn,j u ≤ C1 n exp − C2 A m Lm exp − C2 A , n Ln+1 + n m=n+2
with positive constants C1 , C2 . In view of the definitions (14), (15), and (32), it is now clear that the small factors exp(−C2 An ), exp(−C2 Am ) dominate everything else. Straightforward manipulations (details are left to the reader) finally give an estimate of the form Tn,j u2 ≤ C exp(−δAn ), with δ > 0. It now follows easily the solution y defined as the first component of T (x)u is xthat n,j +1 in L2 . Namely, since xn,j |V | is bounded, a Gronwall-type estimate lets us extend the inequality T (x)u2 ≤ C exp(−δAn ) to all of xn,j ≤ x ≤ xn,j +1 . (The constant C here might be larger than the original one.) Thus
∞ 0
|y(x)|2 dx ≤ C
∞ n=1
)n + Ln gn−1 exp − δAn ,
and this is finite since )n + Ln gn−1 ≤ exp(Ca n ). Finally, we have to estimate V . We proceed as in the last part of Section 6. Instead of (28) we obtain this time that if an < x < an+1 , then a n+1 n+1 −b , |V (x)| ≤ C1 exp − a −2 3a − 4 + bn+1 . x ≤ C2 An exp a n+1 (a − 1)(2a − 4) Now the precaution of introducing b at the beginning of the construction finally pays off. In spite of the additional large factor An , we still have V (x) = O(x −α ) (actually, again a slightly better estimate holds), so the proof of Theorem 7.1 is complete. 8. Proof of Theorem 1.5. We again work with wavenumbers k instead of energies E. We use only k’s of a rather special form because we want them to lie in the gaps of small periodic potentials, and by Proposition 2.3 (or Theorem 4.2), there is nothing we can do about the location of these gaps. Also, for technical reasons (see Lemma 8.1), we must not have too many rational relations among the k’s. (Rationally independent k’s would be best for this purpose.) The choice below is a good compromise between these partly contradictory requirements. So suppose we are ki ’s of the form ki = πj (i)2−m(i) with j (i), m(i) ∈ N, given 2 j (i) odd, and with ki < ∞. We further assume that m(i + 1) ≥ m(i) + 2 for all i ∈ N. We then construct a potential V (x) = O(x −1 ) so that P ⊃ {ki2 }. Clearly, this proves Theorem 1.5; in fact, it also shows that we can make (En − en ) arbitrarily small.
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
489
The basic strategy of the proof is as follows. We place gaps at the ki ’s and then use the arguments of the preceding section to deduce the existence of L2 -solutions at these energies. At each step, we work with only finitely many ki ’s (let us say with Nn of them), and we let this number Nn tend to infinity. Again, several modifications of the original construction are necessary. First of all, we have to use different periodic potentials, depending on the interval (an , bn ). So, here V has the form
(n) Wgn (x − an ), x ∈ (an , bn ), V (x) = 0, x ∈ bn , an+1 . Of course, the parameter cn is now superfluous. Then, we must use gn ’s that are only exponentially decreasing. We put gn = 2−n . Next, according to what has been said above, we pick functions W (n) ∈ L∞ (0, 1) that satisfy for all i ∈ N with m(i) ≤ n the inequality (n) W ≥ ki . j Here, j = j (i, n) is defined as the unique integer with πj 2−n = ki . If n < m(i), there 1 is no condition. We further require that 0 W (n) = 0 and supn W (n) ∞ < ∞. The existence of such functions W (n) follows from Theorem 5.1. We need still another version of Lemma 6.1 in this section. (s)
(s)
Lemma 8.1. Given angles θ1 , . . . , θN , ϕ1 , . . . , ϕN (s = 1, 2), there is ) ∈ [0, 2m(N ) ] so that (s) ϕ + ki ) − θi ≥ π (34) i 8 for i = 1, . . . , N , s = 1, 2. In the formulation, we already anticipated the fact that in the application of Lemma (s) 8.1, θi is the exceptional angle from Lemma 2.2 and ϕi = ϕs (bn , ki ) are the Prüfer angles corresponding to two different solutions. Also, as usual in such situations, all angles are evaluated modulo π. Proof. The proof is by induction on N . The case N = 1 is trivial. For general N, let FN ()) = (k1 ), . . . , kN )), viewed as a function with values in the torus TN = [0, π )N . FN is periodic with period pN = 2m(N) /JN , where JN is the greatest common factor of j (1), . . . , j (N ). In particular, we see that pN ≥ 4pN −1 . By the induction hypothesis, there is a )0 so that (34) holds for ) = )0 and i ≤ N − 1. We may of course assume that )0 ∈ [0, pN −1 ). We now look at FN ()) at the points )r = )0 + rpN −1 , r = 0, 1, . . . , pN /pN −1 − 1. Since FN ()r ) = (FN−1 ()0 ), kN )r ) by periodicity of FN −1 , it suffices to consider the last component of FN . Now the kN )r , r = 0, 1, . . . , pN /pN −1 −1, are all distinct, for otherwise the period of FN would be smaller than pN . It is easy to see that these values are in fact equally spaced when ordered appropriately, and there are at least four of them.
490
CHRISTIAN REMLING
So (34) with i = N and ) = )r must hold for some choice of r. (In fact, there are at least two r’s that do the job.) The estimate on ) is of course immediate from pN ≤ 2m(N ) . We now pick the numbers Nn that indicate how many ki ’s are taken into account at step n of the construction. We fix once and for all Nn ∈ N0 with the following properties: Nn → ∞, and m(Nn ) ≤ n for all n ∈ N. We are now ready to show that (1) with E = ki2 indeed has an L2 -solution. Since the argument is reasonably close to the discussion of the preceding section, it suffices to provide a sketch. The right choice of the parameters Ln turns out to be Ln = 5 · 2n (5 could be replaced by other sufficiently large constants). Now use Lemma 8.1 to find )n−1 ∈ [0, 2m(Nn ) ] so that ϕs (an , ki ) − θn (ki ) ≥ π (35) 8 for s = 1, 2, i = 1, . . . , Nn . As already explained above, θ denotes the exceptional angle defined via Lemma 2.2. Furthermore, we are again considering two different solutions of (1) here. We have indicated this by using the additional index s. Fix i, and let n be large enough so that i ≤ Nn . Then also m(i) ≤ n; so, if j = j (i, n) (n) | ≥ ki by construction of W (n) . Thus is defined by j π = ki /gn (as above), then |W j Proposition 2.3 tells us that the eigenvalue λn associated with W (n) satisfies λn ki ≥ 1 + ki = 1 + π gn . gn 10j 10 Actually, it may be necessary here to restrict n to even larger values, to make sure the hypotheses of Proposition 2.3 are satisfied. With this and (35) as new input, we can now again run the machine based on Proposition 7.4 and Lemma 7.5. We omit the details, but we give some intermediate results for better orientation. The notation is from Section 7. The analogs of (31) and (33) are Rn,l Rn+1,0 π ln ln ≥ (l − j ) gn + ln sin(π/8) , ≥ 2. Rn,j 5 Rn,0 (Of course, the constants are anything but optimal.) Using this, one first shows that Proposition 7.4 applies. Then, the analysis of (29) shows that there is a decaying solution y of (1) with |y(x)|2 ≤ Ce−4n 22n for x ∈ (an , an+1 ). It follows that ∞ Ln 2 −4n 2n |y(x)| dx ≤ C e 2 e−4n 22n 2n+1 + 5 · 22n < ∞, )n + ≤C g n 0 n n as desired. The estimate on )n uses the fact that m(Nn ) ≤ n.
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
491
To conclude the proof, we note that, for x ∈ (an , an+1 ), we have that |V (x)| ≤ C1 gn2 = C1 2−2n . On the other hand, n Lr )r + ≤ C2 22n x≤ gr r=1
for these x; thus V (x) = O((1 + x)−1 ).
9. Second proof of Corollary 1.6. In [26], Simon uses von Neumann-Wigner potentials (see [28]), sin(2kx + θ) , W (x) = C 1+x in an original way to construct a V with prescribed point spectrum and “almost” V (x) = O((1 + x)−1 ). (A different approach to this problem had earlier been presented in [16]; [26] also removes a technical condition from [16].) The gist of Simon’s work is contained in the following. (n)
Theorem 9.1 [26]. Given km > 0 (n ∈ N, m = 1, . . . , Nn ), put fn (x; θ) =
Nn m=1
(n) (n) km sin 2km x + θm(n) . (n)
Then there are cutoffs xn ≥ 0, xn → ∞ so that, for arbitrary θm ∈ [0, 2π), the Schrödinger equation (1) with potential ∞
V (x) =
4 fn (x; θ)χ(xn ,∞) (x) 1+x n=1
(n)
has an L2 -solution for all E ∈ {(km )2 }. We refer to [26] for the proof, but a few comments may be helpful. First of all, note that the sum defining V (x) is finite for each fixed x, so convergence is not an issue. Then, in [26], the k’s are not grouped together as in the definition of the functions fn above, but that does not make a real difference in the proof. The basic idea is that VN defined as the finite sum 4/(1 + x) N n=1 · · · would generate L2 -solutions at the k’s occurring in the sum by the standard theory of von Neumann-Wigner potentials (see [8]). Then, with xN+1 chosen sufficiently large, one shows that things change only a little at the energies considered so far when passing from VN to VN +1 . In [26], Simon uses the θ ’s to control the initial phases of the L2 -solutions. Here, we trade this additional information for bounds on |V |, thus obtaining an example that proves Corollary 1.6. We specialize to the situation where m (n) = @n 1 + km , m = 1, . . . , Nn , Nn
492
CHRISTIAN REMLING
with @n > 0, Nn ∈ N. One could of course treat other examples as well, but we do not see how to obtain Theorem 1.5 in full generality with the method of this section. The crucial property of the choice above is that fn becomes a periodic function with not too large period. To see that something of this sort is necessary to get reasonable (n) bounds on fn , consider the extreme case when the km (m = 1, . . . , Nn ) are rationally (n) independent. Then fn ∞ = m km , independently of the θ’s, and this is far too big for our purposes. We use the fact that random trigonometric polynomials are, up to logarithmic (in the degree) corrections, bounded pointwise by their L2 -norm. In the case at hand, this result takes the following form. (n)
Proposition 9.2. There exist θm ∈ [0, 2π) so that
fn (·; θ) ≤ 4 @ 2 Nn ln 32πN 2 1/2 . n n ∞ Although this is merely an adaptation of classical methods to the situation under consideration here, we give the full proof below. However, let us first show how the 2 proposition can be used to prove Corollary 1.6. To this end, we let @n = 2−n −n/2 , 2 Nn = 22n . Then Nn (n) 2−1/n 2−1/n ≥ @n Nn = 21/2 , km m=1
(n) (km )2
is not in /p for any p < 1. On the other hand, with θ’s picked so Em,n ≡ according to Proposition 9.2, we have that fn ∞ ≤ Cn2−n/2 is summable; thus the V from Theorem 9.1 satisfies V (x) = O((1+x)−1 ), as required. Proof of Proposition 9.2. We basically follow the development of [9, Chapter 6]. Of course, n is fixed, so we can drop this index in the notation. We view the θm (m = 1, . . . , N ) as independent, identically distributed random variables with uniform distribution, and we show that the assertion in fact holds with large probability. The function f is periodic in x with period πN/@, so M(θ ) ≡ sup f (x; θ) = max f (x; θ). 0≤x≤π N/@
x
Pick x0 so that |f (x0 )| = M. Notice that π N/@ 2@ km = f (x) sin 2km x + θm dx ≤ 2M; πN 0 hence
N 2 f (x) = 2 km cos 2km x + θm ≤ 8@NM. m=1
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
493
It follows that |f (x)| ≥ M − 8@NM|x − x0 |. In particular, there is an interval I = I (θ ) ⊂ [0, π N/@] of length 1 πN 1 |I (θ )| ≥ min , = 8@N @ 8@N so that |f (x)| ≥ M/2 for all x ∈ I . We can now estimate the quantity E(eλM/2 ), where E(· · · ) denotes expectation with respect to the probability measure introduced above, and λ > 0 is chosen later. Namely, also using independence and the computation 1 2π
0
2π
e
a sin(α+θ )
2π ∞ n ∞ a 1 a 2n −n 2n 2 n ≤ ea /4 , 4 dθ = sin θ dθ = n n! 2π 0 (2n)! n=0
n=0
we get 1 E eλM/2 ≤ E |I |eλM/2 8@N λf (x;θ) −λf (x;θ) ≤E +e e dx I (θ) π N/@
E eλf (x) + e−λf (x) dx
≤ 0
2πN λ2 @ 2 N e . ≤ @ We can write this in the form λ 2 1 E exp M − 2λ@ 2 N − ln 32πN 2 ≤ 2 λ 2 and deduce that, with probability at least 1/2, M ≤ 2λ@ 2 N +
2 ln 32πN 2 . λ
With λ = @ −1 N −1/2 (ln(32π N 2 ))1/2 , we obtain the claim. Appendix: Norm bounds In this appendix, we sketch the proof of Theorem 4.1. We follow [1] and private notes of Kiselev. First of all, the bound on Mf follows from the fact that the Fourier transform itself is bounded as a map from Lp (0, 1) to /q if 1 ≤ p ≤ 2 and 1/p +1/q = 1. Now such norm bounds automatically carry over to the corresponding maximal functions, provided that p < q; that is, p < 2. See [1, Theorem 3.1] and also [2] for this remarkable general fact.
494
CHRISTIAN REMLING
The proof of the bound on Mf,g is more involved. We need the following result, which is essentially a special case of [1, Theorem 4.1]. Theorem A.1. Let Ki (n, ·) ∈ L∞ (0, 1) for every n ∈ Z, and suppose that the corresponding integral operators Ti : Lp (0, 1) −→ /q (1 ≤ p < 2, q ≥ 2), 1 (Ti f )(n) = Ki (n, x)f (x) dx 0
are norm bounded for i = 1, 2. Let 1 T (f, g)(n) = dxK1 (n, x)f (x) 0
Then
x
dtK2 (n, t)g(t).
0
T (f, g)
q/2
≤ Cf p gp
for all f, g ∈ Lp . Here, we may take a constant C that depends only on p and the norms of T1 , T2 , but not on the particular form of the kernels Ki . Before proving Theorem A.1, let us first show how it is used to complete the proof of Theorem 4.1. Clearly, the operator T1 with kernel K1 (n, x) = χ(0,bn ) (x)e2π inx is dominated by the corresponding maximal function for arbitrary choice of the sequence bn ∈ [0, 1]. We know already that this maximal function is norm bounded as a map from Lp to /q if p ∈ [1, 2) and 1/p + 1/q = 1, so T1 ≤ C, independently of (bn ). Thus Theorem A.1 with this K1 and K2 (n, x) = e−2π inx yields
b n x
2π inx −2π int
dxf (x)e dt g(t)e
≤ Cf p gp .
0
q/2
0
Here, (bn ) is arbitrary and C is independent of (bn ), so Mf,g satisfies the same inequality, as claimed. Proof of Theorem A.1. We need a decomposition of the characteristic function of the triangle 0 ≤ t ≤ x ≤ 1 into an infinite sum of characteristic functions of rectangles. This decomposition is adapted to the functions f , g. Let F = (|f |p + |g|p )1/p . Obviously, F ∈ Lp , and by bilinearity we can normalize so that f p = gp = 2−p ; (m) thus F p = 1. For every m ∈ N, decompose [0, 1] into 2m subintervals Ii so that p −m m for all i ∈ {1, . . . , 2 }. More specifically, we do this inductively, F χI (m) p = 2 i
(m)
(m−1)
(j = by subdividing every Ij and at step m we construct the intervals Ii m−1 1, . . . , 2 ) into two new intervals. Furthermore, we label in the natural way; that is, (n) (n) Ij +1 is the right neighbor of Ij . Then we have that, for almost all (x, t) ∈ [0, 1]2 , m−1
χ(0,∞) (x − t) =
∞ 2 m=1 i=1
χI (m) (t)χI (m) (x). 2i−1
2i
SCHRÖDINGER OPERATORS WITH DECAYING POTENTIALS
495
This is most easily seen from a sketch; for the formal proof, see [1]. Using this decomposition to evaluate T (f, g), we get
T (f, g)
∞ 2m−1
=
T1 f χI (m) T2 gχI (m)
q/2 2i 2i−1
m=1 i=1
q/2
2m−1
≤
∞
T1 f χI (m) T2 gχI (m)
q
2i
m=1 i=1
m−1
∞ 2
≤ T 1 T2
f χ m=1 i=1
q
2i−1
(m) I2i
gχI (m)
p
2i−1
p
2m−1
∞
≤ T1
T2
F χI (m) F χI (m)
2i
= T1
T2
m=1 i=1 ∞
−2m/p m−1
= 22/p − 2
2
p
2i−1
p
2
m=1 −1
T1
T2 ,
as required. References [1]
[2] [3]
[4] [5]
[6] [7] [8] [9] [10]
M. Christ and A. Kiselev, Absolutely continuous spectrum for one-dimensional Schrödinger operators with slowly decaying potentials: Some optimal results, J. Amer. Math. Soc. 11 (1998), 771–797. , A maximal inequality, preprint, 1998. M. Christ, A. Kiselev, and C. Remling, The absolutely continuous spectrum of onedimensional Schrödinger operators with decaying potentials, Math. Res. Lett. 4 (1997), 719–723. P. Deift and R. Killip, On the absolutely continuous spectrum of one-dimensional Schrödinger operators with square summable potentials, Comm. Math. Phys. 203 (1999), 341–347. R. del Rio, S. Jitomirskaya, Y. Last, and B. Simon, Operators with singular continuous spectrum, IV: Hausdorff dimensions, rank one perturbations, and localization, J. Anal. Math. 69 (1996), 153–200. F. Delyon, B. Simon, and B. Souillard, From power pure point to continuous spectrum in disordered systems, Ann. Inst. H. Poincaré Phys. Théor. 42 (1985), 283–309. M. S. P. Eastham, The Spectral Theory of Periodic Differential Equations, Texts in Math., Scottish Academic Press, Edinburgh, 1973. , The Asymptotic Solution of Linear Differential Systems: Applications of the Levinson Theorem, London Math. Soc. Monogr. (N.S.) 4, Clarendon Press, Oxford, 1989. J.-P. Kahane, Some Random Series of Functions, 2d ed., Cambridge Stud. Adv. Math. 5, Cambridge Univ. Press, Cambridge, 1985. A. Kiselev, Absolutely continuous spectrum of one-dimensional Schrödinger operators and
496
[11]
[12]
[13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29]
CHRISTIAN REMLING Jacobi matrices with slowly decreasing potentials, Comm. Math. Phys. 179 (1996), 377–400. , Stability of the absolutely continuous spectrum of the Schrödinger equation under slowly decaying perturbations and a.e. convergence of integral operators, Duke Math. J. 94 (1998), 619–646. A. Kiselev, Y. Last, and B. Simon, Modified Prüfer and EFGP transforms and the spectral analysis of one-dimensional Schrödinger operators, Comm. Math. Phys. 194 (1998), 1–45. Y. Last and B. Simon, Eigenfunctions, transfer matrices, and absolutely continuous spectrum of one-dimensional Schrödinger operators, Invent. Math. 135 (1999), 329–367. B. M. Levitan and I. S. Sargsjan, Sturm-Liouville and Dirac Operators, Math. Appl. (Soviet Ser.) 59, Kluwer, Dordrecht, 1991. S. Molchanov, One-dimensional Schrödinger operators with sparse potentials, preprint, 1997. S. N. Naboko, Dense point spectra of Schrödinger and Dirac operators, Theoret. and Math. Phys. 68 (1986), 646–653. J. Pöschel and E. Trubowitz, Inverse Spectral Theory, Pure Appl. Math. 130, Academic Press, Boston, 1987. M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. I–IV, Academic Press, New York, 1972–1979. C. Remling, Some Schrödinger operators with power-decaying potentials and pure point spectrum, Comm. Math. Phys. 186 (1997), 481–493. , The absolutely continuous spectrum of one-dimensional Schrödinger operators with decaying potentials, Comm. Math. Phys. 193 (1998), 151–170. , Embedded singular continuous spectrum for one-dimensional Schrödinger operators, Trans. Amer. Math. Soc. 351 (1999), 2479–2497. , Bounds on embedded singular spectrum for one-dimensional Schrödinger operators, Proc. Amer. Math. Soc. 128 (2000), 161–171. C. A. Rogers, Hausdorff Measures, Cambridge Univ. Press, Cambridge, 1970. B. Simon, Some Jacobi matrices with decaying potential and dense point spectrum, Comm. Math. Phys. 87 (1982), 253–258. , Bounded eigenfunctions and absolutely continuous spectra for one-dimensional Schrödinger operators, Proc. Amer. Math. Soc. 124 (1996), 3361–3369. , Some Schrödinger operators with dense point spectrum, Proc. Amer. Math. Soc. 125 (1997), 203–208. G. Stolz, Bounded solutions and absolute continuity of Sturm-Liouville operators, J. Math. Anal. Appl. 169 (1992), 210–228. J. von Neumann and E. Wigner, Über merkwürdige diskrete Eigenwerte, Z. Phys. 30 (1929), 465–467. J. Weidmann, Spectral Theory of Ordinary Differential Operators, Lecture Notes in Math. 1258, Springer, Berlin, 1987.
Fachbereich Mathematik/Informatik, Universität Osnabrück, 49069 Osnabrück, Germany; [email protected]
Vol. 105, No. 3
DUKE MATHEMATICAL JOURNAL
© 2000
GROUP SCHEMES AND LOCAL DENSITIES WEE TECK GAN and JIU-KANG YU
1. Introduction. The subject matter of this paper is an old one with a rich history, beginning with the work of Gauss and Eisenstein, maturing at the hands of Smith and Minkowski, and culminating in the fundamental results of Siegel. More precisely, if L is a lattice over Z (for simplicity), equipped with an integral quadratic form Q, the celebrated Smith-Minkowski-Siegel mass formula expresses the total mass of (L, Q), which is a weighted class number of the genus of (L, Q), as a product of local factors. These local factors are known as the local densities of (L, Q). Subsequent work of Kneser, Tamagawa, and Weil resulted in an elegant formulation of the subject in terms of Tamagawa measures. In particular, the local density at a non-Archimedean place p can be expressed as the integral of a certain volume form ωld over Aut Zp (L, Q), which is an open compact subgroup of Aut Qp (L, Q). The question that remains is whether one can find an explicit formula for the local density. Through the work of Pall (for p = 2) and Watson (for p = 2), such an explicit formula for the local density is in fact known for an arbitrary lattice over Zp (see [P] and [Wa]). The formula is obviously structured, although [CS] seems to be the first to comment on this. Unfortunately, the known proof (as given in [P] and [K]) does not explain this structure and involves complicated recursions. On the other hand, Conway and Sloane [CS, Section 13] have given a heuristic explanation of the formula. In this paper, we give a simple and conceptual proof of the local density formula for p = 2. The viewpoint taken here is similar to that of our earlier work [GHY], and the proof is based on the observation that there exists a smooth affine group scheme G over Zp with generic fiber AutQp (L, Q), which satisfies G(Zp ) = AutZp (L, Q). This follows from general results of smoothening [BLR], as we explain in Section 3. To obtain an explicit formula, it is necessary to have an explicit construction of G. The main contribution of this paper is to give such an explicit construction of G (in Section 5), and to determine its special fiber (in Section 6). Finally, by comparing ωld and the canonical volume form ωcan of G, we obtain the explicit formula for the local density in Section 7. The smooth group schemes constructed in this paper should also be of independent interest. Our method works over any non-Archimedean local field of residue characteristic p = 2 and also works for all types of classical groups. Therefore, we obtain new explicit formulas for local densities of lattices in symplectic spaces, Hermitian spaces, and quaternionic Hermitian and anti-Hermitian spaces. For lattices in a symplectic Received 29 September 1999. Revision received 10 February 2000. 2000 Mathematics Subject Classification. Primary 11E41, 11E95, 14L15, 20G25, 20G30; Secondary 11E12, 11E57. 497
498
GAN AND YU
space, or a Hermitian space over an unramified quadratic extension, we note that such explicit formulas were obtained earlier in [HS1], [HS2], and [Hi] using very different techniques. The restriction that p = 2 is actually not required for symplectic spaces, Hermitian spaces over an unramified quadratic extension, and quaternionic Hermitian spaces, as we explain in Section 9. For the remaining types of spaces, the case p = 2 is much more involved and will not be pursued here. Finally, we include a complete discussion of the Smith-Minkowski-Siegel mass formula. One reason for including this is that most treatments of this topic in the literature either worked over a number field of class number 1 (usually Q) or worked explicitly with integral matrices. As such, the lattice involved is implicitly assumed to be free. For lattices that are not free, there is a minor subtlety which we clarify in Section 10. Furthermore, we have not been able to find a reference that treats the mass formula for general types of classical groups in sufficient detail. Even in the case of orthogonal groups, these issues are completely worked out only in the recent Ph.D. thesis of Hanke [Ha]. As a consequence of the mass formula and our results on the local densities, we obtain an explicit formula for the mass of an arbitrary lattice in a quaternionic Hermitian space. This extends a recent result of Shimura, who obtained an exact formula for the mass of a particular lattice called the maximal lattice. 2. Notation 2.1. Let F be a non-Archimedean local field of residue characteristic p = 2, and let A be its ring of integers. Fix a uniformizing element π of F , and let κ = A/πA. Let q be the cardinality of κ. 2.2. • K • K • K • K • K
Let (K, σ ) be one of the following F -algebras with involution: = F , σ = identity; = E, a quadratic extension, σ = the unique nontrivial automorphism of E/F ; = F ⊕ F , σ (x, y) = (y, x); = D, the quaternion algebra over F , σ = the standard dinvolution; −b . = M2 (F ), the algebra of (2 × 2)-matrices, σ ac db = −c a
2.3. Let B be a maximal A-order in K. Then B is uniquely determined except in the case K = M2 (F ), in which we may and do assume that B = M2 (A). If K = E is a ramified quadratic extension of F , or if K = D, we let πK be a uniformizer of K and put e = 2; in all other cases, we let πK = π and put e = 1. 2.4. Let be either 1 or −1. The triple (K, σ, ) will be fixed throughout this paper (at least until Section 10), and by a Hermitian form we always mean a (σ, )Hermitian form. We consider a B-lattice L (i.e., a free right B-module of finite rank) with a Hermitian form −, − : L × L → B. Our convention is v · a, w · b = σ (a)v, wb, w, v = σ v, w .
GROUP SCHEMES AND LOCAL DENSITIES
499
We assume that V = L ⊗A F is nondegenerate with respect to −, −, in the sense that x, V = 0 implies that x = 0. The right B-module L is also regarded as a left B-module by the rule a ·v = v ·σ (a). 2.5. Definition. The dual lattice of L, denoted by L⊥ , is L⊥ = x ∈ L ⊗A F : x, L ⊂ B . The pair (L, −, −) is fixed throughout this paper. 2.6.
Let G be the reductive algebraic group over F such that G(E) = Aut K⊗F E V ⊗F E, −, −
for any commutative F -algebra E. Then G is a classical group, not necessarily connected. The group GL = Aut B (L, −, −) is an open compact subgroup of G(F ). We denote by GLB (L)/A the A-group scheme whose group of R-valued points is GLB⊗A R L ⊗A R
for any commutative A-algebra R. If B is commutative, this is just the Weil restriction of scalars ResB/A GLB (L). We define GLK (V )/F in the same way. 3. The local density. In this section, we explain our viewpoint and strategy for proving the local density formula. We first recall how the local density arises, and our exposition here follows that of [S]. 3.1. Let H be the F -vector space of Hermitian forms on V = L ⊗A F , and let h0 ∈ H be our fixed Hermitian form −, −. Define a map f˜ : EndK (V ) −→ H by f˜(t) = h0 ◦ t. Here, h0 ◦ t is the Hermitian form (v, w) → t · v, t · w. Then the inverse image of h0 under f˜ is simply G. 3.2. Regarding H and M = EndK (V ) as varieties over F , let ωH and ωM be nonzero, translation-invariant volume forms on H and M, respectively. Then one can define a volume form ω on G in the following way. Let M ∗ = GLK (V )/F . It is easy to see that f = f˜|M ∗ : M ∗ → H is smooth of relative dimension dim G. Therefore, there is an exact sequence of locally free sheaves on M ∗ : 0 −→ f ∗ &H /F −→ &M ∗ /F −→ &M ∗ /H −→ 0. This gives rise to an isomorphism top top top ∗ ∗ &H /F ⊗ &M /H &M ∗ /F . f
500
GAN AND YU
Let ω ∈ top &M ∗ /H (M ∗ ) be such that f ∗ ωH ⊗ω = ωM |M ∗ . We then put ω = ω |G . It is easy to see that ω is a nonzero quasi-invariant differential on G, by which we mean that for any g ∈ G(F ), g ∗ ω = χ(g) · ω for some rational character χ of G which is trivial on G◦ , the connected component of identity of G. Hence, ω defines a Haar measure |ω| on G(F ). We sometimes denote ω by ωM /f ∗ ωH . 3.3. Let HL be the set of Hermitian forms on V which take values in B when restricted to L. Then HL is an A-lattice in H , which gives H an integral structure. Similarly, EndB (L) is an A-lattice in M, giving M an integral structure. The Hermitian form h0 is an element of HL , and so we obtain a naive integral model G of G. More precisely, for any commutative A-algebra R, G (R) = Aut B⊗A R L ⊗A R, h0 ⊗A R . 3.4. Lemma. Assume that HL |ωH | = 1 and EndB (L) |ωM | = 1. Put ωld = ωM /f ∗ (ωH ). Then
ld
ω = lim q −N dim G #G A/π N A , GL
N →∞
where the limit stabilizes for N sufficiently large. This lemma is well known, at least when (K, ) = (F, 1) (see, e.g., [S] or [Ha]). The general case is proved in the same way. 3.5. Definition. The local density of (L, −, −) is the quantity βL =
1 · lim q −N dim G #G A/π N A . [G : G◦ ] N →∞
3.6. The proof of the local density formula that one finds in the literature involves a computation of the stabilizing value of the above limit. This limit does not always stabilize at the first term because the group scheme G is not always smooth over A. The starting point of our work is the observation of the following proposition. 3.7. Proposition. There exists a unique smooth affine group scheme G over A such that G has generic fiber G and G(R) = G (R) for any étale A-algebra R. Proof. Applying the general theorem of group smoothening (see [BLR, Theorem 7.1.5]) to G , we see that there exists a smooth group scheme G over A of finite type with generic fiber G satisfying G(R) = G (R) for any étale A-algebra R. By a result of Raynaud (see [SGA3, Exp. XVII, Appendice III, Proposition 2.1(iii)]), G is affine. The uniqueness follows from [BT1, 1.7]. 3.8. Remark. The proof above also applies when p = 2, although when F is of equal characteristic 2 and G is a form of odd orthogonal group, one should replace G by its reduced subgroup to make sure that G is smooth.
GROUP SCHEMES AND LOCAL DENSITIES
501
3.9. Let ωcan be a differential of top degree on G/A , which is quasi-invariant under G and which has nonzero reduction on the special fiber. Note that ωcan is well defined up to a unit of A on each connected component of G and, hence, determines a Haar measure |ωcan | on G(F ). Now the volume of GL with respect to this Haar measure is given by GL
can
ω = q − dim G · #G A/πA .
Moreover, #G(A/π A) can be easily computed once we know its maximal reductive quotient. Hence, by Lemma 3.4, in order to obtain an explicit formula for the local density, it suffices to • determine the special fiber of G, especially its maximal reductive quotient; • relate the Haar measures |ωld | and |ωcan |. The abstract existence result in Proposition 3.7 does not help in the solution of the two problems above. In Section 5, we give an explicit construction of G, which makes all subsequent computations possible. 4. Jordan decomposition of Hermitian modules. From now until the end of Section 7, we assume that K = M2 (F ), because the case K = M2 (F ) is best handled by Morita context and will be done in Section 8. Though we have restricted ourselves to the case where A is a complete discrete valuation ring, the results of this and the following two sections hold more generally for A, a Henselian discrete valuation ring, with perfect residue field. 4.1. For a ∈ K ∗ such that σ (a) = a, we denote by a the rank-1 lattice B equipped with the Hermitian form (x, y) → σ (x)ay. For any a ∈ K ∗ , we denote by Ha the rank-2 lattice B · e1 + B · e2 with the Hermitian form such that e1 , e1 = e2 , e2 = 0 and e1 , e2 = a. 4.2. Proposition. Let L be as in 2.4. • If (K, ) = (F, 1), or K = F ⊕F , or K = E is unramified, or (K, ) = (D, −1), L is isomorphic to an orthogonal direct sum of lattices of the form a. • If (K, ) = (F, −1), L is isomorphic to an orthogonal direct sum of lattices of the form Ha . • If K = E is ramified or (K, ) = (D, 1), L is isomorphic to an orthogonal direct sum of lattices of the form a or Ha . Proof. The case K = F is well known, and the other cases are probably known too. Since the proofs for the various cases are similar, we only sketch the argument for (K, ) = (D, 1). Let r = min{ordv, v : v ∈ L} and s = min{ordv, w : v, w ∈ L}. Then r is an integer, s is half an integer, and obviously s ≤ r. Since I = {v, w : v, w ∈ L} is a two-sided ideal in B, we must have I = πK2s B. Since the trace of x, y is
502
GAN AND YU
x + y, x + y − x, x − y, y, we have tr(I ) ⊂ πK2r A. This implies that s = r or s = r − 1/2. Suppose that s = r. Let e1 ∈ L be such that orde1 , e1 = r, and let e2 , . . . , en be such that e1 , . . . , en form a B-basis of L. We replace ei by ei − e1 , e1 −1 ei , e1 e1 and assume that e1 is perpendicular to e2 , . . . , en . Thus, L is the orthogonal direct sum of B · e1 and B · e2 + · · · + B · en . The result follows by induction. Suppose that s = r −1/2. Let e1 , e2 ∈ L be such that s = orde1 , e2 . We claim that we choose e1 and e2 so that e1 , e1 = e2 , e2 = 0. Assuming this claim, an argument similar to the above shows that B · e1 + B · e2 is an orthogonal direct summand of L, and, hence, the result follows by induction. To verify the claim, we use the following well-known presentation of D: D is generated by i, j such that i 2 ∈ A∗ (A∗ )2 , j 2 = π, and ij = −j i. By scaling −, − suitably, we assume that r = 0 and e1 , e2 = π −1 j . It suffices to show that for any a, c ∈ A, we find x ∈ B such that a − tr(π −1 xj ) + cxσ (x) = 0. We actually show that there is a solution x in A[j ] (this fact is also needed in the case where K = E is ramified). By writing x = u + j v, u, v ∈ A, the equation that we solve is c(u2 − πv 2 ) + a − 2v = 0. This equation defines a closed subscheme of Spec A[u, v], which is smooth over A. Therefore, the solvability follows from Hensel’s lemma. 4.3. Corollary. There exists an orthogonal decomposition L = that the dual lattice of Li is πK−i Li .
i≥0 Li
such
5. The smooth model G. In this section, we give an explicit construction of the smooth integral model G, which was shown to exist in Proposition 3.7. 5.1. Let L be as in 2.4, and let L = Li be a decomposition as in Corollary −i ⊥ 4.3. Then L = πK Li . Obviously, if g ∈ GL , then g stabilizes L⊥ . We interpret this fact in terms of matrices as follows. Let ni = rank B Li , and let n = rank B L = ni . Assume that ni = 0 unless 0 ≤ i < N . In the following, we always divide an (n × n)-matrix into (N × N)-blocks such that the (i, j )-block is of size ni × nj . By choosing a B-basis for Li for each i, we represent an element g of EndB (L) by such an (N × N)-matrix m = m(g). If g ∈ GL , the fact that g stabilizes L⊥ means that π j −i B if j ≥ i, K the (i, j )-block mij has entries in (∗) B if j ≤ i. 5.2. We first construct an affine scheme of rings M over A, representing matrices formally of this type. Define a functor from the category of commutative flat Aalgebras to the category of rings as follows. For any commutative flat A-algebra R, set M(R) = t ∈ EndB⊗A R L ⊗A R : t L⊥ ⊗A R ⊂ L⊥ ⊗A R .
GROUP SCHEMES AND LOCAL DENSITIES
503
Then the functor M is representable by a unique flat A-algebra A(M). We denote the spectrum of this algebra again by M (so now M(R) is defined for any A-algebra R). The representability is easily verified by thinking of elements of M(R) as matrices satisfying (∗) and then directly writing down the affine ring of M, which is simply a polynomial ring over A of n2 · [K : F ] variables. Moreover, it is easy to see that M has the structure of a scheme of rings, since the set of matrices of the form (∗) is closed under addition and multiplication. 5.3. We stress that the above description of M(R) is valid only if R is a flat Aalgebra. Now suppose that R is a κ-algebra. Then, by choosing a B-basis for each Li , max(0,j −i) uij ), where uij we describe each element of M(R) formally as a matrix (πK is an (ni × nj )-matrix with entries in R ⊗A B. Moreover, the addition law on the ring M(R) is defined in the obvious way. The multiplication law, on the other hand, is given max(0,j −i) uij ) as follows. To multiply (uij ) and (uij ), we form the matrices m = (πK max(0,j −i)
max(0,j −i)
uij ), and we write m · m = (πK uij ). Then (uij ) is the and m = (πK product of (uij ) and (uij ). Since M is a scheme of rings, the functor R → M(R)∗ is represented by a group scheme M ∗ . It is easy to see that M ∗ is an open subscheme of M, with generic fiber M ∗ = GLK (V )/F , and M ∗ is smooth over A. 5.4. For any flat A-algebra R, let H (R) be the set of Hermitian forms h on L ⊗A R (with values in B ⊗A R) such that h(L, L⊥ ) ⊂ R ⊗A B. (Recall that −, − is fixed throughout the paper and L⊥ is the dual lattice of L defined in 2.5.) It is easy to see that H is represented by a flat A-scheme, again denoted by H , which is isomorphic to an affine space of dimension n2 [K : F ] − dim G. The group M ∗ (R) acts on the right of H (R) by h · t = h ◦ t. It is easy to see that this action is represented by an action morphism H × M ∗ −→ H . Note that our fixed Hermitian form h0 is an element of H (A). Now we have the following crucial result. 5.5. Theorem. Let f be the morphism M ∗ → H defined by f (t) = h0 ◦ t. Then f is smooth of relative dimension dim G. The proof of Theorem 5.5 depends on the following two lemmas. 5.5.1. Lemma. Let S be a Noetherian scheme, and let f : X → Y be a morphism of S-schemes. Assume that both X, Y are of finite type over S. Suppose (i) X is flat over S; (ii) fs = f ×S κ(s) : Xs → Ys is smooth for all s ∈ S. Then f is smooth. Proof. By (ii), fs is flat for all s ∈ S. By [BLR, Proposition 2.4.2], f is flat. Let
504
GAN AND YU
x ∈ X. Set y = f (x), and let s ∈ S be such that both x and y are lying above s. Since fs is smooth at x, Xy = (Xs )y is smooth over κ(y). By [BLR, Proposition 2.4.7], this implies that f is smooth at x. 5.5.2. Lemma. The morphism f ⊗ κ : M ∗ ⊗ κ → H ⊗ κ is smooth of relative dimension dim G. Proof. It is enough to check the statement over the algebraic closure κ of κ. By [H, III.10.4], it suffices to show that, for any m ∈ M ∗ (κ), ¯ the induced map on the Zariski tangent space f∗ : Tm → Tf (m) is surjective. To facilitate computations, we think of elements of M(R) as in 5.3. Similarly, we think of elements of H (R) formally as (n × n)-Hermitian matrices h whose max(i,j ) hij , where hij has entries in R ⊗A B. The action (i, j )-block is of the form πK morphism H × M ∗ → H is simply (h, m) −→ σ t m · h · m, where the multiplication is to be interpreted as in 5.3. We introduce still another functor on flat A-algebras. Define M (R) to be the set of all (n × n)-matrix h over R ⊗A B such that the (i, j )-block hij of h has entries in max(i,j ) πK R ⊗A B. It is easy to see that M is represented by a flat A-scheme, again denoted by M . The matrix products (m, m ) → σ (t m) · m and (m, m ) → m · m induce two morphisms M ×A M → M of schemes over A. Then we identify Tm with M(κ) ¯ and Tf (m) with H (κ) ¯ ⊂ M (κ). ¯ The map f∗ : t t Tm → Tf (m) is then X → σ ( m) · h0 · X + σ ( X) · h0 · m. The desired surjectivity now follows from the following three easy statements: (1) X → h0 · X is a bijection M(κ) ¯ → M (κ); ¯ (2) m → σ (t m ) + m is a surjection M (κ) ¯ → H (κ); ¯ ∗ t (3) for any m ∈ M (κ), ¯ m → σ ( m) · m is a bijection from M (κ) ¯ to itself. We now give the proof of Theorem 5.5. It is clear that M ∗ is flat over A and f ⊗A F is smooth. By Lemma 5.5.2, f ⊗A κ is smooth. Applying Lemma 5.5.1 with (X, Y, S) = (M ∗ , H , Spec A), the theorem follows. 5.6. Let G be the stabilizer of h0 in M ∗ . It is an affine group subscheme of M ∗ , defined over A. 5.7. Theorem. The group scheme G is smooth, and G(R) = Aut B⊗A R (L ⊗A R, −, −) for any étale A-algebra R. Proof. Regard h0 as a morphism Spec A → H . Then G → Spec A is simply the base change of M ∗ → H by this morphism. Therefore, the first statement follows from Theorem 5.5. The second assertion follows from the definition of G. 5.8. We now give another description of G, which is more concise but not as informative as the construction above. When L ⊂ L⊥ ⊂ πK−1 L, GL is a maximal
GROUP SCHEMES AND LOCAL DENSITIES
505
parahoric subgroup of G and the following result specializes to the construction in [BT2]. Proposition. Let ρ : G −→ GLK (V )/F ×F GLK (V )/F be the direct sum of two copies of the standard representations. Then the schematical closure of ρ(G) in GLB (L)/A ×A GLB (L⊥ )/A is isomorphic to G. Proof. Let G be this schematical closure. Clearly, G (R) = G(R) for any étale algebra R over A. Let ρ˜ : GLK (V )/F −→ GLK (V )/F × GLK (V )/F be the direct sum of two copies of the identity homomorphism. Then it is easy to ⊥ see that the schematical closure of ρ(GL ˜ K (V )/F ) in GLB (L)/A ×A GLB (L )/A is nothing but M ∗ . By the definition of schematical closure, G is flat and there is a surjection pr from the affine ring A[M ∗ ] onto A[G ]. By the construction in 5.4 and 5.6, A[M ∗ ] also maps onto A[G], via a surjection pr, and there is a surjection φ : A[G] → A[G ] such that pr = φ ◦ pr. It is clear that φ ⊗ F is the identity F [G] → F [G]. Since both G and G are flat over A, the maps A[G] → F [G] and A[G ] → F [G] are injective. Therefore, φ is injective and, hence, bijective. The proposition is proved. 5.9. Remark. The results in this section cover the case K = F ⊕ F . However, this case can also be dealt with separately. A free K-module V is of the form W ⊕ W for some F -vector space W , and then G is isomorphic to GLF (W ). It is easy to show that GL is the intersection of the stabilizers (in GLF (W )) of two lattices M , M ⊂ W . Bythe theory of elementary divisors, we assume that there is a decomposition M = Mi and M = π −i Mi . It follows that G is simply the group scheme M ∗ constructed in 5.2, with (B, V , L, L⊥ ) replaced by (A, W, M , M ). 6. The special fiber of G. The purpose of this section is to determine the structure ˜ of G. We keep the notation from the previous section. For each of the special fiber G i, put i−j L(i) = Lj = x ∈ L : x, L ⊂ πKi B . πK Lj + j
6.1.
j ≥i
Denote by M˜ the special fiber of M ∗ . Let M˜ i = GLB/πK B L(i) πK L(i) , κ
506
GAN AND YU
regarded as a κ-algebraic group, as in 2.6. For any κ-algebra R, let m = max(0,j −i) ˜ (πK uij ) ∈ M(R). Then uii ∈ M˜ i (R) for all i. Therefore, we have a morphism of algebraic varieties r : M˜ −→ M˜ i , given by m → (uii ). It is easy to see that r is a homomorphism of algebraic groups. We now have the following lemma. ˜ and M˜ i is the Lemma. The kernel of r is the unipotent radical M˜ + of M, ˜ maximal reductive quotient of M. 6.2. Suppose that e = 1. Consider the (σ, )-Hermitian form (x, y) → π −i x, y mod π on the B/π B-vector space Vi = L(i) /πL(i) . Let Vi be the kernel of this Hermitian form, and let Vi = Vi /Vi . We define Gi to be the isometry group of Vi with the induced Hermitian form. We represent −, − by a Hermitian matrix diag(δ0 , πK δ1 , . . . , πKN −1 δN −1 ). Then δi mod π is a (σ, )-Hermitian matrix over B/πK B representing the Hermitian form on Vi . max(0,j −i) ˜ uij ) ∈ G(R), we have uii ∈ Gi (R). For any κ-algebra R and any m = (πK ˜ → Gi , m → (uii ). Therefore, we have a homomorphism of algebraic groups r : G ˜ is Gi . 6.2.1. Proposition. The maximal reductive quotient of G ˜ κ) Proof. The map G( ¯ → Gi (κ) ¯ is surjective. In fact, if G (κ) ¯ is the subgroup max(0,j −i) consisting of those m = (π u ) such that u = 0 for i = j , then G (κ) ¯ → ij ij K Gi (κ) ¯ is already surjective. ˜ and Gi are smooth, r is a quotient map by [W, Section 15.2]. Since Since both G Gi is clearly reductive, it remains to show that the kernel U of r is unipotent and connected. Being a subgroup of M˜ + , U is clearly unipotent. ˜ are the following: The equations defining G δi = σ t uii δi uii , 0 ≤ i < N, σ t uki δk ukj , 0 ≤ i < j < N. 0= i≤k≤j
The equations defining U are obtained by setting uii = 1 in the above equations. It follows easily that as an algebraic variety over κ, U is isomorphic to an affine space of dimension i<j ni · nj · [K : F ]. This shows that U is connected and also gives a second proof of the unipotency of U . ˜ as an algebraic variety, 6.2.2. Remark. The above equationsalso show that G, is isomorphic to the direct product of Gi and U . This gives a second proof of ˜ We also remark that the surjectivity of G( ˜ κ) ¯ and the smoothness of G. ¯ → Gi (κ)
GROUP SCHEMES AND LOCAL DENSITIES
507
the smoothness of Gi already implies that r is a quotient map, and, therefore, the ˜ smoothness of Gi and U gives a third proof of the smoothness of G. 6.2.3. Proposition. The type of Gi is as follows. • If (K, ) = (F, 1), then if det(δi ) (mod π) ∈ κ ×2 , O(ni ) Gi = 2 O(n ) otherwise. i • If (K, ) = (F, −1), then Gi = Sp(ni ). • If K/F is an unramified quadratic extension, then Gi = U(ni ). • If K = F ⊕ F , then Gi = GL(ni ). Here, O(ni ) (resp., 2 O(ni )) denotes the split (resp., nonsplit) orthogonal group over κ in ni variables, Sp(ni ) denotes the symplectic group in ni variables, and U(ni ) denotes the unitary group in ni variables. 6.3. The remainder of this section is devoted to the case e = 2, which is somewhat more complicated than the case e = 1. Let σi be the reduction modulo πK of the automorphism x −→ πK−i σ (x)πKi of B. Consider Vi = L(i) /πK L(i) . Then (x, y) −→ πK−i x, y
mod πK
is a (σi , (−1)i )-Hermitian form on Vi . Let Vi be the maximal nondegenerate quotient of Vi with respect to this Hermitian form. We define Gi as the isometry group of Vi . If we represent −, − by a block diagonal Hermitian matrix diag(πKi δi ), then δi mod πK is a (σi , (−1)i )-Hermitian matrix over B/πK B representing the Hermitian form on Vi . Again, the map max(0,j −i) uij −→ uii mod πK m = πK ˜ → Gi . is a homomorphism of algebraic groups r : G ˜ 6.3.1. Proposition. The group Gi is the maximal reductive quotient of G. The proof of this proposition depends on a series of lemmas. ˜ κ) ¯ (resp., ¯ → G0 (κ) 6.3.2. Lemma. If L = L0 (resp., L = L1 ), then the map G( ˜ ¯ is surjective. G(κ) ¯ → G1 (κ)) Proof. If L = L0 (resp., L = L1 ), then L is a self-dual lattice (resp., π −1 L is a maximal lattice). Since G is smooth, it is the Bruhat-Tits scheme associated to a maximal parahoric subgroup (cf. [T], [BT2], and [GHY]). Using Bruhat-Tits theory
508
GAN AND YU
(see [BT1, Section 4.6.10] and [T, Sections 3.5.1 and 3.5.2]), one can check that G0 ˜ The surjectivity statement follows. is actually the maximal reductive quotient of G. 6.3.3. Lemma. Let 1 → X → Y → Z → 1 be an exact sequence of group schemes that are locally of finite type over κ. Suppose that X is smooth, connected, and unipotent. Then 1 → X(R) → Y (R) → Z(R) → 1 is exact for any κ-algebra R. Proof. Since the group schemes are locally of finite type over κ, 1 → X → Y → Z → 1 is an exact sequence of sheaves on the (big) fppf site of Spec κ. Hence, it 1 (Spec R, X) is trivial for all κ-algebra R. suffices to show that the pointed set Hfppf Since κ is perfect, X is a split unipotent group (see [Sp, Theorem 14.3.8(iii)]); that is, X is a successive extension of additive groups Ga ’s. Therefore, it suffices to show 1 (Spec R, G ) = 0. This follows from [SGA4, Exp. VII, Proposition 4.3 and that Hfppf a Remark 4.5]. 6.3.4. Recall that M˜ is the special fiber of M ∗ and that M˜ + is the unipotent ˜ Notice that M(R) is a B/πB-algebra for any κ-algebra R. Therefore, radical of M. we consider the subfunctor πK M : R → πK M(R) of M ⊗ κ and the subfunctor ˜ Then we have the following easy lemma. M˜ 1 : R → 1 + πK M(R) of M. Lemma. (i) The functor M˜ 1 is representable by a smooth, connected, unipotent group scheme over κ. Moreover, M˜ 1 is a closed normal subgroup of M˜ + . (ii) The quotient group scheme M˜ + /M˜ 1 represents the functor R −→ M˜ + (R)/M˜ 1 (R) and is smooth, connected, and unipotent. ˜ M˜ + , 6.3.5. From the lemma, we describe the functors of points of the schemes M, + 1 ˜ ˜ and M /M as follows: ˜ M(R) = uij : uij ∈ Mni ×nj R ⊗ B/πB , uii is invertible for all i , M˜ + (R) = uij : uij ∈ Mni ×nj R ⊗ B/πB , uii = 1 mod πK for all i , + M˜ /M˜ 1 (R) = uij : uij ∈ Mni ×nj R ⊗ B/πK B , uii = 1 for all i . We remark that the above only describes the underlying schemes. The group law max(0,j −i) uij ), is to be interpreted as in 5.3: to multiply (uij ) and (uij ), let m = (πK max(0,j −i)
max(0,j −i) uij );
m = (πK uij ) and write m · m = (πK of (uij ) and (uij ).
then (uij ) is the product
˜ 1 as ˜ → M. ˜ We define G ˜ + and G 6.3.6. Recall that there is a closed immersion G the kernels of the compositions ˜ −→ M˜ −→ M/ ˜ M˜ + G
and
˜ −→ M˜ −→ M/ ˜ M˜ 1 , G
˜ + → M˜ + /M˜ 1 and, hence, ˜ 1 is the kernel of the morphism G respectively. Then G
GROUP SCHEMES AND LOCAL DENSITIES
509
˜ + . The induced morphism G ˜ + /G ˜ 1 → M˜ + /M˜ 1 is a closed normal subgroup of G + 1 ˜ ˜ is a monomorphism, and thus G /G is a closed subgroup scheme of M˜ + /M˜ 1 by [SGA3, Exp. VIB , Corollary 1.4.2]. ˜ 1 is connected, smooth, and unipotent. 6.3.7. Lemma. (i) G + 1 ˜ ˜ (ii) G /G is connected, smooth, and unipotent. ˜ are the following: Proof. The equations defining G δi = σi t uii δi uii − σi t ui−1,i δi−1 πK ui−1,i + σi t ui+1,i πK δi+1 ui+1,i , 0 ≤ i < N ; k−j i−j j −k j −i 0= σj t uki πK δk πK ukj − σj t ui−1,i πK δi−1 πK πK ui−1,j i≤k≤j
+ σj t uj +1,i πK δj +1 uj +1,j ,
0 ≤ i < j < N.
˜ 1 are obtained by setting The equations defining G uii = 1 + πK uii ,
uij = πK uij ,
i = j
in the above equations. It is then easy to check that the underlying algebraic variety ˜ 1 is simply an affine space. This proves (i). of G ˜ 1 represents the functor R → ˜ + /G Now (i) and Lemma 6.3.3 imply that G 1 + 1 + ˜ (R). If m = (uij ) ∈ (M˜ /M˜ )(R) is such that m ∈ (G ˜ + /G ˜ 1 )(R), then ˜ (R)/G G obviously (uij ) satisfies the following equations (which are given as equalities in R ⊗A B/πK B): 1 = uii , 0 ≤ i < N, k−j j −k 0= σj t uki πK δk πK ukj , 0 ≤ i < j < N. i≤k≤j
Let G‡ be the subfunctor of M˜ + /M˜ 1 consisting of those (uij ) satisfying the above equations. Then it is easy to check that G‡ is represented by a smooth, connected, closed subscheme of M˜ + /M˜ 1 and is isomorphic to an affine space over κ. ˜ + /G ˜ 1 . Since G† and G‡ are both closed subFor ease of notation, let G† = G + 1 † ‡ ˜ ˜ ¯ ⊂ G (κ), ¯ (G† )red is a closed subscheme of (G‡ )red = schemes of M /M and G (κ) † ‡ G . It is easy to check that dim G = dim G‡ . Since G‡ is irreducible, we must have (G† )red G‡ , and, hence, G† = G‡ because G† is a subfunctor of G‡ . This proves (ii). ˜ of We now prove Proposition 6.3.1. Again, if we consider the subgroup scheme G max(0,j −i) ˜ consisting of diagonal block matrices (i.e., those m = (π uij ) with uij = 0 G K ˜ ¯ → Gi (κ) ¯ is already surjective by Lemma 6.3.2. unless i = j ), then the map G (κ)
510
GAN AND YU
˜ → Gi is a quotient map whose kernel is by definition G ˜ +. Therefore, the map G + ˜ By Lemma 6.3.7, G is connected, smooth, and unipotent. Since Gi is obviously reductive, the proposition is proved completely. 6.3.8. Remark. As in 6.2.2, the above analysis can be used to give another proof ˜ and, thus, another proof of the smoothness of G → Spec A. of the smoothness of G We omit the details. 6.3.9. Proposition. The type of Gi is as follows. • If K/F is a ramified quadratic extension and = (−1)s , then O(ni ) Gi = 2 O(ni ) Sp(ni )
if i + s is even and det(δi ) mod πK ∈ κ ×2 , if i + s is even and det(δi ) mod πK ∈ / κ ×2 , if i + s is odd.
• If K = D and = 1, then Gi =
U(ni )
Resκ2 /κ Sp(ni )
if i is even, if i is odd.
• If K = D and = −1, then U(ni ) Gi = Resκ2 /κ O(ni ) 2 Resκ2 /κ O(ni )
if i is even,
if i is odd and det(δi ) mod πK ∈ κ2×2 , if i is odd and det(δi ) mod πK ∈ / κ2×2 .
Here κ2 is the quadratic extension of κ and Resκ2 /κ denotes the Weil restriction of scalars. 7. Comparison of volume forms and final formulas 7.1.
and ω to be such that In the construction of 3.2, pick ωM H
M(A)
ω = 1 M
and
H (A)
ω = 1. H
/f ∗ ω . By Theorem 5.5, we have an exact sequence of locally free Put ωcan = ωM H sheaves on M ∗ :
0 −→ f ∗ &H /A −→ &M ∗ /A −→ &M ∗ /H −→ 0. It follows that ωcan is of the type discussed in 3.9.
511
GROUP SCHEMES AND LOCAL DENSITIES
7.2. Lemma. Let d = dimκ (B/πK B). Then ω M = π N E ωM , NE = d · (j − i) · ni · nj , j >i
ωH = π
NH
ωH ,
NH =
d · j · n i · nj +
j >i
di ,
i
ωld = π NE −NH ωcan . Here
di = i · d · n2i − dim Gi
if e = 1.
If K = E is a ramified quadratic extension, t · n2 if i = 2t is even, i di = t · n2 + dim G if i = 2t + 1 is odd. i i If K = D,
t · ni 2ni − di = t · n 2n − + n2 i i i
if i = 2t is even, if i = 2t + 1 is odd.
7.3. Theorem. Let q = #κ. The local density of (L, −, −) is 1 − dim Gi qN #G (κ) , q i G : G◦ i
βL = where
N = N H − NE =
j >i
d · i · n i · nj +
di .
i
7.4. Remark. Though we have assumed that ni = 0 for i < 0, it is easy to check that the formula in the preceding theorem remains true without this assumption. 8. Morita context. In this section, we use the elementary aspects of the theory of Morita context (see [J]) to reduce the case K = M2 (F ) to the case K = F . We make a slight change of notation: F = M2 (F ), A = M2 (A). Let 1 0 0 0 , e2 = = 1 − e1 = σ (e1 ). e1 = 0 0 0 1 8.1. Let V be an F -vector space. Then V = V ⊕ V is an F-module in a natural way. The functor V → V is an equivalence of categories. A quasi-inverse is the
512
GAN AND YU
functor sending an F-module V to V · e1 . The module V is free over F if and only if dim V is even. 8.2. Similarly, the category of A-modules is equivalent to the category of Amodules via the functors L → L ⊕ L and L → L · e1 . The module L is free over A if and only if L is free of even rank over A. 8.3. Suppose that V is a free F-module and V = V · e1 . The correspondence L ↔ L is a bijection between A-lattices in V and A-lattices in V . 8.4. Let V be an F-module, and let −, − be a sesquilinear form on V. Let V = V · e1 , and let −, − be the restriction of −, − on V . Then −, − is an F bilinear mapping with value in the 1-dimensional F -vector space σ (e1 ) · F · e1 . Since F = F · e1 · F and V = V · F · e1 · F = V · F, the mapping −, − → −, − from the space S of sesquilinear forms V to the space S of bilinear forms on V is injective. By comparing dimensions, we see that it is bijective if V is free over F. 8.5. Suppose that V is free over F. Then −, − is -Hermitian if and only if −, − is (−)-Hermitian. Moreover, in either case, −, − is nondegenerate if and only if −, − is nondegenerate. 8.6. From now on, assume that V is free over F and −, − is -Hermitian and nondegenerate. Let L be an A-lattice in V corresponding to the A-lattice L in V . It is well known that g → g|V is an isomorphism of algebraic groups from G = Aut F (V, −, −) to G = Aut F (V , , ). This isomorphism induces a group isomorphism from GL = Aut A (L, −, −) to GL = Aut A (L, −, −). 8.7. Let ω ωld (resp., ωld ) be the volume form on G (resp., G) defined in 3.2 by using the lattice L (resp., L). Then the pullback of ωld by the isomorphism G → G is simply ω ωld . By Lemma 3.4 and the last statement of 8.6, the local density of L is the same as that of L. 9. The case p = 2 9.1. As mentioned in the introduction, the results of this paper hold for F of residue characteristic 2 in certain cases. One of the reasons for restricting to the case p = 2 is that the Jordan decomposition described in Section 4 is in general more complicated in the case p = 2. However, for lattices in symplectic spaces, Hermitian and anti-Hermitian spaces over F ⊕ F or an unramified quadratic extension E, and Hermitian spaces over M2 (F ) or D, the results of Proposition 4.2 remain true. We caution the reader that the proof for the quaternionic Hermitian case given there is only valid if p = 2. 9.2. Another reason for assuming that p = 2 is that Theorem 5.5 is not true in general when p = 2. Indeed, in the proof of Lemma 5.5.2, statement (2) is no longer true in general. However, for the types of spaces mentioned above, the proofs
GROUP SCHEMES AND LOCAL DENSITIES
513
of Theorems 5.5 and 5.7 remain valid. Furthermore, for these spaces, the isometry groups Gi defined in Section 6 do not involve orthogonal groups and, hence, are smooth even in characteristic 2. Thus, the determination of the structure of the special ˜ given in Section 6 carries through without change. fiber G 9.3. In conclusion, the local density formula in Theorem 7.3 remains valid when p = 2 for symplectic spaces, Hermitian and anti-Hermitian spaces over F ⊕ F or an unramified quadratic extension E, and Hermitian spaces over M2 (F ) or D. 10. The mass formula. In this section, we establish the Smith-Minkowski-Siegel mass formula, with the aim of clarifying the role played by the local densities defined in Section 3. As we are working globally, we differ from our previous notation at times. 10.1. Let k be a number field of degree d over Q, with ring of integers A and adele ring A. For a place v of k, we let kv be the corresponding completion of k. For a finite place v, Av denotes the ring of integers of kv , and qv denotes the cardinality of the residue field of Av . Let S∞ denote the set of Archimedean places, and let r1 (resp., r2 ) be the number of real (resp., complex) places. Hence, r1 + 2r2 = d. 10.2. Let K be one of the following k-algebras: • K = k; • K = E, a quadratic extension of k; • K = D, a quaternion division algebra over k, equipped with the obvious involution σ as in 2.2. Let t = [K : k], the dimension of K as a k-vector space. Note that there is a trace map Tr : K → k, which gives the k-vector space K a natural symmetric bilinear trace form (x, y) → Tr(x · σ (y)). Fix a maximal A-order B of K, and let B ⊥ be the A-lattice dual to B with respect to the trace form. Then set dK/k = [B ⊥ : B] ∈ Z>0 . More precisely, 1 if K = k,
(Ᏸ) if K = E, N dK/k = K/Q qv2 if K = D. v∈SD
Here, Ᏸ is the different ideal of E/k, and SD is the set of finite places of k where Dv is ramified. 10.3. Consider a finite-dimensional vector space V over K, equipped with a (σ, )-Hermitian form h0 = −, −, where = ±1. Let G be the corresponding isometry group, which is a (possibly disconnected) reductive algebraic group over k. Let L be a lattice in V , by which we mean a finitely generated projective B-submodule of V such that L ⊗A k = V , and assume that −, − is B-valued on L. Note that L is not necessarily free.
514 10.4.
GAN AND YU
The genus of L is indexed by the finite double coset space @ = G(k)\G(A)/U ,
where U = G(k ⊗ R) × v finite Uv , and Uv is the stabilizer of Lv = L ⊗A Av in G(kv ). For α ∈ @, represented by gα ∈ G(A), let Bα = G(k) ∩ gα Ugα−1 . Then Bα is an arithmetic group, and since the connected center of G is an anisotropic torus over k, Bα \G(k ⊗ R) is of finite volume with respect to any Haar measure of G(k ⊗ R). 10.5. If k is totally real and −, − is totally definite, then Bα is finite, and, following Eisenstein, the mass of L is defined to be Mass(L) =
1 . #Bα
α∈@
In general, the definition of the mass of L (due to Siegel [Si]) depends on the choice of a Haar measure on the real Lie group G(k ⊗ R). There seem to be two natural choices for this. A reductive algebraic group over R has a unique compact form and a unique split form, and each of these possesses a natural Haar measure. For the compact form, the natural Haar measure is the one that gives the group volume 1. For the split form, it is the one determined by an invariant differential ω0 of top degree on the Chevalley model over Z. One can then transfer these Haar measures to any other form of the group, as in [GrG]. This gives two natural Haar measures on G(k ⊗ R), denoted by |ωc | and |ω0 |, respectively. 10.6. The relation between these two measures can be found in [Gr, Section 7]. More precisely, r +r |ω0 | = λd · G : G◦ 1 2 · |ωc |, where λ=
(2π)di , di − 1 ! i
and the di ’s run over the degrees of G. The Haar measure we use on G(k ⊗ R) is the one coming from the compact form. Hence, the mass of L is, by definition, |ωc |. Mass(L) = α∈@ Bα \G(k⊗R)
Note that if k is totally real and −, − is totally definite, so that G(k ⊗R) is compact, this agrees with the previous definition, which explains our choice. It is interesting
515
GROUP SCHEMES AND LOCAL DENSITIES
to note that in [Si], the mass of L was defined using a Haar measure, which is |ω0 | multiplied by a precise power of 2. 10.7. Let ω be any invariant differential of top degree on G/k. Then ω gives rise to a Haar measure |ω|v of G(kv ) for each place v, and we let |ω|∞ denote the measure ⊗v∈S∞ |ω|v on G(k ⊗ R). The Tamagawa measure of G is the measure − dim(G)/2
|ω|A = dk
1 · |ω|v G : G◦
· v
on G(A), where dk is the discriminant of k over Q. Actually, this is correct only when G is semisimple. When G is not semisimple, which occurs when K = E or when V is a 2-dimensional quadratic space, one should put in certain convergence factors. However, since the connected center of G is anisotropic over k, the product 1 · |ω|v G : G◦ v finite Uv
is conditionally convergent for any open compact subgroup U = v finite Uv , so that the measure defined above makes sense and agrees with the one defined using suitable convergence factors. (For more discussion on these issues, we refer the reader to [S].) Let |ω|A τ (G) = G(k)\G(A)
be the Tamagawa number of G. 10.8. Recall from Section 3 that, for each finite place v, the lattice Lv determines ld |, which gives rise to a local density β a Haar measure |ωL Lv as defined in 3.5. The v mass formula relates Mass(L) to the quantities βLv . Indeed, a standard and formal computation give dim(G)/2 τ (G) · d Mass(L) = c(L) · k , v finite βLv where
ld
ld
ωL ωL | |ω |ω | r +r 0 c v v · = λ−d · · . c(L) = G : G◦ 1 2 · |ω|∞ |ω|v |ω|∞ |ω|v
v finite
v finite
Note that all but finitely many terms of the above product are 1, and the product of the local densities βLv is conditionally convergent only when G is not semisimple. 10.9. Hence, the computation of Mass(L) is reduced to the computations of the local densities βLv , provided that τ (G) and c(L) can be given explicitly; this is the content of the mass formula. The determination of the Tamagawa number τ (G) is a deep problem that was solved in complete generality only fairly recently. For the
516
GAN AND YU
groups we are dealing with, however, the value of τ (G) has been known for a while from the work of Weil and is given by τ (G) =
1
if K = k or D,
2
if K = E.
The explicit determination of c(L), on the other hand, is an easier problem and is the main purpose of this section. 10.10. In the classical case of quadratic forms, if the lattice L is free, one can ld | for every finite place choose a global invariant differential ω such that |ω|v = |ωL v v. In this case, c(L) is interpreted as an Archimedean local density. If L is not free, we do not have such an interpretation, nor do we see any reason for the existence of an ω as above. The computation of c(L) in the case of quadratic forms is essentially the content of [Ha]. In the following, we determine c(L) in general. One begins with the following elementary observation. 10.11. Lemma. Let n = dimK V , and let d(L) = [L⊥ : L] be the discriminant of L. Then the quantity β = c(L)/d(L)n(V )/2 is independent of L, where n + if K = k, 2 dimk H if K = E, n(V ) = = n n·t n − if K = D. 2 Here, H is the k-vector space of (σ, )-Hermitian forms on V . 10.12. Now we come to the computation of c(L). We first introduce some notation. Let K = {u ∈ K : σ (u) = u} be the -eigenspace of σ , and let t = dimk K . Henceforth, fix a basis Ᏹ = {e1 , . . . , en } of V over K, ᐁ = {u1 , . . . , ut } of K over k, ᐂ = {v , . . . , v } of K over k. 1
t
Let Ᏹ∗ = {e1∗ , . . . , en∗ } be the basis of the dual vector space V ∗ which is dual to Ᏹ and so on. Let m∗ij : M = V ⊗K V ∗ → K be the function ei∗ ⊗ ej . Similarly, for i = j (resp., i = j ), we have the function hij : H → K (resp., H → K ), given by h → h(ei , ej ) for h ∈ H . 10.13. Recall that we have a smooth morphism f : M ∗ → H of varieties over k, given by t → h0 ◦ t, and G = f −1 (h0 ). Note that M ∗ refers to the open subvariety of invertible elements in M, and not the dual vector space. Consider the following
517
GROUP SCHEMES AND LOCAL DENSITIES
differential forms on M and H , respectively: ωM,Ᏹ,ᐁ =
ωM,i,j ,
ωM,i,j =
i,j
ωH,Ᏹ,ᐁ,ᐂ =
t d u∗k ◦ m∗ij ;
k=1
ωH,i,j ,
i≥j
ωH,i,j =
t ∗ d u ◦ h ij k k=1 t ∗ d vk ◦ hii
if i = j , if i = j .
k=1
Then ωᏱ,ᐁ,ᐂ = ωM,Ᏹ,ᐁ /f ∗ ωH,Ᏹ,ᐁ,ᐂ is a differential of top degree on G, invariant under G◦ . Now we have the following crucial lemma. 10.14. Lemma. Suppose that g ∈ AutK (V ), g ∈ Autk (K), and g ∈ Autk (K ). Let g · Ᏹ denote the basis {ge1 , ge2 , . . . , gen } and so on. Then ωg·Ᏹ,g ·ᐁ,g ·ᐂ = Nrd(g)−n(V ) · det(g )−n(n+1)/2 · det(g )n , ωᏱ,ᐁ,ᐂ where Nrd : Aut K (V ) → k × is the reduced norm map. 10.15. Note that the above discussion can be carried out over any local field kv . Suppose in particular that we are working over a non-Archimedean kv and that ᏱLv is a basis of the lattice Lv over Bv , ᐁBv is a basis of Bv over Av , and ᐂBv is a basis of Bv over Av . Then we have ld ωᏱLv ,ᐁBv ,ᐂBv = ωL . v
This, together with Lemma 10.14, implies that |ω0 |
c(L) = λ−d ·
ωᏱ,ᐁ,ᐂ
n(V ) n(n+1)/2 −n · B(Ᏹ) : L · A(ᐁ) : B · A(ᐂ) : B ,
∞
where B(Ᏹ) is the B-lattice generated by Ᏹ, and A(ᐁ) (resp., A(ᐂ)) is the Alattice generated by ᐁ (resp., ᐂ). In particular, we now need to determine the ratio |ω0 |/|ωᏱ,ᐁ,ᐂ |∞ , which is a purely Archimedean computation. 10.16.
Let ᐁ∞ = {u∞,1 , . . . , u∞,t } be a basis of K ⊗ R over k ⊗ R such that = (±1)r1 +r2 ∈ k ⊗ R. det Tr u∞,i · σ u∞,j
518
GAN AND YU
Then ti=1 du∗∞,i induces a measure on K ⊗ R, which we call the standard measure. Further,
t
∗
−1/2 i=1 dui ∞
t
= A(ᐁ) : B · dK/k . ∗
i=1 du∞,i ∞ Together with Lemma 10.14, this implies that |ω0 |
c(L) = λ−d ·
ωᏱ,ᐁ ,ᐂ
∞
∞
10.17.
−n n(n+1)/4 n(V ) · B(Ᏹ) : L · A(ᐂ) : B · dK/k .
Let Ᏹ∞ = {e∞,1 , . . . , e∞,n } be a basis of V ⊗ R over K ⊗ R such that Nrd e∞,i , e∞,j = (±1)r1 +r2 ∈ k ⊗ R.
Then again by Lemma 10.14, we have c(L) = λ−d ·
ωᏱ
|ω0 |
∞ ,ᐁ∞
,ᐂ
∞
−n n(n+1)/4 · A(ᐂ) : B · d(L)n(V )/2 · dK/k .
10.18. We now choose a basis ᐂ∞ for (K ⊗ R) over k ⊗ R. If = 1, then (K ⊗ R) = k ⊗ R, and we let ᐂ∞ = {1}. If = −1, consider the exact sequence: 0 −→ K −→ K −→ k −→ 0, given by the trace map Tr : K → k. We have already defined standard measures on K ⊗ R and k ⊗ R. These induce a standard measure on (K ⊗ R) , and we let ᐂ∞ =
∗ induces this standard measure. Then {v∞,1 , . . . , v∞,t } be a basis such that ti=1 dv∞,i we have
t ∗
i=1 dvi ∞
t
= A(ᐂ) : B · δK, , ∗
dv i=1
where δK,
∞,i ∞
1 if = 1 or K = k, −1/2 = dK/k · A : Tr(B) if K = E and = −1, −1/2 if K = D and = −1. dK/k
Again by Lemma 10.14, this implies that c(L) = λ−d ·
ωᏱ
|ω0 |
∞ ,ᐁ∞ ,ᐂ∞
n(n+1)/4 −n
· d(L)n(V )/2 · dK/k · δK, .
GROUP SCHEMES AND LOCAL DENSITIES
Note that
519
A : Tr(B) = Av : Tr(Bv ) ,
v |2
and the local factors [Av : Tr(Bv )] can be explicitly given as
Av : Tr(Bv ) =
1
if v is split or unramified in Kv ,
[d /2] qv v
if v is ramified in Kv ,
where in the second case (πKdvv ) is the different ideal of Kv /kv . We are finally reduced to the computation of the number µ0 = |ω0 |/|ω Ᏹ∞ ,ᐁ∞ ,ᐂ∞ |∞ . Notice that this number depends only on the triple (n, , (K ⊗ R)/(k ⊗ R)). The following proposition shows that, in fact, it depends only on (n, , (K ⊗ C)/(k ⊗ C)). 10.19. Proposition. We have the equality µ0 = µd , where µ depends only on the triple (n, t, ) and is given by n 2 2(n+1)/2 µ= 22n 1
if t = 1, = 1, and n is even; if t = 1, = 1, and n is odd; if t = 4 and = −1; otherwise.
Proof. Write k ⊗ R= v∈S∞ kv . Then it follows by definition that as Haar measures on G(k ⊗ R) = v∈S∞ G(kv ), |ω0 | =
ω0,v , v∈S∞
where ω0,v is an invariant differential on Gv = G ×k kv obtained from the Chevalley model over Z. Note that under any isomorphism ϕ : Gv ×kv C → Gv ×kv C, we have ϕ ∗ (ω0,v ) = ±ω0,v . Each element e∞,i of the basis Ᏹ∞ is an (r1 +r2 )-tuple (e∞,i,v ) ∈ v∈S∞ V ⊗k kv , where, for each v ∈ S∞ , Ᏹv = e∞,1,v , . . . , e∞,n,v is a basis of V ⊗k kv , with the property that Nrdv e∞,i,v , e∞,j,v = ±1. The analogous statement is true of the bases ᐁ∞ and ᐂ∞ . Then
ω Ᏹ ,ᐁ ,ᐂ =
ωᏱ ,ᐁ ,ᐂ , v v v v ∞ ∞ ∞ v∈S∞
520
GAN AND YU
as Haar measures on G(k ⊗R) = v∈S∞ G(kv ). Now the first statement of the proposition follows from the observation that under any isomorphism ϕ : Gv ×kv C → Gv ×kv C,
∗
ϕ ωᏱ ,ᐁ ,ᐂ = ωᏱ ,ᐁ ,ᐂ , v v v v v v as Haar measures on Gv (C). In particular, µ0 = µd , with µ equal to the absolute value of the complex number ω0,v /ωᏱv ,ᐁv ,ᐂv , which is independent of v ∈ S∞ . To compute µ, it suffices to work over C. Hence, let Ᏹ, ᐁ, and ᐂ be bases of V , K, and K chosen as before, where now K is C, C × C, or M2 (C). To compare the invariant differentials ω0 and ω = ωᏱ,ᐁ,ᐂ on G, it suffices to compare them on the tangent space of the identity element e of G. Let df : Te M ∗ = M → Th0 H = H be the map on tangent spaces induced by f . Then ω = wM,Ᏹ,ᐁ /(df )∗ (ωH,Ᏹ,ᐁ,ᐂ ) is a differential on the kernel of df . We treat the case when G is the orthogonal group in some detail. If n is even, we choose the basis Ᏹ of V such that the form h0 is represented by the matrix A, where Aij = δi,n+1−j . With respect to this basis, one identifies M and H with the space Mn (C) of (n × n)-matrices and the space Symn (C) of symmetric matrices, respectively. The differentials ωM,Ᏹ,ᐁ and ωH,Ᏹ,ᐁ,ᐂ are then the standard ones on Mn (C) and Symn (C), and the map df is given by X −→ t XA + AX. From this, one can write down ω explicitly. On the other hand, a Chevalley basis of Lie(G) = Ker(df ) was given explicitly by Bourbaki [B, Chapter VIII, Section 13.4, p. 211], which allows one to write down ω0 . Comparing, one finds that ω=
1 · ω0 , 2n
which is the result sought for in this case. The case for odd n is slightly more subtle. Let Ᏹ be a basis of V such that h0 is represented by the matrix A , with Aij
=
2
if i = j =
δi,n+1−j
n+1 , 2
otherwise.
Using Ᏹ , Bourbaki described in [B, Chapter VIII, Section 13.2, pp. 199–200] an explicit Chevalley basis on Ker(df ), from which a direct computation gives ωᏱ =
1 2n+1
· ω0 .
GROUP SCHEMES AND LOCAL DENSITIES
521
On the other hand, by Lemma 10.14, 1 · ω. 2(n+1)/2
ω Ᏹ =
Hence, the result follows in this case. The remaining cases, when K = k or E, follow by a similar computation, using the results in [B, Chapter VIII, Section 13], and are more straightforward than the case of orthogonal groups. The cases when K = D then follow by Morita context, as in Section 8. The proposition is proved. We summarize the above discussion in the following theorem. 10.20. Theorem. dim(G)/2
τ (G) · d Mass(L) = c(L) · k
v finite βLv
where
,
d n(V )/2 n(n+1)/4 −n · dK/k · δK, . c(L) = λ−1 µ · L⊥ : L
Here, τ (G) is given in 10.9, λ is given in 10.6, µ is given in Proposition 10.19, n(V ) is given in Lemma 10.11, dK/k is given in 10.2, and δK, is given in 10.18. 11. An example: quaternionic Hermitian spaces. As a result of Theorem 10.20, Theorem 7.3, and the remarks in Section 9, we give an exact formula for the mass of an arbitrary lattice in a quaternionic Hermitian space V over any number field k. In this section, we write down the various quantities that appear in Theorem 10.20 as explicitly as possible. 11.1. Let SD be the finite set of finite places of k for which Dv is ramified. The group G is a form of the symplectic group in 2n variables. Hence, we have τ (G) = 1,
n (2j − 1)!
λ−1 =
dK/k =
j =1
v∈SD
(2π)2j
qv2 ,
,
µ = 1,
δK, = 1.
Therefore, the mass of a lattice L is given by d n(n+1)/2 n (2j − 1)! n(n+1/2) Mass(L) = dk · qv · (2π)2j v∈S j =1
D
(n−1/2)/2 · L⊥ : L ·
v finite
βL−1 , v
where βLv is given explicitly by Theorem 7.3 for all non-Archimedean v.
522
GAN AND YU
11.2. As a check for the above result, we consider the case when k is totally real, D is ramified at all real places of k, the Hermitian space V is totally definite, and L is a maximal lattice in V . The mass of such an L was recently obtained by Shimura [Sh]. In the terminology of [Sh], L is maximal with respect to the property that −, − is B-valued on L. For each finite place v of k, Lv is self-dual, and, hence, [L⊥ : L] = 1. Further,
βv =
n 1 − qv−2j j =1
if v ∈ / SD ,
n 1 − (−qv )−j
if v ∈ SD .
j =1
As a result, n(n+1/2)
Mass(L) = dk
·
n j =1
d j (2j − 1)! ζk (2j ) · qv + (−1)j , (2π)2j
v∈SD
which agrees with the formula in [Sh]. 11.3. On the other hand, one can consider a different notion of maximality. Let L be a lattice such that L⊥ is maximal with respect to the property that x, x ∈ A for all x ∈ L⊥ . The mass of such a lattice L was obtained in [GHY, Section 9] from a general mass formula in [GrG, Section 10]. For such an L, Lv is self-dual if v ∈ / SD , but for v ∈ SD , (Lv )1 if n is even, Lv = (Lv )0 ⊕ (Lv )1 if n is odd, with (Lv )0 having rank 1. Here, (Lv )i is as defined in Corollary 4.3. Hence, we have L⊥ : L = qv4[n/2] ,
v∈SD
and for v ∈ SD ,
βLv =
n/2 n2 · 1 − qv−4j q v j =1
if n is even,
(n−1)/2 (n−1)2 −1 · 1 − qv−4j · 1 + q q v v j =1
if n is odd.
GROUP SCHEMES AND LOCAL DENSITIES
523
As a result, n(n+1/2) Mass(L) = dk ·
with
λv =
n j =1
(2j − 1)! d ζk (2j ) · λv , (2π)2j v∈SD
n/2 4j −2 −1 qv j =1
if n is even,
(n+1)/2 −1 4j −2 · −1 qv q +1 v
if n is odd,
j =1
which agrees with [GHY, Proposition 9.4]. 11.4. It should be noted that in [Sh], Shimura obtained the mass of the maximal lattice over an arbitrary number field. However, in the case when −, − is not totally definite, his definition of the mass of L differs from ours. Indeed, the mass in [Sh] was defined using the symmetric space G(k ⊗ R)/U∞ where U∞ is the maximal compact subgroup of G(k ⊗ R). The invariant measure used on the symmetric space is, up to a precise power of 2, the quotient of the Haar measures on G(k ⊗ R) and U∞ coming from the split form of the groups. In view of the precise relation between |ω0 | and |ωc | given in 10.6, it should not be difficult to translate his formula for the mass of the maximal lattice to our formulation. References [SGA4]
[BLR] [B] [BT1] [BT2] [CS] [SGA3]
[GHY] [Gr]
M. Artin, A. Grothendieck, and J. L. Verdier, eds., Théorie des topos et cohomologie étale des schémas, 2, Séminaire de Géométrie Algébrique du Bois-Marie 1963/64 (SGA 4), Lecture Notes in Math. 270, Springer, Berlin, 1972. S. Bosch, W. Lütkebohmert, and M. Raynaud, Néron Models, Ergeb. Math. Grenzgeb. (3) 21, Springer, Berlin, 1990. N. Bourbaki, Éléments de mathématique, fasc. 38: Groupes et algèbres de Lie, chapitres 7– 8, Actualités Sci. Indust. 1364, Hermann, Paris, 1975. F. Bruhat and J. Tits, Groupes réductifs sur un corps local, II, Publ. Math. IHES 53 (1984), 197–376. , Schémas en groupes et immeubles des groupes classiques sur un corps local, II: Groupes unitaires, Bull. Soc. Math. France 115 (1987), 141–195. J. H. Conway and N. J. A. Sloane, Low-dimensional lattices, IV: The mass formula, Proc. Roy. Soc. London Ser. A 419 (1988), 259–286. M. Demazure and A. Grothendieck, eds., Schémas en groupes, I, II, III, Séminaire de Géométrie Algébrique du Bois-Marie 1962/64 (SGA 3), Lecture Notes in Math. 151, 152, 153, Springer, Berlin, 1970. W. T. Gan, J. P. Hanke, and J.-K. Yu, On an exact mass formula of Shimura, preprint, 1999. B. H. Gross, On the motive of a reductive group, Invent. Math. 130 (1997), 287–313.
524 [GrG] [Ha] [H] [Hi] [HS1] [HS2] [J] [K] [P] [S] [Sh] [Si] [Sp] [T]
[W] [Wa]
GAN AND YU B. H. Gross and W.T. Gan, Haar measure and the Artin conductor, Trans. Amer. Math. Soc. 351 (1999), 1691–1704. J. P. Hanke, An exact mass formula for orthogonal groups over number fields, Ph.D. dissertation, Princeton Univ., Princeton, 1999. R. Hartshorne, Algebraic Geometry, Grad. Texts in Math. 52, Springer, New York, 1977. Y. Hironaka, Local zeta functions on Hermitian forms and its applications to local densities, J. Number Theory 71 (1998), 40–64. Y. Hironaka and F. Sato, Spherical functions and local densities of alternating forms, Amer. J. Math. 110 (1988), 473–512. , Local densities of alternating forms, J. Number Theory 33 (1989), 32–52. N. Jacobson, Basic Algebra, II, W. H. Freeman, San Francisco, 1980. Y. Kitaoka, Arithmetic of Quadratic Forms, Cambridge Tracts in Math. 106, Cambridge Univ. Press, Cambridge, 1993. G. Pall, The Weight of a Genus of Positive n-ary Quadratic Forms, Proc. Sympos. Pure Math. 8, Amer. Math. Soc., Providence, 1965, 95–105. J.-P. Serre, Harvard lecture notes, 1984. G. Shimura, Some exact formulas for quaternion unitary groups, J. Reine Angew. Math. 509 (1999), 67–102. C. L. Siegel, Gesammelte Abhandlungen, I, II, III, Springer, Berlin, 1966. T. A. Springer, Linear Algebraic Groups, 2d ed., Progr. Math. 9, Birkäuser, Boston, 1998. J. Tits, “Reductive groups over local fields” in Automorphic Forms, Representations, and L-functions (Corvallis, Ore., 1977), Part 1, Proc. Sympos. Pure Math. 33, Amer. Math. Soc., Providence, 1979, 29–69. W. C. Waterhouse, Introduction to Affine Group Schemes, Grad Texts in Math. 66, Springer, New York, 1979. G. L. Watson, The 2-adic density of a quadratic form, Mathematika 23 (1976), 94–106.
Department of Mathematics, Princeton University, Fine Hall, Princeton, New Jersey 08544, USA; [email protected]; [email protected]
Vol. 105, No. 3
DUKE MATHEMATICAL JOURNAL
© 2000
CLASSIFICATION OF POSITIVE DEFINITE LATTICES RICHARD E. BORCHERDS
Contents 1. Classification of positive norm vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525 2. Vectors in the lattice I I1,25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528 3. Lattices with no roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534 The primitive norm 0 vectors of I I1,25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537 The norm 2 vectors of I I1,25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539 The norm 4 vectors of I I1,25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544 1. Classification of positive norm vectors. In this paper we describe an algorithm for classifying orbits of vectors in Lorentzian lattices. The main point of this is that isomorphism classes of positive definite lattices in some genus often correspond to orbits of vectors in some Lorentzian lattice, so we can classify some positive definite lattices. Section 1 gives an overview of this algorithm, and in Section 2 we describe this algorithm more precisely for the case of I I1,25 , and as an application we give the classification of the 665 25-dimensional unimodular positive definite lattices and the 121 even 25-dimensional positive definite lattices of determinant 2 (see Tables 1 and 2). In Section 3 we use this algorithm to show that there is a unique 26dimensional unimodular positive definite lattice with no roots. Most of the results of this paper are taken from the unpublished manuscript [B4], which contains more details and examples. For general facts about lattices used in this paper, see [CS2, especially Chapters 15–18 and 23–28]. Some previous enumerations of unimodular lattices include Kneser’s list of the unimodular lattices of dimension at most 16 [K], Conway and Sloane’s extension of this to dimensions at most 23 [CS2, Chapter 16], and Niemeier’s enumeration of the even 24-dimensional ones [N]. All of these used some variation of Kneser’s neighborhood method [K], but this becomes very hard to use for odd lattices of dimension 24, and it seems impractical for dimension at least 25 (at least for hand calculations; computers could probably push this further). The method used in this paper works well up to 25 dimensions, could be pushed to work for 26 dimensions, and does not seem to work at all beyond this. We use the “(+, −, −, · · · , −)” sign convention for Lorentzian lattices L, so that the reflections we are interested in are (usually) those of negative norm vectors of L. Received 4 January 2000. Revision received 23 February 2000. 2000 Mathematics Subject Classification. Primary 11E12; Secondary 11E41. Author’s work supported by National Science Foundation grant number DMS-9970611. 525
526
RICHARD E. BORCHERDS
We fix one of the two cones of positive norm vectors and call it the positive cone. The norm 1 vectors in the positive cone form a copy of hyperbolic space in the usual way. We assume that we are given a group G of automorphisms of a Lorentzian lattice L, such that G is the semidirect product of a normal subgroup R generated by reflections of some negative norm vectors, and a group Aut(D) of automorphisms preserving a fundamental domain D of R in hyperbolic space. We assume that all elements of u ∈ L having nonnegative inner product with all simple roots of R have norm (u, u) at least zero. (This is just to eliminate some degenerate cases.) If L is a lattice, then L(−1) is the lattice L with all norms multiplied by −1. We use Conway’s convention of using small letters an , dn , en for the spherical Dynkin diagrams, and we use capital letters An , Dn , En for the corresponding affine Dynkin diagrams. The Weyl vector of a root system is the vector ρ such that (ρ, r) = −r 2 /2 for any simple root r. We want to find the orbits of positive norm vectors of the positive cone of L under the group G. Every positive norm vector of the positive cone of L is conjugate under R to a unique vector in D, so it is enough to classify orbits of vectors u in D under Aut(D). The algorithm works by trying to reduce a vector u of D to a vector of smaller norm by adding a root of u⊥ to u. There are three possible cases we need to consider. (1) There are no roots in u⊥ . (2) There is a root r in u⊥ such that u + r ∈ D. (3) There is at least one root in u⊥ , but if r is a root in u⊥ , then u+r is never in D. We try to deal with these three cases as follows. If there are no roots in u⊥ , then we assume that D contains a nonzero vector w such that (r, w) ≤ (r, u) for any simple root r and any vector u ∈ L in the interior of D. Then u − w has inner product at least zero with all simple roots, so it also lies in D and has smaller norm than u unless u is a multiple of w and w2 = 0. So we can reduce u to a vector of smaller norm in D. The existence of a vector w with these properties is a very strong condition on the lattice L. Example 1.1. The lattices I I1,9 and I I1,17 have properties 1 and 2; this follows easily from Vinberg’s description [V] of their automorphism groups. Conway showed that the lattice I I1,25 also has these properties; see Section 2. The lattices I I1,8n+1 for n ≥ 4 do not have these properties, but the Minkowski-Siegel mass formula shows that these lattices have such vast numbers of orbits of positive norm vectors that there seems little point in classifying them. Example 1.2. It follows from [B2] that several lattices that are fixed points of finite groups acting on I I1,25 also have a suitable vector w. For example, the lattice I I1,1 ⊕BW(−1), where BW is the Barnes-Wall lattice, has this property. Some of the norm 0 vectors correspond to the 24 lattices in the genus of BW classified in [SV]; the remaining orbits of norm 0 vectors should not be hard to find. Example 1.3. Take L to be the lattice I1,9 , and take R to be the group generated by reflections of norm −1 vectors. (This has infinite index in the full reflection group.)
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
527
Then the lattice has a Weyl vector for the reflection group as in [B2], so we can apply the algorithm to this reflection group. (However, it is not entirely clear what the point of doing this is, as it is easier to use the full reflection group of the lattice!) Next, we look at the second case when u⊥ has a root r such that v = u−r is in D. Then −r is in the fundamental domain of the finite reflection group of u⊥ , so r is a sum of the simple roots of u⊥ with the usual multiplicities. For u in D we let Ri (u) be the simple roots u of D which have inner product i(r, r)/2 with u, so Ri (u) is empty for i < 0 and R0 (u) is the Dynkin diagram of u⊥ . We write S(u) for R0 (u) ∪ R1 (u) ∪ R2 (u). Then given S(v) we can find all vectors u of D which come from v as in case (3) above, and S(u) is contained in S(v). By keeping track of the action of Aut(D, v) on S(v) for vectors v of D, we can find all possible vectors v constructed in this way from v, together with the sets S(u). Finally, the third case (when there is at least one root in u⊥ , but if r is a root in u⊥ , then v −r is never in D) has to be dealt with separately for each lattice L. In practice it does not present too much difficulty for lattices with a vector w as in case (1). See the next section for the example of L = I I1,25 . The following two lemmas are used later to prove some properties of the root systems of 25-dimensional lattices. Lemma 1.4. Suppose that reflection in u⊥ is an automorphism of L. Then there is an automorphism σ of L (of order 1 or 2) with the following properties: (1) σ fixes D; (2) if σ fixes w, then w is a linear combination of u and the roots of L in u⊥ . Proof. There is an automorphism of L acting as 1 on u and as −1 on u⊥ , given by the product of −1 and reflection in u⊥ . As this automorphism fixes u ∈ D, we can multiply it by some (unique) element of the reflection group of u⊥ so that the product σ fixes D. The element σ acts as −1 on the space orthogonal to u, z, and all roots of R in u⊥ , which implies assertion (2) of Lemma 1.4. Lemma 1.5. Suppose that there is a norm 0 vector z such that (z, u) = 2, where u is a vector in D. Then there is an automorphism σ of L with the following properties: (1) σ fixes D; (2) if σ fixes w, then w is a linear combination of u, z, and the roots of L in u⊥ . Proof. If M is the lattice spanned by z and u, then M has the property that all elements of M /M have order 1 or 2. So there is an automorphism of L acting as 1 on M and −1 on M ⊥ . The result now follows as in the proof of Lemma 1.4. This proves Lemma 1.5. Remark 1.6. It is usually easy to classify all orbits of negative norm vectors u in Lorentzian lattices, because this is closely related to the classification of the indefinite lattices u⊥ , and by Eichler’s theorem [E] indefinite lattices in dimension at least 3 are classified by the spinor genus (which in practice is often determined by the genus).
528
RICHARD E. BORCHERDS
For example, it is easy to give a proof along these lines that if n > 0 and m > 0, then I I1,8n+1 has a unique orbit of primitive vectors of norm −2m. 2. Vectors in the lattice I I1,25 . In this section we specialize the algorithm of the previous section to the lattice I I1,25 . Note that orbits of norm 4 vectors u of I I1,25 correspond naturally to 25-dimensional positive definite unimodular lattices, because u⊥ is isomorphic to the lattice of even vectors in a 25-dimensional unimodular negative definite lattice. In particular, we can classify the 665 positive definite 25-dimensional unimodular lattices, as in Table 2; this is the main application of the algorithm of the previous section. Similarly, norm 2 vectors of I I1,25 correspond to 25-dimensional even positive definite lattices of determinant 2. (Another interpretation of the vectors of I I25,1 of norm at least −2 is that they are the roots of the fake monster Lie algebra.) First, we have to show the existence of a vector w satisfying the property of Section 1. This follows from Conway’s theorem [C] stating that the reflection group of I I1,25 has a Weyl vector w of norm 0, with the property that (w, r) = 1 for all simple roots r of the reflection group. Conway’s proof depends on the rather hard classification of the “deep holes” in the Leech lattice in [CPS]; there is a proof avoiding these long calculations in [B1]. It seems likely that 26 is the largest possible dimension of a lattice with a suitable vector w. Next, we have to classify the vectors u of D such that u⊥ has roots but u + r is not in D for any root r ∈ u⊥ . One obvious way this can happen is if u has norm 0, so we have to classify the norm 0 vectors in I I1,25 . In any lattice L = I I8n+1,1 , the orbits of primitive norm 0 vectors z correspond to the 8n-dimensional even negative definite unimodular lattices z⊥ /z. So the orbits of primitive norm 0 vectors of I I1,25 correspond to the 24 Niemeier lattices (see [CS2]). The nonprimitive norm 0 vectors are of course either zero or a positive integer multiple of a primitive norm 0 vector, so this gives the classification of all orbits of norm 0 vectors in I I1,25 ; see Table 0. Next, suppose that u is a positive norm vector of D with (u, u) = 2n, and suppose that r is a highest root in u⊥ such that u−r is not in D. Then u−r is conjugate under the reflection group to some vector v such that (v, u) < (u − r, u). But (v, u)2 ≥ (u, u)(v, v) = 2n(2n − 2) and (v, u) < (u − r, u) = 2n, so (v, u) = 2n − 1. So if z = u − v, then (z, u) = 1 and z2 = 0. If we put z = u − nz, then z and z are norm 0 vectors with (z, z ) = 1 and u = nz + z . So I I1,25 = B ⊕ z, z for some Niemeier lattice B. If this Niemeier lattice has roots, then adding some of these roots to r gives a vector in D by the previous argument; so B must be the Leech lattice and we can assume that z is in the orbit of w. If n > 1, then there are no roots in u⊥ , and if n < 1, then (u, u) ≤ 0, so we must have n = 1. So the only possibility for u is that it is a norm 2 vector in the orbit of w + w = 2w + r, where r is a simple root. Putting everything together gives the following list of the vectors u ∈ D such that u⊥ has roots but u − r is not in D for any root r ∈ u⊥ : (1) the zero vector;
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
529
(2) the norm 0 vectors nz for n ≥ 1 and z a primitive norm 0 vector of D corresponding to some Niemeier lattice other than the Leech lattice (the vectors for a given Niemeier lattice and a given value of n are all conjugate under Aut(D)); (3) the norm 2 vectors of the form 2w +r for a simple root r of D (these form one orbit under Aut(D)). Lemma 2.1. Suppose u, v ∈ D, u2 = 2n, v 2 = 2(n − 1), and (v, u) = 2n. Then R0 (u) ⊆ R0 (v) ∪ R1 (v) ∪ R2 (v) = S(v), Ri (u) ⊆ R0 (v) ∪ R1 (v) ∪ · · · ∪ Ri (v) for i ≥ 1. Proof. The vector v is in D, so v = u + r for some highest root r of u⊥ . The vector r has inner product 0, 1, or 2 with all simple roots of u⊥ , and −r is a sum of roots of R0 (u) with positive coefficients, so r has inner product greater than or equal to zero with all simple roots of D not in R0 (u). The lemma follows from this and the fact that (v, s) = (u, s)+(r, s) for any simple root s of D. This proves Lemma 2.1. We now start with a vector v of norm 2(n−1) and try to reconstruct u from it. The vector u−v is a highest root of some component of R0 (u), and R0 (u) is contained in S(v), so we should be able to find u from S(v). By Lemma 2.1, S(u) is contained in S(v), so we can repeat this process with u instead of v. The following theorem shows how to construct all possible vectors u as in Lemma 2.1 from v and S(v). Theorem 2.2. Suppose that v has norm 2(n − 1) and is in D (so n ≥ 1). Then there are bijections between (1) norm 2n vectors u of D with (u, v) = 2n; (2) simple spherical Dynkin diagrams C contained in the Dynkin diagram ! of D such that if r is the highest root of C and c in C satisfies (c, r) = i, then c is in Ri (v); (3) Dynkin diagrams C satisfying one of the following three conditions: either C is an a1 and is contained in R2 (v), or C is an an (n ≥ 2) and the two endpoints of C are in R1 (v) while the other points of C are in R0 (v), or C is dn (n ≥ 4), e6 , e7 , or e8 and the unique point of C which has inner product 1 with the highest root of C is in R1 (v) while the other points of C are in R0 (v). Proof. Let u be as in (1), and put r = u − v. The vector r is orthogonal to u and has inner product less than or equal to zero with all roots of R0 (because −v does) so it is a highest root of some component C of R0 (u). The vector r therefore determines some simple spherical Dynkin diagram C contained in !. Any root c of C has (c, v + r) = (c, u) = 0, so c is in Ri (v), where i = (c, r). This gives a map from (1) to (2).
530
RICHARD E. BORCHERDS
Conversely, if we start with a Dynkin diagram C satisfying (2) and put u = v + r (where r is the highest root of C), then (c, u) = 0 for all c in C, so (r, u) = 0 as r is a sum of the c’s. This implies that u2 = 2n and (u, v) = 2n. We now have to show that u is in D. Let s be any simple root of D. If s is in C, then (s, r) = −(s, v), and if s is not in C, then (s, r) ≥ 0, so in either case (s, u) = (s, v + r) ≥ 0, and hence u is in D. This gives a map from (2) to (1) and shows that (1) and (2) are equivalent. Condition (3) is just the condition (2) written out explicitly for each possible C, so (2) and (3) are also equivalent. This proves Theorem 2.2. We define the height of a vector u in I I1,25 to be (u, w). We show how to calculate the heights of vectors of I I1,25 which have been found with the algorithm above. Lemma 2.3. Suppose u, v are vectors in D of norms 2n, 2(n−1) with (u, v) = 2n, and suppose that v = u−r for some root r of u⊥ corresponds to the component C of R0 (u). Then height(u) = height(v) + h − 1, where h is the Coxeter number of the component C. Proof. We have v = u − r, where r is the highest root of C, so height(u) = height(v) + (r, w). Wehave r = i mi ci , where the ci are the simple roots of C with weights mi and i mi = h − 1. All the ci have inner product 1 with w, so (r, w) = h − 1. This proves Lemma 2.3. Lemma 2.4. Let u be a primitive vector of D such that there is a norm 0 vector z with (z, u) = 0 or 1, and suppose that z corresponds to a Niemeier lattice B with Coxeter number h. (1) If u has norm 0, then its height is h. The Dynkin diagram of u⊥ is the extended Dynkin diagram of B. (2) If u has positive norm, then height(u) = 1 + (1 + u2 /2)h. The Dynkin diagram of u⊥ is the Dynkin diagram of B if u2 > 2 and the Dynkin diagram of B plus an a1 if u2 = 2. Proof. (1) The Dynkin diagram of u⊥ is a union of extended Dynkin diagrams. If this union is empty, then u must be w and therefore has height 0 = h. If not, then let C be one of the components. We have u = i mi ci , where the ci ’s are the simple roots of C with weights mi . Also, i mi = h because C is an extended Dynkin diagram and all the ci ’s have height 1, so u has height h. (2) As u has inner product 1 with a norm 0 vector z of D, we can put u = nz + z with u2 = 2n and z 2 = 0, (z, z ) = 1. By part (1) z has height h. We have z = z + r, where r is a simple root of D, so height(z ) = height(z) + height(r) = h + 1. Hence height(u) = nh + h + 1 = 1 + (1 + u2 /2)h. The lattice u⊥ is B ⊕ N, where N is a 1-dimensional lattice of determinant 2n, so the Dynkin diagram is that of B plus that of N, and the Dynkin diagram of (norm 2 roots of) N is empty unless 2n = 2, in
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
531
which case it is a1 . This proves that the Dynkin diagram of u⊥ is what it is stated to be. This proves Lemma 2.4. Orbits of norm 2 vectors u ∈ I I1,25 correspond to even 25-dimensional positive definite lattices B of determinant 2, where B(−1) ∼ = u⊥ . One part of the algorithm for finding vectors of norm 2n consists of finding the vectors u such that there are no roots in u⊥ . For norm 2 vectors u, the following lemma shows that there are no such vectors. Lemma 2.5. If u ∈ I I1,25 has norm 2, then u⊥ contains roots. In other words, every 25-dimensional even positive definite lattice of determinant 2 has a root. Proof. If u⊥ contains no roots, then, by the algorithm of Section 1, u = w +u1 for some u1 in D. We have u21 = u2 − 2 height(u), so u21 = 0 and u has height 1 because u21 ≥ 0, u2 = 2 and the height of u is positive. Then height(u1 ) = height(u) = 1, so u is a norm 0 vector in D which has inner product 1 with the norm 0 vector w of D, but this is impossible as u−w would be a norm −2 vector separating the two vectors u and w of D. This proves Lemma 2.5. Theorem 2.6. Suppose that u ∈ D has norm 2. Then w =ρ+
height(u)u , 2
where ρ is the Weyl vector of the root system of u⊥ . Also, −2ρ 2 = height(u)2 . Proof. The vector w is fixed by any automorphism fixing D, so by Lemma 1.4 the vector w must be in the space spanned by u and the roots of u⊥ . However, w also has inner product 1 with all simple roots of u⊥ and has inner product height(u) with u, so w must be ρ +height(u)u/2. Taking norms of both sides of w = ρ +height(u)u/2, and using the facts that w 2 = 0, (u, ρ) = 0, and (u, u) = 2, shows that −2ρ 2 = height(u)2 . This proves Theorem 2.6. In particular, we find the strange consequence that the norm of the Weyl vector of any 25-dimensional even positive definite lattice of determinant 2 must be a half a square. Norm 4 vectors in the fundamental domain D of I I1,25 correspond to 25-dimensional unimodular lattices A = A1 ⊕ I n , where u⊥ is the lattice of even elements of A(−1) and where A1 has no norm 1 vectors. The odd vectors of A(−1) can be taken as the projections of the vectors y with (y, u) = 2 into u⊥ . A norm 4 vector u can behave in four different ways, depending on whether the unimodular lattice A1 with no norm 1 vectors corresponding to u is at most 23-dimensional, or 24-dimensional and odd, or 24-dimensional and even, or 25-dimensional. Theorem 2.7. Norm 1 vectors of A correspond to norm 0 vectors z of I I1,25 with (z, u) = 2. Write A = A1 ⊕ I n , where A1 has no vectors of norm 1. Then u is in
532
RICHARD E. BORCHERDS
exactly one of the following four classes. (1) The vector u has inner product 1 with a norm 0 vector. The lattice A1 is a Niemeier lattice. (2) A has at least four vectors of norm 1, so that A1 is at most 23-dimensional (but may be even). There is a unique norm 0 vector z of D with (z, u) = 2, and this vector z is of the same type as either of the two even neighbors of A1 ⊕ I n−1 . (3) A1 is 24-dimensional and odd. There are exactly two norm 0 vectors that have inner product 2 with u, and they are both in D. They have the types of the two even neighbors of A1 . (4) A = A1 has no vectors of norm 1. Proof. The vector z is a norm 0 vector with (z, u) = 2 if and only if u/2 − z is a norm 1 vector of A. Most of Theorem 2.7 follows from this. The only nontrivial things to check are the statements about norm 0 vectors that are in D. If u does not have inner product 1 with any norm 0 vector, then a norm 0 vector z with (z, u) = 2 is in D if and only if it has inner product greater than or equal to zero with all simple roots of u⊥ , so there is one such vector in D for each orbit of such norm 0 vectors under the reflection group of u⊥ . If A has at least four vectors of norm 1, then they form a single orbit under the Weyl group of (the norm 2 vectors of) u⊥ , which proves (2), while if A has only two vectors of norm 1, then they are both orthogonal to all norm 2 vectors of A and so form two orbits under the Weyl group of u⊥ . This proves Theorem 2.7. Theorem 2.8. Suppose that u is a norm 4 vector corresponding to a unimodular 25-dimensional lattice A = A1 ⊕ I 25−n with 2n ≥ 4 vectors of norm 1. Let ρ be the Weyl vector of the root system of norm −2 roots of u⊥ (which is the Weyl vector of the norm −2 vectors of A(−1)), and let h be the Coxeter number of the even neighbors of the 24-dimensional unimodular lattice A1 ⊕ I 24−n . Then height(u) = (w, u) = 2(h + n − 1), w = ρ + height(u)u/4, and −ρ 2 = (h + n − 1)2 . Proof. There is a unique norm 2 vector z of D with (z, u) = 2; we let i be its projection into u⊥ . The lattice A has at least four vectors of norm 1, so any vector of norm 1 and in particular i is in the vector space generated by vectors of norm −2 of u⊥ . Hence, by Lemma 1.5 and the same argument as in Theorem 2.6, we have w = ρ +height(u)u/4. The norm −4 vector 2i of u⊥ is the sum of −2(n−1) simple roots of the dn -component of the Dynkin diagram of u⊥ , so (2i, w) = (2i, ρ) = −2(n−1). The vector i is the projection of z into u⊥ , so i = z − u/2, and hence height(u) = (w, u) = 2(w, z − i) = 2(height(z) + n − 1) = 2(h + n − 1). If we calculate the norms of both sides of w = ρ +height(u)u/4, we find that −ρ 2 = (h + n − 1)2 . This proves Theorem 2.8. Example 2.9. Suppose u corresponds to the lattice I 25 . The number n is then 25 and the root system of the norm 2 vectors is D25 , so the Weyl vector ρ can be taken as
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
533
(0, 1, 2, . . . , 24). The even neighbors of I 24 are both D24 with Coxeter number h = 46, so we find that 02 +12 +22 +· · ·+242 = ρ 2 = (h+n−1)2 = 702 . Watson [W] showed that the only solution of 02 + 12 + · · · + k 2 = m2 with k ≥ 2 is k = 24. See [CS2, Chapter 26] for a construction of the Leech lattice using this equality. Theorem 2.10. Suppose that u is a norm 4 vector of D with exactly two norm 0 vectors z1 , z2 that have inner product 2 with u, and suppose that there are no norm 0 vectors that have inner product 1 with u. Then z1 and z2 are both in D and have Coxeter numbers h1 , h2 , where hi = (zi , w). Then w =ρ+
(h1 z2 + h2 z1 ) , 2
where ρ is the Weyl vector of the norm −2 vectors of u⊥ . Also, u = z1 + z2 , height(u) = h1 + h2 , −ρ 2 = h1 h2 , and u⊥ has 8(h1 + h2 − 2) roots. Proof. The vector u − z1 is a norm 0 vector that has inner product 2 with u and so must be z2 . Hence u = z1 +z2 and height(u) = height(z1 )+height(z2 ) = h1 +h2 . There is a norm 0 vector that has inner product 2 with u, and any automorphism of L fixing D also fixes w, so by Lemma 1.5 w is a linear combination of z1 , z2 , and the roots of R in u⊥ . Using the facts that (w, z1 ) = h1 , (w, z2 ) = h2 , and (w, r) = −r 2 /2 for any simple root r in u⊥ shows that w must then be ρ + (h1 z2 + h2 z1 )/2. Using the fact that w2 = 0, this shows immediately that −ρ 2 = h1 h2 . The number of roots follows from Remark 2.13. This proves Theorem 2.10. Corollary 2.11. If A1 is an odd 24-dimensional positive definite unimodular lattice with no vectors of norm 1 and whose even neighbors have Coxeter numbers h1 and h2 , then ρ 2 = h1 h2 , where ρ is the Weyl vector of A1 . Proof. This follows immediately from Theorem 2.10, using the fact that A1 ⊕I is the 25-dimensional unimodular lattice corresponding to u, as in Theorem 2.10. Remark 2.12. Let B1 , B2 be the two even neighbors of A1 . Then it is not hard to show that h2 ≤ 2h1 + 2, and there are several lattices A1 for which equality holds. Remark 2.13. [B3, Theorem 13.1 and Corollary 13.2] show that the height of a vector in the fundamental domain of I I1,25 can be written as an explicit linear combination of the theta functions of cosets of the lattice u⊥ . In particular, we find that if u is a norm 2 vector, then 12 height(u) = 18 − 4z1 + r, where r is the number of norm −2 vectors of u⊥ and where zi is the number of norm 0 vectors having inner product i with u (so z1 is 0 or 2 and is 2 if and only if the lattice u⊥ is the sum of a 1-dimensional lattice and an even lattice). Similarly, if u
534
RICHARD E. BORCHERDS
has norm 4 and corresponds to a 25-dimensional unimodular lattice A, then 8t = 20 − 2z2 − 8z1 + r, where r is the number of norm 2 vectors of A, z2 is the number of norm 1 vectors of A, and z1 is 1 if A is the sum of a Niemeier lattice and a 1-dimensional lattice and is zero otherwise. Note that these relations give congruences for the numbers of roots which immediately imply that 25-dimensional even lattices of determinant 2 and 25-dimensional unimodular lattices always have roots. There are similar relations and congruences for larger norm vectors of I I1,25 . There are several other genuses of lattices which can be classified using I I1,25 . Most of these do not seem important enough to be worth publishing, but here is a summary of what is available just in case anyone finds a use for any of these. The 24-dimensional even positive definite lattices of determinant 5 are easy to classify as they turn out to correspond to pairs consisting of a norm 2 vector u of I I1,25 together with a norm −2 root r with (r, u) = 1, and these can easily be read off from the list of norm 2 vectors. The 25-dimensional positive definite even lattices of determinant 6 correspond to the norm 6 vectors in I I1,25 and can be classified from the norm 4 vectors using the algorithm; there are 2825 orbits if I have made no mistakes. A list of them is available from my home page. These can be used to classify the 26-dimensional even positive definite lattices of determinant 3, because the norm 2 roots of such lattices correspond to the norm 6 vectors of I I1,25 . (There is a unique such lattice with no roots; see the next section.) There are between 677 and 681 such lattices, and a provisional list is available from my home page. (There are a few small ambiguities that I have not yet got around to resolving.) If such a lattice has no norm 6 roots, then the number of norm 2 vectors is divisible by 6. With a lot more effort it should be possible to classify the 26-dimensional unimodular lattices by finding the (roughly 50000?) orbits of norm 10 vectors of I I1,25 ; see the next section. 3. Lattices with no roots. In this section we show that there is a unique 26dimensional positive definite unimodular lattice with no roots. Conway and Sloane use this result in their proof that there is a positive definite unimodular lattice with no roots in all dimensions greater than 25 [CS1]. We also show that the number of norm 2 vectors of a 26-dimensional unimodular lattice is divisible by 4, and we sketch a construction of a 27-dimensional unimodular lattice with no roots. Lemma 3.1. A 26-dimensional unimodular lattice L with no vectors of norm 1 has a characteristic vector of norm 10. Proof. If L has a characteristic vector x of norm 2, then x ⊥ is a 25-dimensional even lattice of determinant 2 and therefore has a root r by Theorem 2.6; 2r + x is a characteristic vector of norm 10. If the lemma is not true, we can therefore assume that L has no vectors of norm 1 and no characteristic vectors of norm 2 or 10. Its
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
535
theta function is determined by these conditions and turns out to be 1 − 156q 2 + · · · , which is impossible as the coefficient of q 2 is negative. This proves Lemma 3.1. Lemma 3.2. There is a bijection between isomorphism classes of (1) norm 10 characteristic vectors c in 26-dimensional positive definite unimodular lattices L and (2) norm 10 vectors u in I I1,25 given by c⊥ (−1) ∼ = u⊥ . We have Aut(L, c) = Aut(I I1,25 , u). Proof. The proof is routine. Note that −1 is a square mod 10. This proves Lemma 3.2. Lemmas 3.1 and 3.2 give an algorithm for finding 26-dimensional unimodular lattices L. It is probably not hard to implement this on a computer if one is given a computer algorithm for deciding when two vectors of the Leech lattice are conjugate under its automorphism group; such an algorithm has been described by Allcock [A]. The main remaining open problem is to find a use for these lattices! We now apply this algorithm to find the unique such lattice with no roots. Lemma 3.3. Take notation as in Lemma 3.2. The lattice L has no roots if and only if u⊥ has no roots and u does not have inner product 1, 2, 3, or 4 with any norm 0 vector. Proof. If u⊥ has roots, then obviously L has too. If there is a norm 0 vector z that has inner product 1, 2, 3, or 4 with u, then the projection zu of z into u⊥ has norm −1/10, −4/10, −9/10, or −16/10, respectively. The lattice L(−1) contains u⊥ + c, and the vector zu ± 3c/10, zu ± 4c/10, zu ± c/10, or zu ± 2c/10 is in L for some choice of sign and has norm −1, −2, −1, or −2, respectively. Hence, if u has inner product 1, 2, 3, or 4 with some norm 0 vector, then L has roots. Conversely, if L has a root r, then either r has norm 2 and inner product 0, ±2, ±4 with c or it has norm 1 and inner product ±1, ±3 with c, and each of these cases implies that u⊥ has roots or that u has inner product 1, 2, 3, or 4 with some norm 0 vector by reversing the argument above. This proves Lemma 3.3. Now let L be a 26-dimensional unimodular lattice with no roots containing a characteristic vector c of norm 10, and let u be a norm 10 vector of D corresponding to it as in Lemma 3.2. Lemma 3.4. We have u = z+w, where z is a norm 0 vector of D corresponding to a Niemeier lattice with root system A64 , and w is the Weyl vector of D. In particular, u is determined up to conjugacy under Aut(D). Proof. The lattice u⊥ has no roots so u = w + z for some vector z of D. By Lemma 3.3, u does not have inner product 1, 2, 3, or 4 with any norm 0 vector, so (z, w) = (u, w) ≥ 5. Hence, 10 = u2 = z2 + 2(z, w) ≥ 2(z, w) ≥ 10
536
RICHARD E. BORCHERDS
so (z, w) = 5 and z2 = 0. The only norm 0 vectors z in D with (z, w) = 5 are the primitive ones corresponding to A64 Niemeier lattices, which form one orbit under Aut(D). This proves Lemma 3.4. Lemma 3.5. If u = z+w is as in Lemma 3.4, then the 26-dimensional unimodular lattice corresponding to u has no roots. Proof. The lattice u⊥ obviously has no roots so by Lemma 3.3 we have to check that there are no norm 0 vectors that have inner product 1, 2, 3, or 4 with u. Let x be any norm 0 vector in the positive cone. If x has type A64 , then (x, u) ≥ (x, w) ≥ 5; if x has Leech type, then (x, u) ≥ (x, z) ≥ 5; if x has type A24 1 , then (x, u) = (x, w)+(x, z) ≥ 2 + 3 = 5 ((x, z) cannot be 2 as there are no pairs of norm 0 vectors of types A24 1 and 6 A4 which have inner product 2 by the classification of 24-dimensional unimodular lattices); and if x has any other type, then (x, u) = (x, w) + (x, z) ≥ 3 + 2 = 5. This proves Lemma 3.5. Theorem 3.6. There is a unique 26-dimensional positive definite unimodular lattice L with no roots. Its automorphism group is isomorphic to the group O5 (5) = 2.G.2 of order 28 .32 .54 .13 and acts transitively on the 624 characteristic norm 10 vectors of L. Proof. By Lemma 3.1, L has a characteristic vector of norm 10, so by Lemmas 3.3 and 3.4 L is unique and its automorphism group acts transitively on the characteristic vectors of norm 10. By Lemma 3.5, L exists. The theta function is determined by the conditions that L has no vectors of norm 1 or 2 and no characteristic vectors of norm 2, and it turns out that the number of characteristic vectors of norm 10 is 624. The stabilizer of such a vector is isomorphic to Aut(I I1,25 , u), which is a group of the form 53 .2.S5 , where S5 is the symmetric group on five letters. This determines the order of the automorphism group of the lattice. From this it is not difficult to determine it precisely; we omit the details. This proves Theorem 3.6. We now show that the number of norm 2 vectors of any 26-dimensional even positive definite unimodular lattice is divisible by 4. There are strictly 26-dimensional unimodular lattices with no roots or with four roots, so this is the best possible congruence. For unimodular lattices of dimension less than 26, there are congruences modulo higher powers of 2 for the number of roots. Lemma 3.7. If L is a 25-dimensional positive definite lattice of determinant 2, then the number of norm 2 roots of L is 2 mod 4. Proof. The lattice L is of the form u⊥ for some norm 2 vector u of I1,25 , so its even vectors are the vectors of I I1,25 in u⊥ having even inner product with some b ∈ I I1,25 , where I I1,25 is one of the two even neighbors of I1,25 . As (u, b) is even, we can assume that b ∈ u⊥ (by subtracting multiples of u from b). Let B be the even determinant 2 lattice u⊥ . The number of roots of B is 12t − 10 or 12t − 18, where t
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
537
is the height of the norm 2 vector u of D corresponding to B by Remark 2.13, so it is sufficient to prove that the number of norm 2 vectors of B which have odd inner product with b is divisible by 4. The vector b has zero inner product with u and integral inner product with w, so by Theorem 2.6 b has integral inner product with ρ. Hence, b has even inner product with the sum of the positive roots of B, so it has odd inner product with an even number of positive roots. This implies that the number of roots of B which have odd inner product with b is divisible by 4. This proves Lemma 3.7. Corollary 3.8. If L is a 26-dimensional unimodular lattice, then the number of norm 2 vectors of L is divisible by 4. Proof. The result is obvious if L has no norm 2 roots, so let r be a norm 2 vector of L. The lattice r ⊥ is a 25-dimensional even lattice of determinant 2, so by Remark 2.13 the number of roots of r ⊥ is 2 mod 4. The number of roots of L not in r ⊥ is 4h − 6, where h is the Coxeter number of the component of L containing r, so the number of norm 2 vectors of L is divisible by 4. This proves Corollary 3.8. Remark 3.9. A similar but more complicated argument can be used to show that there is a unique even 26-dimensional positive definite lattice of determinant 3 with no roots. Gluing on a 1-dimensional lattice to this gives a unique 27-dimensional unimodular lattice with no roots and a characteristic vector of norm 3. As a different proof of this has already been published in [EG], we give just a brief sketch of the proof from [B4]. (The preprint [BV] shows that there are exactly three 27-dimensional positive definite unimodular lattices with no roots.) Let L be a 27-dimensional positive definite unimodular lattice with no roots and a characteristic vector c of norm 3. The theta function of L is determined by these conditions, and this implies that L has vectors of norm 5; let v be such a vector. Then v, c ⊥ is a 25-dimensional even lattice X of determinant 14 such that X /X is generated by an element of norm 1/14 mod 2. Such lattices X correspond to norm 14 vectors x in the fundamental domain D of I I1,25 , and the condition that L has no vectors of norm 1 or 2 implies that there are exactly two possibilities for x: x is either the sum of w and a norm 0 vector of height 7 corresponding to A46 , or x is the sum of w and a norm 2 vector of height 6 corresponding to the 25-dimensional lattice of determinant 2 with root system a29 . Both of these x’s turn out to give the same lattice L, which therefore has two orbits of norm 5 vectors and is the unique 27-dimensional positive definite unimodular lattice with no roots and with characteristic vectors of norm 3. The primitive norm 0 vectors of I I1,25 . We list the set of orbits of primitive norm 0 vectors z of I I1,25 , which is of course more or less the same as the well-known list of Niemeier lattices (see [CS2, Table 16.1]). The “Height” is just (w, z), where w is the Weyl vector of a fundamental domain containing z. The letter after the height is just a name to distinguish vectors of the same height, and it is the letter referred to in the column headed “Norm 0 vectors” of Table 1. The column headed “Group” is the order
538
RICHARD E. BORCHERDS
of the subgroup of Aut(D) fixing the primitive norm 0 vector. However, note that the group order is not (usually) the order of the quotient of the automorphism group of the Niemeier lattice by the reflection group; see [CS2, Chapter 16] for a description of the relation between these groups. For the vector w of height 0, the group is the infinite group of automorphisms of the affine Leech lattice and is an extension of a finite group of the order given by the group of translations of the Leech lattice !. Table 0 The Primitive Norm 0 Vectors of I I1,25 Height
Roots
Group
0x
None
!·8315553613086720000
2a
1002795171840
12e
A24 1 A12 2 A83 A64 D46 A45 D4 A46 A27 D52 A38 D64 A29 D6 E64
12a
A11 D7 E6
24
13a
52
14d
A212 D83
16a
A15 D9
16
18d
D10 E72
8
18a
A17 E7
12
22d
2 D12
8
25a
A24
10
30e
E83
6
30d
D16 E8
2
46d
D24
2
3a 4a 5a 6d 6a 7a 8a 9a 10d 10a
138568320 688128 30000 138240 3456 1176 256 324 384 80 432
48
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
539
The norm 2 vectors of I I1,25 . The following sets are in natural 1:1 correspondence: (1) orbits of norm 2 vectors in I I1,25 under Aut(I I1,25 ); (2) orbits of norm 2 vectors u of D under Aut(D); (3) 25-dimensional even bimodular lattices L. The lattice L(−1) is isomorphic to u⊥ . Table 1 lists the 121 elements of any of these three sets. “Height” is the height of the norm 2 vector u of D, in other words (u, w), where w is the Weyl vector of D. The letter after the height is just a name to distinguish vectors of the same height and is the letter referred to in the column headed “Norm 2” of Table 2. An asterisk after the letter means that the vector u is of type 1, in other words, the lattice L is the sum of a Niemeier lattice and a1 . The column “Roots” gives the Dynkin diagram of the norm 2 vectors of L arranged into orbits under Aut(L). “Group” is the order of the subgroup of Aut(D) fixing u. The group Aut(L) is a split extension R.G, where R is the Weyl group of the Dynkin diagram and where G is isomorphic to the subgroup of Aut(D) fixing u. “S” is the maximal number of pairwise orthogonal roots of L. The column headed “Norm 0 vectors” describes the norm 0 vectors z corresponding to each orbit of roots of u⊥ , where u is in D. A capital letter indicates that the corresponding norm 0 vector is twice a primitive vector; otherwise the norm 0 vector is primitive. The letter x stands for a norm 0 vector of Leech type. Otherwise the letter a, d, or e is the first letter of the Dynkin diagram of the norm 0 vector, and its height is given by height(u)−h+1, where h is the Coxeter number of the component of the Dynkin diagram of u. For example, the norm 2 vector of type 23a has 3 components in its root system, of Coxeter numbers 12, 12, and 6, and the letters are e, a, and d, so the corresponding norm 0 vectors have Coxeter numbers 12, 12, and 18 and hence are norm 0 vectors with Dynkin diagrams E64 , A11 D7 E6 , and D10 E72 . For some remarks on the reliability of Table 1, see the introduction to Table 2. Table 1 The Norm 2 Vectors of I I1,25 Roots
Group
S
Norm 0 vectors
1a∗
a1
8315553613086720000
1
X
2a
a2
991533312000
1
x
3a
a19
92897280
9
a
Height
540
RICHARD E. BORCHERDS
4a
a2 a112
190080
13 aa
5a∗
a124 a1
244823040
25 aA
5b
3456
13 aa
5c
a24 a19 a3 a115
40320
17 aa
6a
a29
3024
9
6b
a3 a25 a16
240
13 aaa
7a∗
a212 a1
190080
13 aA
7b
48
13 aaa
384
17 aad
240
13 aaa
7e
a33 a24 a13 a34 a18 a1 a4 a26 a15 d4 a121
120960
25 ad
8a
a36 a2
240
13 ad
8b
12
13 aaaa
8c
a4 a33 a23 a12 d4 a29
864
13 aa
9a∗
a38 a1
2688
17 aA
9b
a42 a34 a1 a43 a3 a22 a13 d4 a34 a3 a13 a5 a33 a24 a5 a34 a16
16
13 aaa
12
13 aaaa
48
17 aada
24
13 aaa
48
17 aaa
10a
d4 a43 a23
12
13 aaa
10b
a5 a42 a32 a2 a1
4
13 aaaaa
11a∗
a46 a1
240
13 aA
11b
d44 a19 a5 d42 a33 a5 a5 a42 a3 a1 a52 d4 a32 a12 a1
432
25 dd
7c 7d
9c 9d 9e 9f
11c 11d 11e
a
24
17 daa
4
13 aaaaa
8
17 aaaad
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
11f
a53 a24
48
13 aa
11g
48
17 aad
11h
d5 a36 a1 a6 a42 a32 a2 a1
4
13 aaaaa
12a
a54 a2
24
13 ad
12b
d5 a44 a2
8
13 aaa
12c
a6 d4 a43
6
13 aaa
12d
a6 a52 a3 a22
4
13 aaaa
13a∗
a54 d4 a1
48
17 aaA
13b
4
17 aaada
12
17 aaae
2160
25 dD
4
13 aaaa
4
13 aaaa
13g
d5 a52 d4 a3 a1 d5 a53 a13 a1 d46 a1 a62 a5 a4 a12 a7 a5 a42 a3 a7 a5 d4 a32 a12
4
17 aaaaa
14a
a6 a6 d5 a4 a2
2
13 aaaaa
14b
a63 d4
12
13 aa
14c
a7 a6 a5 a4 a1
2
13 aaaaa
15a∗
a64 a1
24
13 aA
15b
12
17 ade
24
25 ddd
4
17 aada
4
17 aaad
8
17 aad
15g
d53 a5 a3 d6 d44 a13 d6 a52 a5 a3 a7 d52 a32 a1 a72 d42 a1 a8 a53
6
13 aa
15h
a8 a6 a5 a3 a2
2
13 aaaaa
16a
a73 a2
12
13 ad
16b
a8 a6 d5 a4
2
13 aaaa
17a∗
a72 d52 a1
8
17 aaA
13c 13d∗ 13e 13f
15c 15d 15e 15f
541
542
RICHARD E. BORCHERDS
17b
e6 a53 d4
12
17 aae
17c
a7 d6 d5 a5
2
17 daaa
17d
4
17 aada
17e
a72 d6 a3 a1 a8 a72 a1
4
13 aaa
17f
a9 d5 a5 d4 a1
2
17 aaaaa
17g
a9 a7 a42
4
13 aaa
18a
e6 a63
6
13 aa
18b
a9 a8 a5 a2
2
13 aaaa
19a∗
a83 a1
12
13 aA
19b
d63 d4 a13 a7 e6 d52 a1
6
25 ddd
4
17 eaad
19d
d7 a7 d5 a5
2
17 aaad
19e
d7 a72 a3 a1
4
17 aaad
19f
a9 a7 d6 a1 a1
2
17 aaaad
19g
a10 a7 a6 a1
2
13 aaaa
20a
a82 e6 a2
4
13 aaa
20b
a10 a8 d5
2
13 aaa
21a∗
a92 d6 a1
4
17 aaA
21b
a11 d6 a5 a3
2
17 aaaa
21c
a11 a8 a5
2
13 aaa
21d∗
d64 a1
24
25 dD
21e
a9 e6 d6 a3
2
17 aaad
23a
d7 e62 a5
4
17 ead
23b
2
25 dddd
23c
d8 d62 d4 a1 a9 d72
4
17 da
23d
a 9 d8 a7
2
17 daa
23e
a11 d7 d5 a1
2
17 aaad
24a
2 a a11 2
4
13 ad
19c
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
24b
a12 e6 a6
2
13 aaa
25a∗
a11 d7 e6 a1
2
17 aaaA
25b
a13 d6 d5
2
17 aaa
25e∗
e64 a1
48
17 eE
26a
a13 a10 a1
2
13 aaa
27a∗
2 a a12 1
4
13 aA
27b
e7 d63
3
25 dd
27c
a 9 a 9 e7
2
17 ada
27d
d 9 a 9 e6
2
17 ada
27e
a11 d9 a5
2
17 aad
27f
a14 a9 a2
2
13 aaa
29a
a11 e7 e6
2
17 daa
29d∗
d83 a1
6
25 dD
31a
d82 e7 a1 a1
2
25 ddde
31b
d10 d8 d6 a1
1
25 dddd
31c
a15 d8 a1
2
17 aad
33a∗
a15 d9 a1
2
17 aaA
33b
a15 e7 a3
2
17 aad
33c
a17 a8
2
13 aa
35a
e73 d4
6
25 de
35b
a13 d11
2
17 da
36a
a18 e6
2
13 aa
37a∗
a17 e7 a1
2
17 aaA
37d∗
d10 e72 a1
2
25 ddD
543
544
RICHARD E. BORCHERDS
39a
d12 e7 d6
1
25 ddd
45d∗
2 a d12 1
2
25 dD
47a
d10 e8 e7
1
25 edd
47b
d14 d10 a1
1
25 ddd
47c
a17 e8
2
17 da
48a
a23 a2
2
13 ad
51a∗
a24 a1
2
13 aA
61d∗
d16 e8 a1
1
25 ddD
61e∗
e83 a1
6
25 eE
63a
d18 e7
1
25 dd
93d∗
d24 a1
1
25 dD
The norm 4 vectors of I I1,25 . There is a natural 1:1 correspondence between the elements of the following sets: (1) orbits of norm 4 vectors u in I I1,25 under Aut(I I1,25 ); (2) orbits of norm 4 vectors in the fundamental domain D of I I1,25 under Aut(D); (3) orbits of norm 1 vectors v of I1,25 under Aut(I1,25 ); (4) 25-dimensional unimodular positive definite lattices L; (5) unimodular lattices L1 of dimension at most 25 with no vectors of norm 1; (6) 25-dimensional even lattices L2 of determinant 4. L1 is the orthogonal complement of the norm 1 vectors of L, L2 is the lattice of elements of L of even norm, L2 (−1) is isomorphic to u⊥ , and L(−1) is isomorphic to v ⊥ . Table 2 lists the 665 elements of any of these sets. “Height” is the height of the norm 4 vector u of D, in other words (u, w), where w is the Weyl vector of D. The items in Table 2 are listed in increasing order of their height. “Dim” is the dimension of the lattice L1 . A capital E after the dimension means that L1 is even. The column “Roots” gives the Dynkin diagram of the norm 2 vectors of L2 arranged into orbits under Aut(L2 ).
545
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
“Group” gives the order of the subgroup of Aut(D) fixing u. The group Aut(L) ∼ = Aut(L2 ) is of the form 2 × R · G, where R is the group generated by the reflections of norm 2 vectors of L, G is the group described in the column “Group,” and 2 is the group of order 2 generated by −1. If dim(L1 ) ≤ 24, then Aut(L1 ) is of the form R · G, where R is the reflection group of L1 and where G is as above. For any root r of u⊥ , the vector v = u − r is a norm 2 vector of I I1,25 . This vector v can be found as follows. Let X be the component of the Dynkin diagram of u⊥ to which u belongs, and let h be the Coxeter number of X. Then u − r is conjugate to a norm 2 vector of I I1,25 in D of height t − h + 1 (or t − h if the entry under “Dim” is 24E) whose letter is the letter corresponding to X in the column headed “Norm 2.” For example, let u be the vector of height 6 and root system a22 a110 . Then the norm 2 vectors corresponding to roots from the components a2 or a1 have heights 6 − 3 + 1 and 6 − 2 + 1 and letters a and b, so they are the vectors 4a and 5b of Table 1. If dim(L1 ) ≤ 24, then the column “Neighbors” gives the two even neighbors of L1 +I 24−dim(L1 ) . If dim(L1 ) ≤ 23, then both neighbors are isomorphic so only one is listed, and if L1 is a Niemeier lattice, then the neighbor is preceded by 2 (to indicate that the corresponding norm 0 vector is twice a primitive vector). If the two neighbors are isomorphic, then there is an automorphism of L exchanging them. Tables 1 and 2 were originally calculated by hand. Most of the lattices were found several times, once for each orbit of roots, and this gave a large number of checks for most entries. I later ran a computer version of the algorithm of this paper, which turned up about 20 minor errors (mostly errors in column 5, and a few misprints in the group order and root systems which were due to copying errors). I also checked the Minkowski-Siegel mass formula. Any errors remaining are probably either copying errors (the tables are based on computer output but have had some hand editing to turn them into nice looking TEX) or an error where a lattice should be split into two lattices, each with twice the automorphism group. The second possibility cannot be detected by mass formulas, but I think it unlikely that it occurs in these tables. (It becomes an irritating problem when classifying the 26-dimensional even lattices of determinant 3.) Table 2 The Norm 4 Vectors of I I1,25 Height Dim
Roots
Group
1
24 E
None
8315553613086720000
2
23
a12
84610842624000
2
24
None
1002795171840
Norm 2
Neighbors 2!
a
! !
A24 1
546
RICHARD E. BORCHERDS
3
25
a12
88704000
a
4
24
a18
20643840
a
4
25
a22
26127360
a
4
25
a16
138240
a
5
24
a112
190080
a
5
25
a2 a17
5040
aa
5
25
a110
1920
a
6
23
a116 a12
645120
ca
A24 1
6
24
a22 a110
5760
ab
A12 2
A12 2
6
24
a116
43008
c
A24 1
A83
6
25
a3 a18
21504
ac
6
25
a22 a18
128
ab
6
25
a2 a110 a1
120
abc
6
25
a18 a16
1152
bc
7
24
a24 a18
384
bb
A12 2
A83
7
24 E
a124
244823040
a
2A24 1
7
25
a25 a13
720
bb
7
25
72
acb
7
25
24
bba
7
25
1440
ab
7
25
12
bbb
7
25
a3 a2 a19 a24 a14 a12 a3 a112 a23 a16 a13 a22 a112
144
cb
8
22
a3 a122
887040
ae
A24 1
8
23
a26 a16 a12
1440
bda
A12 2
8
24
a32 a112
768
cc
A83
A83
8
24
a3 a24 a16
96
bbb
A83
A83
8
24
a28
672
a
A83
A83
8
24
a26 a16
240
bd
A12 2
A64
24
a124
e
A24 1
D46
8
138240
A24 1
A24 1
A24 1
A12 2
547
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
25
a4 a112
1440
ad
8
25
16
bbb
8
25
64
cbc
8
25
12
bbbc
6
bbbbd
8
bab
16
bbcbd
8
bbbdb
48
bdc
720
cce
8
8
25
8
25
8
25
8
25
8
25
8
25
a3 a24 a14 a32 a18 a12 a3 a23 a16 a1 a3 a23 a13 a13 a1 a24 a22 a14 a3 a22 a14 a14 a12 a24 a2 a14 a12 a1 a24 a18 a12 a3 a115 a1
9
24
a32 a24 a14
16
bbb
A83
A64
9
24
a28 a14
384
dc
A12 2
A45 D4
9
25
a4 a23 a16 a1
12
bdbb
9
25
a32 a24 a12
16
bba
9
25
4
bbcba
9
25
2
bbbbbbb
9
25
6
abb
9
25
8
bcbb
9
25
8
bbbbc
9
25
a32 a22 a2 a14 a1 a3 a3 a22 a2 a12 a12 a1 a3 a26 a12 a32 a22 a14 a14 a3 a24 a2 a14 a1 a3 a22 a22 a2 a12 a12 a1
2
bbbdbbb
10
22
a3 a210
2880
ac
A12 2
10
23
384
cfa
A83
10
23
48
bbea
A83
10
24
32
bab
A64
A64
10
24
16
bdbc
A64
A64
10
24
16
bbbe
A83
A45 D4
10
24
48
cdf
A83
A45 D4
10
24
384
cd
A83
D46
10
24 E
a34 a18 a12 a33 a24 a12 a12 a34 a22 a12 a4 a3 a24 a14 a32 a3 a24 a12 a34 a14 a14 a34 a18 a212 a35 d4 a24 a16 d4 a3 a112
190080
a
2A12 2
10
25
10
25
10
25
1920
c
144
bcd
576
ced
548
RICHARD E. BORCHERDS
25
a5 a115
720
cf
10
25
8
bdbb
10
25
6
bcbb
10
25
2
bdbbbce
10
25
10
25
10
25
10
25
10
25
10
25
10
25
10
25
10
25
10
25
10
25
10
25
10
25
10
25
10
25
a4 a3 a24 a12 a33 a3 a2 a13 a4 a3 a22 a2 a12 a12 a1 a33 a3 a14 a12 a32 a32 a14 a12 a4 a26 a12 a4 a3 a22 a14 a12 a12 a32 a3 a22 a2 a12 a1 a32 a3 a22 a2 a1 a1 a1 a4 a25 a15 a32 a26 a32 a24 a22 a33 a22 a13 a12 a1 a3 a3 a3 a22 a12 a1 a1 a1 a1 a3 a3 a22 a22 a2 a12 a1 a32 a24 a14 a12 a33 a112 a3 a26 a2 a13
11
24
11 11
10
24
bcce
16
cbbd
6
abe
4
bdbccf
2
bbbbbc
2
bbbadbc
10
bbc
48
da
8
bba
12
bbdcf
2
cbbbcebfd
2
bbbbbee
8
dbcf
48
cf
12
dbce
a4 a32 a3 a22 a12
4
bbbbb
A64
A45 D4
24
a42 a24 a14
16
dca
A64
A45 D4
24
a36
240
a
A64
D46
A83
A46
24
a34 a24
24
be
11
25
8
befb
11
25
8
bcda
11
25
2
bbbcba
11
25
4
dcbbb
11
25
4
bbb
11
25
6
cbb
11
25
a5 a24 a2 a14 d4 a3 a24 a14 a4 a32 a3 a2 a12 a1 a42 a22 a2 a14 a1 a4 a32 a24 a4 a33 a16 a4 a32 a22 a2 a12 a1
2
bbbfab
11
25 a4 a3 a3 a2 a2 a2 a1 a1 a1
1
bbbbccbba
11
25 a4 a3 a3 a2 a2 a2 a1 a1 a1
1
bbbbecbbb
11
25
8
bab
11
a34 a3 a14
549
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
25
a32 a32 a22 a2 a1
2
bbbbb
11
25
2
bbabda
11
25
2
bbeccba
11
25
a32 a3 a3 a22 a2 a1 a4 a3 a22 a22 a2 a12 a1 a32 a32 a22 a14
4
bbdb
12
22
a36 a3 a12
96
dag
A83
12
23
a42 a32 a22 a12 a12
8
bcbha
A64
12
23
a4 a35 a12
40
aba
A64
12
24
d4 a4 a26
24
dca
A45 D4
A45 D4
12
24
d4 a34 a14
32
cde
A45 D4
A45 D4
12
24
a5 a33 a16 a1
24
cfec
A45 D4
A45 D4
12
24
a42 a32 a3 a12
8
bbbd
A45 D4
A45 D4
12
24
a5 a32 a24 a1
16
bebf
A45 D4
A45 D4
12
24
d4 a34 a14
48
cdc
D46
A45 D4
12
24
1152
eb
24
4
bcbh
12
24
32
fdg
D46 A64 A83
D46
12 12
25
8
cefdc
12
25
2
bebbdh
12
25
4
bddac
12
25
8
dcae
12
25
4
bfbdhf
12
25
2
bfebhdee
12
25
d42 a116 a42 a32 a22 a12 a34 a32 a14 a5 a32 a3 a14 a1 a5 a32 a22 a2 a1 a1 d4 a32 a3 a22 a12 d4 a4 a24 a14 a5 a32 a22 a12 a12 a1 a5 a3 a3 a22 a12 a1 a1 a1 a42 a3 a3 a2 a12 a1
2
bbcade
12
25
a4 a4 a3 a3 a2 a1 a1 a1
1
bbccbddd
12
25
4
bbb
12
25
4
bche
12
25
8
bdaeg
12
25
12
cdegb
12
25
2
abcbc
12
25
a42 a3 a24 a42 a32 a14 a12 d4 a32 a24 a1 a1 d4 a33 a16 a1 a1 a4 a32 a32 a2 a1 a42 a3 a22 a2 a1 a1 a1
2
bcbbfdh
12
25 a4 a4 a3 a2 a2 a2 a1 a1 a1
1
bbcabbhhd
12
25
a4 a34 a14
8
ach
25
a4 a32 a3 a3 a12 a12
2
bbcfde
11
12
A46 A27 D52
550
RICHARD E. BORCHERDS
12
25
a4 a33 a23 a1
6
bbag
12
25
a 4 a3 a3 a3 a2 a2 a2 a1
1
bbbebbah
12
25
8
bfdc
12
25
2
bccbhge
12
25
2
bcfbhh
12
25
a34 a3 a3 a12 a4 a32 a3 a22 a12 a1 a1 a4 a32 a3 a22 a12 a12 a32 a32 a3 a22 a12
4
edbbh
13
24
a44 a14
24
cc
A45 D4
A46
13
24
a5 a4 a3 a3 a2 a2 a1
2
bebbddd
A45 D4
A46
13
24
a42 a34
16
bb
A64
A27 D52
13
24
a42 a4 a3 a22 a12
4
ccahb
A64
A27 D52
13
24 E
a38
2688
a
2A83
13
25
a6 a26 a13
12
dhd
13
25
24
ca
13
25
2
bfbdc
13
25
6
bdac
13
25
a44 a12 a5 a4 a32 a2 a12 d4 a4 a33 a12 d42 a26
72
cc
13
25
a5 a4 a3 a2 a2 a2 a1 a1
1
bebhdeda
13
25
a5 a4 a3 a2 a2 a2 a1 a1
1
bebhdddd
13
25
d4 a4 a3 a3 a2 a2 a1 a1
1
bdaaeccb
13
25
a42 a4 a3 a22
2
bcbd
13
25
a 4 a4 a4 a3 a2 a1 a1 a1
1
bccbdbcd
13
25
6
bbd
13
25
2
abbhc
13
25
2
behdfd
13
25
a5 a33 a23 a5 a32 a3 a22 a2 a5 a4 a22 a22 a2 a12 a5 a32 a3 a22 a12 a1
2
bbbedd
13
25
a 4 a4 a3 a3 a3 a2 a1
1
bcbbbdc
13
25
a 4 a4 a3 a3 a3 a2 a1
1
bbbabdb
13
25
6
cdb
13
25
6
bacg
13
25
2
ebaddd
13
25
a43 a23 a13 d4 a33 a23 a2 a42 a3 a3 a22 a2 a1 a5 a29
72
cf
14
21
d4 a37
336
ag
A83
551
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
14
22
a44 a3 a22
16
aab
14
23
a5 a42 a3 a3 a12 a1
4
bbddaf
A45 D4
14
23
d4 a43 a22 a12
12
caca
A45 D4
14
23
a5 d4 a33 a13 a12
12
dfega
A45 D4
14
23
a52 a3 a24 a12
16
efda
A45 D4
14
23
96
dcd
D46
14
24
12
cbe
A46
A46
14
24
8
eda
A46
A46
14
24
12
bhd
A46
A46
14
24
4
caac
A45 D4
A27 D52
14
24
4
dfecbg
A45 D4
A27 D52
14
24
8
febcg
A45 D4
A27 D52
14
24
4
bbddf
A45 D4
A27 D52
14
24
32
dc
D46
A27 D52
12
bh
A38
384
g
A64 A83
8
bgbb
48
cgc
2
bhhcf
8
efee
2
bhhdeg
6
cabc
14
24
14
24
14
25
14
25
14
25
14
25
14
25
14
25
14
25
14
25
14
25
d42 a34 a12 a5 a43 a13 a52 a32 a22 a6 a33 a23 d4 a42 a4 a22 a5 d4 a32 a3 a12 a1 a52 a32 a12 a12 a12 a5 a42 a3 a3 a1 d42 a34 a43 a33 a38 d5 a32 a24 a12 d5 a33 a18 a6 a32 a3 a22 a1 a52 a3 a3 a14 a6 a32 a3 a2 a12 a12 d4 a43 a13 a1 a5 d4 a3 a3 a22 a1 a5 a5 a3 a22 a12 a1 a1 a5 a42 a3 a2 a12
14
25
14
25
14
25
14
25
14
2
deeccb
2
efddefg
2
bbfdf
a 5 a4 a4 a3 a2 a1 a1
1
cbbdabe
4
baeb
8
dcbb
2
cbhcee
25
d4 a42 a32 a12 d42 a32 a3 a14 a5 a42 a3 a12 a12 a1 a5 d4 a32 a14 a12 a1
4
dfegcg
14
25
a 5 a4 a3 a3 a3 a2
1
bbhddc
14
25
a 5 a4 a3 a3 a3 a2
1
bbeddc
14
25
a5 a42 a22 a2 a12
2
cbddf
25
a44 a22
8
ba
14
A64
D64
552
RICHARD E. BORCHERDS
14
25
d4 a4 a4 a3 a2 a2 a1 a1
1
caaecbbg
14
25
a5 a4 a3 a3 a3 a1 a1 a1
1
bbhedbfg
14
25
4
dddcg
14
25
2
bheddf
14
25
2
cbhdge
14
25
a5 a32 a32 a3 a1 a5 a32 a3 a3 a3 a1 a5 a4 a32 a22 a12 a1 a42 a4 a32 a2 a1
2
bbdbf
14
25
a 4 a4 a4 a3 a3 a2 a1
1
abbdhcf
14
25
2
bahbdf
14
25
a42 a4 a3 a22 a2 a1 d4 a34 a3 a14
8
fegg
15
24
a52 a42 a12
4
bdc
A46
A27 D52
15
24
a 6 a4 a4 a3 a2 a1 a1
2
chhceaa
A46
A27 D52
15
24
a52 a4 a3 a22
4
bfdf
A45 D4
A38
15
24
a44 a32
16
hb
A64
A29 D6
15
25
20
ab
15
25
2
bgbba
15
25
d5 a35 d5 a4 a32 a22 a12 a6 a42 a3 a12 a1
2
chdcb
15
25
a6 a4 a4 a2 a2 a1 a1 a1
1
chhefacc
15
25
a5 d4 a42 a12 a1
2
abeab
15
25
a5 a5 a4 a3 a2 a1
1
bbddea
15
25
a6 a4 a3 a3 a2 a2 a1
1
bhdcfga
15
25
2
bhdfc
15
25
15
25
a6 a4 a32 a22 a1 d42 a42 a22 a52 a4 a3 a12 a12
15
25
15
25
15
25
15 15
4
acb
2
bedcc
a 5 d4 a4 a3 a2 a2 a1
1
abecbba
2
bdac
2
adfec
25
a52 a32 a3 a12 a5 a42 a4 a2 a12 a5 a42 a4 a2 a12
2
addca
25
a5 a4 a4 a4 a2 a1 a1
1
bhddeca
15
25
a45
5
d
15
25
a 5 a4 a4 a3 a3 a2
1
bhdcab
15
25
d4 a42 a32 a3
2
bccb
16
21
d4 a45
40
ab
A64
553
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
16
22
a52 d4 a32 a3
8
ceba
A45 D4
16
22
a53 a3 a3 a13
12
ecad
A45 D4
16
22
d44 a3 a16
144
bdc
D46
16
23
a6 a5 a4 a3 a2 a12 a1
2
bhdecah
A46
16
23
a53 a4 a12 a1
6
daag
16
24
16
dgb
16
24
4
fgbceb
16
24
16
cef
16
24
16
fge
16
24
4
cbba
16
24
d5 d4 a34 d5 a5 a32 a3 a12 a1 a52 d42 a12 a7 a34 a14 d5 a4 a42 a22 a6 d4 a42 a2
4
ahcb
16
24
a6 a5 a4 a3 a2 a1
2
bhdech
16
24
a52 d4 a32 a12
4
eegd
16
24
a5 a5 a42 a3
4
dddf
16
24
a52 d4 a32 a12
8
eebd
16
24
48
bc
16
24 E
240
a
16
25
d44 a18 a46 a7 a34 a12
8
fff
2
efggch
36
bcb
2
bbcbdb
6
dgbec
16
25
16
25
16
25
16
25
16
25
a7 a3 a3 a3 a22 a12 a5 d43 a13 d5 a42 a3 a3 a1 a1 d5 d4 a33 a13 a1 d5 a5 a3 a24 a1
4
egcad
16
25
a 6 a5 a4 a2 a2 a1 a1
1
bhdcagh
16
25
d5 a4 a34
4
bbb
16
25
d5 a42 a3 a22 a12
2
cbcae
16
25
a 6 d4 a4 a3 a2 a1 a1
1
ahcgafe
16
25
a52 a5 a3 a2
2
deeb
16
25
a 6 a5 a3 a3 a2 a2
1
bhfebc
16
25
a6 a5 a3 a3 a2 a1 a1 a1
1
bhegchhh
16
25
2
bcee
16
25
2
defchd
16
25
4
cfcb
16
25
a6 a42 a32 a1 a5 a5 a5 a22 a12 a1 a52 d4 a3 a22 a5 a5 d4 a3 a22
2
ccdga
A46 2 A7 D52 A27 D52 A27 D52 A27 D52 A27 D52 A27 D52 A46 A45 D4 A45 D4 A45 D4 D46 2A64
A27 D52 A27 D52 A27 D52 A27 D52 A27 D52 A27 D52 A38 A29 D6 A29 D6 D64 D64
554
RICHARD E. BORCHERDS
16
25
a6 d4 a32 a22 a2
2
ahgab
16
25
a 5 a5 a4 a4 a2 a1
1
ddddcg
16
25
a 5 a5 a4 a4 a2 a1
1
ddadad
16
25
a 6 a4 a4 a3 a2 a2 a1
1
bddecah
16
25
2
edccb
16
25
2
cebecf
16
25
2
haeh
16
25
a5 d4 a42 a3 a1 a5 d42 a32 a1 a1 a1 a52 a4 a32 a12 a52 a4 a3 a3 a12
2
ddebd
16
25
a5 a5 a4 a3 a3 a1 a1
1
eddefdg
25
a52 a34
8
cf
16
25
2
hdcch
16
25
a52 a4 a3 a22 a12 d4 a42 a4 a32
2
hcbb
17
24
a7 a42 a32
4
bfc
A27 D52
A38
17
24
a62 a4 a3 a12
4
hebb
A27 D52
A38
17
24
a 6 a5 a5 a3 a2
2
dddcg
A46
A29 D6
17
24
a62 a32 a22
4
hah
A46
A29 D6
17
24
a54
24
a
A46
D64
17
25
a 7 a4 a4 a3 a2 a1
1
bfgcgb
17
25
2
hcfa
17
25
2
bfhea
17
25
a62 a32 a2 a1 a7 a42 a22 a2 a1 d5 d4 a42 a22
2
abbb
17
25
d 5 a5 a4 a3 a3 a1
1
bbbaab
17
25
d5 a5 a4 a3 a2 a2 a1
1
bbbaedb
17
25
a6 a5 d4 a3 a2 a1
1
ecdafb
17
25
a6 a5 a42 a12
2
ddeb
17
25
a 6 a5 a4 a3 a3
1
dcgbc
17
25
a 6 a5 a4 a3 a3
1
dcfcc
17
25
a 6 a5 a4 a3 a3
1
dceac
17
25
a52 d4 a4 a3
2
cdbb
17
25
a 6 a4 a4 a4 a3 a1
1
deffab
18
21
a53 d4 d4 a1
12
bcab
A45 D4
22
a62 a42 a3
4
caa
A46
16
18
555
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
18
23
a6 d5 a4 a4 a2 a12
2
bhaaba
A27 D52
18
23
a62 d4 a4 a12
4
ceba
A27 D52
18
23
d5 a52 d4 a12 a12
4
ebcfa
A27 D52
18
23
a7 a5 a42 a12 a1
4
dfcag
A27 D52
18
23
a7 d42 a32 a12
8
cgfa
A27 D52
18
23
16
gea
A27 D52
18
24
8
ffa
A38
A38
18
24
d52 a34 a12 a7 a52 a22 a7 a5 a42 a1
4
dfcg
A27 D52
A29 D6
18
24
a7 a5 d4 a3 a1 a1 a1
2
eggfdfc
A27 D52
A29 D6
18
24
a 6 d5 a4 a4 a2
2
bhaab
A27 D52
A29 D6
18
24
d52 a34
16
gb
A27 D52
D64
18
24
d5 a52 d4 a12
4
ebbc
A27 D52
D64
18
24
a52 a5 d4 a3 a1
4
gbcdb
A45 D4 A11 D7 E6
18
24
a54 a14
48
cb
18
25
d6 a34 a3 a12
8
ddcd
18
25
8
fge
18
25
2
egfbc
18
25
a7 a52 a14 a7 a5 d4 a22 a1 a7 d42 a3 a14
4
cgef
18
25
a 7 a5 a4 a3 a2
1
dfchb
18
25
a62 a5 a2 a1 a1
2
deaed
18
25
a7 a5 a4 a3 a1 a1 a1
1
dgchfeg
18
25
a 6 d5 a4 a3 a2 a1 a1
1
bhaebfd
18
25
d 5 a5 a5 a4 a1 a1
1
dbcacc
18
25
a7 a5 a32 a3 a1
2
dggfd
18
25
a7 a5 a32 a3 a1
2
dfhfg
18
25
a6 a6 a4 a3 a2 a1
1
cdchbg
18
25
a 6 a5 a5 a4 a1
1
aeecc
18
25
2
ebcefb
18
25
4
cbbc
18
25
2
dbadf
18
25
d5 a5 d4 a32 a12 a1 d5 d42 a32 a3 d5 a5 a42 a3 a1 d5 a5 a42 a3 a1
2
dbabb
18
25
a 6 a5 d4 a4 a2 a1
1
cgeabf
18
25
a6 a5 a4 a4 a3
1
cfacg
25
a52 d42 a3 a12
2
bgff
18
A45 D4
E64
556
RICHARD E. BORCHERDS
18
25
a6 a42 a42 a1
2
bcag
19
24
a8 a42 a32
4
hhb
A38
A29 D6
19
24
a 7 a6 a5 a2 a1
2
dfceb
A38
A29 D6
19
24
a6 a6 a5 a4 a1
2
eeaha
A46 A11 D7 E6
2160
d
2D46
48
aa
2A45 D4
2
addc
19
24 E
19
24 E
19
25
d46 a54 d4 d6 a42 a4 a22
19
25
a 8 a4 a4 a3 a2 a1
1
hghbfb
19
25
a8 a42 a22 a2 a1
2
hhgeb
19
25
a 7 a6 a4 a2 a2 a1
1
dfheeb
19
25
d52 a4 a4 a22
2
bbec
19
25
a 6 d5 d4 a4 a2
1
bcaec
19
25
a 6 d5 a5 a3 a2 a1
1
bdabca
19
25
a7 a52 a3 a12
2
dcab
19
25
3
aaa
19
25
d5 a53 a1 a7 a52 a22 a12
2
dcdb
19
25
a 7 a5 a4 a4 a2
1
achgc
19
25
a7 d4 a4 a4 a3
1
ccefb
19
25
2
fbad
19
25
6
cb
19
25
a62 a5 a3 a2 a6 a53 a6 a52 a5
2
ecb
19
25
a 6 a5 a5 d4 a2
1
babcc
19
25
d5 a52 a4 a22
2
dadc
20
20
d5 a54
16
ad
A45 D4
20
20
d5 d45
120
dc
D46
20
21
a63 d4 a2
6
aaa
A46
20
22
a 7 d5 a5 a3 a3 a1
2
bgecad
A27 D52
20
22
d52 a52 a3
8
bba
A27 D52
20
22
a72 a32 a3 a12
8
gdae
A27 D52
20
23
a7 a62 a12 a12
4
ecga
A38
20
23
a72 a4 a3 a12
4
faea
A38
23
a8 a52 a22 a12
4
dhba
A38
20
557
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
20
24
a72 d4 a14
8
gff
A29 D6
A29 D6
20
24
d6 a52 a32
8
edd
A29 D6
A29 D6
20
24
a8 a52 a3
4
dge
A29 D6
A29 D6
20
24
d6 a52 a32
4
edc
D64
A29 D6
20
24
d6 d43 a16
12
bcb
D64
20
24
a7 d5 a5 a3 a1 a1 a1
2
cgefecd
20
24
d5 d5 a52 a12
4
bced
20
24
a7 d5 d4 a3 a3
2
bgecf
20
24
a6 a6 d5 a4
2
aaeb
20
24
8
cbc
20
24
4
ch
20
24
48
c
20
25
2
dhfg
20
25
2
eddfeb
20
25
2
ccdcd
20
25
d52 a52 a12 a62 a52 d46 a8 a52 a12 a12 d6 a52 a3 a12 a1 a1 d6 a5 d4 a32 a1 a7 a7 a3 a22 a12
D64 2 A7 D52 A27 D52 A27 D52 A27 D52 A27 D52 A46 D46
2
fggbg
20
25
a8 d4 a4 a3 a3
1
chbff
20
25
2
bbecb
20
25
d52 a5 d4 a12 a1 a7 d5 a42 a1 a1
2
cfbcf
20
25
a 7 a6 a5 a3
1
ecfe
20
25
a62 d5 a3 a12
2
aedd
20
25
a 7 d5 a4 a3 a3
1
bfbfc
20
25
a7 a6 a5 a2 a1 a1 a1
1
echbgfe
20
25
a7 a6 a4 a4 a1
1
ecbad
20
25
a62 a6 a3 a1
2
abee
20
25
a6 a6 a6 a3 a1
1
caceg
20
25
a6 d5 a5 a4 a2 a1
1
aeebad
20
25
a 6 a6 a5 d4 a1
1
abfhd
20
25
2
ehag
20
25
a7 a52 a4 a12 d5 a5 a5 a42
2
fbdb
21
24
a 8 a6 a5 a2 a1
2
ehbga
21
24
a72 a42
4
cg
21
25
d 6 a6 a4 a4 a2
1
cdcdd
A11 D7 E6 A11 D7 E6 A11 D7 E6 A11 D7 E6 E64 A212 D83
A38 A11 D7 E6 A27 D52
A212
558
RICHARD E. BORCHERDS
21
25
a8 a6 a4 a3 a1
1
eggbb
21
25
a 7 a6 d5 a2 a1 a1
1
aecfba
21
25
a 6 d5 d5 a4 a2
1
baacc
21
25
a 7 d5 a5 a4 a1
1
acbdb
21
25
a7 a 6 a 5 a 4
1
cgbe
22
21
a7 d52 d4 a3
4
beac
A27 D52
22
22
a8 a62 a3
4
bba
A38
22
23
a8 a7 a5 a12 a1
2
cgeac
A29 D6
22
23
a8 a6 d5 a2 a12
2
abhba
A29 D6
22
23
a7 d6 a5 a3 a12 a1
2
dgdfab
A29 D6
22
23
a9 a5 a42 a12
4
fgba
A29 D6
22
23
d6 d5 a52 a12
4
bdcd
D64
22
23
d54 a12
48
bd
D64
22
24
a9 a5 d4 a3 a1 a1
2
gfffeb
A29 D6 A11 D7 E6
22
24
a7 d6 a5 a3 a1
2
dgcfe
A29 D6 A11 D7 E6
22
24
a 8 a6 d5 a2
2
abhb
A29 D6 A11 D7 E6
22
24
d6 d5 a52
4
bdc
D64 A11 D7 E6
22
24
d54
48
b
D64
E64
22
24
a8 a7 a4 a3
2
chbg
A38
A212
22
24
a7 d52 a32
4
eed
A27 D52
D83
22
24
a72 d42
8
fd
A27 D52
D83
22
24 E
a64
24
a
2A46
12
cbc
2
dbace
8
fba
24
ge
2
gfgb
22
25
22
25
22
25
22
25
22
25
e6 d42 a33 e6 a5 a42 a3 a1 e6 a52 a24 d7 a36 a9 a5 d4 a22
22
25
a 9 a5 a4 a3 a1 a1
1
ffbgbc
22
25
a9 a5 a32 a3
2
fggf
22
25
a 8 a7 a4 a2 a1
1
chbbc
22
25
d 6 d5 a5 a3 a3 a1
1
bdcdbe
22
25
a7 d6 a32 a3 a12
2
dgfeb
22
25
a8 d5 a5 a3 a1
1
agffe
559
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
22
25
a 8 d5 a5 a2 a2 a1
1
ahfbab
22
25
a 8 a6 a5 a3
1
bbgg
22
25
a 8 a6 a5 a3
1
cbef
22
25
2
bcdb
22
25
6
cdb
22
25
d6 a52 d4 a3 d6 a53 a13 a8 a52 d4
2
bfe
22
25
a8 a6 a4 a4 a1
1
cbbbc
22
25
a7 a7 a5 a3 a1
1
ghefc
22
25
6
ec
22
25
d53 a33 a7 d5 d42 a3
2
bffc
23
24
a82 a32
4
hb
A29 D6
A212
23
24
a 9 a6 a5 a2
2
cgbc
A29 D6
A212
23
24
a73
12
a
A38
D83
23
25
d7 a44
4
bd
23
25
a9 a 6 a 4 a 3
1
cfgb
23
25
a8 d52 a22
2
ebe
23
25
d6 a 6 a 6 a 4
1
accf
23
25
d6 a62 a4
2
bce
23
25
a8 a 7 d 4 a 3
1
fbbb
23
25
2
baa
23
25
a72 d5 a3 a62 d52
2
cb
24
20
a72 d5 d5
4
cda
A27 D52
24
21
a82 d4 a4
4
baa
A38
24
22
a9 a7 d4 a3 a1
2
fffad
A29 D6
24
22
a72 d6 a3
4
cfa
A29 D6
24
22
d62 d42 a3 a12
4
cbdb
24
24
d7 a52 a5 a1
4
cdea
A11 D7 E6 A11 D7 E6
24
24
4
efec
A11 D7 E6 A11 D7 E6
24
24
4
bgce
A11 D7 E6 A11 D7 E6
24
24
4
eaa
A11 D7 E6 A11 D7 E6
24
24
4
cbca
E64 A11 D7 E6
24
24
a9 d52 a12 a1 a7 e6 d4 a32 e6 a62 a4 e6 d5 a52 a12 d62 d42 a14
4
cbb
D64
D64
D83
560
RICHARD E. BORCHERDS
24
24
a72 d6 a12
4
dfd
A29 D6
D83
24
24
a72 d5 d4
4
fde
A27 D52
A15 D9
24
25
2
bdeeb
24
25
2
fgcd
24
25
d7 a52 d4 a1 a1 a9 a7 a32 a1 d6 d52 a5 a1
2
bcba
24
25
a 7 d6 d5 a3 a1 a1
1
cedeeb
24
25
a 8 a7 a6 a1
1
aebd
24
25
a 7 d6 a5 d4 a1
1
cfdfb
24
25
a72 a7 a12
2
edd
24
25
a 8 a7 d4 a4
1
bfgb
24
25
a7 a 7 d 5 a 4
1
ccgb
24
25
a72 d5 a4
2
cea
25
24
a10 a6 a5 a1
2
hgbb
A11 D7 E6
A212
A38 2A27 D52
A15 D9
24
a8 a72
4
eb
25
24 E
a72 d52
8
aa
25
25
d 7 a6 a6 a2 a2
1
adedc
25
25
a10 a6 a4 a2 a1
1
hgcea
25
25
e 6 a6 d5 a4 a2
1
acaea
25
25
e6 a62 d4
2
bca
25
25
a 7 e6 a 5 a 4 a 1
1
acaeb
25
25
a10 a5 a4 a4
1
gbcb
25
25
a 8 d6 a6 a2
1
dbfc
25
25
a 9 a6 d5 a2 a1
1
bfbeb
25
25
a9 a62 a2
2
agc
25
25
a9 d 5 a 5 a 4
1
bbbe
26
19
a72 d6 d5
4
dae
A27 D52
26
21
a9 d6 a5 d4
2
cfea
A29 D6
26
23
a9 d6 d5 a12 a1
2
cffab
A11 D7 E6
26
23
2
bbga
A11 D7 E6
26
23
2
ahaba
A11 D7 E6
26
23
2
edeea
A11 D7 E6
26
23
a10 a6 d5 a12 a8 e6 a6 a2 a12 d7 a7 d5 a3 a12 e6 d6 a52 a12
4
dbea
A11 D7 E6
25
561
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
26
23
e6 d53 a12
12
bce
E64
26
24
a92 a22
8
ga
A212
A212
26
24
d 7 a7 d5 a3
2
eddc
A11 D7 E6
D83
26
24
a 9 d6 a5 a3
2
dfbd
A29 D6
A15 D9
A29 D6 A27 D52
D10 E72
A38
A17 E7
26
24
a 9 a8 a5
2
ebc
26
24
a72 d52
8
ce
26
25
d7 d52 a3 a3
2
bdba
26
25
a10 d5 a5 a2 a1
1
bgbbb
26
25
a7 d62 a3
2
bcc
26
25
a 9 d6 d4 a3 a1
1
cfbeb
26
25
a 9 a7 a6
1
egb
26
25
a 9 a7 a6
1
efb
26
25
a7 d6 d5 a5 a1
1
dffeb
27
24
a82 a7
4
ga
27
25
a 8 d7 a4 a4
1
dbde
27
25
2
baa
27
25
a92 a3 a12 a10 a62 a1
2
eca
28
20
a92 d5 a12
4
fac
A29 D6
28
20
d63 d5 a12
6
bdb
D64
28
22
a 9 e 6 d5 a3 a1
2
cfead
A11 D7 E6
28
22
a 9 d7 a5 a3
2
dfca
A11 D7 E6
28
22
a11 d5 a5 a3 a1
2
fbeae
A11 D7 E6
28
22
8
bae
E64
28
23
2
bgaaf
A212
28
23
2
gcaa
A212
28
24
e62 a52 a3 a10 a9 a2 a12 a1 a11 a7 a4 a12 d8 d44
8
cb
D83
D83
28
24
a9 d7 a5 a1 a1
2
efede
A11 D7 E6
A15 D9
28
24
a11 d5 d4 a3
2
fbeb
A11 D7 E6
A15 D9
A29 D6 A29 D6 D64 2A38
A17 E7
28
24
a 9 a7 d6 a1
2
fefc
28
24
a 9 a7 d6 a1
2
fbfc
28
24
d62 d6 d4 a12
2
bbbb
a83
12
a
28
24 E
A15 D9
D10 E72 D10 E72
562
RICHARD E. BORCHERDS
25
d8 a52 a5 a1
2
ddbe
28
25
a11 a7 a22 a12
2
gbaf
28
25
a11 d5 a4 a3
1
fcbb
28
25
e 6 d6 d5 a5 a1
1
cceab
28
25
d 7 d6 a5 a5
1
cdcb
28
25
a 8 a7 e 6 a1 a1
1
aeedc
28
25
a9 e6 a42 a1
2
cgbc
28
25
a10 a8 a4 a1
1
bbae
28
25
a92 a4 a12
2
gaf
28
25
a 9 a8 d5 a1
1
fbcd
29
24
a11 a8 a3
2
bca
29
25
a10 d6 a6
1
fbe
30
21
d7 a7 e6 d4
2
cada
30
22
4
ba
A212
30
23
4
dcd
D83
30
23
4
ddd
D83
30
24
2 a a10 3 d72 a7 a12 d8 a72 a12 d8 a72
4
dd
D83
A15 D9
30
24
a11 d6 a5 a1
2
fbba
A11 D7 E6
A17 E7
30
24
a10 e6 a6
2
agb
A11 D7 E6
A17 E7
30
24
d 7 a7 e 6 a3
2
caed
A11 D7 E6
D10 E72
30
24
a 9 e6 d 6 a 1
2
efea
A11 D7 E6
D10 E72
30
24
e62 d52
8
ca
E64
D10 E72
30
25
d 8 a7 d5 a3
1
cbdd
30
25
a10 a9 a4
1
bca
30
25
d7 a7 d52
2
cae
31
24
a12 a7 a4
2
gbf
A212
A17 E7
28
31
24 E
31
24 E
d64 a92 d6
24
d
4
aa
31
25
a12 a6 a5
1
gba
31
25
a8 e62 a22
2
aaa
31
25
a10 e6 d5 a2
1
ebba
A212
A15 D9
A11 D7 E6
2D64 2A29 D6
563
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
31
25
a11 a8 d4
1
bea
31
25
a11 a7 d5
1
bba
31
25
a 8 a8 d7
1
dcb
31
25
a82 d7
2
ca
32
18
a92 d7
4
da
A29 D6
32
18
d7 d63
6
db
D64
32
20
a11 e6 d5 a3
2
ebaa
A11 D7 E6
32
21
a12 a8 d4
2
bba
A212
32
22
d8 d62 a3 a12
2
bbdb
D83
32
24
d8 d62 a12 a12
2
bbba
D83
D10 E72
32
24
d64
8
b
D64
2 D12
32
25
a9 d8 a5 a1 a1
1
dfecb
32
25
a12 a7 d4
1
bbf
32
25
a 9 d7 d6 a1
1
ceeb
34
19
a11 d7 d6
2
cea
A11 D7 E6
34
19
e63 d6
12
ae
E64
34
23
a11 d8 a3 a12
2
dbca
A15 D9
34
23
2
caac
A15 D9
34
23
a13 a8 a12 a1 d9 a72 a12
4
eea
A15 D9
34
24
a13 d6 a3 a1
2
bbcb
A15 D9
A17 E7
34
24
d9 a72
4
ed
A15 D9
D10 E72
34
24
a11 d7 d5
2
eee
A11 D7 E6
2 D12
34
25
e 7 a7 d5 a5
1
cbca
34
25
e7 a72 a3 a1
2
dcab
34
25
d 9 a7 d5 a3
1
ddeb
34
25
a13 a7 a3 a1
1
cfcc
34
25
d72 e6 a3
2
aca
34
25
d 8 a7 e 6 a3
1
edda
34
25
a11 d62
2
cb
35
24
2 a11
4
a
A212
2 D12
35
25
a12 d7 a4
1
ebc
564
RICHARD E. BORCHERDS
36
20
d82 d5 d4
2
bda
D83
36
22
a13 d7 a3 a1
2
ebab
A15 D9
36
24
a 9 e7 a 7
2
cfa
D10 E72
A17 E7
D10 E72 D83
D10 E72
24
e7 d6 d6 d4 a1
2
bbbaa
36
24
d82 d42
2
bb
36
25
a12 a10 a1
1
aab
37
24 E
e64
48
e
37
24 E
a11 d7 e6
2
aaa
37
25
a13 e6 a4 a1
1
baba
38
17
a11 d8 e6
2
dae
A11 D7 E6
38
21
a11 d9 d4
2
dea
A15 D9
38
23
d9 e62 a12
4
add
D10 E72
38
23
a9 e7 e6 a12
2
aecd
D10 E72
38
23
a14 e6 a2 a12
2
bfaa
A17 E7
38
23
a11 e7 a5 a12
2
cbba
A17 E7
38
24
a11 d9 a3
2
eeb
A15 D9
2 D12
A212
A24
36
2 D12
2E64 2A11 D7 E6
38
24
a12 a11
2
af
38
25
d 9 d7 a7
1
cdb
38
25
a11 d8 d5
1
dbc
40
20
a15 d5 d5
2
bba
A15 D9
40
22
d 8 e 7 d6 a3 a1
1
bbada
D10 E72
40
22
d10 d62 a3
2
bbd
D10 E72
40
22
a15 d6 a3
2
bca
A17 E7
40
24
2
bba
D10 E72
40
24 E
d10 d62 a12 2 a12
4
a
40
25
d10 a9 a5
1
dbb
40
25
a15 d5 a4
1
bca
40
25
e7 e62 a5
2
aaa
40
25
a 9 e7 d 7
1
aca
40
25
a11 e7 d5 a1
1
aeba
2A212
2 D12
565
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
40
25
a14 a9
1
ac
41
24
a15 a8
2
ac
A15 D9
42
21
a13 e7 d4
2
aba
A17 E7
43
24
a16 a7
2
fa
A17 E7
d83
6
d
2D83
A24
A24
43
24 E
44
16
d9 d82
2
db
D83
44
20
e72 d6 d5
2
bad
D10 E72
44
24
d82 d8
2
ba
D83
D16 E8
44
24
d83
6
a
D83
E83
46
23
d11 a11 a12
2
ebd
2 D12
46
24
a15 d8
2
cb
46
25
d11 a7 e6
1
dab
47
25
a16 d7
1
ca
48
18
d10 e7 d7 a1
1
abda
D10 E72
48
18
a17 d7 a1
2
cac
A17 E7
48
22
2 a a2 d10 3 1
2
bdb
2 D12
48
24
a15 e7 a1
2
bcc
A17 E7
D16 E8
D10 E72 D10 E72
D16 E8
A15 D9
24
d10 e7 d6 a1
1
abaa
48
24
d8 e72 a12
2
aaa
48
25
a13 d10 a1
1
bcb
49
24 E
a15 d9
2
aa
2A15 D9
50
15
a15 d10
2
ba
A15 D9
52
20
d12 d8 d5
1
bad
2 D12
52
23
a19 a4 a12
2
caa
A24
52
24
d12 d8 d4
1
bab
2 D12
48
D16 E8
D16 E8
D16 E8
566
RICHARD E. BORCHERDS
55
24 E
d10 e72
2
dd
2D10 E72
55
24 E
a17 e7
2
aa
2A17 E7
55
25
a19 d5
1
aa
56
14
d11 e72
2
da
D10 E72
56
21
a20 d4
2
aa
A24
58
25
e8 a11 e6
1
aac
58
25
a11 d13
1
bb
60
24
e8 d82
2
aa
E83
62
23
a15 e8 a12
2
cbd
D16 E8
64
22
e8 e72 a3
2
aae
E83
64
22
d14 e7 a3 a1
1
abda
D16 E8
67
24 E
2 d12
2
d
68
12
d13 d12
1
db
2 D12
68
20
d12 e8 d5
1
aad
D16 E8
68
24
2 d12
2
b
2 D12
D24
71
24
a23
2
a
A24
D24
76
16 E
d16 d9
1
bd
D16 E8
76
16 E
d9 e82
2
ea
E83
76
24
d16 d8
1
ba
D16 E8
76
24 E
a24
2
a
2A24
91
24 E
e83
6
e
2E83
91
24 E
d16 e8
1
dd
2D16 E8
92
8 E
d17 e8
1
da
D16 E8
D16 E8
2 2D12
D24
567
CLASSIFICATION OF POSITIVE DEFINITE LATTICES
d20 d5
1
ad
D24
24 E
d24
1
d
2D24
0 E
d25
1
d
D24
100
20
139 140
References [A] [BV] [B1] [B2] [B3] [B4] [C] [CPS] [CS1] [CS2] [E] [EG] [K] [N] [SV] [V]
[W]
D. Allcock, Recognizing equivalence of vectors in the Leech lattice, preprint, 1996, http://www.math.utah.edu/˜allcock. R. Bacher and B. B. Venkov, Réseaux entiers unimodulaires sans racine en dimension 27 et 28, preprint, 1996, http://www-fourier.ujf-grenoble.fr/PREP/html/a332/a332.html. R. E. Borcherds, The Leech lattice, Proc. Roy. Soc. London Ser. A 398 (1985), 365–376. , Lattices like the Leech lattice, J. Algebra 130 (1990), 219–234. , Automorphic forms on Os+2,2 (R) and infinite product, Invent. Math. 120 (1995), 161–213. , The Leech lattice and other lattices, Ph.D. dissertation, Cambridge Univ., Cambridge, 1985, preprint, http://www.arXiv.org/abs/math.NT/9911195. J. H. Conway, The automorphism group of the 26-dimensional even unimodular Lorentzian lattice, J. Algebra 80 (1983), 159–163. J. H. Conway, R. A. Parker, and N. J. A. Sloane, The covering radius of the Leech lattice, Proc. Roy. Soc. London Ser. A 380 (1982), 261–290. J. H. Conway and N. J. A. Sloane, A note on optimal unimodular lattices, J. Number Theory 72 (1998), 357–362. , Sphere Packings, Lattices and Groups, 3rd ed., Grundlehren Math. Wiss. 290, Springer, New York, 1999. M. Eichler, Quadratische Formen und orthogonal Gruppen, Grundlehren Math. Wiss. 63, Springer, Berlin, 1952. N. D. Elkies and B. H. Gross, The exceptional cone and the Leech lattice, Internat. Math. Res. Notices 1996, 665–698. M. Kneser, Klassenzahlen definiter quadratischer Formen, Arch. Math. 8 (1957), 241–250. H.-V. Niemeier, Definite quadratische Formen der Dimension 24 und Diskriminante 1, J. Number theory 5 (1973), 142–178. R. Scharlau and B. B. Venkov, The genus of the Barnes-Wall lattice, Comment. Math. Helv. 69 (1994), 322–333. È. B. Vinberg, “Some arithmetical discrete groups in Lobaˇcevski˘ı spaces” in Discrete Subgroups of Lie Groups and Applications to Moduli (Bombay, 1973), Oxford Univ. Press, Bombay, 1975, 323–348. G. N. Watson, The problem of the square pyramid, Messenger Math. 48 (1919), 1–22.
Department of Mathematics, University of California at Berkeley, Berkeley, California 94720-3840, USA; [email protected]; http://www.math.berkeley.edu/˜reb