Commun. Math. Phys. 234, 1–35 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0786-0
Communications in
Mathematical Physics
Periodic Monopoles with Singularities and N = 2 Super-QCD Sergey A. Cherkis1 , Anton Kapustin2 1 2
TEP, UCLA Physics Department, Los Angeles, CA 90095-1547, USA. E-mail:
[email protected] Institute for Advanced Study, Olden Lane, Princeton, NJ 08540, USA. E-mail:
[email protected]
Received: 9 March 2001 / Accepted: 15 January 2002 Published online: 20 January 2003 – © Springer-Verlag 2003
Abstract: We study solutions of the Bogomolny equation on R2 × S1 with prescribed singularities. We show that the Nahm transform establishes a one-to-one correspondence between such solutions and solutions of the Hitchin equations on a punctured cylinder with the eigenvalues of the Higgs field growing at infinity in a particular manner. The moduli spaces of solutions have natural hyperk¨ahler metrics of a novel kind. We show that these metrics describe the quantum Coulomb branch of certain N = 2 d = 4 supersymmetric gauge theories on R3 × S1 . The Coulomb branches of the corresponding uncompactified theories have been previously determined by E. Witten using the M-theory fivebrane. We show that the Seiberg-Witten curves of these theories are identical to the spectral curves associated to solutions of the Bogomolny equation on R2 × S1 . In particular, this allows us to rederive Witten’s results without recourse to the M-theory fivebrane. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2. Periodic Monopoles with Singularities . . . . . . . . . . . 2.1 Periodic U (2) monopoles with singularities . . . . . 2.2 Periodic SO(3) monopoles with singularities . . . . 2.3 Periodic U (m) monopoles with singularities . . . . . 3. N = 2 Gauge Theories Compactified on a Circle . . . . . 3.1 The geometry of the Coulomb branch . . . . . . . . 3.2 String theory picture . . . . . . . . . . . . . . . . . 3.3 Periodic U (m) monopoles and N = 2 Gauge theories 4. Nahm Transform . . . . . . . . . . . . . . . . . . . . . . 4.1 Direct Nahm transform . . . . . . . . . . . . . . . . 4.2 Reformulation of the Nahm transform . . . . . . . . 4.3 Monopole spectral data . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
2 4 4 6 8 10 10 11 13 13 13 15 16
2
S.A. Cherkis, A. Kapustin
4.4 Hitchin spectral data . . . . . . . . . . . . 4.5 Coincidence of the spectral data . . . . . . 4.6 Index computation . . . . . . . . . . . . . 5. Boundary Conditions for Hitchin Data . . . . . . 5.1 General remarks . . . . . . . . . . . . . . . 5.2 The case n− < k = n+ . . . . . . . . . . . 5.3 The case n− < k > n+ . . . . . . . . . . . 5.4 The case n− < k < n+ . . . . . . . . . . . 5.5 The case n− = k = n+ . . . . . . . . . . . 6. Inverse Nahm Transform . . . . . . . . . . . . . 6.1 Construction of the monopole fields . . . . 6.2 Asymptotic behavior of the monopole fields 7. Nahm Transform for Periodic U (m) Monopoles . 8. Concluding Remarks . . . . . . . . . . . . . . . A. Singularities of the Monopole Fields . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
16 17 17 18 18 19 23 24 24 26 26 27 29 31 32 34
1. Introduction Let X be an oriented Riemannian 3-manifold, E a unitary vector bundle on X, A a connection on E, and φ a Hermitian section of End(E). The Bogomolny equation is a nonlinear differential equation FA = ∗ dφ.
(1)
In the case X ∼ = R3 with the standard metric, solutions of this equation have been extensively studied from many different viewpoints (see [1] and references therein). In [5] we studied solutions of the Bogomolny equation for rank(E) = 2 and X = R2 × S1 with a standard metric. The norm of the Higgs field was assumed to grow logarithmically at infinity. Such solutions were called periodic monopoles. They are topologically classified by a positive integer, the monopole charge. Using the Nahm transform, we showed that there is a one-to-one correspondence between periodic monopoles with charge k and solutions of the rank k Hitchin equations on a cylinder with a particular asymptotic behavior. The Hitchin equations are the dimensional reduction of the self-duality equation to two dimensions. In this paper we study solutions of the Bogomolny equation on R2 × S1 with n points deleted. The behavior of A, φ in the neighborhood of a deleted point is that of a Dirac monopole minimally embedded in the nonabelian gauge group. The eigenvalues of the Higgs field φ are allowed to grow logarithmically at infinity. Solutions of this kind will be called periodic monopoles with n singularities. We mostly deal with the case rank(E) = 2, but we also sketch how our results can be generalized to higher rank. For rank(E) = 2 periodic monopoles with n singularities are topologically classified by a single integer k which satisfies 2k ≥ n. We call this integer the monopole charge. There are several reasons to study periodic monopoles with singularities. First of all, their moduli spaces carry natural hyperk¨ahler metrics of a novel kind. For example, for k = 2 and n = 4 the centered moduli space is a smooth four-dimensional hyperk¨ahler manifold with a distinguished complex structure. As a complex manifold it is isomorphic to a blow-up of (C × S)/Z2 at four points, where S is an elliptic curve, and Z2 acts by reflection on C and by a nontrivial element of Aut (S) on S. The four blown-up points
Periodic Monopoles with Singularities and N = 2 Super-QCD
3
are the fixed points of the Z2 action. This complex manifold is a noncompact analogue of the Kummer surface. It can be argued that the natural hyperk¨ahler metric on it is complete, nondegenerate, and asymptotically locally flat. For a fascinating introduction to noncompact approximations to K3 metrics see [15]. From the physical point of view, these moduli spaces are interesting because they provide exact low-energy effective actions for N = 2 d = 4 gauge theories compactified on a circle. “Exact” here means that both perturbative and non-perturbative quantum corrections are included. While the effective action of N = 2 d = 4 gauge theories on R4 can be computed by a variety of methods [27, 31, 21, 2], the analogous problem on R3 × S1 remained intractable so far. The main reason is the necessity to sum over an infinite number of instanton contributions, including virtual BPS monopoles wrapping S1 . In our previous paper [5] we explained how the moduli space of periodic monopoles can be used to solve this problem in the case of N = 2 Yang-Mills theory without matter. Periodic monopoles with singularities allow one to solve N = 2 gauge theories with gauge group SU (k) and matter in the fundamental representation. Finally, studying periodic monopoles with singularities provides a new example of the Nahm transform, which is a differential-geometric analogue of the Fourier-Mukai transform. The Bogomolny equation is a reduction of the self-duality equation to three dimensions. In general, the Nahm transform maps a solution of the former equation into a solution of some different system of equations, which is also a reduction of the self-duality equation. The precise form of this new system of equations depends on the boundary conditions imposed on the Bogomolny equation. For example, the Nahm transform takes monopoles on R3 with finite energy to solutions of the so-called Nahm equations, which are the reduction of the self-duality equation to one dimension. Periodic monopoles without singularities are mapped to solutions of the Hitchin equations on a cylinder [5]. We will see that Nahm transform establishes a one-to-one correspondence between periodic monopoles with singularities and solutions of the Hitchin equations on a cylinder with singularities. The singularities of the Hitchin data on the cylinder are so-called tame singularities. Solutions of the Hitchin equations on compact curves with such singularities were previously studied by C. Simpson [29] and others. Since the Hitchin equations are conformally-invariant, one may also wish to compactify the cylinder to a P1 by adding two points at infinity. For general k and n the singularities of the Hitchin data at the two added points are not tame. Nevertheless, it appears that the Hitchin-Kobayashi correspondence, if properly understood, continues to hold in this situation. It would be very interesting to understand this issue in detail. There is one special situation (2k = n) where all four singularities of the Hitchin data on P1 are tame. In this case the rank of the Hitchin data is k. In particular, if we want the moduli space of the Hitchin data to be the noncompact Kummer surface mentioned above, we have to set k = 2 and consider the rank-two Hitchin equations on P1 with four tame singularities. This moduli space can be reinterpreted as the centered moduli space of two periodic monopoles with four Dirac-type singularities. Monopoles on R3 with Dirac-type singularities have been studied in [6, 7 and 8]. Their moduli spaces are asymptotically locally flat hyperk¨ahler manifolds which can be used to solve N = 4 d = 3 gauge theories with matter. The present work can be viewed as an extension of both [6 and 5]. In this paper we explain the relation between periodic monopoles with singularities and N = 2 gauge theories, and study the Nahm transform. The properties of the moduli space will be explored in a forthcoming publication [9].
4
S.A. Cherkis, A. Kapustin
2. Periodic Monopoles with Singularities 2.1. Periodic U (2) monopoles with singularities. In this section we give the precise definition of a periodic monopole with singularities. Let X be (R2 × S1 )\{p1 , . . . , pn }, where pi , i = 1, . . . , n, are distinct points. We will parametrize S1 by χ ∈ R/(2π Z) and z ∈ C ∼ = R2 . Consider a U (2) bundle E on X. Its topological type is completely determined by n integers e1 , . . . , en , the values of the first Chern class of E on small 2-spheres surrounding the points p1 , . . . , pn . We will assume that ei = ±1 for all i. Let us set φ0 (r) = −1/(2r). Let us define a U (1) connection A0 (x) on a line bundle on R3 \{0} by dA0 = ∗ dφ0 . The first Chern class of this line bundle evaluated on a 2-sphere enclosing the origin is one. Let φ∞ (z, χ ) = log2π|z| be a function on M = (R2 × S1 )\{z = 0}. We cover M with two coordinate patches, U0 = {arg z = π }, and U1 = {arg z = 0}, where arg z is assumed to take values in the interval (−π, π ]. Let L be a unitary line bundle on M with the following transition function between U0 and U1 : g(z, χ ) =
1, e−iχ ,
Im z < 0 Im z > 0.
The first Chern class of this line bundle evaluated on any 2-torus of the form |z| = const is one. We define a unitary connection on L by A∞ =
arg z 2π dχ , arg(−z)−π dχ, 2π
arg z = π, arg z = 0.
(2)
The connection A∞ satisfies dA∞ = ∗ dφ∞ . A periodic monopole on E is a solution of the Bogomolny equation such that the connection and the Higgs field behave as ei φ0 (ri ) 0 φ(x) ∼ gi (x) g (x)−1 + O(1), 0 0 i ei dφ0 (ri ) 0 dA φ(x) ∼ gi (x) g (x)−1 + O(1), 0 0 i ei A0 (x − xi ) 0 A(x) ∼ gi (x) g (x)−1 + igi (x)dgi (x)−1 + O(1), 0 0 i
(3) (4) (5)
near the i th singularity (here ri is the distance to the i th singularity and gi (x) is a U (2)valued function), while at infinity their behavior is given by
µ1 µ2 2π φ(x) ∼g(x)diag 2π$1 φ∞ + v1 + Re , 2π $2 φ∞ + v2 + Re z z 1 , +O |z|2
g(x)−1 (6)
Periodic Monopoles with Singularities and N = 2 Super-QCD
5
µ1 dz µ2 dz g(x)−1 2π dA φ(x) ∼g(x)diag 2π$1 dφ∞ − Re 2 , 2π $2 dφ∞ − Re 2 z z 1 , (7) +O |z|3 µ1 2π A(x) ∼g(x)diag 2π$1 A∞ (x) + b1 + Im dχ + α1 d arg z, (8) z µ2 2π$2 A∞ (x) + b2 + Im dχ + α2 d arg z g(x)−1 z 1 −1 . (9) + 2π ig(x)dg(x) + O |z|2 Here g(x) is a U (2)-valued function, $i , vi , bi , αi ∈ R, i = 1, 2, µi ∈ C, i = 1, 2. The integers ei will be referred to as the abelian charges of the periodic monopole. We can assume that $1 ≥ $2 without loss of generality. If $1 = $2 , we will require in addition that v1 > v2 . Physically, this means that for large |z| the U (2) gauge symmetry is broken to U (1) × U (1) by the Higgs field. The numbers α1 and α2 are not gauge-invariant: a gauge transformation may shift them by 2π m, m ∈ Z. Therefore we prefer to regard α1,2 as taking values in R/(2π Z). From these formulas it is easy to see that $1 + $2 is the value of the first Chern class of E on any sufficiently large 2-torus enclosing all the singularities. Since this 2-torus is homologous to the union of n small 2-spheres surrounding the singularities, it follows that $1 + $2 = ei . Both $1 and $2 are integers. Indeed, the eigenvalues of the Higgs field outside a sufficiently large compact region are distinct, and therefore we can define a line subbundle of E associated with the largest eigenvalue of the Higgs field. The value of its Chern class on a large 2-torus is $1 , therefore $1 must be an integer. Hence $2 is also an integer. There are also relations between the continuous parameters appearing in (3) and (6). If we denote by (zi , χi ) the coordinates of the point pi , i = 1, . . . , n, then these relations read: µ1 + µ2 = −
n
ei z i ,
(10)
i=1
α1 + α2 =
n
ei χ i .
(11)
i=1
The derivation of these relations is presented in the next subsection. We define the nonabelian charge of a monopole to be k=
1 ($1 − $2 + n). 2
It is easy to see that k is a positive integer; in fact, since $1 ≥ $2 , it satisfies 2k ≥ n. The asymptotic behavior of the Higgs field is completely fixed once we specify ei , i = 1, . . . , n, and k.
6
S.A. Cherkis, A. Kapustin
Let n± be the total number of singularities with ei = ±1. By definition, n+ +n− = n. The integers $1 , $2 which determine the behavior of the Higgs field at infinity can be expressed in terms of k, n+ , and n− : $1 = k − n− ,
$2 = n+ − k.
2.2. Periodic SO(3) monopoles with singularities. A closely related problem is that of SO(3) monopoles with singularities on R2 × S1 . These are solutions of the Bogomolny equation with traceless A and φ. The behavior of A, φ near the singularities is given by 1 φ (r ) 0 φ(x) ∼ gi (x) 2 0 i (12) gi (x)−1 + O(1), 0 − 21 φ0 (ri ) 1 dφ0 (ri ) 0 2 dA φ(x) ∼ gi (x) (13) gi (x)−1 + O(1), 0 − 21 dφ0 (ri ) 1 A (x − xi ) 0 A(x) ∼ gi (x) 2 0 gi (x)−1 + igi (x)dgi (x)−1 + O(1), 1 0 − 2 A0 (x − xi ) (14) where gi (x) are again U (2)-valued functions. The behavior at infinity is given by µ µ 4π φ(x) ∼g(x)diag 2πk∞ φ∞ + v + Re , −2π k∞ φ∞ − v − Re g(x)−1 z z 1 , (15) +O |z|2 µ dz µ dz 4π dA φ(x) ∼g(x)diag 2πk∞ dφ∞ − Re 2 , −2π k∞ φ∞ + Re 2 g(x)−1 z z 1 , (16) +O |z|3 µ 4π A(x) ∼g(x)diag 2πk∞ A∞ (x) + b + Im dχ + αd arg z, z µ −2πk∞ A∞ (x) − b + Im dχ − αd arg z g(x)−1 z 1 −1 , (17) + 4π ig(x)dg(x) + O |z|2 where g(x) is a U (2)-valued function, k∞ , v, b, α ∈ R, µ ∈ C. We may assume that k∞ ≥ 0 without loss of generality. The number k∞ is, in fact, an integer, as it measures the value of the first Chern class of the eigenbundle corresponding to the positive eigenvalue of φ on a large 2-torus. The second Stiefel-Whitney class of this SO(3) bundle evaluated on a small 2-sphere surrounding the i th singularity is 1, therefore it cannot be lifted to an SU (2) bundle. The Stiefel-Whitney class of the bundle evaluated on a large 2-torus is k∞ mod 2. Since the large 2-torus is homologous to the sum of the small 2-spheres surrounding the singularities, it follows that k∞ = n mod 2, where n is the total number of singularities. We will define the nonabelian charge of an SO(3) monopole to be (k∞ + n)/2. In view of the above, the nonabelian charge is greater than or equal to n/2.
Periodic Monopoles with Singularities and N = 2 Super-QCD
7
The relation between the U (2) and SO(3) periodic monopoles is the following. If we decompose U (2) monopole fields A and φ into a trace part and a trace-free part, A = Atr + Atf ,
φ = φtr + φtf ,
then (Atr , φtr ) and (Atf , φtf ) separately satisfy the Bogomolny equation. It is easy to see that the behavior of (Atf , φtf ) near the singularities is described by (12), while their behavior at infinity is described by (15) with k∞ = $1 − $2 ,
v = v1 − v2 ,
b = b1 − b2 ,
µ = µ1 − µ2 ,
α = α 1 − α2 .
Furthermore, (Atr , φtr ) represents n periodic Dirac monopoles and therefore obeys [5] 1 1 1 1 log |z| (v1 + v2 ) + (v1 + v2 ) + ($1 + $2 ) ei V (x − xi ) ∼ 2π 2 2π 2 2π i zi 1 1 , (18) ei Re + O − 4π z |z|2 i 1 1 zi d arg z 1 Atr ∼ ($1 + $2 )A∞ (x) + ei Im b1 + b2 − dχ + (α1 + α2 ) 2 4π z 2 2π i 1 . (19) +O |z|2 φtr =
Here the function V (x) is given by
∞ log(4π) − γ 1 1 1 − , − 2π 2 p=−∞ 2π |p| |z|2 + (χ − 2πp)2
where the prime means that for p = 0 the second term in the square brackets must be omitted, and γ is Euler’s constant. Equation (18) implies that µ1 and µ2 cannot be chosen arbitrarily, but must satisfy µ1 + µ2 = −
n
ei zi .
i=1
There is another important relation constraining the parameters of the periodic monopole. It relates the asymptotic parameters αi , i = 1, 2, in Eq. (6) and the positions χ1 , χ2 , . . . , χn of the singularities along the S1 . Consider the holonomy of Atr along a circle |z| = R, χ = χ0 . For R → ∞ the holonomy tends to 1 (α1 + α2 − (l1 + l2 )χ0 ). 2 This follows from Eq. (6) and the choice of trivialisation specified by Eq. (2). On the other hand, it is clear from symmetry considerations that the curvature Fz¯z constructed from Atr vanishes identically on the plane χ = χ0 if χ0 is the χ -coordinate of the center-of-mass of the singularities, i.e. if χ0 =
1 ei χi . n
(20)
8
S.A. Cherkis, A. Kapustin
By Stokes’ theorem, the limiting holonomy of Atr must vanish for this value of χ0 . This implies that α1 + α2 =
ei χi .
(21)
To summarize, to any U (2) periodic monopole with n singularities one can associate a U (1) periodic monopole with n singularities and an SU (2)/Z2 = SO(3) periodic monopole with n singularities. The nonabelian charge of the SO(3) monopole is equal to the nonabelian charge of the U (2) monopole. Moreover, it is easy to see that the induced map on the moduli spaces of U (2) and SO(3) monopoles is an isometry. This basically follows from the fact that a periodic U (1) monopole is completely determined by ei , the location of the singularities, and the asymptotics of the Higgs field at infinity, and therefore it has no moduli. We just learned that the moduli space of periodic U (2) monopoles depends only on the total number of singularities n and the nonabelian charge k, but not on the individual values of ei . If we were only interested in the moduli space, we could have set all ei to be one, for example. We prefer to keep ei arbitrary, since the Nahm transform depends on the abelian charges in a nontrivial way.
2.3. Periodic U (m) monopoles with singularities. One can similarly define periodic monopoles with singularities for other classical groups. As an example, we consider U (m) periodic monopoles. The asymptotic behavior of the Higgs field and the connection at infinity is given by µ1 µ2 2π φ(x) ∼ g(x)diag 2π$1 φ∞ (x) + v1 + Re , 2π $2 φ∞ (x) + v2 + Re , z z µm 1 . . . , 2π$m φ∞ (x) + vm + Re , (22) g(x)−1 + O z |z|2 µ1 dz µ2 dz 2π dA φ(x) ∼ g(x)diag 2π$1 dφ∞ (x) − Re 2 , 2π $2 dφ∞ (x) − Re 2 , z z µm dz 1 . . . , 2π$m dφ∞ (x) − Re 2 g(x)−1 + O , z |z|3 µ1 2π A(x) ∼ g(x)diag 2π$1 A∞ (x) + b1 + Im (23) dχ + α1 d arg z, z µ2 2π$2 A∞ (x) + b2 + Im dχ + α2 d arg z, . . . , z µm 2π$m A∞ (x) + bm + Im dχ + αm d arg z g(x)−1 z 1 . + 2π ig(x)dg(x)−1 + O |z|2 We may assume that $1 ≥ $2 ≥ · · · ≥ $m . In addition we assume that if $i = $i+1 , then vi > vi+1 . The numbers $1 , . . . , $m must be integers for A(x) to be a well-defined connection. One way to see this is to note that since all of the eigenvalues of φ are
Periodic Monopoles with Singularities and N = 2 Super-QCD
9
distinct for large enough |z|, there is a well-defined splitting of E into the eigenbundles of φ, and $1 , . . . , $m are equal to the values of the first Chern class of these line bundles on a large 2-torus |z| = const. The singularities at the points p1 , . . . , pn are given by the Dirac monopole minimally embedded into the U (m) gauge group: φ(x) ∼ gi (x)diag(ei φ0 (ri ), 0, . . . , 0)gi (x)−1 + O(1), dA φ(x) ∼ gi (x)diag(ei dφ0 (ri ), 0, . . . , 0)gi (x)−1 + O(1), A(x) ∼ gi (x)diag(ei A0 (x − xi ), 0, . . . , 0)gi (x)−1 + igi (x)dgi (x)−1 + O(1), where ei = ±1. The first Chern class of E evaluated on a small 2-sphere surrounding the i th singularity is equal to ei . Since the first Chern class of E evaluated on a large 2-torus |z| = const is equal to $1 + · · · + $m , we have the relation m j =1
$j =
n
ei .
i=1
We define the nonabelian charge of a monopole to be a vector (k1 , k2 , . . . , km−1 ) with components kp = n− +
p
$i .
i=1
Clearly, the integers $j are completely determined by the abelian charges ei and the nonabelian charge (k1 , . . . , km−1 ). Given a periodic U (m) monopole with singularities, one can decompose its fields into a trace-free and a trace part which separately satisfy the Bogomolny equation. The trace part is completely determined by the abelian charges ei . The traceless part defines a U (m)/U (1) periodic monopole with singularities. To understand the nature of the singularities, recall that a vector bundle with the structure group U (m)/U (1) = SU (m)/Zm has a characteristic class with values in H 2 (X, Zm ) which generalizes the second Stiefel-Whitney class. This class measures the obstruction for lifting the U (m)/U (1) bundle to an SU (m) bundle. In the physics literature it is known as the t’Hooft magnetic flux. One can show that the value of the t’Hooft magnetic flux on a small 2-sphere surrounding the i th singularity is equal to ei mod m. Thus the U (m)/U (1) bundle corresponding to a U (m) monopole with singularities cannot be lifted to an SU (m) bundle. Note that for m > 2 the singularity with ei = +1 is distinct from ei = −1 even after passing to traceless fields. Conversely, given a periodic U (m)/U (1) monopole with singularities, one can unambiguously reconstruct a periodic U (m) monopole with singularities. This happens because the trace part, being a periodic U (1) monopole, with singularities, is completely determined by ei and the asymptotics at infinity. As in the case of U (2) monopoles, there are constraints between various continuous parameters appearing in (22) and (24), namely
10
S.A. Cherkis, A. Kapustin m j =1 m j =1
µj = −
n
ei z i ,
(24)
i=1
αj =
n
ei χ i .
(25)
i=1
These constraints can be derived by computing Atr and φtr and comparing with the known expressions for a U (1) periodic monopole with singularities. 3. N = 2 Gauge Theories Compactified on a Circle In this section we explain the relevance of periodic monopoles with singularities for understanding quantum properties of supersymmetric gauge theories. Chalmers and Hanany [4] were the first to realize that the metric on the moduli space of certain supersymmetric gauge theories in three dimensions is identical to the metric on the moduli space of BPS monopoles. This relation was used to great effect by many authors, notably by Hanany and Witten [14]. Later on, this relation was extended to four-dimensional N = 2 gauge theories compactified on a circle of arbitrary radius R [11, 20, 19], and it was shown that the quantum moduli space of many interesting theories of this kind coincides with the moduli space of self-duality equations or their reductions. Thus a difficult quantum-mechanical problem can often be converted to a much simpler problem of studying the moduli space of certain partial differential equations. In particular, in the decompactification limit R → ∞ one can recover all the results of Seiberg, Witten, and others on the moduli space of four-dimensional N = 2 gauge theories. The precise form of the PDE one has to study depends on the gauge theory in question. For example, certain finite N = 2 gauge theories (the so-called quiver theories) are solved in terms of instantons on R2 × T 2 [20, 19]. N = 2 super-Yang-Mills with gauge group SU (k) and no hypermultiplets are solved in terms of monopoles on R2 × S1 [5]. We will see below that periodic monopoles with n singularities and nonabelian charge k are relevant for N = 2SU (k) gauge theory with n massive hypermultiplets. 3.1. The geometry of the Coulomb branch. Consider an N = 2 SU (k) gauge theory in a generic vacuum on the Coulomb branch, where the expectation value of the Higgs field in the vector multiplet breaks the gauge group down to its maximal torus. The low-energy effective theory is described by k − 1 abelian vector multiplets which contain k − 1 complex scalars φ, k − 1 photons A, and 2(k − 1) Mayorana fermions. Thus the moduli space of the theory is a k − 1-dimensional complex manifold. N = 2 supersymmetry requires the metric on the moduli space to be a special K¨ahler metric. Now consider compactifying the theory on a circle of radius R. At length scales larger than R the theory is effectively three-dimensional. Its bosonic fields include k − 1 complex scalars, k − 1 periodic real scalars originating from Wilson lines of the fourdimensional photons along the compactified direction, and k − 1 periodic real scalars obtained by dualizing k − 1 three-dimensional photons. All in all, the moduli space of the effective three-dimensional theory is 4(k − 1)-dimensional. Its metric is required to be hyperk¨ahler by supersymmetry. Far from the origin of the Coulomb branch, the metric can be found by first flowing to the infrared in the four-dimensional theory, and then dimensionally reducing on a circle. This is possible because the low-energy effective theory in four-dimensions is free, and
Periodic Monopoles with Singularities and N = 2 Super-QCD
11
thus no renormalization group flow occurs upon compactification. The resulting moduli space is fibered over the moduli space of the four-dimensional theory by 2(k − 1)-dimensional tori. The metric on the fibers is flat, and thus the metric on the total space far along the Coulomb branch has a U (1)2k−2 isometry [28, 19]. As one moves towards the origin of the Coulomb branch, the form of the metric starts to deviate from this simple form. In particular, while the four-dimensional instantons respect the U (1)2k−2 isometry, the Euclidean BPS monopoles wrapping the compactified direction do not. These effects are exponentially small far along the Coulomb branch, but are very important near the origin. They tend to smooth out the singularities of the naive metric obtain by dimensional reduction. From the above discussion it is clear that the asymptotic behavior of the metric on the moduli space of the compactified theory is determined by the four-dimensional physics alone. If the four-dimensional theory is asymptotically free, or finite, the metric on the moduli space is locally flat at infinity. For the SU (k) gauge theory with n hypermultiplets this happens if 2k ≥ n. The interpretation of this restriction in terms of periodic monopoles will be explained below. 3.2. String theory picture. The relation between periodic monopoles and N = 2 gauge theories emerges if one embeds these gauge theories into string theory in a particular way, which we now explain. N = 2 SU (k) gauge theories can be realized in IIA string theory by suspending k D4-branes between two parallel NS5-branes. We shall assume that the NS5-branes’ world-volume is along the 0, 1, 2, 3, 4, 5 directions, and their positions in the 7, 8, 9 directions coincide. The NS5-brane with smaller (resp. larger) x 6 coordinate will be called the left (resp. right) NS5-brane. The k D4-branes are infinite in the 0, 1, 2, 3 directions and span a finite interval in the 6 direction. The two boundaries of the D4-brane worldvolume lie on the NS5-branes. The direction 3 will be assumed to be periodic with period 2π R. NS5 D4
0 x x
1 x x
2 x x
3 4 x x x
5 x
6
7
8
9
x
The world-volume theory on the D4-branes reduces in the infrared limit to the N = 2 SU (k) Yang-Mills theory on R3 × S1 , where x 0 , x 1 , x 2 are affine coordinates on R3 and x 3 parametrizes S1 . In order to obtain a theory with n fundamental hypermultiplets, one should add D4-branes parallel to the original k D4-branes but located outside the interval in x 6 where the latter are located. These D4-branes end on either left or right NS5-branes and extend to either x 6 = −∞ or x 6 = +∞. The two kinds of D4-branes will be called left and right semi-infinite D4-branes, respectively, and their numbers denoted nL and nR . The world-volume theory on the k suspended D4-branes is now an N = 2 SU (k) gauge theory with nL + nR hypermultiplets in the fundamental representation. Their masses are given by nL + nR complex numbers which parametrize the positions of the semi-infinite D4-branes in the 45 plane. Since the direction 3 is periodic with period 2π R, the gauge theory is compactified on a circle of radius R. In three dimensions the mass of the hypermultiplet is parametrized by three real numbers rather than by one complex one. The same is true about the four-dimensional theory on a circle, except that one of the three real mass parameters takes values in S1 ∼ = R/Z rather than in R. Indeed, each hypermultiplet is associated with a global U (1)
12
S.A. Cherkis, A. Kapustin
symmetry. Gauging this U (1) symmetry and letting the Wilson line of the corresponding photon along the compactified direction to be non-zero gives an effective mass to the three-dimensional hypermultiplet. In the above string theory picture the global U (1) is identified with the U (1) gauge group of the semi-infinite D4-brane, and the extra mass parameter is associated with the possibility of turning on a Wilson line along x 3 for the corresponding photon. To interpret this brane configuration in terms of periodic monopoles, we perform T-duality along x 3 . The resulting configuration in Type IIB string theory consists of k D3-branes suspended between two NS5-branes, and nL + nR semi-infinite D3-brane ending on the NS5-branes. 0 1 2 3 4 5 6 7 8 9 NS5 x x x x x x D3 x x x x In the limit of R → 0 the T-dual circle decompactifies and we end up with the Chalmers-Hanany-Witten brane configuration. As pointed out in [4], in the worldvolume theory on the two NS5-branes the suspended D3-branes appear as monopoles in the SU (2) subgroup of U (2), while the semiinfinite D3-branes appear as Dirac U (1) monopoles minimally embedded in U (2). These monopoles live in the part of the NS5-branes’ world-volume orthogonal to the D3-branes, namely in the 3, 4, 5 directions. In the remaining directions (0, 1, 2) of the NS5-brane worldvolume the gauge field configuration on the NS5-branes is translationally-invariant. For R = 0 the only difference is that the direction 3 is compact. This means that suspended D3-branes and semi-infinite D3-branes are nonabelian and Dirac monopoles on R2 ×S1 , respectively. It can be checked that a left semi-infinite D3-brane corresponds to a Dirac monopole with ei = +1, while a right semi-infinite D3-brane corresponds to a Dirac monopole with ei = −1. Thus we can identify nL = n+ , nR = n− . As in [4], one can argue that the metric on the moduli space of the suspended D3-branes does not receive quantum corrections and thus is identical to the classical metric on the moduli space of k periodic monopoles with n = nL + nR singularities. On the other hand, this same metric must be the metric on the Coulomb branch of the N = 2 SU (k) gauge theory with n fundamental hypermultiplets. Note that the monopole charge k is related to the rank of the gauge group of the N = 2 gauge theory. If we require the N = 2 gauge theory to be asymptotically free or finite, k must obey 2k ≥ n. In Sect. 2 we derived the same restriction on the monopole charge by analyzing the asymptotic behavior of the periodic monopole. One may wonder which assumption about monopoles corresponds to the requirement that the gauge theory be asymptotically free or finite. The answer is quite simple: we assumed that the periodic monopole configuration breaks the gauge group U (2) down to U (1) × U (1) for large |z|. This is reflected in the fact that the difference of the eigenvalues of the Higgs field either goes to infinity or approaches a finite non-zero limit for |z| → ∞. That is, the assumption of asymptotic freedom or finiteness is equivalent to the assumption of maximal symmetry breaking at infinity. The string theory picture can also be used to anticipate the result of the Nahm transform applied to periodic monopoles. To this end we perform an S-duality and then a T-duality along the direction 3. The resulting brane configuration consists only of D4-branes. 0 1 2 3 4 5 6 7 8 9 D4 x x x x x x D4 x x x x
Periodic Monopoles with Singularities and N = 2 Super-QCD
13
Such a configuration of D4-branes is described by the Hitchin equations on a cylinder parametrized by x 3 and x 6 [20, 19]. The gauge group is SU (K), where the number K depends on n± and k. We will see below that K = max(n+ , n− , k). 3.3. Periodic U (m) monopoles and N = 2 Gauge theories. The relation between periodic U (2) monopoles with and without singularities and quantum N = 2 gauge theories can be extended to U (m) monopoles. Consider the following gauge theory: the gauge group is SU (k1 )×SU (k2 )×· · ·×SU (km−1 ), the matter consists of m−2 hypermultiplets in the representations (k1 , k¯2 , 1, . . . , 1), (1, k2 , k¯3 , 1, . . . , 1), . . . , (1, . . . , 1, km−2 , k¯m−1 ), of the gauge group, n− hypermultiplets in the representation (k¯1 , 1, . . . , 1), and n+ hypermultiplets in the representation (1, . . . , 1, km−1 ). For m = 2 this gauge theory reduces to the one studied in the previous section. Its string theory realization consists of m parallel NS5-branes along 0, 1, 2, 3, 4, 5 separated in the x 6 direction, m − 1 stacks of D4-branes suspended between the successive NS5-branes such that the j th stack contains kj D4-branes, n− semi-infinite D4-branes ending on the right-most NS5-brane, and n+ semi-infinite D4-branes ending on the left-most NS5-brane [31]. Performing T-duality along the x 3 direction converts all the D4-branes into D3-branes. The resulting brane configuration in Type IIB string theory is identical to the one considered by Hanany and Witten [14], except that the direction 3 is periodic. Since we have m NS5-branes, their worldvolume theory has gauge group U (m). The above brane configuration is represented in this worldvolume theory by a BPS monopole with a nonabelian charge (k1 , . . . , km−1 ) with n+ + n− Dirac-type singularities of which n− have ei = −1 and n+ have ei = +1 [14, 6, 8]. We conclude that the moduli space of a U (m) periodic monopole with n+ singularities with ei = +1, n− singularities with ei = −1 and a nonabelian charge (k1 , . . . , km−1 ) is identical to the quantum Coulomb branch of the N = 2 gauge theory described above compactified on a circle. Performing S-duality and then T-dualizing x 3 again yields a configuration consisting solely of D4-branes. Such configurations have been studied in [20, 19] in the case when the direction 6 is periodic and k1 = k2 = · · · = km−1 . The results of [20, 19] suggest that brane configurations of this type are described by the SU (K) Hitchin equations on a cylinder, where K is some integer number. As explained below, the Hitchin equations are related to periodic monopoles by means of the Nahm transform. The integer K will be determined to be max(n+ , n− , k1 , . . . , km−1 ). 4. Nahm Transform This section describes the Nahm transform for periodic U (2) monopoles with singularities, as well as certain algebro-geometric data associated to monopoles. 4.1. Direct Nahm transform. Given a periodic monopole (A, φ) with singularities we define a family of Dirac-type operators parametrized by a point (r, t) ∈ R × R/Z as follows. Let L be a line bundle over R2 × S1 with a flat unitary connection a = −tdχ whose only non-zero component is along S1 . The variable t takes values in the dual circle R/Z which we denote Sˆ 1 . Let σi be Pauli matrices and let r be a real number. Let S be the trivial rank-two vector bundle on R2 × S1 . The Pauli matrices can be regarded
14
S.A. Cherkis, A. Kapustin
as morphisms S → S. We define a first-order differential operator D : S ⊗ E → S ⊗ E by D = σ · dA+a − (φ − r).
(26)
With some abuse of terminology we shall refer to it as “the Dirac operator”, and to the bundle S as “the spin bundle”. We shall use a multi-valued complex coordinate s = r + it on R × Sˆ 1 . It is important to know for which values of s the operator D is Fredholm. If both $1 and $2 are non-zero, then it is easy to see that D is Fredholm for all s ∈ R × Sˆ 1 . If either $1 or $2 are zero, then one or both of the eigenvalues of the Higgs field stay finite for |z| → ∞. Depending on whether $1 = 0 or $2 = 0, this eigenvalue is equal to v1 or v2 . In this case D can fail to be Fredholm only if r = Res is equal to one of the finite eigenvalues, and at the same time t = Ims is equal to b1 or b2 . This can be stated more concisely by introducing a non-unitary connection A − iφdχ , and letting V (z) be its holonomy along the χ direction at a point z ∈ C. If both $1 and $2 are non-zero, the eigenvalues of V (z) do not approach a finite limit as z → ∞; if $1 = 0, $2 = 0, then one of the eigenvalues of V (z) approaches a finite limit w1 = ev1 +ib1 ; if $1 = 0, $2 = 0, then one of the eigenvalues of V (z) approaches a finite limit w2 = ev2 +ib2 ; if $1 = $2 = 0, then the eigenvalues of V (z) approach w1 and w2 . Let K be the set of points on R × Sˆ 1 such that e2πs is equal to one of the limiting eigenvalues of V (z). K consists of at most two distinct points. The operator D is Fredholm for s ∈ (R × Sˆ 1 )\K because D † D has a mass gap. The Weitzenbock formula [5] implies that the operator D † D is positive-definite on the space of functions of rapid decrease. It is also easy to see that all elements of the L2 kernel of D must be decreasing rapidly at infinity for s ∈ / K, and therefore the L2 2 kernel of D is empty for s ∈ / K. It follows that the L kernel of D † is a vector bundle 1 ˆ ˆ We will show below that for on (R × S )\K of rank −IndD. We denote this bundle E. a periodic monopole with singularities IndD = − max(n+ , n− , k). ˆ The space of all We now endow Eˆ with a unitary connection Aˆ and a section of EndE. 2 1 L sections of S ⊗ E forms a trivial unitary bundle over R × Sˆ with a trivial connection. The bundle KerD † is a subbundle in it. Let P denote the orthogonal projector to KerD † . The induced connection on KerD † is given by ∂ ∂ Aˆ = iP ds + iP d s¯ . ∂s ∂ s¯
(27)
ˆ is defined by The Higgs field φˆ ∈ 8(EndE) ˆ φ(s) = P z.
(28)
A computation similar to Nahm’s original computation shows [5] that Aˆ and φˆ satisfy the Hitchin equations: ∂¯Aˆ φˆ = 0 (29)
i ˆ φˆ † = 0. Fˆs s¯ + (30) φ, 4 Thus to any periodic U (2) monopole with singularities we can associate a solution of the Hitchin equations on a cylinder, with zero, one, or two points deleted, depending on the values of $1 , $2 .
Periodic Monopoles with Singularities and N = 2 Super-QCD
15
4.2. Reformulation of the Nahm transform. Here we present a useful reformulation of the Nahm transform in which one can easily recognize its cohomological origin. The cohomological formulation of the Nahm transform is made explicit in [5]. If we want to solve the Dirac equation θ (31) D † θ = D † 1 = 0, θ2 we can do it in two steps. First, let us find a solution of the equation 2(∂z¯ − iAz¯ )θ˘1 − (∂χ − φ − iAχ + s)θ˘2 = 0,
(32)
where θ˘1 and θ˘2 are L2 sections of E. If we define a first-order differential operator
D¯ s = 2(∂z¯ − iAz¯ ), −(∂χ − φ − iAχ + s) 1
which acts from E⊕E to E, and let θ˘ = θ˘1 ⊕θ˘2 ∈ 8(E⊕E), then the above equation can be rewritten as D¯ 1s θ˘ = 0. Now let us define an operator D¯ 0s which acts from E to E⊕E: ∂ − φ − iAχ + s D¯ 0s = χ . (33) 2(∂z¯ − iAz¯ ) Bogomolny equations imply that D¯ 1s D¯ 0s = 0. Thus if θ˘ solves Eq. (32), then θ˘ + D¯ 0 ρ also solves Eq. (32) for any ρ ∈ 8(E). In the second step, we look for ρ ∈ 8(E) such that θ˘ + D¯ 0s ρ is square-integrable and solves the Dirac equation. Note that D † can be written in the form ¯s † † (34) D = D0 s . −D¯ 1 For θ = θ˘ + D¯ 0s ρ
(35)
to solve D † θ = 0, the section ρ must satisfy † s † s ˘ (36) D¯ 0 D¯ 0 ρ = − D¯ 0s θ. † / K, and therefore Now observe that the operator D¯ 0s D¯ 0s is positive-definite for s ∈ the above equation has a unique square-integrable solution. Thus there is a one-to-one correspondence between solutions of the Dirac equation D † θ = 0, and solutions of the equation D¯ 1s θ˘ = 0 modulo sections of E⊕E of the form D¯ 0s ρ. The benefit of the new description is that it simplifies the definition of φˆ and the ˆ Since the operators D¯ s and D¯ s commute with ∂s as well as holomorphic structure on E. 1 0 with multiplication by a holomorphic function of z, the new definition of φˆ is simply φˆ = z, and the new definition of the ∂¯ operator on Eˆ is simply ∂ . ∂¯Aˆ = ∂ s¯
16
S.A. Cherkis, A. Kapustin
4.3. Monopole spectral data. An important role in the subsequent analysis is played by the spectral data, which are algebro-geometric data associated to every periodic monopole with singularities. To define the spectral data, consider the already familiar connection B = A − iφ dχ and its holonomy V (z) around the circle S1 parametrized by χ at a point z ∈ C ∼ = R2 . V (z) is a section of a bundle obtained by restricting E to the plane χ = 0. This bundle has a natural holomorphic structure given by ∂¯A , and the Bogomolny equation ensures that V (z) is a holomorphic section [5]. Thus the coefficients of the characteristic polynomial of V (z) are holomorphic functions of z, and the equation det e2πs − V (z) = 0 (37) defines a holomorphic curve in C × C∗ , where we identified C∗ with the cylinder R × Sˆ 1 parametrized by s via the exponential map. This curve S will be called the monopole spectral curve. Since each point of the curve S corresponds to an eigenvalue of V (z), there is a well-defined sheaf M on S consisting of the eigenvectors of V (z). For a general monopole the spectral curve is nonsingular, and M is the sheaf of local sections of a line bundle [5]. The line bundle M has a natural holomorphic structure defined as follows. For s and z related by Eq. (37), a section of M is represented by a section θ of the bundle E satisfying ∂ (38) − iAχ − φ + s θ = 0. ∂χ It is a holomorphic section of M if and only if ∂ − iAz¯ θ = 0. ∂ z¯
(39)
These two equations are consistent because of Bogomolny equations. We will call M the spectral line bundle, and the pair (S, M) the monopole spectral data. 4.4. Hitchin spectral data. To every solution of the Hitchin equations on C∗ ∼ = R × Sˆ 1 one can also associate a holomorphic curve C and a sheaf N on it. The equation of the ˆ curve is the characteristic equation of φ(s): ˆ det z − φ(s) = 0. (40) It defines a holomorphic curve in C × C∗ because φˆ is a holomorphic section of Eˆ by ˆ virtue of the Hitchin equations. The sheaf N is the sheaf of eigenvectors of φ(s). If the curve C is nonsingular, then N is a line bundle [5]. The holomorphic structure on N is defined as follows: a section ψ of N is holomorphic if and only if ∂ ˆ (41) − i As¯ ψ = 0. ∂ s¯ ˆ this definition is consistent. The pair (C, N ) is called Since ∂s¯ − i Aˆ s¯ commutes with φ, the Hitchin spectral data.
Periodic Monopoles with Singularities and N = 2 Super-QCD
17
4.5. Coincidence of the spectral data. To any periodic monopole we can thus associate two kinds of spectral data: the monopole spectral data, and the Hitchin spectral data of its Nahm transform. A fact of paramount importance is that these two kinds of spectral data coincide. In [5] we proved this for periodic monopoles without singularities, but the argument applies to the present case just as well. Below we sketch the construction of the isomorphism between the two kinds of spectral data. Suppose that a point (ζ, e2πσ ) belongs to the spectral curve C ⊂ C × C∗ . If ? represents a holomorphic section of N , it satisfies ˆ )? = ζ ?, φ(σ
(42)
(∂σ¯ − i Aˆ σ¯ )? = 0.
(43)
and
From the point of view of the Nahm transform, ? is a zero mode of D † twisted by σ . As explained above, we can also think of ? as a section θ˘ ∈ 8(E⊕E) satisfying D¯ 1σ θ˘ = 0, modulo the equivalence relation θ˘ ∼ θ˘ + D¯ 0σ ρ. From this point of view, Eq. (42) is equivalent to (z − ζ )θ˘ = D¯ 0σ ψ. The latter equation implies that ψ|z=ζ represents a section of the sheaf M as defined in Eq. (38). The statement that the holomorphic line bundles M and N are isomorphic means that the condition ∂ − iAζ¯ ψ|z=ζ = 0 (44) ∂ ζ¯ on ψ is equivalent to imposing the condition (43) on ?. 4.6. Index computation. Let us now justify the claim that IndD = − max(n+ , n− , k). The operator D is a Callias-type, or Dirac-Schroedinger, operator. There are a number of index theorems for these kind of operators which express IndD in terms of Chern classes of the eigenbundles of the Higgs field on the boundary. Unfortunately, none of these theorems apply to the present situation, as they usually do not allow for singularities of the fields. Instead of using this direct approach, we will give two indirect arguments which show that IndD = − max(n+ , n− , k). The first argument uses the fact that the monopole spectral curve coincides with the Hitchin spectral curve. By definition of the Nahm ˆ which in turn is equal to the number of times the Hitchin transform, −IndD = rank E, spectral curve covers the w-plane. On the other hand, it is shown in the next section that the monopole spectral curve has the form A(z)w2 − B(z)w + C(z) = 1, where w = e2πs , and A(z), B(z), and C(z) are polynomials of degree n− , k, and n+ , respectively. This curve covers the w-plane max(n+ , n− , k) times, which proves our statement. The second argument uses a reformulation of the Nahm transform presented in Subsect. 4.2. It provides some information on the spatial structure of the zero modes of D † . Consider a complex of sheaves of vector spaces D¯ 0σ
D¯ 1σ
K σ : 0 → E −→ E⊕E −→ E → 0,
18
S.A. Cherkis, A. Kapustin
where the sections of E are assumed to be rapidly decaying. The value of σ is arbitrary, except that we require σ ∈ / K. As shown in Sect. 4.2, H 1 (K σ ) is naturally isomorphic † to KerD twisted by s = σ . It remains to compute the dimension of H 1 (K σ ). To this end consider another complex of sheaves: (z−ζ )
rest.
0 → E −−−→ E −−→ E|z=ζ → 0, where rest. is the restriction map, and E|z=ζ is concentrated on the circle z = ζ . This complex is not exact; nevertheless it leads to a long exact sequence in D¯ σ -cohomology [5]: z−ζ
rest.
0 1 1 σ 1 σ 1 1 0 → HD ¯ σ (S , E|z=ζ ) → H (K ) −−→ H (K ) −−→ HD¯ σ (S , E|z=ζ ) → 0.
This exact sequence implies that the dimension of H 1 (K σ ) is equal to the number of points at which the spectral curve intersects the line s = σ . Using the explicit form of the monopole spectral curve, one can easily see that this number is equal to max(n+ , n− , k). 5. Boundary Conditions for Hitchin Data 5.1. General remarks. The boundary conditions on the Hitchin data will be determined mainly by studying the monopole spectral curve. It has the general form w2 − b(z)w + c(z) = 0. The functions b(z) = Tr V (z, 2π) and c(z) = det V (z, 2π ) are known to be holomorphic on C\{z1 , . . . , zn }. We now show that they are rational functions. Using the known asymptotic behavior of φ and A near the singularities, we compute that c(z) has a simple zero at z = zi if ei = +1 and a simple pole if ei = −1. As for b(z), it has a simple pole at z = zi if ei = −1 and is regular if ei = 1. Thus b(z) and c(z) are meromorphic on C. Using the known asymptotic behavior of φ and A at infinity, we obtain the following asymptotic formulas for b(z) and c(z) for z → ∞: µ1 µ2 b(z) = z$1 ev1 1 + (45) + O(1/z2 ) + z$2 ev2 1 + + O(1/z2 ) , z z µ 1 + µ2 c(z) = z$1 +$2 ev1 +v2 1 + + O(1/z2 ) . (46) z Here v1 = v1 + ib1 , v2 = v2 + ib2 . Hence b(z) and c(z) are meromorphic on P1 , i.e. rational functions on C. Moreover, the above information about the poles of b(z) and c(z) implies B(z) C(z) b(z) = , c(z) = , A(z) A(z) where A(z) =
ei =−1
(z − zi ),
C(z) = ev1 +v2
ei =1
(z − zi ),
Periodic Monopoles with Singularities and N = 2 Super-QCD
19
and B(z) is a polynomial of degree k. Thus the monopole spectral curve can be rewritten in the following form: A(z)w 2 − B(z)w + C(z) = 1.
(47)
It is understood here that the points (w, z) on the curve satisfying A(z) = 0 must be deleted. It is important to note that the known asymptotics of b(z) and c(z) determine the leading and the next-to-leading coefficients of B(z) in terms of the monopole parameters. For example, the leading coefficient of B(z) is given by e v1 , $1 > $2 ev1 + ev2 , $1 = $2 . The precise expression for the next-to-leading coefficient will not be needed here. The remaining k − 2 coefficients of B(z) are the moduli of the monopole. In [31] it was shown that the curve (47) is the Seiberg-Witten curve for the N = 2, d = 4 gauge theory with gauge group SU (k) and n fundamental hypermultiplets. The masses of the hypermultiplets are the zeros of A(z) and C(z), which from the monopole point of view are just the positions of the Dirac-type singularities. The reason for this “coincidence” was explained in Sect. 3. We proved in Sect. 4 that the Hitchin spectral curve is identical to the monopole spectral curve. We now use the form of the spectral curve to determine the asymptotic behavior of the Hitchin data. The results depend on the relative magnitude of the numbers k, n+ , n− . There are seven possible cases to consider. But note that the substitution φ → −φ χ → −χ leaves the Bogomolny equation invariant and in terms of the spectral curve maps w to 1/w leaving z invariant. Thus this map interchanges n+ and n− , and without loss of generality we may assume that n− ≤ k. This leaves us with four cases to consider. 5.2. The case n− < k = n+ . The rank of Eˆ is n+ = k. The set K ⊂ R × Sˆ 1 consists of a single point w = w2 = ev2 . Thus we need to understand the behavior of Aˆ and φˆ for |r| → ∞, as well as near w = w2 . We begin with the region of large |r|. The spectral curve equation implies that the eigenvalues of φˆ for r → +∞ (w → ∞) asymptote to a1 , a2 , . . . , an− , e(2πs−v1 )/(k−n− ) , ωk−n− e(2πs−v1 )/(k−n− ) , . . . , k−n −1 (2πs−v1 )/(k−n− )
ωk−n−−
e
,
(48)
where ωp = e2πi/p , and a1 , . . . , an− are the roots of A(z). For r → −∞ (w → 0) the ˆ n+ eigenvalues of φ(s) are all distinct and approach the n+ roots of C(z). ˆ note first that the curvature of Aˆ goes to zero for To determine the behavior of A, large |r| [5]. Indeed, the computations in [5] imply that FAˆ = iP σ3 (D † D)−1 P ,
(49)
where P is the projector onto the kernel of D † . Now, it is easy to see that the L2 norm of the Green’s function of D † D is bounded from above by const/|r|3/2 , and so is the norm
20
S.A. Cherkis, A. Kapustin
of FAˆ . Since FAˆ goes to zero at least as fast as 1/|r|3/2 , Aˆ has well-defined limiting holonomies around Sˆ 1 for r → ±∞. ˆ φˆ † ], we see that [φ, ˆ φˆ † ] goes to zero Since the Hitchin equations relate FAˆ and [φ, ˆ for large |r|, and therefore in this limit the eigenvectors of φ become orthogonal. Furthermore, since for r → −∞ the eigenvalues of φˆ approach constants, in this limit the ˆ holonomy of Aˆ becomes diagonal in the basis of the eigenvectors of φ. It is possible to find the eigenvalues of the limiting holonomy by analyzing the Nahm transform in more detail. Instead we will use a shortcut. As explained in the next section, the Nahm transform admits an inverse. The inverse Nahm transform involves finding the kernel of a family of Dirac-type operator Dˆ x parametrized by a point x = (z, χ ) ∈ R2 × S1 . The fields A(x), φ(x) are expressed through the overlaps of the zero-modes of Dˆ x . It will be seen that Dˆ x can fail to be Fredholm only when exp(iχ ) ˆ and z coincides coincides with one of the eigenvalues of the limiting holonomies of A, ˆ with the corresponding limiting eigenvalue of φ. For all other values of x = (z, χ ), the operator Dˆ x is Fredholm, and the fields A(x), φ(x) are nonsingular. We already know that the limiting eigenvalues of φˆ for r → −∞ are precisely the z-coordinates of the singularities with ei = +1. Hence the limiting eigenvalues of Aˆ must be the χ -coordinates of these singularities. Thus, if we denote these χ -coordinates by χn− +1 , . . . , χn , the holonomy of Aˆ in the basis of the eigenvectors of φˆ and in the limit r → −∞ is diag(eiχn− +1 , eiχn− +2 , . . . , eiχn ).
(50)
Now let us find the limiting holonomy of Aˆ for r → +∞. Note that the Hitchin equations together with the limiting behavior of the eigenvalues of φˆ imply that in the basis of the eigenvectors of φˆ the limiting holonomy is iβ e 1 ... , (51) iβn− e Vk−n− ei α˜ 1 /(k−n− ) where β1 , . . . , βn− , α˜ 1 ∈ R/(2πZ), and Vp is p × p “shift” matrix given by 0 1 0 ... 0 0 0 0 1 . . . 0 0 0 0 0 . . . 0 0 Vp = . . . . . . . . . . . . . . . . . . . . . . . . 0 0 0 . . . 0 1 1 0 0 ... 0 0
(52)
The “shift” matrix appears because k − n− of the limiting eigenvalues of φˆ are cyclically permuted as one goes around Sˆ 1 . Note also that α˜ 1 takes values in R/(2πZ) because a shift α˜ 1 → α˜ 1 + 2πm, m ∈ Z, can be undone by a gauge transformation. It remains to determine β1 , . . . , βn− and α˜ 1 . The first n+ eigenvalues of the limiting holonomy are associated with the eigenvalues of φˆ which become constant in the limit r → +∞. Using the same shortcut as above, we find that βi is equal to χi , the χ -coordinate of the i th singularity with ei = −1. We will discuss how to express α˜ 1 in terms of the parameters of the monopole in the end of this subsection.
Periodic Monopoles with Singularities and N = 2 Super-QCD
21
Now let us determine the behavior of Aˆ and φˆ near the point w = w2 . From the spectral curve equation it is easy to see that the function z(w) considered as a meromorphic function on the curve has a simple pole at w = w2 = exp(v2 ) with residue ev2 µ2 . This implies that the Higgs field φˆ behaves as ˆ φ(s) ∼
R1 + O(1), s − s2
1 log w2 , and R is a rank-one matrix whose non-zero eigenvalue is ev2 µ2 . where s2 = 2π Now let us show that [R, R † ] = 0. To this end it is sufficient to demonstrate that ˆ [φ, φˆ † ] grows at most as 1/|s − s2 | in the limit s → s2 . To estimate the norm of this commutator, note that by virtue of the Hitchin equations and (49) we have
ˆ φˆ † ] = −4P σ3 (D † D)−1 P . [φ, Now the required estimate on the commutator follows from a simple estimate of the L2 norm of (D † D)−1 . ˆ Our strategy Having determined the behavior of φˆ near s = s2 , we now turn to A. ˆ will be the following. Specifying a unitary connection A is equivalent to specifying ˆ as well as a Hermitian metric on E. ˆ In particular, the a holomorphic structure on E, behavior of Aˆ near s = s2 is determined by the rate of growth of holomorphic sections of Eˆ near s = s2 . As a basis of holomorphic sections we will use the eigenvectors of ˆ They can be reinterpreted as holomorphic sections of the spectral line bundle. Their φ. norm can be estimated using the “cohomological” reformulation of the Nahm transform described in Sect. 4.2. The same strategy was used in [3] to study the Nahm transform of instantons on R2 × T 2 , and we will make use of some of the results of that paper. An important role in this argument is played by the holomorphic isomorphism of the spectral line bundles M and N whose construction we now recall. Let (ζ, e2πσ ) be the coordinates of a point on the spectral curve C. In the neighourhood of σ = s2 ζ has a simple pole as a function of σ , ζ ∼ 1/(σ − s2 ). If ? is a vector in the fiber of the spectral line bundle over the point (ζ, e2πσ ), then ˆ )? = ζ ?. φ(σ The fact that ζ diverges for σ → s2 means that ? is an eigenvector of φ, unique up to a multiple, whose eigenvalue diverges in this limit. Let θ be the corresponding class in ¯ D-cohomology, as described in Sect. 4.5. This implies (z − ζ )θ = D¯ 0σ ψζ (z, χ ),
(53)
for some ψζ (z, χ ) ∈ 8(E). Then ψζ (ζ, χ ) is an element in the fiber M → S, and the isomorphism between M and N identifies it with ?. Now let us start varying σ so that ?(σ ) is a holomorphic section of the spectral line bundle. This is equivalent to saying that the section ψζ (ζ, χ ) is holomorphic with respect to ∂¯A , i.e. (∂ζ¯ − iAζ¯ )ψζ (ζ, χ ) = 0.
(54)
From the asymptotics of A we infer that when ζ tends to infinity the norm of such a section behaves as |ζ |−n−α2 /(2π) with an integer n. Since M and N are holomorphically equivalent, the corresponding ? is going to satisfy (∂σ¯ − i Aˆ σ¯ )?(σ ) = 0. For any µ(z, χ ) ∈ 8(E) we can replace ψζ (z) with
22
S.A. Cherkis, A. Kapustin
ψζ (z) + (z − ζ )µ(z, χ ) without spoiling the holomorphicity condition (54). If we define a representative θ˘ of the class ? by θ˘ζ (z) = D¯ 0 ψζ (z)/(z − ζ ) such a change of ψζ will change θ˘ζ by D¯ 0 µ. As explained in Sect. 4.2, from θ˘ζ we can construct an L2 solution of the Dirac equation D † θσ = 0: θσ = θ˘ζ − D¯ 0σ
−1 † † D¯ 0σ θ˘ζ . D¯ 0σ D¯ 0σ
(55)
The norm of ?σ is defined to be the L2 norm of θσ . Now we use Lemma 7.5 of [3] to estimate this norm: −1 const ¯ † † † D¯ 0σ θ˘ζ | ≤ | D0 θ˘ζ |. (56) |D¯ 0σ D¯ 0σ D¯ 0σ |σ − s2 | Thus the norm of θσ is bounded from above by a multiple of |σ − s2 |−1+n+α2 /(2π) . In a similar manner one can estimate the norms of the holomorphic sections which correspond to the eigenvalues of φˆ which stay finite in the limit s → s2 . We find that their norms remain bounded. Now we are ready to determine the behavior of Aˆ near the singularity. Since the norms of all holomorphic sections grow not faster than powers of |σ − s2 |, Aˆ must have a simple pole at this point: Aˆ s (σ ) =
Q + O(1). σ − s2
ˆ The residue Q must satisfy [Q, Q† ] = 0, for the same reasons as the residue of φ. Furthermore, it must satisfy [Q† , R] = 0 for the equation ∂¯Aˆ φˆ = 0 to be satisfied. Thus it is possible to choose a basis in the fiber of Eˆ over s = s2 such that both Q and R are diagonal. The above estimates on the norm of holomorphic eigenvectors of φˆ then imply that the eigenvalues of Q restricted to Ker R are zero, and therefore Q is a rank-one matrix. It is easy to see that by a gauge transformation one can make the single non-zero eigenvalue of Q to be purely imaginary. We will denote this purely imaginary number by i α˜ 2 /(4π ). Further gauge transformations can shift α˜ 2 by multiples of 2π , so we should regard α˜ 2 as taking values in R/(2πZ). It remains to understand how the parameters α˜ 1 and α˜ 2 depend on the monopole parameters. Since Tr FAˆ = 0, we can obtain a relation between various limiting holonomies of Aˆ by integrating this equation over Xˆ and using the Stokes’ theorem: n
ei χi = α˜ 1 + α˜ 2 .
(57)
i=1
This equation leaves the individual values of α˜ 1 and α˜ 2 undetermined. We conjecture that α˜ 1 = α1 and α˜ 2 = α2 . One piece of evidence in favor of this is that the relation (57) is automatically satisfied due to (21). More compelling evidence is provided by our estimate of the rate of growth of holomorphic sections of Eˆ near the point s = s2 . We found that the sections in the image of Q are bounded from above by a multiple of |s − s2 |−1+n+α1 /(2π) , where n is an integer. If this bound were saturated, the equality α1 = α˜ 1 would follow. Then α˜ 2 = α2 would also follow by combining (57) and (21).
Periodic Monopoles with Singularities and N = 2 Super-QCD
23
5.3. The case n− < k > n+ . The rank of Eˆ is k. The set K is empty, i.e. the Hitchin data are non-singular for all s. The fact that φ is non-singular can be seen from the spectral curve equation C(z) = 0. (58) w If k > n± , then the roots of this polynomial equation in z have no singularities as functions of w ∈ C∗ . The spectral curve equation implies that for r → +∞ the asymptotics of the eigenvalues of φˆ are given by Eq. (48). The asymptotic behavior of the eigenvalues of φˆ for r → −∞ is analogous: B(z) − wA(z) −
c1 , c2 , . . . , cn+ , e(2πs−v1 )/(n+ −k) , k−n −1 (2πs−v2 )/(n+ −k)
ωn+ −k e(2πs−v2 )/(n+ −k) , . . . , ωn+ −k+
e
,
(59)
where c1 , . . . , cn+ are the roots of C(z). Note that all eigenvalues of φˆ either approach a constant value, or grow exponentially with |r|. Since the curvature of Aˆ decays as |r|−3/2 for large |r|, Aˆ has well-defined limiting holonomies around Sˆ 1 . The same reasoning as in the previous subsection implies that in the basis of the eigenvectors of φˆ the limiting holonomy for r → +∞ must have the form iχ e 1 ... , (60) iχ n e − i α ˜ /(k−n ) − Vk−n− e 1 where χi , i = 1, . . . , n− , are the χ -coordinates of the singularites with ei = −1, and α˜ 1 ∈ R/(2πZ). For r → −∞ the limiting holonomy in the basis of the eigenvectors of φ is given by iχ e n− +1 ... , (61) iχn− +n+ e Vk−n+ e−i α˜ 2 /(k−n+ ) where χn− +1 , . . . , χn− +n+ are the χ -coordinates of the singularities with ei = +1, and α˜ 2 ∈ R/(2π Z). It remains to express the parameters α˜ 1 , α˜ 2 through the parameters of the monopole. The Stokes’ theorem again implies (57), but leaves the individual values of α˜ 1 , α˜ 2 undetermined. We conjecture that in fact α˜ 1 = α1 and α˜ 2 = α2 . The main evidence in favor of this conjecture is the upper bound on the rate of growth of holomorphic sections for r → ±∞. For example, consider the limit r → +∞. The bundle Eˆ has a subbundle spanned by the eigenvectors of φˆ with diverging eigenvalues. The rank of this subbundle is k − n− . Let L be its top exterior power. L is a line bundle which inherits from Eˆ a holomorphic structure, as well as a Hermitian inner product. It is easy to see that holomorphic sections of this bundle grow as er(2πn−α˜ 1 ) , where n ∈ Z, for r → +∞. On the other hand, one can estimate the rate of growth using the coincidence of the spectral data, as in Subsect. 5.2, and get that the holomorphic sections of L are bounded by er(2π n−α1 ) , n ∈ Z. If the bound is saturated, then α˜ 1 = α1 , and consequently α˜ 2 = α2 .
24
S.A. Cherkis, A. Kapustin
5.4. The case n− < k < n+ . The rank of the Hitchin system is n+ . The set K is empty, i.e. the Hitchin data are defined everywhere on the cylinder. It follows from the spectral curve equation that for r → −∞ the eigenvalues of φˆ approach the n+ roots of C(z). The same argument as in the case n− < k = n+ shows that the limiting holonomy of Aˆ in the basis of the eigenvectors of φˆ is given by Eq. (50). The spectral curve equation also implies that for r → +∞, n− of eigenvalues of ˆ φ(s) approach a1 , . . . , an− , n+ − k of them asymptote to the n+ − k roots of zn+ −k = e(2πs−v1 ) ,
(62)
and k − n− of them asymptote to k − n− roots of zk−n− = e(2πs−v2 ) .
(63)
The limiting holonomy of Aˆ at r → +∞ in the basis of the eigenvectors of φˆ is given by iχ e n− +1 ... iχn− +n+ , e (64) −i α ˜ /(n −k) + 1 Vn+ −k e Vk−n− e−i α˜ 2 /(k−n− ) where α˜ 1 , α˜ 2 ∈ R/(2π Z). Stokes’ theorem again yields (57). We conjecture that α˜ 1 = α1 , α˜ 2 = α2 , for the same reason as in the previous case. 5.5. The case n− = k = n+ . The bundle Eˆ has rank k. The set K consists of two points ˆ given by w = w1 = ev1 and w = w2 = ev2 . For r → +∞ the eigenvalues of φ(s) approach the roots of A(z), while for r → −∞ they approach the roots of C(z). The limiting holonomies of Aˆ are well-defined and are given by (50) for r → −∞ and by diag(eiχ1 , eiχ2 , . . . , eiχn− ) for r → +∞. The analysis of the singularities at w = w1,2 is the same as in the case n− < k = n+ . For either of the singular points one of the eigenvalues of φˆ has a simple pole. The residue is equal to µ1 ev1 for w = w1 and to µ2 ev2 for w = w2 . Together with the estimate ||FAˆ || ≤ const/|w − wi |, this implies that φˆ behaves as φˆ ∼
Ri , s − si
si =
1 log wi , 2π
i = 1, 2,
where Ri is a rank-one matrix whose only non-zero eigenvalue is µi evi , and which satisfies [Ri , Ri† ] = 0. As for the connection, similar arguments show that it behaves as Aˆ s ∼
Qi , s − si
Periodic Monopoles with Singularities and N = 2 Super-QCD
25
where Qi is a rank-one matrix satisfying [Q†i , Ri ] = 0. The only non-zero eigenvalue of Qi can be made purely imaginary by a gauge transformation and will be denoted i α˜ i /(4π ). Stokes’ theorem implies the relation (57), as before. The estimate of the rate of growth of the holomorphic sections of Eˆ near the points s = si suggests that in fact α˜ i = αi , i = 1, 2. In the case n− = k = n+ the Nahm transform for periodic monopoles resembles very much the Nahm transform for doubly-periodic instantons studied in [18]. Recall that doubly-periodic instantons are solutions of the U (2) self-duality equation on R2 × T 2 with finite action and vanishing first Chern class. There the Nahm transform is described by solutions of the Hitchin equations on a torus with two punctures [20, 19, 18]. The behavior of Aˆ and φˆ at the punctures is the same as above. The rank of the Hitchin bundle is given by the second Chern class of the instanton bundle on R2 × T 2 . The Nahm data for periodic monopoles can be regarded as a limiting case of the Nahm data for doubly-periodic instantons. The cylinder R × Sˆ 1 can be regarded as a degeneration of the torus of [18]. For example, if the torus is realized as a quotient of C by the lattice generated by 1 and τ , one can consider the limit Imτ → +∞. The positions of the punctures should be held fixed in this limit. The nonabelian monopole charge k corresponds to the instanton number. One can see the reason for this by looking at the monopole side of the story. The case k = n+ = n− is special in that the eigenvalues of the monopole Higgs field φ approach constants at infinity. Now recall that the Bogomolny equation is a reduction of the self-duality equation to three dimensions. Thus a periodic monopole can be regarded as an instanton on Y = R2 × Sχ1 × Sθ1 invariant with respect to the translations of the circle Sθ1 . The relation between the self-dual connection A˜ on Y and the monopole fields on X = R2 × Sχ1 is given by A˜ = ψ ∗ (A) + ψ ∗ (φ)dθ, where ψ : Y → X is the natural projection. The connection A˜ is self-dual for any k and n± , but the case k = n+ = n− is special, because only in this case the large-|z| behavior of A˜ is that of a doubly-periodic instanton as defined in [18]. Of course, unlike in [18], our A˜ has singularities for z = zi , χ = χi . The origin of these singularities can be understood by analyzing how the limiting procedure described above affects the instantons. A doubly-periodic U (2) instanton with charge 1 can be regarded as made of two “monopole” constituents. Each constituent has a fixed size, so its moduli describe its position on R2 × T 2 . Thus a charge 1 instanton has 8-dimensional moduli space. (This interpretation of an instanton as a combination of two monopoles also arises for calorons, i.e. instantons on R3 × S1 , see [22] for details.) The sizes of the two constituents are determined by the asymptotic behavior of the components of A along the T 2 and need not be the same. In particular, one can take a limit in which the size of one of the constituents goes to zero, while the size of the other one stays finite. A point-like constituent monopole is nothing but a Dirac-type singularity on R2 × T 2 of the kind considered in this paper. Thus for k = n− = n+ periodic monopoles with singularities can be obtained as a limit of doubly-periodic U (2) instantons with instanton charge k. The degeneration of a doubly-periodic instanton into a periodic monopole with singularities can be easily seen by slightly modifying the brane configuration discussed in Sect. 3. A doubly-periodic U (2) instanton of charge k is described by a brane configuration with k D3-branes and 2 NS5-branes, but with the x 6 direction compactified on a
26
S.A. Cherkis, A. Kapustin
circle. As described in Sect. 3, a D3-brane can end on the NS5-brane, and therefore each D3-brane breaks into two segments suspended between the NS5-branes and capable of moving independently. These suspended segments represent the constituent monopoles. The size of a constituent is inversely proportional to the length of the segment. In order to obtain a charge k periodic monopole with 2k singularities one has to take the limit in which the NS5-branes are very close together, so that half of the segments are much longer than the other half. 6. Inverse Nahm Transform In this section we construct and study the inverse Nahm transform which associates a periodic monopole with singularities to a solution of the Hitchin equations on (R × Sˆ 1 )\K with the spectral curve and asymptotic behavior described in Sect. 5. ˆ φˆ be a solution of the Hitchin equations 6.1. Construction of the monopole fields. Let A, with rank(E) = K. We will use these data to define a family of Dirac-type operators parametrized by a point (z, χ ) ∈ C × S1 . Let aˆ = −χ dt be a unitary connection on a ˆ Eˆ to E⊕ ˆ Eˆ by trivial line bundle on R × Sˆ 1 . We define a Dirac type-operator from E⊕ −φˆ + z 2∂A+ ˆ aˆ Dˆ = (65) † + z¯ . ˆ 2∂¯A+ − φ ˆ aˆ Now let us assume that Aˆ and φˆ satisfy the kind of boundary conditions described in Sect. 5. Standard arguments show that Dˆ can fail to be Fredholm only if z is equal to one of the asymptotic eigenvalues of φˆ and eiχ is equal to the corresponding eigenvalue ˆ (We recall that the boundary conditions of Sect. 5 imply of the limiting holonomy of A. ˆ that the holonomy of A in the limit |r| → ∞ preserves all the eigenvectors of φˆ corresponding to finite limiting eigenvalues and permutes the rest of the eigenvectors.) Let us denote the set of such points of C × S1 by M. Obviously, M is a finite set whose cardinality does not exceed 2K. The Weitzenbock formula [5] implies that the L2 kernel of Dˆ is trivial, therefore ˆ It is a subbundle of a trivial Ker Dˆ † is a vector bundle on R2 × S1 of rank −IndD. ˆ E. ˆ infinite-dimensional bundle on R2 × S1 whose fiber consists of all L2 sections of E⊕ The latter bundle has a natural Hermitian inner product. Let Pˆ be the corresponding projector to Ker Dˆ † . We define a connection A on E = Ker Dˆ † by ∂ ∂ ∂ dA = Pˆ dz + d z¯ + dχ Pˆ . (66) ∂z ∂ z¯ ∂χ We define a Higgs field φ ∈ 8(End(E)) by φ = Pˆ r Pˆ .
(67)
Obviously, φ † = φ. These formulas are well-defined because the elements of the L2 kernel of Dˆ † decay at least exponentially for (z, χ ) ∈ / M, as can be easily verified. We claim that A, φ satisfy the Bogomolny equation. The computation demonstrating this is exactly the same as in [5]. The proof that the composition of the direct and inverse Nahm transform is the identity is also the same as in [5].
Periodic Monopoles with Singularities and N = 2 Super-QCD
27
We now want to show that the solution of the Bogomolny equation obtained in this way is a periodic U (2) monopole with singularities in the sense of Sect. 2. This will imply that there is a one-to-one correspondence between periodic U (2) monopoles with singularities and the Hitchin data of the kind described in Sect. 5. As a first step, we need to show that −IndDˆ = 2, and therefore the monopole bundle obtained by the inverse Nahm transform has the rank two. The argument for this is exactly the same as in [5]. One proves that dim Ker Dˆ † for z = z0 , χ = χ0 is equal to the number of points at which the Hitchin spectral curve intersects the line z = z0 (provided (z0 , χ0 ) ∈ / M). On the other hand, one of our assumptions about the Hitchin system was that the Hitchin spectral curve has the form A(z)w2 − B(z)w + C(z) = 0, ˆ It where w = 0, and the roots of A(z) and C(z) are the asymptotic eigenvalues of φ. † ˆ / M. follows that dim Ker D = 2 for all (z, χ ) ∈ 6.2. Asymptotic behavior of the monopole fields. It remains to show that the monopole has the right behavior near the points of M, as well as for |z| → ∞. Suppose that the boundary conditions for the Hitchin data are such that n− of the eigenvalues of φˆ approach constants as r → +∞ and n+ of them approach constants as r → −∞. Taking into account the w → 1/w symmetry we may assume that n− ≤ n+ without loss of generality. Recall now that we assumed that the spectral curve defined by the equation ˆ det(z − φ(s)) =0
(68)
A(z)w2 − B(z)w + C(z) = 0,
(69)
has the form
where w = e2πs , and A(z) is a polynomial of degree n− , C(z) is a polynomial of degree n+ . If we denote by k the degree of B(z), then it is easy to see that the rank of the Hitchin system K is max(n+ , n− , k). Moreover, analyzing the four possible boundary ˆ one can easily see that 2k ≥ n+ +n− , and if one normalizes the leading conditions for φ, coefficient of A(z) to be one, then the leading coefficient of C(z) is ev1 +v2 , and the leading coefficient of B(z) is e v1 , $1 > $2 , ev1 + ev2 , $1 = $2 , where $1 = k − n− , $2 = n+ − k. We now use these properties of the spectral curve to find the behavior of φ and A for large |z| as well as near the points of M. To analyze the behavior for |z| → ∞, note the following formula for the components of curvature of A [5]: ˆ −1 Pˆ , Fz¯ χ = 2i Pˆ σ− (Dˆ † D)
ˆ −1 Pˆ . Fz¯z = i Pˆ σ3 (Dˆ † D)
(70)
28
S.A. Cherkis, A. Kapustin
ˆ −1 is bounded from above by a multiple It is easy to see that the L2 norm of (Dˆ † D) of 1/|z|. Hence the components of curvature decay at least as fast as that, and then the Bogomolny equation implies that the covariant differential of φ goes to zero for |z| → ∞. It follows that for large |z| the eigenvalues of φ are independent of χ , and furthermore that the eigenvalues of V (z) factorize into the eigenvalues of the holonomy of A and the eigenvalues of e2πφ(z,χ) . Thus the behavior of the eigenvalues of φ for large |z| can be read off the asymptotic behavior of the eigenvalues of V (z), which are encoded in the spectral curve. One can easily see that this yields (6) with $1 = k − n− and $2 = n+ − k. The behavior of A for large |z| can be inferred from Theorem 10.5 of [17] about the behavior of solutions of the Bogomolny equation. (There the theorem is stated for the Bogomolny equation on R3 , but the proof goes through for R2 × S1 as well). This theorem asserts that if the difference between the eigenvalues of the Higgs field is bounded from below for |z| → ∞, then dA φ is proportional to φ with exponential accuracy. Thus, up to corrections of order exp(−δ|z|), δ > 0, the connection A preserves the splitting of E into the eigenbundles of φ. Hence if we use the eigenvectors of φ as the orthonormal frame of E, the U (2) Bogomolny equation splits into a pair of the decoupled U (1) Bogomolny equations. Solving the rank-one Bogomolny equation for A we find that the asymptotic behavior of A is given by (9) with some α1 , α2 . One can also check that (7) is satisfied. We now turn to the behavior of φ and A near the points of M. Recall that their zcoordinates are given by the roots of A(z) and C(z), while their χ coordinates are given ˆ From the spectral by the corresponding limiting eigenvalues of the holonomy of A. curve equation we see that near such a point z = zi one of the eigenvalues of V (z) either diverges as (z − zi )−1 , or goes to zero as z − zi . Obviously, this can happen only if the Higgs field φ becomes singular at z = zi , χ = χi . To determine the nature of the singularity, we first estimate how fast FA and φ can ˆ −1 is bounded grow near the singular point. It is easy to show that the norm of (Dˆ † D) by a multiple of 1/ri2 , where ri is the distance to the singular point. By (70), the same is true about the norm of FA . Further, one can show that for sufficiently small ri all the elements of Ker Dˆ † are bounded by a multiple of exp(−|r|ri ). (This is due to the ˆ −1 .) Then from the definition of φ one exponential decay of the Green’s function (Dˆ † D) can easily see that const ||φ|| ≤ . ri In the appendix we prove that any solution of the U (2) Bogomolny equation with such a singularity has the following form: m1 φ0 (ri ) 0 φ(x) ∼ h(x) (71) h(x)−1 + O(1), 0 m2 φ0 (r) m1 dφ0 (ri ) 0 dA φ(x) ∼ h(x) (72) h(x)−1 + O(1), 0 m2 dφ0 (r) m1 A0 (x − xi ) 0 A(x) ∼ h(x) h(x)−1 0 m2 A0 (x − xi ) + ih(x)dh(x)−1 + O(1).
(73)
Here h(x) is a U (2)-valued function defined in the neigborhood of the singular point, m1 , m2 are integers, and φ0 (r) and A0 (x) have been defined in Sect. 2. The idea of the proof is to lift the monopole to an instanton on the Taub-NUT space with a point deleted,
Periodic Monopoles with Singularities and N = 2 Super-QCD
29
use the Uhlenbeck compactification theorem to show that the instanton can be continued to the deleted point, and then project the instanton back to three dimensions. Using these formulas for A and φ it is straightforward to compute V (z) near a singular point and compare with what the spectral curve predicts. The results match if and only if one of the mi is zero, and the other one is 1 or −1, depending on whether zi is a root of A(z) or C(z). This completes the demonstration that the inverse Nahm transform produces a periodic monopole with singularities out of any solution of the Hitchin equations with the boundary conditions as in Sect. 5. 7. Nahm Transform for Periodic U (m) Monopoles Periodic U (m) monopoles with singularities defined in Sect. 2.3 can be analyzed in the same manner as U (2) monopoles. Let us describe the result of the Nahm transform applied to a periodic U (m) monopole with n+ (resp. n− ) singularities of positive (resp. negative) Chern class and nonabelian charges (k1 , k2 , . . . , km−1 ). The Nahm transform yields a solution of the Hitchin equations on a cylinder with several points deleted. The rank of the Hitchin bundle is equal to max(n+ , n− , k1 , . . . , km−1 ). The deleted points are determined as follows. Recall that the integers $1 , . . . , $m which determine the behavior of the eigenvalues of φ at infinity are given by l1 = k1 − n+ , l 2 = k2 − k 1 , ... lm−1 = km−1 − km−2 , lm = n− − km−1 .
(74)
These numbers are ordered: $1 ≥ $2 ≥ · · · ≥ $m . If $i = 0, then the i th eigenvalue of φ approaches a finite value vi for |z| → ∞, and the corresponding eigenvalue of the holonomy of A along S1 approaches a constant value ebi . Set vi = vi + ibi . The deleted points are in one-to-one correspondence with i such that $i = 0, and they are located at w = wi = evi . The Higgs field φˆ has a simple pole at each of the deleted points: φˆ ∼
Ri . s − si
Here si = vi /(2π) and Ri is a rank-one matrix satisfying [Ri , Ri† ] = 0 whose only non-zero eigenvalue is equal to µi wi (see (22) for the definition of µi ). The connection Aˆ also has a simple pole at the deleted point: Aˆ ∼
Qi , s − si
where Qi is a rank-one matrix satisfying [Q†i , Ri ] = 0. Before we describe the behavior of the Hitchin data for r → ±∞, let us note two alternative definitions of the rank K in terms of $i . Let j+ be the number of strictly positive $i . It is easy to see that K = n− +
j+ i=1
$i .
(75)
30
S.A. Cherkis, A. Kapustin
Similarly, if j− is the number of strictly negative $i , then we have an identity m−j− +1
K = n+ −
$i .
(76)
i=m
The curvature of Aˆ goes to zero as const/|r|3/2 for |r| → ∞, therefore Aˆ has well-defined limiting holonomies for r → ±∞. For r → +∞, n− of the eigenvalues of φˆ approach constant values z1 , z2 , . . . , zn− , where zi is the z-coordinate of the i th singularity with ei = −1. The corresponding eigenvectors are also the eigenvectors of the limiting holonomy of Aˆ with eigenvalues eiχ1 , eiχ2 , . . . , eiχn− , where χi is the χ -coordinate of the i th singularity with ei = −1. For r → +∞, $1 of the eigenvalues of φˆ asymptote to 2πs − v1 j , j = 1, . . . , $1 , ω$1 exp $1 where ωp denotes exp(2π i/p). The limiting holonomy of Aˆ preserves the subspace spanned by the corresponding eigenvectors and its restriction to this subspace is equal to V$1 exp (2πiα1 /$1 ) , where V$1 is the $1 × $1 “shift matrix” defined in (52). Further, for r → +∞, $2 of the eigenvalues of φˆ approach 2πs − v2 j , j = 1, . . . , $2 , ω$2 exp $2 and so on, until we reach $j+ eigenvalues of φˆ which approach 2πs − vj+ j , j = 1, . . . , $j+ . ω$j exp + $j+ By (75), we have described the behavior of all K eigenvalues of φˆ and the limiting holˆ Note that since all the numbers $1 , . . . , $j+ are positive, the eigenvalues onomy of A. ˆ of φ which do not approach finite values grow exponentially as r → +∞. For r → −∞ the situation is similar. n+ of the eigenvalues of φˆ approach constant values zn− +1 , zn− +2 , . . . , zn− +n+ , where zn− +i is the z-coordinate of the i th singularity with ei = +1. The corresponding eigenvectors are also the eigenvectors of the limiting holonomy of Aˆ with eigenvalues eiχn− +1 , eiχn− +2 , . . . , eiχn− +n+ ,
Periodic Monopoles with Singularities and N = 2 Super-QCD
31
where χn− +i is the χ -coordinate of the i th singularity with ei = +1. |$m | of the eigenvalues of φˆ asymptote to 2πs − vm j ω$m exp , j = 1, . . . , |$m |. $m The limiting holonomy of Aˆ preserves the subspace spanned by the corresponding eigenvectors and its restriction to this subspace is equal to V$m exp (2πiαm /$m ) . |$m−1 | eigenvalues of φˆ asymptote to 2πs − vm−1 j ω$m−1 exp , $m−1
j = 1, . . . , |$m−1 |,
and so on, until we reach the |$m−j− +1 | eigenvalues of φˆ which asymptote to 2πs − vm−j− +1 j ω$m−j +1 exp , j = 1, . . . , |$m−j− +1 |. − $m−j− +1 By (76), we have described the behavior of all K eigenvalues of φˆ and the limiting ˆ Note that since all the numbers $m , . . . , $m−j− +1 are negative, the holonomy of A. eigenvalues of φˆ which do not approach finite values grow exponentially as r → −∞. The spectral curve corresponding to such a periodic monopole can be determined eiˆ ther by computing the characteristic polynomial of φ(w), or directly from the definition of the monopole. The latter way is simpler and yields the following equation in C × C∗ : A(z)wm + B1 (z)w m−1 + B2 (z)w m−2 + · · · + Bm−1 (z)w + C(z) = 0. Here A(z) =
n−
(z − zi ),
i=1
C(z) = ev1 +···+vm
n
(z − zi ),
i=n− +1
and Bi is a polynomial of degree ki whose leading and next-to-leading coefficients are determined by the asymptotics of the monopole fields. Note that this is precisely the Seiberg-Witten curve for the N = 2 d = 4 gauge theory corresponding to our periodic U (m) monopole [31]. 8. Concluding Remarks In this paper we have studied the periodic monopole with singularities. From the physical point of view, they are of interest for several reasons. First, as we have shown above, their moduli spaces can be used to “solve” N = 2 d = 4 gauge theories compactified on a circle. Although computing the metric on the moduli space of periodic monopoles is hard, it is still an infinitely easier problem than summing up an infinite number of instantons in a quantum field theory, including monopole loops wrapping the compactified direction. At the moment it is not clear if any of these metrics can be computed in a closed form, but it seems reasonably straightforward to compute the asymptotic expansion far along the Coulomb branch. This problem will be addressed in [9].
32
S.A. Cherkis, A. Kapustin
Second, we have seen that the Nahm transform for periodic monopoles with singularities is described by the Hitchin equations on a cylinder, and that the Hitchin spectral curve is the Seiberg-Witten curve of the corresponding gauge theory. Recall now that the space of solutions of the Hitchin equations is an algebraically completely integrable system, φˆ being the Lax operator, and s being the spectral parameter [16, 24]. Thus our approach allows to associate to any N = 2 d = 4 gauge theory which admits a brane realization an integrable system, so that the Seiberg-Witten curve is given by the characteristic polynomial of the Lax operator. The relation between N = 2 d = 4 gauge theories and completely integrable systems was noted previously [12, 25, 10], but its origin remained somewhat mysterious. For finite N = 2 gauge theories this question was clarified in [20, 19] (see also [13]). In our work on nonsingular periodic monopoles [5], we described the Hitchin system corresponding to an N = 2 super-Yang-Mills theory without matter, and in this paper we generalized this to a much larger class of theories with product gauge groups and hypermultiplets. Third, in Witten’s approach to N = 2 gauge theories [31], the Seiberg-Witten curve describes the world-volume of the M-theory fivebrane. Then BPS monopoles wrapping the compactified direction are represented by Euclidean M2-branes whose boundaries lie on the M5-brane. Computing the contribution of such configurations to the metric on the moduli space seems rather difficult. On the other hand, our methods in principle allow to compute the contribution of an arbitrary number of wrapped BPS monopoles. Conceivably, this could shed some light on the dynamics of open M2-branes. From the mathematical point of view, periodic monopoles with singularities are of interest because their moduli spaces provide new examples of hyperk¨ahler manifolds without any continuous isometries. The interpretation of the moduli spaces in terms of N = 2 gauge theories makes it clear that they are complete (because the Higgs branch is absent) and locally flat at infinity (because the gauge theories are asymptotically free or finite). In the case when the rank of the gauge group is one, the moduli space is an elliptic fibration over C. No examples of complete hyperk¨ahler metric on elliptic fibrations which are locally flat at infinity were known prior to this work (an incomplete example is provided by the so-called Ooguri-Vafa metric [26]), and it is an interesting question which fibrations admit such metrics. We will suggest a possible answer to this question in [9]. Acknowledgements. It is our pleasure to thank Nigel Hitchin, Marcos Jardim, and Tony Pantev for discussions. S.Ch. is grateful to the Institute for Advanced Study, Princeton, for hospitality during the final stage of this work. S.Ch. was supported in part by NSF grant PHY9819686. A.K. was supported in part by DOE grant DE-FG02-90ER40542.
A. Singularities of the Monopole Fields Let U ⊂ R3 be a punctured neighborhood of the origin, and let E be a rank two unitary vector bundle on U . Let A be a unitary connection on E and φ be a Hermitian section of End(E) such that the Bogomolny equation FA = ∗dA φ is satisfied and C1 , |x| C2 ||FA || ≤ , |x|2 ||φ|| ≤
(77) (78)
Periodic Monopoles with Singularities and N = 2 Super-QCD
33
where C1 , C2 > 0. We are going to show that there exist integers m1 , m2 and a U (2)valued function h(x) defined on U such that m1 φ0 (r) 0 φ(x) ∼ h(x) (79) h(x)−1 + O(1), 0 m2 φ0 (r) m1 A0 (x) 0 A(x) ∼ h(x) (80) h(x)−1 + ih(x)dh(x)−1 + O(1). 0 m2 A0 (x) Here r = |x|, φ0 (r) = −1/(2r), and A0 (x) is defined in Sect. 2. Consider a four-dimensional noncompact manifold X with coordinates (x, t), x ∈ U, θ ∈ R/(4πZ) and the metric ds 2 = V (x)dx i dx j δij + V (x)−1 (dθ + ωi (x)dx i )2 , where V =1+
(81)
1 , |x|
and ω = ωi (x)dx i is a unitary connection on a line bundle on U satisfying the U (1) Bogomolny equation: dω = ∗dV . It is easy to see that ω(x) = −A0 (x), and therefore the degree of the line bundle is −1. The metric (81) is of the Taub-NUT (or Gibbons-Hawking) type, and it is well known that it is hyperk¨ahler and admits a tri-holomorphic U (1) action generated by the vector ∂ field ∂θ . The manifold X endowed with this metric admits a nonsingular partial completion obtained by adding a single point p over x = 0. This point is invariant with respect ¯ to the U (1) action mentioned above. We denote this partial completion by X. Let ψ be the projection X → U . Let A˜ be a connection on ψ ∗ (E) defined by A˜ = ψ ∗ (A) + ψ ∗ V −1 φ (dθ + ω). (82) It is easy to check that it is a self-dual U (1)-invariant connection on ψ ∗ (E). The selfduality holds because A and φ satisfy the Bogomolny equation. (This observation is due to P. Kronheimer [23].) We claim that both ψ ∗ (E) and A˜ can be continued in a unique manner to the point p while preserving U (1)-invariance and self-duality. Indeed, the action density of A˜ is given by Tr FA˜ ∧ ∗FA˜ = 2ψ ∗ V −2 Tr FA ∧ ∗FA + dV −1 ∧ ∗dV −1 Tr φ 2 dθ
(83) −d ψ ∗ ∗dV −1 Tr φ 2 dθ . Using (77), one can easily see that the integral of the action density over X converges. By Uhlenbeck’s compactification theorem [30], there is a unique smooth continuation of ψ ∗ (E) and A˜ to p. ∂ Obviously, the resulting connection is invariant with respect to ∂θ and self-dual. It is easy to see that in a θ-invariant gauge any such connection has the following form in the neighborhood of p:
m1 0 2 + O(x) dθ A˜ = h(x) h(x)−1 + ih(x)dh(x)−1 + ai (x)dx i , m2 0 2 + O(x) dθ
34
S.A. Cherkis, A. Kapustin
where m1 , m2 are integers, ai (x) are smooth u(2)-valued functions, and h(x) is a smooth U (2)-valued function in the punctured neighborhood of x = 0. The geometric meaning of m1 and m2 is the following: they are the weights of the U (1) action on the fiber of ψ ∗ (E) at p. Solving (82) for A and φ and recalling that ω(x) = −A0 (x), we obtain (79– 80). References 1. Atiyah, M., Hitchin, N.: The Geometry and Dynamics of Magnetic Monopoles. Princeton, USA: University Press, 1988 2. Banks, T., Douglas, M.R., Seiberg, N.: Probing F theory with branes. Phys. Lett. B387, 278 (1996) [hep-th/9605199] 3. Biquard, O., Jardim, M.: Asymptotic behaviour and the moduli space of doubly-periodic instantons. Preprint math.dg/0005154 4. Chalmers, G., Hanany, A.: Three dimensional gauge theories and monopoles. Nucl. Phys. B489, 223 (1997) [hep-th/9608105] 5. Cherkis, S.A., Kapustin, A.: Nahm transform for periodic monopoles and N = 2 super Yang-Mills theory. Commun. Math. Phys. 218, 333 (2001) [hep-th/0006050] 6. Cherkis, S.A., Kapustin, A.: Singular monopoles and gravitational instantons. Commun. Math. Phys. 203, 713 (1999) [hep-th/9803160] 7. Cherkis, S.A., Kapustin, A.: D(k) gravitational instantons and Nahm equations. Adv. Theor. Math. Phys. 2, 1287 (1999) [hep-th/9803112] 8. Cherkis, S.A., Kapustin, A.: Singular monopoles and supersymmetric gauge theories in three dimensions. Nucl. Phys. B525, 215 (1998) [hep-th/9711145] 9. Cherkis, S.A., Kapustin, A.: Hyperk¨ahler metrics from periodic monopoles. Phys. Rev. D 65, 084015 (2002) [hep-th/0109141] 10. Donagi, R., Witten, E.: SupersymmetricYang-Mills theory and integrable systems. Nucl. Phys. B460, 299 (1996) [hep-th/9510101] 11. Ganor, O., Sethi, S.: New perspectives on Yang-Mills theories with sixteen supersymmetries. JHEP 9801, 007 (1998) [hep-th/9712071] 12. Gorsky, A. et al.: Integrability and Seiberg-Witten exact solution. Phys. Lett. B355, 466 (1995) [hep-th/9505035] 13. Gukov, S.: Seiberg-Witten solution from matrix theory. Preprint hep-th/9709138 14. Hanany, A., Witten, E.: Type IIB superstrings, BPS monopoles, and three-dimensional gauge dynamics. Nucl. Phys. B492, 152 (1997) [hep-th/9611230] 15. Hitchin, N.J.: Twistor construction of Einstein metrics. In: Global Riemannian Geometry (Durham, 1983), T.J. Willmore, N.J. Hitchin (eds), Chichester: Horwood, 1984, p. 115 16. Hitchin, N.: Stable bundles and integrable systems. Duke Math. J. 54, 91 (1987) 17. Jaffe, A., Taubes, C.: Vortices and Monopoles. Structure of Static Gauge Theories. Boston: Birkh¨auser, 1980 18. Jardim, M.: Construction of doubly-periodic instantons. Commun. Math. Phys. 216, 1 (2001) [math.dg/9909069]; Nahm transform of doubly-periodic instantons. Preprint math.dg/9910120; Spectral curves and the Nahm transform of doubly-periodic instantons. Preprint math.ag/9909146 19. Kapustin, A.: Solution of N = 2 gauge theories via compactification to three dimensions. Nucl. Phys. B534, 531 (1998) [hep-th/9804069] 20. Kapustin, A., Sethi, S.: The Higgs branch of impurity theories. Adv. Theor. Math. Phys. 2, 571 (1998) [hep-th/9804027] 21. Klemm, A. et al.: Self-dual strings and N = 2 supersymmetric field theory. Nucl. Phys. B477, 746 (1996) [hep-th/9604034] 22. Kraan, T.C., van Baal, P.: Monopole constituents inside SU (N) calorons. Phys. Lett. B435, 389 (1998) [hep-th/9806034] 23. Kronheimer, P.B.: Monopoles and Taub-NUT metrics. M.Sc. Thesis, Oxford, 1985 24. Markman, E.: Spectral curves and integrable systems. Comp. Math. 93, 255 (1994) 25. Martinec, E., Warner, N.: Integrable systems and supersymmetric gauge theory. Nucl. Phys. B459, 97 (1996) [hep-th/9509161] 26. Ooguri, H., Vafa, C.: Summing up D-instantons. Phys. Rev. Lett. 77, 3296 (1996) [hep-th/9608079] 27. Seiberg, N., Witten, E.: Electric-magnetic duality, monopole condensation, and confinement in N = 2 supersymmetric Yang-Mills theory. Nucl. Phys. B426, 19 (1994), Erratum ibid. B430, 485 (1994) [hep-th/9407087]; Monopoles, duality, and chiral symmetry breaking in N = 2 supersymmetric QCD. Nucl. Phys. B431, 484 (1994) [hep-th/9408099]
Periodic Monopoles with Singularities and N = 2 Super-QCD
35
28. Seiberg, N., Witten, E.: Gauge dynamics and compactification to three dimensions. In: The Mathematical Beauty of Physics (Saclay, 1996). River Edge, NJ: World Sci. Publishing, 1997, p. 333 [hep-th/9607163] 29. Simpson, C.T.: Harmonic bundles on noncompact curves. J. Am. Math. Soc. 3, 713 (1990) 30. Uhlenbeck, K.K.: Removable singularities inYang-Mills fields. Commun. Math. Phys. 83, 11 (1982) 31. Witten, E.: Solutions of four-dimensional field theories via M-theory. Nucl. Phys. B500, 3 (1997) [hep-th/9703166] Communicated by A. Connes
Commun. Math. Phys. 234, 37–75 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0709-0
Communications in
Mathematical Physics
Heteroclinic Connections Between Periodic Orbits in Planar Restricted Circular Three-Body Problem – A Computer Assisted Proof Daniel Wilczak1 , Piotr Zgliczynski2,∗ 1 2
GSB – NLU, Faculty of Computer Science, Department of Computational Mathematics, ul. Zielona 27, 33-300 Nowy S¸acz, Poland. E-mail:
[email protected] Jagiellonian University, Institute of Mathematics, Reymonta 4, 30-059 Krak´ow, Poland. E-mail:
[email protected]
Received: 15 April 2002 / Accepted: 16 May 2002 Published online: 20 January 2003 – © Springer-Verlag 2003
Abstract: The restricted circular three-body problem is considered for the following parameter values C = 3.03, µ = 0.0009537 – the values for the Oterma comet in the Sun-Jupiter system. We present a computer assisted proof of an existence of a homo- and heteroclinic cycle between two Lyapunov orbits and an existence of symbolic dynamics on four symbols built on this cycle.
1. Introduction and Statement of Results In paper [11] methods of dynamical system theory were used (see also [13]) to explain rapid transitions from heliocentric orbits outside the orbit of Jupiter to heliocentric orbits inside the orbit of Jupiter and vice versa for Jupiter comets Oterma and Gehrels 3. To model this problem the authors in [11] used the planar circular restricted three-body problem and established that for a parameter corresponding to the Sun-Jupiter-Oterma system rapid transitions of Oterma are explained by transversal intersections of stable and unstable manifolds of two periodic orbits around libration points L1 and L2 . In fact the existence of symbolic dynamics on three symbols was claimed. The goal of this paper is to develop and test tools which allow with computer assistance to prove the results claimed in [11]. Before we state our main results we give a short description of the planar restricted circular three-body problem. We follow the paper [11] and use the notation introduced there. Let S and J be two bodies called Sun and Jupiter, of masses ms = 1 − µ and mj = µ, µ ∈ (0, 1), respectively. They rotate in the plane in circles counterclockwise about their common center and with angular velocity normalized as one. Choose a rotating coordinate system (synodical coordinates) so that the origin is at the center of mass and the Sun and Jupiter are fixed on the x-axis at (−µ, 0) and (1 − µ, 0) respectively. ∗ Research supported in part by Polish KBN grant 2 P03A 011 18, NSF grant DGE-98-04459 and PRODYN.
38
D. Wilczak, P. Zgliczynski
In this coordinate frame the equations of motion of a massless particle called the comet or the spacecraft under the gravitational action of Sun and Jupiter are (see [11] and references given there) x¨ − 2y˙ = x (x, y),
y¨ + 2x˙ = y (x, y),
(1.1)
where x2 + y2 1−µ µ µ(1 − µ) (x, y) = + , + + 2 r1 r2 2 r2 = (x − 1 + µ)2 + y 2 . r1 = (x + µ)2 + y 2 , Equations (1.1) are called the equations of the planar circular restricted three-body problem (PCR3BP). They have a first integral called the Jacobi integral, which is given by C(x, y, x, ˙ y) ˙ = −(x˙ 2 + y˙ 2 ) + 2(x, y).
(1.2)
We consider PCR3BP on the hypersurface M(µ, C) = {(x, y, x, ˙ y) ˙ | C(x, y, x, ˙ y) ˙ = C},
(1.3)
and we restrict our attention to the following parameter values: C = 3.03, µ = 0.0009537 – the parameter values for Oterma comet in the Sun-Jupiter system (see [11]). The projection of M(µ, C) onto position space is called a Hill’s region and gives the region in the (x, y)-plane, where the comet is free to move. The Hill’s region for the parameter considered in this paper is shown on Fig. 1 in white, the forbidden region is shaded. The Hill’s region consists of three regions: an interior (Sun) region, an exterior region and Jupiter region. In [11] very good numerical evidence was given for the following facts for the SunJupiter-Oterma system: 0. Existence of Lyapunov orbits L∗1 and L∗2 around libration points L1 and L2 , respectively. Both orbits are hyperbolic and are located in the Jupiter region.
Fig. 1. Hills region for PCR3BP with C = 3.03, µ = 0.0009537 from [11]
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
39
1. There exists a transversal heteroclinic orbit connecting L∗1 and L∗2 . There exists a transversal heteroclinic orbit connecting L∗2 and L∗1 . Both orbits are in the Jupiter region. These orbits were discovered for the first time in [11]. 2. There exists a transversal homoclinic orbit to L∗1 in the interior (Sun) region. 3. There exists a transversal homoclinic orbit to L∗2 in the exterior region. By transversal hetero- and homoclinic orbit, we mean that appropriate unstable and stable manifolds intersect transversally. For example in Assertion 1: the stable manifold of L∗1 intersects transversally the unstable manifold of L∗2 . It is now standard in dynamical system theory (see [11] and references given there) to derive from Assertions [0–3] existence of symbolic dynamics on four symbols S, L∗1 , L∗2 , X with the following allowed transition: S → S, L∗1 ,
L∗1 → L∗1 , S, L∗2
L∗2 → L∗1 , L∗2 , X,
X → X, L∗2 .
In [11] (Sect. 1.4) existence of symbolic dynamics on three symbols only was claimed. Instead of two symbols L∗1 and L∗2 one symbol J for the Jupiter region was used. From the point of view of rapid transition of Oterma from an interior region to an exterior region and vice versa, existence of heteroclinic orbits between L∗1 and L∗2 claimed in Assertion 1 was of a special importance, as they are an indication of existence of a dynamical channel joining an interior with an exterior region. The following two theorems summarize the main results of our paper Theorem 1.1. For PCR3BP with C = 3.03, µ = 0.0009537 there exist two periodic solutions in the Jupiter region, L∗1 and L∗2 , called Lyapunov orbits, and there exist heteroclinic connections between them, in both directions. Moreover for both orbits L∗1 and L∗2 there exists a homoclinic orbit in the interior and exterior region, respectively. The next theorem says that the homo- and heteroclinic connections whose existence is established in Theorem 1.1 are topologically transversal, i.e. give rise to the symbolic dynamics, just as in the case of existence of transversal intersections of stable and unstable manifolds. Theorem 1.2. For PCR3BP with C = 3.03, µ = 0.0009537 there exist a symbolic dynamics on four symbols {S, X, L1 , L2 } corresponding to Sun and exterior regions and the vicinity of L1 and L2 , respectively. A precise statement of this theorem with all necessary details about the symbolic dynamics is given as Theorem 7.1 in Sect. 7. Hence we have proved Assertion 0, but we didn’t prove Assertions 1–3, as we didn’t check that stable and stable manifolds of Lyapunov orbits intersect transversally. Instead we proved that there is enough topological transversality present to build a symbolic dynamics on it. The use of topological tools was essential for the success of this work, as the rigorous computation of stable and unstable manifolds appears to be much more difficult than the computations reported in this paper. The contents of our paper may be described as follows. In Sect. 2 we continue our brief description of PCR3BP, we define suitable Poincar´e maps and state their symmetry properties. In Sect. 3 we present the main topological tool used in this paper – the notion of a covering relation. In Sect. 4 we describe how to link local hyperbolic behavior with covering relations to obtain homo- and heteroclinic orbits. In Sects. 5 and 6 we report the results of our rigorous computations for PCR3BP and we prove Theorem 1.1. In Sect. 7 we show how to use symmetries of PCR3BP together with covering relations
40
D. Wilczak, P. Zgliczynski
to complete the proof of Theorem 1.2. Section 8 contains the details of the numerical part of proof, we mainly discuss the question of an efficient approach to a verification of covering relations. We also include all initial data, so that a willing reader with his own code can verify our claims. Our C++ -source code is available on-line (see [18]). At this site the reader will also find files describing basic classes used in our program and we give the names of functions, which perform proofs of numerical lemmas. In Sect. 9 we discuss some natural extensions of our results. What is new in this paper besides giving a proof for some results from [11]? First of all it shows how to successfully link numerically cheap C 0 -methods (the covering relations) with much more numerically expensive C 1 -methods (local hyperbolicity). This was previously done for maps (see [6]), only. The main obstacle in applications to ODEs was the lack of an efficient C 1 ODE solver. Such a solver – called a C 1 -Lohner algorithm – was recently proposed by the second author in [23]. Other novelties in this paper are some theoretical improvements in the theory of covering relations. We use an abstract definition of covering relation from [7] and we show how to use a symmetry which involves a time reflection (for the Poincar´e map this corresponds to taking an inverse map) together with covering relations. Both these improvements (in numerical algorithms and in the theory of covering relations) allow to reduce the computation time considerably (probably by two orders of magnitude). The total computation time on a PC using a 1.1Ghz Celeron processor is less than 40 minutes. 2. Properties of PCR3BP: Poincar´e Maps and Symmetries In this section we continue our brief description of PCR3BP which we started in the Introduction and we introduce various notations which will be used throughout the paper. The PCR3BP has three unstable collinear equilibrium points on the Sun-Jupiter line, called L1 , L2 and L3 (see Fig. 1.4 in [11]), whose eigenvalues include one real and one imaginary pair. The value of C (Jacobi constant) at the point Li will be denoted by Ci . A linearization at Li for i = 1, 2 for the parameter range considered here shows that these points are of center-saddle type (see [11]). By a theorem of Moser [16] it follows that for C < Ci and |C − Ci | small enough, there exist hyperbolic periodic orbits, L∗i , around Li , called Lyapunov orbits. Observe that for a fixed value of C < Ci existence of the Lyapunov orbit L∗i is not settled by the Moser theorem and has to be proved. As was mentioned in the Introduction we restrict our attention to the following parameter values C = 3.03, µ = 0.0009537 – the parameter values for the Oterma comet in the Sun-Jupiter system (see [11]). Since we work with fixed parameter values we usually drop the dependence of various objects defined throughout the paper on µ and C, so for example M = M(µ, C). For our parameter values we have C2 > C > C3 (this means that we are considering Case 3 from Sect. 3.1 in [11]). We consider Poincar´e sections: = {(x, y, x, ˙ y) ˙ ∈ M | y = 0}, + = ∩ {y˙ > 0}, − = ∩ {y˙ < 0}. On ± we can express y˙ in terms of x and x˙ as follows: (2.4) y˙ = ± 2(x, 0) − x˙ 2 − C. Hence the sections ± can be parameterized by two coordinates (x, x) ˙ and we will use this identification throughout the paper. More formally, we have the transformation T± : R2 → ± given by the following formula:
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
41
T± (x, x) ˙ = (x, 0, x, ˙ ± 2(x, 0) − x˙ 2 − C).
(2.5)
The domain of T± is given by an inequality 2(x, 0) − x˙ 2 − C ≥ 0. Let πx˙ : ± −→ R and πx : ± −→ R denote the projection onto the x˙ and x coordinate, respectively. We have πx˙ (x0 , x˙0 ) = x˙0 and πx (x0 , x˙0 ) = x0 . We will say that (x, x) ˙ ∈ ± meaning that (x, x) ˙ represents two-dimensional coordinates of a point on ± . Analogously we give meaning to the statement M ⊂ ± for a set M ⊂ R2 . We define the following Poincar´e maps between sections: P+ : + → + , P− : − → − , P 1 ,+ : + → − , 2
P 1 ,− : − → + . 2
As a rule the sign + or − tells that the domain of the maps P± or P 1 ,± is contained in 2 ± (the same sign). Observe that P+ (x) = P 1 ,− ◦ P 1 ,+ (x), 2
2
P− (x) = P 1 ,+ ◦ P 1 ,− (x) 2
2
whenever P+ (x) and P− (x) are defined. These identities express the following simple fact: to return to + we need to cross with negative y˙ (this is P 1 ,+ first and then we 2 return to with y˙ > 0 (this is P 1 ,− ). 2 Sometimes we will drop signs in P± and P 1 ,± , hence P (z) = P+ (z) if z ∈ + and 2 P (z) = P− (z) if z ∈ − , a similar convention will be applied to P 1 . 2
2.1. Symmetry properties of PCR3BP. Notice that PCR3BP has the following symmetry: R(x, y, x, ˙ y, ˙ t) = (x, −y, −x, ˙ y, ˙ −t),
(2.6)
which expresses the following fact: if (x(t), y(t)) is a trajectory for PCR3BP, then (x(−t), −y(−t)) is also a trajectory for PCR3BP. From this it follows immediately that if if
P± (x0 , x˙0 ) = (x1 , x˙1 ), P 1 ,± (x0 , x˙0 ) = (x1 , x˙1 ), 2
then then
P± (x1 , −x˙1 ) = (x0 , −x˙0 ), P 1 ,∓ (x1 , −x˙1 ) = (x0 , −x˙0 ).
(2.7)
2
˙ = (x, −x) ˙ for (x, x) ˙ ∈ ± . We will denote also by R the map R : ± → ± R(x, x) Now Eq. (2.7) can be written as if if
P± (x0 ) = x1 , P 1 ,± (x0 ) = x1 , 2
then then
P± (R(x1 )) = f R(x0 ), P 1 ,∓ (R(x1 )) = R(x0 ).
(2.8)
2
3. Topological Tools In this section we present the main topological tools used in this paper. The crucial notion is that of a covering relation. This notion in various forms was introduced in papers [20–22, 24]. Here we follow the most recent and most general version introduced in [7] and the reader is referred there for proofs.
42
D. Wilczak, P. Zgliczynski
3.1. h-sets Notation. For a given norm in Rn by Bn (c, r) we will denote an open ball of radius r centered at c ∈ Rn . When the dimension n is obvious from the context we will drop the subscript n. Let S n (c, r) = ∂Bn+1 (c, r), by the symbol S n we will denote S n (0, 1). We set R0 = {0}, B0 (0, r) = {0}, ∂B0 (0, r) = ∅. For a given set Z, by int Z, Z, ∂Z we denote the interior, the closure and the boundary of Z, respectively. For the map h : [0, 1] × Z → Rn we set ht = h(t, ·). By Id we denote an identity map. For a map f, by dom(f ) we will denote the domain of f . Let f : ⊂ Rn → Rn be a continuous map; we will say that X ⊂ dom (f −1 ) if the map f −1 : X → Rn is well defined and continuous. Definition 3.1. A h-set, N, is the object consisting of the following data: • |N | – a compact subset of Rn , a support of N • u(N ), s(N ) ∈ {0, 1, 2, . . . }, such that u(N ) + s(N ) = n • a homeomorphism cN : Rn → Rn = Ru(N) × Rs(N) , such that cN (|N |) = Bu(N) (0, 1) × Bs(N) (0, 1). We set Nc = Bu(N) (0, 1) × Bs(N) (0, 1), Nc− = ∂Bu(N) (0, 1) × Bs(N) (0, 1),
Nc+ = Bu(N) (0, 1) × ∂Bs(N) (0, 1), −1 (Nc− ), N − = cN
−1 N + = cN (Nc+ ).
Hence a h-set, N , is a product of two closed balls in some coordinate system. The numbers, u(N ) and s(N ), stand for the dimensions of nominally unstable and stable directions, respectively. The subscript c refers to the new coordinates given by the homeomorphism cN . We will call N − (Nc− ) an exit set of N and N + (Nc+ ) an entry set of N. Observe that if u(N ) = 0, then N − = ∅ and if s(N ) = 0, then N + = ∅. Definition 3.2. Let N be a h-set. We define a h-set N T as follows: • |N T | = |N | • u(N T ) = s(N ), s(N T ) = u(N ) T T • We define a homeomorphism cN T : Rn → Rn = Ru(N ) × Rs(N ) , by cN T (x) = j (cN (x)), where j : Ru(N) × Rs(N) → Rs(N) × Ru(N) is given by j (p, q) = (q, p). Observe that N T ,+ = N − and N T ,− = N + . This operation is useful in the context of inverse maps, as it was first pointed out in [1].
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
43
3.2. Covering relations. For n > 0 and a continuous map f : S n → S n by d(f ) we denote the degree of f [4]. For n = 0 we define the degree, d(f ), as follows. Observe first that S 0 = {−1, 1}. We set if f (1) = 1 and f (−1) = −1, 1, d(f ) = −1, if f (1) = −1 and f (−1) = 1, (3.9) 0, otherwise. Definition 3.3. Assume n > 0. Let f : Bn (0, 1) → Rn , such that 0∈ / f (∂B(0, 1)).
(3.10)
We define a map sf : S n−1 → S n−1 by sf (x) =
f (x) . f (x)
(3.11)
Definition 3.4. Assume N, M are h-sets, such that u(N ) = u(M) = u and s(N ) = −1 : Nc → Ru × Rs . s(M) = s. Let f : |N | → Rn be continuous. Let fc = cM ◦ f ◦ cN Let w be a nonzero integer. We say that f,w
N ⇒ M (N f -covers M with degree w) iff the following conditions are satisfied: 1. There exists a continuous homotopy h : [0, 1] × Nc → Ru × Rs , such that the following conditions hold: h0 − h([0, 1], Nc ) ∩ Mc h([0, 1], Nc ) ∩ Mc+
= fc , = ∅,
(3.12) (3.13)
= ∅.
(3.14)
2.1. If u > 0, then there exists a map A : Ru → Ru , such that h1 (p, q) = (A(p), 0), where p ∈ Ru and q ∈ Rs , A(∂Bu (0, 1)) ⊂ Ru \ Bu (0, 1).
(3.15) (3.16)
Moreover, we require that d(sA ) = w, 2.2. If u = 0, then h1 (x) = 0 w = 1. f
for x ∈ Nc ,
(3.17) (3.18)
Intuitively, N ⇒ M if f stretches N in the “nominally unstable” direction, so that its projection onto an “unstable” direction in M covers in topologically nontrivial manner the projection of M. In the “nominally stable” direction N is contracted by f . As a result N is mapped across M in the unstable direction, without touching M + . An example of the covering relation on the plane with one unstable direction is shown on Fig. 3.
44
D. Wilczak, P. Zgliczynski
Definition 3.5. Assume N, M are h-sets, such that u(N ) = u(M) = u and s(N ) = s(M) = s. Let g : ⊂ Rn → Rn . Assume that g −1 : |M| → Rn is well defined and g −1 ,w
g,w
continuous. We say that N ⇐ M (Ng-backcovers M with degree w) iff M T ⇒ N T . The following theorem proved in [7] is one of main tools used in this paper. Various versions of this theorem (without backcovering) using slightly weaker notions of covering relations or even without an explicitly defined notion of covering relation were given in [20–22, 24]. In the planar case this theorem with backcovering was stated also in [1]. Theorem 3.6. Assume Ni , i = 0, . . . , k, Nk = N0 are h-sets and for each i = 1, . . . , k we have either fi ,wi
Ni−1 ⇒ Ni
(3.19)
or |Ni | ⊂ dom (fi−1 ) and fi ,wi
Ni−1 ⇐ Ni .
(3.20)
Then there exists a point x ∈ int |N0 |, such that fi ◦ fi−1 ◦ · · · ◦ f1 (x) ∈ int |Ni |, fk ◦ fk−1 ◦ · · · ◦ f1 (x) = x.
i = 1, . . . , k,
(3.21) (3.22)
Obviously we cannot make any claim about the uniqueness of x in Theorem 3.6. 3.3. Covering relation on the plane with one nominally expanding direction (u = 1). In this section we discuss the case when u = s = 1, hence we have only one nominally expanding and one nominally contracting direction. The basic idea here is: the set N − consists of two disjoint components and all possible values of the degree w in the covering relation are ±1. This allows to give sufficient conditions for existence of covering relations, which are relatively easy to verify. Definition 3.7. Let N be a h-set, such that u(N ) = s(N ) = 1. We set Ncle = {−1} × [−1, 1], Ncre = {1} × [−1, 1],
S(N)lc = (−∞, −1) × R, S(N )rc = (1, ∞) × R. We define −1 N le = cN (Ncle ), −1 S(N)l = cN (S(N )l ),
−1 N re = cN (Ncre ), −1 S(N )r = cN (S(N )r ).
We will call N le , N re , S(N )l and S(N)r the left edge, the right edge, the left side and right side of N , respectively.
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
45
It is easy to see that N − = N le ∪ N re . The triple (|N |, S(N )l , S(N )r ) is a t-set from [2]. As in [2] we will use the following notation for S(N)r,l : N l = S(N)l ,
N r = S(N )r .
Remark 3.8. For all h-sets used in this paper the support is a parallelogram. A usual picture of a h-set is given in Fig. 2. A typical picture illustrating a covering relation on the plane with one “unstable” direction is given in Fig. 3. The following theorem was proved in [7] for any n > 1 and u(N ) = 1. Here we rewrite it for the planar case in a slightly different notation (we use N l and N r for S(N )l and S(M)r , respectively). Theorem 3.9. Let n = 2 and let N , M be two h-sets in Rn , such that u(N ) = u(M) = 1 and s(N ) = s(M) = 1. Let f : |N | → Rn be continuous. Assume that there exists q0 ∈ B s (0, 1), such that following conditions are satisfied: (3.23) f cN ([−1, 1] × {q0 }) ⊂ int (M l ∪ |M| ∪ M r ), f (|N |) ∩ M + = ∅, and one of the following two conditions holds:
Fig. 2. An example of h-set on the plane
Fig. 3. An example of an f −covering relation: N0 ⇒ N0 , N1
(3.24)
46
D. Wilczak, P. Zgliczynski
f (N le ) ⊂ M l and f (N re ) ⊂ M r , f (N le ) ⊂ M r and f (N re ) ⊂ M l .
(3.25) (3.26)
Then there exists w = ±1, such that f,w
N ⇒ M.
3.4. Representation of the h-sets. In this paper we use very simple h-sets, namely the support is a parallelogram. A h-set is defined by specifying the triple N = t (c, u, s), where c, u, s ∈ R2 , such that u, s are linearly independent. We set |N| = x ∈ R2 | ∃t1 ,t2 ∈[−1,1]
x = c + t1 u + t 2 s
= c + [−1, 1] · u + [−1, 1] · s, cN (t1 , t2 ) = c + t1 u + t2 s. In this representation c is a center point of the parallelogram N , u represents an oriented half-length in the “unstable” direction and s is an oriented half-length in the “stable” direction. See Fig. 2 for an example of h-set in this representation. We have N le N re Nl Nr
= = = =
c − u + [−1, 1] · s, c + u + [−1, 1] · s, c + (−∞, −1]u + (−∞, ∞)s, c + [1, ∞)u + (−∞, ∞)s.
We introduce notions of top and bottom edges of N , N te and N be by N be = c + [−1, 1] · u − s, N te = c + [−1, 1] · u + s. Let us recall that the symmetry R : R2 → R2 introduced in Sect. 2.1 was given by R(x1 , x2 ) = (x1 , −x2 ). Definition 3.10. A h-set, N , will be called an R-symmetric h-set if N = t (c, u, s) for some c, u, s ∈ R2 , such that R(c) = c, R(u) = s or R(u) = −s. Figure 4 shows an example of a R-symmetric h-set. Symmetry properties of such an h-set are apparent.
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
47
Fig. 4. An example of an R-symmetric h-set
3.5. Action of R on h-sets. The symmetry of P1/2,± and P± expressed in (2.7) relates the maps and their inverses, hence beside mapping the support of N by R it will switch also the nominally stable and unstable directions. This motivates the following definition of the action of the symmetry R on h-sets: Definition 3.11. Let N be a h-set. We define a h-set R(N ) as follows: • |R(N )| = R(|N |) • u(R(N )) = s(N ) and s(R(N)) = u(N ) • the homeomorphism cR(N) : Rn → Ru(R(N)) × Ru(S(N)) is given by cR(N) = cN T ◦ R −1 . Observe that according to the above definition we have R(N)− = R(N + ) = R(N T ,− ), R(N)+ = R(N − ) = R(N T ,+ ), R(t (c, u, s)) = t (R(c), R(s), R(u)).
(3.27)
As an immediate consequence of Eq. (3.27) we obtain Lemma 3.12. Let N = t (c, u, s) be an R-symmetric h-set. Then R(N ) = N . We have the following easy lemma Lemma 3.13. Let fi : dom (fi ) ⊂ R2 → R2 for i = 1, 2 continuous and invertible on some open sets. Assume that if f1 (x) = x1 ,
then
f2 (Rx1 ) = R(x).
(3.28)
Let N1 , N2 be h-sets such that f1
N1 ⇒ N2 . Then f2
R(N2 ) ⇐ R(N1 ) Proof. From Def. 3.11 and the assumed symmetry it follows immediately that f −1
2 R(N2T ). R(N1 )T ⇒
(3.29)
48
D. Wilczak, P. Zgliczynski
From the above lemma and (2.8) we obtain P±
P±
Corollary 3.14. If N1 ⇒ N2 , then R(N2 ) ⇐ R(N1 ). P 1 ,+
P 1 ,−
P 1 ,−
P 1 ,+
2 2 If N1 ⇒ N2 , then R(N2 ) ⇐ R(N1 ). 2 2 If N1 ⇒ N2 , then R(N2 ) ⇐ R(N1 ).
4. C 1 Tools The goal of this section is to describe the tools which allow in the presence of hyperbolic fixed points for a map to prove an existence of homo- and heteroclinic trajectories. In this section we recall the results from [6] with some additions (see also [19] where the method was outlined for the first time). In the symbol of the covering relation we f
f,w
will drop the degree part, hence we will use N ⇒ M instead of N ⇒ M for some nonzero w. 4.1. General theorems. Let P : Rn → Rn be a C 1 -map. For any set X we define an interval matrix [DP (X)] ⊂ Rn×n to be an interval enclosure of DP (X) given by M ∈ [DP (X)]
inf DP (x)ij ≤ Mij ≤ sup DP (x)ij
iff
x∈X
x∈X
i, j = 1, 2, . . . , n.
Lemma 4.1. Let N be a convex set. Assume x0 , x1 ∈ N . Then P (x1 ) − P (x0 ) ∈ [DP (N )] · (x1 − x0 ).
(4.30)
Moreover, there exists a matrix M ∈ [DP (N )] such that P (x1 ) − P (x0 ) = M · (x1 − x0 )
(4.31)
Proof.
1
P (x1 ) − P (x0 ) = 0
=
0
1
dP (x0 + t (x1 − x0 ))dt dt ∂P (x0 + t (x1 − x0 ))dt · (x1 − x0 ). ∂x
To finish the proof observe that M= 0
1
∂P (x0 + t (x1 − x0 ))dt ∈ [DP (N )]. ∂x
Let I d : Rn → Rn denote the identity map.
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
49
Theorem 4.2. Let N be a convex set. Assume that 0∈ / det ([DP (N )] − I d) = {t = det (M − I d) | M ∈ [DP (N )]}, then N contains at most one fixed point of P . Proof. Assume that x0 , x1 ∈ N are fixed points for P . Then from Lemma 4.1 it follows that x1 − x0 = P (x1 ) − P (x0 ) = M · (x1 − x0 )
(4.32)
for some matrix M ∈ [DP (N )]. Hence x1 − x0 is in the kernel of M − I d. From our assumption it follows that det(M − I d) != 0, hence x1 = x0 . Consider a two–dimensional function f (x) = (f1 (x), f2 (x))T , where x = (x1 , x2 )T . We assume that f (0) = 0, i.e. 0 is a fixed point of f . For a convex set U , such that 0 ∈ U we define intervals λ1 (U ), ε 1 (U ), ε 2 (U ) and λ2 (U ) by
λ1 (U ) ε 1 (U ) . (4.33) Df (U ) = ε2 (U ) λ2 (U ) Since f (0) = 0 then from Lemma 4.1 it follows that f1 (x) ∈ λ1 (U )x1 + ε 1 (U )x2 , f2 (x) ∈ ε2 (U )x1 + λ2 (U )x2 . Let ε1" (U ) = sup{|ε| : ε ∈ ε1 (U )}, λ"1 (U ) = inf{|λ1 | : λ1 ∈ λ1 (U )},
ε2" (U ) = sup{|ε| : ε ∈ ε2 (U )}, λ"2 (U ) = sup{|λ2 | : λ2 ∈ λ2 (U )}.
Let us define the rectangle Nα1 ,α2 by Nα1 ,α2 = [−α1 , α1 ] × [−α2 , α2 ]. Definition 4.3 [6, Def. 1]. Let x∗ be a fixed point for the map f . We say that f is hyperbolic on N # x∗ , if there exists a local coordinate system on N , such that in this coordinate system = 0, < (1 − λ"2 (N ))(λ"1 (N ) − 1), N = Nα1 ,α2 ,
x∗ " " ε1 (N )ε2 (N )
(4.34) (4.35) (4.36)
where α1 > 0, α2 > 0 are such that the following conditions are satisfied: ε1" (N ) " λ1 (N ) − 1
<
1 − λ"2 (N ) α1 < . α2 72" (N )
(4.37)
It is easy to see that that for the map f to be hyperbolic on N it is necessary that λ"1 > 1,λ"2 < 1 and the linearization of f at x∗ is hyperbolic with one stable and unstable direction.
50
D. Wilczak, P. Zgliczynski
Theorem 4.4 [6, Thm. 3]. Assume that f is hyperbolic on N . Then 1. If f k (x) ∈ N for k ≥ 0, then limk→∞ f k (x) = x∗ , 2. If yk ∈ N and f (yk−1 ) = yk for k ≤ 0, then limk→−∞ yk = x∗ . The next theorem shows how we can combine C 0 - and C 1 -tools to prove the existence of asymptotic orbits with prescribed itinerary. Theorem 4.5 [6, Thm. 4]. Assume that g is hyperbolic on Nm and f hyperbolic on N0 . Let xg ∈ Nm be a fixed point for g and xf ∈ N0 be a fixed point for f . 1. If f0
f1
fm−1
f2
g
N0 ⇒ N1 ⇒ N2 ⇒ . . . ⇒ Nm ⇒ Nm ,
(4.38)
then there exists x0 ∈ N0 such that fi−1 ◦ fi−2 ◦ · · · ◦ f0 (x0 ) ∈ Ni k
g ◦ fm−1 ◦ · · · ◦ f0 (x0 ) ∈ Nm
for
i = 1, . . . , m,
for
k > 0,
lim g k ◦ fm−1 ◦ · · · ◦ f0 (x0 ) = xg .
k→∞
2. If f
f0
f1
fm−1
f2
N0 ⇒ N0 ⇒ N1 ⇒ N2 ⇒ . . . ⇒ Nm ,
(4.39)
then there exists a sequence (xk )0k=−∞ , f (xk ) = xk+1 for k < 0 such that xk ∈ N0 for fi−1 ◦ fi−2 ◦ · · · ◦ f0 (x0 ) ∈ Ni for lim xk = xf .
k ≤ 0, i = 1, . . . , m,
k→−∞
3. If f
f0
f1
f2
fm−1
g
N0 ⇒ N0 ⇒ N1 ⇒ N2 ⇒ . . . ⇒ Nm ⇒ Nm ,
(4.40)
then there exists a sequence (xk )0k=−∞ , f (xk ) = xk+1 for k < 0 such that xk ∈ N0 , k ≤ 0, fi−1 ◦ fi−2 ◦ · · · ◦ f0 (x0 ) ∈ Ni g n ◦ fm−1 ◦ · · · ◦ f0 (x0 ) ∈ Nm lim xk = xf ,
for for
i = 1, . . . , m, n > 0,
k→−∞
k
lim g ◦ fm−1 ◦ · · · ◦ f0 (x0 ) = xg .
k→∞
The above theorem can be used without any modifications for proving the existence of trajectories converging to periodic orbits. In this case we consider higher iterates of maps f and g in (4.38), (4.39) and (4.40).
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
51
4.2. How to prove existence of an heteroclinic orbit, fuzzy sets. To prove existence of an heteroclinic orbit we want to use the third assertion in Theorem 4.5 for g = f , but in order to make the exposition easier to follow we use two different maps f and g. Observe that to apply this theorem directly one needs to know an exact location of two fixed points xf ∈ N0 and xg ∈ Nm , because the sets N0 and Nm are centered on xf and xg respectively. But exact coordinates of xf and xg are usually unknown. We overcome this obstacle in three steps as follows: 1. Finding very good estimates for xf and xg . In this paper we use an argument based on symmetry to obtain tight bounds for xf and xg . In [6] a rigorous interval Newton algorithm was used. Let us denote by Mf and Mg the obtained estimates for xf and xg , respectively. We choose one fixed point xf ∈ Mf and xg ∈ Mg for further considerations. 2. C 1 -computations, hyperbolicity. We choose a set Uf , Mf ⊂ Uf , on which we compute rigorously [Df (Uf )]. Then we have to choose a coordinate system, in which the matrix [Df (Uf )] will be as close as possible to the diagonal one. In this paper we have chosen numerically obtained stable and unstable eigenvectors. Let us denote these eigenvectors by u and s, where u corresponds to the unstable direction and s is pointing in the stable direction. Assume that this process gives us a coordinate frame in which ε1" (Uf )ε2" (Uf ) < 1 − λ"2 (Uf ) λ"1 (Uf ) − 1 . (4.41) From (4.41) it follows easily that there exists α1 > 0, α2 > 0 such that ε1" (Uf ) " λ1 (Uf ) − 1
<
1 − λ"2 (Uf ) α1 . < α2 72" (Uf )
(4.42)
Observe that the above inequality specifies only the ratio α1 /α2 , hence we can find a pair (α1 , α2 ) such that condition (4.42) and the following condition holds: Mf + α1 · [−1, 1] · u + α2 · [−1, 1] · s ⊂ Uf .
(4.43)
We now define a h-set N0 by N0 = t (xf , α1 u, α2 s).
(4.44)
Obviously f is hyperbolic on N0 . Observe that the hyperbolicity implies uniqueness of xf in N0 . ¯ β2 s¯ ). We do a similar construction for g to obtain Nm = t (xg , β1 u, 3. Covering relations for fuzzy h-sets. We have to verify the following covering relations: f
f
fm−1
g
N0 ⇒ N0 ⇒ N1 , Nm−1 ⇒ Nm ⇒ Nm .
(4.45) (4.46)
As was mentioned above we don’t know the h-sets N0 , Nm explicitly, but we know that 0 = {t (c, α1 u, α2 s) | c ∈ Mf }, N0 ∈ N m = {t (c, β1 u, Nm ∈ N ¯ β2 s¯ ) | c ∈ Mg }. The above equations define a fuzzy h-set, as a collection of h-sets. We can now extend the definition of covering relations to fuzzy h-sets as follows.
52
D. Wilczak, P. Zgliczynski
1 , N 2 are fuzzy h-sets Definition 4.6. Let f be a continuous map on the plane. Assume N (collections of h-sets) and R is a h-set. f f 1 . 1 ⇒ R iff M ⇒ R for all M ∈ N • We say that N f f 1 iff R ⇒ M for all M ∈ N 1 . • We say that R ⇒ N f f 1 ⇒ N 2 iff M1 ⇒ M2 for all M1 ∈ N 1 and M2 ∈ N 2 . • We say that N
With the above definition is obvious that to prove the covering relations in Eqs. (4.45) and (4.46) it is enough to show that f f 0 ⇒ 0 ⇒ N N N1 , fm−1
(4.47)
g
m ⇒ N m . Nm−1 ⇒ N
(4.48)
In practice (in rigorous numerical computations) it is convenient to think about a as a parallelogram with thickened edges, hence all tools developed to fuzzy h-set N verify covering relations for h-sets can be easily extended to fuzzy h-sets. 5. The Lyapunov Orbits In this section we present a computer assisted proof of existence and hyperbolicity of the Lyapunov orbits around libration points. Hence we realize here Steps 1 and 2 from Sect. 4.2 on our way to the proof of existence of the heteroclinic connection. As in the previous section in the symbol of the covering relation we will drop the f
f,w
degree part, hence we will use N ⇒ M instead of N ⇒ M for some nonzero w. 5.1. Existence of Lyapunov orbits Theorem 5.1. Let x1 = 0.9208034913207400196, x2 = 1.081929486841799903. • There exists a fixed point L∗1 = (x1∗ , 0) ∈ + for P+ , such that |x1∗ − x1 | < η1 = 6 · 10−14 . • There exists a fixed point
L∗2
=
(x2∗ , 0)
(5.49)
∈ − for P− such that
|x2∗ − x2 | < η2 = 10−13 .
(5.50)
Proof. We consider two intervals I1 := [x1 − η1 , x1 + η1 ] × {0} ⊂ + , I2 := [x2 − η2 , x2 + η2 ] × {0} ⊂ − . The location of xi is schematically shown in Fig. 5. Let us recall that by πx˙ : −→ R we denote the projection onto the x˙ coordinate. With computer assistance we proved the following: Lemma 5.2. The maps P 1 ,+ : I1 −→ − and P 1 ,− : I2 −→ + are well defined and 2 2 continuous. Moreover we have the following properties: πx˙ P 1 ,+ (x1 − η1 , 0) < 0, πx˙ P 1 ,+ (x1 + η1 , 0) > 0, (5.51) 2 2 πx˙ P 1 ,− (x2 − η2 , 0) < 0, πx˙ P 1 ,− (x2 + η2 , 0) > 0. (5.52) 2
2
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
53
Fig. 5. The Lyapunov orbits and the location of xi∗
Fig. 6. Rigorous enclosure of P 1 ,+ (x1 − η1 , 0) (a box in the lower left corner) and P 1 ,+ (x1 + η1 , 0) (a box in the upper right corner)
2
2
Figures 6 and 7 display rigorous enclosures for P 1 ,+ (x1 ±η1 , 0) and P 1 ,− (x2 ±η2 , 0), 2 2 respectively. Now we are ready to finish the proof of Theorem 5.1. From Lemma 5.2 and the Darboux property it follows that there exist points x1∗ ∈ int(I1 ) and x2∗ ∈ int(I2 ), such that P 1 ,+ (x1∗ , 0) = (x10 , 0),
(5.53)
P 1 ,− (x2∗ , 0) = (x20 , 0).
(5.54)
2
2
An application of symmetry properties of P1/2,± (see Eq. (2.7)) gives P+ (x1∗ , 0) = (x1∗ , 0), P− (x2∗ , 0) = (x2∗ , 0).
(5.55) (5.56)
5.2. Hyperbolicity in the neighborhood of Lyapunov orbits. The goal of this section is to prove that P is hyperbolic in the sense of Definition 4.3 in some neighborhood of points L∗1 and L∗2 .
54
D. Wilczak, P. Zgliczynski
Fig. 7. Rigorous enclosure of P 1 ,− (x2 − η2 , 0) (a box in the lower left corner) and P 1 ,− (x2 + η2 , 0) (a box in the upper right corner)
2
2
Let us define u1 = (1, 2.5733011), s1 = (−1, 2.5733011), u2 = (1, 2.2817915), s2 = (−1, 2.2817915). These vectors appear to be a good approximation for unstable (ui ) and stable eigenvectors (si ) at L∗i on the (x, x)-plane. ˙ Observe that R(ui ) = −si , this is in agreement with the symmetry of P± stated in Eq. (2.7). We will also use (ui , si ) later, as the coordinate directions for a good coordinate frame in the proof of hyperbolicity of P+ and P− in the neighborhood of L∗i . Let Hi1 = t (hi , u1i , si1 ) and Hi2 = t (hi , u2i , si2 ) for i = 1, 2 denote h-sets on the (x, x) ˙ plane, where h1 = (x1 , 0),
h2 = (x2 , 0),
α1 = 3 · 10−10 ,
α2 = 4 · 10−10 ,
u11 = α1 u1 ,
u12 = α2 u2 ,
s11 = α1 s1 ,
s21 = α2 s2 ,
u21 = 2 · 10−7 u1 ,
u22 = 1.2 · 10−8 u2 ,
s12 = 2 · 10−7 s1 ,
s22 = 2.8 · 10−7 s2 .
(5.57)
We assume that H11 , H12 ⊂ + and H21 , H22 ⊂ − . Observe that I1 ⊂ H11 ⊂ H12 and I2 ⊂ H21 ⊂ H22 , where sets Ii were defined in the proof of Theorem 5.1. Let Wi = [−ηi , ηi ] × {0},
i = 1, 2,
where ηi were defined in Theorem 5.1. Let Ui , for i = 1, 2 be given by
Ui = Hi1 + Wi = (x + p, x) ˙ : (x, x) ˙ ∈ Hi1 , (p, 0) ∈ Wi .
(5.58)
(5.59)
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
55
The choices made in (5.57) are motivated by the following considerations: since we want to exploit hyperbolicity of P around L∗i it is desirable to choose stable and unstable directions as si and ui . Sets Hi1 (in fact Ui ) will be used to establish hyperbolicity around L∗i , hence it is desirable to choose them very small, as we need to perform a costly rigorous computation of DP on Ui (a C 1 -computation). Sets Hi2 are used as a link in the chain of covering relations between small Hi1 and relatively large sets Ni defined later in Sect. 6. Since the unstable eigenvalue is bigger than 103 (see the proof of Lemma 5.5), we can choose Hi2 to be of three orders of magnitude larger than Hi1 and still have a covering relation between them. The following lemma was proved with computer assistance Lemma 5.3. The maps P+ : U1 → + and P− : U2 → − are well defined. Moreover we have
A1 B1 A2 B2 [DP+ (U1 )] ⊂ , [DP− (U2 )] ⊂ , (5.60) C1 D1 C2 D2 where intervals Ai , Bi , Ci , Di are given below A1 C1 A2 C2
= = = =
[695.6589, 696.1086], B1 [1789.91096, 1791.46262], D1 [573.3982, 573.8351], B2 [1308.16775, 1309.52027], D2
= = = =
[270.3511, 270.4974], [695.6197, 696.12451], [251.3098, 251.4675], [573.3612, 573.84802].
Using the above lemma and symmetry R we can now prove the following: Lemma 5.4. There exists exactly one fixed point L∗1 = (x1∗ , 0) ∈ U1 for P+ . Moreover we have |x1∗ − x1 | < η1 . There exists exactly one fixed point L∗2 = (x2∗ , 0) ∈ U2 for P− . Moreover we have |x2∗ − x2 | < η2 . Proof. We write down the proof for L∗1 , only. The proof for L∗2 is analogous. An easy computation shows that det ([DP+ (U1 )] − I d) < 0, hence from Theorem 4.2 it follows that there exists at most one fixed point for P+ in U1 . Since I1 ⊂ U1 then we know from Theorem 5.1 that one such fixed point L∗1 = (x1∗ , 0) ∈ I1 exists. The estimate for |x1∗ − x1 | was also given in Theorem 5.1. Lemma 5.5. There exist R-symmetric h-sets H1 and H2 , such that |H1 | ⊂ U1 , |H2 | ⊂ U2 ,t L∗1 ∈ H1 and L∗2 ∈ H2 and the following conditions hold: 1. P+ is hyperbolic on |H1 |. 2. P− is hyperbolic on |H2 |. Proof. We will proceed as it was outlined in step 2 in section 4.2. First we need to find a coordinate frame (via an affine transformation) in which the inequality (4.41) is satisfied for (f = P+ , Uf = U1 ) and (f = P− , Uf = U2 ). Form Lemma 5.3 it follows that P+ is defined on U1 and P− is defined on U2 . Observe that the transformation of [DP+ (U1 )] ([DP− (U2 )]) to new coordinates does not depend on the exact location L∗1 (L∗2 ). In new coordinates L∗1 = L∗2 = 0, but we have to choose the coordinate directions in U1 and U2 . It turns out that the vectors (ui , si )
56
D. Wilczak, P. Zgliczynski
which were used in the definition of Hi1 are good for this purpose, as they are reasonably good approximations of unstable and stable directions of corresponding Poincar´e map. A short computation shows that in new coordinates we obtain
λ ε λ ε [DP+ (U1 )] ⊂ 1,1 1,1 , [DP− (U2 )] ⊂ 2,1 2,1 (5.61) ε 1,2 λ 1,2 ε 2,2 λ 2,2 where λ 1,1 ε 1,1 λ 2,1 ε 2,1
= = = =
[1391.271, 1392.239] λ 1,2 ε 1,2 [−0.4951, 0.4719] [1146.751, 1147.69] λ 2,2 [−0.4815, 0.4567] ε 2,2
= = = =
[−0.4829, 0.4842] [−0.4836, 0.4835] [−0.4686, 0.4697] [−0.4687, 0.4695].
It is clear that λi,2 < 1 < λi,1 and εi,1 εi,2 < (1 − λi,2 )(λi,1 − 1). Moreover 1 − λ1,2 ε1,1 <1< λ1,1 − 1 ε1,2 ε2,1 1 − λ2,2 <1< λ2,1 − 1 ε2,2
(5.62) (5.63)
We define Hi for i = 1, 2 as follows Hi = t (L∗i , αi u1 , αi si ),
(5.64)
where αi were given in (5.57). Observe that by construction |Hi | ⊂ |Hi1 | ⊂ Ui .
(5.65)
This shows that P+ is hyperbolic on |H1 | and P− is hyperbolic on |H2 |.
With a computer assistance we proved the following lemma Lemma 5.6. Let H1 and H2 be the h-sets obtained in Lemma 5.5, then P+
P+
P−
P−
• H1 ⇒ H1 ⇒ H12 • H22 ⇒ H2 ⇒ H2 Proof. Consider the following fuzzy h-sets: i = t (Wi , αi ui , αi si ), H
(5.66)
1 | ⊂ + and |H 2 | ⊂ − . where Wi are defined by Eq. (5.58). We assume that |H i . The fuzzy sets H i reflect our lack of knowledge of exact Observe that Hi ∈ H coordinates of L∗i . The following covering relations were established with computer assistance (see Fig. 8): P+ P+ 1 ⇒ 1 ⇒ H H H12 , P− H22 ⇒
(5.67)
P−
2 ⇒ H 2 . H
The assertion of the lemma follows now immediately from Def. 4.6.
(5.68)
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
57
Fig. 8. The set H12 (the large parallelogram), the fuzzy set H˜ 1 (a small set in the center) and P+ (H˜ 1 ) P+
P+
(the nearly diagonal segment across H12 ) illustrating the covering relation: H˜ 1 ⇒ H˜ 1 ⇒ H12 . Vertical edges (when in color: red and blue) are marked by a bold line
6. Existence of Homo- and Heteroclinic Connections for Lyapunov Orbits In this section we prove with computer assistance Theorem 1.1. During the proof we define h-sets which will be used later in the construction of symbolic dynamics in the proof of Theorem 1.2. 6.1. Existence of heteroclinic connection between Lyapunov orbits. In order to prove existence of a heteroclinic connection between L∗1 and L∗2 we need to find a chain of covering relations which starts close to L∗1 (begins with H12 ) and ends close to L∗2 (with H22 ). For this sake we choose the sets Ni along a numerically constructed (nonrigorous) heteroclinic orbit in the vicinity of the intersection of such an orbit with the section (see Fig. 9). Let Ni = t (Xi , ui , si ) be h-sets, where X0 X1 X2 X3 X4 X5 X6 X7
= = = = = = = =
(0.9522928423486199945, 1.23 · 10−5 ), (0.921005737890425169, 0.0005205932817646883714), (0.957916338594066441, 0.02191497366476494527), (1.030069865952822683, 0.00330658676251664686), (0.967306682018305608, 0.003703230165036550462), (1.040628850444842879, 0.02317063455298806404), (1.081670357450509545, 0.0005918226490172379421), (1.046819673646057103, 2.13365065043902489 · 10−5 ),
s0 s1 s2 s3 s4 s5 s6 s7
= = = = = = = =
and (−4 · 10−6 , 1.45 · 10−5 ), (−4.5 · 10−7 , 76 · 10−6 ), (−1.2 · 10−7 , 2.92 · 10−7 ), (−1.05 · 10−7 , 2.92 · 10−7 ), (−1 · 10−7 , 2.9 · 10−7 ), (−1.44 · 10−7 , 5.8 · 10−7 ), (−1.625 · 10−7 , 3.75 · 10−7 ), (−8.3 · 10−7 , 2.9 · 10−6 ),
u0 u1 u2 u3 u4 u5 u6 u7
= = = = = = = =
−R(s0 )/10, −R(s1 )/10, −R(s2 ), −R(s3 ), −R(s4 )/2, −R(s5 )/6, −R(s6 )/2, −R(s7 )/5.
58
D. Wilczak, P. Zgliczynski
Fig. 9. The location of sets Ni along a heteroclinic orbit in (x, y)-coordinates
Vectors si were chosen to be a good approximation of the stable direction at Xi . Vectors ui were chosen as a symmetric image of si , but usually with a different length. Remark 6.1. For our computations to succeed vectors ui might have been chosen quite arbitrarily (but not too close to si ). For example, numerical explorations show that if we choose ui = αR(si ) then we can reduce the time of rigorous computation (Lemma 6.2) around 3 times in comparison to ui parallel to x-axis. We assume that N0 , N2 , N4 , N6 ⊂ − and N1 , N3 , N5 , N7 ⊂ + . With computer assistance we proved Lemma 6.2. The maps P 1 ,+ : H12 ∪ N1 ∪ N3 ∪ N5 ∪ N7 −→ − 2
and P 1 ,− : N0 ∪ N2 ∪ N4 ∪ N6 −→ + 2
are well defined and continuous. Moreover, we have the following chain of covering relations P1/2,+
P1/2,−
P1/2,+
P1/2,−
P1/2,+
H12 ⇒ N0 ⇒ N1 ⇒ N2 ⇒ N3 ⇒ N4 P1/2,−
P1/2,+
P1/2,−
P1/2,+
⇒ N5 ⇒ N6 ⇒ N7 ⇒ H22 .
Figure 10 illustrates the chain of covering relations from Lemma 6.2 obtained by a nonrigorous procedure. Now we are ready to prove the part of Theorem 1.1 concerning existence of heteroclinic connections. Theorem 6.3. For PCR3BP with C = 3.03, µ = 0.0009537 there exist two periodic solutions in the Jupiter region, L∗1 and L∗2 , called Lyapunov orbits, and there exist heteroclinic connections between them, in both directions.
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
59
P1/2,−
P1/2,+
P1/2,+
Fig. 10a–h. A chain of covering relations. (a) H12 ⇒ N0 , (b) N0 ⇒ N1 , (c) N1 ⇒ N2 , (d) P1/2,−
P1/2,+
P1/2,−
P1/2,+
P1/2,−
N2 ⇒ N3 , (e) N3 ⇒ N4 , (f) N4 ⇒ N5 , (g) N5 ⇒ N6 , (h) N6 ⇒ N7 . These pictures aren’t produced by a rigorous procedure, as we checked the covering relations by a less direct approach to reduce the computation time - see Sect. 8 for details
60
D. Wilczak, P. Zgliczynski
Proof. We prove only existence of the connection from L∗1 to L∗2 . Existence of the connection in the opposite direction is obtained by symmetry R. From Lemmas 5.6 and 6.2 it follows that there exists the following chain of covering relations: P+
P+
P1/2,+
P1/2,−
P1/2,+
P1/2,−
P1/2,+
H1 ⇒ H1 ⇒ H12 ⇒ N0 ⇒ N1 ⇒ N2 ⇒ N3 ⇒ N4 P1/2,−
P1/2,+
P1/2,−
P1/2,+
P−
P−
⇒ N5 ⇒ N6 ⇒ N7 ⇒ H22 ⇒ H2 ⇒ H2 .
The assertion follows now from Lemma 5.5 and Theorem 4.5.
6.2. Homoclinic connection in an exterior region. In this section we establish an existence of an orbit homoclinic to L∗2 (see Fig. 11). For this end we find a chain of covering relations, which starts close to L∗2 passes through the sets located in the exterior region and then ends close to L∗2 . For this sake we choose the sets Ei along a numerically constructed, (nonrigorous) homoclinic orbit in the vicinity of the intersection of such an orbit with the section (see Fig. 11). We define the following h-sets Ei = t (Yi , ui , si ), where Y0 Y1 Y2 Y3 Y4 Y5
= = = = = =
(−2.08509704964865536, 0), (1.160261327316386816, −0.1812035059427922688), (1.059527808809695232, −0.03871458787165545984), (1.082284499686768768, −0.0008090412116073312256), (1.046834433386131072, −0.00002957990840481726976), (1.081929798158888576, −0.0000007068412578518833152),
and s0 s1 s2 s3 s4 s5
= = = = = =
(−1 · 10−7 , 3 · 10−8 ), (1 · 10−7 , 8 · 10−8 ), (−3 · 10−7 , 81 · 10−8 ), (−1 · 10−7 , 23 · 10−8 ), (−1 · 10−7 , 35 · 10−8 ), (−1 · 10−8 , 22817915 · 10−15 ),
u0 u1 u2 u3 u4 u5
= = = = = =
−R(s0 ), −4R(s1 ), −R(s2 )/10, −R(s3 )/4, −R(s4 )/4, −R(s5 )/2.
We assume that E0 , E2 , E4 ⊂ + and E1 , E3 , E5 ⊂ − .
Fig. 11. Homoclinic orbits to L∗1 and L∗2 Lyapunov orbits
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
61
With computer assistance we proved the following Lemma 6.4. The maps P 1 ,+ : E0 ∪ E2 ∪ E4 → − , 2
P 1 ,− : E1 ∪ E3 → + , 2
P− : E5 → i
are well defined and continuous. Moreover, we have the following chain of covering relations P1/2,+
P1/2,−
P1/2,+
P1/2,−
P1/2,+
P−
E0 ⇒ E1 ⇒ E2 ⇒ E3 ⇒ E4 ⇒ E5 ⇒ H22 . We are now ready to state the basic theorem in this section. Theorem 6.5. For PCR3BP with C = 3.03, µ = 0.0009537 there exists an orbit homoclinic to L∗2 . Proof. From Lemmas 6.4 and 5.6 it follows that P1/2,+
P1/2,−
P1/2,+
P1/2,−
P1/2,+
P−
P−
E0 ⇒ E1 ⇒ E2 ⇒ E3 ⇒ E4 ⇒ E5 ⇒ H22 ⇒ H2 .
(6.69)
Observe that from the definition of E0 it follows that E0 is R-symmetric. From Corollary 3.14, Lemma 5.6 and Eq. (6.69) we obtain P−
P−
P1/2,−
P1/2,−
P1/2,+
P1/2,−
P1/2,+
H2 = R(H2 ) ⇐ R(H22 ) ⇐ R(E5 ) ⇐ R(E4 ) ⇐
R(E3 ) ⇐ R(E2 ) ⇐ R(E1 ) ⇐ E0 = R(E0 ).
(6.70)
From (6.69), (6.70), Lemmas 5.6 and Theorem 4.5 we obtain an orbit homoclinic to L∗2 . 6.3. Homoclinic connection in interior region. In this section we establish existence of an orbit homoclinic to L∗1 (see Fig. 11). For this sake we find a chain of covering relations, which starts close to L∗1 passes through the sets located in the interior region and then ends close to L∗1 . For this sake we choose the sets Fi along a numerically constructed, (nonrigorous), homoclinic orbit in the vicinity of the intersection of such an orbit with the section (see Fig. 11). We define the following h-sets Fi = t (Zi , ui , si ) where Z0 Z1 Z2 Z3 Z4
= = = = =
(−0.6160415155975000064, 0), (0.84668503722876047360, 0.17563753764246766080), (0.94793695784874987520, 0.01522141990729746432), (0.92067611200358768640, 0.00032764933375860776), (0.95228425894935162880, 0.00001048139819208300)
62
D. Wilczak, P. Zgliczynski
and s0 s1 s2 s3 s4
= = = = =
(−1 · 10−7 , 25 · 10−8 ), (1 · 10−7 , 92 · 10−9 ), −8 (−25 · 10−9 , 33 4 · 10 ), −7 −8 (−1 · 10 , 26 · 10 ), (−1 · 10−7 , 37 · 10−8 ),
u0 u1 u2 u3 u4
= = = = =
−R(s0 ), −2.2R(s1 ), −R(s2 )/5, −R(s3 )/6, −R(s4 )/6.
We assume that F0 , F2 , F4 ⊂ − and F1 , F3 ⊂ + . With computer assistance we proved the following Lemma 6.6. The maps P 1 ,− : F0 ∪ F2 ∪ F4 → + , 2
P 1 ,+ : F1 ∪ F3 → − 2
are well defined and continuous. Moreover, we have the following covering relations: P1/2,−
P1/2,+
P1/2,−
P1/2,+
P1/2,−
F0 ⇒ F1 ⇒ F2 ⇒ F3 ⇒ F4 ⇒ H12 . We are now ready to state the basic theorem in this section. Theorem 6.7. For PCR3BP with C = 3.03, µ = 0.0009537 there exists an orbit homoclinic to L∗1 . Proof. From Lemmas 6.6 and 5.6 it follows that P1/2,−
P1/2,+
P1/2,−
P1/2,+
P1/2,−
P+
F0 ⇒ F1 ⇒ F2 ⇒ F3 ⇒ F4 ⇒ H12 ⇒ H1 .
(6.71)
Observe that from the definition of F0 it follows that F0 is R-symmetric. From Corollary 3.14, Lemma 5.6 and Eq. (6.71) we obtain P+
P1/2,+
P1/2,−
P1/2,+
P1/2,−
H1 = R(H1 ) ⇐ R(H12 ) ⇐ R(F4 ) ⇐ R(F3 ) ⇐ R(F2 ) ⇐ P1/2,+
R(F1 ) ⇐ R(F0 ) = F0 . (6.72) From (6.71), (6.72), Lemmas 5.6 and Theorem 4.5 we obtain an orbit homoclinic to L∗1 . Proof of Theorem 1.1. We combine together Theorems 6.3, 6.5 and 6.7.
7. Symbolic Dynamics on Four Symbols The goal of this section is to give a precise meaning and a proof to Theorem 1.2. As in previous sections in the symbol of the covering relation we will drop the degree part, f
f,w
hence we will use N ⇒ M instead of N ⇒ M for some nonzero w.
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
63
From Lemmas 5.6 and 6.2 we know that there exists the following chain of covering relations: P+
P1/2,−
P1/2,+
P+
P1/2,+
P1/2,−
P1/2,+
H1 ⇒ H1 ⇒ H12 ⇒ N0 ⇒ N1 ⇒ N2 ⇒ N3 ⇒ N4 P1/2,−
P1/2,−
P1/2,+
P1/2,+
P−
P−
⇒ N5 ⇒ N6 ⇒ N7 ⇒ H22 ⇒ H2 ⇒ H2 .
(7.73)
From Lemmas 5.5 and 3.12 we have R(Hi ) = Hi for i = 1, 2. From Lemmas 5.6, 6.2 and Corollary 3.14 it follows that P−
P1/2,−
P1/2,+
P1/2,−
P1/2,+
P1/2,−
P1/2,+
P1/2,−
H2 = R(H2 ) ⇐ R(H22 ) ⇐ R(N7 ) ⇐ R(N6 ) ⇐ R(N5 ) ⇐ R(N4 ) ⇐ R(N3 ) ⇐ R(N2 ) ⇐ R(N1 ) P1/2,+
P1/2,−
⇐ R(N0 ) ⇐
P+ R(H12 ) ⇐
(7.74)
R(H1 ) = H1 .
From Lemma 6.4 and the proof of Theorem 6.5 it follows that P1/2,+
P1/2,−
P1/2,+
P1/2,−
P1/2,+
P−
P−
E0 ⇒ E1 ⇒ E2 ⇒ E3 ⇒ E4 ⇒ E5 ⇒ H22 ⇒ H2 ,
(7.75)
and P−
P1/2,−
P−
P1/2,+
H2 = R(H2 ) ⇐ R(H22 ) ⇐ R(E5 ) ⇐ R(E4 ) ⇐ P1/2,−
P1/2,+
P1/2,−
R(E3 ) ⇐ R(E2 ) ⇐ R(E1 ) ⇐ E0 = R(E0 ).
(7.76)
From Lemma 6.6 and the proof of Theorem 6.7 it follows that P1/2,−
P1/2,+
P1/2,−
P1/2,+
P1/2,−
P+
F0 ⇒ F1 ⇒ F2 ⇒ F3 ⇒ F4 ⇒ H12 ⇒ H1
(7.77)
and P+
P1/2,+
P1/2,−
P1/2,+
P1/2,−
H1 = R(H1 ) ⇐ R(H12 ) ⇐ R(F4 ) ⇐ R(F3 ) ⇐ R(F2 ) ⇐ P1/2,+
R(F1 ) ⇐ R(F0 ) = F0 . (7.78) We will construct now the symbolic dynamics on four symbols. The construction is a little bit involved, because we have four different maps in all the covering relations listed above. We assign symbols as follows: 1 – the set H1 , 2 – H2 , 3 – E0 and 4 – F0 . The covering relations allow for transitions 1 → 1, 1 → 2, 1 → 4, 2 → 1, 2 → 2, 2 → 3, 3 → 2 and 4 → 1. For each such transition i → j we associate a pair (j, i). This defines a set of admissible pairs ?. For any (α, β) ∈ ? we define a map f(α,β) as follows: if (α, β) = (1, 1), P+ 4◦P P ◦ P ◦ (P ◦ P ) − 1/2,+ 1/2,− 1/2,+ + if (α, β) = (2, 1), 4◦P P ◦ P ◦ (P ◦ P ) + 1/2,− 1/2,+ 1/2,− − if (α, β) = (1, 2), P− if (α, β) = (2, 2), f(α,β) = P 2◦P ◦ (P ◦ P ) if (α, β) = (4, 1), 1 1/2,− 1/2,+ 1/2,+ 2 ,+ 2 2 if (α, β) = (3, 2), P 1 ,− ◦ (P1/2,+ ◦ P1/2,− ) ◦ P− 2 2 if (α, β) = (1, 4), P ◦ P1/2,− ◦ (P1/2,+ ◦ P1/2,0 ) +2 P− ◦ P1/2,+ ◦ (P1/2,− ◦ P1/2,+ )2 if (α, β) = (2, 3).
64
D. Wilczak, P. Zgliczynski
Let @? ⊂ {1, 2, 3, 4}Z be defined as follows: c ∈ @? iff for every i ∈ Z (ci , ci+1 ) ∈ ?. We set S1 = H1 , S2 = H2 , S3 = E0 and S4 = F0 . We can now formulate the theorem about existence of symbolic dynamics on four symbols. Theorem 7.1. For any sequence α = {αi } ∈ @? there exists a point x0 ∈ Sα0 , such that • its trajectory exists for t ∈ (−∞, ∞) • xn = f(αn ,αn−1 ) ◦ · · · ◦ f(α2 ,α1 ) ◦ f(α1 ,α0 ) (x0 ) ∈ Sαn for n > 0 −1 −1 −1 • xn = f(α ◦ · · · ◦ f(α ◦ f(α (x0 ) ∈ Sαn for n < 0. n+1 ,αn ) −1 ,α−2 ) 0 ,α−1 ) Moreover, we have periodic orbits: If α is periodic with the principal period equal to k, then x0 can be chosen so that xk = x0 , hence its trajectory is periodic. homo- and heteroclinic orbits: If α is such that αk = i− for k ≤ k− and αk = i+ for k ≥ k+ , where i− , i+ ∈ {1, 2}, then lim xn = L∗i− ,
n→−∞
lim xn = L∗i+ .
n→∞
Proof. From chains of covering relations (7.73), (7.74), (7.75), (7.76), (7.77), (7.78) and Theorem 3.6 we obtain the statement on periodic points for periodic α. To treat a nonperiodic α we approximate it with periodic sequences βn with increasing periods to obtain a sequence of points x n and after eventually passing to a subsequence we obtain x0 with the desired properties. The statement on homo- and heteroclinic orbits is an easy consequence of Theorem 4.5 and the hyperbolicity of P± on Hi established in Lemma 5.5. Our methods do not allow to make any claims about the uniqueness of x0 for a given α. The only claims of this type we can make is if αn = i for all n ∈ Z then x0 = L∗i . 8. Numerical Aspects of the Proof In this section we give details of the computer assisted proofs of Lemmas 5.2, 5.3, 5.6 and 6.2. As in previous sections in the symbol of the covering relation we will drop the f
f,w
degree part, hence we will use N ⇒ M instead of N ⇒ M for some nonzero w.
8.1. The existence and continuity of Poincar´e maps. Hyperbolicity on Ui . All proofs are required to check first that suitable Poincar´e maps (P± , P 1 ,± ) are defined on some 2 parallelograms (supports of our h-sets) on ± . To this end the parallelogram, Z, was represented as a finite union of small parallelograms, Zi , and each of Zi ’s was used as an initial condition for our routine computing of the necessary Poincar´e map, P1/2,± or P ±. We divided horizontal edges on n equal parts (a horizontal grid) and vertical edges on m equal parts (a vertical grid) and hence we covered Z by n × m parallelograms. Our routine was constructed so that, if completed successfully, then we can claim that Zi is contained in the domain of P and the computed image contains P (Zi ). Our routine is based on the C 0 and C 1 -Lohner algorithms [14, 23].
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
65
We had to prove the following assertions: 1. (in Lemma 5.2) P 1 ,+ is well defined and continuous on I1 and P 1 ,− is well defined 2 2 and continuous on I2 . 2. (in Lemma 5.3) P+ is well defined and smooth on U1 , P− is well defined and smooth on U2 . 3. (in Lemma 5.6 – Eqs. (5.67) and (5.68)). P− is well defined and continuous on H22 . 1 ⊂ U1 , H 2 ⊂ U2 , then the previous assertion guarantees Observe that since H 1 and P− on H 2 . existence and continuity of P+ on H 4. (in Lemmas 6.2, 6.4 and 6.6) P 1 ,+ is well defined and continuous on H12 , N1 , N3 , 2 N5 , N7 , E0 , E2 , E4 , F1 , F3 . P 1 ,− is well defined and continuous on N0 , N2 , N4 , N6 , 2 E1 , E3 , F0 , F2 and F4 . P− is well defined on E5 . The first assertion follows easily from the second one. We reason as follows: since Ii ⊂ Ui , then existence of P− (P+ ) on I1 (I2 ) implies that also P 1 ,− (P 1 ,+ ) is defined. 2 2 To prove the second assertion we cover Ui by a finite number (13 × 13) of parallelograms. Then we compute an image of each part and an enclosure of the derivative of the Poincar´e map using a routine based on the C 1 -Lohner algorithm recently proposed in [23]. As a consequence we obtain an estimation of DP± (see Lemma 5.3). Parameter settings used in these computations are listed in Table 1. Let us stress also that a successful termination of our routine proves also that P+ and P− are defined on U1 and U2 , respectively. From the standard theory it follows that P± are smooth on their domain. Table 1. Parameter settings of the Taylor method used in C 1 -computations – in the proof of Lemma 5.3 set U1 U2
order 5 5
step 0.007 0.007
horizontal grid 13 13
vertical grid 13 13
To prove the third and fourth assertion we proceed in a similar way. We cover each set by a finite number of parallelograms and compute an image of each parallelogram. Since an estimation of the derivative of the Poincar´e map is not necessary we have used a C 0 -Lohner algorithm [14, 23]. Parameter settings for these computations are listed in Table 2. 8.2. Details of the proof of Lemma 5.2. To prove inequalities (5.51), (5.52) we had to compute rigorous enclosures for P 1 ,+ (x1 ± η1 , 0) and P 1 ,− (x2 ± η2 , 0), respectively. 2 2 The values of the time step and the order of the Taylor method used in our routine are listed in Table 3. Figures 6 and 7 display the actual enclosures obtained. 8.3. How do we verify covering relations – details of proofs of Lemmas 6.2, 6.4 and 6.6. This is the most computationally demanding part of our program. In principle the same rigorous computations can be used to obtain both existence of Poincar´e maps and covering relations, but in practice this doesn’t work, i.e. it will result in an enormous computation time (see the discussion in Sect. 6 of [5]). It turns out that once existence of the Poincar´e map is established, we can reduce the computations to the boundary of our h-sets and one interval inside, only (see Lemmas 8.3
66
D. Wilczak, P. Zgliczynski
Table 2. Parameters of the Taylor method used in the proof of existence of the Poincar´e map in Lemma 6.2, Lemma 5.6, Lemma 6.4 and Lemma 6.6 covering relations P+
P1/2,+
P1/2,−
order
step
h. grid
v. grid
5
0.01
1
1
5
0.02
1
1
5
0.02
1
1
P1/2,+
1 ⇒ H 2 ⇒ N0 ⇒ N1 ⇒ N2 H 1
P1/2,− P1/2,+ P1/2,− N2 ⇒ N3 ⇒ N4 ⇒ N5 P1/2,− P1/2,+ P1/2,+ P− N5 ⇒ N6 ⇒ N7 ⇒ H22 ⇒ P1/2,+ P1/2,− P1/2,+ E0 ⇒ E1 ⇒ E2 ⇒ E3 P1/2,− P1/2,+ P− E3 ⇒ E4 ⇒ E5 ⇒ H22 P1/2,− P1/2,+ P1/2,− F0 ⇒ F1 ⇒ F2 ⇒ F3 P1/2,+ P1/2,− P+ F3 ⇒ F4 ⇒ H12 ⇒ H1
2 , H
Table 3. Settings used in the proof of (5.51) (L∗1 -row) and (5.52) (L∗2 -row) orbit L∗1 L∗2
order of Taylor method 20 19
time step 0.05 0.055
and 8.4). Now, when we compute an image of an edge I , we still have to divide it into subintervals, but the number of subintervals of the order of the square root of the number of parallelograms need to achieve the same accuracy on the parallelogram built on two intervals of the linear size similar to that of I . In order to establish existence of covering relations we need to verify the assumptions of Theorem 3.9. To facilitate a discussion about various conditions implying Theorem 3.9 we introduce the following Definition 8.1. Let f : R2 → R2 be a continuous map and let N1 = t (c1 , u1 , s1 ) and N2 = t (c2 , u2 , s2 ) be two h-sets. We say that f satisfies condition ah, a0, a, b+, b− on N1 and N2 if ah: there exists q0 ∈ [−1, 1], such that f cN1 ([−1, 1] × {q0 }) ⊂ int N2l ∪ |N2 | ∪ N2r , a0: f (|N1 |) ∩ N2+ = ∅, a: f (|N1 |) ⊂ int(N2l ∪ |N2 | ∪ N2r ), b+: f (N1le ) ⊂ int(N2l ) and f (N1re ) ⊂ int(N2r ), b−: f (N1le ) ⊂ int(N2r ) and f (N1re ) ⊂ int(N2l ). We say that f satisfies condition b on N1 and N2 if either b+ or b− is satisfied. Remark 8.2. Observe that conditions (3.23), (3.24), (3.25) and (3.26) from Theorem 3.9 coincide with conditions ah, a0, b+ and b−, respectively. Observe that condition a implies conditions ah and a0. The following lemma gives sufficient conditions for existence of covering relations for injective maps.
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
67
Lemma 8.3. Let f : R2 → R2 be a continuous map and let N1 and N2 be two h-sets. Assume that f is injective on |N1 | and f satisfies condition b on N1 and N2 and the following condition a’: a’: f (∂|N1 |) ⊂ int(N2l ∪ |N2 | ∪ N2r ). Then f
N1 ⇒ N2 . Proof. From Remark 8.2 and Theorem 3.9 it follows that it is enough to verify condition a. This follows easily from a’ and the Jordan theorem (see [5], p. 180). Figures illustrating covering relations obtained in Lemmas 6.2 and 5.6 suggest that condition a is satisfied in all relations. Unfortunately the verification of condition a (or f
a’) poses the following difficulty: In the relation N1 ⇒ N2 the set |N1 | is mapped across of N2 , without touching its horizontal edges, but if |N2 | is small then we need a very good estimation of the image of horizontal edges of N1 . This forces us to make a very fine partition of the boundary of N1 , take small time steps and a high order in the numerical method resulting in very long computation times. The above phenomenon is illustrated in Fig. 12, which shows enclosures obtained from our rigorous routines. In this picture we can see a rigorous enclosure for an image of P1/2,− (∂N6 ). This image was obtained as follows: we divided the boundary of the set N6 into some number of subintervals (see Table 4) and computed an image of each part via P1/2,− . This picture shows that a much tighter enclosure of an image of the horizontal edge was required compared to an enclosure for an image of vertical edges (for example edge (N6 )le was divided into 8 equal parts, but (N6 )be into 5 equal parts). In other covering relations this disproportion was often much bigger. To deal with this problem we use the following lemma, in which we indirectly verify conditions a0 and ah instead of a’. This approach allowed us to reduce the computation time by a factor of 5. Lemma 8.4. Let N1 = t (c1 , u1 , s1 ), N2 = t (c2 , u2 , s2 ) be h-sets, f : |N1 | −→ R2 an injection of class C 1 . Let γ be a horizontal line in |N1 | connecting vertical edges given by γ : [−1, 1] # t −→ c1 + t · u1 ∈ |N1 |.
(8.79)
−1 ◦ f ◦ γ . Assume f satisfies condition b on N1 and N2 and the Let g = (g1 , g2 ) = cN 2 following conditions hold:
a1
dg1 dt (t)
!= 0 for t ∈ (−1, 1),
a2 there exists t0 ∈ (−1, 1) such that f (γ (t0 )) ∈ int(|N2 |), a3 f −1 (N2+ ) ∩ |N1 | = ∅. f
Then N1 ⇒ N2 . Proof. We need to show that conditions ah and a0 are satisfied. Observe that condition a0 follows immediately from condition a3 and injectivity of the map f . Namely, by applying f to both sides of a3 we obtain N2+ ∩ f (|N1 |) = ∅. We now show that condition ah is true.
68
D. Wilczak, P. Zgliczynski
P1/2,−
Fig. 12. An example of the rigorous enclosure of an image of ∂N6 in relation N6 ⇒ N7
To this end we consider f ◦ γ in the coordinates induced by the map cN2 . In these coordinates |N2 | = [−1, 1] × [−1, 1], N2r = [1, ∞) × (−∞, ∞), N2l = (−∞, −1] × (−∞, ∞), f ◦ γ = g.
(8.80) (8.81) (8.82) (8.83)
Without any loss of generality we can assume that dg1 (t) > 0, dt
for
t ∈ (−1, 1).
(8.84)
Hence g1 is a strictly increasing function and from condition b it follows that b+ is satisfied. We define two numbers t ∗ = min {t > t0 | f (γ (t)) ∈ ∂|N2 |} , t∗ = max {t < t0 | f (γ (t)) ∈ ∂|N2 |} .
(8.85) (8.86)
From conditions a2, a3 and b it follows that these numbers are well defined t∗ < t ∗ and f (γ ([−1, t∗ ))) ⊂ int N2l , (8.87) (8.88) f γ ((t ∗ , 1]) ⊂ int N2r , ∗ f γ ((t∗ , t )) ⊂ int (|N2 |) . (8.89) To finish the proof observe that from condition b it follows that f (γ (±1)) ∈ int N2r ∪ N2l . −1 Remark 8.5. Observe that cN (x) = A−1 (x − c2 ), where A = uT2 , s2T . 2
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
69
Hence 2 dg1 A−1 (t) = 1i df (γ (t))ij u1,j , dt
(8.90)
i,j =1
where u1 = (u1,1 , u1,2 ). Remark 8.6. Observe that, when N1 = t (c1 , u, s) and N2 = t (c2 , αu, s2 ), which means that the unstable coordinate direction for both h-sets coincide, then dg1 . (8.91) (t) = α A−1 · df (γ (t)) · A 11 dt Hence it is enough to look at the (1, 1) entry of df expressed in the coordinates of the h-set N2 . In Table 4 we present settings used in the proof of Lemma 6.2. In particular the parameter grid gives the number of equal intervals into which we divide an edge. A positive time step means that Lemma 8.3 is used to verify a covering relation. A negative time step means that we use Lemma 8.4 and symbolizes the fact that we compute an inverse of the Poincar´e map to verify condition a3. Parameter settings for the verification of a1 and a2 are given in Table 5. In this table order(m) and step(m) denote an order and a time step of Taylor method which we use to prove a1, and order(c), step(c) denote an order and a time step of Taylor method which we use to prove a2. The parameter grid(m) denotes a number of equal intervals used to cover the curve γ in condition a1. To verify condition a2 we usually compute the image of the center of the set (t0 = 0). P1/2,+
Only in the case N7 ⇒ H22 we used t0 = 0.228. Parameters of the Taylor method used in the proof of the conditions of Lemma 6.4 P−
are listed in Tables 6 and 7. In the relation E5 ⇒ H22 we do not compute an image of the center of set E5 but an image of the point C = Y5 − 0.155u5 . Parameters settings used in the proof of Lemma 6.6 are listed in Tables 8 and 7. 8.4. Verification of covering relations for fuzzy set – details of the proof of Lemma 5.6. In this subsection we discuss how we verify covering relations for fuzzy h-sets. It is as a parallelogram with thickened edges. convenient to think about a fuzzy h-set N We define the support, left and right edges and left and right sides of a fuzzy set as follows: N | = = M∈N ∂|M|, |N |M| , ∂ N M∈ N le , N re , re = le = N M M M∈N M∈N l r r = l = N N M , M , M∈N M∈N where by &Z' we denoted a convex hull of the set Z. We introduce one more notation , for allowed image of the h-set covering N = strip N int M l ∪ |M| ∪ M r . M∈N
Lemmas 8.3 and 8.4 can be easily adopted to fuzzy h-sets. Namely we have
70
D. Wilczak, P. Zgliczynski Table 4. Parameters of the Taylor method used in the proof of Lemma 6.2 covering relation H12
P1/2,+
⇒ N0
P1/2,−
N0 ⇒ N1 P1/2,+
N1 ⇒ N2 P1/2,−
N2 ⇒ N3 P1/2,+
N3 ⇒ N4 P1/2,−
N4 ⇒ N5 P1/2,+
N5 ⇒ N6 P1/2,−
N6 ⇒ N7 P1/2,+
N7 ⇒ H22
edges
grid
order
(H12 )be and (H12 )te
6
4
0.01
(H12 )re N1be
7 1
5 5
0.01 −0.01
N0re and N0le N2be and N2te
40 8
8 6
0.04 −0.01
N1re and N1le N3be and N3te
25 6
5 6
0.01 −0.004
N2re and N2le N4be and N4te
5 3
6 6
0.004 −0.005
N3re and N3le N5be and N5te
5 2
6 6
0.004 −0.01
N4re and N4le N6be and N6te
15 2
6 6
0.006 −0.01
N5re and N5le N6be and N6te
32 8
6 6
0.01 0.01
N6re and N6le (H22 )be and (H22 )te
5 2
6 5
0.01 −0.01
N7re and N7le
33
5
0.01
and and
(H12 )le N1te
step
Table 5. Parameters of the Taylor method used in the proof of conditions a1, a2 in Lemmas 6.2 and 5.6 covering relations P1/2,−
P1/2,+
N0 ⇒ N1 ⇒ N2 P1/2,−
N2 ⇒ N3 P1/2,+
N3 ⇒ N4 P1/2,−
N4 ⇒ N5 P1/2,+ P1/2,+ N5 ⇒ N6 ⇒ P− 2 H22 ⇒ H
H22
order(m)
step(m)
grid(m)
order(c)
step(c)
4
0.01
1
6
0.01
4
0.004
4
6
0.004
4
0.003
2
6
0.003
4
0.004
1
6
0.004
4
0.01
1
6
0.01
4
0.01
1
–
–
1 and N 2 be two fuzzy Lemma 8.7. Let f : R2 → R2 be a continuous map and let N h-sets. Assume that f is injective on |N1 | and the following conditions af and bf are satisfied: 1 | ⊂ strip N 2 , af f |N le l re r ⊂ int N and f N ⊂ int N bf either f N 1 2 1 le r re l 2 or f N1 ⊂ int N2 and f N1 ⊂ int N2 . f
1 ⇒ N 2 . Then N
1 = t (W1 , u1 , s1 ), N 2 = t (W2 , u2 , s2 ) be fuzzy h-sets and Lemma 8.8. Let N 1 | −→ R2 an injection of class C 1 . Let γ be a fuzzy horizontal line in |N 1 |, f : |N
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
71
Table 6. Parameters of the Taylor method used in the proof of covering relations in Lemma 6.4 covering relation
edges
grid
order
step
E1be and E1te
33
7
−0.04
E0re and E0le E2be and E2te
12 12
7 8
0.04 −0.02
E1re and E1le E3be and E3te
25 1
7 7
0.02 −0.02
E2re and E2le E4be and E4te
50 2
7 7
0.03 −0.02
E3re and E3le E5be and E5te
7 2
7 7
0.03 −0.02
E4re and E4le (H22 )be and (H22 )te
20 1
7 5
0.03 −0.02
E5re and E5le
4
5
0.01
P1/2,+
E0 ⇒ E1 P1/2,−
E1 ⇒ E2 P1/2,+
E2 ⇒ E3 P1/2,−
E3 ⇒ E4 P1/2,+
E4 ⇒ E5 P−
E5 ⇒ H22
Table 7. Parameters of the Taylor method used in the proof of conditions a1, a2 in Lemma 6.4 and Lemma 6.6 covering relations P1/2,+
P1/2,−
o(m)
s(m)
g(m)
o(c)
s(c)
4
0.02
1
7
0.02
5
0.02
1
8
0.02
5
0.01
2
8
0.02
4
0.01
1
8
0.02
4
0.01
1
–
–
P1/2,+
E0 ⇒ E1 ⇒ E2 ⇒ E3 P1/2,−
P1/2,+
E3 ⇒ E4 ⇒ E5 P−
E5 ⇒ H22 P1/2,− F0 ⇒ F1 P1/2,+ P1/2,− P1/2,+ F1 ⇒ F2 ⇒ F3 ⇒ P+ H12 ⇒ H1
P1/2,−
F4 ⇒ H12
Table 8. Parameters of the Taylor method used in the proof of covering relations for sets Fi covering relation P1/2,−
F0 ⇒ F1 P1/2,+
F1 ⇒ F2 P1/2,−
F2 ⇒ F3 P1/2,+
F3 ⇒ F4 P1/2,−
F4 ⇒ H12 P+
H12 ⇒ H1
edges
grid
order
step
F1be and F1te
50
6
−0.01
F0re and F0le F2be and F2te
20 9
8 7
0.02 −0.02
F1re and F1le F3be and F3te
330 1
7 7
0.02 −0.02
F2re and F2le F4be and F4te
35 1
7 7
0.03 −0.02
10 3
7 7
0.03 −0.02
F4re and F4le H1be and H1te
45 3
7 8
0.03 −0.02
(H12 )re and (H12 )le
7
7
0.03
F3re 2 (H1 )be
and F3le and (H12 )te
72
D. Wilczak, P. Zgliczynski
given by
1 . γ : [−1, 1] × W1 # (t, w1 ) −→ w1 + t · u1 ∈ N
(8.92)
F or w2 ∈ W2 let N2,w2 = t (w2 , u2 , s2 ) and −1 gw2 (t, w1 ) = (gw2 ,1 , gw2 ,2 )(t, w1 ) = cN ◦ f ◦ γ (t, w1 ). 2,w 2
1 and N 2 and the following Assume that W1 is connected, f satisfies condition bf on N conditions hold: dg
w2 ,1 af 1 dt (t, w1 ) != 0 for t ∈ (−1, 1), w1 ∈ W1 , w2 ∈ W2 , 2 | , af 2 there exists t0 ∈ (−1, 1) and w ∈ W such that f (γ (t , w )) ∈ int | N 1 1 0 1 + ∩ N 1 = ∅. af 3 f −1 N 2
f
1 ⇒ N 2 . Then N
dg
w2 ,1 does not depend on w2 and is Remark 8.9. Observe that (compare Remark 8.5) dt given by formula (8.90) with the matrix A depending only on u2 and s2 .
Let us describe how the above conditions af 1, af 2, af 3 were verified for relations (5.67) and (5.68), which we rewrite below for the convenience of the reader: P+ P+ 1 ⇒ 1 ⇒ H H H12 , P− H22 ⇒
P−
2 ⇒ H 2 . H
(8.93) (8.94)
Let us recall (see Sect. 5.2) that the h-sets entering the above covering relations are given by i = t (Wi , αi ui , αi ui ), H H12 = t h1 , 2 · 10−7 u1 , 2 · 10−7 s1 , H22 = t h2 , 1.2 · 10−8 u2 , 2.8 · 10−8 s2 . P±
For all covering relations, N ⇒ M, listed above the unstable vectors for both h-sets entering the relations are proportional, hence we can apply Remark 8.6 and look at the (1, 1)-entry of dP± expressed in cM -coordinates. i ⊂ Ui then from Lemma 5.3 and Eq. (5.61) we obtain an encloObserve that since H i -coordinates. From inspection of λi,1 it follows that sure for [DP± (|Hi )] expressed in H i ⇒ H i and H 1 ⇒ H 2 . Since |H 2 | is not condition af 1 holds for covering relations H 1 2 P− 2 contained in U2 we had to verify condition af 1 for the relation H2 ⇒ H2 . Parameter settings for this computation is added as the last row of Table 5. Condition af 2 is clearly satisfied with w2 = L∗1 for relations (5.67) and w2 = L∗2 for relations (5.68). Conditions af 3 and bf must be verified by direct computation. Parameter settings for these computations are given in Table 9. It turns out that some inclusions involved in condition b can be verified at the same P+ P+ 1 ⇒ 1 ⇒ time. For example, to prove that H H H12 we need to verify condition b+
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
73
Table 9. Parameters of the Taylor method used in the proof of the covering relations for fuzzy sets covering relations
edges
grid
order
time step
1 )be and (H 1 )te (H
2
6
−0.01
P
1 )re and (H 1 )le (H
3
6
0.01
P
(H12 )be and (H12 )te
2
5
−0.01
2 )be and (H 2 )te (H
4
8
−0.02
(H22 )re and (H22 )le
32
5
0.01
2 )re and (H 2 )le (H
2
8
0.02
+ 1 ⇒ H H1 + 1 ⇒ H H12 P− 2 H22 ⇒ H P− H2 ⇒ H2
for both relations:
re le 1 ⊂ H 1r , P+ H ⊂H l , P+ H 1 1 re 2 r le 2 l 1 ⊂ H1 , P+ H ⊂ H . P+ H 1 1
It is sufficient to show that
re 1 ⊂ H 1r ∩ H12 r , P+ H le 1 ⊂ H 1l ∩ H12 l . P+ H
(8.95) (8.96)
(8.97) (8.98)
P− P− 2 ⇒ 2 we must verify condition af 3 for both Similarly, to prove that H22 ⇒ H H relations. Namely, we have to check that + ∩ H 2 = ∅, P−−1 H (8.99) 2 −1 + 2 P− H2 ∩ H2 = ∅. (8.100) 2 ⊂ |H 2 | (compare (5.57 and (5.66)) it is sufficient to show (8.100) only. Since H 2
9. Concluding Remarks, Future Work There are several directions in which this research can be extended. First, all the methods presented in this paper are not restricted to the particular parameters of the Oterma comet, other parameters may require slight changes in the definition of the sets on which covering relations are built, but the method itself will be the same. Basically this method can be applied to prove symbolic dynamics in any system for which numerical simulations indicate an existence of some kind of hyperbolic behavior, for example here we have homo- and heteroclinic chains. Another interesting problem is the question of existence of a hyperbolic invariant set claimed in [11], where the authors assumed existence of transversal homo- and heteroclinic connections between Lyapunov orbits and then followed the standard dynamical system theory argument from the Birkhoff-Smale homoclinic theorem. Since we didn’t compute here unstable and stable manifolds, we cannot use these arguments. Observe also that a rigorous computation of stable and unstable manifolds for our problem appears to be very difficult (requires very extensive C 1 -computations). Hence developing tools which avoid a direct computation of invariant manifolds is of interest. In this context we formulate the following conjecture.
74
D. Wilczak, P. Zgliczynski
Conjecture 9.1. Let f be a diffeomorphism. Let N0 , N1 be h-sets. Assume that f is hyperbolic on N0 and N1 (in the sense of Def. 4.3). Assume that we have the following sequences of covering relations: f
f
f
f
f
f
f
f
f
N0 ⇒ N0 ⇒ A1 ⇒ A2 ⇒ · · · ⇒ As ⇒ N1 , f
f
f
N1 ⇒ N1 ⇒ B1 ⇒ B2 ⇒ · · · ⇒ Br ⇒ N0 , then there exists k ≥ 1 and S ⊂ |N0 | ∪ |N1 |, such that • f k (S) = S, i.e. S is an invariant set for f k • S is hyperbolic (in the standard sense – see for example [9]) • the map π : S → @2 = {0, 1} given by π(x)i = j iff f ki (x) ∈ |Nj | is one-to-one. Another interesting problem is the question of stability of obtained results with respect to various extensions of PCR3BP. By this we mean the following: • Does the symbolic dynamics persist if the Jupiter orbit becomes an ellipse with small eccentricity (which is the case in nature)? This can be seen as a small periodic perturbation to the ODE describing PCR3BP. We believe that an answer is positive. Obviously in this context one can consider a more general question: Assume that we obtained a symbolic dynamics for an ODE x " = f (x) using covering relations. Does this symbolic dynamics persist for a nonautonomous ODE x " = f (x) + 7(t, x) if 7(t, x) is small? • What about a restricted three body problem in three dimensions? One obvious observation is that the plane (x, y) is invariant for the full 3D problem, hence we have symbolic dynamics also in a spatial problem. We would like to pose a more general question: Does there exist a symbolic dynamics for the 3D problem such that the corresponding orbits are not all contained in the Sun-Jupiter plane? Some preliminary numerical explorations in this direction can be found in paper [8]. • What about the full 3-body problem? Will the symbolic dynamics established here continue to the very small but nonzero mass of a comet? Some results in this direction for non-degenerate periodic orbits and generic bifurcations can be found in [15]. References 1. Arioli, G.: Periodic orbits, symbolic dynamics and topological entropy for the restricted 3-body problem. Commun. Math. Phys. 231, 1–24 (2002) 2. Arioli, G., Zgliczy´nski, P.: Symbolic dynamics for the Henon-Heiles Hamiltonian on the critical level. J. Diff. Eq. 171, 173–202 (2001) 3. CAPD – Computer assisted proofs in dynamics, a package for rigorous numerics. http://limba. ii.uj.edu.pl/capd 4. Dugundji, J., Granas, A.: Fixed Point Theory. Monografie Matematyczne 61, Warszawa: PWN, 1982 5. Galias, Z., Zgliczy´nski, P.: Computer assisted proof of chaos in the Lorenz system. Physica D 115, 165–188 (1998) 6. Galias, Z., Zgliczy´nski, P.: Abundance of homoclinic and heteroclinic orbits and rigorous bounds for the topological entropy for the H´enon map. Nonlinearity. 14, 909–932 (2001) 7. Gidea, M., Zgliczy´nski, P.: Covering relations for multidimensional dynamical systems. Submitted, http://www.im.uj.edu.pl/zgliczyn 8. Gomez, G., Koon, W.S., Lo, M.W., Marsden, J.E., Masdemont, J., Ross, S.D.: Invariant manifolds, the spatial three-body problem and space mission design. Preprint 9. Guckenheimer, J., Holmes, P.: Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields. New York-Heidelberg-Berlin: Springer-Verlag
Heteroclinic Connections Between Periodic Orbits in Three-Body Problem
75
10. The IEEE Standard for Binary Floating-Point Arithmetics. ANSI-IEEE Std 754, 1985 11. Koon, W.S., Lo, M.W., Marsden, J.E., Ross, S.D.: Heteroclinic connections between periodic orbits and resonance transitions in celestial mechanics. Chaos 10, (2), 427–469 (2000) 12. Llibre, J., Martinez, R., Sim´o, C.: Transversality of the invariant manifolds associated to the lyapunov family of periodic orbits near L2 in the restricted three-body problem. J. Diff. Eq. 58, 104–156 (1985) 13. Lo, M., Ross, S.: SURFing the solar system: invariant manifolds and the dynamics of the solar systems. JPL IOM 312/97, 2–4 14. Lohner, R.J.: Computation of guaranteed enclosures for the solutions of ordinary, initial and boundary value problems. In: Computational Ordinary Differential Equations, J.R. Cash, I. Gladwell, (eds), Oxford: Clarendon Press, 1992 15. Meyer, K., Schmidt, D.: From the restricted to the full three-body problem. Trans. AMS 5, 2283–2299 (2000) 16. Moser, J.: On the generalization of a theorem of Liapunov. Comm. Pure Appl. Math. 11, 257–271 (1958) 17. Mrozek, M., Zgliczy´nski, P.: Set arithmetic and the enclosing problem in dynamics. Annales Pol. Math., LXXIV, 237–259 (2000) 18. Wilczak, D.: http://link.springer.de/link/service/journals/00220/index.htm or http://link.springer-ny. com/link/service/journals/00220/index.htm, DOI: 10.1007/s00220-002-0709-0 19. W´ojcik, K., Zgliczy´nski, P.: How to show an existence of homoclinic trajectories using topological tools? In: Proceedings of Equadiff’99, B. Fiedler, K. Gr¨oger, J. Sprekels (eds), Singapore-New Jersey-London-Hong-Kong: World Scientific, 2000, pp. 246–248 20. Zgliczy´nski, P.: Fixed point index for iterations of maps, topological horseshoe and chaos. Topological Methods in Nonlinear Analysis 8, 169–177 (1996) 21. Zgliczy´nski, P.: Sharkovskii’s Theorem for multidimensional perturbations of 1-dim maps. Ergodic Theory and Dynamical Systems 19, 1655–1684 (1999) 22. Zgliczy´nski, P.: Computer assisted proof of chaos in the R¨ossler equations and the H´enon map. Nonlinearity 10, 243–252 (1997) 23. Zgliczy´nski, P.: C 1 -Lohner algorithm. Foundations of computational mathematics 2, 429–465 (2002) 24. Zgliczy´nski, P.: On periodic points for systems of weakly coupled 1-dim maps. Nonlinear Analysis. TMA 46/7, 1039–1062 (2001) Communicated by G. Gallavotti
Commun. Math. Phys. 234, 77–100 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0754-8
Communications in
Mathematical Physics
Deformations of Vertex Algebras, Quantum Cohomology of Toric Varieties, and Elliptic Genus Fyodor Malikov1,∗ , Vadim Schechtman2 1
Department of Mathematics, University of Southern California, Los Angeles, CA 90089, USA. E-mail:
[email protected] 2 Laboratoire Emile Picard UFR MIG, Univesit´e Paul Sabatier, 118 Route de Narbonne, 31062 Toulouse Cedex 4, France. E-mail:
[email protected] Received: 15 July 2000 / Accepted: 17 August 2002 Published online: 8 January 2003 – © Springer-Verlag 2003
Abstract: We use previous work on the chiral de Rham complex and Borisov’s deformation of a lattice vertex algebra to give a simple linear algebra construction of quantum cohomology of toric varieties. Somewhat unexpectedly, the same technique allows to compute the formal character of the cohomology of the chiral de Rham complex on even dimensional projective spaces. In particular, we prove that the formal character of the space of global sections equals the equivariant signature of the loop space, a well-known example of the Ochanine-Witten elliptic genus. Introduction Let X be a smooth complex variety. It was shown in [MSV] that the complex cohomology algebra H ∗ (X) can be obtained as the cohomology of a certain differential vertex algebra H ch (X) canonically associated with X. By definition, H ch (X) = H ∗ (X; ch X ), ch where X is a sheaf of vertex superalgebras constructed in [MSV] and called the chiral de Rham complex. (If X is compact, then H ch (X) may be called the chiral Hodge cohomology algebra of X.) The algebra H ch (X) is equipped with a canonical odd derivation Q of square zero, and the cohomology of H ch (X) with respect to Q is equal to H ∗ (X). We shall record this as follows: H ∗ (X) = HQ (H ch (X)); and whenever we encounter a vector space W with differential d, we let Hd (W ) denote the corresponding cohomology. In the very interesting paper [B] Borisov defined for a smooth complete toric variety X a certain vertex superalgebra V (X) equipped with an odd derivation D, D 2 = 0, so that H ch (X) = HD (V (X)). It follows that H ∗ (X) = HQ (HD (V (X))). In fact, [Q, D] = 0 and it is true that H ∗ (X) = HQ+D (V (X)). He also constructs a family (Vφ (X), Q(φ) + D(φ)), φ lying in the K¨ahler cone of X, of vertex superalgebras with differential such that lim (Vφ (X), Q(φ) + D(φ)) = (V (X), Q + D).
φ→∞ ∗
Partially supported by an NSF grant
78
F. Malikov, V. Schechtman
We prove that HQ(φ)+D(φ) (Vφ (X)) is equal to Batyrev’s quantum cohomology algebra of X, [Bat]. This is first done in Sect. 1.1–2.4, Theorem 2.4, in the case when X = PN and then generalized to an arbitrary X in Sect. 3, Theorem 3.2. We also get similar partial results for Fano hypersurfaces in PN . Their exposition in Sect. 5 is sketchy reflecting the patchy status quo. In Sect. 4 we present a simpler (making no use of Q) version of this construction in the case of PN (Theorem 4.1) and apply the deformation technique to compute H ∗ (PN ; ch ) (Theorem 4.2). The simultaneous proof of Theorems 4.1 and 4.2 heavily PN relies on the previous work, in particular, on the LslN+1 -symmetry of ch introduced PN in [MS1]. According to Theorem 4.1, HD(q) (Vq (PN )), q ∈ C, is a family of vertex algebras over C with fiber that equals the quantum cohomology over q = 0 and blows up to the infinite dimensional vertex algebra H ch (X) over 0 ∈ C. (This q is related to the φ above by q = e−φ .) Theorem 4.2 says that the intermediate cohomology functors unchiralize the chiral de Rham complex, that is, H i (PN , ch ) = H i (PN , ∗PN ), PN
0 < i < N,
∗
where PN stands for the usual de Rham complex. The following application of this theorem to elliptic genera was obtained in collaboration with V. Gorbounov: sign(q, LP2n ) = 2chH 0 (P2n , ch P2n ) − 1;
(0.1)
) is the formal character of H 0 (P2n , ch ) and sign(q, LP2n ) is here chH 0 (P2n , ch P2n P2n the equivariant signature of the loop space of P2n , an example of an Ochanine-Witten elliptic genus of PN as defined in [HBJ, 6.1]. One consequence of (0.1) is that the coefficients of sign(q, LP2n ) are positive and usually even, a result which seems to be new. On the other hand, taking this result for granted one reads (0.1) backwards to get a new interpretation of the chiral de Rham complex over P2n as a sheaf whose space of global sections gives a combinatorial realization of this elliptic genus. It is obvious that these observations prompt an array of questions of topological as well as representation-theoretic nature, which we hope to explore in the future. 1. Borisov’s Construction 1.1. Lattice vertex algebras. Let L be a free abelian group on 2N generators Ai , B i , 1 ≤ i ≤ N. Give L an integral lattice structure by defining a bilinear symmetric Z-valued form (., .) : L × L → Z so that (Ai , B j ) = δij , (Ai , Aj ) = (B i , B j ) = 0. Introduce the complexification of L: ᒅL = L ⊗Z C.
The bilinear form (., .) carries over to ᒅL by bilinearity. Let ᒅˆL = ᒅL ⊗ C[t, t −1 ] ⊕ CK
be a Lie algebra with bracket
Deformations of Vertex Algebras
79
[x ⊗ t i , y ⊗ t j ] = i(x, y)δi+j K,
[x ⊗ t i , K] = 0.
Associated with L there is a group algebra C[L] with basis eα , α ∈ L, and multiplication eα · eβ = eα+β ,
e0 = 1,
α, β ∈ L.
Denote by SᒅL the symmetric algebra of the space ᒅL ⊗ t −1 C[t −1 ]. The space SᒅL ⊗ C[L] carries the well-known vertex algebra structure, see for example [K]. Borisov proposes to enlarge this lattice vertex algebra by fermions as follows. (0) (1) We tacitly assumed that ᒅL is a purely even vector space: ᒅL = ᒅL , ᒅL = 0. Let (1) (0) %ᒅL satisfy the relations %ᒅL = ᒅL , %ᒅL = 0. Thus %ᒅL is a purely odd vector i i space with basis & , ' carrying the following odd bilinear form: (., .) : %ᒅL × %ᒅL → C, (& i , 'j ) = δij ,
(& i , & j ) = ('i , 'j ) = 0.
Given all this, one defines the Clifford algebra, ClᒅL , to be the vector superspace ClᒅL = %ᒅL ⊗ C[t, t −1 ] ⊕ CK ,
(1)
ClᒅL = %ᒅL ⊗ C[t, t −1 ],
(0)
ClᒅL = CK ,
with (super) bracket [x ⊗ t i , y ⊗ t j ] = (x, y)δi+j K . Let )ᒅL be the symmetric algebra of the superspace i −1 i −1 −1 ⊕N i=1 (' ⊗ C[t ] ⊕ & ⊗ t C[t ].
(If we had been allowed to ignore the parity, we would have equivalently defined )ᒅL to be the exterior algebra of the indicated space.) The space )ᒅL carries the well-known vertex algebra structure, see for example [K]. Finally let VL = )ᒅL ⊗ SᒅL ⊗ C[L]. Being a tensor product of vertex algebras, VL is also a vertex algebra. 1.2. Explicit description of the vertex algebra structure on VL . To simplify the notation, we identify C[L] with the subspace 1 ⊗ 1 ⊗ C[L]. As an ᒅˆL ⊕ ClᒅL -module, VL is a direct sum of irreducibles and there is one irreducible module, VL (α), for each α ∈ L. VL (α) is freely generated by the supercommutative associative algebra SᒅL ⊗ )ᒅL from the highest weight vector eα . The words “highest weight vector” mean that the following relations hold: Ain eα = &ni eα = Bni eα = 'in+1 eα = 0, Keα = K eα = eα ,
n ≥ 0,
xeα = (x, α)eα , x ∈ ᒅL .
Thus, VL (α), α ∈ L, are different as ᒅˆL ⊕ClᒅL -modules, but isomorphic as ᒅˆL 1 ⊕ClᒅL modules, where ᒅˆL 1 ⊂ ᒅˆL is the subalgebra linearly spanned by x ⊗ t i , i = 0, x ∈ ᒅL . In fact, multiplication by eβ provides an isomorphism of ᒅˆL 1 ⊕ ClᒅL -modules: eβ : VL (α) → VL (α + β), x ⊗ eα → x ⊗ eα+β .
80
F. Malikov, V. Schechtman
Let us now define the state-field correspondence, that is, attach a field x(z) ∈ End(VL )((z, z−1 )) to each state x ∈ VL . As has become customary, we shall write xi for x ⊗ t i (x ∈ ᒅL or %ᒅL ). We have: 1 x(z)(n) , n!
(x−n−1 e0 )(z) = where
x(z) =
x ∈ ᒅL ,
xj z−j −1 .
j ∈Z
In particular, (x−1
e0 )(z)
= x(z). We continue in the same vein: ('i−n e0 )(z) =
1 i (n) ' (z) , n!
where 'i (z) =
j ∈Z
i (&−n−1 e0 )(z) =
where & i (z) =
j ∈Z
'ij z−j ;
1 i (n) & (z) , n!
&ji z−j −1 ;
eα (z) = eα · exp −
αn n<0
n
z−n · exp −
αn n>0
n
z−n · zα0 (−1)prA (α) .
The last formula deciphers as follows: the product of exponentials signifies an unambiguously defined generating series of operators acting on VL ; whenever eα (z) is applied to an element of VL (β), zα0 is changed for z(α,β) ; prA stands for the projection of L onto the sublattice spanned by A’s, and when applied to an element of VL (β), (−1)prA (α) is changed for (−1)(prA (α),β) . Finally, (1)
(2)
(k)
(1)
(2)
(k)
x−n1 · x−n2 · · · x−nk · eα (z) =: x−n1 (z)x−n2 (z) · · · x−nk (z)eα (z) : . Remark. Of these formulas that for eα (z) is undoubtedly the most intricate. The factor (−1)prA (α) is the Frenkel-Kac cocycle [FK, K] needed in order to conform to the vertex algebra axioms. This factor being dropped, the formula gives the vertex operator known since the early days of string theory. The untiring reader will check that all the calculations of this paper go through with (−1)prA (α) dropped, the light-minded reader may want to take this for granted. The vertex algebra structure on VL is equivalently described by the following family of products (n ∈ Z): def n x(z)z (y), (n) : VL ⊗ VL → VL , x ⊗ y → x(n) y = where x(z)zn stands for the linear transformation of VL equal to the coefficient of z−n−1 in the series x(z).
Deformations of Vertex Algebras
81
1.3. Degeneration of VL . Denote by LA the subgroup of L generated by Ai , i = 1, . . . , N. Any smooth toric variety X can be defined via a fan, /, that is, a collection of “cones” lying in LA . Borisov uses such / to define a certain degeneration, VL/ , of the vertex algebra structure on VL . He further shows that the cohomology of VL/ with ch respect to a certain differential D / : VL/ → VL/ equals H ∗ (X, ch X ), where X is the chiral de Rham complex of [MSV]. Let us describe the outcome of this construction in the case when X = PN postponing the discussion of the general case until Sect. 3. Consider the following set of N + 1 elements of LA : ξ1 = A1 , ξ2 = A2 , . . . , ξN = AN , ξN+1 = −A1 − A2 − · · · − AN . Define the cone 1i ⊂ LA to be the set of all non-negative integral linear combinations of the elements ξ1 , . . . , ξi−1 , ξi+1 , . . . , ξN+1 . The fan / in this case is the set consisting of all the intersections 1i1 ∩ 1i2 ∩ · · · ∩ 1il with 1 ≤ i1 < · · · < il ≤ N + 1, 1 ≤ l ≤ N + 1. We now define VL/ to be a vertex algebra equal to VL as a vector space with nth product (n),/ as follows: if { i ni Ai , i ni Ai } ⊂ 1j for some j , then i i
i i
x ⊗ e i mi B + i ni A (n),/ y ⊗ e i mi B + i ni A i i
i i
= x ⊗ e i mi B + i ni A (n) y ⊗ e i mi B + i ni A ,
otherwise
i i
i i
x ⊗ e i mi B + i ni A (n),/ y ⊗ e i mi B + i ni A = 0,
(1.1a)
(1.1b)
where (n) stands for the nth product on VL . The fact that these new operations satisfy the Borcherds identities can be proved by including both VL and VL/ in a 1-parameter family of vertex algebras; this will be done in 2.1. Let N i j & i (z) eA − e− j A (z). (1.2) D= i=1
It is obvious that D ∈ End(VL/ ) and D 2 = 0; therefore the cohomology HD (VL/ ) arises. Theorem 1.3 ([B]).
HD (VL/ ) = H ∗ (PN , ch ). PN
2. Deforming H ∗ (PN ) 2.1. The family VL,q . Here we exhibit a family of vertex algebras, VL,q , q ∈ C, so that VL,q is isomorphic to VL if q = 0 and VL,0 is isomorphic to VL/ ; cf. the end of Sect. 8 [B]. Define the height function ht : LA → Z> as follows. Since LA = ∪i 1i , see 1.3, each α ∈ LA is uniquely represented in the form α=
N+1 i=1
n i ξi
(2.1a)
82
F. Malikov, V. Schechtman
so that all ni ≥ 0 and #{i : ni > 0} ≤ N . Let ht (α) = ni ,
(2.1b)
i
where n1 , . . . , nN are as in (2.1). Define the linear automorphism tq : VL → VL ,
q ∈ C − {0}
by the formula tq (x ⊗ e
i
mi B i +
i
ni Ai
) = q ht (
i
ni Ai )
x⊗e
mi B i +
i
i
ni Ai
.
Define VL,q to be the vertex algebra equal to VL as a vector space with the following nth product: (x ⊗ e
i
mi B i +
i
ni Ai
= tq−1 (tq (x ⊗ e By definition,
)(n),q (y ⊗ e
i
mi
Bi +
i
ni
i
Ai
mi B i +
i
ni Ai
)(n) tq (y ⊗ e
)
i
mi B i +
i
ni Ai
)).
tq : VL,q → VL ,
q ∈ C − {0}, is a vertex algebra isomorphism. It is also easy to see that if i ni Ai and i ni Ai belong to the same cone from /, then (x ⊗ e
i
mi B i +
= (x ⊗ e otherwise (x ⊗ e
i
ni Ai
)(n),q (y ⊗ e
i i mi B +
mi B i +
i
i
ni Ai
∈ qC[q](x ⊗ e
i i ni A
i
mi B i +
)(n) (y ⊗ e
)(n),q (y ⊗ e
i i mi B +
i i ni A
i
i
ni Ai
mi B i +
i
)
i i mi B +
ni Ai
)(n) (y ⊗ e
i
ni Ai
)
i i mi B +
i
);
ni Ai
).
Two things follow at once: first, the operations (n),0
= lim
q→0
(n),q ,
n∈Z
are well defined and satisfy the Borcherds identities; second, the vertex algebra, VL,0 , obtained in this way is isomorphic to VL/ . By the way, this remark proves that VL/ is indeed a vertex algebra. To get a better feel for this kind of deformation, and for future use, let us consider the subspace C[LA ] ⊂ VL,q with basis eα , α ∈ LA . The (−1)st product makes this space a commutative algebra. The subspace C[1j ] defined to be the linear span of eα , α ∈ 1j , is a polynomial ring on generators eξ1 , . . . , eξj −1 , eξj +1 , . . . , eξN +1 . For example, if we i denote xi = eA , then C[1N+1 ] = C[x1 , . . . , xN ] and this isomorphism identifies j nN . e j nj A with the monomial x1n1 · · · xN The entire C[LA ] is not a polynomial ring. For example, as follows from the definition of the deformation, there is a relation 1 −···−AN
(e−A
1 +···+AN
)(−1) (eA
) = q N+1 e0 ,
Deformations of Vertex Algebras
83
because ht (0) = 0, ht (A1 + · · · + AN ) = N , ht (−A1 − · · · − AN ) = 1. If we let 1 N T = e−A −···−A , then the last equality rewrites as follows: T x1 x2 · · · xN = q N+1 , and a moment’s thought shows that in fact C[LA ] = C[x1 , . . . , xN , T ]/(T x1 x2 · · · xN − q N+1 ).
(2.2)
Being a group algebra, C[LA ] carries another algebra structure, a priori different from the one we just described and independent of q. We see that the two structures are isomorphic if q = 0, the isomorphism being defined by rescaling: xi → qxi ,
T → qT .
At q = 0, however, the one we just described degenerates in an algebra with zero divizors. 2.2. The cohomology H ∗ (PN ) via vertex algebra. Let Q(z) = Ai (z)'i (z) −
'j (z) ,
(2.3a)
j
G(z) = B i (z)& i (z), i
i
J (z) =: ' (z)& (z) : +
(2.3b) j
B (z),
(2.3c)
j
L(z) =: B i (z)Ai (z) : + : 'i (z) & i (z) :,
(2.3d)
where the summation with respect to repeated indices is assumed. One checks that the Fourier components of these 4 fields satisfy the commutation relations of the N = 2 algebra. It is also easy to see that the fields G(z), L(z) commute with Borisov’s differential D, see (1.1), and therefore define the fields, to be also denoted G(z), L(z), acting on HD (VL/ ). The field Q(z) does not commute with D, but its Fourier component Q0 = Q(z) does: [Q0 , D] = 0. Thus we get an operator, to be also denoted Q0 , acting on HD (VL/ ). All this is summarized by saying that HD (VL/ ) is a topological vertex algebra. A glance at the formulas on p. 17 of [B] shows that the isomorphism HD (VL/ ) = ∗ H (PN , ch ) (see Theorem 1.3) identifies these G(z), L(z), Q0 with the fields (operPN ators) constructed in [MSV] and denoted in the same way. One of the main results of [MSV] then reads (2.4) H ∗ (PN ) = HQ0 (HD (VL/ )). Further, the product of two elements of H ∗ (PN ) equals the (−1)st product of corresponding representatives in HD (VL/ ).
84
F. Malikov, V. Schechtman
2.3. Deformation of the product. It follows from the proof of Theorem 2.3 below that the cohomology (2.4) can be calculated in the reversed order: H ∗ (PN ) = HD (HQ0 (VL/ )).
(2.5)
Note that D and Q0 are still well-defined operators acting on the deformed algebra: D=
N
i
& (z)(e
Ai
−e
−
j
Aj
)(z),
Q0 =
Ai (z)'i (z) : VL,q → VL,q .
i=1
It is immediate to see that D 2 = Q20 = 0 on VL,q as well. Moreover, [D, Q0 ] = 0.
(2.6)
Indeed, the formulas of 1.2 imply the following OPE: N
i
& (z)(e
Ai
−e
−
j
Aj
j
j
)(z) · A (w)' (w) ∼
j
j
eA (w) + e− z−w
i=1
Therefore, [D, Q0 ] =
j
eA (w) + e−
j
j
Aj
(w)
.
j j A (w) = 0.
Thus it is natural to take the space HD (HQ0 (VL,q )) for a deformation of H ∗ (PN ). Theorem 2.3.
HD (HQ0 (VL,q )) = C[T ]/(T N+1 − q N+1 ).
Proof. 1) Computation of HQ0 (VL,q ). Due to (2.3a), Ai−n 'in . Q0 =
(2.7)
n∈Z
Therefore,
j
j
[Q0 , &0 ] = A0 ,
[Q0 , G0 ] = L0 .
These relations imply that j
HQ0 (VL,q ) = HQ0 (∩j KerA0 ∩ KerL0 ).
(2.8)
It follows from the commutation relations recorded in 1.1 that j
∩j KerA0 = )ᒅL ⊗ SᒅL ⊗ C[LA ]. of the latter space and, as follows from Next, L0 is a semi-simple linear transformation i (2.3d), it multiplies a monomial yi1 · · · xik e i ni A by i1 + · · · ik . Since only '’s may appear with zero index, we get j
i ∩j KerA0 ∩ KerL0 = )(⊕N i=1 C'0 ) ⊗ C[LA ],
where ) stands for the exterior algebra of the vector space in braces.
Deformations of Vertex Algebras
85
The (−1)st product makes this space a supercommutative algebra and the computations in the end of 2.1, especially formula (2.2), establish the algebra isomorphism j i N+1 ). ∩j KerA0 ∩ KerL0 ∼ = )(⊕N i=1 C'0 ) ⊗ C[x1 , . . . , xN ]/(T x1 · · · xN − q
Formula (2.7) shows that the restriction of Q0 to this subspace is 0. Thus j i N+1 ). HQ0 (∩j KerA0 ∩ KerL0 ) ∼ = )(⊕N i=1 C'0 ) ⊗ C[x1 , . . . , xN ]/(T x1 · · · xN − q
2) Computation of HD (HQ0 (VL,q )). In view of Step 1), we have to restrict D to j i N+1 ). ∩j KerA0 ∩ KerL0 ∼ = )(⊕N i=1 C'0 ) ⊗ C[x1 , . . . , xN ]/(T x1 · · · xN − q The last isomorphism identifies D with i (xi − T )∂/∂'i0 , where ∂/∂'i0 is the odd j derivation such that ∂/∂'i0 ('0 ) = δ ij . Therefore, the complex
(C[x1 , . . . , xN , T ; '1 , . . . , 'N ]/(T x1 · · · xN − q N+1 ), D) is simply the Koszul resolution of the algebra {C[x1 , . . . , xN , T ]/(T x1 · · · xN − q N+1 )}/(x1 − T , x2 − T , . . . , xN − T ) associated with the sequence x1 − T , x2 − T , . . . , xN − T . This sequence is regular and we get at once HD (HQ0 (VL,q )) = C[T ]/(T N+1 − q N+1 ). It is easy to infer from Borisov’s proof of Theorem 1.3 that the element T = ∈ VL,q is a cocycle representing the cohomology class proportional to that of a hyperplane in PN . This means that the deformation of H ∗ (PN ) we have obtained coincides with the standard one, except that for some reason q happened to be raised to the N + 1. 1 N e−A −···−A
2.4. Reduction to a single differential. Of course it would be nicer to get H ∗ (PN ), or its deformation, as the cohomology of this or that vertex algebra with respect to a single differential rather than to compute a repeated cohomology. Theorem 2.4.
HD+Q0 (VL,q )) = C[T ]/(T N+1 − q N+1 ).
It is no wonder, in view of Theorem 2.3, that this assertion is a result of computation of a certain spectral sequence. We shall use several spectral sequences arising in the following situation, which is slightly different from the standard one. Let n W = ⊕+∞ n=−∞ W
(2.9a)
be a graded vector space with two commuting differentials d1 : W n → W n+1 ,
d2 : W n → W n−s , s ≥ 0;
(2.9b)
(It is worthwhile to keep in mind that s is fixed and in the situation at hand equals 1.)
86
F. Malikov, V. Schechtman
There arise the total differential d = d1 + d2 and the cohomology Hd1 +d2 (W ). Note that this cohomology group is not graded since d1 and d2 map in opposite directions. We can, however, introduce the filtration W = ∪n W ≤n , Then
W ≤n = ⊕nm=−∞ W m .
(d1 + d2 )(W ≤n ) ⊆ W ≤(n+1) ,
and there arises a filtration Hd1 +d2 (W )≤n on the cohomology and the associated graded object GrHd1 +d2 (W ). It is straightforward to define a spectral sequence }, {E(W )nr , d (r) : E(W )nr → E(W )n−2(s+1)r+1 r
E(W )nr+1 = Hd (r) (E(W )nr ), (2.10)
the first three terms being as follows: E(W )n0 = W n , E(W )n1 = Hd1 (W n ), E(W )n2 = Hd2 (Hd1 (W n )),
(2.11)
where Ker{d1 : W n → W n+1 } , Im{d1 : W n−1 → W n } Ker{d2 : Hd1 (W n ) → Hd1 (W n−s )} Hd2 (Hd1 (W n )) = . Im{d2 : Hd1 (W n+s ) → Hd1 (W n )} Hd1 (W n ) =
In the situation pertaining to Theorem 2.4 we take VL,q for W , Q0 for d1 , and D for d2 . The space VL,q is graded by fermionic charge; this grading is defined by letting the degree of &ji equal −1, the degree of 'ij equal 1, the degree of Aij , Bji , eα equal 0. By definition, n+1 n−1 n n Q0 (VL,q ) ⊆ VL,q , D(VL,q ) ⊆ VL,q , and we get a spectral sequence {E(VL,q )nr , d (r) } with s = 1. Observe that the grading by fermionic charge and the corresponding filtration are infinite in both directions. Therefore, the standard finiteness conditions that guarantee convergence of spectral sequences fail. Nevertheless the following lemma holds true. Lemma 2.5. The spectral sequence {E(VL,q )nr , d (r) } converges to GrHQ0 +D (VL,q ) and collapses: HD (HQ0 (VL,q )) = HQ0 +D (VL,q ). Lemma 2.5 combined with Theorem 2.3 gives Theorem 2.4 at once and it remains to prove Lemma 2.5. n as follows. Let Proof of Lemma 2.5. Introduce yet another grading of the space VL,q α = (α1 , . . . , αN+1 ) be an element of the group ZN+1 . Let i n VL,q [α] = (∩N i=1 Ker(A0 − αi I d)) ∩ Ker(L0 − αN+1 I d).
Of course
n n VL,q = ⊕α∈ZN +1 VL,q [α],
and both the differentials preserve this grading. Therefore all calculations can be carried n [α] with a fixed α. Consider the following two cases. out inside VL,q
Deformations of Vertex Algebras
87
1) α = 0. In this case, as was observed in the beginnning of the proof of Theorem 2.3 (see e.g. (2.8)), HQ0 (VL,q [α]) = 0 and, therefore, E[α]1 = 0. It remains to show that HQ0 +D (VL,q [α]) = 0. Let x ∈ VL,q [α]≤n be a cocycle. This means that there is a “chain” of elements xi ∈ VL,q [α]n−2i , i = 0, 1, . . . , k so that x=
k
xi ,
i=0
and the following holds: Q0 (x0 ) = 0, Q0 (xi+1 ) + D(xi ) = 0,
D(xk ) = 0,
i = 0, . . . , k − 1.
(2.12)
We now repeatedly use the condition HQ0 (VL,q [α]) = 0 and (2.12) to construct another chain yi ∈ VL,q [α]n−2i−1 , i ≥ 0, satisfying Q0 (y0 ) = x0 , Q0 (yi+1 ) + D(yi ) = xi+1 .
(2.13)
Indeed, since Q0 (x0 ) = 0, there is y0 ∈ VL,q [α]n−1 so that Q0 (y0 ) = x0 . Since Q0 (−D(y0 ) + x1 ) = DQ0 (y0 ) + Q0 (x1 ) = D(x0 ) + Q0 (x1 ) = 0, there is y1 ∈ VL,q [α]n−3 so that Q0 (y1 ) + D(y0 ) = x1 . In general, having found yi ∈ VL,q [α]n−2i−1 , yi−1 ∈ VL,q [α]n−2i+1 so that Q0 (yi )+ D(yi−1 ) = xi , we calculate as follows: 0 = D(0) = D(Q0 (yi ) + D(yi−1 ) − xi ) = DQ0 (yi ) − D(xi ). Due to (2.12), the last expression rewrites as DQ0 (yi ) + Q0 (xi+1 ) and we get −Q0 D(yi ) + Q0 (xi+1 ) = 0. Therefore, Q0 (D(yi )−xi+1 ) = 0 and there is yi+1 so that Q0 (yi+1 ) = −D(yi )+xi+1 as desired. Formally, (2.13) means that ∞ yi = x (D + Q0 ) i=0
and what does not allow us to conclude immediately that x = ∞ i=0 xi is a coboundary is that the sum ∞ y looks infinite. To complete case 1) it remains to show that yi = 0 i i=0 for all sufficiently large i. This is achieved by the following dimensional consideration. Note that by construction yi ∈ ⊕|mj |
j
mj Aj +
j
αj B j
),
(2.14)
where k is a number independent of i, and n − 2i − 1 is the fermionic charge. Indeed, each application of D changes mj by at most 1, Q0 preserves mj , and the linear estimate of mj follows; that the fermionic charge of yi is n − 2i − 1 follows from the fact that the fermionic charge of Q (D resp.) is 1 (–1 resp.). On the other hand, L0 is defined explicitly by (2.3d), and this formula implies that the smallest eigenvalue of L0 restricted
88
F. Malikov, V. Schechtman
to )n−2i−1 is nonnegative and grows faster than a polynomial of degree 2, say q(i), as ᒅL i → +∞. The same formula gives L0 e
j
mj Aj +
j
αj B j
=
mj α j e
j
mj Aj +
j
αj B j
.
j
Therefore, if non-zero, yi is a sum of eigenvectors associated to eigenvalues of L0 greater or equal to q(i) − (α1 + · · · αn )ki. Since this number tends to +∞ as i → +∞, we arrive at contradiction with the assumption L0 yi = αN+1 yi , αN+1 being fixed, if i is sufficiently large. Hence, yi = 0 for all sufficiently large i, each cocycle is a coboundary, and case 1) is accomplished. 2) α = 0. As we saw in the beginning of the proof of Theorem 2.3, the restriction of Q0 to VL,q [0] is 0; by definition, the complex (VL,q [0], D + Q0 ) is equal to (E(VL,q )[0]1 , d (1) ) and the cohomology of the latter were computed in Step 2) of the proof of Theorem 2.3.
3. Quantum Cohomology of Toric Projective Varieties Let us explain how the constructions and results of Sect. 2 carry over to an arbitrary smooth projective toric variety of dimension N . Each such variety is determined by a complete regular fan in LA . This and other relevant concepts can be defined as follows (see [D, Bat] for details). 3.1. Let I ⊂ LA . The cone generated by I is said to be the set of all non-negative integral combinations of elements of I and is denoted 1I . A cone generated by part of a basis of LA is called regular. A complete regular fan / is defined to be a collection of regular cones {σ1 , . . . , σs } so that the following conditions hold: (i) If σ is a face of σ ∈ /, then σ ∈ /; (ii) If σ, σ ∈ /, then σ ∩ σ is a face of σ ; (iii) (the completeness condition) LA = σ1 ∪ · · · ∪ σs . We skip the construction of the smooth compact toric manifold X/ attached to a regular complete fan / referring the reader to [D, Bat], but formulate Batyrev’s result on H 2 (X/ , R). A function φ : LA → R is called piecewise linear if its restriction to any cone in / is a morphism of abelian groups. Denote by P L(/) the space of all piecewise linear functions. Let G(/) = {ξ1 , . . . , ξn } be the set of the generators of all 1-dimensional cones in /. Since each piecewise linear function is determined by its values on ξi (i = 1, . . . , n), P L(/) is an n-dimensional real vector space. It contains the N -dimensional subspace of globally linear functions; the latter is naturally isomorphic to LB ⊗Z R. Theorem 3.1 ([Bat]). H 2 (X/ , R) = P L(/)/LB ⊗Z R.
Deformations of Vertex Algebras
89
Denote by K 0 (/) ⊂ H 2 (X/ , R) the set of classes of K¨ahler (1,1)-forms known as the K¨ahler cone of X/ . According to [Bat], Theorem 4.5, its combinatorial description is as follows: Call a piecewise linear function φ convex if φ(x) + φ(y) ≥ φ(x + y) all x, y ∈ LA .
(3.1)
By Theorem 3.1, the cone of all convex piecewise linear functions descends to the cone in H 2 (X/ , R) and its interior is exactly K 0 (/). A little more explicitly, K 0 (/) is the set of classes of all strictly convex piecewise linear functions, that is, of all those functions φ for which the equality in (3.1) is achieved if and only if x and y belong to the same cone in /. 3.2. Let us return to the vertex algebra VL . To begin with, Borisov’s degeneration VL/ associated to the fan / is still defined by (1.1a,b) and Borisov’s Theorem 1.3 carries over word for word – or rather “symbol for symbol” – provided D is defined as follows (cf. (1.2)): N n D= & i (z) (B i , ξj )eξj (z) : VL/ → VL/ , (3.2) i=1
j =1
where {ξ1 , . . . , ξn } is the set of generators of all 1-dimensional cones in /. Next, generalizing 2.1, we construct a flat family of vertex algebras with base H 2 (X/ ) such that when restricted to the K¨ahler cone it extends to the latter’s one-point compactification with VL/ attached to infinity. For an arbitrary R-valued function φ on LA define a linear automorphism tφ : VL → VL by the formula tφ (x ⊗ e
i
mi B i +
i
ni Ai
) = e−φ(
i
ni Ai )
x⊗e
i
mi B i +
i
ni Ai
.
(3.3)
Define VL,φ to be the vertex algebra equal to VL as a vector space with the following nth product:
i
i
i
i
(x ⊗ e i mi B + i ni A )(n),φ (y ⊗ e i mi B + i ni A ) i i i i = tφ−1 (tφ (x ⊗ e i mi B + i ni A )(n) tφ (y ⊗ e i mi B + i ni A )).
(3.4)
By definition, tφ : VL,φ → VL is a vertex algebra isomorphism. This provides us with a constant family of vertex algebras parametrized by φ. Formulas (3.1,3.3–3.4) imply that (i) if φ is convex piecewise linear, then the operations (n),∞φ
= lim
τ →+∞
(n),τ φ ,
n∈Z
are well defined and satisfy the Borcherds identities; denote the vertex algebra arising in this way by VL,∞φ ; (ii) if φ is strictly convex piecewise linear, then VL,∞φ is isomorphic to Borisov’s algebra VL/ .
90
F. Malikov, V. Schechtman
These assertions mean that the family VL,φ produces a deformation of VL/ with base equal to the cone of strictly convex piecewise linear functions. It is also immediate to see that if φ − φ is a linear function, then the two deformations VL,τ φ and VL,τ φ , τ ≥ 0, are equal to each other. Thus we have obtained the family of vertex algebras VL,φ , φ ∈ K 0 (/), which is a deformation of VL/ with base K 0 (/) ∪ ∞. Observe that the projectivity of X/ is equivalent to K 0 (/) being non-empty. Denote by QHφ∗ (X/ , R) the quantum cohomology of X/ as defined in Sect. 5 of [Bat]. With Borisov’s differential still defined by (3.2), we obtain the following: Theorem 3.2.
HQ0 +D (VL,φ ) = QHφ∗ (X/ , R).
Proof. First of all,
(Q0 )2 = 0, D 2 = 0, [Q0 , D] = 0.
(The first two of these assertions are obvious, the last one is obtained in the same way as (2.6). Indeed, N i=1
& i (z)
n
(B i , ξj )eξj (z) · Aj (w)'j (w) ∼
j =1
= and therefore, [D, Q0 ] =
n
N n 1 i i A (z) (B , ξj )eξj (z) z−w
1 z−w
i=1 n
j =1
eξj (w) ,
j =1
eξj (w) = 0.)
j =1
Hence there arises a spectral sequence completely analogous to the one used in 2.4. It converges and collapses: HQ0 +D (VL,φ ) = HD (HQ0 (VL,φ )); this is done in exactly the same way as in the proof of Theorem 2.4. In part 1) of the proof of Theorem 2.3 the vector space HQ0 (VL,φ ) was shown to be equal to the group algebra R[LA ] extended by the grassmann variables 'i0 (i = 1, . . . , N): i HQ0 (VL,φ ) = R[LA ] ⊗ )(⊕N i=1 C'0 ).
(3.5a)
Both sides of this formula carry a natural product equal to the (−1)st product on the ambient vertex algebra VL,φ ; it depends on φ and differs from another natural product on the right, the one stemming from the fact that R[LA ] is a group algebra and hence independent of φ. To emphasize this subtlety (as needed to compare with [Bat]) we rewrite (3.5a) as the following algebra identification: i HQ0 (VL,φ ) = R[LA ]φ ⊗ )(⊕N i=1 C'0 ).
(3.5b)
The multiplication on R[LA ]φ is indeed different from that on R[LA ], but not much; formulas (3.3–3.4) show that i
i
i
R[LA ] → R[LA ]φ , eA → eφ(A ) eA
(3.5c)
Deformations of Vertex Algebras
91
is an algebra isomorphism. (At this point the reader may want to consult the end of 2.1 to see what this amounts to in the simpler case of a projective space.) Definition (3.2) says that (HQ0 (VL,φ ), D) is the Koszul complex associated with the sequence n (B i , ξj )eξj , i = 1, . . . , N. (3.6) j =1
To identify the “Koszul cohomology” HD (HQ0 (VL,φ )) with QHφ∗ (X/ , R) we need to leaf through [Bat]. Batyrev defines QHφ∗ (X/ , R) to be the polynomial ring R[z1 , . . . , zn ] modulo the sum of two ideals denoted by P (/) and Qφ (/). It is shown in the beginning of the proof of [Bat], Theorem 8.4, that an algebra homomorphism R[z1 , . . . , zn ] → R[LA ] determined by the assignment
zi → e−φ(ξi ) eξi
factors through an algebra isomorphism R[z1 , . . . , zn ]/Qφ (/) → R[LA ].
(3.7)
Composing (3.7) with (3.5c) we get an algebra isomorphism R[z1 , . . . , zn ]/Qφ (/) → R[LA ]φ , zi → eξi .
(3.8)
It follows from the comparison of (3.6) above with Definition 3.7 in [Bat] that under this isomorphism the image of the ideal P (/) in R[z1 , . . . , zn ]/Qφ (/) is the ideal generated by the elements (3.6). Therefore QHφ∗ (X/ , R) equals the 0th cohomology of the complex (HQ0 (VL,φ ), D), and it remains to show that all higher cohomology vanish. The ring R[LA ] is locally Cohen-Macaulay of dimension N ; the sequence (3.6) has length N and the corresponding quotient has dimension 0; therefore (see e.g. [H], Theorem 8.21) the sequence (3.6) is regular. 4. Computation of H ∗ (PN , Ωch ) and an Elliptic Genus of P2n PN In this section we return to the familiar case of a projective space and keep the notation of Sects. 1 and 2. Theorem 4.1. If q = 0, then HD (VL,q ) equals the quantum cohomology of PN . It is clear that the combination of this result and Borisov’s Theorem 1.3 asserts that HD(q) (Vq (PN )), q ∈ C, is a family of vertex algebras over C with fiber that equals the quantum cohomology over q = 0 and blows up to the infinite dimensional vertex algebra H ch (X) over 0 ∈ C. To formulate the next theorem, we use the operator L0 , see (2.3d), to introduce a grading (by conformal weight)
where and a character
a N ch H a (PN , ch ) = ⊕∞ i=0 H (P , PN )i , PN
(4.1a)
H a (PN , ch ) = Ker(L0 − iId), PN i
(4.1b)
92
F. Malikov, V. Schechtman
chH a (PN , ch )= PN
∞ i=0
q i dimH a (PN , ch ). PN i
Theorem 4.2. There is a canonical isomorphism ∼
H a (PN , ∗PN ) −→ H a (PN , ch ), 0 < a < N, PN
(4.2a)
where ∗PN is the sheaf of differential forms, so that ∼
) , 0 < a < N, H a (PN , ∗PN ) −→ H a (PN , ch PN 0
(4.2b)
H a (PN , ch ) = 0, 0 < a < N, if i > 0. PN i
(4.2c)
and
Corollary 4.1. Let sign(q, LP2n ) be the equivariant signature of the loop space of P2n , n ∈ N, as defined in [HBJ, 6.1]. Then sign(q, LP2n ) = 2chH 0 (P2n , ch P2n ) − 1.
(4.3)
. ch is a sheaf of Recall the previously known results on the cohomology of ch PN PN N+1 -modules [MS1]. In particular, if U0 = CN ⊂ PN is a big cell, then @(U0 , chN ) sl P N+1 introduced in [FF]. We proved in [MS1] is a generalized Wakimoto module over sl that H 0 (PN , ch ) = @(U0 , ch )int , PN PN where @(U0 , ch )int stands for the maximal slN+1 -integrable submodule of @(U0 , ch ). PN PN On the other hand, it follows from the chiral Serre duality [MS2] that there is a canonical isomorphism ∼
H N (PN , ch ) −→ H 0 (PN , ch )d , PN PN where d stands for the restricted dual. ) and @(U0 , ch )int Unfortunately, little is known about the structure of @(U0 , ch PN PN if N > 1; see, however, 4.2 below for the case when N is even and [MS1] for the case when N = 1. Otherwise, Theorem 4.2 and these results give a complete description of H ∗ (PN , ch ). Moreover, Corollary 4.1, an immediate consequence of Theorem 4.2 as PN will be explained in 4.3, if read backwards gives us the character formula in the even dimensional case: 2n 2n ch chH 0 (P2n , ch P2n ) = chH (P , P2n ) =
1 1 sign(q, LP2n ) + . 2 2
Deformations of Vertex Algebras
93
4.1. Proof of Theorems 4.1 and 4.2. These theorems are proved simultaneously by considering the following spectral sequence. By definition, the complex (VL,q , D) is the constant vector space VL with differential D polynomially depending on q ∈ C. To make this more precise, observe that VL is graded by the function ht defined by (2.1a, b): VL = ⊕n≥0 VLn , where VLn is a linear span of x⊗e D then breaks in a sum
i
mi B i +
i
ni Ai
(4.4)
with ht (
i
ni Ai ) = n. The differential
D = d+ + q N d− ,
(4.5a)
d+ (VLn ) ⊂ VLn+1 ,
(4.5b)
d− (VLn ) ⊂ VLn−N ,
(4.5c)
(d+ )2 = (d− )2 = [d+ , d− ] = 0.
(4.5d)
so that
and
These properties place us in the situation of (2.9a,b) with s = N , and therefore there arises a spectral sequence of the type (2.10). By construction, the 0th term and 0th differential of this spectral system coincide with Borisov’s complex (VL/ , D). Hence the 1st term and 1st differential are (cf. (2.11)): E1 = H ∗ (PN , ch ), PN
(4.6)
def ) → H ∗ (PN , ch ). d1 = q N d− : H ∗ (PN , ch PN PN
(4.7)
Because of (4.5c), the latter sharpens as follows: ) → H n−N (PN , ch ), 0 ≤ n ≤ N. d1 : H n (PN , ch PN PN
(4.8)
Thus the 2nd term is E2 =
) H 0 (PN , ch PN
Im{d1 : H N (PN , ch ) → H 0 (PN , ch )} PN PN
i N ch ⊕Ker{d1 : H N (PN , ch ) → H 0 (PN , ch )} ⊕ ⊕N−1 i=1 H (P , PN ). PN PN
Again because of (4.5c), all higher differentials vanish. Since the grading we are using is bounded from below, an argument similar to (and simpler than) the one used in Step 1) of the proof of Lemma 2.4 shows that this spectral sequence converges to GrHD (VL,q ). Therefore GrHD (VL,q ) =
) H 0 (PN , ch PN
Im{d1 : H N (PN , ch ) → H 0 (PN , ch )} PN PN
i N ch ⊕Ker{d1 : H N (PN , ch ) → H 0 (PN , ch )} ⊕ ⊕N−1 i=1 H (P , PN ). PN PN
(4.9)
The remainder of the proof uses a structure on our spectral sequence which we have ) is graded not yet introduced – a grading. Just as the cohomology algebra H ∗ (PN , ch PN by the eigenvalues of the operator L0 , see (4.1a,b), so is our initial object, the vertex
94
F. Malikov, V. Schechtman
algebra VL,q . With respect to this grading the differential D is homogeneous of weight 0, and therefore the grading is inherited by the entire spectral sequence. Now focus on the weight 0 component of (4.8). It equals the cohomology of H ∗ (PN , ch ) with respect to d1 , see (4.6–4.7). By the definition of ch , the space H ∗ (PN , PN 0 PN ch ∗ ∗ N ∗ N PN )0 is H (P , PN ), and hence H (P ). example, this follows from the fact It is easy to see that d1 = 0 on this space. (For j that this space is linearly spanned by powers of e− j A , see the last sentence of 2.3.) Hence the weight 0 component of (4.9) is the ordinary cohomolgy of PN , as it should be. Computations done in the proof of Theorem 2.3 show that the genuine HD (VL,q )0 is the quantum cohomology. A glance at (4.1a–4.2c) and the fact that all the intermediate cohomology groups ), 1 ≤ i ≤ N − 1, enter (4.9) show that what remains to prove is that the H i (PN , ch PN higher conformal weight components of (4.9) are 0. This follows immediately from N representing an element of H N (PN , ch ) Lemma 4.2. There is a cocycle y˜ ∈ VL,q PN such that [D, y˜(0) ] = L0 .
Indeed, y˜(0) is then a homotopy connecting a non-zero multiple of the identity map on each positive conformal weight space to zero. Proof of Lemma 4.2 is a computation seriously using our previous work and its interplay with [B]. Therefore the details of this calculation go beyond the scope of the present paper, but glimpses can be discerned. In [MS1], III, we constructed a vector space embedding slN+1 A→ H 0 (PN , ch ), PN
(4.10)
which defines an LslN+1 -module structure on the sheaf ch . If we let eij be an element PN of the standard basis of glN+1 and identify it with its image under (4.10), then we get a collection of fields, eij (z), and Fourier components of these fields are linear transformations defining an action of the loop algebra LslN+1 on ch . In fact, a little more is PN true. If we let V ert (LslN+1 ) denote the vertex algebra associated to LslN+1 in [FZ], then (4.10) gives a vertex algebra embedding V ert (LslN+1 ) A→ H 0 (PN , ch ). PN
(4.11)
For example, we have identifications N
N N −B , e1N+1 = AN −1 − '0 &−1 e N
N
N
e1N+1 (z) = A (z) − ' (z)& (z)e
(4.12a) −B N
(z).
(4.12b)
). Elementary properties of ch combined with the Serre Now look at H N (PN , ch PN PN duality imply the embedding H 0 (PN , T )∗ A→ H N (PN , ch ) , PN 1 where T is the tangent sheaf. Since slN+1 ∼ = H 0 (PN , T )∗ , we have an embedding of slN+1 -modules: ) . (4.13) π : slN+1 A→ H N (PN , ch PN 1
Deformations of Vertex Algebras
95
It follows from Borisov’s proof of Theorem 1.3 that an element of VL representing the class of the image of e1N+1 under (4.13) can be chosen to be equal to N
N −1
N−1 A N A π(e1N +1 ) = (&−1 e )(0) (&−1 e
N
1 A 1 2 N )(0) · · · (&−1 e )(0) e−B 'N −1 '0 '0 · · · '0 . 1
A direct computation shows that N
N N −B d1 (π(e1N+1 )) = e1N+1 (= AN ), −1 − '0 &−1 e
cf. (4.12a), and thanks to the slN+1 -invariance d1 (π(eij )) = eij , 1 ≤ i, j ≤ N + 1.
(4.14)
Being a vertex algebra, V ert (LslN+1 ) has a Virasoro field of its own: 1 : eij (z)ej i (z) :, N +1 ij
and a little thought shows that it coincides with L(z) from (2.3d). It follows then from (4.14) (and some representation theory of affine Lie algebras) that y˜ needed in Lemma 4.2 can be chosen as follows y˜ =
1 : π(eij )(z)π(ej i )(z) : . N +1
ij
4.2. Reducibility of H 0 (PN , ch ) and H N (PN , ch ). Observe that H 0 (PN , ch ) is PN PN PN ch ch ch 0 N N N 0 N a vertex algebra, and H (P , PN ) and H (P , PN ) are H (P , PN )-modules. Corollary of the Proof of Theorems 4.1, 4.2 ⊕i>0 H 0 (PN , ch ) ⊂ H 0 (PN , ch ) PN i PN and ) ⊂ H N (PN , ch ) H N (PN , ch PN 0 PN are submodules of codimension and dimension 1 (resp.). Indeed, the mapping, see (4.8), d1 : H N (PN , ch ) → H 0 (PN , ch ) PN PN )-module morphism, and the essence of our proof of Theorems is clearly an H 0 (PN , ch PN 4.1, 4.2 is that the indicated spaces are its image and kernel (resp.); formally, the reader may want to compare e.g. (4.9) with Theorem 4.1.
96
F. Malikov, V. Schechtman
4.3. Proof of Corollary 4.1. Borisov and Libgober proved in [BL] that sign(q, LP2n ) =
2n a=0
(−1)a chH a (P2n , ch P2n ),
although we must admit that it is not that easy to extract this particular form of the result from [BL]. Theorem 4.2 then gives 2n 2n ch sign(q, LP2n ) = chH 0 (P2n , ch P2n ) + chH (P , P2n ) − 1,
(4.15)
because 2n−1 i=1
(−1)i dimH i (P2n , ch P2n )0 =
2n−1 i=1
(−1)i dimH i (P2n , ∗P2n ) = −1.
Since due to the chiral Serre duality [MS1] 2n 2n ch d H 0 (P2n , ch P2n ) = H (P , P2n ) ,
we have
2n 2n ch chH 0 (P2n , ch P2n ) = chH (P , P2n ).
Plugging this in (4.15) we get the desired result.
5. Deforming Cohomology Algebras of Hypersurfaces in Projective Spaces Let L → PN be a degree −n < 0 line bundle, L∗ → PN its dual, s ∈ @(PN , L∗ ) a global section so that its zero locus Z(s) ⊂ PN is a smooth hypersurface. The way Borisov calculates the cohomology of the chiral de Rham complex over Z(s) is as follows. ˆ (., .)) so that Extend the lattice (L, (., .)) introduced in 1.1 to the lattice (L, Lˆ = L ⊕ ZAu ⊕ ZB u , (Au , B u ) = 1, (Au , L) = 0, (B u , L) = 0. There arises the corresponding lattice vertex algebra VLˆ . Observe that any subset L ⊂ Lˆ closed under addition gives rise to the vertex subalgebra VL ⊂ VL generated by ᒅLˆ and ClLˆ from the highest weight vectors eβ , β ∈ L ; see 1.1–1.2. In our geometric situation let Lˆ n be the span of B i (i = 1, . . . , N), B u with arbitrary integral coefficients and Ai (i = 1, . . . , N), Au , nAu −A1 −· · ·−AN with nonnegative integral coefficients. The vertex algebra VLˆ n affords a degeneration, V ˆ/ , and includes in a family, VLˆ n ,q , Ln
q ∈ C, in the same way the algebra VL did, see 1.3, 2.1. To construct V ˆ/ , considLn
er the following N + 1 elements of Lˆ : ξ1 = A1 , ξ2 = A2 , . . . , ξN = AN , ξN+1 = nAu − A1 − A2 − · · · − AN . Define the cone 1i to be the set of all non-negative integral linear combinations of the elements ξ1 , . . . , ξi−1 , ξi+1 , . . . , ξN+1 , Au and let / = {11 , . . . , 1N+1 }. The vertex algebra V ˆ/ is now defined by repeating word for Ln
word the definition of VL/ in 1.3. Similarly, the family VLˆ n ,q , q = 0, is defined by repeating word for word the definition of VL,q in 2.1. The argument parallel to that leading to (2.2) shows that this family extends “analytically” to q = 0 if n ≤ N + 1 and we again obtain an isomorphism VL,0 = VLˆ/ if n < N + 1. n
(5.1)
Deformations of Vertex Algebras
97
Borisov’s differential is as follows: N nAu − j Aj nAu − j Aj i Ai u Au & (z)(e − e )(z) + & (z)(ne − e )(z) . D= i=1
(5.2a) (For future use let us note that the right-hand side of this equality can be rewritten as a sum over lattice points: N+1 ξi ξi u Au D= & (z)e (z) − & (z)e (z) , (5.2b) i=1
where & ξi = & i (i ≤ N) and & ξN +1 = n& u − j & j .) It is obvious that D ∈ End(VLˆ n ,q ) and D 2 = 0; therefore there arise the cohomology groups HD (VLˆ n ,q ) and HD (V ˆ/ ) = HD (VLˆ n ,0 ). Ln
Theorem 5.1 ([B]).
HD (VLˆ/ ) = H ∗ (L, ch L ). n
Borisov proposes to calculate the chiral de Rham complex over the hypersurface Z(s) ⊂ PN by means of a certain Koszul-type resolution of the complex ch L . The N ∗ combinatorial data that determine s ∈ @(P , L ) consist of a finite set N nj B j s.t.(β, ξi ) ≥ 0, i = 1, 2, . . . , N + 1 , (5.3) 1∗ = β = B u + j =1
and a function
g : 1∗ → Z≥ .
Define Kg = where 'β = 'u +
g(β)'β (z)eβ (z),
β∈1∗ j
(5.4)
nj 'j provided β = B u +
j
(5.5)
nj B j . It is easy to see that
Kg ∈ End(VLˆ n ,q ), Kg2 = 0, [Kg , D] = 0. Therefore, there arise the cohomology groups HD+Kg (VLˆ n ,q ) and HD+Kg (V ˆ/ ) = Ln HD+Kg (VLˆ n ,0 ). Theorem 5.2 ([B]).
HD+Kg (VLˆ/ ) = H ∗ (Z(s), ch Z(s) ). n
All the vertex algebras in sight being topological (see 2.2), Theorem 5.2 and the main result of [MSV] give H ∗ (Z(s)) = HQ0 (HD+Kg (VLˆ/ )), (5.6a) n
or, equivalently,
H ∗ (Z(s)) = HD+Kg (VLˆ/ )0 ,
where HD+Kg (V ˆ/ )0 stands for the kernel of L0 . Ln This prompts the following
n
(5.6b)
98
F. Malikov, V. Schechtman
Conjecture 5.3. If n < N , then the algebra HD+Kg (VLˆ n ,q )0 is isomorphic to the quantum cohomology algebra of Z(s). Unfortunately we do not have a proof of this conjecture; we cannot even prove that HD+Kg (VLˆ n ,q )0 is a deformation of H ∗ (Z(s)). What we know is collected in the following u − j
Lemma 5.4. (i) The element enA
Aj
satisfies u − j
(D + Kg )(enA
Aj
) = 0,
and, therefore, determines an element of HD+Kg (VLˆ n ,q )0 for all q. If q = 0, then this element, considered as an element of H ∗ (Z(s)) (see (5.6b)), is proportional to the cohomology class of a hyperplane section. nAu − j Aj (ii) Due to (i), e determines an element of HD+Kg (VLˆ n ,q )0 to be denoted x. It satisfies the following relation: x N+1 − q N+1−n nn x n = 0.
(5.7)
(iii) If Z(s) is a hyperplane (i.e. n = 1), then Conjecture 5.3 is correct. (iv) If Z(s) is the image of the embedding P1 × P1 → P3 , (u : v), (w : t) → (uw : ut : vw : vt), then HD+Kg (VLˆ 2 ,q )0 is isomorphic to C[x, y]/(x 2 − 1, y 2 − 1). Hence Conjecture 5.3 is true in this case.
Sketch of Proof of Lemma 5.4. The first part of assertion (i) is a result of the obvious u j calculation using the formulas of 1.1–1.2. The fact that at q = 0 the element enA − j A is proportional to the cohomology class of a hyperplane section follows from Borisov’s proof of Theorem 3.2; this observation is completely analogous to the one made in the end of 2.3. i (ii) By definition ( cf. (2.2)), the subalgebra of VLˆ n ,q generated by eA (i = 1, . . . , N), u
eA , and 'i0 (i = 1, . . . , N), 'u0 is isomorphic to
C[x1 , . . . , xN , T , u; '1 , . . . , 'N , 'u ]/(x1 x2 · · · xN T − q N+1−n un ), and the restriction of D to this supercommutative algebra coincides with the Koszul differential associated with the regular sequence xi − T , u − nT (i = 1, . . . , N). The relation (3.7) follows at once. (iii) Observe that we have the constant family of vector spaces VLˆ n ,q , q ∈ C, with differential D +Kg depending on q. At q = 0 the complex (VLˆ n ,q , D +Kg ) degenerates in Borisov’s complex (V ˆ/ , D +Kg ). As it usually happens in situations of this kind, and Ln has already happened in 4.1, the differential d = D+Kg breaks in a sum d = d− (q)+d+ so that [d− (q), d+ ] = 0, d− (0) = 0, and d+ equals Borisov’s differential on V ˆ/ . There Ln arises a spectral sequence, analogous to the one used in 4.1 but more complicated, connd term verging to HD+Kg (VLˆ n ,q ) with the 1st term equal to H ∗ (Z(s), ch Z(s) ). The 2
Deformations of Vertex Algebras
99
(1) (1) = d (q) equals the cohomology of the complex (H ∗ (Z(s), ch − Z(s) ), d ) with d satisfying: i−N +n d− (q)(H i (Z(s), ch (Z(s), ch Z(s),n )) ⊂ H Z(s),n ); n ≥ 0.
(5.8)
All this is true for an arbitrary Z(s). In the case of a hyperplane we have n = 1. It follows immediately that then all higher differentials vanish and it remains to calculate the ch ∗ cohomology (H ∗ (Z(s), ch Z(s) )0 , d− (q)). By Theorem 5.2, the space H (Z(s), Z(s) )0 u
j
is an algebra generated by the class of enA − j A , which d− (q) sends to 0, as follows from either (5.8) or assertion (i). Hence d− (q) is 0 on the indicated space and the algebra HD+Kg (VLˆ n ,q )0 is indeed a deformation of the cohomology of PN−1 . The quantum cohomology algebra of the hyperplane is isomorphic to the algebra of functions on an N-point set. Because of (ii), HD+Kg (VLˆ n ,q )0 is a quotient of the algebra of functions on an (N + 1)-point set, hence the algebra of functions on an (N + 1 − k)point set. As we have just proved, k can only be 1 and this quotient can only be the algebra of functions on an N -point set. (iv) Our proof of this assertion is a relatively lengthy and not particularly illuminating computation of the type done in (iii); we omit the details. Remarks. (i) By Corollary 9.3 of [G], in the case n < N the cohomology class p of a hyperplane section satisfies in the quantum cohomology of Z(s) the relation pN = qnn p n−1 . The amusing similarity between this equality and (5.7) suggests that the subalgebra of HD+Kg (VLˆ n ,q )0 generated by x might be equal to C[x]/(x N − q N+1−n nn x n−1 ). (ii) Borisov’s suggestion to treat the mirror symmetry as an A ↔ B flip seems to be working in our “quantized” situation as well. Compare (5.5) with (5.2b) to note that D and Kg are sums over two sets of lattice points defined by the self-dual condition (3.3). Hence the A ↔ B flip changes D to a similar differential to be associated with the mirror partner of Z(s) lying in another toric manifold, see Sect. 3. Of course the vertex j u algebra VLˆ n bears a certain asymmetry, since not all elements of the type e j nj A +nu A are allowed, but Borisov’s “transition to the whole lattice” (see Theorem 8.3 in [B]) and the spectral sequence above seem to straighten things out. ´ Acknowledgements. This work was done while V.S. was visiting the Institut des Hautes Etudes Scientifiques. The last strokes were laid while F.M. was visiting the Max-Planck-Institut f¨ur Mathematik. We are grateful to both institutions for support and excellent working conditions.
References [Bat] [B] [BL] [D] [FK]
Batyrev, V.: Quantum cohomology rings of toric manifolds. Journ´ees de G´eom´etrie Alg´ebrique d’Orsay (Orsay 1992), Ast´erisque 218, 9–34 (1993); alg-geom/9310004 Borisov, L.: Vertex algebras and Mirror symmetry. CMP 215(3), 517–557 (2001); math.AG/9809094 Borisov, L., Libgober, A.: Elliptic genera and applications to mirror symmetry. Inv. Math. 140, 453–485 (2000); math/9906126 Danilov, V.I.: The geometry of toric varieties. Uspekhi Mat. Nauk 33(2), 85–134 (1978) (Russian); Russ. Math. Surveys 33(2), 97–154 (1978) Frenkel, I.B., Kac, V.G.: Basic representations of affine Lie algebras and dual resonance models. Invent. Math. 62, 23–66 (1980)
100 [FZ]
F. Malikov, V. Schechtman
Frenkel, I.B., Zhu, Y.: Vertex operator algebras associated to representations of affine and Virasoro algebra. Duke Math. J. 66(1), 123–168 (1992) [G] Givental, A.: Equivariant Gromow-Witten invariants. Internat. Math. Res. Notices 13, 613–663 (1996) [H] Hartshorne, R.: Algebraic geometry. 4th edition. Berlin-Heidelberg-New York: Springer-Verlag, 1987 [HBJ] Hirzebruch, F., Berger, T., Jung, R.: Manifolds and modular forms. In: Aspects of Mathematics, A publication of the Max-Planck-Institut f¨ur Mathematik, Bonn, 1992 [K] Kac, V.G.: Vertex algebras for beginners, 2nd edition, University Lecture Series 10, Providence, RI: AMS, 1998 [MSV] Malikov, F., Schechtman, V., Vaintrob, A.: Chiral de Rham complex. Commun. Math. Phys. 204, 439–473 (1999) [MS1] Malikov, F., Schechtman, V.: Chiral de Rham complex. II Am. Math. Soc. Transl. 194(2), 149–188 (1999) [MS2] Malikov, F., Schechtman, V.: Chiral Poincar´e duality. Math. Res. Lett. 6, 533–546 (1999) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 234, 101–127 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0753-9
Communications in
Mathematical Physics
Equivariant K-Theory, Generalized Symmetric Products, and Twisted Heisenberg Algebra Weiqiang Wang
∗,∗∗
Department of Math., North Carolina State University, Raleigh, NC 27695, USA Received: 3 February 2001 / Accepted: 17 August 2002 Published online: 10 January 2003 – © Springer-Verlag 2003
Abstract: For a space X acted on by a finite group , the product space X n affords a natural action of the wreath product n = n Sn . The direct sum of equivariant Kgroups n≥0 Kn (X n ) ⊗ C were shown earlier by the author to carry several interesting algebraic structures. In this paper we study the K-groups KHn (X n ) of n -equivariant Clifford supermodules on Xn . We show that F− (X) = n≥0 KHn (X n ) ⊗ C is a Hopf algebra and it is isomorphic to the Fock space of a twisted Heisenberg algebra. Twisted vertex operators make a natural appearance. The algebraic structures on F− (X) , when is trivial and X is a point, specialize to those on a ring of symmetric functions with the Schur Q-functions as a linear basis. As a by-product, we present a novel construction of K-theory operations using the spin representations of the hyperoctahedral groups. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . n and Its Spin Supermodules . . . . . 2. The Group H n . . . . . . . 2.1 Definition of the supergroup H 2.2 Conjugacy classes of H n . . . . . . . . . . . n . . . . . . . . . . 2.3 Spin supermodules over H 3. A Decomposition Theorem in Equivariant K-Theory 3.1 The standard version . . . . . . . . . . . . . . 3.2 A super/twisted variant . . . . . . . . . . . . . n and K-Theory Operations . . . . . . 4. The Group H n and a ring of symmetric functions 4.1 The group H 4.2 K-theory operations . . . . . . . . . . . . . . . ∗
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
102 104 104 106 108 109 109 110 112 112 114
Partially supported by an NSF grant and an FR&PD grant at NCSU Current address: Department of Mathematics, University of Virginia, Charlottesville, VA 22904, USA. E-mail:
[email protected] ∗∗
102
5. Algebraic Structures on F− (X) . . . . . . . . . . 5.1 Generalized symmetric products . . . . . . 5.2 Hopf algebra F− (X) . . . . . . . . . . . . 5.3 Description of the algebra F− (X) . . . . . 5.4 Twisted vertex operators and F− (X) . . . . 5.5 Twisted Heisenberg algebra and F− (X) . . 6. Appendix: Another Formulation Using Sn and n
W. Wang
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
116 116 117 118 120 123 124
1. Introduction Motivated in part by Vafa-Witten [VW1] and generalizing the work of Segal [Seg2] (also cf. Grojnowski [Gr]), we studied in [W1] a direct sum, denoted by F (X), of the equivariant K-groups ⊕n≥0 Kn (X n ) ⊗ C associated to a topological -space X. Here is a finite group and the wreath product n (i.e. the semi-direct product n Sn ) acts naturally on the nth Cartesian product X n . We proved that the space F (X) carries several remarkable algebraic structures such as Hopf algebra and Fock representation of a Heisenberg (super)algebra, etc., and that vertex operators make a natural appearance as a part of λ-ring structure. We in addition pointed out in [W1] a new approach to the realization of the Frenkel-Kac-Segal homogeneous vertex representations of affine Lie algebras by using representation rings of the wreath product n associated to a finite subgroup of SL2 (C). This has been subsequently completed in [FJW1] jointly with I. Frenkel and Jing, and further extended in [FJW2] to realize vertex representations of twisted affine and toroidal Lie algebras by using the spin representation rings of a double cover n of the wreath product n . In this paper we will introduce a variant of equivariant K-theory. Given a topological space X acted upon by a finite supergroup G in an appropriate sense, we introduce a category of complex G-equivariant spin vector superbundles over X, and consider − the corresponding Grothendieck group KG (X). The superscript − here and below is used in this paper to stand for spin, i.e. a certain central element z in the supergroup G acts as −1. This K-group can be thought of as an invariant of orbifolds with (certain distinguished) discrete torsion as introduced by Vafa [Va]. Discrete torsion has been a topic of interest from various viewpoints since then, cf. [VW2, Di, AR, Ru, Sh] and the references therein. We present here a variant of the decomposition theorem of Adem and Ruan [AR] for what they call twisted equivariant K-theory, which generalizes the decomposition theorem [BC, Ku] (also cf. [AS, HKR]) in equivariant K-theory. Our formulation of such K-groups is motivated by providing a general framework for the main subject of study in this paper, namely a spin/twisted version of the space F (X) studied in [W1]. When the topological space under consideration is a point, our K-theory specializes to the theory of supermodules of finite supergroups, cf. J´ozefiak [Jo1]. (In this paper super always means Z2 -graded). Such a theory of supermodules has provided, in our opinion, a most natural framework (cf. [Jo]) for the exposition of the spin representations of a double cover Sn of the symmetric group initiated by Schur n of the hyperoctahedral group, [Sc]. The spin representation theory of a double cover H being almost parallel to and to some extent simpler than the spin representation theory for Sn , can also be treated successfully in terms of supermodules (cf. [Jo2]). Given a space X acted on by a (non-graded) finite group , we obtain our main − n th example of such a K-group KH n (X ) by considering the action on the n Cartesian n product X by the wreath product n which is further extended trivially to the action
Equivariant K-Theory and Twisted Heisenberg Algebra
103
n . Here H n is a double cover of the wreath product of a larger finite supergroup H n ( × Z2 ) Sn . (We recommend that the reader sets to be the one-element group in their first reading so that the whole picture becomes simpler and more transparent. In this n reduces to a double cover H n of the Hyperoctahedral group.) The category case H n -equivariant spin vector superbundles over X n admits an equivalent reformuof H lation which has a perhaps better geometric meaning. Namely this is the category of n -equivariant vector bundles E over X n such that E carries a supermodule structure with respect to the complex Clifford algebra of rank n which is compatible with the action of n . n -vector superbundles over X n , which plays an imA fundamental example of H portant role in this paper, is given as follows for X compact. Given a -vector bundle V over X, we consider the vector superbundle V |V = V ⊕ V over X with the natural n -equivariant Z2 -grading. We can endow the nth outer tensor (V |V )n a natural H n vector superbundle structure over X . We will show that the direct sum F− (X) :=
∞ n=0
− n KH (X ) n
C
carries naturally a Hopf algebra structure and it is isomorphic to the Fock space of a − ∼ twisted Heisenberg superalgebra associated to KH 1 (X) = K (X). All the algebraic structures are constructed in terms of natural K-theory maps. In particular the dimension − n of KH n (X ) is determined explicitly for all n. We remark that such a twisted Heisenberg algebra has played an important role in the theory of affine Kac-Moody algebras, cf. [FLM]. Roughly speaking, the structure of the space F (X) studied in [Seg2, W1] is modeled on the ring C of symmetric functions with a basis given by Schur functions (or equivalently on the direct sum of representation rings of symmetric groups Sn for all n). The structure of the space F− (X) under consideration in this paper is shown to be modeled instead on the ring C of symmetric functions with a linear basis given by the so-called Schur Q-functions (or equivalently on the direct sum of the spin representation n for all n). It is well known that the graded dimension of the ring C is given ring of H by the generating function 2 1 η(q ) 1 − 24 = q , 2i−1 ) η(q) i=0 (1 − q
∞
where η(q) is the Dedekind η function. Just as the λ ring is modeled on the ring C of symmetric functions (cf. e.g. [Kn]), one can introduce a Q-λ ring1 structure modeled on the ring C with Adams operations of odd degrees only. We show that as a part of the Q-λ ring structure on F− (X) twisted vertex operators naturally appear in F− (X) via the nth outer tensor (V |V )n associated to V ∈ K (X) in terms of the Adams operations. − n n It is of independent interest to see that when we restrict KH n (X ) from X to its diagonal we are able to obtain various K-theory operations on K(X), including supersymmetric power operations and Adams operations of odd degrees, by means of the spin n . This is a super analog of Atiyah’s construction [At] of K-theory supermodules of H operations on K(X) by means of the representations of the symmetric groups. 1 Here Q stands for Queer or Schur Q-functions. We believe that there exists a rich Q-mathematical world which is relevant to various twisted, spin, super structures, etc.
104
W. Wang
Motivated by G¨ottsche’s formula, Vafa and Witten [VW1] conjectured that the direct sum H(S) of homology groups for Hilbert schemes S [n] of n points on a (quasi-) projective surface S should carry the structure of a Fock space of a Heisenberg algebra, which was realized subsequently in a geometric way by Nakajima and Grojnowski (cf. [Na, Gr]). Parallel algebraic structures such as Hopf algebra, vertex operators, and Heisenberg algebra as part of vertex algebra structures [Bo, FLM], have naturally showed up in H(S) as well as in F (X). When S is a suitable resolution of singularities of an orbifold X/ , close connections appear between H(S) and F (X), cf. [W2, W3] and the references therein. It will be very important to see if one can find a “Hilbert scheme” version of the orbifold picture drawn in this paper. It is also interesting to see if our current work can find some applications in string theory, cf. [VW1, VW2, Di, Sh]. In fact the special case of our construction for trivial is closely related to an earlier paper of Dijkgraaf [Di]. See the Appendix. n When X is a point, the K-group K − n (X ) becomes the Grothendieck group of spin H n . In a companion paper [JW] joint with Jing, the conjugacy classes supermodules of H n have been studied in detail. In particular, when is and Grothendieck groups of H a subgroup of SL2 (C), they are used to realize vertex representations of twisted affine algebras (cf. [FLM]) and toroidal Lie algebras. This provides a new group theoretic construction, a somewhat improved version in our opinion than the one in [FJW2] using the groups n , of twisted vertex representations. In the Appendix, we sketch another n n instead of H formulation of our main results in this paper using the groups Sn and n . and H The plan of this paper is as follows. In Sect. 2 we present the spin representation n . In Sect. 3 we introduce the category and K-groups rings of the finite supergroup H of spin vector superbundles which are equivariant with respect to finite supergroups and present a decomposition theorem for such equivariant K-groups. In Sect. 4 we study n . The results in this section K-theory operations based on the spin representations of H are not to be further used in this paper. In Sect. 5, which is the heart of the paper, we present the structures of a Hopf algebra and of a Q-λ ring on F− (X), and relate the latter to the twisted vertex operators. We further construct a Heisenberg algebra which acts on F− (X) irreducibly by means of natural K-theory maps. In the Appendix, we n . outline a somewhat different construction in terms of the group Γn and Its Spin Supermodules 2. The Group H In this section we recall briefly some essential points in the theory of supermodules of n associated to any a finite supergroup (cf. [Jo1]). We define the finite supergroup H finite group , and study its conjugacy classes and spin supermodules. More details of these can be found in [JW]. n . Let G be a finite group and let d : G → Z2 be a 2.1. Definition of the supergroup H 0 the kernel of d which is a subgroup of G of index group epimorphism. We denote by G by letting the degree of elements in G 0 be 2. We regard d(·) as a parity function on G 1 = G\ G 0 be 1. Elements in G 0 0 and the degree of elements in the complementary G 1 ) will be called even (resp. odd). We will often refer to the pair (G, d), or simply (resp. G when there is no ambiguity, as a finite supergroup. The class of finite supergroups G under consideration in this paper has an additional property: it contains a distinguished even central element z of order 2. We denote by θ the quotient group homomorphism → G ≡ G/1, G z.
Equivariant K-Theory and Twisted Heisenberg Algebra
105
In this paper the modules over a finite supergroup or a superalgebra (such as the group superalgebra of a finite supergroup) will always be Z2 -graded (i.e. supermodules) unless otherwise specified. A general theory of supermodules over finite supergroups was developed by J´ozefiak [Jo1]. This was motivated to provide a modern account [Jo] of Schur’s seminal work on spin representations of symmetric groups [Sc]. Given two supermodules M = M0 + M1 and N = N0 + N1 over a superalgebra A = A0 + A1 , the linear map f : M → N between two A-supermodules is a homomorphism of degree i if f (Mj ) ⊂ Mi+j and for any homogeneous element a ∈ A and any homogeneous vector m ∈ M we have f (am) = (−1)d(f )d(a) af (m). The degree 0 (resp. 1) part of a superspace is referred to as the even (resp. odd ) part. We denote HomA (M, N ) = HomA (M, N )0 ⊕ HomA (M, N )1 , where HomA (M, N )i consists of A-homomorphisms of degree i from M to N . The notions of submodules, tensor product, and irreducibility, etc. for supermodules are defined similarly. a G-supermodule Given a finite supergroup G, V is called spin if the central element z acts as −1. Alternatively, one can associate a 2-cocycle α : G × G → Z2 = {±1}, such that V becomes a projective supermodule of the group G = G/1, z associated with α, namely, the actions of any two elements g, h ∈ G on V , denoted by ρ(g), ρ(h), satisfy the relation ρ(g)ρ(h) = α(g, h)ρ(gh).
(1)
Among all G-supermodules, we will only consider the spin supermodules in this paper. It is clear that the restriction (resp. the induction) of a spin supermodule to a Z2 -graded subgroup (resp. a larger supergroup) with the same distinguished even element z remains to be a spin supermodule. There are two types of complex simple superalgebras according to C.T.C. Wall: M(r|s) and Q(n). Here M(r|s) is the superalgebra consisting of the linear transformations of the superspace Cr|s = Cr + Cs . The superalgebra Q(n) is the graded subalgebra of M(n|n) consisting of matrices of the form AB . BA It is known [Jo1] that the group (super)algebra of a finite supergroup is semisimple, i.e. decomposes into a direct sum of simple superalgebras. According to the classification of simple superalgebras above, the irreducible supermodules of a finite supergroup are divided into two types, type M and type Q. We note that the endomorphism algebra of an irreducible supermodule V is isomorphic to C if V is of type M and isomorphic to the complex Clifford algebra C1 in one variable if V is of type Q. Let )n be the group generated by 1, z, a1 , . . . , an subject to the relations z2 = 1, ai2 = z
and
ai aj = zaj ai , for i = j.
The symmetric group Sn acts on )n via permutations of the elements a1 , . . . , an , i.e. n = )n Sn , which σ (ai ) = aσ (i) for σ ∈ Sn . We thus form the semi-direct product H naturally endows a Z2 -grading given by the parity d(ai ) = 1, d(z) = 0, and d(σ ) = 0
106
W. Wang
n as a finite supergroup with a distinguished for σ ∈ Sn . We may therefore regard H n ]/z = −1 is exactly the even central element. Note that the group superalgebra C[H complex Clifford algebra Cn in n variables. Let be a finite group with r + 1 conjugacy classes. We denote by ∗ = {γi }ri=0 the set of complex irreducible characters where γ0 denotes the trivial character, and by ∗ the set of conjugacy classes. Let R() = ⊕ri=0 Cγi be the space of class functions on , and set RZ () = ⊕ri=0 Cγi . For c ∈ ∗ let ζc be the order of the centralizer of an element in the conjugacy class c, so the order of the class is then ||/ζc . Given a positive integer n, let n = × · · · × be the nth direct product of , and let 0 be the trivial group. The symmetric group Sn naturally acts on n × )n by simultaneous permutations of elements in n and )n . n is then defined to be the semi-direct product of the symThe finite supergroup H n metric group Sn and × )n , with the multiplication given by (g, σ ) · (h, τ ) = (g σ (h), σ τ ),
g, h ∈ n × )n , σ, τ ∈ Sn .
n is clearly 2n+1 n!||n . The Z2 -grading on H n is induced from The order of H that on )n and by letting the elements in n be even (i.e. of degree 0). Denoting by H n = ( × Z2 )n Sn , we have the following exact sequence of groups: θn n → H n → 1. 1 → Z2 = {1, z} → H
n reduces to a double cover H n of the hyperoctahedral It is clear that when is trivial H group Hn = Zn2 Sn . n contains )n , H n and the wreath product n = n Sn The finite supergroup H as distinguished subgroups. Letting n act trivially on )n we extend the action of the n as a semi-direct symmetric group Sn to n on )n . In this way we may also view H product between n and )n . n . Let G be a finite supergroup and put G = G/1, 2.2. Conjugacy classes of H z −1 and θ : G → G as before. For any conjugacy class C of G, θ (C) is either a conjugacy or it splits into two conjugacy classes of G, cf. [Jo1]. If θ −1 (C) splits, the class of G conjugacy class C will be referred to as split, and an element g in C is also called split, which is equivalent to say that the two preimages of g under θ are not conjugate to each other. Otherwise g is said to be non-split. In view of (1), we have the following easy equivalent formulation. Lemma 2.1. An element g in G = G/1, z is split if and only if εg (·) := α(g, ·)α(·, g)−1 defines a trivial character of the centralizer group ZG (g). n , it is crucial to have a detailed description For the study of spin supermodules of H of split conjugacy classes of G. Indeed the characters of spin supermodules vanish on non-split conjugacy classes as well as on odd split classes, cf. [Jo1]. Below we will n . concentrate on the group H Let λ = (λ1 , λ2 , . . . , λl ) be a partition of integer |λ| = λ1 + · · · + λl , where λ1 ≥ . . . ≥ λl ≥ 1. The integer l is called the length of the partition λ and is denoted by l(λ). We will also make use of another notation for partitions: λ = (1m1 2m2 . . .),
Equivariant K-Theory and Twisted Heisenberg Algebra
107
where mi is the number of parts in λ equal to i. A partition λ is strict if its parts are distinct integers, namely all the multiplicities mi are 1 or 0. Given a partition λ = (1m1 2m2 . . .) of n, we define zλ = i mi mi !. i≥1
We note that zλ is the order of the centralizer of an element of cycle-type λ in Sn . For a finite set X and ρ = (ρ(x))x∈X a family of partitions indexed by X, we write
ρ = |ρ(x)|. x∈X
It is convenient to regard ρ = (ρ(x))x∈X as a partition-valued function on X. We denote by P(X) the set of all partitions indexed by X and by Pn (X) the set of all partitions in P(X) such that ρ = n. The total number of parts, denoted by l(ρ) = x l(ρ(x)), is called the length of ρ. Let OP(X) be the set of partition-valued functions (ρ(x))x∈X in P(X) such that all parts of the partitions ρ(x) are odd integers, and let SP(X) be the set of partition-valued functions ρ ∈ P(X) such that each partition ρ(x) is strict. It is clear that |OPn (X)| = |SPn (X)|. When X consists of a single element, we will omit X and simply write P for P(X), thus OP or SP will be used similarly. We denote by Pn+ (X) = {λ ∈ Pn (X)|
l(ρ) ≡ 0 mod 2},
= {λ ∈ Pn (X)|
l(ρ) ≡ 1 mod 2},
Pn− (X)
and define SPn± (X) = Pn± (X) ∩ SPn (X) for i = 0, 1. The conjugacy classes of a wreath product is well understood, cf. [M]. In particular this gives us the following description of conjugacy classes of the wreath product H n = ( × Z2 )n Sn . n , where g = (g1 , . . . , gn ) Let x = (g, σ ) be an element in a conjugacy class of H and gi = (αi , εi ) ∈ × Z2 . We take the convention here that Z2 = {±1}. For each cycle y = (i1 i2 . . . ik ) in the permutation σ consider the element αy = αik αik−1 . . . αi1 ∈ and εy = εik εik−1 . . . εi1 (which is ±1) corresponding to the cycle y. For each c ∈ ∗ , ε = ± and r ≥ 1, let mεr (c) be the number of r-cycles in the permutation σ such that the cycle product αy lie in the conjugacy class c and εy equals ε · 1. Then c → ρε (c) = ε ε (1m1 (c) 2m2 (c) . . .) defines a partition-valued function on ∗ for each ε. The partition-valued function ρε = (ρε (c))c∈∗ ∈ P(∗ ) is called the ε-type of x. Denote by P(∗ )2n the set of pairs of partition-valued functions ρ = (ρ + , ρ − ) such that ρ + + ρ − = n. The pair ρ = (ρ + , ρ − ) is called the type of x. One can show that the type only depends on the conjugacy class of x in H n and the conjugacy classes in H n are parameterized by the types ρ ∈ P(∗ )2n . We will also say that the conjugacy class containing x has conjugacy type ρ and is denoted by Cρ if x is of type ρ. Denote by Dρ = θn−1 (Cρ ). The following is established in [JW], Theorem 2.1. Theorem 2.1. For ρ = (ρ + , ρ − ) ∈ P(∗ )2n , Dρ splits into two conjugacy classes in n if and only if: H (1) for Dρ is even we have ρ + ∈ OPn (∗ ) and ρ − = ∅; (2) for Dρ is odd we have ρ + = ∅ and ρ − ∈ SPn− (∗ ). Thus, the set (H n )e.s ∗ of even split conjugacy classes in H n is in one-to-one correspondence with the set OP n (∗ ).
108
W. Wang
n . The number of irreducible spin supermodules of 2.3. Spin supermodules over H n , by a general theorem in H n equals the number of even split conjugacy classes in H the supermodule theory of finite supergroups [Jo1]. The next proposition follows from the equality |SPn (∗ )| = |OPn (∗ )|. n is equal to the Proposition 2.1. The number of irreducible spin supermodules of H number |SPn (∗ )| of strict partition-valued functions on ∗ . n ) (resp. R − (H n )) the C-span (resp. Z-span) of the characters We denote by R − (H Z n . Let of irreducible spin supermodules of H R− =
∞
n ). R − (H
n=0
When is trivial, we will simply drop the subscript and write −
R =
∞
n ). R − (H
n=0
n , the irreducible spin n reduces to H For example, when is trivial and thus H n are parameterized by strict partitions of n (cf. [Ser] and [Jo2]). For supermodules of H strict partitions λ and µ of n, let T λ and T µ denote the corresponding irreducible spin n . We have supermodules over H dim HomHn (T λ , T µ ) = δλµ 2δ(l(λ)) ,
(2)
where the number δ(l(λ)) is 0 for l(λ) even and 1 otherwise. That is, the supermodule T λ is of type M (resp. type Q) if and only if l(λ) is even (resp. odd). n -supermodule is the so-called basic A most distinguished example of irreducible H spin supermodule Ln constructed as follows (cf. [Jo2, JW]). As a superspace Ln is isoalgebra in n varimorphic to the group superalgebra C[)n ]/z = −1 (i.e. the Clifford ables). Denote by yi ∈ Ln the image of ai ∈ )n . Then yI = i∈I yi , I ⊂ {1, . . . , n}, n on Ln is given by form a linear basis of Ln . The action of H aj yI = yj yI (j = 1, . . . , n),
σyI = ys(I ) , s ∈ Sn .
n -supermodule T (n) associated to the one-part partition (n). Indeed Ln is exactly the H n If we denote by ξ the character of Ln , then the character value of ξ n is 2l(ρ) on a conjugacy class of type ρ = (ρ + , ∅), where ρ + ∈ OPn (∗ ). For each partition-valued function ρ = (ρ(c))c∈∗ we define zρ(c) ζcl(ρ(c)) , Zρ = 2l(ρ) c∈∗
which is the order of the centralizer of an element in H n of conjugacy type ρ = (ρ(c))c∈∗ (see [JW]). For a fixed c ∈ ∗ , we denote by cn (n ∈ 2Z+ + 1) the even split conjugacy class in H n of the type (ρ + , ∅), where the partition-valued function ρ + takes as its value the n ) one-part partition (n) at c ∈ ∗ and zero elsewhere. We denote by σn (c) ∈ R − (H n which takes its value nζc at the conjugacy class cn (c ∈ ∗ ) the class function of H and zero otherwise. For ρ = {mr (c)}c,r ∈ OPn (∗ ), we define
Equivariant K-Theory and Twisted Heisenberg Algebra
σρ =
109
σr (c)mr (c) ,
c∈G∗ ,r≥1
n which takes its value Zρ at the conjugacy and regard it as the class function on H class Dρ+ and 0 elsewhere. Then it follows that (cf. [Ser, Jo2])
2l(ρ) Zρ−1 σ ρ . (3) ξn = ρ∈OPn (∗ )
n of Young subgroups of the symmetric groups. Finally we define the analog for H n and H m with a twisted multiplication ˜ Let H n ×H m be the direct product of H
(t, t ) · (s, s ) = tszd(t )d(s) , t s , n , s , t ∈ H m . We define the spin product of H n and H m by where s, t ∈ H letting n × n × m = H m /{(1, 1), (z, z)}, ˆH ˜H H
(4)
which carries a canonical Z2 -grading and can be regarded as a subgroup of the super n+m in a natural way. group H m n and H m we define a H n × ˆH For two spin supermodules U and V of H spin supermodule on the tensor product U ⊗ V by letting (t, s) · (u ⊗ v) = (−1)d(s)d(u) (tu ⊗ sv). n ) R − (H m ) to R − (H n × m ). ˆH This induces an isomorphism φn,m from R − (H − The space R carries a (commutative associative) multiplication which is defined by the composition (for all n, m) φn,m I nd n ) m ) −→ n × n+m ). m ) −→ ˆH R − (H R − (H R − (H R − (H 3. A Decomposition Theorem in Equivariant K-Theory In this section we introduce a variant of equivariant K-theory for a finite supergroup. We recall a decomposition theorem in the equivariant K-theory from [BC, Ku, AS, HKR], and present a generalization of it in our new setup. 3.1. The standard version. Given a (non-graded) finite group and a compact Hausdorff -space X, we recall [Seg1] that K0 (X) is the Grothendieck group of G-vector bundles over X. One can define K1 (X) in terms of the K 0 functor and a certain suspension operation, and one puts K (X) = K0 (X) K1 (X). In this paper we will only be concerned about K (X) ⊗ C, and subsequently we will denote K (X) = K (X) C. We denote by dim Ki (X)(i = 0, 1) the dimension of Ki (X) ⊗ C.
110
W. Wang
If X is a locally compact, Hausdorff and paracompact -space, take the one-point compactification X + with the extra point ∞ fixed by . Then we define K0 (X) to be the kernel of the map K0 (X + ) −→ K0 ({∞}) induced by the inclusion map {∞} C→ X+ . This definition is equivalent to the earlier one when X is compact. We also define K1 (X) = K1 (X + ). Note that K (pt) is isomorphic to the Grothendieck ring RZ () and K (pt) is isomorphic to the ring R() of class functions on . Let Xg denote the fixed-point set by g ∈ , which is preserved under the action of the centralizer Z (g). The following decomposition theorem (cf. [BC, Ku, AS, HKR]) gives a description of each direct summand over conjugacy classes of . We remark that the subspace of invariants K(X g )Z (g) is isomorphic to K(X g /Z (g)), and it is isomorphic for a different choice of g in the same conjugacy class [g] ∈ ∗ . Theorem 3.1. There is a natural Z2 -graded isomorphism Z (g) φ : K (X) −→ K Xg . [g]∈∗
be a finite supergroup which contains a distin3.2. A super/twisted variant. Now let G → guished central element z of order 2, and let θ be the quotient homomorphism G G = G/1, z. Let X be a compact (non-graded) G-space. We may regard X as a G-space by letting act by θ (g) ∈ G. We denote by C − (X) the category whose objects consist of G g∈G G equivariant complex vector superbundles (i.e. Z2 -graded bundles) E = E0 + E1 (often acts in a Z2 -graded manner and z acts as −1. denoted by E0 |E1 ) over X on which G We will call such an object a spin G-vector superbundle. Given two objects E, F in the − category CG (X), the space of homomorphisms of G-equivariant vector superbundles between E and F admits a natural Z2 -grading: HomG HomG (E, F ) = HomG 0 (E, F ) 1 (E, F ). always contains odd elements. It follows that By our definition of finite supergroups, G rankE0 = rankE1 thanks to the existence of odd automorphisms from the odd elements of G. −,0 We denote by KG (X) the Grothendieck group of the abelian monoid of isomor− phism classes2 of the vector superbundles in CG (X). As in the ordinary case, we can −,1 − extend the definition of CG (X) to locally compact spaces, and define KG (X) to be
−,0 −,0 −,1 − KG (X) = KG (X × R), where R is the real line. We denote KG (X) + KG (X). − In this paper we will only be concerned about the free part KG (X) ⊗ C, which will − be denoted by K G (X) subsequently. The following theorem, generalizing Theorem 3.1, is a variation of Adem-Ruan’s construction [AR] for the so-called twisted equivariant K-theory. Recall that the character εg of the centralizer group ZG (g) was defined in Lemma 2.1 and ZG (g) acts on K(X g ). 2
The isomorphisms are Z2 -graded and isomorphisms of degree 1 are allowed.
Equivariant K-Theory and Twisted Heisenberg Algebra
111
be a finite supergroup which contains a distinguished central eleTheorem 3.2. Let G → G = G/1, ment z of order 2, and let θ be the quotient homomorphism G z. Given a locally compact Hausdorff G-space X and regarding it as a G-space, we have a natural Z2 -graded isomorphism ZG (g) ∼ = g (X) −→ ) εg , (5) K(X φ : K− G [g]
where the summation runs over the even conjugacy classes in G. Let us indicate briefly how the map φ is defined (also compare [AR]). A G-equivari ant vector superbundle E = E0 |E1 , when restricted to X g , becomes a spin g-vector superbundle, where g denotes the subgroup of G covering the cyclic subgroup g of G generated by g. Using Proposition 4.1, we can obtain the following isomorphisms of ZG (g)-modules: − g ∼ g ∼ Kg R − g R(g) εg . = K(Xg ) (X ) = K(X ) g In this way we obtain a map KG R(g) ⊗ εg . Composing this map (X) → K(X ) with the character evaluation at g gives us a map − g KG εg , (6) (X) → K(X ) whose image indeed is ZG (g)-invariant. Now we claim that this map is zero when g is odd (thanks to the Z2 -grading!) and thus the summation above does not involve the odd conjugacy classes. Let g ∈ G be an odd element. Take an eigenvector v0 + v1 of g in the fiber of the ungraded subbundle E µ ⊂ E0 |E1 , where vi ∈ Ei (i = 0, 1). We see that g.v0 = µv1 , and g.v1 = µv0 . It follows that v0 − v1 is an eigenvector of g with eigenvalue −µ, i.e. v0 − v1 ∈ E −µ . Denote by F the isomorphism of E which is the identity map when restricted to E0 and negative the identity map when restricted to E1 . Clearly F sends E µ to E −µ and vice versa. Thus the map (6) becomes zero since µ[E µ ] + (−µ)[E −µ ] = 0. Putting (6) together for all even conjugacy classes, we obtain the map φ. The rest of the proof of the above theorem is the same as in [AR] which in turn follows closely the classical case (cf. [BC, Ku, AS]). Below we will single out a certain class of G-space X with a favorable property. Definition 3.1. Assume we are in the setup of Theorem 3.2. We say the G-space X satisfies a strong vanishing property if for every even non-split conjugacy class [g] in G, there exists some element b in ZG (g) such that the character εg of ZG (g) takes non-trivial value (which has to be –1) and the element b fixes Xg pointwise. In view of Lemma 2.1, if the G-space X satisfies the strong vanishing property, the isomorphism (5) will simplify to the following isomorphism: ∼ = φ : K− K(X g )ZG (g) , (X) −→ G [g]
where the summation runs over all even split conjugacy classes in G. When X is a point the isomorphism φ becomes the map from a spin supermodule of G to its character. As is known [Jo1, Jo2], the character of a spin supermodule vanishes on odd conjugacy classes as well as on even non-split conjugacy classes. In our terminol automatically satisfies the strong vanishing property. ogy, the one-point space for any G We shall see that the strong vanishing property holds for other non-trivial examples.
112
W. Wang
= Remark 3.1. Theorem 3.2 contains Theorem 3.1 as a special case. Indeed, let G H 1 = )1 × for some finite group and let X be a -space. We have an isomorphism − K (X) ∼ V → V |V . = KH (X), 1
On the other hand, we note that )1 is isomorphic to Z4 = {1, a, z = a 2 , a 3 } with the by {1, z} Z2 -grading given by letting the generator a be of degree 1. The quotient of G is G = × Z2 . The even conjugacy class in G is given by the conjugacy classes in × {1} = . Therefore the right-hand side in Theorem 3.2 reduces to the right-hand side of Theorem 3.1. It is possible to further generalize Theorem 3.2 along the line of [AR]. n and K-Theory Operations 4. The Group H In this section, we construct various K-theory operations based on the finite supergroup n . This is an analog of Atiyah’s construction [At] of K-theory operations by using the H symmetric groups and implicitly Schur duality. The role of Schur duality is replaced n . here by Sergeev’s generalization [Ser] of the Schur duality involving H n and a ring of symmetric functions. Let V be a complex vector space 4.1. The group H of dimension m. We denote by q(V ) the superalgebra of linear transformations on the superspace V |V = V + V which preserve an odd automorphism P : V |V → V |V such that P 2 = −1. For example, if we take V = Cm , and P to be given by the 2m × 2m matrix 0 I −I 0 then q(V ) is the Lie superalgebra which is obtained by the associative superalgebra Q(m) (see Sect. 2) by taking the supercommutators. Let us now consider the natural action of q(V ) on V |V . We may form the n-fold tensor product (V |V )⊗n , on which q(V ) acts naturally. In addition we have an action of n : the symmetric group Sn acts on (V |V )⊗n by permutations with the finite supergroup H appropriate signs; ai acts on (V |V )⊗n by means of exchanging the parity of the i th copy of V |V via the odd automorphism P of V |V . More explicitly, ai transforms the vector v1 ⊗. . . vi−1 ⊗vi ⊗. . .⊗vn in (V |V )⊗n into (−1)p(v1 )+···+p(vi−1 ) v1 ⊗. . . vi−1 ⊗P (vi )⊗ n (super)commute with . . . ⊗ vn . According to Sergeev [Ser], the actions of q(m) and H each other. Furthermore, one has
(V |V )⊗n ∼ 2−δ(l(λ)) Umλ ⊗ T λ , = λ
n -supermodule associated to a strict partition λ, and Umλ = where T λ is the irreducible H λ ⊗n HomHn (T , (V |V ) ) is an irreducible q(m)-module. The expression 2−δ(l(λ)) Umλ ⊗T λ above has the following meaning. Suppose that A and B are two supergroups or two superalgebras and suppose that VA and VB are irreducible supermodules over A and B of type Q, namely, HomA (VA , VA ) and HomB (VB , VB ) are both isomorphic to the Clifford superalgebra in one odd variable. It is known (cf. e.g. [Jo1]) that VA ⊗ VB as a module over A ⊗ B is not irreducible, but decomposes into a direct sum of two
Equivariant K-Theory and Twisted Heisenberg Algebra
113
isomorphic copies (via an odd isomorphism) of the same irreducible supermodule. In our particular setting when l(λ) is odd both T λ and Umλ are such modules (cf. [Ser]). So in this case we mean to take one copy inside their tensor product. We introduce (cf. [M]) the symmetric functions qn in the variables x1 , x2 , . . . by the formula
1 + xi t . qn t n = 1 − xi t n≥0
i
Denote by the subring of symmetric functions generated by q1 , q2 , q3 , . . . , and n the subspace spanned by symmetric functions in of degree n. Put C = ⊗Z C and nC = n ⊗Z C. Recall [M] that a linear basis of C is given a distinguished class of symmetric functions Qλ , called the Schur Q-functions, parameterized by strict partitions λ. Furthermore C = C[q1 , q3 , q5 , . . .], where qr (r odd) are algebraically independent. We take the convention that when the set of variables is finite, say x = (x1 , x2 , . . . , xm ), we set Qλ (x) = Qλ (x1 , x2 , . . . , xm , 0, 0 . . .). According to Sergeev [Ser], the trace of the diagonal matrix D = diag(x1 , . . . , xm ; x1 , . . . , xm ) in q(m) acting on Umλ is T rD|Umλ = 2
δ(l(λ))−l(λ) 2
Qλ (x).
(7)
n -invariants n -supermodule W , we define W (V ) to be the space of H Given a spin H
H n W (V |V )⊗n . It is easy to check that the correspondence V → W (V ) is functorial on V . In particular, if we take a diagonalizable linear transformation l : V → V with eigenvalues x1 , . . . , xm , then the eigenvalues of the induced map W (l) = I dW ⊗ (l|l)⊗n : W (V ) → W (V ) are monomials in x1 , . . . , xm of degree n. In particular the trace of W (l) is a symmetric polynomial in x1 , . . . , xm of degree n with integer coefficients. One can argue that this symmetric polynomial (for m ≥ n) is the restriction of a unique symmetric function in infinitely many variables. By definition we have (W1 W2 )(V ) = W1 (V ) W2 (V ). It follows that the mapn ) to the space of ping W → T rW (l) induces a map, denoted by ch, from R − (H symmetric functions of degree n. Note that W (V ) ∼ = HomHn (W, (V |V )⊗n ) since all the character values of W are real, i.e. W is self-dual. It follows from (7) that if we take W δ(l(λ))−l(λ) 2 to be T λ then ch sends the class function associated with T λ to 2 Qλ (x). In this n ) to the ring C of symmetric way we have defined a map ch from R − = ⊕n R − (H functions.
Remark 4.1. It is possible that one can develop this approach coherently to study the map ch : R − → C without referring to the Lie superalgebra q(V ) and thus essentially independent of the work of Sergeev [Ser], as sketched below. This is a super analog of an approach adopted in Knutson [Kn] in the setup of symmetric groups. For example, we can start by arguing that the space W (V ) associated to the basic spin supermodule Ln (with character ξn ) is nth supersymmetric algebra Qn (V |V ) = ⊕ni=0 S i V ⊗ n−i V , the n n and thus ch(ξ ) = i=0 hi en−i = qn (cf. Ex. 6(c), pp. 261, [M]), which is exactly the Schur Q-function Q(n) . Here hr and er , which stand for the r th elementary and complete symmetric function respectively, are the traces of diag(x1 , . . . , xm ) on S r V and r V . n ) → is an One can easily show by using the Frobenius reciprocity that ch : R − (H algebra homomorphism. The image of ch contains nC as we have just seen that it contains qn for all n. By comparing the dimensions, the characteristic map ch : R − → C is indeed an isomorphism.
114
W. Wang
There is another way to define the characteristic map as follows (cf. [Jo2, Ser]). Denote by pn the nth power sum symmetric functions, and define pµ = pµ1 pµ2 . . . for a partition µ = (µ1 , µ2 , . . .). Given a character χ ∈ R − , we define the characteristic map ch : R − → C by
−1 ch (χ ) = zµ χ µ pµ , µ∈OPn
where zµ = i≥1 i mi mi ! for µ = (1m1 3m3 . . .) and χµ the character value at the even + . It is known [Jo2] that ch is an algebra isomorphism and it split conjugacy class Dµ,∅ n sends ξ to qn for all n. Thus ch coincides with the characteristic map ch we defined above. be a finite supergroup with G = G/1, 4.2. K-theory operations. Let G z, and let X be a trivial G-space (and thus G-space). Given a G-supermodule V in RZ− (G) and a complex vector bundle E on X, E ⊗ V can be given a natural G-equivariant vector act on only the factor V . Obviously E ⊗ V superbundle structure over X by letting G − − lies in the category CG ⊗ K(X) −→ KG (X). In this way we obtain a map RZ (G) (X). Then we have the following (compare [Seg1]). Proposition 4.1. Under the above setup, there is a canonical isomorphism ∼ = − RZ− (G). KG (X) −→ K(X) One needs some extra care to define the inverse of the isomorphism in the above proposition. Given a G-supermodule V in RZ− (G) and a complex vector bundle E on X, consider HomG (V, E), where V is the trivial G-superbundle over X associated to V . HomG (V, E) is isomorphic to E if V is of type M, but isomorphic to E ⊗ C1 = E|E if V is of type Q, where C1 is the Clifford algebra in one variable. It is perhaps more − natural to replace K(X) in the proposition above by an isomorphic space K) (X), cf. 1 Remark 3.1. Below we will construct various K-theory operations based on the construction in the previous subsection. It is a super analog of an approach due to Atiyah [At] who used the symmetric group representations. Let E be a vector bundle over X, consider the nth tensor power (E|E)⊗n of the vector superbundle E|E. The odd operator P (P 2 = −1) acts on each factor E|E fiberwise and induces an action of the finite supergroup )n on (E|E)⊗n . The symmetric group Sn also acts on (E|E)⊗n in a natural way. The joint action of )n and Sn then gives rise to an n on (E|E)⊗n . We have the following decomposition: action of the finite supergroup H
(E|E)⊗n ∼ π(E), 2−δ(l(π)) T π = π
H where π is a strict partition of n, and π(E) = T π ⊗ (E|E)⊗n n is a vector bundle on X. Clearly one can extend the definition of the vector bundle π(E) associated to n so that it is additive on π . In this way we obtain a ring any spin supermodule π of H n ) and Op(K) is the ring homomorphism j − : R − → Op(K), where R − = n R − (H of K-theory operations on K(X) (cf. [At]).
Equivariant K-Theory and Twisted Heisenberg Algebra
115
We note that if π is the one-part partition (n) then π(E) is the nth supersymmetric power Qn : Qn (E|E) =
n
S i E ⊗ n−i E.
(8)
i=0
For odd n, we denote by ψ n the operation corresponding to the class function σn ∈ n ) which takes its value n in the even split conjugacy class of type ((n), ∅) and R − (H n n zero elsewhere. Denote by Qt (E) = ∞ n=0 Q (E|E)t , where t is a formal variable. Proposition 4.2. The operation ψ r (r odd) coincides with the usual r th Adams operation. In particular it is additive. Furthermore, we have
Qt (E) = exp 2ψ r (E)t r /r . (9) r>0, odd
Proof. We first prove (9). Since the operations Qn ’s and ψ r ’s are obtained from the ring homomorphism j − : R − → Op(K), we only need to show that the corresponding identity holds in R − , or alternatively in C . But this is a classical identity [M]
qn t n = exp 2pr t r /r . n≥0
r>0, odd
Now we denote by σt = n σ n t n , t = n n t n . Let us denote by ψ˜ r the r th Adams operations for the time being. It is classical that
r r ˜ σt (E) = exp ψ (E)t /r ,
r>0
t (E) = exp (−1)r−1 ψ˜ r (E)t r /r . r>0
We have from (8) that Qt (E) = t (E)σt (E) and therefore
Qt (E) = exp 2ψ˜ r (E)t r /r . r>0, odd
Comparing with (9) which we have already established, we have ψ r = ψ˜ r .
!
It follows that Qt (E + F ) = Qt (E)Qt (F ) and Qt (E − F ) = Qt (E)Q−t (F ), where E, F ∈ K(X), since the Adams operations are additive. For example, the second equation reads componentwise as follows: Qn (E − F |E − F ) =
n i=0
Hn (−1)i I ndH
i H n−i ×
(Qn−i (E|E) ⊗ Qi (F |F )).
116
W. Wang
Remark 4.2. The ring of symmetric functions is a basic model for (free) λ rings, and indeed λ rings can be defined by axiomizing various properties of natural operations on , where the Adams operations play a crucial role (cf. [Kn]). We can define similarly a notion of Q-λ ring with Adams operations of odd degrees only, using C as a basic model. Then what we have just shown is that K(X) admits a Q-λ ring structure. We will see the structure of a Q-λ ring instead of a λ ring shows up naturally in some fairly non-trivial setup in Sect. 5. 5. Algebraic Structures on FΓ− (X) n In this section, we shall study in detail the K-groups K − n (X ) of generalized symH metric products for all n simultaneously.
5.1. Generalized symmetric products. Our main examples in this paper are as follows. Let be a finite group and let X be a -space. The nth Cartesian product X n is acted n = )n n in a canonical way: )n acts trivially on on by the finite supergroup H Xn ; n acts on Xn factorwise while Sn by permutations, and this gives rise to a natural action of the wreath product n = n Sn by letting
a.(x1 , . . . , xn ) = g1 xs −1 (1) , . . . , gn xs −1 (n) , where a = ((g1 , . . . , gn ), s) ∈ n , and x1 , . . . , xn ∈ X. Note that orbifolds X n /Sn are often called symmetric products. We may refer to X n / n , or Xn with n -action, or n -action as generalized symmetric products. rather Xn with H Our earlier general construction when applied to the generalized symmetric products − − n n gives us the category CH n (X ) and its associated K-group K H n (X ). It turns out this category affords an equivalent description below which affords more transparent geometric meaning. The wreath product n acts on the vector space Cn naturally by letting act trivially and Sn act as the permutation representation. This action preserves the standard quadratic form on Cn . We denote by n such a n -vector bundle X n × Cn over X n . We denote by Cn the complex Clifford algebra associated to Cn and the standard quadratic form on it. The action of n on Cn induces a natural action on Cn . We denote by C(n) the associated n -vector (super)bundle X n × Cn on X n , which is the Clifford module on Xn associated to the vector bundle n. We introduce the following category Cnn (X n ): the objects consist of complex vector superbundles E = E0 + E1 on X n equipped with compatible actions of n and the Clifford algebra Cn associated to Cn with the standard quadratic form. That is, E is a Z2 -graded C(n)-module and a G-equivariant vector bundle over X n such that g.(v.ξ ) = (g.v).(g.ξ ),
g ∈ n , v ∈ n, ξ ∈ E.
(10)
n (X n ), the space of (G, C(n))-equivariant homomorGiven two superbundles E, F in CG phisms of vector superbundles admits a natural Z2 -gradation. Since the twisted group n (Y ) is then obalgebra of )n is isomorphic to the Clifford algebra Cn , the category CG − n viously equivalent to the category CH (X ), and so are the corresponding K-groups. n
−,0 − When X is a point, KH n (pt) = KH n (pt) reduces to the Grothendieck group − n . R (H n ) of the spin supermodules of H
Z
Equivariant K-Theory and Twisted Heisenberg Algebra
117
Remark 5.1. We may replace the rank n vector bundle n over X n above by the nth direct sum of a non-trivial line bundle endowed with a quadratic form, and modify the construction of the category Cnn (X n ) accordingly. We conjecture that the resulting K-group − n is isomorphic to KH (X ). n
Remark 5.2. We may reverse the above consideration in a more general setup as below. Take a G-space Y and a G-superbundle C(n) over Y whose fiber is the Clifford algebra in n variables. Assume that there exist n sections a1 , . . . , an of the superbundle C(n) which fiberwise generate the Clifford algebra and G permutes a1 , . . . , an . It is of interest n (Y ) of G-equivariant in algebraic topology (cf. Karoubi [Ka]) to study the category CG vector superbundles over Y which are compatible with the Clifford module structure in the sense of (10) and study its associated K-group. Then we may form the finite n (Y ) as the category C − supergroup )n G and reinterpret the category CG )n G (Y ), and then apply Theorem 3.2 to the study of the associated K-group. In particular if we take Y = Xn and G = n for a -space X, we recover our main examples of generalized symmetric products. be a Z2 -graded subgroup of a finite supergroup G 5.2. Hopf algebra F− (X). Let H with the same distinguished even element z, and let X be a G-space X which is reG : garded as a G-space, where G = G/1, z. We can define the restriction map ResH − − − − G K G (X) −→ K H (X) and the induction map I ndH : K H (X) −→ K G (X) in the same way as in the usual equivariant K-theory. When it is clear from the text, we will often G as Res or Res. Similar remarks apply to the induction map. abbreviate ResH H Now we introduce the direct sum of equivariant K-groups n n F− (X) = K− F− (X, t) = t nK − (X ), (X ), H H n≥0
n
n≥0
n
0 is the one-element group by convention, and t is a formal variable counting where H the graded structure of F− (X). We also set
n dimt F− (X) = t n dim K − (X ). H n
n≥0
We define a multiplication · on the space F− (X) by a composition of the induction map and the K¨unneth isomorphism k: n K− (X ) H
n
k
− m K− (X ) −→ K H H m
m H n×
Ind (X n+m ) −→ K − H
n+m
(X n+m ). (11)
0 We denote by 1 the unit in K − (X ) which can be identified with C. H 0
On the other hand we can define a comultiplication O on F− (X) to be a com n to position of the inverse of the K¨unneth isomorphism and the restriction from H m × n−m : H H n K− (X ) −→ H n
n m=0
K− H
n−m H m×
k −1
(X n ) −→
n m=0
− m K− (X ) ⊗ K H H m
n−m
(X n−m ). (12)
118
W. Wang
n We define the counit P : F− (X) −→ C by sending K − (X ) (n > 0) to 0 and H n
− 0 1 ∈ KH (X ) = C to 1. 0
Theorem 5.1. With various operations defined as above, F− (X) is a graded Hopf algebra. The proof is the same as the proof of the Hopf algebra structure on a direct sum of equivariant K-groups ⊕n K n (X n ) [W1], where a straightforward generalization to the equivariant K-groups of Mackey’s theorem plays a key role. One easily checks the super version of Mackey’s theorem can be carried over to our K-group setup. In the case when n ∼ − X is a point and thus K − n (X ) = R (H n ), the Hopf algebra structure above is also H treated in [JW]. 5.3. Description of the algebra F− (X). Take an even split element a = (γ , σ ) in H n of type ρ = (ρ + , ∅), where g = (g1 , . . . , gn ), gi = (αi , εi ) ∈ × Z2 , and n ρ + = (m+ r (c))r,c ∈ OPn (∗ ). We define a = ((α1 , . . . , αn ), σ ) ∈ n = Sn . n n a Since the subgroup )n ≤ H n acts on X trivially, the orbit space (X ) /ZH n (a) is identified with (Xn )a /Zn (a) which has been calculated earlier in Lemmas 4 and 5 of [W1]. We make a convention here to denote the centralizer ZG (g) (resp. Xg , Xg /ZG (g)) by ZG (c) (resp. Xc , Xc /ZG (c)) by abuse of notations when the choice of a representative g in a conjugacy class c of a group G is irrelevant. For a fixed c ∈ ∗ , recall that cn (n ∈ 2Z+ + 1) is the even split conjugacy class in H n of the type (ρ + , ∅), where the partition-valued function ρ + takes the value of the one-part partition (n) at c ∈ ∗ and zero elsewhere. Lemma 5.1. Let a ∈ H n be an even split element of type ρ = (ρ + , ∅), where ρ + = n a (m+ r (c))r≥1,c∈∗ ∈ OPn (∗ ). Then the orbit space (X ) /ZH n (a) can be naturally identified with +
S mr (c) X c /Z (c) , S m (·)
r,c th m symmetric
denotes the product. In particular, the orbit space where (Xn )cn /ZH n (cn ) can be naturally identified with X c /Z (c). In view of Lemma 2.1, let us recall how we have classified the split conjugacy classes of H n in Theorem 1.2 of [JW]. In order to show that a given element a in H n is non-split, an explicit element, say b, is constructed in the centralizer ZH n (a) such that the character εa takes the value −1 at b. This was achieved in [JW] case by case. On the other hand, we observe that in all cases the element b fixes (Xn )a pointwise! In other words, we have the following. n -space X n satisfies the strong vanishing property. Lemma 5.2. The H − n We now give an explicit description of F− (X) = n≥0 K H n (X ) as a graded algebra. Theorem 5.2. As a (Z+ × Z2 )-graded algebra F− (X, t) is isomorphic to the supersym∞ 2r−1 metric algebra S K (X) . In particular, we have r=1 t ∞ 1 (1 + t 2r−1 )dim K (X) − dimt F (X) = r=1 . ∞ 2r−1 )dim K0 (X) r=1 (1 − t
Equivariant K-Theory and Twisted Heisenberg Algebra
119
Here supersymmetric of the symmetric algebra the
algebra is equal to the tensor product
∞ 2r−1 0 ∞ 2r−1 1 S t K (X) and the exterior algebra t K (X) . r=1 r=1 Proof. Take an even split element a ∈ H n of type ρ = (ρ + , ∅), where ρ + = (m+ unneth formula, we have r (c))r≥1,c∈∗ ∈ Pn (∗ ). By Lemma 5.1 and the K¨ mr (c) Sm+r (c) n a c Z (c) K((X ) /ZH n (a)) ≈ K(X ) c∈∗ ,r≥1 odd
≈
+
S mr (c) (K(X c /Z (c))).
(13)
c∈∗ ,r≥1 odd
We now calculate as follows. The statement concerning dimt F− (X) follows from this immediately.
tn K (X n )a /ZH n (a) F− (X, t) ∼ = n≥0
∼ =
[a]∈(H n )e.s. ∗
n≥0
{m+ r (c)}c,r ∈OPn (∗ )
tn
∼ =
by Theorem 3.2 and Lemma 5.2, S
∼ =
S mr
S
m+ r (c)
∼ =
{mr }r r≥1 odd
∼ =S
∞ r=1
by (13) and Theorem 2.1,
t K(X /Z (c))
r
c
t r K(X c /Z (c))
where mr =
c∈∗
{mr }r r≥1 odd
(K(X c /Z (c)))
c,r odd
c,r odd {m+ r (c)}c,r ∈OP (∗ )
m+ r (c)
S
mr
r
(t K (X))
c
m+ r (c),
by Theorem 3.1,
t 2r−1 K (X) .
! Recall that the orbifold Euler number e(X, ) was introduced by Dixon, Harvey, Vafa and Witten [DHVW] in the study of orbifold string theory. It is subsequently interpreted as the Euler number of the equivariant K-group K (X), cf. e.g. [AS]. If we define the Euler number of the generalized symmetric product to be the difference n ) := dim K −,0 (X n ) − dim K −,1 (X n ), e(Xn , H H H n
n
then we obtain the following corollary. n ) is given by the following generating Corollary 5.1. The Euler number e(Xn , H function: ∞ ∞
−e(X,) n )t n = e(X n , H . 1 − t 2r−1 n=0
r=1
In the case when X is a point, we obtain the following corollary (also cf. [JW]).
120
W. Wang
Corollary 5.2. When X is a point and thus F− (X) ∼ = R− , we have
n ) = t n dim R − (H
n≥0
∞
(1 − t 2r−1 )−|∗ | .
r=1
5.4. Twisted vertex operators and F− (X). In the following, we will define various K-theory maps appearing in the following diagram (n odd): n
ℵ
− n K (X) −→ K − 1 (X) −→ K H n (X ) H
φn −→ K (X n )a /ZH n (a) [a]∈(H n )e.s ∗
pr
ι
c∈∗
ϑ
−→
c∈∗
K (X n )cn /ZH n (cn )
φ K X c /Z (c) ←− K (X).
(14)
1 = × )1 , we have a canonical isomorphism, denoted by ℵ, from Noting that H − K (X) to KH (X) given by E → E|E. Given a -equivariant vector bundle V , con1
sider the nth outer tensor product (V |V )n which is a vector superbundle over X n . The odd operator P acting on each factor V |V induces an action of the finite supergroup )n on (V |V )n while the wreath product n acts on (V |V )n by letting ((g1 , . . . , gn ), s).u1 ⊗ . . . ⊗ un = g1 us −1 (1) ⊗ . . . ⊗ gn us −1 (n) ,
(15)
where g1 , . . . , gn ∈ , s ∈ Sn and ui ∈ V |V , i = 1, . . . , n. It is easy to check the n on (V |V )n , and combined action gives rise to an action of the finite supergroup H n (V |V ) endowed with such an H n action is an H n -equivariant vector superbun n action on V n ⊗ (C1|1 )n as dle over Xn . On the other hand we can define an H n follows: n acts on the first factor V only while )n acts only on the second factor (C1|1 )n ; the symmetric group Sn acts diagonally. One can check that the combined -equivariant vector superbundle action gives V n ⊗ (C1|1 )n the structure of an H n n over X . We easily see that (V |V ) is canonically isomorphic to V n ⊗ (C1|1 )n as n -equivariant superbundle. aH n -supermodule Ln . Remark 5.3. Note that the above (C1|1 )n is precisely the basic H n -supermodule M with character χ , we can define an H n In general for a given H n equivariant superbundle structure on V ⊗ M when replacing Ln above by M. We will write the corresponding element in KH n (X n ) as V n S χ . This defines an additive map − n n S χ . from R(n ) to KH (X ) by sending χ to V n
Sending V |V to (V |V )n gives rise to the K-theory map n. More explicitly, given V , W two -equivariant vector bundles on X, we use V itself to denote the corresponding element in K (X) by abuse of notation. Then n
n (n−j ) j (V |V ) . (16) (V |V − W |W )n = (−1)j IndH (W |W ) H × H j =0
n−j
j
Equivariant K-Theory and Twisted Heisenberg Algebra
121
n−j and respecHere (V |V )(n−j ) and (W |W )j carry the standard actions of H j . The map φn is the isomorphism in Theorem 3.2 given by the summation tively H + + ρ ∈OPn (∗ ) (φn )ρ + over the even split conjugacy classes of H n of type (ρ , ∅) when n . The map pr is the projection to the applying to the case Xn with the action of H direct sum over the even split conjugacy classes cn of H n while ι denotes the inclusion map. The map ϑ denotes the natural identification given by Lemma 5.1. Finally the last map φ is the isomorphism given in Theorem 3.1. We introduce in addition the following K-theory operations. Definition 5.1. For n ∈ 2Z+ + 1, we define the following K-theory operations as composition maps: ψ n := nφ −1 ◦ ϑ ◦ pr ◦ φn ◦ n ◦ ℵ : K (X) −→ K (X),
− n ϕ n := nφn−1 ◦ ι ◦ pr ◦ φn ◦ n : K − (X) −→ K H (X ), H 1
n chn := φ −1 ◦ ϑ ◦ pr ◦ φn : K − (X ) −→ K (X), H
n
n
n Un := φn−1 ◦ ι ◦ ϑ −1 ◦ φ : K (X) −→ K − (X ). H n
Recall that the notation ψ n (n odd) was used in Sect. 4 to denote the nth operation. We shall see the ψ n defined here for trivial coincides with the nth
Adams Adams operation tensored with C. This is why we have chosen to use the same notation. We list some properties of these K-theory maps which follows directly from definitions. Proposition 5.1. The following identities hold: chn ◦ Un = I |K (X) ,
Un ◦ ψ n = ϕ n ◦ ℵ,
chn ◦ ϕ n ◦ ℵ = ψ n ,
where I |K (X) denotes the identity operator on K (X). Lemma 5.3. Both ψ n and ϕ n (n odd) are additive K-theory maps. In particular, for trivial, the operation ψ n given in Definition 5.1 coincides with the nth Adams operations on K(X). Proof. We sketch a proof. By definition Un is additive and ℵ is an isomorphism. Thanks to the equality Un ◦ ψ n = ϕ n ◦ ℵ by Proposition 5.1, to show that ϕ n is additive, it suffices to check that ψ n is additive. This can be proved in a parallel way by using (16) as Atiyah [At] proves the additivity of the Adams operations defined in terms of symmetric groups. Now we set = {1} and consider the diagonal embedding On : X → X O C→ X n . n acts on X O trivially, it follows by Proposition 4.1 that K − (X) ∼ Since H = K(X) ⊗ n H − R (Hn ). We have the following commutative diagram (n odd): K(X) n ◦ ℵ ↓ ( ⊗n ◦ ℵ O∗n
n K− n (X ) −→ H pr ◦ φn ↓↑ ι ϑ K X O −→
∼
= n ), K− −→ K(X) ⊗ R − (H n (X) H O ι ↑↓ pr ◦ φn * ev
K (X)
where φnO is the analog of φn when Xn is replaced by the diagonal X, and the evaluation map ev is defined to be the character value at the conjugacy class of type ((n), ∅). The
122
W. Wang
map from K(X) to itself obtained along the left-bottom route in the above diagram coincides with n1 ψ n given in Definition 5.1. The map from K(X) to itself obtained along the top-right route in the above diagram gives n1 times the nth Adams operation. This of course gives another proof that both ψ n and ϕ n are additive when is trivial. ! Remark 5.4. F− (X) is a Q-λ ring with ϕ n (n odd) as the nth Adams operation. If X is a point and thus F− (pt) = R− , then our result reduces to the fact that F− (pt) is a free Q-λ ring generated by ∗ . In particular when is trivial this is isomorphic to the model Q-λ ring C . t (X) the completion of F − (X, t) which allows formal infinite sums. Denote by F t (X) as follows: Given V ∈ K (X), we introduce Q(V , t) ∈ F Q(V , t) = t n (V |V )n . (17) n≥0
The following lemma is immediate by Definition 5.1 and Remark 5.3. Lemma 5.4. Given V ∈ K (X), we have (for r odd)
ϕ r ◦ ℵ(V ) = ζc−1 φr−1 (φr )cr (V r S σr (c)). c∈∗
Proposition 5.2. Given V ∈ K (X), we can express Q(V , t) as follows:
2 r r Q(V , t) = exp ϕ ◦ ℵ(V )t . r r>0 odd
Here the right-hand side is understood in terms of the algebra structure on F− (X). Proof. By (3) and the above lemma, we have Q(V , t) = t n (V n S ξ n ) n≥0
=
tn
n≥0
=
ρ∈OPn (∗ )
n≥0 ρ∈OPn (∗ )
=
c∈G∗ ,r≥1 odd
φn−1 (φn )ρ (V n S 2l(ρ) Zρ−1 σ ρ )
φn−1 (φn )ρ 2l(ρ) Zρ−1 t n V n S σ ρ
1 mr (c)!
mr (c)
2 r −1 −1 t ζ φ (φr )cr (V r S σr (c)) r c r
2
ζc−1 φr−1 (φr )cr (V r S σr (c)) tr r c∈∗ r≥1 odd
2 = exp t r ϕ r ◦ ℵ(V ) . r
= exp
r≥1 odd
! Combining with the additivity of ϕ r , the proposition implies
Equivariant K-Theory and Twisted Heisenberg Algebra
123
Corollary 5.3. The following equations hold for V , W ∈ K G (X): Q(V
W, t) = Q(V , t)Q(W, t),
Q(−V , t) = Q(V , −t). Remark 5.5. The generating function Q(V , t) is essentially half the twisted vertex operator, and the other half can be obtained by the adjoint operator to Q(V , t). Twisted vertex operators have played an important role in the representation theory of infinite dimensional Lie algebras and the moonshine module, cf. [FLM]. When X is a point, we can develop the picture more completely (cf. [JW]) to provide a group theoretic realization of vertex representations of twisted affine and twisted toroidal Lie algebras (also compare [FJW2] for a different construction). 5.5. Twisted Heisenberg algebra and F− (X). We see from Theorem 5.2 that F− (X) has the same size of the tensor product of the Fock space of an infinite-dimensional twisted Heisenberg algebra of rank dim K0 (X) and that of an infinite-dimensional twisted 1 (X). In this section we will actually construct such Clifford algebra of rank dim KG a Heisenberg/Clifford algebra, which we will simply refer to as a twisted Heisenberg (super)algebra from now on. The dual denoted by K (X)∗ , is naturally Z2 -graded as identified with of1 K (X), 0 ∗ ∗ K (X) . Denote by ·, · the pairing between K (X)∗ and K (X). For any K (X) m ∈ 2Z+ + 1 and η ∈ K (X)∗ , we define an additive map a−m (η) : K Gn (X n ) −→ K Gn−m (X n−m )
(18)
as the composition Res
− n K− (X ) −→ K H H n
chm ⊗1
n−m H m×
−→ K (X)
k −1
m (X n ) −→ K − (X ) H
m
K− H
η⊗1
K H n−m (X n−m ) −→ K − H
n−m
n−m
(X n−m )
(X n−m ).
On the other hand, we define for any m ∈ 2Z+ + 1 and V ∈ K (X) an additive map am (V ) : K − H
n−m
n (X n−m ) −→ K − (X ) H
(19)
n
as the composition K− H
n−m
(X n−m )
m 2 Um (V )·
−→ k
−→
m K− (X ) H
m
K− H
n−m H m×
K− H
n−m
(X n−m )
Ind n (X n ) −→ K − (X ). H n
Let H be the linear span of the operators a−m (η), am (V ), m ∈ 2Z+ +1, η ∈ K (X)∗ , V ∈ K (X). Clearly H admits a natural Z2 -gradation induced from that on K (X) and K (X)∗ . Below we shall use [−, −] to denote the supercommutator as well. It is understood that [a, b] is the anti-commutator ab + ba when a, b ∈ H are both odd elements according to the Z2 -gradation.
124
W. Wang
Theorem 5.3. When acting on F− (X), H satisfies the twisted Heisenberg superalgebra commutation relations, namely for m, l ∈ 2Z+ + 1, η, η ∈ K (X)∗ , V , W ∈ K (X), we have l δm,l η, V , 2 [am (W ), al (V )] = 0, [a−m (η), a−l (η )] = 0. [a−m (η), al (V )] =
Furthermore, F− (X) is an irreducible representation of the twisted Heisenberg superalgebra. Remark 5.6. The proof of the Heisenberg algebra commutation relation can be given in a parallel way as the one for Theorem 4, [W1]. The irreducibility of F− (X) as a module over the Heisenberg algebra follows from Theorem 5.2. Given a bilinear form on K(X), then we can get rid of K(X)∗ in the formulation of the above theorem. In the special case when X is a point, the Heisenberg algebra here specializes to the one given in [JW] acting on R− . In the case when is a finite subgroup of SL2 (C), we may consider further the space which is the tensor product of F− (pt) with a module of a certain 2-group which can be constructed out of , and realize in this way a vertex representation of a twisted affine and a twisted toroidal Lie algebra. This is treated in [JW] in detail. Γn 6. Appendix: Another Formulation Using Sn and As is well known (cf. e.g. [Sc, J, Jo, HH]), the symmetric group Sn has a double cover Sn : θn
1 −→ Z2 −→ Sn −→ Sn −→ 1, generated by z and ti , i = 1, . . . , n − 1 and subject to the relations: z2 = 1,
ti2 = (ti ti+1 )3 = z,
ti tj = ztj ti (i > j + 1),
zti = ti z.
Sn carries a natural The map θn sends ti ’s to the simple reflections si ’s in Sn . The group Z2 grading by letting ti ’s be odd and z be even. Given a finite group , the symmetric group Sn acts on the product group n , and the Sn , group Sn acts on n via θn . Thus we can form a semi-direct product n := n θn which carries a natural finite supergroup structure by letting elements in n be even, cf. [FJW2]. We still denote by θn the quotient map n → n /1, z = n . Given a -space X, we have seen Xn affords a natural n action. Then we can apply − the general construction in Sect. 3.2 to construct the category C (X n ) and its associated n
− − − (X n ). As before, we denote K (X n ) = K (X n ) ⊗ C. K-group K n n n We then form the direct sum
F− (X) = F− (X, t) =
∞ n=0 ∞ n=0
− K (X n ), n
− t nK (X n ). n
Equivariant K-Theory and Twisted Heisenberg Algebra
125
− When X is a point, K (pt) reduces to the Grothendieck group R − ( n ) of spin supern − modules of Sn , and F (pt) has been studied in detail in [FJW2]. The purpose of the Appendix is to outline how to modify the various constructions of algebraic structures − on F− (X) for the new space F− (X). As the constructions are very similar to the F (X) case, we will be rather sketchy. Given n, m ≥ 0, we can define a Z2 -graded subgroup n × n+m , in a way m of analogous to (4). Then the obvious analog of constructions (11) and (12) defines a multiplication and comultiplication on the space F− (X). The following is an analog of Theorem 5.1 and it generalizes Theorem 3.8 of [FJW2] which is our special case when X is a point.
Theorem 6.1. The space F− (X) carries a natural Hopf algebra structure. By the analysis of the split conjugacy classes given in the proof of Theorem 2.5 of [FJW2], we see that the n -space X n satisfies the strong vanishing property, and the analog of Lemma 5.1 holds. Therefore we obtain the following theorem which is an analog of Theorem 5.2. Theorem 6.2. As a (Z+ × Z2 )-graded algebra F− (X, t) is isomorphic to the supersym∞ 2r−1 metric algebra S K (X) . r=1 t Except for the first two terms and the first two arrows in the diagram (14), the rest − (X n ). Note in the definition of the K-theory of the diagram has a direct analog for K n maps chn and Un (see Definition 5.1) only the part of the diagram (14) which can be − directly generalized to the K (X n ) setup has been used. Therefore an analog of chn and n Un can be defined in our new setup. This guarantees the analog of annihilation operators (18) and the creation operators (19) can be defined in our new setup. In this way we obtain the following which is an analog of Theorem 5.3. Theorem 6.3. The space F− (X) affords an action of the twisted Heisenberg algebra H in terms of natural additive K-theory maps. Furthermore this representation is irreducible. We remark that it is much less natural to use Sn to construct various K-theory opn of the hyperoctahedral erations on K(X) as done in Sect. 4 using a double cover H group. The connection between F− (X) and the Q-λ ring in Sect. 5.4 does carry over to our new setup. Keeping Remark 5.3 in mind and knowing that Sn also has a so-called basic spin supermodule (cf. e.g. [Jo, FJW2]), we can use it to define the analog of (17). Indeed this also defines an analog of the map n ◦ ℵ (cf. the diagram (14)), and thus an analog of ϕ n ◦ ℵ. Therefore, we have an analog of Proposition 5.2 in our new setup which generalizes Proposition 6.2 in [FJW2]. However there is an unpleasant square root of 2 in the formula which originates in the spin representation theory of Sn and n . This is another reason why we have preferred the formulation in the main body of the n and H n . paper using H − − n One may wonder why KH (X n ) are so similar to each other and there (X ) and K n
n
are almost parallel constructions on F− (X) and F− (X). When X is a point and thus the K-groups reduce to the corresponding Grothendieck groups of spin supermodules, this has been noticed by various different authors (cf. e.g. [Ser, Jo2, Naz, Y, FJW2] and the references therein). Yamaguchi [Y] explains clearly such a phenomenon by establishing n ]/z = −1 and the (Z2 -graded) an isomorphism between the group superalgebra C[H
126
W. Wang
tensor product of the group superalgebra C[ Sn ]/z = −1 with the complex Clifford algebra Cn of n variables (this is not the same copy of Cn associated to the subgroup )n n !). It follows that the group superalgebra C[H n ]/z = −1 is isomorphic to the in H tensor product of the group superalgebra C[ n ]/z = −1 with Cn . Note that a Clifford n -bundle algebra admits a unique irreducible supermodule. As this Cn acts on an H − n over Xn fiberwise, this isomorphism provides a direct isomorphism between KH (X ) n
− (X n ). and K n
We can also forget about the Z2 -gradings (i.e. the super structures) in the group n , − n ) and its associated K-group K − (X n ). Let us in the construction of the category C (X n n s (X n ). In particular when X is a point and is denote the resulting new K-group by K n n ) of spin (not super) modules of Sn trivial, this reduces to the Grothendick group R s ( where z still acts as −1. We can then apply the decomposition theorem of Adem-Ruan s (X n ) ⊗ C in terms of K (X) ⊗ C. The difference here from the [AR] to calculate K n calculations in Theorem 5.2 and Theorem 6.2 is that the odd split conjugacy classes of n will also make contributions. Recall that the orbifold Euler number e(X, ) defined in [DHVW] is the same as the Euler number of the equivariant K-theory K (X). Using the description of even and odd split conjugacy classes of n (cf. [FJW2], Theorem 2.5), s (X n ), denoted by es (X n , we can obtain the Euler number of K n ), in terms of the n following generating function (compare Corollary 5.1): ∞
t n es (X n , n )
n=0
=
∞
(1 − t 2r−1 )−e(X,)
r=1 ∞
+
(1 + t
r=1
2r−1 e(X,)
)
1 · 2
∞ r=1
2r e(X,)
(1 + t )
−
∞
2r e(X,)
(1 − t )
.
r=1
The second summand in the right-hand side of the above equation counts the contributions from odd split conjugacy classes. When we set X to be a point and trivial (and thus e(X, ) = 1), this formula reduces to the classical generating function for the spin Grothendick group R s ( Sn ) (cf. Theorem 3.6, [Jo], pp. 213; Corollary 3.10, [HH], pp. 32). We remark that due to some inaccurate analysis of split conjugacy classes of Sn , the formula (6.10) given in [Di] (even in the case when is trivial and X is a point) is incompatible with this classical statement. References Adem, A., Ruan, Y.: Twisted orbifold K-theory. math.AT/0107168 Atiyah, M.: Power operations in K-theory. Quart. J. Math. Oxford 17, 165–193 (1966) Atiyah, M., Segal, G.: On equivariant Euler characteristics. J. Geom. Phys. 6, 671–677 (1989) Baum, P., Connes, A.: Chern character for discrete groups. In: Y. Matsumoto et al (eds.), A Fete of Topology, London-New York: Academic Press, 1988 [Bo] Borcherds, R.: Vertex algebras, Kac-Moody algebras, and the Monster. Proc. Natl. Acad. Sci. USA 83, 3068–3071 (1986) [Di] Dijkgraaf, R.: Discrete torsion and symmetric products. hep-th/9912101 [DHVW] Dixon, L., Harvey, J.A., Vafa, C., Witten, E.: Strings on orbifolds. Nuclear Phys. B 261, 678–686 (1985) [AR] [At] [AS] [BC]
Equivariant K-Theory and Twisted Heisenberg Algebra [FJW1] [FJW2] [FLM] [Gr] [HH] [HKR] [J] [JW] [Jo] [Jo1] [Jo2] [Ka] [Ku] [Kn] [M] [Na] [Naz] [Ru] [Sc] [Seg1] [Seg2] [Ser] [Sh] [Va] [VW1] [VW2] [W1] [W2] [W3] [Y] [Z]
127
Frenkel, I., Jing, N., Wang, W.: Vertex representations via finite groups and the McKay correspondence. Internat. Math. Res. Notices 4, 195–222 (2000), math.QA/9907166 Frenkel, I., Jing, N., Wang, W.: Twisted vertex representations via spin groups and the McKay correspondence. Duke Math. J. 111, 51–96 (2002), math.QA/0007159 Frenkel, I., Lepowsky, J., Meurman, A.: Vertex operator algebras and the Monster. NewYork: Academic Press, 1988 Grojnowski, I.: Instantons and affine algebras I: The Hilbert scheme and vertex operators. Math. Res. Lett. 3, 275–291 (1996) Hoffman, P.N., Humphreys, J.F.: Projective representations of the symmetric groups: Q-functions and shifted tableaux. Oxford: Clarendon Press, 1992 Hopkins, M., Kuhn, N., Ravenel, D.: Generalized group characters and complex oriented cohomology theories. J. Amer. Math. Soc. 13, 553–594 (2000) Jing, N.: Vertex operators, symmetric functions and the spin group n . J. Alg. 138, 340–398 (1991) Jing, N., Wang, W.: Twisted vertex representations and spin characters. Math. Z. 239, 715–746 (2002), math.QA/0104127 J´ozefiak, T.: Characters of projective representations of symmetric groups. Expositiones Math. 7, 193–247 (1989) J´ozefiak, T.: Semisimple superalgebras. In: Some Current Trends in Algebra, Proc. of the Varna Conf. 1986, Lect. Notes in Math. 1352, Berlin-Heidelberg, New York: Springer, pp. 96–113 J´ozefiak, T.: A class of projective representations of hyperoctahedral groups and Schur Q-functions. In: Topics in Algebra, Banach Center Publications 26, Part 2, Waszawa: PWNPolish Scientific Publishers, 1990 pp. 317–326 ´ Norm. Sup., 4e S´erie, t. 1, Karoubi, M.: Alg`ebres de Clifford et K-th´eorie. Ann. Scient. Ec. 161–270 (1968) Kuhn, N.: Character rings in algebraic topology, In: Advances in Homotopy. London Math. Soc. Lect. Notes Series 139, 1989, pp. 111–126 Knutson, D.: λ-rings and the representation of the symmetric group. Lect. Notes in Math. 308, Berlin-Heidelberg, New York: Springer-Verlag, 1973 Macdonald, I.: Symmetric functions and Hall polynomials. Second Edition, Oxford: Clarendon Press, 1995 Nakajima, H.: Lectures on Hilbert schemes of points on surfaces, Univ. Lect. Ser. 18, Providence RI: Amer. Math. Soc., 1999 Nazarov, M.: Young’s symmetrizers for projective representations of the symmetric group. Adv. in Math. 127, 190–257 (1997) Ruan, Y.: Stringy geometry and topology of orbifolds. math.AG/0011149 ¨ Schur, I.: Uber die Darstellung der symmetrischen und der alternierenden Gruppe durch gebrochene lineare Substitutionen. J. Reine Angew. Math. 139, 155–250 (1911) Segal, G.: Equivariant K-theory. Publ. Math. IHES, 34, 129–151 (1968) Segal, G.: Equivariant K-theory and symmetric products. 1996 Preprint Sergeev, A.: The tensor algebra of the identity representation as a module over the Lie superalgebras gl(n, m) and Q(n). Math. USSR Sbornik 51, 419–427 (1985) Sharpe, E.: Recent developments in discrete torsion. Phys. Lett. B 498, 104–110 (2001), hep-th/0008191 Vafa, C.: Modular invariance and discrete torsion on orbifolds. Nucl. Phys. B273, 592–606 (1986) Vafa, C., Witten, E.: A strong coupling test of S-duality. Nucl. Phys. B 431, 3–77 (1994) Vafa, C., Witten, E.: On orbifolds with discrete torsion. J. Geom. Phys. 15, 189–214 (1995) Wang, W.: Equivariant K-theory, wreath products and Heisenberg algebra. Duke Math. J. 103, 1–23 (2000), math.QA/9907151 Wang, W.: Hilbert schemes, wreath products, and the McKay correspondence. Preprint, math.AG/9912104 Wang, W.: Algebraic structures behind Hilbert schemes and wreath products. Contemp. Math. 297, 271–295 (2002) Yamaguchi, M.: A duality of the twisted group algebra of the symmetric group and a Lie superalgebra. J. Alg. 222, 301–327 (1999) Zelevinsky, A.: Representations of finite classical groups. A Hopf algebra approach. Lect. Notes in Math. 869, Berlin-New York: Springer-Verlag, 1981
Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 234, 129–183 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0785-1
Communications in
Mathematical Physics
A Magnetic Model with a Possible Chern-Simons Phase (with an Appendix by F. Goodman and H. Wenzl) Michael H. Freedman Microsoft Research, Redmond, WA 98052, USA. E-mail:
[email protected] Received: 10 October 2001 / Accepted: 17 September 2002 Published online: 24 January 2003 – © Springer-Verlag 2003
Abstract: An elementary family of local Hamiltonians H◦, , = 1, 2, 3, . . . , is described for a 2-dimensional quantum mechanical system of spin = 21 particles. On the torus, the ground state space G◦, is (log) extensively degenerate but should collapse under “perturbation” to an anyonic system with a complete mathematical description: the quantum double of the SO(3)-Chern-Simons modular functor at q = e2πi/+2 which we call DE. The Hamiltonian H◦, defines a quantum loop gas. We argue that for = 1 and 2, G◦, is unstable and the collapse to G, ∼ = DE can occur truly by perturbation. For ≥ 3, G◦, is stable and in this case finding G, ∼ = DE must require either > > 0, help from finite system size, surface roughening (see Sect. 3), or some other trick, hence the initial use of quotes “ ”. A hypothetical phase diagram is included in the introduction. The effect of perturbation is studied algebraically: the ground state space G◦, of H◦, is described as a surface algebra and our ansatz is that perturbation should respect this structure yielding a perturbed ground state G, described by a quotient algebra. By classification, this implies G, ∼ = DE. The fundamental point is that nonlinear structures may be present on degenerate eigenspaces of an initial H◦ which constrain the possible effective action of a perturbation. There is no reason to expect that a physical implementation of G, ∼ = DE as an anyonic system would require the low temperatures and time asymmetry intrinsic to Fractional Quantum Hall Effect (FQHE) systems or rotating Bos´e-Einstein condensates − the currently known physical systems modelled by topological modular functors. A solid state realization of DE3, perhaps even one at a room temperature, might be found by building and studying systems, “quantum loop gases”, whose main term is H◦,3 . This is a challenge for solid state physicists of the present decade. For ≥ 3, = 2 mod 4, a physical implementation of DE would yield an inherently fault-tolerant universal quantum computer. But a warning must be posted, the theory at = 2 is not computationally universal and the first universal theory at = 3 seems somewhat harder
130
M.H. Freedman
to locate because of the stability of the corresponding loop gas. Does nature abhor a quantum computer? Contents 0. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 An alternative microscopic model . . . . . . . . . . . . . 2. Things Temperley-Lieb . . . . . . . . . . . . . . . . . . . . . 3. Perturbation and Deformation of H◦, . . . . . . . . . . . . . . 4. The Evidence for a Chern-Simons Phase . . . . . . . . . . . . . 5. The H,,t Medium as a Quantum Computer . . . . . . . . . . . 6. Appendix. Ideals in Temperley-Lieb Catergory (by Frederick M. Goodman and Hans Wenzl) . . . . . . . . . . A1. The Temperley-Lieb Category . . . . . . . . . . . . . . . . . . A1.1 The Generic Temperley-Lieb Category . . . . . . . . . . A1.2 Specializations and evaluable morphisms. . . . . . . . . . A1.3 The Markov trace . . . . . . . . . . . . . . . . . . . . . . A2. The structure of the Temperley–Lieb algebras . . . . . . . . . . A2.1 The generic Temperley–Lieb algebras . . . . . . . . . . . A2.2 Path idempotents . . . . . . . . . . . . . . . . . . . . . . A2.3 Specializations at non-roots of unity . . . . . . . . . . . . A2.4 Specializations at roots of unity and evaluable idempotents A3. Ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
130 136 138 141 155 165 170
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
172 173 173 174 175 176 176 177 178 178 180
0. Introduction In Sect. 1, we write down a positive semidefinite local Hamiltonian H◦, for a system of locally interacting Ising spins on a 2-dimensional triangular lattice or surface triangulation, = 1, 2, 3, . . . . In the presence of topology, e.g. on a torus, H◦, has a highly degenerate space G◦, of zero modes. On any closed surface Y , different from the 2sphere, the degeneracy is polylog (2v ) = poly(v), where v is the number of sites in the triangulation and the 2v is the dimension of the Hilbert space h of spins. On the torus T 2 the polynomial has degree = 1, when Y has genus g > 1 the polynomial has degree = 3g − 3 (see Proposition 3.8). We argue for an ansatz (3.4) which exploits the peculiarly rigid algebraic structure of G◦, − it is a monoidal tensor category with a unique nontrivial ideal. The ansatz allows us to model any “perturbed” ground state space G, (which is itself stable to perturbation) uniquely as a known anyonic system or in mathematical parlance a modular functor. The functor is the Drinfeld double of the even-label-sector of the SU (2)-ChernSimons unitary topological modular functor at level , DE. Even labels correspond in physical terms to the integer spin representations so the even-label-sector derives from the group SO(3). The Hamiltonian H◦, defines a quantum loop gas which can be compared (see Sect. 3) with the classical analog. The statistical mechanics of classical loop gases [Ni] identifies a known critical regime and from this we infer that for = 1 and 2, G◦, is unstable and the collapse to G, ∼ = DE is truly by perturbation, for ≥ 3, G◦, is stable and in this case finding G, ∼ = DE requires > > 0, or some other device (see Sect. 3), hence the initial use of quotes “ ”. Figure 0.1 is a hypothetical phase
Magnetic Model with a Possible Chern-Simons Phase
131
diagram. The stability of G◦, at = 3 is probably very slight − see footnote 6 in Sect. 3 and the corresponding discussion. The reader should not be alarmed that a “doubled” Chern-Simons theory arises. The doubled structure makes it a gauge theory and, as we will explain, the double, being achiral, is more likely to have a robust physical realization. The modular functor 2 DE has λ = +1 2 “labels” or, physically, λ super selection sectors for quasiparticle excitations (including the empty particle). Physically this means that a local bit of material, a two dimensional disk with a fixed boundary condition, which is in its unique ground state G, can have λ types of point-like anyonic excitations (presumably with exponentially decaying tails) which can only be created in pairs. λ is the number of ordered integer pairs (x, y) with 0 ≤ x, y ≤ , and x, y = even. By mathematically deleting small neighborhoods of such excitations a ground state vector is approximately achieved in the highly degenerate ground state space G, associated to a punctured disk with boundary conditions. It is known [FLW2] and [FKLW] that for ≥ 3, = 2 mod 4, there is a universal, inherently fault-tolerant, model for quantum computation based on the ability to create, braid, fuse, and finally distinguish these excitations types. Thus H◦, could be of technological importance: a physical system, a “quantum loop gas”, in this (perturbed) universality class could be the substrate of a universal fault tolerant quantum computer. Any unit vector ∈ G◦, is a superposition of classical ±-spin states | which v are distinguished 1 0 by the eigenvalues ±1 of a commuting family of vPauli matrices σz equal to 0 −1 at vertex v. Sampling = ai |i by observing {σz }, we “observe” a classical |i with probability |ai |2 . The domain wall γi separating the +-spin regions from −-spin regions of may be thought of as a random, self dual, loop gas [Ni]. This random state is self dual because there is a symmetry between “up” and “down”. 2 π , where the It is a Gibbs state with parameters k = 0, self dual, and n = 2 cos +2 weight of a configuration γ is w(γ ) = e−k(total length γ ) n# components γ . It is known that for 0 < n ≤ 2 and k = 0 the loop gas is critical, sitting at a 2nd order phase transition as k crosses from negative to positive. This information together with Sect. 3 and 4 support a √ π phase diagram like the one shown in Fig. 0.1 with parameters d := 2 cos +2 = n. The parameter scales alocal perturbation term V . We will argue that the simplest choice for V , V = σxi , is a likely candidate. The diagram is labelled “hypothetical” since site i
there is no proof of its accuracy. The challenge to solid state physics is to find or engineer a two dimensional quantum medium in the universality class, DE3 below. The presumptive approach − nearly universal in the literature − to building a quantum computer is quite different from our topological/anyonic starting point. It is based on manipulating and protecting strictly local − as opposed to global or topological − degrees of freedom. It may be called the “qubit approach” since often a union of 2-level systems (with state space C2i ) is proposed. Actually, the number of levels − or even i
their finiteness − is not the essential feature, it is that each tensor factor of the state space − call it a qunit − is physically localized in space (or momentum space). The environment will – despite the best efforts of the experimentalist − interact directly with these “raw” qunits. It has long been recognized ([S, U]) that the raw qunits must encrypt fewer “logical qubits.” The demon in this approach is that very low initial (or raw) error rate – perhaps one error per 10−5 operations – and large ratios of raw to logical qunits ∼103 seem to be required [Pr] to have a stable computational scheme. This problem pervades
132
M.H. Freedman
Fig. 0.1. Shaded regions are the topological phases DE1, DE2, DE3, DE4, . . . . Doubly shaded regions are the computationally universal phases DE3, DE5, DE7, . . . . We have no way of predicting if the topological phases are actually in contact with each other as drawn. Solid lines are phase boundaries between inequivalent systems
all approaches based on local or “qubit” systems: liquid NMR, solid state NMR, electron spin, quantum dot, optical cavity, ion trap, etc. . . . Kitaev’s seminal paper [K1] on anyonic computation, amplified in numerous private conversations, provides the foundation for the approach described here. Anyons are a (2 + 1)-dimensional phenomena: when sites containing identical particles in a 2-dimensional system are exchanged (without collision) there are, up to deformation, two basic exchanges; a clockwise and a counter-clockwise half turn - or “braid” if the motion is considered as generating world lines in 2 + 1-dimensional space-time. The two are inverse to each other but of infinite order rather than order = 2. So whereas only the permutation needs to be recorded for exchanges in R 3 , in R 2 “statistics” becomes a representation ρ of a braid group B into the unitary group of a Hilbert space h encoding the internal degrees of freedom of the particle system: ρ : B −→ U (h). Since a representation into unitary transformations, “gate set” in (quantum) computer science language, is the heart of quantum computation it is not really a surprise that any kind of particle system with a sufficiently general (it certainly must be nonabelian)
Magnetic Model with a Possible Chern-Simons Phase
133
dim on T 2
number of constant color particles = labels and their braid reps.
number of additional color reversing particles and the total braid reps.
specific heat
DE1
1
1,T
1,T
2
DE2 DE3 DE4 DE5 DE6
4 4 9 9 16
4,T 4,U 9,N 9,U 16,?
1,N 4,U 4,N 9,U 9,?
5 8 13 18 25
2 +1 2
2 +1 2 , U for ≥ 5, = 2 mod 4
2 2 , U for ≥ 5, = 2 mod 4
(+1) 2
Theory
DE
2
nonsingular unitary topological modular functor? (UTMF)
yes, but trivially no, rank (S − matrix) = 1 yes yes yes no yes if ≡ 0, 1, 3 mod 4 no if ≡ 2 mod 4
Fig. 0.2. For sufficiently many particles we have recorded if the (generalized) braid group representations are: T (trivial), A(abelian), N(nonabelian), U(computationally universal); we have called the total number of elementary particles, including those that reverse the (|+, |−) coloring, “specific heat” as it counts local degrees of freedom above the ground state. The color constant elementary particles are the irreducible representations of the corresponding linear category (see Sect. 2). Coloring-reversing particles are explained at the end of Sect. 2
image ρ(B) can be used to build a universal model for quantum computation. This has been shown in [FLW1, FLW2 and FKLW]. What are the advantages and disadvantages of anyonic versus qubit computation? The most glaring disadvantage of anyons is that no one is absolutely sure that nonabelian anyons exist in any physical system. Two dimensional electron liquids exhibiting the fractional quantum Hall effect FQHE are the most widely studied candidates for anyonic systems. The Laughlin state at filling fraction ν = 1/3 has observed excitations charges of (1/3)e and these are convincingly linked by the mathematical model with a statistical factor of ω = ±e2πi/3 for the exchange of such pairs. Quasiparticle excitations with nonabelian statistics is one of the most exciting predictions of Chern-Simons theory as a model for the FQHE. With a few low level (e.g. = 1, 2 or 4, when G = SU (2)) exceptions nonabelian anyonic systems are capable, under braiding, of realizing universal quantum computation [FLW2]. The essential point is that the “Jones representation” of the braid group (on sufficiently many strands) associated to the Lie group SU (2) has a dense image at least in SU (h) ⊂ U (h), h an irreducible summand of the representation. At ν = 5/2 according to [RR] the Hall fluid is modelled by a U (1) theory coupled to CS2 [the Chern-Simons theory of SU (2) at the 4th root of unity (level = 2)]; the latter is a theory with a nonabelian “Clifford group” representation. This model was selected from conformal field theories to match expected ground state degeneracies and central charge, and is further supported by numerical evidence on the overlap of trial wave functions. Though very interesting, this representation is still discrete and is not universal in the sense of [FLW1]. However at ν = 8/5, with perhaps weaker numerical support [RR], it is thought that the Hall fluid model contains CS3 (level= 3, 5th root of unity). Here braiding and fusing the excitation would yield universal quantum computation [FLW1]. So let us, for the sake of discussion, accept that FQHE systems have computationally universal anyons, we are still a long way from building a quantum computer. FQHE systems are very delicate: 1. The required crystals have been grown successfully only in a few laboratories. 2. The temperatures at which the finer plateaus are stable are order milK.
134
M.H. Freedman
3. The chiral asymmetry intrinsic (for CS2 and CS3 the central charge is 23 and 95 respectively) to the effect requires an enormous transverse magnetic field, order 10 – 15 Tesla to reduce magnetic length to where conduction plateaus are observed. At feasible magnet lengths,1 the Coulomb interaction between electrons is at least three orders of magnitude weaker than in solids. Corresponding to the weakness of these interactions the spectral gap protecting topological phases is necessarily quite small. Perhaps for this reason, even the most basic experiments to prove existence of “nonabelions” have not been carried out, and the use of these systems for computations appears unrealistic. For applications such as breaking the cryptographic scheme RSA, it can be estimated that several thousand anyons must be formed, braided at will (perhaps implementing tens of thousands of half twists), and finally fused. This appears to be a nearly impossible task in a FQHE system. The main point of this paper is that computationally universal anyons may be available in more convenient systems. H◦, is a local model for a paramagnetic system of Ising spins with short range antiferromagnetic properties. Written out in products of Pauli matrices H◦, is seventh order (on the standard triangular lattice) and thus looks complicated compared to, say, the Heisenberg magnet. But geometrically it is quite simple and its ground states are known exactly. A 2-dimensional material in the universality class DE proposed as the ground state space for H, , = 1, or ≡ 2 mod 4, will have excitations – “quasiparticles” – capable of universal fault tolerant quantum computation within a model that allows creation, braiding, fusion and measurement of quasiparticle type. A topological feature is not too easy to detect; by definition, topological properties cannot be altered or measured by purely local operators but instead require something akin to an Aharonov-Bohm holonomy experiment. So perhaps the universality class of H, already exists in surface layer physics but is waiting to be discovered. Or perhaps with H◦, in mind something in its (perturbed) universality class can be engineered. If this is possible there would be no reason to expect the system to be particularly delicate. The characteristic energies for magnets are often several hundred Kelvin [NS]. Furthermore the modular functor DE (this includes the information of the various braid group representations, 6j -symbols, S and fusion matrices) which arises is amphichiral, the central charge c = 0, so there is no reason that time symmetry must be broken and no apparent need for a strong transverse magnetic field. These two features are in marked contrast to the delicate FQHE systems. was Subsequent to the initial draft of this paper a different local Hamiltonian H◦, found which bears the same relation as H◦, to the topological modular functors DE, but has potential advantages: 1. it is expressible as 4th rather than 7th order interactions and 4 π . 2. its classical analog is the much studied self dual Potts model for q = 2 cos +2 We have added a Sect. 1.1 following Sect. 1 to explain this alternative microscopic model. We make no proposal here for a specific implementation of H◦, or for how to trap and braid its excitations but we hope that models in the spirit of [NS] for the high Tc cuprates may soon be proposed. In this regard, we note that relatively simple – but still non 1
=
In √ semiconductors with dialectic constant ≈ 10 and |B| ≈ 10 Tesla the characteristic length /eβ ≈ 150Å compared to about 4Å separation between the ions in a crystalline solid.
Magnetic Model with a Possible Chern-Simons Phase
135
classical-braiding statics have been proposed [SF] in conjunction with the phenomena of spin-charge separation [A] for high Tc cuprate super conductors above their Tc . Certain −1 phases are predicted to occur when braiding the electron fragments “visions” and “chargeons” around each other and around ground state defects called “holons”. Also contained in this paper is the suggestion that topological charges might in passing though a phase transition become classical observables, e.g. magnetic vortices. Similarly other phase transitions might link higher ( = 3) to lower ( = 1) topological phases and might be useful in measuring quasiparticle types. Whether even the simplest topological theory is realized in any known superconductor is open, but [SF] is cited as precedent for anyonic models for solid state magnetic systems with high characteristic energies. So while the FQHE motivates this paper, we hope we have steered toward its mathematical beauty and away from it experimental difficulties. What are the generic advantages of anyonic computation? First information is stored in topological properties “large scale entanglement” of the system that cannot be altered (or read) by local interaction. This affords a kind of physical stability against error rather than the kind of combinatorial error correction scheme envisioned in the qubit models – “hardware” rather than “software” error correction. Second, at least in the simplest analysis,2 one expects excitation of a stable system to be well localized with exponentially decaying tails. Thus physical braiding should approximate mathematical braiding, ρ : B −→ U (h) up to a “tunnelling” error of the form e−cL , where c is a positive constant, and L is a microscopic length scale describing how well separated the excitations are kept during the braiding process. This error scaling is highly desirable and seems to have no analog in qubit models. While tunnelling treats virtual errors, errors which borrow energy briefly from the vacuum, actual errors would be expected to scale like e−c1 T /T◦ where T◦ is a character energy for the system and T the operating temperature. This is essentially the error analysis Kitaev made for his anyonic system, the toric code [K1]. This paper draws on three sources of inspiration: 1) Kitaev’s paper [K1] on anyonic computation, 2) the FQHE, and 3) rigidity in the classification of von Neumann algebras subfactors. Rigidity implies that certain monoidal tensor categories have very few ideals. But when interpreted physically, “ideal” means “definable by local conditions”, so we find that a certain locality assumption (Ansatz 3.4) strongly limits the physics. This provides an algebraic approach to the perturbation theory of H◦, – and perhaps yields greater insight than would be possible by analytic methods. We find that for H◦, the polylog extensively degenerate space of 0-modes G◦, possess, in addition to its linear structure, an important “multiplicative” structure − the structure of a monoidal tensor category – which we argue, should be preserved under a perturbation. The rigidity of type II1 factor pairs, an aspect of which is stated as Thm 2.1, provides a unique candidate for the (still finitely degenerate) “perturbed” ground state space G, of H, . The space G, is a braid group representation space with the representation induced by an adiabatic motion of quasiparticle excitations. Throughout, the excitations on a surface Y are assumed to be localized near points so excited states of H, (Y ) become ground states of H, (Y − ) but now on a punctured surface Y − with “boundary conditions” or more exactly “labels,” (see Sect. 2.) We treat excited states indirectly as ground states on the more complicated surface Y − . The existence of a stable phase G, ∼ = DE will be argued by analogy with the FQHE where topological phases are found to be stable, from algebraic uniqueness, and via “consistency checks”. But these arguments constitute neither a mathematical proof 2 Kivelson and Sandih [KS] find that Landau level-mixing in FQHE can thicken the tails to polynumerical decay, but this is not a fundamental effect.
136
M.H. Freedman
nor a numerical verification. The latter may be exactly as far off as a working quantum computer. It was precisely the problem of studying quantum mechanical Hamiltonians in the thermodynamic limit, e.g. questions of spectral gap, that lead Feynman [Fe] to dream of the quantum computer in the first place. It is curiously self-referential that we may need a quantum computer to “prove” numerically that a given physical system works like one. ; returning later to amplify on the relaWe turn now to the definition of H◦, and H◦, ∗ tions to quantum computing, C − algebras, Chern-Simons theory, and topology. (The connection between Chern-Simons Theory and complexity classes is discussed in [F1].) In addition to Alexei Kitaev, I would like to thank Christian Borgs, Jennifer Chayes, Steven Kivelson, Chetan Nayak, Oded Schramm, Kevin Walker, and Zhenghan Wang for stimulating conversations on the proposed model. 1. The Model The model describes a system of spin =
particles located at the vertices v of a trin angulated surface Y . The Hilbert space is H = C2v , where C2v is the local degree of 1 2
v=1
freedom {| +, | −} at the vertex v. The basic Hamiltonian H◦, is written out below as a sum of local projections and thus is positive semidefinite. The ground state space (energy = 0 vectors) G◦, of H◦, can be completely understood (this is unusual since these projectors do not commute) and identified (as n −→ ∞) with what we call the π even Temperley-Lieb surface “algebra” ETLsd where d = 2 cos +2 . Ultimately our focus will be on the ground states on a multiply punctured disk − the puncture corresponding to anyonic excitations (see Sect. 5). Two issues arise: (1) non-trivial topology and (2) boundary conditions. The boundary conditions are quite tricky so it is best to work first with closed surfaces of arbitrary genus (even though these are not our chief interest) to understand the influence to topology alone “liberated” from boundary conditions. Y will denote a compact oriented surface throughout. In combinatorial contexts, Y will be given a triangulation with dual cellulation C. Initially, we consider the case where Y is closed, boundary Y = ∂Y = ∅. We will speak in terms of the dual cellulation by 2-cells or “plaques” c. For example, if Y is a torus it may be cellulated with regular hexagons. This is a perfectly good example to keep in mind but higher genus surfaces are also interesting, while is less so. Soon we will consider surfaces with boundary. the sphere Distributing over , one writes H = span {classical spin configurations on plaques} =: span {si }. Let c be a plaquet, s a classical spin configuration and s c that configuration with reversed spin (+ −→ − and − −→ +) at c. For 1 < i, j ≤ 2n define hij (c) = 1 if (1)sj = s ci and (2)si assigns the same spin ± to c and all its immediate neighbors, and hij (c) = 0 otherwise. Define gij (c) = 1 if (1)sj = s ci and (2) the domain wall γsi between + and − plaques, in the spin configuration si , meets c in a single connected topological arc, and gij (c) = 0 otherwise. We define:
H◦, =
gij (c) |si − |sj si | − sj |
plaques c, pairs of spin states si ,sj
+κ
plaques c, pairs of spin states si ,sj
hij (c)
|si −
1 |sj d
si | −
1 sj | d
.
(1.1)
Magnetic Model with a Possible Chern-Simons Phase
137
The constant κ is positive and may, in this paper, be set as κ = 1. To help digest the notation each of the two sums has n22n terms most of which are zero. It is easy to see that gij = gj i . If the domain wall γ meets c in a topological arc reversing the spin of c isotopes the domain wall across c to the complementary arc = ∂c γ . Contrariwise if hij = 1 then hj i = 0. The parameter d could be any positive real π number but we will be concerned mainly with d = 2 cos +2 , = 1, 2, 3, · · · . The √ √ cases = 2, d = 2 and = 3, d = 1+2 5 , the golden ratio, are of particular interest. Finally, each term in the definition of H◦, should be read, according to the usual ket-bra notation, as orthogonal projection onto the indicated vector: |si − |sj or |si − 1/d |sj . These vectors (whose projectors occur nontrivially in the sums) are certainly not orthogonal to each other (using the inner product |+ hermitian orthonormal to |− in C2 , extended to define the tensor product Hermitian structure on H) so those individual projectors do not commute. It is therefore surprising at first that we can completely describe the (space of) zero modes G◦, of this positive semidefinite form, H◦, . However once the description is given the surprise will evaporate for it will be clear how H◦, was “engineered” precisely to yield this result. Identifying G◦, is the next goal. Associate to the closed oriented surface Y an infinite dimensional vector space ETLd (Y ), the even Temperley-Lieb space of Y . It is the C-span of “isotopy classes” of closed bounding 1-manifolds γ modulo a relation called d-isotopy. The “bounding” condition means that γ is a domain wall separating Y into two regions which could be labelled “|+” and “|−”. Neither γ nor the regions are presumed to be connected. We do not orient γ , so we do not distinguish here between states which differ by globally interchanging |+ and |−. The term “1-manifold” means γ does not branch or terminate at any point. Isotopy, of course, means gradual deformation. The d-isotopy relation: γ − d(γ \γ◦ ), when imposed, says that if a component γ◦ of γ bonds a disk in Y then γ = d(γ \γ◦ ), d times the value on the submanifold with γ◦ deleted. We often work with the dual ET L∗d (Y ), which are the functions f on bounding isotopy classes satisfying f (γ ) = d(f (γ \γ◦ )). Let γ ! be a γ as above enhanced by one of the two choices for “signing” the complementary regions. Define ELTd! (Y ) to be C-span {γ ! }, so that ELTd!∗ (Y ) are the functions from {γ ! } obeying the d-isotopy relation. Both the definition of H◦, and ET L∗d can easily be extended to Y a compact surface with boundary = ∂Y , given a fixed boundary condition, the points where γ meets ∂Y (transversely). So if is a triangulation of (Y, ∂Y ) with dual cellulation C and if the spin configuration |+ or |− is fixed at every vertex (= dual cell) on ∂Y then formula (1.1) defines a Hamiltonian operator on the configurations with that boundary condition provided, in both terms, we restrict the sum to plaques c which do not meet ∂Y . This prevents “fluctuations” from altering the boundary conditions. Define H◦, (Y, ∂Y ) in this way. Similarly if a 2-coloring (or +, − “signing”) of Y is fixed along ∂Y we may consider a relative γ ! as an extension of this signing to a division of Y into + and − signed regions (which are presumed to lie in Y as subsurfaces). Now relative to the ! boundary condition (the signing) ET L!∗ d (Y, ∂Y ) is defined as functions from {γ } to C which obey the d-isotopy relation; ET L∗d (Y, ∂Y ) is the set of such a function invariant under − , the global |+ ←→ |− interchange. If is a triangulation on a surface Y , with or without boundary then we have the com! binatorial versions of ETL!d (Y ) and ETLd (Y ), ETLd (Y ) and ETLd (Y ) (resp.) define using only (|+, |−)2-colorings in which each dual 2-cell (plaquet) is + or −.
138
M.H. Freedman
There are natural maps of C-vector spaces: !
ETLd (Y ) → ETL!d (Y ) and ETLd (Y ) → ETLd (Y ).
(1.2)
These maps are of course never onto (only the simpler d-isotopy classes are realized). Also for certain triangulations the kernel can also be non-zero (due to “stuck” con figurations). However, it is easy to see that as is subdivided. ET Ld (Y ) approxi ∼ mates ET Ld (Y ) in the sense that the direct limit lim − → ET Ld (Y ) = ET Ld (Y ), similarly ! ! ∼ lim − → ET Ld (Y ) = ET Ld (Y ). Let be a fixed triangulation of Y (with fixed boundary condition, a− − projective (|+, |−) 2-coloring, if ∂Y = ∅), set G◦, (Y, ) = zero modes (ground state space) of the positive semidefinate H◦, defined above (1.1). Clearly H◦, is − −invariant and so G◦, is − −invariant. Note that − is not always fixed point free: on Y = T 2 , the configuration which is |+ on an essential annulus A ⊂ T 2 and |− on T 2 A is a − − − fixed point. Let G+ ◦, (Y, ) denote the +1-eigenspace of . Proposition 1.1. For Y a closed surface or a surface with fixed boundary conditions, ! ∼ there are natural isomorphisms G◦, (Y, ) ∼ = ETLd (Y ) and G+ ◦, (Y, ) = ETLd (Y ). Proof. From line (1.1), ∈ G◦ (Y, ) iff =
i
(f |si )|si for some linear functional
! obeying the d-isotopy relation, thus G◦, (Y, ) ∼ = ETLd (Y ). The involution − acts compatibly on both sides so ETLd (Y ) may be identified as the +1 eigenspace of − on the r.h.s. " #
When we come (Sect. 3) to imposing the mathematical structure of a modular functor (or TQFT) on the ground state spaces G+ ◦, (Y ) for various surfaces Y we will need to impose a base point ∗ on each boundary component C ⊂ ∂Y . This is directly analogous to the framing of the Wilson loop in [Wi], in fact the base point moving in time defines the first direction of a normal frame to the Wilson loop in the 2 + 1 dimensional space-time picture. As in the previous application, the base point is introduced for mathematical rather than physical reasons. It allows the state vectors in each conformal block to be identified precisely and not merely up to a (block-dependent) phase ambiguity. Concretely, in our model the base point prevents domain walls from spinning around a puncture. Note that if (a superposition of) domain walls γ represent an eigenspace for the Dehn twist around the puncture with eigenvalue λ = 1 and if twisting is not prevented then the relation |γ = λ|γ will occur, killing the state |γ which is certainly not desired. I thank Nayak for pointing out that although choosing base points breaks symmetry, none of the physics depends on which base points are chosen. The Hamiltonian has a U (1) × · · · × U (1)-gauge symmetry where k = # boundary components of Y .
k
1.1. An alternative microscopic model. In this subsection we present an alternative , on a cellulated surface (Y, C). We do not restrict to triangulation Hamiltonian, H◦, since the square lattice actually yields the simplest form. It has the same relation, in the infrared, to topological theories as does H◦, . In this model the degrees of freedom are on bonds and the loops lie in a “midlattice” separating the |+ clusters from the |− dual clusters (isolated vertices and isolated dual vertices count as clusters). There are perhaps three advantages:
Magnetic Model with a Possible Chern-Simons Phase
139
1. On a square lattice, all terms in the Hamiltonian have order 4 (as compared to seven in the previous model). It is simple enough that we expand it as a product of Pauli matrices. 2. The corresponding classical statistical mechanical model is the Potts model in cluster π 4 . expansion (F K) form with q = 2 cos +2 3. The loops in this model are “fully packed” so no isotopy is possible, only d-isotopy. In particular the total length of the loops separating |+ from |− is configuration ; the notation is explained below. independent. Here is H◦, H◦,
=
1 1 1 1 |3 − |4 |1 − |0 3| − 4| + κ 1| − 0| d d d d (1.3)
κ is a positive constant which for symmetry we suppose to be κ = 1. Again d = π 2 cos +2 . On each bond there is a spin = 21 degree of freedom = C2 = span {|+, |−}. The first summation is over all plaques (2−cells) with a market edge. (So, if the surface is a 10 × 10 torus cellulated with 100 squares, the first summand contains 400 onto the vector terms.) Each term in the first1 sum is an orthogonal projection |− ⊗ |+ ⊗ |+ ⊗ · · · ⊗ |+ − d |+ ⊗ |+ ⊗ |+ ⊗ · · · ⊗ |+ where the tensor factors begin with the bond containing the dot and proceed counterclockwise around the plaquet. (Of course this projector is understood to be tensored with the identity over all remaining bonds.) Perhaps this is confusing, but we have used the notation |3 for the first and |4 for the second basis vector in this combination because in the square lattice case, those numbers count the + signs: a more elaborate notation would be |ni − 1 − d1 |ni . The second term is the “double dual” of the first where one duality swaps cellulation with dual cellulation (homology with cohomology) and the other duality swaps |+ and |−. Thus the second summation is over vertices with a marked incoming bond; the vector | 1 denotes |+ ⊗ |− ⊗ |− ⊗ · · · ⊗ |− and | 0 denotes |− ⊗ |− ⊗ |− ⊗ · · · ⊗ |−, again reading counterclockwise from the dot. The ∧ reminds us that we are reading around a site not a plaquet. In the case of the square lattice the two types of terms may be |− |+ 1 0 and σ = 0 1 . expressed as a 4th degree polynomial in Pauli matrices: σz = |+ x |− 0 −1 10 has two types of terms: H◦,
1 0 1 1 0 (I + σ ) ⊗ (I + σz1 ) ⊗ (I + σz2 ) ⊗ (I + σz3 ) , (I − σz0 ) − σx + z 2 16 8d 16d 1 1 0 1 0 and (I + σ ) ⊗ (I − σz1 ) ⊗ (I − σz2 ) ⊗ (I − σz3 ) (I − σz0 ) − σx + z 2 16 8d 16d 2 2 1 d +1 1−d 0 2 0 1 2 3 = ⊗ (I + σ I + σ − σ ) ⊗ (I + σ ) ⊗ (I + σ ) , z x z z z 16 d2 d2 d2 2 1 d +1 d2 − 1 0 2 0 1 2 3 and ⊗ (I − σ I + σ − σ ) ⊗ (I − σ ) ⊗ (I − σ ) . z z z z 16 d2 d2 d2 x
=
is Baxter’s “mid lattice” [B]. If C, C ∗ are The proper context for understanding H◦, cellulation and dual cellulation, let c = c ∩ c∗ be the general intersection of a plaquet
140
M.H. Freedman
and a dual plaquet. Put a center ∗ in c and a center point ∗1 , · · · ∗n in each of its boundary n 1- cells. Subdivide c by the cone of ∗i to ∗ and let c denote the general plaquet of i=1
this subdivision. The collection {c } are precisely the plaquets of the “mid lattice”. As an example, for the square lattice of unit size the dual lattice is shifted by (1/2, 1/2) and the resulting mid lattice is spanned by vectors {(0, 1/4), (1/4, 0)}. A classical configuration s : {bonds of C} −→ {|+, |−} is encoded as the union of bonds on which s is |+, the components of which are called clusters and the union of the duals of bonds on which s is |−, whose components are called dual clusters. There is a well defined 1- manifold (multi-loop) γs in the mid lattice which separates clusters from dual clusters. builds in dynamics which fluctuates broken (|3) and complete The Hamiltonian H,◦ (|4) boxes and broken | 1 and complete (| 0) dual boxes with a prescribed weight factor = d. The vector |4 encodes a small face-centered loop, Of , in the mid lattice while | 0 encodes a small vertex-centered loop, Ov . The first term projectors, by annihilating , and i is |3 + d|4, enforce a relation. If = i ai i ∈ G◦, , the zero modes of H◦, ;
written out as (index on boundary plaquet, distant indices κ ) then, ;
= a|4,; , for all κ . d a|3,; κ κ
(1.4) ;
Examining this relation on mid lattice multiloops γ (and suppressing κ ) we see that γ|4 differs from γ|3 in that an Of has been added to the isotopy class of γ|3 by “pinching off” a small bend in γ|3 . Correspondingly γ|4 has its coefficient a|4 equal to d times the coefficient a|3 of γ|3 . Similarly for the double dual: up to isotopy γ|0 = γ|1 ∪ Ov and for a zero mode the coefficients must satisfy: d a|1,κ = a|0,κ ,
(1.5)
analogous to line (1.2) and Proposition 1.1 we have: Proposition 1.2. There are natural maps: ET LCd ! (Y ) −→ ET L!d (Y ) and ET LCd (Y ) −→ ET Ld (Y ). In the (direct) limit they become isomophisms. There are natural isomor+ ∼ # phisms: G◦, (Y, C) ∼ = ET LCd ! (Y ) and G◦,d = ET LCd (Y ). " Proposition 1.2 replaces the triangulation with the cellulation C, so ET LCd ! means formal configurations, as s, where s assigns |+ or |− to the bonds of C and as obeys (1.4) and (1.5). ET LCd are formal configurations which are also invariant under the global swap, (− ), |+ ←→ |−. In the limit this relation expresses d-isotopy of the mid lattice domain wall γ . Note, however, that the first two maps mentioned in Proposition 1.2 are not necessarily injective. The situation is summed up by the following example. On a 2 × 2 square torus the two possible staircase diagonals, i.e. |+ on one positively sloping diagonal and |− on the complement, do not fluctuate (are not in the same ergodic component) whereas already in the 3 × 3 torus there is enough room that any two staircases of slopes = 1 are connected by fluctuations.
Magnetic Model with a Possible Chern-Simons Phase
¹
141
, however
=
=d
=d
=
=d
=
=
by symmetry
Fig. 1.3
Remark 1.3. Because of their importance in solid state physics, we observe that a cer in that the zero-modes G tain ring exchange Hamiltonian H◦ is the parent of all H◦, ◦ contain the zero modes G◦, , for all . Each G◦, arises from a distinct linear constraint on G◦ . H◦ = (|3 − |3 )(3| − 3 |) + κ (| 1 − | 1 )( 1| − 1 |) +
|3 is like |3 except cycled one step: |3 = |+ ⊗ |− ⊗ |+ ⊗ . . . , ⊗|+, similarly | 1 = |− ⊗ |+ ⊗ |− ⊗ · · · ⊗ |−. Note that (|3 − |3 ) ∈ span |3 − d1 |4 , |3 − d1 |4 , etc..., so G◦, ⊂ G◦ . The zero modes G◦ can be identified with the (combinatorial) isotopy classes of domain walls between |+ and |− regions. Measuring spins (by a family of σz ’s) converts a ground state vector ∈ G◦, or G◦, into a classical probabilistic state = meas.() which turns out to be a Gibbs state. The statistical physics of meas.() plays an important role in Sect. 3. First, however, we use Sect. 2 to lay down the algebraic framework. 2. Things Temperley-Lieb The generic Temperley-Lieb algebra is a tensor algebra over the complex numbers adjoined an indeterminate d. Often d is written in terms of another indeterminate A as d = −A2 − A−2 . In degree n the algebra TLn has generators 1, e1 , . . . , en−1 and the relations ei2 = ei , ei ej = ej ei if |i − j | ≥ 2 and ei ei±1 ei = d12 ei . Pictorially, after V. Jones and L. Kauffman, we may think of the generators as pictures of arcs disjointly imbedded in rectangles (multiplied by the coefficient 1/d) and multiplication as vertical stacking. For example for n = 4, we have: 1 d 1
U I
1 d
e1
U I e2
Fig. 2.1
1 d
U I e3
142
M.H. Freedman
There is a convention that any closed circle (and these may arise when the pictures are stacked) should be regarded as a factor of d. All closed circles in a picture should be deleted and the resulting picture should then be formally multiplied by d (#circles) . The reader can now easily verify the relations by stacking pictures. Kauffman proved the algebra of such pictures has no other relations [K]. A tensor structure between grades TLn TLm −→ TLn+m is created by horizontal stacking. An inclusion TLn −→ TLn+k is obtained by adding k vertical strands on the right. The union of grades is ∞
the (generic) Temperley-Lieb algebra, TL = U TLn . The structure of this algebra is n=1 2n 1 th completely worked out in [J]: Each grade TLn has dim(TLn ) = n+1 n , the n Catalan number, and is a direct sum of matrix algebras that fit together via a rather simple Brattelli diagram. Also of interest are specializations where the indeterminate d is set to a fixed nonzero real number. Here the structure differs from the generic case when d π assumes a “special value” d = 2 cos +2 , a positive integer, and has been worked out by Goodman and Wenzl [GW]. There is an involution − on TL which acts by reflecting the rectangle in a horizontal line and conjugating coefficients (d = d) making TL a ∗ - algebra. Using this, the “Markov trace pairing < a, b >:= trace (ab) may be defined. “Trace,” on pictures, means closing a rectangular diagram by a family of arcs sweeping from top to bottom and then evaluating each circle as a factor of d times 1 ∈ C. Extend this definition to a Hermitian pairing on TL.
I
U
U
a
a,b
>
= tr (a b ) = tr
I b
(
U I
)
=
= d
>
Fig. 2.2
Theorem 2.1. ([J]) The trace pairing , : TL TL −→ C[d], when d is specialized, to a positive real number, becomes a positive definite Hermitian pairing , d : π TLd TLd −→ C exactly for d ≥ 2. For d = “special” = 2 cos +2 , a positive integer , d is positive semidefinate. For other values of d ∈ R 0, , d has mixed signs. " # π For d = 2 cos +2 define the radical Rd ⊂ TLd by < Rd , TLd >d ≡ 0. The radical Rd has first non-trivial intersection with the ( + 1)th grade where it is 1-dimensional: Rd ∩ TLd, = 0 and Rd ∩ TLd,+1 = spanC (p+1 ). The elements p+1 belonging to TL−,+1 (the + 1 grade of the generic algebra) are called the Jones-Wenzl [W] projectors and a simple recursive formula for these is known. √ In this paper we will be particularly concerned with p3 and p4 (for d = 2 and √ 1+ 5 respectively) which can be computed (from the formula on p. 18 [KL]). 2
Magnetic Model with a Possible Chern-Simons Phase
p
2
p
3
- 1 d UI
=
p
3
p
4
p
2
d
2
-
2
2
2
2
p
4
2
U
U I
) -d
2 , and
+U
2
3
d d - 3d
d =±
U + 1 -2 I d -2 ( I
d d
+ -d +1 ( U + d - 2d I
4
d = 1, - 1;
U U U U d -1 ( I + I ) - d -1 ( I + I )
genrerates a proper radical for
=
+
generates a proper radical for
1
+
=
,
143
- 2d ( I 1
3
UU
U
I
2
genrerates a proper radical for
+
I
U
+
U
I
)
+U I)
U d + 2 I I - d - 3d + 2 ( I I + 4
d = 0;
U
I
U
) +d - d
æ1+ 5 ö÷ æç1- 5 ö÷ ÷ ÷ çè 2 ÷÷ø , ±ççè 2 ÷÷ø.
1
4
3
2
+2
U I
,
d = ± ççç
Fig. 2.3
It is known that when d is special, the ideal J (p+1 ) generated by p+1 in TLd (the specialized TL algebra) is Rd . The notion of ideal closure in different algebraic contexts is essential to all that follows so we will be explicit here; J (p+1 ) is the smallest subset of TLd containing p+1 so that if c1 , c2 ∈ C; a, b belong to the subset and x, y ∈ TLd then: c1 a + c2 b, ax, xa, a ⊗ y, and y ⊗ a all belong to the subset. J (p+1 ) is a linear subspace of TLd and is a two sided ideal under and ⊗. So J is, by definition, closed under formal linear combination and all types of picture stacking: top, bottom, right, and left. For d special the algebra TLd contains many other ideals besides Rd (e.g. the ideal generated by diagrams with at least two “horizontal” arcs) but we find that when we move to the category, Rd becomes unique (see Appendix). This motivates the definition of the Temperley-Lieb category TLcd . The generic Temperley-Lieb category TLc is a strict monoidal tensor category over C(A) with objects N◦ = {0, 1, 2, . . . } thought of as that the number of marked points in the interior of a horizontal interval. The indeterminate A determines d, above, by the formula d = −A2 − A−2 . The morphism Hom(m, n) is a C(A) vector space spanned by all pairing of the n + m points that can be realized by disjointly imbedded arcs in a rectangle for which the m points are on the top and the n points are on the bottom edge. The only difference from the algebra is that we do not demand that a nontrivial morphism have m = n. Again composition ( ) is vertical stacking and ⊗ is horizontal stacking. The involution, the specialization of d and the notions of “ideal” are defined using exactly the same words as before. Now the Markov trace < a, b >= tr(ab) becomes a Hermitian
144
M.H. Freedman
pairing Hom(m, n) × Hom(m, n) −→ C. Theorem 2.1 continues to hold with TLc and TLcd replacing TL and TLd respectively and for d special the radical Rd is still the ideal closure of p+1 . But in the categorical setting there is a new result conjecture by the author and proved by Goodman and Wenzl (see Thm. A3.3 in the appendix to this paper), which when combined with Theorem 2.1 yields. π Theorem 2.2 (Goodman, Wenzl). For d = special value = 2 cos +2 , TLcd has a unique c non-zero, proper ideal = Rd = J (p+1 ) and on the quotient TLd /Rd the pairing , d becomes positive definite. If d = special value but is of the form d = α + α, α a root of unity, α = ±1 or ± i, then TLcd has a unique non-zero proper ideal, the pairing , d descends to the quotient but has mixed sign. For other values of d ∈ R0, TLcd has no non-zero proper ideal. " #
We can continue to make the algebraic structure more flexible, more suited to both topology and physics, while retaining the key notion of “ideal” and the uniqueness property set out in the preceding theorem. One step in this direction is Jones theory of “planar algebras” [J2]. These are generalized categories with an operad structure replacing the p notion of morphism. The TL-planar algebra, TLp or TLd , if d is specialized, begins with a Hilbert space h2k associated to an even number 2k of points marked on a circle: h2k ∼ = span(imbeddable arc pairings in a disk D with 2k marked points on ∂D). To a disk with j -internal punctures D − and a relatively imbedded 1-manifold γ ⊂ D − , where γ has 2ki endpoints on the k th interval boundary component and 2k endpoints on the outer boundary component, Jones associated (in an obvious way3 !) a homomorj phism h2ki −→ h2k . In the planar algebra context the distinction between times () i=1 and tensor ( ) has been lost because there is no up, down, right, left. Instead we have “subpictures” of “pictures”, i.e. restrictions of imbedded 1-manifold on a surface to a subsurface. Definition 2.3. A picture γ in Y is an imbedded 1-submanifold (multi-curve), proper if ∂Y = ∅. A formal picture is a linear combination of pictures with identical boundary if ∂γ = ∅. In Jones’ theory there is no action by Dehn twist because surfaces are considered up to homeomorphism. We take a further step, and allow surfaces with genus > 0, here Dehn twist becomes crucially important. Consider an oriented compact surface Y , and the possible imbedded 1-manifolds (“multi-curves”) γ in Y . Picking a special value d for closed circles which bound a disk (“trivial circles”) defines d-isotopy. In Sect. 1, we have defined ETLd (Y ) to be the C-vector space of d-isotopy classes of closed null-bounding 1-manifolds modulo d-isotopy, on a surface Y . Definition 2.4. Suppose a = ai γi is a formal picture in a disk δ ⊂ interior (Y ) with fixed endpoints ∂γi ⊂ ∂δ. The ideal J (a) or a ⊂ ET Ld (Y ) generated by a are the d-isotopy of formal pictures of the form ax, x = xj χj , χi a picture in Y \ δ with ∂χj = ∂γi , for all i and j , ax = ai xj (γi ∪ χj ). Dually, a∗ ⊂ ET L∗d (Y ) are the i,j
functions annihilating a. Concretely, y ∈ a∗ iff y(ax) = 0 for all x as above. The 3
Let closed loops in D be assigned the multiplicative factor d.
Magnetic Model with a Possible Chern-Simons Phase
145
definition of ideal is the same in T Ld (Y ) and similar in the combinatorial settings: ET Ld (Y ) and ET LCd (Y ). One finds that the quotient ET Ld /J (p+1 ) =: QE has (or better “recovers”) the structure of a TQFT, or more precisely, a 2 + 1-dimensional unitary topological modular functor (UTMF). For this, we must extend the definition of QE to the case of a surface ;
with labelled boundary (Y, t ). The essential feature is that QE may be calculated by “gluing rules” applied to these smaller pieces. When we wish to emphasize the modular (UTMF) structure on QE we use the notation DE = QE to recall the doubled SO(3) or “even” theory discussed in the introduction. Up to the global |+ ↔ |− involution − on configurations, DE will be our model for the perturbed ground state space G, , DE ∼ = G+ , . A UTMF is a very natural way to model the topological properties of a two dimensional particle system without low lying modes in the bulk. Knowing that the ground state has the structure of a particular modular functor (DE), tells us all the topological information about excitation types, braiding rules (nonabelian Barry phase), 6j -symbol, S-matrix, and fusion rules. It is this structure that we have been seeking. The statement DE = QE is a purely topological one and it is possible to piece it together from the topological literature using [BHMV, Prz and KL]. An exposition [FNWW] of the easiest modular functors is in progress and will explicate this isomorphism and contain a proof of Theorem 2.5 below. But let us take a step back and explain this structure (UTMF) in a context where the gluing rules are obvious. Then we will summarize the axioms and finally explain the labels, pairing, and cutting/gluing operations in DE in terms of functions on pictures. Let M(Y ) be the vector space spanned 1-submanifolds (= pictures) γ on Y with no equivalence relation. Suppose Y is cut into two pieces by a circle α ⊂ Y , Y = Y1 ∪α Y2 . The uncountable set X of all finite subsets of α will be the “labels” or “superselection sectors” of this theory. Neglecting the measure zero event that γ and α are not transverse, we can formally write: M(Y ) = M(Y1 , x) M(Y2 , x), (2.1) x X
where M(Yi , x) is the vector space of 1-manifold in Yi meeting α = ∂Yi , in the finite point set x. Equation (2.6) is the essential feature of a TMF as used by Witten [Wi] and formalized by Segal [S], Atiyah [A], Walker [W], and Turaev [T]. Many enormous “classical spaces” have this kind of formal structure but it requires beautiful algebraic “accidents” to find finite dimensional “quantizations” of these. Bounding or “even” pictures span another (huge) vector space EM(Y ). Let us π set d = 2 cos +2 and constrain the functions, in M ∗ (Y ) and EM ∗ (Y ), first by the d-isotopy relation and then annihilation by the ideal generated by Jones-Wenzl relation p+1 . This yields the following quotients and inclusions in lines (2.2) and (2.3): DK A = ieπi/2+4 ←− T Ld (Y ) ←− M(Y ), (2.2) DK ∗ A = ieπi/2+4 = p+1 ∗ G→ T L∗d (Y ) G→ M ∗ (Y ), DE ←− ET Ld (Y ) ←− EM(Y ) DE∗ = p+1 ∗ G→ ET L∗d (Y ) G→ EM ∗ (Y ).
(2.3)
146
M.H. Freedman
Theorem 2.5. The annihilating subspace p+1 ∗ of M ∗ (Y ), we wrote it DK ∗ , is in fact, the Drinfeld double [Dr] of the unitary topological modular functor (UTMF) derived from the Kauffman bracket at A = ieπi/2+4 . This is true even at odd levels, = odd, where the undoubled Kauffman bracket MF is flawed by having a singular S-matrix. DE is a UTMF for = 2 mod 4 and in these cases is a trivial double: V ∗ ⊗ V . Remark 2.6. For the even space, DE the same MF arises for A, iA, −A, and −iA; A = ieπi/2+4 so the notation agrees with the introduction. Remark 2.7. The Kauffman bracket TMF (or TQFT), constructed in [BHMV], is not identical to the TMF derived from SU (2). In physics there is the loop group L(SU (2)) approach and in representation theory there is the quantum group (U sl2,q ) approach and these lead to the same representation categories. Globalization of these representation categories (this viewpoint is explained in [Ku]) yields the same MF. The pictures underlying the Kauffman bracket are unoriented arcs. The Rumer-Teller-Weyl theorem shows these almost correspond to Rep q(SU (2)). However an important minus sign, the Frobenius-Schur indicator, corresponding to the quaternionic (not real) structure of the fundamental representation, is missing. This minus sign propagates into the S matrix making the K and the SU (2), (TMFs) distinct. A different microscopic model which allowed arbitrary 1-manifolds (not just bounding 1-manifolds) could, depending on the local details, lead to DK or DSU (2) so solid state physicists looking for anyons will need to be aware of this distinction in detail [FNWW]. The present models H◦, and address only bounding 1-manifolds which correspond to (endomorphism of) the H◦, even symmetric powers of the fundamental representation which are all real. Thus K restricted to even labels EK ∼ = SO(3), the SU (2) theory at even labels. The same holds, of course, for the doubles of these TMFs. Addendum 2.8. In [FLW2] it is shown that the braid representation of the “Fibanocci category”4 (F) is universal for quantum computation. DE3(A = eπi/10 ) has 4 labels 0, 0; 0, 2; 2, 0; and 2, 2 and is isomorphic to F ⊗F ∗ implying that DE3 is also universal. Addendum 2.9. If Y has a fixed triangulation then combinatorial versions of the six vector spaces connected by maps (2.2) and (2.3) in Theorem 2.5 are defined. Provided is sufficiently fine, no information is lost; the left most combinatorial spaces are actually isomorphic to DK and DE respectively. The proof is the same as in the main topological theorem of [F1]. Furthermore an estimate on the required fineness of (it is linear − → in ) can be extracted from that proof. Also if Y has boundary and labels t (see the discussion of labels which follows immediately) are specified, then the left-most combinatorial spaces are again defined and these map to the TMFs with the given boundary − → labels t . To appreciate the last statement we set out Walker’s axioms [Wa] for a UTMF. Fortunately these can be abbreviated due to two simplifications: (1) the theories are unitary and (2) both are quantum doubles (i.e. the endomorphisms of another more primitive UTMF), so the central charge c = 0 (c(V ⊗ V ∗ ) = c(V ) + c(V ∗ ) = c(V ) − c(V ) = 0. Thus no “extended structures” or projective representations need be mentioned. For a concrete appreciation of these examples, see Figs. 2.4 and 2.5 where the particle types (= labels), fusion algebra, and braiding, and S-matrices in the cases DE3 are given. 4 Greg Kuperberg’s term for the even label sub-theory of (SU (2), 3) also called the SO(3)−theory at level 3.
Magnetic Model with a Possible Chern-Simons Phase
147
A labelled surface Y is a compact oriented surface Y possibly with boundary, each boundary component has a base point marked and a label tL from a finite label set L with involutioncontaining a distinguished trivial element 0, fixed by. For Kauffman SU (2), and SO(3) theories the labels are self dual a = a but we include the hats in the formulas anyway. A UTMF will be a functor V from the category of label surfaces, and isotopy classes of diffeomorphisms (preserving labels and base points) to the category of finite dimensional Hilbert spaces over C and unitary maps. Axiom 1 (Disjoint union). V (Y1 + Y2 , t1 + t2 ) = V (Y1 , t1 , ) (Y2 , t2 ), the equality is compatible with the mapping class groupoids: V (f1 + f2 ) = V (f1 ) V (f2 ). Axiom 2 (Gluing). If Yg is obtained by gluing Y along dually labeled (x, x ) boundary components then: V (Yg , t) = V (Y, t, x, x ). (x, x ) labels on the paired circles
The identification is also compatible with mapping class groupoids – and is associative (independent of order of gluings). Axiom 3 (Duality). V (Y, t) = V (−Y, t)∗ , where − is orientation reverse on Y and on labels, and ∗ denotes the space of complex linear functionals. The Hermitian structures on V give vertical maps and the diagram below must commute: V (Y ) ←→V (−Y )∗ ,
,
V (Y )∗ ←→ V (−Y ) All these identifications are compatible with the mapping class groupoids. The Hilbert space pairings are compatible with maps: x, y = V (f )x, V (f )y, where x, y V (Y, t), and disjoint union: α α2 , β1 β2 = α1 , β1 α2 , β2 . Writing α, βV (Yg , t) as 1 α = αx and β = βx according to Axiom 2, then α, β = Sx αx , βx , where x x x Sx = S0,xi . The symbols S0,xi are the values of a fixed function L −→ ;
xi x =(x1 ,... ,xn )
C{0} which is a part of the definition of V . Experts will recognize S0,xi as the (0, i) – entry of the S-matrix of V : this is the map that describes exchange of meridian and longitude of a torus in the natural “label” bases. Axiom 4 (Empty surface). V (∅) ∼ = C.
C, a = 0 ∼ Axiom 5 (Disk). Let D be a disk, V (D, a) = . 0, a = 0
148
M.H. Freedman
Axiom 6 (Annulus). Let A denote an annulus. Then V (A, a, b) ∼ =
C, a = b 0, a = b.
To complete these axioms to a theory incorporating 3-manifolds Walker adds Axioms 7−10. We will not need these here except to note that a 3-manifold X determines a vector Z(X) belonging to V (∂X). If X = Y ×I , V (X) = id ∈ Hom(V (Y × +1), V (Y × −1)) = V (Y ) V (Y )∗ =: D (V (Y )). In the SU (2) theories it has been known since [Wi] that if X contains a labeled link “Wilson loop” (or suitable labeled “trivalent graph”) then this pair also defines an element of DV (Y ). The simple idea is to regard the 1-manifold γ ⊂ Y = Y × 0 as a link labeled by “1”, the 2-dimensional representation of sl(2, q) inside X = Y × [−1, +1]. This defines a map M(Y ) −→ D(Y ). If γ is null bounding (in Z2 -homology) on Y then there is a subsurface Y0 ⊂ Y with ∂(Y◦ ) = γ . Let G be a generic spine (trivalent graph) for Y◦ . Derived from Witten’s theory and its bracket variation are combinatorial recoupling rules (6j symbols) which are exposited in detail by Kauffman and Lins in [KL]. We have adopted their notations (which caused us to rename Walker’s trivial label “1” by “0”) except in the choice of A, A4 = q = e2πi/+2 . To make d positive we choose A = ieπi/2+4 , that is i times the primitive 4( + 2)th root chosen in [KL]. For = even our A is still a primitive 4( + 2)th root of unity, for = odd, it is a primitive 2( + 2)th root of unity but still defines a nonsingular TUMF on the even labels. Applying recoupling, γ yields a formal labelling of G in which only even labels – odd dimensional representations − appear. This means that the set of possible morphism Z(X, G) is isomorphic to the endomorphism algebra of the sum of even labelled blocks. Restricted to even levels, the 4 choices for A differing by powers of i give the same Kauffman bracket UTMF and this agrees with the SO(3) UTMF, which we call DE. The recoupling relations on labelled trivalent graphs G ⊂ Y × 0, i.e. the 6j symbols, are consequences of the projector of the relation p+1 applied to formal 1-manifolds (and conversely p+1 follows from 6j ). What is less direct [Prz] is that on a surface Y, p+1 alone generates the same relation as including Y × 0 ⊂ Y × [−1, +1] and then = A)(+A−1 ∪ employing both p+1 and the Kauffman bracket relation ∩. Abstractly we know the label sets for DK and DE, but we need to interpret these labels in DK := T Ld /p+1 and DE := ET Ld /p+1 resp. and in this context recover the gluing formula. From a physical point of view it would be surprising if we could not localize because we expect the Hamiltonian H, to define a stable topological phase for which the superselection sectors of excitations define the label set. But such reasoning is in the end circular; it is better to have a mathematical proof that the candidate ground state space G, has the structure of a UTMF and view this as evidence for a “consistency check” on the physical stability of G, . We now explain the “labels” for the theories DK and DE in terms of “pictures”. A conceptual point is that the label has a kind of symplectic character: “half” the label’s information is a non negative integer ≤ which counts “essential” strands of γ passing inward from a component C ⊂ ∂Y . Think of this as “position” information. (Any “excess” strands correspond to a descendent field or gapless boundary excitation.) The other half of the information (“momentum”) is expressed as a symmetry condition on γ in the bulk Y . The formal picture γ must lie in the image of certain minimal idempotents − certain eigenspaces or projector images as constructed below.
Magnetic Model with a Possible Chern-Simons Phase
149
Abstractly, the label set L for DE = QE := ET Ld /p+1 may be written as: +1 +1 L = (0, 0); (0, 2); (2, 0); . . . ; 2 ,2 . 2 2 The “position” part of the doubled label t = (a, b) for γ ∈ QE on a boundary component C ⊂ ∂Y is |a − b|. This quantity is the smallest number # of domain wall (γ ) intersections with C , # = |a − b|, as C varies over all imbedded loops parallel to C (i.e. cobounding an annulus with C) and the domain wall γ also varies over all p+1 -equivalent pictures. The “momentum” part of the label is an eigenvalue. Let us do this more carefully. We follow [BHMV] to define what Walker calls an “anA 1 1 nulus category” ∧A . A = S × I , obj(∧ ) = {set of even number of points on S } then an element of morph (∧A ) are all formal combinations of pictures in A, which beginning on the object in S 1 × −1 and end on the object in S 1 × +1, and which obey the relations: d-isotopy and p+1 . (Recall p+1 = negligible morphisms of T Ld . Also see the appendix.) Suppose that Y is a surface with connected A boundary ∂Y = C, then there is a gluing action of ∧A on DE(Y ) : f ∈ morph ∧ , g ∈ DE(Y ), f ◦ g ∈ DE(Y ); f ◦ g(xi ⊗ zj ) = χi ωj , where the coefficient of f on the picture xi is χi , f (xi ) = χi and g(zj ) = ωj . For this action to be defined we must pick an identification Y ∼ = Y ∪ A. C
Also C has a fixed parameterization and an orientation; these tell us which end of A, A opp S 1 × −1 or S 1 × +1 to glue to C. Technically, this means one of ∧ or ∧A is acting according to orientation. Since we will not make calculations, we will not be careful in choosing orientations and in distinguishing categories and their opposites. If Y has k boundary components the k-fold product X∧A acts on DE(Y ). The reader should k
note that in the cases where there is a mismatch between boundary conditions on ∂Y and A there is by definition no action defined. This would not make sense if we were k
dealing with algebras and is precise by the extra flexibility that make linear categories, sometimes called algebroids, a useful generalization of algebras. So X∧A =: C has an action or − to use the usual terminology when actions are linear k
− a representation on DE(Y ) =: V . To clarify, for each ∂-condition on A there are two corresponding finite dimensional k
vector spaces DE(Yinto A ∂-condition ) =: Vin and DE(Yout A ∂-condition ) =: Vout . A morphism in γout ∈ morph(C) induces (by gluing) a linear map : Vin −→ Vout . The construction is so natural that all the required diagrams commute, and gluing, indeed, defines a representation. (There are some technical points but these are well understood and will cause us no trouble. As annular collars are added to Y various “association” must be chosen so the action is “weak” not “strict”. Also it is sometimes convenient to forget the parameterization of C and only remember the base point ∗ = 1 ∈ S 1 , and further to replace the uncountable object set − finite subsets of S 1 − with the countable object set consisting of one exemplar object for each finite cardinal 0, 1, 2, . . . . This processes is called skeletonizing the category.). The positivity properties of the pairing , , Fig. 2.2 and Thm 2.1, and its extension implies that its finite dimensional representations decompose uniquely into a direct sum of irreducibles. The arguments for this are nearly word for word what is said to prove that a finite dimensional C∗ -algebra is isomorphic to a direct sum of matrix algebras. The algebroid context changes little.
150
M.H. Freedman
So let us decompose V as a representation of C into a direct sum of its irreducibles. We record multiplicities by tensor product with a vector space Wi on which no action of C exists Wi (2.4) Vi . V ∼ = irreps. C
The index i is a multi index i = (i1 , . . . , ik ) and counts the “admissible labellings” of ∂Y = C1 U . . . U Ck . In the theory DE, the possible components of i are the {irreps of ∧A } =: L,
(2.5)
the label set of DE. The involution ∧ is induced by orientation reversal on A and conjugates representations: it happens to be trivial in these theories. A final categorical comment: the form of the r.h.s. of 2.4 suggest it represents a 2-vector: a linear combination (in this case of irreps.) with vector space, rather than scalar coefficients. This is, in fact, the correct categorical setting for MF (Y ) when Y has boundary. So far this discussion of the label set has been rather abstract but it is possible to make explicit calculations by considering the annular categories as operators on T Lcd , the Temperley-Lieb, or “rectangle” categories. A label a ∈ irrep ∧A is generated for some idempotent a = k a k = morph (k, k) ⊂ ∧A , which is a linear combination of annular pictures. Quoting a result which will appear in [FNWW] with full details, we describe (Fig. 2.4) the idempotent a for the four labels of DE3, a = (0, 0), (0, 2), (2, 0) and (2, 2). In the language of rational conformal field theory these labels are the 4 primary fields. They are given below as formal pictures in annuli. Previously, we only considered ideals to be generated by formal pictures in a disk. But now we can let a, by stacking formal pictures in annuli, generate an “annular ideal”, J . Elements b ∈ J may have more than k boundary points on S 1 × ±1; such b are the descendent fields. The idempotent (0, 2) has among its many terms five principal terms, pictures containing only arcs going between inner and outer boundaries of A and no arcs which are boundary parallel. The other terms enforce orthogonality of (0, 2), to the descendents of (0, 0) and (2, 2). The principal terms are written out below. (0, 2) :: I + e3πi/5 F + e6πi/5 F 2 + e9πi/5 F 3 + e12πi/5 F 4 + lower terms ,
(2.6)
fractional Dehn Twist: F = ,I = . (0, 2) is a e2πi/5 eigenvector of F . The powers of F are obtained by radial stacking of annuli. In the case |ai − bi | = 2ki , ki = 0, consider the commuting actions of F = eπi/ki twists of Ci on QE . The resulting eigenvalues turn out to be distinct within the “position” = |a − b| summand of V (Y ). These eigenvalues add the “momentum” information which determines the labels: the minimal projections to eigenspaces. For ki = 0, the Dehn twist acts by the identity, so here the prescription must be different. Suppose s is a configuration on Y which has constant spin “monochromatic” either |+ or |−, near Ci and let [s] be its image in QE. Define an action on [s] by adding an annular ring of the opposite spin in the interior (Y ) immediately parallel to Ci and define βn =
+2 2
x=0
S2n,2x R x , where R x consists of x parallel annular rings of the
Magnetic Model with a Possible Chern-Simons Phase
D
(0, 0) =
2
(2, 2) =
2
5
D 5
sin
p
sin
p
-1
=
È
d
Ç
D 5
2
-
5 1
2
+
5
æ5 + 5 ö ÷÷ ç where D = ç ÷÷ çç 2 è ø 2
151
D 5
sin
p
sin
2
5
p
2
5
2
in the quantum dimension of
d
,
= -A
+2
A
-2
A = ie p
1 / 10
,
,
DE 3 and
.
± w 0
(0, 2) =
where
1
d
1
2
sin
5
2
p
3
± w
w± = (0, 0) , w± = (2, 2) , 0
2
,
0
5
2
and the crossing symbol on
strands labelled 2 is expanding the Kauffman bracket; 2
2
=
A
4
(I
+
U
A (2 -2
-1
d
U
+
U I U
I
U
A (I 2
I +
+
U
I + I
UU I
-1
d
+
U
U I
+ 2
U
U I I)
1 I +
d
+
d
+
A
2
U I
+
U I -4
)+ UU
+ II
)+
U I
UU II
(2, 0) is the complex conjugate of (0, 2). Fig. 2.4
opposite spin stacked up parallel to Ci but in the interior (Y ), and where Sy,x is the 2 +2 S-matrix = √+2 sin πxy +2 of the undoubled theory. For n = 0, . . . , 2 , the maps βn are commuting projectors (up to a scalar), which also commute with twists around other boundary components. For ki = 0 “position” is refined to a label by applying the idempotent βp . The trivial label is β0 . The image of βp turns out to be the −(A2p+2 +A−2p−2 ) eigenspace of the actions of R on the |a − b| = 0 “position” summand of V (Y ). Thus − → when ∂Y = ∅, DE(Y, t ) is an orthogonal summand of QE(Y ) determined by specifying a minimal even number ni , of arcs reaching each boundary component Ci (but not parallel to it), ni {0, . . . , }, and an eigenvalue λi at Ci ⊂ ∂Y . Note that gluing
152
M.H. Freedman l=
S-
3
matrix = tensor square of:
0 0
5
sin 3
5
2
p
sin 3
5 - 2
5
sin
5
5
p
5
matrix = tensor square of : 0
0
2
5
p
2 2
F-
2
sin np
2
-
d
d
2
d
-1
-1
-1
d
2
2
, -1
d=
1+
5
2
In the undoubled thoery the nonzero fusion conefficients ar re:
N
000
,
N
222
nonzero
,
N
022
,
N
202
, and
220
. In the doubled theory, 25
N s occur: Nii jj kk obtained by inter leaving the subNijk and Ni j k above. The action of Dehn
scripts of any two twist:
N
Dehn (0, 0)
= (0, 0), Dehn (0,2) = e p
Dehn (2,0) = e
4 i/5
pi/5
(0,2),
( 2, 0), Dehn (2,2) =(2,2)
Fig. 2.5
different eigenspaces images automatically implies a trivial result as required by the gluing axioms. This is immediate from the commutivity of the minimal idempotents. ; For surfaces with boundary, we may write DE(Y ) := (Y, t ), then ;
admissible labelings t
DE(Y ) = QE(Y ) in this case as well. For the first computationally universal case, level = 3, the S-matrix, F -matrix (= 6j -symbol), the action of the Dehn twist, and Verlinda (fusion) relations are listed above. The only interesting F −matrix in the undoubled theory occurs when all four external labels = 2, so F reduces to a 2-tensor. We have come to a point where we can study the difference between span {2colorings (|+, |−)} = G◦, and the − -invariant combinations G+ ◦, . We may begin ! with the enhancement of ET L (Y ) to ET L (Y ), manifold 2-coloring modulo d-isotopy (for d = 2 cos π/ + 2). The enhancement leads to the “color reversal particles” of Fig. 0.2 which do not fit exactly into the UTMF−TQFT formalism, (but perhaps a Z2 -graded version?) as they do not raise the ground state degeneracy on the torus. They should, however, arise physically and contribute to specific heat. We will return to these shortly. First, we show that there is only one lifting to the enhancement of the projector relablack and p white denote the extension to 2-colorings of p tion p+1 : for odd let p+1 +1 by +1 the relation applied to a 2-coloring in a neighborhood of a transverse arc which crosses + 1 strands of γ from black to black and white to white respectively. The reader may
Magnetic Model with a Possible Chern-Simons Phase
153
take black = |+ and white = |−. If is even, noting that all the projectors pi have left-right symmetry there is only one way to lift p+1 to ET L! . white black = J p+1 . Proposition 2.10. J p+1 Proof. We may use under crossings to indicate formal combinations of TL-diagrams which are consistent with the Kauffman relation:
=
A) (
+
A
-1
U I
Fig. 2.6
Now consider (for = 3) the following sequence.
=
= d1
1 d
1 d
a
13 pictures
= d1
13 pictures
=
13 pictures
=
Fig. 2.7 black across the arc α with an application of p white . These 6 steps effect p+1 +1
# "
black For a closed surface Y let QE! (Y ) be the enhancement ET L! (Y )/ p+1 white = ET L! (Y )/ p+1 and QE! (Y ) −→ QE (Y ) the forgetful map. We do not know if this map is always an isomorphism. However for the case of most interest = 3, Proposition 2.11, below shows that QE3! (T 2 ) ∼ = QE3 (T 2 ), T 2 the 2-torus. Since dim V (T 2 ) = |L| the cardinality of the label set, this implies that the largest quotient of QE3! having the structure of a UTMF is isomorphic to QE3 . Proposition 2.11. QE3! (T 2 ) ∼ = Q3 (T 2 ). Proof. Let B be the black and W the white coloring of T 2 while M and M 2 are the one and two meridianal ring colorings respectively:
154
M.H. Freedman
Fig. 2.8
Applying p4 across the arc γ (and a short calculation using Fig. 2.3) yields: M 2 = 3M − W . Since M and M 2 are black-white symmetric the third term must be symmetric as well, hence W = B. It follows quickly that QE3! (T 2 ) = C − span (W, M, L, D), where L is a longitudinal and D a diagonal ring. Forgetting the 2−coloring and retaining only the domain wall we get a basis for QE3 (T 2 ) ∼ # = DE3(T 2 ). " It is possible to brake the color symmetry − by adjusting the Hamiltonian to fix the color = |− at some plaquet on each component of Y . This adjustment creates a new ground state G canonically isomorphic to the former G+ , so we drop the + from the notation. However this does not obviate the need to study the enhancement. The point is that localized color- reversing excitations remain and are expected physically. These, when realized on an annulus algebra, have opposite coloring on S 1 × −1 and S 1 × +1, and so cannot be glued into a ground state on T 2 . Let us see how this works in the simplest example, the level = 1 theory = 1, d = 1, A = ie2πi/12 . When we make no “evenness” restriction this theory, D1, is also called Z2 -gauge theory [SF] and [K1]. It has four labels: 0 = 21 (∅ − R), m = 21 (∅ − R), e = 21 (I + T ), and em = 21 (I − T ), where the pictures in these combinations are:
f=
X X
, R
=
X
X
, I
=
X
X
, T
=
X
X
Notice that the labels are orthogonal under stacking annuli. One may check that braiding m around e or em introduces a phase factor = −1, as does braiding e around m or em. The even theory DE1(A = ie2πi/12 ), has only one particle (0, 0) – which is the trivial particle, and has dimension = 1 on T 2 , and so is quite trivial. In quantum systems with other microscopics (e.g. [K1]) can easily realize D(1) but in our set up the pictures do not arise directly but indirectly as a domain wall so I and T make no sense. However R does make sense as a domain wall between |+ and |− boundary conditions at opposite ends of A. In fact, we may define an elementary excitation of DE(1) at a plaquet c by using the local projector at C2c (|+c + |−c ) ⊗ (+|c + −|c ), instead of the ground state projector (|+c − |−c ) ⊗ (+|c − −|c ). Thus the “m” particle can arise as an
Magnetic Model with a Possible Chern-Simons Phase
155
excitation of DE1 even though it does not contribute to ground state degeneracy. This is the prototypical color-reversing particle. Regarding the other information in Fig. 0.2, the coloring preserving or “label” excitations (the irreps. of ∧A ) are counted by the (even, even) lattice points in [0, + 1] × [0, + 1]. The color reversing labels are (odd, odd) ⊂ [0, + 1] × [0, + 1]. The S-matrix of the (undoubled) SU (2) or Kauffman theory when restricted to even labels is singular (precisely) for ≡ 2 mod 4, for example at = 2, 1 2 √ S = 22 1 2
Seven
√
1 2 √ 0 −2 2 . √ − 2 1 2 2 2 2
1 :: 21
1 2 . 1 2 2
When S is nonsingular, = 1, 2, 4 and the number of braid stands ≥ 5 it is known [FLW2] that the braid representations are dense in the corresponding special unitary groups. 3. Perturbation and Deformation of H◦, As remarked near the end of Sect. 0, excited states, i.e. anyons, will be studied as ground states on a punctured surface with labelled boundary. In the large separation limit, the braiding of anyons can be formulated as an adiabatic evolution of the ground state space on a labelled surface Y with boundary. So in the present section we confine the discussion to ground states. Although boundary is assumed to be present and labelled we will nevertheless consider only perturbations acting in the bulk so the role of the boundary is peripheral in this section. The passage from G◦, , the ground state space of H◦, , to the deformed ground state space G, of H, does not result from the breaking of a symmetry, in fact G◦, has no obvious symmetry. Rather it is the creation of new “symmetry”: topological order. If a perturbation V is breaking an existing symmetry then only the original ground state and the effective action of V at the lowest nontrivial order is relevant. But in the present case, to understand the effect of a perturbation V , one should first describe all low lying (gapless) excitations above G◦, and then see how V can act effectively on G◦, through virtual excitations. For example in the toric codes [K1] the ground state space may be rotated in an interesting way if a virtual pair of e (electric) particles appear, tunnel around an essential loop (of combinatorial length = L), and then annihilate. In the case of toric codes there is an energy gap to creation of (e, e) pairs so the above process has exponentially small amplitude in the refinement scale L ∼ e−L/L◦ . In contrast, for level ≤ 2 we expect the ground state space G◦, to be gapless5 (in the thermodynamic limit) and processes which act through virtual excitations will be important in the perturbation theory because excitations are cheap. However it seems hopeless to analytically describe these gapless excitations so we skip this step and resort to an ansatz (3.4) stated below. It 5
For = 3 the gap may be extremely small as explained latter in this section.
156
M.H. Freedman
asserts that G, is modelled as the common null space of local projectors acting on G◦, . We argue for this via an analogy to FQHE, uniqueness considerations and “consistency checks”. From Sect. 2, the reader knows that we wish to identify G, with G◦, /p+1 (for suitable values of ), and this is what the ansatz implies. The fact that G◦, /p+1 ∼ = DE (see S 2 and [FNWW]) has the structure of an anyonic system (mathematically a UTMF) is the first consistency check. There will be one more presented in Sect. 4. Let us prepare to state ansatz 3.4 carefully. 2 Definition 3.1. An operator O on a tensor space H = Cv is k-local if it is a sum v∈V
of operators Oi each acting on a bounded (≤ k) number of tensor factors and id on remaining factors. We say O is strongly local if the index set {V } are vertices of a triangulation and O = Oi , where each Oi is k-local with the k active vertices i
spanning a connected subgraph Gi of . All Gi are assumed isomorphic and with fixed isomorphisms Gi −→ Gj inducing isomorphisms Oi ∼ = Oj . In the later case, we call (, O) a quantum medium. Note 3.2. In the special case that a family of strongly local operators {Oi } are projection onto 1-dimensional subspaces, the system {Oi } is equivalent to what topologists call a combinatorial skein relation ([L, KL]), though in the topological context equivalence of isotopic pictures is implicitly assumed. An example of a (14 term) skein relation is p+1 = 0 (see Fig. 2.3), applied to γ , the dual-1-cell domain wall between |+ and |−. A skein relation is a local linear relation between! degrees of freedom, domain walls in our case. The intersection of all the null spaces null(Oi ) = null(O) is the subi
space perpendicular to the equivalence classes in H defined by the combinatorial skein relation. ! Definition 3.3. The joint ground space (jgs) of {Oi } is E◦,i , where E◦,i is the eigenspace corresponding to the lowest eigenvalue λ◦ of Oi .
i
The jgs {Oi } is not necessarily the lowest eigenspace of O =
i
Oi because jgs {Oi }
can easily be {0}. In this case the Hamiltonian is “frustrated”. It may happen that O has long wave length excitations at the bottom of its spectrum which do not show up in the spectrum of Oi . However if O defines a stable physical phase it is an optimistic but not unrealistic assumption that jgs {Oi } = ground state space (O). For example, this occurs in the “ice model” or “perfect matching problem” on the honeycomb lattice [CCK] and in the fractional quantum Hall effect (FQHE). The FQHE begins with a “raw” state space H, the lowest eigenspace for an individual electron confined to a 2-dimensional disk D and subjected to a transverse magnetic field B. This H is called the lowest Landau level. Each level can hold a number of spin + and − electrons ≈ area D/(magnetic length)2 and the fraction of that number actually residing at the level is called the filling fraction ν. In a spherical model, the Coulomb interaction H between pairs of electrons (ei , ej ) can be written [RR] as a sum of projectors onto various “joint angular momentum subspaces” pij ((2k + 1)Nφ ). The null space, null(Hq ), for q−3
Hq =
2
k=0 i<j
pij (2k + 1)Nφ
(3.1)
Magnetic Model with a Possible Chern-Simons Phase
157
is nontrivial and, of course, is the joint ground space jgs of the individual projectors in the sum. In fact, null (Hq ) is Laughlin’s “odd denominator” state space at ν = q1 . Ansatz 3.4 For well chosen , the perturbed ground state space G will be stable and can be written as G = jgs (|si si |) ∩ G◦ for some strongly local family of projectors {|si si |} acting on H. Equivalently if si◦ is the orthogonal projection of si into G◦ and |si◦ si◦ | : G◦ −→ G◦ is the corresponding projector then G ∼ = jgs (|si◦ si◦ |). In topological terms the ansatz asserts that the reduction G◦ −→ G occurs by imposing a skein relation. The ansatz is essentially a strong locality assumption. As discussed above, the Laughlin ν = q1 states, q odd, follow this pattern with the Coulomb interaction between electrons playing the role of the perturbation on the disjoint union of single electron systems. Since the Landau level has no low lying excitation the analogy is closest with G◦, , ≥ 3. Theorem 2.5 gives us the following: Implication of Ansatz 3.4. Suppose H◦ is subjected to a sufficiently small perturbation ≤ 2 or an appropriate deformation, ≥ 3, which partially lifts the log extensive degeneracy of the ground state G◦, to yield a strictly less degenerate ground state G, . If we assume the stability of G, we expect G, to be modelled as G, ∼ = G◦, /p+1 = the modular functor DE. For the projector p+1 to arise as an effective action of V on G◦, = C < disotopy classes of domain walls >, for > 0, various sets of + 1 walls must have a polynomially large probability of simultaneously visiting the support Ui of some local i i combinatorial Oi = p+1 enforcing p+1 . The walls must visit a site Ui or p+1 cannot enforce orthogonality to the relation vector p+1 (as depicted in Fig. 2.3 for ≤ 3.) The notion of a combinatorial instance of p+1 was developed in [F2]. It amounts to a discretization of the smooth domain wall diagrams (Fig. 2.3) by choosing specific superpositions of local plaquet spin configurations (with the spin state of the plaques at the boundary of the configuration constant) to represent the smooth relation p+1 . Evidently there are many distinct combinatorial patterns which are instances of a fixed p+1 . The simplest of these amount to geometric rules for simplifying the + 1 domain wall when these run parallel for (roughly) + 1 plaques. As discussed in [F2], imposition of such a combinatorial relation in the presence of mild assumptions on the triangulation , produces a result isomorphic to the smooth quotient. It is sufficient that the triangulation must have injectivity radius >> + 1 and bounded valence. So should subdivided (to approach a thermodynamic limit) as shown in Fig. 3.1. Heuristic. There is a curious pattern observed in Fig. 2.3 (and further computer calculaπ tion of Walker (private communication), for + 1 = even and d = 2 cos +2 , the sum of the coefficients in p+1 , in the geometric basis {g} is zero, and for + 1 = odd, the sum is small. The geometric pictures {g} may be filtered by an integer weight n which counts the fewest sign changes on plaquets − in topological terms, the fewest domain wall reconnections or “surgeries” − required to transform the straight (identity) picture to g. So, referring to Fig. 2.3, the first term for p2 has weight = 0, the second term has weight = 1. The terms for p3 have weights 0, 2, 2, 1, 1 respectively. Notice that sign (coefficient (g)) = (−1)weight (g) . This suggests that V =
σxc is a reasonable choice for V to obtain Fig. 0.1.
c, plaquet
The Pauli matrix σz has +1-eigenvector |+ + |− and −1-eigenvector |+ − |−, and thus assigns a lower energy to combinations of geometric pictures g which have a (−1)
158
M.H. Freedman
phase shift associated to domain wall surgery. The perturbation V = −|) =
1 2 c
(id +
σxc )
c
(|++|−)(+|+
contains terms which annihilate antisymmetric combinations of approaching domain walls of γ : |)( − | L M . Nonzero entries coupling all the terms of p+1 occur first at order , i.e. in V , so one may expect that this is the order at which an effective action arises on G◦, . It is now time to treat the statistical physics of a general ground state vector ∈ G◦, . A perturbed Hamiltonian H, will not have a ground state modelled by G◦, /p+1 if the domain walls of have an effective tension. The simplest place to see this is on a closed surface Y with the triangulation determining a metric. When the domain walls γ ⊂ Y are pulled tight under tension they will stand a bounded distance apart and have i exponentially small amplitude for simultaneously entering a small locality Ui so p+1 will be unable to act. Measuring any ∈ G◦, via the community family {σzc } projects into the geometric basis. This results in a probabilistic spin configuration which is Gibbs with probabilities proportional to n# loops , where n = d 2 = (2 cos π/ + 2)2 . Let us write ∈ H, || = 1, in the classical basis of spin configurations, = ak |k . We say, consistent with measurement of any observable which is diagonal in the |k basis, that |k has probability |ak |2 . Thus k becomes a random classical component of the random configuration, meas.(). Just as one asks about the typical Brownian path, we ask what a typical k looks like. There will be a competition between energy and entropy. Since d > 1, the Hamiltonian H◦, “likes” trivial circles and will place a high weight on configurations with most of the surface area of Y devoted to a “foam” of small circles. However entropy favors configurations with longer, fractal loops which exhibit more variations. From the critical behavior of loop gases we know √ that for d ≤ 2 entropy√dominates and ∈ G◦, is a critical Gibbs state with typical loops fractal. For d > 2 energy dominates and the Gibbs state is stable: To free up dual lattice bonds to build this foam the topologically essential part γ + of γ will be pulled tight by an effective “string tension”. Recall from the introduction that the Gibbs weight on a loop gas state γ is proportional w(γ ) = e−k(total length γ ) n# components γ , ours is the self dual, k = 0 case. Let us be explicit. In any ground state vector ∈ G◦, , = ai i , we have seen that the coefficients a1 and a2 of d isotopic configurations 1 and 2 satisfy a1 /a2 = d #1 /d #2 , where #i is the number of trivial domain wall components, “trivial loops”, of i . The Pauli matrices σzv = 10 −10 applied at vertex = v form a commuting family of observables so we may “observe” in the geometric basis {i } of classical spin configurations to obtain meas.() and we see that the ratio of probabilities of observing 1 versus 2 is: p(1 ) = (d #1 −#2 )2 = (d 2 )#1 −#2 p(2 )
(3.2)
Thus observing in the classical basis yields a Gibbs states meas.() :: e−βE(i ) |i for E(i ) = −#i and β = 2 log d. Such probabilistic states are called “loop gases” and have been extensively studied [Ni], e.g. in the √ context of the O(n)-model. It is believed that, in the self dual case k = 0, d ≤ 2, ≤ 2, there is no string tension and that the Gibbs state is critical: typical loops are 1/polynomial in size and correlations decay polynomially. Furthermore the familiar “space = imaginary time ansatz” (see lines 3.7–13) suggest that this regime,
Magnetic Model with a Possible Chern-Simons Phase
159
≤ 2, should have G◦, gapless. For ≥ 2, the loop gas is beyond the critical range. For these values correlations of the loop gas decay exponentially and it is believed the loops are in “bubble phase” where any long loop forced by topology would be pulled tight by an effective string tension. The corresponding H◦, > 2 should be gapped, above its (polylog) extensively degenerate ground state space G◦, (compare with line 3.8). It is in this case, specifically for = 3, that we still may hope the ansatz describes G, for some family of deformations H, , 1 > > ◦ > 0 as suggested by the phase diagram, Fig. (0.1). This would imply G, = G◦, / < p+1 >∼ = G◦, /R = DE . The gap above G◦,3 and therefore ◦ might be quite small. A loop gas with k = 0 (defined on a mid-lattice see Ch. 12 [B]) is closely related to√the F K representation of the self-dual Potts model at q = n2 = d 4 . For = 3, d = 1+2 5 and q ∼ = 5.6. Although the self-dual lattice Potts model in 2-dimensions have second order transitions (critical) for q ≤ 4 and first order transitions (finite correlation length) for q > 4, exact calculations show that for q = 5.6 the correlation length ζ though finite is several hundred lattice spacing [BJ]6 . resulted from an effort to sharpen the relation Our “alternative” Hamiltonian H◦, between meas.() and the (FK) Potts model. I would like to thank Oded Schramm for is a Hamiltonian on Hilbert space helpful conversations on this relation. Recall that H◦, of spin = 1/2 particles on the bonds of a surface triangulation, cellulation, or lattice. We will work locally and so ignore contributions of the global Euler characteristic χ (Y ) to the formulas below. Also, we write “=” to mean that the equation holds up to a fixed extensive constant, like the total number of bonds. Recall from the introduction that we consider the union of |+ bonds and disjoint from this the union of |−− dual bonds. Our “loop gas” is on the mid lattice separating the |+− from the |−− clusters. Let E be the number of |+− edges and E ∗ the number of |−− edges, C(C ∗ ) the number of clusters (dual clusters) with the convention that isolated vertices (dual vertices) count as clusters (dual clusters). Let L be the number of loops in the loop gas. The Potts model parameter q (number of colors), it turn out, should be set as q = d 4 = (2 cos π/ + 2)4 . Finally, 0 ≤ p ≤ 1, denotes a probability. We have two basic equations. Every loop (in the plane) is the outermost boundary of either a cluster or dual cluster so: L = C + C∗.
(3.3)
C ∗ “ = ”C + E.
(3.4)
Also there is an Euler relation:
Now we can re-express the loop gas Gibbs weight ω in terms of clusters in the FK Potts model: ∗
∗
ω(spin conf.) “ = ”(d 2 )L = (d 2 )C+C = (d 4 )C (d 2 )E 1E “ = ”
E
E ∗ d2 1 ∗ (d 4 )C = q C p E (1 − p)E . 2 2 d +1 d +1
(3.5)
Because of the − - symmetry between |+ and |− we expect p to be the self dual point for this value of q, and we check this below: ∗
ω “ = ”q C p E (1 − p)−E = q C p −E (1 − p)E “ = ”ω∗ , 6
I thank Steve Kivelson for pointing out the existence and relevance of these calculations.
160
M.H. Freedman
using (3.4) we get p 2E (1 − p)−2E = q E , so we have proved √ q p= √ . 1+ q
(3.6)
Theorem 3.5. Observing that the family {σzb for all bonds b} maps any ground state vector ∈ G◦, into a Gibbs state meas.() of the self dual Potts model for q = π 4 2 cos +2 . " # as a “quantum Potts” model and contemplating a This justifies thinking of H◦, diagram of relations:
conf. field theory, infra-red limit double −−−−−−−−→ (SU, )UTMF −−−−→ DE sl2 , level $ correlation functions # zero modes π 2 Potts q = (2 cos ) +2 quantum Potts π 4 observe π Potts q = 2 cos ←−−−−−d = (2 cos ) +2 +2 Fig. 3.1
Let us return to the heuristic connecting the spectral gap above G◦, and the statistical properties of a ground state vector ∈ G◦, . Let A and B be two strongly local operators on a quantum medium (, H ). For example measuring A might be σzc at plaquet c. Assume, first, that there is a unique up to a phase ground state vector O and a spectral gap = δ above O|H |O = 0. Then if we evolve in imaginary time: “ imaginary time correlation” = OetH Ae−tH BO = OAe−tH BO.
(3.7)
Writing |BO = OBOO + B and OA|∗ = OAOO + A note that non-O summands B and A decay under imaginary time evolution at a rate ≥ e−δt . Thus applying e−tH to either the bra or the ket in (7) gives: “ imaginary time correlation” −→ OAOOOBOO = OAOOBO,
(3.8)
where the convergence (−→) is exponential. Denote the spatial translations of A (by A ), the analogous operator to A acting near a site at distance from the support, the original of A. The simplest expectation is that spatial correlations should behave in the same way as imaginary time correlations, exponentially
“space correlation” = OA BO −−−−−−−−→ OAOOBO.
(3.9)
This statement is precise in a Lorentz invariant context but is expected also to hold in greater generality provided that there is some linkage between temporal and spatial scales. A code space G ⊂ H is an important generalization of a 1-dimensional subspace (see [G] for examples and discussion in other notations).
Magnetic Model with a Possible Chern-Simons Phase
161
Definition 3.6. We say G ⊂ H is k-code, with k measuring the strength of the encryption, if for any strongly k-local operator Ok the composition: inc
Ok
PG
G −−→ H −→ H −−→ G
(3.10)
must be multiplication by some scalar (perhaps zero). PG is the orthogonal projection onto G. The importance of a code space is that it resists local perturbations. In [G] Gottesman states, in different language, his Thm 3: Theorem 3.7. Suppose L ⊂ H is a subspace of a Hilbert space and E is a linear space, called “errors”, of the operators HOM (H, H) so that the composition below, for any inc
E
PL
→ H −−→ L. Then L E ∈ E is always multiplication by a scalar c(E): L −−→ H − constitutes a “code space” protected from errors in E. This means that there is a physical operator (composition of measurements, their adjoints, and unitary transformations) which corrects errors coming from E ∈ E. (PL is orthogonal projection onto L.) " # ∼ H ∼ G◦, /p+1 =⊂ We have already encountered a code space. The space DE = has the code property (and similarly for G◦, /p+1 ). From the theorem and the disk axiom (Sect. 2) we see immediately that the dual space G◦, ∩ p+1 ∗ ⊂ H is a code space for operators (errors) supported in any fixed disk D ⊂ Y . This is true long before the refinement limit, we need only a modest level of refinement before the combinatorial quotient exactly assumes the structures of a unitary topological modular functor (UTMF) [F2]. Then Axiom 4 (Sect. 2) says that V (D, 0) ∼ = C and V (D, a) ∼ = 0 whenever a label a = 0. Thus any operator supported on D must act as a scalar. Observe that even when a ground state space G is degenerate, if it is nevertheless a code space and also has a spectral gap of δ > 0 between it and the first excited state, the argument for the decay of spatial correlations O◦ A BO◦
(3.11)
|O◦ B = O◦ BO◦ O◦ + 0O1 + 0O2 + · · · + B , B ⊥ G,
(3.12)
O◦ A|∗ = O◦ AO◦ O◦ + 0O1 + 0O2 + · · · + A , A ⊥ G,
(3.13)
remains valid for any O◦ G:
{O◦ , O1 , O2 . . . } an orthonormal basis for G. Applying e−tH to say ket (3.12) and then pairing with (3.13) observe that O◦ AOi = 0, i > 0, so: exponentially
“imaginary time correlation” −−−−−−−−→ O◦ AO◦ O◦ BO◦ .
(3.14)
Thus the usual heuristic: gap ←→ finite correlation length, gapless ←→ polynomial decay, is not less valid for code spaces than simple, non-degenerate ground states. Hence, the expectation is that code spaces G are protected by a spectral gap and meas.(), ∈ G has finite (or even zero) correlation length. 4 π , ≥ 1, Curiously, it is the square of the Beraha numbers, n2 = d 4 = q = 2 cos +2 which enter as the weight of a circle in the loop gas Gibbs state. This means that for
162
M.H. Freedman
≥ 3 these systems are outside the critical range in the thermodynamic limit. But recall [BJ] that, for = 3 the resulting theoretical stability is extremely weak. We would like to propose the possibility that the transition nc from critical to bubble phase on a surface Y might be sensitive to roughening, i.e. an increase of Hausdorff dimension. Roughening appears to increase the entropy of long domain walls giving them more dimensions to fluctuate in, so one might expect the energy/entropy balance point to increase to nc > 2. There are two arguments for nc = 2. One is an explicit study of the spectral gap of the corresponding transfer matrix in the 6-vertex model. This needs a geometric product structure and so will not apply to a typical rough surface. The second (Ch. 12 [B]) is topological and seemingly does apply. It used the Euler relation to create a translation between the Potts model, a loop model, and an ice-type model. Formally, when the Potts parameter q crosses 4, θ = log(z) (z is the parameter in the ice type model) passes from imaginary to real q 1/2 = ez + e−z . This shows √ a singularity in the coordinates but not necessarily the model itself at q = 4, n = q = 2, so it seems that a role for surface roughening √has not been excluded. If nc can be promoted to 3+2 5 ≈ 2.62 by surface roughening, then a G,3 ∼ = G◦,3 /p4 ∼ = DE3 might be available as the ground state space of an honest perturbation of H◦,3 . Alternatively, the stability may at n = 2.62 be so slight as to be physically irrelevant. In either case, roughened or not, the quantum medium must certainly have topological dimension = 2 to admit anyons but the Hausdorff dimension of Y might approach 3. To be susceptible to the imposition of a topological symmetry (to force the system to be perpendicular to the ideal p+1 ) we need G◦, to be unstable (or in the = 3 cases, nearly so). What will be the properties of a G, ∼ = G◦, ∩ p+1 ∗ if in fact such a ground state space is achieved? In Sect. 2 the global properties of DE3 as a UTMF were discussed. These have implications for the local properties of a unit vector ∈ G, ∼ = G◦, ∩ p+1 ∗ and these contrast with a vector R ∈ G◦, . First, because G, has the structure of a UTMF no information about the state can be determined from observables acting on a disk D imbedded in Y , D ⊂ Y . Usual correlations such as σzv ⊗ σzv must be zero (or at least exponentially decaying) or else measurements in a large disk D ⊂ Y and “extrapolation” would reveal information about the state on Y . One might think with rapid decay of correlations, that observing in the classical basis we would see a “bubble phase” with a few tight global walls γ◦ ⊂ γ forced by topology. But this is impossible that such global lines could be locally detected on a disk D ⊂ Y and would reveal information on which is forbidden. In other words, a local operator could split the ground state whereas no local operator should effect more than an exponentially small splitting of G, . How is the paradox resolved? The domain walls “loops” in a typical (according to L2 -norm) component i of will be very long, probably space filling, but at the same time not locally correlated. This behavior is seen already in the typical classical (i.e. observed) states of any toric code word [K1], so the phenomena is not a surprise. Next we show that on (Y, ) the ground state space G◦, of H◦, is polylog extensively degenerate. Understanding this scaling is an important ingredient in perturbation theory. Fix a closed surface Y of genus (Y ) = g and use the number of vertices v() as a measure of the combinatorial complexity of the triangulation . Assume, for studying the v(i ) := vi −→ ∞ asymptotics of G◦ , that the triangles of the triangulations i have bounded similarity type. This means that triangle shapes should not be arbitrarily distorted. In this regard “barycentric subdivision” is “bad” but more regular subdivisions are “good”.
Magnetic Model with a Possible Chern-Simons Phase
163
"good"
"bad" Fig. 3.2
Proposition 3.8. If genus (Y ) = g > 1 then dim(G◦, (Y, )) and dim(G◦, (Y, )) are O v()3g−3 . If genus (Y ) = 1, the dimensions are O(v()). Proof. A pant is a 3-punctured 2-sphere. Fix a hyperbolic metric on Y . In a hyperbolic metric, d-isotopy classes have unique geodesic representatives. Using the Fenchel-Nielsen coordinates [Th] on a geodesic pants decomposition of (Y, ), we find, for each pant, 3 multiplicity parameters and 3 twist parameters each assuming O(v 1/2 ) values, a definite fraction of which are mutually consistent. Since the Euler characteristic χ (pant)= −1, there are 2g − 2 = |χ (Y )| pants in the decomposition. The consistent parameter settings define geodesic patterns, O v 3g−3 geodesic 1-manifolds compatible with . These 1-manifolds are unique up to isotopy and have no trivial circles and so are also unique up to d-isotopy. But G◦, is defined as the perpendicular to the d-isotopy relation on H. " # Theorem 3.9. Assume of Y is sufficiently fine. If {Oi } is a family of strongly local Hermitian operators on H with jgs X ⊂ H then there are three possibilities for X∩G◦, : (1) X ∩ G◦, (2) X ∩ G◦, (3) X ∩ G◦,
= {0}, = G◦, , and ∼ = G◦, ∩ p+1 ∗ = DE(Y ).
i realizes possibility (3). The choice Oi = p+1
Note 3.10. By the Verlinda formulas dim DE(Y ), Y a closed surface, is asymptotically χ(Y ) π √2 (the fraction of error converges to zero). This maybe compared to sin +2 +2 the much larger dim of G◦, (Proposition 3.8). Proof. For each Oi there are (orthonormal) vectors fi,j spanning the k tensor factors on which Oi acts nontrivially so that E◦,i the lowest eigenspace of Oi , has the form span(fi,1 , fi,2 , . . . fi,δi )⊗ (the remaining n − k factors) where δi = dim(E◦,i ). The fi,j are assumed to be chosen coherently with respect to the natural isomorphism Oi ∼ = Oi . δi
For each i ⊕ |fi,k fi,k | constitute a skein relations si as explained above. Thus X∩G◦, k=1
consists of vectors orthogonal to the equivalence relation spanned by both d-isotopy and the skein relations {si }. The easiest example is for the level= 1 theory where A = ieπi/6 , d = −A2 − A−2 = 1. In this case the Jones Wenzl projector p2 reads: p2 = )( − L M or in the combinatorial model:
164
M.H. Freedman
p2 =
|-ñ |-ñ
|+ñ |+ñ |+ñ
|-ñ
|-ñ |-ñ
_
|-ñ
|+ñ |-ñ |+ñ
|-ñ |-ñ
Fig. 3.3
Any combinatorial version of the smooth relation will give the same quotient and same G◦, ∩ X provided the triangulation is sufficiently fine. When d = 1 generalized isotopy simply means isotopy and deletion of circles bounding disks; adding si = p2i yields unoriented Z2 -homology − as shown in Fig. 3.3 − as the quotient equivalence relation. So, for example on a closed surface Y , the space X ∩G◦, has dimension 2b1 (Y ) , b1 being the first Betti number, and is identified with functions H1 (Y ; Z2 ) −→ C. For any skein relation s the subset < si > ∩G◦, ⊂ H, each si specializing s at locations Ui , is an ideal of the surface algebra G◦, as defined in Sect. 2. The uniqueness Thm 2.5 implies < si >= G◦, , {0}, or p+1 (the latter occurs when si represents i ). These correspond respectively to the three alternatives in p+1 on Ui , i.e, si = p+1 the theorem. " # Observation 3.11. Although V = ground state vector θ◦ = tion, > 1,
2V
σxv , is a promising perturbation to find DE, its
1 (−1)#|− k 2V /2 k=1
has an exponentially large H◦, -expectaα
θ◦ |H◦, |θ◦ ≥ eL
(3.15)
some α > 0, and L the refinement scale. Proof. If a spin configuration is chosen uniformly at random from the 2v possibilities it follows from an easy independence argument that the mean number of circles # of that configuration is O(v) and the standard deviation is s.d. (#) ≥ O(v 1/2 ), v the number of vertices of (Y ). For d > 1, line 3.15 is deduced using the “circle” term of H◦, together with the above inequality on standard deviation. " # To consider the possibility of frustration, which is outside our algebraic ansatz (but quite possible in a fundamental description of an anyonic system, see Sect. 4, (2) Uniqueness) we should think about how a general local operator could act on G◦, . Such (nonscalar) actions are possible precisely because G◦, is not a topological modular functor7 (there is no disk axiom here) so local operators such as O may detect statistical information on the topology of a state R ∈ G◦, . For example the presence of a bond in the domain wall γ may have polynomial influence on a global topological event. This phenomena is familiar from percolation: The state of a single bond in the middle of an × piece of lattice has influence, at pc , on the existence of a percolating cluster joining two opposite boundary segments which decays as −5/4 (for the triangular lattice) [LSW and SW]. O may determine an effective reduction Eλ◦ ⊂ G◦, based on some local statistical difference between the random components of different Rk ∈ H which the local 7 For ≤ 2, G ◦, presumably has gapless modes which are un-topological. For > 2, string tension allows local measurements to yield a global inference.
Magnetic Model with a Possible Chern-Simons Phase
165
operators |si si | detect. The sites of these operators should be located and oriented randomly w.r.t. the domain wall γk of Rk so it is reasonable to believe that the deviation of O from scalar × identity will be attributable to the topologically essential part γk+ of γk since other features may be lost in averaging over the sites. For example, γk+ may appear “straighter” than the “foam” γk γk+ . That is, O may favor or disfavor a topologically complex essential-domain-wall γ1+ over, say, a simpler essential-domain-wall γ2+ . If complexity is favored the ansatz is still capable of describing the final reduced ground state space, though for an indirect reason: A reduction Eλ◦ ⊂ G◦, , or a series of reductions, which preserves the complex topological representatives will retain many representatives for each class in the quotient G◦, /p+1 modular functor so topological information will not be lost, Eλ◦ ∩ p+1 ∗ = G◦, ∩ p+1 ∗ . (But note that the intermediate subspace Eλ◦ fails to respect the multiplicative/tensor structure on G◦, .). If topological complexity is disfavored in Eλ◦ , then the action of O could destroy the topological information of the modular functor by killing all classes except for the simplest, γ + = ∅, corresponding to no essential domain wall and foam covering all of Y . In this case the ansatz is not applicable. But fortunately the sign of in the perturbation V may be possible to control, placing the system (H, , H ) into the favorable regime, where a further reduction, perhaps at, higher order, could still find the modular functor. Because of the topological character of skein relation, the final quotient G◦, /p+1 i is the same regardless of which sites and with what coefficient norm the various p+1 relation is enforced. Applying p+1 requires no homogeniety of input and in fact “smooths” local disturbances. 4. The Evidence for a Chern-Simons Phase There is no proof or “derivation” of a stable (gapped) phase G, ∼ = = G◦, / < p+1 >∼ DE but there is evidence in the form of analogy and internal consistencies under the headings: “UTMF”, “uniqueness”, and “positivity”. The first two have been discussed and are only now summarized, the third is presented in more detail. 1) UTMF . The quotient algebra G◦, / < p+1 >∼ = DE is an anyonic system, in mathematical terms a UTMF. This in itself is a consistency check: The presence of an anyonic quotient system. The spatial correlation scale for topological information is zero so we expect a gap protecting the topological degrees of freedom. The system is the quantum double of a Chern-Simons theory and experience with the FQHE as a Chern-Simons theory has prepared us to believe that these beautiful structures can self-organize in nature from the simplest underlying Hamiltonians – e.g. Coulomb repulsion in a Landau level. Mathematically the double has the form of the algebra of operators on some (fictitious) FQHE – like system “X”: domain walls realize the Wilson loop operators on X. The double is freed from chiral asymmetry and the extreme physical conditions required to break time reversal symmetry. The double is a better place to look for a realization protected by a large spectral gap. 2) Uniqueness . There is a unique candidate model G◦, / < p+1 > respecting the local or “multiplicative” structure of G◦, (Thm 3.9). Uniqueness suggests that there is a sharp boundary: there are no slightly larger quotients which could include low frequency excitations. Thus the simplest expectation, that the reduction G, , ≥ 2, be one dimensional (non-degenerate), cannot be achieved as a joint ground state jgs of local projectors
166
M.H. Freedman
but instead requires frustration. A subtle point is involved here. The most interesting candidates in solid state physics for topological states are highly frustrated systems when written out in their fundamental degrees of freedom. This does not mean that they cannot have effective descriptions as a jgs (= unfrustrated). In fact a very general topological argument suggest that a phase with the structure of a TQFT always will. In local models p on a surface Y (e.g. [TV]), product structures Y × I yield projectors H(Y ) −→ H(Y ) whose image is the underlying UTMF, V (Y ). However by building Y × I from overlapping local product structures, p can be factored into a commuting family of local projectors {pi } so V (Y ) = image p = jgs{pi }. Finally, uniqueness creates an aesthetic bias: Could nature really turn down such a possibility? We now turn to the final consistency check. 3) Positivity of the Markov trace pairing on G◦, /p+1 . On a surface Y with or without boundary, the Hilbert space G◦, /p+1 ∼ = G◦, ∩ p+1 ∗ inherits a Hermitian inner product , geom. by inclusion in H, the space of all spin configurations. (Beginning with {|+, |−} as an orthogonal basis for C2 , the Hilbert space H acquires the ∗ tensor product pairing which may be restricted to G+ ◦, ∩ p+1 .) On the other hand, ∼ G+ ◦, /p+1 = DE (Y ) has a topologically defined “Markov trace” Hermitian inner product , top. corresponding to its structure as a UTMF (see Definition 4.2.). Proposition 4.1. Suppose that the perturbed Hamiltonian H, has a spectral gap above its ground state G◦, ∩ p+1 ∗ , then up to a correction which is exponentially small in the refinement scale of the triangulation , , geom. and , top. are proportional: , geom. ∼ = c , top. for some real number c = 0. The definition of , top. is recalled below. Definition 4.2. Extending the definition given in Sect. 2: If γ1 and γ2 are domain walls × S 1 where Y in Y with identical boundary data then γ = γ1 ∪ γ2 defines a link in Y is Y with its boundary capped by disks. (Place γ1 and γ2 on disjoint θ -levels θ1 and θ2 , then bend paired endpoints to meet at the intermediate level (θ1 + θ2 )/2 w.r.t. the S 1 orientation.) Regarding γ as labelled by the 2-dimensional representation, γ1 , γ2 := × S 1 , γ ) at level , see [Wi, RT and BHMV]. We refer to this pairing, Witten invariant (Y also, as the Markov trace pairing. The combinatorial properties of the code space C := G◦, ∩ p+1 ∗ are such that local operators (in fact any operator supported on some topological disk D ⊂ Y ) cannot extract or modify information in C. (It is true that a local operator can rotate C to C , C ⊥ C, but nondestructive (of details within C) measurements allow the error to be corE
F|
rected by a physical operator F acting on H so that the composition C −→ C −→ C is the identity idC .) The code property is a kind of combinatorial/topological rigidity and it is quite natural that, if achieved in a ground state space, that space should be protected by a spectral gap. As we argue for Proposition 4.1, a final consistency check emerges: gap + code ⇒ positivity of Markov trace pairing.
(4.1)
This explains how the Markov pairing is picked out and why its (indefinite) Galois conjugates are unrealizable as stable phases. The Markov trace pairing is known to be
Magnetic Model with a Possible Chern-Simons Phase
167
positive [J], [FNWW] precisely for our choice of A, A = ieπi/2r , d = −A2 − A−2 , and it being topologically defined is automatically invariant under the mapping-class-group of (Y, ∂Y ). For other roots of unity A the resulting Hermitian pairings are of mixed sign so cannot, by positivity of , geom. on H and Proposition 4.1, correspond to stable √
physical phases. For example, at level = 3, the Galois conjugate choice d = 1−2 5 is not physical. No ground state space modelling Gd◦ / < p4d > could have a gap. The corresponding “UTMFs” are only “unitary” with respect to the mixed (p, q) Lorentz form [BHMV], constructed for each labelled surface of the theory. These structures cannot be induced by restricting the standard Hermitian pairing , geom. on H. π By choosing d = 2 cos +2 we have ensured tr (a, a) is positive. If G, has a gap and is naturally identified with G◦, /p+1 then 4.1 implies positively of the Markov trace. Positivity is demonstrated by showing that, up to an error exponentially small in the refinement scale L, that the Markov trace is in the similarity class of the Hermitian form induced from the standard inner product on H. This is the argument; it is not mathematically rigorous as the “imaginary − time = space ansatz” is employed, but we hope that is convincing physically.
Argument, Proposition 4.1. A surface (Y, ) can be gradually changed by bringing bonds in and out of the triangulation (and perhaps adding or deleting vertices). With patience, a Dehn twist can be effected. This takes O(n2 ) moves on an n × n square grid torus T 2 . Similarly a braid generator for quasiparticle excitations on a disk takes O(n2 ) such moves where n is the number of bonds in a loop surrounding the two quasiparticles. These changes can eventually return to a homeomorphic, though now twisted, image of itself, see Fig. 4.1.
before
after Fig. 4.1
If H has a gap, bounded as we change (on which H depends), the adiabatic theorem will define, in the slow deformation limit: deformation speed << gap, a time evolution of vectors in G,t ⊂ H, t[0, 1]. At time t = 0, G,◦ = G and finally at time t = 1, G,1 = G again. This evolution is (incidentally) identical with the one induced by the canonical connection on the universal topological bundle of {k - plane, vector in k - plane } −→ {k - planes}. From the assumption of a gap= δ, one can argue that this monodromy for a Dehn or braid twist is accomplished by a composition of O(δ −1 n2 ) local operators, or more precisely operators At which have only an exponentially small nonlocal part. This means that for Pauli matrices at sites i and j , the
168
M.H. Freedman j
commutator satisfies: 0 [σxi At , σy ] 0< c◦ e−c1 0i−j 0 for all indices x, y, i, j, t, and some positive constants c◦ and c1 . We call such operators quasi-local. The essential point is that the local disturbance caused by modifying Ht near a bond to Ht+1 dies away exponentially in imaginary time and hence in space. Let us ignore the exponential tail (which will lead to a manageable error term), and think of the monodromy as a composition of O(δ −1 n2 ) local operators: monodromy = A1◦c t . For each t in a discretized unit t
is a local unitary operator ∈ Hom (H, H) which interval with O(δ −1 n2 ) points, A1◦c t carries the code subspace G,t of H at time = t to the code subspace G,t+1 at time = t + 1. Adiabatic evolution has provided us with one (local) representation, ρgeom. : π1 (moduli space) −→ P Ugeom. (G,t ), from the fundamental group of moduli space (Y ) (in our discrete context moduli space is the space of triangulations of Y ) to the projective unitary transformations of the perturbed ground state space. The subscript geom. signifies that P U is defined with respect to , geom. . On the other hand, assuming a spectral gap above G,t , there is a physical argument that a second, topologically defined, representation is also local. This representation: ρtop. : π1 (moduli space) −→ P Utop. (G,t ) is defined into the projective unitaries w.r.t. , top. by deforming the triangulation t while leaving the formal picture (:= superposition of domain walls) topologically invariant. This representation can be defined by choosing a local rotation which interpolates between the conditions (that define G◦,t ∩ p+1 ∗ ) in force at time = t but not t + 1 and those in force at time = t + 1 but not t. What is not immediate is whether the effect of this local rotation on the jgs can be achieved by an operator At on H which is quasi-local. But the existence of a quasi-local At can be argued based on the “imaginary − time = space” ansatz (Sect. 3 lines (8)–(14)). Similarly, if we view the ground state G,t as a local excitation of H,t+1 , but one without topological content, we expect that they can be annihilated by a quasi-local At . But if a local operator carries one code space into another, that operator restricted to the first code space is unique, up to a scalar, among all restrictions of such local operators. This is particularly clear in the present case when the operators are unitary and all the code spaces have the same dimension. Suppose both A and B are unitary operators carrying C1 into C2 , then B † ◦ A|C1 : C1 −→ C1 is also local and so multiplication by some unit norm scalar λ. Thus B|C1 = λA|C1 . So assuming a gap, the proceeding observation shows first that ρgeom. and ρtop. are both actually well defined as maps from the fundamental group (see Fig. 4.2) and second that ρgeom. will be projectively the same ρtop. up to an error exponentially small in the refinement scale L (when measured in the operator norm). The latter, ρtop is simply parallel transport in Witten’s [Wi] projectively flat connection on the modular functor bundle V (Yt ) over the moduli space of surfaces {Yt } (t now an arbitrary parameter). Projective flatness as well as uniqueness of this connection follow formally from locality properties: As the surface is gradually changed (discretely this is done by moves on the triangulation ) the two surfaces Yt and Yt+1 can be canonically identified in the complement of a disk D supporting the changing bonds, and the identification can be extended arbitrarily over D. From the disk axiom and the gluing axiom of Sect. 2, we have a unique canonical projective isomorphism of modular functors V (Yt ) −→ V (Yt+1 ). This determines, via differentiation, a unique connection. Projective flatness follows by applying this uniqueness to a loop of identifications, see Fig. 4.2, representing a small cycle of changes to collectively supported in a disk D ⊂ Y . Similar loops span the relations in π1 (moduli spaces). Let V = DE = 1, 2, or 4. It is known [FLW2] that for a sphere with 4 or more punctures (or a higher genus surface) the braid (or mapping class group) acts densely in
Magnetic Model with a Possible Chern-Simons Phase
169
− → the projective unitary transformations of each label sector P Utop. (V (Y, t )). But identi− → − → fying V (Y, t ) with a ground state space G, (followed by the idempotent t defined in Sect. 3), we have on the one hand the adiabatic evolution which must be unitary w.r.t. the Hermitian pairing induced from the standard Hermitian pairing on H, and on the other hand, exponentially close to this, transport in Witten’s connection. Both define (nearly) the same dense homomorphism from π1 := the fundamental group of moduli space: − → ρgeom. ρtop. : π1 −→ End V (Y, t ) .
(4.2)
In the case Y is planar and all boundary labels equal, π1 is a familiar braid group.
Fig. 4.2
It follows from the rapid approximation algorithm [KSV, So, K2] of elements of − → P Ugeom. , P Utop. ⊂ End V (Y, t ) by words in π1 , that the induced Hermitian metric on V must be exponentially close (in L) to the intrinsic UTMF metric on V up to the overall scalar c. The mathematical fact that we are using here is that Hermitian pairings can be recovered from their symmetries: Lemma 4.3. Suppose a vector space V has Hermitian (but not necessarily positive definite) inner products , 1 and , 2 with symmetry groups U1 , U2 ⊂ End(V ) respectively, if U1 = U2 then , 1 = c , 2 for some real constant c = 0. Furthermore if U1 = U2 but instead if for all A1 ∈ U1 there exists an A2 ∈ U2 with ||A1 − A2 || < > 0, and for all A2 ∈ U2 there exists an A1 ∈ U1 with ||A2 − A1 || < , then for all v ∈ V , v, v21 /v, v22 = const. +O() Proof. Up to linear conjugacy the type of the form is determined by dimension, signature approximately) these invariants agree. Let M End(V ) and nullity. If U1 = U2 (even transform , 1 and , 2 M † v, Mw1 = v, w2 . Then U1 M = MU2 so if U1 = U2 =: U then M normalizes U . Let s + t = dim(V ). In P GL(s, t; C), P U (s, t) is its own normalizer establishing the lemma when U is nonsingular. If the forms have radicals, these must agree and the preceding argument applies modulo the radical. Finally, in the case U1 and U2 are not identical but have Hausdorff distance = , a counterexample to the lemma would yield a Lie algebra element α ∈ pg(s, t)pu(s, t) with adα (pu(s, t)) ⊂ pu(s, t), but pu(s, t) ⊂ pg(s, t) is a maximal proper sub Lie algebra. " #
170
M.H. Freedman
Consequently, G, is not just linearly V but metrically V , provided H, has a gap. The spectral gap assumption implies that the combinatorically defined Markov trace pairing is induced from the standard inner product of H. The Markov trace pairing is a rather intricate structure in its relation to gluing (see Axiom 2). That it arises from a simple assumption can be viewed as a valuable “consistency check” on that assumption – the existence of a spectral gap above G, . 5. The H,,t Medium as a Quantum Computer In the literature one finds at least three polynomially equivalent models of quantum computation defined: q-Turing machine [D], q-circuit model [Y], q-cellular automata [Ll]. Nearly all proposed architectures ([NC] is an excellent survey) presume localization of the fundamental degrees of freedom. This may be called the “qubit approach” although qunit might be more precise since there is nothing special about two state systems, the number n of states per site may even be infinite, as in optical cavity models − what is important in these architectures is the tensorial structure of the computational degrees of freedom. However, there is another approach [FKLW] in which the global tensor structure becomes redundant. The physical degrees of freedom still have a local tensor structure − as is universal in quantum mechanics − but these are never touched directly. Instead a system is engineered so that these local degrees interact through a Hamiltonian H whose eigenstates Eλ are highly degenerate code spaces capable of storing, protecting and processing quantum information. For us Eλ will be the internal symmetries of an anyonic system in which position coordinates have been frozen out. The processing will consist of braiding anyons in (2 + 1)-dimensional space time. To make sense of this, consider a definition of “universal quantum computer” which does not presuppose any tensor decomposition. We need: 1. h: A Hilbert h space on which to act. (Its dimension should scale exponentially in a physical parameter.) 2. ◦ ∈ h: We need to be able to initialize the system. 3. ρ: Operations −→ U (h), a representation of some group (or at least semigroup) of operation on the unitary transformations of h which can be physically implemented – preferably with error scaling like e−constant L for some physical parameter L. (Lack of such scaling is the Achilles heel of qubit models.) The representation ρ should have dense image in SU (h): This, together with the rapid convergence property of dense subgroups of U (h), ensures universality. 4. Compiler: This is a classical computer which takes a q-algorithm and an instance, e.g. Shor’s poly time factoring algorithm [S2] and a thousand bit integer, and maps the pair into a string s of operations as in (3). 5. f : The result of the quantum portion of the calculation is a final state f = ρ(s)◦ . 6. Observation: There must be a Hermitian operator which serves as the observation: projecting f into an eigenstate λ with probability |aλ |2 , f = aλ λ . The eigenvalue λ is what is actually observed. 7. Answer: Another poly-time classical computation is now made to convert the observed eigenvalue, perhaps for many executions of 1) −→ 6), into a probabilistic output. The class of problems that can be answered in polynomial time by 1) −→ 7) with bounded error probability (say error < 41 ) is called BQP . For example, factoring [S2] is in BQP . Computer scientists believe, and cryptographers hope, that factoring is not in the corresponding classical computational class BP P .
Magnetic Model with a Possible Chern-Simons Phase
171
The reader can easily take any qubit architecture (see [NC] for details of these) and fit it into the preceding format. Let us now do this for our anyonic system with Hamiltonian H, . As explained in the introduction, the system was chosen to have two spatial dimensions topologically, i.e. to live on a triangulated surface (Y, ), so that exotic statistics become a possibility. By mathematical excising neighborhoods of the excitations we reduce to the case of studying the ground state space Gt of a time dependent H,,t , on a highly punctured surface Y − with labelled boundary. The subscript t reminds us of the time dependence of the surface (Y − , ) as the position of the punctures evolves. The ground state space Gt describes the internal symmetries of a collection of quasiparticle (anyon) excitations whose spatial locations are t dependent. The space Gt◦ is a representation space for the braid group (or generalized braid group) which describes the motion of these quasiparticles on Y . Because of the presumed spectral gap, the quasiparticles are expected to have exponentially decaying tails. Chopping off and ignoring these tails amounts to puncturing Y at the quasiparticles. This identifies the excited state ◦,t ∈ Eλ,t containing anyons on Y with a ground state ◦,t ∈ Gt on a multiply punctured Y − with labeled boundary. So by puncturing and labeling the surface Y , a ground state in Gt can be used to represent the anyonic state ∈ Eλ,t , so the discussion of Sect. 3 applies to Eλ,t . Let us walk through steps 1 through 7 for our anyonic model, though it is not efficient to do this in strict order. ∼ Gt : In [FLW1] and [FKLW] abstract anyonic models for (but with 1. & 2. h = Eλ,t = no known Hamiltonian H ) were analyzed algebraically. In [F2] an explicit but artificial Hamiltonian was given as an existence theorem. The UTMFs of [FLW1, F2 and FKLW]. required a two (with care 1.5) quasiparticle pairs per qubit simulated. DE is a closely related UTMF and for = 3 a similar encryption yields one qubit per 1.5 pairs of (0, 2) type excitations. Physically one imagines a disk of quantum media governed by Ht and trivial outer boundary condition, lying in its (nondegenerate) ground state = “the vacuum”. The steps required to build h are a subset of those discussed in [BK] in connection with their CS2 model. The disk is struck in some way (with a hammer?) at a point to create a pair of excitations. Already in building Eλ,t we need measurement to tell if the newly created pair is type (0, 2); (0, 2) . If the pair is of this type, we keep it, if not it is returned to the vacuum. Repeat (perhaps thousands of times) until a sufficiently large Hilbert space Eλ,t (Y ) ∼ = Gt (Y − ) ∼ = h, and initial vector, ◦ ∈ Gt is realized. The initial state ◦ is determined by the condition that all circles surrounding (not separating) the created pairs acquire label (0, 0). How many pairs are required depends on the problem instance. For example, for the factoring problem, it is a small multiple of the number of bits of the number to be factored. 6. Measurement. The creation process is probabilistic. So even at the start, there must be a local observation which tells us which anyon pairs have been created ((0, 0), (0, 0)); ((0, 2), (0, 2)); (2, 0), (2, 0); or ((2, 2), (2, 2)). One hopes that will not require a “topological microscope”. Because quasi particles are arrangements of elementary degrees of freedom spins, charges, etc. . . , of the system, one expects each quasiparticle when examined electromagnetically to have its own unique signature: e.g. quadruple moment etc. . . . In this view, localized quasiparticles would always be “measured” by their environment and never lie in superpositions. However, it is essential for quantum computation that a well separated pair of quasiparticles − before being fused − could be in a superposition of collective states. Another idea
172
M.H. Freedman
[SF], discussed in the introduction, is that a phase transition be employed for measurement. 3. Braiding. The group of operations is the braid group of quasiparticles moving on the disk. In order to implement this mathematically known representation on h we need to be able to grab hold of the quasiparticle and, within some allowable dispersion corridor, nudge it along to execute the braid s dictated by the output of the classical compiler, Step 4. This, like observation, should in principle be possible, using the characteristic electric and magnetic attributes of the nontrivial quasiparticles (whatever they are eventually measured to be). It may be possible to design wells that trap, and when desired, move specific quasiparticles. 5. f is the internal state after braiding f = ρ(s)◦ . 6. Observation. We already discussed the necessity to observe halves of newly created pairs. To read out quantum information after braiding, take two quasiparticles in the system f and fuse them. Although they will retain their individual identities during braiding and still be both of type (0, 2), after fusing, two outcomes are possible: (0, 0) or (0, 2). The probabilities attached to these outcomes is the classical distillation of quantum information equivalent to measuring a qubit in the usual architecture (see [FKLW] for details of the read out and its relation to quantum topology and the Jones polynomial.) The braiding has rearranged, in an exponentially intricate fashion, the structure of the composite pairs of (0, 2)− quasiparticles. This recoupling is the heart of the computation. 7. & 4. The final conversion of eigenvalues observed to a probabilistic output is the same as for the qubit architecture. The structure of the compiler is also similar but must include a rapid approximation algorithm [K2, So] subroutine. We have presented H,3,t as a theoretical candidate for an anyonic medium capable of universal quantum computation. Its experimental realization would a landmark.
6. Appendix. Ideals in Temperley-Lieb Catergory (by Frederick M. Goodman and Hans Wenzl) This appendix contains a proof of the following result, which is used in the paper of Michael Freedman, A magnetic model with a possible Chern-Simons phase. Theorem A0.1. When the parameter d is equal to 2 cos(j π/n) with n ≥ 3 and j coprime to n, then the Temperley-Lieb category has exactly one non-zero, proper ideal, namely the ideal of negligible morphisms. For all other values of d, the Temperley-Lieb category has no non-zero, proper tensor ideal. We are grateful to Michael Freedman for bringing the question of tensor ideals in the Temperley-Lieb category to our attention and for allowing us to present the proof as an appendix to his paper. Our notation in the appendix differs slightly from that in the main text. We write t instead of −A2 , Tn for the Temperley-Lieb algebra with n strands, and T L for the Temperley-Lieb category. We trust that this notational variance will not cause the reader any difficulty. This appendix can be read independently of the main text.
Magnetic Model with a Possible Chern-Simons Phase
173
A1. The Temperley-Lieb Category A1.1. The Generic Temperley-Lieb Category. Let t be an indeterminant over C, and let d = (t + t −1 ). The generic Temperley Lieb category TL is a strict tensor categor whose objects are elements of N0 = {0, 1, 2, . . . }. The set of morphisms Hom(m, n) from m to n is a C(t) vector space described as follows: If n − m is odd, then Hom(m, n) is the zero vector space. For n − m even, we first define (m, n)-TL diagrams, consisting of: 1. A closed rectangle R in the plane with two opposite edges designated as top and bottom, 2. m marked points (vertices) on the top edge and n marked points on the bottom edges, 3. (n+m)/2 smooth curves (or “strands") in R such that for each curve γ , ∂γ = γ ∩∂R consists of two of the n + m marked points, and such that the curves are pairwise non-intersecting. Two such diagrams are equivalent if they induce the same pairing of the n+m marked points. Hom(m, n) is defined to be the C(t) vector space with basis the set of equivalence classes of (m, n)-TL diagrams; we will refer to equivalence classes of diagrams simply as diagrams. The composition of morphisms is defined first on the level of diagrams. The composition ba of an (m, n)-diagram b and an (, m)-diagram a is defined by the following steps: 1. Juxtapose the rectangles of a and b, identifying the bottom edge of a (with its m marked points) with the top edge of b (with its m marked points). 2. Remove from the resulting rectangle any closed loops in its interior. The result is a (n, )-diagram c. 3. The product ba is d r c, where r is the number of closed loops removed. The composition product evidently respects equivalence of diagrams, and extends uniquely to a bilinear product Hom(m, n) × Hom(, m) −→ Hom(, n), hence to a linear map Hom(m, n) ⊗ Hom(, m) −→ Hom(, n).
Fig. A1.1. A (5,7)–Temperley Lieb Diagram
174
M.H. Freedman
The tensor product of objects in TL is given by n ⊗ n = n + n . The tensor product of morphisms is defined by horizontal juxtaposition. More exactly, the tensor a ⊗ b of an (n, m)-TL diagram a and an (n , m )-diagram b is defined by horizontal juxtposition of the diagrams, the result being an (n + n , m + m )-TL diagram. The tensor product extends uniquely to a bilinear product Hom(m, n) × Hom(m , n ) −→ Hom(m + m , n + n ), hence to a linear map Hom(m, n) ⊗ Hom(m , n ) −→ Hom(m + m , n + n ). For each n ∈ N0 , Tn := End(n) is a C(t)-algebra, with the composition product. The identity 1n of T (n) is the diagram with n vertical (non-crossing) strands. We have canonical embeddings of Tn into Tn+m given by x 1→ x ⊗ 1m . If m > n with m − n even, there also exist obvious embeddings of Hom(n, m) and Hom(m, n) into Tm as follows: If ∩ and ∪ denote the morphisms in Hom(0, 2) and Hom(2, 0), then we have linear embeddings a ∈ Hom(n, m) 1→ a ⊗ ∪⊗(m−n)/2 ∈ Tm and
b ∈ Hom(m, n) 1→ b ⊗ ∩⊗(m−n)/2 ∈ Tm .
Note that these maps have left inverses which are given by premultiplication by an element of Hom(n, m) in the first case, and postmultiplication by an element of Hom(m, n) in the second. Namely, a = d −(m−n)/2 (a ⊗ ∪⊗(m−n)/2 ) ◦ (1n ⊗ ∩⊗(m−n)/2 ) and
b = d −(m−n)/2 (1n ⊗ ∪⊗(m−n)/2 ) ◦ (b ⊗ ∩⊗(m−n)/2 ). By an ideal J in TL we shall mean a vector subspace of n,m Hom(n, m) which is closed under composition and tensor product with arbitrary morphisms. That is, if a, b are composible morphisms, and one of them is in J , then the composition ab is in J ; and if a, b are any morphisms, and one of them is in J , then the tensor product a ⊗ b is in J . Note that any ideal is closed under the embeddings described just above, and under their left inverses. A1.2. Specializations and evaluable morphisms.. For any τ ∈ C, we define the specialization TL(τ ) of the Temperley Lieb category at τ , which is obtained by replacing the indeterminant t by τ . More exactly, the objects of TL(τ ) are again elements of N0 , the set of morphisms Hom(m, n)(τ ) is the C-vector space with basis the set of (m, n)-TL diagrams, and the composition rule is as before, except that d is replaced by d(τ ) = (τ + τ −1 ). Tensor products are defined as before. Tn (τ ) := End(n) is a complex algebra, and x 1→ x ⊗ 1m defines a canonical embedding of Tn (τ ) into Tn+m (τ ). One also has embeddings Hom(m, n) → Tn and Hom(n, m) → Tn , when m < n, as before. An ideal in TL(τ ) again means a subspace of n,m Hom(n, m) which is closed under composition and tensor product with arbitrary morphisms.
Magnetic Model with a Possible Chern-Simons Phase
175
Let C(t)τ be the ring of rational functions without pole at τ . The set of evaluable morphisms in Hom(m, n) is the C(t)τ -span of the basis of (n, m)-TL diagrams. Note that the composition and tensor product of evaluable morphisms are evaluable. We have an evaluation map from the set of evaluable morphisms to morphisms of TL(τ ) defined by a= sj (t)aj 1→ a(τ ) = sj (τ )aj , where the sj are in C(t)τ , and the aj are TL-diagrams. We write x 1→ x(τ ) for the evaluation map. The evaluation map is a homomorphism for the composition and tensor products. In particular, one has a C-algebra homomorphism from the algebra Tnτ of evaluable endomorphisms of n to the algebra Tn (τ ) of endomorphisms of n in TL(τ ). The principle of constancy of dimension is an important tool for analyzing the specialized categories TL(τ ). We state it in the form which we need here: Proposition A1.1. Let e ∈ Tn and f ∈ Tm be evaluable idempotents in the generic Temperley Lieb category. Let A be the C(t)-span in Hom(m, n) of a certain set of (m, n)-TL diagrams, and let A(τ ) be the C-span in Hom(m, n)(τ ) of the same set of diagrams. Then dimC(t) eAf = dimC e(τ )A(τ )f (τ ). Proof. Let X denote the set of TL diagrams spanning A. Clearly dimC(t) A = dimC A(τ ) = |X|. Choose a basis of e(τ )A(τ )f (τ ) of the form {e(τ )xf (τ ) : x ∈ X0 } where X0 is some subset of X. If the set {exf : x ∈ X0 } were linearly dependent over C(t), then it would be linearly dependent over C[t], and evaluating at τ would give a linear dependence of {e(τ )xf (τ ) : x ∈ X0 } over C. It follows that dimC(t) eAf ≥ dimC e(τ )A(τ )f (τ ). But one has similar inequalities with e replaced by 1 − e and/or f replaced by 1 − f . If any of the inequalities were strict, then adding them would give dimC(t) A > dimC A(τ ), a contradiction. A1.3. The Markov trace. The Markov trace Tr = Tr n is defined on Tn (or on Tn (τ )) by the following picture, which represents an element in End(0) ∼ = C(t) (resp. End(0) ∼ = C). On an (n, n)-TL diagram a ∈ Tn , the trace is evaluated geometrically by closing up the diagram as in the figure, and counting the number c(a) of components (closed loops); then Tr(a) = d c(a) . It will be useful to give the following inductive description of closing up a diagram. We define a map εn : Tn+1 → Tn (known as a conditional expectation in operator algebras) by only closing up the last strand; algebraically it can be defined by a ∈ Tn+1
1→
(1n ⊗ ∪) ◦ (a ⊗ 1) ◦ (1n ⊗ ∩).
If k > n, the map εn,k is defined by εn,k = εn ◦ εn+1 . . . ◦ εk−1 . It follows from the definitions that Tr(a) = ε0,n for a ∈ Tn . It is well-known that Tr is indeed a functional satisfying Tr(ab) = Tr(ba); one easily checks that this equality is even true if a ∈ Hom(n, m) and b ∈ Hom(m, n). We need the following well-known fact:
176
M.H. Freedman
= Tr(a) ∈ End(0)
a
Fig. A1.2. The categorical trace of an element a ∈ Tn
Lemma A1.2. Let f ∈ Tn+m and let p ∈ Tn such that (p ⊗ 1m )f (p ⊗ 1m ) = f , where p is a minimal idempotent in Tn . Then εn,n+m (f ) = γp, where γ = T rn+m (f )/T rn (p) Proof. It follows from the definitions that pεn,n+m (f )p = εn,n+m ((p ⊗ 1m )f (p ⊗ 1m )) = εn,n+m (f ). As p is a minimal idempotent in Tn , εn,n+m (f ) = γp, for some scalar γ . Moreover, by our definition of trace, we have T rn+m (f ) = T rn (εn,n+m (f )) = γ T rn (p). This determines the value of γ . The negligible morphisms Neg(n, m) are defined to be all elements a ∈ Hom(n, m) for which T r(ab) = 0 for all b ∈ Hom(m, n). It is well-known that the set of all negligible morphisms form an ideal in TL. A2. The structure of the Temperley–Lieb algebras A2.1. The generic Temperley–Lieb algebras. Recall that a Young diagram λ = [λ1 , λ2 , . . . λk ] is a left justified array of boxes with λi boxes in the i th row and λi ≥ λi+1 for all i. For example, [5, 3] = . All Young diagrams in this paper will have at most two rows. For λ a Young diagram with n boxes, a Young tableau of shape λ is a filling of λ with the numbers 1 through n so that the numbers increase in each row and column. The number of Young tableax of shape λ is denoted by fλ . The generic Temperley Lieb algebras Tn are known ([J1]) to decompose as direct sums of full matrix algebras over the field C(t), Tn = λ Tλ , where the sum is over all Young diagrams λ with n boxes (and with no more than two rows), and Tλ is isomorphic to an fλ -by-fλ matrix algebra. When λ and µ are Young diagrams of size n and n + 1, one has a (non-unital) homomorphism of Tλ into Tµ given by x 1→ (x ⊗ 1)zµ , where zµ denotes the minimal central idempotent in Tn+1 such that Tµ = Tn+1 zµ . Let gλ,µ denote the rank of (e ⊗1)zµ , where e is any minimal idempotent in Tλ . It is known that gλ,µ = 1 in case µ is obtained from λ by adding one box, and gλ,µ = 0 otherwise.
Magnetic Model with a Possible Chern-Simons Phase
177
One can describe the embedding of Tn into Tn+1 by a Bratteli diagram (or inductionrestriction diagram), which is a bipartite graph with vertices labelled by two-row Young diagrams of size n and n + 1 (corresponding to the simple components of Tn and Tn+1 ) and with gλ,µ edges joining the vertices labelled by λ and µ. That is λ and µ are joined by an edge precisely when µ is obtained from λ by adding one box. The sequence of embeddings T0 → T1 → T2 → · · · is described by a multilevel Bratteli diagram, as shown in Fig. A2.5. A tableau of shape λ may be identified with an increasing sequence of Young diagrams beginning with the empty diagram and ending at λ; namely the j th diagram in the sequence is the subdiagram of λ containing the numbers 1, 2, . . . , j . Such a sequence may also be interpreted as a path on the Bratteli diagram of Fig. A2.5, beginning at the empty diagram and ending at λ. A2.2. Path idempotents. One can define a familiy of minimal idempotents pt in Tsn , labelled by paths t of length n on the Bratteli diagram (or equivalently, byYoung tableaux of size n), with the following properties: 1. pt ps = 0 if t, s are different paths both of length n. 2. zλ = {p t : t ends at λ}. 3. pt ⊗ 1 = {ps : s has length n + 1 and extends t} Let t be a path of length n and shape λ and let µ be a Young diagram of size n + m. It follows that (pt ⊗ 1m )zµ = 0 precisely when there is a path on the Bratteli diagram from λ to µ. It has been shown in [J1] that (in our notations) Tr(pt ) = [λ1 − λ2 + 1], where [m] = (t m − t −m )/(t − t −1 ) for any integer m, and where λ is the endpoint of the path t. Observe that we get the same value for diagrams λ and µ (of different sizes) that are in the same column in the Bratteli diagram. ∅
❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ Fig. A2.5. Bratteli diagram for the sequence (Tn )
178
M.H. Freedman
The idempotents pt were defined by recursive formulas in [W2], generalizing the formulas for the Jones-Wenzl idempotents in [W1]. A2.3. Specializations at non-roots of unity. When τ is not a proper root of unity, the Temperley Lieb algebras Tn (τ ) are semi-simple complex algebras with the “same" structure as generic Temperley Lieb algebras. That is, Tn (τ ) = λ Tλ (τ ), where Tλ (τ ) is isomorphic to an fλ -by-fλ matrix algebra over C. The embeddings Tn (τ ) → Tn+1 (τ ) are described by the Bratteli diagram as before. The idempotents pt , and the minimal central idempotents zλ , in the generic algebras Tn , are evaluable at τ , and the evaluations pt (τ ), resp. zλ (τ ), satisfy analogous properties. A2.4. Specializations at roots of unity and evaluable idempotents. We require some terminology for discussing the case where τ is a root of unity. Let q = τ 2 , and suppose that q is a primitive th root of unity. We say that a Young diagram λ is critical if w(λ) := λ1 − λ2 + 1 is divisible by . The mth critical line on the Bratteli diagram for the generic Temperly Lieb algebra is the line containing the diagrams λ with w(λ) = ml. See Fig. A2.6. Say that two non-critical diagrams λ and µ with the same number of boxes are reflections of one another in the mth critical line if λ = µ and |w(λ)−m| = |w(µ)−m| < . (For example, with = 3, [2, 2] and [4] are reflections in the first critical line w(λ) = 3.) ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅
Fig. A2.6. Critical lines
Magnetic Model with a Possible Chern-Simons Phase
179
For τ a proper root of unity, the formulas for path idempotents in [W1] and [W2] generally contain poles at τ , i.e. the idempotents are not evaluable. However, suitable sums of path idempotents are evaluable. We will review some facts from [GW] about such evaluable sums. Suppose w(λ) ≤ and t is a path of shape λ which stays strictly to the left of the first critical line (in case w(λ) < ), or hits the first critical line for the first time at λ (in case w(λ) = ); then pt is evaluable at τ , and furthermore Tr(pt ) = [w(λ)]τ = (τ w(λ) − τ −w(λ) )/(τ − τ −1 ). For each critical diagram λ of size n, the minimal central idempotent zλ in Tn is evaluable at τ . Furthermore, for each non-critical diagram λ of size n, an evaluable idempotent zλL = pt ∈ Tn was defined in [GW] as follows: The summation goes over all paths t ending in λ for which the last critical line hit by t is the one nearest to λ to the left and over the paths obtained from such t by reflecting its part after the last critical line in the critical line (see Figure A2.7). These idempotents have the following properties (which were shown in [GW]): L (τ ) : µ non-critical } is a partition of unity in T (τ ); that is, 1. {zλ (τ ) : λ critical } ∪ {zµ n the idempotents are mutually orthogonal and sum to the identity. 2. zλ (τ ) is a minimal central idempotent in Tn (τ ) if λ is critical, and zλL (τ ) is minimal central modulo the nilradical of Tn if λ is not critical (see [GW], Theorem 2.2 and Theorem 2.3).
❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅
❅ ❅
❅ ❅ ❅ ❅ ❅
Fig. A2.7. A path and its reflected path
180
M.H. Freedman
L (τ ) = 0 only if λ = µ, or if there is exactly 3. For λ and µ non-critical, zλL (τ )Tn (τ )zµ one critical line between λ and µ which reflects λ to µ. If in this case ν denotes the L ⊆ T (in the generic Temperley leftmost of the two diagrams λ and µ, then zλL Tn zµ ν Lieb algebra). reg 4. Let zn = pt , where the summation goes over all paths t which stay strictly to reg reg the left of the first critical line, and let znnil = 1 − zn . Then both zn and znnil L reg are evaluable; this is a direct consequence of the fact that zn = λ zλ , where the summation goes over diagrams λ with n boxes with width w(λ) < .
Proposition A2.1. The ideal of negligible morphisms in TL(τ ) is generated by the idempotent p[−1] (τ ) ∈ T−1 (τ ). Proof. Let us first show that znnil (τ ) is in the ideal generated by p[−1] (τ ) for all n. This nil = p nil is clear for n < , as z−1 [−1] and zn = 0 for n < − 1. Moreover, znnil is a central idempotent in the maximum semisimple quotient of Tn , whose minimal central idempotents are the zλL with w(λ) ≥ . One checks pictorially that p[−1] zλL = 0 for any such λ (i.e. the path to [ − 1] can be extended to a path t for which pt is a summand of zλL ). This proves our assertion in the maximum semisimple quotient of Tn ; it is well-known that in this case also the idempotent itself must be in the nil (τ ) + znil (τ )Hom(n, m) is also ideal generated by p[−1] . In particular, Hom(n, m)zm n contained in this ideal. By [GW], Theorem 2.2 (c), for λ a Young diagram of size n, with w(λ) < , zλL Tn zλL (τ ) is a full matrix algebra, which moreover contains a minimal idempotent pt of trace Tr(pt ) = [w(λ)]τ = 0. Therefore zλL Tn zλL (τ ) ∩ Neg(n, n) = (0). reg reg Furthermore, zn Tn zn (τ ) = zλL Tn zλL (τ ), by Fact 4 above, so reg
reg
zn Tn zn (τ ) ∩ Neg(n, n) = (0) reg
reg
as well. Now for x ∈ Neg(n, n), one has zn (τ )xzn (τ ) = 0, so x ∈ Tn (τ )znnil (τ ) + znnil (τ )Tn (τ ). We have shown that Neg(n, n) is contained in the ideal of TL(τ ) generated by p[l−1] , for all n. That the same is true for Neg(m, n) with n = m follows from using the embeddings, and their left inverses, described at the end of Sect. A1.1. " # A3. Ideals Proposition A3.1. Any proper ideal in TL (or in TL(τ )) is contained in the ideal of negligible morphisms. Proof. Let a ∈ Hom(m, n). For all b ∈ Hom(n, m), (ba) is in the intersection of the ideal generated by a with the scalars End(0). If a is not negligible, then the ideal generated by a contains an non-zero scalar, and therefore contains all morphisms. " # Corollary A3.2. The categories TL and TL(τ ) for τ not a proper root of unity have no non-zero proper ideals.
Magnetic Model with a Possible Chern-Simons Phase
181
Proof. There are no non-zero negligible morphisms in TL and in TL(τ ) for τ not a proper root of unity. " # Theorem A3.3. Suppose that τ is a proper root of unity. Then the negligible morphisms form the unique non-zero proper ideal in TL(τ ). Proof. Let J be a non-zero proper ideal in TL(τ ). By the embeddings discussed at the end of Sect. 1.1, we can assume J ∩ Tn = 0 for some n. L (τ )} is a partition Now let a be a non-zero element of J ∩ Tn (τ ). Since {zλ (τ )} ∪ {zµ of unity in Tn (τ ), one of the following conditions hold: (a) b = azµ (τ ) = 0 for some critical diagram µ. L (τ )azL (τ ) = 0 for some non-critical diagram µ. (b) b = zµ µ (c) b = zλL (τ )azλL (τ ) = 0 for some pair λ, λ of non-critical diagrams which are reflections of one another in a critical line. In this case, let µ denote the leftmost of the two diagrams λ, λ . In each of the three cases, one has b ∈ e(τ )Tn (τ )f (τ ), where e, f are evaluable idempotents in Tn such that eTn f ⊆ Tµ . Let α be a Young diagram on the first critical line of size n+m, such that there exists a path on the generic Bratteli diagram connecting µ and α. Then one has dimC zα (τ )(e(τ ) ⊗ 1m )(Tn (τ ) ⊗ C 1m )(f (τ ) ⊗ 1m ) = dimC(t) zα (e ⊗ idm )(Tn ⊗ C(t)1m )(f ⊗ 1m ) = dimC(t) eTn f = dimC e(τ )Tn (τ )f (τ ), where the first and last equalities result from the principle of constancy of dimension, and the second equality is because x 1→ zα (x ⊗1m ) is injective from Tµ to Tα . But then it follows that x 1→ zα (τ )(x ⊗ 1m ) is injective on e(τ )Tn (τ )f (τ ). In particular (b ⊗ 1m )zα is a non-zero element of J ∩ Tα . Hence there exists c ∈ Tα such that g = c(b ⊗ 1m )zα is an idempotent. After conjugating (and multiplying with p[−1] ⊗ 1m , if necessary), we can assume g to be a subidempotent of p[−1] ⊗ 1m . But then ε−1+m,−1 (g) is a multiple of p[−1] , by Lemma A1.2, with the multiple equal to the rank of g in Tα . This, together with Proposition A2.1, finishes the proof. " # It is easily seen that TL has a subcategory TLev whose objects consist of even numbers of points, and with the same morphisms between sets of even points as for TL. The evaluation TLev (τ ) is defined in complete analogy to TL(τ ). Corollary 3.4. If τ 2 is a proper root of unity of degree with odd, the negligible morphisms form the unique non-zero proper ideal in TLev . Proof. If is odd, p[−1] is a morphism in TLev . The proof of the last theorem goes through word for word (one only needs to make sure that one stays within TLev, which is easy to check). " # References [A] [At]
Anderson, P.W.: The resonating valence bond state in La2 CuO4 and superconductivity. Science 235, 1196–1198 (1987) Atiyah, M.: The geometry and physics of knots. Lezioni Lincee. [Lincei Lectures] Cambridge: Cambridge University Press, 1990, x+78 pp.
182 [B]
M.H. Freedman
Baxter, R.: Exactly Solved Models in Statistical Mechanics. London-San Diego: Academic Press, 1982 [BHMV] Blanchet, C., Habegger, N., Masbaum, G., Vogel, P.: Topological quantum field theories derived from the Kauffman bracket. Topology 34(4), 883–927 (1995) [BK] Bravyi, S., Kitaev, AYu.: Quantum invariants of 3-manifolds and quantum computation. 2001 Preprint [BJ] Borgs, C., Janke, W.: An explicit formula for the interface tension of the 2D Potts model. J. Phys. I France 2, 2011–2018 (1992) [D] Deutsch, D.: Quantum computational networks. Proc. Roy. Soc. London, A425, 73–90 (1989) [Dr] Drinfeld, V.G.: Quantum groups. Proceedings of the International Congress of Mathematicians. Vol. 1, 2 (Berkeley, Calif., 1986), Providence, RI: Am. Math. Soc., 1987, pp. 798–820 [F1] Freedman, M.H.: P/NP, and the quantum field computer. Proc. Natl. Acad. Sci. USA 95(1), 98–101 (1998) [F2] Freedman, M.H.: Quantum computation and the localization of modular functors. Found. Comput. Math 1(2), 183–204 (2001) [FKLW] Freedman, M., Kitaev, A., Larsen, M., Wang, Z.: Topological Quantum Computation. AMS 2002 - to be published [FLP] Fathi, A., Laudenbach, F., Poenaru, V.: Travaux de Thurston sur les surfaces. Asterisque 66–67 (1979) [FLW1] Freedman, M., Larsen, M., Wang, Z.: The two-eigenvalue problem and density of Jones representation of braid groups. Commun. Math. Phys. 228(1), 177–199 (2002) [FLW2] Freedman, M., Larsen, M., Wang, Z.: A modular functor which is universal for quantum computation. Commun. Math. Phys. 227(3), 605–622 (2002) [Fe] Feynman, R.: Simulating physics with computers. Int. J. Theor. Phys. 21, 467–488 (1982) [FS] Senthil, T., Fisher, M.: Fractionalization and confinement in the U(1) and Z2 gauge theories of strongly correlated systems. J. Phys. A 34(10), L119–L125 (2001) [FNWW] Freedman, M., Nayak, C., Walker, K., Wang, Z.: Pictures TQFTs. In preparation. [G] Gottesman, D.: Theory of fault-tolerant quantum computation. Phys. Rev. Lett 79, 325–328 (—-) [GW] Goodman, F., Wenzl, H.: The Temperley-Lieb algebra at roots of unity. Pac. J. Math. 161(2), 307–334 (1993) [J1] Jones, V.F.R.: Index for subfactors. Invent. Math. 72(1), 1–25 (1983), and Jones, V.F.R.: Braid groups, Hecke algebras and type I I1 factors. In: Geometric Methods in Operator Algebras, Proc. of the US-Japan Seminar, Kyoto, July 1983 [J2] Jones, V.F.R.: Planar Algebras, I. On personal website [Ka] Kauffman, L.: An invariant of regular isotopy. Trans. Am. Math. Soc. 318(2), 417–471 (1990) [KL] Kauffmann, L., Lins, S.: Temperley-Lieb recoupling theory and invariants of 3-manifolds. Ann. Math. Studies 134, (—) [K1] Kitaev, A. Yu.: Fault-tolerant quantum computation by anyons. quant-ph/9707021, 9, July (1997) [K2] Kitaev, A.Yu.: Quantum computations: algorithms and error correction. Russ. Math. Survey 52:61, 1191–1249 (1997) [KSV] Kitaev, A. Yu., Shen, A., Vyalyi, M.: Classical and quantum computation. Translated from the 1999 Russian original by Lester J. Senechal. Graduate Studies in Mathematics, 47. Providence, RI: American Mathematical Society, 2002. xiv+257 pp. [Ku] Kuperberg, G.: Spiders for rank 2 Lie algebras. Commun. Math. Phys. 180(1), 109–151 (1996) [L] Lickorish, W.B.R.: Three-manifolds and the Temperley-Lieb algebra. Math. Ann. 290(4), 657–670 (1991), an The Skein method for three-manifold invariant. J. Knot Theory and it Ramifications 2, 171 (1993) [Ll] Lloyd, S.: A potentially realisable quantum computer. Science 261, 1569 (1993) and Science 263, 695 (1994) [LSW] Lawler, G.F., Schramm, O., Werner, W.: Values of Brownian intersection exponents. II. Plane exponents. Acta Math. 187(2), 275–308 (2001) [NS] Nayak, C., Shtengel, K.: Microscopic models of two-dimensional magnets with fractionalized excitations. Phys. Rev. B. 64, 064422 (2001) [Ni] Nienhuis, B.: Coulomb gas formulation of two-dimensional phase transitions. In: Phase Transition and Critical Phenomena. London-San Diego: Academic Press, 1987 [NC] Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information. Cambridge: Cambridge University Press, 2000. xxvi+676 pp. [P] Preskill, J.: Fault-tolerant quantum computation. In: Introduction to Quantum Computation and Information. River Edge NJ: World Sci. Publishing, 1998, pp. 213–269 [PH] Penner, R.C., Harer, J.L.: Combinatorics of train tracks. Annals of Mathematics Studies, 125, Princeton, NJ: Princeton University Press, 1992. xii+216 pp.
Magnetic Model with a Possible Chern-Simons Phase [Prz] [RR] [RT] [Se] [SF] [S1] [S2] [So] [SK] [SW] [Th] [Tr] [TV] [U] [Wa] [W1] [W2] [Wi] [Y]
183
Przytycki, J.H.: Skein modules of 3-manifolds. Bull. Polish Acad. Sci. Math. 39(1–2), 91–100 (1991) Read, N., Rezayi, E.: Beyond paired quantum Hall states: Parafermions and incompressible states in the first excited Landau level. LANL cond-mat/9809384 Reshetikhin, N., Turaev, V.G.: Invariants of 3-manifolds via link polynomials and quantum groups. Invent. Math 103(3), 547–597 (1991) Segal, G.: Notes on conformal field theory. Unpublished Senthil, T., Fisher, M.P.A.: Z2 Gauge theory of electron fractionalization in strongly correlated systems. LANL cond-mat/9910224 Shor, P.: Fault-tolerant quantum computation. In: 37th Annual Symposium on Foundations of Computer Science (Burlington, VT, 1996), Los Alamitos, CA: IEEE Comput. Soc. Press, 1996, pp. 56–65 Shor, P.: Algorithms for quantum computation: Discrete logarithms and factoring. In: 35th Annual Symposium on Foundations of Computer Science (Santa Fe, NM, 1994), Los Alamitos, CA: IEEE Comput. Soc. Press, 1994, pp. 124–134 Solovey, R.: Unpublished notes on rapid factoring in SU (2) Sondhi, S.L., Kivelson, S.A.: Long range interactions and the quantum hall effect. Phys. Rev. B46, 13319–13325 (1992) Smirnov, S., Werner, W.: Critical exponents for two-dimensional percolation. Math. Res. Lett. 8(5–6), 729–744 (2001) Thurston, W.: Geometry of 3-manifolds (notes). Dept. of Math., Princeton University, 1978 Turaev, V.G.: Quantum invariants of knots and 3-manifolds. de Gruyter Studies in Mathematics, 18. Berlin: Walter de Gruyter & Co., 1994. x+588 pp. Turaev, V.G., Viro, Ya O.: State sum invariants of 3-manifolds and quantum 6j -symbols. Topology 31(4), 865–902 (1992) Unruh, W.G.: Maintaining coherence in quantum computers. Phys. Rev. A 51, 992 (1995) Walker, K.: On Witten’s 3-manifold invariants. Preprint, 1991. (Available at http://www. xmission.com/∼kwalker/math/) Wenzl, H.: On sequences of projections. Math. Rep. C.R. Acad. Sc. Canada 9, 5–9 (1987) Wenzl, H.: Hecke algebras of type An and subfactors. Invent. Math 92, 349–383 (1988) Witten, E.: Quantum field theory and the Jones polynomial. Commun. Math. Phys. 121, 351–399 (1989) Yao, A.: Quantum circuit complexity. In: 34th Annual Symposium on Foundations of Computer Science (Palo Alto, CA, 1993), Los Alamitos, CA: IEEE Comput. Soc. Press, 1993, pp. 352–361
Communicated by P. Sarnak
Commun. Math. Phys. 234, 185–190 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0779-z
Communications in
Mathematical Physics
Differentiation of SRB States: Correction and Complements David Ruelle1,2 1 2
IHES, 91440 Bures sur Yvette, France. E-mail:
[email protected] Math. Dept., Rutgers University, Piscataway, NJ 08854-8019, USA
Received: 26 August 2002 / Accepted: 26 September 2002 Published online: 20 January 2003 – © Springer-Verlag 2003
Abstract: Taking into account criticism by D. Dolgopyat and M. Jiang, we present here an improved derivation of the formula for the derivative of an SRB measure with respect to parameters. The SRB measure ρf on a mixing Axiom A attractor K depends smoothly on the diffeomorphism f , and a formula for the derivative was given in [3], namely δρf () =
∞
ρf grad( ◦ f n ), X
n=0
=
∞
ρf [(grad) ◦ f n , (Tf n )X s − ( ◦ f n )divu X u ],
(1)
n=0
where X = δf ◦ f −1 has components X s and X u in the stable and unstable directions. The divergence divu is taken along an unstable manifold with respect to the natural measure induced by ρf on unstable manifolds, and divu X u is a H¨older continuous function on K. This last fact, as pointed out by Dolgopyat, is not obvious, and was not clearly stated or proved in [3] (the problem is that X u need not be smooth). Furthermore, as pointed out by Jiang, one term was omitted in the proof of the above formula (this term, see below, vanishes however in the present circumstances). The purpose of the present note is to correct and complement the proof of the “first step” in Sect. 3 of [3]. From this one obtains formula (1) for δρf , without an extra term, and with H¨older continuous divu X u as needed for applications to statistical mechanics. The following proposition is stated for an attractor K, but an extension to general Axiom A basic sets is discussed in Remark 3 below. Proposition 1. Let f be a Cr diffeomorphism (r ≥ 3) of a compact manifold M and K a mixing Axiom A attractor. Also let Jfu : K → R be the unstable Jacobian computed with respect to some smooth volume element ω on the unstable manifolds (say the volume
186
D. Ruelle
element associated with a Riemann metric). A change δf of f corresponds to a vector field X = δf ◦ f −1 on M. Let δJ u be the corresponding first order change of Jfu . We shall prove the formula δJ u ∼ divuσ X u , Jfu
(2)
where we have used the following notation. The equivalence ∼ means that the integrals of both sides with respect to every f -invariant measure on K coincide. We have written Xs + X u the decomposition of X restricted to K along the stable and unstable directions. Finally, the divergence divuσ is computed on unstable manifolds with respect to the canonical volume element σ defined up to a multiplicative constant by σ (x) = s(x)ω(x) where ∞ u −k −1 Jf (f x) s(x) . = s(y) Jfu (f −k y) k=1
We claim that
divuσ X u
makes sense as a H¨older continuous function on K.
The proof of the proposition will use the absolute continuity of the projection along stable directions: let π : 1 → 2 be the projection along local stable manifolds between two u-dimensional manifolds 1 , 2 transversal to stable directions (u is the unstable dimension) then π is absolutely continuous with respect to the Riemann u-volume on 1 , 2 . The corresponding Jacobian Jπ is H¨older, but if one of the u-dimensional manifolds is moved, the Jacobian varies smoothly along stable manifolds (see Lemma 2 below). We shall denote by Vxs , Vxu the stable and unstable subspaces of Tx M at x ∈ K. Remember that, by structural stability, changing f to f˜ yields a map j : K → M such that f˜j = jf . Furthermore j depends smoothly on f˜ (and similarly for other quantities like V˜jux , etc.). Our proposition is a first order calculation, where we write f˜ = f + δf , X = δf ◦ f −1 , and j x = x + δx. We also want to take δJ u = J˜u ◦ j − J u . f˜
f
In order to define the unstable Jacobian J˜u˜ , we have to choose an unstable volume f
element ω˜ for f˜. Note that changing ω˜ amounts to changing log J˜u˜ by a “coboundary” f
term A ◦ f˜ − A. This changes δJ u /Jfu by A ◦ f˜ ◦ j − A ◦ j = A ◦ j ◦ f − A ◦ j ∼ 0. Any choice of ω˜ corresponding to a continuous function A on K is thus allowed. The calculation of δJ u involves unstable manifolds Vxu , V u , V˜ u , V˜ u for f and f˜, fx
jx
jf x
u → V u defined by projection along the stable and there are maps π : V˜ jux → Vxu , V˜ jf x fx manifolds for f . By absolute continuity, π also defines maps (Tj · M)∧u → (V·u )∧u , and we shall use the volume element ω˜ = ω ◦ π on V˜ jux , f V˜ jux , f˜V˜ jux (note that ω˜ has continuous rather than smooth density). Let now F (·) ∈ (V·u )∧u , F˜ (·) ∈ (V˜·u )∧u be defined by ω, F = 1, ω, π F˜ = 1. Write ˜ λ(x) = ω, (Tx f )∧u F (x), λ(x) = ω, π(Tj x f˜)∧u F˜ (j x)
˜ and δλ(x) = λ(x) − λ(x), then |λ(x)| = Jfu (x),
δλ(x)/λ(x) = δJ u (x)/Jfu (x).
Differentiation of SRB States: Correction and Complements
187
We may write δλ(x) = ω, π(Tx+δx (f + δf ))∧u F˜ (x + δx) − ω, (Tx f )∧u F (x) = ω, π(Tx+δx f )∧u F˜ (x + δx) − ω, (Tx f )∧u F (x) +ω, π[(Tf (x+δx) (id + X))∧u − 1](Tx+δx f )∧u F˜ (x + δx). If we let δx = δ u x + δ s x with δ u x ∈ Vxu , δ s x ∈ Vxs , we have π(Tx+δx f )∧u F˜ (x + δx) = (Tx+δ u x f )∧u F (x + δ u x) so that ω, π(Tx+δx f )∧u F˜ (x + δx) − ω, (Tx f )∧u F (x) = λ(x + δ u x) − λ(x) (this is the term forgotten in [3], as pointed out by Jiang). Also, to first order, ω, π[(Tf (x+δx) (id + X))∧u − 1](T(x+δx) f )∧u F˜ (x + δx) = ω, π[(Tf x (id + X))∧u − 1](Tx f )∧u F (x) = ω, π[(Tf x (id + X))∧u − 1]λ(x)F (f x) = λ(x)(f x), where
(x) = ω, π[(Tx (id + X))∧u − 1]F (x).
We claim that (x) = divuω X u .
(3)
The formula (3) will be obtained in this setup as a consequence of Lemma 2 below. Using (3) we have δλ(x) = λ(x + δ u x) − λ(x) + λ(x) divuω X u (f x). Let us now define σn = f n ω, so that σn (x) = sn (x)ω(x)
with
sn (x) =
n
k=1
−1 Jfu (f −k x)
and replace ω by σn . This replaces Jfu (x) by Jfu (f −n x) so that, as n → ∞, the derivative in the unstable direction of log Jfu (x) tends to 0. In particular λ(x + δ u x) − λ(x) → 0. Note also the identity
divuσn X u − divuσ X u = X u · grad log
dσn dσ
with the Radon-Nikodym derivative dσn /dσ = sn /s, and ∞ sn X · grad log X u · grad(Jfu ◦ f −k ). = s u
k=n+1
Xu )
When n → ∞ (for fixed H¨older the above expression tends to zero in a space of H¨older continuous functions on K. Therefore, if we know that divuω X u is a H¨older continuous function, so are all divuσn X u and also their limit divuσ X u . (That divuω X u is H¨older will result from 2 and 3 below). Using also ◦ f ∼ , this concludes the proof of Proposition 1 (modulo the proof of (3) in the next two sections).
188
D. Ruelle
Lemma 2. Identify Ru , Rs with subspaces of Ru + Rs = Ru+s , and choose a chart identifying an open set of M with the sum Bu + Bs of unit balls around 0 in Ru and Rs . We assume that the stable manifolds for f are (uniformly) transversal to the affine spaces y + Ru . Also let 0 = e ∈ Rs . If x ∈ Ru , let Vxs be the local stable manifold through x, and let ξ(x, s) be the point of intersection of Vxs and se + Ru . Thus s → ξ(x, s) is d ξ(x, s)|s=0 , we may write e˜ (x) = e + g(x), ˜ where smooth, ξ(x, 0) = x, and if e˜ (x) = ds u g(x) ˜ ∈R . d ξ(x, s) are H¨older, and that the same is true for We claim that x → ξ(x, s), ds x → (Tx ξ(·, s))∧u e1 ∧ · · · ∧ eu = " (x, s)e1 ∧ · · · ∧ eu , where e1 , . . . , eu are the canonical basis vectors of Ru . (We write here (Tx ξ )∧u somewhat abusively, since ξ is not differentiable in general, to indicate the well defined action on u-volume elements). Furthermore, us → k" (x, s) is smooth, x → " (x, s) = d older and, writing g(x) ˜ = k=1 g˜ (x)ek , we have ds " (x, s) is H¨ " (x, 0) =
u ∂ k g˜ (x) ∂x k k=1
in the sense of distributions (the left-hand side is a H¨older continuous function, canonically identified with a distribution on Rn ). d ξ(x, s) are H¨older. To prove the “absolute It is well known that x → ξ(x, s), ds continuity” result that x → " (x, s) is H¨older we use the formula
||(Tx f n )∧u e1 ∧ · · · ∧ eu || =1 n→∞ ||(Tξ(x,s) f n )∧u " (x, s)e1 ∧ · · · ∧ eu || lim
(where the norm || · || is based, say, on a Riemann metric on M). If we write F0 = e1 ∧ . . . ∧ eu , τ0 = 1, and Fk (x, s) = (Tf k−1 ξ(x,s) f )∧u Fk−1 (x, s)/τk−1 (x, s) τk (x, s) = ||Fk (x, s)|| we find log " (x, s) = −
∞
[log τk (x, s) − log τk (x, 0)],
k=1
where the sum converges exponentially fast (by hyperbolicity and the fact that ξ(x, s) is in the stable direction and the Fk tend to the unstable direction) and is a H¨older d continuous function by the usual argument. Also, " (x, s) = ds " (x, s) is given by a convergent series and is a H¨older continuous function of x. Let F (0) be a smooth foliation of a neighborhood of K, which is transversal to the stable manifolds ⊂ K. Let F (n) = f −n F (0) (restricted to a neighborhood of K). Then (n) F (n) tends to the “H¨older foliation” by stable manifolds: Fx → Vxs in Cr , uniformly in x. Define ξ (n) (x, s), e˜ (n) (x), " (n) (x, s) in the same way as ξ(x, s), e˜ (x), " (x, s), but with stable manifolds replaced by leaves of F (n) . Then, there is α ∈ (0, 1) such that ξ (n) (x, ·) → ξ(x, ·) in Cr ((−α, α)) uniformly for x ∈ αBu .
Differentiation of SRB States: Correction and Complements
189
A 3& argument also shows that " (n) (x, s) → " (x, s), " (n) (x, s) → " (x, s) uniformly in αBu × (−α, α), but since F (n) is a smooth foliation " (n) (x, s) = where we have written ξ (n) =
u
k=1 ξ
u ∂ d (n)k ξ (x, s), ∂x k ds k=1
(n)k e
" (x, 0) =
k.
Taking n → ∞, then s = 0 we obtain
u ∂ k g˜ (x), ∂x k k=1
where the left-hand side is a H¨older continuous function of x, and the right-hand side a sum of distributional derivatives, as announced. Proof of (3). Note that both sides of (3) are defined independently of a coordinate system. We choose coordinates x 1 , . . . , x u , x u+1 , . . . , x u+s such that Vxu satisfies x u+1 = . . . = x u+s = 0, and ω = dx 1 ∧ . . . ∧ dx u , and X is constant = e. We then see that (x) = ω, π[(Tx (id + X))∧u − 1]F (x) d = ω, [(Tx (id − ξ(x, s)|s=0 )∧u − 1]e1 ∧ . . . ∧ eu ds u ∂ k = −" (x, 0) = − g˜ (x) = divuω X u ∂x k k=1
(the second equality is geometrically clear, and the last follows from X u = −g). ˜
Remark 3. If the mixing Axiom A basic set K is not an attractor the local stable manifolds do not cover a neighborhood of K. The proof of Proposition 1 carries over to this situation, except that " (x, 0) cannot in a straightforward manner be interpreted as a divergence. We do not pursue this topic here in spite of its interest (it is related to “escape from quasi-attractors”). Calculation of δρf . The fact that the attractor K and the SRB measure ρf depend smoothly on f have been noted by various authors at various levels of generality. To the references in [3] one should add Bakhtin [1]. For recent results see also [2]. The specific formula (1) goes however beyond the proof of differentiability. As indicated in [3], δρf is a sum of two terms: δ (1) ρf which takes into account the change (2) in log Jfu , and δ (2) ρf which takes into account the change x → x + δx associated with structural stability of K. One has (with ∈ C2 (M)) δ (1) ρf () = [ρf (( ◦ f k ).(−divuσ X u )) − ρf ().ρf (−divuσ X u )], k∈Z
δ (2) ρf () = ρf
∞ n=0
grad( ◦ f n ), Xs −
∞
grad( ◦ f −n ), Xu .
n=1
But since ρf has conditional measure σ on the unstable manifolds, the integral of divuσ vanishes: ρf (−divuσ X u ) = 0 and also ρf [grad( ◦ f −n ), Xu + ( ◦ f −n ).divuσ X u ] = 0.
190
D. Ruelle
[To see this it is convenient to use a Markov partition {R}. We may disintegrate ρf in each rectangle R with respect to the partition into local unstable manifolds, writing ρf |R as an integral of measures σ . The integral of divuσ X u with respect to σ on a piece of unstable manifold reduces to boundary terms. And the boundary terms of different rectangles cancel when we sum over R. Therefore ρf (divuσ ·) = 0.] In conclusion δρf () =
∞ n=0
ρf [grad( ◦ f n ), Xs + ( ◦ f n ).(−divuσ X u )]
as announced. Calculation of δ(ρf (− log Jfu )). The expression δ(ρf (− log Jfu )) cannot be directly evaluated from (1) (indeed Jfu = |λ| depends on f, and is defined on K, not on M). ˜ is the unstable Jacobian for f˜(x) estimated at Remembering that δλ = λ˜ − λ, where |λ| x + δx, we see that in fact δ(ρf (− log λ)) = (δ (1) ρf )(− log λ) − ρf (λ−1 δλ) = [ρf ((log λ ◦ f k ).divuσ X u )−ρf (λ).ρf (divuσ X u )] k∈Z
− ρf (divuσ X u ◦ f ) = Therefore
δ(ρf (− log Jfu )) =
k∈Z
k∈Z
ρf ((log λ ◦ f k ).divuσ X u ).
ρf ((log Jju ◦ f k ).divuσ X u ).
If the unstable volume element ω can be chosen such that the function Jfu is constant on K, then δ(ρf (− log Jfu )) = 0 (but this is not the case in general). Acknowledgements. I am considerably indebted to Dima Dolgopyat and Miaohua Jiang for their useful criticism as indicated above. Subsequent discussions with Dolgopyat and correspondence with Jiang have been very constructive and are gratefully acknowledged.
References 1. Bakhtin, V.I.: Random processes generated by a hyperbolic sequence of mappings. I. Russian Acad. Sci. Izv. Math. 44, 247–279 (1995) 2. Dolgopyat, D.: On differentiability of SRB states. Preprint 3. Ruelle, D.: Differentiation of SRB states. Commun. Math. Phys. 187, 227–241 (1997) Communicated by J. L. Lebowitz
Commun. Math. Phys. 234, 191–227 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0751-y
Communications in
Mathematical Physics
Spectral Analysis of Unitary Band Matrices Olivier Bourget1 , James S. Howland2 , Alain Joye1 1 2
Institut Fourier, Universit´e de Grenoble 1, BP 74, 38402 St.-Martin d’H`eres, France Department of Mathematics, University of Virginia, Charlottesville, VA 22903, USA
Received: 29 April 2002 / Accepted: 7 August 2002 Published online: 20 January 2003 – © Springer-Verlag 2003
Abstract: This paper is devoted to the spectral properties of a class of unitary operators with a matrix representation displaying a band structure. Such band matrices appear as monodromy operators in the study of certain quantum dynamical systems. These doubly infinite matrices essentially depend on an infinite sequence of phases which govern their spectral properties. We prove the spectrum is purely singular for random phases and purely absolutely continuous in case they provide the doubly infinite matrix with a periodic structure in the diagonal direction. We also study some properties of the singular spectrum of such matrices considered as infinite in one direction only. 1. Introduction The dynamical stability of quantum systems governed by a time periodic Hamiltonian is often characterized in terms of the spectral properties of the corresponding monodromy operator, a unitary operator defined as the evolution generated by the Hamiltonian over a period. A first rough classification consists in determining whether or not the spectrum of the monodromy operator contains an absolutely continuous (a.c.) component. The presence of absolutely continuous spectrum is a signature of unstable quantum systems, whereas a purely singular spectrum is a characteristic of quantum stability. For smooth Hamiltonians, these spectral properties can be obtained through the study of an associated self-adjoint operator, the so-called Floquet or quasi-energy operator [Ho1], [Y]. In case the Hamiltonian is singular, e.g. when it corresponds to a kicked system, one is often lead to consider the monodromy operator directly [Co2]. In both situations, one is typically confronted with a problem where a dense pure point operator is perturbed either by the addition of a self-adjoint operator in the first case, or by a multiplicative unitary perturbation in the second case. A more or less detailed spectral analysis can thus be performed provided a perturbative framework of some sort is available, or in case disorder is present. See e.g. [Be, DS, DLSV, GY, Ho2, Ho3, N, J] for the smooth case, and also the review [Co2, Co1, dO, ADE, Bo] for the kicked case.
192
O. Bourget, J.S. Howland, A. Joye
The dynamical quantum systems we address here are characterized by a monodromy operator given by a product of two pure point unitaries, neither of which can be considered a perturbation of the other. However, the spectral analysis can be carried over under certain circumstances due to the fact the monodromy operator has a band structure in some basis. The motivation of the construction of such operators is borrowed from the work [BB] which we briefly recall below. As noted by these authors, this structure allows us to adapt the techniques developed in the study of one dimensional discrete Schr¨odinger operators to the unitary framework in order to obtain results about the spectrum of such monodromy operators. Let us briefly summerize the paper. In Sec. 2, we define explicitly the class of unitary operators on the integer lattices Z and N that we shall study and discuss their relationship to [BB]. These operators depend on transmission and reflection amplitudes at each lattice point. Some simple perturbative results for essential and absolutely continuous spectra are obtained in Sec. 3. Here, the moduli of the transmission and reflection amplitudes may vary from point to point, but in the remainder of the paper these moduli are assumed constant on the lattice. In Secs. 4 and 5, we consider the random case, in which the phases are independent and randomly distributed on the circle, and we prove that the spectrum is purely singular. To do this, we first establish a version of the Ishii-Pastur Theorem according to which the absolutely continuous part of the spectrum is almost surely supported on the closure of the set where the Lyapunov exponent vanishes and then prove that the Lyapunov exponent is everywhere positive. In Sec. 6, we consider the coherent case, in which the phases are eventually periodic. We identify the absolutely continuous spectrum, and show that the singular continuous spectrum is absent. Finally, in Sec. 7, we give an example in which the phases are almost periodic and the spectrum is purely singular continuous.
2. Construction of the Monodromy Operator We consider a class of monodromy operators whose construction is motivated by the study of a model of electronic transport in a ring threaded by a linear time dependent magnetic flux, as discussed in [BB], and references therein. Neglecting the curvature of the ring, the instantaneous Hamiltonian of the one-body Schr¨odinger operator corresponds to that of a one dimensional Schr¨odinger operator with a periodic potential describing the material of the ring and time dependent boundary conditions of Floquet type. With a choice of linear flux, the time plays the role of the quasi-momentum. Therefore, as a function of time, the Hamiltonian is periodic and its instantaneous spectrum is given by the band structure corresponding to the potential. Under some adiabaticity condition, the evolution operator is assumed to couple states by adjacent pairs of states only by means of the Landau-Zener mechanism. The concerned states are those whose corresponding eigenvalues become close to one another. Thus, a given state with index k say, is coupled once to the one with index k − 1 and once with the one index k + 1. This yields the band structure of the evolution operator over a period in the basis of eigenvectors at time zero, say. We refer the reader to this paper for physical background and further description of the regime in which the model holds. Let us now define our monodromy operator following the main lines of the construction sketched above. Our separable Hilbert space is l 2 (Z) and we denote the canonical basis by {ϕk }k∈Z . In order to make contact with the above model, we shall also state results for l 2 (N∗ ).
Spectral Analysis of Unitary Band Matrices
193
The most general 2 × 2 unitary matrix depends on 4 parameters and can be written as S = e−iθ
re−iα iteiγ ite−iγ reiα
,
(2.1)
where α, γ , θ belong to the torus T and the real parameters t, r, also called reflection and transition coefficients, are linked by r 2 + t 2 = 1. We introduce an infinite set of such matrices {Sk }k∈Z , where Sk depends on the phases αk , γk , θk , and the reflection and transition coefficients tk , rk . They are the building blocks of our monodromy operator in l 2 (Z). Let Pj be the orthogonal projector on the span of ϕj , ϕj +1 in l 2 (Z). We introduce Ue , Uo , two 2 × 2 block diagonal unitary operators on l 2 (Z) defined by Ue = Uo =
k∈ Z k∈Z
P2k S2k P2k , P2k+1 S2k+1 P2k+1 ,
(2.2)
or, in matrix representation in the canonical basis, Ue =
..
.
S−2
S0
S2
..
(2.3)
.
and similarly for Uo , with S2k+1 in place of S2k . Note that the 2 × 2 blocks in Ue are shifted by one with respect to those of Uo along the diagonal. We now define the monodromy operator U , object of our investigations, by U = Uo Ue ,
(2.4)
such that, for any k ∈ Z, U ϕ2k = ir2k t2k−1 e−i(θ2k +θ2k−1 ) e−i(α2k −γ2k−1 ) ϕ2k−1 +r2k r2k−1 e−i(θ2k +θ2k−1 ) e−i(α2k −α2k−1 ) ϕ2k +ir2k+1 t2k e−i(θ2k +θ2k+1 ) e−i(γ2k +α2k+1 ) ϕ2k+1 −t2k t2k+1 e−i(θ2k +θ2k+1 ) e−i(γ2k +γ2k+1 ) ϕ2k+2 , U ϕ2k+1 = −t2k t2k−1 e−i(θ2k +θ2k−1 ) ei(γ2k +γ2k−1 ) ϕ2k−1 +it2k r2k−1 e−i(θ2k +θ2k−1 ) ei(γ2k +α2k−1 ) ϕ2k +r2k r2k+1 e−i(θ2k +θ2k+1 ) ei(α2k −α2k+1 ) ϕ2k+1 +ir2k t2k+1 e−i(θ2k +θ2k+1 ) ei(α2k −γ2k+1 ) ϕ2k+2 .
(2.5)
194
O. Bourget, J.S. Howland, A. Joye
In matrix form, without expliciting the elements, we have the structure U =
..
. ∗ ∗
∗ ∗
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
. ∗ ∗ .. .
(2.6)
In the regime considered in [BB], the transition coefficients tk , and all elements of the scattering matrices Sk can be computed from the band functions of the periodic background potential. In particular, the transition coefficients may admit a limit as k → ∞. In this paper, we briefly show how to get information on the spectral properties of the monodromy operators defined on l 2 (N∗ ) from those of operators defined on l 2 (Z). Also, we briefly demonstrate how spectral properties of U when the tk ’s have limits t± as k → ±∞ can be related to those of the limiting operator with constant (in k) transition coefficients t. Then we focus on the case of constant transition and reflection coefficients tk = t ∈]0, 1[i.e. rk = r ∈]0, 1[, for all k ∈ Z, which is the main object of our analysis. This corresponds to a regime of the original model in which the sole behavior of the scattering phases θk , γk , αk determine the spectral properties of U . It is argued in [BB] on the basis of numerical computations that in case these phases have a coherent behavior as functions of k, if they are periodic say, U has an a.c. component in its spectrum, whereas U should be singular if some phases are random. Following their arguments, we are aiming at a rigorous version of similar statements in our setting.
3. First Properties At this point, we have slightly generalized the construction proposed by [BB] in order to define our monodromy operator, a unitary pentadiagonal band matrix. Before going further in the analysis, one can ask whether simpler unitary band matrices could provide interesting models, spectrally speaking, as is the case in the self-adjoint setting where the discrete Schr¨odinger operators are tridiagonal, though non-trivial. The next lemma answers this question negatively, validating our model from another point of view. Its proof can be found in the Appendix. Lemma 3.1. If U is unitary and tridiagonal, then U is either unitarily equivalent to a (bilateral) shift operator, or it is an infinite direct sum of 2 × 2 and 1 × 1 unitary matrices. On the other hand, it is straightforward to construct unitary band matrices with larger width starting with general unitary finite size matrices, following the same steps as above.
Spectral Analysis of Unitary Band Matrices
195
Perturbative results. In the physical context alluded to above, the natural Hilbert space is l 2 (N∗ ), with N∗ the set of positive integers, and the definition of the unitary monodromy operator, say U + , is U + ϕ1 = r1 e−i(θ0 +θ1 ) e−iα1 ϕ1 + it1 e−i(θ0 +θ1 ) e−iγ1 ϕ2 , U + ϕk , k > 1,
(3.1)
as in (2.5). We shall also define U − on l 2 (−N∗ ) in a similar fashion. Consider U e on l 2 (Z) defined by (2.5) with even matrix elements {t−k , θ−k , α−k , γ−k } = {tk , θk , αk , γk } ∀k ∈ N.
(3.2)
+ and U e denote their restriction Theorem 3.1. Let U + and U e be as above and let Ua.c. a.c. to their respective absolutely continuous subspaces. Then
σess (U + ) = σess (U e ),
and
+ + e Ua.c. ⊕ Ua.c. Ua.c. ,
where means unitary equivalence. Proof. We can write on l 2 (−N∗ ) ⊕ C ⊕ l 2 (N∗ ), CU + C −1 +F Ue = 1 + U + C U C −1 1 = 1 1 + F, + II U II
(3.3)
where absent elements denote zeros, II is the identity, C is the operator C : l 2 (N∗ ) → l 2 (−N∗ ) ϕk → ϕ−k
(3.4)
and F is a finite rank operator. Noting that σ (CU + C −1 ) = σ (U + ), we get the result by Weyl’s and Birman-Krein’s theorems on invariance of essential, resp. absolutely continuous, spectrum, under compact, resp. trace class, perturbation. Let us now consider the situation where the transition coefficients of the operator U defined by (2.5) satisfy lim tk = t± ⇐⇒ lim rk = r± .
k→±∞
(3.5)
k→±∞
We measure the convergence by means of the quantities δ ± defined by 2 2 , tj tj −1 − t+ , tj rj ±1 − r+ t+ } , δ + (j ) = max{rj rj −1 − r+
j ∈ N,
(3.6)
and similarly for δ − . Let U ± (t± ) be defined on l 2 (±N∗ ) by (3.1) with tk = t± and rk = r± , for all k ∈ ±N∗ .
196
O. Bourget, J.S. Howland, A. Joye
Theorem 3.2. Assume (3.5) and let U and U ± (t± ) be as above. Then σess (U ) = σess (U + (t+ )) ∪ σess (U − (t− )). If, furthermore, there exists > 1/2 such that supj ∈N δ ± (j )j 2 < ∞, + − (t+ ) ⊕ Ua.c. (t− ). Ua.c. Ua.c.
Proof. Let us introduce the asymptotic unitary operator U−,+ by U−,+ =
U − (t− ) 1
U + (t
+)
.
(3.7)
The difference between the actual and asymptotic operators is given by the operator = U − U−,+ whose matrix elements
(3.8)
(j, k) = ϕj | ϕk satisfy for |k| > 1,
| (j, k)| ≤ {δ ± (j + 1)} if |j − k| ≤ 20 otherwise.
(3.9)
Therefore, approximating by a finite matrix N , we can use the Schur condition, [K], P. 143, to estimate the norm of the difference − N and get − N → 0 as N → ∞. This, in turn, shows that is compact and that the essential spectra of U and U−,+ coincide and yields the first assertion. The second is proven following arguments used in [Ho2]. Let > 1/2 and set j = (1 + j 2 )1/2 . We define " = diag {j } in the basis {ϕk }k∈Z . As = "−1 (" ")"−1 , where "−1 is Hilbert-Schmidt, will be trace class as soon as " " is bounded. Its non-zero matrix elements are (" ")(j, k) = (j k) (j, k), k = j, j ± 1, j ± 2,
(3.10)
so that we get boundedness as above from the Schur condition and the estimate (3.9). Remarks. i) An analogous statement is obviously true for operators defined on l 2 (N∗ ). ii) The condition supj ∈N δ ± (j )j 2 < ∞ for some > 1/2, actually is necessary as well to have trace-class. Indeed, in case t± = t ⇐⇒ r± = r with tr = 0, t = r, and t − tj = 1/j α , one checks that δ(j ) ∼ c/j α , for some constant c. Assuming that is trace class, we have j ∈Z | (j, j )| < ∞ which is equivalent to j ∈Z 1/j α < ∞ and requires α > 2, for some > 1/2. iii) It is clear that similar perturbative results hold for more general cases where the phases have a limiting behavior as well. The case t+ = t− = 0 is of particular interest and allows a stronger result. Theorem 3.3. Consider U on l 2 (Z) defined by (2.5) and U + on l 2 (N∗ ) defined by (3.1). If lim inf k→±∞ tk = 0, then σa.c. (U ) = σa.c. (U + ) = ∅.
(3.11)
Spectral Analysis of Unitary Band Matrices
197
Proof. We consider U only, the proof for U + being similar. Let Un be equal to U with tn = 0 and Fn = U − U n .
(3.12)
The matrix of Un is separated into two disjoint blocks and Fn is a rank four operator with Fn ≤ ctn . The hypothesis insures the existence of a subsequence tn(k) going to zero as fast as we wish, say as k−2 , when k → ±∞. We set G=
Fn(k) and U˜ = U − G.
(3.13)
k∈Z
By construction, we have for some constant c, ˜ G1 ≤ 4
Fn(k) ≤
k∈Z
and U˜ is pure point, hence the result.
c˜ < ∞, k2
(3.14)
k∈Z
Remarks. i) In case there exists a subsequence {tn(k) } such that lim tn(k) = t+ and
k→+∞
lim tn(k) = t− ,
k→−∞
(3.15)
a similar construction is valid and we get an approximation of the form U = U˜ −,+ + G−,+ ,
(3.16)
where U˜ −,+ contains an infinite number of t− and t+ in its matrix representation and G−,+ is trace class. However we do not know the spectral properties of such U˜ −,+ ’s. ii) If U + (0) defined as in Theorem 3.2 is such that its pure point spectrum possesses a finite number of accumulation points only, then, if lim tj → 0, σ (U + ) is pure point with finitely many accumulation points as well. This will be true in case the phases have a coherent behavior, see Sect. 6. Motivated by the previous theorems, we now address the spectral properties of the limiting operators. Constant reflection and transition coefficients. From now, tk = t, rk = r, ∀k ∈ Z. We first note that the extreme cases where rt = 0 are spectrally trivial. Proposition 3.1. In case t = 0 i.e. r = 1, U is pure point and if t = 1 i.e. r = 0, U is purely absolutely continuous. Proof. The first case is trivial. In the second case, we observe that U is reduced by the supplementary subspaces L+ , respectively L− , generated by the vectors in the canonical basis with even indices, respectively odd indices. Moreover U |L± is unitarily equivalent to the shift operator, hence the result.
198
O. Bourget, J.S. Howland, A. Joye
Remark. As a typical corollary, we get the following spectral properties for monodromy operators U + defined on l 2 (N∗ ) according to (3.1): σa.c. (U + ) = S 1 if 1 − tj ∼ 1/j α ,
α > 1.
(3.17)
The rest of the paper is devoted to studying the limiting operator when tk = t ∈ ]0, 1[i.e. rk = r ∈]0, 1[, for all k ∈ Z. All phases in the definition of U do not play the same role, as the following lemma shows. On the one hand it justifies the choice made in [BB] where the phases γk are taken equal to zero and, on the other hand, it will be very useful below. Lemma 3.2. If we denote the matrix (2.6) by M({θk }, {αk }, {γk }), and identify U to it, we have for any sequences {θk }, {αk }, {γk }, k ∈ Z, U ≡ M({θk }, {αk }, {γk }) M({θk }, {αk }, {0}). Remarks. i) As a corollary, we can replace the sequence {γk }, k ∈ Z in the definition of U by any other sequence {γk }. ii) The same statement is true for U + defined on l 2 (N∗ ) by (3.1). Proof. Let V be the unitary operator defined by V ϕk = eiζk ϕk ,
k ∈ Z.
(3.18)
One checks easily that the operator V −1 U V has the form M({θk }, {αk }, {0}) in the canonical basis provided for all j ∈ Z, ζj − ζj −1 = −γj −1 .
(3.19)
This is realized by taking, for example, ζ0 = 0 and ζk = −
k−1
γj ,
j =0
ζ−k =
−k
γj ,
k ∈ N∗ .
(3.20)
j =−1
Generalized eigenvectors. Without making use yet of the freedom we have in the sequence {γk }, k ∈ Z, we now turn to the eigenvalue equation U ψ = eiλ ψ,
ψ= ck ϕk , ck ∈ C,
λ ∈ C.
(3.21)
k∈Z
One sees from the structure (2.6) of the operator U , that if ψ satisfies (3.21), a linear relation between the coefficients (c2k , c2k+1 ) and (c2k−2 , c2k−1 ) of the form c2k−2 c2k = T (k) (3.22) c2k+1 c2k−1
Spectral Analysis of Unitary Band Matrices
199
must exist, provided some 2 × 2 matrix is invertible. Using the definition (2.5), straightforward computations show that the matrix T (k) has elements T (k)11 = −e−i(λ+γ2k−1 +γ2k−2 +θ2k−1 +θ2k−2 ) , r T (k)12 = i e−i(λ+γ2k−1 −α2k−2 +θ2k−1 +θ2k−2 ) − e−i(γ2k−1 −α2k−1 ) , t r −i(θ2k−2 −θ2k +γ2k +γ2k−1 +γ2k−2 +α2k−1 ) T (k)21 = i e t − e−i(λ+θ2k−2 +θ2k−1 +γ2k +γ2k−1 +γ2k−2 +α2k ) ,
(3.23)
1 i(λ+θ2k +θ2k−1 −γ2k −γ2k−1 ) e t2 r2 + 2 e−i(γ2k +γ2k−1 ) ei(θ2k −θ2k−2 +α2k−2 −α2k−1 ) + e−i(α2k −α2k−1 ) t r 2 −i(λ+θ2k−2 +θ2k−1 +γ2k +γ2k−1 +α2k −α2k−2 ) − 2e , t provided t = 0. We also compute T (k)22 = −
det T (k) = e−i(θ2k−2 −θ2k +γ2k +2γ2k−1 +γ2k−2 )
(3.24)
so that | det T (k)| = 1. Therefore, once the coefficients (c0 , c1 ) are given, we compute for any k ∈ N∗ , c2k c0 c = T (k) · · · T (2)T (1) ≡ ,(k) 0 , c2k+1 c1 c1 (3.25) c c c−2k = T (−k + 1)−1 · · · T (−1)−1 T (0)−1 0 ≡ ,(−k) 0 . c−2k+1 c1 c1 The multiplicity of possible eigenvalues is therefore bounded by two. 4. Random Setting Apart from the fact that our transfer matrix is complex valued instead of the usual real valued setting suiting the discrete Schr¨odinger case, we will see that here also one Lyapunov exponent is enough to describe the spectral properties of U , when the phases are random and the transfer matrices T (k) are independent and identically distributed. Making use of Lemma 3.2, let us introduce a probabilistic space (-, F, P), where - is identified with {TZ }, T being the torus, and P = ⊗k∈Z P0 , where P0 is the uniform distribution on T with F the σ -algebra generated by the cylinders. We introduce the set of random vectors on (-, F, P) given by - → T2 ω → (θk , αk ) k ∈ Z,
(4.1)
with θk (ω) = ω2k , αk (ω) = ω2k+1 . The random vectors {βk }k∈Z are i.i.d. and uniformly distributed on T2 . We denote by Uω the random unitary operator corresponding to the random infinite matrix (2.6), Uω = M({θk (ω)}, {αk (ω)}, {0}).
(4.2)
200
O. Bourget, J.S. Howland, A. Joye
Introducing the shift operator S on - by S(ω)k = ωk+2 ,
k ∈ Z,
(4.3)
we get an ergodic set {S j }j ∈Z of translations. With the unitary operator Vj defined on the canonical basis of l 2 (Z) by Vj ϕk = ϕk−2j ,
∀k ∈ Z,
(4.4)
we observe that for any j ∈ Z, US j ω = Vj Uω Vj∗ .
(4.5)
Therefore, our random operator Uω is an ergodic operator. The spectral projectors E (ω) of Uω , where is a Borel set of T, define a weakly measurable projector valued family of operators on - and the spectrum of Uω is deterministic, see [CL]. However, we shall not make use of these properties below. As it stands, the transfer matrix T (k) depending on the random vectors β2k , β2k−1 , β2k−2 seems to be correlated with T (k + 1) and T (k − 1). Using the same Lemma 3.2, we can replace the sequence {0} in (4.2) by {(−1)k+1 αk }, so that we consider explicitly M({θk }, {αk }, {(−1)k+1 αk }) and the corresponding transfer matrices. Thus, in terms of the new variable, with λ ∈ R, ηk (λ) = θk + θk−1 + αk − αk−1 + λ,
(4.6)
the transfer matrix can be written as T (k) ≡ T (η2k (λ), η2k−1 (λ))
(4.7)
with ∀k ∈ Z, T (k)11 T (k)12 T (k)21 T (k)22
= −e {−iη2k−1 (λ)} , = i rt e−iη2k−1 (λ) − 1 , = i rt ei(η2k (λ)−η2k−1 (λ)) − e−iη2k−1 (λ) , 2 = − t12 eiη2k (λ) + rt 2 ei(η2k (λ)−η2k−1 (λ)) + 1 − e−iη2k−1 (λ) .
(4.8)
Therefore, introducing the set of random vectors δk = (η2k (λ), η2k−1 (λ)) ∈ T2 ,
k ∈ Z,
(4.9)
we observe that the set of random transfer matrices {T (k)}k∈Z will be independent provided the set of random vectors {δk }k∈Z are independent. Using properties of the characteristic functions of random vectors (4.10) ,β (n1 , n2 ) = E e−i(n1 β1 (ω)+n2 β2 (ω)) , n1 , n2 ∈ Z, we get the following lemma. Lemma 4.1. If the vector {βk }k∈Z are i.i.d and uniform, the random vectors {δk }k∈Z are also i.i.d. and uniformly distributed on T2 . In turn, the set of transfer matrices {T (k)}k∈Z are i.i.d. random matrices in Gl2 (C). We can now state our main result in the random setting:
Spectral Analysis of Unitary Band Matrices
201
Theorem 4.1. Let Uω be defined by its matrix elements (2.5) with t ∈ (0, 1). Assuming the phases {αk }k∈Z and {θk }k∈Z are i.i.d. and uniform on T, we have almost surely σa.c. (Uω ) = ∅. The next section is devoted to the proof of this theorem. Remarks. i) The same result for Uω+ defined by (3.1) holds by Theorem 3.2. ii) In case the phases αk ∈ T are deterministic and of the form αk = ak + b, a, b ∈ R, whereas the θk ’s are i.i.d. and uniform, the conclusions of the above lemma and theorem still hold. The same is true if the θk ’s are deterministic and constant whereas the αk ’s are i.i.d. and uniform. iii) To motivate our hypotheses on the uniform distribution of the phase vectors βk , we recall the Lemma 4.2. If Xk , k ∈ Z, is a set of i.i.d. random variables on T with support not reduced to a point, then the random variables Yk± = Xk ± Xk−1 , k ∈ Z are independent if and only if the Xk are uniformly distributed. A proof of Lemmas 4.1 and 4.2 can be found in the Appendix. 5. Lyapunov Exponents As the map (4.6) is measurable, we can realize our transfer matrices as an i.i.d. random process on the same probabilistic space (-, F, P) in such a way that T (k, ω) = T (ω2k , ω2k−1 ),
k ∈ Z, ∀ω ∈ TZ ,
(5.1)
with the C ∞ map T : T2 → Gl2 (C) defined in (4.7). Therefore, T (k + 1, ω) = T (k, S(ω)),
∀k ∈ Z.
(5.2)
The set of translations {S j }j ∈Z is ergodic and we can write for all k ∈ N∗ , ,(k, ω) = T (k, ω)T (k − 1, ω) · · · T (1, ω) = T (1, S k (ω))T (1, S k−1 (ω)) · · · T (1, ω).
(5.3)
Similarly, we set ,(0, ω) = II2 and ,(−k, ω) = T −1 (0, S −k+1 (ω))T −1 (0, S −k+2 (ω)) · · · T −1 (0, ω).
(5.4)
Therefore {,(k, ω)}k∈N defines a random ergodic linear dynamical system over Gl2 (C) generated by the map T (1, ·) and {,(−k, ω)}k∈N defines another one generated by T −1 (0, ·). We are now formally in good shape to apply Oseledec’s and Furstenberg’s Theorems to define and study the Lyapunov exponents. However, the last result is stated for real valued matrices, and, in particular, irreducibility properties of groups of matrices are a delicate matter. Therefore, we first want to map our problem to a problem involving matrices in Gl4 (R). This is done very conveniently using the method described in [MT],
202
O. Bourget, J.S. Howland, A. Joye
which we apply to our setting. We will denote by ·|· the scalar product on R4 or C2 and we introduce 10 0 1 I= , J = . (5.5) 01 −1 0 We define a sub-algebra of A4 (R) of M4 (R) by a1 I + a2 J b1 I + b2 J A4 (R) = , aj , bj , cj , dj ∈ R, j = 1, 2. . c1 I + c2 J d1 I + d2 J The topology on M2 (C), M4 (R) is generated by the spectral norm
|λ|2 A =
(5.6)
(5.7)
λ∈σ (|A|)
and that of A4 (R) is the induced topology. Let ρ be the mapping ρ : C2 → R4 , "(x) x −#(x) → , (5.8) y "(y) −#(y) and τ : M2 (C) → A4 (R) be defined by ab "(a)I + #(a)J → c d "(c)I + #(c)J
"(b)I + #(b)J "(d)I + #(d)J
.
(5.9)
The following properties are readily checked: Lemma 5.1. For any u, v ∈ C2 , and any α ∈ C, ρ(u + v) = ρ(u) + ρ(v), ρ(αu) = "(α)ρ(u) + #(α)ρ(iu).
(5.10) (5.11)
For any A, B ∈ M2 (C), and α ∈ R, τ (A + B) = τ (A) + τ (B), τ (AB) = τ (A)τ (B), τ (αA) = ατ (A), τ (A∗ ) = τ (A)∗ , τ (A−1 ) = τ (A)−1 .
(5.12)
The last formula means that if A ∈ M2 (C) is invertible, τ (A) is also invertible and the formula is true. Finally, for all u ∈ C2 , ∀T ∈ M2 (C), ρ(T u) = τ (T )ρ(u).
(5.13)
We also note the following lemma for future reference. Lemma 5.2. If A ∈ M2 (C) and | det(A)| = 1, then | det(τ (A))| = 1. If A is self adjoint with eigenvalues γ1 and γ2 , then τ (A) is real symmetric with eigenvalues γ1 and γ2 of multiplicity two. More general results of the same sort in higher dimension can be found in [MT].
Spectral Analysis of Unitary Band Matrices
203
Remarks. i) Let us note as a consequence of Lemma 5.2 that the mappings ρ and τ are homeomorphisms and ∀u ∈ C2 , ∀A ∈ M2 (C), √ (5.14) ρ(u) = u, τ (A) = 2A. ii) The mapping ρ does not transport scalar product but it does preserve the norm. Note that we have for all ∀u, v ∈ C2 , and all T ∈ M2 (C), ρ(iu)|ρ(u) = 0, ρ(u)|τ (T )ρ(v) = ρ(iu)|τ (T )ρ(iv) = "(u|T v), ρ(iu)|τ (T )ρ(v) = −ρ(u)|τ (T )ρ(iv) = #(u|T v).
(5.15) (5.16) (5.17)
Therefore, if u and v are orthogonal in C2 , ρ(u) and ρ(v) are also orthogonal. Existence of the Lyapunov Exponents. Using this operator τ : Gl2 (C) → Gl4 (R), we can now consider the random ergodic linear dynamical system over Gl4 (R) defined from {,(k, ω)}k∈N by ?(k, ω) = τ (,(k, ω))
(5.18)
generated by the map τ (T (1, ·)) : - → Gl4 (R). We will work similarly if −k ∈ N. We now apply Oseledec’s Theorem according to [A], Thm. 3.4.11, specialized to our setting. Proposition 5.1. Let the random ergodic dynamical system generated by the map τ (T (1, ·)) : - → Gl4 (R). Then, on an invariant set -0 ⊂ - of P-measure one, the following limit exists lim (?(n, ω)∗ ?(n, ω))1/2n = "(ω).
n→∞
(5.19)
The matrix "(ω) possesses at most 2 distinct eigenvalues of multiplicities 2, denoted by eγ1 ≥ eγ2 ≡ e−γ1 > 0,
(5.20)
associated with at most two eigenspaces E1 (ω), E2 (ω). The Lyapunov exponents γ1 ≥ γ2 are constant almost surely. If γ1 > 0, there exists a filtration of R4 , {0} ⊂ V(ω) ⊂ R4 such that V(ω) = E2 (ω), and R4 = E2 (ω) ⊕ E1 (ω),
(5.21)
1 log ?(n, ω)u = γ2 = −γ1 < 0, n
(5.22)
and u ∈ V(ω) iff lim
n→+∞
and u ∈ R4 \ V(ω) 1 log ?(n, ω)u = γ1 > 0. n→+∞ n lim
(5.23)
204
O. Bourget, J.S. Howland, A. Joye
Moreover, there exists a splitting R4 = E2 (ω) ⊕ E1 (ω)
(5.24)
such that lim
n→±∞
1 log ?(n, ω)u = γj ⇔ u ∈ Ej (ω) \ {0}. n
(5.25)
Proof. We need to check the hypotheses of the Ergodic Multiplicative (Oseledec’s) Theorem, see e.g. [A], Thm. 3.4.11, in order to get the existence of the limit. All norms being equivalent, considering the maximum modulus of the matrix elements, we get the existence of a finite constant depending only on 0 < t < 1 such that C(t)−1 ≤ T (1, ω) ≤ C(t). As | det(T (1, ω))| = 1, the same bound is true for T (1, ω)−1 . The properties of τ finally yield (ln+ τ (T (1, ·)) + ln+ τ (T (1, ·)−1 )) ∈ L1 (-, F, P),
(5.26)
where ln+ (x) = max{ln(x), 0}, x > 0, which ensures the existence of the limit. The statements about the number of Lyapunov exponents, their relations and multiplicities are shown as follows. For any n, the 2 × 2 matrix ,(n, ω)∗ ,(n, ω) is positive, of determinant one so that it either possesses two distinct eigenvalues σ1 (n, ω) > σ2 (n, ω) = 1/σ1 (n, ω) > 0 (of multiplicity one), or it is the identity matrix. Therefore, ?(n, ω)∗ ?(n, ω) = τ (,(n, ω)∗ ,(n, ω)) has two distinct eigenvalues σ1 (n, ω) > σ2 (n, ω) = 1/σ1 (n, ω) > 0 of multiplicity two, or it is the identity matrix in R4 . The determinant being continuous, the limit "(ω) is also positive of determinant equal to one. By continuity of τ and τ −1 , "(ω) also belongs to A4 (R) and there exists κ(ω) ∈ M2 (C) such that τ (κ(ω)) = "(ω). Moreover, the relation κ(ω) = limn→∞ ,(n, ω)∗ ,(n, ω) shows that κ(ω) is also positive of determinant one, which proves that the multiplicities of the eigenvalues of "(ω) is two or it is the identity matrix. Corollary 5.1. Under the same hypotheses as above, there exists almost surely a subspace V0 (ω) of C2 of complex dimension 1 such that ∀u ∈ V0 (ω) \ {0}, ∀u ∈ C2 \ V0 (ω),
lim 1 ln ,(n, ω)u = −γ1 < 0, n→+∞ n lim 1 ln ,(n, ω)u = γ1 > 0. n→+∞ n
(5.27)
Also, there exists a splitting C2 = E20 (ω) ⊕ E10 (ω) such that 1 log ,(n, ω)u = γj ⇔ u ∈ Ej0 (ω) \ {0}. n→±∞ n lim
(5.28)
Proof. By the proposition, there exists a filtration: {0} ⊂ V(ω) ⊂ R4 , such that: 1 log τ (T (1, S n (ω))) . . . τ (T (1, ω)v = γ1 , n→+∞ n V(ω) \ {0}, lim n1 log τ (T (1, S n (ω))) . . . τ (T (1, ω))v = −γ1 . n→+∞
∀v ∈ R4 \ V(ω), lim ∀v ∈
(5.29)
The properties (5.1), (5.13) and (5.14) imply that ∀v ∈ R4 , 1 log τ (T (1, S n (ω))) . . . τ (T (1, ω))v n 1 = lim log T (1, S n (ω)) . . . T (1, ω)ρ −1 (v). n→∞ n
lim
n→∞
(5.30)
Spectral Analysis of Unitary Band Matrices
205
Let v0 ∈ V(ω), u0 = ρ −1 (v0 ) and V0 (ω) = Cu0 . The equation above proves the first assertion. Consider u ∈ C2 \ V0 (ω). Then u = αu0 + u⊥ 0, ρ(u) =
u⊥ 0 = 0, α ∈ C,
"(α)ρ(u0 ) + #(α)ρ(iu0 ) + ρ(u⊥ 0 ).
(5.31) (5.32)
The three components are non-zero and mutually orthogonal. Therefore ρ(u) ∈ R4 \ V(ω), from which the second assertion follows. We can proceed along the same lines in order to prove the statements concerning the existence and properties of a splitting of C2 = E20 (ω)⊕E10 (ω) with Ej0 (ω) = Cρ −1 (vj ), vj ∈ Ej (ω). Indeed, let v1 ∈ E1 (ω) and u1 = ρ −1 (v1 ). We define v1 = ρ(iu1 ), so that v1 |v1 = 0 and limn→±∞ n1 ln ,(n, ω)v1 = limn→±∞ ln n1 ?(n, ω)iu1 = γ1 . Hence v1 ∈ E1 (ω). Let v2 ∈ E2 (ω) such that u2 := ρ −1 (v2 ) is not colinear to u1 . There exists such a v2 , otherwise, u2 = αu1 implies ρ(u2 ) = v2 = "(α)v1 +#(α)v1 ∈ E2 (ω), which is a contradiction. Hence, v2 = ρ(iu2 ) ∈ E2 (ω). Now u = αu1 + β1 u2 and ρ(u) = "(α)v1 + #(α)v1 + "(β)v2 + #(β)v2 . So that lim
n→±∞
1 1 ln ,(n, ω)u = γj = lim ln ?(n, ω)ρ(u) n→±∞ n n
is equivalent to β = 0 if j = 1 and α = 0 if j = 2.
(5.33)
Positivity of the Lyapunov exponent. In order to assess the positivity of the first Lyapunov exponent, we use Furstenberg’s Theorem. Let us introduce, according to [BL] III.2.1., the following notions. Let S be a subset of GLd (R), d > 0. Such a set S is said irreducible if there is no strict subspace V of Rd such that ∀M ∈ S, M(V ) = V . A set will be called strongly irreducible if there is no finite family V1 , . . . , VN of strict subspaces of Rd , such that: ∀M ∈ S, M(V1 ∪ . . . ∪ VN ) = V1 ∪ . . . ∪ VN . The basic theorem is then Theorem 5.1 (Furstenberg). If µ is a probability measure on M = {M ∈ GLd (R); | det M| = 1} such that: log Mdµ(M) < +∞ and the group Gµ generated by the support of µ is strongly irreducible and non-compact, then the first Lyapunov exponent associated with any sequence of i.i.d. matrices in M satisfies γ1 > 0. See [BL] Theorem III.6.1 for a proof. We note the following property (Exercise IV.2.9 of [BL]) reducing strong irreducibility to irreducibility in some cases. Lemma 5.3. Let 1 < d ∈ N and S be a connected subset of GLd (R). Then S is strongly irreducible if and only if S is irreducible. In our case, the measure µ is the image by the map (4.7) of the uniform measure P0 ⊗ P0 on T2 . In order to study the properties of the corresponding set Gµ , we introduce the connected set of matrices given by the range of the smooth map from T2 → C2 which to (θ, η) assigns the matrix
ir −iθ −e−iθ e −1 t
. (5.34) T(θ,η) = 2 − irt e−iθ − ei(η−θ) − rt 2 e−iθ − 1 − ei(η−θ) − t12 eiη
206
O. Bourget, J.S. Howland, A. Joye
Let G ⊂ Gµ denote the smallest group generated by the support of the measure image by P0 ⊗ P0 on SL4 (R) by (θ, η) → τ (T(θ,η) ). Proposition 5.2. The group G is not compact. Proof. The matrix τ (T(π,π) ) belongs to the support of the image of P0 ⊗ P0 by (5.34) and it has eigenvalues (r + 1)2 (r − 1)2 and . 2 t t2
(5.35)
The second one is strictly larger than 1, if t < 1 so that, since for any n ∈ N, τ (T(π,π) )n ∈ G, G cannot be bounded. Proposition 5.3. The group G is strongly irreducible. Proof. It is enough to exhibit an irreducible, connected, subset of G. The map τ (T·,· ) is smooth, hence the set {τ (T(θ,η) ), (θ, η) ∈ [0, 2π [2 }, included in G, is connected. We now show that there exists no strict subspace of R4 invariant under this set of matrices. We first note that choosing η = θ ∈ [0, 2π [ we get τ (T(θ,θ) ) = M0 + sin(θ )M1 + cos(θ )M2 , where
0 0 0 − rt 0 0 r 0 t M0 = 0 r 2 r 2 0 , 2 t t 2 − rt 0 0 2 rt 2 0 1 rt 0 r −1 0 0 t M1 = r , − t 0 0 −1 r 0 −t 1 0 M2 = −(M0 +II ),
(5.36)
(5.37)
(5.38) (5.39)
where II denotes the identity matrix. If there exists a strict invariant subspace E for the set τ (T(θ,θ) )θ∈[0,2π[ , this subspace E is also invariant for the matrices Mj , j = 0, 1, 2. Similarly, choosing −η = θ ∈ [0, 2π[, we have τ (T(θ,−θ )) = N0 + sin(θ )N1 + cos(θ )N2 + sin(2θ)N3 + cos(2θ)N4 ,
(5.40)
where, in particular,
0 1 −1 0 N1 = − r 0 t
r t
0 0
0 − rt − r t+1 2 2
0
r t r 2 +1 t2
.
(5.41)
0
Again E must be invariant under N1 . As M0 , M1 , N1 are real (anti) symmetric, they all leave E ⊥ invariant as well so that these matrices are reduced by the orthogonal spaces E ⊕ E ⊥ = R4 . In particular, these
Spectral Analysis of Unitary Band Matrices
207
invariant subspaces must be generated by the eigenvectors {u1 , u2 , u3 , u4 } of M0 which form a basis of R4 . Explicitly, 1 0 1 0 1−r r+1 0 0 − t u1 = 0 , u2 = t , u3 = 0 , u4 = (5.42) , 1 1 r+1 1−r 0 0 t t the first two vectors being associated with the eigenvalue r(r + 1)/t 2 while the last two are associated with r(r − 1)/t 2 . We further compute, repeatedly using r 2 + t 2 = 1, that M1 u1 =
1 1 1 1 u 4 , M 1 u 2 = u 3 , M 1 u 3 = − u 2 , M1 u 4 = − u 1 t t t t
(5.43)
and N1 u1 = −
1+r 1 1−r 1 u 2 , N1 u2 = u 1 , N1 u3 = u2 , N1 u4 = − u3 . (5.44) t (1 − r) t t (1 + r) t
Clearly no one dimensional subspace E = uj (or E ⊥ = uj ) can be invariant under M0 , M1 and N1 . And by inspection, one checks that no two dimensional subspace E = uj , uk can be invariant under M0 , M1 and N1 . The irreducible set {τ (T(θ,η) ), (θ, η) ∈ [0, 2π[2 } being contained in the group G, the latter and Gµ are a fortiori irreducible. Therefore, Proposition 5.4. The Lyapunov exponent γ1 (λ) associated to the ergodic linear dynamical system (5.18) is strictly positive for any λ ∈ T. Ishii-Pastur. The link between Lyapunov exponents and a.c. spectrum is provided in the self adjoint random case by the Ishii-Pastur-Kotani Theorem. We provide a unitary version of the Ishii-Pastur part of the result, which is enough for our purpose. In order to adapt the proof of [CFKS], it is only necessary to show that it is spectrally true that the generalized eigenvectors of U are polynomially bounded. We first show that generalized eigenvectors corresponding to spectral parameters outside the spectrum cannot be polynomially bounded for bounded normal operators with a band structure. We’ll say that a matrix {Mj,k }j,k∈Z has a band structure of order 2p + 1, p ∈ N if |j − k| > p implies Mj,k = 0. Note that if this is so, then (Mv)k =
Mk,j vj =
k+p
Mk,j vj
(5.45)
j =k−p
j ∈Z
makes sense for an arbitrary vector v = {vj }j ∈Z , since the sum is finite. Define the projections
|ϕj ϕj | (5.46) P[a,b] = a≤j ≤b
and note that P[a,b] U = P[a,b] U P[a−p,b+p] , U P[a,b] = P[a−p,b+p] U P[a,b] . That is, in fact, just another way of saying that U has band structure.
(5.47)
208
O. Bourget, J.S. Howland, A. Joye
Lemma 5.4. Let (ϕn )n∈Z be an orthonormal basis of a separable Hilbert space H on which a normal operator U acts. Assume U has a band structure of order 2p + 1 and consider an arbitrary nontrivial sequence φ such that U φ = zφ, where z ∈ C is in the resolvent set of U . Then the sequence (ϕk |φ)k∈Z is not polynomially bounded. Proof. The operator U being normal, for any z in the resolvent set, (U − z)−1 is normal too. Therefore (U − z)−1 = rσ ((U − z)−1 ), where rσ (A) is the spectral radius of the operator A. As rσ ((U − z)−1 ) = 1/dist (z, σ (U )) ([K] (III 6.16 p.177)), we deduce ∀ψ ∈ H, ψ ≤
1 (U − z)ψ . dist (z, σ (U ))
(5.48)
Consider the generalized eigenvector φ. Since z ∈ / σ (U ), φ cannot be in l 2 , so it must fail to be in l 2 either at +∞ or at −∞. We will assume that it fails at +∞, and focus on the coefficients ϕk |φ, with k ≥ 0 and large enough. Let n > 3p and let Pn = P[p,n] = II − Qn .
(5.49)
Since Pn φ ∈ l 2 (Z), we have by (5.48), Pn φ ≤ cz (U − z)Pn φ,
(5.50)
where cz−1 = dist (z, σ (U )). Since we have assumed that φ is not in l 2 as k ≥ 0, necessarily Pn−p φ2 = P[p,n−p] φ2 → ∞ as n → ∞.
(5.51)
So there exists an n0 such that, given > 0, n ≥ n0 implies Pn−p φ2 ≥ −1 P[0,2p] φ2 .
(5.52)
Since (U − z)φ = 0, it follows for any finite projection P that (U − z)P φ = −(U − z)Qφ,
(5.53)
where Q = II − P , and hence that (U − z)P φ = P (U − z)P φ + Q(U − z)P φ = −P (U − z)Qφ + Q(U − z)P φ = −P U Qφ + QU P φ.
(5.54)
Take in (5.50), P = Pn = P[p,n] = II − Qn . By (5.47), we get Pn U Qn = P[p,n] U P[0,p+n] Qn = P[p,n] U (P[0,p−1] + P[n+1,p+n] ).
(5.55)
Qn U Pn = Qn U (P[p,2p] + P[2p+1,n−p] + P[n−p+1,n] ).
(5.56)
Qn U P[2p+1,n−p] = Qn P[p+1,n] U P[2p+1,n−p] = 0
(5.57)
Also,
But
Spectral Analysis of Unitary Band Matrices
209
so that Qn U Pn = Qn U (P[p,2p] + P[n−p+1,n] ).
(5.58)
Since the ranges of the appropriate projectors are orthogonal, we have with A = U 2 , (U − z)Pn φ2 = Pn U Qn φ2 + Qn U Pn φ2 = Pn U (P[0,p−1] + P[n+1,p+n] )φ2 + Qn U (P[p,2p] + P[n−p+1,n] )φ2 ≤ A P[0,p−1] φ2 + P[n+1,p+n] φ2 + P[p,2p] φ2 + P[n−p+1,n] φ2 = A P[0,2p] φ2 + P[p,n+p] φ2 − P[p,n−p] φ2 = A P[0,2p] φ2 + Pn+p φ2 − Pn−p φ2 . (5.59) Thus, by (5.52) and (5.50), for n > max(n0 , 3p), we have Pn−p φ2 ≤ Pn φ2 ≤ cz2 (U − z)φ2 ≤ cz2 A Pn−p φ2 + Pn+p φ2 − Pn−p φ2 ,
(5.60)
which implies that Pn+p φ2 ≥ Pn−p φ2
1 +1− Acz2
≡ BPn−p φ2 ,
(5.61)
where B > 1, if < 1/(Acz2 ). Iterating the argument, we get ∀k ∈ N, k
Pn+p2k φ ≥ B 2 Pn φ ,
(5.62)
which ensures the existence of an exponentially growing subsequence of coefficients. The second element is the construction of generalized solutions corresponding to spectral parameters in the spectrum of U which are polynomially bounded, a` la Berezanskii. This is done in our unitary setting following, mutatis mutandis, the arguments given in [S] for the self-adjoint case. We only quote the end result here, including a proof in the Appendix for completeness. Recall that a measure ρ is in the measure class of a unitary operator U with spectral projection E(·) if for any Borel set ⊂ T: ρ( ) = 0 ⇔ E( ) = 0. Theorem 5.2. Let U be a unitary operator with a band structure defined on l 2 (Z) and δ > 1. Then there exists a measure ρ in the spectral measure class of U and a family of disjoint measurable sets ( n )n∈N∗ whose union supports ρ such that for λ ∈ n , there exist n vectors φj (λ) satisfying • (U − eiλ )φj (λ) = 0. • ∀n ∈ Z, |ϕn |φj (λ)| ≤ C < n >δ . • For any λ fixed, the family {φj (λ)}j is linearly independent. Remark. The result is also true if the operator U is defined on l 2 (N) or l 2 (N)∗ .
210
O. Bourget, J.S. Howland, A. Joye
Corollary 5.2. σ (U ) is the essential closure of the set S = {λ ∈ T1 ; U φ = eiλ φ admits a polynomially bounded solution}
(5.63)
and E[0,2π [\S (U ) = 0. Proof. If eiλ ∈ σ (U ) then for any > 0, E(]λ−, λ+[) > 0 and ρ(]λ−, λ+[) > 0. Hence, by Theorem 5.2, for λ arbitrarily close to λ there exists a polynomially bounded ¯ The reverse inclusion follows from Lemma 5.4 and solution φj (λ ). Thus σ (U ) ⊂ S. the fact that σ (U ) is closed. The last statement follows immediately. Putting these arguments together, we get the unitary version of the Ishii-Pastur theorem suited to our monodromy operator: Theorem 5.3. Let Uω be unitary with a band structure. Assume that the corresponding transfer matrix at spectral parameter eiλ induces two Lyapunov exponents γ1 (λ) ≥ γ2 (λ) = −γ1 (λ) which are constant almost surely. Then σac (Uω ) ⊆ {eiλ ∈ S 1 ; γ1 (λ) = 0} . Proof. Identical to that given in [CFKS] Thm 9.13.
(5.64)
Therefore, Theorem 4.1 follows from the above theorem and Proposition 5.4.
6. Coherent Setting In this section we consider situations where the behavior of the matrix coefficients of U in (2.5) are periodic functions of k as the result of a coherent behavior of the phases. We first show that this implies purely absolutely continuous spectrum. Then we prove that when restricted to l 2 (N∗ ), these operators have no singular continuous spectrum and may possess finitely many simple eigenvalues only.
Coherence on l 2 (Z). As a first particular case of coherent dependence of the scattering phases, we consider the simple situation where the θk ’s and αk ’s take alternatively two values, up to a linear term. This corresponds to a monodromy operator U = Uo Ue , where Ue , Uo are direct sums of constant blocks S2k = Se , S2k+1 = So . Proposition 6.1. Let t ∈]0, 1[, let the sequence {γk } be arbitrary and θk =
θe if k is even αe if k is even , αk = ak + ∀k ∈ Z, θo if k is odd αo if k is odd
where θe , θo , αe , αo , a ∈ R. Define = αe − αo , H = θe + θo . Then, with the identification U ≡ M({θk }, {αk }, {γk }), U is purely absolutely continuous and σac (U ) = {e−i(a+H) e±i(arccos(r
2 cos
−t 2 cos(2x+ )))
, x ∈ T}.
Spectral Analysis of Unitary Band Matrices
211
Proof. By Lemma 3.2, we can replace γk by (−1)k+1 αk so that, with our choice of phases, U e−i(a+H) V , where V ϕ2k = irte−i ϕ2k−1 + r 2 e−i ϕ2k + irtei ϕ2k+1 − t 2 ei ϕ2k+2 , V ϕ2k+1 = −t 2 e−i ϕ2k−1 + itre−i ϕ2k + r 2 ei ϕ2k+1 + irtei ϕ2k+2 . (6.1) Let us map l 2 (Z) unitarily to L2 (T) via
such that for any ψ =
W : ϕk → eikx ,
(6.2)
∈ C,
(W ψ)(x) = ck eikx ∈ L2 (T).
(6.3)
k ck ϕ k , c k
k∈Z
We further introduce L2± (T) = W L± , where L+ , L− are the subspaces of l 2 (Z) generated by the basis vectors with even, respectively odd, indices. It is easily checked that V is unitarily equivalent on L2 (T) = L2+ (T) ⊕ L2− (T) to the matrix valued multiplication operator 2 −i r e − t 2 ei e2ix 2itr cos(x + ) V . (6.4) 2itr cos(x + ) r 2 ei − t 2 e−i e−2ix This matrix is analytic in x and has non-constant eigenvalues given by λ± (x) = r 2 cos − t 2 cos( + 2x) ± i 1 − (r 2 cos − t 2 cos( + 2x))2 , (6.5) from which the result follows.
Using basically the same strategy, we can consider the general case where the elements of U display an arbitrary periodicity. Theorem 6.1. Let t ∈]0, 1[, let the sequence {γk } be arbitrary and {θk }, {αk } be such that for some 2 ≤ N ∈ N, and all k ∈ Z, θk+N = θk , αk = ak + πk , where πk+N = πk and a ∈ R. Then, with the identification U ≡ M({θk }, {αk }, {γk }), U is purely absolutely continuous. Proof. As above, we first replace γk by (−1)k+1 αk and we introduce N−1 2 2 L2 (T) = ⊕N−1 L (T) ⊕ L (T) , 2q 2q+1 q=0 q=0
(6.6)
L22q (T) = span{(e(2Nk+2q)ix )k∈Z , x ∈ T},
(6.7)
L22q+1 (T) = span{(e(2Nk+2q+1)ix )k∈Z , x ∈ T} .
(6.8)
If P2qand P2q+1 denote the orthogonal projections on these subspaces, we get for ψ = k∈Z ck ϕk , with the same notations as above,
c2(Nk+q) ei2(Nk+q)x ∈ L22q (T), (6.9) (P2q W ψ)(x) = k∈Z
212
O. Bourget, J.S. Howland, A. Joye
and similarly for (P2q+1 W ?)(x). To determine the image of U under the unitary mapping W , we introduce νk± = θ2k + θ2k±1 ∓ (π2k − π2k±1 ),
(6.10)
± for any k ∈ Z. Hence U e−ia V where, this time, V acts on such that νk± = νk+N 2 l (Z) according to −
−
+
+
V ϕ2k = irte−iνk ϕ2k−1 + r 2 e−iνk ϕ2k + irte−iνk ϕ2k+1 − t 2 e−iνk ϕ2k+2 , −
−
+
+
V ϕ2k+1 = −t 2 e−iνk ϕ2k−1 + itre−iνk ϕ2k + r 2 e−iνk ϕ2k+1 + irte−iνk ϕ2k+2 . (6.11) The phases νk± being N -periodic, by manipulations similar to those performed above, one gets that V T , where T is a matrix valued multiplication operator on the decomposition of the Hilbert space (6.6) by the 2N × 2N matrix T (eix ) =
2
eikx Tk ,
(6.12)
k=−2
where the Tk ’s have a N × N block structure of the form 0 0 0 D− , T1 = irt T2 = −t 2 , W 0 Wu u 0 D− 0 Wl 0 0 Wl , T−1 = irt T0 = r 2 , , T−2 = −t 2 0 D+ 0 0 D+ 0
(6.13)
with ±
±
±
D± = diag(e−iν0 , e−iν1 , . . . , e−iνN −1 ) − 0 e−iν1 − 0 e−iν2 . , .. Wu = − 0 e−iνN −1 − e−iν0 0 + 0 e−iνN −1 −iν + e 0 0 . + . Wl = −iν1 . . e 0 + −iνN −2 e 0
(6.14)
Now, the operator T being unitary, the matrix T (eix ) is unitary as well for almost every x ∈ R. But this matrix being analytic in a neighborhood of the real axis, it must be unitary everywhere on the real axis. By classical results in analytic perturbation theory, see [K], it is therefore diagonalizable with analytic eigenprojectors in a neighborhood of the real axis, and identically zero eigennilpotents. In order to prove the absolutely continuous nature of the spectrum of U , it is then enough to show that the analytic eigenvalues of the matrix T (eix ) are non-constant in x ∈ R. But this is immediate, because otherwise, an infinitely degenerate eigenvalue would exist, which is forbidden by (3.25).
Spectral Analysis of Unitary Band Matrices
213
Remarks. i) The formulae (6.13) above are the starting point of a detailed analysis of the band spectrum of U as a function of t ∈]0, 1[, which we shall not perform here. We only note that for t = 0 (and with no loss of generality a = 0), + − + − σ (U ) = eiν0 , ν1+ + . . . + νN−1 + πeiν0 + ν1− + . . . + ν1− + . . . + νN−1 + π) , (6.15) where each eigenvalue is infinitely degenerate, whereas, for t = 1, + + + σ (U ) = Ran e−2ix ei(ν0 +ν1 +···+νN −1 +π)/N eik2π/N , x ∈ T k=0,... ,N−1
− − − ∪ Ran e2ix ei(ν0 +ν1 +···+νN −1 +π)/N eik2π/N , x ∈ T . (6.16)
Perturbation theory as t → 0 and t → 1 can now be applied to get information on the corresponding band functions in these regimes. ii) It is not difficult to check that a unitary band matrix of order 2p + 1 with periodic coefficients, in the sense that there exists N > 0 such that ϕj |U ϕk = ϕj +N |U ϕk+N , is always unitarily equivalent to a multiplication by an pN × pN unitary matrix T (eix ) on ⊕q=0,... ,pN−1 L2q (T), where T (eix ) is a polynomial of degree p in e±ix . However, in general, one cannot rule out the existence of finitely many infinitely degenerate eigenvalues. Coherence on l 2 (N∗ ). Let us now turn to the study of U + defined on l 2 (N∗ ) by (3.1) in case the phases {γk } are arbitrary whereas {θk } and {αk } are eventually coherent: i.e., there exists k0 ∈ N and 2 ≤ N ∈ N such that for all k ≥ k0 ∈ N∗ , θk+N = θk , αk = ak + πk , where πk+N = πk and a ∈ R.
(6.17)
We can replace without loss γk by (−1)k+1 αk and assume a = 0, since we are working up to unitary equivalence. Our coherent comparison operator U0 on l 2 (Z) is defined by (2.5) with phases {θk } and {αk } obtained by extending (6.17) (with a = 0) to Z. Therefore we can write on l 2 (−N∗ ) ⊕ C ⊕ l 2 (N∗ ) − W − F, 1 (6.18) U0 = U+ where absent elements denote zeros, W − is an operator defined on l 2 (−N∗ ) which is eventually periodic and F is a finite rank operator. It is always possible to construct U0 this way with dim Ran F = M depending on N and k0 . Theorem 6.2. Let U + and U0 be as above. Then σs.c. (U + ) = ∅ and σa.c. (U + ) = σa.c. (U0 ). The point spectrum of U + consists of finitely many simple eigenvalues in the resolvent set of U0 . Remark. As the proof below shows, the same statement holds if U + denotes a doubly infinite coherent matrix perturbed by a finite rank operator.
214
O. Bourget, J.S. Howland, A. Joye
Proof. Let us first show that the finite rank perturbation F of the unitary U0 does not produce any singular continuous spectrum. By Weyl’s Theorem, this cannot happen in the gaps (on S 1 ) of the absolutely continuous spectrum of U0 . Therefore we focus on σ (U0 ). Depending on k0 and N , we have for some finite M > 0,
F = cj,k |ϕj ϕk |. (6.19) |j |,|k|≤M
We know from (6.12) that U0 is unitarily equivalent to the multiplication by a 2N × 2N unitary matrix V (x) on the decomposition (6.6), where V (x) is a polynomial in e±ix whose eigenvalues are not constant in x. Therefore, V (x) is analytic in a neighborhood of the real axis and we can write V (x) =
2N
Pj (x)λj (x),
(6.20)
j =1
where the eigenprojections Pj and eigenvalues λj are analytic in a neighborhood of the real axis as well (see [K]). We know that σ (U0 ) = ∪2N j =1 Ran {λj (x), x ∈ T}. Note that F
cj,k |(eij x )(eikx )|,
(6.21)
(6.22)
|j |,|k|≤M
where the r.h.s. is to be understood as a multiplication operator on the decomposition (6.6) and (eij x ) is a vector in C2N with zero elements except at some line, depending on j , where the entry is eij x . We follow the perturbation theory of unitary operators presented in [KK] to study the unitary operator U1 ≡ U0 + F . Let ζ = ρeiβ with ρ = 1 and β ∈ T. We set for j = 0, 1, Rj (ζ ) = Uj (Uj − ζ )−1 = (I I − ζ Uj ∗ )−1 , G(ζ ) = II + ζ (U1∗ − U0∗ )R1 (ζ ) = (II + ζ (U0∗ − U1∗ )R0 (ζ ))−1 .
(6.23) (6.24)
These quantities are analytic in C \ S 1 . We know from [KK] that for any vectors f, g, lim g|δρ (Ej , β)f =
ρ→1−
d g|Ea.c.,j (β)f a.e. β ∈ T, j = 0, 1, dβ
(6.25)
where 2πδρ (Ej , β) = Rj (ζ ) − Rj (ζ ) with ζ = 1/ζ¯ ,
(6.26)
and Ea.c.,j (β) is the absolutely continuous part of the the spectral projector of Uj at eiβ . Also, δρ (E1 , β) = G(ζ )∗ δρ (E0 , β)G(ζ ) ∗
= (I I − ζ F ∗ R0 (ζ ))−1 δρ (E0 , β)(I I − ζ F ∗ R0 (ζ ))−1 .
(6.27)
Spectral Analysis of Unitary Band Matrices
215
In order to get information on the nature of the spectral measure of U1 , it is sufficient to consider (6.25) on the cyclic subspace for U0 generated by the range of F ∗ . Indeed, the spectral measures of U0 and U1 associated with vectors in the orthogonal complement of this subspace coincide and it is cyclic also for U1 . Let P denote the projector on Ran F ∗ . We first note that II − ζ F ∗ R0 (ζ ) is invertible on Ran P if and only if det(P (I I − ζ F ∗ R0 (ζ ))P ) = 0
(6.28)
(I I − ζ F ∗ R0 (ζ )|Ran P )−1 = (P (I I − ζ F ∗ R0 (ζ ))P )−1 P .
(6.29)
and
So we need to consider the finite matrix whose elements are given for |n|, |m| ≤ M by
ϕn |F ∗ R0 (ζ )ϕm = c¯j,n ϕj |U0 (U0 − ζ )−1 ϕm =
2N
c¯j,n
|j |≤M 2π 0
|j |≤M l=1
Pl (x)λl (x) imx dx (eij x ), (e ) , λl (x) − ζ
(6.30)
where < ·, · > denotes here the scalar product in C2N . Therefore, (6.30) is a finite sum of the form 2N
l=1
2π 0
dx
(l)
fn,m (x) , λl (x) − ζ
(6.31)
(l)
where fn,m is analytic in an open strip of finite width, independent of l, n, m containing the real axis. Fix an l ∈ {1, . . . , 2N } and let xβ ∈ T be such that eiβ = λl (xβ ). There is only a finite number of such points. Assume λl (xβ ) = 0. Then we can deform the contour of integration in x to control the integrals (6.31) when ρ → 1 as follows. There exists a neighborhood C ⊃ Nβ of xβ which is mapped by λl bijectively on its image Mβ which contains eiβ in its interior. Let Dβ ⊂ Mβ be a smooth deformation of the unit circle towards the exterior which avoids eiβ . Taking the inverse image λ−1 l (Dβ ) ⊂ Nβ and connecting it at both ends with the real axis, we get a smooth path Cβ along which 0
2π
(l)
fn,m (x) dx = λl (x) − ζ
Cβ
dz
(l)
fn,m (z) . λl (z) − ζ
(6.32)
By construction, the last integral is now analytic in ζ in a neighborhood M˜ β ⊂ Mβ containing eiβ . Therefore, the matrix (6.30) has an analytic continuation in ζ in a neighborhood of S 1 except at a finite set of points. Hence there is only a countable set of points of S 1 , call it Z, where the determinant (6.28) is zero. Then, for any ψ = P ψ and any eiβ ∈ S 1 \ Z, we can write
(I I − ζ F ∗ R0 (ζ ))−1 ψ = dk (ζ )ϕk , (6.33) |k|≤M
216
O. Bourget, J.S. Howland, A. Joye
where the dk (ζ )’s are analytic in a neighborhood of eiβ ‘ and the ϕk ’s span the range of F ∗ . Thus, we deduce from the relation
ψ|δρ (E1 , β)ψ = d¯j (ζ )dk (ζ )ϕj |δρ (E0 , β)ϕk , (6.34) |k|,|j |≤M
that the limit ρ → 1− yields the derivative of an absolutely continuous measure on S 1 \Z, as the limits limρ→1− ϕj |δρ (E0 , β)ϕk ∈ L1 (T). As a countable set of point cannot support a continuous measure, we get that σs.c. (U1 ) = ∅. Let us consider the point spectrum of U + . From the relation (3.25), we get that the eigenvalues have multiplicity two at most. Except for a finite number of them, the transfer matrices T (k) are periodic in k, of period N. Therefore we define R = T (k0 + 1 + N )T (k0 + N ) · · · T (k0 + 1) and set
d(k) =
c2k
(6.35)
c2k+1
(6.36)
so that d(j N + k0 ) = R j d(k0 ) = R j T (k0 )T (k0 − 1) · · · T (2)d(1).
(6.37)
We will use the notation D(j ) = d(j N + k0 ). Note also that det R = eiκ , where κ ∈ T is independent of λ, due to (3.24), and that the matrix R is analytic in λ since it is a polynomial in e+iλ and e−iλ . Assume that an eigenvector of U + exists in l 2 (N∗ ) for the eigenvalue eiλ . This implies that the sequence {D(j )}j ∈Z belongs to l 2 at +∞. We are thus lead to the study of (6.37). This is done by means of the following elementary lemma whose proof we omit. Lemma 6.1. Let R be a 2 × 2 matrix with | det R| = 1, and let E1 be an eigenvalue of R. Consider D(j ) = R j D(0), where D(0) ∈ C2 . Then, 1) there exists K > 0, such that for all vectors D(0) of norm 1, for all j ∈ Z, K ≤ D(j ) ≤ |j |/K if and only if |E1 | = 1. 2) When |E1 | = 1, there exists another eigenvalue E2 = E1 . We can assume |E1 | > 1 > |E2 | = |E1 |−1 and we get j
j
D(j ) = AE1 v1 + BE2 v2 , j ∈ Z, where v1 , v2 ∈ C2 are the corresponding eigenvectors of R and A, B ∈ C are the coefficient of D(0) in the basis they form. A direct consequence is that {D(j )} ∈ l 2 (N) implies exponential decay at +∞, thus D(0) = v2 and any eigenvalue is simple. Now we use D(0) as an initial vector to construct a generalized vector for the coherent operator U0 on l 2 (Z). Note that considered as a functions of λ, R(λ) is analytic in a neighborhood of the real axis, therefore, E1 (λ) is analytic on T, except at the finite set X of exceptional points in T where the eigenvalues of R(λ) cross. At such exceptional points, |E1 | = 1. Then observe that if the second statement of Lemma 6.1 is true for some λ ∈ T, it still true in a neighborhood of λ by continuity. This implies that all generalized eigenvectors corresponding to eigenvalues
Spectral Analysis of Unitary Band Matrices
217
in the corresponding neighborhood grow exponentially at one end or the other. Due to Corollary 5.2, this can take place only in the resolvent set of U0 . Also, as the spectrum of U0 contains no isolated point, the argument above shows that X must belong to the closure of the set of points in T \ X, where |E1 (λ)| = 1. Therefore |E1 | is continuous on the whole of T and σ (U0 ) = |E1 |−1 ({1}). The band edges are also excluded from the point spectrum of U + , since they correspond to points where |E1 | = 1. We now study the number of eigenvalues of U + . The boundary condition that d(1) has to meet reads, according to (3.1), T˜ −1 d(1) = c1 b(λ), where c1 is the non zero first coefficient of the generalized eigenvector and irt −t 2 iλ 0 0 T˜ −1 = e−i(θ1 +θ2 +α2 −α1 ) , − e 10 r 2 itr 1 r b(λ) = eiλ − e−i(θ0 +θ1 +α1 ) , 0 it
(6.38)
(6.39) (6.40)
with | det T˜ −1 | = 1. Therefore, the condition to have an eigenvalue eiλ for U + is equivalent to b(λ) T˜ −1 T −1 (2)T −1 (3) · · · T −1 (k0 )v2 (λ),
(6.41)
where v2 (λ) is an eigenvector of R(λ) and all matrices involved are analytic in λ ∈ T. In other words, eiλ is an eigenvalue if and only if det(v2 (λ); T (k0 )T (k0 − 1) · · · T (2)T˜ b(λ)) ≡ det(v2 (λ); a(λ)) = 0,
(6.42)
where a is analytic on T and v2 can be chosen analytic on T \ X, see [K]. Therefore, to show that the number of eigenvalues of U + is finite, it is enough to prove that, as a function of λ on T, the determinant above has finitely many zeros. This is a consequence of the next lemma we prove in the Appendix. Lemma 6.2. If λ0 ∈ X, the eigenvectors vj (λ), j = 1, 2, have at worst a square root branch point at λ0 . It follows that the function λ → det(v2 (λ); a(λ)) is analytic on T \ X and possesses square root branch point singularities at X as well. Therefore it only possesses finitely many zeros on T. Finally, we show that σ (U0 ) ⊂ σ (U + ). Let eiλ be in the interior of the set σa.c (U0 ) and consider the relation (3.25) yielding the coefficients d(k) of the corresponding generalized eigenvector. Up to a finite number of transfer matrices T (k0 )T (k0 − 1) · · · T (1), this relation is identical to that yielding the coefficients with positive indices of a corresponding generalized eigenvector for U + . The discussion above shows that d(k) is polynomially bounded at both ends, so that by Corollary 5.2, eiλ belongs the spectrum of U + as well. This finishes the proof of the theorem. Remark. In keeping with the last remark following the proof of Theorem 6.1, let U0 denote a periodic band matrix of order 2p + 1. Then, it is also true that a finite rank perturbation of the form (6.19) produces no singular continuous spectrum, since the first part of the above proof goes through without changes.
218
O. Bourget, J.S. Howland, A. Joye
7. An Almost Periodic Example In order to complete the picture of the spectral properties such matrices can possess, we briefly describe below an example of deterministic unitary band matrices which is almost periodic and purely singular continuous. This example is constructed in analogy with the random discrete Schr¨odinger case according to the approaches of Herman and Gordon, see e.g. [CFKS]. We consider again the matrix M({θk }, {αk }, {γk }), where the phases αk are taken as constants, while the γk ’s are arbitrary and can be replaced by (−1)k+1 αk , as above. The almost periodicity lies with the phases θk defined according to θk = 2πβk + θ,
∀k ∈ N,
(7.1)
where β is irrational, and θ ∈ [0, 2π[. Consider the uniform measure P0 on the T, and the translation τ : T → T defined by τ (θ ) = 2iπβ + θ.
(7.2)
Then the set of iterates τ k , k ∈ Z is ergodic. The corresponding transfer matrices T (k)θ generated by this set of translations are then given by (see (4.8)) T (k)θ11 = −e−i(λ+2θ+8kπβ−6βπ) , ir −i(λ+2θ+8kπβ−6βπ) T (k)θ12 = e −1 , t ir T (k)θ21 = e2iθ − e−i(λ+2θ+8kπβ−6βπ) , t r 2 4iβ θ T (k)22 = 2 e + 1 − e−i(λ+2θ+8kπβ−6βπ) t 1 − 2 ei(λ+2θ +8kπβ−2βπ) . t
(7.3)
Following Herman [He], we first get the positivity of the Lyapunov exponent. Proposition 7.1. Let T (k)θ be the transfer matrices (7.3) at spectral parameter λ ∈ T corresponding to U ≡ M({θk }, {α}, {γk }), where the θk ’s are given by (7.1). For β irrational, the Lyapunov exponent γ (λ) satisfies for almost all θ : γ (λ) ≥ ln
1 > 0, therefore σac (U ) = ∅. t2
Proof. We first note that the sub-additive ergodic theorem applies to FN (θ ) θ = ln RN k=1 T (k) and since τ is ergodic, FN (θ ) = γ (λ) N→∞ N lim
(7.4)
almost surely with respect to P0 . Setting z = e−iθ , we write our transfer matrices T (k, z), expliciting the dependence in z ∈ C∗ , and we define three matrices (Rj (k)), j = −2, 0, 2, by T (k, z) ≡ z2 R2 (k) + R0 (k) + z−2 R−2 (k),
(7.5)
Spectral Analysis of Unitary Band Matrices
219
where
R2 (k) = e
−i(λ+8kπβ−6βπ)
R0 (k) =
0 ir 4iπβ t e
− irt
−1 irt 2 − irt − rt 2
,
(7.6)
, (ei4πβ + 1) 1 00 . R−2 (k) = − 2 ei(λ+8kπβ−2βπ) 01 t r2 t2
(7.7) (7.8)
Then we consider Sk (z) = z2 T (k, z) which is analytic in C and such that if z ∈ S 1 , Sk (z) = Tk (z), ∀k ∈ Z. Hence, the function ln N k=1 Sk (z) is sub-harmonic and as Sk (0) = R−2 , we get the estimate 1 2π
0
2π
N
ln
N
Sk (eiθ )dθ ≥ ln
k=1
Sk (0) = N ln k=1
1 . t2
(7.9)
We finally note that (7.4) implies 1 γ (λ) = lim N→+∞ N
2π
0
! ! ! ln ! !
N
k=1
! ! dθ ! Tk (e )! . ! 2π iθ
The second statement then follows from Theorem 5.3.
(7.10)
Next, we adapt the argument of Gordon to our setting in order to exclude eigenvalues in σ (U ), for β a Liouville number. That is if for any k ∈ N, there exist pk , qk ∈ N such that |β − pk /qk | ≤ k −qk .
(7.11)
Proposition 7.2. Assume the phases (θk ) are given by (7.1) and (αk ) are zero. Moreover, suppose β is a Liouville number and ((θ m )k ) a family of periodic sequence of period qm . For each sequence, the corresponding family of transfer matrices (7.3) is denoted by (T (k)θ )k∈Z and (Tmθ (k))k∈Z respectively. Assume the period of the sequence (θ m ) obeys limm→+∞ qm = +∞ and the following estimates hold: sup Tmθ (k) < ∞,
sup T θ (k) − Tmθ (k) ≤ Cm−qm .
k,m
Then, any non-zero solution φ =
|k|≤2qm
ck ϕk of U φ = eiλ φ satisfies
lim sup
|k|→+∞
2 ck+1 + ck2
c12
+ c02
≥
1 . 4
(7.12)
Its proof is identical to that given in [CFKS], Theorem 10.3, noting that the norm of any transfer matrix (7.3) is bounded by a constant depending on t, r only. Theorem 7.1. Let U be as in Proposition 7.1. If β is a Liouville number, then for a.e. θ, U is purely singular continuous.
220
O. Bourget, J.S. Howland, A. Joye
Proof. Let β be
a Liouville number. It can be approximated by a sequence of irreducible fractions pqmm obeying (7.11). Define the sequence ((θ m )k ) by: ∀k ∈ Z, (θ m )k = 2π pqmm k + θ . A simple computation shows ∀k ∈ Z, T θ (k) − Tmθ (k) m pm −i λ+2θ+(4k−3)π β+ pqm 1 −irt −1 = 2i sin (4k − 3)π β − e irt −1 r 2 t −2 qm m pm 0 0 i λ+2θ+(4k−1)π β+ pqm −2i sin (4k − 1)π β − e (7.13) 0 t −2 qm Then, β being a Liouville number allows us to check the hypotheses of Proposition 7.2. Therefore, the generalized eigenvalue equation cannot have l 2 solution and the point spectrum of U is empty. Combining this result with Theorem 7.1 yields the conclusion. 8. Appendix Proof of Lemma 3.1. Assume U is unitary and, for all k ∈ Z, U ϕk = αk ϕk−1 + βk ϕk + γk ϕk+1 , so that
U =
..
. αk−1 βk−1 αk γk−1 βk αk+1 γk βk+1
. γk+1 . .
(8.1)
.
(8.2)
Then, for all k ∈ Z, |αk |2 + |βk |2 + |γk |2 = 1, |αk+1 |2 + |βk |2 + |γk−1 |2 = 1, ∗ ∗ = 0, αk βk−1 + βk γk−1 ∗ γk−1 βk−1 + βk αk∗ = 0, αk γk∗ = 0, ∗ = 0. αk γk−2
(8.3)
Let us start by noting that |βk0 | = 1 is equivalent to αk0 = γk0 = αk0 +1 = γk0 −1 = 0, which creates an isolated 1 × 1 block in the matrix structure of U . We assume now that one off-diagonal element is non-zero. By considering the transpose of U instead of U if necessary, we assume without loss αk0 = 0. The last two relations impose γk0 = γk0 −2 = 0 and the two middle ones yield |βk0 αk0 +1 | = |βk0 −1 αk0 −1 | = |βk0 +1 αk0 +1 | = |βk0 −2 αk0 −1 | = 0.
(8.4)
On the one hand, if βk0 = 0, then βk0 −1 = 0. Otherwise we would get from the first two relations in (8.3) |αk0 | = 1 and βk0 = 0. Hence, from (8.4), αk0 −1 = αk0 +1 = 0, showing that an isolated block of the form βk0 −1 αk0 (8.5) γk0 −1 βk0 exists in the matrix U .
Spectral Analysis of Unitary Band Matrices
221
If, on the other hand, βk0 = 0, together with αk0 = 0 this implies |αk0 | = 1 and, in turn βk0 −1 = 0. We first assume γk0 −1 = 0. Hence the last two equations in (8.3) yield αk0 −1 = αk0 +1 = 0, which again yields an isolated block of the form 0 αk0 . (8.6) γk0 −1 0 in U with |γk0 −1 | = |αk0 | = 1. If γk0 −1 = 0 and |αk0 | = 1, we get |αk0 −1 | = |αk0 +1 | = 1. In turn, this imposes βk0 −1 = γk0 −1 = 0 and βk0 +1 = γk0 +1 = 0. Thus, U is of the form
..
. αk0 −1 0 αk0 0 αk0 +1 U = 0
..
(8.7)
.
and is therefore unitarily equivalent to the shift operator, using a unitary defined similarly to (3.18). Hence, except in the last case, iteration of the above arguments, shows that U has the block structure announced. Proof of Lemma 4.1. We can set the value λ at zero without loss. Let ,u (n) = E(e−inθ ) = δn,0
(8.8)
be the characteristic function of the common uniform distribution of the phases θk and αk . Consider the characteristic function of the set of random vectors {δk1 , δk2 , · · · δkj } given by ,δk1 ,δk2 ,···δkj (n1 , n2 , · · · , nj ) = E(exp(−i(n1 · δk1 + n2 · δk2 + · · · + nj · δkj )))
= E exp(−i(n11 θ2k1 + (n11 + n21 )θ2k1 −1 + n21 θ2k1 −2 + · · · + n1j θ2kj +(n1j + n2j )θ2kj −1 + n2j θ2kj −2 ))
× E exp(−i(n11 α2k1 + (n21 − n11 )α2k1 −1 − n21 α2k1 −2 + · · · + n1j α2kj +(n2j − n1j )α2kj −1 − n2j α2kj −2 )) , (8.9) where nk = (n1k , n2k ) ∈ Z2 . We used independence of the θ ’s and α’s to factorize the expectations over these random variables. We can assume the kl ’s are ordered and we deal with the θ’s only. The argument is similar for the α’s. From the expression above, one sees that one can factorize the expectations over θl with l ≤ 2kr from those with l ≥ 2kr+1 − 2 as soon as kr < kr+1 + 1. Therefore, it is enough to consider consecutive indices k1 = m, k2 = m + 1, . . . , kj = m + j . As (8.8) shows, in such a case, the expectation over the θ ’s equals zero unless n11 = 0,
n11 + n21 = 0, . . . , n1j + n2j = 0,
n2j = 0,
(8.10)
222
O. Bourget, J.S. Howland, A. Joye
when it equals one. But this is equivalent to nlk = 0 for all k = 1, . . . , j , l = 1, 2. Hence, we have proven that ,δ (n) = ,u (n1 ),u (n2 )
(8.11)
for j = 1 and ,δk1 ,δk2 ,... ,δkj (n1 , n2 , . . . , nj ) = ,δk1 (n1 ),δk2 (n2 ) · · · ,δkj (nj ), which is equivalent to independence of the random vectors δk1 , δk2 , · · · δkj .
(8.12)
Proof of Lemma 4.2. Let us consider Yk = Yk+ = Xk + Xk−1 only, the other case being similar. Let the measure µX denote the distribution of the Xk ’s. Then the Yk ’s are identically distributed according to the measure µY = µX ∗ µX . Let ,X be the characteristic function of the random variable X. Then ,Y (n) = ,2X (n). Given Lemma 4.1, we need only prove that independence of the Yk ’s imposes µX is uniform on the torus. Then, the characteristic function of the variables {Yk , Yk+1 } must satisfy for all (n1 , n2 ) ∈ Z2 , ,Yk ,Yk+1 (n1 , n2 ) = E(exp(−in1 (Xk + Xk−1 ) − in2 (Xk+1 + Xk ))) = ,X (n1 ),X (n1 + n2 ),X (n2 ) ≡ ,2X (n1 ),2X (n2 ).
(8.13)
In case ,X (n1 ),X (n2 ) = 0, this relation is fulfilled. Otherwise, we have for all other cases ,X (n1 ),X (n2 ) = ,X (n1 + n2 ).
(8.14)
If N is the smallest positive integer such that ,X (N ) = 0, we get that 1 = ,X (0) = ,X (N ),X (−N ) = |,X (N )|2 ⇐⇒ ,X (N ) = e−iν ,
(8.15)
for some ν ∈ T. Iteration of (8.14) implies that for any m ∈ Z, ,X (mN ) = e−imν .
(8.16)
One checks with (8.14) that there can be no integer M > N , M = kN , k ∈ N, such that ,X (M) = 0. That implies that µX = δ(x − ν/N ), which is a contradiction to our hypothesis. Hence we must have ,X (n) = 0 for all n = 0, which corresponds to a uniform distribution. Proof of Theorem 5.2. We develop here the arguments yielding polynomially bounded generalized eigenfunctions associated with spectral parameters in the spectrum of U . We state the starting point result, Theorem C.5.1 in [S], specialized to our setting. Theorem 8.1. Let H be a separable Hilbert space. Assume that to any Borel set ⊂ [0, 2π [ we have a positive trace class operator A( ) on H satisfying: the condition if = ∪+∞ with ∩ = ∅ for i = j , then A( ) = s − lim A( n ). n i j n=1 Then there exists a Borel measure dρ and a positive, trace class, operator valued measurable function a(λ) such that: • ∀φ ∈ H, φ|A( )φ = φ|a(λ)φdρ(λ). • T r(a(λ)) = 1, dρ-ae. These two conditions characterize the operator valued function a.
Spectral Analysis of Unitary Band Matrices
223
Let us introduce weighted l 2 (Z) spaces lδ2 (Z) = {φ = (φn )n ∈ l 2 (Z∗ );
< n >δ |φn |2 < +∞},
(8.17)
n∈Z
where < n >=< 1 + n2 >1/2 . We prove the equivalent of Theorem C.5.2. in [S]. Proposition 8.1. Let U be a unitary operator defined on l 2 (Z) and δ > 1. Then there exists a spectral measure dρ and, for dρ almost all λ, there exists a function F.,. (λ) defined on Z × Z such that: • Fn,m is measurable in λ. • n,m < n >−δ |Fn,m (λ)|2 < m >−δ ≤ 1, dρ-ae. δ
δ
• |Fn,m (λ)| ≤ C < n >− 2 < m >− 2 . • For any bounded Borel function g on S 1 , and for any vectors φ, ψ in lδ2 (Z),
∗ φ|g(U )ψ = g(λ) (8.18) Fn,m (λ)φn ψm dρ(λ). n,m
• For any fixed m, (U − eiλ )F.,m (λ) = 0, where F.,m (λ) =
n∈Z Fn,m (λ)ϕn .
Proof. We denote the spectral projectors of U by (E( )) ∈B([0,2π[) , where B([0, 2π [) denotes the Borel sets on the interval [0, 2π[. Let x be the self adjoint operator, diagonal on the orthonormal basis (ϕn )n∈Z , defined ∀n ∈ Z, by xϕn =< n > ϕn . The operators (A( )) ∈B([0,2π[) defined ∀ ∈ B([0, 2π[) by A( ) = x
−δ 2
E( )x
−δ 2
,
are positive and trace class: #
" −δ −δ n−δ ϕn |E( )ϕn < +∞ . ϕn |x 2 E( )x 2 ϕn ≤ n∈Z
(8.19)
(8.20)
n∈Z
By definition, the spectral family E(.) satisfies for any countable disjoint family ( i)i∈I ⊂ B([0, 2π[): E(∪i∈I i ) = s − lim i∈I E( i ). The operators x −δ being bounded on l 2 (Z), we get A(∪i∈I i ) = s − lim i∈I A( i ). Hence A(.) is a Borel measure with values in positive, trace class operators. and Theorem 8.1 applies. Thus, ∀(n, m) ∈ Z×Z, we get a function defined dρ-ae, # " δ δ δ Fn,m (λ) = ϕn |x 2 a(λ)x 2 ϕm = (nm) 2 ϕn |a(λ)ϕm δ
= (nm) 2 an,m (λ).
(8.21)
By construction, the functions an,m (hence Fn,m ) are measurable. Moreover,
|Fn,m (λ)|2 (nm)−δ = |an,m (λ)|2 n,m
=
n
n,m
a(λ)ϕn = a(λ)22 ≤ a(λ)21 = T r(a(λ))2 = 1 dρ − ae. (8.22) 2
224
O. Bourget, J.S. Howland, A. Joye
This implies the third statement. Let Ai ⊂ [0, 2π [ be the Borel set, χi : S 1 → R be its characteristic function and φ, ψ be two vectors of lδ2 (Z). Then [0,2π [
χi (eiλ )
=
n,m
=
n,m
n,m
φn∗ ψm φn∗ ψm
φn∗ ψm Fn,m (λ)dρ(λ) =
Ai
Fn,m (λ)dρ(λ) =
$ $ ϕn $$
n,m
[0,2π[
n,m
φn∗ ψm
[0,2π[
χi (eiλ )Fn,m (λ)dρ(λ)
φn∗ ψm ϕn |E(Ai )ϕm
χi (eiλ )dλ ϕm = φ|χi (U )ψ.
(8.23)
This results holds for step functions by linearity, and for bounded measurable functions on [0, 2π[. In particular, taking g = id and ψ = ϕm ,
φ|U ϕm =
e
iλ
[0,2π[
=
[0,2π[
n
Fn,m (λ)φn∗
dρ(λ) ,
φ|eiλ F.,m (λ)dρ(λ).
(8.24)
But, [0,2π[
=
φ|U F.,m (λ)dρ(λ) =
k
=
k,j
=
k,j
[0,2π[
φk∗
j
Ukj φk∗ (j m)δ
k
[0,2π[
φk∗ (U F.,m (λ))k dρ(λ)
Ukj Fj,m (λ)dρ(λ) =
k,j
[0,2π[
aj,m (λ)dρ(λ) =
Ukj φk∗ E([0, 2π[)j,m =
k,j
Ukj φk∗
k,j
[0,2π[
Fj,m (λ)dρ(λ)
Ukj φk∗ (j m)δ A([0, 2π [)j,m
Ukj φk∗ δj,m = φ|U ϕm .
(8.25)
It follows that ∀m ∈ Z, ∀φ ∈ lδ2 (Z), [0,2π[
φ|U F.,m (λ)dρ(λ) =
[0,2π[
φ|eiλ F.,m (λ)dρ,
(8.26)
and thus φ|U F.,m (λ) = φ|eiλ F.,m (λ) dρ − ae.
(8.27)
At this point we can prove Theorem 5.2, following closely the arguments of [S]: Let N (λ) be the rank of the Hilbert-Schmidt operator a(λ), which is a measurable function
Spectral Analysis of Unitary Band Matrices
225
of λ. For all λ, there exists a set of orthogonal vectors [K], (fj (λ))j ∈{1,... ,N(λ)} , such that dρ-ae: a(λ) =
N(λ)
|fj (λ)fj (λ)| and
j =1 N(λ)
j =1
=
fj (λ)2 = N(λ)
m=1
m,j
1 fm (λ)|fj (λ)fj (λ)|fm (λ) fm (λ)2
fm (λ) fm (λ) |a(λ) = T r(a(λ)) = 1. fm (λ) fm (λ)
In case of degeneracy of the spectrum, it is always possible [S] to choose the f ’s so that they are measurable. It is enough to set now φn (λ) = x δ/2 fn (λ) , ∀n ∈ Z, ∀λ ∈ [0, 2π [,
n
= {λ; N (λ) = n}.
(8.28)
The sets n are disjoint by construction. For any fixed λ, the vectors φj (λ) are linearly independent, as is easily checked. The conditions on the growth of the components of the vectors φj (λ) are consequences of their definitions and Proposition 8.1. By construction, ∀k ∈ Z,
< m >−δ Fk,m (λ)(φj (λ))m . (8.29) fj (λ)2 (φj (λ))k = m
Therefore, ∀n ∈ Z, ∀j ∈ {1, . . . , N (λ)},
Unk (φj (λ))k ϕn |U φj (λ) = k
=
1 Unk < m >−δ Fk,m (λ)(φj (λ))m fj (λ)2 k,m
1 = < m >−δ (φj (λ))m ϕn |U F.,m (λ). fj (λ)2 m Using Proposition 8.1, it follows that the previous line equals
1 < m >−δ (φj (λ))m eiλ ϕn |F.,m (λ) = ϕn |eiλ φj (λ). = 2 fj (λ) m Thus, ∀φ ∈ lδ2 (Z), φ|U φj (λ) = φ|eiλ φj (λ), dρ-ae. Proof of Lemma 6.2. Write
R(λ) =
a(λ) b(λ) , c(λ) d(λ)
(8.30)
(8.31)
where a, b, c, d are analytic on T and det R(λ) = eiκ . The eigenvalues of R(λ) are (TrR(λ))2 TrR(λ) Ej (λ) = + (−1)j − eiκ j = 1, 2, (8.32) 2 4
226
O. Bourget, J.S. Howland, A. Joye
and the set X consists of the zeros of (TrR)2 − 4eiκ . Let λ = 0 belong to X. We can assume that in a punctured neighborhood of 0, b(λ) = 0. Therefore, the eigenvectors can be chosen as b(λ) vj (λ) = . (8.33) Ej (λ) − a(λ) Since (TrR(λ))2 /4 − eiκ =
tn λn
(8.34)
n∈N
with t0 = 0, Ej and, in turn, vj admit convergent series expansions in non-negative powers of λ1/2 in a neighborhood of 0. Acknowledgements. We wish to thank Joachim Asch and Jean Brossard for helpful discussions. JH wishes to thank the Institut Fourier and AJ wishes to thank the University of Virginia for hospitality and support.
References [A] [ADE] [BB] [BG] [BL] [Be] [Bo] [CFKS] [CL] [Co1] [Co2] [DLSV] [DS] [GY] [He] [Ho1] [Ho2] [Ho3]
Arnold, L.: Random Dynamical Systems. Berlin-Heidelberg-NewYork: Springer-Verlag, 1998 Asch, J., Duclos, P., Exner, P.: Stability of driven systems with growing gaps, quantum rings, and Wannier ladders. J. Stat. Phys. 92, 1053–1070 (1998) Blatter, G., Browne, D.: Zener tunneling and localization in small conducting rings. Phys. Rev. B 37, 3856 (1988) Bambusi, D., Graffi, S.: Time Quasi-periodic unbounded perturbations of the Schr¨odinger operators and KAM methods. Commun. Math. Phys. 219, 465–480 (2001) Bougerol, P., Lacroix, J.: Products of Random Matrices with Applications to Schr¨odinger Operators. Basel-Boston: Birkh¨auser, 1985 Bellissard, J.: Stability and instability in quantum mechanics. In: Trends and Developments in the Eighties. S. Albeverio, Ph. Blanchard (eds), Singapore: World Scientific, 1985, pp. 1–106 Bourget, O.: Floquet operators with singular continuous spectrum. J. Math. Anal. Appl., to appear Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schr¨odinger Operators. Berlin-HeidelbergNew York: Springer-Verlag, 1987 Carmona, R., Lacroix, J.: Spectral theory of Random Schr¨odinger Operators. Basel-Boston: Birkh¨auser, 1990 Combescure, M.: Spectral properties of a periodically kicked quantum Hamiltonian. J. Stat. Phys. 59, 679–690 (1990) Combescure, M.: Recurrent versus diffusive quantum behaviour for time dependent Hamiltonians. In: Operator theory: Advances and Applications, Vol. 57, 15–26 Basel-Boston: Birkh¨auser Verlag, 1992 Duclos, P., Lev, O., Stovicek, P., Vittot, M.: Weakly regular Hamiltonians with pure point spectrum. Rev. Math. Phys., 14(6), 531–568 (2002) Duclos, P., Stovicek, P.: Floquet Hamiltonians with pure point spectrum. Commun. Math. Phys. 177, 327–347 (1996) Graffi, S., Yajima, K.: Absolute continuity of the floquet spectrum for a nonlinearly forced harmonic oscillator. Commun. Math. Phys. 215(2), 245–250 (2000) Herman, M.: Une m´ethode pour minorer les exposants de Lyapunov et quelques exemples montrant le caract`ere local d’un th´eor`eme d’Arnold et Moser sur le tore en dimension 2. Comment. Math. Helv. 58, 453–502 (1983) Howland, J.: Scattering theory for hamiltonians periodic in time. Indiana J. Math. 28, 471–494 (1979) Howland, J.: Floquet operators with singular continuous spectrum, I. Ann. Inst. H. Poincar´e Phys. Th´eor. 49, 309–323 (1989); II, 49, 325–334 (1989); III, 69, 265–273 (1998) Howland, J.: Perturbation theory of dense point spectra. J. Funct. Anal. 74(1), 52–80 (1987)
Spectral Analysis of Unitary Band Matrices [J] [K] [KK] [MT] [N] [dO] [S] [Y]
227
Joye, A.: Absence of absolutely continuous spectrum of Floquet operators. J. Stat. Phys. 75, 929–952 (1994) Kato, T.: Perturbation Theory of Linear Operators. Berlin-Heidelberg-New York: SpringerVerlag, 1976 Kato, T., Kuroda, S.T.: Theory of simple scattering and eigenfunction expansions. In: Functional Analysis and Related Fields, F. Browder (ed.), Berlin-Heidelberg-New York: SpringerVerlag, 1970 Mneimn´e, R., Testard, F.: Introduction a` la th´eorie des groupes de Lie classiques. Paris: Hermann, 1986 Nenciu, G.: Floquet operators without absolutely continuous spectrum. Ann. Inst. H. Poincar´e Phys. Th´eor. 59, 91–97 (1993); Adiabatic theory: Stability of systems with increasing gaps. Ann. Inst. H. Poincar´e Phys. Th´eor. 67, 411–424 (1997) de Oliveira, C.R.: On kicked systems modulated along the Thue-Morse sequence. J. Phys. A 27(22), 847–851 (1994) Simon, B.: Schr¨odinger semigroups. Bull. A.M.S. 7(3), 447–526 (1982) Yajima, K.: Scattering theory for Schr¨odinger equations with potential periodic in time. J. Math. Soc. Japan 29, 729–743 (1977)
Communicated by B. Simon
Commun. Math. Phys. 234, 229–251 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0763-7
Communications in
Mathematical Physics
Monstrous Branes Ben Craps1 , Matthias R. Gaberdiel2 , Jeffrey A. Harvey1,3 1
Enrico Fermi Institute, University of Chicago, 5640 S. Ellis Av., Chicago, IL 60637, USA. E-mail:
[email protected] 2 Department of Mathematics, King’s College London, Strand, London WC2R 2LS, U.K. E-mail:
[email protected] 3 Department of Physics, University of Chicago, 5640 S. Ellis Av., Chicago, IL 60637, USA. E-mail:
[email protected] Received: 20 February 2002 / Accepted: 17 August 2002 Published online: 19 December 2002 – © Springer-Verlag 2002
Abstract: We study D-branes in the bosonic closed string theory whose automorphism group is the Bimonster group (the wreath product of the Monster simple group with Z2 ). We give a complete classification of D-branes preserving the chiral subalgebra of Monster invariants and show that they transform in a representation of the Bimonster. Our results apply more generally to self-dual conformal field theories which admit the action of a compact Lie group on both the left- and right-moving sectors. 1. Introduction The connection between the Monster sporadic group and modular functions known as moonshine is one of the most peculiar and mysterious facts in modern mathematics. Equally strange is the fact that the construction of a Monster module which most naturally encodes these connections uses the techniques and ideas of string theory [1]. There now exists a proof of the moonshine conjectures [2], and steps have been taken towards an explanation of their origin [3, 4]. However, a fully satisfactory conceptual explanation of the connection between the Monster, modular functions, and its appearance in string theory is still lacking. It may be that some new physical ideas and techniques will help to shed light on the situation. The current understanding of the Monster in string theory shows that, in a particular closed string background, which will be described later, the Bimonster acts as a symmetry group of the perturbative spectrum of the string theory. (The Bimonster is the wreath product of the Monster simple group with Z2 , i.e. two copies of the Monster group exchanged by an involution; see Subsect. 2.2 for more details.) Over the last several years it has been appreciated that string theory also contains non-perturbative states whose mass scales like 1/gs or 1/gs2 with gs the string coupling, and that these states play a fundamental role in understanding the structure of string theory [5]. In the bosonic string theory in which the Bimonster appears, the best understood non-perturbative states are Dirichlet branes or D-branes whose mass scales like 1/gs . The classification of D-brane
230
B. Craps, M.R. Gaberdiel, J.A. Harvey
states in various string backgrounds has been an active area of research recently, and our aim here is to use the techniques that have been developed to at least partially classify the possible D-brane states in the bosonic string theory with Bimonster symmetry. In doing so we will provide evidence that the Bimonster extends from a symmetry of the perturbative spectrum to a symmetry of the full spectrum of the theory. The construction of the Monster module in [1] can be viewed, in string theory language, as the construction of an asymmetric orbifold of a special toroidal compactification of bosonic string theory. As a result we will need to utilise a number of results regarding the description of D-branes in orbifold conformal field theory. In the next subsection we shall therefore review briefly, following [6], some of the necessary background material.
1.1. Orbifolds, D-branes and conformal field theory. Let us begin by explaining the basic ideas underlying the orbifold construction [7]. Consider a closed string theory compactified on a manifold M on which a group acts as a group of symmetries. Roughly speaking, the orbifold by is the compactification on the quotient space M/ . If the action of the discrete group on M is not free, i.e. if M has fixed points under the action of some elements in , then the resulting space is singular. Despite such classical singularities, string theory is however well-behaved on such orbifolds. More specifically we can describe the orbifold theory as follows. Firstly, the theory consists of those states in the original space of states H that are invariant under the action of the orbifold group . In addition, the theory has so-called twisted sectors containing strings that are closed in M/ but not in M. If the orbifold action has fixed points, the twisted sector states describe degrees of freedom that are localised at these fixed points; the presence of these additional states is the essential reason for why string theory is well-behaved despite these singularities. The concept of an orbifold can be extended to a more general setting, where neither the underlying conformal field theory H nor the discrete symmetry group need to have a direct geometric interpretation. (For example, in the case studied in this paper, includes an asymmetric reflection, acting on the left-moving string coordinates only.) In this context the twisted sectors are then determined by the condition that the orbifold conformal field theory should be modular invariant. If is abelian one finds that there is one twisted sector Hh for each element h ∈ . Each twisted sector has to be projected again onto the states that are invariant under the action of the orbifold group . Next, we turn to the description of D-branes on orbifolds. Let us first consider the case where the orbifold has a geometric interpretation as M/ . Then we can construct D-branes as follows [8, 9]: we consider a D-brane on the covering space M, and add to it images under the action of so as to obtain a -invariant configuration of D-branes on M. We then restrict the resulting open string spectrum to those states that are invariant under the action of the orbifold group. A typical orbifold-invariant configuration will consist of || D-branes on the covering space. The resulting D-brane is then called a “bulk” brane, and it possesses moduli that describe its position on M. On the other hand, if the original D-brane is localised at a singular point of the orbifold, we need fewer preimages in the covering space to make an orbifold invariant configuration; such branes are then called “fractional” D-branes. Because they involve fewer preimage branes, fractional D-branes cannot move off the singular point; instead, a number of fractional D-branes have to come together in order for the system to be able to move off into the bulk.
Monstrous Branes
231
For orbifolds that do not have a simple geometric interpretation, it is often useful to describe D-branes in terms of boundary states, using (boundary) conformal field theory methods.1 D-branes can be thought of as describing open string sectors that can be added consistently to a given closed string theory. From the point of view of conformal field theory, the construction of D-branes is therefore simply the construction of permissible boundary conditions. This problem has been studied for a number of years (for a recent review see for example [15]). In the conformal field theory approach, D-branes are described by coherent “boundary states” that can be constructed in the underlying (closed) string theory. These boundary states satisfy a number of consistency conditions, the most important of which is the so-called Cardy condition [16] (which we shall analyse in detail in Sect. 4). It arises from considering the annulus diagram for which the two boundary conditions are determined by two (possibly identical) D-branes, one for each boundary. This diagram can be given two interpretations, depending on which world-sheet coordinate is chosen as the worldsheet time. From the closed string point of view the diagram describes the tree-level exchange of closed string states between two sources (D-branes). On the other hand, the diagram can also be interpreted as a one-loop vacuum diagram of open strings with boundary conditions described by the two D-branes. The requirement that both the open and the closed string interpretations of the annulus diagram should be sensible imposes strong restrictions on the possible D-branes in a given closed string theory.
1.2. Outline. The paper is organised as follows. In Sect. 2 we review the construction of the Monster theory whose D-branes we want to study. In Sect. 3 we make use of the orbifold construction of the Monster theory to anticipate the presence of certain bulk and fractional branes in the D-brane spectrum. We then employ conformal field theory techniques in Sect. 4 in order to construct all D-branes that preserve the chiral subalgebra of Monster invariants. We demonstrate that the branes we construct satisfy all relative Cardy conditions, and we show that they are complete in a suitable sense. The D-branes are labelled by group elements in the Monster group M, and transform in the regular representation of both copies of M (that defines a representation of the Bimonster). As we shall explain, our analysis is actually valid for any self-dual conformal field theory which admits the action of a compact Lie group on both left- and right-moving sectors, and we therefore couch our arguments in this more general setting. In Sect. 5 we explain how the “geometrical” D-branes of Sect. 3 can be accounted for in terms of the more abstract conformal field theory construction. Finally, we end with some conclusions in Sect. 6.
2. The Monster Theory 2.1. The Monster conformal field theory. We are interested in the (bosonic) closed string theory whose spectrum is described by the tensor product of two (chiral) Monster conformal field theories, H = HM ⊗ HM ,
(2.1)
1 Actually, the conformal field theory point of view is also powerful for geometric orbifolds; see for example [10–14].
232
B. Craps, M.R. Gaberdiel, J.A. Harvey
where HM is the (chiral) Monster theory, HM = HL /Z2 .
(2.2)
Here L is the Leech lattice, the (unique) even self-dual Euclidean lattice of dimension 24 that does not possess any points of length square 2 (see [17] for a good explanation of these matters), and H is the holomorphic bosonic conformal field theory associated to the even, self-dual lattice . The lattice theory H is the conformal field theory that consists of the states of the form n i=1
j
i α−m |p, i
(2.3)
where ji ∈ {1, . . . , d}, m1 ≥ m2 ≥ · · · ≥ mn > 0 and p ∈ . Here d is the dimension of the lattice which equals the central charge of the conformal field theory H , c = d with d necessarily a multiple of 8, and the oscillators αni , i ∈ {1, . . . , d}, n ∈ Z satisfy the standard commutation relations j
i [αm , αn ] = m δ ij δm,−n .
(2.4)
More details about lattice conformal field theories can, for example, be found in [18, 19]. For the case under consideration c = d = 24, and the Z2 orbifold acts on the 24 oscillators, αni , i = 1, . . . , 24 by αni → −αni ,
(2.5)
and on the ground states |p with p ∈ L as |p → | − p.
(2.6)
The construction of this vertex operator algebra and the demonstration that the Monster acts as its automorphism group is due to Frenkel, Lepowsky and Meurman [1]; the embedding of this construction into conformal field theory has been discussed in [20–22]. The closed conformal field theory (2.1) can be described as the Z2 × Z2 orbifold of the compactification of the closed bosonic string theory on the Leech torus, the quotient space R24 /L , with a special background B-field turned on [20]. The two Z2 ’s can be taken to be the two asymmetric orbifolds that act on the left- and right-movers as described above. Alternatively, we can also think of the theory as an asymmetric Z2 orbifold (where Z2 acts as above on the left-movers only) of the geometric Z2 orbifold of the Leech theory. Because of the background B-field we have just mentioned, the generator of the asymmetric Z2 actually differs from the usual T-duality transformation that inverts all 24 directions: indeed, T-duality not only reflects the left-movers, but also rotates the right-movers by D ≡ (G − B)−1 (G + B), where Gij = δij [23]. The distinction between the usual T-duality and the asymmetric Z2 we are considering here will be important later on in Sect. 3. The untwisted sector of the chiral Monster theory (2.2) consists of those states of the lattice theory that are invariant under the action of (2.5) and (2.6). The twisted sector is created by the action of half-integrally moded oscillators cri , r ∈ Z + 21 , i = 1, . . . , 24 satisfying i j (2.7) cr , cs = r δ ij δr,−s ,
Monstrous Branes
233
on the irreducible representation of the Clifford algebra (L ) associated to L . This Clifford algebra arises as a projective representation of L /2L , generated by ±γi , where i = 1, . . . , 24 labels a basis ki of L . The γi satisfy the relations 1 2
γi γj = (−1)ki ·kj γj γi ,
γi2 = (−1) 2 ki .
(2.8)
Since ki2 = 4 for L , each element γi squares to one. The irreducible representation of this Clifford algebra has dimension 212 , and thus the chiral theory has a degeneracy of 212 in the twisted sector. As before in the untwisted sector, we also have to restrict the twisted sector states to be Z2 -even, where the Z2 generator acts on the oscillators as cri → −cri ,
(2.9)
and on the degenerate twisted sector ground state |χ as |χ → −|χ .
(2.10)
In the full theory (2.1) we then have a degeneracy of 224 in the sector where both left- and right-moving oscillators are half-integrally moded; these correspond to (certain linear combinations of) the 224 fixed points under the diagonal (geometrical) Z2 orbifold. Two additional sectors, where either the left- or the right-movers but not both are half-integrally moded, each have a degeneracy of 212 ; these sectors are twisted by the Z2 acting on the left- or the right-movers, respectively.
2.2. The Monster group and some subgroups. The automorphism group of the chiral Monster conformal field theory is the so-called Monster group M, the largest sporadic simple finite group. This is to say, for each g ∈ M, we have an automorphism of the conformal field theory, g : HM → HM ,
(2.11)
g V (ψ, z) g −1 = V (g ψ, z),
(2.12)
for which
where V (ψ, z) is the vertex operator corresponding to the state ψ ∈ HM . Furthermore, g |! = |!
g |ω = g L−2 |! = L−2 |!.
(2.13)
Here Ln denote the modes of the Virasoro algebra, [Lm , Ln ] = (m − n) Lm+n +
c m (m2 − 1) δm,−n , 12
(2.14)
|! is the vacuum vector and |ω = L−2 |! is the conformal vector. Since |ω is invariant under the action of g ∈ M it follows from (2.12) that [g, Ln ] = 0
for all g ∈ M, n ∈ Z.
(2.15)
In particular, this implies that each eigenspace of L0 forms a representation of the Monster group, M. The theory has a single state with h = 0, the vacuum |!, which
234
B. Craps, M.R. Gaberdiel, J.A. Harvey
transforms in the singlet representation of M. There are no states with h = 1, and for h = 2 we have 196884 states that transform as 196884 = 196883 + 1.
(2.16)
Indeed, as we have seen, the state L−2 |! is in the singlet representation of M; the remaining states then transform in the smallest non-trivial (196883-dimensional) representation of the Monster group. This pattern persists at higher level [1]. The automorphism group of the full closed string theory (2.1) is then the so-called Bimonster group. The Bimonster is the wreath product of the Monster group M with Z2 , i.e. the semi-direct product of (M × M) with Z2 where the generator σ of Z2 permutes the two copies of M. (A neat presentation of the Bimonster in terms of Coxeter relations was conjectured by Conway, and subsequently proven by Norton [24].) Indeed, the two copies of the Monster group act on the left and right chiral theory separately, and σ is the symmetry that exchanges left- and right-movers (combined with a shift of the B-field such that the background is preserved). The Monster group contains a subgroup C whose action on the Monster conformal field theory can be understood geometrically. This subgroup is an extension of the sim1+24 ple Conway group (·1) = (·0)/Z2 by an “extra-special” group denoted by 2+ ; one 1+24 therefore writes C = 2+ (·1). Since C will play a role in Sects. 3 and 5, we now review its construction, following the exposition given in [19]. The Conway group (·0) is the group of automorphisms of the Leech lattice, (2.17) (·0) = Aut(L ) = R ∈ SO(24) : Rp ∈ L for p ∈ L . The centre of (·0) contains one non-trivial element, the reflection map p → −p which we have used in the above orbifold construction. Since this symmetry acts (by construction) trivially on the orbifold theory, the automorphism group of the Monster conformal field theory only involves the quotient group by this reflection symmetry; this is the simple Conway group (·1). Each element R ∈ (·1) has a natural action on the oscillators, given by j
αni → Rij αn ,
j
cri → Rij cr ,
(2.18)
but the action on the ground states is ambiguous. This ambiguity is responsible for the extension of (·1) mentioned above, as we now describe. Let us extend the γi ≡ γki to being defined for arbitrary k ∈ L , where the generators γk now satisfy γk γl = (−1)k·l γl γk ,
γk γl = ε(k, l)γk+l ,
(2.19)
and ε(k, l) are suitable signs. Because of the first equation in (2.19), these signs must satisfy ε(k, l) = (−1)k·l ε(l, k),
(2.20)
while the associativity of the algebra product implies that ε(k, l) ε(k + l, m) = ε(k, l + m) ε(l, m).
(2.21)
As we have explained before, the ground states of the twisted sector form an irreducible representation of the algebra (2.19). Each element R ∈ (·1) gives rise to an automorphism of the gamma matrix algebra by γk = γRk ,
(2.22)
Monstrous Branes
235
and this induces an automorphism of the corresponding representation. Since all irreducible representations of the Clifford algebra are isomorphic, there exists a unitary transformation S so that S γk S −1 = vR,S (k) γRk ,
(2.23)
where vR,S (k) = ±1. This transforms ε(k, l) into ε (k, l) =
vR,S (k + l) ε(k, l). vR,S (k) vR,S (l)
(2.24)
The action on the ground states is now defined by |p → vR,S (p) |Rp
|χ → S |χ ,
(2.25)
where |χ denotes the 212 ground states of the twisted sector. This construction also applies to elements R ∈ (·0); for example, the generator of the Z2 defined by (2.5), (2.6), (2.9) and (2.10) corresponds to R = −1 and S = −1 with vR,S (k) = 1. The unitary transformation S that enters in (2.25) is only determined by (2.23) up to S → Sγ ,
(2.26)
where γ = ±γl for some l ∈ L , i.e. γ ∈ (L ). For each R ∈ (·1), there are therefore |(L )| = 225 different choices for S, thus leading to the extension of (·1) by 1+24 . In particular, for R = e and S = γl , (2.23) and (2.19) imply that (L ) = 2+ ve,γl (k) = (−1)k·l .
(2.27)
The Monster group contains the involution i which acts as +1 in the untwisted sector and as −1 in the twisted sector. Actually the centraliser of i in M, i.e. the subgroup that consists of those elements g ∈ M that commute with i, is precisely the group 1+24 C = 2+ (·1) that we have just described. In particular, i is therefore an element of C; it corresponds to choosing R = e in (·1) and S = −1 in (L ). 2.3. Partition function and McKay-Thompson series. The character or partition function of the (chiral) Monster theory is given as c TrHM q L0 − 24 = j (τ ) − 744, (2.28) where j (τ ) is the elliptic j -function, j (τ ) =
-E8 (τ )3 = q −1 + 744 + 196884 q + 21493760 q 2 + · · · , η(τ )24
q = e2πiτ . (2.29)
Here -E8 (τ ) is the theta function of the E8 root lattice, - (τ ) =
x∈
1 2
q 2x ,
(2.30)
236
B. Craps, M.R. Gaberdiel, J.A. Harvey
and η(τ ) is the Dedekind eta-function, 1
η(τ ) = q 24
∞
(1 − q n ).
(2.31)
n=1
The j -function (and therefore also (2.28)) is a modular function; this is to say, j (τ ) is invariant under the action of SL(2, Z), i.e. j
aτ + b cτ + d
= j (τ ),
ab cd
∈ SL(2, Z).
(2.32)
As we have explained above (see (2.15)), the action of the Monster group commutes with L0 . Thus it is natural to consider the so-called McKay-Thompson series χg (q) ≡ TrHM (g q L0 −1 ),
(2.33)
for every element g ∈ M. For g = e, the identity element of the Monster group, (2.33) reduces to (2.28). Since the definition of (2.33) involves the trace over representations of M, the McKay-Thompson series only depends on the conjugacy class of g in M. There are 194 conjugacy classes, and the first fifty terms in the power series expansions of (2.33) have been tabulated in [25].2 The McKay-Thompson series have a number of remarkable properties. In particular, for each g ∈ M, χg (q) is a Hauptmodule of a genus zero modular group. This is the key statement of the moonshine conjecture of Conway and Norton [26] that has now been proven by Borcherds [2]. (For a nice introduction to “monstrous moonshine” see [27].)
3. D-Branes of the Monster Conformal Field Theory: Some Examples Our aim is to construct, and to some extent classify, the D-branes of the Monster theory. In particular, we would like to understand whether the D-brane states fall into representations of the Bimonster group. In the following we shall mainly concentrate on those D-branes that preserve the subalgebra W of the full (chiral) Monster vertex operator algebra that consists of the Monster invariant states in HM . As we shall see, a complete classification for these D-branes is possible, and one can show that they transform in a representation of the Bimonster group. This result will emerge as a special case of the much more general analysis in Sect. 4. That analysis combines well-known techniques from boundary conformal field theory [16] with mathematical results on vertex operator algebras [28]. In the present section, we use the description of the Monster conformal field theory as an orbifold of the Leech lattice to anticipate the presence of some of these boundary states. In Sect. 5 we shall then show how these examples fit into the analysis of Sect. 4. Since the coefficients of q n in (2.33) are real (they are in fact all integers), the McKay-Thompson series for g and g −1 agree; in fact, there are 22 conjugacy classes with distinct inverses. Moreover, the two distinct classes of order 27 turn out to have the same McKay-Thompson series, so all in all there exist 194 − 22 − 1 = 171 different McKay-Thompson series. 2
Monstrous Branes
237
3.1. Fractional D0−D24 at the origin. We are considering the Z2 × Z2 orbifold of the Leech compactification. As we have mentioned in the paragraph following (2.6), we can think of this as an asymmetric Z2 orbifold of the geometric Z2 orbifold of the Leech theory. We are interested in those configurations of D-branes on the geometric Z2 orbifold of the Leech theory that are invariant under the asymmetric Z2 (which differs from T-duality by a rotation of the right-movers). The simplest such configuration is a fractional D0-brane sitting at the origin together with a D24-brane without Wilson lines. The corresponding D-brane in the orbifold theory is described by a boundary state 1 (3.1) √ ||D0U + ||D0T + ||D24U + ||D24T , 2 where the subscripts U and T denote components in the untwisted sector and in the sector twisted by the geometric Z2 , respectively. Specifically, we have for the untwisted states 1 i i |(p, p) (3.2) exp α¯ −n ||D0U = α−n n p n≥1
and ||D24U =
exp
p
n≥1
1 i i − α−n α¯ −n |(p, −p), n
(3.3)
with |(pL , pR ) denoting the momentum ground state in H with momentum pL , pR ∈ L . By construction, this combination of boundary states is invariant under the asymmetric Z2 . However, our Z2 does not simply correspond to the T-duality transformation that inverts all 24 directions. Indeed, because of the background B-field mentioned after (2.6), the T-dual of ||D0U , would be the familiar 1 j i (3.4) exp − α−n Dij α¯ −n |(p, −p), ||D24’U = n p n≥1
where Dij = (1−B)−1 il (1+B)lj . This expression can be obtained from (3.3) by rotating the right-movers with the matrix D. Physically, (3.3) therefore describes a D24-brane with a (Born-Infeld) flux F = −B which compensates for the background B-field. For the twisted sector contribution we have similarly 1 ||D0T = exp (3.5) ci c¯i |x = 0 r −r −r r≥1/2
and3
||D24T = − exp
r≥1/2
1 i i − c−r c¯−r |x = 0, r
(3.6)
where |x = 0 denotes the twisted sector ground state localised at the fixed point 0. 3 In the twisted sector, the image of ||D0 under T-duality would involve all 224 fixed points (see for T instance [23, 29]).
238
B. Craps, M.R. Gaberdiel, J.A. Harvey
3.2. More fractional branes. Given this “geometric” D-brane, we can obtain a number of other configurations with a clear geometric interpretation. First of all, we can consider introducing Wilson lines for the D24-brane, thereby placing the D0-brane on a different fixed point y (where y ∈ 21 L /L ). In the untwisted sector of the above boundary states this corresponds to introducing p-dependent signs in (3.3) and (3.2) as 1 i i ||D0, yU = |(p, p) (−1)2 y·p exp α¯ −n α−n n p
(3.7)
n≥1
and 1 i i ||D24, yU = |(p, −p). (−1)2 y·p exp − α−n α¯ −n n p
(3.8)
n≥1
In the twisted sector the relevant modifications are 1 ||D0, yT = exp ci c¯i |x = y r −r −r
(3.9)
r≥1/2
and ||D24, yT = − exp
r≥1/2
1 i i − c−r c¯−r |x = y. r
(3.10)
Since there are 224 different fixed points, there are 224 such configurations. In addition, we can also change the overall sign of the twisted sector contributions; thus in total there should be 225 such configurations. From Sect. 2 it is clear that introducing these signs corresponds precisely to the chiral 1+24 action of (L ) = 2+ . Indeed, i ∈ (L ) changes the overall sign of the twisted sector contributions, and the other elements with R = e in (·1) introduce the correct p-dependent signs in the untwisted sector (see (2.27)). We can also apply a number of asymmetric reflections of the Leech lattice to relate the D0–D24 combination to a Dp-D(24-p) combination. This can be implemented by a suitable lift of an element in (·1) to the extra-special extension C introduced before. Combining these two constructions we therefore conclude that the image of the 1+24 above D24–D0 brane under the action of C = 2+ (·1) gives another geometric brane configuration. On the other hand, the other generators of the Monster group do not map the D0D24 brane into another geometrical configuration. This is not really surprising since the oscillators αni do not transform in a representation of the Monster group. (After all, the smallest non-trivial representation of the Monster group has dimension 196883.) This is not in conflict with the claim that the Monster group acts on the conformal field theory since the modes αni are not the modes of an actual state of the theory – they are the modes i |!, but this state does not survive the orbifold projection. of the state α−1
Monstrous Branes
239
3.3. Bulk branes. Apart from the fractional D-brane states we have constructed so far, one also expects there to be bulk D-branes, which can move away from the fixed points of the orbifold. More specifically, if we start with a bulk D0 brane at an arbitrary point of the geometric Z2 orbifold (i.e. a pair of D0-branes at x and −x on the covering space), then adding two D24-branes with Wilson lines x and −x (in suitable units) on the covering space gives an orbifold invariant combination. This D-brane will have moduli (corresponding to the position of one of the two D0-branes, say), and so will be part of a continuum. Such a bulk D-brane can be obtained by combining two coincident fractional D-branes with cancelling twisted sector components. Indeed, as we shall see later on, the open string spectrum between two such states does indeed contain the appropriate marginal operators. 4. Symmetric D-Branes In principle, the only symmetry the boundary states of a (bosonic) string theory are required to preserve is the conformal symmetry, i.e. the boundary states must satisfy Ln − L−n ||B = 0. (4.1) In general it is difficult to classify all such conformal boundary conditions (see however [30, 31]), and one therefore often restricts the problem further by demanding that the boundary states preserve some larger symmetry. Examples are the familiar Dirichlet or Neumann branes whose corresponding open strings satisfy Dirichlet or Neumann boundary conditions at the ends. The boundary states then preserve a U(1) current algebra for each coordinate,
i αni ± α¯ −n ||B = 0,
(4.2)
where αni are the modes associated to X i , and the sign determines whether X i obeys a Dirichlet or a Neumann condition on the world-sheet boundary. Typically, the more symmetries one requires a D-brane to preserve, the easier it is to construct and classify the relevant boundary states. However, in general one then only finds a subset of all physically relevant boundary states. For the case of the Monster theory we have so far only considered D-branes that consist of orbifold invariant combinations of Dirichlet and Neumann branes. While these are consistent D-branes, they are unlikely to describe all the lightest D-branes of the theory since none of these branes couples to the asymmetrically twisted closed string states. Furthermore, it is rather unnatural to consider gluing conditions that involve the modes αni , since these modes are not actually present in the theory. (The modes are only present in the theory before orbifolding, and thus the characterisation of these branes relies on a specific realisation of the Monster conformal field theory in terms of an orbifold construction.) Instead, we will in the following consider D-branes that are characterised by a gluing condition that can be formulated within the Monster conformal field theory. More specifically, we shall analyse the branes that preserve the subalgebra of the original vertex operator algebra consisting of Monster-invariant states. We will denote this W -algebra by W; it contains the Virasoro algebra, but is in fact strictly larger. Let Wn denote the
240
B. Craps, M.R. Gaberdiel, J.A. Harvey
modes of an element of W. Then the gluing condition we will impose, generalising (4.1), is Wn − (−1)s W¯ −n ||B = 0, (4.3) where s is the spin of W . As we shall see, this gluing condition is restrictive enough to allow a complete classification of those solutions. In Sect. 5 we will discuss which of the examples in Sect. 3 are captured by this construction. It turns out that the mathematical results we need in our construction and classification of these symmetric boundary states are known in much wider generality. Therefore, in the present section, we will work in a broader framework than strictly necessary to analyse the Monster theory.
4.1. General framework. We will work in the general framework studied in [28]. Suppose H0 is a simple vertex operator algebra, i.e. a vertex operator algebra that does not have any non-trivial ideals. (An introduction to these matters can be found in [32, 1, 33].) Let us furthermore assume that H0 admits a continuous action of a compact Lie group G (which may be finite). In the example of the Monster theory, H0 corresponds to the chiral Monster theory HM and G is the Monster group M. Let W be the vertex operator subalgebra of H0 consisting of the G-invariants. The main result of [28] is that H0 can be decomposed as H0 = Rλ ⊗ Hλ , (4.4) λ
where the sum runs over all irreducible representations Rλ of G, and each Hλ is an irreducible representation of W.4 Moreover, the Hλ are inequivalent for different λ. The first few terms of the characters of Hλ can be found in [25].5 The total space of states (the generalisation of H, see (2.1)) then has the decomposition H0 ⊗ H0 = Rλ ⊗ Rµ ⊗ Hλ ⊗ Hµ , (4.5) λ,µ
where R denotes the conjugate M-representation of R, and the sum extends over all tensor products of irreducible representations of G. In the following we shall construct boundary states that preserve W for this theory. We shall make use of the fact that H0 ⊗ H0 has the decomposition (4.5). In general, H0 ⊗ H0 is only the vacuum sector of the full conformal field theory, and the theory also contains sectors that correspond to other representations of H0 . A priori, we do not know whether these other sectors also have a decomposition as (4.5), and we shall 4 For the case of the Monster theory, this result implies that W must be strictly larger than the Virasoro algebra: otherwise the modular invariant partition function (2.28) would equal a finite sum of irreducible Virasoro characters with c = 24. One can also check directly, by comparing characters, that W contains at least an additional primary field of conformal weight 12. 5 Table 2 of that paper does not contain any entries for the (conjugate pairs of) irreducible Monster representations IRR16-IRR17 and IRR26-IRR27. The corresponding Hλ are not trivial, but they only contain states whose conformal weight is bigger than h = 51.
Monstrous Branes
241
therefore restrict ourselves to self-dual theories, i.e. theories for which the full conformal field theory is actually given by the vacuum sector alone. This is clearly the case for the Monster theory. Self-dual conformal field theories have the property that their character is invariant under the S-modular transformation. 4.2. Ishibashi states. We are interested in constructing D-branes that preserve the full W-symmetry. Each such D-brane state can be written in terms of W-Ishibashi states; the Ishibashi state is uniquely fixed (up to normalisation) by the gluing condition (4.3), and for each term in (4.5) for which the left- and right-moving W-representations are conjugate, we can construct one such Ishibashi state. (See for example [15] for a review of these issues.) Thus we see from (4.5) that the W-Ishibashi states are labelled just like matrix elements of representations of G, | Rλ ; i, ¯ ∈ Rλ ⊗ Rλ ⊗ Hλ ⊗ Hλ , (4.6) where i ∈ Rλ , ¯ ∈ Rλ are a basis for the representation Rλ and Rλ , respectively. The relevant overlap between these Ishibashi states is given as 1
¯
c
Rλ ; i1 , ¯1 | q 2 (L0 +L0 − 12 ) | Rµ ; i2 , ¯2 = δλ,µ δi1 ,i2 δ¯1 ,¯2 χHλ (q),
(4.7)
where χHλ (q) is the character of the irreducible W-representation Hλ . 4.3. Consistent boundary states. Next we want to construct actual D-brane states, which are certain linear combinations of the Ishibashi states. D-brane states are (at least partially) characterised by the property that they satisfy Cardy’s condition [16], i.e. that they give rise to a positive integer number of W-representations (or more generally, Virasoro representations) in the open string. One D-brane state that satisfies this condition can be easily constructed: it is given by ||e = | Rλ ; i, ı¯ , (4.8) λ;i
where the sum extends over all irreducible representations Rλ of G, and i labels a basis of the representation Rλ . In order to check that ||e satisfies the Cardy condition, we observe that 1 c ¯ e|| q 2 (L0 +L0 − 12 ) ||e = χHλ (q) = dim(Rλ ) χHλ (q) ≡ F (τ ), (4.9) λ;i
λ
where we have used that H0 decomposes as in (4.4), and where F (τ ) is the character (or partition function) of the chiral conformal field theory H0 . The character of a self-dual theory is invariant under the modular transformation τ → −1/τ , and therefore 1 c ¯ dim(Rλ ) χHλ (q), ˜ (4.10) e|| q 2 (L0 +L0 − 12 ) ||e = F (−1/τ ) = λ
where q˜ = e−2πi/τ . Since dim(Rλ ) are positive integers, this demonstrates that the boundary state ||e satisfies the Cardy condition. For the special case of the Monster theory, F (τ ) = j (τ ) − 744, which is indeed invariant under the S-modular transformation.
242
B. Craps, M.R. Gaberdiel, J.A. Harvey
As our notation suggests the boundary state (4.8) is associated to the identity element of the automorphism group G. We want to show next that there is actually a boundary state for each group element of G. The different boundary states are transformed into one another by the left-action of G. Thus we define R ||g = g ||e = Dj iλ (g) | Rλ ; j, ı¯ , (4.11) λ;i,j
where DjRi (g) is the matrix element of g ∈ G in the representation R, j
DljR (h) DjRi (g) = DliR (hg).
(4.12)
The self-overlap of each of these branes is in fact the same as (4.9) above: it follows directly from the definition of (4.11) that ¯
1
c
g|| q 2 (L0 +L0 − 12 ) ||g =
λ;i,j
DjRiλ (g) DjRiλ (g) χHλ (q).
(4.13)
Since each group representation can be taken to be unitary, we have i,j
DjRiλ (g) DjRiλ (g) =
i,j
DijRλ (g −1 ) DjRiλ (g) =
i
DiiRλ (e) = dim(Rλ ).
(4.14)
Inserting (4.14) into (4.13) we thus reproduce (4.9). Incidentally, this also shows that all these D-branes have the same mass, since the mass is determined by the q → 0 limit of (4.13) [5, 34]. It remains to show that the overlap between two different branes of the form (4.11) also gives rise to a positive integer number of representations of W in the open string. Using the same argument as above in (4.13) and (4.14) we now find 1
¯
c
g|| q 2 (L0 +L0 − 12 ) ||h =
λ;i,j
=
λ
DjRiλ (g) DjRiλ (h) χHλ (q) TrRλ (g −1 h) χHλ (q) c
= TrH0 (g −1 h q L0 − 24 ).
(4.15)
It follows from standard orbifold considerations [7] that under τ → −1/τ we have c c TrH0 gˆ q L0 − 24 = TrHgˆ q˜ L0 − 24 ,
(4.16)
ˆ representation of where we have written gˆ = g −1 h, and Hgˆ is the (unique) g-twisted the conformal field theory H0 [35]. Since the g-twist ˆ acts trivially on the generators of W, we can decompose the representation Hgˆ in terms of representations of W as Hgˆ = Dj ⊗ Hj , (4.17) j
Monstrous Branes
243
where each Hj is an irreducible representation of W, and Dj is some multiplicity space. Thus it follows that c 1 ¯ g|| q 2 (L0 +L0 − 12 ) ||h = dim(Dj ) χHj (q). ˜ (4.18) j
In particular, this therefore implies that the relative overlaps also satisfy Cardy’s condition. For the case of the Monster theory, the last line of (4.15) is precisely the McKayThompson series (2.33), which thus appears very naturally in the study of monstrous D-branes!
4.4. Completeness. The construction of a set of consistent boundary states in the previous subsection was fairly general. Now we want to argue that at least in some cases (including the Monster theory) this set is complete, in the sense that it contains all (fundamental) D-branes preserving W. In the following we shall restrict ourselves to the case where G is a finite group. If G is a finite group, there are only finitely many W-preserving Ishibashi states. Whenever this is the case, one can show the completeness of the boundary states by the following algebraic argument: suppose there are N W-Ishibashi states, |I1 , . . . , |IN , and that we have managed to find N boundary states, ||B1 , . . . , ||BN that are linearly independent over the complex numbers. (This is the case in our example as we shall show momentarily.) Then since every boundary state is a linear combination of Ishibashi states, we can express the N Ishibashi states in terms of the N boundary states, i.e. we can find an invertible matrix A such that |Ij = Aij ||Bi . (4.19) i
Suppose now that there exists another boundary state ||B that is compatible with the boundary states ||Bi with i = 1, . . . , N. (By this we mean that the various overlaps between ||B and the ||Bi lead to positive integer numbers of W characters in the open string channel.) Since ||B is a boundary state, it can be written as a linear combination of Ishibashi states, and therefore, because of (4.19), as a linear combination of the boundary states ||Bi . Thus we have shown that ||B = Ci ||Bi , (4.20) i
where the Ci are some (in general complex) constants. In order to prove that the ||Bi are all the (fundamental) boundary states it therefore only remains to show that the Ci are in fact non-negative integers. This will typically follow from the fact that ||B is compatible with the ||Bi in the sense described above. For example, if the ||Bi have the property that the overlap between ||Bi and ||Bj only contains the vacuum representation in the open string provided that i = j , and that the vacuum representation occurs with multiplicity one if i = j (again this is the case for the Monster theory as we shall show momentarily) then this can be shown as follows. We consider the overlap 1
¯
c
Bi || q 2 (L0 +L0 − 12 ) || B,
(4.21)
244
B. Craps, M.R. Gaberdiel, J.A. Harvey
and transform into the open string description. From the above assumption and (4.20) it then follows that the vacuum character in the open string occurs with multiplicity Ci . Thus it follows that Ci has to be a non-negative integer since ||B is compatible with ||Bi . For the case at hand, one can actually show that the two assumptions made above are satisfied. First of all, it follows from the above analysis that there are
dim(Rλ )2 = dim(G)
(4.22)
λ
Ishibashi states, which therefore agrees with the number of boundary states described by (4.11). By the Peter-Weyl Theorem (or the appropriate simpler statement for finite groups) these dim(G) boundary states are linearly independent. Given the above argument, this shows that the boundary states are all fundamental boundary states provided they are “orthogonal”, i.e. provided that the identity only arises in the open string of the overlap of each boundary state with itself (where it arises with multiplicity one). For the Monster theory, the latter statement is obviously correct since the open string overlap between each boundary state and itself is simply j (q) ˜ − 744 which starts indeed with 1q˜ −1 + · · · . Thus it only remains to check that the overlap between different boundary states starts with 0q˜ −1 + · · · in the open string. This is simply the question of what the leading behaviour of the S-modular transform of the different McKay-Thompson series (for g = e) is. It has recently been argued that the only McKay-Thompson series that has a term of order q˜ −1 in its S-modular transform is the series associated to the identity element [36]. We have also checked this property for a number of McKay-Thompson series explicitly. We have thus shown that there are precisely |M| W-preserving boundary states ||g, labeled by g ∈ M. It is interesting to ask how these boundary states transform under the Bimonster group. First of all, it is easy to verify that the elements of M × M act as (hL , hR ) ||g = ||hL g h−1 R .
(4.23)
Indeed, we calculate (hL , hR ) ||g = (hL , hR ) =
λ;i,j,k,l
=
λ;i,j,k,l
=
λ;k,l
λ;i,j
DjRiλ (g) | R; j, ı¯
Rλ DjRiλ (g) Dkj (hL ) DliRλ (hR ) | Rλ ; k, l¯ Rλ ¯ Dkj (hL ) DjRiλ (g) DilRλ (h−1 R ) | Rλ ; k, l
Rλ ¯ Dkl (hL g h−1 R ) | Rλ ; k, l
= ||hL g h−1 R .
(4.24)
Next we note that the generator σ of the Z2 that exchanges the left- and right-movers is an anti-linear map that replaces the Ishibashi states | Rλ ; j, ı¯ by | Rλ ; i, ¯ . It therefore
Monstrous Branes
245
acts on the boundary states as σ ||g =
λ;¯,i
=
λ;¯,i
DjRiλ (g) | Rλ ; i, ¯ DijRλ (g −1 ) | Rλ ; i, ¯ = ||g −1 ,
(4.25)
where we have again used that the representations of the Monster group are unitary. The actions (4.23) and (4.25) combine to give a full representation of the Bimonster group since σ (h1 , h2 ) ||g = ||h2 g −1 h−1 1 = (h2 , h1 ) σ ||g.
(4.26)
Thus we have shown that the W-preserving boundary states fall into a representation of the Bimonster group. 4.5. Factorisation constraint. In the previous subsections we have constructed a family of W-preserving boundary states that satisfy all relative Cardy conditions. Furthermore, we have shown that this set of boundary states is complete. In addition to the Cardy conditions, consistent boundary states also have to satisfy the “sewing relations” of [37]. One of these conditions is the factorisation (or cluster) condition that requires that certain bulk-boundary structure constants satisfy a set of non-linear equations (sometimes also referred to as the classifying algebra in this context). It was shown in [38] that a W-preserving boundary state ||B satisfies this factorisation constraint provided that it preserves the full symmetry algebra H0 up to conjugation by an element in g ∈ G, i.e. provided (4.27) g Sn g −1 − (−1)s S¯−n ||B = 0, for all modes of fields in H0 . (Here g ∈ G depends on the boundary condition ||B.) Since Wn ∈ W is invariant under the action of g ∈ G, (4.27) contains (4.3) as a special case. As we shall now explain, the boundary states we have constructed actually satisfy (4.27); in fact, we have (4.28) g Sn g −1 − (−1)s S¯−n ||g = 0 for each g ∈ M. Given the decomposition (4.5), the boundary state corresponding to ||e is the unique boundary state that preserves the full symmetry algebra (the theory only contains a single Ishibashi state that preserves this algebra), and thus (4.28) holds for g = e. The general statement then follows from this using (4.23). 5. Application to the Monster: Fractional and Bulk Branes Let us now return to the specific case of the Monster theory. As we have shown in the previous section, the D-branes that preserve the W -algebra of Monster invariants W are labelled by group elements in M. We want to analyse now how the various D-branes that we constructed in Sect. 3 fit into this analysis. In order to do so it is useful to describe the “geometrical” boundary states of Sect. 3.1 in more detail.
246
B. Craps, M.R. Gaberdiel, J.A. Harvey
5.1. Fractional D0–D24 at the origin. In the untwisted sector of the geometric orbifold, the constituent boundary states are (up to normalisation) given as 1 i i |(p, −p) (5.1) exp − α−n α¯ −n ||D24U = n p n≥1
and ||D0U =
p
1 exp α i α¯ i |(p, p). n −n −n
(5.2)
n≥1
Let us expand out these boundary states, and in particular, consider the contributions for h = h¯ = 0, 1, 2. At h = h¯ = 0, both boundary states are proportional to the vacuum. At h = h¯ = 1, the ||D24U boundary state is proportional to i i α−1 α¯ −1 |(0, 0), (5.3) − i
while the ||D0U state is proportional to i i α−1 α¯ −1 |(0, 0).
(5.4)
i
The sum ||D24U + ||D0U therefore does not have any contribution at h = h¯ = 1. This is in agreement with the fact that the Monster theory does not have any states of h = h¯ = 1. At h = h¯ = 2 we get for ||D24U , −
1 i j i j 1 i i α−2 α¯ −2 |(0, 0) + α−1 α−1 α¯ −1 α¯ −1 |(0, 0) + |(p, −p), 2 2 2 i
i,j
(5.5)
p:p =4
while for ||D0U we have 1 i j i j 1 i i α−2 α¯ −2 |(0, 0) + α−1 α−1 α¯ −1 α¯ −1 |(0, 0) + |(p, p). 2 2 2 i
i,j
(5.6)
p:p =4
At h = h¯ = 2 the sum of ||D24U + ||D0U therefore has the contribution i,j
j
j
i i α−1 α−1 α¯ −1 α¯ −1 |(0, 0) +
p:p2 =4
1 1 √ |p + | − p L ⊗ √ |p + | − p R . 2 2 (5.7)
In the last sum we have written the momenta as tensor products of left- and right-moving momenta. Next we recall that the chiral Monster theory has 196884 states with h = 2; of these j i √1 (|p+|−p) there are 24·25 2 states of the form α−1 α−1 |0 and 98280 states of the form 2
with p2 = 4 (as well as 24 · 212 states coming from the twisted sector). What the above calculation shows is that of the 1968832W-Ishibashi states at h = h¯ = 2, those that come from the untwisted sector (i.e. the
24·25 2
+ 98280 − 1
2
Ishibashi states coming
Monstrous Branes
247
from the first two types of states minus the stress-energy tensor) contribute only if the left-label of the Monster representation is the same as the right-label of the Monster representation. Furthermore, all these diagonal states appear with the same coefficient. If the D0–D24 combination is one of the boundary states we have constructed before, then the group element g must therefore have the property that Dij196883 (g) = δij if i and j are untwisted labels. It is also clear that Dij196883 (g) = 0 if i and j describe one untwisted and one twisted label since the twisted sector of the geometric Z2 orbifold (in which theory the D0-D24 boundary state is constructed) consists of those states that are twisted with respect to both the left- and the right- asymmetric orbifold. Thus the corresponding group element g must give rise to a representation matrix of the form
1 0 D 196883 (g) = , (5.8) 0 R(g) where we have written the matrix in block-diagonal form, with the two blocks corresponding to the untwisted and the twisted sector states, respectively. Here R(g) is a 98304 × 98304 matrix, describing the components of g in the representation 196883 with respect to the twisted sector states. We also know that, in the above notation,
1 0 196883 D (i) = , (5.9) 0 −1 and thus it follows that D 196883 (gig −1 i) = 1.
(5.10)
Since the Monster group does not have any non-trivial normal subgroups, it follows that gig −1 i = e, and therefore that g is in the centraliser of i, i.e. in C. On the other hand, we know how the elements in C act on the 98579 untwisted sector states of 196883, and it is easy to see that there are only two elements in C that give rise to a matrix of the form (5.8) above: either g = e or g = i. Let us choose to identify the fractional D0–D24 discussed above with ||e. As we will see in the next subsection, ||i then corresponds to another fractional D0–D24 at the origin, differing from ||e in the overall sign of the twisted sector components of its boundary state. Apart from the direct boundary state argument given above, there is another way to argue for the identification of ||e (and ||i) with a fractional D0–D24 at the origin. We expect that such a D0-D24 is at least invariant under the left-right symmetric (geometric) action of a suitable lift of the simple Conway group (·1) to the extra-special extension C (since the elements in (·1) correspond to Leech lattice automorphisms, which leave the origin invariant). Indeed, it is obvious from (4.23) that both ||e and ||i are in fact invariant under the left-right symmetric (diagonal) action of any element of C. Furthermore, e and i are the only group elements in M with this property. 5.2. More fractional branes. In Subsect. 3.2 we saw that by acting on the fractional 1+24 D0-D24 brane of Subsect. 3.1 with elements of the group C = 2+ (·1) acting on the left, one obtains other fractional D-branes with a clear geometric interpretation. From the previous section, we know that the fractional D0–D24 is described by the boundary state ||e, and in Subsect. 4.3 we showed that the left action of an element g of C
248
B. Craps, M.R. Gaberdiel, J.A. Harvey
results in the boundary state ||g. Thus the boundary states labelled by an element of 1+24 (·1) have a geometric interpretation as fractional branes. C = 2+ In particular, the boundary state ||i associated to the involution i corresponds to a fractional D0-D24 brane at the origin, which differs from ||e by the sign of the twisted sector components of its boundary state: if we normalised the various boundary states so that ||e is given by (3.1), then 1 ||i = √ ||D0U − ||D0T + ||D24U − ||D24T , (5.11) 2 where the subscripts U and T denote again the components in the untwisted sector and in the sector twisted by the geometric Z2 , respectively. 5.3. Bulk branes. None of the branes we described in Sect. 4 have any moduli. Indeed, as we saw in Subsect. 4.3, the self-overlap of any of these branes leads to the one-loop partition function F (q) ˜ = j (q) ˜ − 744 in the open string, which therefore does not have any massless states.6 On the other hand, we saw in Sect. 3.3 that the theory should have a continuum of bulk D-brane states. It therefore follows that bulk branes generically cannot preserve W. They are therefore examples of physical D-branes (preserving the conformal symmetry) that are not captured by the construction in Sect. 4 (where we restricted our attention to D-branes preserving the larger algebra W). One may wonder whether these bulk branes can be thought of as being built out of fractional branes. In particular, one may expect that combinations of fractional branes with a vanishing twisted sector contribution can combine to form a bulk brane. The simplest example of such a combination of fractional branes is described by the superposition of ||e and ||i. In order to check whether this configuration of branes possesses massless modes that describe the corresponding moduli, we determine the cylinder diagram 1 c ¯ e|| + i|| q 2 (L0 +L0 − 12 ) ||e + ||i = 2χe (q) + 2χi (q). (5.12) We want to write this amplitude in the open string channel, i.e. in terms of the open string variable q˜ = exp(−2π i/τ ). As we have explained before, χe (q) = j (τ ) − 744 = j (−1/τ ) − 744 = q˜ −1 + 0 + 196884q˜ + · · · ,
(5.13)
and thus we do not get any massless modes from χe . (This is as expected since ||e and ||i separately do not have any massless modes.) On the other hand, i lies in the class 2B of the ATLAS [39], and using [26], we find for the second term χi (q) =
η(τ )24 + 24 = 24 + 212 q˜ 1/2 + · · · . η(2τ )24
(5.14)
In particular, the combined system has 48 massless modes; 24 of these correspond to the moduli that describe the 24 different directions in which the bulk D0-brane can move off the fixed point. (The other 24 massless states correspond on the covering space to an open string connecting a D0 and its image D0, which has massless modes when the bulk brane is at a fixed point.) 6 Incidentally, this property of the branes is special to the Monster theory; all other known self-dual conformal field theories contain massless states in their chiral partition function.
Monstrous Branes
249
Finally, let us remark that these “bulk branes” (which do not preserve the full W-symmetry) have higher mass than the W-preserving branes labelled by group elements in M (as follows by comparing their boundary states in the limit q → 0 [5, 34]). It therefore seems plausible that the lightest branes of the theory preserve the full W-symmetry, and therefore that they fall into a representation of the Bimonster group.
6. Conclusions In this paper we have shown that the D-branes preserving the chiral algebra of Monster invariants transform in the regular representation of both copies of the Monster group (and define a representation of the Bimonster). Although this does not give a complete classification of all possible D-brane states in the Monster theory, it does provide evidence that the Bimonster symmetry of the perturbative spectrum extends to a nonperturbative symmetry of the full theory. In particular, it seems likely that the D-brane states we have constructed are the lightest D-branes in the spectrum, and that all other D-brane states can be formed as composites of these building blocks. In this paper we have restricted our attention to conformal field theories that only consist of a vacuum sector, i.e. that are self-dual. This assumption guaranteed that the decomposition (4.4) (that is only known to hold for the vacuum sector [28]) can be used to decompose the full space of states as in (4.5). It seems plausible that the decomposition (4.4) may hold more generally for an arbitrary representation. This would then suggest that our construction may generalise further. This idea is also supported by the observation that our result is structurally very similar to what was found in [40] for the (non-self-dual) WZW model corresponding to su(2) at level k = 1. We hope to come back to this point in a future publication. The techniques described here may also be useful in trying to obtain a more systematic understanding of D-branes in asymmetric orbifolds. (Previous attempts at constructing D-branes in asymmetric orbifolds have been made in [41, 42].) In particular, some of the branes described above (namely those that correspond to group elements in M \ C) actually involve Ishibashi states from asymmetrically twisted closed string sectors. Finally, we hope that the perspective we have discussed here might be of use to mathematicians trying to obtain a more conceptual understanding of monstrous moonshine. Conformal field theories with boundaries have been less well developed in the mathematical literature. In a physical framework the boundaries are associated to D-branes and open strings which have endpoints on the D-brane. We have shown here that the McKay-Thompson series which are the subject of the genus zero moonshine conjectures arise naturally in the open string sector of the closed string theory with Monster symmetry. Perhaps this will suggest new approaches to the moonshine conjectures. Acknowledgements. We would like to thank Terry Gannon, Peter Goddard, Emil Martinec, and Andreas Recknagel for useful conversations; and Richard Borcherds, and John McKay for helpful communications. The work of BC is supported by DOE grant DE-FG02-90ER-40560 and NSF grant PHY-9901194. MRG is grateful to the Royal Society for a University Research Fellowship. He also acknowledges partial support from the EU network ‘Superstrings’ (HPRN-CT-2000-00122), as well as from the PPARC special grant ‘String Theory and Realistic Field Theory’, PPA/G/S/1998/0061. The work of JH is supported by NSF grant PHY-9901194.
250
B. Craps, M.R. Gaberdiel, J.A. Harvey
References 1. Frenkel, I., Lepowsky, J., Meurman, A.: Vertex Operator Algebras and the Monster. London–New York: Academic Press, 1988 2. Borcherds, R.E.: Monstrous moonshine and monstrous Lie superalgebras. Invent. Math. 109, 405 (1992) 3. Tuite, M.: On the relationship between monstrous moonshine and the uniqueness of the moonshine module. Commun. Math. Phys. 166, 495 (1995) 4. Cummins, C.J., Gannon, T.: Modular equations and the genus zero property of moonshine functions. Invent. Math. 129, 413 (1997) 5. Polchinski, J.: Dirichlet-branes and Ramond-Ramond charges. Phys. Rev. Lett. 75, 4724 (1995); hep-th/9510017 6. Craps, B., Gaberdiel, M.R.: Discrete torsion orbifolds and D-branes. II. J. High Energy Phys. 0104, 013 (2001); hep-th/0101143 7. Dixon, L., Harvey, J., Vafa, C., Witten, E.: Strings on orbifolds I and II. Nucl. Phys. B261, 678 (1985) and Nucl. Phys. B274, 295 (1986) 8. Gimon, E.G., Polchinski, J.: Consistency conditions for orientifolds and D-manifolds. Phys. Rev. D54, 1667 (1996); hep-th/9601038 9. Douglas, M.R., Moore, G.: D-branes, quivers and ALE instantons. hep-th/9603167 10. Bergman, O., Gaberdiel, M.R.: Stable non-BPS D-particles. Phys. Lett. B441, 133 (1998); hep-th/9806155. 11. Diaconescu, D.-E., Gomis, J.: Fractional branes and boundary states in orbifold theories. J. High Energy Phys. 10, 1 (2002) 12. Gaberdiel, M.R., Stefa´nski, B.: Dirichlet branes on orbifolds. Nucl. Phys. B578, 58 (2000); hep-th/9910109 13. Gaberdiel, M.R.: Lectures on Non-BPS Dirichlet branes. Class. Quant. Grav. 17, 3483 (2000); hep-th/0005029 14. Bill´o, M., Craps, B., Roose, F.: Orbifold boundary states from Cardy’s condition. J. High Energy Phys. 0101, 038 (2001); hep-th/0011060 15. Gaberdiel, M.R.: D-branes from conformal field theory. Fortschr. Phys. 50, 783 (2002) 16. Cardy, J.L.: Boundary conditions, fusion rules and the Verlinde formula. Nucl. Phys. B324, 581 (1989) 17. Conway, J.H., Sloane, N.J.A.: Sphere Packings, Lattices and Groups. 3rd edn, Berlin Heidelberg– New York: Springer 1999 18. Goddard, P.: Meromorphic conformal field theory. In: Infinite Dimensional Lie Algebras and Lie Groups: Proceedings of the CIRM Luminy Conference 1988, Singapore: World Scientific, 1989 19. Dolan, L., Goddard, P., Montague, P.: Conformal field theories, representations and lattice constructions. Commun. Math. Phys. 179, 61 (1996) 20. Dixon, L., Ginsparg, P., Harvey, J.A.: Beauty and the beast: Superconformal symmetry in a Monster module. Commun. Math. Phys. 119, 285 (1988) 21. Dolan, L., Goddard, P., Montague, P.: Conformal field theory, triality and the Monster group. Phys. Lett. B236, 165 (1990) 22. Dolan, L., Goddard, P., Montague, P.: Conformal field theory of twisted vertex operators. Nucl. Phys. B338, 529 (1990) 23. Lauer, J., Mas, J., Nilles, H.P.: Twisted sector representations of discrete background symmetries for two-dimensional orbifolds. Nucl. Phys. B351, 353 (1991) 24. Conway, J.H.: Y555 and all that. In: Groups, Combinatorics & Geometry: Proceedings of the LMS Durham Symposium 1990, M. Liebeck, J. Saxl (eds), Cambridge: Cambridge University Press, 1992 25. McKay, J., Strauss, H.: The q-series of monstrous moonshine & the decomposition of the head characters. Commun. Alg. 18(1), 253 (1990) 26. Conway, J.H., Norton, S.P.: Monstrous moonshine. Bull. London Math. Soc. 11, 308 (1979) 27. Gannon, T.: Monstrous moonshine and the classification of CFT. math.QA/9906167 28. Dong, C., Li, H., Mason, G.: Compact automorphism groups of vertex operator algebras. Int. Math. Res. Notices 18, 913 (1996) 29. Brunner, I., Entin, R., R¨omelsberger, C.: D-branes on T(4)/Z(2) and T-duality. J. High Energy Phys. 9906, 016 (1999); hep-th/9905078 30. Gaberdiel, M.R., Recknagel, A.: Conformal boundary states for free bosons and fermions. J. High Energy Phys. 0111, 016 (2001); hep-th/0108238 31. Janik, R.A.: Exceptional boundary states at c = 1. Nucl. Phys. B618, 675 (2001); hep-th/ 0109021 32. Kac, V.G.: Vertex Algebras for Beginners. Providence, RI: Amer. Math. Soc., 1997 33. Frenkel, I., Huang, Y.-Z., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Mem. Am. Math. Soc. 104, 1 (1993)
Monstrous Branes
251
34. Harvey, J.A., Kachru, S., Moore, G., Silverstein, E.: Tension is dimension. J. High Energy Phys. 0003, 001 (2000); hep-th/9909072 35. Dong, C., Li, H., Mason, G.: Modular invariance of trace functions in orbifold theory. Commun. Math. Phys. 214, 1 (2000); q-alg/9703016 36. Ivanov, R.I., Tuite, M.P.: Rational generalised moonshine from abelian orbifoldings of the moonshine module. Nucl. Phys. B635, 435 (2002) 37. Lewellen, D.C.: Sewing constraints for conformal field theories on surfaces with boundaries. Nucl. Phys. B372, 654 (1992) 38. Fuchs, J., Schweigert, Ch.: Symmetry breaking boundaries II. More structures; examples. Nucl. Phys. B568, 543 (2000); hep-th/9908025 39. Conway, J.H., Curtis, R.T., Norton, S.P., Parker, R.A., Wilson, R.A.: ATLAS of Finite Groups: Maximal Subgroups and Ordinary Characters for Simple Groups. Oxford: Clarendon Press, 1985 40. Gaberdiel, M.R., Recknagel, A., Watts, G.M.T.: The conformal boundary states for SU(2) at level 1. Nucl. Phys. B626, 344 (2002) 41. Brunner, I., Rajaraman, A., Rozali, M.: D-branes on asymmetric orbifolds. Nucl. Phys. B558, 205 (1999); hep-th/9905024 42. Kors, B.: D-brane spectra of nonsupersymmetric, asymmetric orbifolds and nonperturbative contributions to the cosmological constant. J. High Energy Phys. 9911, 028 (1999); hep-th/9907007 Communicated by R.H. Dijkgraaf
Commun. Math. Phys. 234, 253–285 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0774-4
Communications in
Mathematical Physics
Birkhoff Normal Form for Some Nonlinear PDEs Dario Bambusi Dipartimento di Matematica, Universit`a degli studi di Milano, Via Saldini 50, 20133 Milano, Italy Received: 15 May 2002 / Accepted: 13 September 2002 Published online: 24 January 2003 – © Springer-Verlag 2003
Abstract: We consider the problem of extending to PDEs Birkhoff normal form theorem on Hamiltonian systems close to nonresonant elliptic equilibria. As a model problem we take the nonlinear wave equation utt − uxx + g(x, u) = 0,
(0.1)
with Dirichlet boundary conditions on [0, π ]; g is an analytic skewsymmetric function which vanishes for u = 0 and is periodic with period 2π in the x variable. We prove, under a nonresonance condition which is fulfilled for most g’s, that for any integer M there exists a canonical transformation that puts the Hamiltonian in Birkhoff normal form up to a reminder of order M. The canonical transformation is well defined in a neighbourhood of the origin of a Sobolev type phase space of sufficiently high order. Some dynamical consequences are obtained. The technique of proof is applicable to quite general semilinear equations in one space dimension. 1. Introduction Consider a finite dimensional Hamiltonian system having an elliptic equilibrium point; it is well known that there exist canonical coordinates in which the Hamiltonian takes the form H = h0 + f, where h0 (p, q) =
j
ωj
pj2 + qj2 2
,
(1.1)
and f (p, q) = O((p, q)3 ). Moreover, if the frequencies are nonresonant, then for any integer M ≥ 4 there exists a canonical transformation T (defined in a neighborhood
254
D. Bambusi
of the origin) which puts the system in Birkhoff normal form up to order M, i.e. such that H ◦ T = h0 + Z + R,
(1.2)
where Z depends only on the action variables Ij = (pj2 +qj2 )/2, and R = O (p, q)M is a remainder which is small close to the origin. In the present paper we investigate the existence of such a transformation for some infinite dimensional Hamiltonian systems describing nonlinear partial differential equations (PDEs). As a model problem we consider the nonlinear wave equation utt − uxx + g(x, u) = 0 ,
(0.1)
with Dirichlet boundary conditions on [0, π ], where g is a function which extends analytically to the region |u| < R1 and |Im x| < σ1 , where R1 , σ1 are positive numbers, and fulfills g(x, 0) = 0 ,
g(x + 2π, u) = g(x, u) ,
g(−x, −u) = −g(x, u).
(1.3)
We will prove that if g fulfills a suitable nonresonance condition (which has full measure in a sense that will be made precise), then for any M ≥ 4 and any positive (small) R there exists a canonical transformation which puts the Hamiltonian of the system in the form (1.2). The transformation is well defined in an open ball of radius R of Sobolev–type phase spaces of sufficiently high order, and the norm of the hamiltonian vector field of R is smaller than CR M in the same ball, with C a positive constant independent of R. As a corollary we obtain an estimate of high Sobolev norms of the solution which is valid for very long times (in the spirit of [8]) and also a very precise qualitative description of the dynamics for such times. In particular we obtain (in the spirit of [15, 16]) that all the actions Ij of the linearized system are approximately constants of motion for times of the order of the norm of the initial datum to the power −M. We recall that perturbation theory for Hamiltonian PDEs has attracted great interest during the last fifteen years, and in particular KAM theory has been partially extended by proving that most of the finite dimensional invariant tori of an integrable (e.g. linear) system survive a small perturbation [12, 13, 11, 7, 9, 18, 19, 10]. However, it is clear that, although very interesting, the solutions lying on these tori correspond to exceptional initial data. Concerning solutions starting outside such tori only a few results are known [1, 2, 6, 3, 8]. In particular, the papers [1, 2] deal with perturbations of resonant systems and prove a long time stability property of periodic solutions or finite dimensional tori in the topology induced by the energy norm; such results are based on resonant averaging combined with other techniques. The papers [6, 3] deal with perturbations of nonresonant systems and their results allow to control, for long times, the dynamics corresponding to all initial data in a ball of a phase space of smooth functions; however the topology in which the dynamics is controlled is much weaker than that of the original phase space (see Remark 2.5). Such latter results are obtained by exploiting the linear stability of finite dimensional tori. Finally we quote the remarkable paper [8] where a stability property of most small amplitude infinite dimensional tori of some nonlinear Schr¨odinger equations is obtained. This result is strongly related to the main one proved here. In the present paper we obtain a precise qualitative description of all the solutions starting in a ball of a Sobolev type phase space, in particular we obtain a control of
Birkhoff Normal Form for PDE’s
255
the (phase space) norm of the solution, and moreover we prove that each solution remains close, in a topology slightly weaker than that of the phase spase, to an infinite dimensional torus (see Corollary 2.3). We point out that the technique of proof of the present paper is very general, while that of [8] (which is completely different from ours) exploits some particular features of the NLS. Moreover, we think that the approach of the present paper could be the starting point for further developments in the same way as the Birkhoff normal form is for finite dimensional systems (following for example [13, 14, 17]). The paper is organized as follows: in Sect. 2 we give a precise statement of the results for Eq. (0.1), in Sect. 3 we present the idea of the proof and we discuss some possible extensions. In Sect. 4 we state our abstract normal form theorem. In sect. 5 we give the proof of such an abstract result. This section is divided in three subsections: in the first one we study the properties of the class of functions allowed as nonlinearities by our theory; in the second one we study the behaviour of such functions under canonical transformations generated by functions of the same class; in the third subsection we iteratively construct the normalizing transformation. Section 6 is devoted to the application to the nonlinear wave equation (0.1) and is divided in two subsections: in the first one we give Eq. (0.1) the form needed for the application of the abstract theorem; in the second subsection we prove that for most values of a suitable parameter the frequencies fulfill the nonresonance condition needed to apply the theorem. 2. Statement of the Results To start with we precise our assumptions. Write g as g(x, u) = g0 (x)u + g(x, ¯ u) ,
g0 (x) = m + V (x) ,
g(x, ¯ u) = O(|u|2 ),
where m is the average over [−π, π ] of g0 and V has zero average. Consider now the Sturm Liouville problem −∂xx ϕj + V ϕj = µj ϕj ,
ϕj (0) = ϕj (π ) = 0;
(2.1)
it is well known that all the eigenvalues are distinct, that the solutions {ϕj }j ≥1 of (2.1) form an orthogonal basis of L2 that we will assume to be normalized, and that, since V is analytic and symmetric, the ϕj are analytic and skewsymmetric. Remark that the eigenvalues µj define the frequencies of oscillation of the linear system by ωj := µj + m . (2.2) It is well known that (0.1) is Hamiltonian with Hamiltonian function given by H = h0 + f, where
h0 (v, u) := f (u) :=
π
0 π
(2.3)
[v(x)]2 + [ux (x)]2 + V (x)[u(x)]2 + m[u(x)]2 dx, 2 G(x, u(x))dx;
0
v := ut is the momentum conjugated to u and G(x, u) is such that ∂u G = g. ¯ To define formally the phase space consider the Sobolev space H s of the functions u ∈ L2 having
256
D. Bambusi
s weak derivatives in L2 . For any integer s ≥ 2 we define the phase space Fs as the space of the functions (u, v) ∈ Fs := H s ⊕ H s−1 fulfilling the compatibility conditions u(2j ) (0) = u(2j ) (π ) = 0 , v (2j ) (0) = v (2j ) (π ) = 0 ,
s−1 , 2 s 0 ≤ j ≤ − 1, 2 0≤j ≤
(2.4)
(the upper index in parentheses denoting the derivative of the corresponding order). We endow Fs by the norm π (u, v)2Fs := (∂xs−1 v)2 + (∂xs u)2 + u2 + v 2 dx; 0
we will denote by Bs (R) the ball of radius R in Fs centered at the origin. Remark 2.1. It is well known that the Cauchy problem for Eq. (0.1) is locally well posed in Fs , and that it is also globally well posed for small enough initial data. To introduce the normal modes and the linear actions consider the expansion of u and v on the eigenfunctions ϕj : u(x) = uj ϕj (x) , v(x) = vj ϕj (x) , j ≥1
j ≥1
and define the j th action of the linearized system by Ij ≡ Ij (u, v) :=
vj2 + ωj2 u2j 2ωj
.
Having fixed two positive numbers m1 < m2 , such that µj + m1 > 0 for all j ≥ 1 we will consider the parameter m as varying in the interval I := [m1 , m2 ]. We have the following Theorem 2.2. Fix a positive integer M ≥ 4, then there exists a subset J ⊂ I with measure equal to m2 − m1 and a positive (large) integer s∗ with the following properties: for any m ∈ J there exists a positive R∗ and, for any R < R∗ there exists an analytic canonical transformation TR : Bs∗ (R/3) → Bs∗ (R) which puts the Hamiltonian (2.3) in Birkhoff normal form up to order R M , namely such that (h0 + f ) ◦ TR = h0 + Z + R with Z a function that depends only on the actions Ij , and R an analytic functional having a Hamiltonian vector field XR which is analytic as a function from Bs∗ (R/3) to Fs∗ . Moreover, for any integer s ≥ s∗ there exists a positive constant Cs such that the transformation TR and the vector field XR are analytic also as maps from Bs (R/Cs ) to Fs and the following estimates hold: sup
(u,v)Fs ≤R/Cs
(u, v) − TR (u, v)Fs ≤ Cs R 2 , sup
(u,v)Fs ≤R/Cs
Then we have the following
XR Fs ≤ Cs R M−1 .
(2.5)
Birkhoff Normal Form for PDE’s
257
Corollary 2.3. Fix M ≥ 4 and m ∈ J , fix also s ≥ s∗ , then there exist positive constants Rs , Cs such that the following holds true: let ζ (t) ≡ (u(t), v(t)) be the solution of the Cauchy problem for Eq. (0.1) with initial datum (u, ¯ v) ¯ ∈ Fs if ε := (u, ¯ v) ¯ F s ≤ Rs , then for all times t with |t| ≤ the following estimates hold: ζ (t)Fs ≤ 2Cs ε,
1 Cs ε M−3 Ij (t) − Ij (0) ε2
(2.6)
≤
Cs ε. j 2s
(2.7)
Corollary 2.4. With the same notations and assumptions of Corollary 2.3, fix M1 , M2 with 2M1 + M2 = M − 2, then there exists an infinite dimensional torus T0 smoothly embedded in Fs and, for any s < s − 1/2, s ≥ 2, there exists a constant Cs,s such that one has ds (ζ (t), T0 ) ≤ Cs,s ε M1 ,
(2.8)
where ds (., .) denotes the distance in Fs and t is such that |t| ≤
1 . Cs ε M2
(2.9)
We point out that the first of the estimates (2.7) holds also for the nonlinear Klein– Gordon equation (corresponding to the case V = 0) with periodic boundary conditions (see Corollary 6.7 below), and also the second of (2.7) has an analogous in this case. Remark 2.5. Corollaries 2.3 and 2.4 have to be confronted with the main result of [6]. In [6] the estimates (2.7) and (2.8) were proven for solutions corresponding to initial data in Fs+s1 , with ε defined as the Fs+s1 norm of the initial datum, and s1 a large M dependent number. 3. Idea of the Proof and Discussion We begin by discussing the nonresonance condition. To this end consider a finite sequence of frequencies ω ≡ {ωj }nj=1 , and recall that the standard nonresonance condition used in order to prove Birkhoff’s theorem in a finite dimensional context is “for any integer r ≥ 3 there exists a positive δr such that |ω · k| ≥ δr , ∀ k ∈ Zn \{0} with |k| ≤ r”. It can be easily seen that in general such a condition is not satisfied in the infinite dimensional case n = ∞, so it has to be modified in order to deal with PDEs. To this end remark that, given k as above, one has k · ω ≡ ±ωi1 ± ωi2 ± ... ± ωir with a suitable choice of the signs and of the indices i1 , ..., ir , and consider the nonresonance condition γ |ωi1 ± ωi2 ± ... ± ωir | ≥ τ τ τ : (3.1) i1 i2 ...ir
258
D. Bambusi
we will prove that, provided τ is sufficiently large (depending on r), the frequencies (2.2) fulfill (3.1) if m is in a set with complement having measure of the order of γ 1/r (see Theorem 6.16). We point out that condition (3.1) is fulfilled with large probability in very general models. For example, this is true if ω ∈ [0, 1]∞ with [0, 1]∞ endowed by the product measure. Condition (3.1) makes clear that very small small–denominators always involve many times large indexes (or many times a large index). Recall now that small denominators appear in solving the homological equation. More
precisely, having introduced variables (ξj , ηj ) j ≥1 in which the operator of Poisson brackets with h0 is diagonal, the algorithm of the Birkhoff normal form introduces a denominator of the form ωk1 + ωk2 + ... + ωkn1 − ωj1 − ωj2 − ... − ωjn2 in order to eliminate from the nonlinearity a monomial of the form fk,j ξk1 ξk2 ...ξkn1 ηj1 ηj2 ...ηjn2 ,
(3.2)
where fk,j ∈ C is a complex coefficient. The main point is that, due to the locality of the nonlinearity in (0.1), the vector field of a monomial having a nonvanishing coefficient fk,j turns out to be small when at least three of the indexes kl , jl are large and the sequences ξj , ηj decay fast enough with j (see Lemma 5.6 and Proposition 5.8 below). The smallness of monomials involving many variables with large index allows to perform a spatial analogue of the ultraviolet cutoff currently used in KAM theory, namely a momentum cutoff: Precisely we fix a large N and eliminate from the nonlinearity only monomials which contain at most two variables with index larger than N , the others being already small. In order to do that we assume that the small denominators one has to control satisfy a diophantine type condition stronger than (3.1), namely that there exists α such that ωi ± ωi ± ... ± ωi ± ωj ± ωk ≥ γ , (3.3) r 1 2 Nα where the indexes il are smaller than N and k ≥ j > N . This is the condition we will prove to have full measure (see Theorem 6.5 below). We point out that a related diophantine type condition was already used in [6] by Bourgain, and that a momentum cutoff was already used in different contexts in [5, 11]. We also point out that the properties that allow to prove that monomials involving variables with large index have a small vector field are (i) the regularity of the nonlinearity, (ii) its locality and (iii) the fact that the eigenfunctions of the linearized problem are “well localized with respect to the Fourier basis”. Technically we are able to obtain the proof by a trick which consists in embedding the dynamical system (0.1) into a larger one, essentially in the same way as one embeds periodic skewsymmetric functions in the spaces of periodic functions in order to use the Fourier expansions on the exponentials which is simpler to manipulate than the sin expansion. We conclude by remarking that our approach applies directly also to the case of the nonlinear Klein–Gordon equation with periodic boundary conditions (see Corollary 6.7 below), and also to quite general equations in one space dimension. Concerning possible extensions to systems with more than one space dimension we remark that the nonresonance condition (3.1) still holds, and also the smallness property of terms with large index holds for these systems, however Eq. (3.3) is typical of systems in one space dimension and we have no ideas on the possibility of obtaining similar results under weaker nonresonance assumptions.
Birkhoff Normal Form for PDE’s
259
4. Statement of the Abstract Result ¯ := Z \{0}; define the space 42s of the complex sequences ξ ≡ {ξj } ¯ such Denote Z j ∈Z that 2 ξ 242 := (1 + j 2s ) ξj < ∞, s
¯ j ∈Z
and the symplectic spaces
Ps := 42s ⊕ 42s ζ ≡ {ξj }j ∈Z¯ , {ηj }j ∈Z¯ ,
endowed by the symplectic form i j ∈Z¯ dηj ∧ dξ−j . In Ps we will use the norm ζ 2s = ξ 242 + η242 , and will denote by Bs (R (s) ) the ball of radius R (s) centered at s s the origin of Ps . For the Hamiltonian vector field of a Hamiltonian function H we will use the notation XH , so that the Hamilton equations are ζ˙ = XH (ζ ). Explicitly they are given by d ∂H d ∂H , . ξj = i ηj = −i dt ∂η−j dt ∂ξ−j In order to fix ideas one can think of ξj as the j th Fourier coefficient on the basis of the exponentials of a smooth function, and similarly for ηj . Consider a Hamiltonian of the form H := h0 + f with h0 (ξ, η) :=
(4.1)
ωj ξj η−j ,
(4.2)
¯ j ∈Z
and real ωj ’s; the functional f will be assumed to be smooth in a sense that will be specified in a while and to have a zero of order three at the origin. Definition . Consider a polynomial h(ξ, η) homogeneous of degree n1 in ξ and of degree n2 in η, namely hk,j ξk1 ...ξkn1 ηj1 ...ηjn2 , (4.3) h(ξ, η) = k,j
where we denoted k = (k1 , ..., kn1 ), j = (j1 , ..., jn2 ). The quantity |h|µ := eµ|l| sup |hk,j | l∈Z
(4.4)
k1 +...+kn1 +j1 +...+jn2 =l
will be called the µ-modulus of h.
Definition . Consider a functional h = n1 ,n2 hn1 ,n2 with hn1 ,n2 homogenous of degrees n1 , n2 in ξ and η respectively. If the quantity µ |h|R := |hn1 ,n2 |µ R n1 +n2 (4.5) n1 ,n2
µ
is finite for some positive R, µ, then we will say that h is of class MR and we will write µ h ∈ MR .
260
D. Bambusi µ
We will assume that f ∈ MR , and we will prove that in the case of the nonlinear µ wave equation the perturbation is of class MR . Definition . We will say that a sequence of frequencies {ωj }j ∈Z¯ is strongly non-vanishing if for any r ≥ 3 there exists a positive γ and a real α such that, for any (large) N , any choice of indexes i1 , ..., ir , j, k with |i1 | ≤ |i2 | ≤ ... ≤ |ir | ≤ N < |j | ≤ |k| and any choice of r + 1 signs one has that ωi1 ± ωi2 ± ... ± ωir ± ωj ± ωk = 0 implies
ωi ± ωi ± ... ± ωi ± ωj ± ωk ≥ γ . r 1 2 Nα
(3.3)
We will assume that the frequencies are strongly nonvanishing. The last definition we need is that of the Birkhoff normal form:
n1 ,n2 , with Z n1 ,n2 homogeneous of Definition . Consider a functional Z = n1 ,n2 Z degree n1 in ξ and of degree n2 in η. Write ,n2 Zkn11,k ξ ξ ...ξkn1 ηj1 ηj2 ...ηjn2 ; Z n1 ,n2 = 2 ,...,kn ,j1 ,...,jn k1 k2 1
k1 ,k2 ,...,kn1 ,j1 ,j2 ,...,jn2
2
Z will be said to be in Birkhoff normal form if ,n2 Zkn11,k 2 ,...,kn
1 ,j1 ,...,jn2
= 0 ⇒ ωk1 + ωk2 + ... + ωkn1 − ω−j1 − ω−j2 − ... − ω−jn2 = 0.
In the application to Eq. (0.1) we will need to restrict the system to the submanifold of skew–symmetric functions that we will denote by A := {(ξk , ηk ) : ξk = ξ−k and ηk = η−k } , and to the manifold
Re := (ξk , ηk ) : ξk∗ = η−k
corresponding to real functions. We assume that (1) the frequencies ωj of the hamiltonian system (4.1) are strongly nonvanishing with some indices α, γ . µ (2) there exists R¯ 1 > 0 and µ > 0 such that the function f is of class MR for any R ≤ R¯ 1 and fulfills the estimate µ
|f |R ≤ AR 3 with some positive A. Theorem 4.1. Under the above assumptions, fix a large M, define s∗ := 4Mα(2M); then there exist constants C, R∗ such that for any positive R (s∗ ) fulfilling R (s∗ ) < R∗ the following holds true: There exists a canonical transformation T : Bs∗ (R (s∗ ) /3) → Bs∗ (R (s∗ ) ) such that the transformed Hamiltonian has the form (h0 + f ) ◦ T = h0 + Z + R
Birkhoff Normal Form for PDE’s
261
with Z in Birkhoff normal form, and R an analytic functional having a Hamiltonian vector field which is analytic as a function from Ps∗ to itself; moreover there exists a positive c such that for any s ≥ s∗ , and any R (s) ≤ R∗ /cs the transformation T and the vector field XR are analytic also as maps from Bs (R (s) ) to Ps and the following estimate holds: sup ζ − T (ζ )s ≤ C(cs R (s) )2 , ζ s ≤R (s)
moreover sup ζ s ≤R (s)
M XR s ≤ C cs R (s) .
(4.6)
Finally, if the vector field of f leaves invariant A and/or Re, then the same is true for T and for the vector fields of Z and of R. 5. Proof of Theorem 4.1 We split the proof in three parts. In the first we establish some properties of the functions µ µ of class MR , in the second we prove that the Lie transform of a function of class MR µ µ is of class MR with any R < R, provided it is generated by a function of class MR ; in the third subsection we prove the iterative lemma which constitute the main step of the proof, and deduce Theorem 4.1. We will denote
9s :=
ρs :=
2s+1 , π
(1 + j 2s )e−2µ|j | ,
j ∈Z
and remark that there exists a positive (µ dependent) constant c such that 9s < cs .
(5.1)
Moreover C will denote a positive (large) constant whose value can change through the paper, and which is independent of the relevant quantities. For analytic functions f , that have an analytic vector field we will write R (s) Xf := s
sup ζ s ≤R (s)
Xf (ζ ) . s
Having fixed some integer N , we will denote by (qj , pj )|j |≤N = (ξj , ηj )|j |≤N , the first N canonical variables and by (Qj , Pj )|j |>N = (ξj , ηj )|j |>N the remaining ones. µ
5.1. Properties of the functions of class MR . The main results of this sub-section are Proposition 5.3 and 5.8; the hurried reader can go directly to them and leave the rest of the subsection for a subsequent reading.
262
D. Bambusi
Given a homogeneous polynomial h of degree n1 in ξ and of degree n2 in η, write h(ξ, η) = hk,i ξk1 ...ξkn1 ηi1 ...ηin2 ; (5.2) k,i
denote by
$i$ := i1 + i2 + ... + in1
the sum of the components of a multiindex i and define the functional ˇ η) := h(ξ, e−µ| $k$ + $i$ | ξk1 ...ξkn1 ηi1 ...ηin2 ,
(5.3)
k,i
which clearly depends only on the values of n1 , n2 , in (5.2). The interest of this functional is that it is very easy to estimate the norm of its vector field, since hˇ can be written in the form π 1 ˇh(ξ, η) = F (x)ξ(x)n1 η(x)n2 dx, (5.4) n−1 (2π) 2 −π where n = n1 + n2 and we defined 1 1 ξ(x) := √ ξj eij x , η(x) := √ ηj eij x , 2π ¯ 2π ¯ j ∈Z
j ∈Z
1 −µ|j | ij x F (x) := √ e e . 2π ¯ j ∈Z
(5.5) Lemma 5.1. One has
R Xh R s ≤ |h|µ Xhˇ s .
Proof. We write explicitly the vector field of h, to this end we use the form (5.2). Consider the ηt component of Xh , it is given (except for an irrelevant sign) by ∂h i = n1 h−t,k2 ,...,kn1 ,i ξk2 ...ξkn1 ηi1 ...ηin2 , ∂ξ−t k2 ,...,kn1 ,i
and the norm of the η component of Xh is given by 2 2s 1 + |t| n1 h−t,k2 ,...,kn1 ,i ξk2 ...ξkn1 ηi1 ...ηin2 k2 ,...,kn ,i t 1 2 ≤ sup ht,k2 ,...,kn1 ,i eµ|k2 +...+kn1 + $i$ +t| 1 + |t|2s t,k,i
× n1
k2 ,...,kn1 ,i
t 2 , e−µ|k2 +...+kn1 + $i$ −t| ξk2 ...ξkn1 ηi1 ...ηin2
but the first square bracket is smaller than |h|µ , while the second one is controlled by the norm of the η component of Xhˇ . Indeed, the norm of the η component of Xhˇ is given by 2 e−µ|k2 +...+kn1 + $i$ −t| ξk2 ...ξkn1 ηi1 ...ηin2 , 1 + |t|2s n1 k2 ,...,kn ,i t 1
Birkhoff Normal Form for PDE’s
263
taking real positive ξ, η one obtains the expression in the above square bracket. Similar inequalities clearly hold for the other components of the vector field of h. One thus obtains R1 2 R2 2 R 2 ξ η 2 |h|µ Xhˇ s = |h|µ sup + X ˇ Xhˇ h s s R1 +R2 =R 2 2 η R2 2 ξ R1 ≥ sup = Xh R + X X which is the thesis.
s
h s
h s
R1 +R2 =R
% $
Lemma 5.2. Consider the function hˇ as above; one has R X ˇ ≤ n9s (ρs R)n−1 , h s where n = n1 + n2 . Proof. We begin by considering the ξ component of Xhˇ , one has ξ h
Xˇ = i or more precisely ξ h
X ˇ (x) = i
(2π )
n−1 2
,
n2 F (x)[ξ(x)]n1 [η(x)]n2 −1
from which, using the inequality 2 ξ
n2 F ξ n1 ηn2 −1
(2π )
n−1 2
,
2 s+1 ξ H s ξ H s , (5.6) 2 π R ξ and, remarking that 9s = F H s , one has X ˇ ≤ n2 9s (ρs R)n−1 . Considering also h s the η component one has the thesis. $ % Hs
≤
µ
Proposition 5.3. Let h ∈ MR ; then, for any positive R (s) such that ρs R (s) ≤ R and any δ (s) < R (s) , the Hamiltonian vector field Xh is analytic as a map from Bs (R (s) ) to P s and fulfills the estimate Xh R s
(s) −δ (s)
≤
9s µ |h|ρ R (s) . s ρs δ (s)
Proof. First work on homogenous polynomials of degree n. One has h =
(5.7)
n1 ,n2 n1 +n2 =n
hn1 ,n2 ,
with hn1 ,n2 homogeneous in ξ, and in η. One has n−1 (s) −δ (s) (s) −δ (s) hn1 ,n2 n9s ρs (R (s) − δ (s) ) Xh R Xhn1 ,n2 R ≤ ≤ s s µ ≤
n1 ,n2 n1 ,n2 n 9s n1 ,n2 (s) R h ρ s µ ρs δ (s) n ,n 1 2
=
9s µ |h|ρ R (s) , s ρs δ (s)
where we used the inequality n(R − d n−1 ) ≤ R n d −1 . Then the result easily extends to analytic functions. $ %
264
D. Bambusi
Consider now a homogeneous polynomial of degree m1 , m2 , m3 , m4 , in q, p, Q, P respectively, and write it in the form h(q, p, Q, P ) = hk,i,j,l qk1 ...qkm1 pi1 ...pim2 Qj1 ...Qjm3 Pl1 ...Plm4 . (5.8) k,i,j,l
Correspondingly we define the functional π 1 ˇh(p, ˇ q, P , Q) := F (x)q(x)m1 p(x)m2 Q(x)m3 P (x)m4 dx n−1 (2π) 2 −π with q(x), p(x), Q(x), P (x) defined in analogy with (5.5). For its vector field we will prove an estimate stronger than (5.7); this is the heart of the momentum cutoff. Working exactly as in the proof of Lemma 5.1 one has Lemma 5.4. Let h be homogeneous polynomial of degree m1 , m2 , m3 , m4 in q, p, Q, P respectively, then one has R Xh R ≤ |h| µ X ˇˇ . s h s
To come to the estimate of the vector field of hˇˇ we need a few lemmas concerning the vector field of the product of the functions P (x) and Q(x). Lemma 5.5. Let φ be such that 1 φ(x) = √ φk eikx , 2π |k|>N then, given positive integers s, j such that s > j + 21 , one has j d φ 2 1 dx j (x) ≤ π s−j − 1 φH s . 2 N Proof. One has
√ dj φ j ikx 2π j (x) = (ik) φk e ≤ |φk ||k|j dx |k|>N |k|>N 1/2 1/2 1 2s 2 ≤ |k| |φk | |k|2(s−j ) |k|>N |k|>N 1/2 2 φH s . $ ≤ % [2(s − j ) − 1]N 2(s−j )−1
Lemma 5.6. Let φ and ψ be such that 1 ψ(x) = √ ψk eikx , 2π |k|>N then, for any integer s ≥ 1, one has ψφH s ≤
1 φ(x) = √ φk eikx , 2π |k|>N
2 2s+1 ψH s φH s . π N s−1
(5.9)
Birkhoff Normal Form for PDE’s
265
Proof. The main point consists in estimating the L2 norm of s ds s ψ (j ) φ (s−j ) . (ψφ) = s j dx j =0
Denote by Dj the j th addendum at r.h.s. For 1 ≤ j ≤ s − 1 one has (by Lemma 5.5) s 2 1 φH s ψH s , |Dj (x)| ≤ j π N s−1 moreover,
|D0 (x)| ≤
2 1 (s) ψ (x) s φ , H π N s−1/2
from which φψL2
s Dj + j =0
L2
|Ds (x)| ≤
2 1 (s) φ (x) s ψ H π N s−1/2
s 2 1 s φH s ψH s , ≤2 s−1 j πN
s using sj =0 = 2s the thesis follows. j
j =0
% $
Lemma 5.7. Let hˇˇ be of degree at least three in P , Q, namely with m3 + m4 ≥ 3, then one has R 9s (5.10) X ˇˇ ≤ n s−1 (ρs R)n−1 , h s N where n = m1 + m2 + m3 + m4 . Proof. For simplicity we will consider only the case m3 ≥ 3; the other cases can be treated exactly in the same way. We proceed as in the proof of Lemma 5.2, but consider first the P component of the vector field. Its norm is estimated by n−1 (2π ) 2 X Pˇ = m3 F q m1 p m2 Qm3 −1 P m4 s H hˇ H s 2 m1 m2 m3 −3 m4 ≤ m3 q p Q P F s 2s+1 Q2 s H π 2 H 2 s+1 1 QH s QH s ≤ m3 F q m1 p m2 Qm3 −3 P m4 s 2 H π N s−1 n−1 1 2 s+1 ≤ m3 9s s−1 . 2 R N π Considering the other components and summing up one has the thesis.
% $
Finally we have Proposition 5.8. Let h be a polynomial of degree less than or equal to r and of class µ MR , assume also that it is at least cubic in (P , Q) namely that it is the sum of polynomials of the form (5.8) each one fulfilling m3 + m4 ≥ 3, then, for any positive R (s) such
266
D. Bambusi
that ρs R (s) ≤ R and any δ (s) < R (s) , one has Xh R s
(s) −δ (s)
≤
2r+1 9s µ |h|ρ R (s) . s ρs δ (s) N s−1
(5.11)
Proof. First consider a polynomial hn1 ,n2 of degree n1 , n2 in ξ and η respectively. Expand it in Taylor series in P , Q. Such a Taylor series contains at most 2n1 +n2 homogeneous polynomials. Denote by hn1 ,n2 ,m3 ,m4 the polynomial of degree m3 , m4 in Q, P respectively. Then one has n ,n ,m ,m µ h 1 2 3 4 ≤ hn1 ,n2 µ , R R and thus, by Lemma 5.7, working as in the proof of Proposition 5.3 one has Xhn1 ,n2 ,m3 ,m4 R s
(s) −δ (s)
≤
1 9s n1 ,n2 ,m3 ,m4 µ . h ρs R (s) (s) ρs δ N s−1
(5.12)
Summing over all the different polynomials one gets at most the factor 2r+1 and the desired result. $ % 5.2. Lie transform estimates. In this subsection we first recall some facts about the µ µ Lie transform, and then we estimate the MR−d norm of the Lie transform of an MR function. This is given by Lemma 5.12 below, which constitutes the main result of this subsection. Consider a function χ and the corresponding Hamilton equations ζ˙ = Xχ (ζ ), the corresponding flow T t is a canonical transformation, and T := T 1 ≡ T t t=1 is called the Lie transform generated by χ . Given an analytic function h one can consider the transformed function h ◦ T . We recall that one has h◦T =
∞
hl ,
(5.13)
l=0
where the hl ’s are defined by hl =
1 {χ , hl−1 } , l
l ≥ 1;
h0 = h,
(5.14)
as is easily seen by iterating the equality d [h ◦ T t ] = {χ , h} ◦ T t , dt and considering the Taylor expansion in t of h ◦ T t . In order to estimate the norm of µ h ◦ T we need first to estimate the Poisson bracket of two MR functions. We will do it in a few steps. Lemma 5.9. Let h be a homogeneous polynomial of degree m and g a homogeneous polynomial of degree n; assume that both h, g are homogenea in ξ and η independently, and that they have finite µ-modulus, then one has |{h, g}|µ ≤ mn|h|µ |g|µ .
(5.15)
Birkhoff Normal Form for PDE’s
267
Proof. Write h and g explicitly as follows hi1 ,...,in1 ,j1 ,...,jm1 ξi1 ...ξin1 ηj1 ...ηjm1 , h(ξ, η) = i1 ,...,in1 ,j1 ,...,jm1
g(ξ, η) =
k1 ,...,kn2 ,l1 ,...,lm2
One has ϕ := {h, g} = i
gk1 ,...,kn2 ,l1 ,...,lm2 ξk1 ...ξkn2 ηl1 ...ηlm2 .
∂h ∂g ∂g ∂h . − ∂ξt ∂η−t ∂ξt ∂η−t t
(5.16)
We first consider the case where h depends on ξ only and g depends on η: We have ϕ(ξ, η) = ϕi2 ,...,in ,j2 ,...,jm ξi2 , ..., ξin ηj2 ...ηjm i2 ,...,in ,j2 ,...,jm
with
ϕi2 ,...,in ,j2 ,...,jm = mn
h−j1 ,i2 ,...,in gj1 ,j2 ,...,jm .
j1
In order to estimate the µ-modulus of ϕ we write it explicitly as follows: 1
|ϕ|µ = sup gj1 ,...,jm−1 ,l− n ia − m−1 jb h−j1 ,i2 ,...,in eµ|l| . a=2 b=2 nm l i2 ,...,in ,j2 ,...,jm−1 j1
Changing the variable in the sum from j1 to l1 := −j1 + na=2 ia , this becomes sup g n ia −l1 ,j2 ,...,jm−1 ,l− n ia − m−1 jb hl1 − na=2 ia ,i2 ,...,in eµ|l| l
≤
i2 ,...,in ,j2 ,...,jm−1 l 1
l
sup
l1 i2 ,..,in ,j2 ,..,jm−1
a=2
a=2
g n
n
a=2 ia −l1 ,j2 ,...,jm−1 ,l−
b=2
m−1
a=2 ia −
b=2
µ|l|
n h e . l − i ,i ,...,i n jb 1 a=2 a 2
Changing the variable of summation l to l2 = l − l1 one can estimate the above quantity by sup g n ia −l1 ,j2 ,...,jm−1 ,l2 +l1 − n ia − m−1 jb eµ|l2 | l1 i2 ,...,in ,j2 ,...,jm−1
l2
≤
sup
a=2
l2 i2 ,...,in ,j2 ,..,jm−1 ,l1
a=2
b=2
× hl1 − na=2 ia ,i2 ,...,in eµ|l1 |
n
m−1 eµ|l2 | i −l ,j ,...,j ,l +l − i − j m−1 1 2 a=2 a 1 2 a=2 a b=2 b × sup hl1 − na=2 ia ,i2 ,...,in eµ|l1 | , i2 ,...,in
g n
l1
but the first supremum is taken over all the combination of indices such that their sum is l2 , and therefore the first curly bracket is just the µ-modulus of g, the second is the µ-modulus of h. So the thesis follows in the present case.
268
D. Bambusi
The general case follows using |{h, g}|µ ≤ (n1 m2 + n2 m1 )|h|µ |g|µ ≤ (n1 + m1 )(m2 + n2 )|h|µ |g|µ , which is the thesis.
% $
Remark 5.10. Equation (5.15) holds also in the case of polynomials which are not homogenea in ξ, η, to verify this point just use the definition of the µ-modulus. µ
µ
Lemma 5.11. Let h ∈ MR and g ∈ MR−d , then for any positive d < R − d one has µ {h, g} ∈ MR−d−d , µ
|{h, g}|R−d−d ≤
d (d
1 µ µ |h|R |g|R−d . + d )
(5.17)
Proof. Write h = j hj and g = k gk with hj homogeneous of degree j and similarly for g, we have
{h, g} = hj , gk . j,k
Now each term of the series is estimated by
µ |{hk , gk }|R−d−d = hj , gk µ (R − d − d )j +k−2 ≤ hj µ |gk |µ j k(R − d − d )j +k−2 = hj µ |gk |µ j (R − d − d )j −1 k(R − d − d )k−1 1 1 ≤ hj µ |gk |µ R j (R − d)k d + d d µ 1 hj |gk |µ . = (5.18) R−d R d (d + d ) % $ We estimate now the terms of the series (5.13), (5.14) defining the Lie transform. µ
µ
Lemma 5.12. Let h ∈ MR and χ ∈ MR be two analytic functions, denote by hn the functions defined recursively by (5.14); then, for any positive d < R, one has µ hn ∈ MR−d , and the following estimate holds: 2 n e µ µ µ |χ |R |hn |R−d ≤ |h|R . (5.19) d2 (n)
Proof. Fix n, and denote δ˜ := d/n, we look for a sequence Cl µ ˜ R−δl
|hl |
(n)
≤ Cl ,
such that
∀ l ≤ n.
By (5.17) this sequence can be defined by (n)
µ
C0 = |h|R ,
(n)
Cl
=
1 1 (n) n2 (n) µ µ Cl−1 |χ |R = 2 2 Cl−1 |χ |R . l l δ˜δ˜ l d
So one has Cn(n)
1 = (n!)2
µ
n2 |χ |R d2
n
µ
|g|R .
Using the inequality nn < n!en , which is easily verified by writing the iterative definition of nn /n! one has the thesis. $ %
Birkhoff Normal Form for PDE’s
269
Remark 5.13. Let χ be an analytic function with Hamiltonian vector field which is an R (s) alytic as a map from Bs (R (s) ) to Ps , fix δ (s) < R (s) . Assume Xχ < δ (s) and s
consider the time t flow T t of Xχ . Then, for |t| ≤ 1, one has sup ζ s ≤R (s) −δ (s)
t T (ζ ) − ζ ≤ χ R (s) . s s
(5.20)
Lemma 5.14. Consider χ as above and let h : Bs R (s) → C be an analytic function R (s) with vector field analytic in Bs R (s) , fix 0 < δ (s) < R (s) assume Xχ s ≤ δ (s) /3, then, for |t| ≤ 1, one has R (s) 3 (s) −δ (s) (s) R Xh R Xh◦T t s ≤ 1 + (s) Xχ s . s δ For the proof see [2], proof of Lemma 8.2.
5.3. Iterative lemma. In the statement of the forthcoming iterative lemma we will use the following notations: For any positive R, define δ := R/2r and Rk := R − kδ. (s) Moreover, for any positive R (s) define δ (s) := R (s) /2r and Rk := R (s) − kδ (s) . Proposition 5.15. Iterative Lemma. Fix r ≥ 4 and consider the Hamiltonian (4.1); define R∗ by 1 16e2 AN α r 2 , := R∗ γ and assume R < R¯ 1 ,
R 1 ≤ ; R∗ 2
(5.21)
then, for any k ≤ r − 4 there exists a canonical transformation T (k) which puts (4.1) in the form (k)
(k)
H (k) := H ◦ T (k) = h0 + Z (k) + f (k) + RN + RT ,
(5.22)
with Z (k) in Birkhoff normal form; moreover 1) the following estimates hold: k−1 & % R l (k) µ ≤ AR 3 , Z Rk R∗ l=0 k & % R (k) µ 3 ≤ AR , f Rk R∗ 2) for any s ≥ 1 define R∗(s) :=
R∗ , ρs
(5.23)
270
D. Bambusi
then, for any R (s) such that R (s) (s)
R∗
≤ min
1 2e2 , 2 ρs 9s
the transformation T (k) is analytic as a map (s) (s) T (k) : Bs Rk+1 → Bs R1 ,
(5.24)
(5.25)
and the following estimates hold: sup (s) ζ s ≤Rk+1
l k R (s) (k) (s) ρs 9s ζ − T (ζ ) ≤ R (s) s 16e2 r l=1 R∗
(5.26)
l+1 k−1 R (s) r+1 (s) ' 3ρ R 2 9 k+1 1 + s s ≤ s−1 2rA9s (ρs R (s) )2 XR(k) (s) N N 16e2 s R∗ l=0 l k−1 R (s) × (5.27) (s) l=0 R∗ r−1 l+1 k−1 R (s) (s) ' 3ρ R 9 R (s) s s k+1 1 + ; ≤ 9s 2r−2 rAR∗2 XR(k) (s) (s) T 16e2 s R∗ R ∗ l=0 ×
k−1 1 . 2l
(5.28)
l=0
furthermore if the vector fields of h0 and of f leave invariant A and/or Re, then the same is true for T (k) and for the Hamiltonian vector fields of all the terms of (5.22). Remark 5.16. In particular XR(k) is of order [R (s) ]2 /N s−1 and XR(k) is of order N T (s) (R (s) /R∗ )r−1 . Proof. We proceed by induction. First we split f (k) (f in the case k = 0) into an effective part and a remainder. Write (k) (k) f (k) = f0 + fT , (k)
(k)
where f0 is the the Taylor polynomial of f (k) truncated at order r − 1 and fT is (k) the remainder of the Taylor series (which begins with terms of order r). Since f0 is a truncation of f (k) one has & &µ % % (k) µ f0 ≤ f (k) . Rk
Rk
(k)
Consider f0 , and its Taylor series in (P , Q) only; we write (k)
f0
(k)
= f˜(k) + fN ,
Birkhoff Normal Form for PDE’s
271
where f˜(k) is the truncation of such a series at second order (it contains at most terms (k) quadratic in (P , Q)) and fN is the remainder of the expansion. One has & &µ & &µ % % % % ˜(k) µ (k) µ ≤ f (k) , ≤ f (k) . f fN Rk
Rewrite where
Rk
Rk
Rk
˜ (k) + R ˜ (k) , H˜ (k) := h0 + Z (k) + f˜(k) + R N T ˜ (k) := R(k) + f (k) , R N N N
˜ (k) := R(k) + f (k) . R T T T
Consider the Lie transform Tk generated by a function χk , and use it to tranform the main part of the Hamiltonian, namely h0 + Z (k) + f˜(k) ; by formulae (5.13), (5.14) one has h0 + Z (k) + f˜(k) ◦ Tk = h0 + Z (k) + {χk , h0 } + f˜(k) ,∞ ∞ ∞ (k) (k) + (5.29) Z + h0l , f˜ + l=1
where (k)
Zl
:=
l
/ 1. (k) χ , Zl−1 , l ≥ 1 ; l
l=1
l
l=2
(k)
Z0 = Z (k) ,
(k) and similarly for f˜l and h0l . If χk fulfills the equation
{χk , h0 } + f˜(k) = Zk ,
(5.30)
with Zk in normal form, then the order of the non-normalized part has increased by one. We determine a function χk fulfilling (5.30). To this end expand f˜(k) in Taylor series: (k) f˜(k) (ξ, η) = flj ξ j ηl , |l|+|j |≤r
where l = {li }i∈Z and j = {ji }i∈Z are sequences of nonnegative integers, and we used 0 j the vector notation ξ j = i∈Z ξi i . Denote by RE := {(l, j ) : ω · (j − l) = 0}, and define Zk (ξ, η) :=
(l,j )∈RE
(k)
flj ξ l ηj ,
χk (ξ, η) :=
(l,j )∈RE
(k)
flj
iω · (j − l)
ξ j ηl ,
then Zk is in normal form. The norms of these functions are estimates by &µ % N α % ˜(k) &µ µ µ |Zk |R ≤ f˜(k) , |χk |R ≤ f , R R γ as one can easily see by remarking that Zk is a truncation of f˜(k) and that the denominators in the definition of χ are larger than γ /N α . Then, using the inductive hypothesis, one has k R AN α 3 R k µ µ 3 |Zk |Rk ≤ AR , |χk |Rk ≤ . (5.31) R R∗ γ R∗
272
D. Bambusi
Define T (k+1) := T (k) ◦ Tk and remark that, by the standard theory of systems with symmetry, if T (k) and the Hamiltonian vector field of the different parts of H˜ (k) leave invariant the manifold A and/or Re, then the same is true for T (k+1) . Then define Z (k+1) := Z (k) + Zk , and f (k+1) as the curly bracket of (5.29). For (k+1) clearly (5.23) holds with k + 1 in place of k. We now estimate f (k+1) . To this Z end let us denote e2 µ D := 2 |χ |Rk , δ and notice that, by (5.31), and the definition of R∗ , we have 1 R k+1 < 1/2, ∀ k. D≤ 4 R∗ So, using the estimate (5.19), from Lemma 5.12, we have 2µ 1 ∞ ∞ &µ % D % (k) &µ (k) Zl ≤ D l Z (k) = Z Rk Rk 1−D l=1
Rk+1
and similarly
l=1
≤ 4DAR 1+1
2µ 1 ∞ (k) f˜l l=1
≤ 2AR 2 D
Rk+1
R R∗
k
.
µ MRk ,
Finally, even if h0 is not of class all the terms of the sequence h0l generated by µ h0 turn out to be of class MRk+1 . Indeed, remarking that µ
h01 = {χk , h0 } = f˜(k) − Zk ∈ MRk and proceeding as in the proof of Lemma 5.12 one gets &µ % µ |h0l |Rk+1 ≤ D r−1 f˜(k) , l ≥1, Rk
and therefore
2µ 1 ∞ h0k l=2
≤ 2AR
1+1
D
Rk+1
R R∗
k
.
Summing up the different contributions to f (k+1) and using the definition of D, we obtain the second of (5.23) with k + 1 in place of k. We come to item 2). First we define (k+1)
RN
˜ (k) ◦ Tk , := R N
(k+1)
RT
˜ (k) ◦ Tk := R T
and remark that, by Proposition 5.3, one has k k+1 α (s) (s) R (s) ρs 9s R (s) Xχ Rk+1 ≤ 9s AN (ρs R (s) )3 R = , k s (s) (s) 16e2 r ρs δ (s) γ R∗ R∗
(5.32)
Birkhoff Normal Form for PDE’s
273
where we used the definition of R∗ . The above expression, by Lemma 5.13 and by (5.24), ensures that (s) (s) Tk : Bs Rk+2 → Bs Rk+1 , which together with the inductive assumptions ensures the validity of the estimate (5.26) with k + 1 in place of k. (k+1) Estimate of RN : To begin with remark that one has k R (s) 2r+1 A9s (ρs R (s) )3 R (s) k+1 ≤ Xf˜(k) (s) N N s−1 ρs δ (s) s R∗ k 2 R (s) 2r+1 = 2rA9s ρs R (s) , (s) N s−1 R∗ and, from Lemma 5.14 one has k+1 R (s) 2 R (s) k 2r+1 3ρs 9s R (s) k+2 2rA9s ρs R (s) ≤ 1+ , Xf (k) ◦T (s) (s) k s N 16e2 N s−1 R∗ R∗ and also
k+1 R (s) (s) r+1 R 9 3ρ s s k+2 2 ≤ 1 + 2rA9s XR(k) ◦T (s) k s N 16e2 N s−1 R∗ l+1 k−1 l k−1 (s) ' R (s) 3ρ R 9 s s 1 + ×(ρs R (s) )2 , (s) (s) 16e2 R ∗ l=0 l=0 R∗
which shows that the estimate (5.27) holds with k + 1 in place of k. µ (k) Estimate of (5.28): First we estimate the MRk norm of fT . To this end remark that (k) µ f is a function of R which is analytic and has all the Taylor coefficients that are Rk & % (k) µ is the sum of the terms of order higher than positive. Moreover, the quantity fT Rk
r in such a series. From this one easily sees that r r % & & % 1 R 2R (k) µ (k) µ ≤ ≤ 2r−3 AR∗3 . fT f Rk R∗k /2 R∗ R∗ 2k Then, by Proposition 5.3 one has r (s) R 2r 1 2r−3 AR∗3 k (s) (s) 2 ρs R R∗ r−1 R (s) 1 = 9s 2r−2 rAR∗2 . (s) 2k R∗
R (s) k+1 ≤ 9s Xf (k) T
s
(k)
Finally, proceeding as in the estimate of RN one obtains the thesis.
% $
(5.33)
274
D. Bambusi
Corollary 5.17. Consider the Hamiltonian (4.1); There exists a constant C such that for any s ≥ 1 and any R (s) such that R (s) < (C2s N α 9sα )−1 the following holds true: There exists a canonical transformation Tr : Bs (R (s) /3) → Bs (R (s) ) fulfilling ζ − Tr (ζ )s ≤ C2s 9s (R (s) 2s N α )2 R (s) such that the transformed Hamiltonian has the form H ◦ Tr = h0 + Z + RT + RN with Z in normal form, and the following estimates hold: 2 s (s) 2 C9 R , s N s−1 r−1 ≤ C9s R (s) 2s N α .
(s) XR R /3 ≤ N s (s) XR R /3 T
s
1
(5.34)
Proof. Apply Proposition 5.15 with k = kr = (r − 3), then estimate the vector field of (k ) (k ) f (kr ) by using (5.7), and define RN := RN r and RT := RT r + f (kr ) . Inserting the (s) % definition of R∗ and inserting all irrelevant constants in C we get the result. $ We are now ready for the Proof of Theorem 4.1. In Corollary 5.17 we have N and s at our disposal. We begin by choosing N := R −1/2α so that the bracket in the second of (5.34) becomes of order [R (s) ]1/2 . Inserting in (5.34) we get r−1 (s) XR + XR R /3 ≤ C9s (R (s) ) 2α1 (s−1) + C9s 4s CR (s) 2 . N T s Finally choosing s ≥ s∗ := 4αM with M = (r − 1)/2 one has that the first addendum is smaller than the second. To obtain the statement use also Eq. (5.1). $ % 6. Application to the Nonlinear Wave Equation We split this section in two subections: in the first we will show how to apply Theorem 4.1 in order to obtain Theorem 2.2, and in the second we will show that the frequencies of the system we study are strongly nonvanishing for most values of the parameter m. 6.1. Application of Theorem 4.1 to Equation (0.1). Consider the eigenfunctions ϕj of the Sturm–Liouville problem (2.1) and their Fourier expansion 1 j ikx ϕj (x) = √ ϕk e , 2π k∈Z
Birkhoff Normal Form for PDE’s
275 j
it is known [11, 9] that one has |ϕk | ≤ Ce−σ ||k|−j | with positive σ < σ1 and a suitable j C. Remark also that ϕ0 = 0. We want to introduce the analogue for our problem of the Fourier basis of the exponentials. So, we define 1 j ikx ψj (x) := √ ϕk e , 2π k>0
1 j ikx ψ−j (x) := √ ϕk e , 2π k<0
j >0;
considering the Fourier expansion of the ψj ’s, one has 1 j ikx ψj (x) = √ ψk e , 2π ¯
j
|ψk | ≤ Ce−σ |k−j | ,
(6.1)
k∈Z
which is the property we need. p We fix the phase space to be Fs := 42s ⊕ 42s−1 (q, ˆ p) ˆ endowed by the symplectic
p form i k∈Z¯ dpk ∧ dq−k ; for any point (q, ˆ p) ˆ ∈ Fs we define the functions u(x) ˆ :=
qˆk ψk (x) ,
k∈Z¯
v(x) ˆ :=
pˆ k ψk (x);
(6.2)
k∈Z¯
their skew–symmetric part will play the role of the unknown of Eq. (0.1). p
Remark 6.1. The norm of Fs is defined in terms of the coefficients of the expansion of (u, ˆ v) ˆ on the basis of the eigenfunction of a Sturm Liouville problem with analytic potential, so it is equivalent to the standard Sobolev norm of the corresponding funcp tions. In particular (6.2) establishes an isomorphism of Fs with the Sobolev space s s−1 H (R/2π Z) ⊕ H (R/2πZ). Consider the Hamiltonian system pˆ k pˆ −k + ω2 qˆk qˆ−k π k G(x, u(x)) ˆ dx, + 2 −π
(6.3)
k∈Z¯
with ω−k := ωk ≡ of q. ˆ
√ µk + m, (k ≥ 1) and uˆ which has to be thought of as a function
Remark 6.2. In general the functions ψj are not eigenfunctions of the Sturm Liouville problem with potential V and periodic boundary conditions, therefore the dynamical system (6.3) does not coincide with the system obtained by considering (0.1) with periodic boundary conditions on [−π, π]. Remark 6.3. A remarkable exception is the Klein Gordon equation (corresponding to V = 0) where the functions ψj are just the imaginary exponentials which are also eigenfunctions of the linearized problem with periodic boundary conditions. It follows that in this case the Hamiltonian (6.3) is the restriction of the Hamiltonian of the nonlinear Klein–Gordon equation with periodic boundary conditions in [−π, π] to the space of the functions with zero average. To obtain the true nonlinear Klein Gordon equation one has to add the degree of freedom corresponding to the average u. This can be easily done still fitting the framework of the previous section.
276
D. Bambusi
Define the subspace of skewsymmetric functions by
A := (pˆ k , qˆk ) : qˆk = qˆ−k and pˆ k = pˆ −k , then we have the following Proposition 6.4. The space A is invariant for the dynamics of the system (6.3); the dynamics of (6.3) in A coincides with the dynamics of the nonlinear wave equation (0.1). Proof. Consider first the vector field of h0 . It is clear that it leaves invariant A and that here its dynamics coincides with the dynamics of the linear part of (0.1) by identifying qˆk (k ≥ 1) with uk and pˆ k (k ≥ 1) with vk . Consider the nonlinear part and remark first that, since π ψj (x)ψk (x)dx = δj,−k , (6.4) −π
the symplectic form can be written in terms of the variables u, ˆ vˆ as π d v(x) ˆ ∧ d u(x)dx ˆ . −π
Expand the nonlinearity f of (6.3) in power series in u, namely write f = with π n n f (u) ˆ = Gn (x)[u(x)] ˆ dx , −π
(6.5)
n≥3 f
n,
(6.6)
and consider the Hamiltonian vector field of f n . By (6.5) it is given by d vˆ = nGn uˆ n−1 , dt
d uˆ = 0 ; dt
(6.7)
by the skew symmetry property of g it leaves invariant A. On this manifold the vector field (6.7) coincides with the one of the single terms of the expansion of the nonlinearity of (0.1) by identifying (as natural) u and v with the restriction of uˆ and vˆ to [0, π ]. $ % Introduce now complex variables ξ, η by pˆ j pˆ j 1 1 √ √ ξj := √ qˆj ωj − i √ , ηj := √ qˆj ωj + i √ , ωj ωj 2 2
(6.8)
then clearly the quadratic part of the hamiltonian takes the form (4.2). Fix a potential V as in Sect. 2, then, concerning the frequencies, we have the following theorem that will be proved in the next subsection. Theorem 6.5. Having fixed m1 and m2 as above, there exists a subset √ J of [m1 , m2 ] with measure m2 −m1 such that for any m ∈ J the frequencies ωj := µj + m (j ≥ 1) are strongly non-vanishing. The index α = α(r) can be chosen equal to 16r 5 . The same is true if one adds the frequencies ω−j := ωj , j ≥ 1. Concerning the nonlinearity we have the following
Birkhoff Normal Form for PDE’s
277
Proposition 6.6. Under the above assumptions there exist positive µ and R such that µ the nonlinearity of (6.3) is of class MR . Proof. Define first the functions ξ¯ (x) :=
ξk √ ψk (x) , ωk
η(x) ¯ :=
k
ηk √ ψk (x) , ωk k
and, to start with consider a homogeneous polynomial π n2 G(x)[ξ¯ (x)]n1 [η(x)] ¯ dx , f (ξ, η) :=
(6.9)
−π
with G periodic and analytic in a strip of width κ around the real axis; the expression (6.9) coincides with π ξl1 ...ξln ηi1 ...ηin 1 2 G(x)ψl1 (x)...ψln1 (x)ψi1 (x)...ψin2 (x)dx , (6.10) √ ωl1 ...ωln1 ωi1 ...ωin2 −π l,i
we estimate the integral. Denote k = (k1 ...kn ) := (l, i) = (l1 , ..., ln1 , i1 , ..., in2 ), then such an integral is given by π k1 k i(j1 +...+jn )x n ψj1 ...ψjn G(x)e dx −π j ≤ C n e−σ (|k1 −j1 |+...+|kn −jn |) |G|κ e−κ|j1 +...+jn | , j
where |G|κ is the supremum of G in the strip |Im x| < κ; here we used the estimate (6.1) of ψjk , the fact that the last integral is the Fourier coefficient of G and the standard decay estimate of Fourier coefficients of analytic functions. To fix ideas assume that κ > σ , then, denoting by fk ≡ fk1 ,...,kn the integral in (6.10), and choosing a positive δ < σ we have |fk | ≤ C n e−σ (|k1 −j1 |+...+|kn −jn |) e−κ|j1 +...+jn | |G|κ j ≤ C n e−δ[|k1 −j1 |+...+|kn −jn |] e−(σ −δ)[|k1 −j1 |+...+|kn −jn |] e−(σ −δ)|j1 +...+jn | j n −(σ −δ)|k1 +...+kn | −δ|k1 −j1 | −δ|k2 −j2 |
≤C e
e
e
...e−δ|kn −jn | ≤ C n e−(σ −δ)|k1 +...kn | ,
j
where we redefined the constant C and used the triangular inequality: |k1 +...+kn | ≤ |k1 −j1 +...+kn −jn |+|j1 +...+jn | ≤ |k1 −j1 |+...+|kn −jn |+|j1 +...+jn | . Take now µ = σ − 2δ, then one has |f |µ ≤ |G|κ
l
eµ|l| cn
sup
k1 +...+kn =l
e−(σ −δ)|k1 +...+kn | = cn |G|κ e−δ|l| , √ ωk1 ...ωkn l
278
D. Bambusi
where we used the fact that ωk is strictly positive for all k’s and redefined c. So, one has that, for µ < min{σ, κ}, (6.9) has finite µ-modulus smaller than C|G|κ cn , where C, c are positive constants that depend on G only through κ. To obtain the statement just remark that by analyticity there exist constants such that |Gn |κ ≤ Ccn , and that Gn is the sum of at most 2n polynomials of the form (6.9). $ % We have thus verified that the system (6.3) fulfills the assumptions of Theorem 4.1 that therefore applies. Theorem 2.2 immediately follows as a corollary by virtue of Proposition 6.4 and of the remark that, since on A ξk = ξ−k and ηk = η−k , the restriction of the normal form to this manifold depends only on the actions Ik . Proof of Corollary 2.3. The estimates (2.7) can be obtained just estimating the time derivatives of the functions ζ 2Fs and j 2s Ij (ζ ), which are analytic. $ % ¯ v) ¯ the “normalized coordinates” Proof of Corollary 2.4. Denote by (u¯ , v¯ ) := TR−1 (u, of the initial datum, and define T0 := {(u, v) : Ij (u, v) = Ij (u¯ , v¯ ) , ∀j ≥ 1}, and finally define T0 := TR (T0 ). Denoting also by ζ (t) := TR−1 (ζ (t)) the solution in the normalized coordinates one has 1/2 1/2 [ds (ζ (t), T0 )]2 ≤ j −2(s−s ) j 4s (Ij (ζ (t)) − Ij (u¯ , v¯ ))2 , j ≥1
j ≥1
and by estimating the time derivative of the function in square brackets which is analytic in Fs one gets the estimate (2.8) in the normalized coordinates. By using the fact that TR is Lipschitz the thesis follows. $ % A further corollary concerns the case of the nonlinear Klein Gordon equation with periodic boundary conditions. To state it denote by Jk := Ik + I−k the sum of actions corresponding to modes with the same frequency. Corollary 6.7. Consider Eq. (0.1) with periodic boundary conditions on [−π, π ] in the particular case V = 0, and fix m in such a way that the frequencies are strongly nonvanishing. Fix M ≥ 4 and s ≥ s∗ , then there exist positive constants Rs , Cs such that the following holds true: let ζ (t) ≡ (u(t), v(t)) be the solution of the Cauchy problem for Eq. (0.1) with initial datum (u0 , v0 ) ∈ H s (R/2π Z) × H s−1 (R/2π Z) if R := (u, v)H s ×H s−1 ≤ Rs , then for all times t with |t| ≤
1 Cs R M−3
(6.11)
the following estimates hold: ζ (t)H s ×H s−1 ≤ 2Cs R ,
Jj (t) − Jj (0) R2
≤
Cs R. j 2s
(6.12)
Birkhoff Normal Form for PDE’s
279
6.2. Measure estimates. Since ω−j = ωj we will consider only the case of positive j ’s. We first introduce some notations. Given
integers 0 < i1 , ..., ir ≤ N , and a choice σ of r signs, define the vector k := rl=1 ±eil , where eil ∈ Z∞ is the vector with all components equal to zero but the ilth which is equal to 1, so that k · ω = ±ωi1 ± ωi2 ± ... ± ωir . We will consider only choices of σ such that k = 0. Then, for fixed τ, γ1 we define γ1 (6.13) Rσi (γ1 , τ ) := m ∈ I : ωi1 ± ωi2 ± ... ± ωir < τ τ τ . i1 i2 ...ir Given j > N define
,
Rσij (γ1 , τ ) := m ∈ I : ωi1 ± ωi2 ± ... ± ωir−1 ± ωj <
γ1 τ i1τ i2τ ...ir−1
and finally, given also k ≥ j define , σ Rij k (γ1 , τ ) := m ∈ I : ωi1 ± ωi2 ± ... ± ωir−2 ± ωj ± ωk <
-
γ1 τ i1τ i2τ ...ir−2
,
we assume that when j = k the sign in front of ωj agrees with the sign in front of ωk . Obviously Rσij and Rσij k do not depend on ir and on ir−1 , ir respectively; this is not important for the following. We begin by proving the following estimate: Lemma 6.8. For any K ≤ r, let i1 < ... < iK be K different positive indices; then one has ωi1 ωi2 · · · ωiK dω d ωiK d ωi2 i ··· dm1 dm dm C . . ··· . . (6.14) ≥ 2K−1 2K−1 2K−1 i1 i2 · · · iK . ··· . K−1. d ωi1 d K−1 ωi2 d K−1 ωiK K−1 K−1 · · · K−1 dm
dm
dm
Proof. First remark that, by explicit computation one has d n ωi (−1)n+1 (2n − 3)! = . dmn 2n−2 (n − 2)!2n (µi + m)n− 21
(6.15)
Substituting (6.15) in the l.h.s. of (6.14) we get the determinant to be estimated. To obtain the estimate factorize from the i th column the term (µi + m)1/2 , and from the j th (2j −3)! row the term 2j −2 . Forgetting the inessential powers of −1, we obtain that the (j −2)!2j determinant to be estimated is given by 1 1 1 · · · 1 xi1 xi2 xi3 · · · xiK K−1 x 2 K 2 x xi23 · · · xi2K i2 ' ' i1 (2n − 3)! . . . · · · . , (6.16) ωil 2n−2 (n − 2)!2n . . . · · · . n=1 l=1 . . . ··· . K−1 K−1 K−1 xi xi xi · · · xiK−1 1
2
3
K
280
D. Bambusi
where we denoted by xi := (µi + m)−1 . The last determinant is a Vandermond determinant whose value is given by ' (xil − xin ) . (6.17) l
Now one has xi − xi = n l
µi − µi 1 1 n l = − ≥ Cxil xin µil + m µin + m (µil + m)(µin + m)
with a suitable C. So, (6.17) is estimated by K ' l−1 '
Cxil xin = C
K
l=2 (l−1)
l=2 n=1
K ' l=2
xil−1 l
l−1 '
xin
n=1
=C
K ' l=1
xiKl ,
from which, using the asymptotics of the frequencies, the thesis immediately follows. % $ The above result will be used in view of the following Lemma 6.9. of Appendix B in [4]). Let u(1) , ..., u(K) be K independent vec (i)(Proposition tors with u 41 ≤ 1. Let w ∈ RK be an arbitrary vector, then there exist i ∈ [1, ..., K], such that w41 det(u(i) ) |u(i) · w| ≥ , K 3/2 where det(u(i) ) is the determinant of the matrix formed by the components of the vectors u(i) . For the proof see [4]. Corollary 6.10. Let w ∈ R∞ be a vector with only K components different from zero, namely those with index i1 , ..., iK ; assume K ≤ r. Then, for any m ∈ I there exists an index i ∈ [0, ..., K − 1] such that i w41 w · d ω (m) ≥ C 0 , β K dmi l=1 il where ω is the frequency vector, and β = 2r. Proof. Consider the vectors u(i) :=
di ω d1i ω dm i
dmi
di ω dmi
i d ω if dm i > 1 i d ω if dm i ≤ 1
and apply Lemma 6.9. We thus get that there exists i such that K i ' w41 1 w · d ω ≥ C / . 0 l i K β˜ dω dm l=0 max 1, dml 1 l=1 il 4
with β˜ = 2r − 1 and ω denotes here the vector with all components equal to zero except those with indexes i1 , ..., iK which are put equal to ωi1 ...ωiK . Then remark that the 41 di ω norm of dm i is bounded by a constant (independent of w) for all i ≥ 2, while for i = 1
Birkhoff Normal Form for PDE’s
281
we have that it is given by the sum of at most r components different from zero. Redω is bounded uniformly with respect to the choice mark also that each component of dm of the indexes, and therefore also for i = 1 the above norm is bounded by a constant independent of w. Thus we have K i ' 1 w · d ω ≥ C w41 . i β˜ ω41 dm l=1 il Now, as one can easily check, ω is bounded by C
K '
ωil
l=1
from which, using the asymptotic of the frequencies we get the thesis.
% $
Lemma 6.11. (Lemma 8.4 of [3]). Let g : I → R be r times differentiable, and assume that (1)∀ m ∈ I there exists i ≤ r − 1 such that g (i) (m) > d. (2)There exists A such that |g (i) (m)| ≤ A∀ m ∈ I, and ∀ i with 1 ≤ i ≤ r. Define Ih := {m ∈ I : |g(m)| ≤ h} , then
|Ih | A ≤ 2(2 + 3 + ... + r + 2d −1 )h1/r . |I| d
For the proof see [3] and [20]. By combining Lemma 6.11 and Lemma 6.10 we get the following Lemma 6.12. For any choice of i1 ≤ i2 ≤ ... ≤ ir and any acceptable choice σ of the signs, one has 1/r σ R (γ1 , τ ) ≤ |I|C 0γ1 i r δ l=1 il
with δ = −2β + τ/r. Moreover, for any choice also of j > N one has γ1 1/r j β σ Rij (γ1 , τ ) ≤ |I|C 0r−1 δ l=1 il and, for any choice of k ≥ j one has γ1 1/r j β k β σ Rij k (γ1 , τ ) ≤ |I|C 0r−2 δ . l=1 il
(6.18)
(6.19)
(6.20)
Lemma 6.13. Assume i1 ≤ i2 ... ≤ ir−1 ≤ N, then there exists C1 such that Rσij = ∅ when j ≥ C1 N. Proof. Defining k¯ as j we have,
r−1
¯ ≤ r − 1, and therefore, for large enough ±eil we have |k| k¯ · ω ± ωj ≥ ωj − (r − 1)ωN
l=1
282
D. Bambusi
τ which is automatically larger than (r − 1)ωN and also than γ1 / i1τ ...ir−1 if
ωj ≥ 2(r − 1)ωN , which is implied by the hypothesis, with a suitable choice of C1 .
% $
Lemma 6.14. Assume i1 ≤ i2 ... ≤ ir−1 ≤ N, then one has C|I|γ1 1/r N β+1 4 σ R (τ, γ ) . 0r−1 δ 1 ≤ ij j ≥N l=1 il
(6.21)
Proof. We have 4 C 1N 4 σ σ σ ≤ R (τ, γ ) = R (τ, γ ) R 1 1 ij ij ij , j >N C1 N>j ≥N j =N+1 and using Lemma 6.12 this gives (6.21).
% $
The estimate of the union over j, k of the measure of the sets Rσij k is more difficult. It follows the proof of Lemma 8 of [18] and Lemma 3.1 of [12]. Lemma 6.15. Assume i1 ≤ i2 ... ≤ ir−2 ≤ N . One has βθ 2 4 γ1 θ/r N r +1 σ Rij k (τ, γ1 ) ≤ C j ≥N 0r−2 τr 2θ −2β l=1 il k>j with θ :=
1 . 2β + 1 + 1/r
Proof. Clearly the difficulties arise when the sign in front of ωk is opposite to the sign in front of ωj . So, we denote , γ1 ˜ Ri,j,k := m ∈ I : ωi1 ± ωi2 ± ... ± ωir−2 ± (ωj − ωk ) < τ τ τ . i1 i2 ...ir−2 Denote n = k − j , one has
ω k − ωj = n + rj k
with |rj k | ≤ Now, one can easily verify that, defining , Qnij
c . j
γ1 c := m ∈ I : |k · ω + n| ≤ τ τ τ + i1 i2 ...ir−2 j
-
Birkhoff Normal Form for PDE’s
283
˜ i,j,j +n ⊂ Qn (for details see [3]), remarking also that, for j ≥ j0 one has one has R ij n n Qij ⊂ Qij0 , we obtain 4 4 n R ˜ i,j,j +n ≤ Q ∪ Ri,j,j +n ij0 j j <j0 1/r r−2 ' 2β γ1 1/r j β (j + n)β C c γ1 il + ≤C τ τ + 0r−2 δ i1 ...ir−2 j0 l=1 il l=1 j <j0 1/r r−2 ' 2β N β γ1 1/r j 2β+1 C γ1 c 0 ≤C τ τ + il + , (6.22) 0r−2 δ i1 ...ir−2 j0 l=1 il l=1
where we used the fact that we always have n < CN otherwise, by the argument of ˜ i,j,j +n is empty, and Lemma 6.11 with g = k¯ · ω + n and Lemma 6.13, the set R
¯k := r−2 ±ej . We now choose j0 so that both terms in the previous formula have the l=1 same order of magnitude. Thus we put 0r−2 δ+2β θ l=1 il j0 = N β γ1 1/r inserting in (6.22) and summing over n ≤ CN we get the thesis.
% $
Theorem 6.16. Provided δ > 1 there exists C such that ∞ ∞ ∞ 4 4 4 4 σ ... Ri ≤ Cγ1 1/r |I| i1 =1 i2 =1 ir =1 σ (see (6.13) for the definition of Rσi ). Proof. By (6.18) one has ∞ r 4 ∞ 4 ∞ 4 1 ∞ 4 Rσ ≤ Cγ1 1/r |I|2r ... Rσi ≤ , i iδ i1 ,...,ir ,σ i1 =0 i2 =1 ir =1 σ i=0 from which, remarking that the series is convergent one has the thesis.
% $
Remark 6.17. Here we did not assume that the indexes are smaller than N ; on the contrary we considered them as varying from zero to infinity, thus this theorem ensures that the condition (3.1) is fulfilled for almost all masses. Remark 6.18. The condition δ > 1 explicitly reads τ > 2r 2 + r. Analogously one has Theorem 6.19. Assume τθ − 2β > 1 , r2
(6.23)
284
then
D. Bambusi
4 N N 4 4 4 N 4 1/r β+1 σ ... R N ij ≤ C|I|γ1 i1 =0 i2 =0 ir−1 =0 σ j ≥N+1
and moreover 4 N 4 4 N 4 4 N 4 θ/r 2 βθ σ ... R N r +1 . ij k ≤ C|I|γ1 i1 =0 i2 =0 ir−1 =0 σ j ≥N+1 k≥N+1 Remark 6.20. Equation (6.23) is implied by τ > Cr 4 . 3
Proof of Theorem 6.5. Just take γ1 = γ /N 4r and apply Theorem 6.19.
% $
Remark 6.21. Remark that in the definition of strongly nonvanishing frequencies appear the quantities in the definition of the sets Rσij k with r + 2 instead of r. Actually the quantities entering in the perturbative construction are those we estimated in this section. If one wants to fit in the definition of strongly nonvanishing frequencies one should just substitute in the estimates of this section r + 2 to r. Remark 6.22. The theory of this subsection directly applies also to the case of frequencies of the form ω j = µj + m with µj ∼ j d and d > 1. Acknowledgement. This work has been partially supported by INTAS-00.221 project and by “Gruppo Nazionale di Fisica Matematica” of “Istituto di Alta Matematica”.
References 1. Bambusi, D., Nekhoroshev, N.N.: A property of exponential stability in nonlinear wave equation near the fundamental linear mode. Physica D 122, 73–104 (1998) 2. Bambusi, D.: Nekhoroshev theorem for small amplitude solutions in nonlinear Schr¨odinger equations. Math. Z. 130, 345–387 (1999) 3. Bambusi, D.: On long time stability in Hamiltonian perturbations of nonresonant linear PDE’s. Nonlinearity 12, 823–850 (1999) 4. Benettin, G., Giorgilli, A., Galgani, L.: A proof of Nekhoroshev’s Theorem for the stability times in nearly integrable Hamiltonian systems. Cel. Mech. 37, 1–25 (1985) 5. Benettin, G., Fr¨ohlich, J., Giorgilli, A.: A Nekhoroshev-type Theorem for Hamiltonian systems with infinitely many degrees of freedom. Commun. Math. Phys. 119, 95–108 (1988) 6. Bourgain, J.: Construction of approximative and almost periodic solutions of perturbed linear Schr¨odinger and wave equation. GAFA 6, 201–230 (1995) 7. Bourgain, J.: Quasi-periodic solutions of Hamiltonian perturbations of 2D linear Schr¨odinger equations. Ann. of Math. 148, 363–439 (1998) 8. Bourgain, J.: On diffusion in high-dimensional Hamiltonian systems and PDE. J. Anal. Math. 80, 1–35 (2000) 9. Bourgain, J.: Nonlinear Schr¨odinger equations. In: Hyperbolic Equations and Frequency Interactions, L. Caffarelli, E. Weinan (eds), IAS/Park City mathematics series 5. Providence, RI: American Mathematical Society, 1999 10. Chierchia, L.,You, J.: KAM tori for 1D nonlinear wave equations with periodic boundary conditions. Commun. Math. Phys. 211, 497–525 (2000) 11. Craig, W., Wayne, C.E.: Newton’s method and periodic solutions of nonlinear wave equations. Comm. Pure Appl. Math. 46, 1409–1501 (1993)
Birkhoff Normal Form for PDE’s
285
12. Kuksin, S.B.: Nearly integrable infinite-dimensional Hamiltonian systems. Lect. Notes Math. 1556, Berlin-Heidelberg-New York: Springer, 1994 13. Kuksin, S.B., P¨oschel, J.: Invariant Cantor manifolds of quasi-periodic oscillations for a nonlinear Schr¨odinger equation. Ann. Math. 142, 149–179 (1995) 14. Morbidelli, A., Giorgilli, A.: Superexponential stability of KAM tori. J. Statist. Phys. 78, 1607–1617 (1995) 15. Nekhoroshev, N.N.: Behaviour of Hamiltonian systems close to integrable. Funct. Anal. Appl. 5, 338–339 (1971) 16. Nekhoroshev, N.N.: Exponential estimate of the stability time of near integrable Hamiltonian systems. Russ. Math. Surveys. 32(6), 1–65 (1977) 17. Niederman, L.: Nonlinear stability around an elliptic equilibrium point in a Hamiltonian system. Nonlinearity 11, 1465–1479, (1998) 18. P¨oschel, J.: A KAM–Theorem for some partial differential equations. Ann. Scuola Norm. Sup. Pisa Cl. Sci. 23, 119–148 (1996) 19. P¨oschel, J.: Quasi–periodic solutions for a nonlinear wave equation. Comment. Math. Helv. 71, 269–296 (1996) 20. Xu, J., You, J., Qiu, Q.: Invariant tori for nearly integrable Hamiltonian systems with degeneracy. Math. Z. 226, 375–387 (1997) Communicated by P. Sarnak
Commun. Math. Phys. 234, 287–338 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0767-3
Communications in
Mathematical Physics
Distribution of the First Particle in Discrete Orthogonal Polynomial Ensembles Alexei Borodin1 , Dmitriy Boyarchenko2 1
School of Mathematics, Institute for Advanced Study, Princeton, NJ 08540, USA. E-mail:
[email protected] 2 Department of Mathematics, University of Pennsylvania, Philadelphia, PA 19104-6395, USA. E-mail:
[email protected] Received: 12 April 2002 / Accepted: 17 September 2002 Published online: 20 January 2003 – © Springer-Verlag 2003
Abstract: We show that the distribution function of the first particle in a discrete orthogonal polynomial ensemble can be obtained through a certain recurrence procedure, if the (difference or q-) log-derivative of the weight function is rational. In a number of classical special cases the recurrence procedure is equivalent to the difference and q-Painlev´e equations of [10, 17]. Our approach is based on the formalism of discrete integrable operators and discrete Riemann–Hilbert problems developed in [3, 4]. Contents 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discrete Riemann-Hilbert Problems and Orthogonal Polynomials . . Lax Pairs for Solutions of DRHP . . . . . . . . . . . . . . . . . . . Recurrence Relation for Fredholm Determinants . . . . . . . . . . . Compatibility Conditions for Lax Pairs . . . . . . . . . . . . . . . . Initial Conditions for Recurrence Relations . . . . . . . . . . . . . . Lax Pairs for Discrete Orthogonal Polynomials of the Askey Scheme Solution of the Compatibility Condition: The General Case . . . . . . The Fifth and the Fourth Discrete Painlev´e Equations . . . . . . . . . A Connection with the q − PV I Equation of M. Jimbo and H. Sakai . Applications: Recurrence Relations for Some Polynomials of the Askey Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 12. Numerical Computations . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
287 293 296 299 305 307 311 315 317 321
. . . . . .
326 335
1. Introduction 1.1. The basic problem considered in this paper is the following. Let X be a locally finite subset of R and w : X → R>0 be a positive–valued function on X with finite moments:
288
A. Borodin, D. Boyarchenko
|x|n w(x) < ∞,
n = 0, 1, . . . .
x∈X
Fix a positive integer k (the number of particles) and consider the probability measure on all k-point subsets of X given by Prob{x1 , . . . , xk } = const ·
(xi − xj )
1≤i<j ≤k
2
k
w(xi ).
(1.1)
i=1
We are interested in the distribution of max{x1 , . . . , xk } with respect to this measure. The problem is motivated by random matrix theory on one side, and by combinatorial and representation theoretic models on the other one. In random matrix theory, probability measures of the form const ·
(xi − xj )2
1≤i<j ≤k
k
w(xi )dxi
i=1
on k-point subsets of R, with w(x) being a smooth function on a subinterval of R, play a prominent role. Most computations for such models are conveniently done by means of the orthogonal polynomials associated with the weight function w(x). On this ground, these measures are often called orthogonal polynomial ensembles. See [14, 15] and references therein for a further discussion. The problem of describing the distribution of the max{xi } in the continuous setting for the classical weights has been solved in the following sense: the distribution function was explicitly written in terms of a specific solution of one of the six (2nd order nonlinear ordinary differential) Painlev´e equations. It was done in [20] for the Hermite weight w(x) = exp(−x 2 ), x ∈ R, and for the Laguerre weight x a exp(−x), x > 0; in [20, 9] for the Jacobi weight (1 − x)a (1 + x)b , x ∈ (−1, 1); and in [22, 5] for the quasi-Jacobi weight (1 − ix)s (1 + ix)s¯ , x ∈ R.1 Thus, it is natural to ask what would be an analog of these results when we take w to be a classical discrete weight function. On the other hand, in recent years the random variable max{xi } with xi ’s distributed according to (1.1) with certain specific weights, came up as the main quantity of interest in a number of problems originating in combinatorics, first-passage percolation, representation theory, and growth processes, see e.g. [7, 11, 12, 2] and references therein. 1.2. In order to state our first result we need to introduce more notation. Let us denote the points of X by πs , s = 0, 1, . . . , N, with π0 < π1 < · · · or π0 > π1 > · · · . Here N = |X| − 1 may be finite or infinite. We use the following two basic assumptions: • There exists an affine transformation σ : R → R such that σ πs+1 = πs for all s, 0 ≤ s < N. • There exist polynomials P (x) and Q(x) such that w(πs−1 ) P (πs ) = , w(πs ) Q(πs )
1 ≤ s ≤ N,
and P (π0 ) = 0. The orthogonality data for a number (but not all) hypergeometric polynomials of the Askey scheme satisfy these assumptions, see Sect. 7 below for details.2 1 Continuous problems of this type have been extensively studied. We refer to the introduction of [5] for a brief review and references. 2 For some classical families of polynomials both assumptions are satisfied but the orthogonality set X is not locally finite. We extend our results to those cases, see Sect. 4 below.
Distribution of the First Particle
289
We prove that under the two conditions above, there exists a certain recurrence procedure which computes the gap probability Ds = Prob {xi ∈ / {πs , πs+1 , . . . } for all i} Prob{max{xi } < πs }, if π0 < π1 < · · · , = Prob{min{xi } > πs }, if π0 > π1 > · · · , with xi ’s distributed according to (1.1). In fact, the recurrence procedure produces a sequence (As , Ms (ζ )), where As is a nilpotent 2 by 2 matrix and Ms (ζ ) is a matrix polynomial Ms (ζ ) = Ms(l) ζ l + · · · + Ms(0) ,
Ms(i) ∈ Mat(2, C),
of degree l = max{deg P , deg Q}. The elementary step of the recurrence is provided by the equality As As+1 I+ Ms (ζ ) = Ms+1 (ζ ) I + . (1.2) σ ζ − πs ζ − πs+1 It is not hard to see that if det Ms (πs+1 ) = 0 (which is always the case in our setting) then (1.2) defines (As+1 , Ms+1 ) uniquely provided that we know (As , Ms ). However, the existence of (As+1 , Ms+1 ) is not obvious and needs to be proved.3 Again, in our setting it always holds. We then show that the ratio Ds+2 Ds+2 Ds+1 −1 Ds+3 − − Ds+2 Ds+1 Ds+1 Ds (l) (0) (l) (0) is an explicit rational function of As , Ms , . . . , Ms and As+1 , Ms+1 , . . . , Ms+1 . Since Ds = Prob{max{xi } < πs } is nonzero only if s ≥ k (recall that k is the number of xi ’s in (1.1)), it is enough to provide the initial conditions Dk , Dk+1 , Dk+2 , Ak , Mk (ζ ) in order to be able to compute Ds for arbitrary s. These initial conditions are readily expressed in terms of {πs } and {w(πs )}, see Sect. 6 below. For certain classical weights w the recurrence relation (1.2) can be substantially simplified. To illustrate the situation, let us consider X = Z≥0 and w(x) = a x /x!, where a > 0 is a parameter. This weight function corresponds to the Charlier orthogonal polynomials. In this case, As and Ms (ζ ) can be parameterized by three scalar sequences as , bs , cs as follows: 10 b s b s cs −1 −as cs Ms (ζ ) = ζ+ , As = (k + bs ) . (1.3) 00 a/cs a 1/(as cs ) 1 Then the equality (1.2) leads to the following recurrence relations: (bs + aas )(k + bs + aas ) , aas (s + 1 + bs + aas ) s+1 = − (s + 1 + k + bs + aas ), 1 − as+1 aas cs . cs+1 = k + bs + aas
as+1 = bs+1
3
of
(l)
(0)
(1.4) (1.5) (1.6)
In fact, for 2 by 2 matrices one can easily see that (As+1 , Ms+1 , . . . , Ms+1 ) are rational functions
(l) (0) (As , Ms , . . . , Ms ).
290
A. Borodin, D. Boyarchenko
The connection of these sequences and the distribution Ds is given by Ds+3 Ds+2 Ds+2 Ds+1 −1 cs (bs + aas )(bs+1 + aas+1 ) − − = . 2 c Ds+2 Ds+1 Ds+1 Ds a(s + 2)as+1 s+1 The corresponding initial conditions can be found in Subsect. 11.2. Under the change of variables fs = as−1 ,
gs = aas + bs + s + 1,
(1.4)–(1.5) turn into ags , (gs − s − 1)(gs + k − s − 1) a s+1 = − − k + 2s + 3. fs+1 1 − fs+1
fs fs+1 = gs + gs+1
This recurrence is immediately identified with the difference Painlev´e IV equation (dPIV) of [17]. 1.3. It turns out that the situation for the Charlier weight described above is rather typical. We are also able to reduce (1.2) to scalar rational recurrence relations for other classical weight functions. Our results are summarized in the following table. The family of orthogonal polynomials
The corresponding orthogonality sets
The weight defining the families
The corresponding (q−)difference equations
Charlier polynomials
X = Z≥0
ω(x) = a x /x! a>0
difference Painlev´e IV, see Sect. 11.2
Meixner polynomials
X = Z≥0
Krawtchouk polynomials
X = {0, 1, . . . , N}
q − Charlier polynomials Alternative q − Charlier polynomials Little q − Laguerre
X = {q −x x ∈ Z≥0 }
X = {q x x ∈ Z≥0 }
X = {q x x ∈ Z≥0 }
(or Wall) polynomials Little q − Jacobi polynomials q − Krawtchouk polynomials
X = {q x x ∈ Z≥0 }
x x ω(x) = (β) difference Painlev´e V, x! c β > 0, 0 < c < 1 see Sect. 11.4 ω(x) = Nx px (1 − p)N −x difference Painlev´e V, N ∈ Z≥0 , 0 < p < 1 see Sect. 11.5 ax (x2) ω(x) = (q;q) q degeneration of q − PV I , x a>0 see Sect. 7.5, Sect. 10.3 and Sect. 11.7 x+1 x see the next paragraph ω(x) = a q ( 2 ) (q;q)x
a>0
(aq)x (q;q)x < q −1
ω(x) =
degeneration of q − PV I ,
0
see Sect. 7.5, Sect. 10.3 and Sect. 11.7 q − Painlev´e VI,
ω(x) = 0
(bq;q)x x (q;q)x (aq) −1 −1 q , b
see Sect. 7.5, Sect. 10.1 and Sect. 11.7
−N x (−p)−x X = {q −x x = 0, . . . , N} ω(x) = (q(q;q);q) q − Painlev´e VI, x N ∈ Z≥0 , p > 0 see Sect. 7.5, Sect. 10.1 and Sect. 11.7
It is remarkable that in almost all the cases we can solve explicitly, we end up with one of the equations of Sakai’s hierarchy which was constructed out of purely algebraic geometric considerations, see [17]. (We were not able to see such a reduction in the
Distribution of the First Particle
291
alternative q-Charlier case, but we do not claim that there is none.) So far we have not found a conceptual explanation for this fact. One can notice, however, that recurrence relations originating from (1.2) must have some kind of singularity confinement property. (This property was the starting point of Sakai’s work.) For example, the parameterization (1.3) does not make much sense if, say, As has a zero (2.1) element. Then the values of as and cs are not well-defined. In terms of the recurrence relations, this is reflected by vanishing of one of the denominators in (1.4)–(1.6). However, the matrix sequence {(As , Ms )} does not feel this singularity, which means that the sequences {as }, {bs }, {cs } can be “continued through” their singular values. Of course, all Sakai’s equations have this kind of singularity confinement by construction. Let us also point out that it is not clear at this point whether the weights of higher hypergeometric polynomials of the Askey scheme will also lead to one of Sakai’s equations. All the cases that we were able to solve explicitly have linear matrices Ms (ζ ) in (1.2), while, say, for the Hahn weight Ms (ζ ) is quadratic. Handling such cases seems to be a problem of the next level of difficulty. It remains an interesting open problem to derive explicit rational recurrence relations for deg Ms = 2. For Charlier and Meixner weights, Ds can also be written as Toeplitz determinants with symbols (1 + z)k exp(az−1 )
and
(1 + z)k (1 + bz−1 )c
(1.7)
respectively, see [21, 12, 6]. Here a, b, c are parameters. Among previous results on the subject let us mention • the derivation of dPII for Toeplitz determinants with the symbol exp(θ (z + z−1 )), see [2, 4, 1] (note also the derivation of the same equation for the quantity closely related to these Toeplitz determinants in [16, 21]); • derivation of dPV for Toeplitz determinants with symbol (1 + z)k (1 + bz−1 )c and k being not necessarily integral in [4]; • derivation of rational recurrence relations for Toeplitz determinants with symbols of the form
exp(P1 (z) + P2 (z−1 )) zγ (1 − d1 z)γ1 (1 − d2 z)γ2 (1 − d1−1 z−1 )γ1 (1 − d2−1 z−1 )γ2 , (1.8) where P1 and P2 are polynomials with | deg P1 − deg P2 | ≤ 1, γ1 , γ2 , γ1 , γ2 , d1 , d2 are constants, see [1]. Interestingly enough, for the symbols (1.7), which are special cases of (1.8), the relations of [1] do not seem to have much in common with those of [4] and the present paper. 1.4. The methods used in this paper are based on the formalism of discrete integrable operators and discrete Riemann–Hilbert problem (DRHP) developed in [3, 4]. The first step is to represent Ds as a Fredholm determinant of an integrable operator: Ds = det(1 − Ks ), where Ks is an operator in )2 ({πs , πs+1 , . . . }) with the kernel Ks (x, y) =
1 pk−1 (x)pk (y) − pk (x)pk−1 (y) w(x)w(y) . x−y pk−1 2)2 (X,w)
Here pk and pk−1 are monic k th and (k − 1)st orthogonal polynomials on X with respect to the weight function w. Using the results of [3, 4], the computation of such a Fredholm
292
A. Borodin, D. Boyarchenko
determinant can be reduced to solving a DRHP on {π0 , . . . , πs−1 } with a jump matrix easily expressible in terms of w. Our assumptions on X and w, see above, then allow us to obtain a Lax pair for the solution ms (ζ ) of this DRHP, which has the form As ms (ζ ), ms (σ ζ ) = Ms (ζ )ms+1 (ζ )D −1 (ζ ). (1.9) ms+1 (ζ ) = I + ζ − πs Here Ms (ζ ) is a matrix polynomial and D(ζ ) is a fixed diagonal matrix polynomial. The compatibility condition for this pair of equations is exactly (1.2). The paper is organized as follows. In Sect. 2 we reduce the problem of computing discrete orthogonal polynomials with a given weight to a DRHP. In Sect. 3 we derive the Lax pair (1.9). In Sect. 4 we show how to express the Fredholm determinants Ds in terms of (As , Ms ). In Sect. 5 we prove that the compatibility condition (1.2) always has a unique solution. In Sect. 6 we derive the initial conditions Ak , Mk , Dk , Dk+1 . In Sect. 7 we write down explicitly the Lax pairs for 14 families of discrete hypergeometric orthogonal polynomials of the Askey scheme. In Sect. 8 we solve (1.2) in terms of rational recurrences for the matrix elements of As and Ms for deg Ms = 1. In Sect. 9 we show how to reduce (1.2) to dPIV and dPV equations in the case when deg Ms = 1 and σ ζ = ζ − 1. In Sect. 10 we reduce (1.2) to q-PVI or its degeneration in the case when deg Ms = 1 and σ ζ = q ±1 ζ . Finally, in Sect. 11 we solve (1.2) in terms of difference and q-Painlev´e equations for 7 families of classical orthogonal polynomials. At the end of the paper we also provide a few plots of the “density function” (difference or q-derivative of Ds ) obtained using the formulas of Sect. 11. This research was partially conducted during the period the first author (A. B.) served as a Clay Mathematics Institute Long-Term Prize Fellow. He was also partially supported by the NSF grant DMS-9729992. 1.5. The following notation is used throughout our paper. For an integer k, we write Z≥k = {k, k + 1, k + 2, . . . }. If a, q ∈ C and k ∈ Z≥0 , one defines the Pochhammer symbol and its q-analogue (often also called the q-shifted factorial) by (a)0 := 1,
(a)k := a(a + 1)(a + 2) · · · (a + k − 1) if k ≥ 1
and (a; q)0 := 1,
(a; q)k := (1 − a)(1 − aq)(1 − aq 2 ) · · · (1 − aq k−1 ) if k ≥ 1,
respectively. One usually writes (a1 , . . . , ar )k =
r
(aj )k
and (a1 , . . . , ar ; q)k =
j =1
r
(aj ; q)k .
j =1
If r, s ∈ Z≥0 and a1 , . . . , ar , b1 , . . . , bs , z, q ∈ C, the hypergeometric series and the basic hypergeometric series are defined by ∞ (a1 , . . . , ar )k zk a1 , . . . , ar F z := r s b1 , . . . , bs (b1 , . . . , bs )k k! k=0
and
rφs
∞ k zk (a1 , . . . , ar ; q)k a1 , . . . , ar q; z := (−1)(1+s−r)k q (1+s−r)(2) , b1 , . . . , bs (b1 , . . . , bs ; q)k (q; q)k k=0
respectively.
Distribution of the First Particle
293
2. Discrete Riemann-Hilbert Problems and Orthogonal Polynomials 2.1. In this section we explain how solutions of discrete Riemann-Hilbert problems (DRHP) for jump matrices of a special type can be expressed in terms of the corresponding orthogonal polynomials. Let X be a discrete locally finite subset of C, and let w : X → Mat (2, C) be a function. As in [3, 4], we say that an analytic function m : C \ X −→ Mat (2, C) solves the DRHP (X, w) if m has simple poles at the points of X and its residues at these points are given by the jump (or residue) condition Res m(ζ ) = lim (m(ζ )w(x)), ζ =x
ζ →x
x ∈ X.
(2.1)
Lemma 2.1. If m(ζ ) is a solution of the DRHP (X, w) and the matrix w(x) is nilpotent for all x ∈ X, then the function det m(ζ ) is entire. If, in addition, det m(ζ ) → 1 as ζ → ∞, then det m(ζ ) ≡ 1. Proof. For each x ∈ X, the jump condition (2.1) implies that the function m(ζ ) · I − (ζ − x)−1 w(x) is analytic in a neighborhood of x. Since w(x) is nilpotent, this product has the same determinant as m(ζ ), which shows that det m(ζ ) has no pole at x. The second statement of the lemma follows from Liouville’s theorem. 2.2. We now assume that the matrix w(x) has the following form: 0 ω(x) w(x) = , 0 0
(2.2)
where ω : X → C is a function. Recall that a collection {Pn (ζ )}∞ n=0 of complex polynomials is called the collection of orthogonal polynomials associated to the weight function ω if • Pn is a polynomial of degree n for all n = 1, 2, . . . , and P0 ≡ const; • if m = n, then Pm (x)Pn (x)ω(x) = 0. x∈X
We will always take Pn to be monic: Pn (x) = x n + lower terms. In order for the definition to make sense, we assume that all moments of the weight function ω are finite, i.e., |ω(x)x j | converges for all j ≥ 0. (2.3) the series x∈X
Under this condition, one can consider the following inner product on the space C[ζ ] of all complex polynomials: f (x)g(x)ω(x). (f (ζ ), g(ζ ))ω := x∈X
It is clear that there exists a collection of orthogonal polynomials {Pn (ζ )} associated to ω such that (Pn , Pn )ω = 0 for all n if and only if the restriction of (·, ·)ω to the space C[ζ ]≤d of polynomials of degree at most d is nondegenerate for all d ≥ 0. If this condition holds, we say that the weight function ω is nondegenerate. In this case, it is also clear that the collection {Pn } is unique.
294
A. Borodin, D. Boyarchenko
Remark 2.2. If the set X is finite, one has to modify the definitions above. Indeed, if X consists of N +1 points (N ∈ Z≥0 ), the inner product (·, ·)ω is necessarily degenerate on C[ζ ]≤d for all d > N. So, instead, we require that (·, ·)ω be nondegenerate on C[ζ ]≤d for 0 ≤ d ≤ N, and we are only interested in a collection {Pn (ζ )}N n=0 of orthogonal polynomials of degrees up to N. On the other hand, the condition (2.3) is empty in this case. Remark 2.3. If the values of the weight function ω are real and strictly positive and the orthogonality set X is contained in R, then ω is automatically nondegenerate.
2.3. The first basic result of the paper is Theorem 2.4 (Solution of DRHP). Let X be a discrete locally finite subset of C and ω : X → C be a nondegenerate weight function satisfying (2.3). Let {Pn (ζ )}N n=0 be the collection of monic orthogonal polynomials associated to ω, where N = card(X) − 1 ∈ Z≥0 ∪ {∞}. Assume that the jump matrix w(x) is given by (2.2). Then for any k = 1, 2, . . . , N, the DRHP (X, w) has a unique solution mX (ζ ) satisfying the asymptotic condition −k 1 ζ 0 =I +O as ζ → ∞, (2.4) mX (ζ ) · 0 ζk ζ where I is the identity matrix. If we write
12 m11 X (ζ ) mX (ζ ) , mX (ζ ) = 22 m21 X (ζ ) mX (ζ ) 21 −1 then m11 X (ζ ) = Pk (ζ ) and mX (ζ ) = (Pk−1 , Pk−1 )ω · Pk−1 (ζ ).
Remark 2.5. The asymptotic condition (2.4) needs to be made more precise. Indeed, if X is infinite, then the LHS of (2.4) has poles accumulating at infinity. In this case, we require that the asymptotics be uniform on a sequence of expanding contours (e.g., circles whose radii tend to ∞) whose distance from X remains bounded away from zero. A similar remark applies to all asymptotic formulas below. Remark 2.6. A result for continuous weight functions similar to Theorem 2.4 was proved in [8]. Proof. Fix a natural number k ≤ N and define a matrix-valued function 11 12 m m : C −→ Mat (2, C) m= m21 m22 by m11 (ζ ) = Pk (ζ ), Pk (x)ω(x) , m12 (ζ ) = ζ −x x∈X
m21 (ζ ) = c · Pk−1 (ζ ), Pk−1 (x)ω(x) , m22 (ζ ) = c · ζ −x x∈X
Distribution of the First Particle
295
where c is the unique constant for which m22 (ζ ) = ζ −k +O(ζ −k−1 ) as ζ → ∞ (we will show below that such a c exists, and in fact c = (Pk−1 , Pk−1 )−1 ω ). We first observe that (2.3) implies that the series for m12 (ζ ) converges uniformly and absolutely on compact subsets of C \ X, and hence m12 (ζ ) is analytic on the complement of X. Moreover, since for each x ∈ X, the series Pk (y)ω(y) y∈X\{x}
ζ −y
converges uniformly and absolutely on compact subsets of (C \ X) ∪ {x}, we see that m12 (ζ ) has a simple pole at x, with the residue given by Res m12 (ζ ) = Pk (x)ω(x). ζ =x
Similarly,
m22 (ζ )
is analytic away from X, with a simple pole at each x ∈ X satisfying Res m22 (ζ ) = c · Pk−1 (x)ω(x). ζ =x
This shows that our matrix m(ζ ) satisfies the jump condition (2.1). To verify the asymptotic condition (and find the constant c), note that m11 (ζ ) · ζ −k = 1 + O(1/ζ ),
m21 (ζ ) · ζ −k = O(1/ζ ),
as ζ → ∞.
Next, we write
1 1 1 x k−1 x −1 x = · 1− = + 2 + · · · + k + O(ζ −k−1 ). ζ −x ζ ζ ζ ζ ζ
(2.5)
By the definition of orthogonal polynomials, we have Pk (x)ω(x)x i = 0, 0 ≤ i ≤ k − 1, x∈X
and hence substituting (2.5) into the definition of m12 (ζ ) yields m12 (ζ ) = O(ζ −k−1 ) Also, (2.5) gives m22 (ζ ) = c ·
as ζ → ∞.
Pk−1 (x)ω(x)x k−1 · ζ −k + O(ζ −k−1 )
as ζ → ∞.
x∈X
Hence, if we set
c=
x∈X
−1 Pk−1 (x)ω(x)x k−1
= (Pk−1 , Pk−1 )−1 ω ,
then the matrix m(ζ ) satisfies all the conditions of the theorem. To prove uniqueness, let mX be an arbitrary solution of the DRHP (X, w) satisfying the asymptotic condition (2.4). Since the functions m and mX have the same (simple) poles and satisfy the same residue conditions at these poles, it is clear that the function f (ζ ) = m(ζ )−1 · mX (ζ ) is entire (note that m(ζ ) is invertible by Lemma 2.1). The asymptotic conditions on the matrices m and mX imply that f (ζ ) → I as ζ → ∞. By Liouville’s theorem, f ≡ I , which means that m = mX .
296
A. Borodin, D. Boyarchenko
3. Lax Pairs for Solutions of DRHP 3.1. Let X be a discrete locally finite subset of C, let w : X → Mat (2, C) be a function, and fix a natural number k < card(X). If w arises from a nondegenerate weight function ω on X with finite moments, as in Sect. 2, then from Theorem 2.4 we know that the DRHP (X, w) admits a unique solution mX (ζ ) with the asymptotics of diag(ζ k , ζ −k ) at infinity. Convention. From now on, we assume that the weight function under consideration is everywhere real and strictly positive, and the orthogonality set is contained in R. If Z ⊆ X is any subset of cardinality > k, it follows from Remark 2.3 and our convention that the restriction of ω to Z is also nondegenerate, whenceby Theorem 2.4,
ζ −k 0
the DRHP (Z, w Z ) has a unique solution mZ (ζ ) such that mZ (ζ ) · → I as 0 ζk ζ → ∞. Let us now assume that the set X is parameterized as X = {πx }N x=0 , where N ∈ Z≥0 ∪ {∞}. For every s ≥ 0, we consider the subset Zs := {πx }x≤s−1 ⊆ X. If s > k, then card(Zs ) > k, so by the previous paragraph, we have the corresponding solution ms (ζ ) := mZs (ζ ). It can also be shown that even though card(Zk ) = k, the DRHP (Zk , w Z ) still has a unique solution mk (ζ ). Indeed, uniqueness can be proved by the k same argument as in Theorem 2.4, and in Proposition 6.1 below we give an explicit formula for mk (ζ ). Note also that ms (ζ ) is defined for all s ≥ k; in particular, if N is finite, we have ms (ζ ) = mX (ζ ) for all s > N. Our next basic assumption is: there exists an affine transformation σ : C −→ C such that σ πx+1 = πx for all 0 ≤ x < N. (3.1) The Lax pair in our setting will consist of two equations, one of which relates ms+1 (ζ ) with ms (ζ ) and the other one relates ms (σ ζ ) with ms+1 (ζ ). We will denote the derivative of σ (which is a constant) by η, so that σ (ζ1 ) − σ (ζ2 ) = η · (ζ1 − ζ2 )
for all ζ1 , ζ2 ∈ C.
(3.2)
The cases of special interest are those when X is the orthogonality set for one of the families of discrete hypergeometric orthogonal polynomials of the Askey scheme. In this situation, X is a subset of either a one-dimensional lattice or a one-dimensional q-lattice in C, and σ is given by either σ ζ = ζ − 1 or σ ζ = q ±1 ζ . These cases will be treated in greater detail in Sect. 7; for now we concentrate on the general theory. 3.2. The main result of this section is
Theorem 3.1 (Lax pair). For each s = k, k + 1, k + 2, . . . , let Zs = {πx ∈ X x ≤ s − 1}, and let ms (ζ ) be the unique solution of the DRHP (Zs , w Z ) such that ms (ζ ) · s −k ζ 0 → I as ζ → ∞, where w is given by4 k 0 ζ
w(πx ) =
0 ω(x) . 0 0
4 To simplify notation, we assume that the weight function ω(x) is always defined for x ∈ Z , ≥0 x ≤ N.
Distribution of the First Particle
297
(a) For each s ∈ Z≥k , s ≤ N, there exists a constant nilpotent matrix As such that As ms+1 (ζ ) = I + ms (ζ ). (3.3) ζ − πs (b) Assume that there exist entire functions d1 (ζ ), d2 (ζ ) such that ω(x − 1) d1 (πx ) =η· , ω(x) d2 (πx ) d1 (π0 ) = 0,
for all 1 ≤ x ≤ N,
and d2 (σ −1 πN ) = 0 if N is finite.
Let D(ζ ) =
d1 (ζ ) 0 . 0 d2 (ζ )
Then for every s ∈ Z≥k , we have ms (σ ζ ) = Ms (ζ )ms+1 (ζ )D −1 (ζ ),
(3.4)
where Ms (ζ ) is entire5 . (c) With the assumptions of part (b), suppose that the functions d1 (ζ ) and d2 (ζ ) are polynomials of degree at most n in ζ , and write d1 (ζ ) = λ1 ζ n + (lower terms), d2 (ζ ) = λ2 ζ n + (lower terms). Set κ1 = ηk λ1 , κ2 = η−k λ2 . Then the matrix Ms (ζ ) is polynomial of degree at most n in ζ , with the coefficient of ζ n equal to diag(κ1 , κ2 ). Equations (3.3) and (3.4) constitute the Lax pair. Remark 3.2. As follows from the proof below, the condition d2 (σ −1 πN ) = 0 in part (b) of the theorem is only required to assert that Ms (ζ ) is entire for s > N . If d2 (σ −1 πN ) = 0, then (b) and (c) still hold for all s, k ≤ s ≤ N . Proof. (a) Fix s ∈ Z≥0 , s ≤ N , and consider the matrix-valued function N (ζ ) := ms+1 (ζ )m−1 s (ζ ) (recall that ms (ζ ) is invertible by Lemma 2.1). It is clear that N (ζ ) has only one simple pole, at ζ = πs . Hence the function N (ζ ) − (ζ − πs )−1 As is entire, where As = Resζ =πs N (ζ ). But ms (ζ ) and ms+1 (ζ ) have the same asymptotics at infinity, so N (ζ ) − (ζ − πs )−1 As → I as ζ → ∞. By Liouville’s theorem, N(ζ ) − (ζ − πs )−1 As ≡ I , whichgives (3.3). Taking determinants of both sides of (3.3) and using Lemma 2.1 gives det I + (ζ − πs )−1 As = 1 for all ζ , which forces the matrix As to be nilpotent. (b) We have Ms (ζ ) = ms (σ ζ )D(ζ )m−1 s+1 (ζ ). Therefore it is clear that Ms (ζ ) is analytic away from {π0 , π1 , . . . , πs }. Let 1 ≤ x ≤ s. Then for ζ near πx , we can write w(x − 1) w(x) and m−1 H2 (ζ ), (ζ ) = I − ms (σ ζ ) = H1 (ζ ) I + s+1 σ ζ − πx−1 ζ − πx where H1 and H2 are analytic and invertible matrix-valued functions defined in a neighborhood of πx . So Ms (ζ ) is analytic near πx if and only if so is the product −1 d (ζ )ω(x − 1) − d (ζ )ω(x) η 2 1 w(x) w(x − 1) , D(ζ ) I − = d1 (ζ ) I+ ζ − πx σ ζ − πx−1 ζ − πx 0 d (ζ ) 2
5
Again, note that Ms (ζ ) is defined for all s ∈ Z≥k , in particular for s > N , even if N is finite.
298
A. Borodin, D. Boyarchenko
i.e., if and only if ω(x − 1) d1 (πx ) =η· . ω(x) d2 (πx ) Similarly, we note that ms (σ ζ ) is analytic near π0 , whereas w(0) −1 ms+1 (ζ ) = I − H (ζ ), ζ − π0 with H analytic and invertible near π0 , which implies that Ms (ζ ) is analytic near π0 if and only if d1 (π0 ) = 0. Finally, if N is finite, we have to make sure that Ms (ζ ) has no pole at σ −1 πN for s ≥ N + 1. A necessary and sufficient condition for that is d2 (σ −1 πN ) = 0. (c) Using the asymptotic conditions on the matrices ms (σ ζ ) and ms+1 (ζ ), we obtain, as ζ → ∞, Ms (ζ ) = ms (σ ζ )D(ζ )m−1 s+1 (ζ ) 1 k k η ζ 0 = I +O 0 η−k ζ −k ζ −k 1 0 0 λ1 ζ n + O(ζ n−1 ) ζ I + O 0 λ2 ζ n + O(ζ n−1 ) 0 ζk ζ κ 0 · ζ n + O(ζ n−1 ). = 1 0 κ2 Since Ms (ζ ) is entire by part (b), this completes the proof.
3.3. To conclude this section, we remark that the method by which we have obtained the second equation of the Lax pair can be applied to derive a (second-order) difference equation for the orthogonal polynomials corresponding to the weight function ω. More precisely, one proves, by the same argument as Theorem 3.1, the following Proposition 3.3. Under the assumptions of Theorem 3.1(b), the function M(ζ ) = mX (σ ζ ) D(ζ )m−1 X (ζ ) is entire. The analogue of Theorem 3.1(c) also holds for the matrix M(ζ ). Now we have mX (σ ζ ) = M(ζ )mX (ζ )D(ζ )−1 . Considering the (1, 1) and (2, 1) elements of both sides and using the explicit formula for mX (ζ ) given in Theorem 2.4, we obtain a system of equations of the form (3.5) Pk (σ ζ ) = M 11 (ζ )Pk (ζ ) + cM 12 (ζ )Pk−1 (ζ ) d1 (ζ )−1 , 21 22 −1 c · Pk−1 (σ ζ ) = M (ζ )Pk (ζ ) + cM (ζ )Pk−1 (ζ ) d1 (ζ ) , (3.6) ij where c = (Pk−1 , Pk−1 )−1 ω and the M (ζ ) are the elements of the matrix M(ζ ). If we find c · Pk−1 (ζ ) from the first equation and substitute it into the second one, we will get a relation between Pk (ζ ), Pk (σ ζ ) and Pk (σ 2 ζ ). Even though the functions M ij (ζ ) may involve unknown parameters, one might hope that the equation we obtain in the end will only involve known parameters. We will work out an explicit example in Subsect. 11.3, for the weight function corresponding to the Charlier polynomials. We will see that the relation we get is exactly the standard difference equation for Charlier polynomials.
Distribution of the First Particle
299
4. Recurrence Relation for Fredholm Determinants 4.1. We remain in the general setting of Sect. 3. Thus we consider a discrete locally finite subset X = {πx }N x=0 ⊆ R, where N ∈ Z≥0 ∪ {∞}, and we are also given a strictly positive weight function ω : {x ∈ Z≥0 x ≤ N} → C whose moments are finite. For all s ∈ Z≥0 , s ≤ N , we let Zs = {πx }s−1 x=0 ,
Ys = X \ Zs .
Finally, we fix a natural number k ≤ N and let
12 (ζ ) m11 (ζ ) m X X mX (ζ ) = 22 (ζ ) m21 (ζ ) m X X be the unique solution of the discrete Riemann-Hilbert problem (X, w) with the asymptotics of diag(ζ k , ζ −k ) at
infinity provided by Theorem 2.4. Let α, β : {x ∈ Z≥0 x ≤ N} → C be two functions such that α(x)β(x) = ω(x) for all x. Consider the following kernel on X × X: α(x)β(y) φ(πx )ψ(πy ) − ψ(πx )φ(πy ) , x = y, K(πx , πy ) = (4.1) πx − π y α(x)β(x) φ (πx )ψ(πx ) − ψ (πx )φ(πx ) , x = y, 21 −1 where φ(ζ ) = m11 X (ζ ) = Pk (ζ ) and ψ(ζ ) = mX (ζ ) = (Pk−1 , Pk−1 )ω · Pk−1 (ζ ). Up to the factor of α(x)β(y), this is precisely the Christoffel-Darboux kernel for the family of orthogonal polynomials corresponding to the weight function ω. We will not need the specific form of the functions α and β in our computations. Note also that changing α and β while keeping their product fixed results in conjugation of the kernel K and hence has no effect on the Fredholm determinants studied below.
4.2. For each s ∈ Z≥k , s ≤ N, we let Ks be the restriction of K to Ys × Ys . We also denote by K and Ks the operators on l 2 (X) and l 2 (Ys ) defined by the kernels K and Ks , respectively. The main goal of our paper is to derive a recurrence relation for the Fredholm determinants Ds = det(1 − Ks ),
s ∈ Z≥k , s ≤ N.
The resulting equation will be in terms of the elements of the matrices As and Ms (ζ ) from the Lax pair (3.3), (3.4). Note that the Fredholm determinants are always well defined because K is a finite rank operator, as follows from the Christoffel-Darboux formula (see, e.g., [18]): k−1 K(πx , πy ) Pm (πx )Pm (πy ) = . α(x)β(y) (Pm , Pm )ω m=0
For a probabilistic interpretation of these determinants, see Subsect. 4.5 below. Lemma 4.1. For each s ∈ Z≥k , s ≤ N , the operator 1 − Ks is invertible, and Ds = det(1 − Ks ) = 0.
300
A. Borodin, D. Boyarchenko
Proof. The operator K is the orthogonal projection onto the subspace of l 2 (X) spanned by the polynomials P0 , . . . , Pk−1 . In particular, it has finite rank, its only eigenvalues are 0 and 1, and the eigenvectors corresponding to the eigenvalue 1 are linear combinations of P0 , . . . , Pk−1 , i.e., the polynomials of degree ≤ k − 1. It follows that the operator Ks also has finite rank, hence 1 − Ks is not invertible if and only if 1 is an eigenvalue of Ks . Suppose that k ≤ s ≤ N and there exists f ∈ l 2 (Ys ) such that 2 ˜ Ks f = f . We extend f by zero to X,
and we denote the extension by f ∈ l (X). Now K f˜ Y = Ks f = f = f˜ Y and f˜ Z = 0. This implies that K f˜ Z = 0 because oths s s s erwise we would have ||K f˜||ω > ||f˜||ω , which is impossible because K is a projection operator. This shows that K f˜ = f˜, whence by the remarks above, f˜ is a polynomial of degree ≤ k − 1. But it vanishes on the set Zs of cardinality ≥ k, which implies that f˜ is identically equal to zero. Hence, 1 − Ks is an invertible operator, and det(1 − Ks ) = 0. 4.3. For
each s ∈ Z≥k , s ≤ N , we have the unique solution ms (ζ ) of the DRHP (Zs , w Z ) having the same asymptotics at infinity as mX (ζ ). As in Sect. 3, we assume s that there exist entire functions d1 (ζ ), d2 (ζ ) such that d1 (π0 ) = 0
and
d1 (πx ) ω(x − 1) =η· for all 1 ≤ x ≤ N. ω(x) d2 (πx )
(4.2)
The assumption that d2 (σ −1 πN ) = 0 if N is finite is not essential for us now since it was only used in Sect. 3 to show that the function Ms (ζ ) is entire for s > N , whereas here we are only interested in the case k ≤ s ≤ N (cf. Remark 3.2). Note that d1 (πx )d2 (πx ) = 0 for all 1 ≤ x ≤ N . We let d (ζ ) 0 D(ζ ) = 1 and Ms (ζ ) = ms (σ ζ )D(ζ )m−1 s+1 (ζ ). 0 d2 (ζ ) By part (b) of Theorem 3.1, the function Ms (ζ ) is entire for all k ≤ s ≤ N . Using part (a) of this theorem, we obtain the following general form of the Lax pair for the solutions ms (ζ ): As p s qs ms (ζ ), As = , ps2 = −qs rs ; (4.3) ms+1 (ζ ) = I + rs −ps ζ − πs ms (σ ζ ) = Ms (ζ )ms+1 (ζ )D(ζ )−1 .
(4.4)
In Sect. 5, we will explain how one can obtain a recurrence relation for the matrix elements of Ms (ζ ) and As . In the present section, we assume that the function Ms (ζ ) is known, and derive a recurrence relation for the Fredholm determinants using this function and the parameters ps , qs . For all s, k ≤ s ≤ N , we put 11 12 11 12 11 12 m s ms µ s µs ν ν Ms (πs+1 ) = Ms (πs+1 ) = s21 s22 , ms (πs ) = 22 , 21 µ22 , m21 m µ νs νs s s s s (4.5) d Ms (ζ ). Then we have the following recurrence relations for the mawhere Ms (ζ ) = dζ 11 trix elements ms and for the Fredholm determinants Ds in terms of the parameters ps , ij ij qs , µs , νs .
Distribution of the First Particle
301
Theorem 4.2 (Recurrence relation for Fredholm determinants). Assume that ps = 0 for all s ≥ k, and hence also qs , rs = 0 for s ≥ k. (a) For each s ∈ Z≥k , s ≤ N , we have ps 12 −1 22 µ m11 = d (π ) + µ m11 2 s+1 s s . s+1 qs s
(4.6)
(b) For each s ∈ Z≥k , s ≤ N − 2, we have 2 Ds+1 ω(s) · us · (m11 Ds+2 s ) − = , Ds+1 Ds η · d1 (πs+1 ) · d2 (πs+1 )
(4.7)
where ps 22 21 us = νs21 µ22 + s − ν s µs qs ps2 + 2 qs
12 21 11 22 22 11 · νs21 µ12 s − ν s µs + ν s µ s − ν s µs 12 11 · νs11 µ12 s − ν s µs
for all s. Remark 4.3. We will show later (see Proposition 5.5 and Remark 6.3) that under the assumptions of Subsect. 1.2, ps is nonzero for all s ≥ k. Proof. (a) First we take the residues of both sides of (4.3) at ζ = πs and use the jump condition on the LHS: lim ms+1 (ζ )w(πs ) = As ms (πs ).
ζ →πs
Since the first column of the matrix w(πs ) is zero, we deduce that the first column of the 11 matrix As ms (πs ) is zero, whence m21 s = −(ps /qs )ms . Next we substitute ζ = πs+1 in (4.4) and rewrite it as follows: ms+1 (πs+1 ) = Ms−1 (πs+1 )ms (πs )D(πs+1 ). Since det Ms (ζ ) = det D(ζ ) by Lemma 2.1, the last equation can be written explicitly as 11 11 12 22 1 ms+1 m12 m s ms d1 (πs+1 ) µs −µ12 0 s s+1 = . 22 11 22 0 d2 (πs+1 ) m21 m21 d1 (πs+1 )d2 (πs+1 ) −µ21 s µs s ms s+1 ms+1 Equating the (1, 1) and (2, 1) elements of both sides yields m11 s+1
= d2 (πs+1 )
−1
11 (µ22 s ms
21 − µ12 s ms )
11 (−µ21 s ms
21 + µ11 s ms )
= d2 (πs+1 )
−1
and m21 s+1
= d2 (πs+1 )
−1
= −d2 (πs+1 )
−1
µ22 s
ps 12 + µs m11 s qs
µ21 s
ps 11 + µs m11 s . qs
302
A. Borodin, D. Boyarchenko
(b) For all s ∈ Z≥k , s ≤ N , we put Rs = Ks (1 − Ks )−1 (note that this is well defined by Lemma 4.1), so that Rs + 1 = (1 − Ks )−1 , and hence (Rs + 1)(s, s) = det(1 − Ks+1 )/ det(1 − Ks ), i.e., Rs (s, s) =
Ds+1 − 1. Ds
(4.8)
Now Theorem 2.3(ii) in [4] gives (Situation 2.2 ibid. explains why this theorem is applicable here) Rs (s, s) = g t (s)m−1 s (πs )ms (πs )f (s),
where f (s) = (α(s), 0)t and g(s) = (0, −β(s))t . For the purpose of our calculation, we rewrite this formula as follows: Rs (s, s) = −ω(s) · e2t m−1 s (πs )ms (πs )e1 ,
(4.9)
where e1 = (1, 0)t and e2 = (0, 1)t . Similarly, we get Rs+1 (s + 1, s + 1) = −ω(s + 1) · e2t m−1 s+1 (πs+1 )ms+1 (πs+1 )e1 .
(4.10)
We substitute ζ = πs+1 into (4.4) again: ms (πs ) = Ms (πs+1 )ms+1 (πs+1 )D(πs+1 )−1 ,
(4.11)
and we differentiate (4.4) at ζ = πs+1 to obtain ms (πs ) = η−1
d
Ms (ζ )ms+1 (ζ )D(ζ )−1 .
dζ ζ =πs+1
(4.12)
If the derivative in (4.12) falls onto the third factor, the contribution of the corresponding term to the RHS of (4.9) is (by (4.11)) d
∗0 1 −η−1 ω(s) · e2t D(πs+1 )
= 0. D(ζ )−1 e1 = const · (0, 1) 0∗ 0 dζ ζ =πs+1 If the derivative in (4.12) falls onto the second factor, the contribution of the corresponding term to the RHS of (4.9) is, by (4.11), (4.2) and (4.10), −1 −η−1 ω(s) · e2t D(πs+1 )m−1 s+1 (πs+1 )ms+1 (πs+1 )D(πs+1 ) e1 d2 (πs+1 ) t −1 = −η−1 ω(s) · e m (πs+1 )ms+1 (πs+1 )e1 d1 (πs+1 ) 2 s+1 = −ω(s + 1) · e2t m−1 s+1 (πs+1 )ms+1 (πs+1 )e1 = Rs+1 (s + 1, s + 1).
Therefore, if we subtract Rs+1 (s +1, s +1) from both sides of (4.9), we get the following equation: Rs+1 (s + 1, s + 1) − Rs (s, s)
−1 = η−1 ω(s) · e2t m−1 s (πs )Ms (πs+1 )ms+1 (πs+1 )D(πs+1 ) e1 .
(4.13)
Distribution of the First Particle
303
Since det ms ≡ 1 for all s by Lemma 2.1, the RHS of (4.13) equals 11 12 22 ν s νs ms −m12 −1 s η ω(s) · (0, 1) 11 −m21 m νs21 νs22 s s 11 ms+1 m12 d1 (πs+1 )−1 1 0 s+1 × 22 0 0 d2 (πs+1 )−1 m21 s+1 ms+1
11 22 11 21 11 21 11 12 21 21 = η−1 ω(s)d1 (πs+1 )−1 (νs21 m11 s ms+1 + νs ms ms+1 − νs ms ms+1 − νs ms ms+1 ).
11 21 Substituting the formulas for m21 s , ms+1 , ms+1 derived in part (a) into the last expression and using (4.8), we find that (4.13) is equivalent to (4.7).
Corollary 4.4. With the notation of Theorem 4.2, we have us
Ds+3 Ds+2 − Ds+2 Ds+1
22 2 µs + (ps /qs ) · µ12 Ds+2 Ds+1 s = − · us+1 η · d1 (πs+2 ) · d2 (πs+2 ) Ds+1 Ds
(4.14)
for all s ∈ Z≥k , s ≤ N − 3. Proof. Immediate from (4.7) and (4.6).
4.4. It turns out that it is possible to extend the results of Theorem 4.2 to some cases where the orthogonality set X is discrete but not locally finite, i.e., has an accumulation point. The motivation for such an extension is the following. Even though the DRHP (X, w), as it is stated in Sect. 2, is not well posed if X is not locally finite (for then we need to impose additional conditions near the accumulation point of the poles), the definition (4.1) of the kernel K still makes sense: the polynomials φ(ζ ) = Pk (ζ ) and ψ(ζ ) = (Pk−1 , Pk−1 )−1 ω · Pk−1 (ζ ) are well-defined. In fact, there exist several classical families of basic hypergeometric orthogonal polynomials for which the orthogonality set is discrete but not locally finite (see e.g. [13], Chapter 3). On the other hand, the solutions ms (ζ ) of “restricted” DRHPs, and hence all the quantities derived from them, are also defined for a non-locally finite orthogonality set X, because their definitions involve only finite subsets of X. In particular, we can still consider the corresponding ij ij ij Lax pair as in Subsect. 4.3 and the scalar sequences ps , qs , ms , µs , νs defined by (4.3) and (4.5). It is therefore natural to ask whether the recurrence relation (4.7) remains valid in the case where X is not locally finite. We will see in Subsect. 4.6 that it does. 4.5. Let us recall the following probability-theoretic interpretation of the Fredholm determinants Ds , see e.g. [19]. In general, if X is a discrete, not necessarily locally finite subset of R of cardinality N + 1 (N ∈ Z≥0 ∪ {∞}), ω : X → R is a strictly positive weight function whose moments are finite, {Pn (ζ )}N n=0 is the corresponding family of orthogonal polynomials and k is a natural number, k ≤ N , we can consider a probability distribution P on the set of all subsets of X of cardinality k, defined by 1 P {x1 , . . . , xk } = · Z
1≤i<j ≤k
(xi − xj )2 ·
k i=1
ω(xi ).
(4.15)
304
A. Borodin, D. Boyarchenko
Here Z is the unique constant for which the measure of the set of all subsets of X of cardinality k is equal to 1:
Z=
(xi − xj )2 ·
{x1 ,...,xk }⊆X 1≤i<j ≤k
k
ω(xi ).
(4.16)
i=1
Now if K denotes the kernel (4.1) defined in Subsect. 4.1, then for any subset Y ⊆ X, we have
det 1 − K Y×Y = P {x1 , . . . , xk } (4.17) xi ∈Y
(the sum on the RHS is taken over all subsets {x1 , . . . , xk } ⊆ X of cardinality k which are disjoint from Y). 4.6. We now assume that X = {πx }∞ x=0 ⊂ R is discrete but not necessarily locally finite. We consider, as before, the subsets Ys = {πx }∞ x=s of X, and we are interested in the sequence {Ds }∞ s=k defined by Ds = det(1 − Ks ),
Ks = K Y ×Y , s s
where K is the kernel (4.1). The proof of Lemma 4.1 is still valid and gives Ds = 0 for all s ∈ Z≥k . Thus we see from the discussion of Subsect. 4.4 that both sides of (4.7) are at least well defined in this situation. Proposition 4.5. All formulas of Theorem 4.2 and Corollary 4.4 remain valid in the present situation. Proof. It is obvious from the proof of Theorem 4.2 that (4.6) remains valid. Now let us fix s ∈ Z≥k and prove that formula (4.7) also holds in the present situation (then (4.14) follows automatically). Let L ∈ Z, L ≥ s + 2. We write X(L) = {πx }L x=0 ⊂ X. Since X(L) is finite, all of the discussion of Subsects. 4.1–4.3 is valid for X(L) in place of X. Since L ≥ s + 2, it is clear that replacing X by X(L) (and keeping the same weight ij ij ij function) has no effect on ms (ζ ), ps , qs , ms , µs , νs . But of course, the quantities Ds (L) do change. Let Ds denote the Fredholm determinants defined as in Subsect. 4.2 for X(L) in place of X. Then Theorem 4.2(b) gives (L)
Ds+2
(L) Ds+1
(L)
−
Ds+1 (L) Ds
=
2 ω(s) · us · (m11 s ) . η · d1 (πs+1 ) · d2 (πs+1 )
(L) (L) It remains to observe that Ds+1 Ds = Ds+1 Ds for all s. Indeed, the discussion of Subsect. 4.5 gives the formula Ds =
0≤i1
P {πi1 , . . . , πik } ,
(4.18)
Distribution of the First Particle
305
where P is given by (4.15). If P(L) denotes the probability distribution on the set of all subsets of X(L) of cardinality k defined similarly to P, then we also have Ds(L) = P(L) {πi1 , . . . , πik } . (4.19) 0≤i1
We note that the summations in (4.18) and (4.19) are over the same index set, and the only difference between the definitions of P and P(L) is in the normalization constant (L) (L) Z. Of course, when we take the ratios Ds+1 Ds and Ds+1 Ds , the normalization constants cancel each other, completing the proof. 5. Compatibility Conditions for Lax Pairs 5.1. In this section we study the compatibility conditions for the Lax pairs of the form considered in Sects. 3, 4: As ms (ζ ), (5.1) ms+1 (ζ ) = I + ζ − πs ms (σ ζ ) = Ms (ζ )ms+1 (ζ )D(ζ )−1 .
(5.2)
The general notation and conventions are those of Sects. 3, 4. As in the second part of Sect. 4, we do not assume that the orthogonality set X is locally finite: we have already remarked in Subsect. 4.4 that all of our arguments related to Lax pairs only involve finite subsets of X. Lemma 5.1. Fix s ∈ Z≥0 , s ≤ N , let A ∈ Mat (2, C), and define m(ζ ) = I + (ζ − πs )−1 A ms (ζ ). Then m(ζ ) = ms+1 (ζ ) if and only if the matrix A satisfies the following two conditions: A · ms (πs )w(πs ) = 0, ms (πs )w(πs ) + A · ms (πs )w(πs )
= A · ms (πs ).
(5.3) (5.4)
In particular, for a fixed s, there is a unique matrix A satisfying (5.3) and (5.4), namely, A = As . Proof. By the uniqueness of ms+1 (ζ ), we only have to verify that m(ζ ) satisfies the same residue conditions as ms+1 (ζ ) if and only if (5.3) and (5.4) hold (note that the asymptotics of m(ζ ) and ms+1 (ζ ) as ζ → ∞ are clearly the same). Now if 0 ≤ x ≤ s − 1, then since I + (ζ − πs )−1 A is analytic near πx , it is clear that the residue condition at πx for ms (ζ ) implies one for m(ζ ). Thus, we only need to consider the residue condition at the pole ζ = πs . Since ms (ζ ) is analytic near πs , we have Resζ =πs m(ζ ) = A·ms (πs ). On the other hand, the limit lim m(ζ )w(πs ) = lim I + (ζ − πs )−1 A ms (ζ )w(πs ) ζ →πs
ζ →πs
exists if and only if (5.3) holds. Moreover, if (5.3) holds, this limit equals ms (πs )w(πs )+ A · ms (πs )w(πs ), so m(ζ ) satisfies the required residue condition at ζ = πs if and only if (5.4) holds.
306
A. Borodin, D. Boyarchenko
Theorem 5.2 (Compatibility conditions for Lax pairs). Fix s ∈ Z≥k , s ≤ N − 1. (a) We have
As I+ σ ζ − πs
As+1 Ms (ζ ) = Ms+1 (ζ ) I + ζ − πs+1
.
(5.5)
(b) Conversely, assume that M : C → Mat (2, C) is an analytic function, A ∈ Mat (2, C) is a nilpotent matrix, and I+
As σ ζ − πs
Ms (ζ ) = M(ζ ) I +
A ζ − πs+1
,
(5.6)
where Ms (ζ ) and As are defined by (5.1), (5.2). Then M(ζ ) = Ms+1 (ζ ) and A = As+1 . Equation (5.5) is the compatibility condition for the Lax pair (5.1), (5.2). Remark 5.3. This theorem provides a recipe for computing As+1 and Ms+1 (ζ ) if we know As and Ms (ζ ). Indeed, one simply needs to find the unique solution of the compatibility condition which satisfies A2s+1 = 0. Proof. (a) If we replace ζ by σ ζ in (5.1) and then substitute (5.2) into the result, we obtain As ms+1 (σ ζ ) = I + Ms (ζ )ms+1 (ζ )D(ζ )−1 . (5.7) σ ζ − πs On the other hand, if we substitute (5.1) into (5.2) and then replace s by s + 1, we get ms+1 (σ ζ ) = Ms+1 (ζ ) I +
As+1 ζ − πs+1
ms+1 (ζ )D(ζ )−1 .
(5.8)
Comparing (5.7) and (5.8) yields (5.5). (b) From (5.5) and (5.6), we find that M(ζ ) I +
A ζ − πs+1
As+1 = Ms+1 (ζ ) I + ζ − πs+1
.
(5.9)
Using Lemma 2.1 and (5.2) with s replaced by s+1, we see that det Ms+1 (ζ ) ≡ det D(ζ ), and hence Ms+1 (ζ ) is invertible near πs+1 . Now since A2 = 0, we can rewrite (5.9) as −1 (ζ ) · M(ζ ) Ms+1
As+1 = I+ ζ − πs+1
· I−
A ζ − πs+1
.
(5.10)
The RHS is analytic for ζ = πs+1 , and the LHS is analytic near πs+1 . Thus, both sides of (5.10) are entire functions. But the RHS tends to I as ζ → ∞, so by Liouville’s theorem, both sides are equal to I for all ζ . This proves that M(ζ ) = Ms+1 (ζ ) and A = As+1 .
Distribution of the First Particle
307
5.2. Recall that the formulas of Theorem 4.2 have been derived under the assumption that the parameter ps does not vanish for all s ≥ k. Let us now establish the non-vanishing of ps for the weight functions ω and orthogonality sets X satisfying the assumptions of Subsect. 1.2. We need the following well-known fact. For the reader’s convenience, we also provide a proof. Lemma 5.4 (Zeroes of discrete orthogonal polynomials). Let Z ⊂ C be a finite subset, and assume that Z is contained in a closed interval [a, b] ⊆ R ⊆ C such that a is the minimal element of Z and b is the maximal element of Z. Let ω : Z → R be a strictly positive weight function. Then there exists a unique family {Pn (ζ )}L n=0 of orthogonal polynomials corresponding to ω, where L = card(Z) − 1. The coefficients of each Pn (ζ ) are real. Moreover, for any 0 ≤ n ≤ L, all zeroes of Pn (ζ ) are real and are contained in the open interval (a, b). Proof. Since ω is strictly positive, the restriction of the corresponding inner product (·, ·)ω to the space R[ζ ]≤d of real polynomials of degree at most d is nondegenerate for each 0 ≤ d ≤ L. Thus, there exists a unique family of real orthogonal polynomials {Pn (ζ )}L n=0 corresponding to ω, and (Pn , Pn )ω = 0 for 0 ≤ n ≤ L. Now fix 1 ≤ n ≤ L, and assume that Pn (ζ ) has fewer than n zeroes in the open interval (a, b). Let z1 , . . . , zm be the zeroes of Pn (ζ ) in (a, b), listed with their multiplicities, and let Q(ζ ) = (ζ − z1 ) · · · (ζ − zm ). Then, since Pn (ζ ) and Q(ζ ) are real polynomials, we have either Pn (ζ )Q(ζ ) ≥ 0 for all ζ ∈ [a, b], or Pn (ζ )Q(ζ ) ≤ 0 for all ζ ∈ [a, b]. In addition, since the degree of Pn (ζ ) is less than the cardinality of Z, there exists z ∈ Z such that Pn (z)Q(z) = 0. This implies that Pn (ζ ) and Q(ζ ) are not orthogonal with respect to ω. Since the degree of Q(ζ ) is less than that of Pn (ζ ), we have a contradiction with the definition of orthogonal polynomials. Now we can prove Proposition 5.5. With the notation and conventions of Sects. 3 and 4, assume that either π0 < π1 < π2 · · · , or π0 > π1 > π2 > · · · . Then ps = 0 for all s > k, s ≤ N . Proof. Fix s > k, s ≤ N . By Lemma 5.4, there exists a family {Pn (ζ )}s−1 n=0 of polynomials orthogonal on {π0 , . . . , πs−1 } with the weight function given by the restriction of ω to {0, 1, . . . , s − 1}. Moreover, as πs lies outside the interval between π0 and πs−1 , we have Pn (πs ) = 0 for all 0 ≤ n ≤ s − 1. Now since k ≤ s − 1, we know from Theorem t 2.4 that the first column of the matrix ms (ζ ) has the form Pk (ζ ), c · Pk−1 (ζ ) , where c is a nonzero constant. On the other hand, by Lemma 5.1, we have ms (πs )w(πs ) = As · (ms (πs ) − ms (πs )w(πs )).
(5.11)
If ps = 0, then because the matrix As is nilpotent by Theorem 3.1(a), we have either qs = 0 or rs = 0, i.e., either the first or the second row of As is zero. By (5.11), this implies that either the (1, 2) or the (2, 2) element of the matrix ms (πs )w(πs ) is zero. This contradicts Pk (πs ), Pk−1 (πs ) = 0. 6. Initial Conditions for Recurrence Relations 6.1. In this section we derive the initial conditions for the recurrence relations (4.7) and (5.5). We keep the general notation and conventions of Sects. 4 and 5. Recall in particular that k is a natural number that controls the asymptotics at infinity of the solutions of all DRHPs that we consider. As in Sect. 5, we do not assume that the orthogonality set X is locally finite.
308
A. Borodin, D. Boyarchenko
Proposition 6.1. The solution mk (ζ ) of the DRHP {π0 , . . . , πk−1 }, w {π ,...,π } with 0 k−1 k 0 the asymptotics mk (ζ ) ∼ ζ −k as ζ → ∞ is given by 0 ζ
mk (ζ ) (ζ − π0 )(ζ − π1 ) · · · (ζ − πk−1 ) = (ζ − π0 )(ζ − π1 ) · · · (ζ − πk−1 ) k−1 m=0
ρm ζ −πm
0 , (ζ − π0 )−1 (ζ − π1 )−1 · · · (ζ − πk−1 )−1 (6.1)
where ρm = ω(m)−1 ·
(πm − πj )−2
(6.2)
0≤j ≤k−1 j =m
for all 0 ≤ m ≤ k − 1. Proof. Let m(ζ ) be the matrix defined by the RHS of (6.1). It is clear that m(ζ ) has the required asymptotics at infinity. Hence we only have to show that (6.2) is the (unique) choice of constants ρm which makes m(ζ ) satisfy the required residue conditions. Now if 0 ≤ x ≤ k − 1, then the (2, 2) element of the matrix Resζ =πx m(ζ ) equals (πx − πl )−1 , 0≤l≤k−1 l=x
and the (2, 2) element of the matrix limζ →πx m(ζ )w(πx ) equals (πx − πl ). ω(x) · ρx · 0≤l≤k−1 l=x
The other elements of both matrices are zero. Equating the last two expressions yields (6.2). 6.2. Now we use Lemma 5.1 to find the matrix Ak . Proposition 6.2. The elements of the matrix p k qk Ak = rk −pk are given by the following formulas: qk = ρk +
k−1 m=0
ρm (πk − πm )2
−1
−1 k −1 −2 ω(m) = (π − π ) m j , m=0 0≤j ≤k j =m
(6.3) where
Distribution of the First Particle
309
ρk = ω(k)−1 ·
k−1
(πk − πj )−2 ;
j =0
pk = −qk ·
k−1 m=0
ρm , πk − π m
(6.4)
and rk = −qk ·
k−1 m=0
ρm πk − π m
2 .
(6.5)
Remark 6.3. It follows from (6.2), (6.3) and (6.4) that if the orthogonality set X is contained in R and either π0 > π1 > π2 > · · · or π0 < π1 < π2 < · · · (and the weight function ω is strictly positive), then ρm > 0 for 0 ≤ m ≤ k − 1, qk > 0, and hence pk = 0. Proof. It follows from Lemma 5.1 that Ak is the unique matrix satisfying the following system of equations: Ak · mk (πk )w(πk ) = 0, mk (πk )w(πk ) + Ak · mk (πk )w(πk )
(6.6)
= Ak · mk (πk ).
(6.7)
Substituting (6.1) into (6.6) yields (6.4). It remains to prove (6.3), for (6.5) then follows from pk2 = −qk rk . To this end, we consider the (1, 2) elements of both sides of (6.7). $ The first summand on the LHS contributes ω(k) · k−1 j =0 (πk − πj ) to the (1, 2) element. Now we consider the second summand. We can rewrite it as d
Ak · mk (ζ )w(πk )
ζ =π dζ k % & 0 1 d
= (ζ − π0 ) · · · (ζ − πk−1 ) · ω(k) · Ak · .
ρm 0 k−1 dζ ζ =πk m=0 ζ −πm If the derivative falls onto the factor (ζ − π0 ) · · · (ζ − πk−1 ), the corresponding term is zero because of (6.6). Hence the whole expression equals 0 0 k−1 ρm 0 − m=0 (π −π )2 m k k−1 k−1 ρm 0 −qk · m=0 (π −π )2 m k . (πk − πj ) · = ω(k) · 0 ∗ j =0
(πk − π0 ) · · · (πk − πk−1 ) · Ak · ω(k) ·
Finally, the (1, 2) element of Ak ·mk (πk ) equals qk · the (1, 2) elements of both sides of (6.7) yields
$k−1
j =0 (πk −πj )
−1 . Thus, comparing
310
A. Borodin, D. Boyarchenko
k−1 k−1 k−1 ρm −1 (πk − πj ) · ω(k) · ·q + (π − π ) k j k (πk − πm )2 m=0
j =0
= ω(k) ·
k−1
j =0
(πk − πj ),
j =0
which gives (6.3).
Remark 6.4. The two propositions we have just proved give explicit formulas for the matrices mk (ζ ) and Ak . Using these formulas, we can also find the functions mk+1 (ζ ) and Mk (ζ ). Indeed, from the Lax pair (3.3) and (3.4), we have Ak mk+1 (ζ ) = I + mk (ζ ), (6.8) ζ − πk −1 (ζ ) = m (σ ζ )D(ζ )m (ζ ) I− Mk (ζ ) = mk (σ ζ )D(ζ )m−1 k k+1 k
Ak ζ − πk
.
(6.9)
Even though this gives an explicit formula for Mk (ζ ), it is cumbersome to use it in practice. In the case where the matrix D(ζ ) is linear in ζ , a more explicit version is available: Proposition 6.5. Assume that d1 (ζ ) = λ1 ζ + µ1 , d2 (ζ ) = λ2 ζ + µ2 , where λ1 , λ2 , µ1 , µ2 ∈ C are constants (some of which could be zero). Then Mk (ζ ) ηk (λ1 ζ + µ1 )+ηk λ1 (π0 −πk − pk ) −ηk λ1 qk = , −k −k −η−k λ2 rk +(ηk−1 λ1 − η−k λ2 ) k−1 m=0 ρm η (λ2 ζ + µ2 )+η λ2 (pk +πk −π0) (6.10) where pk , qk , rk , ρm are given by (6.4), (6.3), (6.5) and (6.2). Proof. Since Mk (ζ ) is an entire function, it suffices by Liouville’s theorem to show that (6.10) holds up to terms of order ζ −1 as ζ → ∞. To that end, note that by (6.1) and (3.2), we have
ηk (ζ −π1 ) · · · (ζ −πk ) 0 mk (σ ζ ) = k−1 . ρm η (ζ −π1 ) · · · (ζ −πk ) k−1 η−k (ζ −π1 )−1 · · · (ζ −πk )−1 m=0 ζ−πm+1 (6.11) But it follows from (6.9) that % −k & −k &−1 Ak ζ ζ 0 0 Mk (ζ ) = mk (σ ζ ) · D(ζ ) · mk (ζ ) . · I− 0 ζk 0 ζk ζ − πk %
Substituting (6.1) and (6.11) into the last formula, we arrive at (6.10).
Distribution of the First Particle
311
6.3. The final result of this section is the computation of the Fredholm determinants Dk and Dk+1 . We use the probability-theoretic interpretation of the Fredholm determinants Ds given in subsect. 4.5. One can show (this is a standard random matrix theory argument, see e.g. [14]) that the constant Z given by (4.16) is equal to the product of the norms squared of the first k monic orthogonal polynomials: Z=
k−1
(Pi , Pi )ω .
(6.12)
i=0
Now we prove Proposition 6.6. With the notation of Sect. 4, let X ⊂ R be a discrete set, let {Pn (ζ )} be the family of orthogonal polynomials corresponding to a strictly positive weight function ω : X → R, and let Z be given by (6.12). Then Dk =
1 · Z
(πi − πj )2 ·
0≤i<j ≤k−1
Dk+1 = ω(k) · qk−1 · Dk ·
k−1
ω(l),
(6.13)
l=0 k−1
(πk − πl )2 ,
(6.14)
l=0
where qk is given by (6.3). s−1 Proof. Recall that for all
s ∈ Z≥k , s ≤ N , we have defined Zs = {πx }x=0 , Ys = X \ Zs , and Ds = det(1 − K Y ×Y ). Hence a subset of X is disjoint from Ys if and only s s if it is contained in Zs . There exists only one subset of Zk of cardinality k, namely, Zk = {π0 , . . . , πk−1 } itself. Applying (4.17) yields (6.13). Next, there are k + 1 subsets of Zk+1 of cardinality k, namely, those of the form Zk+1 \ {πm } for 0 ≤ m ≤ k. Applying (4.17) gives k k 1 1 2 Dk+1 = · ω(l) · (π − π ) i j ω(m) . Z l=0 0≤i<j ≤k m=0 i,j =m
Using (6.2), (6.3) and (6.13), we see that the last equation is equivalent to (6.14).
7. Lax Pairs for Discrete Orthogonal Polynomials of the Askey Scheme 7.1. In this section we specialize to weight functions appearing in the orthogonality relations for the hypergeometric orthogonal polynomials and the basic hypergeometric orthogonal polynomials of the Askey scheme. We use [13] as our main reference for the orthogonal polynomials of the Askey scheme. We are only interested in those families for which the orthogonality set is discrete. Since our ultimate goal is to derive a recurrence relation for the associated Fredholm determinants, we do not impose the local finiteness condition on X. However, the basic assumptions of Subsect. 1.2 have to be satisfied in order to use our approach (in its present form). These assumptions are not satisfied for the following families of discrete orthogonal polynomials: the Racah polynomials ([13],
312
A. Borodin, D. Boyarchenko
§1.2), the dual Hahn polynomials ([13], §1.6), the q-Racah polynomials ([13], §3.2), the big q-Jacobi polynomials ([13], §3.5), the big q-Legendre polynomials ([13], §3.5.1), the dual q-Hahn polynomials ([13], §3.7), the big q-Laguerre polynomials ([13], §3.11), the dual q-Krawtchouk polynomials ([13], §3.17), the Al-Salam-Carlitz I polynomials ([13], §3.24), and the discrete q-Hermite I polynomials ([13], §3.28). 7.2. Now we list the families of hypergeometric and basic hypergeometric orthogonal polynomials for which our results do apply. Instead of writing out the whole Lax pair in each case, we only give the orthogonality set X, the weight function ω(x), the affine transformation σ : C → C, and the corresponding entire functions d1 (ζ ), d2 (ζ ) which satisfy the assumption of Theorem 3.1(b). For the basic hypergeometric polynomials, we assume from now on that 0 < q < 1. This restriction ensures that X ⊂ R and the weight function is strictly positive and has finite moments. • Hahn polynomials ([13], §1.5): X = {0, . . . , N}, N ∈ Z≥0 ; α+x β +N −x ω(x) = , where α, β > −1 or α, β < −N ; x N −x σ ζ = ζ − 1,
d1 (ζ ) = ζ (ζ − β − N − 1),
• Meixner polynomials ([13], §1.9): ω(x) =
(β)x x c , x!
σ ζ = ζ − 1,
d2 (ζ ) = (ζ − N − 1)(ζ + α).
X = Z≥0 ; where β > 0 and 0 < c < 1;
d1 (ζ ) = ζ,
d2 (ζ ) = cζ + c(β − 1).
• Krawtchouk polynomials ([13], §1.10): X = {0, . . . , N}, N ∈ Z≥0 ; N x ω(x) = p (1 − p)N−x , where 0 < p < 1; x σ ζ = ζ − 1,
d1 (ζ ) = ζ,
• Charlier polynomials ([13], §1.12): ω(x) = σ ζ = ζ − 1, • q-Hahn polynomials ([13], §3.6): ω(x) =
p (ζ − N − 1). p−1
X = Z≥0 ; ax x!
,
where a > 0;
d1 (ζ ) = ζ, d2 (ζ ) = a.
X = {q −x x = 0, . . . , N}, N ∈ Z≥0 ;
(αq; q)x (q −N ; q)x (αβq)−x , (q; q)x (β −1 q −N ; q)x
σ ζ = qζ,
d2 (ζ ) =
where 0 < α, β < q −1 or α, β > q −N ;
d1 (ζ ) = αβ(ζ − 1)(ζ − β −1 q −N−1 ),
d2 (ζ ) = (ζ − α)(ζ − q −N−1 ).
Distribution of the First Particle
313
• Little q-Jacobi polynomials ([13], §3.12): ω(x) =
(bq; q)x (aq)x , (q; q)x
σ ζ = q −1 ζ,
where 0 < a < q −1 and b < q −1 ;
d1 (ζ ) = ζ − 1,
• q-Meixner polynomials ([13], §3.13): ω(x) =
d2 (ζ ) = a(bζ − 1).
X = {q −x x ∈ Z≥0 };
x (bq; q)x cx q (2) , (q; q)x (−bcq; q)x
σ ζ = qζ,
X = {q x x ∈ Z≥0 };
where 0 < b < q −1 and c > 0;
d1 (ζ ) = (ζ − 1)(ζ + bc),
d2 (ζ ) = c(ζ − b).
• Quantum q-Krawtchouk polynomials ([13], §3.14): X = {q −x x = 0, . . . , N}, N ∈ Z≥0 ; ω(x) =
x (pq; q)N−x (−1)N−x q (2) , (q; q)x (q; q)N−x
where p > q −N ;
d1 (ζ ) = (ζ − 1)(pq N+1 ζ − 1),
d2 (ζ ) = 1 − q N+1 ζ.
σ ζ = qζ,
• q-Krawtchouk polynomials ([13], §3.15): ω(x) =
X = {q −x x = 0, . . . , N}, N ∈ Z≥0 ;
(q −N ; q)x (−p)−x , (q; q)x
where p > 0;
d2 (ζ ) = q −N − qζ.
• Affine q-Krawtchouk polynomials ([13], §3.16): X = {q −x x = 0, . . . , N}, N ∈ Z≥0 ; σ ζ = qζ,
ω(x) =
d1 (ζ ) = p(ζ − 1),
(pq; q)x (q; q)N (pq)−x , (q; q)x (q; q)N−x
where 0 < p < q −1 ;
d2 (ζ ) = (ζ − p)(q N+1 ζ − 1).
• Little q-Laguerre/Wall polynomials ([13], §3.20): X = {q x x ∈ Z≥0 }; σ ζ = qζ,
d1 (ζ ) = p(ζ − 1),
ω(x) =
(aq)x , (q; q)x
σ ζ = q −1 ζ,
where 0 < a < q −1 ;
d1 (ζ ) = ζ − 1,
d2 (ζ ) = −a.
314
A. Borodin, D. Boyarchenko
• Alternative q-Charlier polynomials ([13], §3.22): ω(x) =
σ ζ = q −1 ζ,
x+1 ax q ( 2 ), (q; q)x
ω(x) =
σ ζ = qζ,
where a > 0; a d2 (ζ ) = − ζ. q
d1 (ζ ) = ζ − 1,
• q-Charlier polynomials ([13], §3.23):
X = {q x x ∈ Z≥0 };
X = {q −x x ∈ Z≥0 };
x ax q (2) , (q; q)x
where a > 0;
d1 (ζ ) = ζ − 1,
• Al-Salam-Carlitz II polynomials ([13], §3.25): q x ax , (q; q)x (aq; q)x
d2 (ζ ) = a.
X = {q −x x ∈ Z≥0 };
2
ω(x) =
σ ζ = qζ,
where a > 0;
d1 (ζ ) = (ζ − 1)(ζ − a),
d2 (ζ ) = a.
7.3. The next three sections (Sects. 8–10) deal with the various possible ways of “solving” the compatibility conditions for the Lax pairs listed above. By a “solution” of a compatibility condition of the form (5.5) we mean a collection of formulas which allow (i) us to express the entries of the matrices Ms+1 , As+1 as rational functions of the entries of (i)
(l)
(0)
(i)
the matrices Ms , As , where Ms (ζ ) = Ms ζ l + · · · + Ms , Ms ∈ Mat (2, C), for all s. The most general case where we have been able to solve the compatibility condition explicitly is the one where the functions d1 (ζ ) and d2 (ζ ) are either linear or constant; this is described in Sect. 8. The resulting formulas can be used for practical computations, but they do not appear to be related to any known systems of difference equations. In certain more specialized cases we have been able to reduce the compatibility condition to one of the equations of H. Sakai’s hierarchy in [17].
7.4. In Sect. 9 we solve the compatibility
condition (5.5) in the case where the orthogonality set has the form X = {x ∈ Z≥0 x ≤ N } (N ∈ Z≥0 ∪ {∞}) and the functions d1 (ζ ) and d2 (ζ ) are linear, by a method different from the one used in Sect. 8 . We show that if both d1 and d2 are nonconstant, then the compatibility condition is (generically) equivalent to the d − PV equation of H. Sakai [17], and if d2 is constant, then the compatibility condition is (generically) equivalent to the d − PI V equation ibid. (see Theorem 9.3). This result allows us to write down explicit solutions for the recurrence relations corresponding to the Meixner polynomials, the Krawtchouk polynomials, and the Charlier polynomials — see Sect. 11.
Distribution of the First Particle
315
7.5. As for the basic hypergeometric orthogonal polynomials, we show in Sect. 10 (see Theorem 10.1(b)) that in the case where the orthogonality set has the form X = {q −s }N s=0 (N ∈ Z≥0 ∪ {∞}) and the functions d1 (ζ ) and d2 (ζ ) are linear and nonconstant, the compatibility condition for the corresponding Lax pair is equivalent to the q − PV I system of M. Jimbo
and H. Sakai, for a certain choice of the parameters. Since the case where X = {q s s ∈ Z≥0 } is reduced to the former one after replacing q by q −1 , this situation suits the following families of basic hypergeometric orthogonal polynomials: the little q-Jacobi polynomials and the q-Krawtchouk polynomials (it does not suit the alternative q-Charlier polynomials because the constant terms of the functions d1 (ζ ) and d2 (ζ ) must be nonzero, cf. Theorem 10.3). If one of the functions d1 (ζ ) and d2 (ζ ) is linear and the other one is constant, it is possible to reduce the corresponding compatibility condition to a degeneration of the q − PV I system. This process is described in Subsect. 10.3. It allows us to solve the compatibility conditions for the little q-Laguerre/Wall polynomials and the q-Charlier polynomials. It turns out, however, that the method of solving the compatibility condition by reducing it to the q − PV I system (or its degeneration) is rather difficult to carry out in practice. So to find a recurrence relation for the Fredholm determinants associated to classical families of basic hypergeometric orthogonal polynomials, we prefer to use the more general formulas of Sect. 8 (see Sect. 11). The disadvantage of the formulas of Sect. 8, as compared to q −PV I , is the fact that the recurrence step substantially involves more than two sequences, while for q − PV I two sequences suffice, cf. Theorems 8.2 and 10.3 below.
8. Solution of the Compatibility Condition: The General Case 8.1. In this section we “solve” (in the sense of Subsect. 7.3) the compatibility condition I+
η−1 As ζ − πs+1
Ms (ζ ) = Ms+1 (ζ ) I +
As+1 ζ − πs+1
(8.1)
derived in Sect. 5 (cf. Eq. (5.5)), in the case where the matrix D(ζ ) defined in Theorem 3.1(b) depends linearly on ζ . Our method is based on the following simple observation. Proposition 8.1. If Ms (ζ ) = @ · ζ + Cs for all s, where Cs does not depend on ζ and @ is a fixed matrix independent both of ζ and of s, then under the assumption that A2s+1 = 0, the compatibility condition (8.1) is equivalent to the following system of linear equations: (πs+1 @ + Cs + η−1 As @) · As+1 = η−1 As · (πs+1 @ + Cs ),
(8.2)
−1
(8.3)
Cs+1 = Cs + η
As @ − @As+1 .
Proof. Comparing the asymptotics of both sides of (8.1) as ζ → ∞ yields (8.3). If we take the residues of both sides of (8.1) at ζ = πs+1 , we obtain η−1 As · (πs+1 @ + Cs ) = (πs+1 @ + Cs+1 ) · As+1 . Substituting (8.3) into the last equation and using the assumption that A2s+1 = 0 gives (8.2).
316
A. Borodin, D. Boyarchenko
8.2. We now note that if the matrix πs+1 @ + Cs + η−1 As @ is invertible, then the system (8.2), (8.3) already has a unique solution (As+1 , Cs+1 ). Since the compatibility condition (8.1) always has a unique solution by Theorem 5.2(b), it follows from Proposition 8.1 that in this case, the solution of (8.2), (8.3) is also the solution of (8.1). Even though we cannot prove that the matrix πs+1 @ + Cs + η−1 As @ is invertible in general, the explicit computations we have carried out for five families of basic hypergeometric orthogonal polynomials (see Sect. 11) show that the result below (Theorem 8.2) has practical significance. 8.3. We introduce the following notation. Suppose that d1 (ζ ) = λ1 ζ + µ1 , d2 (ζ ) = λ2 ζ + µ2 , where λ1 , µ1 , λ2 , µ2 ∈ C are constants. Then it follows from Theorem 3.1(c) that the assumption of Proposition 8.1 is satisfied for @ = diag(κ1 , κ2 ), where κ1 = ηk λ1 and κ2 = η−k λ2 . Let us write ps qs α s βs As = and Cs = . rs −ps γs δs Finally, define Cs = det(πs+1 @ + Cs + η−1 As @). Now we can state Theorem 8.2. We have Cs = d1 (πs+1 )d2 (πs+1 ) + η−1 κ1 (ps δs − rs βs ) − η−1 κ2 (ps αs + qs γs ).
(8.4)
If Cs = 0, then the following formulas hold: ps+1 = −η−1 ps−1 Cs−1 · (ps βs + qs δs + κ2 πs+1 qs ) · (rs αs − ps γs + κ1 πs+1 rs ), (8.5) qs+1 = η−1 qs−1 Cs−1 · (ps βs + qs δs + κ2 πs+1 qs )2 ,
(8.6)
rs+1 =
(8.7)
η−1 rs−1 Cs−1
· (rs αs − ps γs + κ1 πs+1 rs ) , 2
αs+1 = αs + η−1 κ1 ps − κ1 ps+1 ,
(8.8)
−1
κ2 qs − κ1 qs+1 ,
(8.9)
−1
κ1 rs − κ2 rs+1 ,
(8.10)
κ2 ps + κ2 ps+1 ,
(8.11)
βs+1 = βs + η
γs+1 = γs + η δs+1 = δs − η us = −κ2 γs +
−1
ps · (κ1 δs − κ2 αs ) + qs
ps2 qs2
· κ 1 βs ,
(8.12)
where us is defined in Theorem 4.2(b). We omit the proof, as it consists entirely of straightforward computations. We first derive (8.4) using the identity det Ms (ζ ) ≡ det D(ζ ) which follows from Lemma 2.1. If Cs = 0, we rewrite (8.2) as As+1 = η−1 · (πs+1 @ + Cs + η−1 As @)−1 · As · (πs+1 @ + Cs ). Writing out the matrix product on the RHS explicitly yields (8.5)–(8.7). Then (8.3) gives (8.8)–(8.11). Finally, (8.12) follows immediately from the definition of us .
Distribution of the First Particle
317
9. The Fifth and the Fourth Discrete Painlev´e Equations 9.1. In this section we assume that the orthogonality set is of the form X = {x ∈ Z≥0 x ≤ N } (N ∈ Z≥0 ∪ {∞}), so that, with the notation of Sect. 3, σ ζ = ζ − 1 and η = 1. Our goal is to prove that if the functions d1 (ζ ) and d2 (ζ ) are linear, then the compatibility condition (5.5) is equivalent to either the fifth or the fourth discrete Painlev´e equation of H. Sakai [17]. Recall from Theorem 3.1 that since π0 = 0, we must have d1 (0) = 0, and because only the ratio d1 (ζ )/d2 (ζ ) matters, we may assume without loss of generality that d1 (ζ ) = ζ , and write d2 (ζ ) = ξ ζ + τ for some ξ, τ ∈ C (unless otherwise explicitly stated, we do not exclude the possibility ξ = 0). By Theorem 3.1(c), we can write Ms (ζ ) = @ζ + Cs ,
where @ =
11 12 10 C s Cs . and Cs = 0ξ Cs21 Cs22
Then the compatibility condition (5.5) takes the form I+
As ζ − (s + 1)
· @ζ + Cs = @ζ + Cs+1 · I +
As+1 , ζ − (s + 1)
(9.1)
p q where As = rss −ps s . The following result allows us to find a convenient reparameterization of the matrices As and Cs which leads to an explicit solution of (9.1). Lemma 9.1. We have Cs11 + ps = −k,
Cs22 − ξps = ξ k + τ,
Cs12 Cs21 = Cs11 Cs22
(9.2)
for all s ∈ Z≥k , s ≤ N . Proof. Taking the asymptotics of both sides of (9.1) as ζ → ∞ gives Cs + As @ = Cs+1 + @As+1 ,
(9.3)
which implies that 11 Cs11 + ps = Cs+1 + ps+1
and
22 Cs22 − ξps = Cs+1 − ξps+1
for all s ∈ Z≥k , s < N . This means that the expressions Cs11 + ps and Cs22 − ξps do not depend on s. But we know from Proposition 6.5 that
−k − pk −qk , Ck = −ξ rk + (1 − ξ ) k−1 m=0 ρm τ + ξ(pk + k)
(9.4)
whence Ck11 + pk = −k and Ck22 − ξpk = ξ k + τ , proving the first two equalities in (9.2). To prove the third equality, note that det Ms (ζ ) = det D(ζ ) for all ζ , by (3.4) and Lemma 2.1, and take ζ = 0.
318
A. Borodin, D. Boyarchenko
9.2. It follows from Lemma 9.1 that for all s ∈ Z≥k , s ≤ N , the matrices As and Cs can be naturally parameterized as follows: −1 −αs βs bs bs βs . (9.5) As = (k + bs ) , Cs = 1/(αs βs ) 1 (τ − ξ bs )/βs τ − ξ bs Remark 9.2. This parameterization is taken from [4] (see Eq. (6.8) ibid.). The proof of the theorem below is based on the same idea as the proof of Proposition 6.3 in [4]. In fact, the situation considered in Sect. 6 of [4] corresponds, with minor modifications, to the weight function for Meixner polynomials (see Sect. 7 above). The only essential difference with the present paper is that we consider a slightly more general situation by letting d2 (ζ ) be an arbitrary linear function, which allows us to treat the cases of the Krawtchouk and Charlier polynomials, as well as the case of the Meixner polynomials. Note also that the parameterization (9.5) is only valid if the matrices As and Cs are sufficiently generic. If, for instance, Cs12 = 0, but Cs11 = 0, then (9.5) does not make sense. When we try to solve (9.1) in terms of the parameterization (9.5), we will encounter a similar difficulty: the formulas will involve rational functions of the parameters, and it is not clear a priori that the denominators of all fractions do not vanish. We refer the reader to [4], §6, where this problem is discussed in detail. The argument of [4] can be easily adapted to our situation. Theorem 9.3. (a) Assume that ξ = 0. Introduce new variables fs , gs by s , gs = −αs . fs = −k − bs + 1 − αs
(9.6)
Then with the parameterization (9.5), the recurrence relation (9.1) has the following solution: s τ/ξ + s + 1 + , (9.7) fs+1 + fs = −(k + τ/ξ ) + 1 + gs 1 + ξgs gs+1 gs =
(fs+1 − 1 − s)(fs+1 − 1 − s + k) , ξfs+1 (fs+1 + k + τ/ξ )
ξgs (1 + gs+1 )fs+1 + (k + τ/ξ )gs+1 − s − 1 βs+1 =− · . βs gs+1 (1 + ξgs )fs+1 + k − s − 1
(9.8)
(9.9)
(b) Now let ξ = 0, and introduce new variables fs , gs by fs = αs−1 ,
gs = τ αs + bs + s + 1.
(9.10)
Then with the parameterization (9.5), the recurrence relation (9.1) has the following solution: τgs fs fs+1 = , (9.11) (gs − s − 1)(gs + k − s − 1) gs + gs+1 =
τ s+1 − − k + 2s + 3, fs+1 1 − fs+1
τ βs+1 . = βs fs (gs + k − s − 1)
(9.12)
(9.13)
Distribution of the First Particle
319
Remark 9.4. (a) If we set f = fs , f¯ = fs+1 , g = gs , g¯ = gs+1 , then the relations (9.7), (9.8) form a special case of the difference Painlev´e V equation (d − PV ) of [17], §7. The parameters λ, a0 , a1 , a2 , a3 , a4 in our case are as follows: a0 = τ/ξ + s + 1, a3 = −(k + τ/ξ ),
a4 = k,
a1 = s,
a2 = −s,
λ = a1 + 2a2 + a3 + a4 + a0 = 1.
(b) If we set f = fs+1 , f¯ = fs , g = gs+1 , g¯ = gs , then the relations (9.11), (9.12) form a special case of the difference Painlev´e IV equation (d − PI V ) of [17], §7. The parameters λ, a0 , a1 , a2 , a3 in our case are as follows: a0 = −s − 2, a3 = s + 2 − k,
a1 = 1,
a2 = k,
λ = a1 + a2 + a3 + a0 = 1.
Proof of Theorem 9.3. For the first part of the proof we do not need to distinguish between the cases where ξ = 0 and ξ = 0. Taking the residues of both sides of (9.1) at ζ = s + 1 yields As · (s + 1)@ + Cs = (s + 1)@ + Cs+1 · As+1 , i.e., −1 −αs βs bs βs s + 1 + bs · (k + bs ) · 1/(αs βs ) 1 (τ − ξ bs )/βs ξ(s + 1) + τ − ξ bs −1 −αs+1 βs+1 s +1+bs+1 bs+1 βs+1 = (k+bs+1 )· · . (τ −ξ bs+1 )/βs+1 ξ(s +1)+τ −ξ bs+1 1/(αs+1 βs+1 ) 1
Comparing the diagonal terms on both sides, we get a system of two equations: ' ( (k + bs ) · −(τ − ξ bs )αs − (s + 1) − bs ' ( = (k + bs+1 ) · bs+1 /αs+1 − (s + 1) − bs+1 , (9.14) ' ( (k + bs ) · bs /αs + ξ(s + 1) + τ − ξ bs ( ' = (k + bs+1 ) · −(τ − ξ bs+1 )αs+1 + ξ(s + 1) + τ − ξ bs+1 .
(9.15)
If we multiply (9.14) by αs+1 , (9.15) by αs , and add the results, we obtain an equation which can be written as follows: ' ( (k + bs ) · (1 − ξ αs )(1 − αs+1 )bs +τ αs (1 − αs+1 )+ξ(s + 1)αs − (s + 1)αs ' ( = (k + bs+1 ) · (1 − ξ αs )(1 − αs+1 )bs+1 +τ αs (1 − αs+1 )+ξ(s + 1)αs − (s + 1)αs . (9.16) Now we subtract the LHS of the last equation from its RHS and divide the result by (1 − ξ αs )(1 − αs+1 )(bs+1 − bs ); noting that (k + bs+1 )bs+1 − (k + bs )bs = (k + bs+1 + bs )(bs+1 − bs ), we get k + bs + bs+1 +
τ αs + s + 1 s+1 − = 0. 1 − ξ αs 1 − αs+1
(9.17)
320
A. Borodin, D. Boyarchenko
Let us now assume that ξ = 0. In this case, it is easy to see that with the notation (9.6), the last equation is equivalent to (9.7). To obtain (9.8), we divide (9.15) by (9.14), which yields bs /αs + ξ(s + 1) + τ − ξ bs −(τ − ξ bs+1 )αs+1 + ξ(s + 1) + τ − ξ bs+1 = . −(τ − ξ bs )αs − (s + 1) − bs bs+1 /αs+1 − (s + 1) − bs+1 (9.18) From (9.6) and (9.17), we have bs /αs + ξ(s + 1) + τ − ξ bs 1 1 − ξ αs = (1 − ξ αs )bs + ξ(s + 1) + τ = (fs+1 − 1 − s), αs αs ' ( −(τ − ξ bs )αs − (s + 1) − bs = − (1 − ξ αs )bs + τ αs + s + 1 = −(1 − ξ αs )fs+1 , − (τ − ξ bs+1 )αs+1 + ξ(s + 1) + τ − ξ bs+1 ' ( = τ (1 − αs+1 ) − ξ · (1 − αs+1 )bs+1 − (s + 1) = (1 − αs+1 )(τ + ξfs+1 + ξ k), bs+1 /αs+1 − (s + 1) − bs+1 ( 1 − αs+1 ' 1 = · (1 − αs+1 )bs+1 − (s + 1)αs+1 = (s + 1 − fs+1 − k). αs+1 αs+1 This computation immediately implies that (9.18) is equivalent to (9.8). To complete the proof of part (a), we equate the (2, 1) elements of both sides of (9.3), which gives (τ − ξ bs )/βs + (k + bs )/(αs βs ) = (τ − ξ bs+1 )/βs+1 + ξ(k + bs+1 )/(αs+1 βs+1 ). We can rewrite the last equation as ( ( 1' 1 ' (τ − ξ bs ) + (k + bs )/αs = (τ − ξ bs+1 ) + ξ(k + bs+1 )/αs+1 . βs βs+1 It is easily seen to be equivalent to (9.9), by (9.6). Now we assume that ξ = 0. Then (9.17) becomes k + bs + bs+1 + τ αs + s + 1 −
s+1 = 0. 1 − αs+1
(9.19)
It is clear that with the notation (9.10), the last equation is equivalent to (9.11). To obtain (9.12), we divide (9.15) by (9.14), which gives bs /αs + τ −τ αs+1 + τ = . −τ αs − (s + 1) − bs bs+1 /αs+1 − (s + 1) − bs+1 From (9.10) and (9.19), we have bs /αs + τ =
1 1 (τ αs + bs ) = (gs − s − 1), αs αs
(9.20)
Distribution of the First Particle
321
−τ αs − bs − (s + 1) = −gs , −τ αs+1 + τ = (1 − αs+1 )τ, bs+1 /αs+1 − (s + 1) − bs+1 =
1
' ( · (1 − αs+1 )bs+1 − (s + 1)αs+1
αs+1 1 − αs+1 (s + 1 − k − gs ). = αs+1
This computation immediately implies that (9.20) is equivalent to (9.12). Finally, we compare the (2, 1) elements of both sides of (9.3). This yields τ k + bs τ + = , βs αs β s βs+1
i.e.,
βs+1 τ = −1 , βs αs (τ αs + k + bs )
which gives (9.13), completing the proof of part (b).
10. A Connection with the q − PV I Equation of M. Jimbo and H. Sakai 10.1 Reduction to the q − PV I system. In this section we show that the compatibility conditions for the Lax pairs corresponding to some of the families of polynomials orthogonal on q-lattices are equivalent to the q − PV I system of M. Jimbo and H. Sakai (Eqs. (19)–(20) in [10]) for an appropriate choice of the parameters, or to a certain degeneration of this system. Thus, we now assume that the orthogonality set is of the form X = {q −s }, where s runs either over Z≥0 or over {0, . . . , N} (N ∈ Z≥0 ), and |q| = 0, 1. Hence we have σ ζ = qζ and η = q. Then the corresponding Lax pair has the following form: As ms+1 (ζ ) = I + ms (ζ ), ms (qζ ) = Ms (ζ )ms+1 (ζ )D(ζ )−1 , (10.1) ζ − q −s and the compatibility condition is (cf. Eq. (5.5)): As As+1 I+ Ms (ζ ) = Ms+1 (ζ ) I + . qζ − q −s ζ − q −s−1
(10.2)
To relate our situation to the one considered in [10], we make the following change of notation: x := ζ , t := q −s . Then we define A(x, t) = Ms (x) (x − t)I + As , (10.3) B0 (t) = −qtI − As−1 , x xI + B0 (t) . B(x, t) = (x − qt)2
(10.4) (10.5)
Theorem 10.1. (a) The compatibility condition (10.2) with s replaced by s − 1 is equivalent to the following equation: A(x, qt)B(x, t) = B(qx, t)A(x, t).
(10.6)
322
A. Borodin, D. Boyarchenko
(b) If the matrix D(ζ ) = diag(d1 (ζ ), d2 (ζ )) is linear in ζ , so that d1 (ζ ) = λ1 (ζ − a3 ), d2 (ζ ) = λ2 (ζ − a4 ) with λ1 , λ2 = 0, then the matrix A(x, t) is quadratic in x, and if we write A(x, t) = A0 (t) + A1 (t)x + A2 x 2 ,
(10.7)
we have: κ1 0 , A2 = 0 κ2
κ1 = q k λ1 , κ2 = q −k λ2 ,
(10.8)
A0 (t) has eigenvalues tθ1 , tθ2 ,
(10.9)
det A(x, t) = κ1 κ2 (x − t) (x − a3 )(x − a4 ),
(10.10)
2
and the parameters κj , aj , θj are independent of t. Recall that the natural number k in (10.8) defines the asymptotics of the solutions ms (ζ ) as ζ → ∞ (see Theorem 3.1). Remark 10.2. Equation (10.6) can be viewed as the compatibility condition for the following pair of q-difference matrix equations, cf. [10]: n(x, qt) = B(x, t)n(x, t),
n(qx, t) = A(x, t)n(x, t).
(10.11)
One way to construct a matrix n(x, t) which solves the system (10.11) is as follows. Let V (ζ ) = diag(v1 (ζ ), v2 (ζ )) be a diagonal matrix such that V (qζ ) = D(ζ )V (ζ ) for all ζ . Then for all s, we define a new matrix ns (ζ ) by ms (ζ ) =
s $ i −1 · n (ζ ) · V (ζ )−1 if |q| > 1, ζ s q (2) s−1 s i=−∞ (q ζ − 1) s $+∞ s i ) ( ζ q 2 (q ζ − 1) · n (ζ ) · V (ζ )−1 if 0 < |q| < 1. s i=s
(10.12)
Substituting this into the first equation of the Lax pair (10.1), replacing s by s − 1 and simplifying yields ζ · ns (ζ ) = (ζ − q −s+1 )I + As−1 · ns−1 (ζ ). Since A2s−1 = 0, this is equivalent to ζ ζ I + (−q −s+1 I − As−1 ) ns (ζ ). ns−1 (ζ ) = (ζ − q −s+1 )2
(10.13)
On the other hand, if we substitute the first equation of the Lax pair (10.1) into the second one, use (10.12) and simplify the result, we obtain (10.14) ns (qζ ) = Ms (ζ ) · (ζ − q −s ) + As ns (ζ ). Now if we let x = ζ , t = q −s and n(x, t) = ns (ζ ), then with the notation (10.3), (10.4), (10.5) of Theorem 10.1, the system (10.13), (10.14) leads to (10.11).
Distribution of the First Particle
323
Proof of Theorem 10.1. (a) Since A2s−1 = 0, we have A(x, qt)B(x, t) = Ms−1 (ζ ) · (ζ − q −s+1 )I + As−1 ζ (ζ − q −s+1 )I − As−1 · = ζ Ms−1 (ζ ) (ζ − q −s+1 )2 and
qζ q(ζ − q −s )I − As−1 · Ms (ζ ) · (ζ − q −s )I + As B(qx, t)A(x, t) = 2 −s 2 q (ζ − q ) q −1 As−1 As =ζ· I− · M . (ζ ) · I + s ζ − q −s ζ − q −s
−1 Hence, if we multiply both sides of (10.6) by I − (ζ − q −s )−1 q −1 As−1 = I + (ζ − q −s )−1 q −1 As−1 and divide by ζ , we obtain As q −1 As−1 · M . (ζ ) = M (ζ ) · I + I+ s−1 s ζ − q −s ζ − q −s Replacing s by s + 1 yields (10.2). (b) Since the matrix D(ζ ) = diag(λ1 (ζ − a3 ), λ2 (ζ − a4 )) is linear in ζ , it follows from Theorem 3.1(c) that we can write κ ζ 0 Ms (ζ ) = 1 + Cs 0 κ2 ζ for a constant matrix Cs , where κ1 , κ2 are given by (10.8). This immediately implies that the matrix A(x, t) is quadratic in x with the leading coefficient A2 = diag(κ1 , κ2 ). Since det Ms (ζ ) = det D(ζ ) for all ζ by Lemma 2.1, and the matrix As is nilpotent, we see from (10.3) that det A(x, t) = κ1 κ2 (x − t)2 (x − a3 )(x − a4 ). In particular, taking x = 0, we see that det A0 (t) = t 2 δ, where δ is a constant independent of t. To complete the proof, it therefore suffices to show that Tr A0 (t) = tτ , where τ is a constant independent of t. To this end, we study the compatibility condition (10.2). Taking residues of both sides of (10.2) at ζ = q −s−1 yields & % −s−1 & % −s−1 κ1 q κ1 q 0 0 + C = q · + C As · s s+1 · As+1 . 0 κ2 q −s−1 0 κ2 q −s−1 (10.15) If we compare the asymptotics of both sides of (10.2) as ζ → ∞, we get −1 κ1 q κ1 0 0 = Cs+1 + · As+1 . Cs + As · 0 κ2 0 κ2 q −1
(10.16)
Multiplying (10.16) by q −s and subtracting it from (10.15) gives (As − q −s I )Cs = qCs+1 (As+1 − q −s−1 I ). Since A0 (t) = Cs (As − q −s I ), this shows that Tr A0 (qt) = q · Tr A0 (t) for all t.
324
A. Borodin, D. Boyarchenko
10.2 Solution of the q − PV I system. We are now in a position to quote a result of M. Jimbo and H. Sakai [10]. The situation considered in their work is more general: it is assumed that B(x, t) has the form x xI + B0 (t) B(x, t) = , (10.17) (x − qta1 )(x − qta2 ) and instead of (10.10), it is assumed that det A(x, t) = κ1 κ2 (x − ta1 )(x − ta2 )(x − a3 )(x − a4 ).
(10.18)
Thus, the situation of Theorem 10.1 corresponds to the specialization a1 = a2 = 1. Now we have Theorem 10.3 (Jimbo–Sakai, [10]). Assume (10.17), (10.7), (10.8), (10.9) and (10.18). Also, suppose that κj , θj = 0 (j = 1, 2), aj = 0 (j = 1, 2, 3, 4) and κ1 = κ2 . (a) Define y = y(t) and zi = zi (t) (i = 1, 2) by A12 (y, t) = 0,
A11 (y, t) = κ1 z1 ,
A22 (y, t) = κ2 z2 ,
where Aij (x, t) are the elements of the matrix A(x, t), so that z1 z2 = (y − ta1 )(y − ta2 )(y − a3 )(y − a4 ). In terms of y, z1 , z2 , the matrix A(x, t) can be parameterized as follows: κ1 (x − y)(x − α) + z1 κ2 w(x − y) A(x, t) = , κ1 w −1 (γ x + δ) κ2 (x − y)(x − β) + z2 where α=
* 1 ) −1 y (θ1 + θ2 )t − κ1 z1 − κ2 z2 − κ2 (a1 + a2 )t + a3 + a4 − 2y , κ1 − κ 2
β=
* 1 ) −1 −y (θ1 + θ2 )t − κ1 z1 − κ2 z2 + κ1 (a1 + a2 )t + a3 + a4 − 2y , κ1 − κ 2
γ = z1 + z2 + (y + α)(y + β) + (α + β)y − a1 a2 t 2 − (a1 + a2 )(a3 + a4 )t − a3 a4 , δ = y −1 a1 a2 a3 a4 t 2 − (αy + z1 )(βy + z2 ) . (b) Define z = z(t) by z1 =
(y − ta1 )(y − ta2 ) , qκ1 z
z2 = qκ1 (y − a3 )(y − a4 )z.
Introduce the notation y¯ = y(qt), z¯ = z(qt), w¯ = w(qt), and set b1 =
a1 a2 , θ1
b2 =
a1 a2 , θ2
b3 =
1 , qκ1
b4 =
1 . κ2
Distribution of the First Particle
325
Then the compatibility condition (10.6) is equivalent to the following system of equations: y y¯ (¯z − tb1 )(¯z − tb2 ) = , a3 a4 (¯z − b3 )(¯z − b4 ) (y − ta1 )(y − ta2 ) z¯z = , b3 b4 (y − a3 )(y − a4 ) w¯ b4 z¯ − b3 = · . w b3 z¯ − b4
(10.19) (10.20) (10.21)
Remark 10.4. Equations (10.19)–(10.21) allow us to compute (y, ¯ z¯ , w) ¯ if we know (y, z, w), and vice versa. Remark 10.5. Unfortunately, it was beyond our technical abilities to follow the proof of Theorem 10.3 in [10]. However, we were able to verify the statement of the theorem using computer simulations with random values of the parameters κj , aj , θj . 10.3 Degeneration of q − PV I . We have mentioned in Subsect. 7.5 that the compatibility conditions for Lax pairs corresponding to certain families of orthogonal polynomials are equivalent not to the q − PV I system but to a degeneration of it. We now describe this degeneration. Theorem 10.6. With the notation of Theorem 10.1, suppose that the matrix D(ζ ) = diag(d1 (ζ ), d2 (ζ )) is such that d1 (ζ ) = λ1 (ζ − a3◦ ), d2 (ζ ) = λ2 , where λj , a3◦ = 0. Then (a) The matrix A(x, t) is quadratic in x, and if we write A(x, t) = A0 (t) + A1 (t)x + A2 x 2 , we have:
◦ κ 0 A2 = 1 , 00
(10.22)
κ1◦ = q k λ1 ,
(10.23)
A0 (t) has eigenvalues tθ1◦ , tθ2◦ , det A(x, t) =
κ1◦ κ2◦ (x
− t) (x 2
− a3◦ ),
κ2◦
(10.24) =q
−k
λ2 ,
and the parameters κj◦ , θj◦ , a3◦ are independent of t. (b) We can parameterize the matrix A(x, t) as follows: ◦ κ (x − y ◦ )(x − α ◦ ) + z1◦ κ2◦ w ◦ (x − y ◦ ) , A(x, t) = 1 ◦ ◦ −1 ◦ κ1 (w ) (γ x + δ ◦ ) κ2◦ (x − y ◦ + z2◦ ) where α◦ =
* 1 ) ◦ −1 ◦ ◦ ◦ ◦ ◦ ◦ ◦ ) + θ )t − κ z − κ z (y (θ + κ 1 2 1 1 2 2 2 , κ1◦ γ ◦ = z2◦ − (y ◦ + α ◦ ) − y ◦ + 2t + a3◦ ,
(10.25)
(10.26)
326
A. Borodin, D. Boyarchenko
δ ◦ = (y ◦ )−1 −a3◦ t 2 + (α ◦ y ◦ + z1◦ )(y ◦ − z2◦ ) . Define z◦ = z◦ (t) by z1◦ =
(y ◦ − t)2 , qκ1◦ z◦
z2◦ = qκ1◦ (y − a3◦ )z◦ .
Introduce the notation y¯ ◦ = y ◦ (qt), z¯ ◦ = z◦ (qt), w¯ ◦ = w ◦ (qt), and set b1◦ =
1 , θ1◦
b2◦ =
1 , θ2◦
b3◦ =
1 . qκ1◦
Then the compatibility condition (10.6) is equivalent to the following system of equations: (¯z◦ − tb1◦ )(¯z◦ − tb2◦ ) y ◦ y¯ ◦ = , ◦ ◦ κ2 a3 z¯ ◦ − b3◦ z◦ z¯ ◦ (y ◦ − t)2 , = b3◦ κ2◦ (y ◦ − a3◦ ) b3◦ − z¯ ◦ w¯ ◦ = . w◦ b3◦
(10.27) (10.28) (10.29)
Proof. (a) The proof of this part is almost identical to that of Theorem 10.1(b) and will therefore be omitted. (b) One checks directly that all formulas of [10] are compatible with the following limit transition: κ1 → κ1◦ , a1 , a2 → 1, θ1 → θ1◦ , θ2 → θ2◦ , a3 → a3◦ , α → α ◦ , y → y ◦ , z1 → z1◦ , z → z◦ , κ2 → 0, κ2 a4 → −κ2◦ , κ2 β → −κ2◦ , κ2 γ → κ2◦ γ ◦ , κ2 δ → κ2◦ δ ◦ , κ2 z2 → κ2◦ z2◦ , κ2 w → κ2◦ w ◦ . This limit transition takes the parameterization of A(x, t) given in Theorem 10.3(a) to the parameterization (10.26), and it takes the system (10.19)–(10.21) to the system (10.27)–(10.29). 11. Applications: Recurrence Relations for Some Polynomials of the Askey Scheme 11.1 Notation. In this section we illustrate the results of Sect. 3–10 by considering several specific examples: Charlier polynomials, Meixner polynomials, Krawtchouk polynomials, q-Charlier polynomials, little q-Laguerre/Wall polynomials, alternative q-Charlier polynomials, little q-Jacobi polynomials and q-Krawtchouk polynomials. In the first four cases we solve the compatibility condition explicitly, we write down a recurrence relation for the corresponding Fredholm determinants in terms of the solution, and we provide the initial conditions for all of our recurrence relations. This is done in Subsects. 11.2, 11.4, 11.5 and 11.6. For the other four families, we content ourselves with making some general remarks in Subsect. 11.7. The general notation of this section is that of Sects. 4 and 5. Recall that we are considering a family {Pn (ζ )}N n=0 of monic polynomials orthogonal on a discrete, but not
Distribution of the First Particle
327
necessarily locally finite subset X = {πx }N x=0 of R (where N ∈ Z≥0 ∪{∞}), with respect to a strictly positive weight function ω : X → R. If k is a natural number, k ≤ N , we can consider a kernel K on X × X defined by the formula ||P ||−2 √ω(x)ω(y) Pk (πx )Pk−1 (πy ) − Pk−1 (πx )Pk (πy ) , x = y, k−1 ω K(πx , πy ) = (πx − πy ) (π )P (π ) , ||Pk−1 ||−2 ω(x) P (π )P (π ) − Pk−1 x = y, x k−1 x x k x ω k (11.1) 1/2
where ||Pk−1 ||ω = (Pk−1 , Pk−1 )ω denotes the norm of Pk−1 (ζ ) with respect to the inner product defined by ω. Up to conjugation, this coincides with the kernel introduced in the beginning of Sect. 4 (see Eq. (4.1)). For all k ≤ s ≤ N , we define a subset Ys = {πx }N x=s ⊆ X, and we are interested in the Fredholm determinants
(11.2) Ds = det 1 − K Y ×Y . s
s
11.2 Charlier polynomials ([13], §1.12). The nth Charlier polynomial is defined by 1 −n, −x . (11.3) − Cn (x; a) = 2F0 − a These polynomials satisfy the orthogonality relation ∞ ax x=0
x!
Cm (x; a)Cn (x; a) = a −n ea n!δmn ,
(11.4)
where a > 0. Thus the orthogonality set for Charlier polynomials is X = Z≥0 , and x the weight function is ω(x) = ax! . The polynomial Cn (x; a) is not monic in general; in fact, its leading coefficient is (−a)−n . Hence the corresponding family of orthogonal polynomials (recall that the orthogonal polynomials that we use are monic, see Sect. th 2) is {Pn (ζ ) = (−a)n Cn (ζ ; a)}∞ n=0 . We call Pn (ζ ) the n normalized Charlier polyn a nomial. Now from (11.4), we find that (Pn , Pn )ω = a e n! for all n ≥ 0. After these preliminaries, we can state our main result for Charlier polynomials. Theorem 11.1. If K is the kernel (11.1) corresponding to the family {Pn (ζ )}∞ n=0 of normalized Charlier polynomials, then the Fredholm determinants Ds defined by (11.2) can be computed from the following recurrence relation: Ds+2 f2 Ds+1 a s−1 · s · (gs − s − 1) · h2s . − = Ds+1 Ds (s + 1)! es
(11.5)
Here, the scalar sequences {es }s≥k , {fs }s≥k , {gs }s≥k and {hs }s≥k satisfy the following recurrence relations: aes es+1 = , (11.6) fs (gs + k − s − 1) ags fs+1 = , (11.7) fs (gs − s − 1)(gs + k − s − 1) a s+1 − − gs − k + 2s + 3, (11.8) gs+1 = fs+1 1 − fs+1 hs+1 = a −1 · fs · (gs − s − 1) · hs .
(11.9)
328
A. Borodin, D. Boyarchenko
The initial conditions for the recurrence relations (11.5)–(11.9) are given by Dk = e−ak ,
Dk+1 = e−ak · I(−k; 1; −a), ak
· (k − 1)! , I(1 − k; 1; −a) a · I(1 − k; 2; −a) fk = − , I(1 − k; 1; −a) (k + 1) · I(1 − k; 1; −a) · I(−k; 2; −a) gk = k + 1 − , I(−k; 1; −a) · I(1 − k; 2; −a) hk = k!, ek =
(11.10) (11.11) (11.12) (11.13) (11.14)
where I(u; w; z) = 1F1
u z . w
Remark 11.2. As we have have seen in Sect. 9 (cf. Remark 9.4(b)), the recurrence relations (11.7), (11.8) form a special case of the d − PI V equation of [17]. Proof. The proof is a straightforward application of our previous results. For the reader’s convenience we provide some remarks; the same ones apply to Theorems 11.4 and 11.6, and hence the proofs of those results will be omitted. To make the notation of the present section more uniform, we have been writing es 11 for βs and hs for m11 s , where {βs } and {ms } are the scalar sequences defined in Sects. 9 and 4, respectively. The symbols fs and gs have the same meaning as in Sect. 9. Then the recurrence relations (11.5)–(11.9) are obtained directly from Theorem 4.2(b) and Theorem 9.3(b). To find the initial conditions, one uses the definitions of bk , αk , βk , fk , gk , together with Propositions 6.1, 6.2, 6.6, and the obvious identities hk = m11 k = k!, bk βk = −qk , which follow from (6.1), (9.4).
11.3. We now illustrate the concluding remark of Sect. 3 by showing how one can use Proposition 3.3 to obtain difference equations satisfied by orthogonal polynomials. Proposition 11.3 (cf. [13], §1.12, Eq. (1.12.5)). The k th normalized Charlier polynomial Pk (ζ ) solves the following difference equation: −kPk (ζ ) = aPk (ζ + 1) − (ζ + a)Pk (ζ ) + ζ Pk (ζ − 1).
(11.15)
Proof. We use the notation of Proposition 3.3. Recall that in the case of Charlier polynomials, we have D(ζ ) = diag(ζ, a). We also observe that from the proof of Theorem 2.4 it follows that the matrix mX (ζ ) has a full asymptotic expansion in ζ as ζ → ∞; in particular, we can write, by (2.4), −k αβ ζ 0 =I+ · ζ −1 + O(ζ −2 ). mX (ζ ) · γδ 0 ζk
Distribution of the First Particle
329
Therefore, as ζ → ∞, we have (ζ ) M(ζ ) = mX (ζ − 1) · D(ζ ) · m−1 −k X k ζ0 ζ 0 ζ 0 · · · m−1 = mX (ζ − 1) · X (ζ ) 0a 0 ζk 0 ζ −k & % (1 − 1/ζ )k ζ0 (ζ − 1)−k 0 0 · · = mX (ζ − 1) · 0a 0 (ζ − 1)k 0 (1 − 1/ζ )−k % −k &−1 ζ 0 × mX (ζ ) · 0 ζk ζ −k0 1 − αζ −1 −βζ −1 1 + αζ −1 βζ −1 · · + O(ζ −1 ) = 0 a γ ζ −1 1 + δζ −1 −γ ζ −1 1 − δζ −1 ζ −k −β = + O(ζ −1 ). γ a Since M(ζ ) is entire by Proposition 3.3, the last term O(ζ −1 ) is identically zero by Liouville’s theorem. Hence the system of Eqs. (3.5), (3.6) takes the following form: ζ · Pk (ζ − 1) = (ζ − k) · Pk (ζ ) − β · cPk−1 (ζ ), ζ · cPk−1 (ζ − 1) = γ · Pk (ζ ) + a · cPk−1 (ζ ).
(11.16) (11.17)
Note that since det M(ζ ) = det D(ζ ) for all ζ by Lemma 2.1, we have βγ = ak; in particular, β = 0. Now from (11.16), we find that cPk−1 (ζ ) =
( 1 ' · (ζ − k)Pk (ζ ) − ζ Pk (ζ − 1) . β
Substituting this into (11.17), multiplying the result by β and using βγ = ak, we obtain ( ' ζ · (ζ − 1 − k)Pk (ζ − 1) − (ζ − 1)Pk (ζ − 2) ' ( = ak · Pk (ζ ) + a · (ζ − k)Pk (ζ ) − ζ Pk (ζ − 1) . Dividing the last equation by ζ and replacing ζ by ζ + 1, we find that it is equivalent to (11.15). 11.4 Meixner polynomials ([13], §1.9). The nth Meixner polynomial is defined by 1 −n, −x Mn (x; β, c) = 2F1 . (11.18) 1− β c These polynomials satisfy the orthogonality relation ∞ (β)x x=0
x!
cx Mm (x; β, c)Mn (x; β, c) =
c−n n! δmn , (β)n (1 − c)β
(11.19)
where β > 0 and 0 < c < 1. Thus the orthogonality set for Meixner polynomials is x x X = Z≥0 , and the weight function is ω(x) = (β) x! c . The leading coefficient of the
330
A. Borodin, D. Boyarchenko n
polynomial Mn (x; β, c) is (1−1/c) (β)n , so the corresponding family of monic orthogonal th polynomials is {Pn (ζ ) = (β)n (1 − 1/c)−n Mn (ζ ; β, c)}∞ n=0 . We call Pn (ζ ) the n nor(β)n cn n! malized Meixner polynomial. Now from (11.19), we find that (Pn , Pn )ω = (1−c)β+2n for all n ≥ 0. Then we have Theorem 11.4. If K is the kernel (11.1) corresponding to the family {Pn (ζ )}∞ n=0 of normalized Meixner polynomials, then the Fredholm determinants Ds defined by (11.2) can be computed from the following recurrence relation: ( Ds+2 Ds+1 (β)s cs−1 1 + cgs ' − = · (1 + cgs )fs+1 − s − 1 · h2s . · · 2 Ds+1 Ds β + s (s + 1)! es g s (11.20) Here, the scalar sequences {es }s≥k , {fs }s≥k , {gs }s≥k and {hs }s≥k satisfy the following recurrence relations: es+1 = −
ces gs (1 + gs+1 )fs+1 + (β + k − 1)gs+1 − s − 1 · , gs+1 (1 + cgs )fs+1 + k − s − 1 s β +s + , fs+1 = 1 − β − k − fs + 1 + gs 1 + cgs (fs+1 − 1 − s)(fs+1 − 1 − s + k) gs+1 = , cgs fs+1 (fs+1 + β + k − 1) (1 + cgs )(s + 1 − fs+1 ) · hs . hs+1 = c(β + s)gs
(11.21) (11.22) (11.23) (11.24)
The initial conditions for the recurrence relations (11.20)–(11.24) are given by Dk = (1 − c)k(β+k−1) ,
Dk+1 =
(β)k k · c · (1 − c)k(β+k−1) · F (−k, −k; β; 1/c), k! (11.25)
βc · (k − 1)!2 , F (1 − k, 1 − k; 1 + β; 1/c) fk = 0, k F (1 − k, 1 − k; 1 + β; 1/c) gk = · , βc F (−k, 1 − k; β; 1/c) hk = k!, ek =
(11.26) (11.27) (11.28) (11.29)
where F (u, v; w; z) = 2F1 Proof. See the proof of Theorem 11.1.
u, v z . w
Remark 11.5. As we have have seen in Sect. 9 (cf. Remark 9.4(a)), the recurrence relations (11.22), (11.23) form a special case of the d − PV equation of [17].
Distribution of the First Particle
331
11.5 Krawtchouk polynomials ([13], §1.10). The nth Krawtchouk polynomial is defined by Kn (x; p, N ) = 2F1
−n, −x 1 . −N p
(11.30)
These polynomials satisfy the orthogonality relation N N x=0
x
x
p (1 − p)
N−x
(−1)n n! Km (x; p, N )Kn (x; p, N ) = (−N )n
1−p p
n
δmn , (11.31)
where N ∈ Z≥0 and 0 < p < 1. Thus the orthogonality set for Krawtchouk polyno mials is X = {0, . . . , N}, and the weight function is ω(x) = Nx p x (1 − p)N−x . The −n leading coefficient of the polynomial Kn (x; p, N ) is (−N )−1 n p , so the corresponding family of monic orthogonal polynomials is {Pn (ζ ) = (−N )n p n Kn (ζ ; p, N )}N n=0 . We call Pn (ζ ) the nth normalized Krawtchouk polynomial. Now from (11.31), we find that (Pn , Pn )ω = (−1)n n!(−N )n p n (1 − p)n for all 0 ≤ n ≤ N . Then we have Theorem 11.6. If K is the kernel (11.1) corresponding to the family {Pn (ζ )}N n=0 of normalized Krawtchouk polynomials, then the Fredholm determinants Ds defined by (11.2) can be computed from the following recurrence relation: s−1 Ds+2 Ds+1 N p (1 − p)N−s+1 1 + pgs /(p − 1) − = · · Ds+1 s+1 Ds (N − s)2 es gs2 ) * · 1 + pgs /(p − 1) fs+1 − s − 1 · h2s . (11.32) Here, the scalar sequences {es }s≥k , {fs }s≥k , {gs }s≥k and {hs }s≥k satisfy the following recurrence relations: es+1 =
pes gs (1 + gs+1 )fs+1 + (k − N − 1)gs+1 − s − 1 , · (1 − p)gs+1 1 + pgs /(p − 1) fs+1 + k − s − 1 s (1 − p)(N − s) fs+1 = N + 1 − k − fs + + , 1 + gs p − 1 + pgs (1 − p)(fs+1 − 1 − s)(fs+1 − 1 − s + k) gs+1 = , pgs fs+1 (N + 1 − k − fs+1 ) (p − 1 + pgs )(fs+1 − s − 1) · hs . hs+1 = p(N − s)gs
The initial conditions for the recurrence relations (11.32)–(11.36) are given by
(11.33) (11.34) (11.35) (11.36)
332
A. Borodin, D. Boyarchenko
Dk = (1−p)k(N+1−k) ,
Dk+1 =
N · p k · (1 − p)k(N−k) ·F (−k, −k; −N ; 1 − 1/p), k (11.37)
Np(1 − p)N−1 (k − 1)!2 , F (1 − k, 1 − k; 1 − N ; 1 − 1/p) fk = 0, k(1 − p) F (1 − k, 1 − k; 1 − N ; 1 − 1/p) · , gk = Np F (−k, 1 − k; −N ; 1 − 1/p) hk = k!, ek =
where
F (u, v; w; z) = 2F1
(11.38) (11.39) (11.40) (11.41)
u, v z . w
Proof. See the proof of Theorem 11.1.
Remark 11.7. As we have have seen in Sect. 9 (cf. Remark 9.4(a)), the recurrence relations (11.34), (11.35) form a special case of the d − PV equation of [17]. 11.6 q-Charlier polynomials ([13], §3.23). In this subsection, we assume that q is a fixed real number, 0 < q < 1.The nth q-Charlier polynomial is defined by −n q n+1 q ,ζ Cn (ζ ; a; q) = 2φ1 q; − . (11.42) 0 a These polynomials satisfy the orthogonality relation ∞ x=0
x ax q (2) Cm (q −x ; a; q)Cn (q −x ; a; q) = q −n (−a; q)∞ (−a −1 q, q; q)n δmn , (q; q)x
(11.43) where a > 0 and (−a; q)∞ =
∞
(1 + aq l ).
l=0
Thus the orthogonality set for q-Charlier polynomials is X = {q −x }∞ x=0 , and the weight ax (x2) . The leading coefficient of the polynomial Cn (ζ ; a; q) function is ω(x) = (q;q) q x + 2 is (−1)n q n a −n , so the corresponding family of orthogonal polynomials is Pn (ζ ) = , 2 ∞ (−1)n a n q −n Cn (ζ ; a; q) n=0 . We call Pn (ζ ) the nth normalized q-Charlier polynomial. Now from (11.43), we find that (Pn , Pn )ω = a 2n q −2n
2 −n
for all n ≥ 0. Then we have
(−a; q)∞ (−a −1 q, q; q)n
Distribution of the First Particle
333
Theorem 11.8. If K is the kernel (11.1) corresponding to the family {Pn (ζ )}∞ n=0 of normalized q-Charlier polynomials, then the Fredholm determinants Ds defined by (11.2) satisfy the following recurrence relation: s+1 Ds+1 a s−1 Ds+2 − = · q ( 2 ) · us · h2s , Ds+1 Ds (q; q)s+1
(11.44)
where us = q k ·
ps · (ps βs + aq −k qs ) qs2
(11.45)
for all s ≥ 0. The scalar sequences {ps }s≥k , {qs }s≥k , {βs }s≥k and {hs }s≥k can be computed from the following recurrence relations (which involve additional sequences): Cs = a(q −s−1 − 1) + q k−1 (aq −k ps − rs βs ), ps+1 =
−q −1 ps−1 Cs−1
· (ps βs + aq
−k
(11.46)
k−s−1
qs ) · (rs αs − ps γs + q −1 −1 −1 qs+1 = q qs Cs · (ps βs + aq −k qs )2 , rs+1 = q −1 rs−1 Cs−1 · (rs αs − ps γs + q k−s−1 rs )2 , αs+1 = αs + q k−1 ps − q k ps+1 , βs+1 = βs − q k qs+1 , γs+1 = γs + q k−1 rs , ps βs + aq −k qs · hs . hs+1 = aqs
rs ),
(11.47) (11.48) (11.49) (11.50) (11.51) (11.52) (11.53)
The initial conditions for the recurrence relations (11.44) and (11.47)–(11.53) are provided by k
−(2) · Dk = (−a; q)−k ∞ ·a
Dk+1 = (−a; q)−k ∞ ·
k−1
'
n=0
( (−a −1 q; q)−1 n q
) (,
n+1 2
(11.54)
k k−1 ' ( a k−(2) −(k) (n+1 2 ) , ·q 2 · Gq (q −k , q −k ; −q 2k /a)· (−a −1 q; q)−1 n q (q; q)k
n=0
(11.55) pk = (1 − q −k ) ·
(q −k , q 1−k ; −q 2k−1 /a)
Gq , Gq (q −k , q −k ; −q 2k /a)
qk = (q; q)2k · q −k(k+1) · Gq (q −k , q −k ; −q 2k /a)−1 , rk =
Gq (q −k , q 1−k ; −q 2k−1 /a)2 q (1 − q −k ) · , Gq (q −k , q −k ; −q 2k /a) (q; q)k (q; q)k−1
(11.56) (11.57)
k2
αk = −1 − q k pk , k
βk = −q qk , γk =
(11.58) (11.59) (11.60)
k 2 −1
q · Gq (q 1−k , q 1−k ; −q 2k−2 /a), (q; q)2k−1 hk = q −k · (q; q)k , 2
(11.61) (11.62)
334
A. Borodin, D. Boyarchenko
where
Gq (u, v; z) = 2φ0
u, v q; z . −
Proof. As before, the proof is quite straightforward. We have been writing hs for m11 s (defined in Sect. 4). The formulas (11.45) and (11.46)–(11.52) follow immediately from Theorem 8.2 (note that δs = aq −k for all s, since we have δs+1 = δs from (8.11), and δk = aq −k from (6.10)). Then (11.44) and (11.53) are deduced from (4.7) and (4.6), respectively. Finally, the initial conditions (11.54)–(11.62) are easily obtained from Propositions 6.6, 6.2 and 6.5. Remark 11.9. As we have mentioned in Subsect. 7.5, the recurrence relation (11.47)– (11.52) for q-Charlier polynomials is in fact equivalent to a certain degeneration of the q − PV I equation of [10]. This is a special case of Theorem 10.6. Remark 11.10. In the case of q-Charlier polynomials, it is possible to solve the compatibility condition for the corresponding Lax pair by a method similar to the one used in Theorem 9.3(b). Namely, it is easy to see that the Lax pair can be parameterized as follows: & % ps ps a s c s −s −1 ms+1 (ζ ) = I + (ζ − q ) · ms (ζ ), −ps/(as cs ) −ps k q (ζ − 1) + bs bs cs (ζ − 1)−1 0 ms+1 (ζ ) . ms (qζ ) = 0 a −1 aq −k/cs aq −k Then the compatibility condition gives the following recurrence relations for the parameters as , bs , cs , ps : ps+1 =
as+1 cs+1
ps (bs + aq −k as )(q k−s−1 − q k + bs + aq −k as ) , q k ps (bs + aq −k as ) + q · (q k−s−1 − q k ) · aq −k as
bs+1 = bs + q k−1 ps − q k ps+1 , bs − q k ps+1 + aq −k as = k−s−1 , − q k + bs − q k ps+1 + aq −k as q aq −k as cs = . aq −k as − q k−1 ps
With this notation, the recurrence relation for the Fredholm determinants is (11.44), the same as before, but now we have us =
q k · (aq −k + bs /as ) , as cs
and the recurrence relation for hs is given by hs+1 = a −1 · (aq −k + bs /as ) · hs . The initial values ak , bk , ck can be easily found from (11.56)–(11.61). The main difference between this situation and that of Sect. 9 is that we cannot further reduce the recurrence relations for as , bs , ps to relations involving only two sequences of parameters. So this method cannot be used to show that our recurrence relations in the case of q-Charlier polynomials are equivalent to one of H. Sakai’s q-difference
Distribution of the First Particle
335
equations [17]. From the computational point of view, this method is slightly easier to use than the one presented in Theorem 11.8. 11.7 Concluding remarks. It follows from Theorem 10.1 that the recurrence relations corresponding to the little q-Jacobi polynomials and the q-Krawtchouk polynomials can be reduced to special cases of the q − PV I system of [10]. Also, it follows from Theorem 10.6 that the recurrence relations corresponding to the q-Charlier polynomials and the little q-Laguerre /Wall polynomials can be reduced to special cases of a certain degeneration of the q − PV I system described in Subsect. 10.3. However, it is more convenient to use the formulas of Sect. 8 for practical computations. In addition, the method of Sect. 8 covers the case of the alternative q-Charlier polynomials, whereas we do not know if the recurrence relation corresponding to these polynomials can be reduced to one of the equations of H. Sakai’s hierarchy. As far as using the formulas of Sect. 8 is concerned, there is no essential difference between the q-Charlier polynomials and the other four families of basic hypergeometric orthogonal polynomials that we consider here. So we have decided not to write out explicitly the results we have obtained for these four families. On the other hand, we have carried out all the calculations for some specific values of parameters in Maple, and in Sect. 12 we present a few plots of the “density function” (difference or q-derivative of Ds ) for the eight families of orthogonal polynomials considered in this section. 12. Numerical Computations 12.1. The plots in this section have been obtained in Maple by using the formulas of Sect. 8 and Subsects. 11.2, 11.4 and 11.5 for the specific values of parameters indicated below. 12.2. The following are two plots of the density function Ds+1 − Ds for the family of Meixner polynomials. The parameters (cf. Subsect. 11.4 or Subsect. 7.2) are k = 4, c = 0.01, β = 3000 for the first graph and k = 4, c = 0.9, β = 0.5 for the second graph. The x-coordinate in each case is s. 0.016
0.08
0.014
0.012
0.06
0.01
0.008
0.04
0.006
0.004
0.02
0.002
0
20
40
60 x
80
100
0
20
40
60
80
100
120
140
160
180
200
x
12.3. The following are the plots of the density function for the families of Charlier, q-Charlier and alternative q-Charlier polynomials (left to right). The parameters (cf. Subsect. 7.2) are k = 6, a = 20 for the Charlier polynomials (first graph) and k = 6,
336
A. Borodin, D. Boyarchenko
0.1 0.05
0.08 0.04
0.06
0.03
0.04
0.02
0.02
0.01
0
0
10
20
30
40 x
50
60
70
10
20
30
40
50
60
70
x
0.08
0.06
0.04
0.02
0
10
20
30
40
50
60
70
x
a = 20, q = 0.96 for the q-Charlier and the alternative q-Charlier polynomials (last two graphs). In the case of Charlier polynomials we plot the difference derivative, Ds+1 −Ds , of Ds , and the x-coordinate is s, while in the other two cases we plot the q-derivative, q s · (Ds+1 − Ds )/(1 − q), of Ds , and the x-coordinate is q −s . 12.4. The following are the plots of the density function q −s · (Ds+1 − Ds )/(1 − q) for the families of little q-Laguerre/Wall polynomials (first graph) and little q-Jacobi polynomials (second graph). The parameters (cf. Subsect. 7.2) are k = 6, a = 0.5, q = 0.9 for the little q-Laguerre polynomials and k = 6, a = 0.5, b = 1.5, q = 0.9 for the little q-Jacobi polynomials. The x-coordinate in each case is q s . 8 4
6 3
4 2
2
0
1
0.1
0.2
0.3 x
0.4
0.5
0
0.1
0.2
0.3 x
0.4
0.5
Distribution of the First Particle
337
12.5. The following are the plots of the density function for the families of Krawtchouk polynomials (first graph) and q-Kratwchouk polynomials (second graph). The parameters (cf. Subsect. 7.2) are k = 5, N = 80, p = 1/(0.7 + 1) for the Krawtchouk polynomials and k = 5, N = 80, p = 0.7, q = 0.98 for the q-Krawtchouk polynomials. In the first case we plot the difference derivative, Ds+1 − Ds , of Ds , and the x-coordinate is s, while in the second case we plot the normalized q-derivative, (q −N − 1) · q s · (Ds+1 − Ds )/(1 − q)/N , of Ds , and the x-coordinate is N · (q −s − 1)/(q −N − 1). 0.14 0.1
0.12
0.08
0.1
0.08
0.06
0.06 0.04
0.04 0.02
0.02
0
10
20
30
40
50 x
60
70
80
0
10
20
30
40
50
60
70
80
x
References 1. Adler, M., van Moerbeke, P.: Recursion relations for Unitary integrals, Combinatorics and the Toeplitz Lattice. Preprint, 2001, math-ph/0201063 2. Baik, J.: Riemann–Hilbert problems for last passage percolation. math/0107079 3. Borodin, A.: Riemann–Hilbert problem and the discrete Bessel kernel. Intern. Math. Res. Notices, 9, 467–494 (2000), math/9912093 4. Borodin, A.: Discrete gap probabilities and discrete Painlev´e equations. To appear in Duke Math. J. math-ph/0111008 5. Borodin, A., Deift, P.: Fredholm determinants, Jimbo-Miwa-Ueno tau-functions, and representation theory. Comm. Pure Appl. Math. 55(9), 1160–1230 (2002), math-ph/0111007 6. Borodin, A., Okounkov, A.: A Fredholm determinant formula for Toeplitz determinants, Integral Equations Operator Theory 37(4), 386–396 (2000), math/9907165 7. Borodin, A., Olshanski, G.: Z–Measures on partitions, Robinson–Schensted–Knuth correspondence, and β = 2 random matrix ensembles. Math. Sci. Res. Inst. Publ. 40, 71–94 (2001), math/9905189 8. Fokas, A.S., Its, A.R., Kitaev, A.V.: Discrete Painlev´e equations and their appearance in quantum gravity. Commun. Math. Phys. 142(2), 313–344 (1991) 9. Haine, L., Semengue, J.-P.: The Jacobi polynomial ensemble and the Painlev´e VI equation. J. Math. Phys. 40(4), 2117–2134 (1999) 10. Jimbo, M., Sakai, H.: A q-analog of the sixth Painlev´e equation. Lett. Math. Phys. 38(2), 145–154 (1996) 11. Johansson, K.: Shape fluctuations and random matrices. Commun. Math. Phys. 209, 437–476 (2000), math/9903134 12. Johansson, K.: Discrete orthogonal polynomial ensembles and the Plancherel measure. Ann. Math. (2) 153(1), 259–296 (2001), math/9906120 13. Koekoek, R., Swarttouw, R.F.: The Askey-scheme of hypergeometric orthogonal polynomials and its q-analogue. Report no. 98-17, (1998), http://aw.twi.tudelft.nl/˜koekoek/ documents/as98.ps.gz 14. Mehta, M.L.: Random Matrices. 2nd edn, New York: Academic Press, 1991 15. Nagao, T., Wadati, M.: Correlation functions of random matrix ensembles related to classical orthogonal polynomials. J. Phys. Soc. Jap. 60(10), 3298–3322 (1991) 16. Periwal, V., Shevitz, D.: Unitary-matrix models as exactly solvable string theories. Phys. Rev. Lett. 64(12), 1326–1329 (1990)
338
A. Borodin, D. Boyarchenko
17. Sakai, H.: Rational surfaces associated with affine root systems and geometry of the Painlev´e equations. Commun. Math. Phys. 220(1), 165–229 (2001) 18. Szeg¨o, G.: Orthogonal Polynomials. AMS Colloquium Publications, Vol. XXIII Providence, RI: Am. Math. Soc. 1959 19. Tracy, C.A., Widom, H.: Introduction to random matrices. Springer Lecture Notes in Physics, Vol. 424, Berlin-Heidelberg-NewYork: Springer, 1993, pp. 103–130, hep-th/9210073 20. Tracy, C.A., Widom, H.: Fredholm determinants, differential equations and matrix models. Commun. Math. Phys. 163, 33–72 (1994), hep-th/9306042 21. Tracy, C.A., Widom, H.: Random unitary matrices, permutations and Painlev´e. Commun. Math. Phys. 207(3), 665–685 (1999), math/9811154 22. Witte, N.S., Forrester, P.J.: Gap probabilities in the finite and scaled Cauchy random matrix ensembles. Nonlinearity 13(6), 1965–1986 (2000), math-ph/0009022 Communicated by P. Sarnak
Commun. Math. Phys. 234, 339–358 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0766-4
Communications in
Mathematical Physics
Intersection Numbers of Twisted Cycles and the Correlation Functions of the Conformal Field Theory Katsuhisa Mimachi1 , Masaaki Yoshida2 1
Department of Mathematics, Tokyo Institute of Technology, Oh-okayama, Meguro-ku, Tokyo 152-8551, Japan. E-mail: [email protected] 2 Department of Mathematics, Kyushu University, Ropponmatsu, Fukuoka 810-8560, Japan Received: 8 March 2002 / Accepted: 25 September 2002 Published online: 8 January 2003 – © Springer-Verlag 2003
Abstract: The coefficients of the four-point correlation functions calculated by Dotsenko-Fateev are interpreted naturally in terms of the intersection numbers of twisted (or loaded) cycles. This is an answer to the long standing problem of clarifying the meaning of such coefficients appearing in correlation functions. The physical correlation function G(z, z) = G(z1 , . . . , zn , z1 , . . . , zn ) in the conformal field theory is a single-valued function expressed by G(z, z) =
Xij Ii (z)Ij (z),
Xij ∈ C.
(1)
Here Ij (z), called the conformal blocks, are solutions to the linear differential equation arising from the representation of the Virasoro algebra (in some cases, it corresponds to the Knizhnik-Zamolodchikov equation) [2]. Though Ij (z) are multivalued, the function G(z, z) must be single-valued; this imposes a strong constraint on the coefficients Xij . Indeed, Dotsenko and Fateev calculated two cases of four-point correlation functions, in the first paper [4] of a series of their papers. We shall briefly explain their results. In the first case they treated, the differential equation of the theory turns out to be the Gauss hypergeometric differential equation. If we suppose 0 < z < 1, then two functions ∞ I1 (z) = t a (t − 1)b (t − z)c dt, 1 z I2 (z) = t a (1 − t)b (z − t)c dt 0
form a basis of the solution space. Here, the argument of each factor of the integrand is fixed to be zero, and the exponents a, b, c are supposed to satisfy a, b, c, a + b +
340
K. Mimachi, M. Yoshida
c ∈ R\Z. Then the quadratic form sin(π b) sin(π(a + b + c))|I1 (z)|2 + sin(π a) sin(π c)|I2 (z)|2
(2)
is shown to be invariant under the action of the monodromy group (this coincides with (4.18) of [4] up to an overall factor). In the second case, the differential equation of the theory is of third order, which is shown to be z2 (z − 1)2 I + (K1 z + K2 (z − 1))z(z − 1)I + (L1 z2 + L2 (z − 1)2 + L3 z(z − 1))I + (M1 z + M2 (z − 1))I = 0, with K1 = −g − 3b − 3c,
K2 = −g − 3a − 3c,
L1 = (b + c)(2b + 2c + g + 1),
L2 = (a + c)(2a + 2c + g + 1),
L3 = (b + c)(2a + 2c + g + 1) + (a + c)(2b + 2c + g + 1) + (c − 1)(a + b + c) + (3c + g)(a + b + c + g + 1), M1 = −c(2b + 2c + g + 1)(2a + 2b + 2c + g + 2), M2 = −c(2a + 2c + g + 1)(2a + 2b + 2c + g + 2). If we suppose 0 < z < 1 and a, b, c, g, 2a+g, 2b+g, 2c+g, a+b+c+g, 2a+2b+2c+g, 2a+2b+2c ∈ R\Z, then a basis of the solution space is given by t1 ∞ dt1 dt2 (t1 − t2 )g tia (ti − 1)b (ti − z)c , I1 (z) = 1
1
∞
I2 (z) =
0 z
I3 (z) = 0
z
dt1
1
i=1,2
t1
dt1 0
dt2 (t1 − t2 )g t1a (t1 − 1)b (t1 − z)c t2a (1 − t2 )b (z − t2 )c , dt2 (t1 − t2 )g
i=1,2
tia (1 − ti )b (z − ti )c ,
where the argument of each factor of the integrand is fixed to be zero. In this case, the monodromy invariant quadratic form is shown to be g g g s(a + c + g)s(b)s b + 2c × |I1 (z)|2 s(a + b + c + g)s a + b + c + 2 2 2 g s(a)s(b)s(c) × |I2 (z)|2 + s(a + b + c + g)s a + c + 2 g g g s(a)s c + s(c)s(a + c)2c × |I3 (z)|2 , +s a+ (3) 2 2 2
Intersection Numbers of Twisted Cycles and Correlation Functions of CFT
341
which coincides with the correlation function (5.16) of [4]. Hereafter we use the symbols c(A) = cos(π A)
and
s(A) = sin(π A)
for brevity’s sake. In general, the conformal blocks Ij (z) are multivalued, and differential equations satisfied by Ij (z) for the four-point correlation functions turn out to be ordinary linear differential equations with regular singularities at 0, 1, and ∞ [4]. The method of deriving the quadratic form due to Dotsenko and Fateev is rewriting the condition for G(z, z¯ ) to be monodromy invariant into a system of linear equations among Xij , and then solving this system. It sounds simple, however, there remain several problems apart from the tediousness of the procedure. The most serious problem is the solvability of the system of equations that appears at the final step, because such a system is an overdetermined one. Actually, Dotsenko-Fateev says (p. 338 of [4]) that there is a problem with the solvability of the system with respect to Xij , and there should be a general theorem behind this. In the twisted de Rham theory, integrals can be regarded as a pairing between the cohomology and the homology with coefficients in local systems. With the development of the intersection theory of twisted cycles, we got an idea that the coefficients Xij in the correlation function (1) would be related with the intersection numbers of the twisted cycles associated with the integral representing Ij (z). The purpose of this paper is to demonstrate it, by taking the examples mentioned above. The authors thank Professor Yasuhiko Yamada, who kindly explained to us the correlation functions in conformal field theory from the viewpoint of a physicist and helped us with the discussion in the Appendix. 1. Twisted Cycles and Intersection Numbers We give a brief review on the intersection theory for the twisted homology groups. We refer the reader [6, 7] and [11] for details. Let lj be a linear function in t = (t1 , . . . , tr ), αj ∈ R\Z, and Hj the hyperplane α defined by lj = 0. Then u(t) = j lj j is a multivalued holomorphic function defined on the complex manifold T = Cr − ∪j Hj . Let L be the sheaf consisting of the local solutions of dL = Lω for the holomorphic 1-form ω = du(t)/u(t), L∨ the sheaf consisting of the local solutions of dL = −Lω, and L the sheaf consisting of the local solutions of dL = Lω, where ω is the complex conjugate of ω. It is seen that the sheaves L, L∨ and L are local systems. Since all the exponents αj are real, the correspondence ιu : L av −→
v av = a u−1 ∈ L∨ , 2 |u| u
(a ∈ C),
where v is a section of L, gives a C-linear isomorphism between L and L∨ . Note that if |v| = |u|, then v/|u|2 = v −1 , and that |u|2 would be multivalued if the exponents αj were not real. This isomorphism depends on the choice of the section u of L: two sections u and u give the same isomorphism if and only if |u| = |u |. Anyway we fix this isomorphism ιu : L L∨ . lf Let Hp (T , L) be the p th twisted homology group with coefficients in L, Hp (T , L) th the p locally finite twisted homology group with coefficients in L. Elements of these
342
K. Mimachi, M. Yoshida
homology groups, called twisted cycles or loaded cycles, are represented by ∂-closed twisted (finite or locally finite) chains C= aρ ρ ⊗ vρ , (aρ ∈ C), ρ
where each ρ is a p-simplex and vρ a section of L on ρ. The subset ∪ρ;aρ =0 ρ is called the support of C. The boundary operator ∂ is defined to be a C-linear mapping satisfying p ∂(ρ ⊗ v) = i=0 (−1)i ρ i ⊗ v|ρ i , where ρ is a p-simplex, ρ i denotes the i th face of ρ, and v|ρ i is the restriction of v on ρ i . The Poincar´e duality gives the non-degenerate pairing, called the intersection form of twisted (loaded) cycles lf
I : Hr (T , L) × Hr (T , L∨ ) −→ C defined by I aρ ρ ⊗ v ρ , bσ σ ⊗ w σ = aρ bσ Ix (ρ, σ )vρ (x)wσ (x) , ρ
σ
ρ, σ
x∈ρ∩σ
where aρ , bσ ∈ C, each ρ or σ is an r-simplex, vρ a section of L on ρ, wσ a section of L∨ on σ , and Ix (ρ, σ ) the topological intersection number at x. On the other hand, under some genericity condition (in the beginning of each section to follow, a condition is explicitly given) on the exponents αj , we have the isomorphism, called the regularization, lf
reg : Hr (T , L) −→ Hr (T , L), lf
which is the inverse of the natural map Hr (T , L) −→ Hr (T , L). Concerning the intersection numbers, if we encounter two non-compact cycles, we regularize one of them and compute the intersection number of the consequent cycles, because intersection numbers are not defined for two non-compact cycles. The isomorphism ιu : L L∨ induces the C-linear isomophism lf
lf
ιu∗ : Hp (T , L) −→ Hp (T , L∨ ) defined by
aρ ρ ⊗ vρ
−→
ρ
aρ ρ ⊗ vρ /|u|2 ,
ρ
where aρ ∈ C, each ρ is an r-simplex, and vρ a section of L on ρ. Thus the intersection form I leads to the bilinear form lf
Hr (T , L) × Hr (T , L) −→ C defined by (C, C ) → I (C, ιu∗ C ). lf lf Define the complex conjugation− : Hp (T , L) −→ Hp (T , L) by
aρ ρ ⊗ v ρ = aρ ρ ⊗ vρ , ρ
ρ
Intersection Numbers of Twisted Cycles and Correlation Functions of CFT
343
where aρ ∈ C, each ρ is an r-simplex and vρ a section of L on ρ. Then, by combinlf ing with the isomorphism reg : Hr (T , L) → Hr (T , L) and the complex conjugation − : H lf (T , L) → H lf (T , L), the above bilinear form H (T , L) × H lf (T , L) → C r p p r yields the Hermitian form lf
lf
Hr (T , L) × Hr (T , L) −→ C defined by (C, C ) −→ I (reg C, ιu∗ C ),
lf
C, C ∈ Hr (T , L).
The value I (reg C, ιu∗ C ) is given by aρ aσ Ix (ρ, σ )vρ (x)vσ (x)/|u|2 , ρ, σ
x∈ρ∩σ
if reg C and C are represented by reg C = aρ ρ ⊗ v ρ , ρ
C =
σ
aσ σ ⊗ vσ ,
where aρ , aσ ∈ C, each ρ or σ is an r-simplex, vρ a section of L on ρ, and vσ a section of L on σ . We shall also call this Hermitian form the intersection form. This Hermitian form up to a real positive multiplicative factor is determined only by the data: hyperplanes Hj and their exponents αj . For brevity’s sake, we often write reg C · C
in place of
I (reg C, ιu∗ C ),
if the choice of u is clearly indicated. In the next section, as a preparation for the study of higher rank cases, we study a case that the rank of the twisted homology group is one, which corresponds to the Beta integral. 2. A Rank One Case Set u(t) = t a (t − 1)b with a, b, a + b ∈ R\Z and T = C\{0, 1}. Let L be the local system on T determined by u(t), that is, the sheaf of the local solutions of dL/dt = a b + L. In this case, the rank of the locally finite twisted homology group t t −1 lf H1 (T , L) is known to be 1, and a base is given by −−→ C = (0, 1) ⊗ t a (1 − t)b ,
(4)
where the argument of each factor of t a (1 − t)b is fixed to be zero on the path (0, 1), lf that is, arg t = arg(1 − t) = 0 on (0, 1). Similarly, a base of H1 (T , L) is given by −−→ C = (0, 1) ⊗ t a (1 − t)b .
(5)
To compute the intersection numbers, we need to consider the regularization lf reg C ∈ H1 (T , L) of C ∈ H1 (T , L). To express the regularization explicitly, we
344
K. Mimachi, M. Yoshida
introduce the symbol S(a ; z) which stands for the positively oriented circle centered at the point z with starting and ending point a. a z Fig. 1 lf
Then the regularization reg C ∈ H1 (T , L) of C ∈ H1 (T , L) can be given by reg C =
1 1 −−−−−→ S(- ; 0) + [-, 1 − -] − S(1 − - ; 1) ⊗ t a (1 − t)b , (6) e(a) − 1 e(b) − 1
where - is a small positive number and √ e(a) = exp(2π −1a). Here arg t = arg(1 − t) = 0 on [-, 1 − -], and the argument on the oriented circle S(- ; 0) or S(1 − - ; 1) is defined naturally by the analytic continuation, i.e. arg t takes values from 0 to 2π on S(- ; 0) and arg(1 − t) from 0 to 2π on S(1 − - ; 1). The regularization reg C is a base of H1 (T , L). In order the formula (6) to jump into sight, we often illustrate the image of paths appearing in (6) as in Fig. 2:
1
0 Fig. 2
Be careful that reg C is defined as the sum of three parts with different weight as in (6). In what follows, we use this type of figures without further comments. The intersection number reg C · C can be calculated: If the support (0, 1) of C is deformed into the sine-like curve as is depicted in Fig. 3, the number of the intersection points of the supports of reg C and of C is three, and we have reg C · C = −
1 −1 −1+ e(a) − 1 e(b) − 1
e(a)e(b) − 1 =− = (e(a) − 1)(e(b) − 1)
√
−1 s(a + b) . 2 s(a)s(b)
(7)
Intersection Numbers of Twisted Cycles and Correlation Functions of CFT C
345
reg C
1
0 Fig. 3
Note that we may also make use of another deformation, for instance, such as in Fig. 4: reg C · C = −
−e(b) 1 + e(a) − 1 e(b) − 1
e(a)e(b) − 1 =− = (e(a) − 1)(e(b) − 1)
√
−1 s(a + b) . 2 s(a)s(b)
C
0
reg C
1
Fig. 4
3. A Rank Two Case Let u(t) = t a (t − 1)b (t − z)c be a multivalued function that determines the local system L on T = C\{0, 1, z}. Suppose that a, b, c, a + b + c ∈ R\Z. Then the rank of the lf homology H1 (T , L) is known to be two. Define the function Ii (z) to be the pairing Ci , dt = ρ aiρ ρ vρ (t)dt for bases lf Ci = ρ aiρ ρ ⊗ vρ (t) of H1 (T , L) and an element dt of H 1 (T , L∨ ). Here each ρ is a 1-simplex in T and vρ (t) a section of L on ρ. Let Ih be the intersection matrix defined by reg C1 · C1 reg C1 · C2 . (8) Ih = reg C2 · C1 reg C2 · C2 Then, the Hermitian form F (z, z) = ( I1 (z), I2 (z) )Ih−1
is seen to be monodromy invariant. We shall explain it.
I1 (z) I2 (z)
(9)
346
K. Mimachi, M. Yoshida
Let γ be a closed path in C\{0, 1} starting and ending at z. Let γ∗ C1 and γ∗ C2 be the cycles obtained by the continuous deformation of C1 and C2 with respect to the moving point z along γ . Then we have an invertible matrix M(γ ) = (mij (γ )), depending only on the homotopy class of γ , such that γ∗ Ci = j =1,2 mij (γ )Cj for i = 1, 2. This induces γ∗ Ii (z) = j =1,2 mij (γ )Ij (z), where each γ∗ Ii (z) is the consequence of the analytic continuation of Ii (z) along γ . This gives the monodromy representation of the lf fundamental group π1 (z, C\{0, 1}) on H1 (T , L) or on the space ⊕i=1,2 CIi (z). Hence we have I1 (z) γ∗ F (z, z) = (I1 (z), I2 (z)) t M(γ ) γ∗ Ih−1 M(γ ) , I2 (z) where tA stands for the transposed matrix of A. On the other hand, by the deformation of Ih with respect to the moving point z along γ , we have γ∗ Ih = M(γ ) Ih t M(γ ), and so −1 γ∗ Ih−1 = t M(γ ) Ih−1 M(γ )−1 . This completes the proof of the monodromy invariance of F (z, z). In what follows, we shall determine the matrix Ih−1 explicitly for a set of given cycles. In case 0 < z < 1, we take −−−→ C1 = (1, ∞) ⊗ t a (t − 1)b (t − z)c , −−→ C2 = (0, z) ⊗ t a (1 − t)b (z − t)c lf
as a set of bases of H1 (T , L). Here the argument of each factor is fixed to be zero on the corresponding path, as explained in the rank one case.
C1 0
z
1
∞
0
z
1
∞
C2
Fig. 5
lf
Accordingly, the regularizations of C1 , C2 ∈ H1 (M, L) can be represented by
Intersection Numbers of Twisted Cycles and Correlation Functions of CFT
reg C1 =
347
1 1 −−−−−−→ S(1 + - ; 1) + [1 + -, R] − S(R ; ∞) e(b) − 1 e(−a − b − c) − 1
⊗ t a (t − 1)b (t − z)c , reg C2 =
1 1 −−−−−→ S(- ; 0) + [-, z − -] − S(z − - ; z) e(a) − 1 e(c) − 1
⊗ t a (1 − t)b (z − t)c , where - is a small positive number and R a large positive number. Here the circle S(R ; ∞) around the point at infinity ∞ is regarded√as the negatively oriented circle S −1 (R ; 0) with center 0. Recall that e(a) = exp(2π −1a).
S(R; ∞)
S −1 (R; 0) =
reg C1 : 0
z
1
∞
0
z
1
∞
0
z
1
reg C2 :
Fig. 6
It is clear that reg C1 · C2 = reg C2 · C1 = 0. To compute the “self”-intersection numbers reg C1 · C1 and reg C2 · C2 , the formula derived in Sect. 2 can be applied: the change of the exponents (a, b) in (7) into (b, −a −b −c) leads to the number reg C1 ·C1 , the change of the exponents (a, b) in (7) into (a, c) yields the number reg C2 · C2 . The result is given by reg C1 · C1 reg C1 · C2 Ih = reg C2 · C1 reg C2 · C2 s(a + c) √ 0 −1 ; s(b)s(a + b + c) = s(a + c) 2 0 s(a)s(c)
348
K. Mimachi, M. Yoshida
so we have s(b)s(a + b + c) 0 2 s(a + c) . =√ s(a)s(c) −1 0 s(a + c)
Ih−1
In this way, we rediscover the invariant Hermitian form (2): 2 1 F (z, z) = √ {s(b)s(a + b + c)|I1 (z)|2 + s(a)s(c)|I2 (z)|2 }, s(a + c) −1 announced in the introduction. We remark here that the integrals I1 (z) and I2 (z) over the cycles C1 and C2 above can be expressed by ∞ I1 (z) = t a (t − 1)b (t − z)c dt 1
3(−a − b − c − 1)3(b + 1) 2 F1 3(−a − c)
=
I2 (z) =
z
−c, −a − b − c − 1 ;z −a − c
,
t a (1 − t)b (z − t)c dt
0
1
= z1+a+c
t a (1 − t)c (1 − zt)b dt
0
=z where 2 F1
1+a+c
α, β γ
3(a + 1)3(c + 1) 2 F1 3(a + c + 2)
−b,
a+1
a+c+2
;z
,
;z
is the Gauss hypergeometric series defined by
α(α − 1) · · · (α − n + 1)β(β − 1) · · · (β − n + 1) zn γ (γ − 1) · · · (γ − n + 1) n!
(|z| < 1).
n≥0
Next, we give another expression of the invariant Hermitian form (2) in terms of lf another basis of H1 (T , L). The aim is mainly to explain the way of computing the intersection numbers for the adjacent cycles. This will be used in the next section. Take −−→ C1 = (0, z) ⊗ t a (1 − t)b (z − t)c , −−→ C2 = (z, 1) ⊗ t a (1 − t)b (t − z)c , where the argument of each factor is fixed to be zero on the corresponding path. Then, the “self”-intersection numbers reg C1 · C1 and reg C2 · C2 can be computed in the same
Intersection Numbers of Twisted Cycles and Correlation Functions of CFT
349
way as in the previous case, and both intersection numbers reg C1 · C2 and reg C2 · C1 are shown to be c √ e −1 1 2 =− . e(c) − 1 2 s(c) When we compute reg C1 · C2 , we have 0 ≤ arg(z − t) ≤ 2π on the circle S(z − -; z) used in the definition of reg C1 and in particular arg(z − t) = π at the intersection point t = z + -. Similarly, when we compute reg C2 · C1 , we have 0 ≤ arg(t − z) ≤ 2π on S(z + -; z) and in particular arg(t − z) = π at the intersection point t = z − -. See Fig. 7. reg C1
C2 z
0
1
reg C2
C1 z−-
0
z+-
z
1
Fig. 7
As a result, the intersection matrix Ih is shown to be Ih =
reg C1 · C1 reg C2 · C1
so we have Ih −1
s(a + c) 1 √ − reg C1 · C2 s(a)s(c) s(c) = −1 2 s(b + c) 1 reg C2 · C2 − s(c) s(b)s(c)
s(a)s(b + c) s(a)s(b) s(a + b + c) s(a + b + c) 2 =√ s(a + c)s(b) −1 s(a)s(b) s(a + b + c) s(a + b + c)
:
,
which leads to the expression 2 1 F (z, z) = √ s(a)s(b + c)|I1 (z)|2 −1 s(a + b + c) + s(a)s(b){I1 (z)I2 (z) + I2 (z)I1 (z)} + s(a + c)s(b)|I2 (z)|2 .
350
K. Mimachi, M. Yoshida
4. A Rank Three Case In this section, we rediscover the invariant Hermitian form (3) associated with the conformal blocks that satisfy the differential equation of the third order. Let u(t) = (t1 − t2 )g i=1,2 tia (ti − 1)b (ti − z)c be a multivalued function which determines the local system L on the complex manifold T = C2 \D , where D = { t1 − t2 = 0 } ∪i=1,2 {ti = 0} ∪i=1,2 {1 − ti = 0} ∪i=1,2 {z − ti = 0}. Fix 0 < z < 1 and assume a, b, c, g, 2a +g, 2b+g, 2c+g, a +b+c+g, 2a +2b+2c+g, 2a +2b+2c ∈ R\Z. lf
In this case, the rank of H2 (T , L) is known to be 6, and a basis is given as follows: Set (j ) the domains Di (i = 1, 2, 3, j = 1, 2) of R2 to be (1)
D1 = {(t1 , t2 ) ; 1 < t1 < t2 },
(2)
(1)
D2 = {(t1 , t2 ) ; 0 < t1 < z, 1 < t2 },
(1)
D3 = {(t1 , t2 ) ; 0 < t1 < t2 < z },
D1 = {(t1 , t2 ) ; 1 < t2 < t1 }, D2 = {(t1 , t2 ) ; 0 < t2 < z, 1 < t1 }, D3 = {(t1 , t2 ) ; 0 < t2 < t1 < z },
(2) (2)
with the standard orientation. Then, we take (1) (1) C1 = D1 ⊗ (t1 − t2 )g tia (ti − 1)b (ti − z)c , i=1,2
(2)
(2)
C1 = D1 ⊗ (t2 − t1 )g
i=1,2
(1)
(1)
(2)
(2)
(1)
(1)
tia (ti − 1)b (ti − z)c ,
C2 = D2 ⊗ (t1 − t2 )g t1a (t1 − 1)b (t1 − z)c t2a (1 − t2 )b (z − t2 )c , C2 = D2 ⊗ (t2 − t1 )g t2a (t2 − 1)b (t2 − z)c t1a (1 − t1 )b (z − t1 )c , C3 = D3 ⊗ (t1 − t2 )g
i=1,2
(2)
(2)
C3 = D3 ⊗ (t2 − t1 )g
i=1,2
tia (1 − ti )b (z − ti )c , tia (1 − ti )b (z − ti )c , (j )
where the argument of each factor of the section of L is fixed to be zero on each Di . (j ) lf These twisted cycles Ci (i = 1, 2, 3, j = 1, 2) form a basis of H2 (T , L). For the study of invariant Hermitian form (3), it is adequate to consider the S2 lf lf invariant subspace H2 (T , L)S2 of H2 (T , L), where the action of the symmetric group S2 is through the interchange of the coordinates t1 , t2 of the manifold T ; the rank of lf (1) (2) H2 (T , L)S2 is 3. We remark that σ (Ci ) = −Ci for the generator σ of S2 because lf of the change of the orientation. Thus a basis of H2 (T , L)S2 is given by the elements lf (1) (2) Ci = (Ci − Ci )/2 for i = 1, 2, 3. Moreover, it is known [10] that H2 (T , L)S2 is closed under the action of π1 (z, C\{0, 1}) on Ci (i = 1, 2, 3).
Intersection Numbers of Twisted Cycles and Correlation Functions of CFT
351
Fig. 8 (j )
(j )
On the other hand, we define the function Ii (z) to be the paring Ci , dt1 ∧ dt2 = (j ) (j ) (j ) (j ) = Di ⊗ vi and the element dt1 ∧ dt2 ∈ (j ) vi (t)dt1 ∧ dt2 for the cycle Ci D
i
H 2 (T , L∨ ): for instance, ∞ (1) dt1 I1 (z) = 1
and
(2)
I1 (z) = −
1
∞
t1
dt2 (t1 − t2 )g
1
i=1,2
t2
dt2 1
dt1 (t2 − t1 )g
tia (ti − 1)b (ti − z)c ,
i=1,2
tia (ti − 1)b (ti − z)c , (1)
where the argument of each factor of the integrand is zero. It is seen that Ii (z) = (2) −Ii (z) for i = 1, 2, 3. Hence, each function t1 ∞ 1 (1) (2) I1 (z) − I1 (z) , dt1 dt2 (t1 − t2 )g tia (ti − 1)b (ti − z)c = 2 1 1 i=1,2
∞
1
z
dt1 0
dt2 (t1 − t2 )g t1a (t1 − 1)b (t1 − z)c t2a (1 − t2 )b (z − t2 )c
1 (1) (2) I2 (z) − I2 (z) , 2 t1 1 (1) (2) dt1 dt2 (t1 − t2 )g tia (1 − ti )b (z − ti )c = I3 (z) − I3 (z) 2 0
=
z 0
i=1,2
(1)
(2)
can be regarded as the integral over the cycle Ci = (Ci −Ci )/2, respectively. The integrals on the left-hand side above are the conformal blocks to give the quadratic form (3). Therefore, for the study of (3), we evaluate the intersection numbers reg Ci · Cj for 1 ≤ i, j ≤ 3, following the recipe in [8]. First, blow up the (t1 , t2 )-space at each triple (j ) the cycles obtained after such blow-ups. Secondly, singular point. We denote by C i (j ) of each cycle C (j ) . Then we have construct the regularization reg C i i
352
K. Mimachi, M. Yoshida
j = 1 (l) . i · C (k) · C reg C (−1)k+l reg C i j 4 k,l=1,2
It is clear that reg Ci · Cj = 0 for i = j . Thus we compute reg Ci · Ci (i = 1, 2, 3) in what follows. Especially, for a while, we consider the intersection number reg C3 · C3 . First we blow up the (t1 , t2 )-space at (0, 0) and (z, z); the process can be depicted in Fig. 9, where (αj ) indicates the exponent of lj that defines the hyperplane Hj by lj = 0.
(a)
(c)
(g)
(c)
(g) (2) C 3
(c)
z (2)
C3
(2c + g)
⇐
(1) C3
(1) C 3
(a)
0
(a) (a)
z
0
(2a + g)
(c)
Fig. 9 (i)
(i = 1, 2) is locally a product of 1-dimensional cycles, the Since each cycle C 3 (j ) (j = 1, 2) can be made in an obvious way. The twisted cycles regularization reg C 3 (1) and C (2) restricted on (t1 − t2 )-space is described in Fig. 10. Let D (j ) be the cloreg C 3
(j )
3
(1)
3
(2)
sure of D3 , and let D be D3 ∩ D3 (don’t confuse this – with the complex conjugate). Then we have g e 2 reg C (1) · C (1) · C (2) , (2) = regC (10) 3 3 3 D 3 D dg where dα = e(α) − 1 and the symbol C|D stands for the restriction of the cycle C on D. t1 − t2 = 0 (2) C 3
(1)
reg C 3
Fig. 10
(2) , deform the cycle C (2) as in Fig. 11; hence (1) · C To evaluate reg C 3 3 3 D
D
(1) · C (2) reg C = 3 3
D
D
D
d2a+2c+2g 1 1 +1+ = . d2a+g d2c+g d2a+g d2c+g
(11)
Intersection Numbers of Twisted Cycles and Correlation Functions of CFT (2) C 3
(1) reg C 3
D
(2c + g)
353
D
(2a + g) Fig. 11
Combination of (10) and (11) leads to (2) = (1) · C reg C 3 3
e
g
d2a+2c+2g 2 , dg d2a+g d2c+g
(1) . (2) · C which can be also shown to be reg C 3 3 (1) (1) (1) so that the intersection points For reg C · C , deform the support of the cycle C 3
3
3
(1) and C (1) are situated at the barycenters of the faces of D (1) and that intersections of C 3 3 3 are transversal to each other, then sum up their local contributtions (see Fig. 12). As a result, we have (1) = 1 + 1 + 1 + 1 + 1 + 1 (1) · C reg C 3 3 da dc d2c+g dg d2a+g +
1 1 1 1 1 1 1 1 1 1 + + + + da dc dc d2c+g d2c+g dg dg d2a+g d2a+g da
=1+
da+g dc+g 1 1 1 1 + + + + + . da dc dg da dc d2a+g da dg d2c+g dc dg
(1) · C (2) = reg C (1) . (2) · C It is immediate to check reg C 3 3 3 3 1 dg dccg 1 dg
1 dccg
1 dc dccg 1 dg daag
1
1 daag
1 da daag Fig. 12
1 da
1 dc
1 da dc
354
K. Mimachi, M. Yoshida
Combining these evaluations, we have 3 · C 3 = 1 (j ) (i) · C reg C (−1)i+j reg C 3 3 4 i,j =1,2
da+g d2c+g 1 1 1 1 1 1+ = + + + + + 2 da dc dg da dc d2a+g da dg d2c+g dc dg g e d2a+2c+g 2 − dg d2a+g d2c+g g − 1)(e(a + c + g) − 1) (e a + c + 1 2 g = g g 2 (e a + − 1)(e c + − 1)(e + 1)(e(a) − 1)(e(c) − 1) 2 2 2 g s(a + c + g) s a + c + 1 2 . =− g g g 8 s a+ s c+ s(a)s(c)2c 2 2 2
Similarly, we have 2 · C 2 = 1 (i) · C (j ) reg C (−1)i+j reg C 2 2 4 i,j =1,2
=−
1 s(a + c + g)s(a + c) 8 s(a)s(b)s(c)s(a + b + c + g)
and 1 · C (i) · C 1 = 1 (j ) reg C (−1)i+j reg C 1 1 4 i,j =1,2
g s(a + c) s a + c + 1 2 . =− g g g 8 s(b)s(a + b + c + g)s a + b + c + 2c s b+ 2 2 2
Hence the intersection matrix Ih is given by
Intersection Numbers of Twisted Cycles and Correlation Functions of CFT
355
Ih = reg Ci · Cj g s(a + c)s a + c + 2 g g g s(b)s(a + b + c + g)s a + b + c + 2c 1s b+ 2 2 2 =− 8 0 0
0
0
s(a + c + g)s(a + c) 0 s(a)s(b)s(c)s (a + b + c + g) . g s(a + c + g) s a+c+ 2 g 0 g g s c+ s(a)s(c)2c s a+ 2 2 2 As a consequence, we have I1 (z) −1 I2 (z) F (z, z) = (I1 (z), I2 (z), I3 (z))Ih I3 (z) 8
=−
g s(a + c)s(a + c + )s(a + c + g) 2
g g g × s(a + b + c + g)s a + b + c + s(a + c + g)s(b)s b + 2c 2 2 2 g 2 × |I1 (z)| + s(a + b + c + g)s a + c + s(a)s(b)s(c) × |I2 (z)|2 2 g g g s(a)s c + s(c)s(a + c)2c × |I3 (z)|2 . + s a+ 2 2 2
This is the invariant Hermitian form corresponding to (3). 5. Comments We first comment on the higher rank case that is associated with u(t) =
m i=1
tia (1 − ti )b (z − ti )c
(ti − tj )g
1≤i<j ≤m
for general m (it is studied in [5]). For the concrete expression of the intersection numbers, we need to consider the blow-ups of the twisted cycles in higher dimensional space. The principle of the calculation is the same as the present case; indeed, we can carry out the computation in the rank 4 case (i.e. m = 3 case). However, the process is a bit tedious
356
K. Mimachi, M. Yoshida
and hard to proceed further. Therefore, we need to find an alternative way to calculate the intersection numbers.∗ On the other hand, look at the formula 3(a + 1)3(b + 1) 2 2 sin(π a) sin(π b) . (12) |t|2a |1 − t|2b dt ∧ dt = √ 3(a + b + 2) −1 sin(π(a + b)) C It is easily seen that the second factor of the right-hand side is the inverse of the intersection number (7) while the first factor is the square of the beta integral. Indeed, the intersection form for the twisted cohomology groups is defined and has a relation with the intersection form for the twisted homology groups (a generalization of Riemann’s period relation) [3, 6]. The formula (12) can be interpreted in such a framework. Namely, if the left-hand side of (12) is regarded as an intersection number of twisted cocycles, then the equality (12) is automatically derived. In the same way, the integral |t|2a |1 − t|2b |z − t|2c dt ∧ dt C
can be shown to be the invariant Hermitian form F (z, z). At this stage, it is natural to expect that the formula about the complex Selberg integral by Aomoto [1]
m
Cm i=1
|ti |2a |1 − ti |2b
|ti − tj |2g dt1 ∧ dt 1 ∧ · · · ∧ dtm ∧ dt m
1≤i<j ≤m
and, furthermore, the integral discussed in [5] (called the integral over the whole 2D space, therein)
m
Cm i=1
|ti |2a |1 − ti |2b |z − ti |2c
|ti − tj |2g dt1 ∧ dt 1 · · · dtm ∧ dt m
1≤i<j ≤m
could be interpreted in the same framework. We will investigate these integrals, which are related with the Hodge structure, in future publications. 6. Appendix Our method is applicable also to correlation functions in the Wess-Zumino-Witten model. We explain it, by giving an example of the Wess-Zumino-Witten model related with SU (N ). Let u(t) = t a (t − 1)b (t − z)c be a multivalued function that determines the local system L on T = C\{0, 1, z}. Suppose a, b, c, a + b + c ∈ R\Z. Then the homolf logy group H1 (T , L) and the cohomology group H 1 (T , L∨ ) are 2-dimensional. Let Iij (z) (1 ≤ i, j ≤ 2) be the pairing Ci , φj (t) dt defined by ρ aiρ ρ vρ (t)φj (t) dt lf for bases Ci = ρ aiρ ρ ⊗ vρ (t) of H1 (T , L) and φj (t) dt of H 1 (T , L∨ ). Here each ρ is a 1-simplex in T and vρ (t) a section of L on ρ. Then the matrix-valued function ∗ It is completed in our preprint “Intersection numbers of twisted cycles and the correlation functions of the conformal field theory II”, Kyushu Univ. Preprint. Series in Mathematics, 2002-23.
Intersection Numbers of Twisted Cycles and Correlation Functions of CFT
F (z, z) =
I11 (z) I21 (z) I12 (z) I22 (z)
I −1 h
I11 (z) I12 (z) I21 (z) I22 (z)
357
,
(13)
where Ih is the same as (8), is shown to be monodromy invariant in the same way as (9). If we take −−−→ C1 = (1, ∞) ⊗ t a (t − 1)b (t − z)c , −−→ C2 = (0, z) ⊗ t a (1 − t)b (z − t)c lf
as a basis of H1 (T , L), the (i, j )-entry of F (z, z) for 1 ≤ i, j ≤ 2 is expressed as 2 1 {s(b)s(a + b + c)I1i (z)I1j (z) + s(a)s(c)I2i (z)I2j (z)}, √ −1 s(a + c)
(14)
where s(A) = sin(π A). In what follows, from (14) we derive the expression of the correlation function (4.18) in [9]. Take φ1 (t) = t −1 , φ2 (t) = (1 − t)−1 and put a=b=
N +1 1 , c=− , N +k N +k
where k > 1 denotes the level and N corresponds to SU (N ). Furthermore, define the (j ) functions Fi (z) to be
(0) F1 (z)
1 3(k/(N + k)) [z(1 − z)] N (N+k) = (0) 3(1/(N + k))3((k − 1)/(N + k)) F2 (z)
(1) F1 (z)
1 N 3(N/(N + k)) [z(1 − z)] N (N +k) = (1) 3(1/(N + k))3((N − 1)/(N + k)) F2 (z)
I21 (z) , I22 (z)
I11 (z) . I12 (z)
As a result, the monodromy invariants (4.18) of [9] are given as follows (up to an overall factor): (0) (0) (1) (1) Fi (z)Fj (z) + hFi (z)Fj (z), where
s((N − 1)/(N + k)) 1 3((N − 1)/(N + k))3(k/(N + k)) 2 h= s((N + 1)/(N + k)) N 3(N/(N + k))3((k − 1)/(N + k)) =
1 3((N − 1)/(N + k))3((N + 1)/(N + k))3(k/(N + k))2 . N 2 3((k − 1)/(N + k))3((k + 1)/(N + k))3(N/(N + k))2
Note that
Fj (z) =
(j )
F1 (z) (j )
F2 (z)
358
K. Mimachi, M. Yoshida
satisfies the differential equation (the first equation of (4.6) in [9]) 1 1 1 1 − N 2 −N 1 0 + Fj (z), ∂z Fj (z) = 0 1 Nκ z z − 1 −N 1 − N 2 and that
1
κ,
− κ1
(0) F1 (z)
=z
(0) F2 (z)
1 = z1−2< (1 − z)<1 −2< 2 F1 κ −N
(1) F1 (z)
(1) F2 (z)
−2<
(1 − z)
<1 −2<
2 F1
1−
N−1 = [z(1 − z)]
<1 −2<
κ
2 F1
N κ
= −N [z(1 − z)]
1+
where <=
N2 − 1 , 2N (N + k)
2 F1
<1 =
;z
κ
N N +k
,
1 + κ1 , 1 − 2−
N+1 κ
,
N κ
N−1 <1 −2<
,
N κ
1 κ
;z
,
;z
,
N+1 κ
N κ
and
;z
,
κ = N + k.
References 1. Aomoto, K.: On the complex Selberg integral. Quart. J. Math. Oxford 38(2), 385–399 (1987) 2. Belavin, A.A., Polyakov, A.M., Zamolodchikov, A.B.: Infinite conformal symmetry in two-dimensional quantum field theory. Nucl. Phys. B241, 333–380 (1984) 3. Cho, K., Matsumoto, K.: Intersection theory for twisted cohomologies and twisted Riemann’s period relations I. Nagoya Math. J. 139, 67–86 (1995) 4. Dotsenko, VI.S., Fateev, V.A.: Conformal algebra and multipoint correlation functions in 2D statistical models. Nucl. Phys. B240[FS12], 312–348 (1984) 5. Dotsenko, VI.S., Fateev, V.A.: Four-point correlation functions and operator algebra in 2D conformal invariant theories with central charge C ≤ 1. ibid. B251[FS13], 691–734 (1985) 6. Hanamura, M., Yoshida, M.: Hodge structure on twisted cohomologies and twisted Riemann inequalities I. Nagoya Math. J. 154, 123–139 (1999) 7. Kita, M., Yoshida, M.: Intersection theory for twisted cycles. Math. Nachr. 166, 287–304 (1994) 8. Kita, M., Yoshida, M.: Intersection theory for twisted cycles II – Degenerate arrangements. Math. Nachr. 168, 171–190 (1994) 9. Knizhnik, V.G., Zamolodchikov, A.B.: Current algebra and Wess-Zumino model in two dimensions. Nucl. Phys. B247, 83–103 (1984) 10. Mimachi, K.: Homological representations of the Iwahori-Hecke algebra associated with a Selberg type integral. Preprint, 2002 11. Yoshida, M.: Hypergeometric Functions, My Love. Vieweg, 1997 Communicated by L. Takhtajan
Commun. Math. Phys. 234, 359–381 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0776-2
Communications in
Mathematical Physics
On the Absolutely Continuous Spectrum of Stark Operators Galina Perelman Centre de Math´ematiques, Ecole Polytechnique, 91128 Palaiseau, France. E-mail:
[email protected] Received: 26 June 2002 / Accepted: 30 September 2002 Published online: 20 January 2003 – © Springer-Verlag 2003
Abstract: The stability of the absolutely continuous spectrum of the one-dimensional Stark operator d2 H = − 2 − F x, dx under perturbations of the potential is discussed. The focus is on proving this stability under minimal assumptions on smoothness of the perturbation. A general criterion is presented together with some applications. These include the case of periodic perturbations where we show that any perturbation v ∈ L1 (T) ∩ H −1/2 (T) preserves the a.c. spectrum. Introduction In this note we consider the Stark operator on R, d2 − F x + v(x), F > 0, (1) dx 2 where v is supposed to be locally integrable and such that H is in the limit point case at ±∞. If v = 0 the spectrum of H is purely absolutely continuous: H =−
σ (H ) = σac = R.
(2)
The question we address is which perturbations v preserve the absolutely continuous spectrum of H . Two classes of condition are known to ensure preservation of the absolutely continuous spectrum: smoothness and decay. There exists an extensive physical and mathematical literature on this subject. We shall describe here only some recent results relevant to this note referring the interested reader to [5, 6, 12, 16, 17, 24] and references therein. In order to study the stability of the absolutely continuous spectrum of one-dimensional Schr¨odinger operators two different approaches were worked out during the last
360
G. Perelman
years (both of them were initially developed in the context of the slowly decaying perturbations of the free Schr¨odinger operator (F=0) and then applied to some more general background potentials). In the first approach the desired spectral statement is derived as a consequence of a stronger result establishing the WKB asymptotic behavior of the generalized eigenfunctions for almost all energies, see [8–11, 17–19, 22]. The proof is based on a representation of the solutions of the equation (H − E)ψ = 0 as sums of infinite series of multilinear operators and on a careful analysis of these operators. In the case of the Stark operator (1) (considered on R+ ) this approach allowed Christ and Kiselev [12] to prove the stability of the absolutely continuous spectrum under the perturbations of the form v = v1 + v2 , where both v1 (x 2 ) and x −1 v2 (x 2 ) are in (L1 + Lp )(R), and there exist C0 < 1 such that |v2 (x)| ≤ C0 F |x| for |x| sufficiently large. The disadvantage of the above method is that it seems not to allow to treat the borderline perturbations. In the case of the free Schr¨odinger operator the stability of the a.c. spectrum under the borderline (in the Lp setting) perturbations v ∈ L2 was proved by Deift and Killip [13], Molchanov, Novitskii and Vainberg [21]. In both [13] and [21] this is done by using the Buslaev-Faddeev-Zakharov trace formula. The trace formula method is effective only in the free case. To treat the case where the underlying potential is nonzero a different although intimately related method was developed by Killip, see [16]. In the context of the Stark operator it gives in particular the stability of the absolutely continuous spectrum in the case of the borderline decay: v(x 2 ) ∈ L2 (this is known to be optimal in the Lp scale). In this note we describe a modification of Killip’s method which allows to treat both cases “smoothness and decay” simultaneously. In particular, for the case considered by Christ-Kiselev we get the persistence of the a.c. spectrum under the perturbations of the form v1 (x 2 ) ∈ L2 , x −1 v2 (x 2 ) ∈ L2 . Our second example is the case of periodic perturbations v(x) = v(x + 1). In this special case the spectrum of (1) is known to remain absolutely continuous and to cover the whole axis under rather weak smoothness requirements on the potential. In fact, (2) can be proved for v ∈ H s (T), s > 0, see [4, 7]. On the other hand there are examples of very singular periodic perturbations such as an array of δ potentials for which the spectrum has no a.c. component or even is pure point, see [2, 3, 15, 20]. It was argued by Ao [1] that the spectral nature depends on the gap structure of the periodic perturbation. He conjectured that if the size of the nth gap behaves as n−α the spectrum is continuous for α > 0 and pure point for α < 0 (at least for “non-resonant” F ). In the critical case α = 0, which corresponds to a comb of δ function potentials, a transition from pure point to continuous spectrum is expected as F grows (see also [14] for related phenomena in the random setting). Furthermore, the spectral nature in this critical case seems to depend also on number theoretical properties of F [7]. Our contribution to the issue is a proof of the persistence of the a.c. spectrum under perturbations v ∈ L1 (T) ∩ H −1/2 (T), which corresponds to α > 0 in the Ao language. 1. Preliminary Notions and Statement of the Results Consider the Stark operator H =−
d2 − F x + q(x) + v(x), dx 2
F > 0,
x ∈ R,
(1.1)
Absolutely Continuous Spectrum of Stark Operators
361
q = q0 + q1 , where q0 is a “smooth” part of the potential that will be allowed to grow at infinity at a rate arbitrarily close to that of the constant electric field, < x >−1/2 q1 ∈ L1 (R). At last, v contains a “singular part without decay”. More precisely, we assume the following. Hypothesis H1. (i) |q0 (x)| ≤ C0 F |x| + C1 for some constants C0 , C1 , C0 < 1; (ii) < x >−5/4 q0 ∈ L2 (R+ ), < x >−3/2 q0 ∈ L1 (R+ ). Here we use the notation < x >= (1 + x 2 )1/2 . It is clear that in the case v = 0 assumption (H1) implies σ (H ) = σac (H ) = R. The potential v is supposed to be uniformly in L1loc : Hypothesis H2.
α+1 dx|v| < ∞. |v| ≡ sup α∈R
α
This implies in particular that the operator H is in the limit point case at ±∞. Introduce the notation ∞ e2i( (y,E)− (x,E)) (x, E) = dy v(y), (F y − q0 )1/2 x
(1.2)
where
1 (x, E) = (F x − q0 )1/2 + E(F x − q0 )−1/2 . 2 Clearly, the integral in r.h.s. of (1.2) converges for E with imE > 0.
Hypothesis H3. x −1/4 (x, E) is in L2 at +∞ with respect to x. It is worth mentioning that if the above assumption is valid for some value of E, E0 , imE0 > 0, the same is true for any E, imE > 0, see appendix 2, (A2.10). Our main result is the following. Theorem 1.1. Under assumptions (H1), (H2) and (H3), an essential support of the absolutely continuous spectrum of the operator H, !ac (H ), coincides with the whole axis. Remark. In this and subsequent statements only behavior of the potential as x → +∞ really matters. On the negative part of the real axis it is sufficient for all the conclusions to require that in some suitable sense q + v − F x → +∞ as x → −∞. Here are some simple applications, see section 3 for the details. Corollary 1.1. Suppose that (k) (i) |q0 (x)| ≤ c < x >α−k , k = 0, 1, 2, for some α < 1; (ii) v is periodic, v(x) = v(x + 1), v ∈ L1 (T) ∩ H −1/2 (T). Then, !ac (H ) = R.
362
G. Perelman
Corollary 1.2. Assume that v = v1 + v2 , where v1 (x) ∈< x >1/2 L1 + < x >1/4 L2 , v2 ∈< x > L1 + < x >3/4 L2 and there exist C0 < 1 such that |v2 | ≤ C0 F |x| for |x| sufficiently large. Then, the essential support of the a.c. spectrum of operator (1) coincides with R. The rest of the paper is arranged as follows. The next section contains the proof of Theorem 1.1. Corollaries 1.1, 1.2 are proved in Sect. 3, some technical details being removed to the appendices. 2. Proof of Theorem 1.1 2.1. Some preliminary reductions. The proof is based on the ideas of the works [13, 16]. We start by replacing the whole line operator by a half-line operator by putting a Dirichlet condition in a point x = R. This is a relative trace class perturbation and it decomposes the whole-line operator into the direct sum of two half-line operators. Since d2 − dx 2 + α(q0 − F x) + q1 + v is bounded from below for any α > 0 and q0 − F x → +∞ as x → −∞ the spectrum of the left half-line operator is discrete. Thus, by the Birman-Kuroda Theorem, the absolutely continuous parts of the operator H and of the right half-line operator are unitarily equivalent. In particular, !ac (H ) = !ac (HR ), for any R ∈ R . Here HR denote the right half-line operator with the Dirichlet condition as x = R, (HR ψ)(x) = −ψ + (q(x) + v(x) − F x)ψ(x),
ψ(R) = 0,
x ≥ R.
So, to prove the theorem it is sufficient to show that for any compact interval I ⊂ R there exists R such that I ⊂ !ac (HR ). From now on we shall omit the index R in the notations of HR . For E ∈ C, let ϕ(x, E), θ (x, E) denote the solutions of (H − E)ψ = 0,
(2.1)
which obey ϕ(R, E) = θ (R, E) = 0, ϕ (R, E) = θ (R, E) = 1. Since H is in the limit point case at infinity, for E ∈ C \ R, there exists a unique solution ψ of (2.1) such that ψ(·, E) ∈ L2 at +∞ and ψ(R, E) = 1. Expressing ψ in terms of ϕ and θ one gets the Weyl m-function: ψ(x, E) = θ (x, E) + m(E)ϕ(x, E). m(E) is a holomorphic function of E ∈ C \ R. In fact, m(E) admits a unique representation 1 ξ m(E) = A + dµ(ξ ), ∀E ∈ C \ R, − 1 + ξ2 R ξ −E dµ where A ∈ R and dµ is a positive measure with 1+ξ 2 finite. The measure dµ is called the spectral measure and H is unitary equivalent to f (ξ ) → ξf (ξ ) acting in L2 (dµ). We shall consider H as a perturbation of the operator H0 , (H0 ψ)(x) = −ψ + (q(x) − F x)ψ(x),
ψ(R) = 0,
x ≥ R.
Absolutely Continuous Spectrum of Stark Operators
363
Actually, following [13, 16] rather then deal with the true perturbation v we shall study the operators HN = H0 + vN , v(x), x ≤ R + N, vN (x) = 0, x > R + N, and get some a priori estimates allowing to take the limit N → ∞. HN will be considered on the half-line {x ≥ R} with the Dirichlet boundary condition, and for consistency of notations we fix v0 ≡ 0. Let mN , dµN denote the Weyl m-function and the spectral measure of the operator HN . Clearly, as N → +∞, mN (E) → m(E) uniformly on compact subsets of C+ , which implies that the spectral measures dµN converge weakly to dµ. Consider the equation −ψxx + (q(x) − F x)ψ = Eψ.
(2.2)
Using the Liouville substitution one can easily construct a solution of (2.2) with the standard behavior at +∞. Since the subject is so well known we just formulate the necessary results. Lemma 2.1. For E, im E ≥ 0 there exist a solution f0 (x, E) of (2.2), which is a holomorphic function of E ∈ C+ , continuous up to the boundary, with the following asymptotic behavior as x → +∞ f0 = fas (1 + o(1)), where fas (x, E) = p−1/2 (x, E)ei ∞ (x,E) , p(x, E) = (F x − q0 + E)1/2 , ∞ x E E (x, E) = ds p (s) + ds p(s, E) − p (s) − − , ∞ 0 0 2p0 (s) 2p0 (s) x0 x p0 (x) = p(x, 0). The roots here are defined on the complex plane with the cut along negative imaginary semi-axes and is positive for positive values of the arguments. This asymptotic representation can be differentiated with respect to x and E. For E ∈ R the solutions f0 , f 0 form a basis, w(f0 , f 0 ) = 2i. Let ϕ0 (x, E) denote a solution of (2.2) with the initial conditions ϕ0 (R, E) = 0, ϕ0 (R, E) = 1. For E ∈ R one can express ϕ in terms of f, f : ϕ0 (x, E) = a0 (E)f0 (x, E) + a0 (E)f0 (x, E), where a0 (E) = 2i f0 (R, E) is a holomorphic function of E ∈ C+ , continuous up to the boundary, a0 (E) = 0, imE ≥ 0. 1 Since imm0 (E + i0) = 4|a (E)| 2 , E ∈ R, 0
dµ0 (E) =
1 dE, 4|a0 (E)|2
where m0 and dµ0 stand for the Weyl m-function and the spectral measure of H0 respectively.
364
G. Perelman
One may similarly express the spectral measure dµN of the perturbed operators HN in terms of the corresponding coefficients aN : dµN (E) =
1 dE, 4|aN (E)|2
aN =
i i w(ϕN , fN ) = fN (R, E), 2 2
(2.3)
where ϕN , fN are the solutions of the equation (HN −E)ψ = 0 satisfying the conditions ϕN (R, E) = 0,
ϕN (R, E) = 1,
fN (x, E) = f0 (x, E),
x > R + N.
Clearly, aN (E) is a holomorphic function of E ∈ C+ , aN (E) = 0, imE ≥ 0. As a consequence of representation (2.3) and the weak convergence dµN . µ as N → +∞ one has the following proposition (see [13, 16] for the proof). Proposition 2.1. Let I ⊂ R be a compact interval. Suppose there exist a function g ∈ C(I ), almost everywhere positive, such that dEg(E) ln |aN (E)| ≤ C, (2.4) I
uniformly in N . Then I ⊂ !ac (H ). The following observation is due to Killip. Lemma 2.2. Assume that for some ν and for E in a rectangle {reE ∈ I, 0 < imE ≤ K} the following estimate holds uniformly in N : | ln |aN (E)|| ≤ C|im E|ν . Then, there exist a function g ∈ C(I ), g > 0 almost everywhere, such that (2.4) is satisfied. See [16] for the proof. The remainder of the section is devoted to the proof of the following proposition. Proposition 2.2. Let (H1), (H2), (H3) hold. Then for any compact K ⊂ C+ there exist constants R0 , C, independent of N such that for E ∈ K, R > R0 , | ln |aN (E)|| ≤ C|im E|−5 . By Proposition 2.1 and Lemma 2.2, this statement implies Theorem 1.1. We prove Proposition 2.2 in two steps. First (Subsect. 2.2- 2.6), we show that d ≤ C|imE|−6 , (E) ln a N dE provided R > R0 . Second (Subsect. 2.7), we prove that | ln |aN (E)|| is bounded uniformly on compact subsets of C+ , again, for R sufficiently large.
Absolutely Continuous Spectrum of Stark Operators
365
d 2.2. Trace formula. To estimate the derivative dE ln aN (E), imE > 0, we use the trace formula. Let RN (E) denote the resolvent of HN : RN (E) = (HN − E)−1 . For imE > 0 its kernel is given by the formula i ϕN (x, E)fN (y, E) for x ≤ y, RN (x, y, E) = 2aN (E) ϕN (y, E)fN (x, E) for x ≥ y.
For im E > 0 consider the trace of the difference RN (E) − R0 (E), understood as the trace of the integral operator ∞ ϕN (x, E)fN (x, E) ϕ0 (x, E)f0 (x, E) tr[RN (E) − R0 (E)] = i dx − . 2aN (E) 2a0 (E) R Since vN is supported on a compact subset tr[RN (E) − R0 (E)] is finite and is connected with the function aN (E) by the following relation: d aN (E) ln = −tr[RN (E) − R0 (E)]. dE a0 (E) To calculate tr[RN (E) − R0 (E)] we write the equation (HN − E)ψ = f,
imE > 0
(2.5)
as a first-order system 0 ψ ψ 0 1 − = q + v − F x − E 0 N f ψ ψ and apply a variation of parameter-type transformation that brings the system into nearly diagonalized form: ψ z1 1 1 = E(x, E)z, E(x, E) = , ∗ (x, E) , z = E(x, E) E ψ z2 E=
fas , fas
E ∗ (x, E) =
∗ fas ∗ , fas = p−1/2 e−i ∗ fas
We arrive at HN (E)z = g,
g=
∞
.
i f , 2 −f
d (E) + VN . HN (E) = HN d (V ) stands for the diagonal (anti-diagonal) part of the operator H : Here HN N N d E 0 d HN + iVN σ3 , VN = VN σ2 , =p − 0 E∗ dx
where
1 1 5 1 vN + V , V = q1 − (F − q0 )2 p −4 − q0 p −2 , 2 2 32 8 V ∈ L1 (+∞), σ2 , σ3 being the standard Pauli matrices 0 i 1 0 σ2 = , σ3 = . −i 0 0 −1 VN =
< x >−1/2
366
G. Perelman
We shall consider HN (E) as an operator in L2 ([R, ∞) → C2 ) submitted to the boundary condition: z1 (R) + z2 (R) = 0. Clearly, for imE > 0, HN (E) has a bounded inverse, RN (E). Its kernel can be expressed in terms of the solutions fN , ϕN as follows: ˆ N (x, y, E)E(y, E)p−1 (y, E), R ˆ N = {R ˆ ij }2 RN (x, y, E) = E−1 (x, E)R N i,j =1 , ˆ 11 (x, y, E) = −R ˆ 22 (y, x, E), R ˆ ij (x, y, E) = R ˆ ij (y, x, E), j = i, R (2.6) N N N N 1 −ϕN (x, E)fN (y, E) ϕN (x, E)fN (y, E) ˆ N (x, y, E) = R for x ≤ y. 2iaN (E) −ϕN (x, E)fN (y, E) ϕN (x, E)fN (y, E)
It is worth mentioning that the components
ij RN
of RN , RN =
the relations similar to (2.6). In terms of the operator RN (E) the trace formula takes the form 2i
12 R11 N RN
22 R21 N RN
d aN (E) 22 N 21 12 N ln = tr[R11 j (E) − Rj (E)]j =0 + [trRj (E) − trRj (E)]j =0 . dE a0 (E)
satisfy
(2.7)
Here and through paper we use the notation [bj ]N j =0 = bN − b0 . 21 Consider the expression trRN (E). It is easy to check that trR12 N (E) is finite and admits the representation trR12 N (E) = −
i i − trRN (E)QN (E), 4p 2 (R, E) 2
(2.8)
where QN = Q1N + iQ2N , 3 25 Q1N (E) = −(q1 + vN )p −2 + q0 p −4 + (F − q0 )2 p −6 , 4 16 Q2N (E) = (F − q0 )p −3 . Let us mention that x −1/2 Q1N ∈ L1 (+∞) and Q1N is uniformly in L1loc (+∞): x −1/2 Q1N L1 ([R,∞)) , |Q1N | ≡ sup Q1N L1 ([α,α+1]) ≤ C, α≥R
provided R is sufficiently large. A representation similar to (2.8) holds for trR21 N (E): trR21 N (E) =
i 4p 2 (R, E)
i + trRN (E)Q∗N (E), 2
(2.9)
where Q∗N = Q1N − iQ2N . Combining (2.8), (2.9) with (2.7) one gets finally 2i
d aN (E) 1 1 22 N ln = tr[(R11 j (E) − Rj (E))(1 − Qj (E))]j =0 dE a0 (E) 2 1 1 1 N 12 1 N − [trR21 j (E)Qj (E)]j =0 + [trRj (E)Qj (E)]j =0 . 2 2
(2.10)
Absolutely Continuous Spectrum of Stark Operators
367
2.3. Preliminary estimates of the modified resolvent RN (E). In this subsection we describe some simple estimates related to RN (E) that we shall need later. Here and below (except for Subsect. 2.7) C is used as a general notation for the constants that are independent of N , R, uniform with respect to E in compact subsets of C+ but that may depend on q0 , q1 , v. Proposition 2.3. For imE > 0, RN (E) is bounded as an operator from l2 (L1 )+ < x >1/4 L1 to l2 (L∞ )∩ < x >−1/4 L∞ . Moreover, (im E)−1 gl2 (L1 ) , RN (E)g 1 ≤ C (im E)−3/2 < x >−1/4 gL1 , −1 l2 (L∞ ) (im E)−3/2 gl2 (L1 ) , < x >1/4 RN (E)g 1 ≤ C (im E)−2 < x >−1/4 gL1 , −1 L∞ (im E)−2 gl2 (L1 ) , RN (E)g 1 ≤C (im E)−3/2 < x >−1/4 gL1 , 1 l2 (L∞ ) (im E)−5/2 gl2 (L1 ) , < x >1/4 RN (E)g 1 ≤C (im E)−2 < x >−1/4 gL1 , 1 L∞ provided R is sufficiently large : R > C. Here l p (Lq ) norms are defined by f l p (Lq ) =
∞ n=0
1
p
p
f Lq (n )
,
n = [n + R, n + 1 + R].
Proof. The above estimates are an immediate consequence of the corresponding properties of the standard resolvent RN and the representations g (RN (E)g)(x) RN (E) (x) = −2iE−1 (x, E) , (RN (E)g) (x) −g g i g1 + z, = RN (E) RN (E) g 2 −g1 where
f0 g ∗ g1 = 2 − E − E − v N g2 , p f0 x 1 g −1 g2 = 2f0 . dy , z = g2 E f0−1 f0 pf0 R
By Lemma 2.1,
|g2 (x)| ≤ C
where β = one obtains
√ 1 , F (1+C0 )
x
R
dye−β im E(x
1/2 −y 1/2 )
x ≥ R, R is supposed to be sufficiently large. As a consequence,
(imE)
1/2
|g(y)| , x 1/4 y 1/4
g2 l2 (L∞ ) , x
1/4
g2 L∞ ≤ C
(imE)−1/2 gl2 (L1 ) , x −1/4 gL1 .
368
G. Perelman
Thus, in order to get Proposition 2.3 it is sufficient to prove the following set of estimates for RN −1/2 d RN (E)gl2 (L∞ ) , x ≤ C(imE)−1 gl2 (L1 ) , (2.11) RN (E)g dx l2 (L∞ ) −1/2 d RN (E)gl2 (L∞ ) , x ≤ C(imE)−3/2 x −1/4 gL1 , (2.12) RN (E)g dx l2 (L∞ ) −1/4 d (imE)−3/2 gl2 (L1 ) , 1/4 x RN (E)gL∞ , x RN (E)g ≤C (2.13) (imE)−2 x −1/4 gL1 . dx L∞ Consider first the unperturbed resolvent R0 . By Lemma 2.1, |R0 (x, y, E)|, |x −1/2 R0x (x, y, E)|, |y −1/2 R0y (x, y, E)| 1/2 1/2 ≤ Cx −1/4 y −1/4 e−β im E(x −y ) , x > y, R is sufficiently large, which leads immediately to the estimates −1/2 d R x (E)g ≤ C(imE)−1/2 x −1/4 gL1 , R0 (E)gl2 (L∞ ) , 0 dx l2 (L∞ ) −1/4 d (imE)−1/2 gl2 (L1 ) , 1/4 x R0 (E)gL∞ , x R0 (E)g ≤ C x −1/4 gL1 . dx L∞ By the resolvent identity
(2.14)
(2.15)
RN = R0 − RN vN R0 ,
inequality (2.12) is a consequence of (2.11) and (2.14). In a similar way (2.13) follows from (2.11), (2.12) and (2.15). At last, the proof of (2.11) is a simple elliptic type argument, see Appendix 1 for the details. N N 1 21 1 2.4. Estimate [trR12 j (E)Qj (E)]j =0 , [trRj (E)Qj (E)]j =0 . To estimate the contribu12 21 tions of Rj , Rj to (2.10) we use the iterated resolvent identity
RN = TN − AN + RN VN AN , where
−1
d TN = HN ,
(2.16)
AN = TN VN TN .
The kernel of the operator TN has the form tN0 (x, y) t (x, y):(x − y) , TN (x, y) = N 0 −tN (y, x):(y − x) x
−1
ei y ds(p−VN p ) tN (x, y) = , (p(x)p(y))1/2
tN0 (x, y)
=
ei
x R
+
(2.17)
y R
ds(p−VN p−1 )
(p(x)p(y))1/2
,
where : is the Heviside function. Clearly, |tN (x, y)| ≤ Cx −1/4 y −1/4 e−β im E(x
1/2 −y 1/2 )
,
x ≥ y ≥ R,
(2.18)
Absolutely Continuous Spectrum of Stark Operators
369
|tN0 (x, y)| ≤ Cx −1/4 y −1/4 e−β im E(x +y −2R ) . (2.19)
12 A11 N (x, y) AN (x, y) Consider the operator AN . Its kernel AN (x, y) = has the form 22 A21 N (x, y) AN (x, y) 1/2
0 1 A11 N (x, y) = −itN (x, y)N (y),
1/2
1/2
11 A22 N (x, y) = −AN (y, x),
0 1 2 A12 N (x, y) = −itN (x, y)N (R) − itN (x, y)N (y), 1 A21 N (x, y) = itN (x, y)N (x), ij AN (x, y)
=
ij AN (y, x),
x ≥ y,
x ≥ y,
(2.20) (2.21) (2.22)
i = j.
Here 1 (x, E) N
=
2 (x, E) = N
Clearly,
∞ x x R
dy
dy
e2i
e2i
y x
x y
ds(p−VN p−1 )
p(y, E) ds(p−VN p−1 )
p(y, E)
i N L∞ ([R,∞)) ≤ C|imE|−1 ,
i Some more estimates of N
VN (y),
VN (y).
i = 1, 2.
(2.23)
are collected in the following lemma, the proofs being given
in Appendix 2.
i satisfy the inequalities: Lemma 2.3. For R sufficiently large, the functions N i i N (·, E)L∞ ([R,∞)) , x −1/4 N (·, E)l2 (L∞ )([R,∞))
≤
C 2/3 (x −1/4 (·, E0 )L2 ([R,∞)) + R −1/6 + ρ(R, q)), imE
imE0 > 0, where ρ(R; q0 , q1 ) = x −5/4 q0 2L2 ([R,∞)) + x −3/2 q0 L1 ([R,∞)) + x −1/2 q1 L1 ([R,∞)) , being defined by (1.2). The constants here depend also on E0 . From now on we fix some E0 ∈ C+ and shall not make precise a dependence on E0 anymore. In particular, (x) will stand for (x, E0 ). As an immediate consequence of the representations (2.20–2.22), inequalities (2.18), (2.19), (2.23) and Lemma 2.3, one gets the following set of estimates for the kernel AN (x, y). k x 1/4 y 1/4 AN L∞ ([R,∞)2 ) ≤ C N L∞ ([R,∞)) ≤ C(imE)−1 , (2.24) k=1,2
Aa ≡ sup x 1/4 AN (x, ·)l2 (L∞ )([R,∞)) + sup y 1/4 AN (·, y)l2 (L∞ )([R,∞)) x≥R
≤ C(imE)−1/2
k=1,2
y≥R
k N L∞ ([R,∞)) ≤ C(imE)−3/2 ,
(2.25)
370
G. Perelman
1 AN l2 (L∞ )([R,∞)2 ) ≤ C (imE)−1 N L∞ ([R,∞))
+(imE)
−1/2
x
−1/4
k=1,2
k N l2 (L∞ )([R,∞))
≤ C(imE)−2 . (2.26)
Here AN 2l (L 2
2 ∞ )([R,∞) )
=
∞ m,n=0
AN 2L∞ (Qmn ) ,
Qmn being the unit cubes Qmn = {n ≤ x − R ≤ n + 1, m ≤ y − R ≤ m + 1}. ij Consider the expressions trRN (E)Q1N (E), i = j . By (2.16), (2.17) they admit the representations 1 0 1 12 1 1 trR12 N QN = trtN QN − trAN QN + trpRN VN AN qQN , 1 21 1 1 trR21 N QN = −trAN QN + trqRN VN AN pQN .
(2.27)
Here p, (q) is the projection of C2 -vector onto the first (second) component: p = 00 q= . 01 It follows immediately from (2.19) that ∞ 0 1 |trtN QN | ≤ C x −1/2 |Q1N |dx ≤ C. R
(2.28) 10 , 00
(2.29)
By (2.18), (2.19), (2.21-2.23), C . imE Consider the last term in r.h.s. of (2.27). We represent it as the sum: 1 21 1 |trA12 N QN |, |trAN QN | ≤
(2.30)
trpRN VN AN qQ1N = I1 + I2 , 1 trpRN vN σ2 AN qQ1N . 2 Using Proposition 2.3 one can estimate I1 , I2 as follows. I1 = trpRN V σ2 AN qQ1N ,
I2 =
|I1 | ≤ CRN x 1/4 L1 →x −1/4 L∞ x −1/2 Q1N L1 x −1/2 V L1 × x 1/4 y 1/4 AN L∞ ≤ C(imE)−3 ,
|I2 | ≤ CRN l 2 (L1 )→x −1/4 L∞ x −1/2 Q1N L1 |v| AN a ≤ C(imE)−4 . Thus,
|trpRN VN AN qQ1N | ≤ C(imE)−4 . In a similar way one gets for the last term in (2.28):
(2.31)
|trqRN VN AN pQ1N | ≤ C(imE)−4 .
(2.32)
Combining (2.29–2.32) one obtains 1 21 1 −4 |trR12 N (E)QN (E)|, |trRN (E)QN (E)| ≤ C(imE) .
(2.33)
Absolutely Continuous Spectrum of Stark Operators
371
1 1 N 22 2.5. Estimate of tr[(R11 j (E) − Rj (E))(1 − 2 Qj (E)]j =0 . To estimate this expression we represent RN in the form
RN =
3
˜ N, RN,k + R
k=0
where
RN,k = (−1)k TN (VN TN )k ,
˜ N = RN (VN TN )4 . R
11 − R 22 )(x, x) = • Contribution of RN,0 . It follows from (2.17) that (RN,0 N,0
1 N 1 1 N 1 ∞ |[Qj ]j =0 | 11 22 dx ≤ C. tr (Rj,0 − Rj,0 ) 1 − Qj ≤ 2 2 R |p| j =0
1 p,
so
(2.34)
0 1 • Contribution of RN,1 = AN . Since Akk N = ±itN (x, x)N (x), k = 1, 2, the estimates (2.19), (2.23) give kk 1 1 1 trA L∞ ≤ C(imE)−2 . Q 1 − ≤ C(imE)−1 N N 2 N
Thus,
1 1 N 11 22 tr Rj,1 − Rj,1 1 − Qj ≤ C(imE)−2 . 2 j =0
(2.35)
• Contribution of RN,2 = TN VN AN . The direct calculations give R11 N,2 (x, x, E) = −
1 (x, E) 2 (x, E) 1 1 x N −1 (x, E)N (R, E) N − e2i R ds(p−VN p ) N , p(x, E) p(x, E)
which implies tr R11 1 − 1 Q1 ≤ C[(imE)−1/2 1 L x −1/4 1 2 ∞ N,2 N N l2 (L∞ ) 2 N 1 2 +x −1/4 N l 2 (L∞ ) x −1/4 N l 2 (L∞ ) ] ≤ C(imE)−5/2 . 22 Since R11 N,2 (x, x) = −RN,2 (x, x), one gets finally N 1 22 1 − Q1j tr R11 ≤ C(imE)−5/2 . j,2 − Rj,2 2 j =0
(2.36)
• Contribution of RN,3 = AN VN AN . Represent RN,3 as the sum RN,3 = J1 + J2 , J1 = AN V σ2 AN , J2 =
1 A N v N σ2 A N . 2
Then, kk 1 1 −1/2 trJ V L1 (1 + |Q1N |)AN 2a ≤ C(imE)−3 . 1 1 − 2 QN ≤ Cx
(2.37)
372
G. Perelman
For trJ2kk 1 − 21 Q1N , k = 1, 2, one can write the following estimate: kk 1 1 trJ 1 − ≤ C|v|(1 + |Q1N |)AN 2l 2 (L ) ≤ C(imE)−4 . Q 2 ∞ 2 N Combining (2.37) and (2.38), one gets N 1 22 1 − Q1j tr R11 ≤ C(imE)−4 . j,3 − Rj,3 2 j =0
(2.38)
(2.39)
˜ N into two ˜ N = RN VN AN VN AN , we break R • To estimate the contribution of R ˜ N = K1 + K 2 , parts: R 1 RN vN σ2 AN VN AN . 2
Using Proposition 2.3 one can estimate the expressions trK1kk 1 − 21 Q1N , k = 1, 2 as follows: K1 = RN V σ2 AN VN AN , K2 =
1 |trK1kk (1 − Q1N )| 2 ≤ C(1 + |Q1N |)RN x 1/4 L1 →l 2 (L∞ ) (x −1/2 V 2L1 x 1/4 y 1/4 AN L∞ AN a +|v| x −1/2 V L1 AN l 2 (L∞ ) AN a ) ≤ C(imE)−5 .
For trK2kk 1 − 21 Q1N , k = 1, 2, one has kk 1 1 trK Q 1 − ≤ C(1 + |Q1N |)|v| RN l 2 (L1 )→l 2 (L∞ ) 2 2 N × (|v| AN 2l 2 (L ≤ C(imE)−6 . Thus,
∞)
+ x −1/2 V L1 AN 2a )
1 1 N 11 22 ˜ ˜ 1 − Qj tr Rj − Rj ≤ C(imE)−6 . 2 j =0
Combining estimates with (2.34–2.36), (2.39–2.40), one obtains 1 1 N 11 22 tr (Rj − Rj )(1 − Qj ) ≤ C(imE)−6 . 2 j =0
(2.40)
(2.41)
d 2.6. Estimate of the derivative dE ln aN (E). Estimates (2.33), (2.41) together with representation (2.10) give d −6 (2.42) dE ln aN (E) ≤ C(imE) .
Absolutely Continuous Spectrum of Stark Operators
373
2.7. Estimate of ln |aN (E)| outside the real axis. We shall derive the desired estimates working directly with the differential equation. In this subsection C stands as a general notation for the constants that are independent of N , R, uniform with respect to E in compact subsets of C+ , but that
may depend on v, q. Introduce the vector z = zz21 , fN = EE1 z, fN
∞ −1 fas ei x dsp VN 0 ∞ E1 = . ∗ e−i x dsp−1 VN 0 fas Then z satisfies the system z = −p −1 VN σ2 e2i
˜ ∞ σ3
z,
1 + o(1), as x → +∞, 0
z=
where ˜∞ =
∞+
∞ x
(2.43)
dsp −1 VN .
Using the integral representation for z2 z2 e
−2i ˜ ∞
= −i
∞
x
dy
e2i
x y
ds(p−VN p−1 )
p
V N z1 ,
(2.44)
one gets immediately the estimate |z2 e−2i
˜∞
| ≤ Cz1 L∞ .
(2.45)
By (2.43), this implies |z1 | ≤ C < x >−1/2 |VN |z1 L∞ .
(2.46)
Integrating by parts in (2.44) one can write the following representation: ∞ y −1 −2i ˜ ∞ 1 1 z2 e = −iz1 N − i dye2i x ds(p−VN p ) N z1 , x
which together with (2.46) leads to the estimate
1/2 ∞ −2β im E(y 1/2 −x 1/2 ) e −2i ˜ ∞ 1 1 2 . |z2 e | ≤ Cz1 L∞ |N | + dy |VN (y)||N (y)| y 1/2 x (2.47) In particular, |z2 e−2i
˜∞
1 | ≤ Cz1 L∞ N L∞ ≤ Cz1 L∞ ρ1 (R; v, q0 , q1 ),
where ρ1 (R; v, q0 , q1 ) = x −1/4 L2 ([R,∞)) + R −1/6 + ρ(R; q0 , q1 ) → 0, 2/3
as R → ∞.
(2.48)
374
G. Perelman
Using system (2.43) one can easily check that 2
(|z1 | ) = e where M =
p¯ p
4im ˜ ∞
2
(|z2 | )
e−2i + 4imV re p
+ 2re(Mz2 z¯ 2 )
˜∞
z2 z¯ 1 ,
− 1. Integrating this identity one gets
|z1 (x)|2 − 1 = e4im
˜∞
∞ −2 x ∞
−4 x
∞ |z2 (x)|2 + 4
dye4im
˜∞
|z2 |2 im(p − VN p −1 ), (2.49)
x
dye4im
˜∞
re(M z¯ 2 z2 ),
e−2i dyimV re p
˜∞
(2.50)
z2 z¯ 1 .
(2.51)
Using inequality (2.47–2.48) one can estimate the r.h.s. of (2.49) as follows: |r.h.s. of (2.49)| ≤ C
˜ z2 e−2i ∞ 2L∞
+
∞ x
dyy
−1/2
|z2 e
−2i ˜ ∞ 2
|
1 2 1 2 ≤ Cz1 2L∞ (N L∞ + x −1/4 N l 2 (L ) ) ≤ Cz1 2L∞ ρ12 . ∞
In the first inequality here we have made use of the obvious estimate: ∞ −1 im p VN ≤ C. x
Consider expression (2.50): |(2.50)| ≤ C
∞
x
dyy −3/2 |VN |e2im
≤ Cz2 e−2i
˜∞
˜∞
|z2 ||z1 |
L∞ z1 L∞ ≤ Cz1 2L∞ ρ1 .
The same estimate is valid for (2.51): |(2.51)| ≤ Cz2 e−2i
˜∞
L∞ z1 L∞ ≤ Cz1 2L∞ ρ1 .
Thus, for x ≥ R, ||z1 (x)|2 − 1| ≤ Cz1 2L∞ ([R,∞)) ρ1 (R; v, q0 , q1 ). Since ρ1 → 0 as R → ∞, (2.48), (2.52) imply |z1 |2 − 1L∞ ([R,∞)) , z2 e−2i provided R is sufficiently large.
˜∞
L∞ ([R,∞)) ≤ Cρ1 (R; v, q0 , q1 ),
(2.52)
Absolutely Continuous Spectrum of Stark Operators
Returning to fN , one gets fN (x, E) ln f (x, E) ≤ C, 0 which gives in particular,
375
x ≥ R,
aN (E) | ≤ C. | ln a0 (E)
(2.53)
Since ln |a0 (E)| is bounded on compact subsets of C+ , inequalities (2.42), (2.53) imply Proposition 2.2. 3. Applications Here we prove Corollaries 1.1 and 1.2. Corollary 1.1 is an immediate consequence of Theorem 1.1 and the following proposition. Proposition 3.1. For v ∈ L1 (T) ∩ H −1/2 (T), q0 as in Corollary 1.1, the following inequality holds: 2 | v ˆ | C m R −1/6 + < x >−1/4 (·, E)2L2 ([R,∞)) ≤ , (im E)2 m −1 m≥π
p0 (R)−2
provided R ≥ C. See Appendix 3 for the proof. Proof of Corollary 1.2. Using arguments close to that of [12], Theorem 4.1, we will show that if v satisfies the assumptions of Corollary 1.2, then it can be represented as the sum v = q0 + q1 + v, ˜ where < x >−1/2 q1 ∈ L1 (R), q0 and v˜ satisfy hypotheses H1 and H2, H3 respectively. Then the result will follow from Theorem 1.1. Since < x >1/4 L2 ⊂< x >1/2 L1 + l∞ (L1 ), v1 admits the representation v1 = v11 + v12 , where v11 ∈< x >1/2 L1 and v12 ∈ l∞ (L1 ). Write v2 in the form v2 = v21 + v22 , ∈< x > L , v ∈< x >3/4 L , |v | ≤ C < x > with some constant C. where v21 1 22 2 2i Fix two functions ψ, η ∈ C0∞ (R), such that η ≥ 0, R dxη(x) = 1, suppψ ⊂ (−1, 1), ψ(x − n) = 1. Set v˜2 = v˜21 + v˜22 ,
n∈Z
v˜2i =
n∈Z
vi,n (x) = αi,n
ψ(x − n)vi,n (x),
i = 1, 2,
dyη(αi,n (x − y))v2i (y),
where αi,n are defined as follows: α1,n =< n >1/2 ,
α2,n = v22 L2 ([n−2,n+2]) + < n >µ ,
0 < µ < 1/4,
αi,n → ∞ as n → +∞. It is easy to check that (i) v21 − v˜21 ∈< x >1/2 L1 , v22 − v˜22 ∈ l∞ (L1 ); (ii) |v˜2 | ≤ C0 |x| for any C0 > C0 provided |x| is sufficiently large;
376
G. Perelman
(iii) v˜2 ∈< x >5/4 L2 , v˜2 ∈< x >3/2 L1 . ˜ q0 = v˜2 satisfies Hypothesis H1, q1 = v11 + v21 − v˜21 ∈ Thus, v = q0 + q1 + v, < x >1/2 L1 , v˜ = v12 + v22 − v˜22 ∈ l∞ (L1 ). ∞ 2i( (y,E)− (x,E) Consider (x, E) = x dy e (Fy−q )1/2 ) v(y), ˜ im E > 0. Since v˜ can be written 0
in the form v˜ = V1 + V2 + V3 + V4 , where V1 ∈< x >1/2 L1 , V2 ∈< x >1/4 L2 , V3 + V4 ∈ l∞ (L1 ), |V3 | + |V4 | ≤ C|x|, V3 ∈< x >3/4 L2 , V4 ∈< x >5/4 L2 ,V4 ∈ < x >3/2 L1 , it is not difficult to check that (·, E) ∈< x >1/4 L2 (+∞).
Appendix 1 Here we outline the proof of estimates (2.11). In this appendix and the next ones the constants C have the same nature as in Subsects. 2.3–2.6, 2.8. Consider ψ = RN (E)g, im E > 0, g ∈ l 2 (L1 ). To estimate L∞ -norms of ψ and ψ on the interval n = [n + R, n + 1 + R], n ≥ 0, R being sufficiently large, we use the representation: x
x
ψ(x) = µ1 p −1/2 (x)ei ξ pds + µ2 p −1/2 (x)e−i ξ pds x x sin( y dsp) − dy (A1.1) [g − 2VN ψ] , (p(x)p(y))1/2 ξ
µ1 where x, ξ ∈ n , µ = µ is given by the relation ψψ (ξ ) = p−1/2 (ξ )E(ξ )µ. 2 As an immediate consequence of (A1.1) one gets the estimate ψL∞ (n ) , x −1/2 ψ L∞ (n ) ≤ C(ψL2 (n ) + x −1/2 ψ L2 (n ) + (n + R)−1/2 gL1 (n ) ), provided R is sufficiently large. Summing up these inequalities one obtains ψl 2 (L∞ ) , x −1/2 ψ l 2 (L∞ ) ≤ C(ψL2 + x −1/2 ψ L2 + R −1/2 gl 2 (L1 ) ). (A1.2) It follows immediately from the equation (HN − E)ψ = g that imEψ2L2 ≤ gl 2 (L1 ) ψl 2 (L∞ ) , x −1/2 ψ 2L2
≤
C(ψ2L2
+ R −1/2 ψ2l 2 (L ) ∞
(A1.3)
+ R −1 g2l 2 (L ) ). 1
(A1.4)
Combining (A1.2–A1.4) one gets inequalities (2.11). Appendix 2 In this appendix we prove Lemma 2.3. Here and below the constants C may also depend on E0 . i can be estimated by its We start by remarking that the l2 (L∞ )-norm of x −1/4 N L2 -norm. Indeed, using the inequality 1 1 N L∞ (m ) ≤ N L2 (m ) + VN p −1 dy m 1 −1/2 ≤ N L2 (m ) + C(m + R) +C y −1/2 |V |dy, m
Absolutely Continuous Spectrum of Stark Operators
377
one gets 1 1 l 2 (L∞ )([R,∞)) ≤ C(x −1/4 N L2 ([R,∞)) + R −1/4 ). x −1/4 N
(A2.1)
In a similar way, 2 2 x −1/4 N l 2 (L∞ )([R,∞)) ≤ C(x −1/4 N L2 ([R,∞)) + R −1/4 ). j
(A2.2) j
To estimate the Lp - norms of N , j = 1, 2 we break it into two parts: N (x, E) = j2 1 j1 2 N (x, E) + N (x, E), 11 (x, E) N
=
21 N 12 N 22 N
∞ x
dy
x
e2i
e2i
y x
x y
ds(p−VN p−1 )
p0 (y)
vN (y),
ds(p−VN p−1 )
vN (y), dy p0 (y) R∞ y vN −1 −1 = dye2i x ds(p−VN p ) Vp −1 + p − p0−1 , 2 x x x −1 vN −1 = dye2i y ds(p−VN p ) Vp −1 + p − p0−1 . 2 R =
12 can be estimated as follows. The function N ∞ 1/2 1/2 12 |N (x, E)| ≤ C dye−2β im E(y −x ) (y −1/2 |V | + y −3/2 |vN |). x
As a consequence, 12 N L∞ ([R,∞)) ≤ C(y −1/2 V L1 ([R,∞)) + R −1/2 ) 12 L2 ([R,∞)) x −1/4 N
≤ C(ρ(R; q0 , q1 ) + R −1/2 ), C ≤ (y −1/2 V L1 ([R,∞) + R −1/2 ) (imE)1/2 C ≤ (ρ(R; q0 , q1 ) + R −1/2 ). (imE)1/2
(A2.3)
(A2.4)
In a similar way, 22 N L∞ ([R, ∞)) ≤ C(ρ(R; q0 , q1 ) + R −1/2 ), 22 L2 ([R, ∞)) ≤ x −1/4 N
C (ρ(R; q0 , q1 ) + R −1/2 ). (imE)1/2
j1
Consider N , j = 1, 2. Integrating by parts one gets the representations ∞ y −1 11 N (x, E) = N (x, E0 ) + 2i dye2i x ds(p−VN p ) x
VN ×[p(y, E) − (y, E) − p
(y, E0 )]N (y, E0 ),
(A2.5) (A2.6)
378
G. Perelman x
−1
21 N (x, E) = −N (x, E0 ) + e2i R ds(p−VN p ) N (R, E0 ) x x −1 −2i dye2i y ds(p−VN p ) R VN (y, E) − (y, E0 ) N (y, E0 ). × p(y, E) − p
Here N (x, E) =
∞
x
dye2i(
(y,E0 )− (x,E0 )) vN
p0
.
These representations imply immediately j1
N (·, E)L∞ ≤
C C N (·, E0 )L∞ ≤ (·, E0 )L∞ , imE imE (A2.7)
C (N (·, E0 )L∞ + x −1/4 N (·, E0 )l 2 (L∞ ) ). imE 1 (x, E) replaced Taking into account inequality (A2.1) (clearly, (A2.1) is valid with N by N (x, E) as well as by (x, E)) one can rewrite the last estimate as follows: j1
x −1/4 N (·, E)L2 ≤
j1
x −1/4 N (·, E)L2 ([R,∞)) C ≤ (N (·, E0 )L∞ ([R,∞)) + x −1/4 N (·, E0 )L2 ([R,∞)) + R −1/4 ) imE C ≤ ((·, E0 )L∞ ([R,∞)) + x −1/4 (·, E0 )L2 ([R,∞)) + R −1/4 ). (A2.8) imE At last, one can estimate the L∞ -norm of the in terms of L2 -norm of x −1/4 : (·, E0 )L∞ ([R,∞)) ≤ Cx −1/4 (·, E0 )l 2 (L 2/3
∞ )([R,∞))
≤ C(x −1/4 (·, E0 )L2 ([R,∞)) + R −1/6 ). 2/3
(A2.9)
Combining (A2.1–A2.9) one gets the statement of Lemma 2.3. Let us mention that exactly in a same way one proves that Hypothesis H2 is independent of E. Representing (x, E) in the form ∞ E − E0 (x, E) = (x, E0 ) + i dye2i( (y,E)− (x,E)) (y, E0 ), p0 x one gets x −1/4 (·, E)L2 ([R,∞)) ≤
C x −1/4 (·, E0 )L2 ([R,∞)) . imE
Appendix 3 Here we prove Proposition 3.1. We break into two parts = (0) + (1) , ∞ e2i( (0) = dyθ (y − x)
v(y), p0 (y) ∞ e2i( (y,E)− (x,E)) = dy(1 − θ (y − x)) v(y), p0 (y) x x
(1)
(y,E)− (x,E))
(A2.10)
Absolutely Continuous Spectrum of Stark Operators
where θ ∈ C0∞ (R), θ (ξ ) =
1, 0,
379
for |ξ | ≤ 1, Clearly, (0) admits the estimate for |ξ | ≥ 2.
x −1/4 (0) L2 ([R,∞)) ≤ CR −1/4 ,
(A3.1)
provided R is sufficiently large. To estimate (1) we represent it as the sum (1) = vˆm ζm ,
(A3.2)
m∈Z
1 where vˆm stand for Fourier coefficients of v: vˆm = 0 dxv(x)e2πimx , ∞ e2i( (y,E)− (x,E)−πmy) ζm (x, E) = . (A3.3) dy(1 − θ (y − x)) p0 (y) x Break the sum (A3.2) into two parts m∈Z = m≤m(x)−1 + m≥m(x) , m(x) = (F x−q0 (x))1/2 , where [a] denotes the integer part of a. Integrating by parts in (A3.3) π one gets the representation 1 ζm = − 4
2 iE y dsp−1 (s) ∞ y 0 d 1 e x (1 − θ(y − x)) , dye2i( x dsp0 −πmy) dy (p0 (y) − π m) p0 (y) x
which leads to the estimate |ζm | ≤
C (m(x) − m)−2 x −1/2 , imE
m < m(x).
As a consequence, one obtains C −1/2 x v ˆ ζ . m m ≤ imE m≤m(x)−1
(A3.4)
To estimate ζm for m ≥ m(x), we break it into three terms: ζm = ζm(0) + ζm(1) + ζm(2) , x+2 e2i( ζm(0) = θm (y)(1 − θ (x − y)) ζm(1)
=
ζm(2) =
x
∞
x+2 ∞ x
θm (y)
e2i(
(y,E)− (x,E)−πmy)
p0 (y)
(y,E)− (x,E)−πmy)
p0 (y)
(1 − θm (y))(1 − θ (x − y))
,
,
e2i(
(y,E)− (x,E)−πmy)
p0 (y)
,
where θm (y) = θ (mν−1 (y − ym )), ym being defined by the equation F ym − q0 (ym ) = π 2 m2 , 1/6 ≤ ν ≤ 1/2 will be fixed later. (0) The expression ζm can be estimated as follows. −1/2 , m ≤ m(x) + 2, x (0) (A3.5) |ζm | ≤ C 0, m > m(x) + 2.
380
G. Perelman (1)
Consider ζm . Clearly, it admits the representation ∞ (y − ym ) 1 (1) −2i( (ym )− (x))−πmym ) iκm (y−ym )2 ζm e 1 + iE = dyθm e πm x+2 πm +O(m−3ν ),
where κm =
F −q0 (ym ) 2πm .
The integral here can be estimated by Cm1/2 . So, one gets |ζm(1) | ≤ Ce−γ im E(πm−p0 (x)) m−1/2 ,
(A3.6)
with some γ > 0. (2) Consider ζm . Integrating by parts one can easily check that |ζm(2) | ≤ Cx −1/2+ν ((imE)−1 < p0 (x) − π m >−2 +e−γ im E(πm−p0 (x)) ).
(A3.7)
Combining (A3.4–A3.7) and choosing ν = 1/6 one obtains | (1) (x, E)|
≤ C (imE)−1 x −1/3 + (imE)−1/2
1/2 |vˆm |2 m−1 e−γ im E(πm−p0 (x)) .
m≥m(x)
(A3.8) Inequalities (A3.1), (A3.8) imply immediately
< x >−1/4 2L2 ([R,∞)) ≤ C(imE)−2 R −1/6 + m≥
|vˆm |2 m−1 ,
p0 (R) π −2
provided R is sufficiently large. Acknowledgement. It is a pleasure to thank V.S. Buslaev for pointing out the problem, J. Sj¨ostrand for numerous helpful discussions and the referee for a careful reading of the paper and useful suggestions.
References 1. Ao, P.: Absence of localization in energy space of a Bloch electron driven by a constant electric force. Phys. Rev. B 41, 3998–4001 (1990) 2. Asch, J., Duclos, P., Exner, P.: Stark-Wannier Hamiltonians with pure point spectrum. In: Differential Equations, Asymptotic Analysis, and Mathematical Physics. Potsdam 1996, Math. Res., 100, Berlin: Akademie Verlag, 1997, pp. 10–25 3. Asch, J., Duclos, P., Exner, P.: Stability of driven systems with growing gaps, quantum rings, and Wannier ladders. J. Stat. Phys. 92 1053–1070 (1998) 4. Asch, J., Duclos, P., Bentosela, F., Nenciu, G.: On the dynamics of crystal electrons, high momentum regime. J. Math. Anal. Appl. 256, 99–114 (2001) 5. Bentosela, F., Carmona, R., Duclos, P., Simon, B., Souillard, B., Weder, R.: Schr¨odinger operators with an electric field and random or deterministic potentials. Commun. Math. Phys. 88, 387–397 (1983) 6. Briet, Ph.: Absolutely continuous spectrum for singular Stark Hamiltonians. Rev. Math. Phys. 13, 587–599 (2001)
Absolutely Continuous Spectrum of Stark Operators
381
7. Buslaev, V.S.: Kronig-Penney electron in homogeneous electric field. Am. Math. Soc. Transl. Ser. 2, 189 8. Christ, M., Kiselev, A.: WKB and spectral analysis of one-dimensional Schr¨odinger operators with slowly varying potentials. Commun. Math. Phys. 218, 245–262 (2001) 9. Christ, M., Kiselev, A.: WKB asymptotic behavior of almost all generalized eigenfunctions for onedimensional Schr¨odinger operators with slowly decaying potentials. J. Funct. Anal. 179, 426–447 (2001) 10. Christ, M., Kiselev, A.: Absolutely continuous spectrum for one-dimensional Schr¨odinger operators with slowly decaying potentials: some optimal results. J. Am. Math. Soc. 11, 771–797 (1998) 11. Christ, M., Kiselev, A., Rempling, Ch.: The absolutely continuous spectrum of one-dimensional Schr¨odinger operators with decaying potentials. Math. Res. Lett. 4, 719–723 (1997) 12. Christ, M., Kiselev, A.: Absolutely continuous spectrum of Stark operators, mp arc preprint, 2001 13. Deift, P., Killip, R.: On the absolutely continuous spectrum of one-dimensional Schr¨odinger operators with square summable potentials. Commun. Math. Phys. 203, 341–347 (1999) 14. Delyon, F., Simon, B., Souillard, B.: From power pure point to continuous spectrum in disordered systems. Ann. Inst. H. Poincar´e Phys. Th´eor. 42 (3), 283–309 (1985) 15. Exner, P.: The absence of the absolutely continuous spectrum for δ Wannier-Stark ladders. J. Math. Phys. 36(9), 4561–4570 (1995) 16. Killip, R.: Perturbations of one-dimensional Schr¨odinger operators preserving the absolutely continuous spectrum. mp arc preprint, 2001 17. Kiselev, A.: Absolutely continuous spectrum of perturbed Stark operators. Trans. Am. Math. Soc. 352, 243–256 (2000) 18. Kiselev, A.: Stability of the absolutely continuous spectrum of the Schr¨odinger equation under slowly decaying perturbations and a.e. convergence of integral operators. Duke Math. J. 94, 619–646 (1998) 19. Kiselev, A., Last, Y., Simon, B.: Modified Pr¨ufer and EFGP transforms and the spectral analysis of one-dimensional Schr¨odinger operators. Commun. Math. Phys. 194, 1–45 (1998) 20. Maioli, M., Sacchetti, A.: Absence of the absolutely continuous spectrum for Stark-Bloch operators with strongly periodic potentials. J. Phys. A 28, 1101–1106 (1995) 21. Molchanov, S., Novitslii, M., Vainberg, B.: First KdV integrals and absolutely continous spectrum for 1-D Schr¨odinger operator. Commun. Math. Phys. 216, 195–213 (2001) 22. Rempling, Ch.: The absolutely continuous spectrum of one-dimensional Schr¨odinger operators with decaying potential. Commun. Math. Phys. 193, 151–170 (1998) 23. Shabani, J.: Propagations theorem for some classes of pseudo-differential operators. J. Math. Anal. Appl 211, 481–497 (1997) 24. Shabani, J.: On the absolutely continuous spectrum of Stark Hamiltonians. mp arc preprint 00–259 25. Weidmann, J.: Spectral Theory of Ordinary Differential Operators. LNM 1258, Berlin: SpringerVerlag, 1987 Communicated by B. Simon
Commun. Math. Phys. 234, 383–422 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0783-3
Communications in
Mathematical Physics
Rigorous Solution of the Gardner Problem Mariya Shcherbina1 , Brunello Tirozzi2 1
Institute for Low Temperature Physics, Ukr. Ac. Sci., 47 Lenin ave., Kharkov, Ukraine. E-mail:
[email protected] 2 Department of Physics of Rome University “La Sapienza”, 5, p-za A.Moro, Rome, Italy. E-mail:
[email protected] Received: 2 December 2001 / Accepted: 12 June 2002 Published online: 31 January 2003 – © Springer-Verlag 2003
Abstract: We prove rigorously the well-known result of Gardner about the typical fractional volume of interactions between N spins which solve the problem of storing a given set of p random patterns. The Gardner formula for this volume in the limit N, p → ∞, p/N → α is proven for all values of α. Besides, we prove a useful criterion for the factorisation of all correlation functions for a class of models of classical statistical mechanics.
1. Introduction The spin glass and neural network theories are of considerable importance and interest for a number of branches of theoretical and mathematical physics (see [M-P-V] and references therein). Among many topics of interest the analysis of different models of neural network dynamics is one of the most important. A discrete-time neural network dynamics is defined as σi (t + 1) = sign
N
Jij σj (t)
(i = 1, . . . , N ),
(1.1)
j =1,j =i
where {σj (t)}N j =1 are Ising spins and the interaction matrix {Jij } (not necessarily symmetric) depends on the concrete model but usually it satisfies the conditions N
Jij2 = N R(1 + o(1))
N →∞
(i = 1, . . . , N ),
j =1,j =i
where R is some fixed number which can be taken equal to 1.
(1.2)
384
M. Shcherbina, B. Tirozzi
A main problem in the neural network theory is to introduce an interaction in such p a way that some chosen vectors {ξ (µ) }µ=1 (patterns) are fixed points of the dynamics (1.1). This requires the conditions: (µ)
ξi
N
(µ)
Jij ξj
>0
(i = 1, . . . , N ).
(1.3)
j =1,j =i
Usually, to simplify the problem the patterns {ξ (µ) }µ=1 are chosen i.i.d. random vectors p
(µ)
with i.i.d. components ξi (i = 1, . . . , N) which assume values ±1 with probability 21 . Sometimes condition (1.3) is not sufficient to have ξ (µ) as the end points of the dynamics. To have some “basin of attraction” (that is some neighbourhood of ξ (µ) , starting from which we for sure arrive in ξ (µ) ) one should introduce some positive parameter k and impose the conditions: (µ)
ξi
N
(µ) J˜ij ξj > k
(i = 1, . . . , N ).
(1.4)
j =1,j =i
Gardner [G] was the first who solved a kind of inverse problem. She asked the questions: for which α = Np interaction1 {Jij }, satisfying (1.2) and (1.4) exist? What is the typical fractional volume of these interactions? This problem after a simple transformation can p be replaced by the following. For the system of p ∼ αN i.i.d. random patterns {ξ (µ) }µ=1 (µ)
with i.i.d. Bernoulli components ξj N,p (k) = σN−1
consider
(J ,J )=N
dJ
p
θ(N −1/2 (ξ (µ) , J ) − k),
(1.5)
µ=1
where the Heaviside function θ (x), as usual, is zero on the negative half-line and 1 on the positive half-line and σN is the Lebesgue measure of the hypersurface area of the N -dimensional sphere of radius N 1/2 . Then the question of interest is the behaviour of p 1 N log N,p (k) in the limit N, p → ∞, N → α. This problem has a very simple geometrical interpretation (see [S-T2]). For very large integer N consider the N -dimensional sphere SN of radius N 1/2 centred in the origin and p = αN independent random half spaces µ (µ = 1, . . . , p). Let µ = {J ∈ RN : N −1/2 (ξ (µ) , J ) ≥ k}, where ξ (µ) are i.i.d. random vectors with i.i.d. Bernoulli (µ) components ξj and k is the distance from µ to the origin. The problem is to find the maximum value of α such that the volume of the intersection of SN with ∩µ is not “too small” (i.e. of order e−N const ). More precisely, we study the “typical” behaviour as N → ∞ of N,p (k). Gardner [G] solved this problem by using the so-called replica trick which is non-rigorous from the mathematical point of view but sometimes very useful in the physics of spin glasses (see [M-P-V] and references therein). She obtained that for any α < αc (k), where −1 ∞ 1 2 −u2 /2 αc (k) ≡ √ (u + k) e du , (1.6) 2π −k
Rigorous Solution of the Gardner Problem
385
the following limit exists 1 E{log N,p (k)} = F(α, k) N √
u q +k 1 1 q ≡ min αE log H √ + log(1 − q) , + q:0≤q≤1 21−q 2 1−q lim
N,p→∞,p/N→α
(1.7)
where u is a Gaussian random variable with zero mean and variance 1, H(x) is defined as ∞ 1 2 H(x) ≡ √ e−t /2 dt (1.8) 2π x and here and below we denote by the symbol E{...} the averaging with respect to all random parameters of the problem and also with respect to u. And E N1 log N,p (k) tends to minus infinity for α ≥ αc (k). In this paper we give a proof of the Gardner results. As far as we know, it is one of the first cases when a problem from spin glass theory is solved completely (i.e. for all values of parameters of the corresponding problem). Before there were only few rather simple models such as the Random Energy Model [D] and the spherical Sherrington-Kirkpatrick [K-T-J] model which were solved rigorously for all values of their parameters. The possibility of a complete solution for the Gardner model can be explained by the fact that here the so-called replica symmetric solution is true for all α and k while in most of the other mean field models of spin glass theory the replica symmetric solution is valid only for some values of their parameters, e.g. for small enough α or inverse temperature β for the Hopfield model or small enough β for the Sherrington-Kirkpatrick model (see [M-P-V] for the physical theory). And rigorous results for these models were obtained only for some parameter values where the replica symmetric solution is valid (see [S1, S2, T1, T2, B-G]). A similar situation holds, unfortunately, with a problem similar to the Gardner one, the so-called Gardner-Derrida [D-G] problem, although physical theory predicts that also in this model the replica symmetry solution is valid for all parameter values. This model studies the behaviour as N, p → ∞, Np → α of −N GD N,p (k) = 2
p
θ (β) (N −1/2 (ξ (µ) , J ) − k),
(1.9)
Ji =±1 µ=1
where
θ (β) (x) = e−β + (1 − e−β )θ (x).
One can see easily that as β → ∞ GD N,p (k) becomes the discrete measure of the intersection of p random half spaces µ described above with a discrete cube {−1, 1}N . In the paper [T4] a more general model was considered. There the function θ (β) (x) is replaced by eu(x) , where |u(x)| < D is any function continuous except possibly at finitely many points. This model was studied for α < α0 (D) and α0 (D) → 0, as D → ∞. Thus, even for small α it is not possible to consider in (1.9) the limit β → ∞, which is the analogue of our Theorem 3 (see the next section). We solve the Gardner problem in three steps which are Theorems 1, 2 and 3 below. In the first step we prove some general statement. We study an abstract situation, where the energy function (the Hamiltonian) and the configuration space are convex (we recall here that we study a model where J ’s become variables in the configuration space instead
386
M. Shcherbina, B. Tirozzi
of interactions and some function of them plaies the role of Hamiltonian. Thus J ’s vary continuously). We consider the Gibbs measure generated by this Hamiltonian on our convex set and prove that in this case all the correlation functions become factorised in the thermodynamic limit (e.g., for any i = j Ji Jj − Ji Jj → 0, as N → ∞). Usually this factorisation means that the ground state and the Gibbs measure are uniquely defined. In fact, physicists have understood this fact for a rather long time, but it has not been proved before. The proof of Theorem 1 is based on the application of a theorem of classical geometry, known since the nineteenth century as the Brunn-Minkowski theorem (see e.g. [Ha] or [B-L]). This theorem studies the intersections of a convex set with the family of parallel hyper-planes (see the proof of Theorem 1 for the exact statement). We only need to prove some corollary from this theorem (Proposition 1), which allows us to have N independent estimates. As a result we obtain the rigorous proof of the general factorisation property of all correlation functions (see (2.8)). Everybody who is familiar with mean field models of spin glasses knows that the vanishing of correlations, as N → ∞ is the key point in the derivation of self-consistent equations. We remark here that a similar idea was used in [B-G] where the results of [B-L] (also based on the Brunn-Minkowski theorem) have been used. The second step is the derivation of self-consistent equations for the order parameters of our model. In fact Theorem 1 provides all that is necessary tools to express the free energy in terms of the order parameters, but the problem is that we are not able to produce the equations for these parameters in the case when the “randomness” is not included in the Hamiltonian, but is connected with the integration domain. That is why we use a rather common trick in mathematics: replace θ -functions by some smooth functions which depend on a small parameter ε and tend, as ε → 0, to the θ -function. We choose for these purposes H(−xε−1/2 ). But the particular form of these smoothing functions is not very important for us. The most important fact is that their logarithms are well defined and concave functions and so we can treat them as a part of our Hamiltonian. The proof of Theorem 2 is based on the application to the Gardner problem of the so-called cavity method, the rigorous version of which was proposed in [P-S] and developed in [S1, P-S-T1, P-S-T2]. But in the previous papers ([P-S, P-S-T1, P-S-T2]) we assumed the factorisation of the correlation functions in the thermodynamic limit and on the basis of this fact derived the replica symmetry equation for the order parameters (to be more precise, we assumed that the order parameter possesses the self-averaging property and obtained from this fact the factorisation of the correlation function). Here, due to Theorem 1, we can prove the asymptotic factorisation property, which allows us to finish completely the study of the Gardner model. Our last step is the limiting transition ε → 0, i.e. the proof that the product of αN θ -functions in (1.5) can be replaced by the product of H(− √xε ) with a small difference, when ε is small enough. Despite our expectations, it is the most difficult step from the technical point of view. It is rather simple to prove that the expression (1.7) is an upper bound for N1 E{log N,p (k)}. But the estimate from below is much more complicated. The problem is that to estimate the difference between the free energies corresponding to two Hamiltonians we, as a rule, need to have them defined on the common configuration space or, at least, we need to know some a priori bounds for some Gibbs averages. In the case of the Gardner problem we do not possess this information. This leads to rather serious (from our point of view) technical problems (see the proof of Theorem 3 and Lemma 4).
Rigorous Solution of the Gardner Problem
387
The paper is organised as follows. The main definitions and results are formulated in Sect. 2. The proof of these results are given in Sect. 3. The auxiliary results (lemmas and propositions which we need for the proof) are formulated in the text of Sect. 3 and their proofs are given in Sect. 4. 2. Main Results As mentioned above we start from an abstract statement which allows us to prove the factorisation of all correlation functions for some class of models. N Let { N (J )}∞ N=1 (J ∈ R ) be a system of convex functions which possess their third derivatives, bounded in any compact. Consider also a system of convex domains N {N }∞ N=1 (N ⊂ R ) whose boundaries consist of a finite number (maybe depending on N) of smooth pieces. We remark here that for the Gardner problem we need to study N which is the intersection of αN half-spaces but in Theorem 1 (see below) we consider a more general sequence of convex sets. Define the Gibbs measure and the free energy, corresponding to N (J ) in N : −1 . . . N ≡ N N dJ (. . .) exp{− N (J )}, (2.1) N ( N ) ≡ N dJ exp{− N (J )}, fN ( N ) ≡ N1 log N ( N ). Denote
˜ N (U ) ≡ {J : N (J ) ≤ N U }, N (U ) ≡ ˜ N (U ) ∩ N ,
˜ DN (U ) ≡ DN (U ) ∩ N , ˜ ˜ N (U ). Then define where DN (U ) is the boundary of 1 ∗ fN (U ) = log dJ e−NU . N J ∈DN (U )
(2.2)
Theorem 1. Let the functions N (J ) satisfy the conditions: d2
N (J + te)|t=0 ≥ C0 > 0, dt 2
(2.3)
for any direction e ∈ RN , |e| = 1 and uniformly in any set |J | ≤ N 1/2 R1 ,
N (J ) ≥ C1 (J , J ), and for any U > Umin ≡ min N J ∈N
−1
as (J , J ) > N R 2 ,
N (J ) ≡ N
−1
(2.4)
∗
N (J )
˜ N (U ) as J ∈
|∇ N (J )| ≤ N 1/2 C2 (U ),
(2.5)
with some positive N-independent C0 , C1 , C2 (U ) where C2 (U ) continuous in U . Assume also that there exists some finite N -independent C3 such that fN ( N ) ≥ −C3 .
Then |fN ( N ) − fN∗ (U∗ )|
≤O
log N N
(2.6)
,
1 U∗ ≡ N N N
.
(2.7)
Moreover, for any e ∈ RN (|e| = 1) and any natural p (J˙ , e)p N ≤ C(p) (J˙i ≡ Ji − Ji N ) with some positive N-independent C(p).
(2.8)
388
M. Shcherbina, B. Tirozzi
Let us remark that the main conditions here are, of course, the condition that the domain N and the Hamiltonian N are convex (2.3). Condition (2.4) and (2.5) are not very restrictive, because they are fulfilled for most Hamiltonians. The bound (2.6) in fact is the condition on the domain N . This condition prevents N to be too small. In the application to the Gardner problem the existence of such a bound is very important, because in this case we should study just the question of the measure of N , which is the intersection of αN random half-spaces with the sphere of radius N 1/2 . But from the technical point of view for us it is more convenient to check the existence of the bound from below for the free energy, than for the volume of the configuration space (see the proof of Theorem 3 below). Theorem 1 has two corollaries which are rather important for us. Corollary 1. Under conditions (2.3)–(2.6) for any U > Umin , log N fN∗ (U ) = min {fN (z N ) + zU } + O . z>0 N
(2.9)
This corollary is a simple generalisation of a result for the so-called spherical model which has became rather popular recently (see, e.g. the review paper [K-K-P-S] and references therein). It allows us to replace an integration over the level surface of the function N by an integration over the whole space, i.e. to substitute the “hard condition” N = U N by the “soft one” N N = U N . This is a common trick which often is very useful in statistical mechanics. The second corollary gives the most important and convenient form of the general property (2.8): Corollary 2. Relations (2.8) imply that uniformly in N 1 ˙ ˙ 2 C Ji Jj N ≤ . 2 N N Remark 1. For N = RN Corollaries 1 and 2 follow from the results of [B-L]. To find the free energy corresponding to the model (1.5) and to derive the replica symmetric equations for the order parameters we introduce the “regularised” Hamiltonian, depending on the small parameter ε > 0, p k − (ξ (µ) , J )N −1/2 z HN,p (J , k, h, z, ε) ≡ − log H + h(h, J ) + (J , J ), √ 2 ε µ=1
(2.10) where the function H(x) is defined in (1.8) and h = (h1 , ..., hN ) is an external random field with independent Gaussian hi with zero mean and variance 1 which we need for technical reasons. The partition function for this Hamiltonian is ZN,p (k, h, z, ε) = σN−1 dJ exp{−Hε (J , k, h, z, ε)}. (2.11) We denote also by . . . the corresponding Gibbs averaging and fN,p (k, h, z, ε) ≡
1 log ZN,p (k, h, z, ε). N
(2.12)
Rigorous Solution of the Gardner Problem
389
Theorem 2. For any α, k ≥ 0 and z > 0 the functions fN,p (k, h, z, ε) are self-averaging in the limit N, p → ∞, αN ≡ Np → α: E (fN,p (k, h, z, ε) − E{fN,p (k, h, z, ε)})2 → 0,
(2.13)
and, if ε is small enough, α < 2 and z ≤ ε−1/3 , then there exists E{fN,p (k, h, z, ε)} = F (α, k, h, z, ε), √
u q +k F (α, k, h, z, ε) ≡ max min αE log H √ R>0 0≤q≤R ε+R−q 1 z h2 1 q + log(R − q) − R + (R − q) , + 2R−q 2 2 2 (2.14) where u is a Gaussian random variable with zero mean and variance 1. lim
N,p→∞,αN →α
Let us note that the bound α < 2 is not important for us because for any α > αc (k) (αc (k) < 2 for any k) the free energy of the partition function N,p (k) tends to −∞, as N → ∞ (see Theorem 3 for the exact statement). The bound z < ε−1/3 also is not a restriction for us. We might need to consider z > ε−1/3 only if, applying (2.9) to the Hamiltonian (2.10), we obtain that the minimum point zmin (ε) in (2.9) does not satisfy this bound. But it is shown in Theorem 3 that for any α < αc (k) zmin (ε) < z with some finite z depending only on k and α. We start the analysis of N,p (k), defined in (1.5), from the following remark. Remark 2. Let us note that N,p (k) can be zero with nonzero probability (e.g., if for some µ = ν ξ (µ) = −ξ (ν ) ). Therefore we cannot, as usual, just take log N,p (k). To avoid this difficulty, we take some large enough M and replace below the log-function by the function log(MN) , defined as log(MN) X = log max X, e−MN .
(2.15)
Theorem 3. For any α ≤ αc (k), N −1 log(MN) N,p (k) is self-averaging in the limit N, p → ∞, p/N → α, E
2
→ 0, N −1 log(MN) N,p (k) − E{N −1 log(MN) N,p (k)}
and for M large enough there exists lim
N,p→∞,p/N→α
E{N −1 log(MN) N,p (k)} = F(α, k),
(2.16)
where F(α, k) is defined by (1.7). For α > αc (k), E{N −1 log(MN) N,p (k)} → −∞, as N → ∞ and then M → ∞. We would like to mention here that the self-averaging of N −1 log N,p (k) was proven in ([T4]), but our proof of this fact is necessary for the proof of (2.16).
390
M. Shcherbina, B. Tirozzi
3. Proof of the Main Results Proof of Theorem 1. For any U > 0 consider the set N (U ) defined in (2.2). Since
N (J ) is a convex function, the set N (U ) is also convex and N (U ) ⊂ N (U ), if U < U . Let VN (U ) ≡ mes( N (U )), SN (U ) ≡ mes(DN (U )), (3.1) FN (U ) ≡ J ∈DN (U ) |∇ N (J )|−1 dSJ . Here and below the symbol mes(...) means the Lebesgue measure in the correspondent dimension. Then it is easy to see that the partition function N can be represented in the form d N = VN (U )dU e−NU FN (U )dU = N −1 e−NU dU U >Umin U >Umin (3.2) = e−NU VN (U )dU . U >Umin d Here we have used the relation FN (U ) = N −1 dU VN (U ) and integration by parts. N Besides, for a chosen direction e ∈ R (|e| = 1), and any real c consider the hyperplane A(c, e) = J ∈ RN : (J , e) = N 1/2 c
and denote
N (U, c) ≡ N (U ) ∩ A(c, e), DN (U, c) ≡ DN (U ) ∩ A(c, e),
VN (U, c) ≡ mes( N (U, c)),
FN (U, c) ≡
J ∈DN (U,c)
|∇ N (J )|−1 dSJ . (3.3)
∂ N −1 ∂U VN (U, c),
Then, since FN (U, c) = we obtain N = dcdU e−NU FN (U, c) = dcdU e−NU VN (U, c), N p/2 dcdU cp e−NU VN (U, c) p (J , e) = . N dcdU e−NU VN (U, c)
(3.4)
Denote
1 1 log VN (U ), sN (U, c) ≡ log VN (U, c). N N Then relations (3.2), (3.4) give us N = N exp{N (sN (U ) − U )}dU, p (J˙ , e) = N p/2 (c − c (U,c) )p (U,c) , sN (U ) ≡
(3.5)
(3.6)
N
where
dU dc(...) exp{N (sN (U, c) − U )} . (3.7) dU dc exp{N (sN (U, c) − U )} Then (2.7) and (2.8) can be obtained by the standard Laplace method, if we prove that sN (U ) and sN (U, c) are concave functions and they are strictly concave in the neighbourhood of the maximum points of the functions (sN (U ) − U ) and (sN (U, c) − U ). To prove this we apply the theorem of Brunn-Minkowski from classical geometry (see e.g. [Ha]) to the functions sN (U ) and sN (U, c). To formulate this theorem we need some extra definitions. ... (U,c) ≡
Rigorous Solution of the Gardner Problem
391
Definition 1. Consider two bounded sets in A, B ⊂ RN . For any positive α and β, αA × βB ≡ {s : s = αa + βb, a ∈ A, b ∈ B} . αA × βB is the Minkowski sum of αA = and βB. Definition 2. The one-parameter family of bounded sets {A(t)}t1∗ ≤t≤t2∗ is a convex oneparameter family, if for any positive α < 1 and t1,2 ∈ [t1∗ , t2∗ ] they satisfy the condition A(αt1 + (1 − α)t2 ) ⊃ αA(t1 ) × (1 − α)A(t2 ). Theorem of Brunn-Minkowski. Let {A(t)}t1∗ ≤t≤t2∗ be some convex one-parameter fam2
2
ily. Consider R(t) ≡ (mesA(t))1/N . Then d dtR(t) ≤ 0 and d dtR(t) ≡ 0 for t ∈ [t1 , t2 ] if 2 2 and only if all the sets A(t) for t ∈ [t1 , t2 ] are homothetic to each other. For the proof of this theorem see, e.g., [Ha]. To use this theorem for the proof of (2.7) let us observe that the family {( N (U ))}U >Umin is a convex one-parameter family and then, according to the Brunn-Minkowski theorem, the function R(U ) = (VN (U ))1/N is a concave function. Thus, we get that sN (U ) is a concave function: d2 d2 R (U ) R (U ) 2 R (U ) 2 s (U ) = log R(U ) = ≤ − . − N dU 2 dU 2 R(U ) R(U ) R(U ) But
R (U ) R(U )
=
d dU sN (U )
> 1 for U < U ∗ , and even if
d dU sN (U )
= 0 for U > U ∗ , we
= −1. Thus, using the standard Laplace method, we get log N log N 1 ∗ ∗ ∗ ∗ fN ( N ) = sN (U ) − U + O = log VN (U ) − U + O , N N N U∗ ≡ N1 N N = U ∗ + o(1). (3.8) Using condition (2.5), and taking J ∗ , which is the minimum point of N (J ), we get ∗ −1 VN (U ) ≥ N |(J − J ∗ , ∇ N (J ))||∇ N (J )|−1 dSJ J ∈DN (U ∗ ) U ∗ − Umin = N −1/2 SN (U ∗ )C(U ∗ ). (3.9) ≥ SN (U ∗ ) maxJ ∈DN (U ∗ ) |∇ N (J )|
obtain that
d dU (sN (U ) − U )
On the other hand, for any U < U ∗ , SN (U ) FN (U ) ≥ min |∇ N (J )| 1/2 N 1/2 VN (U ) N VN (U ) J ∈DN (U ) U − Umin d d ˜ ≥ N 1/2 min sN (U ) ≥ C˜ sN (U ) > C.(3.10) ∗ dU J ∈DN (U ) |J − J | dU Here we have used (3.3) and (2.4). Thus the same inequality is valid also for U = U ∗ . Inequalities (3.10) and (3.9) imply that 1 1 log N log SN (U ∗ ) = log VN (U ∗ ) + O . N N N Combining this relation with (3.8) we get (2.7).
392
M. Shcherbina, B. Tirozzi
Let us observe also that for any (U0 , c0 ) and (δU , δc ) the family { N (U0 + tδU , c0 + tδc }t∈[0,1] is a convex one-parameter family and then, according to the Brunn-Minkowski theorem the function RN (t) ≡ V 1/N (U0 + tδU , c0 + tδc ) is concave. But since in our consideration N → ∞, to obtain that this function is strictly concave in some neighbourhood of the point (U ∗ , c∗ ) of maximum of sN (U, c) − U , we shall use some corollary from the theorem of Brunn- Minkowski: Proposition 1. Consider the convex set M ⊂ RN whose boundary consists of a finite number of smooth pieces. Let the convex one-parameter family {A(t)}t1∗ ≤t≤t2∗ be given by the intersections of M with the parallel hyper-planes B(t) ≡ {J : (J , e) = tN 1/2 }. Suppose that there is some smooth piece D of the boundary of M, such that for any J ∈ D the minimal normal curvature satisfies the inequality N 1/2 κmin (J ) > K0 (the minimal normal curvature κmin (J ) is defined by minimizing the curvature over all directions). Let also the Lebesgue measure S(t) of the intersection D ∩ A(t) satisfy the bound S(t) ≥ N 1/2 V (t)C(t), (3.11) where V (t) is the volume of A(t). Then
d 2 1/N V (t) dt 2
≤ −K0 C(t)V 1/N (t).
One can see that if we consider the sets M, M , A, B(t) ⊂ RN+1 , M ≡ M ∩ A, M ≡ {(J , U ) : N U ≥ N (J ), J ∈ N }, A ≡ {(J , U ) : δU ((J , e) − N 1/2 c0 ) − N 1/2 δc (U − U0 ) = 0}, B(t) ≡ {(J , U ) : δc ((J , e) − N 1/2 c0 ) + N 1/2 δU (U − U0 ) = N 1/2 t}, then N (U0 + tδU , c0 + tδc ) = M ∩ B(t) (without loss of generality we assume that δc2 + δU2 = 1). Conditions (2.3) and (2.5) guarantee that the minimal normal curvature (U ) ≡ {(J , (J )), J ∈ } satisfies the inequality N 1/2 κ ˜ for of DN N N min (J ) > K J ∈ DN (U ), if |U − U ∗ | < ε with small enough but N -independent ε. Besides, similar to (3.10), mesDN (U, c) d ≥ C3 sN (U, c). 1/2 N VN (U, c) dU Thus we get that d2 d 1 ≤ −C4 . (3.12) sN (U, c) ≥ ⇒ 2 sN (U + t sin ϕ, c + t cos ϕ) 2 dt dU t=0 d Remark 1. If N = RN , then the conditions of Theorem 1 guarantee that dU sN (U, c) ≥ ∗ ∗ const, when (U, c) ∼ (U , c ) and so Proposition 1 and (3.10) give us that
sN (U, c) − U − (sN (U ∗ , c∗ ) − U ∗ ) ≤ −
C˜ 0 ((c − c∗ )2 + (U − U ∗ )2 ), 2
(3.13)
which implies immediately (2.8). But in the general case, the proof is more complicated. Let us introduce the new variables ρ ≡ ((U − U ∗ )2 + (c − c∗ )2 )1/2 , ϕ ≡ arcsin U −U ∗ , and let φ˜ N (ρ, ϕ) ≡ φN (U, c) ≡ sN (U ∗ + U, c∗ + c) − U − ((U −U ∗ )2 +(c−c∗ )2 )1/2 ∗ ∗ ∗ sN (U , c ) + U . We shall prove now that φ˜ N (N −1/2 , ϕ) ≤ −
K , N
(3.14)
Rigorous Solution of the Gardner Problem
393
where K does not depend on ϕ, N . Consider the set
d 1 = (U, c) : sN (U, c) < . dU 2 One can see easily that if (U , c ) ∈ , then (U, c ) ∈ for any U > U and d 1 ∗ ∗ dU φN (U, c ) < − 2 . That is why it is clear that (U , c ) ∈ (but it can belong to the boundary ∂). Denote ϕ ∗ ≡ infπ π r(N −1/2 sin ϕ, N −1/2 cos ϕ) ∩ = ∅ , ϕ∈[− 2 , 2 ]
where r(U, c) is the set of all points of the form (U ∗ + tU, c∗ + tc), t ∈ [0, 1]). Then for any ϕ < ϕ ∗ we can apply (3.12) to obtain that φ˜ N (N −1/2 , ϕ) ≤ −
C4 . 2N
(3.15)
Assume that − π4 ≤ ϕ ∗ ≤ π4 . Let us remark that, using (2.5), similarly to (3.9) one can obtain that for all (U, c): |U − U ∗ | ≤ N −1/2 and |c − c∗ | ≤ N −1/2 , d SN (U, c) sN (U, c) ≤ min |∇ N (J )|−1 ≤ C5 . dU VN (U, c)
(3.16)
C4 Choose d ≡ 4C . Then for all ϕ ∗ ≤ ϕ ≤ ϕd ≡ arctan(tan ϕ ∗ + dN −1/2 ), using (3.15) 5 and (3.16), we have
φ˜ N (N −1/2 , ϕ) = φN (N −1/2 sin ϕ, N −1/2 cos ϕ) d C5 d ≤ φN N −1/2 sin ϕ − , N −1/2 cos ϕ + N N C4 ≤− + O(N −3/2 ). 4N For
π 4
(3.17)
≥ ϕ > ϕd , according to the definition of ϕ ∗ and ϕd , there exists ρ1 < 1 such that dρ1 , N −1/2 ρ1 cos ϕ) ∈ N tdρ1 ⇒ (N −1/2 ρ1 sin ϕ − , N −1/2 ρ1 cos ϕ) ∈ (t ∈ [0, 1]). N
(N −1/2 ρ1 sin ϕ −
Therefore, using that φ˜ N (ρ, ϕ) is a concave function of ρ, we get φ˜ N (N −1/2 , ϕ) ≤ ρ1−1 φ˜ N (N −1/2 ρ1 , ϕ) = ρ1−1 φN (N −1/2 ρ1 sin ϕ, N −1/2 ρ1 cos ϕ) dρ1 d d −1 −1/2 −1/2 ≤ ρ1 φN N ρ1 sin ϕ − ρ1 cos ϕ − ,N ≤− . N 2N 2N (3.18) And finally, if |ϕ| >
π 4,
denote
Lφ ≡ r(N −1/2 sin ϕ, N −1/2 cos ϕ) ∩ , lφ = N 1/2 mes{Lφ }.
394
M. Shcherbina, B. Tirozzi
Then, using that for (U, c) ∈ Lφ , π d 1 π d φ˜ N (N −1/2 ρ, ϕ) ≤ N −1/2 cos φN (U, c) < − N −1/2 cos , dρ 4 dU 2 4 and for (U, c) ∈ Lφ we can apply (3.12), we have φ˜ N (N −1/2 , ϕ) ≤ −
lφ (1 − lφ )2 C4 K − ≤− . 2N 2(2N )1/2 N
(3.19)
Inequalities (3.15)–(3.19) prove (3.14) for |ϕ| < π2 . For the rest of ϕ the proof is the same. Now let us derive (2.8) (for p = 2) from (3.14). Choose ρ ∗ = K4 and remark that since φ˜ N (ρ, ϕ) is a concave function of ρ, we have that for ρ > N −1/2 ρ ∗ , K d 2 1 d ˜ ˜ φN (ρ, ϕ) φN (ρ, ϕ) + log ρ < − 1/2 . ≤ 2 dρ dρ N 2N ρ=N −1/2 ρ ∗ Thus, using the Laplace method, one can obtain that d ˜ 2 N φ˜ N (ρ,ϕ) (ρ ∗ )2 ρ>N −1/2 ρ ∗ dρ ρ e dρ φN (ρ, ϕ) ≤ d ˜ 2 ˜ N (ρ,ϕ) N φ N [φN (ρ, ϕ) + log ρ] −1/2 ∗ dρe ρ>N
dρ
ρ
N
≤ ρ=N −1/2 ρ ∗
2(ρ ∗ )2 . N
So, we have for any ϕ, (ρ ∗ )2 ˜ ˜ dρ ρ 2 eN φN (ρ,ϕ) ≤ dρ eN φN (ρ,ϕ) −1/2 ρ ∗ N ρ
N ρ (ρ ∗ )2 ˜ ≤2 dρeN φN (ρ,ϕ) . N This relation proves (2.8) for p = 2, because of the inequalities ˜ dφ dρ ρ 2 eN φN (ρ,ϕ) 2(ρ ∗ )2 2 ∗ 2 ≤ . (c − c (U,c) ) (U,c) ≤ (c − c ) (U,c) ≤ N dφ dρeN φ˜N (ρ,ϕ) For other values of p the proof of (2.8) is similar. Proof of Theorem 2. For our consideration below it is convenient to introduce also the Hamiltonian HN,p (J , x, h, z, ε) ≡
p 1 −1/2 (µ) z (N (ξ , J ) − x (µ) )2 + h(h, J ) + (J , J ). (3.20) 2ε 2 µ=1
Evidently HN,p (J , k, h, z, ε) = − log
x (µ) >k
dx exp{HN,p (J , x, h, z, ε)} +
p log(2π ε) 2
and so F˜ (J ) = F˜ (J ) HN,p for any F˜ (J ). Therefore below we denote . . . both averaging with respect to HN,p and HN,p .
Rigorous Solution of the Gardner Problem
µ,ν
395
Lemma 1. Define the matrix XN =
N 1 (µ) (ν) ξi ξi . If the inequalities N i=1
√ ||XN || ≤ ( α + 2)2 ,
1 (h, h) ≤ 2, N
(3.21)
are fulfilled, then the Hamiltonian HN,p (J , k, h, z, ε) satisfies conditions (2.3), (2.4), (2.5) and (2.6) of Theorem 1 and therefore N 1 ˙ ˙ C(z, ε) , Ji Jj Ji Jj ≤ N2 N i,j =1
N 1 ˙ ˙ 2 C(z, ε) , Ji Jj ≤ N2 N
(3.22)
i,j =1
where J˙i ≡ Ji − Ji . Moreover, choosing εN ≡ N −1/2 log N we obtain that there exist N -independent C1 and C2 such that
2 2 (3.23) Prob maxθ (Ji − N 1/2 εN ) > e−C1 log N ≤ e−C2 log N . i
Remark 2. According to the result of [S-T1] and to a low of large numbers, PN -the 2/3 probability that inequalities (3.21) are fulfilled, is more than 1 − e−constN . Remark 3. Let us note that since the Hamiltonian (2.10) under conditions (3.21) satisfies (2.3), (2.4) and (2.6), we can choose R0 large enough to have 2 −1 θ (|J | − N 1/2 R0 )e−HN,p dJ ≤ (R0 )N e−NC1 R0 < e−NC3 −N σN N ⇒ θ (|J | − N 1/2 R0 ) ≤ e−N , so in all computations below we can use the inequality |J | ≤ N 1/2 R0 with the error O(e−N const ). Remark 4. Let us note that sometimes it is convenient to use (3.22) in the form E
N −1
N
(1) (2) J˙i J˙i
i
2 (1,2)
≤
C(z, ε) , N
2
N C(z, ε) E N −1 ≤ J˙i Ji . N i
Here and below we put an upper index to Ji to show that we take a few replicas of our Hamiltonians and the upper index indicate the replica number. We put also an upper index .. (1,2) to stress that we consider the Gibbs measure for two replicas. The last relation mean, in particular, that 1 ˙ 1 ˙(1) ˙(2) Ji Ji → 0, Ji Ji → 0, as N → ∞ N N in the Gibbs measure and in probability.
(3.24)
396
M. Shcherbina, B. Tirozzi
We start the proof of Theorem 2 from the proof of the self-averaging property (2.13) of fN,p (h, z, ε). Using an idea first proposed in [P-S] (see also [S-T1]), we write p 1 fN,p (h, z, ε) − E{fN,p (k, h, z, ε)} = µ , N µ=0
where
µ ≡ Eµ (log ZN,p (k, h, z, ε)) − Eµ+1 (log ZN,p (h, z, ε)) , (1) (µ) the symbol Eµ {..} means the averaging with respect to random vectors ξ , ..., ξ and E0 log ZN,p (k, h, z, ε) = log ZN,p (h, z, ε). Then, in the usual way,
E µ ν = 0 (µ = ν),
and therefore p 1 E (fN,p (h, z, ε) − E{fN,p (k, h, z, ε)})2 = 2 E{2µ }. N
(3.25)
µ=0
But
E{2µ−1} ≤ E{(Eµ−1 {(log ZN,p (k, h, z, ε))})}
(3.26)
Eµ−1 (log ZN,p−1 (k, h, z, ε))2 ≤ E{(µ )2 }, where
(µ)
µ ≡ log ZN,p (k, h, z, ε) − log ZN,p−1 (k, h, z, ε), (µ)
(µ)
with ZN,p−1 (k, h, z, ε) being the partition function for the Hamiltonian (2.10), where in the r.h.s. we take the sum with respect to all upper indices except µ. Denoting by (µ) ... p−1 the corresponding Gibbs averaging and integrating with respect to x, we get: µ
(µ) √ k − (ξ (µ) , J )N −1/2 = ε log H . √ ε
(3.27)
p−1
But evidently (µ) 0 ≥ log H ε−1/2 (k − (ξ (µ) , J )N −1/2 ) p−1 (µ) −1/2 −1/2 ( µ ) ≥ log H ε (k − (ξ , J )N )
≥ − const (N ε)−1 (ξ (µ) , J )2
(µ) p−1
p−1
+ const .
(3.28)
Thus, E{(µ )2 }
≤ const E
(N ε)
−1
(ξ (µ) , J )2
(µ) p−1
(N ε)
−1
(ξ (µ) , J )2
(µ)
p−1
.
Rigorous Solution of the Gardner Problem
397
But since ... p−1 does not depend on ξ (µ) we can average with respect to ξ (µ) inside (µ)
(µ)
... p−1 . Hence, we obtain E{(µ )2 }
≤ const ε
−2
E
N
−1
(µ)
(J , J )
p−1
N
−1
(µ)
(J , J )
p−1
≤ const .
(3.29)
Inequalities (3.25)–(3.28) prove (2.13). Define the order parameters of our problem RN,p ≡
N 1 2 Ji , N
qN,p ≡
i=1
N 1 Ji 2 . N
(3.30)
i=1
To prove the self-averaging properties of RN,p and qN,p we use the following general lemma: Lemma 2. Consider the sequence of convex random functions {fn (t)}∞ n=1 (fn (t) ≥ 0) on the interval (a, b). If the sequence of functions fn is self-averaging (E{(fn (t) − E{fn (t)})2 } → 0, as n → ∞ uniformly in t) and bounded (|E{fn (t)}| ≤ C uniformly in n, t ∈ (a, b)), then for almost all t,
lim E{[fn (t) − E{fn (t)}]2 } = 0,
n→∞
(3.31)
i.e. the derivatives fn (t) are also self-averaging ones for almost all t. In addition, if we consider another sequence of convex functions {gn (t)}∞ n=1 (gn ≥ 0) which are also self-averaging (E{(gn (t) − E{gn (t)})2 } → 0, as n → ∞ uniformly in t), and |E{fn (t)} − E{gn (t)}| → 0, as n → ∞, uniformly in t, then for all t, which satisfy (3.31) lim |E{fn (t)} − E{gn (t)}| = 0,
n→∞
lim E{[gn (t) − E{g (t)}]2 } = 0.
n→∞
(3.32)
For the proof of this lemma see [P-S-T2]. On the basis of Lemma 2, in Sect. 4 we prove Proposition 2. Denote by RN,p−1 , qN,p−1 the analogues of RN,p , qN,p (see definition (3.30)) for HN,p−1 . Then for any convergent subsequence E{fNm ,pm (k, h, z, ε)} for almost all z and h RNm ,pm , qNm ,pm we have got E{(RNm ,pm − R Nm ,pm )2 }, E{(qNm ,pm − q Nm ,pm )2 } → 0, |R Nm ,pm − R Nm ,pm −1 |, |q Nm ,pm − q Nm ,pm −1 | → 0 as k → ∞,
(3.33)
where R N,p = E{RN,p }, and
E
Nm−1
Nm i=1
q N,p = E{qN,p }
(3.34)
2
Ji2
− R Nm ,=pm
→ 0, as Nm → ∞.
(3.35)
398
M. Shcherbina, B. Tirozzi
Our strategy now is to choose an arbitrary convergent subsequence fNm ,pm (k, h, z, ε), and by applying to it the above proposition, to show that its limit for all h, z coincides with the r.h.s. of (2.14). Then this will mean that there exists the limit fN,p (h, z, ε) as N, p → ∞, Np → α. But in order to simplify formulae below we shall omit the lower index m for N and p. Now we formulate the main technical point of the proof of Theorem 2. Lemma 3. Consider HN,p−1 and denote by . . . p−1 the respective Gibbs averages. For any ε1 > 0 and 0 ≤ k1 ≤ 2k define k1 − N −1/2 (ξ (p) , J p−1 ) 1/2 φN (ε1 , k1 ) ≡ ε1 H , (3.36) √ ε1 p−1
1/2
φ0,N (ε1 , k1 ) ≡ ε1 H
k1 − N −1/2 (ξ (p) , J p−1 ) , UN,p−1 (ε1 )
(3.37)
where UN,p−1 (ε1 ) ≡ R N,p−1 − q N,p−1 + ε1 . Then, 2 φN (ε1 , k1 ) − φ0,N (ε1 , k1 ) → 0, 2 E log φN (ε1 , k1 ) − log φ0,N (ε1 , k1 ) → 0, 2
d → 0, log φN (ε1 , k1 ) − log φ0,N (ε1 , k1 ) dε1 2
d → 0, log φN (ε1 , k1 ) − log φ0,N (ε1 , k1 ) dk1 E
E E
d dε1 d dk1
and N −1/2 (ξ (p) , J p−1 ) converges in distribution to random variable with zero mean and variance 1. Besides, if we denote
(3.38)
q N,p u, where u is a Gaussian
t (µ) ≡ N −1/2 (ξ (µ) , J ) − x µ , t˙(µ) ≡ t (µ) − t (µ) , p p 1 (µ) 2 1 (µ) 2 (t ) , q˜N ≡ 2 t , U˜ N ≡ 2 ε N ε N µ=1
(3.39)
µ=1
then U˜ N and q˜N are self-averaging quantities and for µ = ν, E t˙(µ) t˙(ν) 2 → 0, E ((t (µ) )2 − (t (µ) )2 )((t (µ) )2 − (t (µ) )2 ) 2 → 0, E (t (µ) )4 ≤ const, E (t (µ) )4 (t (ν) )4 ≤ const. (3.40) Now we are ready to derive the equations for q N,p and R N,p . From the symmetry of the Hamiltonian (3.20) it is evident that q N,p = E{J1 2 } and R N,p = E{J12 }. The integration with respect J1 is Gaussian. So, if we denote (µ)
t1
≡ t (µ) − N −1/2 ξ1 J1 , (µ)
Rigorous Solution of the Gardner Problem
399
we get J1 = −(z + αN /ε)−1
p
1 εN
1/2
µ (µ) X1 t1 + hh1 .
µ=1
Hence,
(z + αN /ε) E J1 2
2
1 = 2 E ε N
p
(µ) (ν) (µ) (ν) ξ1 ξ1 t1 t1
µ,ν=1
p 2h (µ) (µ) +h + √ E h1 ξ1 t1 + o(1), (3.41) εN µ=1 2
and similarly p 1 (µ) (ν) (µ) (ν) (z + αN /ε)2 E J12 = (z + αN /ε) + 2 E ξ1 ξ1 t1 t1 ε N µ,ν=1
2h +h2 + √ εN
p
(µ) (µ) E h1 ξ1 t1 + o(1).
(3.42)
µ=1
Now to calculate the r.h.s. in (3.41) and (3.42) we use the formula of “integration by parts” which is valid for any function f with bounded third derivative (µ) (µ) E ξ1 f ξ1 N −1/2 1 1 (µ) (µ) (µ) = 1/2 E f ξ1 N −1/2 + 3/2 E f ζ (ξ1 )ξ1 N −1/2 , N N
(3.43)
(µ)
where |ζ (ξ1 )| ≤ 1. Thus, using this formula and the second line of (3.40), we get: (z + αN /ε)2 q N,p 1 (µ) (µ) (µ) (ν) (ν) (ν) = q˜N + 2 4 E t˙1 (t1 J1 − t1 J1 ) t˙1 (t1 J1 − t1 J1 ) N ε µ=ν 2 (µ) (µ) (µ) (ν) (ν) (ν) + 2 4 E t˙1 (t1 J1 − t1 J1 )(t1 J1 − t1 J1 ) t1 N ε µ=ν 1 (µ) (ν) (µ) (ν) (ν) (µ) + 2 4 E t˙1 (t1 J1 − t1 J1 ) t˙1 (t1 J1 − t1 J1 ) N ε µ=ν
+h2 +
2h2 (µ) (µ) ˙1 (t1 J1 − t1(µ) J1 )J˙1 + o(1). E t ε2 N µ
(3.44)
400
M. Shcherbina, B. Tirozzi (µ)
Replacing t1 by t (µ) and using the symmetry of the Hamiltonian with respect to Ji , we obtain e.g. for the first sum in (3.44): 1 (µ) (µ) ˙1 (t1 J1 ) − t1(µ) J1 ) t˙1(ν) (t1(ν) J1 − t1(ν) J1 ) E t N2 µ=ν
=
p N 1 (µ) (µ) ˙ ˙ (t (Ji + Ji ) − t (µ) (J˙i + Ji ) E t N3 i=1 µ,ν=1 · t˙(ν) (J˙i + Ji ) − t (ν) (J˙i + Ji ) + o(1)
p N 1 = 3 E Ji 2 (t˙(µ) )2 (t˙(ν) )2 + o(1) N i=1 µ,ν=1 = q N,p (U˜ N − q˜N )2 + o(1).
Here we have used the relation (3.24), which allows us to get rid of the terms containing J˙i and the self-averaging properties of q N,p , U˜ N and q˜N . Transforming in a similar way the other sums in the r.h.s. of (3.44) and using also relations (3.40) to get rid of the terms containing t˙(µ) t˙(ν) we get finally: (z + αN /ε)2 q N,p = q˜N + 2(R N,p − q N,p )q˜N (U˜ N − q˜N ) + q N,p (U˜ N − q˜N )2 +h2 (1 + 2(U˜ N − q˜N )(R N,p − q N,p )) + o(1).
(3.45)
Similarly we obtain 2 (z + αN /ε)2 R N,p = (z + αN /ε) + U˜ N + R N,p (U˜ N2 − q˜N ) − 2q N,p q˜N (U˜ N − q˜N ) 2 ˜ +h (1 + 2(UN − q˜N )(R N,p − q N,p )) + o(1). (3.46) Considering (3.45) and (3.46) as a system of equations for R N,p and q N,p , we get
q N,p =
q˜N + h2 + o(1), (z + N )2
R N,p − q N,p =
1 + o(1), z + N
(3.47)
where we denote for simplicity N ≡
α − U˜ N + q˜N . ε
(3.48)
Now we should find the expressions for q˜N and U˜ N . From the symmetry of the Hamiltonian (2.10) it is evident that 2
1 −1/2 (p) (p) q˜N = αN E 2 N (ξ , J ) − x ε $ %2
d 1 = αN E log dx exp − (N −1/2 (ξ (p) , J )−x −k1 )2 dk1 2ε1 x>0 p−1 k1 =k 2 d = αN E log φN (k1 , ε1 ) . (3.49) dk1 k1 =k
Rigorous Solution of the Gardner Problem
401
Therefore, using Lemma 3, we derive: $ %2 d q N,p u + k1 q N,p u + k1 αN 2 = . log H E A q˜N = αN E UN,p dk1 UN,p UN,p (3.50) Here and below we denote d e−x /2 A(x) ≡ − log H(x) = √ , dx 2π H(x) 2
(3.51)
where the function H(x) is defined by (1.8). Similarly
1 −1/2 (p) (p) 2 (N (ξ , J ) − x ) ε2
d 1 −1/2 (p) 2 = 2αN E log dx exp − (N (ξ , J ) − x − k1 ) dε1 2ε1 x>0 p−1 ε =ε 1
d = 2αN E log φp (k1 , ε1 ) . (3.52) dε1 ε1 =ε
U˜ N = αN E
Now, using Lemma 3 and Lemma 1, we derive: q N,p u + k1 d −1/2 log ε1 H dε1 UN,p ε1 =ε ) q N,p u + k1 αN αN . = + 3/2 E (k + q N,p u)A ε UN,p UN,p
U˜ N = 2αN E
(3.53)
Thus, from (3.45), (3.46), (3.50) and (3.53) we obtain the system of equations for R N,p and q N,p ,
q N,p
q N,p u + k + h2 ] + ε˜ N , ≡ (R N,p − q N,p ) [ E A UN,p UN,p ) q N,p u + k α E ( q N,p u + k)A 3/2 UN,p UN,p q N,p 1 − − h2 + ε˜ N , =z+ 2 (R N,p − q N,p ) R N,p − q N,p 2
α
2
(3.54)
→ 0, as N, p → ∞, α → α. where ε˜ N , ε˜ N N
Proposition 3. For any α0 < 2 and small enough h there exists ε ∗ (α0 , k, h) such that for →0 all α < α0 , ε ≤ ε ∗ and z < ε−1/3 the solution of the system (3.54) tends as ε˜ N , ε˜ N ∗ ∗ to (R , q ) which gives the unique point of maxR minq in the r.h.s. of (2.14).
402
M. Shcherbina, B. Tirozzi
On the basis of this proposition we conclude that for almost all z, h there exist the limits
d lim E fNm ,pm (k, h, z, ε) = R ∗ (α, k, h, z, ε), m→∞ dz
d lim E fNm ,pm (k, h, z, ε) = h(R ∗ (α, k, h, z, ε) − q ∗ (α, k, h, z, ε)). m→∞ dh But since the r.h.s. here are continuous functions of z, h we derive that for any convergent subsequence fNm ,pm (k, h, z, ε) the above limits exist for all z, h. Besides, choosing a subsequence fNm ,pm (k, h, z, ε) which converges for any rational α, we obtain that for such that α = any Nm , pm m = αm
pm Nm
pm Nm
such that → α1 (α1 is a rational number) and pm
→ 0,
E fNm ,pm (αk , k, h, z, ε)} − E{fNm ,pm (αk , k, h, z, ε)
p −p
1 m m E log ZNm ,pm −i (k, h, z, ε) − log ZNm ,pm −i−1 (k, h, z, ε) = Nm i=0 ) p −p q Nm ,pm −i u + k 1 m m E log H → ,p −i Nm U N m m i=0 α1 q ∗ (α)u + k → E log H √ ∗ dα. R (α) + ε − q ∗ (α) 0
Thus, for all rational α there exists
lim E fNm ,pm (k, h, z, ε) = F (α, k, h, z, ε), m→∞
where F (α, k, h, z, ε) is defined by (2.14). But since the free energy is obviously monotonically decreasing in α, we obtain that for any convergent subsequence the limit of the free energy coincides with the r.h.s. of (2.14). Hence, as it was already mentioned after Proposition 2, there exists a limit which coincides with the r.h.s. of (2.14). Theorem 2 is proven. Proof of Theorem 3. For any z > 0 let us take h small enough and consider z N,p (k, h, z) ≡ σN−1 dJ exp − (J , J ) − h(h, J ) , 2
N where
N,p ≡ J : N −1/2 (ξ (ν ) , J ) ≥ k, (ν = 1, . . . , p) .
To obtain the self-averaging of N −1 log(MN) (k, h, z) and the expression for E{N −1 log(MN) (k, h, z)} we define also the interpolating Hamiltonians, corresponding partition functions and free energies: p k − N −1/2 (ξ (ν ) , J ) z (µ) HN,p (J , k, h, z, ε) ≡ − log H + (J , J ) + h(h, J ), √ 2 ε ν=µ+1
(3.55)
Rigorous Solution of the Gardner Problem (µ) ZN,p (k, h, z, ε)
≡
(µ)
σN−1
fN,p (k, h, z, ε, M) ≡ where
403
(µ)
(µ)
N,p
dJ exp{−HN,p (J , k, h, z, ε)},
1 log(MN) ZN,p (k, h, z, ε), N
(3.56)
(µ)
N,p ≡ J : N −1/2 (ξ (µ ) , J ) ≥ k, (µ = 1, . . . , µ) .
According to Theorem 2, for large enough M with probability more than (1 − O(N −1 )) 1 (p) (0) log(MN) (k, h, z), fN,p (k, h, z, ε, M) = fN,p (k, h, z, ε), fN,p (k, h, z, ε) = N where fN,p (k, h, z, ε) is defined by (2.12). Hence, fN,p (k, h, z, ε, M) − ˜
p 1 1 (µ) ˜ , log(MN) N,p (k, h, z) = N N µ=1
(µ)
≡
(µ−1) log(MN) ZN,p
(3.57)
(µ) − log(MN) ZN,p .
Below in the proof of Theorem 3 we denote by x (µ) ≡ N −1/2 (ξ (µ) , J ), by the sym(µ) bol . . . µ the Gibbs averaging corresponding to the Hamiltonian HN,p in the domain (µ−1)
N,p
(µ,µ)
and by ZN,p the corresponding partition function. Denote also Tµ (x) ≡ θ (x (µ) − x) , Xµ ≡ x (µ) . µ
µ
To proceed further, we use the following lemma: Lemma 4. If the inequalities (3.21) are fulfilled and there exists an N, µ, ε-independent D such that 1 ˙ ˙ (3.58) (J , J ) µ ≥ D 2 , N then there exist N, µ, ε-independent K1 , C1∗ , C2∗ , C3∗ such that for |Xµ | ≤ log N , ∗
Tµ (k + 2ε 1/4 ) ≥ C1∗ e−C2 Xµ , Tµ (k − 2ε 1/4 ) − Tµ (k + 2ε 1/4 ) ≤ ε1/4 C3∗ 2
(3.59)
with probability PN ≥ (1 − K1 N −3/2 ). (µ)
Remark 5. Similarly to Remark 3 one can conclude that, if ZN,p > e−MN , then there exists an ε, N, µ-independent R0 such that we can use the inequality |J | ≤ N 1/2 R0 with the error O(e−N const ). Remark 6. Denote by D˜ µ2 the l.h.s. of (3.58). Then 4D˜ µ2 θ (|J˙ | − 2D˜ µ N 1/2 ) µ ≤ N −1 (J˙ , J˙ ) = D˜ µ2 (µ,µ)
µ
1 ⇒ θ (|J˙ | − 2D˜ µ N 1/2 ) µ ≤ 4 4 −1 z (µ,µ) ⇒ ZN ≤ σN exp{− (J , J ) − h(h, J )} 3 2 |J˙ |<2D˜ µ N 1/2 4 ≤ (2D˜ µ )N e2hNR0 . 3 (µ,µ) Thus, the inequality Z > e−MN implies that D˜ µ ≥ 1 exp{−M − 2hR0 } ≡ D 2 . N,p
2
404
M. Shcherbina, B. Tirozzi (p)
Let us prove the self-averaging property of fN,p (k, h, z, ε, M), using Lemma 4. Similarly to (3.25) we write (p)
(p)
fN,p (k, h, z, ε, M) − E{fN,p (k, h, z, ε, M)} =
p−1 1 ν , N ν=0
where
(p)
(p)
ν ≡ Eν {fN,p (k, h, z, ε, M)} − Eν+1 {(fN,p (k, h, z, ε, M)}.
Then E{ν ν } = 0, (ν = ν ) and therefore (p)
(p)
E{(fN,p (k, h, z, ε, M) − E{fN,p (k, h, z, ε, M)})2 } =
p−1 1 E{2ν }, N2
(3.60)
ν=0
where similarly to (3.26) 2
E{2ν } ≤ E{ν }, with
(3.61)
(p)
(p,ν+1)
ν ≡ log(MN) ZN,p − log(MN) ZN,p
,
(p,ν)
(p)
where ZN,p is the partition function, corresponding to the Hamiltonian HN,p in the domain N,p which differs from N,p by the absence of the inequality for µ = ν. Therefore for ν ≤ p − 1, (p,ν)
(p)
E{|ν |2 } = E{|p−1 |2 } = E{θ (ZN,p − e−MN )| log(MN) ZN,p − log(MN) ZN,p |2 } (p,p)
(p)
(p,p)
+E{θ (e−MN − ZN,p )| log(MN) ZN,p − log(MN) ZN,p |2 }. (p,p)
(p)
(p,p)
(3.62)
But the second term in the r.h.s. is zero, because ZN,p ≤ ZN,p and thus ZN,p ≤ e−MN (p)
(p,p)
(p,p)
implies ZN,p ≤ e−MN , and so log(MN) ZN,p = log(MN) ZN,p = −MN . Then, denoting by χµ the indicator function of the set, where Z (µ,µ) > e−MN , and the inequalities (3.59) are fulfilled, on the basis of Lemma 4, we obtain that 2 (p,p) E{ν } = E{θ (ZN,p − e−MN ) log2(M) θ (x (p) − k) } (p)
≤
(p)
(p,p)
p (p,p) −MN )θ (|Xp | − log N )} (MN ) [E{θ (ZN,p − e (p,p) +E{θ (ZN,p − e−MN )(1 − χp )θ (log N − |Xp |)}] 2
(p,p) +E θ (ZN,p − e−MN )χp θ (log N − |Xp |) log2 exp{−C1∗ Xµ2 }
≤ (MN )2 [e− log
2 N/2R 2 0
) + K1 N −3/2 ] + 2(R02 C1∗ )2 ≤ 2M 2 K1 N 1/2 . (3.63)
Here we have used that, according to the definition of the function log(MN) (see (2.15), | log(MN) θ (x (p) − k) p | ≤ MN . Besides, we used the standard Chebyshev inequality, according to which Pµ (X) ≡ Prob{Xµ ≥ X} ≤ e−X
2 /2R 2 0
.
(3.64)
Rigorous Solution of the Gardner Problem
405
Relations (3.60), (3.61) and (3.63) prove the self-averaging property of N1 log(MN) N,p (k, h, z). ˜ (µ) , defined in (3.57), for any µ satisfies the bound Now let us prove that ˜ (µ) }| |E{
(µ,µ) = |E{θ (ZN,p −e−MN )[log(MN) H((k−x (µ) )ε −1/2 ) −log(MN) θ(x (µ) −k) ]} µ
≤ ε K, λ
µ
(3.65)
with some positive N, µ, ε-independent λ, K. We remark here that similarly to (3.62), (µ−1) µ µ,µ µ,µ (µ−1) (µ) ZN,p , ZN,p ≤ ZN,p and so, if ZN,p < e−MN , then log(MN) ZN,p = log(MN) ZN,p = MN. Using the inequalities x H(−ε−1/4 )θ (x − ε 1/4 ) ≤ H − 1/2 ≤ ε1 + θ(x + ε 1/4 ) (3.66) ε with ε1 ≡ H(ε −1/4 ), we get log H(−ε−1/4 ) − E{θ (ZN,p − e−MN ) log(1 + r1 (k, ε))} (µ,µ)
−MN ˜ (µ) } ≤ E{θ (Z ) log(1 + r2 (k, ε))}, ≤ E{ N,p − e (µ,µ)
(3.67)
where r1 (k, ε) ≡
Tµ (k) − Tµ (k + ε 1/4 ) , Tµ (k + ε 1/4 )
r2 (k, ε) ≡
Tµ (k − ε 1/4 ) − Tµ (k) + ε1 . Tµ (k)
But by virtue of Lemma 4, one can get easily that, if |Xµ | ≤ log N , then with probability (µ) PN ≥ (1 − K1 N −3/2 ), 2
r1,2 (k, ε) ≤ ε1/4 CeCXµ with some N, µ-independent C. Therefore, choosing λ ≡ 18 R02 (1 + 2CR02 )−1 and L2 ≡ 2λ| log ε|, for small enough ε we can write similarly to (3.63) (µ,µ) E θ (ZN,p − e−MN ) log(MN) 1 + r1,2 (k, ε) ≤ (MN )Pµ (log N ) + K1 N −3/2 (MN ) 2 + θ (log N − |X|) log(1 + ε1/4 CeCX )dPµ (X) 1/4 CL2 = ε Ce + C θ (|X| − L)X 2 dPµ (X) + o(1) 2
≤ ε1/4 CeCL + 2CL2 P (L) ≤ K(C, R0 )ε λ , where Pµ (X) is defined and estimated in (3.64) and we have used that, according to definition (2.15), −MN ≤ log(MN) θ (x (µ) − k) µ , log(MN) θ(x (µ) − k ± ε 1/4 ) µ ≤ 0 and therefore always | log(MN) (1 + r1,2 (k, ε))| ≤ MN . Using the bound 1 1 log log(MN) N,p (k, 0, z) ≤ 2hR0 , (MN) N,p (k, h, z) − N N
406
M. Shcherbina, B. Tirozzi
representation (3.57) and the self-averaging property of obtain that with probability PN ≥ 1 − O(N −1/2 ),
1 N
log(MN) N,p (k, h, z), we
1 log(MN) N,p (k, 0, z) N ≤ F (α, k, 0, z, ε) + O(ελ ) + O(h).
F (α, k, 0, z, ε) + O(ελ ) + O(h) ≤
Now we are going to use Corollary 1 to replace the integration over the whole space by the integration over the sphere of the radius N 1/2 . But since Theorem 2 is valid only for z < ε−1/3 , we need to check that minz {F (α, k, 0, z, ε) + 2z } takes place for a z, which satisfies this bound. Proposition 4. For any α < αc (k) there exists an ε-independent z(k, α) such that zmin < z(k, α). Then, using 2.9, we conclude that with the same probability for α ≤ αc (k), z min F (α, k, 0, z, ε) + + O(ε λ ) + O(h) z 2 1 N,p (k) log ≤ N (MN) z ≤ min F (α, k, 0, z, ε) + + O(ε λ ) + O(δ) + O(h). z 2 Thus,
(3.68)
2 1 1 ≤ O(ε2λ ) + O(h), log(MN) N,p (k) − E log(MN) N,p (k) lim E N→∞ N N (3.69) and since ε, h are arbitrarily small numbers (3.69) proves the self-averaging property of 1 1 N log(MN) N,p (k). Besides, averaging N log(MN) N,p (k) with respect to all random variables and taking the limits h, ε → 0, we obtain (2.14) from (3.69). The last statement of Theorem 3 follows from the one proven above, if we note that log(MN) N,p (k) is a monotonically decreasing function of α and, on the other hand, the r.h.s. of (2.16) tends to −∞ as α → αc (k). Hence, we have finished the proof of Theorem 3.
4. Auxiliary Results Proof of Proposition 1. Let us fix t ∈ (t1∗ , t2∗ ), take some small enough δ and consider Dδ (t) which is the set of all J ∈ A(t) ∩ D whose distance from the boundary of D is more than dN 1/2 max{δ, 2K0 δ}. Now for any J 0 ∈ Dδ (t) consider (J˜ , φ(J˜ )) – the local parametrisation of D with the points of the (N − 1)-dimensional hyper-plane ˜ = 0}, where n˜ is the projection of the normal n to D at the point B = {J˜ : (J˜ , n) J 0 on the hyper-plane B(t). We chose the orthogonal coordinate system in B in such a way that J˜1 = (J , e) = N 1/2 t. Denote J˜0 = P J 0 (P is the operator of the orthogonal projection on B). According to the standard theory of the Minkowski sum (see e.g.[Ha]), the boundary of 21 A(t) × 21 A(t + δ) consists of the points J =
1 1 J + J (δ) (J ), 2 2
(4.1)
Rigorous Solution of the Gardner Problem
407
where J belongs to the boundary of A(t) and the point J (δ) (J ) (belonging to the boundary of A(t + δ)) is chosen in such a way that the normal to the boundary of A(t + δ) at this point coincides with the normal n to the boundary of A(t) at the point J . Denote ˜ 1 ) the part of the boundary of 1 A(t) × 1 A(t + δ) for which in representation (4.1) D( 2 2 2 J ∈ Dδ (t). Now for J 0 ∈ Dδ (t) let us find the point J (δ) (J 0 ). Since by construction (δ) ∂ φ(J˜ 0 ) = 0 (i = 2, . . . , N − 1), we obtain for J˜ (J 0 ) ≡ P J (δ) (J 0 ) the system of ∂ J˜i equations ∂ (δ) φ(J˜ ) = 0, (i = 2, . . . , N − 1) ˜ ∂ Ji (δ) and J˜1 = N 1/2 (t + δ). Then we get (δ) −1 −1 J˜i = J˜i0 + δN 1/2 (D11 ) (D −1 )i,1 + o(δ)
(i = 2, . . . , N − 1),
(4.2)
˜ where the matrix {Di,j }N−1 i,j =1 consists of the second derivatives of the function φ(J ) (Di,j ≡
∂2 ∂ J˜i ∂ J˜j
φ(J˜ )). Thus, it was mentioned above, the point J 1 ≡ ( 21 (J˜ 0 + J˜
1 ˜ 1 ˜ ˜ (δ) 2 (φ(J 0 )) + φ(J )) ∈ D( 2 ). Consider also the point J 1 (δ) J˜ ))) ∈ A(t + 21 δ) ∩ D. Then,
(δ)
),
(δ) ≡ ( 21 (J˜ 0 + J˜ ), φ( 21 (J˜ 0 +
1 ˜ 1 ˜ (δ) (δ) (J 0 + J˜ ) − φ(J 0 ) + φ(J˜ ) 2 2 N−1 N −1 δ 2 −1 2 −1 −1 −1 −1 = N (D1,1 ) Di,j Di,1 Dj,1 + 2D1,1 Di,1 Di,1 + D1,1 2
|J 1 − J 1 | = φ
i,j =2
+o(δ )N δ 2
2
−1 −1 (D1,1 )
i=2
+ o(δ ). 2
−1 −1 ) ≥ λmin , where λmin is the minimal eigenvalue of the matrix D. Therefore, But (D1,1 since
λmin =
(D J˜ , J˜ ) min (D J˜ , J˜ ) ≥ min ≥ κmin ≥ K0 N −1/2 , (4.3) 2 (n, e)2 )3/2 ˜ ˜ ˜ ˜ ˜ (1 + J (J ,J )=1 (J ,J )=1 1
we obtain that |J 1 − J 1 | ≥ δ 2 K0 N 1/2 .
(4.4)
(δ) ∂ φ(J˜ 0 ) = 0 and ∂˜ φ(J˜ ) = 0, we get that the tan∂ J˜i ∂ Ji gent hyper-plane of the boundary 21 A(t) × 21 A(t + δ) at the point J 1 is orthogonal to ˜ 1) (J 1 − J 1 ). So, in fact, we have proved that the distance between Dδ (t + 21 δ) and D( 2 1 1 2 1/2 ˜ ), we obtain that ˜ ) ≡ mesD( is more than δ K0 N . Thus, denoting by S( 2 2
Besides, since by construction
V
1 1 1 2 1/2 ˜ ˜ t + δ −V ≥ δ N K0 S + o(δ 2 ) = δ 2 N 1/2 K0 S(t) + o(δ 2 ). (4.5) 2 2 2
408
M. Shcherbina, B. Tirozzi
˜ 1 ) = S(t) + o(1), as δ → 0, because the boundary D is Here we have used that S( 2 smooth. Therefore, denoting V˜ (τ ) the volume of τ A(t) × (1 − τ )A(t + δ) and using (4.5), we get 1 2V 1/N t + δ − V 1/N (t) − V 1/N (t + δ) 2 1/N 1 2 −1/2 ˜ ≥ 2 V( )+δ N K0 S(t) − V˜ 1/N (0) − V˜ 1/N (1) + o(δ 2 ) 2 1 2δ 2 K0 S(t) = 2V˜ 1/N ( ) − V˜ 1/N (0) − V˜ 1/N (1) + + o(δ 2 ) 2 N 1/2 V˜ 1−1/N ( 21 ) ≥
2δ 2 K0 S(t) N 1/2 V 1−1/N (t + 21 δ)
+ o(δ 2 ) = 2δ 2 K0 C(t)V 1/N (t) + o(δ 2 ).
Here we have used the inequality 2V˜ 1/N ( 21 ) − V˜ 1/N (0) − V˜ 1/N (1) ≥ 0, which follows from the Brunn-Minkowski theorem and the relation V (t + 21 δ) = V (t) + o(1) (as δ → 0). Then, sending δ → 0, we obtain the statement of Proposition 1. Proof of Lemma 1. Since log H(x) is a concave function of x, HN,p (J , h, z, ε) is a convex function of J , satisfying (2.3). Since log H(x) < 0 for any x, (2.4) is also fulfilled. To prove (2.5) let us write 3 (µ) (ν) ξi ξi Aµ Aν + 3h2 (hh) + 3z2 (J , J ) Nε i,µ,ν $ % −1 2 2 2 ≤ const ε Aµ + z (J , J ) + h (hh)
|∇HN,p (J )|2 ≤
µ
$ ≤ const ε
−1
∗
pC −
µ
N −1/2 (J , ξ (µ) ) log H k − √ ε
where we denote for simplicity Aµ ≡ A k −
%
N −1/2 (J ,ξ (µ) ) √ ε
+ h + z (J , J ) , (4.6) 2
2
, with the function A(x)
defined in (3.51). The second inequality in (4.6) is based on the first line of (3.21), the third inequality is valid by virtue of the bound 21 A2 (x) ≤ − log H (x) + C ∗ , with some constant C ∗ , and due to the second line of (3.21). Taking into account (2.4) one can conclude also that for any U there exists some N-independent constant C(U ) such that (J , J ) ≤ N C(U ), if HN,p (J ) ≤ N U . Thus, we can derive from (4.6) that under conditions (3.21), (2.5) is fulfilled. Besides, due to the inequality log H (x) ≥ C1∗ − 21 x 2 , it is easy to obtain that fN,p (k, h, z, ε) ≥ C1∗ + so (2.6) is also fulfilled.
1 log det(ε−2 X + zI ), N
Rigorous Solution of the Gardner Problem
409
Hence, we have proved that under conditions (3.21) the norm of the matrix D ≡ {J˙i J˙j }N i,j =1 is bounded by some N -independent C(z, ε). Then with the same probability N −1 N J˙i J˙j 2 = N −1 TrD2 ≤ C(z, ε), i,j =1
which implies (3.22). To prove (3.23) let us observe that θ (|JN | − N 1/2 εN ) = θ(|c| − εN ) (U,c) ,
(4.7)
where . . . (U,c) is defined in (3.3)–(3.7) with e = (0, . . . , 0, 1). For the function sN (U, c), defined by (3.5), we get ∂ H (J ) exp{−HN.p (J )}|JN =0 dJ1 . . . dJN−1 ∂ −1/2 ∂JN N.p =N sN (U, 0) ∂c exp{−HN.p (J )}|JN =0 dJ1 . . . dJN−1 (U,0) p hhN 1 (µ) = 1/2 + ξN Aµ . (4.8) N Nε JN =0 µ=1
(µ)
But since Aµ |JN =0 does not depend on ξN , by using the standard Chebyshev inequality, we obtain that ∂ 2 2 Prob (4.9) s (U, 0) > εN ≤ e−C1 NεN = e−C1 log N . ∂c N (U,0) On the other hand, since sN (U, c) is a concave function of U, c satisfying (3.13), denoting φN (U, c) ≡ sN (U, c) − U − (sN (U ∗ , c∗ ) − U ∗ ) for any (U, c) ∼ (U ∗ , c∗ ), one can write C0 [(U − U ∗ )2 + (c − c∗ )2] ≤ −
∂ ∂ φN (U, c)(c − c∗ ) − φN (U, c)(U − U ∗ ). (4.10) ∂c ∂U
Multiplying this inequality by eNφN (U,c) and integrating with respect to U , we obtain for c = 0, ∂ + O(N −1 ). sN (U, 0) C0 (c∗ )2 ≤ c∗ ∂c (U,0) Therefore, taking into account (4.9), we get that, if (3.21) is fulfilled, then εN 2 Prob |c∗ | > ≤ e−C1 log N . 2
(4.11)
But, using the Laplace method, we get easily εN 2 2 θ |c − c∗ | − ≤ e−CNεN ≤ e−C log N . 2 (U,c) Combining this inequality with (4.7) and using the symmetry with respect to J1 , . . . , JN , we obtain (3.23).
410
M. Shcherbina, B. Tirozzi
Proof of Proposition 2. Applying Lemma 2 to the sequences fNm ,pm and fNm ,pm −1 as function of z, we obtain immediately relations (3.33) for RNm ,pm for all z, where the limiting free energy f (z, h) has continuous first derivative with respect to z. Besides, since for all λ ∈ (−1, 1) and arbitrarily small δ > 0,
λE δ −1 fNm ,pm (z − δ) − fNm ,pm (z − 2δ) ≤ E log exp λNm−1 (J , J ) −1
≤ λE δ (fNm ,pm (z + 2δ) − fNm ,pm (z + δ) ,
we obtain that E log exp{λ(Nm−1 (J , J )} − R Nm ,pm ) → 0 for all such z and all λ ∈ (−1, 1). Using Remark 2, we can derive then that fm (λ) ≡ E exp λ(Nm−1 (J , J ) − R Nm ,pm ) → 1. (3)
Then, since it follows from Remark 2 that fk (λ) is bounded uniformly in m and λ, we derive that fm (λ) → 0 and, taking here λ = 0, obtain (3.35). To derive relations (3.33) for qNm ,pm we consider fNm ,pm and fNm ,pm −1 as functions of h, derive from Lemma 2 that 2
E Nm−1 (h, J Nm ,pm ) − E Nm−1 (h, J Nm ,pm ) → 0, and therefore E Nm−1 (h, J Nm ,pm )−E Nm−1 (h, J Nm ,pm ) Nm−1 (J Nm ,pm , J Nm ,pm ) → 0. Integrating it with respect to hi , we get
. m ˙ ˙ E (qNm ,pm −q Nm ,pm −(RNm ,pm − R Nm ,pm ))qNm ,pm = N22 N i,j =1 E Ji Ji Jj Jj . m
Using relations (3.22) and (3.27) we derive now (3.33) for qNm ,pm . Proof of Lemma 3. Let us note that, by virtue of Lemma 1, computing φN (ε1 , k1 ), 4 φ0,N (ε1 , k1 ) with probability more than (1 − e−C2 log N ) we can restrict all the integrals with respect to J to the domain
N = |Ji | ≤ εN N 1/2 , (i = 1, . . . , N ), (J , J ) ≤ N R02 . In this case the error for φN (ε1 , k1 ) and φ0,N (ε1 , k1 ) will be of the order O(N e−C1 log N ). So below in the proof of Lemma 3 we denote by ... p−1 the Gibbs measure, corresponding to the Hamiltonian HN,p−1 in the domain N . In this case the inequalities (3.22) are also valid, because their l.h.s., comparing with those, computing in the whole RN , 2 have errors of the order O(N 2 e−C1 log N ). 2
We start from the proof of the first line of (3.38). To this end consider the functions , FN (t) ≡ θ (N −1/2 (ξ (µ) , J ) − t) p−1 −1/2 F0,N (t) = H UN,p (0) N −1/2 (ξ (µ) , J p−1 ) − t , (4.12) ψN (u) ≡ exp iu(ξ (µ) , J˙ )N −1/2 , p−1 2 ψ0,N (u) ≡ exp − u2 (RN,p−1 − qN,p−1 ) .
Rigorous Solution of the Gardner Problem
Take L ≡
π 4εN
411
. According to the Lyapunov theorem (see [Lo]), 2 π
max |FN (t) − F0,N (t)| ≤ t
L
−L
u−1 du|ψN (u) − ψ0,N (u)| +
const . L
(4.13)
Since evidently 1/2
φN (ε1 , k1 ) = ε1
1/2
φ0,N (ε1 , k1 ) = ε1
−1/2
H(ε1
(k1 −1/2 H(ε1 (k1
− t))dFN (t), − t))dF0,N (t),
we obtain |φN (ε1 , k1 ) − φ0,N (ε1 , k1 )| ≤ max |FN (t) − F0,N (t)| const .
(4.14)
t
Thus, using (4.13), we obtain 1 E |φN (ε1 , k1 ) − φ0,N (ε1 , k1 )|2 ≤ const ( + I1 + I2 ), L 1
u−2 |ψN (u) − ψ0,N (u)|2 du , I1 ≡ E 1 I2 ≡ du(1) du(2) 1<|u(1) |,|u(2) |
I2
≡ Ep du(1) du(2) ψN (u(1) )ψ N (u(2) ) 1<|u(1) |,|u(2) |
p−1
.
(4.16) We would like to prove that one can replace the product of cos(ai ) in (4.16) by the product of exp{−ai2 /2}. So we should estimate
≡E
(1)
1<|u(1) |,|u(2) |
− exp
du du
(2)
N
cos N
−1/2
(1) u(1) J˙j
j =1
2 1 (1) ˙(1) (2) − u Jj − u(2) J˙j 2N
.
Let us denote
(1) (2) log cos N −1/2 τ u(1) J˙j − u(2) J˙j
2 τ 2 (1) (1) (2) (2) + . u Jj − u Jj 2N i
p−1
g(τ ) ≡
(2) − u(2) J˙j
(4.17)
412
Then
M. Shcherbina, B. Tirozzi
(1) (2) g(1) g(0) || = du du e − e 1<|u(1) |,|u(2) |
(4.18)
But since g(0), g (0), g (0), g (0) = 0, 4 1 const (1) ˙(1) (2) ˙(2) |g(1) − g(0)| ≤ |g (4) (ζ )| ≤ + u J J u j j 6 0 N 2 1 (1) (1) (2) (2) 2 N −1 (J˙ , J˙ ) + N −1 (J˙ , J˙ ) |u(1) |4 + |u(2) |4 . ≤ const εN Besides, using the inequality (valid for any |x| ≤ log cos x + we obtain that
|eg(0) + eg(1) | ≤ 2 exp
π 2)
x2 x2 ≤ , 2 6
1 (1) ˙(1) (2) 2 . u Jj + u(2) J˙j 6N
2 . Hence, we have proved that Thus, we get from (4.18) || ≤ const εN (1,2) 2 1 (1) (1) 2 Al,m u(l) u(m) + O(εN ), I2 = du(1) du(2) exp − 2 l,m=1
p−1
where
1 ˙(l) ˙(l) 1 (1) (J , J ), (l = 1, 2) A1,2 = (J ˙(1) , J ˙(2) ). N N Now, taking into account that Proposition 2 implies (1,2)
(1) → 0, (N → ∞), E (Al,m − Al,m )2 (1)
Al,l =
m,l=1,2
p−1
where Al,m = δl,m (RN,p−1 − qN,p−1 ), we obtain immediately that du(1) du(2) E ψN (u(1) )ψ N (u(2) ) (1) |,|u(2) |
In the same way one can prove also du(1) du(2) E ψN (u(1) )ψ 0,N (u(2) ) 1<|u (1) |,|u(2) |
(4.19)
Rigorous Solution of the Gardner Problem
413
which gives us that I2 = o(1). Similarly one can prove that I1 = o(1). Then, using (4.15), we obtain the first line of (3.38). To prove the second line of (3.38) we denote by A ≡ (φN (ε1 , k1 )), B ≡ (φ0,N (ε1 , k1 )), −1/2 ε˜ N ≡ E{(A − B)2 }, L˜ ≡ | log ε˜ N |˜εN , and write E | log A − log B|2 ≤ E θ (L˜ − A−1 )θ (L˜ − B −1 )(| log A − log B|2 +2E (θ (L˜ − A−1 ) + θ(L˜ − B −1 ))(log2 A + log2 B) ˜ −2 E (log4 A + log4 B) ≤ 4L˜ −2 E (A − B)2 + 4| log L| ˜ −2 const ≤ const | log L| ˜ −3/2 . ≤ 4˜εN L˜ −2 + | log L|
(4.20)
Here we have used the inequality | log A − log B| ≤ |A − B|(A−1 + B −1 ), the first line of (3.38) and the fact that E{log4 A}, E{log4 B} are bounded (it can be obtained similarly to (3.28)–(3.29)). Since we have proved above that ε˜ N → 0, as N → ∞, inequality (4.20) implies the second line of (3.38). The third and the fourth line of (3.38) can be derived in the usual way (see e.g. [P-S-T2]) from the second line by using the fact that the functions log φN (ε1 , k1 ) and log φ0,N (ε1 , k1 ) are convex with respect to ε1−1 and k1 . The convergence in distribution N −1/2 (ξ (p) , J p−1 ) → q N,p u follows from the central limit theorem (see, e.g. the book [Lo]), because J p−1 does not depend on ξ (p) and the Lindeberg condition is fulfilled: 1 2 Ji 4p−1 ≤ const εN . N2 i
Thus, to finish the proof of Lemma 3 we are left to prove (3.40). It can be easily done, e.g. for µ = p and ν = p − 1, if we in the same manner as above consider the functions 1 (2) dx1 dx2 exp − (N −1/2 (ξ (p) , J ) − x1 − k1 )2 φN (ε1 , ε2 , k1 , k2 ) ≡ 2ε1 x1 ,x2 >0
1 − (N −1/2 (ξ (p) , J ) − x2 − k2 )2 (4.21) 2ε2 p−1 (2)
φ0,N (ε1 , ε2 , k1 , k2 ) −1/2 (p) −1/2 (p−1) (ξ , J p−2 ) − k1 (ξ , J p−2 ) − k2 N N ≡ (ε1 ε2 )1/2 H H , 1/2 1/2 UN,p−2 (ε1 ) UN,p−2 (ε2 ) (4.22) and prove for them the analogue of relations (3.38). Then relations (3.40) will follow immediately. The self-averaging property for U˜ N and q˜N follows from the fact that (2) φ0,N (ε1 , ε2 , k1 , k2 ) is a product of two independent functions.
414
M. Shcherbina, B. Tirozzi
Proof of Proposition 3. It is easy to see that Eqs. (3.54) have the form ∂F = O(˜εN ), ∂q
∂F ), = O(˜εN ∂R
(4.23)
where F (q, R) is defined by (2.14). Let us make the change of variables s = q(R + ε − q)−1 . Then Eqs. (4.23) take the form ∂ F˜ ∂ F˜ = O(ε N ), = O(ε N ), (4.24) ∂s ∂R | and where ε N = |˜εN | + |˜εN √ √ 1 + s k 1 s(R + ε) 1 + log(R − εs) F˜ (s, R) ≡ αE log H u s + √ + 2 R − εs 2 ε+R 1 z h2 R − εs − log(1 + s) − R + . 2 2 2 1+s
(4.25)
Then (4.24) can be written in the form α h2 (R + ε)2 − (R + ε) = O(ε N ), f1 (s, R) ≡ − E A2 + s (R − εs)2 s(s + 1) (4.26) √ αk 1 + s εs(s + 1) 1 h2 f2 (s, R) ≡ 7 E {A} − + + − z = O(ε N ), (R + ε)3/2 (R − εs)2 R − εs s+1 where the function A(x) is defined by (3.51) and to simplify formulae we here and below omit the arguments of the functions A and A . Below we shall use also the corollary from Eqs. (4.26) of the form (cf.(3.47)) 1+s R+ε (4.27) − αE{A } − z = O(εN ). f3 (R, s) ≡ R + ε R − εs But
√ √ k 1+s u s+ √ AA ε+R α αk + 2 E A2 + 2 E AA s s (1 + s)1/2 (ε + R)1/2 2(R + ε)2 ε h2 (2s + 1) h2 + + (R + ε) > R. (R − εs)3 s 2 (s + 1)2 s2
α ∂ f1 (s, R) = − 2 E ∂s s
(4.28)
Here we have used the inequalities (see [A]): √ x2 + 4 + x ⇒ A2 (x) − xA (x)A(x) = A2 (x)(1 + x 2 − xA(x)) > 0, (4.29) A(x) ≤ 2 which gives us that the sum of the first two terms in (4.28) is positive. Therefore we ˜ conclude that equation ∂∂sF (s, R) = 0 for any R has a unique solution s = s(R) and, if we consider the first of Eqs. (4.24), then its solution s1 (R) for any R behave like s1 (R) = s(R) + O(εN ).
(4.30)
Rigorous Solution of the Gardner Problem
415
On the other hand,
√ ∂ 2 F˜ 3αk s + 1 αk 2 (s + 1) R − 3εs − 2s 2 ε {A} 2 2 (s, R) = − E − E A − . ∂R 2(R + ε)3 (R − εs)3 2(R + ε)5/2 (4.31) Thus, if we prove that 1 (4.32) 3ε(1 + s)2 ≤ R, 2 we get ∂ 2 F˜ R (s, R) < − , (4.33) ∂R 2 2(R − εs)3 and then obtain that the function ϕ(R) ≡ F˜ (s(R), R) is concave. So the equation ϕ (R) = 0 has the unique solution R ∗ (α, k) which is a maximum of ϕ(R). Besides, in view of (4.33) the solution of equation ϕ (R) = O(˜εN ) has the form R = R ∗ + O(˜εN ). But in view of (4.30) the second equation of (4.24) can be rewritten in the above form. Therefore its solution tends to R ∗ (α, k) as ε N → 0. Thus, our goal is to prove (4.32). Denote ) ˜ − αH(−k), ˜ k˜ = k(s(R + ε))−1/2 (1 + s)1/2 , D ≡ αI2 (k) ∞ (4.34) ˜ ˜ ≡ √1 ˜ 2 e−u2 /2 du = √k e−k˜ 2 /2 + (1 + k˜ 2 )H(−k). ˜ I2 (k) (u + k) 2π −k˜ 2π We shall use the inequalities ˜ + K1 < E A2 < sI2 (k) ˜ + K1 (1 + k˜ 2 ) + K2 s 2/3 , sI2 (k) ˜ + K3 s −1/3 e−k˜ 2 /3 , ˜ < E A < H(−k) H(−k) and for
1 3
(4.35)
≤ α ≤ α0 < 2,
˜ αH(−k) ˜ + (2 − α)H(−k) ˜ > K4 . ˜ ˜ + k˜ 2 + 1 − 2H(−k)) ) × (kA(− k) ˜ + αI2 (−k) ˜ αH(−k) (4.36) ˜ and to obtain (4.36) we have used the inequality Here K1,2,3,4 do not depend on s, k, ˜kA(−k) ˜ + k˜ 2 + 1 − 2H(−k) ˜ ≥ 1 − 2 k˜ 2 , π
D=
which we have checked numerically. We study first the case when k = 0. 1 Consider R ≥ Kε −1/3 , where K ≡ min{ 24 ; (4.26) and (4.35), we get
K42 48 }.
˜ + K2 s 2/3 + 2K1 ) + s sf1 (R, s) ≥ −α(sI2 (k)
For such R, using the first lines of (R + ε)2 h2 − (R − εs)2 1+s
α0 − O(ε 1/3 ))−1 [α(2K1 + h2 + K2 s 2/3 ) + h2 ] 2 (4.37) ⇒ s < K 1 (α0 , h).
⇒ s < (1 −
416
M. Shcherbina, B. Tirozzi
It is evident that there exists ε1∗ (α0 , h) such that for any ε < ε1∗ (α0 , h) the last inequality in (4.37) under condition R > Kε −1/3 implies (4.32). Consider now R ≤ Kε −1/3 . If α < 13 , then (4.35) and (4.27) imply αE{A } <
1+s R+ε 1 R+ε 1 1+s ⇒z= ( − αE{A }) ≥ 2 R − εs R + ε R − εs 2 R − εs 1+s ⇒ ≤ 2ε −1/3 . R − εs
(4.38)
1 −1/3 ε then evidently there exists ε2∗ such that for any ε ≤ ε2∗ and any α < 13 , If R ≤ 48 (4.32) follows from (4.38). Let now 13 < α ≤ α0 < 2. The first equation of (4.26), (4.27) and the inequalities (4.35), ) 1+s R+ε 1+s z= − αE{A } > αE{A2 } − αE{A } R + ε R − εs R+ε 1+s ˜2 > D − K3 s −1/3 e−k /3 R+ε 1+s ˜2 ⇒ D − K3 (1 + s)−1/3 e−k /3 ≤ ε−1/3 , (4.39) R+ε
where D is defined by (4.34). Inequalities (4.39) and (4.36) give us two possibilities: ˜
1+s K4 −1/3 (i) K3 s −1/3 e−k2 /3 ≤ K24 ⇒ (R+ε) 2 ≤ε −2 ⇒ 3ε(1 + s)2 ≤ 12K4 ε 1/3 (R + ε)2 and R > K4−1 ε 1/3 − ε; ˜
(ii) K3 s −1/3 e−k2 /3 >
K4 2
⇒
1+s (R+ε)
≤ k −1
8K33 K43
≡ K5
(4.40)
⇒ 3ε(1 + s)2 ≤ 3εK52 (R + ε)2 and R > K5−1 . One can see easily that there exists ε3∗ (α0 , k) such that for any α < α0 , ε ≤ ε3∗ under K2
condition R ≤ 484 ε −1/3 , (4.40) imply (4.32). Hence, we have proved (4.32) for any α < α0 , ε < ε ∗ (α0 , k, h) ≡ min{ε1∗ , ε2∗ , ε3∗ } (k = 0). Now to finish the proof of Proposition 3 we are left to prove (4.32) for k = 0. Since the only place above where we have used that k = 0 is the case (ii) of (4.40), it is enough to prove that Eqs. (4.26) for k = 0 imply 1 (4.41) (z − h2 )−1 . 2 But for k = 0 the second equation in (4.26) is quadratic in R with the first root satisfying the bound (4.41) and the second root R = εs(s + 2) + O(ε2 z). Substituting the second root in the first equation of (4.26), we obtain R≥
αE{A2 } +
h2 = s + 2 + s −1 + O(εz). s+1
(4.42)
But using the first inequality in (4.29) we have E{A2 } ≤ s+2 2 (k = 0). Therefore for any small enough h there exists ε∗ (α0 , h) such that for any α < α0 < 2 (4.42) has no solutions. So we have proved (4.41) which, as it was mentioned below implies the statement of Proposition 3 for k = 0. Proposition 3 is proven.
Rigorous Solution of the Gardner Problem
417
Proof of Proposition 4. One can see easily that, if we want to study minz {F (α, k, 0, z, ε)+ z 2 }, then we should consider the system (4.26) with zeros in the r.h.s. and with the additional equation ∂ F (α, k, 0, z, ε) = 1 ⇔ R = 1. ∂z Thus we need to substitute R = 1 in the first equation. Since the l.h.s. of this equation for ε = 0 is an increasing function which tends to 1 − ααc−1 > 0, as s → ∞, there exist the unique s ∗ , which is the solution of this equation. Then, choosing ε small enough, it is easy to obtain that s(ε) is in some ε-neighbourhood of s ∗ and therefore s(ε) ≤ s(k, α). Then, substituting this s(ε) in the second equation, we get the ε-independent bound for z. Proof of Lemma 4. Repeating conclusions (3.3)–(3.6) of the proof of Theorem 1, one can see that (4.43) θ (x (µ) − k) µ = θ (c − kN −1/2 ) (U,c) , (µ−1)
(µ)
where . . . (U,c) are defined by (3.7) (see also (3.3), (3.5) for N = N,p , N = HN,p . (µ) (µ) (µ) (µ) and c = N −1 ξi Ji . We denote φN (c, U ) ≡ (sN (c, U )−U −(sN (c∗ , U ∗ )−U ∗ )), (µ) where sN (c, U ) is defined by (3.5) and (c∗ , U ∗ ) is the point of maximum of the function (µ) sN (c, U ) − U . (µ) Applying Theorem 1, we find that sN (c, U ) is a concave function of (c, U ) and it satisfies (3.14). Denote M ≡ {(U, c) : N φN (c, U ) ≥ M}, c∗ ,c˜ ≡ {(U, c) : c∗ ≤ c ≤ c˜ }, (µ)
(4.44)
and let for any measurable B ⊂ R2 m(B) ≡ χB (c, U ) (U,c) . To prove Lemma 4 we use the following statement: (µ)
Proposition 5. If the function φN (c, U ) is concave and satisfies inequality (3.14), 1/2
˜ U ), then c, ˜ c˜ > c∗ , and the constant A ≤ − 2(Nc−c ˜ ∗ ) maxU φN (c, (µ)
√ θ (c − c)e ˜ AN c (U,c) ≤ 2e NAc˜ , θ (c − c) ˜ (U,c) 1/2
(4.45)
and for any M < −4, m(M ) ≤
1 , 4
m(M ∩ c∗ ,c˜ ) 1 ≤ . m(M ∩ c∗ ,c˜ ) 4
(4.46)
The proof of this proposition is given after the proof of Lemma 4. 1/2 (µ) ˜ U ). Using (4.45), we get Let us choose any c˜ > c∗ and A = − 2(Nc−c ˜ ∗ ) maxU φN (c, 1/2 ˜ eAN (c−c)
1/2
θ (c − c)e ˜ AN c (U,c) θ(c − c2 ) (U,c) (U,c) θ(c − c) ˜ (U,c) ≤ θ (c2 − c) (U,c) + 2θ (c − c2 ) (U,c) ≤ 2. (4.47) On the other hand, we shall prove below = θ (c2 − c) (U,c) +
418
M. Shcherbina, B. Tirozzi
Proposition 6. For any |A| ≤ O(log N ), g(A) ≡ log exp{AN 1/2 (c − c )} A2 ˙ ˙ (J , J ) µ + RN , = 2N
(U,c)
= log exp{AN −1/2 (ξ (µ) , J˙ )}
µ
4 = O(A16 N −2 ). E RN
(4.48)
It follows from this proposition that the probability to have for all Ai = ±1, . . . , ±[log N ] the inequalities 2 2 2 2 eAi R0 ≥ exp Ai N 1/2 (c − c ) ≥ eAi D /4 (4.49) (U,c)
is more than PN ≥ 1 − O(N −3/2 ). Therefore, using that logexp{AN 1/2 (c − c )} (U,c) is a convex function of A, and this function is zero for A = 0, one can conclude that with the same probability for any A : 1 ≤ |A| ≤ log N , 2 2 2 2 e2A R0 ≥ exp AN 1/2 (c − c ) ≥ eA D /8 . (4.50) (U,c)
The first of these inequalities implies in particular that for any 0 < L < log N , θ (c − LN −1/2 − c) (U,c) ≤ max exp AN 1/2 (c − LN −1/2 − c) A>0
≤e
−L2 /8R02
(U,c)
(4.51)
.
The same bound is valid for θ (c − c − LN −1/2 ) (U,c) . Thus, assuming that c > c∗ and denoting L0 = 21 N 1/2 (c − c∗ ), c1 ≡ c − 2L0 N −1/2 = c∗ , c2 ≡ c − L0 N −1/2 , c3 ≡ c + L0 N −1/2 , we can write 1 = θ (c1 − c) (U,c) + χc1 ,c3 (c) (U,c) + θ(c − c3 ) (U,c) ≤ 4e−L0 /8R0 ⇒ N |c − c∗ |2 = 4L20 ≤ 16R02 . 2
2
(4.52)
(µ)
Here we have used (4.51) and the fact that since φN (U, c) is a concave function and (U ∗ , c∗ ) is the point of its maximum, we have for any d > 0 and c˜ > c∗ , χc, ˜ c+d ˜ (c) (U,c) ≤ χc∗ ,c∗ +d (c) (U,c) ⇒ χc2 ,c (c) (U,c) , χc ,c3 (c) (U,c) ≤ χc∗ ,c2 (c) (U,c) ≤ θ (c∗ − c) (U,c) ≤ e−L0 /8R0 . 2
2
(4.53)
The case c < c∗ can be studied similarly. We would like to stress here that Theorem 1 also allows us to estimate N |c − c∗ |2 , but this estimate can depend on ε. Now let us come back to (4.47). In view of (4.50) for our choice of A, 8N 1/2 (c˜ − c ) + 4D A2 D 2 (µ) − AN 1/2 (c˜ − c ) ≤ log 2 ⇒ A ≤ ⇒ max φN (c, ˜ U) U 8 D2 7(c˜ − c )2 + 3(c − c∗ )2 4 K0 (c˜ − c )2 ≥ −2 − − ≥ −14 (4.54) 2 2 D N D N with some N, µ, ε-independent K0 .
Rigorous Solution of the Gardner Problem
419
˜ c) Let us take L1 = 8R0 and c˜ > c +L1 N −1/2 . Consider M( ˜ ≡ N maxU φN (c + 2(c˜ − c ), U ). ˜ c) If M( ˜ < −4, consider the sets (µ)
1 ≡ {(U, c) : c > c}, ˜
2 ≡ {(U, c) : c − L1 N −1/2 ≤ c ≤ c}. ˜
(4.55)
Applying (4.46) and (4.51), we get 3 3 )≥ , m(M( ˜ c) ˜ 4 4 1 ⇒ m(M( ˜ c) ˜ c) ˜ ∩ (1 ∪ 2 )) ≥ 2 ≥ m(M( ˜ ∪ (1 ∩ 2 )) m(M( ˜ c) ˜ ∩ 1 ) ⇒ θ (c − c) ˜ (U,c) ≥ m(M( ˜ c) ˜ c) ˜ ∩ (1 ∪ 2 )) + m(M( ˜ ∪ (1 ∩ 2 ))
m(1 ∪ 2 ) ≥
≥
m(M( ˜ c) ˜ ∩ 1 ) 2(m(M( ˜ c) ˜ c) ˜ ∩ 1 ) + m(M( ˜ ∩ 2 ))
≥
1 ˜ c) ˜ S S −1 ) 2(1 + e−M( 2 1
(4.56)
,
where we denote by S1,2 the Lebesgue measure of M( ˜ c) ˜ ∩ 1,2 , and use the fact that (µ) ˜ c). ˜ 0 ≥ Nφ (U, c) ≥ M( N
(µ)
Consider the point (c + 2(c˜ − c ), U1 ), found from the condition N φN (c + ˜ c) 2(c˜ − c ), U1 ) = M( ˜ and two points (c, ˜ U2 ), (c, ˜ U3 ) which belong to the boundary of M( . Since is a convex set, if we draw two straight lines through the first and ˜ c) ˜ c) ˜ M( ˜ the second and the first and the third points and denote by T the domain between these lines, then T ∩ 1 ⊂ M( ˜ c) ˜ c) ˜ ∩ 1 and M( ˜ ∩ 2 ⊂ T ∩ 2 . Therefore (c˜ − c )2 1 S1 ≥ ≥ . 2 2 S2 (2(c˜ − c ) + L1 ) − (c˜ − c ) 8
(4.57)
Thus, we derive from (4.56): ˜
θ (c − c) ˜ (U,c) ≥
˜ eM(c) ˜ c) ˜ + 16 2eM(
(4.58)
.
˜ c) If M( ˜ > −4, let us choose c1 > c∗ , which satisfies condition N maxU φN (2c1 , U ) = −4 (c1 > c + 2(c˜ − c )). Replacing in the above consideration M( ˜ c) ˜ by −4 , we finish the proof of the first line of (3.59). To prove the second line of (3.59) we choose any c1 > c∗ + L1 N −1/2 , which satisfies (µ) the condition N maxU φN (2c1 , U ) < −4, denote d = 2ε 1/4 N −1/2 and write similarly to (4.56), (µ)
χc∗ ,c∗ +d (c) (U,c) ≤ ≤
m(−4 ∩ c∗ ,c∗ +d ) + m(−4 ∩ c∗ ,c∗ +d ) m(−4 ∩ c∗ ,c∗ +d ) 5m(−4 ∩ c∗ ,c∗ +d )
4m(−4 ∩ c∗ ,c∗ +d ) 5e4 S˜2 5e4 (c1 − c∗ )2 − (c1 − c∗ − d)2 ≤ ≤ ≤ ε1/4 C3∗ , 4 (c1 − c∗ − d)2 4S˜1
(4.59)
420
M. Shcherbina, B. Tirozzi
where we denote by S˜1,2 the Lebesgue measures of −4 ∩ c∗ ,c∗ +d and −4 ∩ c∗ ,c∗ +d respectively. Now, using the first line of (4.53), we obtain the second line of (3.59). Lemma 4 is proven. Proof of Proposition 5. Let us introduce new variables ρ ≡ (c − c∗ )2 + (U − U ∗ )2 , (µ) U −U ∗ . Then φN (ρ, ϕ) for any ϕ is a concave function of ρ. ϕ ≡ arcsin √ ∗ 2 ∗ 2 (c−c ) +(U −U )
(µ)
Let r(ϕ) be defined from the condition N φN (r(ϕ), ϕ) = M. Consider φM (ρ, ϕ) ≡ (µ) (µ) r −1 (ϕ) · φN (r(ϕ), ϕ)ρ. Since φN (ρ, ϕ) is concave, we obtain that (µ)
φN (ρ, ϕ) ≥ φM (ρ, ϕ), 0 ≤ ρ ≤ r(ϕ), (µ) φN (ρ, ϕ) ≤ φM (ρ, ϕ), ρ ≥ r(ϕ).
(4.60)
Thus, denoting by R the l.h.s. of the first inequality in (4.46), we get
(µ) dϕ ρ>r(ϕ) dρ exp{N φN (ρ, ϕ)} R≤ (µ) dϕ ρr(ϕ) dρ exp{N φM (ρ, ϕ)} (1 − M)eM 1 ≤ ≤ . ≤ 1 − (1 − M))eM 4 dϕ ρ
d d (µ) 1 φN (ρϕ , ϕ) 1 d (µ) ≤ φc˜ (ρϕ , ϕ) ≤ φ (ρϕ , ϕ) − φ (ρϕ , ϕ). dρ dρ N 2 ρϕ 2 dρ N Thus, for any ϕ we can write ρ>ρϕ
(µ)
dρeNφN
(ρ,ϕ) AN 1/2 (cos ϕρ−c) ˜ e (µ)
NφN ρ>ρϕ e
(ρ,ϕ)
≤
d (µ) −1/2 cos ϕ −1 dρ φN (ρϕ , ϕ) + AN d (µ) φ (ρϕ , ϕ)−1 dρ N
≤ 2.
This inequality implies (4.45). Proof of Proposition 6. To prove Proposition 6 we use the method developed in [P-S-T2]. Consider the function g(A) defined by (4.48) and let us write the Taylor expansion up to the second order with respect to t for g(tA) (t ∈ [o, 1]). Then 1 1 RN = A2 dt (1 − t)g (tA)dt − A2 g (0) 2 0 1 t (µ) 3 ξi (J˙ , J˙ )J˙i µ,t1 dt (1 − t) dt1 N −3/2 =A 0
0
1
+A2 0
dt (1 − t)N −1
i=j
(µ) (µ) ˙ ˙ ξj Ji Jj µ,t
ξi
(1)
(2)
≡ RN + RN , (4.61)
Rigorous Solution of the Gardner Problem
where we denote
421
(. . .) exp{tAN −1/2 (ξ (µ) , J )} µ ≡ . ( µ ) exp{tAN −1/2 (ξ , J )}
... µ,t
µ
Let us estimate (1) E{(RN )4 }
≤A N 12
−6
1
dt 0
i1 =i2 =i3 =i4
(µ) (µ) (µ) (µ)
E{ξi1 ξi2 ξi3 ξi4
(J˙ , J˙ )J˙i1 µ,t (J˙ , J˙ )J˙i2 µ,t (J˙ , J˙ )J˙i3 µ,t (J˙ , J˙ )J˙i4 µ,t } (µ) (µ) +6 E{ξi2 ξi3 (J˙ , J˙ )J˙i1 2µ,t (J˙ , J˙ )J˙i2 µ,t (J˙ , J˙ )J˙i3 µ,t } i1 =i2 =i3
3
E{(J˙ , J˙ )J˙i1 2µ,t (J˙ , J˙ )J˙i2 2µ,t }
i1 =i2
4
i1 =i2
E{ξi1 ξi2 (J˙ , J˙ )J˙i1 3µ,t (J˙ , J˙ )J˙i2 µ,t } + (µ) (µ)
E{(J˙ , J˙ )J˙i1 4µ,t } .
i1
(4.62) Now, using the formula of integration by parts (3.43), taking into account that in our case ∂(µ) = Ath−1 N −1/2 ∂h∂ i , and then using integration by parts with respect to the ∂ξi
Gaussian variable hi , one can substitute (µ)
E{ξi
. . . t,µ } → Ath−1 N −1/2 E{hi . . . t,µ } + N −3/2 A3 O(E{(J˙i )2 (. . .) t,µ }). (4.63)
Thus, for the first sum in (4.62), we obtain 4 1 −4 16 −8 E{ 1 } ≤ h A N dtE hi1 (J˙ , J˙ )J˙i1 µ,t + O(A18 N −3 ) 0 i1 2 1 −4 16 −2 −1 −1 ˙ ˙ 2 2 ˙ ˙ ≤h A N dtE N hi hj Ji Jj µ,t (N (J , J )) 0 i,j ≤ const A16 N −2 .
(4.64)
Here to estimate the error term in (4.63) we use that, according to Theorem 1 (see (2.8)), p for any fixed p E{J˙i µ,t } is bounded by N -independent constant. (1)
Other sums in the r.h.s. of (4.62) and E{(RN )4 } can be estimated similarly to (4.64). Acknowledgement. The authors would like to thank Prof. A. D. Milka for the fruitful discussion of the geometrical aspects of the problem and Prof. M. Talagrand for the valuable remarks.
422
M. Shcherbina, B. Tirozzi
References [A-S]
Abramowitz, M., Stegun, I. Ed.: Handbook of Mathematical Functions. National Bureau of Standards Applied Mathematics Series-55, 1964 [B-G] Bovier, A., Gayrard, V.: Hopfield models as a generalized random mean field models. Mathematical Aspects of Spin Glasses and Neuronal Networks, A. Bovier, P. Picco (eds), Progress in Probability. Birkh¨auser 41, 3–89 (1998) [B-L] Brascamp, H.J., Lieb, E.H.: On the extension of the Brunn-Minkowsky and Pekoda-Leindler theorems, includings inequalities for log concave functions, and with an application to the diffusion equation. J. Func. Anall. 22, 366–389 (1976) [D] Derrida, B.: Random energy model: An exactly solvable model of disordered systems. Phys. Rev. B24, 2613–2626 (1981) [D-G] Derrida, B., Gardner, E.: Optimal stage properties of neural network models. J. Phys. A: Math. Gen. 21, 271–284 (1988) [G] Gardner, E.: The space of interactions in neural network models. J. Phys. A: Math. Gen. 21, 257–270 (1988) [Ha] Hadwiger, H.: Vorlesungen uber Inhalt, Oberlache und Isoperimetrie. Springer-Verlag, 1957 [K-T-J] Kosterlitz, J.M., Thouless, D.J., Jones, R.C.: Spherical model of spin glass. Phys. Rev. Lett. 36, 1217–1220 (1976) [K-K-P-S] Khorunzy, A., Khoruzhenko, B., Pastur, L., Shcherbina, M.: The large-n limit in statistical mechanics and spectral theory of disordered systems. Phase Transition and Critical Phenomena. vol. 15, p. 73, Academic Press, 1992 [Lo] Loeve, M.: Probability Theory. D. Van Nostrand Comp. Inc. 1960 [M-P-V] Mezard, M., Parisi, G., Virasoro, M.A.: Spin Glass Theory and Beyond. Singapore: World Scientific, 1987 [P-S] Pastur, L., Shcherbina, M.: Absence of self-averaging of the order parameter in the Sherrington-Kirkpatrick model. J. Stat. Phys. 62, 1–26 (1991) [P-S-T1] Pastur, L., Shcherbina, M., Tirozzi, B.: The replica-symmetric solution without replica trick for the Hopfield model. J. Stat. Phys. 74(5/6), 1161–1183 (1994) [P-S-T2] Pastur, L., Shcherbina, M., Tirozzi, B.: On the replica symmetric equations for the Hopfield model. J. Math. Phys. 40 (1999) [S1] Shcherbina, M.V.: On the replica symmetric solution for the Sherrington-Kirkpatrick model. Helvetica Physica Acta 70, 772–797 (1997) [S2] Shcherbina, M.: Some estimates for the critical temperature of the Sherrington-Kirkpatrick model with magnetic field. In: Mathematical Results in Statistical Mechanics. World Scientific, Singapore, 1999, pp. 455–474 [S-T1] Shcherbina, M., Tirozzi, B.: The free energy of a class of Hopfield models. J. Stat. Phys. 72(1/2), 113–125 (1993) [S-T2] Shcherbina, M., Tirozzi, B.: On the volume of the intersection of a sphere with random half spaces. C.R. Acad. Sci. Paris Ser. I 334, 1–4 (2002) [T1] Talagrand, M.: Rigorous results for the Hopfield model with many patterns. Prob. Theor. Rel. Fields 110, 176–277 (1998) [T2] Talagrand, M.: Exponential inequalities and replica symmetry breaking for the SherringtonKirkpatrick model. Ann. Probab. 28, 1018–1068 (2000) [T3] Talagrand, M.: Intersecting random ralf-spaces: Toward the Gardner-Derrida problem. Ann. Probab. 28, 725–758 (2000) [T4] Talagrand, M.: Self averaging and the space of interactions in neural networks. Random Structures and Algorithms 14, 199–213 (1998) Communicated by A. Kupiainen
Commun. Math. Phys. 234, 423–454 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0745-9
Communications in
Mathematical Physics
On the Gribov Problem for Generalized Connections Christian Fleischhack1,2 1
Max-Planck-Institut f¨ur Mathematik in den Naturwissenschaften, Inselstrasse 22-26, 04103 Leipzig, Germany 2 Institut f¨ ur Theoretische Physik, Universit¨at Leipzig, Augustusplatz 10/11, 04109 Leipzig, Germany. E-mail: [email protected] Received: 4 March 2002 / Accepted: 20 August 2002 Published online: 30 January 2003 – © Springer-Verlag 2003
Abstract: The bundle structure of the space A of Ashtekar’s generalized connections is investigated in the compact case. It is proven that every stratum is a locally trivial fibre bundle. The only stratum being a principal fibre bundle is the generic stratum. Its structure group equals the space G of all generalized gauge transforms modulo the constant center-valued gauge transforms. For abelian gauge theories the generic stratum is globally trivial and equals the total space A. However, for a certain class of non-abelian gauge theories – e.g., all SU (N ) theories – the generic stratum is nontrivial. This means, there are no global gauge fixings – the so-called Gribov problem. Nevertheless, for many physical measures there is a covering of the generic stratum by trivializations each having total measure 1. Finally, possible physical consequences and the relation between fundamental modular domains and Gribov horizons are discussed. 1. Introduction The functional integral approach to quantum field theories consists of two basic steps: first the construction of a “physical” (Euclidean) measure on the configuration space and second the reconstruction of the quantum theory via an Osterwalder-Schrader procedure. In this paper we will focus on a certain issue arising in the first step – the Gribov problem. The configuration space of smooth gauge field theories has a very complicated structure. The space A/G of all smooth gauge orbits is, in general, neither affine, nor compact, nor a manifold. In contrast, the space A of all smooth gauge fields (mathematically, connections) is an affine space, and therefore (from the mathematical point of view) by far more pleasant. That is why in the 60s Faddeev and Popov [18] proposed to transfer the problem of finding a measure on A/G to that on A by means of a certain gauge fixing. This change of coordinates produced a kind of a degenerate Jacobi determinant, the so-called Faddeev-Popov determinant. About ten years later, Gribov [27] noticed a severe problem: The Coulomb gauge is (for certain gauge theories) no gauge fixing
424
Ch. Fleischhack
since in some gauge orbits several connections fulfill the gauge condition. Therefore, besides the Faddeev-Popov determinant a corresponding, highly nontrivial factor had to be inserted into the functional integral. But, Singer showed that the situation is still worse: He investigated [40] this problem more systematically and found that the non-existence of gauge fixings is a typical property of gauge theories with non-abelian structure group. Mathematically, this means simply that there is no global section in the foliation A −→ A/G of the set of connections over the gauge orbit space. More precisely, Singer was even able to prove (explicitly for structure group G = SU (N ) and spacetime manifold M = S r ) that there is no global section even in the principal fibre bundle of all irreducible connections being a subbundle of A −→ A/G. Most of these problems can be formulated and could arise in a similar way also in the Ashtekar framework which differs (in the gauge-theoretical sense) from the standard one mainly in the usage of distributional connections instead of only smooth ones. Therefore we want to study in the present paper whether there is a Gribov problem for Ashtekar connections as well and, if necessary, study its impact. Our considerations can be understood as a continuation of the investigations started in [23, 22] on the structure of the configuration space of generalized gauge theories. The present paper is organized as follows: After introducing the necessary notations we will investigate first the fibre bundle structure of the strata and prove then that the generic stratum is the only stratum having the structure of a principal fibre bundle. Next, using rather simple simplicial arguments we show that the generic stratum of Gk for sufficiently large k is globally trivial up to a subset of Haar measure zero. By means of the reduction mapping from [23] we can lift this statement to the space A of generalized connections. This means the generalized Gribov horizon – if not empty anyway – is always contained in a zero subset for a large class of physical measures. Afterwards, we investigate whether the Gribov problem indeed occurs or not: On the one hand, we prove the triviality of A in the abelian case. On the other hand, we state a group-theoretical criterion for the nontriviality of the generic stratum of A in the nonabelian case. Using homotopy arguments, we find concrete structure groups that fulfill this criterion. Among them are, e.g., all nontrivial SU (N ). This means that the Gribov problem appears for generalized connections again. The paper concludes with a discussion of the results and an outlook. 2. Preliminaries We recall the necessary facts and notations from [23] where the details can be found. Other references are [1–3]. Let the “space-time” M be an at least two-dimensional manifold, m ∈ M some fixed point and the structure group G be a compact and (for technical reasons) connected Lie group. P denotes the groupoid of all paths in M, HG the group of all paths starting and ending in m. The set A of generalized connections A is defined by A := lim G#E() = Hom(P, G). Here runs over all finite graphs in M. E() ← − is the set of edges in , V() will be that of all vertices. The set G of generalized gauge transforms g is G := lim G#V() = Maps(M, G). It acts continuously on A via ← − h (γ )g , where the path γ is in P and hA is the homomorphism hA◦g (γ ) = gγ−1 γ (1) (0) A corresponding to A. The corresponding canonical projection from A to A/G is denoted by π . The stabilizer B(A) of A contains exactly those gauge transforms that fulfill −1 h (γ )g for all x ∈ M and whose m-component g lies in the holonomy hA (γx ) = gm m A x x
Gribov Problem for Generalized Connections
425
centralizer Z(HA ) of A, i.e. the centralizer of the holonomy group of A. Here, for all x, γx is some fixed path from m to x. We have B(A) ∼ = Z(HA ). Now, the orbit type of A is defined to be the G-conjugacy class of B(A), but equivalently it can be defined to be the G-conjugacy class [Z(HA )] of the holonomy centralizer of A. This definition will be used in the following. The types are partially ordered by the natural inclusion-induced ordering of classes of subgroups of G. A stratum A=t is the set of all connections A ∈ A having type t and the generic stratum Agen is the set of all connections having the maximal orbit type [Z] where Z ≡ Z(G) is the center of G. Agen is an open, dense and G-invariant subset of A. Furthermore, there is a slice theorem on A. This means, for every A ∈ A there is a so-called slice S ⊆ A with A ∈ S such that: • S ◦ G is an open neighbourhood of A ◦ G and • there is an equivariant retraction F : S ◦ G −→ A ◦ G with F −1 ({A}) = S. The most important tool for the proof of this theorem has been a so-called reduction mapping ϕα . Note, due to the compactness of G, any centralizer is finitely generated and consequently there is a finite set α ⊆ HG of paths starting and ending in m such that Z(hA (α)) = Z(HA ). Since Z(hA (α)) is the orbit type of hA (α) w.r.t. the adjoint action of G on G#α , the reduction mapping ϕα : A −→ G#α with A −→ hA (α) lifts the slice theorem from G#α to A. The notion of a reduction mapping will be crucial again in the present paper. Finally we note that there is a natural measure µ0 on A, the so-called AshtekarLewandowski measure or induced Haar measure [2, 22]. At the moment, we should only notice that µ0 (π−1 (U )) = µHaar (U ) for all graphs with corresponding projections π : A −→ G#E() , π (A) = (hA (e1 ), . . . , hA (e#E() )), and all measurable U ⊆ G#E() . 3. Fibre Bundle Structure of the Strata The first question about the structure of the fibering π : A −→ A/G is, of course, whether it is a fibre bundle or not. In general, the answer is negative; at least, if we demand that all fibres A ◦ G have to be isomorphic G-spaces. This required [10] that all stabilizers are G-conjugate. But, this implies [23] that all Howe subgroups of G have to be G-conjugate. This, however, is only possible for abelian G. Hence, for arbitrary G we can expect bundle structures only on subsets of A where all connections have conjugate stabilizers. The maximal sets of that kind are exactly the sets of connections having one and the same gauge orbit type t, i.e. the strata A=t . Proposition 3.1. Let t ∈ T be a gauge orbit type and A ∈ A=t be some connection of that type. Then the stratum A=t is a locally trivial fibre bundle over A=t /G with fibre B(A)\ G and structure group B(A)\ N (B(A)) acting on the typical fibre by left translation. Here, N(B(A)) denotes the normalizer of B(A) w.r.t. G. Proof. We know from [23] that there is a slice theorem on A. Now the assertion follows immediately because in general every stratum of a space with slice theorem is such a fibre bundle [24].
426
Ch. Fleischhack
Remark. The preceding proposition is a further generalization of a result for regular connections to the Ashtekar approach. Already in 1978, Daniel and Viallet [12] noticed that classically gauge fixings always exist at least locally for irreducible connections. This result was extended to general connections by later statements [38, 37, 32, 33] on the triviality of the strata. We know already from [23] that a lot of structures in G can be reduced to structures in the Lie group G. For instance, we have seen that B(A) is always isomorphic to Z(HA ). This, the other way round, implies that the homeomorphy class of each orbit is determined not only by B(A)\ G, but also by Z(HA )\ G. We will see now that there is an analogous result for the structure group B(A)\ N (B(A)). We start investigating N(B(A)) itself. Proposition 3.2. Let A ∈ A and g ∈ G. Furthermore, we again fix for every x ∈ M a path γx from m to x, where γm is trivial. Then we have g ∈ N (B(A)) iff 1. gm ∈ N(Z(HA )) and 2. gx ∈ hA (γx )−1 Z(Z(HA )) gm hA (γx ) for all x ∈ M. Proof. In general, g ∈ N (B(A)) ⇐⇒ g −1 B(A)g ⊆ B(A). So let g ∈ G and g ∈ B(A). Then we have g −1 g g ∈ B(A) −1 g g ∈ Z(H ) ⇐⇒ 1. gm m m A −1 g g )−1 h (γ ) (g −1 g g ) 2. hA (γx ) = (gm ∀x ∈ M m m x x x A x −1 g g ∈ Z(H ) ⇐⇒ 1. gm m m A )−1 g h (γ )g −1 h (γ )−1 g 2. gm hA (γx )gx−1 hA (γx )−1 = (gm m A x x m A x h (γ )) (since gx = hA (γx )−1 gm A x
∀x ∈ M.
Hence, g ∈ N (B(A)) ∀g ∈ B(A) ⇐⇒ g −1 g g ∈ B(A) ⇐⇒ 1. gm ∈ N (Z(HA )) 2. gm hA (γx )gx−1 hA (γx )−1 ∈ Z(Z(HA )) ⇐⇒ 1. gm ∈ N (Z(HA )) 2. gx ∈ hA (γx )−1 Z(Z(HA )) gm hA (γx )
∀x ∈ M ∀x ∈ M.
Example 1. In the generic stratum we have Z(HA ) = Z(G), hence N (Z(HA )) = G and Z(Z(HA )) = G. Consequently, N (B(A)) = G. Example 2. In the minimal stratum we have Z(HA ) = G. Hence N (Z(HA )) = G and Z(Z(HA )) = Z(G). By the proposition above we have g ∈ N (B(A)) iff there is some gm ∈ G such that gx ∈ hA (γx )−1 Z(G) gm hA (γx ) = Z(G) hA (γx )−1 gm hA (γx ) for all x ∈ M. Remark. We realize already at this point that (for nonabelian G) B(A) is not a normal subset in G provided A has minimal type. A more general statement can be found in Sect. 5.
Gribov Problem for Generalized Connections
427
The preceding proposition shows how the normalizer of B(A) is determined by objects in the Lie group G. Corollary 3.3. For all A ∈ A the spaces N (B(A)) and N Z HA ××x=m Z Z HA are homeomorphic. −1 ) Proof. We see immediately that the map 1 : g −→ gm , (hA (γx )gx hA (γx )−1 gm x=m is a homeomorphism between these two spaces.
We emphasize that both subgroups of G are in general not isomorphic as topological groups. At least there is no “reasonable” homomorphism. We will discuss this problem a bit more in detail in Appendix A. Roughly speaking, the homomorphy property is destroyed by the above restriction on gx to be a (usually non-commutative) product of gm with elements in Z(Z(HA )). In order to investigate the structure of B(A)\ N (B(A)), we recall the form of B(A). By [23], B(A) and Z(HA ) × ×x=m {eG } are isomorphic topological groups. Hence, heuristically B(A)\ N (B(A)) and Z(HA )\ N (Z(HA )) × × Z Z HA x=m
are homeomorphic. The group isomorphy, however, is not to be expected: Since the base centralizer B(A) and the holonomy centralizer Z(HA ) are isomorphic groups, it is unlikely that there arise isomorphic groups from originally non-isomorphic groups by factorization. Indeed there are examples (generic connections for G = SU (2)) admitting no such “reasonable" group isomorphism. We will discuss this further in Appendix A. Proposition 3.4. For every A ∈ A [1 ] :
B(A)\ N (B(A)) −→ Z(HA )\ N (Z(HA )) × × Z Z HA x=m −1 [g]B(A) −→ [gm ]Z(HA ) , hA (γx )gx hA (γx )−1 gm
x=m
is a homeomorphism. Proof. Using the properties of base centralizers [23] and straightforward algebraic calculations, well-definedness and injectivity of [1 ] are clear. To prove surjectivity and continuity use the commutative diagram N (B(A))
1 ∼ =
↓ ↓ B(A)\ N (B(A))
[1 ]
→ N (Z(HA )) ×
× Z(Z(HA ))
x=m
↓ ↓ → Z(HA )\ N (Z(HA )) ×
× Z(Z(HA ))
x=m
and Corollary 3.3. Since N (B(A)) is a closed, hence compact subgroup of G, B(A)\ N(B(A)) is compact as well. Thus, [1 ] is a continuous map of a compact space to a Hausdorff space. Standard theorems [31] show that [1 ] is a homeomorphism.
428
Ch. Fleischhack
4. Modified Principal Fibre Bundles Next we will find out which strata are even principal fibre bundles. Here we will use a slightly modified definition for principal fibre bundles. Usually, one makes the demand on such a bundle that the structure group acts freely on the fibres, i.e. all stabilizers are to be trivial. This will not be the case for generalized connections in general because every holonomy centralizer – being isomorphic to the corresponding stabilizer – contains at least the center of G. Therefore we will factor out exactly these “bugging” parts: Definition 4.1. A G-space X is called principal G-fibre bundle iff all x ∈ X have the same stabilizer S. The structure group of π is S \ G. Sometimes we say that also π : X −→ X/G is a principal G-fibre bundle. Lemma 4.1. Let X be a G-space where all stabilizers are G-conjugate. Then X is a principal G-fibre bundle (according to our notation) iff the stabilizer Gx of x is a normal subgroup of G for every x ∈ X. Proof. If X is a principal fibre bundle, then Gx = Gy for all x, y ∈ X and, in particular, Gx = Gx◦g = g −1 Gx g for all g ∈ G. Thus, Gx is a normal subgroup in G. Conversely, let all Gx be normal subgroups. Then, since all stabilizers are conjugate, we have Gy = g −1 Gx g ⊆ Gx for all x, y ∈ X, i.e., X is a principal fibre bundle.
Proposition 4.2. Let X be a principal fibre bundle with “typical” stabilizer S. Then S \ G is a topological group that acts in a natural manner continuously and freely on X. Moreover, X/G ∼ = X/(S \ G). This way the definition of a (not necessarily locally trivial) principal fibre bundle above is equivalent to the usual one. Here neither the total nor the base space gets changed. Only the acting group is reduced to its part being essential for the action. By the proposition above we see that the notation “structure group” is reasonable. It coincides with the standard definition for fibre bundles. Note, however, that the G in a “principal G-fibre bundle” X does not denote the structure group as usual, but the total acting group. Proof. By the lemma above, S is normal, hence S \ G a topological group. Obviously, the action x ◦ [g]S := x ◦ g is well-defined and continuous. Since from x ◦ [g]S = x we get x ◦ g = x, hence g ∈ S, the action is free. The homeomorphy of the two quotient spaces is clear as well.
Now we are left with the modification of the definition of trivializations. Definition 4.2. Let G be a compact topological group, X be a principal G-fibre bundle and π : X −→ X/G be the canonical projection. Moreover, let µ be some normalized measure on X. 1. An open G-invariant set U ⊆ X is called local trivialization of the principal fibre bundle X iff there is an equivariant homeomorphism χ : U −→ π(U ) × (S \ G) with π|U = pr 1 ◦ χ . Here, “equivariant” means (pr 2 ◦ χ )(x ◦ g) = ((pr 2 ◦ χ )(x)) · [g]S for all x ∈ U and g ∈ G. We often call χ itself local trivialization. 2. A principal fibre bundle is called locally trivial iff there is a covering (Uι )ι∈I of X by local trivializations Uι .
Gribov Problem for Generalized Connections
429
3. A principal fibre bundle is called µ-almost globally trivial iff there is a covering (Uι ) as in the preceding item where additionally µ(Uι ) = 1 for all ι. In the following we usually say simply “almost global” instead of “µ-almost global” provided the measure µ meant is clear. 4. A principal fibre bundle is called globally trivial iff X is a local trivialization. Here again it should be clear that the definitions above are equivalent to the standard ones. However, we will use these definitions throughout the whole paper when we speak about principal fibre bundles and trivializations. 5. Principal Fibre Bundle Structure of the Strata In order to decide which strata are principal fiber bundles we have to study the form of the stabilizers, i.e. of the base centralizers. Definition 5.1. The set BZ := {g ∈ G | gm ∈ Z(G) and gx = gm base center.
∀x ∈ M} is called
Lemma 5.1. 1. The base center is contained in every base centralizer. 2. A base centralizer is a normal subgroup of G iff it equals the base center. 3. The base centralizer of a connection equals the base center iff the connection is generic. Proof. Let A ∈ A with base centralizer B(A). 1. We have g ∈ B(A) ⇐⇒ a) gm ∈ Z(HA ) −1 h (γ )g b) hA (γ ) = gm x A
∀γ ∈ Pmx , x ∈ M.
Since Z ⊆ Z(U ) for all U ⊆ G, we have BZ ⊆ B(A). 2. ⇒ Let B(A) be a normal subgroup in G. Let g ∈ B(A) and g0 ∈ G. Furthermore, let x ∈ M, x = m, and γ ∈ Pmx be arbitrary. Additionally choose a g 0 ∈ G with g0,m = g0 and g0,x = hA (γ ). Then by assumption g −1 0 g g 0 ∈ B(A), in particular −1 −1 −1 hA (γ ) = g0,m gm g0,m hA (γ ) g0,x gx g0,x −1 −1 = g0 gm g0 hA (γ ) hA (γ )−1 gx hA (γ ) −1 g0 gx hA (γ ), = g0−1 gm
hence gm g0 = g0 gx . Since g0 is arbitrary, we have gm = gx for all x ∈ M and consequently gm ∈ Z. Thus, g ∈ BZ . We get B(A) ⊆ BZ , hence the equality by the first part of the present proof. ⇐ Let B(A) = BZ . −1 Let now g ∈ BZ and g 0 ∈ G. Then we have (g −1 0 g g 0 )x = g0,x gx g0,x = −1 −1 g0,x gm g0,x = g0,x g0,x gm = gm ∈ Z, hence, in particular, (g −1 0 g g 0 )x = −1 −1 (g 0 g g 0 )m for all x ∈ M. Thus, g 0 g g 0 ∈ BZ . Hence, B(A) = BZ is a normal subgroup. 3. ⇒ Let A ∈ Agen , i.e. Z(HA ) ⊃ Z. Let g ∈ Z(HA ) \ Z, and set g := (hA (γx )−1 g hA (γx ))x∈M with γx chosen as usual. Per definitionem we have g ∈ B(A), but g ∈ BZ .
430
Ch. Fleischhack
⇐ Let A ∈ Agen , i.e. Z(HA ) = Z. Now we have g ∈ B(A) ⇐⇒ a) gm ∈ Z(HA ) = Z −1 h (γ )g b) hA (γ ) = gm x A ⇐⇒ a) gm ∈ Z −1 g b) hA (γ ) = hA (γ )gm x ⇐⇒ a) gm ∈ Z b) gm = gx ∀x ∈ M ⇐⇒ g ∈ BZ .
∀γ ∈ Pmx , x ∈ M ∀γ ∈ Pmx , x ∈ M
Consequently, only the generic stratum is a principal fibre bundle: Proposition 5.2. Let t ∈ T be a gauge orbit type. Then we have: The stratum A=t is a principal fibre bundle iff t = tmax , i.e. A=t = Agen . Proof. By the definition of the gauge orbit type, all holonomy centralizers Z(HA ) occurring in a fixed stratum A=t are conjugate. As proven in [23] the same is true for the stabilizers B(A). Lemma 4.1 and Lemma 5.1 yield the assertion.
6. Almost Global Triviality of (Gk )gen In Sect. 8 we will prove that Agen is not only a locally trivial, but even an almost globally trivial principal fibre bundle. As mentioned in the introduction we will deduce this by means of the reduction mapping from the corresponding statement on the generic stratum of Gk w.r.t. the adjoint action that we are going to deal with in the present section. Definition 6.1. Let k ∈ N+ . An element g := (g1 , . . . , gk ) ∈ Gk (and its orbit, resp.) is called generic iff Typ( g ) ≡ [Z({g1 , . . . , gk })] = [Z]. The set of all generic elements of Gk is denoted by (Gk )gen . Since Typ( g ) ≤ [Z] for all g ∈ Gk , every generic orbit is of maximal type. Moreover, a Lie group is abelian iff one (and then each) (nontrivial) power consists of generic elements only. In fact, we have Z(eG ) := Z((eG , . . . , eG )) = G. If Gk now consists for k ∈ N+ of generic elements only, then Z = Z(eG ) = G. Conversely, if Z = G, then G ⊇ Z( g ) ⊇ Z = G, hence Z( g ) = Z for all g ∈ Gk . Proposition 6.1. There is a kmin ∈ N+ , such that the generic stratum (Gk )gen is an open and dense submanifold of Gk with full Haar measure for all k ≥ kmin and is empty for all k < kmin . Proof. 1. Choice of kmin . By the finiteness lemma for centralizers [23] there is a k ∈ N+ , such that there is at least one orbit in Gk having type [Z(G)]. Now, choose for kmin simply the minimum of all such k. • Obviously (Gk )gen = ∅ for k < kmin . • For k ≥ kmin there is at least one generic element in Gk . Namely, let g := (g1 , . . . , gkmin ) ∈ Gkmin be a generic element. Then gk := (g1 , . . . , gkmin , eG , . . . , eG ) ∈ Gk is generic as well because of [Z] ≥ Typ( gk ) = Z {g1 , . . . , gkmin , eG , . . . , eG } g ) = [Z]. = Z({g1 , . . . , gkmin }) = Typ(
Gribov Problem for Generalized Connections
431
Let now k ≥ kmin in the following. 2. (Gk )gen is a smooth submanifold of Gk , since the adjoint action is smooth [10]. 3. (Gk )gen is open, since any of its points possesses an open neighbourhood whose points have at least, hence by the maximality of [Z] even exactly type [Z], [10]. 4. Since G is connected, Gk /Ad is connected again. Again by the maximality of [Z], we have (Gk )gen ≡ (Gk )Z = [K]≤[Z] (Gk )K = Gk . [39] Here we set (Gk )K = g ) = [K]}. { g ∈ Gk | Typ( 5. Now, (Gk )H is a smooth submanifold of Gk for every closed subgroup H of G with [H] < [Z]. [39] Moreover, dim(Gk )H < dim(Gk )Z ≡ dim(Gk )gen = dim Gk . Since the Haar measure of a lower-dimensional submanifold vanishes [15, 16], we have µHaar ((Gk )H ) = 0. Since there are many orbit types on Gk [39], only finitely k) = 0.
we get µHaar (Gk \ (Gk )gen ) = µHaar (G H [H]<[Z] Proposition 6.2. (Gk )gen is a µHaar -almost globally trivial (smooth) principal fibre bundle having structure group Z \ G for all k ≥ kmin . Proof. 1. Bundle structure over (Gk )gen /Ad The adjoint action of G on Gk is smooth. Hence the generic stratum (Gk )gen ≡ (Gk )Z is a smooth submanifold and πk : (Gk )gen −→ (Gk )gen /Ad is a smooth (locally trivial) fibre bundle with typical fibre and structure group Z \ G ([10] and note N(Z) = G). Since the stabilizer of any point in (Gk )gen equals Z, πk is (in our notation) even a principal fibre bundle over the generic stratum. 2. Choice of a neighbourhood Vk,v for every fixed v ∈ (Gk )gen : First we cut out from the n-dimensional smooth manifold (Gk )gen /Ad a neighbourhood B n of [ v ] diffeomorphic to an n-dimensional ball and get a remaining smooth manifold N . By general arguments [36, 43, 42] there are a simplicial complex K consisting of countably many simplices and a smooth triangulation f : K −→ N . Let Kn denote the set of all n-cells of K. (Note that here [in contrast to the standard definition] a cell
σ is the interior of a simplex σ . Only for dimension 0 the cell shall be a simplex.) Kn can be constructed from K by deleting the (n − 1)-skeleton K≤n−1 , i.e. all cells whose dimension is smaller than that of K. Now define Vk,v := πk−1 (f (Kn ) ∪ int B n ). 3. Properties of Vk,v • Vk,v is Ad-invariant. • v ∈ Vk,v . • Since f is a smooth triangulation, f (Kn ) is a smooth manifold [15] that equals the disjoint union of all f (
σ ), where σ is an n-simplex of K. Since f is a homeomorphism and
σ always contractible, also f (
σ ) is always contractible. The contractiblity of int B n is trivial. Moreover, int B n and f (Kn ) are disjoint. • Hence, as a πk -preimage of a submanifold of (Gk )gen /Ad the set Vk,v is a submanifold of (Gk )gen , thus of Gk as well. In particular, Vk,v is open in Gk by the continuity of πk . Thus πk (Vk,v ) is a disjoint union of contractible manifolds. • We have Vk,v = (Gk )gen \ πk−1 (∂B n ) ∪ σ ∈K≤n−1 πk−1 (f (
σ )) , i.e. Vk,v emerges from (Gk )gen by deleting the πk -preimages of the boundary of B n and of all images of lower-dimensional skeletons in (Gk )gen /Ad, respectively. • We have µHaar (Vk,v ) = 1: Since µHaar ((Gk )gen ) = 1, it is sufficient to prove that the just eliminated objects have Haar measure 0. However, this is clear, because they are a countable union of submanifolds having at least codimension 1.
432
Ch. Fleischhack
• Vk,v is dense in Gk . This follows from µHaar (Vk,v ) = 1 and the strict positivity of the Haar measure because Vk,v is open. 4. Triviality of πk |Vk,v : The map πk |Vk,v : Vk,v −→ πk (Vk,v ) = int B n ∪ f (Kn ) is a principal fibre bundle over the disjoint union int B n ∪ f (Kn ) of contractible manifolds having structure group Z \ G. Hence [30], this bundle is trivial, i.e., there is an equivariant homeomorphism Vk,v ∼ = π(Vk,v ) × (Z \ G). 5. Almost global triviality of (Gk )gen : Obviously Vk := {Vk,v | v ∈ (Gk )gen } is a non-empty covering of (Gk )gen by µHaar -almost global trivializations.
7. Relations Between the Structure Groups We know already that the structure group of the principal fibre bundle (Gk )gen equals Z \ G and that of the principal fibre bundle Agen equals BZ \ G. In order to lift the almost global triviality of (Gk )gen to that of Agen in the next section, we have to investigate the relation between the corresponding structure groups. We need Definition 7.1. Let ∗ : (Z \ G)×G −→ BZ \ G be defined by [g]Z ∗g := [(g gx )x∈M ]BZ . Lemma 7.1. • ∗ is well-defined and continuous. • The restriction of ∗ to (Z \ G) × G 0 is an isomorphism. We recall G 0 := {g ∈ G | gm = eG }. Proof. • Let g1 ∼ g2 , i.e. g1 = zg2 for a z ∈ Z. Then we have [g1 ]Z ∗ g = [(g1 gx )x∈M ]BZ = [(z g2 gx )x∈M ]BZ = [z ◦ (g2 gx )x∈M ]BZ = [(g2 gx )x∈M ]BZ = [g2 ]Z ∗ g.
(z ≡ (z)x∈M ∈ BZ )
• The continuity of ∗ follows immediately from the surjectivity and openness of the canonical projection G × G −→ (Z \ G) × G, the continuity of G × G −→ G with (g, g) −→ (ggx )x∈M as well as that of the canonical projection G −→ BZ \ G and the commutativity of the corresponding diagram G×G
→G .
↓ ↓ (Z \ G) × G
∗
↓ ↓ → BZ \ G
• ∗ is injective: Let [g1 ]Z ∗ g 1 = [g2 ]Z ∗ g 2 with g 1 , g 2 ∈ G 0 , i.e. g1,m = g2,m = eG . Then per def. [(g1 g1,x )x∈M ]BZ = [(g2 g2,x )x∈M ]BZ . Thus, there is a z ∈ Z with g1 g1,x = zg2 g2,x for all x ∈ M. For x = m we have g1 = zg2 (and so [g1 ]Z = [g2 ]Z ) and consequently g 1 = g 2 . • ∗ is surjective: Let [g]BZ be given. Choose a representative g ∈ [g]BZ . Set g := gm −1 and gx := (g ) gx . Then [g ]Z ∗ g = [g]BZ .
Gribov Problem for Generalized Connections
433
The map ∗ is a modification of the isomorphism 0 : G 0 × Z(HA )\ G −→ B(A)\ G [23], tailored to the special case of generic connections. There we started with the canoni cal isomorphism φ |Z(HA ) = φ −1 : Z(HA ) −→ B(A), g −→ hA (γx )−1 g hA (γx ) x∈M and extended it in a natural way to the whole G. For generic connections, φ |Z now maps −1 g simply on hA (γx ) g hA (γx ) x∈M = (g)x∈M . But, therewith we get a second natural extension of φ |Z to the whole G. This is just the definition of ∗ above. 8. Almost Global Triviality of Agen We start with Proposition 8.1. Agen has the induced Haar measure 1. Proof. By Proposition 6.1, (Gk )gen has Haar measure 1 for some k ∈ N+ . Now, let be some graph with k edges αi and the vertex m. Since the corresponding reduction mapping ϕ ≡ π : A −→ Gk decreases the type, [23] we have [Z] = Typ(ϕ(A)) ≤ Typ(A) ≤ [Z] for all A ∈ ϕ −1 ((Gk )gen ). Hence Agen ⊇ ϕ −1 ((Gk )gen ). By definition of the induced Haar measure, 1 ≥ µ0 (Agen ) ≥ µ0 (ϕ −1 ((Gk )gen )) = µHaar ((Gk )gen ) = 1.
The first goal of the present section is the proof of the following Theorem 8.2. Agen is (w.r.t. the action of G) a µ0 -almost globally trivial principal fibre bundle with structure group BZ \ G. By Proposition 3.1, we know Agen is (w.r.t. the action of G) a fibre bundle with structure group BZ \ G and by Proposition 5.2 even a principal fibre bundle. Hence, we are left only with the proof that there is a covering U of Agen by almost global trivializations. This will be done in the next three subsections. During the construction we will see what properties of µ0 are really essential for that, and generalize then the above theorem accordingly to a large class of dynamical measures given in the last subsection. 8.1. Independent Generators of the Holonomy Centralizer. Our task is to find for all A ∈ Agen a neighbourhood UA that, on the one hand, has the full induced Haar measure 1 and, on the other hand, is a local trivialization of Agen . For that purpose we choose for A according to the finiteness lemma for centralizers [23] finitely many paths αi in HG with Z(hA (α)) = Z(HA ) = Z and denote the corresponding reduction mapping shortly by ϕ. The most obvious choice of a neighbourhood of A would be U := ϕ −1 (V ), where V is an almost global trivialization of (Gk )gen from the covering above with ϕ(A) ∈ V . But, it can happen that despite µHaar (V ) = 1 the induced Haar measure of U is smaller than 1. This is, in particular, the case if ϕ is not surjective, because, e.g., the paths in α are not independent or one path in α occurs twice what is not a priori forbidden. That is why we have to guarantee that we can always find an α fulfilling a certain independency condition. For that purpose we will need the notion of hyphs [22]: A hyph υ is a set of “independent” paths. For instance, graphs and webs [8] are special hyphs. One of the most striking properties of a hyph is that the parallel transports can be assigned to the paths independently. Moreover, the corresponding projection πυ : A −→ G#υ , A −→ hA (υ), gives µ0 (πυ−1 (W )) = µHaar (W ) for all W .
434
Ch. Fleischhack
It would be optimal if we were able to show that for every A there is a hyph α with Z(HA ) = Z(hA (α)). However, though we can find – starting with an arbitrary β with Z(HA ) = Z(hA (β)) – a hyph υ such that all paths in β can be written as products of paths in υ, this set υ typically consists not only of paths in HG, i.e. closed paths only. To avoid this problem we weaken the notion of a hyph. Definition 8.1. A finite set α ⊆ HG is called weak hyph iff the corresponding reduction mapping ϕα : A −→ G#α A −→ hA (α) is surjective and fulfills µ0 (πα−1 (W )) = µHaar (W ) for all measurable W ⊆ G#α . Obviously, every hyph that consists only of closed paths is a weak hyph. Proposition 8.3. For every A ∈ A there is a weak hyph α ⊆ HG with Z(hA (α)) = Z(HA ). The proof of this proposition is very technical and is therefore shifted to Appendix E. 8.2. Choice of the Covering of Agen . Now, we are able to define the desired covering of Agen : For each A ∈ Agen we choose a weak hyph α ⊆ HG with Z(hA (α)) = Z(HA ) = Z by Proposition 8.3 and denote the corresponding reduction mapping as usual by ϕα . Now we choose according to Sect. 6 an almost global trivialization V#α,ϕα (A) of (G#α )gen that contains ϕα (A), and set UA := ϕα−1 (V#α,ϕα (A) ) ⊆ A. By A ∈ UA we have Proposition 8.4. U := {UA }A∈Agen is a covering of Agen . 8.3. Properties of the Covering. We still have to show that each UA is an almost global trivialization of Agen . For this, let us fix an arbitrary A ∈ Agen with reduction mapping ϕ := ϕα and set simply k := #α, V := Vk,ϕ(A) ⊆ (Gk )gen and U := UA = ϕ −1 (V ). The easily verifiable properties of U are described by Lemma 8.5. U is an open, dense, G-invariant subset of Agen with µ0 (U ) = 1. Proof. By construction, V is an open, dense, Ad-invariant subset of (Gk )gen with µHaar (V ) = 1. Since ϕ always preserves or reduces the types, [23] we have [Z] = Typ(ϕ(A )) ≤ Typ(A ) ≤ [Z], i.e. A ∈ Agen for all A ∈ U . Since ϕ is continuous, U = ϕ −1 (V ) is open as a subset of A, hence as a subset of Agen as well due to the openness of Agen . Since α is a weak hyph, we have µ0 (U ) = µ0 (ϕ −1 (V )) = µHaar (V ) = 1 by Proposition 8.3. Due to the strict positivity of µ0 , both statements yield the denseness of U in Agen . Finally, the G-invariance of U follows from the Ad-invariance of V .
The most important property of U , however, requires a longer proof. Proposition 8.6. U is a local trivialization of Agen .
Gribov Problem for Generalized Connections
435
Proof. We denote the equivariant homeomorphism that belongs to the almost global trivialization of Gk according Proposition 6.2 by ψ : V −→ πk (V ) × (Z \ G). The projection onto the second component is ψ2 : V −→ Z \ G. Furthermore, let γx be for every x ∈ M some fixed path from m to x. W.l.o.g., γm is the trivial path. Now we define the trivialization mapping of A: : U −→ π(U ) × BZ \ G. h −→ ([h], ψ2 (ϕ(h)) ∗ (h(γx ))x∈M ) 1. is well-defined: Because of h ∈ U we have ϕ(h) ∈ V , i.e. ψ2 (ϕ(h)) is well-defined. 2. is surjective: Let ([h], [g]BZ ) ∈ π(U ) × BZ \ G be given. By Lemma 7.1 there is exactly one [g ]Z ∈ Z \ G and some g ∈ G 0 with [g ]Z ∗ g = [g]BZ . Additionally choose some h ∈ [h]. Hence, h ∈ U and ϕ(h ) ≡ h (α) ∈ V . Let g := ψ −1 ([h (α)]Ad , [g ]Z ). Since h (α) and g are in one and the same orbit w.r.t. the adjoint action, there is a g ∈ G with g = g −1 h (α) g . Now, let −1 −1 −1 h(γ ) := (gx ) g h (γx γ γy ) g gy for all γ ∈ Pxy . Obviously, h is gauge equivalent to h by means of the gauge transform g gx )x∈M . Hence [h] = [h ] = [h] and h ∈ U . Moreover, ϕ(h) ≡ h(α) = (h (γx )−1 −1 g h (α) g . Finally, (h) = ([h], ψ2 (ϕ(h)) ∗ (h(γx ))x∈M ) −1 −1 = ([h], ψ2 ( g −1 h (α) g ) ∗ ((gm ) g h (1) g gx )x∈M ) = ([h], ψ2 ( g ) ∗ (gx )x∈M ) = ([h], [g ]Z ∗ g ) = ([h], [g]BZ ). 3. is injective: Let (h1 ) = (h2 ). Then, in particular, ψ2 (ϕ(h1 )) ∗ (h1 (γx ))x∈M = ψ2 (ϕ(h2 ))∗(h2 (γx ))x∈M . From the bijectivity of ∗ on (Z \ G)×G 0 we get h1 (γx ) = h2 (γx ) for all x ∈ M and ψ2 (h1 (α)) = ψ2 (h2 (α)). Since by assumption h1 and h2 are gauge equivalent, there is a g ∈ G with h1 = h2 ◦ g. In particular, we have −1 h (α)g , i.e., h (α) and h (α) are contained in the same orbit. Due h1 (α) = gm 2 m 1 2 −1 h (α)g , i.e. g ∈ to ψ2 (h1 (α)) = ψ2 (h2 (α)) we have h2 (α) = h1 (α) = gm 2 m m −1 Z(h2 (α)) = Z. Finally we get gx = h2 (γx ) gm h1 (γx ) = h2 (γx )−1 h1 (γx )gm = gm for all x ∈ M, i.e. g ∈ BZ . Due to h1 , h2 ∈ U , we get h1 = h2 ◦ g = h2 . 4. is continuous: It is sufficient to prove that the projections from onto the two factors are continuous: • 1 := pr 1 ◦ is equal to π : U −→ π(U ), hence continuous. • 2 := pr 2 ◦ is continuous as a concatenation of continuous mappings ϕ, ψ2 , πγx and ∗. Here we used the map πγx : A −→ G, A −→ hA (γx ). 5. is a homeomorphism: First, note that the standard theorem on the continuity of the inverse mapping is not applicable because U is typically noncompact. Therefore we cannot directly get the continuity of from that of −1 . Since is bijective and continuous, it is sufficient [24] to show that every element of π(U ) × BZ \ G has a compact neighbourhood whose preimage is again compact in U . Let now such an element ([h ], [g]BZ ) be given. Since U is an open subset of the compact Hausdorff space A, U is locally compact. Hence, there is a compact neighbourhood U ⊆ U of h . Now, W := π(U ) × BZ \ G is the desired compact neighbourhood of ([h ], [g]BZ ): Due to the compactness of U and G, W is
436
Ch. Fleischhack
compact and by the openness of π also a neighbourhood; moreover, π −1 (W ) = U ◦ G is (again by the compactness of G) compact. 6. is equivariant: We have (h ◦ g) = ([h ◦ g], ψ2 (ϕ(h ◦ g)) ∗ (((h ◦ g)(γx ))x∈M )) −1 = ([h], ψ2 (ϕ(h) ◦ gm ) ∗ ((gm h(γx )gx )x∈M )) −1 = ([h], (ψ2 (ϕ(h)) · [gm ]Z ) ∗ ((gm h(γx )gx )x∈M )) (· denotes the multiplication in Z \ G.)
= ([h], ψ2 (ϕ(h)) ∗ (h(γx )gx )x∈M ) = ([h], (ψ2 (ϕ(h)) ∗ (h(γx ))x∈M ) · [g]BZ )
(· now denotes multiplication in BZ \ G .)
= (h) ◦ [g]BZ .
This concludes the proof of Theorem 8.2. 8.4. Generalization. Of course, from the physical point of view the µ0 -almost global triviality is not very relevant, because µ0 is just a very singular kinematical measure. Much more interesting is the investigation of dynamical measures. Basically, the only point in the proof of Theorem 8.2 where we have used a property of µ0 was the equality of µ0 (ϕα−1 (V )) and µHaar (V ) for certain α and certain V having full Haar measure 1. This is, by normalization of µ0 , equivalent to the corresponding equality for zero subsets V or – in other words – the equivalence of (ϕα )∗ µ0 and µHaar . This leads immediately to Theorem 8.7. Let µ be some normalized regular Borel measure on A such that for every A ∈ Agen there is a weak hyph α with the following two properties: 1. (ϕα )∗ µ is equivalent to (ϕα )∗ µ0 , i.e. the Haar measure on G#α . 2. Z(hA (α)) = Z. Then Agen is a µ-almost globally trivial principle fibre bundle. The assumptions of the theorem are fulfilled, in particular, if all the projections of µ to the underlying lattice theories are Lebesgue measures or, more general, are absolutely continuous w.r.t. the Haar measure. In fact, then we only need the existence of a weak hyph preserving the genericity of A under the projection to the “lattice” α. But, this is guaranteed by Proposition 8.3. Moreover, typically, dynamical measures have just this property. All explicitly constructed physical measures belong to this class of measures – be it the two-dimensional Yang-Mills measure (for non-vanishing coupling constant) [5, 21, 19], be it the Fock measures for the weavy states of the free Maxwell field [4] or be it, more general, the induced heat kernel measures [3]. Moreover, also the lattice measures known from other approaches usually fulfill the assumption above. Definitely this is the case if µα = Z1α e−Sα µHaar for some Euclidean lattice action Sα and an appropriate normalization constant Zα . Thus we can expect that Theorem 8.7 is applicable for a large class of physical measures in the Ashtekar framework. 9. Triviality of A for Abelian G For commutative structure groups every connection is generic. Moreover, A is even globally trivial:
Gribov Problem for Generalized Connections
437
Proposition 9.1. Let G be a commutative compact Lie group. Then A is a globally trivial principal fibre bundle with structure group G const \ G. Here, G const denoted the set of all constant gauge transforms. Obviously, G const equals the base center. G const \ G is isomorphic to G 0 as a topological group via [g]G const −→ −1 g ) (gm x x∈M . Therefore one can (after an appropriate modification of the action) regard A in the abelian case as a principal G 0 -fibre bundle over A/G. Proof. Let
: A −→ A/G × G const \ G, h −→ ([h], [(h(γx ))x∈M ]),
where γx is as usual for all x ∈ M some path from m to x being trivial for x = m. In a commutative group the adjoint action is trivial, hence A/G = Hom(HG, G)/Ad = Hom(HG, G). 1. is surjective: Let [h] ∈ A/G and [g] ∈ G const \ G be given. As just remarked there is an h ∈ A with h |HG = [h]. Now, let h (γ ) := gx−1 h (γx γ γy−1 )gy for γ ∈ Pxy . Obviously, h ∈ A and (h ) = ([h], [g]). 2. is injective: Let h1 , h2 ∈ A and (h1 ) = (h2 ). Then [(h1 (γx ))x∈M ] = [(h2 (γx ))x∈M ], hence – γm is trivial – also h1 (γx ) = h2 (γx ) for all x ∈ M. The injectivity now follows, because two connections are equal if their holonomies are equal and if their parallel transports coincide for each x along at least one path from m to x. 3. is obviously continuous. 4. −1 is continuous because A is compact and A/G × G const \ G is Hausdorff. 5. is clearly equivariant.
10. Criterion for the Non-Triviality of Agen We already know that Agen is almost globally trivial and in the abelian case even globally trivial. Now we want to know when the generic stratum is nontrivial. First we state a sufficient condition for the non-triviality of Agen requiring only a property of G and find then a class of Lie groups having this property. Finally we discuss some problems arising when we tried to prove that Agen is nontrivial for all non-commutative G. 10.1. General Criterion. We start with the sufficient condition for the non-triviality of Agen . Proposition 10.1. If there is a natural number k ≥ 1 such that (Gk )gen is a nontrivial principal G-fibre bundle, then Agen (and thus A) is nontrivial as well. Proof. Let k ∈ N+ and (Gk )gen nontrivial. Suppose there were a section s : Agen /G −→ Agen for π : Agen −→ Agen /G. We choose a graph with k edges and exactly one vertex m. The set of edges is denoted by α ⊆ HG and defines the reduction mapping ϕ := ϕα . Moreover, we define U := ϕ −1 ((Gk )gen ). Our goal is now to construct a section s[ϕ] in the bottom line of the diagram
438
Ch. Fleischhack
Agen ← ↑ s
⊇
π
↓
Agen /G ←
U ← ↑ s|[U ]
⊇
sϕ ϕ
π|U
↓ [U ] ←
k → (G )gen ↑ πk
[ϕ] s[ϕ]
↓
sk
.
→ (Gk )gen /Ad
We construct first a section sϕ : (Gk )gen −→ Agen . Here we choose for sϕ ( g ) that connection that is built by means of the construction method [22] out of the trivial connection if one successively assigns the components of g to the k edges in . Clearly, then ϕ(sϕ ( g )) = g for all g ∈ Gk . It is easy to see1 from this method, that sϕ is continuous and obviously maps (Gk )gen to Agen . Now we define for [ g ] ∈ (Gk )gen /Ad a mapping s[ϕ] : (Gk )gen /Ad −→ Agen /G by s[ϕ] ([ g ]) := π(sϕ ( g )). One sees immediately that s[ϕ] is a well-defined continuous section. Since s is also a section, the continuous map sk := ϕ ◦ s ◦ s[ϕ] is a section for πk : (Gk )gen −→ (Gk )gen /Ad by πk ◦ sk = πk ◦ ϕ ◦ s ◦ s[ϕ] = [ϕ] ◦ π ◦ s ◦ s[ϕ] = id(Gk )gen /Ad . This is a contradiction to the assumption that πk is a nontrivial bundle. Therefore, there is no continuous section over the whole Agen /G.
10.2. Concrete Criterion. The crucial question is now what concrete G and k give nontrivial bundles πk : (Gk )gen −→ (Gk )gen /Ad. It is quite easy to see (cf. Appendix B) that in the case of G = SU (2) the bundle is empty for k = 1, trivial for k = 2 and nontrivial for k ≥ 3. So maybe typically up to some k the bundles are trivial, but nontrivial for bigger k. But, is there a k for every non-abelian G such that πk is nontrivial? Up to now, we did not find a complete answer. However, the following two propositions give a wide class of groups, for which the bundle is nontrivial starting at some k. In particular, the proposition above is non-empty, i.e., its assumptions can be fulfilled. Proposition 10.2. Let G be a non-abelian Lie group with π1h (Z \ G) = 1 and π1h (G) = 1.2 Then there is a k ∈ N such that (Gk )gen is a nontrivial principal G-fibre bundle. Proof. • Choose k ∈ N so large that (Gk )gen is non-empty (cf. Proposition 6.2). By general arguments one sees [10] that the codimension of all non-generic strata, i.e. all strata whose type is smaller than [Z], is at least 1. By Corollary D.2 in Appendix D the non-generic strata in G3k have at least codi mension 3. Let k := 3k . 1
Let e be some edge in M. There are at most two indices i and j such that the initial paths of αi and αj coincide with a partial path of e or e−1 . If there were no such indices, then πe ◦ sϕ ( g ) = eG for all g. Otherwise πe ◦ sϕ ( g ) equals a product of gi , gj , gi−1 or gj−1 . Thus, in any case πe ◦ sϕ : Gk −→ G is continuous for all edges e. Hence, sϕ is continuous as well. 2 We denote the fundamental group not as usual by π , but by π h , in order to avoid confusion with 1 1 πk : Gk −→ Gk /Ad.
Gribov Problem for Generalized Connections
439
• Suppose, the principal fibre bundle (Gk )gen were trivial. Then (Gk )gen ∼ = (Gk )gen /Ad × Z \ G, hence in particular π h ((Gk )gen ) ∼ = π h ((Gk )gen /Ad) ⊕ π h (Z \ G). 1
1
(1)
1
Gk
Since because of the compactness of G the number of non-generic strata in is finite [39] and each one of the strata is a submanifold of Gk [10] having codimension bigger or equal 3, we have π1h ((Gk )gen ) ∼ = π1h (Gk ). Consequently, (1) reduces to π1h (G)k ∼ = π1h ((Gk )gen /Ad) ⊕ π1h (Z \ G). = π1h (Gk ) ∼ This, however, is a contradiction to the assumptions π1h (G) = 1 and π1h (Z \ G) = 1. Hence, (Gk )gen is nontrivial.
Proposition 10.3. The assumptions of the proposition above are fulfilled, in particular, for all semisimple (simply connected) Lie groups whose decomposition into simple Lie groups contains at least one of the factors An , Bn , Cn , Dn , E6 or E7 . Proof. It is well-known that among the simple (and simply connected) compact Lie groups exactly the representatives of the series listed above have nontrivial center [28]. The assumption now follows, because for simply connected G the order of Z equals π1h (Z \ G) [29] and the center of the direct product of groups equals the direct product of the corresponding centers.
In particular, we see that Agen is nontrivial for all G = SU (N ) (= AN−1 , N ≥ 2). However, the corresponding problem, e.g., for G = SO(N ) or the prominent case G = E8 × E8 remains unsolved. We remark that in general for fixed G the bundles get “more nontrivial” when k increases. Strictly speaking, we have Proposition 10.4. For every G the non-triviality of (Gk )gen implies that of (Gk+1 )gen . Proof. Let k be chosen such that (Gk )gen is nontrivial. Suppose there is a section sk+1 for πk+1 : (Gk+1 )gen −→ (Gk+1 )gen /Ad. We get the following commutative diagram: (Gk+1 )gen ←
⊇
(G
k+1
)gen /Ad ←
(Gk )gen
πk+1 |U
πk+1
↓ ↓
pk
U
⊇
↓ ↓ [U ]
πk
. ↓ ↓
[pk ]
(Gk )gen /Ad
Here pk : (Gk+1 )gen −→ (Gk )gen is the projection onto the first k coordinates and [pk ] is induced in a natural way. Additionally, we defined U := pk−1 ((Gk )gen ). Now we reuse the idea for the proof of Proposition 10.1. First we set ik : (Gk )gen −→ U = (Gk )gen × G with ik ( g ) := ( g , eG ), where the rhs vector is viewed as an element of Gk+1 . Obviously, ik is a continuous section for pk that additionally – as can be checked quickly – defines a continuous section for [pk ] via [ik ]([ g ]) := πk+1 (ik ( g )). Finally one sees that sk := pk ◦ sk+1 |[U ] ◦ [ik ] is a section for πk . This, however, is a contradiction to the non-triviality of πk .
440
Ch. Fleischhack
10.3. Conjecture. Let us return again to the proof of Proposition 10.1. There we deduced via ϕ from the non-triviality of the generic stratum in Gk that the preimage ϕ −1 ((Gk )gen ), hence Agen as well is a nontrivial bundle. But, besides we know that ϕ is surjective even as a mapping from Agen to the whole Gk [23]. It seems to be obvious that one can now deduce from the non-existence of a section for πk over the whole space Gk /Ad (and not only over the generic stratum as above) analogously to the case above the non-existence of a global section in Agen −→ Agen /G. But, the existence of a section for the whole πk is rather not to be expected because typically in the case of non-commutative structure groups G the mapping πk does not define a fibre bundle. (This can easily be seen because in Gk there occurs both the orbit [i.e. the fibre] Z \ G and the orbit pt = G\ G being never isomorphic. However, this is not a criterion for the non-existence of a section, but simply just an indication.) Hence, one can guess that Agen is surely nontrivial for non-commutative G, at least as far as πk possesses no section over the whole Gk /Ad. Unfortunately, we have not been able to prove this up to now. At one point the proof above uses explicitly the fact that only the generic stratum in Gk is considered – namely, for the definition of s[ϕ] . A continuation of that mapping from the generic elements of (Gk )gen /Ad to the whole Gk /Ad is not possible as the next proposition shows: Proposition 10.5. Let k ∈ N+ be some number for that there are both generic and non-generic elements in Gk . Then there is no continuous mapping sϕ : Gk −→ Agen , such that s[ϕ] : Gk /Ad −→ Agen /G with s[ϕ] ([ g ]) := π(sϕ ( g )) is well-defined and that ϕ ◦ sϕ = idGk . We need the following Lemma 10.6. Let k, l ∈ N+ such that there are generic elements in Gl . Then we have: G is abelian iff there is a continuous f : Gl −→ Gk with • g1 ∼ g2 ⇒ ( g1 , f ( g1 )) ∼ ( g2 , f ( g2 )) and • Z( g ) ∩ Z(f ( g )) = Z for all g ∈ Gl . Proof. • Let G be abelian. Then, e.g., f ( g ) = eG , g ∈ Gl , fulfills the conditions of the lemma. • Let G be non-abelian. Suppose there were such an f . Let g1 , g2 ∈ Gl be equivalent, i.e., let there exist a g ∈ G with g2 = g1 ◦ g. By assumption there is also a g ∈ G with ( g2 , f ( g2 )) = ( g1 , f ( g1 )) ◦ g = ( g1 ◦ g , f ( g1 ) ◦ g ). Hence, g1 ◦ g = g1 ◦ g , i.e. g = g g for some g ∈ Z( g1 ). Consequently, f ( g1 ◦ g) = f ( g2 ) = f ( g1 )◦g = f ( g1 )◦g ◦g. In particular, for all generic g1 we have g ∈ Z, i.e. f ( g1 ◦ g) = f ( g1 ) ◦ g. Since the generic elements by assumption form a dense subset in Gl (cf. Proposition 6.1) and f is to be continuous, f ( g1 ◦ g) = f ( g1 ) ◦ g has to hold even for all g1 ∈ Gl and all g ∈ G. Let g now be a non-generic element in Gl , i.e., let there exist a g ∈ Z( g ) \ Z. But, now f ( g ) = f ( g ◦ g) = f ( g ) ◦ g, hence g ∈ Z(f ( g )) ∩ (Z( g ) \ Z) = (Z( g ) ∩ Z(f ( g ))) \ Z = ∅. Therefore all g ∈ Gl are generic in contradiction to the non-commutativity of G.
Proof of Proposition 10.5. Suppose there exists such an sϕ : Gk −→ Agen . 1. Let g ∈ Gk be arbitrary, but fixed. Due to sϕ ( g ) ∈ Agen there is an α g ⊆ HG with hsϕ (g ) (α g ) ∈ (G#α g )gen . Since the generic stratum is always open and since together with sϕ and hα g also hα g ◦ sϕ : Gk −→ G#α g is continuous, Ug := (hα g ◦ sϕ )−1 ((G#α g )gen ) defines an open neighbourhood of g.
Gribov Problem for Generalized Connections
441
2. Varying over all g one gets an open covering U := {Ug | g ∈ Gk } of Gk . Since G is compact, there are finitely many gi ∈ Gk such that i Ugi = Gk . Let now α be the set (the tuple, respectively) of all these α gi . 3. We define f := hα ◦ sϕ : Gk −→ G#α and – recall ϕ ≡ hα – f := (hα ◦ sϕ , hα ◦ sϕ ) ≡ (idGk , f) : Gk −→ Gk+#α . We have: • f is continuous. • Let g , g ∈ Gk with g ∼ g . From that, due to the assumed well-definedness of s[ϕ] , sϕ ( g ) ∼ sϕ ( g ) w.r.t. G. Hence, in particular ( g , f ( g )) = f ( g) ∼ f ( g ) = ( g , f ( g )). • Z(f ( g )) = Z for all g ∈ Gk . Let g ∈ Gk . Then there is an i with g ∈ Ugi . Thus, hα gi (sϕ ( g )) ∈ (G#α gi )gen ,
hence f ( g ) = hα ◦ sϕ ( g ) ∈ (G#α )gen due to α gi ⊆ α . Due to Z ⊆ Z( g ) ∩ Z(f ( g )) ⊆ Z(f ( g )) = Z, f fulfills all assumptions of the preceding lemma, i.e., G is abelian in contradiction to the supposition.
Despite these obstacles we close this section with an even stronger Conjecture 10.7. Agen is nontrivial for every non-commutative G. Perhaps, there is even for every non-commutative G a k such that (Gk )gen is nontrivial. 11. Discussion We summarize the results of this paper on the structure of the generic stratum of A: Theorem 11.1. Let µ be some normalized regular Borel measure on A such that all of its projections to the lattice theories are Lebesgue measures (or, more general, let µ be a measure according to Theorem 8.7). Then we have: 1. Agen has µ-measure 1. 2. Agen is a µ-almost globally trivial principal G-fibre bundle with structure group BZ \ G. 3. Agen is globally trivial for abelian G. 4. Agen is not globally trivial for nonabelian, simply connected G with nontrivial center. Here BZ ⊆ G is the set of all constant, center-valued gauge transforms. Remark. 1. All the assertions of the theorem are completely independent of the base manifold M. This is a striking contrast to the standard Gribov problem. There, Singer [40] proved that gauge theories over S 4 with semisimple, connected structure group do not have a global gauge fixing, whereas every gauge theory over R4 does – namely the axial gauge [26]. In contrast, here in the Ashtekar framework, e.g., for G = SU (N ) we have non-triviality both on S 4 and on R4 . Of course, one could perhaps expect such a different behaviour in the cases not covered by our non-triviality result. However, this seems to be very unlikely. In fact, as we have seen, in the Ashtekar approach the Gribov problem is rather a phenomenon on the lattice level. And, roughly speaking, lattices do not distinguish between R4 and S 4 . Lattices should do this at most, if we deal with manifolds having different fundamental groups.
442
Ch. Fleischhack
2. Completely irrelevant for the appearance of the Gribov problem in the Ashtekar framework is the topology of the underlying principal fibre bundle P over the manifold M at least if G fulfills the conditions of item 4 in the theorem above. However, this is a general effect that already occurs in the definition of A, G and A/G using projective limits [2, 20]. The theorem above shows that the Gribov problem – well-known in the case of regular (Sobolev) connections for a long time – appears in the Ashtekar approach as well. However, we could refine the statement substantially: The non-triviality is now concentrated on a zero subset – both kinematical and often also dynamical. Just this fact has large physical importance – it justifies the definition of measures on A/G by the image of the corresponding G-invariant measures on A. Actually, there are two possibilities for the choice of such measures (for an earlier discussion see [23]): Let X be some topological space equipped with a measure µ and let G be some topological group acting measurepreservingly on X. What should the corresponding measure µG on the orbit space X/G look like? On the one hand, one could simply define µG,1 (U ) := π∗ µ(U ) ≡ µ(π −1 (U )) for all measurable U ⊆ X/G, i.e. use the image measure w.r.t. the canonical projection π : X −→ X/G. On the other hand, one could perform a kind of Faddeev-Popov transformation: Let us assume that X can be written as a disjoint union of certain sets V and that each V equals V /G × GV \ G where GV characterizes the type of orbits on V . Then one could define the measure of a set U that w.l.o.g. is contained in one of these V /G naively by µG,2 (U ) := µ(π −1 (U )) ν(GV ), (2) where ν somehow measures the “size” of the stabilizer G in G. (The measure of a V general U can then be defined via µG,2 (U ) := V µG,2 (U ∩ (V /G)).) In contrast to the first method, here the orbit space and not the total space is regarded to be primary: If the measure is uniformly distributed over all points of the total space, the image measure on the orbits would in the first case no longer be uniformly distributed; the weighting of the orbits comes along according to their size. But, in the second case the uniform distribution remains, i.e. the group degrees of freedom do not play any rˆole in the Faddeev-Popov context. That Eq. (2) above can indeed be seen as a Faddeev-Popov transformation will be clear by means of a small rearrangement. Namely, we have (maybe up to an overall constant factor) 1 µ(π −1 (U )) = µG,2 (U ), ν(GV ) hence
π∗ µ = µG,2 with : X −→ R . (3) 1 x −→ ν(Stab(x)) Thus,
1 ν(GV )
= is nothing but the Faddeev-Popov determinant.
But, what does this mean for our concrete problem X = A and G = G, if we (for simplicity) assume that the lattice projection measures for (the G-invariant) µ are Lebesgue measures? Since the stabilizer of a connection in the generic stratum is minimal, is maximal on Agen . Moreover, has to be constant, say gen , on the whole Agen . Since we can assume that is not identically 0, we get the measure µ2 on Agen /G directly from (3). Hence, we are left with the nongeneric strata only. We know already that the stabilizer of a connection A is isomorphic to the corresponding holonomy centralizer Z(HA ). There are now two different cases [39]:
Gribov Problem for Generalized Connections
443
1. The holonomy centralizer has the same dimension as the center Z of G, i.e. A is an exceptional connection. Then, by compactness, Z(HA )/Z is discrete, whence, by the assumed invariance of µ w.r.t. G, we get (A) =
gen > 0. #Z(HA )/Z
Therefore, for the exceptional strata the measure is given by (3) again. However, since the exceptional strata have µ-measure 0, their G-projection will have µ2 -measure 0 as well. 2. The holonomy centralizer has larger dimension than the center Z of G, i.e. A is singular. Consequently, its measure w.r.t. ν should be infinite. Hence, would vanish on the singular connections, and a complete determination of µ2 via (3) is no longer possible. The simplest way out would be, of course, to give the singular strata simply µ2 -measure 0. This is indeed plausible, if one takes the following argument into account: First, in the case of (sufficiently smooth) actions of Lie groups on manifolds the nonmaximal strata have lower dimension than the maximal stratum (even after being projected down to the orbit space), i.e., they can be regarded as zero subsets. And, second, the measure on A/G can also be defined using the projections πυ : A/G −→ Gk /Ad. Now we realize that both possibilities above for defining a measure on the orbit space are equivalent in the case of A/G, if we assume the lattice measure to be Lebesgue. In fact, is constant up to a zero subset, hence equal 1, i.e., we have µ1 = π∗ µ = µ2 = µ2 . As a by-product, our argumentation provides us with generalizations of Gribov horizons [27] and fundamental modular domains [13, 45, 25]. In fact, given some almost global gauge fixing in the generic stratum we have an embedding σ : Agen /G ⊇ U −→ Agen . Now we define to be the closure of σ (U ). Since U is dense in A/G, we have π() = A/G and int := ∩ π −1 (U ) = σ (U ). This means, is a fundamental modular domain. Its “interior” int is free of Gribov copies, but its “boundary” in general is not. However, the Yabuki problem [44] – certain (axial) gauges sometimes miss some smooth orbits – is not possible; only non-continuities or Gribov ambiguities can arise here in \ int . Next, the Gribov region is defined by the set of all connections in with positive Faddeev-Popov determinant . But, as seen above, is precisely 1 for generic connections, between 0 and 1 for exceptional connections and 0 for singular connections. Therefore equals the union of the generic strata and all exceptional strata restricted to the fundamental modular domain, and the Gribov horizon ∂ is the union of all singular strata restricted to . In the cases where the Gribov problem occurs, we get additionally int ⊂ . This coincides with the observations of Dell’Antonio and Zwanziger in [14] and van Baal in [41] for smooth connections. Finally, we remark that the construction of A/G and of the physical measures is only the first step in the quantization. It is an open question how the non-triviality of the generic stratum as well as the non-generic strata affect the second step, the reconstruction of the quantum theory via a kind of Osterwalder-Schrader procedure (for the Ashtekar approach developed in [6]). First, singular gauge fixings can lead to inequivalent reductions to the physical quantum theory and produce this way different superselection sectors [35]. Second, it is known that singularities can lead, on the one hand, to quantum nodes [7], i.e. zeros of all wave functions, but, on the other hand, also to concentrations of wave functions [17]. Indeed, in the present approach the Gauß-type measures
444
Ch. Fleischhack
are concentrated “near” non-generic strata [19]. Then, e.g., these strata could get larger impact. This, however, would enforce a revision of the arguments the definition of measures on A/G by measures on A is based on (see above). In fact, then the non-generic strata cannot simply be given measure 0 because this could perhaps exclude physically interesting phenomena. Acknowledgements. The author has been supported in part by the Reimar-L¨ust-Stipendium of the MaxPlanck-Gesellschaft. The idea of using homotopy arguments for the proof of the non-triviality on the Lie group level is due to Lorenz Schwachh¨ofer. Moreover, the author thanks Detlev Buchholz, Hartmann R¨omer, Gerd Rudolph and Matthias Schmidt for inspiring discussions.
Appendix A. Group Isomorphy of the Two Normalizers In Sect. 3 we dealt with structure groups of the strata. As in the case of the stabilizer of a connection we could describe each of these groups using by far simpler structures in the underlying Lie group G. We saw that B(A)\ N (B(A)) and Z(HA )\ N (Z(HA )) ×
× Z(Z(HA ))
x=m
are homeomorphic. In this appendix we are going to discuss whether these groups are isomorphic even as topological groups. In order to reduce the size of the expressions in what follows, we assume w.l.o.g. that the connection A ∈ A under consideration has the property that hA (γx ) = eG for all x ∈ M, where γx is as usual some fixed path from m to x and γm is trivial. This, indeed, is no restriction because every A ∈ A is gauge equivalent to such an A. For this, one would simply set g := (hA (γx )−1 )x∈M ∈ G and A := A ◦ g. Note that our special choice has the technical advantage that now B(A) consists just of the constant Z(HA )-valued gauge transforms. As we noticed in the main text already, we will restrict ourselves to so-called “reasonable” isomorphisms. 1. We look only for isomorphisms between B(A)\ N (B(A)) and Z(HA )\ N (Z(HA )) ×
× Z(Z(HA ))
x=m
that are induced by an isomorphism ψ : N (B(A)) −→ N (Z(HA )) ×
× Z(Z(HA ))
x=m
between the two non-factorized spaces. This means such a factor isomorphism has to be a continuation of the natural isomorphism between the base centralizer and the holonomy centralizer, hence fulfill ψ(B(A)) = Z(HA ) × ×x=m {eG }. 2. We denote the mapping that one gets by concatenation of ψ and of the corresponding projection to the x-component of the image space, by ψx , x ∈ M. For a “reasonable” isomorphism ψ we demand a) ψm ≡ πm : N (B(A)) −→ N (Z(HA )) and b) ψx (g) depends only on the values of g in x and m. We think viewing at Proposition 3.2 these restrictions are natural. We neglect only “wild” isomorphisms, i.e. mappings that mix the points of M.
Gribov Problem for Generalized Connections
445
Now, let ψ be a “reasonable” isomorphism. We fix a point x = m and investigate how the projection ψx of ψ to the point x has to look like. Since ψ is to be a “reasonable” homomorphism, ψx is a map from (gm , gx ) to ψx (g) ≡ ψx (gm , gx ). By Proposition 3.2 every g ∈ N (B(A)) is just determined by the values of gm ∈ N (Z(HA )) and of zx ∈ Z(Z(HA )): we have gx = zx gm . Hence, ψx is well-defined iff3 ψx (g, zg) ∈ Z(Z(HA )) for all g ∈ N (Z(HA )) and z ∈ Z(Z(HA )).
(4)
The homomorphy property implies ψ(g 1 )ψ(g 2 ) = ψ(g 1 g 2 ) for all g i ∈ N (B(A)), hence ψx (g1 g2 , z1 g1 z2 g2 ) = ψx (g1 , z1 g1 )ψx (g2 , z2 g2 ) for all gi ∈ N (Z(HA )) and zi ∈ Z(Z(HA )).
(5)
Now, we define χ (z) := ψx (1, z) for z ∈ Z(Z(HA )) and ϕ(g) := ψx (g, g) for g ∈ N (Z(HA )). Obviously χ : Z(Z(HA )) −→ Z(Z(HA )) and ϕ : N (Z(HA )) −→ Z(Z(HA )) are continuous (hence smooth) homomorphisms and we have ψx (g, zg) = χ (z)ϕ(g). χ is even an automorphism of Z(Z(HA )) because χ is per constr. an injective Lie morphism. The injectivity of χ here is a consequence of our assumption that ψm is trivial and ψx (g) does not depend on gx , x = x, m. Now let g ∈ N (Z(HA )) with ϕ(g) ∈ Z(Z(Z(HA ))) = Z(HA ). Due to the bijectivity of χ we have exactly for those g that ψx (g, zg) = χ (z)ϕ(g) = ϕ(g)χ (z) = ψx (g, gz) for all z ∈ Z(Z(HA )). This implies that zg = gz, i.e. g ∈ Z(Z(Z(HA ))) = Z(HA ). Hence we get ϕ −1 (Z(HA )) ⊆ Z(HA ).
(6)
Our assumption ψ(B(A)) = Z(HA ) × ×x=m {eG } implies now ϕ(Z(HA )) = {eG }. This again yields ker ϕ ⊆ ϕ −1 (Z(HA )) ⊆ Z(HA ) ⊆ ker ϕ, hence
Therefore we have
ker ϕ = Z(HA ).
(7)
= im ϕ ⊆ Z(Z(HA )). Z(HA )\ N (Z(HA )) ∼
(8)
This, however, cannot always be fulfilled. Let, e.g., be G = SU (2) and A be generic. Then Z(HA ) = Z(SU (2)) = Z2 and Z(Z(HA )) = N (Z(HA )) = SU (2). We are looking now for a homomorphism ϕ : SU (2) −→ SU (2) with ker ϕ = Z2 . By the homomorphism theorem SO(3) ∼ = SU (2)/Z2 ∼ = im ϕ is a subgroup of SU (2). This is a contradiction. Hence, in general, there is no “reasonable” isomorphism of topological groups between B(A)\ N (B(A)) and Z(HA )\ N (Z(HA )) × ×x=m Z(Z(HA )). 3
In what follows we in general drop the index m in gm and the index x in zx .
446
Ch. Fleischhack
Finally, we discuss two special cases. • χ(z) = z and ϕ(g) = g. The resulting mapping ψx (g, zg) = zg just corresponds to the restriction 0 of the identical map on G. This, however, gives a group isomorphism only if ϕ is indeed a map from N (Z(HA )) to Z(Z(HA )), i.e., these two spaces are equal. This criterion is fulfilled for instance in the generic stratum. And indeed, in this case 0 is a group isomorphism between N (B(A)) and N (Z(HA )) × ×x=m Z(Z(HA )). Nevertheless, 0 factorizes by condition (7) only then to an isomorphism of the quotient groups if Z(G) = Z(HA ) = ker ϕ equals {eG }. In the minimal stratum 0 is in general no longer an isomorphism because at least for non-abelian G (i.e., if Agen = A=tmin ), N (Z(HA )) = G is not equal Z(Z(HA )) = Z(G). • χ (z) = z and ϕ(g) = eG . The resulting mapping ψx (g, zg) = z corresponds here to the homeomorphism 1 from Corollary 3.3. In order to turn ψ into a homomorphism, by condition (7), N(Z(HA )) = ker ϕ = Z(HA ) has to hold. This condition is fulfilled in the minimal stratum. (Then, as can be easily checked, 1 is indeed a group isomorphism.) On the other hand, 1 is no isomorphism for generic A in the non-abelian case. This is clear, because there we have N (Z(HA )) = G, but Z(HA ) = Z(G). We see again that it is in many cases impossible to find a group isomorphism that additionally does not depend explicitly on the respective stratum containing A. B. Stratification of SU(2)k This subsection is devoted to the analysis of the special case G = SU (2) being relevant for the formulation of Ashtekar’s general relativity in the compact case. We will usually only state the results. Their proofs are technical, but straightforward [24]. Every SU (2)-matrix A can be written uniquely as A = −ba∗ ab∗ , where a, b ∈ C are complex numbers fulfilling |a|2 + |b|2 = 1. Alternatively such a matrix can be seen as a quaternion A = a + bj ∈ H. In this case we also describe A by a0 + a1 i + a2 j + a3 k or a0 + a with ai2 = 1, ai ∈ R. We have SU (2) ∼ = S 3 ⊆ R4 . B.1. Adjoint Action on SU (2). Let A, C ∈ SU (2) with A = a0 + a and C = c0 + c. It is easy to check that the adjoint action in terms of quaternions is C + AC = a0 + a + 2 a , c c − c, c a + c0 ( a × c) = a0 + a + 2 c0 ( a × c) − c × ( a × c) . Here, ·, · is the canonical scalar product on R3 . We determine the stabilizer of A. a , c c − c, c a + c0 ( a × c) , hence We have C −1 AC = C + AC = A + 2 a , c c − c, c a + c0 ( a × c) = 0. (9) C ∈ Z(A) ⇐⇒ C −1 AC = A ⇐⇒ There are two cases: 1. a = 0, i.e. A = ±1. Clearly, the rhs of (9) is true for all C ∈ SU (2), i.e. Z(A) = SU (2). 2. a = 0, i.e. A = ±1.
Gribov Problem for Generalized Connections
447 Table 1
Type
Dimension of Stratum in SU (2) SU (2)/Ad
[Z2 ] [U (1)] [SU (2)]
3 0
Total Centralizer Rd ∩ B 3 B3
1 0
Single Centralizer
empty 1× Rd ∩ B 3 1× B 3
Dimension in SU (2) SU (2)/Ad 3 0
1 0
Table 2 Type
Dimension of Stratum in Total SU (2)2 SU (2)2 /Ad Centralizer
[Z2 ] [U (1)]
6 4
3 2
{0} Rd ∩ B 3
[SU (2)]
0
0
B3
Single Centralizers 1× Ra ∩ B 3 , 1× Rb ∩ B 3 2× Rd ∩ B 3 1× Rd ∩ B 3 , 1× B 3 2× B 3
Dimension in SU (2)2 SU (2)2 /Ad 6 4 3 0
3 2 1 0
Let C ∈ Z(A). Multiplying the rhs of (9) by a we get a , c c, a − c, c a , a = 0. This implies due to a = 0 that c = µ a for some µ ∈ R. Conversely, every such C is indeed a solution. One easily sees Z(A) ∼ = U (1). In the following we interpret a subset X of the three-dimensional ball B 3 also as a subset X := {A = a0 + a ∈ SU (2) | a ∈ X} of SU (2).
B3 for a02 = 1 Lemma B.1. For A = a0 + a ∈ SU (2) we have Z(A) = . 3 R a∩B for a02 = 1 The strata in SU (2) are now displayed in Table 1. Since every SU (2)-matrix can be diagonalized, we get Proposition B.2. • We have SU (2)/Ad ∼ = [−1, 1] by [A] −→ 21 trA. • The map SU (2)/Ad −→ SU (2)is a continuous section.
a0 −→ a0 + 1 − a02 i • The orbits are the small spheres with constant real part a0 .
B.2. Adjoint Action on SU (2)2 . As we have seen the generic stratum for SU (2)1 is empty. There are generic elements in SU (2)k iff there is some element whose stabilizer equals the center Z of SU (2). We know already from the finiteness lemma for centralizers [23] that every centralizer is finitely generated, i.e. can be written as the intersection of the centralizers of finitely many elements. By Z ≡ Z(SU (2)) = Z2 ≡ {±1} ≡ {0} k we see immediately that the minimal number k for SU (2) to have generic elements is precisely 2: Two non-collinear centralizers have trivial intersection. So we get the stratification of SU (2) × SU (2) given in Table 2 ( a , b are mutually non-collinear). Clearly, (SU (2)2 )gen is open and dense in SU (2)2 . Next, we shall find a continuous section in the generic stratum of SU (2)2 . For this we are looking for a “natural” element describing the orbit [(A, B)] with (A, B) ∈ (SU (2)2 )gen . First we are free to use a matrix C for diagonalizing the first component A.
448
Ch. Fleischhack Table 3
Type [Z2 ]
Dimension of Stratum in Total SU (2)3 SU (2)3 /Ad Centralizer 9
6
[U (1)]
5
3
[SU (2)]
0
0
{0}
Single Centralizers 1× Ra ∩ B 3 , 1× Rb ∩ B 3 , 1× Rc ∩ B 3 2× Ra ∩ B 3 , 1× Rb ∩ B 3 1× Ra ∩ B 3 , 1× Rb ∩ B 3 , 1× B 3
Rd ∩ B 3 3× Rd ∩ B 3 2× Rd ∩ B 3 , 1× B 3 1× Rd ∩ B 3 , 2× B 3
B3
3× B 3
Dimension in SU (2)3 SU (2)3 /Ad 9
6
7 6
4 3
5 4 3
3 2 1
0
0
We get (C + AC, C + BC). It remains the freedom to act with a second matrix β , however, keeping C + AC invariant. Hence, β has to be a diagonal matrix. On the other hand, β has to transform the matrix C + BC. Otherwise, C + BC would be a diagonal Hence, by an appropriate choice of β matrix in contradiction to Z(A) ∩ Z(B) = {0}. we can make the secondary diagonal of (Cβ )+ B(Cβ ) real. Explicitly we get: Proposition B.3. • Every generic orbit contains a unique element (the so-called standard element) of the form (λ, x + 1 − |x|2 j) where λ and x are complex numbers with |λ| = 1, Imλ > 0 and |x| < 1. • The map (SU (2) × SU (2))gen /Ad −→ (SU (2) × SU (2))gen Ad −→ (λ, x + 1 − |x|2 j) [a0 + a , b0 + b] with a, b λ = a0 + a i and x = b0 + i a is a continuous section. Obviously, (SU (2)2 )gen /Ad is homeomorphic to the space of all standard elements which now is homeomorphic to the product of the upper open semicircle of U (1) (λ-part) and the upper open hemisphere of S 2 (x-part). Hence, (SU (2)2 )gen /Ad is homeomorphic to R3 . Consequently, the generic stratum of SU (2)2 is homeomorphic to R3 × SO(3) because SU (2)/Z2 ∼ = SO(3). Finally we remark that there is a homeomorphism between the total orbit space SU (2)2 /Ad and the three-ball B 3 such that the singular strata are mapped to the boundary S 2 of B 3 . B.3. Adjoint Action on SU (2)3 . To close this analysis of SU (2)-orbits we will show that the adjoint action on SU (2)3 leads to a nontrivial generic stratum. The argument here is again the pure homotopy argument from the proof of the more general Proposition c be a , b, 10.2. The explicit structure of the strata on SU (2)3 is summarized in Table 3 ( mutually non-collinear).
Gribov Problem for Generalized Connections
449
We see that the non-generic strata in SU (2)3 have codimension 4. (By the way, one easily recognizes that the codimension of the non-generic strata on SU (2)k equals 2(k − 1).) Therefore, since SU (2)3 is simply connected, the same has to be true for the total space (SU (2)3 )gen of the generic stratum. However, since the structure group of the generic stratum is Z2 \ SU (2) = SO(3) which is not simply connected, the bundle (SU (2)3 )gen cannot be trivial. C. Other Kinds of Connections In the literature there are various kinds of regular connections characterized again by the holonomy group HA : 1. Airr := {A ∈ A | HA = G} (often called irreducible), 2. Agen := {A ∈ A | Z(HA ) = Z(G)} (often called generic), 3. Aalm gen := {A ∈ A | Z(HA ) is discrete} (we call it almost generic). Unfortunately, the notations are sometimes diverging. The corresponding sets fulfill Airr ⊆ Agen ⊆ Aalm gen ⊆ A for semisimple G. If G is not semisimple, the center of G is never discrete [11], hence no centralizer can be discrete. Thus, Aalm gen would be empty. However, the relation between Airr and Agen survives for arbitrary G. As mentioned, e.g., in [9] the inclusions are not always proper. Suppose, e.g., G = SU (2), then Airr = Aalm gen for simply connected base manifolds M and Aalm gen = A for certain M depending on the topology of the bundle P = P (M, G). Moreover, Agen = Aalm gen iff G is a product of SU (Ni )’s only. However, in the case of generalized connections the relation Airr ⊂ Agen is proper for every (at least one-dimensional, but not necessarily semisimple) Lie group G. This is a simple consequence the following Proposition C.1. Let M be at least two-dimensional and m be as always some fixed point in M. For every (abstract) subgroup H ⊆ G there is a generalized connection whose holonomy group equals H . Proof. Let A0 be the trivial connection, i.e. we have hA0 (γ ) = eG for all γ ∈ P. Furthermore, let m be some point in M different from m. We choose a set E := {γg }g∈H ⊆ P of edges connecting m and m that do not intersect each other (i.e., the only common points of each two edges are their endpoints m and m , respectively). Such an E exists always because H as a subset of G has at most the cardinality of R.4 Obviously the set V− of all starting points is {m}, hence finite. By a proposition in [22] there is an A ∈ A with hA (γg ) = g for all g ∈ H . In particular, hA (γg γe−1 ) = g for all g ∈ H , hence H ⊆ HA . On the other hand, by the definition of A G (cf. again [22]) and the group property of H , we have hA (γ ) ∈ H for all γ ∈ P. Hence, HA ⊆ H .
Thus, there are always proper (even countable) subgroups H ⊂ G with Z(H ) = Z(G) and H = HA for some A ∈ A. Up to now, we do not know whether Airr is open or closed or whatever in Agen . We also do not know whether π : A −→ A/G may be trivial on the irreducible connections. This would be an interesting problem, in 4 Imagine in some chart (w.l.o.g. equal Rdim M ) for instance the set E of all field lines between a positive charge in m and a negative charge in m . Since there are ℵ1 radial directions starting in m, there are ℵ1 such field lines as well.
450
Ch. Fleischhack
particular, because the original paper [40] by Singer on the Gribov ambiguity showed the non-triviality just of the bundle of irreducible (Sobolev) connections. D. Codimension of Nongeneric Strata Let G be a compact Lie group acting smoothly on a manifold M. By (x1 , . . . , xk ) ◦ g := (x1 ◦ g, . . . , xk ◦ g) for every k ∈ N+ we define an (again smooth) action of G on M k . Obviously, the types of this action are again conjugacy classes of closed subgroups of G. Now we have Proposition D.1. Let G and M be as above, H be a closed subgroup of G and k ∈ N+ be arbitrary. Then we have max dim(M k )K ≤ k max dim MK . [K]<[H]
[K]<[H]
Proof. Let x := (x1 , . . . , xk ) ∈ (M k )K forsome [K] < [H]. For some appropriate g ∈ G we have g −1 Kg = Stab( x ) = i Stab(xi ) ⊆ Stab(xi ) for all i. Thus, [Stab(xi )] ≤ [K] < [H], hence xi ∈ [Ki ]≤[K] MKi . Consequently, MKi = MK1 × · · · × MKk . (M k )K ⊆ × i
[Ki ]≤[K]
[Ki ]≤[K]
Since there are only finitely many orbit types on M k [39], we get dim(M k )K ≤ max dim MK1 × · · · × MKk [Ki ]≤[K]
=k
≤k
max dim MK
[K ]≤[K]
max dim MK .
[K ]<[H]
In particular, we have max dim(M k )K ≤ k max dim MK .
[K]<[H]
[K]<[H]
Corollary D.2. We have min codimM k (M k )K ≥ k min codimM MK . [K]<[H]
[K]<[H]
E. Proof of Proposition 8.3 First we discuss properties of so-called fundamental systems of connected hyphs, i.e., certain free generating systems for the group of based paths spanned by that hyph. For that purpose, we consider hyphs as abstract graphs. This makes graph-theoretical concepts applicable, like the notion of maximal trees. Definition E.1. • Let υ = {e1 , . . . , eY } be a connected hyph with m ∈ V(υ) and υ be a maximal tree in υ. Moreover, let φ : [1, n] −→ [1, Y ] be an injective function, such that υ = υ \ {eφ(1) , . . . , eφ(n) }. (This intricate definition is necessary, because υ is to be a hyph again and the hyph property requires a certain ordering of the edges of a hyph.)
Gribov Problem for Generalized Connections
451
α ⊆ HG is called weak υ -fundamental system for υ iff for every i = 1, . . . , n the path αi can be expressed as a product of three paths, whereas the first and the third path are contained in P{eφ(1) ,... ,eφ(i−1) }∪υ and the second path equals the edge eφ(i) . The edge eφ(i) is called free edge of αi . • α ⊆ HG is called weak fundamental system iff there are a connected hyph υ and a maximal tree υ in υ, such that α is a weak υ -fundamental system for υ. Here, Pγ denotes the subgroupoid of P generated by γ ⊆ P. Analogously, HG β is defined. We have obviously Lemma E.1. Every connected hyph possesses a weak fundamental system. The basic properties of weak fundamental systems are summarized in Proposition E.2. Let υ be a connected hyph with m ∈ V(υ), and let α be a weak fundamental system of υ. Then the following holds: 1. Pα is freely generated by α and equals HG υ ; 2. πα ∗ µ0 = µ#α Haar ; 3. πα : A −→ G#α is surjective. Consequently, every weak fundamental system of a connected hyph is a weak hyph. Proof. By assumption there is a maximal tree υ in υ = {e1 , . . . , eY }, such that α is a υ -fundamental system of υ. In order to avoid more complicated expressions, we simply assume υ = {en+1 , . . . , eY }. 1. • Obviously Pα ⊆ HG υ . • We show HG υ ⊆ Pα inductively w.r.t. n = #α = #υ − #υ . For n = 0 the statement is trivial. Now, let n > 0 and γ ∈ HG υ , i.e. γ = η η η γ1 en1 γ2 en2 · · · enq γq+1 with γj ∈ P υ := υ \ {en }. υ and ηj = ±1 for all j . Here Let δ+1 and δ−1 be those paths in P υ that fulfill αn = δ−1 en δ+1 . Hence, −η
−η
−η
−η
q q q 2 η2 −η2 1γ δ γ = γ1 δ−η11 αnη1 δη−η 2 −η αn δη2 · · · δ−ηq αn δηq γq+1 . 1 2 ∈ HG ∈ HG ∈ HG υ υ υ
η
(10)
By en ∈ υ , also υ is a maximal tree for υ and, in particular, υ is connected. Now, α \ {αn } is a weak υ -fundamental system for υ . Due to # υ − #υ = #υ − #υ − 1 = n − 1 we have by the induction hypothesis HG υ = Pα\{αn } . Equation (10) implies γ ∈ Pα . • Consequently, α is a generating system of the free group HG υ , whose rank equals #υ − #υ , hence #α. Thus, α is free. [34] 2. We have παυ :
GY −→ Gn , (g1 , . . . , gY ) −→ (· · · g1 · · · , . . . , · · · gn · · · )
where the · · · in · · · gi · · · denote a product of some gj or gj−1 with j > n or j < i. Now, we get for all f ∈ C(Gn ) using successively the structure of · · · gi · · · and the translation invariance and the normalization of the Haar measure
452
Ch. Fleischhack
(παυ )∗ f dµYHaar = f (παυ (g1 , . . . , gY )) dµHaar,1 · · · dµHaar,Y Y G = f (· · · g1 · · · , . . . , · · · gn−1 · · · , · · · gn · · · ) dµHaar,1 · · · dµHaar,Y Y G = f (· · · g1 · · · , . . . , · · · gn−1 · · · , gn ) dµHaar,1 · · · dµHaar,Y
GY
GY
.. . = =
G
n
Gn
f (g1 , . . . , gn ) dµHaar,1 · · · dµHaar,n f dµnHaar .
Since πυ projects the induced Haar measure onto the Haar measure [22], we have πα ∗ µ0 = (παυ )∗ (πυ ∗ µ0 ) = (παυ )∗ µYHaar = µnHaar . 3. Let (g1 , . . . , gn ) ∈ Gn . By assumption we can express every ei , i ≤ n, as a product of αi with appropriate paths in P{e1 ,... ,ei−1 }∪υ . Now, we choose δi,± ∈ Pυ , such that δi,− ei δi,+ is a path in HG. But, then δi,− ei δi,+ equals a product of a closed path in P{e1 ,... ,ei−1 }∪υ , the path αi and a further closed path P{e1 ,... ,ei−1 }∪υ . By the first step Li ±1 i ±1 HG {e1 ,... ,ei−1 }∪υ = P{α1 ,... ,αi−1 } . Hence, δi,− ei δi,+ = K ki αk(ki ,i) αi li αl(li ,i) for some functions k and l, that fulfill always k(ki , i) < i and l(li , i) < i. Since πυ is surjective for every hyph [22], there is an A ∈ A, such that hA (ei ) = eG Ki ±1 Li ±1 −1 for for all i > n and hA (ei ) = hA (δi,− )−1 ki gk(ki ,i) gi li gl(li ,i) hA (δi,+ ) all i ≤ n (defined inductively, since δi,± may run through ei with i < i or i > n). By construction we have hA (αi ) = gi for all i = 1, . . . , n.
Finally, we have Proof of Proposition 8.3. First we choose a β ⊆ HG with Z(hA (β)) = Z(HA ). Since β is obviously connected, there is a connected hyph υ, such that every path in β can be expressed by paths in υ. (The existence of a hyph is proven in [22], the connectedness is a simple consequence [24].) Lemma E.1 guarantees the existence of a weak fundamental system α ⊆ HG for υ. By Proposition E.2, α is a weak hyph. Since every β ∈ HG is a closed path in υ by construction, it can be written as a product of paths in α (and their inverses). Hence, hA (β) is a product of hA (αi ) and their inverses. Now, let g ∈ Z(hA (α)). Then g commutes with all finite products of hA (αi ) and their inverses as well; thus, in particular, g ∈ Z(hA (β)). This implies Z(HA ) ⊆ Z(hA (α)) ⊆ Z(h
β∈β A (β)) = Z(hA (β)) = Z(HA ), hence Z(hA (α)) = Z(HA ). References 1. Ashtekar, A., Isham, C.J.: Representations of the holonomy algebras of gravity and nonabelian gauge theories. Class. Quant. Grav. 9, 1433–1468 (1992) 2. Ashtekar, A., Lewandowski, J.: Representation theory of analytic holonomy C ∗ algebras. In: Knots and Quantum Gravity. J.C. Baez (ed.), Proceedings, Riverside, CA, 1993, Oxford: Oxford University Press, 1994, pp. 21–61
Gribov Problem for Generalized Connections
453
3. Ashtekar, A., Lewandowski, J.: Projective techniques and functional integration for gauge theories. J. Math. Phys. 36, 2170–2191 (1995) 4. Ashtekar, A., Lewandowski, J.: Relation between polymer and Fock excitations. Class. Quant. Grav. 18, L117–L128 (2001) 5. Ashtekar, A., Lewandowski, J., Marolf, D., Mour˜ao, J., Thiemann, T.: SU (N) quantum Yang-Mills theory in two dimensions: A complete solution. J. Math. Phys. 38, 5453–5482 (1997) 6. Ashtekar, A., Marolf, D., Mour˜ao, J., Thiemann, T.: Constructing Hamiltonian quantum theories from path integrals in a diffeomorphism-invariant context. Class. Quant. Grav. 17, 4919–4940 (2000) 7. Asorey, M., Falceto, F., L´opez, J.L., Luz´on, G.: Nodes, monopoles and confinement in (2 + 1)dimensional gauge theories. Phys. Lett. B349, 125–130 (1995) 8. Baez, J.C., Sawin, S.: Functional integration on spaces of connections. J. Funct. Anal. 150, 1–26 (1997) 9. Baum, M., Friedrich, T.: A group-theoretical study of the stabilizer of a generic connection. Ann. Glob. Anal. Geom. 3, 120–128 (1985) 10. Bredon, G.E.: Introduction to Compact Transformation Groups. New York: Academic Press, Inc., 1972 11. Br¨ocker, Th., tom Dieck, T.: Representations of Compact Lie Groups. New York: Springer, 1985 12. Daniel, M., Viallet, C.M.: The gauge fixing problem around classical solutions of the Yang-Mills theory. Phys. Lett. B76, 458–460 (1978) 13. Dell’Antonio, G., Zwanziger, D.: Ellipsoidal bound on the Gribov horizon contradicts the perturbative renormalization group. Nucl. Phys. B326, 333–350 (1989) 14. Dell’Antonio, G., Zwanziger, D.: Every gauge orbit passes inside the Gribov horizon. Commun. Math. Phys. 138, 291–299 (1991) 15. Dieudonn´e, J.: Grundz¨uge der modernen Analysis, Bd. 3. Berlin: VEB Deutscher Verlag der Wissenschaften, 1976 16. Dieudonn´e, J.: Grundz¨uge der modernen Analysis, Bd. 4. Berlin: VEB Deutscher Verlag der Wissenschaften, 1976 17. Emmrich, C., R¨omer, H.: Orbifolds as configuration spaces of systems with gauge symmetries. Commun. Math. Phys. 129, 69–94 (1990) 18. Faddeev, L.D., Popov, V.N.: Feynman diagrams for the Yang-Mills field. Phys. Lett. B25, 29–30 (1967) 19. Fleischhack, Ch.: On the support of physical measures in gauge theories. e-print: math-ph/0109030 20. Fleischhack, Ch.: Regular connections among generalized connections. To appear in J. Geom. Phys. 21. Fleischhack, Ch.: A new type of loop independence and SU (N) quantum Yang-Mills theory in two dimensions. J. Math. Phys. 41, 76–102 (2000) 22. Fleischhack, Ch.: Hyphs and the Ashtekar-Lewandowski measure. J. Geom. Phys. (to appear) 23. Fleischhack, Ch.: Stratification of the generalized gauge orbit space. Commun. Math. Phys. 214, 607–649 (2000) 24. Fleischhack, Ch.: Mathematische und physikalische Aspekte verallgemeinerter Eichfeldtheorien im Ashtekarprogramm (Dissertation). Universit¨at Leipzig, 2001 25. Fuchs, J., Schmidt, M.G., Schweigert, Ch.: On the configuration space of gauge theories. Nucl. Phys. B426, 107–128 (1994) 26. Glimm, J., Jaffe, A.: Quantum Physics: A Functional Integral Point of View. New York: Springer, 1987 27. Gribov, V.N.: Quantization of non-abelian gauge theories. Nucl. Phys. B139, 1–19 (1978) 28. Helgason, S.: Differential Geometry, Lie Groups, and Symmetric Spaces. San Diego: Academic Press, Inc., 1978 29. Hilgert, J., Neeb, K.-H.: Lie-Gruppen und Lie-Algebren. Braunschweig/Wiesbaden: Friedr. Vieweg & Sohn, 1991 30. Husemoller, D.: Fibre Bundles. New York: Springer, 1994 31. Kelley, J.L.: General Topology. Toronto, New York, London: D. van Nostrand Company, Inc., 1955 32. Kondracki, W., Rogulski, J.: On the stratification of the orbit space for the action of automorphisms on connections. Dissertationes mathematicae 250, Warszawa: 1985 33. Kondracki, W., Sadowski, P.: Geometric structure on the orbit space of gauge connections. J. Geom. Phys. 3, 421–434 (1986) 34. Kurosch, A.G.: Gruppentheorie 1. Berlin: Akademie-Verlag, 1970 35. McMullan, D.: Constrained quantisation, gauge fixing and the Gribov ambiguity. Commun. Math. Phys. 160, 431–456 (1994) 36. Milnor, D., Staxef, D.: Harakteristiqeskie Klassy. Moskva: Izdatelstvo «Mir», 1979. Extended Russian translation of the English original edition: Milnor, J.W., Stasheff, J.D.: Characteristic Classes. Princeton: Princeton University Press, 1974
454
Ch. Fleischhack
37. Mitter, P.K.: Geometry of the space of gauge orbits and the Yang-Mills dynamical system. Lectures given at Carg`ese Summer Inst. on Recent Developments in Gauge Theories, Carg`ese, France, Aug 26–Sep 8, 1979 38. Mitter, P.K., Viallet, C.M.: On the bundle of connections and the gauge orbit manifold in Yang-Mills theory. Commun. Math. Phys. 79, 457–472 (1981) 39. Onishchik, A.L. (ed.): Lie Groups and Lie Algebras I (Encyclopaedia of Mathematical Sciences 20). Berlin: Springer-Verlag, 1993 40. Singer, I.M.: Some Remarks on the Gribov ambiguity. Commun. Math. Phys. 60, 7–12 (1978) 41. van Baal, P.: More (thoughts on) Gribov copies. Nucl. Phys. B369, 259–275 (1992) 42. Whitehead, J.H.C.: On C 1 -complexes. Annal. Math. 41, 809–824 (1940) 43. Whitney, H.: Geometric Integration Theory. Princeton, NJ: Princeton University Press, 1957 44. Yabuki, H.: Structure of the Gribov horizon for the non-abelian gauge field theory in axial gauges. Phys. Lett. B231, 271–274 (1989) 45. Zwanziger, D.: Fundamental modular domain, Boltzmann factor and area law in lattice theory. Nucl. Phys. B412, 657–730 (1994) Communicated by H. Nicolai
Commun. Math. Phys. 234, 455–490 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0777-1
Communications in
Mathematical Physics
Cercignani’s Conjecture is Sometimes True and Always Almost True C´edric Villani UMPA, ENS Lyon, 46 All´ee d’Italie, 69364 Lyon Cedex 07, France. E-mail: [email protected] Received: 25 July 2002 / Accepted: 13 September 2002 Published online: 10 February 2003 – © Springer-Verlag 2003
It is a pleasure to dedicate this paper to Carlo Cercignani, whose influence on the theory of the Boltzmann equation over the last decades cannot be overestimated. The results presented here are one instance of the numerous achievements which have been directly or indirectly triggered by his remarks and ideas. Abstract: We establish several new functional inequalities comparing Boltzmann’s entropy production functional with the relative H functional. First we prove a longstanding conjecture by Cercignani under the nonphysical assumption that the Boltzmann collision kernel is superquadratic at infinity. The proof rests on the method introduced in [39] combined with a novel use of the Blachman-Stam inequality. If the superquadraticity assumption is not satisfied, then it is known that Cercignani’s conjecture is not true; however we establish a slightly weakened version of it for all physically relevant collision kernels, thus extending previous results from [39]. Finally, we consider the entropy-entropy production version of Kac’s spectral gap problem and obtain estimates about the dependence of the constants with respect to the dimension. The first two results are sharp in some sense, and the third one is likely to be, too; they contain all previously known entropy estimates as particular cases. This gives a first coherent picture of the study of entropy production, according to a program started by Carlen and Carvalho [12] ten years ago. These entropy inequalities are one step in our study of the trend to equilibrium for the Boltzmann equation, in both its spatially homogeneous and spatially inhomogeneous versions. Contents 1. 2. 3. 4.
Introduction . . . . . . . . . . . Superquadratic Collision Kernels Nonvanishing Collision Kernels General Collision Kernels . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
456 462 469 471
456
C. Villani
5. Further Developments and Open Problems . . . . . . . . . . . . . . . . . . 6. The Entropy Variant of Kac’s Problem . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
481 482 489
1. Introduction Cercignani’s conjecture [19] asserts the domination of Boltzmann’s relative H functional by a constant multiple of Boltzmann’s entropy production functional. Since its formulation twenty years ago, it has been disproved in greater and greater generality [8, 10, 47]. Nevertheless, we shall show in this paper that it is true in certain cases. In fact, if Grad’s angular cut-off is imposed, then it holds true in essentially all the cases which were not previously covered by counterexamples. We shall also show that a slightly weaker family of inequalities holds true in all physical cases, thus recovering and improving previous results in this direction [12, 13, 39]. These functional inequalities play a key role in our subsequent treatment of trend to equilibrium for the Boltzmann equation, both in a spatially homogeneous [34] and in a spatially inhomogeneous context [25]. We will not develop these issues here, in order to limit the size of the present paper; the reader who would like to consult a tentative global view on the subject is referred to [42, Chapter C], or, better, to [46]. Before going further, let us give precise definitions of all the quantities which will be under study. Whenever f = f (v) is a nonnegative integrable function on RN (N ≥ 2), to be thought of as a density in velocity space, we define 1) the macroscopic density, velocity and temperature associated to f , by the identities ρ=
RN
f (v) dv;
u=
1 f (v)v dv; ρ RN
T =
1 f (v)|v − u|2 dv; Nρ RN (1)
2) the H functional, or negative of the entropy, by H (f ) =
RN
f log f ;
(2)
3) the thermodynamical equilibrium, or maximum of the entropy under the constraints (1), |v−u|2
ρ e− 2T M (v) = Mρ,u,T (v) ≡ ; (2π T )N/2 f
(3)
4) the H -dissipation, or entropy production1 : D(f ) =
f f∗ 1 (f f∗ − ff∗ ) log B(v − v∗ , σ ) dσ dv dv∗ . 4 RN ×RN ×S N −1 ff∗
(4)
1 With respect to previous papers, I have decided to change my own sign convention for the entropy (unphysical, albeit common in kinetic theory), and accept the idea that it should be nondecreasing with time. The influence of Sasha Bobylev in this decision is acknowledged.
Cercignani’s Conjecture is Sometimes True
457
Here S N−1 stands for the unit sphere in RN ; we have used the shorthands f = f (v ), f∗ = f (v∗ ), f∗ = f (v∗ ), where v =
|v − v∗ | v + v∗ + σ, 2 2
v∗ =
|v − v∗ | v + v∗ − σ 2 2
(σ ∈ S N−1 ).
(5)
Finally, the function B appearing in (4) is Boltzmann’s collision kernel2 . Boltzmann’s entropy production functional describes the amount of entropy which is produced per unit of time by the collisions of particles in a dilute gas, at a given position in space. These collisions are assumed to be elastic and binary; one may think of v , v∗ as the respective velocities of particles which are just about to collide, and will have respective velocities v, v∗ as a result of this interaction. As for Boltzmann’s collision kernel, it depends on the particular interaction between the particles, but it is always assumed to depend only on the two parameters |v − v∗ | (modulus of the relative velocity) and cos θ = k, σ (cosine of the deviation angle), where k = (v − v∗ )/|v − v∗ |. Typical examples are the hard-sphere models, in which B = |v − v∗ |, and the inverse-power model, in which B = |v −v∗ |γ b(cos θ ) for some exponent γ ∈ R and some complicated function b, which is only known implicitly. Much more details can be found in [42] or in the classical references [20–22, 41]. Boltzmann’s H theorem identifies D with the entropy production and asserts that (i) D(f ) ≥ 0, (ii) if the collision kernel B is almost everywhere positive, then D(f ) = 0 ⇐⇒ f = M f .
(6)
In other words, the entropy production is nonzero if the distribution function is not in thermodynamical equilibrium. This theorem is at the heart of most studies of the hydrodynamical approximation, or the long-time behavior of solutions of the Boltzmann equation. This gives strong motivation for establishing quantitative versions of this theorem. Accordingly, the following question has been studied by many authors (e.g. [24, 12, 13, 39]): Can one establish a lower bound on the entropy production, in terms of how much the distribution function departs from thermodynamical equilibrium? This question is very representative of recent trends in partial differential equations: in many cases of interest, an entropy principle has been identified, and a detailed study of the entropy production principle enables a sharp insight into the problem of asymptotic behavior of the equation under consideration [4, 17, 18, 23, 35]. In the case of the Boltzmann equation, this problem is interesting not only for the sake of developing the theory of this particular equation, but also because it presents a number of interesting mathematical features, some of which are typical of kinetic equations and some of which are more specific. In our context, a natural way to measure the departure of f towards thermodynamical equilibrium is by means of the relative H functional, f f f H (f |M ) = H (f ) − H (M ) = f log f . (7) N M R This is nothing but the relative Kullback information of f with respect to M f . 2 The kernel B is often improperly called the cross-section. Strictly speaking, the cross-section would rather be B(v − v∗ , σ )/|v − v∗ |.
458
C. Villani
After these preparations, we can state Cercignani’s conjecture: it consists in the validity of the functional inequality D(f ) ≥ K(f )H (f |M f ),
(8)
where K(f ) would be a positive constant depending on f only via certain a priori estimates, such as smoothness, decay at infinity or strict positivity. For the sequel of the discussion, let us introduce the functional norms which we shall use in our estimates. We define the functional spaces L1s (weigthed L1 ), L1s log L (weighted Orlicz space) and H k (Sobolev space) by the identities
f L1s = f (v)(1 + |v|2 )s/2 dv,
f L1s log L =
RN
RN
f (v)(1 + |v|2 )s/2 1 + | log f (v)| dv,
and
f H k
=
1
2
|D f (v)| dv , α
N |α|≤k R
2
in which α stands for a multi-index of length |α|, and D α f = ∂1α1 . . . ∂NαN f . Of course in these definitions we have assumed f ≥ 0. Cercignani’s conjecture was first formulated with a view to provide simplified proofs of convergence to equilibrium for the spatially homogeneous Boltzmann equation; however, later it was understood that it would have important consequences on the quantitative study of convergence to equilibrium, even in a spatially inhomogeneous setting. A moment of reflection shows that inequality (8) is essentially a nonlinear variant of a spectral gap estimate, with all the possible implications that one may imagine. One can make this slightly more precise: a standard linearization procedure of the Boltzmann equation (with the notation below, writing f = M(1 + εh) and comparing second order terms in ε) transforms inequality (8) into a spectral gap inequality for a linearized Boltzmann operator; therefore this inequality is stronger than a spectral gap estimate. At first the constant K(f ) in (8) was conjectured to depend only on B, ρ, u and T . Note that, if we define √ T N/2 f˜(v) = f (u + T v), ρ
(9)
|v|2
e− 2 ˜ M(v) = = M f (v), (2π)N/2
(10)
then a homogeneity argument yields H (f |M f ) = ρH (f˜|M),
f˜), D(f ) = ρ 2 D(
(11)
is the entropy production functional associated with the rescaled collision where D kernel √ − v∗ , σ ) = B T (v − v∗ ), σ . B(v
Cercignani’s Conjecture is Sometimes True
459
Therefore it is natural to consider only the case when f lies in the set C1,0,1 (RN ) defined by
ρ = 1, u = 0, T = 1 . (12) C1,0,1 (RN ) = f ∈ L1 (RN ); f ≥ 0; Indeed, estimates on this set immediately lead to general estimates by means of (11). As we mentioned above, for a long time only counterexamples appeared in the subject. The best (negative!) results available are due to Bobylev and Cercignani [10]. Roughly speaking, they showed that, for a large class of collision kernels, inequality (8) cannot be true if K(f ) is allowed to depend on f only via high-order moments, a finite number of Sobolev norms, and “perfect” positivity estimates. Here is a way to formulate this result precisely. Theorem 1.1 (Bobylev, Cercignani). Let B satisfy B(v − v∗ , σ ) dσ ≤ CB (1 + |v − v∗ |γ ) S N −1
for some γ ∈ [0, 2), CB < +∞. Then there exist sequences (Ms )s∈N , (Sk )k∈N (to be understood as moment and smoothness bounds, respectively), and K0 , A0 > 0, such that
D(f ) ; f ∈ C1,0,1 ∩ B (Ms ), (Sk ), A0 , K0 = 0, inf H (f |M) where B[(Ms ), (Sk ), A0 , K0 ] stands for the space of distribution functions f satisfying the bounds
f L1s ≤ Ms ∀s ∈ N, ∀k ∈ N,
f H k ≤ Sk 2 ∀v ∈ RN , f (v) ≥ K0 e−A0 |v| . Moreover, one can impose that any finite number of the bounds Ms , Sk , K0 and A0 be arbitrarily close to their equilibrium values. More precisely, for all s∗ and k∗ , and for all ε > 0 one can impose ∀s ≤ s∗ ,
∀k ≤ k∗ ,
Sk ≤ M H k
Ms ≤ M L1s + ε, 1 + ε K0 ≥ 1 − ε, A0 ≤ + ε. 2
The last part of the theorem rules out even the hope that the conjecture would hold true “in the neighborhood of the Maxwellian”. But Theorem 1.1 does not rule out the possibility of Cercignani’s conjecture holding true under more stringent assumptions on f , for instance 2 eα|v| f (v) dv < +∞. Barthe [6] has shown to us strong indication that the conjecture might hold true if f , viewed as a reference measure, satisfies a Poincar´e inequality, a condition which typically would require at least eα|v| f (v) dv < +∞. However, in the present state of the theory of the Boltzmann equation, “polynomial” moment estimates are essentially
460
C. Villani
the best we can hope for in a fully nonlinear context (see [9] for an exceptional case in which exponential moment estimates are available, however still far from implying a Poincar´e inequality). At the beginning of the nineties, Carlen and Carvalho [12] were able to prove weaker entropy-entropy production inequalities, in the form D(f ) ≥ f H (f |M f ) , where the function f would depend on f only via some moment and (mild) smoothness estimates, and be strictly increasing from 0. Their results, and even more their methods, had considerable impact on the field. It was only in 1999 that the results were improved [39]: the first polynomial bounds, of the form D(f ) ≥ K(f )H (f |M f )α
(α > 1)
were established in a joint paper by Toscani and the author. It was moreover shown that the exponent α can be taken arbitrarily close to 1 if the collision kernel was nonvanishing. More precisely, if B(v − v∗ , σ ) ≥ KB (1 + |v − v∗ |)−β
(KB > 0,
β > 0),
(13)
then for all ε > 0 one had D(f ) ≥ Kε (f )H (f |M f )1+ε .
(14)
In this sense Cercignani’s conjecture was shown to be “almost true” for nonvanishing kernels. Also at the end of the nineties, Carlen, Carvalho and Loss took up the study of an old spectral gap conjecture formulated by Kac. We shall describe it more precisely in Sect. 6; let us just say here that the main difficulty in Kac’s problem is that the dimension n of the phase space grows unbounded, and one wishes to obtain spectral gap estimates which would be independent of the dimension. The conjecture, formulated in the fifties, was first solved by Janvresse [28] and independently by Maslen [33], no earlier than the year 2000. However, the correct formulation for achieving Kac’s goal would rather be an entropic variant of Kac’s problem, namely finding “entropy-entropy production constants” Kn which would admit a uniform lower bound, independently of n. However, it is not expected that Kn be universally bounded below, because this would imply a variant of Cercignani’s conjecture, which is believed to be false as well... In fact Carlen, Carvalho and Loss recently showed that Kn−1 = O(n) is admissible. Although this has not been explicitly checked, their arguments suggest that this bound is sharp (what they do prove is that this bound is sharp for an interesting intermediate entropy inequality; for more information see recent work by Carlen, Lieb and Loss). From the above presentation we see that many shadow regions remained in the picture. In this paper we shall enlighten part of them, and present for the first time a coherent (although certainly not final) picture of these entropy production inequalities. Indeed, we shall establish the following three statements:
Cercignani’s Conjecture is Sometimes True
461
1) Cercignani’s conjecture holds true when the collision kernel is super-quadratic, in the sense B(v − v∗ , σ ) ≥ KB (1 + |v − v∗ |2 ). This seems to be the very first instance in which Cercignani’s conjecture is proven. Moreover, our estimate on the constant K(f ) will be surprisingly good. Note that this is precisely the limit case which is not covered by Theorem 1.1, so that the assumption of superquadraticity is essentially sharp. We emphasize that this is a nonphysical assumption! As a consequence of this result in the superquadratic case, we shall be able to recover the main results of [39], namely D(f ) ≥ Kε (f )H (f |M f )1+ε under assumption (13), with a technically simplified proof and improved constants. 2) Cercignani’s conjecture is almost true in all the physically relevant cases. Indeed, we shall extend the results of [39] to cover also situations in which the collision kernel vanishes for v = v∗ (most typically, the important example of hard spheres, B = |v − v∗ |). Since collision kernels coming from physics are bounded below by positive constants except possibly when |v − v∗ | → ∞ or when |v − v∗ | → 0, our result can be applied in all physically relevant cases. As a trade-off, we shall need a higher regularity for the distribution function (bounds in all Sobolev spaces). 3) The entropy variant of Kac’s conjecture is also true in the superquadratic case, at least for Kac’s caricature of the Boltzmann equation. At the same time that we establish this, we shall give a new proof of the result by Carlen, Carvalho and Loss that, in general, the constant Kn−1 can be chosen O(n) as n → ∞, in the basic case of constant collision kernel (see Sect. 6 for details). As a consequence, under this assumption of superquadratic behavior, we shall be able to realize Kac’s program and connect the trend to equilibrium of a simple caricature of the Boltzmann equation, with that of a well-chosen many-particle system. Of these three results, the second is certainly the most technical: it will turn out to be really tricky to handle the problem of small relative velocities without doing any harm to the form of inequality (14). The first and third will be obtained more easily, essentially by a re-examination of the method in [39] together with a new use of the Blachman-Stam inequality (see [3, 11] for background on this inequality). In particular, we keep from [39] the idea of the auxiliary Ornstein-Uhlenbeck diffusion semigroup, the use of Landau’s entropy production and the study of symmetries in Boltzmann’s entropy production functional. The Blachman-Stam inequality used here is in fact a differential version of the logarithmic Sobolev inequality by Stam and Gross (see [3]). The fact that both our method of proof and logarithmic Sobolev-type inequalities behave well with respect to the dimension will be the key to the proof of the third result. What are the consequences in terms of convergence to equilibrium? Of course, if we consider f = f (t, v) a solution of the spatially homogeneous Boltzmann equation, satisfying nice bounds, in such a way that (14) holds true, then it is immediate to prove convergence to equilibrium like O(t −∞ ), meaning faster than O(t −κ ) for all κ. The fact that we need high-order Sobolev regularity for proving (14) in the general case could be seen as a major weakness of our results. This is in fact not the case: whenever the Cauchy problem can be studied in enough detail, these bounds can be used. In particular, in [34], we shall be able to use this result and recover convergence like O(t −∞ ), even if the solution is not smooth (we shall just need an L2 assumption). The main idea is that the solution can be split into the sum of a very smooth part, and a nonsmooth part which decays exponentially fast.
462
C. Villani
When one considers the full (spatially inhomogeneous) Boltzmann equation, then things are much, much trickier. However, with the help of inequality (14) one can still prove [25] convergence to equilibrium like O(t −∞ ), in the presence of suitable a priori estimates. A point which should probably deserve most attention in future studies, is the relationship of (8) to spectral gap estimates. Our methods, which are pretty efficient for entropy estimates, seem however unable to estimate spectral gaps, in cases where Cercignani’s conjecture does not hold true. For instance, when applied to Kac’s problem in the case of a constant kernel, it will yield a bound in O(n) for the inverse entropy constant Kn−1 , which is optimal — but also a bound in O(n) for the inverse spectral gap, which is not! The Carlen-Carvalho-Loss strategy is completely different, and manages to treat the spectral gap in a sharp way; maybe a common framework should be looked for. Further remarks will be formulated in Sects. 5 and 6. The paper is organized as follows: in each section (except Sect. 5), we state and prove one main theorem. Section 2 establishes the validity of Cercignani’s conjecture for the superquadratic potential. For the sake of completeness, we have included there some of the arguments in [39] (with a slightly simplified presentation), as well as a simple but crucial auxiliary estimate from [26]. We shall not repeat the proofs already published in [26, 39], but we thought it would help clarity to re-state the various lemmas proven there. Then Sect. 3 shows how to recover the main results of [39] from the results in Sect. 2, with a simplified proof and improved constants. Section 4 deals with small relative velocities, and establishes the validity of (14) for all physically realistic crosssections. In Sect. 5 we present some problems left open, and formulate a conjecture about the range of validity of Cercignani’s conjecture. Finally, Sect. 6 discusses the entropy variant of Kac’s problem. On this occasion we shall recall a little bit of the history of this topic, and formulate one last open problem. 2. Superquadratic Collision Kernels In this section, we prove the following theorem. Theorem 2.1. Let B satisfy B(v − v∗ , σ ) ≥ KB (1 + |v − v∗ |2 )
(15)
for some KB > 0, and let D be the associated H -dissipation functional. Let f ∈ C1,0,1 (RN ). Then |S N−1 | D(f ) ≥ KB (16) (N − T ∗ (f )) H (f |M), 4(2N + 1) where T ∗ (f ) = max
e∈S N −1 RN
f (v)(v · e)2 dv.
(17)
Remark. In words, Tf∗ is the “maximum directional temperature” of f . Note that N is the sum of all directional temperatures. In particular, N − T ∗ (f ) ≥ (N − 1)T∗ (f ), where T∗ (f ) stands for the minimum directional temperature of f (defined as in (17) but with “max” replaced by “min”). The strict positivity of T∗ (f ) is a very weak strict positivity assumption on f ; it just means that “all directions in RN ” are represented in
Cercignani’s Conjecture is Sometimes True
463
the support of f . The assumption T ∗ (f ) < N is even weaker, it just means that f is not concentrated on a line. In subsequent developments, we shall crudely bound below (N − T ∗ (f ))/(4(2N + 1)) by T∗ (f )/20. Corollary 2.2. Let B satisfy (15). Then, for all distribution f on RN there is a constant K(f ), depending only on N, KB , ρ, T and an upper bound for H (f ), such that D(f ) ≥ K(f ) H (f |M). This corollary follows from Theorem 2.1 by use of (11) and by standard estimates on the positivity of T∗ (f ), see for instance [26, Prop. 2]. There are many other ways to control the strict positivity of N −T ∗ (f ), for instance by means of L∞ bounds combined with moment bounds. Proof of Theorem 2.1. Without loss of generality we assume that B(v−v∗ , σ ) = KB (1+ |v − v∗ |2 ). As in [39] we shall begin by introducing the adjoint Ornstein-Uhlenbeck (or Fokker-Planck) regularizing semigroup (St )t≥0 , defined by the partial differential equation ∂f = f + ∇ · (f v) ∂t
(t ≥ 0,
v ∈ RN ).
As shown in [39], under suitable assumptions on f (for instance, f | log f |(1 + |v|s ) dv < +∞ for some s > 2), the function D(St f ) is continuous with respect to t, differentiable for t > 0, and goes to 0 as t → ∞. In the sequel, we shall assume that f satisfies these assumptions, and establish (16) for this restricted class of data. Then an approximation argument will imply (16) in the general case. Here is a possible way to implement this approximation argument: without loss of generality, assume that 2 f ∈ L log L. Then replace f (v) by fδ (v) = f (v)e−δ|v| , which has fast decay at infinity for fixed δ. Then note that − fδ fδ∗ ) log (fδ fδ∗
fδ fδ∗ f f∗ 2 2 = e−δ(|v| +|v∗ | ) (f f∗ − ff∗ ) log . fδ fδ∗ ff∗
Therefore, by the monotone convergence theorem, D(fδ ) −→ D(f ) as δ → 0. Of course a priori fδ ∈ / C1,0,1 , but one can reduce to this case by using (9) with a rescaling depending on δ. Details are tedious but easily filled in. Next, we state four lemmas in a row; each of them is quite easy to check. The first three are reformulations of tools used in [39], and detailed proofs can be found there. Lemma 2.3 (Quantitative version of Boltzmann’s angular integration trick). D(f ) ≥ where
D(f ) ≡
R2N
× log
ff∗ −
1
|S N −1 |
KB |S N−1 | D(f ), 4
1 |S N−1 |
S N −1
ff∗ S N −1
f f∗ dσ
f f∗ dσ
(1 + |v − v∗ |2 ) dv dv∗ .
464
C. Villani
This lemma is a direct consequence of Jensen’s inequality and the joint convexity of the function (X, Y ) −→ (X − Y )(log X − log Y ) on R+ × R+ . Lemma 2.4 (Averaging symmetries). The function 1 G(v, v∗ ) ≡ N−1 f f∗ dσ |S | S N −1 only depends on m = v + v∗ ,
e=
|v|2 + |v∗ |2 . 2
This lemma, due to Boltzmann himself, is easy to understand: it suffices to note that G only depends on the sphere with radius |v − v∗ |/2 and center (v + v∗ )/2. A precise (easy) formulation can be found in [39, Lemma 1]. From now on we shall use the notation 1 F (v, v∗ ) = ff∗ , G(v, v∗ ) = N−1 f f∗ dσ = G(m, e). |S | S N −1 Moreover, we shall abuse notation by writing D(f ) = D(F, G). Rather than bounding D directly from below, we shall do this for D. Lemma 2.5 (The semigroup is compatible with the symmetries). For brevity, let us use the same notation (St )t≥0 for the adjoint Ornstein-Uhlenbeck semigroup in L1 (RN ) and in L1 (R2N ). Then, for any t ≥ 0, St (ff∗ ) = (St f )(St f )∗ ,
St G depends only on m and e.
The first part of this lemma is obvious from the explicit representation of St in terms of Gaussian functions. As for the second, it is most easily seen by operating √ an orthonormal change of coordinates sending (v, v∗ ) to (v, ˜ w) ˜ = (v + v∗ , v − v∗ )/ 2. Then one just has to notice that St φ(v)ψ( ˜ St ψ(w), ˜ and that St maps radial ˜ w) ˜ = St φ(v) √ functions into radial functions; so if ψ above depends only on |w| ˜ = |v − v∗ |/ 2, then so does St ψ. In fact a stronger property holds true, see [39, Prop. 2]. In the sequel, we shall use the notation X = (v, v∗ ) ∈ R2N . The following lemma is a generalized form of Proposition 5 in [39]. Lemma 2.6 (Nonlinear commutator formula). Let (St ) be any diffusion semigroup on Rd with generator L = + a(X) · ∇ + b(X), acting on L1 (Rd ). Then, whenever F and G are smooth densities on Rd , ∇F F St F ∇G 2 d St (F − G) log (F + G). − (St F − St G) log = − dt t=0 G St G F G (18)
Cercignani’s Conjecture is Sometimes True
465
The proof is by direct computation. For instance, the object to be computed can be written as F F LF LG L F − G) log − (LF − LG) log − (F − G) − . G G F G The terms in a · ∇ cancel out because they represent the action of differentiation operators, the terms in b disappear because (F − G) log(F /G) is homogeneous of degree 1. Thus only the terms in remain (this simplifies a little bit the proof in [39]). From now on, our treatment departs from that in [39]. Let ψ(v, v∗ ) = 1 + |v − v∗ |2 . Applying Lemma 2.6 with the adjoint Ornstein-Uhlenbeck semigroup (with a(X) = X and b(X) = N), and then integrating with respect to ψ dv dv∗ , we find that ∇X F ∇X G 2 d − D(S f ) = ψ(X) (F + G) dX t F dt t=0 G R2N F − ψ(X)L (F − G) log dX G R2N
−
if f is smooth enough. Here the notation ∇X is used to recall that this is the full gradient with respect to the 2N scalar variables (v, v∗ ). If we introduce the adjoint operator L∗ = X − X · ∇X , then we obtain ∇X F d ∇X G 2 − D(St F, St G) = ψ(X) (F + G) dX − dt t=0 F G R2N F − L∗ ψ(X) (F − G) log dX. G R2N By the semigroup property, for all t > 0, ∇ X St F d ∇X St G 2 − D(St F, St G) = ψ(X) (St F + St G) dX − dt St F St G R2N St F − L∗ ψ(X) (St F − St G) log dX. 2N S tG R
(19)
To have this formula for t > 0 it is not even necessary to assume that f is smooth, since (St ) has a regularizing effect. By direct computation, L∗ ψ(X) = 4N − 2|v − v∗ |2 ≤ 4N ψ(X). Thus, R2N
L∗ ψ(X) (St F − St G) log
St F dX ≤ 4N D(St F, St G). St G
466
C. Villani
To summarize, we have arrived at −
d D(St F, St G) + 4N D(St F, St G) dt ∇ X St F ∇X St G 2 − ψ(X) (St F + St G) dX. ≥ 2N SF SG R
t
(20)
t
This is the same as ∇ X St F ∇X St G 2 d −4Nt e − D(St F, St G) ≥ e−4Nt ψ(X) (St F + St G) dX. − dt St F St G R2N Integrating this differential inequality with respect to t, taking into account D(St F, St G) −→ 0 as t → ∞ and D(F, G) = D(f ), we obtain +∞ ∇X S t F ∇X St G 2 −4Nt D(f ) ≥ e ψ(X) (St F + St G) dX dt. − St F St G R2N 0 (21) Complicated as this expression may seem, it will yield the desired bound. Now we borrow another lemma from [39]. In the sequel, we shall denote by z⊥ the orthogonal projection onto the hyperplane which is orthogonal to the vector z ∈ RN . Lemma 2.7. Let P be the X-dependent linear operator on R2N , P : [A, B] −→ (v−v∗ )⊥ [A − B], where A and B stand for the components in RvN and RvN∗ respectively. To be precise, P ∈ L∞ RvN × RvN∗ ; L(R2N , RN ) , where L(R2N , RN ) stands for the set of linear mappings from R2N to RN . Then, as soon as G is a smooth density only depending on m and e, one has P (∇X G) = 0. As a consequence, whenever G only depends on m and e, P ∇X F 2 ∇X F 1 1 P ∇X F 2 ∇X G 2 F = 2 F . F − G ≥ P 2 2N N L(R
,R )
The proof is elementary linear algebra, plus the key observation that ∂G ∂G , ∇m G + v∗ . ∇X G = ∇m G + v ∂e ∂e Coming back to our entropy production problem, since St G depends only on m and e (Lemma 2.5) for all t > 0, the action of P will enable one to eliminate this function from the estimates. On the other hand, the action of P on F (v, v∗ ) = ff∗ is quite nice because of the tensor product structure; in fact, for smooth f , ∇X F ∇f ∇f = (v), (v∗ ) . F f f
Cercignani’s Conjecture is Sometimes True
467
Taking into account the first part of Lemma 2.5, we see that, for all t > 0, ∇ X St F ∇St f ∇St f = (v), (v∗ ) . St F St f St f Thus when we apply Lemma 2.7 to the right-hand side of (21), we get 2 ∇St f ∇St f 1 +∞ −4Nt − D(f ) ≥ e (1 + |v − v∗ |2 ) (v−v∗ )⊥ 2 0 St f St f ∗ R2N × (St F + St G) dv dv∗ dt, in particular 1 +∞ −4Nt D(f ) ≥ e 2 0 ×
R2N
∇St f ∇St f |v−v∗ | (v−v∗ )⊥ − Sf Sf 2
t
t
∗
2 (St f )(St f )∗ dv dv∗ dt. (22)
At this point we recall the following lemma, taken from [26, Theorem 1]. Lemma 2.8 (A strong estimate on Landau’s entropy production functional). Let f be a distribution in C1,0,1 (RN ), and let T ∗ (f ) be the associated maximum directional temperature; assume that T ∗ (f ) < N . Define 2 ∇f ∇f 1 2 ff∗ dv dv∗ . DL (f ) = − |v − v∗ | (v−v∗ )⊥ (23) 2 R2N f f ∗ Then, DL (f ) ≥ (N − T ∗ (f )) I (f |M), where
I (f |M) =
RN
f 2 f ∇ log M
is the relative Fisher information of f with respect to M.
The proof of this lemma is performed by first reducing to thecase when f (v) vi vj dv = δij Ti , which can always be achieved because the matrix f v ⊗ v dv is nonnegative symmetric. Then one can expand DL by taking advantage of the quadraticity, and apply the two inequalities (∂i f )2 αi ∂i f ∂j f (|v|2 δij − vi vj ) dv ≥ 0, dv ≥ αi . f f Ti ij
i
i
The second one is just a variant of Heisenberg’s inequality. We refer to [26] for details. Since (St ) preserves the class C1,0,1 , we can use Lemma 2.8 to estimate (22), and find +∞ D(f ) ≥ e−4Nt (N − T ∗ (St f )) I (St f |M) dt. 0
We now apply
468
C. Villani
Lemma 2.9. For all f ∈ C1,0,1 , for all t ≥ 0, T ∗ (St f ) ≤ T ∗ (f ). This lemma is easily obtained by noting that whenever e ∈ S N−1 , then St f (v) (v · e)2 dv converges monotonically towards 1, as shown by a simple study of moment behavior. To sum up, at this point we have shown +∞ D(f ) ≥ (N − T ∗ (f )) e−4Nt I (St f |M) dt. 0
This is the point where we shall make use of the Blachman-Stam inequality [7, 11, 36]. From the representation Sτ g = M1−e−2τ ∗ ge−2τ , where we use the notation
√ gλ (v) = λ−N/2 g(v/ λ),
we deduce that I (Sτ g|M) ≤ e−2τ I (g|M) + (1 − e−2τ )I (M|M) = e−2τ I (g|M).
(24)
This inequality can also be proven by elementary 2 theory, since it is at the basis of the famous Bakry-Emery argument for proving logarithmic Sobolev inequalities [5]. In the present context, we apply (24) with g = St+2Nt f , and discover that I (St+2Nt f |M) ≤ e−4Nt I (St f |M). In particular, D(f ) ≥ (N − T ∗ (f ))
+∞
I (St+2Nt f |M) dt.
0
Now we change variables in the time-integral, to obtain (N − T ∗ (f )) +∞ D(f ) ≥ I (St f |M) dt . 1 + 2N 0
(25)
The proof of Theorem 2.1 will be completed by the use of the following lemma (see [16] for instance): Lemma 2.10 (Representation formula for the relative information). Whenever f is a density with finite entropy and finite moments up to order 2, then +∞ I (St f |M) dt. H (f |M) = 0
Cercignani’s Conjecture is Sometimes True
469
3. Nonvanishing Collision Kernels In this section, we shall recover the main result of [39] in a very straightforward way and in an improved form, as a consequence of Theorem 2.1 plus some easy error estimate. The main simplification lies in the fact that we do not have to go through the complicated error estimate arising in the time-integral along the semigroup St . Theorem 3.1. Let B satisfy B(v − v∗ , σ ) ≥ KB (1 + |v − v∗ |)−β
(β ≥ 0).
(26)
Let f be a distribution function satisfying f (v) ≥ K0 e−A0 |v|
q0
∀v ∈ RN
(K0 > 0,
A0 > 0,
q0 ≥ 2).
(27)
Then, for all ε ∈ (0, 1) there exists a constant Kε (f ), depending on ε and depending on f only through ρ, T , q0 , and upper bounds on A0 , 1/K0 , f L1 log L , f L1 2+s
for s = (2 + β)/ε, such that
2+s+q0
D(f ) ≥ Kε (f )H (f |M)1+ε . For instance, when ρ = 1, T = 1, the following constant will do: ε KB |S N−1 |T∗ (f ) 1 |S N−1 |T∗ (f ) min , (28) , Kε (f ) = 40 · 22+β H (f |M) 2s · 40Cε (f ) where T∗ (f ) = mine∈S N −1 f (v)(v · e)2 dv as before, and q0 1 2+β N−1 2 Cε (f ) = 32 · 2 |S 1 + log | f L1 log L f L1 + A0 , s = . 2+s 2+q0 +s K0 ε (29) Remark. This bound is rather crude; we emphasize that there are many possible variants of estimate (29), depending on the norms that one is willing to use. The interest to have a nice control of the dependence on Cε upon the various bounds above is demonstrated in [40]. Proof. First we write B(v − v∗ , σ ) ≥
KB KB (1 + |v − v∗ |2 ) − (1 + |v − v∗ |2 )1|v−v∗ |≥R . (1 + R)2+β (1 + R)2+β
Accordingly, by Theorem 2.1, N−1 KB | |S D(f ) ≥ T∗ (f ) H (f |M) 2+β (1 + R) 20 f f∗ − (1 + |v − v∗ |2 )(f f∗ − ff∗ ) log dv dv∗ dσ . (30) ff∗ |v−v∗ |≥R Now we claim that f f∗ Cε (f ) (1 + |v − v∗ |2 )(f f∗ − ff∗ ) log dv dv∗ dσ ≤ , s ff (R/2) ∗ |v−v∗ |≥R
(31)
470
C. Villani
where Cε (f ) and s are given by (29). If this is true, then Theorem 3.1 follows from (30) with the choice 1/s 40 Cε (f ) , R=2 |S N−1 | Tf∗ H (f |M) after a few elementary computations. So the proof will be complete when we have established (31). Proof of (31). We start with ff∗ ≥ K02 e−A0 (|v|
q0 +|v |q0 ) ∗
≥ K02 e−A0 (|v|
2 +|v |2 )q0 /2 ∗
.
Since |v |2 + |v∗ |2 = |v|2 + |v∗ |2 , also f f∗ ≥ K02 e−A0 (|v|
2 +|v |2 )q0 /2 ∗
.
As a consequence, we have the pointwise inequality (f f∗ − ff∗ ) log
f f∗ ≤ f f∗ log(f f∗ ) + ff∗ log(ff∗ ) ff∗ + (f f∗
! 1 2 2 q0 /2 . + ff∗ ) log 2 + A0 (|v| + |v∗ | ) K0
If we integrate this with respect to (1 + |v − v∗ |2 ) dv dv∗ dσ , and then use the pre-postcollisional change of variables v − v∗ v − v∗ v, v∗ , σ = ←→ v , v∗ , , |v − v∗ | |v − v∗ | which has unit Jacobian, together with the equality |v − v∗ | = |v − v∗ |, we obtain |v−v∗ |≥R
≤2
(1 + |v − v∗ |2 )(f f∗ − ff∗ ) log
|v−v∗ |≥R
(1 + |v − v∗ |2 )ff∗ log(ff∗ ) dv dv∗ dσ
+2
f f∗ dv dv∗ dσ ff∗
(1 + |v − v∗ | )ff∗ 2
|v−v∗ |≥R
1 log 2 + A0 (|v|2 + |v∗ |2 )q0 /2 K0 q
! dv dv∗ dσ.
By using the inequality (X + Y )q/2 ≤ 2 2 −1 (X q/2 + Y q/2 ) for q ≥ 2, and simple symmetry tricks, we crudely bound this expression by q0 1 N−1 2 N−1 2 4|S | (1 + |v − v∗ | )ff∗ log f dv dv∗ + 2 |S | log + A0 K0 |v−v∗ |≥R × (1 + |v − v∗ |2 ) (1 + |v|2 )q0 /2 + (1 + |v∗ |2 )q0 /2 ff∗ dv dv∗ . (32) |v−v∗ |≥R
Cercignani’s Conjecture is Sometimes True
471
Now we use the inclusion " # " # " # |v − v∗ | ≥ R ⊂ |v| ≥ R/2 and |v∗ | ≤ |v| ∪ |v∗ | ≥ R/2 and |v| ≤ |v∗ | . (33) Accordingly, the first term in (32) can be bounded by N−1 4|S | (1+4|v|2 )ff∗ log f dv dv∗ + |v|≥R/2
≤ 16|S N−1 | f L1
|v∗ |≥R/2
|v|≥R/2
(1+4|v∗ | )ff∗ log f dv dv∗ 2
(1 + |v|2 ) f log f dv + f L1 log L
≤ 16|S
N−1
|
f L1 f L1
2+s
log L
+ f L1 log L f L1
f∗ (1 + |v∗ | ) dv∗ 2
|v∗ |≥R/2
2+s
(R/2)s
.
The second term in (32) is even easier to bound: indeed, by (33), use of symmetry and application of Fubini’s theorem, (1 + |v − v∗ |2 ) (1 + |v|2 )q0 /2 + (1 + |v∗ |2 )q0 /2 ff∗ dv dv∗ |v−v∗ |≥R ≤4 (1 + 4|v|2 )(1 + |v|2 )q0 /2 ff∗ dv dv∗ |v|≥R/2 2+q0 ≤ 16 f L1 (1 + |v|2 ) 2 f (v) dv |v|≥R/2
≤ 16 f L1
f L1
2+q0 +s
(R/2)s
.
Putting all these bounds together, one easily concludes to the validity of (31).
4. General Collision Kernels In this section, we establish the following theorem, which generalizes Theorem 3.1 to the case when the collision kernel is allowed to vanish for v = v∗ like a power law, under a Sobolev regularity assumption for the distribution function. Theorem 4.1. Let B satisfy
B(v − v∗ , σ ) ≥ KB min |v − v∗ |γ , |v − v∗ |−β ,
(β, γ ≥ 0),
(34)
and let D be the associated H -dissipation functional. Let f be a distribution function such that ∀v ∈ RN
f (v) ≥ K0 e−A0 |v|
q0
(K0 > 0, A0 > 0, q0 ≥ 2).
(35)
Then, for any ε ∈ (0, 1) there exists a constant K ε (f ), depending on f only through ρ, T , q0 , and upper bounds on A0 , 1/K0 , f L1s and f H k , where s = s(ε, q0 , β, γ ) and k = k(ε, s, β, γ ), such that D(f ) ≥ K ε (f )H (f |M f )1+ε .
472
C. Villani
For instance, when ρ = 1, T = 1, the following constant will do: Kε0 (f )1+4γ /N 1 K ε (f ) = KB K(N, ε, β, γ ) min ,1 , H (f |M)ε C ε0 (f )4γ /N where ε0 = ε/(1 + 4γ /N ); Kε0 (f ) is defined by (28), in which ε is replaced by ε0 ; 3 C ε0 (f ) = 1 + log fK L0 ∞ + A0 × f L1
2q0 +s
s = q0
η log L f L log L f L1q
0 η +2
2 −1 , ε0
k=
2η
log L
(1 + f L2 )(1 + f H k ),
2(N + 1) , ε0
η=
ε0 . 2 − ε0
We have written down this bound so that the reader can have an idea of the orders of the constants involved in Theorem 4.1, however we insist that this is a quite crude estimate which admits many possible refinements. Proof. Without loss of generality, we assume ρ = 1, u = 0, T = 1, so M f = M. Throughout the proof, we shall denote by C and K various numerical constants only depending on N , ε, β and γ . Our starting point is the natural splitting according to whether |v − v∗ | is small or large. Let B0 (v − v∗ , σ ) = (1 + |v − v∗ |)−β , and let D0 be the associated H -dissipation functional. By assumption, B(v − v∗ , σ ) ≥ KB K(β, γ ) min(|v − v∗ |γ , 1) B0 (v − v∗ , σ ). Since D is a monotone function of B, we can assume without loss of generality that B(v − v∗ , σ ) = min(|v − v∗ |γ , 1) B0 (v − v∗ , σ ). So, for all δ ∈ (0, 1), taking into account B0 ≤ 1 we have
B(v − v∗ , σ ) ≥ δ γ B0 (v − v∗ , σ ) − 1|v−v∗ |≤δ . Hence, δ (f )], D(f ) ≥ δ γ [D0 (f ) − D
(36)
1 f f∗ Dδ (f ) = (f f∗ − ff∗ ) log 1|v−v∗ |≤δ dv dv∗ dσ. 4 R2N ×S N −1 ff∗
(37)
where we have set
Let ε0 > 0, to be specified later. This will be an auxiliary small parameter, of the same order as ε.
Cercignani’s Conjecture is Sometimes True
473
From Theorem 3.1 we know that there exists a constant Kε0 (f ), depending on positivity, smoothness and moment bounds for f , such that D0 (f ) ≥ Kε0 (f )H (f |M)1+ε0 . So far there is nothing new: upon replacement of the bounds in [13] by those in [39], this is the strategy used by Carlen and Carvalho [13] to get rid of small relative velocities. δ (f ) = O(δ κ ) for The “standard” way to conclude would be to prove an estimate like D some κ > 0, then optimize in (36) to get a lower bound like D(f ) ≥ KH (f |M)α for some α > 1. But here, we want to keep the exponent arbitrarily close to 1, so we have to be more δ (f ) vanish as δ → 0, clever. The main difficulty will be to show that not only does D but also it vanishes up to (almost) order 1 in H (f |M). To state it in formulas, we wish to prove δ (f ) ≤ C ε0 (f )H (f |M)1−ε0 δ κ . D
(38)
Note that here ε0 is arbitrarily small, while κ is fixed and does not go to 0 as ε0 → 0. If we plug (38) into (36), we find ! δκ C ε0 (f ) γ 1+ε0 1− . D(f ) ≥ δ Kε0 (f )H (f |M) Kε0 (f ) H (f |M)2ε0 Then the choice δ=
Kε0 (f )H (f |M)2ε0
1/κ
2C ε0 (f )
leads to D(f ) ≥ K · Kε0 (f )
1+ γκ
C ε0 (f )
− γκ
H (f |M)
1+ε0 1+ 2γ κ
,
which is what we are looking for (with ε = ε0 (1 + (2γ /κ))). So the proof will be complete when we have shown the theorem below. Theorem 4.2. Let f ∈ C1,0,1 . For any ε0 ∈ (0, 1) there exists a finite constant C ε0 (f ), depending on f only via estimates of positivity, decay and smoothness, such that ∀δ ∈ (0, 1)
δ (f ) ≤ C ε0 (f )H (f |M)1−ε0 δ N/4 . D
Proof of Theorem 4.2. The proof is not so long but quite tricky. We divide it into five steps. Step 1. Introduction of the Maxwellian distribution M by force. We shall use the following elementary inequality: Lemma 4.3. Let X, Y , Z be three positive real numbers. Then X X Z X Y (X − Z) log ≤ C max 1, log , log (X − Y ) log + (Y − Z) log , Z Z X Y Z where C is a numerical, computable constant.
474
C. Villani
Before we prove this lemma, let us explain how we shall use it. Choose X = f f∗ ,
Z = ff∗ ,
Y = MM∗ = M M∗ .
From the assumption on f we have
f L∞ ≡ C0 ≥ f ≥ K0 e−A0 |v| , q0
so C02 ≥ ff∗ ≥ K02 e−A0 (|v|
q0 +|v |q0 ) ∗
.
Just as in the proof of Theorem 3.1, C02 ≥ f f∗ ≥ K02 e−A0 (|v|
2 +|v |2 )q0 /2 ∗
Thus
.
X Z C0 max 1, log , log ≤ 2 1 + log + A0 (|v|2 + |v∗ |2 )q0 /2 . Z X K0
Applying Lemma 4.3, we deduce f f∗ (f f∗ − ff∗ ) log ff∗ C0 2 2 q0 /2 ≤ C 1 + log + A0 (|v| + |v∗ | ) K0 f f∗ ff∗ × (f f∗ − M M∗ ) log + (ff∗ − MM∗ ) log M M∗ MM∗ C0 f f = C 1 + log + A0 (|v |2 + |v∗ |2 )q0 /2 (f f∗ − M M∗ ) log ∗ K0 M M∗ C0 ff∗ +C 1 + log + A0 (|v|2 + |v∗ |2 )q0 /2 (ff∗ − MM∗ ) log , K0 MM∗
(39)
(40)
by energy conservation again. We next multiply (40) by 1|v−v∗ |≤δ and integrate with respect to dσ dv dv∗ . Using the pre-postcollisional change of variables and the equality |v − v∗ | = |v − v∗ | again, we end up with δ (f ) ≤ C 1 + log C0 + A0 D K0 ff∗ × (ff∗ − MM∗ ) log (1 + |v|2 + |v∗ |2 )q0 /2 dv dv∗ dσ (41) MM∗ |v−v∗ |≤δ C0 = C|S N−1 | 1 + log + A0 K0 ff∗ × (ff∗ − MM∗ ) log (1 + |v|2 + |v∗ |2 )q0 /2 dv dv∗ , (42) MM∗ |v−v∗ |≤δ where C is a numerical constant. This is the end of Step 1. Before going further, we shall give the proof of Lemma 4.3.
Cercignani’s Conjecture is Sometimes True
475
Proof of Lemma 4.3. By symmetry and homogeneity, we may assume without loss of generality that X ≥ Z and Y = 1. Then what we wish to prove is X X (X − Z) log ≤ C max 1, log [(X − 1) log X + (Z − 1) log Z]. Z Z We distinguish six cases. Case 1.
1 2
≤ X ≤ 4,
1 10
(X − Z) log
Case 2.
1 2
≤ Z ≤ 4. Then
X ≤ C(X − Z)2 ≤ C[(X − 1)2 + (Z − 1)2 ] Z ≤ C[(X − 1) log X + (Z − 1) log Z].
≤ X ≤ 4, Z <
1 10 .
Then
(X − Z) log
X 1 ≤ C log ≤ C(Z − 1) log Z. Z Z
Case 3. X ≤ 21 , X ≥ Z ≥ X/2. Then (X − Z) log
X ≤ C ≤ C(X − 1) log X. Z
Case 4. X ≤ 21 , Z < X/2. Then X X 1 1 ≤ X log = X log X + X log ≤ C + C log ≤ C(Z − 1) log Z. Z Z Z Z Case 5. X ≥ 4, X/2 ≤ Z ≤ X. Then (X − Z) log
X ≤ C(X − Z) ≤ CX ≤ C(X − 1) log X. Z Case 6. X ≥ 4, Z ≤ X/2. Then X X X (X − 1) log X . (X − Z) log ≤ X log ≤ C log Z Z Z This concludes the proof of the lemma. (X − Z) log
Step 2. Tail estimates. Let R > 0, and ff∗ (ff∗ − MM∗ ) log (1 + |v|2 + |v∗ |2 )q0 /2 dv dv∗ . ER (f ) = MM∗ |v|2 +|v∗ |2 ≥R 2 /2 By a proof strictly analogous to the one in Sect. 3, for all s > 0 one can establish the estimate
f L1 log L
f L1 q0 q0 +s q0 +s +1 2 ER (f ) ≤ 2
f L1 +
f L log L s (R/2) (R/2)s
M L1 log L
M L1 q0 +s q0 +s + M L1 +
M L log L (R/2)s (R/2)s
f L1
M L1 1 2q0 +s 2q0 +s q0 +2 log 2 + A0
f L1 +
M L1 , (43) (R/2)s (R/2)s K 0
476
C. Villani
where K0 = inf(K0 , (2π)−N/2 ), A0 = max(A0 , 1/2). We abbreviate this by ER (f ) ≤
s (f ) C , Rs
s (f ) = C(1 + log K −1 + A0 ) f 1 where C L 0
2q0 +s
log L f L log L .
If we now plug this estimate into (41), we find 1 Dδ (f ) ≤ C 1 + log + A0 K0 % $ q /2 Cs (f ) ff∗ R2 0 . (ff∗ − MM∗ ) log dv dv∗ + × 1+ 2 MM∗ Rs |v−v∗ |≤δ A proper choice of R shows that
1 + A0 K0 ×Cs (f ) max
δ (f ) ≤ C D
1 + log
s s+q0 ff∗ (ff∗ − MM∗ ) log dv dv∗ , MM∗ |v−v∗ |≤δ ff∗ (ff∗ − MM∗ ) log dv dv∗ . MM∗ |v−v∗ |≤δ
We shall choose s in such a way that s ε0 =1− . s + q0 2 So all we have to prove is the following: for all η ∈ (0, 1), there exists Cη (f ), only depending on suitable estimates on f , such that N ff∗ (ff∗ − MM∗ ) log dv dv∗ ≤ Cη (f )δ 2 H (f |M)1−η . (44) MM∗ |v−v∗ |≤δ Then the proof of Theorem 4.2 will be complete upon choosing η = ε0 /(2 − ε0 ), so that (1 − η)(1 − ε0 /2) = 1 − ε0 . Step 3. Localization to physical space. This step is crucial. The error term appearing in the left-hand side of (44) has the form of an integral over a thin strip in R2N ; and we wish to replace it by integrals over small balls in RN . For this we proceed as follows. First, by symmetry, ff∗ f (ff∗ − MM∗ ) log dv dv∗ = 2 (ff∗ − MM∗ ) log dv dv∗ . MM∗ M |v−v∗ |≤δ |v−v∗ |≤δ Next, we rewrite the integrand as (ff∗ − MM∗ ) log
f f M = f∗ (f − M) log − (f∗ − M∗ ) M log −M +f M M f + (f∗ − M∗ )(f − M). (45)
Cercignani’s Conjecture is Sometimes True
477
Note that in this decomposition we have carefully isolated the dominant terms in the form of nonnegative expressions! This will be crucial to avoid destroying the structure of the integrand. A less careful decomposition would lead to the appearance of terms like f | log f |, at the level of (47) for instance, and then we would be lost for the last steps of the proof. As a consequence, we obtain
ff∗ dv dv∗ MM∗ |v−v∗ |≤δ f =2 f∗ (f − M) log dv dv∗ M |v−v∗ |≤δ M − M + f dv dv∗ −2 (f∗ − M∗ ) M log f |v−v∗ |≤δ +2 (f∗ − M∗ )(f − M) dv dv∗ . (ff∗ − MM∗ ) log
|v−v∗ |≤δ
(46)
We shall estimate separately the three terms appearing in (46). First of all,
f dv dv∗ f∗ 1|v−v∗ |≤δ (f − M) log M f ≤ sup f∗ 1|v−v∗ |≤δ dv∗ (f − M) log dv. N N M N R R v∈R
R2N
By Cauchy-Schwarz inequality, for any v ∈ RN , &
RN
f∗ 1|v−v∗ |≤δ dv∗ ≤ f L2
RN
1
1|v−v∗ |≤δ dv∗ ≤ |B N | 2 f L2 δ N/2 ,
where |B N | stands for the volume of the unit ball in RN . Thus, R2N
f f N 21 N/2 dv dv∗ ≤ |B | f L2 δ f∗ 1|v−v∗ |≤δ (f − M) log (f − M) log . N M M R (47)
We next treat similarly the second term in (46):
M (f∗ − M∗ ) M log − M + f f |v−v∗ |≤δ M |f∗ − M∗ |1|v−v∗ |≤δ dv∗ M log ≤ sup − M + f dv N f v∈RN R M N 21 N/2 M log − M + f dv. ≤ |B | ( f L2 + M L2 )δ f
478
C. Villani
As for the last term in (46), we have (f∗ − M∗ )(f − M) dv dv∗ |v−v∗ |≤δ ≤ sup |f − M| dv 1|v−v∗ |≤δ |f∗ − M∗ | dv∗ RN
v∈RN
1
≤ |B N | 2 δ N/2 f − M L2 f − M L1 . Putting all previous estimates together, we end up with |v−v∗ |≤δ
(ff∗ − MM∗ ) log
ff∗ dv dv∗ MM∗
1
≤ 2|B N | 2 ( f L2 + M L2 + 1)δ N/2 M f + M log − M + f + f − M L2 f − M L1 . × (f − M) log M f (48) Since f = M, this can be rewritten in terms of relative informations: ff∗ (ff∗ − MM∗ ) log dv dv∗ MM ∗ |v−v∗ |≤δ 1
≤ 2|B N | 2 ( f L2 + M L2 + 1)δ N/2 × H (f |M) + H (M|f ) + f − M L2 f − M L1 .
(49)
This is the end of Step 3. To conclude the proof of (44), it will suffice to show that
f − M L2 f − M L1 ≤ Cη (f )H (f |M)1−η ,
(50)
H (M|f ) ≤ Cη (f )H (f |M)1−η .
(51)
These inequalities will be proven in Steps 4 and 5 respectively. Step 4. Interpolations. In this step we prove (50). The idea is to reduce to L1 norms in order to apply the Csisz´ar-Kullback-Pinsker inequality, '
f − M L1 ≤ 2H (f |M). (52) So we use the following interpolation lemma: Lemma 4.4. Let u be a smooth function on RN ; then, for all θ ∈ (0, 1) there is a numeric constant Cθ = C(θ, N ) such that
u L2 (RN ) ≤ Cθ u 1−θ
u θH k (RN ) , L1 (RN ) Remark. Any k > N (1 − θ )/(2θ ) would do.
k=
N +1 . 2θ
(53)
Cercignani’s Conjecture is Sometimes True
479
Once this lemma is proven, inequality (50) will follow by choosing θ = 2η and applying (52). Proof of Lemma 4.4. Let ( u stand for the Fourier transform of u: ( u(ξ ) = e−ix·ξ u(x) dx. RN
Then,
u 2L2
1 1 |( u(ξ )|2 2 =
( u
= (1 + |ξ |)(N+1)(1−θ) dξ 2 L (2π)N (2π)N RN (1 + |ξ |)(N+1)(1−θ) 1−θ θ (N +1)(1−θ ) |( u(ξ )|2 1 2 θ dξ |( u(ξ )| (1 + |ξ |) dξ ≤ (2π)N (1 + |ξ |)N+1 RN
thanks to H¨older’s inequality. Then,
u 2L2
≤ C(N)
sup |( u(ξ )|
ξ ∈RN
×
1−θ 2
RN
|( u(ξ )|2 (1 + |ξ |) 2(1−θ)
≤ C(θ, N ) u L1
N +1 θ
dξ N (1 + |ξ |)N+1 R θ dξ
1−θ
u 2θ (N +1) . H
2θ
In the last inequality we have used the well-known representation of Sobolev norms by Fourier transform. This concludes the proof. Step 5. From H (M|f ) to H (f |M). In this last step, we prove inequality (51). For this we shall again use positivity and moment estimates. First we write H (M|f ) =
M log
M = f
f f − log − 1 M dv. M M
In this way we shall be able to take advantage of the quadratic behavior of the function X − log X − 1. Lemma 4.5. There exists a numeric constant CL such that for all X ∈ R, 1 X − log X − 1 ≤ CL max 1, log (X log X − X − 1). X Proof. The proof is immediate if one separates the three cases X ≤ 1/2, 1/2 ≤ X ≤ 2 and X ≥ 2. From this lemma we deduce M f f f H (M|f ) ≤ CL max 1, log log − + 1 M dv f M M M M f = CL max 1, log f log − f + M dv f M
480
C. Villani
1−η f − f + M dv M !η f M 1/η f log − f + M dv . × max 1, log f M
≤ CL
f log
In order to conclude the proof of (44), we just have to bound the integral on the right-hand side. For this we note that M 1 max 1, log + A0 |v|q0 , ≤ max 1, log (54) f K0 so a rough estimate leads to M 1/η f max 1, log − f + M dv f log f M 1/η 1 ≤ C(N ) max 1, log + A0
f L1q 0 K0 η This concludes the proof of Theorem 4.2.
log L
+ f L1q
0 η +2
+ M L1q
0 η
.
Remark. Handling rough distributions. One can wonder what can still be proven if one has to deal with density functions which do not possess the nice properties of decay, smoothness and positivity which we used above. This was the situation in the original Carlen-Carvalho results [12, 13]. If one is ready to accept polynomial lower bounds of the form D(f ) ≥ H (f |M f )αf , where the exponent α would not necessarily be close to 1, then the assumptions on f can be considerably relaxed: • It is very easy to relax the moment assumption. This can be achieved by the following standard way. Let χ (v) be a radially symmetric smooth cut-off function, which will be fixed once for all in such a way that χ ≥ 0, χ (v) = 1 for |v| ≤ 1, χ (v) = 1/|v|µ for |v| ≥ 2, where µ will be chosen later on. We define v f˜R (v) = f (v)χ ; R then we apply our main theorem to fR . Using the crude moment bound and the smoothness, it is possible to show that H (fR |M fR ) is not much smaller than H (f |M f ), and that D(fR ) is not much bigger than D(f ); error terms for this are given as a function of R. Then, an optimization in R yields the result. • If the kernel does not vanish, then it is also very easy to relax the smoothness and the lower bound assumption. Indeed, by crude moment bounds, we may assume that B ≥ 1. Then, if Pt stands for the heat semigroup, we have D(Pt f ) ≤ D(f ) by convexity of f ⊗ f −→ D(f ). But Pt f is smooth and bounded from below; so the only thing to know is that H (Pt f |M Pt f ) is not much smaller than H (f |M). This only needs an ad hoc assumption on the modulus of continuity of the function H (Pt f ) in terms of t. For instance, if f has a finite Fisher information, |∇f |2 < +∞, I (f ) = f
Cercignani’s Conjecture is Sometimes True
481
then one has the bound
H (f ) − H (Pt f ) =
t
I (Pτ f ) dτ ≤ I (f )t,
0
because I (Pt f ) is a nonincreasing function of t (this property is well-known to specialists; see [45] for a short proof). • Even if the kernel vanishes, then it is possible to require less smoothness on the distribution. An example of such a computation is in [12], which uses Lipschitz bounds. 5. Further Developments and Open Problems A natural question to ask for now, is whether Cercignani’s conjecture holds true for ones. As we saw, the answer is negative if other collision kernels than the quadratic B(v − v∗ , σ ) dσ = O(1 + |v − v∗ |γ ) for some γ < 2. However, a physically interesting case is not covered by these assumptions, namely the one in which B presents a nonintegrable angular singularity for small deviation angles. In this case, the Boltzmann operator resembles a fractional derivation operator (see [2]) and one could wonder whether this has a chance to help. For instance, it is shown in [2] that for some kernels of this√kind, finiteness of D(f ) implies finiteness of well-chosen fractional Sobolev norms of f . At present, we do not know how to answer this question, which seems extremely intricate. However, we can formulate a guess as follows. The links between Boltzmann’s and Landau’s entropy production functionals have been studied in detail (see for instance [1, 43]). In particular, the following is known. Let Bε (v − v∗ , σ ) be a family of Boltzmann collision kernels taking the special form ) * v − v∗ cos θ = ,σ , Bε (v − v∗ , σ ) = (|v − v∗ |) bε (cos θ), |v − v∗ | where the angular dependence bε satisfies π bε (cos θ )(1 − cos θ ) sinN−2 θ dθ −−→ µ > 0, ε→0
0
∀θ0 > 0,
sup bε (cos θ) −−→ 0. θ≥θ0
ε→0
Then, for any fixed smooth density f , one has DBε (f ) −−→ DL (f ), ε→0
where DBε stands for Boltzmann’s entropy production functional with kernel Bε , and DL stands for Landau’s entropy production functional, in the form + '
' + +2 + (|v − v∗ |)+(v−v∗ )⊥ ∇ f − (∇ f )∗ + , (55) DL (f ) = 2 R2N
(|z|) = CN µ (|z|)|z|2 .
482
C. Villani
In particular, if one considers a family of collision kernels of the form B(v − v∗ , σ ) = (1 + |v − v∗ |γ ) b(cos θ),
(56)
where b(cos θ ) is a positive kernel satisfying
2π
b(cos θ )(1 − cos θ) sinN−2 θ dθ = 1
0
and presenting a singularity of order 1 + ν at θ = 0, then the Landau entropy production (23) appears as the limit case ν = 2, γ = 0. More generally, for a given power γ , the limit case ν = 2 would correspond to (55) with behaving like a power γ + 2. In fact, most of the time the quantity γ + ν has a lot of influence on the qualitative properties of the Boltzmann operator when grazing collisions have an important role [2]. Now, the main result of the present paper states that Cercignani’s conjecture holds true in the case γ = 2, ν = 0, while the main result of [26] states that this conjecture also holds true in the limit case γ = 0, ν = 2. In view of the above considerations, it is very tempting to conjecture that we have just identified the extremal points of a family of inequalities governed by the quantity γ + ν. Thus we are led to the Conjecture. Let B(v − v∗ , σ ) be a collision kernel of the form (56), where b has an angular singularity of order 1 + ν at θ = 0 (ν ≥ 0; by convention ν = 0 if b is integrable). Then, Cercignani’s conjecture holds true if and only if γ + ν ≥ 2. On the other hand, the linearized Boltzmann collision operator admits a spectral gap if and only if γ ≥ 0. Would this conjecture be true, it would mean that Cercignani’s conjecture is always false in realistic cases; indeed, classical physics yields exponents γ and ν satisfying γ + ν ≤ 1. It would also lead to a remarkable conclusion, namely that the physical quantities which matter for an entropy-entropy production inequality to hold, differ from those which matter for a spectral gap inequality to hold. As we mentioned above, another problem in which questions naturally arise is an entropic version of Kac’s famous spectral gap problem for many particles. As we shall see in the next section, our estimates can be adapted for such a purpose, and lead to the first accomplishment of Kac’s program on a simplified artificial model of the Boltzmann equation. 6. The Entropy Variant of Kac’s Problem At the beginning of the fifties, Kac [30] had the idea to attack the problem of trend to equilibrium for the Boltzmann equation, by a study of a “master equation”, i.e. a linear equation for the density function of a many-particles system. The main apparent advantage of this approach was to trade the complexity of the nonlinear Boltzmann equation for the simplicity of a linear equation, however on a many-particle phase space. It seems natural to use the tools of spectral theory to study the asymptotic behavior of the solution of this linear equation, and this might have led to a natural road towards convergence theorems for the nonlinear Boltzmann equation, for which no way of attack could be seen. The problem set by Kac, and which he was unable to achieve at the time, was to prove that a certain family of linear operators Ln , defined on larger and larger phase spaces (the phase spaces for n particles), admitted a uniform lower bound on the size of their spectral gap λn .
Cercignani’s Conjecture is Sometimes True
483
Let us give here the basic example, commonly called Kac’s caricature of a Maxwell gas. In this model, the particles are modelled by n one-dimensional velocities, evolving stochastically as follows. Attached to the system is a Poisson clock with rate n; whenever it rings, two distinct particles i and j are chosen randomly, and they change states from the old velocities (vi , vj ) to the new values (vi , vj ) = Rθ (vi , vj ), where Rθ is the counterclockwise rotation of angle θ in the plane, and θ is chosen randomly (uniformly) in [0, 2π]. Since √ these “collisions” preserve kinetic energy, the natural phase space is the sphere nS n−1 (so that each particle has energy 1 on the average). The linear operator describing this evolution is Ln f
(n)
2π n ij = − [f (n) ◦ Rθ − f (n) ] dθ, n
(57)
2 i<j 0 ij
where Rθ stands for the counterclockwise rotation of angle θ in the (i, j ) plane, and − stands for the normalized integral. Since the clock typically rings n times over a unit time interval, each particle typically collides once, and in the limit as n → ∞, under a chaos assumption, the one-particle marginal can be shown to satisfy a simple analogue of Boltzmann’s equation, 2π ∂f t ≥ 0, v ∈ R, (58) − f (v )f (v∗ ) dθ dv∗ − f, = ∂t R 0
(v , v∗ )
where = Rθ (v, v∗ ). As a by-product, these studies led Kac to introduce the notion of propagation of chaos, and one of the first mathematical treatments of mean-field limit for a continuous phase space. There is hardly any doubt that these by-products were by far more important than Kac’s original goal. The study of chaos and mean-field limits underwent important progress in the following decades [37, 38], while the spectral gap study of the master equations was stalled. Kac implicitly conjectured that the spectral gap λn of (57) was uniformly bounded below as n → ∞. Recently, Diaconis and Saloff-Coste [27], using a group2 theoretical approach, could not obtain better than λ−1 n = O(n ) for the spectral gap λn of the operator (57). It was only two years ago that Janvresse [28] finally solved Kac’s conjecture; for this she used the so-called Yau entropy method. Independently, Maslen [33] managed to determine the whole spectrum of the Kac operator by a completely different method. Shortly after, Carlen, Carvalho and Loss [14] dramatically improved this result by computing the exact value of the spectral gap, introducing a deep and general induction argument, by which they also could treat more general versions of Kac master equations, in particular the one corresponding to Boltzmann’s equation for Maxwell molecules. However, it was soon realized that these results would not help solving Kac’s original problem, namely convergence to equilibrium for a nonlinear equation set on a one-particle phase space. What √ is the problem? Let us still consider the case of Kac’s caricature: the √ phase space is nS n−1 , and spectral gap tells about the speed of convergence in √ L2 ( nS n−1 ). But the norm of the n-particle distribution function f (n) in L2 ( nS n−1 ) is very hard to relate to relevant quantities about the one-particle marginal f , even when
484
C. Villani
f is very close to be in tensor product form. Roughly speaking, it increases exponentially with n, and cannot be asymptotically expressed in terms of some simple norm of f , see [30]. Even if nobody has ever checked this explicitly, it seems that Kac’s approach rather leads to estimates about the speed of convergence for a linearized Boltzmann-type equation. Now, if one still wants to achieve Kac’s program and relate the speed of convergence for the nonlinear Boltzmann equation, to the speed of convergence for a many-particles system, the natural thing to do is to look at this problem in terms of entropy and entropy production. Indeed, entropy behaves very well for problems with arbitrarily large phase space, a property which has been used again and again in the context of hydrodynamic limits of particle systems [31]. The entropy version of Kac’s problem now becomes: Find an asymptotically sharp lower bound, as n → ∞, on Kn ≡ inf
f (n)
D(f (n) ) , H (f (n) )
√ where f (n) describes the set of probability densities on nS n−1 with finite entropy, D(f (n) ) stands for the entropy production of the system of n particles under consideration, and H (f (n) ) for the negative of its entropy. In [29], Janvresse gives some hint of why Yau’s method seems to fail on this problem. It was shown in the author’s master thesis [44] that Kn−1 = O(n). This estimate was first believed by the author to be suboptimal; but recent work by Carlen, Lieb and Loss [15], using a completely different method, suggests that it is optimal. The fact that Kn−1 is not uniformly bounded prevents from finding any relevant conclusion at the level of the limit n → ∞; this is consistent with the fact that Cercignani’s conjecture is not believed to hold true for Eq. (58). Now, what we prove by an adaptation of the proof of Theorem 2.1, is the following more general result. In the case γ = 2, it includes the first Kac-type model for which the entropy production problem can be solved in a satisfactory way. Theorem 6.1. Let γ ∈ [0, 2]. Consider the following Kac-type master equation: 2π n ∂f (n) ij − 1 + (vi2 + vj2 )γ /2 f (n) ◦ Rθ − f (n) dθ, = Ln f (n) ≡ n ∂t 2 i<j 0
t ≥ 0,
√ v ∈ nS n−1 ,
(59) √
where the unknown f (n) is a probability density on the sphere nS n−1 . Define (n) H (f ) = √ f (n) log f (n) , nS n−1 1 ij (n) (n) (n) (n) . D(f ) = L (f ) log f ◦ R − log f n θ 2 √nS n−1 Then D(f (n) ) ≥
nγ /2 H (f (n) ) H (f (n) ) ≥ γ . (γ + 1)n − 1 (γ + 1)n1− 2
Cercignani’s Conjecture is Sometimes True
485
In particular, for γ = 2, one has D(f (n) ) ≥
H (f (n) ) 3
independently of n. Remarks. 1. Note that vi2 + vj2 is typically of order 1, so our modified Kac-type linear operator is of the same order of magnitude as the one appearing in Kac’s caricature of a Maxwell gas. It describes the same system as Kac’s model, except that now each pair of particles (i, j ) has its own Poisson clock, ringing with rate (vi2 +vj2 )γ if the two particles are in states vi , vj respectively. The collision between particles i and j is performed only when the corresponding clock has rung. 2. In the case γ = 2, we deduce from this study that H (ft ) ≤ H (f0 )e−t/5 , (n)
(n)
(n)
(n)
where ft stands for the n-particle probability density at time t. Assume that f0 is a “strongly” chaotic sequence of initial √ data, for instance the restriction of a tensor product probability measure to the sphere nS n−1 . Then one has (n)
H (f0 ) . n→∞ n
H (f0 |M) = lim
On the other hand, from a propagation of chaos argument and the properties of H one can show that for any t ≥ 0, (n)
H (ft |M) ≤ lim inf n→∞
H (ft ) , n
where ft stands for the solution of (58) at time t. Thus the convergence to equilibrium for the limit equation can be deduced from the estimates on the n-particle system. Note that in the present case, it could also have been established directly. 3. By a standard linearization procedure, this bound on the entropy production implies an estimate of the spectral gap, with the same constants. In the case γ = 0, the estimate obtained in this way already improves on the Diaconis-Saloff-Coste result [27], by a factor n; but does not (cannot?) catch the optimal results of [28] or [14]. 4. We conjecture that the estimate Kn−1 = O(n(1−γ /2) ) is optimal. This could possibly be checked by a reexamination of the methods by Carlen, Carvalho and Loss [14], and by Carlen, Lieb and Loss [15]. 5. Here is the conclusion that can be drawn from this study, when compared with the observations by Carlen, Lieb and Loss [15]: in the entropy production problem for n particles with γ = 0, very low entropy production rates can be achieved by considering particular configurations which are very “unfair”, in the sense that just a few (a small fraction of N, however nonvanishing in the limit N → ∞) fast particles carry an important fraction of the energy. If one enhances the rate of interaction of these fast particles (by introducing vi2 + vj2 in the collision kernel), then this effect can be compensated for. It should be noted that the spectral gap inequality does not worry about the existence of these “unfair” energy distributions. As in the previous section, we see that entropy production estimates are more sensitive to the dynamics of the particle system.
486
C. Villani
Proof of Theorem 6.1. It is in the same spirit as that of Theorem 2.1. The main difference is that the use of the Blachman-Stam inequality will be replaced by the use of a classical estimate coming from the theory of logarithmic Sobolev inequalities on manifolds. To begin with, we use the inequality 1 + X α ≥ (1 + X)α (X ≥ 0, 0 ≤ α ≤ 1), to replace the kernel 1 + (vi2 + vj2 )γ /2 by its lower bound (1 + vi2 + vj2 )γ /2 . The interest of this will appear later. To simplify notations we rescale everything to work on the unit sphere S n−1 . Let σ n be the uniform probability on S n−1 . The problem becomes: bounding below 2π γ /2 (n)
n 1 ij (n) D(f ) = [f ◦ Rθ − f (n) ] 1 + n(vi2 + vj2 ) n n−1 2π S 0 2 2 i<j ij
× log in terms of
f (n) ◦ Rθ dσ n dθ f (n)
H (f (n) ) =
f (n) log f (n) dσ n , S n−1
whenever S n−1 f (n) dσ n = 1. To alleviate notations we shall drop the subscript (n) for the probability density. By Jensen’s inequality, D(f ) ≥ D(f ), where n D(f ) = n 2 2 i<j
γ /2
S n−1
1 + n(vi2 + vj2 )
and 1 f (v) = 2π
ij
2π
(f − f ij ) log
f dσ n , f ij
ij
f (Rθ (v)) dθ.
0
Let S be the Laplace-Beltrami operator on the sphere S n−1 . It can be defined as S =
1 ij 2 (D ij )2 = (D ) , 2 i<j
where
ij
∂f ∂f d ij D f = vi − vj = f ◦ Rθ . ∂vj ∂vi dθ θ=0 ij
An easy computation shows that f S f = − S n−1
S n−1
|∇f |2 dσ n ,
where ∇f stands for the tangential gradient on S n−1 .
Cercignani’s Conjecture is Sometimes True
487
Let (St )t≥0 be the heat semigroup on the sphere, generated by the Laplace-Beltrami operator. A computation similar to the one in the proof of Theorem 2.1 shows that n d φij (St f + St f ij )|∇ log St f − ∇ log St f ij |2 dσ n − D(St f ) = n n−1 dt S 2 2 i<j n St f − S φij (St f − St f ij ) log dσ n , (60) ij n n−1 S tf 2 2 i<j S where φij (v) = [1 + n(vi2 + vj2 )]γ /2 . Here we have used the fact that (St f )ij = St (f ij ). Next, by direct computation, S (vi2 + vj2 ) = 4 − 2n(vi2 + vj2 ) ≤ 4, in particular, from the formula a γ /2 ≤ (γ /2)( a)/a 1−γ /2 , we deduce S φij (v) ≤
2nγ ≤ 2nγ φij (v). [1 + n(vi2 + vj2 )]1−γ /2
Let Pij stand for the orthogonal projection onto the (i, j ) plane: clearly, (St f + St f ij )|∇ log St f − ∇ log St f ij |2 ≥ St f |Pij ∇ log St f − Pij ∇ log St f ij |2 . Now, the crucial symmetry argument on which the proof relies is that Pij ∇ log St f ij is always colinear to (vi , vj ). Let Tij ∈ L∞ (S n−1 ; L(R2 , R)) be the v-dependent linear operator with unit norm defined by Tij : [Xi , Xj ] −→
X j vi − X i vj , . vi2 + vj2
Then Tij Pij ∇ log St f ij = 0, and thus |Pij ∇ log St f − Pij ∇ log St f ij |2 ≥ (Tij Pij ∇ log St f )2 =
(D ij log St f )2 . vi2 + vj2
Since v ∈ S n−1 , we have φij nγ /2 γ /2 ≥ . γ ≥ n vi2 + vj2 (vi2 + vj2 )1− 2 All in all, we arrive at −
d n D(St f ) ≥ n dt 2 2 i<j
S n−1
nγ /2 St f (D ij log St f )2 dσ n
n − n 2 2 i<j
S n−1
2γ nφij (St f − St f ij ) log
St f dσ n , St f ij
488
C. Villani
which implies −
d nγ /2 D(St f ) + 2γ nD(St f ) ≥ dt n−1
Let
I (f ) =
S n−1
|∇St f |2 dσ n . St f
|∇f |2 dσ n f
stand for the Fisher information of a probability distribution f on S n−1 . The preceding differential inequality implies +∞ nγ /2 D(f ) ≥ e−2γ nt I (St f ) dt. n−1 0 It is known from the theory of logarithmic Sobolev and hypercontractive inequalities (see for instance [32]) that I (Sτ f ) ≤ e−2ρτ I (f ),
ρ =n−1
(more generally, on a sphere of radius r, ρ = (n − 1)/r 2 ). Using this and reasoning as in the proof of Theorem 2.1, we end up with +∞ nγ /2 nγ /2 D(f ) ≥ I (St f ) dt = H (f ). γn + n − 1 0 (γ + 1)n − 1 This ends the proof of Theorem 6.1.
We shall conclude this paper on still another open problem. What happens if one tries to replace Kac’s caricature of a Maxwell gas by a more realistic model, in which particles have velocities in RN (N ≥ 2) and undergo elastic collisions, with a collision kernel of the form B = 1 + |vi − vj |γ (as a first step towards an even more realistic model with B = |vi − vj |γ )? More explicitly, consider the case in which n ij Ln f = − B(vi − vj , σ )[f (v ) − f (v)] dσ, n 2
ij
S N −1
where v ij stands for v, with velocities vi and vj replaced by vi + vj vi + vj |vi − vj | |vi − vj | + σ, vj = − σ, 2 2 2 2 and B(vi −vj , σ ) ≥ 1+|vi −vj |γ . Then the phase space would be a sphere of dimension nN − (N + 1), due to the existence of the N + 1 conservation laws, vi =
N i=1
|vi |2 = N n,
N
vi = 0.
i=1
We do believe that the same bounds in O(n1−γ /2 ) still hold true for this model. This would be consistent with both Theorem 2.1 and Theorem 6.1. However, the proof of this fact turns out to require a more delicate geometrical analysis, if one wishes to establish an analogue of Lemma 2.8 in arbitrarily large dimension. This will be the object of future study.
Cercignani’s Conjecture is Sometimes True
489
Acknowledgement. I started to work hard again on Boltzmann’s entropy production on the occasion of a course that I taught at the Institut Henri Poincar´e (Paris) during the fall of 2001, upon the suggestion of Stefano Olla. The proof of Theorem 2.1 came as an illumination while I was trying to answer some nasty questions by Thierry Bodineau. Thus, both of them have played a crucial role in the genesis of the paper and deserve many thanks. Eric Carlen had an obvious influence on the present work, and was kind enough to read it carefully and make some constructive remarks. The paper also relies crucially on some previous joint works with Laurent Desvillettes and Giuseppe Toscani. Additional thanks are due to Michael Loss for several enlightening discussions about the subject, and to Laurent Saloff-Coste for pointing out reference [33]. Finally, the support of the European network “Hyperbolic and Kinetic Equations”, contract HPRN-CT-2002-00282 is acknowledged.
References 1. Alexandre, R., Villani, C.: On the Landau approximation in plasma physics. Preprint, To appear in Ann. Inst. H. Poincar´e Anal. Non-lin´eaire 2. Alexandre, R., Villani, C.: On the Boltzmann equation for long-range interactions. Comm. Pure Appl. Math. 55(1), 30–70 (2002) 3. An´e, C., Blach`ere, S., Chafa¨ı, D., Foug`eres, P., Gentil, I., Malrieu, F., Roberto, C., Scheffer, G.: Sur les in´egalit´es de Sobolev logarithmiques. Vol. 10 of Panoramas et Synth`eses. Paris: Soci´et´e Math´ematique de France, 2000 4. Arnold, A., Markowich, P., Toscani, G., Unterreiter, A.: On logarithmic Sobolev inequalities and the rate of convergence to equilibrium for Fokker-Planck type equations. Comm. Partial Differential Equations 26(1–2), 43–100 (2001) 5. Bakry, D., Emery, M.: Diffusions hypercontractives. In: S´em. Proba. XIX, no. 1123 in Lecture Notes in Math. Berlin-Heidelberg, Newyork: Springer, 1985, pp. 177–206 6. Barthe, F.: Personal communication about his joint works with K. Ball and A. Naor 7. Blachman, N.: The convolution inequality for entropy powers. IEEE Trans. Inform. Theory 2, 267– 271 (1965) 8. Bobylev, A.: The theory of the nonlinear, spatially uniform Boltzmann equation for Maxwellian molecules. Sov. Sci. Rev. C. Math. Phys. 7, 111–233 (1988) 9. Bobylev, A.V.: Moment inequalities for the Boltzmann equation and applications to spatially homogeneous problems. J. Stat. Phys. 88(5–6), 1183–1214 (1997) 10. Bobylev, A.V., Cercignani, C.: On the rate of entropy production for the Boltzmann equation. J. Stat. Phys. 94(3–4), 603–618 (1999) 11. Carlen, E.: Superadditivity of Fisher’s information and logarithmic Sobolev inequalities. J. Funct. Anal. 101(1), 194–211 (1991) 12. Carlen, E., Carvalho, M.: Strict entropy production bounds and stability of the rate of convergence to equilibrium for the Boltzmann equation. J. Stat. Phys. 67(3–4), 575–608 (1992) 13. Carlen, E., Carvalho, M.: Entropy production estimates for Boltzmann equations with physically realistic collision kernels. J. Stat. Phys. 74(3–4), 743–782 (1994) 14. Carlen, E., Carvalho, M., Loss, M.: Determination of the spectral gap for Kac’s master equation and related stochastic evolutions. Preprint, 2000; to appear in Acta Mathematica 15. Carlen, E., Lieb, E., Loss, M.: Personal communication 16. Carlen, E., Soffer, A.: Entropy production by block variable summation and central limit theorems. Commun. Math. Phys. 140, 339–371 (1991) 17. Carrillo, J., J¨ungel, A., Markowich, P. A., Toscani, G., Unterreiter, A.: Entropy dissipation methods for degenerate parabolic systems and generalized Sobolev inequalities. Preprint, 1999 18. Carrillo, J.A., Toscani, G.: Asymptotic L1 -decay of solutions of the porous medium equation to self-similarity. Indiana Univ. Math. J. 49(1), 113–142 (2000) 19. Cercignani, C.: H -theorem and trend to equilibrium in the kinetic theory of gases. Arch. Mech. 34, 231–241 (1982) 20. Cercignani, C.: The Boltzmann Equation and its Applications. New York: Springer-Verlag 1988 21. Cercignani, C.: Rarefied Gas Dynamics. Cambridge: Cambridge University Press, 2000. From basic concepts to actual calculations 22. Cercignani, C., Illner, R., Pulvirenti, M.: The Mathematical Theory of Dilute Gases. Berlin-Heidelberg New York: Springer, 1994 23. Del Pino, M., Dolbeault, J.: Best constants for Gagliardo-Nirenberg inequalities and applications to nonlinear diffusions. To appear in J. Maths Pures Appl. Available on http://www.ceremade.dauphine.fr/ dolbeaul/Preprints/Preprints.html, 2001
490
C. Villani
24. Desvillettes, L.: Entropy dissipation rate and convergence in kinetic equations. Commun. Math. Phys. 123(4), 687–702 (1989) 25. Desvillettes, L., Villani, C.: On the trend to global equilibrium in spatially inhomogeneous entropydissipating systems : The Boltzmann equation. Preprint, 2002 26. Desvillettes, L., Villani, C.: On the spatially homogeneous Landau equation for hard potentials. II. H -theorem and applications. Comm. Partial Differential Equations 25(1–2), 261–298 (2000) 27. Diaconis, P., Saloff-Coste, L.: Bounds for Kac’s master equation. Commun. Math. Phys. 209(3), 729–755 (2000) 28. Janvresse, E.: Spectral gap for Kac’s model of Boltzmann equation. Ann. Probab. 29(1), 288–304 (2001) 29. Janvresse, E.: Proceedings of the conference “Inhomogeneous Random Systems”, held in CergyPontoise (France), January 2001 30. Kac, M.: Foundations of kinetic theory. In: Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, 1954–1955, vol. III, Berkeley-Los Angeles: University of California Press, 1956, pp. 171–197 31. Kipnis, C., Landim, C.: Scaling Limits of Interacting Particle Systems. Berlin: Springer-Verlag, 1999 32. Ledoux, M.: On an integral criterion for hypercontractivity of diffusion semigroups and extremal functions. J. Funct. Anal. 105(2), 444–465 (1992) 33. Maslen, D.: The eigenvalues of Kac’s master equation. To appear in Math. Zeit. 34. Mouhot, C., Villani, C.: Regularity theory for the spatially homogeneous Boltzmann equation with cut-off. Preprint, 2002 35. Otto, F.: The geometry of dissipative evolution equations: The porous medium equation. Comm. Partial Differential Equations 26(1–2), 101–174 (2001) 36. Stam, A.: Some inequalities satisfied by the quantities of information of Fisher and Shannon. Inform. Control 2, 101–112 (1959) 37. Sznitman, A.: Equations de type de Boltzmann, spatialement homog`enes. Z. Wahrsch. Verw. Gebiete 66, 559–562 (1984) ´ ´ e de Probabilit´es de Saint-Flour XIX 38. Sznitman, A.-S.: Topics in propagation of chaos. In: Ecole d’Et´ – 1989. Berlin: Springer, 1991, pp. 165–251 39. Toscani, G., Villani, C.: Sharp entropy dissipation bounds and explicit rate of trend to equilibrium for the spatially homogeneous Boltzmann equation. Commun. Math. Phys. 203(3), 667–706 (1999) 40. Toscani, G., Villani, C.: On the trend to equilibrium for some dissipative systems with slowly increasing a priori bounds. J. Stat. Phys. 98(5–6), 1279–1309 (2000) 41. Truesdell, C., Muncaster, R.: Fundamentals of Maxwell’s Kinetic Theory of a Simple Monoatomic Gas. New York: Academic Press, 1980 42. Villani, C.:A survey of mathematical topics in kinetic theory. To appear in Handbook of Mathematical Fluid Dynamics, S. Friedlander, D. Serre, (eds)., Elsevier 43. Villani, C.: On a new class of weak solutions for the spatially homogeneous Boltzmann and Landau equations. Arch. Rational Mech. Anal. 143(3), 273–307 (1998) 44. Villani, C.: Contribution a` l’´etude math´ematique des collisions en th´eorie cin´etique. Master’s thesis, Univ. Paris-Dauphine, France, 2000 45. Villani, C.: A short proof of the “concavity of entropy power”. IEEE Trans. Inform. Theory 46(4), 1695–1696 (2000) 46. Villani, C.: Entropy dissipation and convergence to equilibrium. Notes from a series of lectures in Institut Henri Poincar´e, Paris, 2001 47. Wennberg, B.: Entropy dissipation and moment production for the Boltzmann equation. J. Stat. Phys. 86(5–6), 1053–1066 (1997) Communicated by J. L. Lebowitz
Commun. Math. Phys. 234, 491–516 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0769-1
Communications in
Mathematical Physics
Asymptotics of Determinants of Bessel Operators Estelle L. Basor1,∗ , Torsten Ehrhardt2 1 2
Department of Mathematics, California Polytechnic State University, San Luis Obispo, CA 93407, USA. E-mail: [email protected] Fakult¨at f¨ur Mathematik, Technische Universit¨at Chemnitz, 09107 Chemnitz, Germany. E-mail: [email protected]
Received: 23 April 2002 / Accepted: 25 September 2002 Published online: 24 January 2003 – © Springer-Verlag 2003
Abstract: For a ∈ L∞ (R+ ) ∩ L1 (R+ ) the truncated Bessel operator Bτ (a) is the integral operator acting on L2 [0, τ ] with the kernel ∞ √ t xyJν (xt)Jν (yt)a(t) dt, K(x, y) = 0
where Jν stands for the Bessel function with ν > −1. In this paper we determine the asymptotics of the determinant det(I +Bτ (a)) as τ → ∞ for sufficiently smooth functions a for which a(x) = 1 for all x ∈ [0, ∞). The asymptotic formula is of the form det(I +Bτ (a)) ∼ Gτ E with certain constants G and E, and thus similar to the well-known Szeg¨o-Akhiezer-Kac formula for truncated Wiener-Hopf determinants.
1. Introduction For a ∈ L∞ (R+ ) ∩ L1 (R+ ) the Bessel operator B(a) is the integral operator acting on L2 (R+ ) with the kernel ∞ √ K(x, y) = t xyJν (xt)Jν (yt)a(t) dt. (1) 0
Here Jν is the Bessel function with a parameter ν > −1. For each τ > 0, the truncated Bessel operator Bτ (a) is the integral operator acting on L2 [0, τ ] with the same kernel (1). Obviously, Bτ (a) can be considered as the restriction of B(a) onto L2 [0, τ ], i.e., Bτ (a) = Pτ B(a)Pτ |L2 [0,τ ] , ∗
Supported in part by NSF Grant DMS-9970879.
(2)
492
E.L. Basor, T. Ehrhardt
where Pτ is the projection Pτ : f (x) → g(x) =
f (x) for 0 ≤ x ≤ τ 0 for x > τ.
(3)
For a ∈ L∞ (R+ ) ∩ L1 (R+ ), the Bessel operator B(a) is bounded on L2 (R+ ), and the truncated Bessel operator Bτ (a) is a trace class operator on L2 [0, τ ]. Hence the operator determinant det(I + Bτ (a)) is well defined for each τ . For more information about trace class operators and related notions we refer to [9]. The main purpose of this paper is to compute the asymptotics of the determinants det(I +Bτ (a)) as τ → ∞ for certain continuous functions a. This problem is a variation of the classical Szeg¨o-Akhiezer-Kac formula which gives the asymptotics for determinants of Wiener-Hopf operators. Both classes of operators, that is, the finite Wiener-Hopf and the truncated Bessel operators arise in random matrix theory. The determinants give information about the distribution function for certain random variables defined on the eigenvalues in the matrix ensembles. In many instances, (when the function a is smooth enough) the asymptotics show that the distribution functions for the variables, after scaling, are normal. The Bessel case is particularly important for ensembles of positive Hermitian matrices, and the case of ν = ±1/2 has additional applications in random matrix theory. Given a ∈ L1 (R+ ), we denote by aˆ the cosine transform of the function a: 1 ∞ a(x) ˆ = cos(xt)a(t) dt. (4) π 0 A function a defined on [0, ∞) is said to be piecewise C 2 on [0, ∞) if there exist 0 = t0 < t1 < . . . < tN < ∞, N ≥ 0, such that a is two times continuously differentiable on each of the intervals [0, t1 ], . . . , [tN−1 , tN ], [tN , ∞), where the derivatives at t0 , . . . , tN are considered as one-sided derivatives. The main result of this paper is as follows. Theorem 1.1. Let ν > −1 and suppose the function b ∈ L∞ (R+ ) ∩ L1 (R+ ) satisfies the following conditions: (i) b is continuous and piecewise C 2 on [0, ∞), and lim b(t) = 0; (ii) (1 + t)−1/2 b (t) ∈ L1 (R+ ), b (t) ∈ L1 (R+ ).
t→∞
Denote by bˆ the cosine transform of b and put a = eb − 1. Then ν 1 ∞ ˆ 2 ˆ det(I + Bτ (a)) ∼ exp τ b(0) − b(0) + x(b(x)) dx 2 2 0
as τ → ∞. (5)
The proof of this theorem will be given in the last section of this paper (Sect. 6). Note that the assumptions of b ensure that all expressions appearing in formula (5) are well defined (see also the arguments in the proof). A result of this kind has already been established by one of the authors in [1] under more restrictive assumptions. In comparison with [1], the proof of the asymptotic formula given here will be more transparent as we also employ a new algebraic method for the proof [7]. In particular, we remove the quite restrictive assumption that ||a||L∞ (R+ ) < 1, which was imposed [1]. It is replaced by the assumption that 1 + a is a function that possesses a logarithm, which is a natural requirement for Szeg¨o-Akhiezer-Kac type formulas [6].
Asymptotics of Determinants of Bessel Operators
493
It is notable that for particular values of ν the Bessel operator B(a) can be written in terms of Wiener-Hopf and Hankel operators: B(a) = W (a) + H (a) B(a) = W (a) − H (a)
if ν = −1/2, if ν = 1/2.
Here we think of a as a function on R+ which is extended to an even function on R by stipulating a(−x) = a(x). Hence in these cases, Theorem 1.1 describes the asymptotics of the determinants of Wiener-Hopf + Hankel operators det(I + Pτ W (a)Pτ ± Pτ H (a)Pτ ), where the symbol a is even. We should also note here that our proof requires that we establish for general ν sufficient conditions on a function a such that the Bessel operator B(a) differs from the Wiener-Hopf operator W (a) by a Hilbert-Schmidt operator. For ν = ±1/2 this reduces to the condition that the Hankel operator H (a) is Hilbert-Schmidt. However the result for general ν is of independent interest, much more difficult to obtain, and thus the main focus of the next section of the paper. Finally, the discrete analogue of computing Toeplitz + Hankel determinants, has been recently investigated by the authors and results have been generalized to the case where the symbol is discontinuous [3] (see also [2, 4]). 2. Operator Theoretic Preliminaries In this section, we establish all general operator theoretic facts as well as particular results about Bessel operators and Wiener-Hopf operators that we will need later on. First of all, let us mention that Bessel operators can be defined for arbitrary functions a ∈ L∞ (R+ ). For ν > −1, let Hν denote the Hankel transform ∞ √ Hν : L2 (R+ ) → L2 (R+ ), f (x) → g(x) = txJν (tx)f (t) dt. (6) 0
It is well known that Hν is selfadjoint and unitary on L2 (R+ ), i.e., Hν∗ = Hν−1 = Hν [14]. For a ∈ L∞ (R+ ) the Bessel operator B(a) ∈ L(L2 (R+ )) is defined by B(a) = Hν M(a)Hν ,
(7)
where M(a) is the multiplication operator on L2 (R+ ). If a ∈ L∞ (R+ ) ∩ L1 (R+ ), then this definition coincides with the one given in the introduction. From formula (7) it follows immediately that B(ab) = B(a)B(b)
(8)
for all a, b ∈ L∞ (R+ ). It is also clear that the Bessel operators are bounded and
B(a) L(L2 (R+ )) = a L∞ (R+ ) .
(9)
For a ∈ L∞ (R), the two-sided Wiener-Hopf operator W 0 (a) ∈ L(L2 (R)) is defined by W 0 (a) = FM(a)F−1 ,
(10)
494
E.L. Basor, T. Ehrhardt
where F : L2 (R) → L2 (R) is the Fourier transform and M(a) stands here for the multiplication operator on L2 (R). The usual Wiener-Hopf and Hankel operators acting on L2 (R+ ) are defined by W (a) = P W 0 (a)P |L2 (R+ ) ,
H (a) = P W 0 (a)J P |L2 (R+ ) ,
(11)
where (Pf )(x) = χR+ (x)f (x) and (Jf )(x) = f (−x). For a ∈ L∞ (R), these operators are bounded and
W (a) L(L2 (R+ )) = a L∞ (R) ,
H (a) L(L2 (R+ )) ≤ a L∞ (R) .
(12)
Moreover, for a, b ∈ L∞ (R) the well known identities ˜ W (ab) = W (a)W (b) + H (a)H (b), ˜ H (ab) = W (a)H (b) + W (a)H (b),
(13) (14)
˜ hold, where b(x) = b(−x). These identities are a simple consequence of the facts that 0 ˜ . W (ab) = W 0 (a)W 0 (b), I = P + J P J and J W 0 (b) = W 0 (b)J ∞ 1 If a ∈ L (R) ∩ L (R), then W (a) and H (a) are integral operators on L2 (R+ ) with kernel a(x ˆ − y) and a(x ˆ + y), respectively, where ∞ 1 a(x) ˆ = e−ixt a(t) dt (15) 2π −∞ is the Fourier transform of a. We remark that if we extend a ∈ L1 (R+ ) to an even function a0 ∈ L1 (R) by stipulating a0 (x) = a(|x|), x ∈ R, then the cosine transform of a coincides with the Fourier transform of a0 . Therefore we will use the same notation for the cosine transform (4) and the Fourier transform (15). In addition to the projection Pτ , we define the following operators acting on L2 (R+ ): f (τ − x) for 0 ≤ x ≤ τ Wτ : f (x) → g(x) = (16) 0 for x > τ, 0 for 0 ≤ x ≤ τ Vτ : f (x) → g(x) = (17) f (x − τ ) for x > τ, V−τ : f (x) → g(x) = f (x + τ ), (18) and Qτ = I − Pτ . It is readily verified, that Pτ2 = Wτ2 = Pτ , Wτ Pτ = Pτ Wτ = Wτ , V−τ Vτ = I and Vτ V−τ = Qτ . Moreover, the following identities hold: Wτ W (a)Wτ = Pτ W (a)P ˜ τ,
Pτ W (a)Vτ = Wτ H (a), ˜
V−τ W (a)Pτ = H (a)Wτ . (19)
Lemma 2.1. (a) Let a ∈ L∞ (R) and K be a compact operator on L2 (R+ ) such that W (a) + K = 0. Then a = 0 and K = 0. (b) Let a ∈ L∞ (R+ ), K be a compact operator on L2 (R+ ) and {Cτ }τ ∈(0,∞) be a sequence of bounded operators on L2 (R+ ) tending to zero in the operator norm as τ → ∞ such that Bτ (a) + Wτ KWτ + Cτ = 0. Then a = 0, K = 0 and Cτ = 0 for all τ ∈ (0, ∞).
Asymptotics of Determinants of Bessel Operators
495
Proof. (a): From the equation W (a) + K = 0 it follows that V−τ W (a)Vτ + V−τ KVτ = 0. Since V−τ W (a)Vτ = W (a), we obtain that W (a) = −V−τ KVτ . Observing that V−τ → 0 strongly as τ → ∞, we take the strong limit of the previous equality and it follows that W (a) = 0. Hence a = 0 and K = 0. (b): Since Wτ → 0 weakly as τ → ∞, the operators Wτ KWτ converge strongly to zero. Taking the strong limit of Bτ (a) + Wτ KWτ + Cτ = 0 we obtain B(a) = 0 because Bτ (a) = Pτ B(a)Pτ → B(a) strongly. Hence a = 0 and Wτ KWτ + Cτ = 0. Multiplying with Wτ from both sides and taking again the strong limit, we conclude that 0 = Pτ KPτ + Wτ Cτ Wτ → K. Thus K = 0 and Cτ = 0. 2.1. Hilbert-Schmidt and trace class conditions. In what follows we establish sufficient conditions for the operators H (a) and B(a)−W (a) to be Hilbert-Schmidt. Moreover, we state sufficient conditions such that Bτ (a) is a trace class operator for each τ ∈ (0, ∞). Proposition 2.2. Let a ∈ L∞ (R) ∩ L1 (R). If ∞ 2 x|a(x)| ˆ dx < ∞,
(20)
0
where aˆ is given by (15), then H (a) is a Hilbert-Schmidt operator on L2 (R+ ). Proof. As pointed out above, H (a) is an integral operator with kernel a(x ˆ + y). This operator is Hilbert-Schmidt if and only if the following integral is finite, which is the square of the Hilbert-Schmidt norm of H (a): ∞ ∞ |a(x ˆ + y)|2 dxdy. 0
0
This integral coincides with (20). L∞ (R
Proposition 2.3. If a ∈ L2 [0, τ ] for each τ ∈ (0, ∞).
+)
∩ L1 (R+ ), then Bτ (a) is a trace class operator on
Proof. Here we make use of Mercer’s Theorem [9, Ch.III], which reads as follows: If m is a continuous function on [0, τ ] × [0, τ ] such that m(s, t) = m(t, s) and τ τ m(t, s)f (t)f (s) dtds ≥ 0, (21) 0
0
then the (positive semi-definite) operator given by 2 2 L [0, τ ] → L [0, τ ], f (t) →
τ
m(t, s)f (s) ds 0
is a trace class operator. This theorem shows that if a ∈ L∞ (R+ ) ∩ L1 (R+ ) is a nonnegative function, then Bτ (a) is trace class on L2 [0, τ ]. Indeed, let m(s, t) = K(s, t) be the kernel (1) and note that K(s, t) is continuous since ν > −1. Moreover, the above integral (21) equals ∞ a(t)(Hν f )(t)(Hν f )(t) dt, 0
which is obviously nonnegative if a is also. Since each function in L∞ (R+ ) ∩ L1 (R+ ) can be represented as the linear combination of four nonnegative functions, the assertion follows for the general case.
496
E.L. Basor, T. Ehrhardt
The rest of this section is devoted to establishing sufficient conditions under which the difference B(a)−W (a) is a Hilbert-Schmidt operator on L2 (R+ ), where a(−x) = a(x). This result will be crucial for the considerations in the subsequent sections. As mentioned in the introduction, it is motivated, to some extent, by the fact that B(a) − W (a) = ±H (a) for ν = ∓1/2. In this case the desired assertion is clear from Proposition 2.2. Lemma 2.4. Let a ∈ L∞ (R) ∩ L1 (R). Then the operators W (a)P1 and P1 W (a) are Hilbert-Schmidt operators on L2 (R+ ). Proof. The operator W (a)P1 is Hilbert-Schmidt if and only if 1 ∞
0
|a(x ˆ − y)|2 dxdy < ∞,
(22)
0
where a(x) ˆ is given by (15). Obviously,
1 ∞
0
|a(x ˆ − y)| dxdy ≤ 2
0
∞
−∞
|a(x)| ˆ dx = 2
∞ −∞
|a(t)|2 dt,
(23)
since aˆ is the Fourier transform of a. This integral is finite because L∞ (R) ∩ L1 (R) ⊂ L2 (R). Hence W (a)P1 is Hilbert-Schmidt. It follows analogously that P1 W (a) is a Hilbert-Schmidt operator. Let us at this point recall the following indefinite integrals for Bessel functions:
t2 2 Jν (tx) − Jν+1 (tx)Jν−1 (tx) , 2 txJν+1 (tx)Jν (ty) − tyJν (tx)Jν+1 (ty) tJν (tx)Jν (ty) dt = . x2 − y2 tJν2 (tx) dt =
(24) (25)
They can be proved in a straightforward manner by using the recursion formulas Jν−1 (x) + Jν+1 (x) = 2ν x Jν (x) and Jν−1 (x) − Jν+1 (x) = 2Jν (x). The asymptotic behavior of Jν (t) at zero and at infinity is as follows: ν 1 t 2 Jν (t) = + O(t ) , t → 0, (26) 2 (ν + 1) ν 2 − 41 2 Jν (t) = cos(t − α) − sin(t − α) + O(t −2 ) , t → ∞, (27) πt 2t where α =
π 2ν
+ π4 .
Lemma 2.5. Let a ∈ L∞ (R+ ) ∩ L1 (R+ ). Then the operators B(a)P1 and P1 B(a) are Hilbert-Schmidt operators on L2 (R+ ). Proof. The operator B(a)P1 is Hilbert-Schmidt if and only if
1 ∞ 0
0
|K(x, y)|2 dxdy < ∞,
(28)
Asymptotics of Determinants of Bessel Operators
497
where K(x, y) is given by (1). Another interpretation of formula (1) is√ that for fixed y the function K(x, y) is the Hankel transform (6) of the function ay (t) = ytJν (yt)a(t). In other words, K(x, y) = (Hν ay )(x). Since Hν is an isometry on L2 (R+ ), it follows that ∞ ∞ |K(x, y)|2 dx = |ay (t)|2 dt. 0
0
Hence (28) is equal to 1 ∞ ytJν2 (yt)|a(t)|2 dtdy = 0
0
∞ 0
1 = 2
0
|a(t)|2 t ∞
1
0
yJν2 (yt) dy dt
|a(t)|2 t Jν2 (t) − Jν+1 (t)Jν−1 (t) dt.
Here we have used (24). The term involving the Bessel functions is continuous for t ∈ (0, ∞) and behaves at zero like O(t 2ν ), ν > −1, and at infinity like O(t −1 ). We can split the integral into an integral from zero to one and an integral from one to infinity. Using the fact that a ∈ L∞ (R+ ) ∩ L1 (R+ ), it follows that (28) is finite. Hence B(a)P1 is a Hilbert-Schmidt operator. It can be shown analogously that P1 B(a) is a Hilbert-Schmidt operator. Lemma 2.6. For each t ∈ [0, ∞) and x, y ∈ [1, ∞), the following integral converges: ∞ 2 cos(xs − α) cos(ys − α) √ s xyJν (xs)Jν (ys) − ds. (29) Kt (x, y) = π t In particular, K0 (x, y) = −
sin(2α) , π(x + y)
and with a certain constant Cν depending only on ν we have ∞ ∞ 1/2 Cν 2 for all t ∈ (0, ∞). |Kt (x, y)| dxdy ≤ t 1 1
(30)
(31)
Proof. From (25), we have the following indefinite integrals: 2 cos(xs − α) cos(ys − α) √ s xyJν (xs)Jν (ys) − ds π √ sxJν+1 (sx)Jν (sy) − syJν (sx)Jν+1 (sy) = xy x2 − y2 sin((x + y)s − 2α) sin((x − y)s) − − π(x + y) π(x − y) √ sJν+1 (sx)Jν (sy) + sJν (sx)Jν+1 (sy) sin((x + y)s − 2α) = xy − 2(x + y) π(x + y) √ sJν+1 (sx)Jν (sy) − sJν (sx)Jν+1 (sy) sin((x − y)s) + xy − . 2(x − y) π(x − y) Using the leading term in the asymptotics (27), it is easily seen that the previous expression tends to zero as s → ∞ for fixed x, y. Hence the integral Kt (x, y) exists for
498
E.L. Basor, T. Ehrhardt
t ∈ (0, ∞). Using the asymptotics (26) we obtain that the terms involving the Bessel functions tend to zero as s → 0 since ν > −1. Hence K0 (x, y) exists and equals (30). To be precise, the just stated assertions hold for x = y. However, if x = y, we can proceed similarly by using (24): 2 cos2 (xs − α) ds sxJν2 (xs) − π sin(2xs − 2α) xs 2 2 s = Jν (sx) − Jν+1 (sx)Jν−1 (sx) − − . 2 2π x π Using the first and second order asymptotics of (27), it follows that the previous expression tends to zero as s → ∞. Due to the asymptotics (26) the term containing the Bessel functions tends to zero as s → 0. Hence Kt (x, x) exists for all t ∈ [0, ∞) and K0 (x, x) equals (30). In order to prove (31) we divide the integral into three parts: ∞ ∞ ∞ Kt (x, y) = k1 (x, y; s) ds + k2 (x, y; s) ds + k3 (x, y; s) ds, (32) t
where
√ xsJν (xs) − k1 (x, y; s) =
t
t
2 cos(xs − α) π
√
ysJν (ys) −
2 2 cos(xs − α) cos(ys − α), π π 2 2 √ k3 (x, y; s) = cos(xs − α) ysJν (ys) − cos(ys − α) . π π
√ k2 (x, y; s) = xsJν (xs) −
2 cos(ys − α) , π
Next we remark that from (26) and (27) it follows that for ν > −1,
const
√ 2
cos(z − α) ≤ for all z ∈ (0, ∞),
zJν (z) −
π z
(33)
with a constant that depends only on ν. Hence |k1 (x, y; s)| ≤ const (xys 2 )−1 , whence
∞
const
k1 (x, y; s) ds
≤
xyt t follows. Partial integration of the second integral in (32) gives ∞ k2 (x, y; s) ds t ∞ √ 2 2 sin(ys − α) xsJν (xs) − cos(xs − α) = π π y t ∞ √ 2 2 sin(ys − α) 1 − sin(xs − α) ds. x √ Jν (xs) + xsJν (xs) + π π y 2 xs t
Asymptotics of Determinants of Bessel Operators
499
By (33) the first expression is bounded by a constant times (xyt)−1 . Next observe that for fixed ν and arbitrary z ∈ (0, ∞) the following identity holds: ν+1 √ √ 1 √ Jν (z) + zJν (z) = √ 2 Jν (z) − zJν+1 (z) 2 z z =
(34)
(ν + 1)2 − 41 cos(z−α) 1 2 ν + 21 cos(z−α) . − sin(z − α) − +O 2 π z 2z z (35)
Like (33) this follows from (26) and (27). We emphasize that estimate holds not just for z → ∞ but also for z → 0, thus uniformly for all z ∈ (0, ∞). Thus ∞ ∞ 1 1 ds k2 (x, y; s) ds = O O + xyt xys 2 t t ∞ cos(xs − α) sin(ys − α) ds +Aν ys t with Aν = π1 (ν 2 − 41 ). A similar expression can be obtained for the integral involving k3 (x, y; s). It follows that ∞ (k2 (x, y; s) + k3 (x, y; s)) ds t 1 ds Aν ∞ =O + (x cos(xs −α) sin(ys −α) + y sin(xs −α) cos(ys −α)) . xyt xy t s The last integral equals
∞
t
d ds (sin(xs − α) sin(ys − α)) . ds s
Another partial integration shows that this equals O(t −1 ). Summarizing the previous results we can conclude that 1 , Kt (x, y) = O xyt from which the desired assertion (31) follows.
For a ∈ L∞ (R+ ) ∩ L1 (R+ ), we introduce two operators acting on L2 [1, ∞). Firstly, let Ka = Q1 (B(a) − W (a))Q1 |L2 [1,∞) ,
(36)
where we stipulate a(−x) = a(x), x < 0, for the symbol of W (a). Secondly define the Hankel operator Ha as the integral operator on L2 [1, ∞) with kernel sin(2α)a(0) 1 ∞ Ha (x, y) = − cos((x + y)t − 2α)a(t) dt, (37) + π(x + y) π 0 where α =
π 2ν
+ π4 .
500
E.L. Basor, T. Ehrhardt
Lemma 2.7. Let a ∈ L∞ (R+ ) ∩ L1 (R+ ) and assume that (i) a is continuous on [0, ∞); (ii) there exist a finite number of points 0 < t1 < . . . < tN < ∞, N ≥ 1, such that a is two times continuously differentiable on the interval [0, t1 ] and one times continuously differentiable on each of the intervals [t1 , t2 ], . . . , [tN−1 , tN ], [tN , ∞); (iii) (1 + t)−1/2 a (t) ∈ L1 (R+ ). Then Ka − Ha is a Hilbert-Schmidt operator on L2 [1, ∞). Proof. The kernel of the operator Ka = Q1 (B(a) − W (a))Q1 is given by ∞ cos(xt − yt) √ Ka (x, y) = t xyJν (xt)Jν (yt) − a(t) dt. π 0 This combined with (37) yields that Ka (x, y) − Ha (x, y) equals ∞ sin(2α)a(0) 2 cos(xt − α) cos(yt − α) √ t xyJν (xt)Jν (yt) − + a(t) dt. π(x + y) π 0 In other words, Ka (x, y) − Ha (x, y) =
sin(2α)a(0) − π(x + y)
∞
a(t) 0
d Kt (x, y) dt, dt
(38)
where Kt (x, y) is given by (29). We first consider functions a(t) which are two times continuously differentiable on [0, ∞), have compact support and satisfy a (0) = 0. Notice that then t −1 a (t) ∈ L1 (R+ ). From formula (30) and partial integration of (38) we obtain ∞ Ka (x, y) − Ha (x, y) = a (t)Kt (x, y) dt. (39) 0
Notice that lim Kt (x, y) = 0 and a ∈ L∞ (R+ ). Equation (31) says that the integral t→∞
operators on L2 [1, ∞] with kernel tKt (x, y) are Hilbert-Schmidt and their Hilbert-Schmidt norm is uniformly bounded for all t ∈ (0, ∞). Since t −1 a (t) ∈ L1 (R+ ), it follows that the operator Ka − Ha is Hilbert-Schmidt. Next, we assume that the function a(t) satisfies the assumptions of the lemma and in addition a(0) = 0. In this case, it is easily verified that the following integrals:
∞ ∞ ∞ ∞
d a(t)
|a(t)| |a(t)| |a (t)|
dt dt, dt, √ ,
dt
t t 3/2 (1 + t 1/2 ) t t 0 0 0 0 are finite. Since a(0) = 0, Eq. (38) becomes Ka (x, y) − Ha (x, y) = −
∞
a(t) 0
d Kt (x, y) dt. dt
Also recall, using the notation from the proof of Lemma 2.6, that we can write the d function − dt Kt (x, y) as the sum of three terms k1 (x, y; t) + k2 (x, y; t) + k3 (x, y; t). We will now write the sum of these three operators as another sum of operators each of
Asymptotics of Determinants of Bessel Operators
501
which is trace class or Hilbert-Schmidt. To do this we recall that if an integral operator on L2 [1, ∞) has a kernel given by ∞ K(x, y) = h1 (x, t)h2 (y, t)a(t) dt, 0
then the trace class norm of this operator is at most ∞ 1/2 ∞ 1/2 ∞ |a(t)| |h1 (x, t)|2 dx |h2 (y, t)|2 dy dt. 0
1
1
Let us begin with the term k1 (x, y; t). From (26) and (27) along with the assumption ν > −1, we obtain ∞
2 2
√
cos(z − α) dz < ∞.
zJν (z) − π 0 This immediately yields that the trace class norm of the operator given by ∞ k1 (x, y; t)a(t) dt 0
∞
is bounded by a constant times 0 t −1 |a(t)| dt. The next term involving k2 is a bit more complicated, but still follows the computation of Lemma 2.6. We write ∞ ∞ √ 2 2 k2 (x, y; t)a(t) dt = xtJν (xt) − cos(xt − α) a(t) cos(yt−α) dt. π π 0 0 We use integration by parts to write this as two terms. The first term is given by ∞ √ 2 2 sin(yt − α) xtJν (xt) − . cos(xt − α) a(t) π π y t=0
√ Since a(t) is bounded at infinity, a(t) = O( t) as t → 0 and ν > −1, this expression is zero. The next term yields, from differentiating the first factor, two terms one of which is ∞ √ 2 2 sin(yt − α) cos(xt − α) a (t) dt. xtJν (xt) − π π y 0 The trace ∞ norm of this operator using the same argument as above is at most a constant times 0 t −1/2 |a (t)| dt and is thus finite. So we are left with one last term ∞ √ 1 2 2 sin(yt − α) x sin(xt − α) a(t) dt. √ Jν (xt) + xtJν (xt) + π π y 2 xt 0 Following Lemma 2.6 and Eq. (27) we rewrite the Bessel functions to obtain an error term of the form ∞ 2 sin(yt − α) dt, (40) xh(xt)a(t) π y 0
502
E.L. Basor, T. Ehrhardt
where h(z) is O(zp ) with p = min{ν − 1/2, −1} for small z and O(1/z2 ) for large z. Note the error term for large z follows from the estimate in Eq. (35) in Lemma 2.6. The norm of the term sin(yt − α) y in L2 [1, ∞) is uniformly bounded for t ∈ [0, ∞) and the estimate of the norm of xh(xt) in L2 [1, ∞) is O(t −3/2 ) as t → 0 and O(t −2 ) as t → ∞. Thus we have a trace norm estimate of a constant times ∞ |a(t)| dt, 3/2 t (1 + t 1/2 ) 0 which is finite. Hence (40) has bounded trace norm. Thus we have one remaining term ∞ cos(xt − α) sin(yt − α) Aν a(t) dt. yt 0 Now the computation for the last term k3 is exactly the same as above with the variables reversed, so once again we combine these terms into one final integral, Aν ∞ a(t) dt, (x cos(xt − α) sin(yt − α) + y sin(xt − α) cos(yt − α)) xy 0 t or Aν ∞ d a(t) dt. (sin(xt − α) sin(yt − α)) xy 0 dt t We integrate by parts one more time so that the above is Aν Aν ∞ a(t) a(t) ∞ − sin(xt − α) sin(yt − α) dt. sin(xt − α) sin(yt − α) xy t t=0 xy 0 t 1 From our assumptions on a(t), these expressions are easily seen to be O xy and are hence Hilbert-Schmidt. To complete the proof for arbitrary functions a satisfying the assumptions of the lemma, let f be a two times continuously differentiable function on R+ with compact support and f (0) = 1, f (0) = 0. We decompose a = a1 + a2 , where a1 (t) = a(0)f (t) and a2 (t) = a(t) − a(0)f (t). The function a1 then fulfills the first assumed conditions, and the function a2 satisfies the second assumptions. This completes the proof. Lemma 2.8. Let a ∈ L∞ (R+ ) ∩ L1 (R+ ) and assume that a is continuous and piecewise C 2 on [0, ∞). Assume also that lim a(x) = 0 and a ∈ L1 (R+ ). Then Ha is x→∞
Hilbert-Schmidt. Proof. We integrate the following integral twice by parts where x ≥ 2: ∞ ∞ sin(xt − 2α) sin(xt − 2α) 1 ∞ cos(xt − 2α)a(t) dt = − a(t) a (t) dt π 0 πx πx 0 0 ti+1 n−1 sin(2α)a(0) cos(xt − 2α) = + a (t) πx π x2 ti i=0 ti+1 cos(xt − 2α) − a (t) dt , π x2 ti
Asymptotics of Determinants of Bessel Operators
503
where 0 = t0 < t1 < . . . tn−1 < tn = ∞ are the points where the derivatives are discontinuous. From this it follows that |Ha (x, y)| ≤ Hence this operator is Hilbert-Schmidt.
C . (x + y)2
Proposition 2.9. Let a ∈ L∞ (R+ ) ∩ L1 (R+ ) and assume that (i) a is continuous and piecewise C 2 on [0, ∞), and lim a(t) = 0; (ii) (1 + t)−1/2 a ∈ L1 (R+ ) and a ∈ L1 (R+ ).
t→∞
Then the operator B(a) − W (a) is Hilbert-Schmidt on L2 (R+ ). Proof. We write the operators B(a) and W (a) as follows: B(a) = P1 B(a) + Q1 B(a)P1 + Q1 B(a)Q1 , W (a) = P1 W (a) + Q1 W (a)P1 + Q1 W (a)Q1 . Hence by Lemma 2.4 and Lemma 2.5, B(a) − W (a) = Ka + Hilbert-Schmidt. Now it remains to apply Lemma 2.7 and Lemma 2.8.
3. The Algebraic Approach Here we follow, essentially, the method developed in [7]. It is useful to point out that the next sections use Banach algebra techniques to compute determinants. This is done without the knowledge that the operator B(a) − W (a) is trace class, but merely HilbertSchmidt. 3.1. A Banach algebra of functions. Let S stand for the set of all functions a ∈ L∞ (R+ ) such that the following properties are fulfilled: (i) H (a) is a Hilbert-Schmidt operator on L2 (R+ ) where a(−x) := a(x); (ii) B(a) − W (a) is a Hilbert-Schmidt operator on L2 (R+ ). We introduce a norm in S by stipulating
a S = a L∞ (R+ ) + H (a) C2 (L2 (R+ )) + B(a) − W (a) C2 (L2 (R+ )) .
(41)
We use the notation A C2 (H) to denote the Hilbert- Schmidt norm of an operator A defined on a space H and later the notation A C1 (H) to denote the trace norm. Proposition 3.1. S is a Banach algebra. Proof. The linearity and the completeness are easy to verify. It remains to show that a, b ∈ S implies ab ∈ S and that ab S ≤ const a S b S . Note this estimate indicates that the Banach algebra S has an equivalent Banach algebra norm other than the usual definition where the above constant is one.
504
E.L. Basor, T. Ehrhardt
Indeed, let a, b ∈ S. Obviously ab ∈ L∞ (R+ ). Moreover, H (ab) = H (a)W (b) + W (a)H (b), B(ab) − W (ab) = B(a)B(b) − W (a)W (b) − H (a)H (b) = B(a) B(b) − W (b) + B(a) − W (b) W (b) − H (a)H (b). It follows that both H (ab) and B(ab) − W (ab) are Hilbert Schmidt. Moreover,
H (ab) 2 ≤ H (a) 2 b ∞ + a ∞ H (b) 2 ,
B(ab) − W (ab) 2 ≤ a ∞ B(b) − W (b) 2 + B(a) − W (b) 2 b ∞ + H (a) 2 H (b) 2 . From this the norm estimate is easy to obtain.
3.2. A Banach algebra of Wiener-Hopf operators. Let B be the set of all operators of the form A = W (a) + K,
(42)
where a ∈ S and K is a trace class operator on L2 (R+ ). We define in B a norm by
A B = a S + K C1 (L2 (R+ )) .
(43)
This definition is correct since a and K are uniquely determined by the operator A. In fact, this is a consequence of Lemma 2.1(a). Proposition 3.2. B is a Banach algebra. Proof. The linearity and the completeness can be shown straightforwardly. As above, we prove that A, B ∈ B implies AB ∈ B and a corresponding norm estimate. In fact, let A = W (a) + K,
B = W (b) + L,
(44)
where a, b ∈ S and K, L ∈ C1 (L2 (R+ )). Then AB = W (a) + K W (b) + L = W (ab) − H (a)H (b) + KW (b) + W (a)L + KL. Noting that ab ∈ S and H (a)H (b) is trace class, it follows that AB ∈ B. Moreover,
AB B ≤ ab S + H (a) 2 H (b) 2 + K 1 b ∞ + a ∞ L 1 + K 1 L 1 ≤ const a S b S + K 1 b S + a S L 1 + K 1 L 1 . Thus AB B ≤ const A B B B .
Asymptotics of Determinants of Bessel Operators
505
3.3. A Banach algebra of sequences of Bessel operators. We are going to introduce a Banach algebra of sequences {Aτ }, τ ∈ (0, ∞), which contains the sequences {Bτ (a)} of Bessel operators. First of all, let N be the set of all sequences {Cτ }, τ ∈ (0, ∞), where • Cτ is a trace class operator on L2 (R+ ) for all τ ∈ (0, ∞); • sup Cτ C1 (L2 (R+ )) < ∞; •
τ ∈(0,∞)
lim Cτ C1 (L2 (R+ )) = 0.
τ →∞
Now let F stand for the set of all sequences {Aτ }, τ ∈ (0, ∞), which are of the form Aτ = Bτ (a) + Wτ KWτ + Cτ , where a ∈ S, K we introduce a norm by
is a trace class operator on L2 (R
+ ) and {Cτ }
(45) ∈ N . For such sequences
{Aτ } F = a S + K C1 (L2 (R+ )) + sup Cτ C1 (L2 (R+ )) . τ ∈(0,∞)
(46)
This definition is correct because Lemma 2.1(b) implies that for a given sequence {Aτ } the function a and the operators K and Cτ are determined uniquely. (1) (2) Moreover, for sequences {Aτ }, {Aτ } ∈ F and λ(1) , λ(2) ∈ C we define algebraic operations by (2) (2) (1) (1) (2) (2) λ(1) {A(1) τ } + λ {Aτ } = {λ Aτ + λ Aτ },
(2) (1) (2) {A(1) τ }{Aτ } = {Aτ Aτ }. (47)
We will provide F with these algebraic operations and the above norm. Proposition 3.3. F is a Banach algebra, and N is a closed two-sided ideal of F. (1)
(2)
Proof. The only non-trivial statement to prove is that {Aτ } ∈ F and {Aτ } ∈ F implies (1) (2) that {Aτ }{Aτ } ∈ F and the corresponding norm estimate. Let ) (j ) A(j τ = Bτ (aj ) + Wτ Kj Wτ + Cτ ,
j = 1, 2,
(48)
(j )
where aj ∈ S, Kj is trace class and {Cτ } ∈ N . Using for brevity the notation R(a) = B(a) − W (a), consider first Pτ B(a1 )Qτ B(a2 )Pτ = Pτ W (a1 )Qτ W (a2 )Pτ + Pτ W (a1 )Qτ R(a2 )Pτ + Pτ R(a1 )Qτ W (a2 )Pτ + Pτ R(a1 )Qτ R(a2 )Pτ = Wτ H (a1 )H (a2 )Wτ + Wτ H (a1 )V−τ R(a2 )Pτ + Pτ R(a1 )Vτ H (a2 )Wτ + Pτ R(a1 )Qτ R(a2 )Pτ . Since H (aj ) and R(aj ) are Hilbert-Schmidt operators and Vτ∗ = V−τ and Qτ converge strongly to zero on L2 (R+ ) as τ → ∞, it follows that the last three terms belong to N . Thus, it can be seen that {Pτ B(a1 )Qτ B(a2 )Pτ } ∈ F, and the norm can be estimated by
a1 S a2 S . From this it follows that Bτ (a1 )Bτ (a2 ) = Pτ B(a1 )B(a2 )Pτ − Pτ B(a1 )Qτ B(a2 )Pτ = Pτ B(a1 a2 )Pτ − Wτ H (a1 )H (a2 )Wτ + Dτ(1) ,
506
E.L. Basor, T. Ehrhardt (1)
(1)
where {Dτ } ∈ N with {Dτ } F ≤ a1 S a2 S . In particular, {Bτ (a1 )}{Bτ (a2 )} ∈ F. Furthermore, observe that Bτ (a1 )Wτ K2 Wτ = Pτ W (a1 )Wτ K2 Wτ + Pτ R(a1 )Wτ K2 Wτ = Wτ W (a1 )Pτ K2 Wτ + Pτ R(a1 )Wτ K2 Wτ = Wτ W (a1 )K2 Wτ − Wτ W (a1 )Qτ K2 Wτ + Pτ R(a1 )Wτ K2 Wτ , where the last two terms belong to N since Qτ → 0 strongly and Wτ → 0 weakly. Hence Bτ (a1 )Wτ K2 Wτ = Wτ W (a1 )K2 Wτ + Dτ(2) , (2)
(2)
where {Dτ } ∈ N with {Dτ } F ≤ a1 S K2 1 . By a similar argument we obtain that Wτ K1 Wτ Bτ (a2 ) = Wτ K2 W (a2 )Wτ + Dτ(3) , (3)
(3)
where {Dτ } ∈ N with {Dτ } F ≤ K1 1 a2 S . Finally, Wτ K1 Wτ Wτ K2 Wτ = Wτ K1 K2 Wτ + Dτ(4) , (4)
(4)
where {Dτ } = {−Wτ K1 Qτ K2 Wτ } ∈ N and {Dτ } F ≤ K1 1 K2 1 . Summarizing the above we can conclude that (2) A(1) τ Aτ = Bτ (a1 a2 ) + Wτ KWτ + Dτ ,
(49)
K = W (a1 )K2 + K1 W (a2 ) + K1 K2 − H (a1 )H (a2 )
(50)
where
(1)
(2)
(1)
(2)
and {Dτ } ∈ N with {Dτ } F ≤ const {Aτ } F {Aτ } F . Hence {Aτ }{Aτ } ∈ F. Noting that
K 1 ≤ a1 ∞ K2 1 + K1 1 a2 ∞ + K1 1 K2 1 + H (a1 ) 2 H (a2 ) 2 , ≤ ( a1 S + K1 1 ) ( a2 S + K2 1 ) , we obtain that (1) {Aτ }{A(2) τ }
F
= a1 a2 S + K 1 + {Dτ } F (2) ≤ const {A(1) τ } F {Aτ } F .
Finally, the fact that N is an ideal of F follows from formulas (49) and (50) with a1 = 0 and K1 = 0 or a2 = 0 and K2 = 0, respectively.
Asymptotics of Determinants of Bessel Operators
507
3.4. Banach algebra ideals and homomorphisms. In the following proposition we introduce a Banach algebra homomorphism that links the Banach algebras F and B, that have been introduced in the previous sections. This result also shows that the quotient Banach algebra F/N is isomorphic B. Proposition 3.4. The mapping : F → B defined by : {Aτ } = {Bτ (a) + Wτ KWτ + Cτ } → A = W (a) + K, is a surjective Banach algebra homomorphism with kernel N . Proof. The linearity and surjectivity are obvious. The continuity follows from the definition of the norms:
{Aτ } F = a S + K 1 + {Cτ } F
and
A B = a S + K 1 .
(51)
It is also clear that the kernel equals N . The only point that needs an explanation is (1) (2) the multiplicativity. Assume that sequences {Aτ } and {Aτ } are given by (48). Then (1) (2) {Aτ } = {Aτ }{Aτ } is given by formulas (49) and (50). Hence ({Aτ }) = W (a1 a2 ) + W (a1 )K2 + K1 W (a2 ) + K1 K2 − H (a1 )H (a2 ). On the other hand, ) ({A(j τ }) = W (aj ) + Kj . (1)
(2)
This implies that ({Aτ }) = ({Aτ })({Aτ }).
Let J ⊆ F be the set J = {Wτ KWτ + Cτ } : K ∈ C1 (L2 (R+ )) and {Cτ } ∈ N .
(52)
Formulas (48), (49) and (50) and the definition of the norm in F show that J is a closed two-sided ideal of F. One can form the quotient Banach algebra F/J . By π : F → F/J , {Aτ } → {Aτ } + J ,
(53)
we denote the canonical homomorphism. Furthermore, we introduce the linear and continuous mapping : S → F, a → {Bτ (a)}.
(54)
Proposition 3.5. The mapping π ◦ : S → F/J is a Banach algebra isomorphism. Proof. It is obvious that π ◦ is linear, continuous and surjective. In regard to the multiplicativity we may refer to formulas (48) and (49) again, which apply in order to conclude that Bτ (a1 )Bτ (a2 ) = Bτ (a1 a2 ) + K + Cτ with a certain K ∈ C1 (L2 (R+ )) and {Cτ } ∈ N . Finally, the mapping is in fact an isomorphism, since for {Aτ } = {Bτ (a) + K + Cτ } the following norm equality holds,
{Aτ } + J F /J = a S , which in turn follows immediately from the definition of the norm in F.
508
E.L. Basor, T. Ehrhardt
4. Asymptotic Analysis We start with an auxiliary result concerning the exponential of an element of F. Recall that the exponential of an element of a Banach algebra is defined by the (absolutely convergent) sum eb =
∞ n b n=0
n!
.
(55)
For any Banach algebra homomorphism ξ we have ξ(eb ) = eξ(b) . Lemma 4.1. Let {Aτ } ∈ F, then {eAτ } = e{Aτ } ∈ F. Proof. Obviously, e{Aτ } ∈ F. Hence e{Aτ } = {Aˆ τ }, where {Aˆ τ } is a sequence of operator in F. For each fixed τ0 ∈ (0, ∞), the mapping ξτ0 : {Aτ } ∈ F → Aτ0 ∈ L(L2 (R+ )) is a Banach algebra homomorphism. We apply these homomorphisms to the equation e{Aτ } = {Aˆ τ } and obtain that eAτ0 = Aˆ τ0 . Hence {Aˆ τ } = {eAτ }, and the assertion is proved. The following result can be considered as a very crucial point in our argument. Proposition 4.2. Let b ∈ S and {Aτ } = {Bτ (eb )e−Bτ (b) }. Then there exist an operator K ∈ C1 (L2 (R+ )) and a sequence {Cτ } ∈ N such that {Aτ } = {Pτ + Wτ KWτ + Cτ }.
(56)
Moreover, the operator K is determined by the identity I + K = W (eb )e−W (b) .
(57)
Proof. First of all, the previous lemma implies that the sequence {Aτ } is contained in F. Moreover, the following identity holds: {Aτ } = {Bτ (eb )e−Bτ (b) } = {Bτ (eb )}e−{Bτ (b)} = (eb )e− (b) . Applying the canonical homomorphism π to this identity, we obtain π({Aτ }) = π( (eb )e− (b) ) = π( (eb ))e−π( (b)) = (π ◦ )(eb ) e−(π◦ )(b) . Using the fact that π ◦ is a homomorphism, it follows that π({Aτ }) = (π ◦ )(eb e−b ) = {Pτ } + J . Hence (56) is proved. Finally we apply the homomorphism to this identity. From the definition of this homomorphism, identity (57) follows immediately. An operator A is said to be of determinant class if A = I + K, where K is a trace class operator. In this case, the operator determinant of A is well defined. A conclusion of the previous proposition is that the operator W (eb )e−W (b)
(58)
is of determinant class for each b ∈ S. Hence the following operator determinant E[b] = det W (eb )e−W (b) is well defined for each b ∈ S.
(59)
Asymptotics of Determinants of Bessel Operators
509
Lemma 4.3. Let Aτ = Pτ + Wτ KWτ + Cτ , where K ∈ C1 (L2 (R+ )) and {Cτ } ∈ N . Then lim det Aτ = det(I + K).
τ →∞
(60)
Proof. We remark that det Aτ = det(I + Wτ KWτ + Cτ ) = det(I + Pτ KPτ + Wτ Cτ Wτ ). τ ) Since Q∗τ = Qτ → 0 strongly on L2 (R+ ), it follows that this equals det(I + K + C with Cτ tending to zero in the trace class norm. In order to prepare for the theorem below, let us recall Proposition 2.3, which says that for each b ∈ L∞ (R+ ) ∩ L1 (R+ ) the operator Bτ (b) is a trace class operator on L2 [0, τ ] for each τ ∈ (0, ∞). Consequently, the operator trace of Bτ (b) is well defined. Moreover, the following result holds. Proposition 4.4. For each b ∈ L∞ (R+ )∩L1 (R+ ) the operator Bτ (eb ) is of determinant class for each τ ∈ (0, ∞). Proof. The set L∞ (R+ ) ∩ L1 (R+ ) is a Banach algebra without unit element. Upon adjoining the constant functions on R+ to L∞ (R+ ) ∩ L1 (R+ ), one obtains a Banach algebra with unit element. From the series expansion of eb it follows that eb − 1 ∈ L∞ (R+ ) ∩ L1 (R+ ) whenever b ∈ L∞ (R+ ) ∩ L1 (R+ ). Now Proposition 2.3 implies that Bτ (eb −1) is a trace class operator. Thus Bτ (eb ) = Pτ +Bτ (eb −1) is of determinant class for each τ ∈ (0, ∞). These considerations show that all the expressions appearing in Eq. (61) below are well defined. This equation is the asymptotic result in which the justifications of this and the previous section culminate. Theorem 4.5. Let b ∈ S ∩ L1 (R+ ). Then det Bτ (eb ) = E[b], τ →∞ etrace Bτ (b) lim
(61)
where E[b] is given by (59). Proof. We first use Proposition 4.2 in connection with Lemma 4.3 and obtain that lim det Bτ (eb )e−Bτ (b) = det(I + K), τ →∞
where I + K is given by (57). Thus det(I + K) equals E[b]. Now we observe that det Bτ (eb )e−Bτ (b) = det Bτ (eb ) e−trace Bτ (b) , which is a consequence of the just stated facts that Bτ (b) is a trace class operator and Bτ (eb ) is of determinant class. Notice that S ⊆ L∞ (R+ ). In order to prove the main result stated in the introduction, it is now clear that we have to find an explicit expression for the operator determinant E[b] and to determine the asymptotics of trace Bτ (b) as τ → ∞. This will be done next.
510
E.L. Basor, T. Ehrhardt
5. Further Calculations 5.1. Trace determination. In what follows we evaluate the asymptotics of the trace of Bτ (b) as τ → ∞. The computation goes along the lines of [1, Sect. 3]. Proposition 5.1. Let b ∈ L∞ (R+ ) ∩ L1 (R+ ), and assume that b is continuous and piecewise C 1 on [0, ∞) such that (1 + t)−1 b ∈ L1 (R+ ). Then τ ∞ ν trace Bτ (b) = b(t) dt − b(0) + o(1), τ → ∞. (62) π 0 2 Proof. From formula (1) for the kernel of Bτ (b), it follows that τ ∞ trace Bτ (b) = xtJν2 (xt)b(t) dtdx 0 0 τ ∞ b(t)t xJν2 (xt) dxdt = 0
0
τ2 ∞ = b(t)t Jν2 (τ t) − Jν+1 (τ t)Jν−1 (τ t) dt 2 ∞0 t t 2 Jν (t) − Jν+1 (t)Jν−1 (t) dt. b = τ 2 0 Here (24) and (26) have been used. Due to the asymptotics (27) we obtain 1 sin(2(t − α)) t 2 Jν (t) − Jν+1 (t)Jν−1 (t) = + + O(t −2 ), as t → ∞. (63) 2 π 2π t Hence the following integral exists: ∞ 1 t 2 F (ξ ) = − Jν (t) − Jν+1 (t)Jν−1 (t) − dt. 2 π ξ From integration by parts it follows that ∞ t 1 ∞ t 1 ∞ t trace Bτ (b) − dt = b F (t) F (t) dt. b − b π 0 τ τ τ 0 τ t=0 Hence trace Bτ (b) =
τ π
∞
∞
b(t) dt − b(0)F (0) −
0
b (t)F (tτ ) dt.
0
The asymptotics (63) implies that F (t) = O(t −1 ) as t → ∞. Obviously, F is continuous on [0, ∞). Hence there exists a constant C > 0 such that F (t) ≤ C(1 + t)−1 for all t ∈ [0, ∞). We estimate the last integral of the last equation as follows:
∞
∞
C
dt b (t)F (tτ ) dt
≤ |b (t)|
1 + τt 0 0 ∞ √1 τ C C |b (t)| |b (t)| dt + dt ≤ 1 1 + τ t 1 + τt √ 0 τ √1 ∞ τ 1 + t |b (t)| ≤C dt. |b (t)| dt + C sup 1 + τt 1+t √1 0 t≥ √1 τ
τ
Asymptotics of Determinants of Bessel Operators
511
This expression tends to zero as τ → ∞. Notice that ∞ 1 t 2 J (t) − Jν+1 (t)Jν−1 (t) − dt −F (0) = 2 ν π 0 ∞ 1 ∞ t 2 ν 2 = (t) − Jν+1 (t)Jν (t) dt = − . Jν (t) + Jν+1 dt − ν 2 π 2 0 0 Here we have used Jν−1 + Jν+1 = 2νt Jν . The first integral equals zero, which can be seen from a direct calculation based on (24) and the asymptotics (26) and (27). The last integral equals 1/2, see, for instance, [10, Sect. 6.512–3]. 5.2. Operator determinant calculation. In nearly the last step of our program we calculate the operator determinant (59) for a certain class of sufficiently smooth functions b. We remark that (only) in this section we are dealing with functions b defined on R that are not necessarily even. It should also be pointed out that results that compute (59) can be found in [16]. We do this computation in a different way and include it here for completeness sake. Let L21/2 (R) stand for the Banach space of all Lebesgue measurable function f defined on R for which ∞ 1/2 2
f L2 (R) = (1 + |x|)|f (x)| dx < ∞. (64) 1/2
−∞
C0∞ (R)
we denote the Banach algebra of all continuous functions f on R for which By f (x) → 0 as x → ±∞, where the multiplication is defined pointwise and the norm is the supremum norm on R. Let W0 stand for the set of all functions a defined on R which are the inverse Fourier transform of a function aˆ ∈ L1 (R) ∩ L21/2 (R), i.e., a(x) =
∞ −∞
eixt a(t) ˆ dt
(65)
(see also (15)). The norm in W0 is defined as ˆ L1 (R) + a ˆ L2 (R) .
a W0 := a 1/2
(66)
A routine computation shows that the set L1 (R) ∩ L21/2 (R) is a Banach algebra with the multiplication defined as the convolution on R, i.e., ∞ ˆ ˆ (aˆ ∗ b)(x) = a(x ˆ − y)b(y) dy. (67) −∞
ˆ the set W0 is also a Banach algebra with the mulˆ = (F−1 a) Since F−1 (aˆ ∗ b) ˆ · (F−1 b), tiplication defined pointwise. Moreover, W0 is continuously embedded into C0∞ (R). ˆ Let W0+ and W0− , respectively, be the sets of all functions a ∈ W0 , where a = F−1 a, for which aˆ is supported on [0, ∞) and (−∞, 0], respectively. It is easily seen that W0+ and W0− are Banach subalgebras of W0 , for which the decomposition W0 = W0+ W0− holds.
512
E.L. Basor, T. Ehrhardt
The Banach algebras W0 and W0± do not contain a unit element. The corresponding Banach algebras with unit element are obtained by adjoining the constant functions on R, i.e, W = C ⊕ W0 and W ± = C ⊕ W0± . Due to the fact that W ⊂ L∞ (R), for each a ∈ W the two-sided Wiener-Hopf operator W 0 (a) is well defined by (10). Proposition 5.2. Let a ∈ W, where a = α+F−1 aˆ with α ∈ C and aˆ ∈ L1 (R)∩L21/2 (R). Then W 0 (a) = αI + K, where K is the integral operator on L2 (R) with the kernel k(x, y) = a(x ˆ − y). ˆ Let f ∈ L1 (R) and denote by aˆ ∗ f the Proof. Obviously, W 0 (a) = αI + W 0 (F−1 a). −1 −1 f ). Hence a∗f convolution of aˆ and f . Then F (a∗f ˆ ) = (F−1 a)·(F ˆ ˆ = W 0 (F−1 a)f ˆ . By an approximation argument the same equality holds also for all functions f ∈ L2 (R). This proves the assertion. For each a ∈ W the Wiener-Hopf operator W (a) and the Hankel operator H (a) are well defined in the sense of (11). Moreover, the previous proposition implies that under the same hypotheses, W (a) equals αI plus an integral operator with the kernel a(x ˆ − y) on L2 (R+ ) and H (a) is an integral operator with the kernel a(x ˆ + y) on L2 (R+ ). The following result is taken from [8], where a formula for the evaluation of a certain type of operator determinant has been established. This formula represents a generalization of Pincus’ formula. The classical Pincus’ formula has been obtained in [13, 11] and employed in [15] in the calculation of the asymptotics of Toeplitz determinants. Proposition 5.3. Let H be a Hilbert space and let A, B ∈ L(H ) be such that the commutator AB − BA is a trace class operator. Then eA eB e−A−B is of determinant class and 1 det eA eB e−A−B = exp trace (AB − BA) . (68) 2 Next we establish an explicit formula for the operator determinant det W (eb )e−W (b) for functions b ∈ W. Theorem 5.4. Let b ∈ W, where b = β + F−1 bˆ with β ∈ C and bˆ ∈ L1 (R) ∩ L21/2 (R). Then W (eb )e−W (b) is of determinant class and 1 ∞ 1 b −W (b) ˜ ˆ b(−x) ˆ = exp trace H (b)H (b) = exp x b(x) dx . (69) det W (e )e 2 2 0 Proof. According to the remarks made after Proposition 5.2, for a ∈ W the Hankel operators H (a) and H (a) ˜ (where a(t) ˜ = a(t −1 )) are integral operators with the kernels a(x ˆ + y) and a(−x ˆ − y), respectively. Since aˆ ∈ L21/2 (R), these Hankel operators are Hilbert-Schmidt operators on L2 (R+ ) and the estimate
H (a) C2 (L2 (R+ )) ≤ a W ,
(70)
H (a) ˜ C2 (L2 (R+ )) ≤ a W
(71)
holds (see also the argument given in the proof of Proposition 2.2).
Asymptotics of Determinants of Bessel Operators
513
Moreover, from formula (13) it follows that W (ac) = W (a)W (c) whenever a ∈ W− or c ∈ W+ . In fact, H (a) = H (c) ˜ = 0. From this it is not difficult to conclude that W (eb+ ) = eW (b+ )
and
W (eb− ) = eW (b− )
(72)
for all b+ ∈ W+ and all b− ∈ W− . Each function b ∈ W can be decomposed into b = b+ + b− with b± ∈ W ± . We define A = W (b− ) and B = W (b+ ). Using formulas (72) and (13) it follows that eA eB = eW (b− ) eW (b+ ) = W (eb− )W (eb+ ) = W (eb− eb+ ) = W (eb ). Obviously, e−A−B = e−W (b− )−W (b+ ) = e−W (b) . Again from formula (13) it follows that AB − BA = W (b− )W (b+ ) − W (b+ )W (b− ) = W (b− b+ ) − W (b+ )W (b− ) = ˜ Since both these Hankel operators are Hilbert-Schmidt H (b+ )H (b˜− ) = H (b)H (b). (see (70)), AB − BA is a trace class operator. Thus we can apply Proposition 5.3 and obtain the first part of the desired formula. The second part of the formula follows from the fact that the kernels of the operators ˜ are given by b(x ˆ + y) and b(−x ˆ H (b) and H (b) − y), respectively. For our purposes it is not enough to have formula (69) proved for functions b ∈ W. In what follows we establish this formula also for a slightly different class of functions. The proof is based on an approximation argument. Let S0 stand for the set of all functions a ∈ L∞ (R) such that both H (a) and H (a) ˜ are Hilbert-Schmidt operators on L2 (R+ ). We introduce a norm in S0 as follows: ˜ C2 (L2 (R+ )) .
a S0 = a L∞ (R) + H (a) C2 (L2 (R+ )) + H (a)
(73)
Proposition 5.5. S0 is a Banach algebra. Proof. We restrict our considerations to showing that a, b ∈ S0 implies that ab ∈ S0 and that ab S0 ≤ const a S0 b S0 . In fact, if a, b ∈ S0 , then it follows from formula (14) that also ab ∈ S0 . Moreover,
H (ab) 2 = a ∞ H (b) 2 + H (a) 2 b ∞ , ˜ 2 + H (a) 2 = a ∞ H (b)
H (ab) ˜ 2 b ∞ . This implies the desired norm estimate.
Let B0 stand for the set of all operators of the form A = W (a) + K,
(74)
where a ∈ S0 and K is a trace class operator on L2 (R+ ). We define in B0 a norm by
A B0 = a S0 + K C1 (L2 (R+ )) .
(75)
By Lemma 2.1(a), this definition is correct. In the same way as in the proof of Proposition 3.2 one can show that B0 is a Banach algebra. This Banach algebra possesses the ideal C1 (L2 (R+ )), and the canonical homomorphism is denoted by π0 : B0 → B0 /C1 (L2 (R+ )). The mapping 0 : a ∈ S0 → W (a) ∈ B0 is a continuous linear mapping. Moreover, the mapping π0 ◦ 0 : S0 → B0 /C1 (L2 (R+ )) is a Banach algebra isomorphism.
514
E.L. Basor, T. Ehrhardt
˜ is a trace class operator and W (eb )e−W (b) Proposition 5.6. Let b ∈ S0 . Then H (b)H (b) is of determinant class. Moreover, the mapping b ∈ S0 → W (eb )e−W (b) ∈ B0
(76)
is continuous. Proof. The first assertion is obvious since both Hankel operators are Hilbert-Schmidt. Now let A = W (eb )e−W (b) . Observe that A = 0 (eb )e− 0 (b) , and thus A belongs to B0 . Taking into account that both π0 and π0 ◦ 0 are homomorphisms it follows that π0 (A) = (π0 ◦ 0 )(eb ) e−(π0 ◦ 0 )(b) = (π0 ◦ 0 )(eb e−b ) = I + C1 (L2 (R+ )). Thus A − I is a trace class operator. The last assertion follows essentially from the fact that 0 is continuous and that the exponential function (in S0 and B0 ) is continuous. Theorem 5.7. Let b ∈ S0 and assume in addition that b ∈ C0∞ (R) ∩ L1 (R). Then 1
det W (eb )e−W (b) = exp
2
˜ . trace H (b)H (b)
(77)
Proof. Let bˆ stand for the Fourier transform of b (see (15)). Since b ∈ L1 (R), the Fourier transform bˆ is a function in C0∞ (R) ⊂ L∞ (R). For λ ∈ (0, ∞), let ρλ (x) = λπ −1 (1 + λ2 x 2 )−1 stand for the Poisson kernel, and let bλ stand for the convolution of ρλ with b, i.e., ∞ λb(t) bλ (x) = dt. (78) 2 2 −∞ π(1 + λ (x − t) ) Since ρλ and b belong to L1 (R), also bλ ∈ L1 (R). Note that the Fourier transform of ρλ is equal to (Fρλ )(x) = (2π)−1 e−|x|/λ . It follows that the Fourier transform bˆλ of bλ is given by ˆ bˆλ = F(ρλ ∗ b) = 2π(Fρλ ) · (Fb) = e−|x|/λ b(x).
(79)
We see immediately that bˆλ ∈ L1 (R) ∩ L21/2 (R). Thus bλ = F−1 bˆλ , whence it follows that bλ ∈ W0 . We can apply Theorem 5.4 and obtain that det W (ebλ ) )e−W (bλ ) = exp
1 2
trace H (bλ )H (b˜λ )
(80)
holds for all λ ∈ (0, ∞). Another consequence of (79) is that H (bλ ) = Mλ H (b)Mλ
and
˜ λ, H (b˜λ ) = Mλ H (b)M
where Mλ stands for the multiplication operator on L2 (R+ ) with the function e−x/λ , x ∈ R+ . Here we have to observe that the above Hankel operators are integral operators whose kernels are given in terms of bˆλ and bλ , respectively.
Asymptotics of Determinants of Bessel Operators
515
Since (Mλ )∗ = Mλ converges strongly on L2 (R+ ) to the identity operator as λ → ∞, ˜ in the Hilbert-Schmidt norm as it follows that H (bλ ) → H (b) and H (b˜λ ) → H (b) λ → ∞. Hence ˜ H (bλ )H (b˜λ ) → H (b)H (b) in the trace class norm as λ → ∞. From (78) and the assumption that b ∈ C0∞ (R) we obtain that bλ → b in the norm of L∞ (R). Hence, together with what has just been said, bλ → b in the norm of S0 . Now we employ the last statement of Proposition 5.6 and obtain that W (ebλ )e−W (bλ ) → W (eb )e−W (b) in the norm of B0 . Since both W (ebλ )e−W (bλ ) and W (eb )e−W (b) are operators of the form identity plus trace class operator, it follows from the particular definition of the Banach algebra B0 that W (ebλ )e−W (bλ ) − I → W (eb )e−W (b) − I in the trace class norm as λ → ∞. By passing to the limit λ → ∞ in (80), the desired identity follows. The following corollary is an immediate consequence of the previous theorem. Corollary 5.8. Let b ∈ C0∞ (R) ∩ L1 (R) be such that bˆ = Fb ∈ L21/2 (R). Then b
det W (e )e
−W (b)
= exp
1 2
∞
ˆ b(−x) ˆ x b(x) dx .
(81)
0
˜ are Proof. From Proposition 2.3 it follows that the Hankel operators H (b) and H (b) ˆ ˆ integral operators with the kernels b(x + y) and b(−x − y), respectively, and that ˜ can be both integral operators are Hilbert-Schmidt. Hence b ∈ S0 and trace H (b)H (b) expressed as the above integral. Notice that W ⊆ S0 . However, the example of a(x) ˆ = sign(x)e−|x| , a(x) = 2ix/(1+ 1 ∈ W0 but a ∈ / L (R). Thus Corollary 5.8 does not cover all of Theorem 5.4. x 2 ) shows that a
6. Proof of the Main Result Proof of Theorem 1.1. Suppose that the function b ∈ L∞ (R+ ) ∩ L1 (R+ ) satisfies the assumptions (i), (ii) of Theorem 1.1. We are going to identify b with its even continuation, b(x) = b(−x). Let bˆ stand for the Fourier transform of the even function b, i.e., the cosine transform (see (4)), and integrate twice by parts: ∞ ∞ ∞ sin(xt) sin(xt) ˆb(x) = 1 cos(xt)b(t) dt = − b(t) b (t) dt π 0 πx πx 0 0 ti+1 ti+1 n−1 cos(xt) cos(xt) = b (t) − b (t) dt , πx 2 π x2 ti ti i=0
516
E.L. Basor, T. Ehrhardt
where 0 = t0 < t1 < · · · tn−1 < tn = ∞ are the points where the derivatives are ˆ discontinuous. From this it follows that b(x) = O(1/x 2 ) as x → ±∞. Moreover, since 1 ∞ b ∈ L (R+ ), we have bˆ ∈ L (R). Hence we can conclude that bˆ ∈ L21/2 (R). Firstly, this means that we are in a position to apply Corollary 5.8. Secondly, the Hankel operator H (b) is Hilbert-Schmidt (see Proposition 2.2). Moreover, the assumptions (i) and (ii) also imply that B(b) − W (b) is a Hilbert-Schmidt operator (see Proposition 2.9. Thus from the definition of S in Sect. 3.1 it follows that b ∈ S. This allows us to apply Theorem 4.5. Finally, the assumptions (i) and (ii) also imply that the assumptions of Proposition 5.1 are fulfilled. Combining Theorem 4.5 with Proposition 5.1 and Corollary 5.8 yields the desired claim of Theorem 1.1. References 1. Basor, E.L.: Distribution functions for random variables for ensembles of positive hermitian matrices. Comm. Math. Phys. 188, 327–350 (1997) 2. Basor, E.L., Ehrhardt, T.: On a class of Toeplitz + Hankel operators. New York J. Math. 5, 1–16 (1999) 3. Basor, E.L., Ehrhardt, T.: Asymptotic formulas for determinants of a sum of finite Toeplitz and Hankel matrices. Math. Nachr. 228, 5–45 (2001) 4. Basor, E.L., Ehrhardt, T.: Asymptotic formulas for the determinants of symmetric Toeplitz + Hankel matrices. To appear in Operator Theory: Advances and Applications 5. Basor, E.L., Tracy, C.A.: Variance Calculations and the Bessel kernel. J. Stat. Phys. 73, 415–421 (1993) 6. B¨ottcher, A., Silbermann, B.: Analysis of Toeplitz Operators. Berlin: Springer, 1990 7. Ehrhardt, T.: A new algebraic approach to the Szeg¨o-Widom Limit Theorem. To appear in Acta Math. Hungar. 8. Ehrhardt, T.: A generalization of Pincus’ formula and Toeplitz determinant identities. To appear in Archiv der Mathematik 9. Gohberg, I., Krein, I.M.: Introduction to the theory of linear nonselfadjoint operators in Hilbert space. Am. Math. Soc. Transl. Math. Monographs 18, Providence, R.I.: Am. Math. Soc., 1969 10. Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Products. 5th edn, San Diego: Academic Press, 1994 11. Helton, J.W., Howe, R.E.: Integral operators: Traces, index, and homology. In: Proceedings of the Conference in Operator Theory, Lecture Notes in Math. vol. 345, Berlin: Springer, 1973, pp. 141–209 12. Metha, M.L.: Random Matrices, San Diego: Academic Press, 1991 13. Pincus, J.D.: On the trace of commutators in the algebra of operators generated by an operator with trace class self-commutator. Unpublished, 1972 14. Titchmarsh, E.C.: Introduction to the Theory of Fourier Integrals, Oxford, 1937 15. Widom, H.: Asymptotic behavior of block Toeplitz matrices and determinants. II. Adv. Math. 21, 1–29 (1976) 16. Widom, H.: A trace formula for Wiener-Hopf operators. J. Oper. Th. 8, 279–298 (1982) Communicated by J.L. Lebowitz
Commun. Math. Phys. 234, 517–532 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0768-2
Communications in
Mathematical Physics
Spectral Estimates for Periodic Jacobi Matrices Evgeni Korotyaev1 , Igor V. Krasovsky2 1 2
Institut f¨ur Mathematik, Humboldt Universit¨at zu Berlin, Germany. E-mail: [email protected] Institut f¨ur Mathematik, Technische Universit¨at, Berlin, Germany. E-mail: [email protected]
Received: 17 April 2002 / Accepted: 1 October 2002 Published online: 13 January 2003 – © Springer-Verlag 2003
Abstract: We obtain bounds for the spectrum and for the total width of the spectral gaps for Jacobi matrices on 2 (Z) of the form (H ψ)n = an−1 ψn−1 + bn ψn + an ψn+1 , where an = an+q and bn = bn+q are periodic sequences of real numbers. The results are based on a study of the quasimomentum k(z) corresponding to H . We consider k(z) as a conformal mapping in the complex plane. We obtain the trace identities which connect integrals of the Lyapunov exponent over the gaps with the normalised traces of powers of H . 1. Introduction The purpose of the present work is to study the spectrum of the q-periodic Jacobi matrix on 2 (Z) (H ψ)n = an−1 ψn−1 + bn ψn + an ψn+1 , (1.1) where bn = bn+q ∈ R, an = an+q > 0, n ∈ Z. The number q is the smallest period. (It is well known that any symmetric Jacobi matrix can be represented in the form with positive off-diagonal elements.) Periodic Jacobi matrices were discussed in many works (see, e.g., [1–3]). In what follows, we consider q > 1 thus excluding the trivial case of q = 1 (the spectrum of H with q = 1 is the interval [b1 − 2a1 , b1 + 2a1 ]). Let r(H ) denote half the distance between the ends of the spectrum of H . We set c = r(H ),
(1.2)
A = (a1 a2 · · · aq )1/q .
(1.3)
We will also use the notation aj ≡ a for a1 = a2 = · · · = aq = a. We remark that periodic discrete 1D Schr¨odinger operators are a particular case of (1.1) with aj ≡ 1.
518
E. Korotyaev, I.V. Krasovsky
By Gershgorin’s theorem (see, e.g., [4]), the spectrum of H lies inside the interval [minj (bj − aj − aj −1 ), maxj (bj + aj + aj −1 )]. It is absolutely continuous and consists of exactly q intervals − σm = [λ+ m−1 , λm ],
separated by gaps
+ γm = (λ− m , λm ),
m = 1, . . . , q
m = 1, . . . , q − 1.
If a gap degenerates, i.e. γm = ∅, then the corresponding segments σm , σm+1 merge. In what follows, we will use | · | to denote the Lebesgue measure of sets. It is known [5, 6] that the total width of bands and gaps, respectively, satisfy the inequalities1 q q−1 4Aq ˜ ˜ 2c − b ≥ |σm | > q−1 , |γm | ≥ b, (1.4) M m=1
m=1
where b˜ = maxj bj − minj bj and M = max(maxj (bj + aj + aj −1 ) − minj bj , maxj bj − minj (bj − aj − aj −1 )). Analysis of the proof in [6] shows that M in (1.4) can be replaced by 2c. In the present work we shall find further estimates for the widths of the gaps and for the total width of the spectrum in terms of the matrix elements of H . Our results are obtained from properties of the quasimomentum (associated with the periodic operator H ) considered as a conformal mapping of complex domains. Originally, the method of conformal mappings was proposed in [7] for the Hill operator (Hy(x) = y (x)+f (x)y(x), f (x) = f (x + T )). It was further developed in [8–12] to obtain spectral estimates for the Hill operator and the Dirac operator. As we shall see below, the main peculiarity of the discrete case is that the spectrum is bounded. This makes z = arccos(λ/ max |λ± n |) a more natural spectral variable than λ. Thus our estimates for the gaps will be formulated in terms of z. As is known, the real part of the quasimomentum is the integrated density of states of the operator H while the imaginary part is the Lyapunov exponent (it describes the rate of growth in n of the solutions ψn (λ) of the equation (H − λ)ψ = 0). In this paper we shall obtain identities (trace formulas and Dirichlet integrals) which relate various integrals of the quasimomentum to traces of powers of H (Lemmas 3.2 and 3.3). These identities will serve as a starting point to derive our estimates. They are also of separate interest. − We shall call a Jacobi matrix normalised if −λ+ 0 = λq > 0. Obviously, this can be achieved for an arbitrary matrix by adding a certain constant. Below we always assume that H is normalised. We shall need the matrix L defined by b1 a1 iaq a1 b2 a2 a b a 2 3 3 a1 + ia2 b1 , if q = 2, L= L= . .. a1 − ia2 b2 aq−2 bq−1 aq−1 −iaq aq−1 bq if q > 2. (1.5) 1
They were written in [6] for the Schr¨odinger case (aj ≡ 1) but the arguments are easy to generalise.
Spectral Estimates for Periodic Jacobi Matrices
519
It follows from Floquet theory (see the next section) that c ≡ r(H ) > r(L). Hence, averaging over the eigenvalues of L gives the following estimate (lower bounds): c2j >
1 Tr L2j , q
j = 1, 2, . . . .
(1.6)
Below we obtain a different type of bounds for c: Theorem 1.1 (Bounds for the width of the spectrum). Let H be normalised and q > 1. Then c > 2A, 1 c 1 c2 + ln > Tr L2 . 2 2A q
(1.7) (1.8)
Remark . The equality sign in (1.7) happens if and only if q = 1. This theorem will be proved in Sect. 3. The first part of it, (1.7), will also be proved by a much simpler method which uses only general properties of polynomials (Lemma 2.2). Note that the quantity A can be interpreted as the logarithmic capacity of the spectrum (see, e.g., [13]). One might ask if the inequality (1.7) between half the diameter of a set and its logarithmic capacity holds in a more general context. For the Schr¨odinger case (aj ≡ 1), the inequality (1.7) reduces to c > 2. It was recently shown in [14] that c > 2 for an arbitrary, not necessarily periodic, real nonconstant sequence bj , aj ≡ 1. As will be clear from the next section, one can also derive further inequalities similar to (1.8). They are analogous to (1.6) but better. We obtain from them lower bounds for c. Consider, for example, Harper’s matrix (see [16] for a review): aj ≡ 1, bj = 2 cos(2π αj + θ ), α = p/q, p and q ∈ Z, θ ∈ R. Obviously, c ≤ 2 + maxj |bj | ≤ 4. Simple calculation shows that Tr L2 = 4q. Substituting this value into inequality (1.8) and solving the latter numerically gives c > 2.41. Note that this bound does not depend on p and q. Therefore it follows from continuity that it also holds for irrational α. Let 1 h+ = max arccosh |A−q det(λ − L)/2|. λ∈∪γn q We obtain the following estimates for the gaps of H in terms of the matrix elements and c: Theorem 1.2 (Bounds for the gaps). Let H be normalised and q > 1. Furthermore, let + − + th gn = (arccos(λ− n /c), arccos(λn /c)), where (λn , λn ) is the n gap in the spectrum of H . Then at least one gap is open and q−1 n=1 q−1
n=1
|gn | > π
ln(c/2A) ln(c/2A) >π , h+ ln(2c/A)
π cos xdx > h+ gn 2
ln(c/2A) 1 1 2 Tr L , + − 2 4 2qc2
ln(c/2A) < |gn |2 < 8 ln(c/2A), max{1, qh+ /π} n=1 c 1 q where h+ satisfies 0 < h+ < ln( A + | Aq j =1 bj |) < ln 2c A.
(1.9)
(1.10)
q−1
(1.11)
520
E. Korotyaev, I.V. Krasovsky
Remarks. • Existence of an open gap implies that h+ = 0. • Just like for (1.8) it is possible to derive further inequalities similar to (1.10) improving the estimates. The plan of the paper is as follows. In the next section we recall some facts from Floquet theory and prove a lemma which guarantees existence of at least one open gap if q > 1 and validity of (1.7). Note that existence of gaps for the Schr¨odinger case also follows from (1.4) [6]. In Sect. 3 we construct the quasimomentum, obtain the trace formulas and prove Theorem 1.1. In Sect. 4 we show how our construction fits into the theory of the general quasimomentum developed in [7, 15, 8] and recall some properties of the general quasimomentum. In Sect. 5 we use these properties together with estimates for the Dirichlet integrals (Lemma 3.3) to prove Theorem 1.2 and to establish further bounds on the maxima of the Lyapunov exponent (Lemma 5.1). Note that the proof of the bounds (1.9) and (1.10) is simpler than that of (1.11). To establish the former, we only need Lemmas 3.1, 3.2, and 5.2. 2. Preliminaries Recall some facts from Floquet theory (see, e.g., [1]). For a fixed λ we introduce two fundamental solutions φ(λ), θ (λ) of the equation (ψ = φ or ψ = θ) (H ψ(λ))n = λψn (λ), n = −∞, . . . , ∞, by the initial conditions φ0 = 0, φ1 = 1; θ0 = 1, θ1 = 0. Both of them satisfy the recurrence relation: an ψn+1 (λ) = (λ − bn )ψn (λ) − an−1 ψn−1 (λ),
n = 1, 2, . . . .
(2.1)
The polynomial of degree q D(λ) = φq+1 (λ) + θq (λ)
(2.2)
is called the discriminant of H . For example, when q = 2 we have D(λ) = (λ2 −(b1 +b2 )λ+b1 b2 −a12 −a22 )/(a1 a2 ). It is known that: 1) Any solution of the equation (H − λ)ψ = 0 (except possibly for those corresponding to D(λ) = ±2) is a linear combination of two solutions ψ + and ψ − with the property
± ψn+q (λ) = e±iq k(λ) ψn± (λ),
n ∈ Z,
1 D(λ) where k(λ) = arccos . q 2
(2.3)
Henceforth, we shall fix the branch of arccos(x) by the condition arccos(0) = −π/2. The function k(λ) plays a crucial role in the present paper. 2) The spectral intervals σm are the image of [−2, 2] under the inverse of the transform D(λ) (see Fig. 1). (Note that by (2.3) solutions ψn are bounded in n if D(λ) ∈ [−2, 2] and otherwise exponentially increase.) Hence, all the critical points of D(λ) are maxima and minima; moreover, at all maxima points λmax D(λmax ) ≥ 2 and at all minima points λmin D(λmin ) ≤ −2. All q − 1 critical points are mutually different. 3) The solutions φ(λ) and θ (λ) satisfy the identity φq+1 (λ)θq (λ) − φq (λ)θq+1 (λ) = 1. We shall need the representation of D(λ) given by
(2.4)
Spectral Estimates for Periodic Jacobi Matrices
521
D(λ)
2
λ0 +
λ1 −
λ1 +
λ2 −
λ2 +
λ3 −
−2
Fig. 1. The discriminant is sketched for q = 3. The bands of the spectrum are shown by thick lines
Lemma 2.1. D(λ) = A−q det(λ − L) for all λ ∈ C. Proof. Denote Dj k , j, k > 0, the determinant of the matrix λ − L with the first j − 1 and the last q − k rows and columns removed, and L1q , Lq1 set to zero. Expanding det(λ − L) in the elements of the first row, we get det(λ − L) = D1q − aq2 D2 q−1 .
(2.5)
Similarly, the expansion of D1j by the last row gives: D1j = (λ − bj )D1 j −1 − aj2−1 D1 j −2 , j = 1, 2, . . . , q, where D1 −1 = 0, D10 = 1. Comparing this recurrence with (2.1), we get D1q = a1 a2 · · · aq φq+1 by induction. Similarly, we show that a a ···a D2 q−1 = 1 2−aq q−1 θq . Substituting these expressions into (2.5) proves the lemma.
Now we shall prove that q = 1 implies the inequality (1.7) and existence of gaps. The proof is based on the above given facts from Floquet theory and general properties of polynomials. Lemma 2.2. Let q > 1. Then c = r(H ) > 2A and at least one gap is open. Proof. Suppose that c ≤ 2A. Then by adding a constant to H , we first ensure that the spectrum of H (Spec(H )) lies within the interval I = [−2A, 2A] = Spec(H0 ). Here H0 is the Jacobi matrix with aj ≡ A and bj ≡ 0. Denote by DH (λ) the discriminant of H . It is a polynomial of degree q with the coefficient A−q of λq . Let us set all aj = A and bj = 0 in it, and denote the resulting polynomial DH0 (λ). It is the discriminant for H0 viewed as a matrix of period q. Since Spec(H0 ) has no gaps, we conclude that maxλ∈I |DH0 (λ)| = 2. Let λ0 = −2A, λq = 2A, and λj , j = 1, . . . , q − 1 be the critical points of DH (λ). By our construction, they all lie in I . By Floquet theory, |DH (λj )| ≥ 2 and the signs in the sequence DH (λj ), j = 0, . . . , q alternate. Thus DH (λq ) ≥ DH0 (λq ),
DH (λq−1 ) ≤ DH0 (λq−1 ),
DH (λq−2 ) ≥ DH0 (λq−2 ),
.... (2.6)
522
E. Korotyaev, I.V. Krasovsky
The polynomial S(λ) = DH (λ) − DH0 (λ) is of degree less than q. However, as follows from (2.6), it changes its sign on I at least q times. Therefore S(λ) ≡ 0. Thus the strict inequality c < 2A is impossible, and c = 2A happens only when DH ≡ DH0 . We shall now demonstrate that the last identity implies H = H0 , i.e., the smallest period of H is 1. The method is essentially borrowed from [3]. Assume DH ≡ DH0 (we shall omit the − ± subscript). Hence λ+ j = λj and maxλ∈I |D(λ)| = |D(λj )| = 2. Let νj , j = 1, . . . , q−1 be the zeros of the (q − 1)-degree polynomial φq (λ). Since φn (λ), n = 1, 2, . . . are the orthogonal polynomials corresponding to H , all the νj are simple and belong to the open set (−2A, 2A). Using (2.4), we get |D(νj )| = |θq (νj )−1 + θq (νj )| ≥ 2. Therefore νj = λ± j , |D(νj )| = 2, and θq (νj ) = ±1, where the sign corresponds to that of D(νj ), j = 1, . . . , q − 1. Since θq (λ) is a polynomial of degree q − 2, it can be uniquely reconstructed from its values in q − 1 points νj by the Lagrange interpolation formula. From the definition of the discriminant, we obtain φq+1 (λ) = D(λ) − θq (λ). On the other hand, we also know the monic polynomial (i.e., with the coefficient 1 of the highest
q−1 degree) φˆ q (λ) = j =1 (λ − νj ); note that it is D (λ) up to a factor. We shall now see that these two polynomials determine H . Note first that the coefficient of the highest (n − 1) degree of φn (λ) is (a1 a2 · · · an−1 )−1 . Therefore we have the following recursion for the monic polynomials φˆ n (λ) = λn−1 + αnn−2 λn−2 + · · · + αn0 : 2 φˆ n−1 (λ) = 0. φˆ n+1 (λ) + (bn − λ)φˆ n (λ) + an−1
All coefficients of this polynomial are zero. This gives n−1 n−2 , an−1 = αnn−3 − αn+1 − bn αnn−2 . bn = αnn−2 − αn+1
(2.7)
(2.8)
Assuming φˆ n+1 (λ) and φˆ n (λ) are given, we then obtain the polynomial φˆ n−1 (λ) from (2.7). Performing this procedure successively for n = q, q − 1, . . . , 1, we reconstruct the coefficients a1 , . . . , aq−1 and b1 , . . . , bq . Finally, aq is obtained from the highestdegree coefficient α of φq+1 (λ): aq = (αa1 · · · aq−1 )−1 . From the uniqueness of our reconstruction, it follows that H = H0 . Thus for q ≥ 2 we have c > 2A. Suppose now that the spectrum of H has no gaps, i.e., |DH (λ)| ≤ 2 on [−c, c] (assuming H is normalised). Then considering the inequalities between DH (λ) and DH0 (λ) now at the critical points of DH0 (λ) on [−c, c], we obtain as before S(λ) = DH (λ) − DH0 (λ) ≡ 0. This contradiction shows that there is at least one open gap in the spectrum of H .
3. Trace Formulas and Dirichlet Integrals Our first aim is to investigate the asymptotics of the function k(λ) for large imaginary λ. We shall see below that it is convenient to introduce instead of λ a new variable z such that the mapping k(λ(z)) becomes asymptotically the identity. After determining the asymptotics of k(λ(z)), we construct the regions into which the upper half-plane is mapped by the functions z(λ) and k(λ). This information is then used to obtain the trace formulas and Dirichlet integrals for the quasimomentum. + First, we set c = λ− q = |λ0 | (recall the normalisation of Sect. 1). Then, in the variable ζ = λ/c, the spectrum lies within the interval [−1, 1] and its boundaries coincide with −1 and 1. Using the expansion
Spectral Estimates for Periodic Jacobi Matrices
arccos x = i ln 2x −
523
∞ 2j j =1
j
1 , 2j (2x)2j
|x| > 1,
(3.1)
we get q k(λ) = arccos(D(λ)/2) = i ln D(λ) + O(1/λ2q ) as |λ| → ∞. Here, by Lemma 2.1, ln D(λ) = ln det(λ − L) − ln Aq = Tr ln(λ − L) − ln Aq ∞ λ λ Tr Lj = q ln + Tr ln(1 − L/λ) = q ln − . A A j λj j =1
Further, using (3.1), we have for |ζ | > 1, 2j λ 1 c c i ln = i ln . + i ln 2ζ = i ln + arccos ζ + i j 2j (2ζ )2j A 2A 2A ∞
j =1
Definition. The function
k(z) = k(−c cos z) + π,
(3.2)
where k(λ) is defined by (2.3), (2.2), is called quasimomentum. From the preceding discussion, we have: k(z) = z + iQ0 +
∞ iQj , cosj z
z → i∞,
(3.3)
j =1
where Q0 = ln
c ; 2A
(1/j cj ) q1 Tr Lj Qj = (1/j 2j ) j − (1/j cj ) q1 Tr Lj j/2
if 1 ≤ j < 2q and odd; if 2 ≤ j < 2q and even. (3.4)
In particular, 1 1 1 Tr L, Q2 = − Tr L2 . (3.5) qc 4 2qc2 Investigating the mapping of the boundaries (as we explain below), we obtain the correspondence of the domains in λ-, z-, and k-planes under the conformal transformations z(λ) = π + arccos λ/c, k = k(z) as shown in Figs. 2–4 (for q = 3). The notation for the real and imaginary parts of z and k is fixed as follows: z = x + iy, k = u + iv. + + The points λ+ n , zn , kn correspond to each other under the mappings; so do the points − − − λn , zn , kn . That the upper λ-half-plane is mapped onto the half-strip by the function z(λ) is easy to verify. We prove first the mapping of the boundary ∂SRλ onto ∂SR (for finite regions SRλ and SR ) and then let R → ∞. To obtain the mapping of the boundary of λ domain onto that of k domain, it is convenient to represent k(λ) + π in the form
D(λ) dD(λ) 1 q D(λ+0 ) 4 − D(λ)2 Q1 =
and integrate (keeping in mind Fig. 1) along the real part of ∂SRλ with vanishing semicircles above the ends of the gaps λ± j .
524
E. Korotyaev, I.V. Krasovsky
λ λ
SR
λ max,1 λ0
λ1
+
λ max,2 λ1
−
λ2
+
λ2
−
+
λ3 −
Fig. 2. λ-plane for q = 3
We now write these results more precisely. Let
+ = C+ \ ∪γn ,
C+ = {λ : Im λ > 0}, 0 = C+ \ ∪gn , C0+ = {z : 0 ≤ Re z ≤ π, Im z ≥ 0}, n = 1, . . . , q − 1, gn = (zn− , zn+ ), ± ± where zn = π + arccos λn /c. The function z = π + arccos λ/c is a conformal mapping 0 . The gap γ is mapped onto g , n = 1, . . . , q − 1. Let of + onto Z+ n n 0 Z+
0 K+ = C0+ \ ∪cn ,
cn =
nπ nπ , + ihn , q q
n = 1, . . . , q − 1,
0 onto K0 . The where cn is a vertical slit. The function k = k(z) conformally maps Z+ + + − spectral interval [zn−1 , zn ] is mapped onto the interval of length π/q of the u-axis [(n − 1)π/q, nπ/q], n = 1, . . . , q; the gap gn , onto the slit cn , n = 1, . . . , q − 1. The height of the nth slit is hn = (1/q)arccosh |D(λmax, n )/2|, where λmax, n is the critical point of D(λ) in the gap γn . Thus we proved
y R
z
SR
0
x
z +0
z 1−
z +1
z −2
z +2
Fig. 3. z-plane
z −3 =π
Spectral Estimates for Periodic Jacobi Matrices
525
v
k
k max, 2 k max, 1 π/q
0 k +0
π/q
k −1=k +1
π/q
k −2=k +2
u k −3 =π
Fig. 4. k-plane
Lemma 3.1 (Quasimomentum). The function k(z) = k(−c cos z) + π is a conform0 onto the quasimomentum domain K0 . It possesses the al mapping of the domain Z+ + asymptotic expansion (3.3). We shall now obtain the identities (trace formulas) which connect the integrals of u ≡ Re k and v ≡ Im k with the coefficients Qi in the asymptotic expansion of k(z) and, via these coefficients, with the matrix elements of H . The trace formulas will be the basis of all the spectral estimates in the present work. To establish (1.7) (in a way different from that of Lemma 2.2), only the first of these formulas is sufficient. The other ones can also be used to extract some local information about the gaps, but we shall not do this here. Lemma 3.2 (Trace formulas). The following identities hold: 1 π
π
c , (3.6) v(x)dx = Q0 = ln 2A 0 n − 1 − 2i (n−1)/2 Q2i+1 /2n−1−2i if n is odd,
π i=0 (n − 1)/2 − i 1 (3.7) v(x) cosn xdx = n/2 π 0 n − 2i n−2i if n is even. i=0 Q2i n/2 − i /2 In particular,
1 π v(x) cos xdx = Q1 , π 0
π 1 v(x) cos2 xdx = Q0 /2 + Q2 . π 0
(3.8) (3.9)
526
E. Korotyaev, I.V. Krasovsky
0 as shown in Fig. 3. Proof. Let us calculate the integral of k(z) along the boundary of Z+ We integrate first along the contour ∂SR :
π
k(z)dz =
0=
0
∂SR
R
k(x)dx +
(π + iv(π, y))idy
0
0
+
0
k(x + iR)dx +
iv(0, y)idy.
π
R
Now take the limit R → ∞ using (3.3) for k(x + iR). We obtain
π
∞
k(z)dz = k(x)dx − iπ Q0 − π 2 /2 − (v(π, y) − v(0, y))dy. 0= 0 ∂ Z+
0
0
Separating the real and imaginary parts, we obtain (3.6) and the identity
π 0
π2 u(x)dx = + 2
∞
(v(π, y) − v(0, y))dy.
Similarly, considering the integrals of (k(z) − z − iQ0 − i 0 we get n = 1, 2, . . . along ∂Z+
π
v(x) cosn xdx =
0
whence the lemma follows.
(3.10)
0
n j =0
π
Qj
n
j j =1 Qj / cos
cosn−j xdx,
z) cosn z,
(3.11)
0
Proof of Theorem 1.1. i) Since v(x) ≥ 0, it follows from (3.6) that c ≥ 2A. ii) Similarly, the right-hand side of (3.9) is nonnegative. Substituting there (3.4), we obtain (1.8) with the greater-or-equal sign. To make the inequalities strict, we use Lemma 2.2: if q > 1 there exists at least one open gap, hence v(x) is not identically zero. Thus the integrals in Lemma 3.2 are strictly positive and the theorem is proven.
Similarly, using (3.7), we can obtain further inequalities for c. For a function f (z), z ∈ C+ , we formally define the Dirichlet integral
1 |f (z)|2 dxdy. I (f ) = 0 π Z+ We shall now present two such integrals the first of which we use below for our estimates. Lemma 3.3 (Dirichlet integrals). The following identities hold: I (k(z) − z) = Q0 = ln I ([k(z) − z − iQ0 ] cos z) =
c ; 2A
Q2 Q0 + Q2 − 2Q0 Q2 − 1 . 2 2
(3.12) (3.13)
Spectral Estimates for Periodic Jacobi Matrices
527
Proof. Using the Cauchy conditions, we rewrite the integral I (f (z)) in the form
1 |∇ Im f (z)|2 dxdy. (3.14) I (f (z)) = 0 π Z+ Recall the Green formula for the harmonic function ω(x, y) = Im f (z) in the closed area SR :
∂ω (3.15) |∇ω(x, y)|2 dxdy = ω dl, ∂n SR ∂SR where ∂/∂n denotes the normal derivative (the normal points outside SR ) to the curve ∂SR . Let us calculate (3.15) for f = k(z) − z, that is for ω(x, y) = v(x, y) − y. We use the Cauchy conditions to see that the integrals along both vertical boundaries of SR vanish. To calculate the integral along the interval [0, π] of the real line, we note (cf. Fig. 4) that u = const on each gap, and v = 0 on each band, hence the term v(x)∂u(x)/∂x vanishes π on [0, π ] and the contribution of this part of the path is 0 v(x)dx. Taking then the limit as R → ∞ (using the asymptotics of k(z) on the part of ∂SR , where y = R to see that the contribution of that part to the integral tends to zero), we obtain
π 2 |k (z) − 1| dxdy = v(x)dx. (3.16) 0 Z+
0
In view of (3.6), we have (3.12). We turn to the calculation of I (f ) for f = [k(z) − z − iQ0 ] cos z. Proceeding as above and using (3.9), we get 1 3 I, I ([k(z) − z − iQ0 ] cos z) = Q0 + Q2 + 4 2π
π I = (vu − vx + Q0 u) sin 2xdx.
(3.17) (3.18)
0
In order to evaluate I , we consider
k(z)2 3 z2 0 = lim − k(z)z + iQ0 k(z) + + Q20 − iQ0 z →0 ∂ Z 0 2 2 2 + 2Q0 Q2 + Q21 /2 2Q0 Q1 + sin 2zdz, + cos z cos2 z
(3.19)
0 differs from ∂Z 0 only in the −neighbourhood of the point z = π/2, where ∂ Z+ + where it is a semicircle above this point of radius (the integrand has a simple pole at z = π/2). The integrand in (3.19) is chosen to satisfy two conditions: 1) integration of the imaginary part of the first three terms in it over [0, π ] gives I (3.18); 2) the integrand behaves like O(1/ cos z) as z → i∞. We evaluate (3.19) as the contour integrals in the proof of Lemma 3.2. The imaginary part of (3.19) then yields Q0 I = −π + 4Q0 Q2 + Q21 . 2
Substituting this into (3.17), we conclude the proof of the lemma.
Further formulas similar to (3.12, 3.13) can be obtained with more effort.
528
E. Korotyaev, I.V. Krasovsky
4. Properties of the Quasimomentum 0 onto the strip By Lemma 3.1, the quasimomentum k(z) conformally maps the strip Z+ 0 K+ . We can expand it to a conformal mapping of the upper half z-plane onto the upper half k-plane (with slits) in the following way. First, let k(−x + iy) = k(x + iy) for 0 (reflection with respect to the y axis). Now continue k(z) periodically: x + iy ∈ Z+ k(z + 2π n) = k(z), n = ±1, ±2, . . .. Thus extended k(z) is a particular (periodic) case of the general quasimomentum: a well-studied mapping [15, 8]. We can utilise, therefore, the known properties of the general quasimomentum. The rest of this section will be devoted to a review of some of these properties. Define the so-called comb domain
K+ = C+ \ , = ∪n ,
n = (un , un + ihn ), hn 0, n ∈ Z,
where un is a strongly increasing sequence of real numbers such that un → ±∞ as ∞ n → ±∞, and {hn }∞ −∞ ∈ . A conformal mapping k(z) = u + iv from the upper half plane C+ onto some comb K+ is called a general quasimomentum if k(0) = 0, and k(iy) = iy(1 + o(1)) as y → ∞. A general quasimomentum k(z) is a continuous function in z ∈ C+ . The inverse function z(k) maps each slit n onto the interval (gap) gn = (zn− , zn+ ) of the real + − − axis; and each interval [u+ n−1 , un ] onto (the band) [zn−1 , zn ]. Obviously, v(z) > 0 for z ∈ C+ . The function v(x) is continuous on the real axis, equals zero on the bands, is positive and reaches the maximum hn in the n’s gap. Hence v ∈ L∞ (R). It is known that also ux (z) = vy (z) > 0 for z ∈ C+ . On the real line the function u (x) is positive on the bands and equals zero on the gaps. (Thus it plays a similar role for the bands as v(x) for the gaps. However, it is more difficult to obtain good estimates for u (x).) By the Herglotz theorem for positive harmonic functions (e.g., [17]), we have
1 ∞ v(t) v(z) = y 1 + dt , z ∈ C+ . (4.1) π −∞ |t − z|2 For its harmonic conjugate:
1 ∞ t −x t u(z) = x + − v(t)dt + const, z ∈ C+ . π −∞ |t − z|2 t2 + 1 (The term with Further,
t t 2 +1
(4.2)
is here only to ensure convergence if v(t) does not decay at infinity.)
1 k(z) = u + iv = z + C0 + π
1 ∞ v(t)dt C0 = − , π −∞ t (1 + t 2 )
1 t dt, v(t) − t − z 1 + t2 −∞ ∞
(4.3)
where the value of the constant C0 is obtained from the condition k(0) = 0. Since by the definition of the quasimomentum vy (iy) = 1 + o(1) as y → ∞, we have ux (iy) = 1 + o(1) as y → ∞. Hence, the Herglotz representation for ux has the form
1 ∞ yu (t) dt, z ∈ C+ . (4.4) ux (z) = 1 + π −∞ |t − z|2
Spectral Estimates for Periodic Jacobi Matrices
529
Consequently, on each gap gn , |y↓0 −vxx
=
(ux )y |y↓0
1 = π
∞
−∞
u (t)dt > 0, x ∈ gn , (t − x)2
(4.5)
i.e. the function v(x) is concave on each gap. For our estimates we shall need the following result proved in [8]: For any gap gj one has
v(t)dt 1 v(x) = wj (x) 1 + , wj (z) = |(z − zj+ )(z − zj− )|1/2 , x ∈ gj . π R\gj |t − x|wj (t) (4.6) Thus, the graph of the function v(x) lies above the semicircle over the gap (see Fig. 5). Note that in our case the quasimomentum is periodic (with period 2π ). This allows us to reduce the general formulas. For example, (4.3) takes on the form Lemma 4.1. Let k(z) = k(z + 2π), z ∈ C+ . Then
π θ −z 1 v(θ ) cotan dθ, 2π −π 2
π 1 θ C = − V.p. v(θ ) cotan dθ. 2π 2 −π
k(z) = z + C +
(4.7)
Proof. We represent (4.3) as a sum of integrals over the intervals [−π + 2π n, π + 2π n], n = −∞, . . . , ∞, change the integration variable in each of them to transform it to [−π, π ], and then use the periodicity of v(x) and the identity
V.p.
1 1 θ −z = cotan . θ − z + 2πn 2 2 n=−∞ ∞
The value of the constant is obtained from the condition k(0) = 0.
v(x)
z −j
z j+
Fig. 5. The graph of v(x) is sketched on a gap x ∈ (zj− , zj+ )
530
E. Korotyaev, I.V. Krasovsky
5. Estimates We return to the quasimomentum we constructed in Sect. 3. Recall that for q > 1 the function v(x) is not identically zero. Recall also that we assume H to be normalised. First, as a side result, note that similarly to the case of the Hill equation [8], we have a double-sided estimate for Q0 in terms of the gap lengths |gn | and maxima hn of v(x) in the gaps: Lemma 5.1. Let q > 1 and hn = maxx∈gn v(x). Then q−1 q−1 1 1 hn |gn | < Q0 < hn |gn |. 2π π n=1
Proof. We have 1 Q0 = π
(5.1)
n=1
0
π
q−1 1 v(x)dx < hn |gn |, π n=1
which is the r.h.s. inequality of (5.1). To establish the l.h.s. we observe that the concavity of v(x) on gaps implies that
1 v(x)dx > hn |gn | 2 gn and then use (3.6).
In order to find a bound on the gap width from below, we shall need an upper bound on hn : Lemma 5.2. Let q > 1 and h+ = maxn hn . Then q 2c c 1 0 < h+ < ln + bj < ln . A Aq A
(5.2)
j =1
Proof. We begin with an argument similar to that in [6]. Let λj be the eigenvalues of L (zeros of D(λ)). By the inequality between the algebraic and geometric means Aq |D(λ)| = where S(λ) =
q
|λ − λj | < (S(λ)/q)q ,
j =1
q
j =1 |λ − λj |.
Observe that for λ ∈ [min λn , max λn ], q q (λj − min λn ), (max λn − λj ) . S(λ) ≤ max j =1
(5.3)
j =1
q q Hence, S(λ) < cq + | j =1 λj | = cq + | j =1 bj | < 2cq, where we used the fact that H is normalised. Now note that 1 |D(λ+ )| 1 h+ = arccosh < ln |D(λ+ )| < ln(S(λ+ )/Aq), q 2 q where λ+ is a point of maximum of |D(λ)| in the gaps. The positivity of h+ follows from Lemma 2.2.
Spectral Estimates for Periodic Jacobi Matrices
We have
1 π
π
0
531
q−1 q−1 1 h+ v(x)dx < hn |gn | < |gn |, π π n=1
(5.4)
n=1
and, in view of (3.6), we have the lower bound (1.9) for the total gap width. The inequality (1.10) is proved similarly using (3.9). To prove the r.h.s. of (1.11), we use the semicircle property of v(x): Q0 = ln
1 c = 2A π
π
v(x)dx > 0
q−1 q−1 1 1 π(|gn |/2)2 = |gn |2 , π 2 8 n=1
n=1
which yields the r.h.s. of (1.11). We are yet to prove the l.h.s. of (1.11). For this we shall make use of the Dirichlet integral (3.12). Lemma 5.3. We have the following estimate: q−1
≤ π b+ ln(c/2A),
h2n
2
n=1
qh+ where b+ = max 1, . π
(5.5)
Proof. First, recall the following result from [11]. Let a real function f (u+iv) belong to the Sobolev space W21 (D(h, α, β)), where the domain D(h, α, β) = {α < u < β, 0 < v < h} for some β > α, h > 0. Let f obey the following conditions : (a) f (u + i0) = 0, if u ∈ (α, β), ¯ (b) f is continuous in the closure D(h, α, β). Then it was shown in [11] that
h |f (+α + iv)|2 dv π h |∇f |2 dudv. (5.6) max 1, v 2 β −α D(h,α,β) 0 Now take the function f (u + iv) = Im(k − z(k)) = v − y(k) and note that (a) the identity y(u + i0) = 0 yields f (u + i0) = 0; 0 , f is also continuous there. (b) as the function y(k) is continuous in K+ Therefore the conditions leading to (5.6) are fulfilled for this f (k) and the domain D(hn , π n/q, π(n + 1)/q). Since f (π n/q + iv) = v if 0 < v < hn , we get
h2n = 2
hn
0
qhn vdv π max 1, |z (k) − 1|2 dudv. π D
(5.7)
n Replacing max 1, qh by its maximum b+ and summing over n, we obtain π q−1 n=1
h2n ≤ πb+
0 K+
|z (k) − 1|2 dudv = π b+
0 Z+
|k (z) − 1|2 dxdy,
(5.8)
where the last equality follows by change of variables. In view of (3.12) we finished the proof.
532
E. Korotyaev, I.V. Krasovsky
Applying Lemma 5.3, we get ! ! " q−1 " q−1 q−1 " " 1 1 1 c v(x)dx < hm |gm | < # h2m # |gm |2 = ln 2A π 0 π π m=1 m=1 m=1 ! " q−1 $ c " # < b+ ln |gm |2 , 2A
π
m=1
which yields the l.h.s. of (1.11). Acknowledgements. The authors were partly supported by the Sfb288. I.K. is grateful to S. Jitomirskaya, D. Lehmann, H. Schulz-Baldes, and F. Sobieczky for useful discussions.
References 1. Toda, M.: Theory of Nonlinear Lattices. Berlin: Springer, 1981 2. van Moerbeke, P.: The spectrum of Jacobi matrices. Inv. Math. 37, 45–81 (1976) 3. Perkolab, L.B.: Inverse problem for a periodic Jacobi matrix: Theor. Funkzij Funkz. Anal. i Prilozh. 42, 107–121 (1984), in Russian 4. Lancaster, P.: Theory of Matrices. NY: Academic, 1969 5. Deift, P., Simon, B.: Almost periodic Schr¨odinger operators III. The absolutely continuous spectrum in one dimension. Commun. Math. Phys. 90, 389–411 (1983) 6. Last, Y.: On the measure of gaps and spectra for discrete 1D Schr¨odinger operators. Commun. Math. Phys. 149, 347–360 (1992) 7. Marchenko,V., Ostrovski, I.:A characterization of the spectrum of the Hill operator. Mat. Sb. 97(139), 4(8), 540–606 (1975), in Russian; V. Marchenko: Sturm-Liouville operators and applications. Basel: Birkh¨auser, 1986 8. Kargaev, P., Korotyaev, E.: Effective masses and conformal mappings. Commun. Math. Phys. 169, 597–625 (1995). Doklady RAN 336(3), 312–315 (1994) in Russian 9. Korotyaev, E.: Estimates for the Hill operator. I. J. Differential Equations 162 (1), 1–26 (2000) 10. Korotyaev, E.: The Estimates of periodic potential in terms of effective masses. Commun. Math. Phys. 183, 383–400 (1997) 11. Korotyaev, E.: Metric properties of conformal mappings on the complex plane with parallel slits. IMRN 10, 493–503 (1996) 12. Korotyaev, E.: Estimates of periodic potentials in terms of gap lengths. Commun. Math. Phys. 197 (3), 521–526 (1998) 13. Van Assche, W.: Asymptotics for Orthogonal Polynomials. Berlin: Springer, 1987 14. Killip, R., Simon, B.: Sum rules for Jacobi matrices and their applications to spectral theory. Ann. Math., to appear 15. Levin, B.: Majorants in the class of subharmonic functions. 1–3. Theor. Funkzij Funkz. Anal. i Prilozh. 51, 3–17 (1989); 52, 3–33 (1989), in Russian 16. Last, Y.: Almost everything about the almost Mathieu operator. In: XIth Intl. Congress Math. Phys. Proceedings, D. Iagolnitzer (ed.) Boston: International, 1995, pp. 366; S. Jitomirskaya, ibid., p. 373 17. Koosis, P.: Introduction to Hp spaces. Cambridge: Cambridge University Press, 1998 Communicated by B. Simon
Commun. Math. Phys. 234, 533–555 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0771-7
Communications in
Mathematical Physics
Method of Quantum Characters in Equivariant Quantization J. Donin ∗ , A. Mudrov Department of Mathematics, Bar-Ilan University, 52900 Ramat-Gan, Israel. E-mail: [email protected]; [email protected] Received: 28 April 2002 / Accepted: 3 October 2002 Published online: 24 January 2003 – © Springer-Verlag 2003
Abstract: Let G be a reductive Lie group, g its Lie algebra, and M a G-manifold. Suppose Ah (M) is a Uh (g)-equivariant quantization of the function algebra A(M) on M. We develop a method of building Uh (g)-equivariant quantization on G-orbits in M as quotients of Ah (M). We are concerned with those quantizations that may be simultaneously represented as subalgebras in Uh∗ (g) and quotients of Ah (M). It turns out that they are in one-to-one correspondence with characters of the algebra Ah (M). We specialize our approach to the situation g = gl(n, C), M = End(Cn ), and Ah (M) the so-called reflection equation algebra associated with the representation of Uh (g) on Cn . For this particular case, we present in an explicit form all possible quantizations of this type; they cover symmetric and bisymmetric orbits. We build a two-parameter deformation family and obtain, as a limit case, the U(g)-equivariant quantization of the Kirillov-Kostant-Souriau bracket on symmetric orbits. 1. Introduction Let G be a reductive Lie group and g its Lie algebra. Let M be a right G-manifold and Ah (M) a quantization of the function algebra A(M) on M. We suppose that the quantization is Uh (g)-equivariant, i.e. Ah (M) is a left Uh (g)-module algebra. We consider the problem of “restricting” Ah (M) to the Uh (g)-equivariant quantization on orbits in M. This means finding an invariant ideal in Ah (M), a deformation of the classical ideal specifying an orbit, such that the quotient algebra will be a flat deformation of the function algebra on the orbit. Our principal example is M = End(V ), where V is the underlying linear space of a finite dimensional representation of G. We consider End(V ) as a right G-manifold with respect to the action by conjugation and study quantizations on G-invariant sub-manifolds in End(V ). The problem of equivariant quantization on End(V ), equipped with the structure of the adjoint Uh (g)-module, was considered in [D1]. It leads to the study of the so-called ∗
This research is partially supported by the Israel Academy of Sciences grant no. 8007/99-01.
534
J. Donin, A. Mudrov
reflection equation (RE) algebras, [KSkl]. Torsion factored out, they are flat deformations of polynomial functions on the cone End (V ) of matrices whose tensor square commutes with the split Casimir , the image of the invariant symmetric element from g⊗2 , [DM2]. It is easy to show that the RE algebra can be restricted to orbits in End (V ) in the case g = gl(n, C) and V = Cn . The problem is to describe such a restriction explicitly, i.e., to find an appropriate ideal in the RE algebra and prove flatness of the quotient algebra as a module over C[[h]]. To this end, we develop a quantization method, confining ourselves to those quantizations that may be represented as subalgebras in the function algebra on the quantum group and as quotients of the RE algebra. Being simultaneously a subalgebra and a quotient algebra of flat deformations guarantees flatness of the quantization. In the classical case, every orbit in M is realized as a subalgebra in A(G) and a quotient of A(M). The algebra A(M) is a comodule over the Hopf algebra A(G). A point a ∈ M defines the map g → ag from G onto Oa , the orbit passing through a. It corresponds to a character χ a of the algebra A(M), which defines the reversed arrow from A(Oa ) to A(G). Since Oa ⊂ M, the algebra A(Oa ) is also a quotient of A(M). The idea of our method is to quantize this picture. Suppose A(M) is quantized together with the A(G)-comodule structure, so that Ah (M) is a comodule over the dual Hopf algebra Uh∗ (g). Suppose there is a character of the algebra Ah (M) being a deformation of χ a . Then, it defines a homomorphism from Ah (M) to Uh∗ (g) and the image of this homomorphism is a deformation of A(Oa ). We prove that, conversely, every equivariant homomorphism Ah (M) → Uh∗ (g) is of this form. We cannot expect to quantize all orbits in such a way. Indeed, a deformed algebra has a poorer supply of characters than its classical counterpart, therefore not every orbit, in general, fits our scheme. Any character of Ah (M) corresponds to a point on M where the Poisson bracket vanishes. An open question is whether every such a point can be quantized to a character of Ah (M). In this paper, we apply our method to the standard quantum group Uh gl(n, C) and M = End(Cn ). We give full classification of the Poisson brackets on semisimple orbits n of GL(n, C) that are obtained by restriction from the RE Poisson structure on End(C ). We describe all Uh gl(n, C) -equivariant quantizations that can be obtained within our approach and present them explicitly in terms of the RE algebra generators and relations. In particular, we build quantizations of symmetric and bisymmetric orbits1 GL(n, C). On symmetric orbits, the admissible Poisson brackets form a one-parameter family and we construct quantizations for all of them. On bisymmetric orbits, the admissible Poisson brackets are parameterized by a two dimensional variety. Within our approach, we quantize certain one-parameter sub-families. Also, we build quantizations on nilpotent orbits formed by matrices of zero square. It is known that a bisymmetric orbit has the structure of a homogeneous fiber bundle over a symmetric orbit as a base. This fact is important for the Penrose transformation theory. We show that our quantization respects that structure and build the quantization of the bundle map. We extend the quantization on symmetric orbits obtained by the method of characters to a two-parameter Uh gl(n, C) -equivariant family. It is a restriction ofthe two-param n eter quantization Lh,t on End(C ), which has the algebra U (gl(n, C) [t] as the limit h → 0. This algebra is a U (gl(n, C) -equivariant quantization of the Poisson-Lie 1
Those are the orbits consisting of matrices with two and three different eigenvalues, respectively.
Method of Quantum Characters in Equivariant Quantization
535
bracket on gl ∗ (n, C) End(Cn ). Taking the limit h → 0 in the two-parameter deformation on symmetric orbits, we obtain explicitly the U (gl(n, C) -equivariant quantizations of the Kirillov-Kostant-Souriau (KKS) bracket on symmetric spaces as quotients of the algebra U gl(n, C) [t]. The setup of the article is as follows. The next section contains some basic information essential for our exposition. In particular, we collect some facts from the quantum group theory in Subsect. 2.1 and recall definitions of modules and comodules over Hopf algebras in Subsect. 2.2. Therein, we introduce the FRT2 and RE algebras associated with the representation of Uh (g) on V and recall their basic properties. In Sect. 3, we formulate the method of restricting the RE algebra to adjoint orbits in End(V ) by means of the RE algebra characters. We specialize this method to the GL(n, C)-case in Sect. 4. We compute the RE Poisson structures on semisimple (co)adjoint orbits of GL(n, C) in Subsect. 4.1 and present the classification of the RE algebra characters in Subsect. 4.2. On this ground, we build the quantizations of symmetric and bisymmetric orbits in Subsect. 4.3. In Subsect. 4.4, we show that the constructed quantization respects the structure of homogeneous fiber bundles on bisymmetric orbits. In Subsect. 4.5, we construct the two-parameter quantization on symmetric orbits and give the explicit quantization of the KKS bracket on them as a limit case. 2. Preliminaries 2.1. Quantum group Uh (g). Let g be a reductive Lie algebra over C and r ∈ ∧2 g a solution to the modified classical Yang-Baxter equation [[r, r]] = φ,
(1)
where [[·, ·]] stands for the Schouten bracket and φ is an invariant element from ∧3 g. The universal enveloping algebra U(g) is a Hopf one, with the coproduct 0 , counit ε0 , and antipode γ0 defined by 0 (X) = X ⊗ 1 + 1 ⊗ X,
ε0 (X) = 0,
γ0 (X) = −X,
X ∈ g.
These operations are naturally extended over U(g)[[h]] as a topological C[[h]]-module. The following theorem is implied by the results of Drinfeld, [Dr2], Etingof and Kazhdan, [EK], Theorem 2.1. There exists an element Fh ∈ U ⊗2 (g)[[h]], Fh = 1 ⊗ 1 +
h r + o(h), 2
such that U(g)[[h]] becomes a quasitriangular Hopf algebra Uh (g) with the coproduct , counit ε, and antipode γ : (x) = Fh−1 0 (x)Fh , ε(x) = ε0 (x), γ (x) = u−1 γ0 (x)u, x ∈ Uh (g).
(2)
2 The FRT algebra is the quantization of the Drinfeld-Sklyanin bracket on End(V ). It was used by Faddeev, Reshetikhin, and Takhtajan for definition of Hopf algebra duals to quantum groups.
536
J. Donin, A. Mudrov
The element u is equal to u = γ0 (F1 )F2 , where F1 ⊗ F2 = Fh (summation implicit) and the universal R-matrix is given by Rh = (Fh−1 )21 ehω Fh = 1 ⊗ 1 + h(r + ω) + o(h),
(3)
where ω ∈ g⊗2 is a symmetric invariant element such that the sum r + ω satisfies the classical Yang-Baxter equation, [Dr1]. The algebra Uh (g) is a quantization of U(g) in the sense that Uh (g)/ h Uh (g) = U(g) as a Hopf algebra. The coassociativity of the coproduct implies the cocycle equation on Fh : ( ⊗ id)(Fh )(Fh ⊗ 1) = −1 h (id ⊗ )(Fh )(1 ⊗ Fh ).
(4)
Here, h is an invariant element from Uh⊗3 (g); it is called coassociator and satisfies the pentagon identity (id⊗2 ⊗ )(h )( ⊗ id⊗2 )(h ) = (1 ⊗ h )(id ⊗ ⊗ id)(h )(h ⊗ 1). Example 2.2 (Standard quantum groups). Let be the root system of g and ± the subsets of positive and negative roots. Let e±α , α ∈ + , be root vectors normalized to (eα , e−α ) = 1 with respect to the Killing form. The simplest example of the classical r-matrix is r= eα ∧ e−α . (5) α∈ +
It is called standard r-matrix and the corresponding quantum group standard or Drinfeld-Jimbo quantization of U(g). Other possible r-matices for simple Lie algebras are listed in [BD]. They were explicitly quantized in [ESS]. By Uh∗ (g) we mean the FRT dual to Uh (g), [FRT]. This is a quantized polynomial algebra on the group G; as a linear space, Uh∗ (g) consists of End∗ (V ) while V runs over finite dimensional completely reducible representations of Uh (g). 2.2. FRT and RE algebras. In this section, we collect some basic facts about the FRT and RE algebras and their symmetries. All Uh (g)-modules are considered to be free over C[[h]]. Tensor products are assumed to be completed in the h-adic topology. Recall that an associative algebra A over C[[h]] is called a left (right) Uh (g)-module algebra if it is a left (right) module with respect to the action ( ) and its multiplication is consistent with the module structure: x (ab) = x(1) a x(2) b , (ab) x = a x(1) b x(2) , (6) 1 a = a, a 1 = a, (7) x 1A = ε(x)1A , 1A x = ε(x)1A (8) for any x ∈ Uh (g) and a, b ∈ A. We adopt the standard brief Sweedler notation for the coproduct (x) = x(1) ⊗ x(2) , where x is an element from a Hopf algebra H. If A is a left and right module simultaneously and the two actions commute with each other, x1 (a x2 ) = (x1 a) x2 ,
x1 , x2 ∈ Uh (g), a ∈ A,
(9)
Method of Quantum Characters in Equivariant Quantization
537
then it is called bimodule. A is a Uh (g)-bimodule algebra if its bimodule and algebra structures are consistent with the coproduct in Uh (g) in the sense of (6–8). A right Uh∗ (g)-comodule algebra is an associative algebra A endowed with a homomorphism δ : A → A ⊗ Uh∗ (g) obeying the coassociativity constraint (id ⊗ ) ◦ δ = (δ ⊗ id) ◦ δ
(10)
and the conditions δ(1A ) = 1A ⊗ 1,
(id ⊗ ε) ◦ δ = id,
(11)
where the identity map on the right-hand side of the second equation assumes the isomorphism A ⊗ C[[h]] A. As for the coproduct , we use symbolic notation δ(x) = x[1] ⊗ x(2) , marking the tensor component belonging to A with the square brackets. The subscript of the Uh∗ (g)-component is concluded in parentheses. Every right Uh∗ (g)-comodule A is a left Uh (g)-module, the action being defined through the pairing . , . between Uh (g) and Uh∗ (g): x a = a[1] x, a(2) ,
x ∈ Uh (g), a ∈ A.
(12)
If A is finite dimensional, the converse is also true. Similarly to right Uh∗ (g)-comodule algebras, one can consider left ones. They are also right Uh (g)-module algebras. A completely reducible finite dimensional representation ρ of the universal enveloping algebra U(g) on a linear space V is naturally extended to that of Uh (g) on V [[h]]. Denote by M the matrix space End(V ) and fix a basis ei in V . Let eji ∈ M be the matrix units acting on V from the right by the rule el eji = δli ej , where δli is the Kronecker symbol. In terms of the basis {eji }, the multiplication is expressed by the formula eji ekl = δjl eki . As an associative algebra, M is a bimodule over itself with respect to the right and left regular representation. The homomorphism ρ equips M[[h]] with the structure of a Uh (g)-bimodule. By duality, the space M∗ [[h]] is a Uh (g)-bimodule as well. Let R denote the image of the universal R-matrix of Uh (g) under the representation ρ. We shall also use the matrix S, which differs from R by the permutation P on V ⊗ V , j S = P R, P = eji ⊗ ei . (13) i,j
Example 2.3 (FRT algebra). Let {Tki } ⊂ M∗ be the dual basis to {eik }. The associative algebra TR is generated by the matrix coefficients {Tki } ⊂ M∗ subject to the FRT relations αβ β mn Sij Tαm Tβn = Tiα Tj Sαβ , (14) α,β
α,β
or, in the standard compact form, ST1 T2 = T1 T2 S,
(15)
j where T = i,j Tji ei . The matrix elements Tji of the representation ρ may be thought of as linear functions on Uh (g); they define an algebra homomorphism TR → Uh∗ (g).
(16)
538
J. Donin, A. Mudrov
Proposition 2.4. Let ρ be a finite dimensional completely reducible representation of Uh (g) on a module V [[h]] and TR the FRT algebra associated with ρ. Then, TR is a Uh (g)-bimodule algebra, with the left and right actions extended from M∗ : x T = T ρ(x),
T x = ρ(x)T ,
x ∈ Uh (g).
(17)
It is a bialgebra, with the coproduct and counit being defined as (Tji ) =
n
Tjl ⊗ Tli ,
ε(Tji ) = δji .
(18)
l=1
Composition of the coproduct with the algebra homomorphism (16) applied to the (left) right tensor factor makes TR a (left) right Uh∗ (g)-comodule algebra. Proof. Actions (17) are extended to the actions on the tensor algebra T(M∗ )[[h]], and the ideal generated by (15) turns out to be invariant. Concerning the bialgebra structure, the reader is referred to [FRT]. The structure of a comodule is inherited from the bialgebra one, so it is obviously coassociative. The coaction is an algebra homomorphism, being a composition of two homomorphisms. Remark that the FRT relations (15) arose within the quantum inverse scattering method and were used for systematic definition of the quantum group duals in [FRT]. Example 2.5 (RE algebra). Another algebra of our interest, LR , is defined as the quotient of the tensor algebra T M∗ )[[h]] by the RE relations αβ µ βµ mν n mn Sij Lβ Sαµ Lν = Lαj Siα Lνµ Sβν , (19) α,β,µ,ν
α,β,µ,ν
where {Lij } is the basis in M∗ that is dual to {ei }. In the compact form, (19) reads j
SL2 SL2 = L2 SL2 S, where L is the matrix
i,j
(20)
j
Lij ei .
Proposition 2.6. Let ρ be a finite dimensional completely reducible representation of Uh (g) on a module V [[h]] and Tlk ∈ Uh∗ (g) its matrix coefficients. Let Lij be the generators of the algebra LR associated with ρ. Then, LR is a left Uh (g)-module algebra with the action extended from the coadjoint representation in M∗ [[h]]: x L = ρ γ (x(1) ) Lρ(x(2) ), x ∈ Uh (g). (21) It is a right Uh∗ (g)-comodule algebra with respect to the coaction Llk ⊗ γ (Tjk )Tli . δ(Lij ) =
(22)
l,k
Proof. Action (21) is naturally extended to T(M∗ )[[h]] and it preserves relations (20). The coassociativity of (22) is obvious. To prove that δ is an algebra homomorphism, one needs to employ commutation relations (15) and (20). For details, the reader is referred to [KS].
Method of Quantum Characters in Equivariant Quantization
539
A spectral dependent version of the RE appeared first in [Cher]. In the form of (20), it may be found in articles [Skl, AFS] devoted to integrable models. The algebra LR was studied in [KSkl, KS]. Its relation to the braid group of a solid handlebody was pointed out in [K]. Remark that the algebras TR and LR may be defined for any quasitriangular Hopf algebra and its finite dimensional representation. Propositions 2.4 and 2.6 will be also valid. It was proven in [DM2] that the FRT and RE algebras are twist-equivalent. For a detailed exposition of the twist theory, the reader is referred e.g. to [Mj]. Here we recall that the twist of a Hopf algebra H with a cocycle F is a Hopf algebra H˜ with the same multiplication as in H and the coproduct ˜ (x) = F −1 (x)F,
x ∈ H.
(23)
˜ the element F must obey the constraint To ensure coassociativity of , ( ⊗ id)(F)F12 = (id ⊗ )(F)F23 .
(24)
If A is a left H-module algebra with the multiplication m, the new associative multiplication m(a ˜ ⊗ b) = m(F1 a ⊗ F2 b),
a, b ∈ A,
(25)
˜ is an H-module ˜ can be introduced on A. This algebra, A, algebra. For example, if H is quasitriangular with the universal R-matrix R, the co-opposite Hopf algebra Hop is a twist with F = R−1 . Another example is the twisted tensor square R
H ⊗ H of a quasitriangular Hopf algebra. This is a twist of the ordinary tensor square H ⊗ H by the cocycle F = R23 ∈ (H ⊗ H) ⊗ (H ⊗ H). Theorem 2.7 ([DM2]). Let TR and LR be respectively the FRT and RE algebras associated with a finite dimensional representation of H. Consider TR as an Hop ⊗ H-module R
algebra. Then, there exists a twist from Hop ⊗ H to H ⊗ H such that the induced transformation (25) converts TR to LR . This twist is performed in two steps. The first one transforms the factor Hop to H in Hop ⊗ H. It is carried out via the cocycle F = R13 ∈ (Hop ⊗ H) ⊗ (Hop ⊗ H). The R
second twist from H ⊗ H to H ⊗ H is via the cocycle F = R23 ∈ (H ⊗ H) ⊗ (H ⊗ H). The composite transformation with the cocycle F = F F converts multiplication in TR to that in LR according to formula (25). It follows from Theorem 2.7 that the algebras TR and LR are isomorphic as C[[h]]modules in the case H = Uh (g). Another consequence is that LR is a module not only R
over H but over H ⊗ H as well. The action of H on LR is induced by the Hopf algebra R
embedding H → H ⊗ H via the coproduct.
540
J. Donin, A. Mudrov
3. Quantum Characters and Quantization on Orbits 3.1. Invariant monoid M . Let V be a complex linear space and ρ a homomorphism of U(g) into the matrix algebra M = End(V ). An element ξ ∈ g generates the left and right invariant vector fields on M, (ξ l f )(A) = df Aρ(ξ ) , (f ξ r )(A) = df ρ(ξ )A , A ∈ M, f ∈ A(M), defining the left and right actions of the algebra U(g) on functions on M. Given an element ψ ∈ U(g), by ψ r and ψ l we denote, correspondingly, its extensions to the rightand left-invariant differential operators on M. The left adjoint action of U(g) on M is generated by the vector fields ξ ad = ξ l − ξ r ,
ξ ∈ g.
(26)
Let ∈ M⊗2 be the image of the invariant symmetric tensor ω ∈ g⊗2 participating in construction of Uh (g), cf. formula (3). Introduce the cone of matrices3 M = {A ∈ M| [, A ⊗ A] = 0}.
(27)
Evidently, M is an algebraic variety, it is closed under the matrix multiplication and invariant with respect to the two-sided action of the group G. There are two remarkable Poisson structures on M ; they are given by the Drinfeld-Sklyanin bracket r l,l − r r,r
(28)
r ad,ad + (ωr,l − ωl,r ).
(29)
and the RE bracket
Theorem 3.1. 1. The quotient of TR by torsion is a Uh (g)op ⊗ Uh (g)-equivariant quantization of Poisson bracket (28) on the cone M . 2. The quotient of LR by torsion is a Uh (g)-equivariant quantization of Poisson bracket (29) on the cone M . Proof. For the proof of the first statement, the reader is referred to [DS]. The second statement is deduced from the first one using the twist from Theorem 2.7, see [DM2]. 3.2. General formulation of the method. In the previous section, we considered two examples of equivariant quantization on the space M . Depending on a particular choice of symmetry, they were quotients of the algebras TR and LR by the ideal of h-torsion elements. Further we study Uh (g)-invariant ideals in LR that are deformations of the classical ideals in the function algebra A(M ) specifying the orbits. The problem is to ensure flatness of the quotient algebras. In this section, we formulate a method realizing the quantized orbits simultaneously as quotients and subalgebras of flat deformations, and that guarantees flatness of the quantization. Briefly, we construct quantizations that are quotients of LR and subalgebras in Ah (G) = Uh∗ (g). Our quantization method uses analogs of points, which are one-dimensional representations or characters of quantized algebras. Let M be a manifold with a right action 3
It coincides with M = End(V ) for g = sl(n, C) and V = Cn .
Method of Quantum Characters in Equivariant Quantization
541
of the group G. Let a ∈ M be a point and χ a the corresponding character of the function algebra A(M). On the diagram M × G −→ M ↑
↑
A(M) ⊗ A(G) ←− A(M) χ a ⊗ id ↓
{a} × G −→ Oa
↓
,
C ⊗ A(G) ←− A(Oa )
the left square displays embedding of the orbit Oa passing through a into the G-space M. The induced morphisms of the function algebras are depicted on the right square. Our goal is to quantize this picture. Proposition 3.2. Let H be a Hopf algebra over C[[h]] and A a comodule algebra over its Hopf dual H∗ . Any character χ of A defines a homomorphism ϕχ : A → H∗ fulfilling the equivariance condition (ϕχ ⊗ id) ◦ δ = ◦ ϕχ .
(30)
Conversely, any homomorphism ϕ : A → H∗ obeying (30) has the form ϕχ , where χ is a character of A. Proof. Let χ be a character of A. We set ϕχ = (χ ⊗ id) ◦ δ and make use of the isomorphism C[[h]] ⊗ H∗ H∗ . Due to coassociativity (10) of the coaction δ, condition (30) holds. Conversely, if ϕ satisfies (30), then we apply the counit of the Hopf algebra H∗ to the first tensor factor of (30) and obtain ϕχ with χ = ε ◦ ϕ. Given an associative algebra A over C[[h]] let Char(A) denote its set of characters, i.e. homomorphisms A → C[[h]]. Any element χ ∈ Char(A) defines an algebra Aχ closing up the commutative diagram (the right-most arrow is onto) δ
A ⊗ H∗ ←− A χ ⊗ id ↓
↓ .
(31)
C[[h]] ⊗ H∗ ←− Aχ We consider H∗ as the left coregular module over H. Then, because of (30), ϕχ is an H-equivariant homomorphism of algebras, and its image Aχ is an H-module algebra. Definition 3.3. Let A be a right H∗ -comodule algebra. Two characters χ1 , χ2 ∈ Char(A) are called gauge-equivalent, χ1 ∼ χ2 , if there exists an element η ∈ Char(H∗ ) such that χ2 = (χ1 ⊗ η) ◦ δ.
(32)
This is an equivalence relation. Indeed, χ1 ∼ χ2 and χ2 ∼ χ3 implies χ1 ∼ χ3 , due to coassociativity of the coaction. Also, χ1 ∼ χ2 ⇒ χ2 ∼ χ1 , since (32) implies χ1 = (χ2 ⊗ η ◦ γ ) ◦ δ. Obviously χ1 ∼ χ1 via the counit ε ∈ Char(H∗ ) of the Hopf algebra H∗ . In the classical situation A = A(M) and H∗ = A(G), two characters are gauge-equivalent if and only if the corresponding points lie on the same orbit of G. Proposition 3.4. Let A be a right H∗ -comodule algebra and χ1 ∼ χ2 ∈ Char(A). There exist Hopf algebra automorphisms fH : H → H, fH∗ : H∗ → H∗ and an algebra automorphism fA : A → A such that the diagram
542
J. Donin, A. Mudrov ϕχ1
A −→ H∗ fA ↓ ↓ fH ∗ ϕχ2 A −→ H∗ is commutative and H-equivariant with respect to the left H-actions H ⊗ A −→ A fH ⊗ fA ↓ ↓ fA H ⊗ A −→ A
H ⊗ H∗ −→ H∗ fH ⊗ f H ∗ ↓ ↓ fH ∗ H ⊗ H∗ −→ H∗
Proof. Let η be the character of H∗ realizing the equivalence χ1 ∼ χ2 in (32). It may be thought4 of as a group-like element from H, i.e. the one whose coproduct is equal to (η) = η ⊗ η. The similarity transformation with such an element is an automorphism of the Hopf algebra H. We set fH∗ (x) = η x(1) x(2) η γ x(3) , x ∈ H∗ , (33) fH (y) = γ (η) y η, y ∈ H, (34) (35) fA (a) = a[1] η γ a(2) , a ∈ A. A straightforward verification using coassociativity of δ and shows that these maps possess the required properties. Specifically for the deformation quantization situation, we suppose that Ah (M) is a quantization of A(M) and it is a comodule algebra for Ah (G) Uh∗ (g). The coaction δ : Ah (M) → Ah (M) ⊗ Uh∗ (g) is assumed to be a deformation of the classical map A(M) → A(M) ⊗ A(G). This implies that Ah (M) is an equivariant quantization of A(M) because every Uh∗ (g)-comodule is a Uh (g)-module via action (12). In connection with Proposition 3.2, there arises the problem of describing the set Char Ah (M) . The following statement is elementary. Proposition 3.5. Let M be a Poisson manifold and Ah (M) a quantization of the function algebra A(M). Let a ∈ M and suppose χh is a character of the algebra Ah (M) such that χh (f ) = f (a) mod h, f ∈ A(M). Then, the Poisson bracket vanishes at the point a. Proof. By definition, χh mh (f, g) = χh (f )χh (g), for f, g, ∈ A(M). Expanding this equality in the deformation parameter and collecting the terms before h, we come to the condition m1 (f, g)(a) + χ1 (f g) = f (a)χ1 (g) + χ1 (f )g(a).
(36)
Here m1 and χ1 are the infinitesimal terms of the deformed multiplication mh and character χh . Skew-symmetrization of (36) proves the statement. Given a Poisson manifold M, we denote by Char 0 (M) the subset of points where the Poisson bracket vanishes. Question 1. Let M be a Poisson manifold and Ah (M) a quantization of its function algebra. Given a point a ∈ Char 0 (M), does there exist a character χh ∈ Char Ah (M) such that χh (f ) = f (a) mod h for all f ∈ A(M)? 4 This is obviously true for finite dimensional Hopf algebras. In the quantum group case, H is complete in the h-adic topology. Then H∗ consists of continuous linear functionals on H. It is equipped with the weak topology, in which continuous linear functionals on H∗ form H.
Method of Quantum Characters in Equivariant Quantization
543
3.3. Application to the quantum matrix algebra LR . We specialize the construction suggested in the previous subsection, to the situation when H is the quantum group Uh (g), H∗ the quantized function algebra Ah (G) Uh∗ (g) on the group G, and M the matrix cone M relative to a given representation of Uh (g). The equivariant quantization Ah (M) is the RE algebra LR . It is a Uh∗ (g)-comodule algebra, and we may apply Proposition 3.2 in order to obtain quantization of orbits in M as quotients of LR . Elements of Char(LR ) are defined by matrices A ∈ M[[h]] solving the numerical reflection equation SA2 SA2 = A2 SA2 S.
(37)
Definition 3.6. We say that a matrix Ah ∈ M[[h]] belongs to the orbit OA if Ah = A u for some invertible element u ∈ U(g)[[h]]. Theorem 3.7. Let Ah ∈ M[[h]] be a solution of (37) and χ = χ Ah the corresponding character of the algebra LR . Suppose the matrix Ah belongs to OA ⊂ M . Then, the algebra Aχ closing up the commutative diagram (the right-most arrow is onto) LR ⊗ Uh∗ (g) ←− LR id ⊗ χ Ah ↓
↓
(38)
C[[h]] ⊗ Uh∗ (g) ←− Aχ is a Uh (g)-equivariant quantization of the polynomial algebra on OA . Its embedding into the left coregular Uh (g)-module Uh∗ (g) is equivariant. Proof. The algebra Aχ is simultaneously defined by the commutative diagram (38) as a quotient and a subalgebra of flat C[[h]]-modules, therefore it is flat. Aχ is invariant under the left action x T = T ρ(x), x ∈ Uh (g), and we should check that it is isomorphic to A(OA )[[h]] as a U(g)[[h]]-module. Let u be an element from Uh (g), such that Ah = A u. The coboundary twist of Uh (g) with the element (u)(u−1 ⊗ u−1 ) converts Aχ into A˜ χ for which A is a character. The matrix A defines an equivariant embedding A˜ χ into the twisted algebra U˜h∗ (g), which is generated by the matrix elements of T˜ = ρ(u)T ρ(u−1 ). Its image is a subalgebra A˜ χ in U˜h∗ (g). Since Uh (g) itself is a twist of U(g)[[h]], the algebras A˜ χ and Aχ are isomorphic as U(g)[[h]]-modules. Obviously, the subalgebra A˜ χ coincides with A(OA ) modulo hA(OA ). The gauge-equivalence between characters of LR are realized by means of elements from Char Uh∗ (g) , which are described by the following proposition. Proposition 3.8. For the Drinfeld-Jimbo quantum group Uh (g), the set Char Uh∗ (g) consists of the elements eη ∈ Uh (g), where η belongs to the Cartan subalgebra h ⊂ g. Proof. The standard r-matrix (5) is of zero weight, so the subset Char 0 (G) coincides with the maximal torus corresponding to the Cartan subalgebra. As an associative algebra, Uh∗ (g) is isomorphic to U(g∗ )[[h]], [Dr1]. Its characters are parameterized by the dual space to g∗ /[g∗ , g∗ ] = h∗ . On the other hand, the elements eη , η ∈ h, are group-like because all η are primitive with respect to the coproduct in Uh (g), [Dr1]. In the next section we specialize our consideration to the Drinfeld-Jimbo quantum group Uh gl(n, C) , its representation in End(Cn ), and the related RE algebra LR .
544
J. Donin, A. Mudrov
4. The GL(n, C)-case From now on, we concentrate on the case g = gl(n, C), G = GL(n, C) and its representation in M = End(Cn ). Let us fix the Cartan subalgebra h ⊂ g as the subspace of diagonal matrices in M. As above, we denote by g , ± g the sets of all, positive, and negative roots with respect to h. It is customary, in the gl(n, C)-case under consideration, to take the trace pairing for the invariant scalar product ( · , · ) on g. Let eα , e−α , α ∈ + g be the root vectors normalized to (eα , e−α ) = 1. Put hi to be diagonal matrix idempotents hi = eii , i = 1, . . . , n; they form an orthonormal basis in h. The standard classical r-matrix and the invariant symmetric 2-tensor ω for gl(n, C) are r= (e−α ⊗ eα − eα ⊗ e−α ), (39) α∈ + g
ω=
n i=1
hi ⊗ h i +
(e−α ⊗ eα + eα ⊗ e−α ).
Quantization of these data yields the Yang-Baxter operator j R=q eii ⊗ eii + eii ⊗ ej + (q − q −1 ) eik ⊗ eki , i=1,... ,n
(40)
α∈ + g
i,j =1,... ,n i=j
(41)
i,k=1,... ,n i
where q = eh . This is the image of the universal R-matrix R ∈ Uh⊗2 gl(n, C) . The corresponding matrix S, defined by (13), satisfies the Hecke condition S 2 − (q − q −1 )S = 1 ⊗ 1.
(42)
We start our quantization programme with computing the relevant Poisson structures. 4.1. RE Poisson structures on adjoint orbits of GL(n, C). Let M be a right G-space and rM the bivector field on M obtained from the classical r-matrix r ∈ ∧2 g by the group action. Recall, [DGS], that if Ah (M) is a Uh (g)-equivariant quantization on a right G-space M, the corresponding Poisson structure has the form5 p = rM + f,
(43)
where rM is the bivector field on M generated by the r-matrix via the group action; f is a G-invariant bivector field on M whose Schouten bracket is equal to [[f, f ]] = −[[rM , rM ]] = −φM . Here φ ∈ ∧3 g is an ad-invariant element, see (1). The group G is equipped with the Poisson-Lie structure related to the classical r-matrix r, [Dr1]. Admissible Poisson brackets on M are such that the action M × G → M, where M × G is the Cartesian product of Poisson manifolds, is a Poisson map. The bivector f defines a skew-symmetric bilinear operation on A(M) called a φ-bracket. Specifically for the case M = End(V ), where the 5 We use a slightly different definition of the quantum group than in [DGS]; this results in the different sign before rM .
Method of Quantum Characters in Equivariant Quantization
545
latter is considered as the right adjoint G-module, the invariant part of the RE Poisson bracket is given by the expression within parentheses in (29). Classification of φ-brackets on semisimple orbits of semisimple Lie groups was done in [DGS] and [Kar]. It this subsection, we compute those obtained by restriction of the RE bracket (29) to semisimple orbits of the group GL(n, C) in End(Cn ). Semisimple are orbits consisting of diagonalizable matrices. They are characterized by eigenvalues and their multiplicities. As abstract homogeneous spaces, they are specified by ordered sets of multiplicities which fix the stabilizer subgroups H ⊂ G. Eigenvalues specify a Poisson structure on the right coset space H \G by embedding it into the Poisson manifold End(Cn ). We shall show that the RE Poisson structures on H \G form a variety whose dimension coincides with the rank of the orbit. Lemma 4.1. Let X, Y be linear functions from M∗ identified with M by the trace pairing. The U(g)-invariant part of the reflection equation Poisson bracket (29) on M is given by f (X, Y )(A) = (A2 , [X, Y ]) = Tr(A2 [X, Y ]),
A ∈ M .
(44)
Proof. We identify gl(n, C) with M, as well as the tangent space at the point A ∈ M. Right- and left-invariant vector fields on M take the form ξ l X = ξ X, X ξ r = Xξ , ξ ∈ g, X ∈ M∗ ∼ M. Calculating the invariant part of the RE bracket (29) with the invariant element ω from (40), we find (X ω1r )(ω2l Y )−(ω1l X)(Y ω2r ) |A = (X1 , A)(2 Y, A)−(1 X, A)(Y 2 , A). Here, we used the conventional notation with implicit summation, ω = ω1 ⊗ ω2 and = 1 ⊗ 2 . The U gl(n, C) -invariant element coincides with the matrix permutation P , see (13). Using the identity (, X ⊗ Y ) = (X, Y ), which is valid for any X, Y ∈ M, we come to formula (44). Introduce notation Gm = GL(m), m ∈ N, and put G[n1 ,... ,nk ] = Gn1 × . . . × Gnk , a Levi subgroup in Gn . To every set {ni } of k positive integers such that ki=1 ni = n corresponds the right coset space O[n1 ,... ,nk ] = G[n1 ,... ,nk ] \Gn . This becomes a one-to-one correspondence between classes of isomorphic homogeneous manifolds and sets {ni } provided they are ordered. An abstract homogeneous space, O[n1 ,... ,nk ] , is realized by orbits in End(Cn ). Every such realization induces a Poisson structure on it via restriction of the RE bracket (29), and different orbits give different Poisson structures. Consider the direct sum decomposition Cn = Cn1 ⊕ . . . ⊕ Cnk and set Pni : Cn → Cni to be the diagonal projector of rank ni , i = 1, . . . , k. We define the orbit O[n1 ,... ,nk ;λ1 ,... ,λk ] as that passing through the point ki=1 λi Pni , where λi are pairwise distinct complex numbers. This correspondence between orbits and diagonal matrices becomes one-to-one if we require some linear ordering among those parameters λi that correspond to equal ni . We choose the lexicographic ordering on C: λ1 λ2 ⇐⇒ Re λ1 > Re λ2 or Re λ1 = Re λ2 and Im λ1 > Imλ2 . To summarize, the abstract homogeneous manifolds with the RE Poisson structures are in one-to-one correspondence with ordered sets of pairs (ni , λi ) ∈ N × C such that ni > 0 and i ni = n. The ordering on N × C is defined as (n1 , λ1 ) (n2 , λ2 )
⇐⇒
n1 > n2
or
n1 = n2 and λ1 λ2 .
(45)
Our further goal is to compute RE Poisson brackets, which are distinguished by their invariant parts, on the abstract homogeneous manifolds O[n1 ,... ,nk ] . Let l ⊂ g = gl(n, C)
546
J. Donin, A. Mudrov
be the Lie algebra of the stabilizer G[n1 ,... ,nk ] and l ⊂ g its root system. The set of quasi-roots g/l corresponding to a given Levi subalgebra l consists of equivalence ¯ it classes, g/l = ( g − l ) mod Z l , [DGS]. For any pair of quasi-roots α¯ and β, ¯ The root vectors is possible to choose such representatives α and β that α + β = α¯ + β. eα , α ∈ α¯ ∈ g/l form a basis of the tangent space g/l at the origin of O[n1 ,... ,nk ] . When g = gl(n, C), we can consider h∗ as a subspace in M, using the invariant trace pairing. All roots of g are parameterized by pairs of integers i, j = 1, . . . , n, i = j , and can be j represented by the elements αij = eii − ej . By definition, the orbit O[n1 ,... ,nk ;λ1 ... ,λk ] passes through the point ki=1 λi Pni . The corresponding quasi-roots are labeled by the pairs of integers i, j = 1, . . . , k − 1, i = j . Every quasi-root α¯ ij is the set of elements hi − hj , where hi are diagonal matrix idempotents of rank one such that hi Pnj = δij hi = Pnj hi . Proposition 4.2. Consider an ordered set of k positive integers n1 ≥ n2 ≥ . . . ≥ nk . The invariant part of RE Poisson bracket (29) on the homogeneous space O[n1 ,... ,nk ] is induced by the right-invariant bivector field α¯ ij ∈ + g/l
λi + λ j (eα ⊗ e−α − e−α ⊗ eα ), λi − λ j
(46)
α∈α¯ ij
where k complex numbers {λi } form a decreasing sequence of pairs (ni , λi ), i = 1, . . . , k. Proof. As was shown in [DGS], an invariant bracket on O[n1 ,... ,nk ] must have the form
c(α¯ ij )
α¯ ij ∈ + g/l
(eα ⊗ e−α − e−α ⊗ eα ).
(47)
α∈α¯ ij
To find the coefficients c(α¯ ij ), let us evaluate (47) on the linear functions e−α , e+α ∈ M∗ M, α ∈ α¯ ij , at a diagonal matrix A: c(α¯ ij )e−α ([A, eα ])eα ([A, e−α ]) = c(α¯ ij )([eα , e−α ], A)([e−α , eα ], A) 2 = −c(α¯ ij ) α(A) . On the other hand, formula (44) gives f (e−α , eα )|A = (A2 , [e−α , eα ]) = −α(A2 ). Substituting A =
k
i=1 λi Pni ,
we obtain the coefficients c(α¯ ij ):
λ2i − λ2j α(A2 ) c(α¯ ij ) = . 2 = (λi − λj )2 α(A)
Remark that the RE Poisson structures on O[n1 ,... ,nk ] form a k−1-dimensional variety because bivector (46) is stable under the dilation transformation λi → νλi , i = 1, . . . , k, ν = 0.
Method of Quantum Characters in Equivariant Quantization
547
4.2. Characters of the RE algebra relative to the standard quantum group Uh gl(n, C) . In this subsection, we formulate the classification theorem for characters of the algebra LR associated with the representation of the standard Uh gl(n, C) in End(Cn ). To do so, we need the following data. Definition 4.3. An admissible pair (Y, σ ) consists of a subset Y ⊂ I = {1, . . . , n} and a decreasing injective map σ : Y → I . Clearly such a map is uniquely determined by its image σ (Y ). Introduce the subsets Y+ = {i ∈ Y |i > σ (i)} and Y− = {i ∈ Y |i < σ (i)}. Let b− = max{Y− ∪ σ (Y+ )} and b+ = min{Y+ ∪ σ (Y− )}. Because σ is a decreasing map, one has b− < b+ . Theorem 4.4 ([M]). For the standard gl(n, C) R-matrix (41), the numerical solutions to the RE equation (37) fall into the following two classes. 1. Let (Y, σ ) be the admissible pair such that Y = [1, m] ∪ [l + 1, l + m], where m and l are non-negative integers such that l + m ≤ n and m ≤ l; then σ (i) = l + m + 1 − i for i ∈ Y . Solutions of type A are gauge equivalent to A(l, m; λ, µ) = µ
m
eii + λ
l
i=1
eii +
λµ
i=1
m
σ (i)
(eσi (i) − ei
),
i=1
where µ, λ are arbitrary complex numbers. The matrix A(l, m; λ, µ) has eigenvalues µ, λ, 0 with multiplicities m, l, and n − m − l, correspondingly. It is semisimple if and only if λ = µ. 2. Let (Y, σ ) be an admissible pair such that card(Y ) ≤ n2 and σ (Y ) ∩ Y = ∅. Let l be an integer from the semiclosed interval [b− , b+ ) and λ ∈ C. Solutions to the numerical RE of type B are gauge equivalent to B(Y, σ, l; λ) = λ
l
eii +
i=1
eσi (i) .
i∈Y
The matrix B(Y, σ, l; λ) has eigenvalue λ and 0 of multiplicities l and n − l. It is semisimple if and only if λ = 0, otherwise it is nilpotent of nilpotence degree two. The layout of solutions to the numerical RE is as follows. The generic RE matrix of type A is obtained by embedding A(l, m; λ, µ)|l+m=n as the left top block extended with zeros to the entire matrix. The matrix A(l, m; λ, µ)|l+m=n itself has the form ∗ µ+λ
√ ∗
∗ ∗
A(l, m; λ, µ)|l+m=n =
λ
∗
λµ
∗
.
√ ∗ − λµ ∗ The (l − m) × (l − m) square in the middle is located in the center of the matrix and disappears when l = m. Solutions of type B are decomposed into the direct sum of matrices λeii , i ∈ Y ∪ σ (i) σ (Y ), i < l, λeii + eσi (i) , i ∈ Y− , and λeσ (i) + eσi (i) , i ∈ Y+ .
548
J. Donin, A. Mudrov
4.3. Quantization of symmetric and bisymmetric orbits. In this subsection, we apply the method of characters to quantization of the RE bracket (29) restricted to adjoint orbits of GL(n, C) in End(Cn ). Proposition 4.5. The sub-variety O[n1 ,... ,nk ;λ1 ,... ,λk ] ⊂ M is determined by the system of equations (A − λ1 ) . . . (A − λk ) = 0, Tr(Am ) =
(48)
k
ni λ m i ,
m = 0, . . . , k − 1.
(49)
i=1
Proof. Reduction to the canonical Jordan form.
Relations (48) and (49) may be generalized to the quantum case, and one can try to use them for building quantization of semisimple orbits. However, the problem is how to ensure the quotient by those relations be a flat module over C[[h]]. The method of quantum character enables us to do that for certain types of orbits. Relation (48) makes sense in the algebra LR , because the entries of any polynomi al in generating matrix L form a Uh gl(n, C) -module of the same type as L itself. There is a q-analog of the matrix trace, too. Let D be a matrix D ∈ M[[h]] such that the linear A → Tr(DA) on M[[h]] is invariant with respect to the action functional A → ρ γ (x(1) ) Aρ(x(2) ) of the quantum group Uh gl(n, C) . It is unique up to a scalar factor and we take it in the form D=
n
q −2i+2 eii .
(50)
i=1
Definition 4.6. Quantum trace of a matrix A with entries in an associative algebra A is the element Tr q (A) =
n
q −2i+2 Aii = Tr(DA) ∈ A.
(51)
i=1
When A = LR and A = L, the matrix of generators, this is an invariant element belonging to the center of LR . Moreover, the quantum traces Tr q (Lk ) of the powers in L are invariant and central as well, [AFS, KS]. Lemma 4.7. Let L be an RE matrix with coefficients in an associative algebra A. Suppose the matrix coefficients Tnm ∈ Uh∗ (g) commute with all Lij . Then, the quantum trace is invariant under the similarity transformation with the matrix T , Tr q γ (T )LT = Tr q (L). (52) For any polynomial P in one variable, P γ (T )LT = γ (T )P(L)T .
(53)
Proof. Formula (53) is an immediate corollary of the equality γ (T ) = T −1 . Verification of (52) is less simple and uses relations (15) and (20) in the algebras Uh∗ (g) and LR . The proof can be found, e.g., in [KS].
Method of Quantum Characters in Equivariant Quantization
549
ˆ k ∈ Z, we mean the quanDefinition 4.8 (Quantum integers). By quantum integer k, −2n 1−q tity kˆ = 1−q −2 . Obviously, kˆ is equal to the quantum trace, kˆ = ki=1 q −2i+2 , of the unit endomorphism of the space Ck , for k ∈ N. Recall that we consider the set N × C lexicographically ordered, see (45). Theorem 4.9. Consider two pairs (l, λ) and (m, µ) from N × C and assume (l, λ) (m, µ), l + m = n. The quotient of the algebra LR by the relations (L − λ)(L − µ) = 0, Tr q (L) = λlˆ + µm, ˆ
(54)
(55) is the Uh gl(n, C) -equivariant quantization of the manifold M = O[l,m] with the Poisson bracket rM + ζ (eα ⊗ e−α − e−α ⊗ eα ),
α∈α¯ 12
where ζ =
λ+µ λ−µ .
Proof. The Poisson structure on O[l,m] is induced by embedding O[l,m] in M as the orbit O[l,m;λ,µ] . The RE matrix Ah = A(l, m; λ, µ)|l+m=n does not depend on the deformation parameter and belongs to O[l,m;λ,µ] . Applying Theorem 3.7, we quantize this orbit by the quantum character corresponding to Ah . As a subalgebra in Uh∗ gl(n, C) the algebra Ah (O[l,m] ) is generated by entries of the RE matrix γ (T )AT , which fulfills conditions (54) and (55), by Lemma 4.7. Note that if one of the eigenvalues µ, λ turns to zero, there are several numerical RE matrices giving quantization of the same orbit. We can take, e.g., the matrix B(Y, σ, l; λ) for Ah , with arbitrary set Y such that max{Y− ∪ σ (Y+ )} < l. It has eigenvalues λ and 0 of multiplicities l and n − l. For example, one can pick Y = ∅ and consider the RE matrix λ li=1 eji . This solution to the numerical RE and the corresponding quantization was built in [DM1]. Theorem 4.10. Consider two pairs (l, λ) and (m, µ) from N × C and assume (l, λ) (m, µ), n − (l + m) = k > 0. The quotient of the algebra LR by the relations L(L − λ)(L − µ) = 0, Tr q (L) = λlˆ + µm, ˆ
(56) (57)
Tr q (L ) = (λ + µ)(λlˆ + µm) ˆ − λµ l +m 2
is the Uh gl(n, C) -equivariant quantization of the following manifolds: 1. M = O[l,m,k] with the Poisson bracket rM + ζ (eα ⊗ e−α − e−α ⊗ eα ) + α∈ α¯ 12
if (l, λ) (m, µ) (k, 0),
α∈ α¯ 23 ∪ α¯ 13
(eα ⊗ e−α − e−α ⊗ eα ),
(58)
550
J. Donin, A. Mudrov
2. M = O[l,k,m] with the Poisson bracket rM + ζ (eα ⊗ e−α − e−α ⊗ eα ) + α∈ α¯ 13
(eα ⊗ e−α − e−α ⊗ eα ),
α∈ α¯ 12 ∪ α¯ 32
if (l, λ) (k, 0) (m, µ), 3. M = O[k,l,m] with the Poisson bracket rM + ζ (eα ⊗ e−α − e−α ⊗ eα ) + α∈ α¯ 23
(eα ⊗ e−α − e−α ⊗ eα ),
α∈ α¯ 21 ∪ α¯ 31
if (k, 0) (l, λ) (m, µ), where ζ =
λ+µ λ−µ .
Proof. As Poisson manifolds, all the three possibilities are realized by the bisymmetric orbit passing through the point Ah = A(l, m; λ, µ) (cf. Theorem 4.4), which satisfies the RE equation and defines the character of the RE algebra. By Theorem 4.9, the quantizationof the orbit with this character is the quotient of the RE algebra and the subalgebra in Uh∗ gl(n, C) generated by entries of the RE matrix γ (T )Ah T . Since the matrix Ah satisfies conditions (56–58), so does the matrix γ (T )Ah T , by Lemma 4.7. Remark 4.11. There is a one-parameter family of RE Poisson structures on symmetric orbits, as follows from Proposition 4.2, and their quantization is described by Theorem 4.9. On bisymmetric orbits, the RE Poisson brackets form a two-parameter family. Theorem 4.10 provides quantization for only special one-parameter subfamilies. This is a limitation of the method of characters, which gives those and only those quantizations that can be represented as subalgebras in Uh∗ (g) and quotients of LR . Remark 4.12. Theorem 4.4 provides two classes of non-semisimple numerical RE matrices obtained by the limits A(l, m; λ, µ)|λ→µ and B(Y, σ, l; λ)|λ→0 . They belong to orbits that are limits of the semisimple ones. The orbit passing through B(Y, σ, l; λ)|λ=0 is nilpotent. Using Theorem 3.7, one can quantize all nilpotent orbits of nilpotence degree two. 4.4. Quantizing fiber bundle O[l,m,k] → O[l+m,k] . In this subsection we consider the following problem. The group embedding G[l,m,k] → G[l+m,k] defines a map of the right coset spaces, O[l,m,k] → O[l+m,k] ,
(59)
which is a bundle with the fiber O[l,m] = G[l,m,k] \G[l+m,k] G[l,m] \Gl+m . Map (59) is equivariant with respect to the action of the group Gl+m+k and the bundle is homogeneous. Suppose the total space and the base of the bundle are Poisson manifolds and map (59) is Poisson. The problem is to quantize the diagram (60) A(O[l,m,k] ) ← A(O[l+m,k] ), i.e., to build Uh gl(l + m + k) -equivariant quantizations of the function algebras and an equivariant monomorphism
Ah (O[l,m,k] ) ← Ah (O[l+m,k] ).
(61)
The spaces O[l,m,k] and O[l+m,k] can be realized as the orbits O[l,m;k;µ,λ,0] and O[l+m,k;−λµ,0] (here we do not assume ordering (45)) with the induced RE Poisson
Method of Quantum Characters in Equivariant Quantization
551
structures. We will show that their quantizations built in the previous subsection admit an equivariant morphism, a quantization of (60). This will imply that the projection O[l,m;k;λ,µ,0] → O[l+m,k;−λµ,0] is a Poisson map, because of the flatness of the quantizations. Remark 4.13. The fiber over the origin A(l, m; λ, µ) ∈ O[l,m,k;λ,µ,0] is realized as a Poisson manifold O[l,m;λ,µ] . It can be shown that its embedding to O[l,m,k;λ,µ,0] is a Poisson map as well and this map can be lifted to a homomorphism of quantized alge bras that is equivariant with respect to the quantum group embedding U gl(l + m) → h Uh gl(l + m + k) . The proof of this statement is rather straightforward, and we do not concentrate on this subject here. Before formulating the main result of this subsection, let us prove an algebraic statement. Let P(x) be a polynomial in one variable with coefficients in a commutative ring K. Suppose P(0) = 0. Fix two scalars α, β ∈ K and consider an associative unital algebra A(e, s) over K generated by the elements {e, s} subject to relations eses = sese
(“reflection equation”),
s 2 − βs = 1
(“Hecke condition”). (62)
Lemma 4.14. The correspondence s → s, e → P(e) gives a homomorphism of the algebra A(e, s) to the quotient algebra A(e, s)/(eP(e) − αe). This homomorphism is factored through the ideal (e2 − αe). Proof. First let us note that the last statement is an immediate corollary of the condition P(0) = 0. We should verify that, modulo the ideal (eP(e) − αe), the following relation holds true in the algebra A(e, s): P(e)sP(e)s = sP(e)sP(e).
(63)
We will prove a stronger assertion; namely, for any m = 1, 2, . . ., the identity P(e)sem s = sem sP(e)
(64)
is valid modulo the ideal (eP(e) − αe). This will imply (63), because P(0) = 0 by the hypothesis of the lemma. We assume β = 0 since otherwise sem s can be replaced by (ses)m , and the proof becomes immediate. For m = 1 this is a consequence of the “reflection equation” relation (62). Suppose (64) is proven for some integer m = l ≥ 1. Using the “Hecke condition” (62), we rewrite (64) for m = l + 1 as P(e)sel (s 2 − βs)es = sel (s 2 − βs)esP(e). The terms without β compensate each other, by the induction assumption. The problem reduces to checking the equality P(e)sel ses = sel sesP(e). Employing the induction assumption, rewrite the left-hand side as P(e)sel ses = sel sP(e)es = αsel ses. The right-hand side can be transformed as sel sesP(e) = sel−1 esesP(e) = sel−1 seseP(e) = αsel−1 sese = αsel−1 eses = αsel ses. Here, we used the “reflection equation” (62).
552
J. Donin, A. Mudrov
Theorem 4.15. Let whose entries generate the algebras E and B be the RE matrices Ah O[l,m,k;λ,µ,0] and Ah O[l+m,k;−λµ,0] , respectively. The matrix correspondence π(B) = E 2 − (λ + µ)E is extended to the Uh gl(l + m + k) -equivariant algebra morphism Ah O[l,m,k;λ,µ,0] ← Ah O[l+m,k;−λµ,0] .
(65)
Proof. The matrix π(B) satisfies the equations π(B)(π(B) + λµ) = 0 and Tr q π(B) = (λ + µ)(λlˆ + µm) ˆ − λµ l + m − (λ + µ)(λlˆ + µm) ˆ = −λµ l + m. Taking into account Theorem 4.9, it remains to show that π(B) is an RE matrix. The matrix E fulfills the polynomial relation (56), and the braid matrix S matrix satisfies the Hecke condition (42). It remains to apply Lemma 4.14, setting e = E2 , s = S, P = π , α = −λµ, β = q − q −1 .
4.5. Quantizing the Kirillov-Kostant-Souriau bracket on symmetric orbits. It is known n that there is a two-parameter quantization Lh,t on End(C ) that is equivariant with respect to the adjoint action of Uh gl(n, C) , [D1]. The corresponding Poisson structure is obtained from (29) by adding the Poisson-Lie bracket with arbitrary overall factor t. The algebra Lh,t can be obtained from LR by the substitution L = E + 1−qt −2 , q = eh , of the matrix of generators. Relations (20) go over into SE2 SE2 − E2 SE2 S = qt [E2 , S].
(66)
This is true for any matrix S satisfying the Hecke condition (42). In the limit h → 0, the matrix S tends to the permutation operator P on Cn⊗ Cn and relation (66) turns into those of the classical universal enveloping algebra U gl(n, C) [t]. Indeed, the substitution P → S in (66) gives explicitly Eji Enm − Enm Eji = t (δjm Eni − δni Ejm ).
(67)
The algebra U gl(n, C) [t] is a quantization of the Poisson-Lie bracket on End∗ (Cn ) identified with gl(n, C) by the invariant trace pairing. Its restriction to orbits is the Kirillov-Kostant-Souriau bracket. Theorem 4.16. Let µ1 , µ2 be two distinct complex numbers and n1 , n2 two positive integers such that n1 + n2 = n. Let {Eji }ni,j =1 be the set of generators of the algebra Lh,t subject to relation (66). The quotient of Lh,t by the relations (E − µ1 )(E − µ2 ) = 0, Tr q (E) = nˆ 1 µ1 + nˆ 2 µ2 + t nˆ 1 nˆ 2 .
(68) (69)
is a two-parameter quantization on thesymmetric orbit O[n1 ,n2 ;µ1 ,µ2 ] . It is equivariant with respect to the adjoint action of Uh gl(n, C) .
Method of Quantum Characters in Equivariant Quantization
553
Proof. Consider the quantization of O[l,m;λ,µ] as a quotient of the RE algebra LR , along the line of Theorem 4.9. The substitution L=E+
t , 1 − q −2
λ = µ1 +
t , 1 − q −2
µ = µ2 +
t , 1 − q −2
transforms relations (20) into (66) while (54–55) into (68–69).
l = n1 ,
m = n2
Corollary 4.17. Let µ1 , µ2 be two distinct complex numbers and n1 , n2 two positive integers such that n1 + n2 = n. Let {Eji }ni,j =1 be the set of generators of the algebra U gl(n, C) [t] subject to relations (67). The quotient of U gl(n, C) [t] with respect to the ideal generated by (E − µ1 )(E − µ2 ) = 0, Tr(E) = n1 µ1 + n2 µ2 + tn1 n2 ,
(70) (71)
is the quantization of the KKS bracket on the symmetric orbit O[n1 ,n2 ;µ1 ,µ2 ] . It is equivariant with respect to the adjoint action of U gl(n, C) . Proof. Taking the limit h → 0 in the two-parameter quantization of Theorem 4.16. Remark 4.18. Theorem 4.16 gives a two-parameter generalization of the quantum sphere, [GS], to symmetric orbits. The explicit description of the quantized KKS bracket in Corollary 4.17 is an especially interesting result that could not be otherwise obtained than by extending deformation to the q-domain. This is a remarkable application of the quantum group theory. Example 4.19 (Complex sphere). Now we illustrate Corollary 4.17 on U gl(2, C) -equivariant quantization of the complex sphere O[µ1 ,µ2 ;1,1] ⊂ End(2, C). Our goal is to demonstrate on this simple example that the system of conditions (67), (70), and (71) is self-consistent. It is known that O[µ1 ,µ2 ;1,1] is specified, as a maximal orbit, by values of two invariant functions, the traces of a matrix and its square. We shall showthat matrix equation (70) boils down to a condition on the second Casimir of U gl(2, C) only when the first Casimir is fixed as in (71). Consider the 2 × 2-matrix E = ||Eji || of generators of U (2, C) [t] obeying the commutation relations [E11 , E21 ] = tE21 , [E11 , E12 ] = −tE12 , [E22 , E12 ] = tE12 , [E22 , E21 ] = −tE21 , [E11 , E22 ]
=
0,
[E21 , E12 ]
=
t (E11
(72)
− E22 ),
which are obtained by specialization of system (67) to the gl(2, C)-case. Put n1 = n2 = 1, σ1 = µ1 + µ2 , σ2 = µ1 µ2 in (70–71) and write (70) out explicitly: (E11 )2 + E12 E21 − σ1 E11 + σ2 = 0,
(73)
+ E21 E12 − σ1 E22 + σ2 E21 E11 + E22 E21 − σ1 E21 E11 E12 + E12 E22 − σ1 E12
= 0,
(74)
= 0,
(75)
= 0.
(76)
(E22 )2
554
J. Donin, A. Mudrov
Equations (73–76) are equivalent to the system (E11 )2 + (E22 )2 + E21 E12 + E22 E12 − σ1 (E11 + E22 ) + 2σ2 = 0, (E11 )2
− (E22 )2
− t (E11 E11 E21 E11 E12
− E22 ) − σ1 (E11 − E22 ) + E22 E21 − (σ1 + t)E21 + E22 E12 − (σ1 + t)E12
(77)
= 0,
(78)
= 0,
(79)
= 0.
(80)
To obtain it, we pulled the diagonal generators to the left in (75–76) and took the sum and difference of (73–74) using the commutation relations (72). Equations (78–80) are satisfied modulo the condition E11 + E22 − (σ1 + t) = 0,
(81)
which is the specialization of (71) for O[µ1 ,µ2 ;1,1] . Equations (77) and (81) can be re j written in terms of the elements Tr(E) = i Eii , Tr(E 2 ) = i,j Eji Ei , which generate the center of U gl(2, C) . The resulting relations for the quantized orbit O[µ1 ,µ2 ;1,1] are (82) Tr(E 2 ) = σ1 (σ1 + t) − 2σ2 , Tr(E) = (σ1 + t). (83) Together with (72), this is a U (2, C) -equivariant quantization of the KKS bracket on the two-dimensional complex sphere. 5. Conclusion The method of quantum characters formulated in this paper is designed for building Uh (g)-equivariant quantizations on a G-manifold M that are representable as subalgebras in Uh∗ (g) and quotients of Ah (M). Despite its simplicity, it allows to obtain new and interesting results, for example, the two-parameter quantization on semisimple orbits. Analyzing the quantizations built within the present approach, one may come to the following conclusion. The two-parameter quantization on a semisimple orbit of GL(n, C) may be sought for in the form of a matrix polynomial equation on the generators Eji ∈ Lh,t with additional conditions on the quantum traces Tr q (E k ), k ∈ N. This conjecture turns out to be true. The proof is based on a different technique than that used in the present paper. It is the subject of our forthcoming publication [DM3] as well as the explicit equations defining quantized semisimple orbits of GL(n, C), including the special case of the KKS bracket. Acknowledgements. We thank Steven Shnider for numerous valuable discussions. We are grateful to the referee for his remarks, which helped us to improve the manuscript.
References [AFS] [BD] [Cher]
Alekseev, A., Faddeev, L., Semenov-Tian-Shansky, M.: Hidden quantum group inside KacMoody algebra. Commun. Math. Phys. 149(2), 335 (1992) Belavin, A.A., Drinfel’d, V.G.: Triangle equations and simple Lie algebras. In: Math. Phys. Rev., S.P. Novikov (ed.) New York: Harwood, 1984, p. 93 Cherednik, I.: Factorizable particles on a half-line and root systems. Theor. Math. Phys. 64, 35 (1984)
Method of Quantum Characters in Equivariant Quantization [GS]
555
Gurevich, D., Saponov, P.A.: Quantum sphere via reflection equation algebra. Preprint math.QA/9911141 [D1] Donin J.: Double quantization on the coadjoint representation of sl(n). Czech J. Phys. 47(11), 1115 (1997) [DM1] Donin, J., Mudrov, A.: Uq (sl(n))-invariant quantization of symmetric coadjoint orbits via reflection equation algebra. Contemp. Math., AMS, Providence, Rl, 315, 61 (2002); math.QA/0108112 [DM2] Donin, J., Mudrov, A.: Reflection equation, twist, and equivariant quantization. Isr. J. Math., in press, math.QA/0204295 [DM3] Donin, J., Mudrov, A.: Explicit equivariant quantization on coadjoint orbits of GL(n, C). Lett. Math. Phys, in press, math. QA/0206049 [DS] Donin, J., Shnider, S.: Deformations of quadratic algebras and the corresponding quantum semigroups. q-alg/9505015 [DGS] Donin, J., Gurevich, D., Shnider, S.: Double quantization on some orbits in the coadjoint representation of simple Lie groups. Commun. Math. Phys. 204(1), 39 (1999) [Dr1] Drinfeld, V.G.: Quantum groups. In: Proc. Int. Congress of Mathematicians, Berkeley, 1986, A.V. Gleason (ed.): Providence: AMS, 1987, p. 798 [Dr2] Drinfeld, V.G.: Quasi-Hopf algebras. Leningrad Math. J. 1, 321 (1990) [EK] Etingof, P., Kazhdan, D.: Quantization of Lie bialgebras. Selecta Math. 2(1), 1 (1996) (q-alg/9510020) [ESS] Etingof, P., Schedler, T., Schiffmann, O.: Explicit quantization of dynamical r-matrices for finite dimensional semisimple Lie algebras. math.QA/9912009, J. Am. Math. Soc. 13(3), 595 (2000) [FRT] Faddeev, L., Reshetikhin, N., Takhtajan, L.: Quantization of Lie groups and Lie algebras. Leningrad Math. J. 1, 193 (1990) [Kar] Karolinskii, E.: A classification of Poisson homogeneous spaces of complex reductive PoissonLie groups. In: Banach Center Publ. 51, 103, math.qa/9901073 [K] Kulish, P.P.: Quantum groups, q-oscillators, and equivariant algebras. Theor. Math. Phys. 94, 193 (1993) [KSkl] Kulish, P.P., Sklyanin, E.K.: Algebraic structure related to the reflection equation. J. Phys. A 25, 5963 (1992) [KS] Kulish, P.P., Sasaki, R.: Covariance properties of reflection equation algebras. Prog. Theor. Phys. 89(3), 741 (1993) [M] Mudrov, A.I.: Characters of Uq sl(n) -reflection equation algebra. Lett. Math. Phys. 60(3), 283 (2002) [Mj] Majid, S.: Foundations of Quantum Group Theory. Cambridge: Cambridge University Press, 1995 [Skl] Sklyanin, E.K.: Boundary conditions for integrable quantum systems. J. Phys.A 21, 2375 (1988) Communicated by L. Takhtajan
Commun. Math. Phys. 234, 557–565 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0795-7
Communications in
Mathematical Physics
Supplement On the Structure of Stationary Solutions of the Navier-Stokes Equations Peter Wittwer ∗ D´epartement de Physique Th´eorique, Universit´e de Gen`eve, Switzerland. E-mail: [email protected] Received: 7 May 2002 / Accepted: 15 October 2002 Published online: 10 February 2003 – © Springer-Verlag 2003 Commun. Math. Phys. 226, 455–474 (2002)
Abstract: This paper is a supplementary section to [1]. We show that without any additional hypothesis the main result in [1] (Theorem 1) can be considerably strengthened. Note. This paper cannot be read independently of [1]. The numbering of equations, theorems and propositions as well as cross-references used here have to be understood as if this paper were an additional section to [1].
6. Asymptotic Flow This supplementary section should be read in conjunction with [1]. Namely, as indicated at the beginning of each subsection, the estimates and propositions below replace certain (weaker) estimates and propositions in [1]. Indeed, Theorem 1 of [1] can be replaced without changing the hypothesis by Theorem 18 below, which gives an explicit description of the asymptotic flow as the sum of two divergence-free contributions: a parabolic wake term, and a compensating source term. 6.1. Improved version of Theorem 1. (Reader’s guide: The following theorem and remarks replace Theorem 1 in [1].) Theorem 18 (Improved version of Theorem 1). Let and be as defined above. Then, for each u∗ = (u∗ , v∗ ) in a certain set of vector fields S to be defined later on, there exist a (locally unique) vector field u = u∞ + (u, v) and a function p satisfying the Navier-Stokes equations (1) and (2) in and the boundary conditions (3) and (4). ∗
Supported in part by the Fonds National Suisse.
558
P. Wittwer
Furthermore,
lim x 1/2 sup |(u − u0 ) (x, y)| = 0 ,
x→∞
y∈R
(69)
lim x sup |(v − v0 ) (x, y)| = 0 ,
x→∞
(70)
y∈R
where c u0 (x, y) = √ 2 π c v0 (x, y) = √ 4 π with
y2 d x 1 , √ e− 4x + π x2 + y2 x y d y − y2 e 4x + , x 3/2 π x2 + y2
(71) (72)
c = lim
eiky (u∗ (y) + iv∗ (y)) dy ,
(73)
eiky (−iv∗ (y)) dy ,
(74)
(u − u0 )(x, y) dy = 0 ,
(75)
k→0+ R
d = lim
k→0+ R
and
R
for all x ≥ 1. A proof of this theorem is given below. It will follow rather easily from Proposition 23, which contains improved versions of the inequalities (28), (29), (31) and (32) of Proposition 6. Remark 19. The integrals in (73), (74) and (75) have to be understood in the (C, δ)-sense, with 0 < δ ≤ 1 (see e.g. [3], Theorem 15). Namely, let C(δ, R) = (1 − |y| /R)δ . Then the exact versions of (73), (74) and (75) are (using in addition that u∗ is even and v∗ is odd), R C(δ, R) u∗ (y) dy − d , c = lim R→∞ −R
d = lim
k→0+
and
lim
R
R→∞ −R
R
R→∞ −R
lim
C(δ, R) sin(ky)v∗ (y) dy
,
C(δ, R) (u − u0 )(y) dy = 0 .
Remark 20. The terms multiplied by d in u0 and v0 are the velocity field of a source located at x = y = 0. We could replace these terms without changing the theorem by those of a source with the same amplitude but located at x = x0 < 1 and y = 0. The terms produced by the difference of two such sources are bounded for x ≥ 1 and vanish as x goes to infinity sufficiently rapidly for (69) and (70) to remain valid.
Stationary Solutions of Navier-Stokes Equations
559
Remark 21. The source-term in (71) (term multiplied by d) is irrelevant in the sense that (69) is satisfied for any value of d. Eq. (75), however, is only satisfied for d as given in (74). Remark 22. The constant c is the amount of fluid transported within the wake. Indeed, it follows from (71) and (75) that, for any 0 < ε < 1/2 and 0 < δ ≤ 1, x 1/2+ε c = lim C(δ, x 1/2+ε ) u(x, y) dy , x→∞ −x 1/2+ε
whereas the total fluid flow through a vertical line is R C(δ, R) u(x, y) dy = d + c , lim R→∞ −R
independently of x ≥ 1. On physical grounds (see for example [2]) we expect that c < 0, since the fluid supposedly flows less fast within the wake than at infinity. This means that fluid is transported within the wake towards the body, and this fluid is then “radiated” away from the body by the source-like contribution to the velocity field (the terms multiplied by d). For the case of boundary conditions on the body and at infinity as given after Eq. (8) we therefore expect that d = −c/2. 6.2. Improved versions of inequalities (28), (29), (31) and (32). (Reader’s guide: The inequalities (79)-(82) replace the inequalities (28), (29), (31) and (32) of Proposition 6 in [1]. Proposition 23 should therefore be read in conjunction with Proposition 6. The reader can skip the reading of the proofs of the inequalities (28), (29), (31) and (32) in [1] and read the proof of Proposition 23 instead. Proposition 7 in [1] has to be replaced by its strengthened version Proposition 24, provided in the appendix of this supplement.) ∗ and q be as defined above. Let Proposition 23. Let u∗− , ω− 1 t ∗ − (k, t) = ω− (k) − q(k, s) ds e− (t−1) , 1 1i t U− (k, t) = u∗− (k) − q(k, s) ds e−|k|(t−1) , 2k 1
(76) (77)
and let for, α ≥ 0, µα (k, t) =
1
1 + |k|t 3/4
α .
(78)
Then, we have the following bounds, ε |ω− (k, t) − − (k, t)| ≤ µα+1 (k, t) , t i − + (ω− (k, t) − − (k, t)) ≤ ε µα+1 (k, t) , k t 1/2 ε ε |u+ (k, t)| ≤ 3/4 µα+1 (k, t) + 1/2 µα+1 (k, t) , t t ε ε |u− (k, t) − U− (k, t)| ≤ 3/4 µα+1 (k, t) + 1/2 µα+1 (k, t) . t t
(79) (80) (81) (82)
560
P. Wittwer
Proof. We first prove (81). We have that
∞
e
|k|(t−s)
t
∞ ds ds + e|k|(t−s) 3/2 3/2 3/4 s s t t+t const. const. −t 3/4 |k| ≤ 3/4 + 1/2 e , t t
ds ≤ s 3/2
t+t 3/4
and therefore |u+ (k, t)| ≤ εµα+1 (k, t) ≤ ≤
ε t 3/4
∞
e|k|(t−s)
t
µα+1 (k, t) +
ε t 1/2
ds s 3/2
e−t
3/4 |k|
ε ε µα+1 (k, t) + 1/2 µα+1 (k, t) t 3/4 t
as claimed. To prove the bounds on ω− and u− , we first show that − t − (t−s) ω− (k, t) = − (k, t) − e q1 (k, s, t) ds , 1 t i |k| u− (k, t) = U− (k, t) + e−|k|(t−s) q1 (k, s, t) ds , 2 k 1
(83) (84)
where
t
q1 (k, s, t) = −
q(k, σ ) dσ . s
Namely, integration by parts in (16) leads to s=t 1 − (t−s) ∗ ω− (k, t) = ω− (k)e− (t−1) − q1 (k, s, t) e s=1 − t − (t−s) − e q1 (k, s, t) ds , 1 s=t 1 i −|k|(t−s) u− (k, t) = u∗− (k)e−|k|(t−1) − e q1 (k, s, t) s=1 2k t 1i |k| e−|k|(t−s) q1 (k, s, t) ds , + 2k 1 and (83) and (84) follow, since q1 (k, t, t) = 0. From the bound (20) on q it follows that for 1 ≤ s ≤ t, |q1 (k, s, t)| ≤ ε|k|µα+1 (k, s) √
t −s √ √ . t + s ts
We now prove the bounds (79) and (80). Namely, from (85) and the inequality 1
(t+1)/2
√
t −s t −1 2√ t, √ √ ds ≤ const. t t + s ts
(85)
Stationary Solutions of Navier-Stokes Equations
we find using Proposition 24 that (t+1)/2 − − (t−s) e q1 (k, s, t) ds 1
561
t −1 2√ t t t −1 2√ t ≤ εµα+3/2 (k, 1)e− (t−1)/2 |− |3/2 t ε t − 1 1/2 µα+3/2 (k, t) , ≤ t t
|− | ε |k| µα+1 (k, 1)e− (t−1)/2 ≤
and that
− t − (t−s) e q (k, s, t) ds 1 (t+1)/2 t ε t −s ≤ |k| µα+1 (k, t) √ √ e− (t−s) |− | ds √ ( t) ts s + (t+1)/2 t ε t −s ≤ √ µα+1 (k, t) √ √ e− (t−s) |− |3/2 ds √ (t+1)/2 ( s + t) ts t ε 1 ≤ 3/2 √ µα+1 (k, t) (t − s)e− (t−s) |− |3/2 ds t (t+1)/2 t ε 1 1 ≤ 3/2 √ µα+1 (k, t) ds √ t t −s (t+1)/2 ε 1 ≤ √ µα+1 (k, t) , t which proves (79). Similarly, + − (t+1)/2 (t−s) − e q (k, s, t) ds 1 k 1 2 t −1 √ t ≤ εµα+1 (k, 1)e− (t−1)/2 |− | t ε t −1 ≤ 1/2 µα+1 (k, t) , t t and + − t − (t−s) e q (k, s, t) ds 1 k (t+1)/2 t t −s ≤ εµα+1 (k, t) √ √ e− (t−s) |− | ds √ ( s + t) ts (t+1)/2 t ε ≤ 3/2 µα+1 (k, t) (t − s)e− (t−s) |− | ds t (t+1)/2 ε ≤ 1/2 µα+1 (k, t) , t
562
P. Wittwer
which proves (80). Finally, we prove (82). For 1 ≤ t ≤ 2 we have that t t t −s e−|k|(t−s) q1 (k, s, t) ds ≤ εµα+1 (k, 1) e−|k|(t−s) |k| √ √ √ ds ( s + t) ts 1 1 t ε e−|k|(t−s) |k| (t − s) ds ≤ µα+1 (k, 1) t 1 ε ≤ εµα+1 (k, 1) ≤ 1/2 µα+1 (k, t) , t and for t > 2 we have that t−(t−1)3/4 e−|k|(t−s) q1 (k, s, t) ds 1 t−(t−1)3/4 t −s ≤ εµα+1 (k, 1) e−|k|(t−s) |k| √ √ √ ds ( s + t) ts 1 t 3/4 (t−1) (t−s) |k| (t − s) ε e−|k| 2 ds ≤ µα+1 (k, 1)e−|k| 2 t s 1/2 1 t ds (t−1)3/4 ε ≤ µα+1 (k, 1)e−|k| 2 1/2 t 1 s (t−1)3/4 ε ε ≤ 1/2 µα+1 (k, 1)e−|k| 2 ≤ 1/2 µα+1 (k, 1) , t t and furthermore that t
t−(t−1)3/4
e
−|k|(t−s)
q1 (k, s, t) ds
t −s e−|k|(t−s) |k| √ √ √ ds ( s + t) ts t−(t−1)3/4 t ε e−|k|(t−s) |k| (t − s)ds ≤ 3/2 µα+1 (k, t) t t−(t−1)3/4 t ε ε ≤ 3/2 µα+1 (k, t) ds ≤ 3/4 µα+1 (k, t) . 3/4 t t t−(t−1) ≤ εµα+1 (k, t)
t
This completes the proof of Proposition 23.
6.3. Proof of Theorem 18. (Reader’s guide: This section should be read in conjunction with Sect. 5 of [1]. Note that c ≡ u(0, ˆ 1) + i v(0 ˆ + , 1) as indicated in Proposition 14 of [1] and that d ≡ −i v(0 ˆ + , 1).) Let − and U− as defined in Proposition 23, and let i U (k, t) = − + − (k, t) + U− (k, t) , k V (k, t) = − (k, t) + iσ (k)U− (k, t) .
Stationary Solutions of Navier-Stokes Equations
563
From (27), and using that |− /k| < 1, we find for t ≥ 1 that i − − ω+ ≤ ε 1 1 µα (k, t) ≤ ε µα+1 (k, t) . t2 k t 3/2 + Using in addition the inequalities (80)-(82) we therefore get for u using (15), that ε ε |(u − U )(k, t)| ≤ 3/2 µα+1 (k, t) + 1/2 µα+1 (k, t) t t ε ε + 3/4 µα+1 (k, t) + 1/2 µα+1 (k, t) t t ε ≤ 1/2 µα+1 (k, t) , (86) t the worst bound being the contribution from ω− . Similarly we get for v using (15) and the inequalities (79), (81) and (82), that ε ε |(v − V )(k, t)| ≤ 3/2 µα+1 (k, t) + µα+1 (k, t) t t ε ε + 3/4 µα+1 (k, t) + 1/2 µα+1 (k, t) t t ε ε ≤ 3/4 µα+1 (k, t) + 1/2 µα+1 (k, t) , (87) t t the worst bounds being here the contributions from u+ and u− . The bounds (86) and (87) imply in direct space for α > 0, that except for the terms U and V , respectively, contributions to u and v are bounded by O(1/x) and O(1/x 5/4 ), uniformly in y. Since, moreover, ∞ i k k 2 ∗ lim − + ( 1/2 )− ( 1/2 , t) = −i∂k ω− (0) + i ∂k q(0, s) ds e−k , t→∞ k/t 1/2 t t 1 (88) ∞ k 2 1/2 ∗ lim t − ( 1/2 , t) = − −i∂k ω− (0) + i ∂k q(0, s) ds (−ik) e−k , (89) t→∞ t 1 ∞ k i lim U− ( , t) = u∗− (0) − ∂k q(0, s) ds e−|k| ≡ d e−|k| , (90) t→∞ t 2 1 and furthermore − i + ( k )− ( k , t) ≤ εµα+1 (k, 1) , k/t 1/2 1/2 1/2 t t 1/2 t − ( k , t) ≤ εµα+1 (k, 1) , 1/2 t U− ( k , t) ≤ εµα+1 (k, 1) , t uniformly in t ≥ 1, it follows by the Lebesgue dominated convergence theorem that the convergence in (88)-(90) is not only pointwise but also in L1 (R). The equalities (69), (70) and (75) now follow by using in addition the identities for u(0, 1) and v(0+ , 1) established in the proof of Proposition 14. The existence of the integrals in (73), (74) and (75) in the (C, δ)-sense (see Remark 19) follows by using that u and U are by definition continuous functions of k ∈ R and that v and V are by definition continuous functions of k ∈ R \ {0}. This completes the proof of Theorem 18.
564
P. Wittwer
6.4. Appendix. Proposition 24 (Improved version of Proposition 7). Let α ≥ β ≥ γ ≥ 0. Then, for all t ≥ 1 and k ∈ R, 1 1+
|k|α
e
− t−1 2
β
|− |
t −1 t
γ
≤ const.
1 1 . t β 1 + |k| t 1/2 α −β +γ
(91)
Proof. For 1 ≤ t < 2 we have that 1 1 + |k|α
e
− t−1 2
≤ const. ≤ const. ≤ const.
β
|− | 1
1+
|k|α
1 1+
e−
t −1 t t−1 2
|− (t − 1)|γ |− |β −γ
|k|α
γ
|− |β −γ
1 |k|α −β +γ
1+ 1 1 ≤ const. β , t 1 + |k| t 1/2 α −β +γ as claimed, and for t > 2 we use that
α −β +γ t−1 t −1 γ e− 2 |− t|β 1 + |k| t 1/2 t
α t −1 γ 1/2 − t−1 β 2 |− t| e ≤ const. 1 + |k| t t
α t e− 4 |− t|β ≤ const. 1 + |k| t 1/2 t |k|α |− t|α /2 |− t|β e− 4 ≤ const. 1 + |− |α /2 |k|α ≤ const. 1 + |− |α /2
≤ const. 1 + |k|α /2 ≤ const. 1 + |k|α , and (91) follows.
Acknowledgement. The author strongly regrets that Theorem 18 was not directly included in [1] and thanks the editors of Commun. Math. Phys. for the possibility of publishing this supplement. The author would also like to thank the referee for his input which helped to improve the readability of these notes.
Stationary Solutions of Navier-Stokes Equations
565
References 1. Wittwer, P.: On the structure of stationary solutions of the Navier-Stokes equations. Commun. Math. Phys. 226, 455–474 (2002) 2. Batchelor, G.K.: An introduction to fluid dynamics. Cambridge: Cambridge University Press, 1967 3. Titchmarsh, E.C.: Theory of Fourier Integrals. Oxford: Clarendon Press, 1937 Communicated by A. Kupiainen